Partial aggregation in Drill-on-Phoenix

2015-10-01 Thread Julian Hyde
Phoenix is able to perform quite a few relational operations on the region server: scan, filter, project, aggregate, sort (optionally with limit). However, the sort and aggregate are necessarily "local". They can only deal with data on that region server, and there needs to be a further operation t

Re: Partial aggregation in Drill-on-Phoenix

2015-10-06 Thread Julian Hyde
gt;> >>>> With Distribution Trait application here: >>>> >>>> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/DrillDistributionTraitDef.java >>>> >>>> To me, the easiest way to s

Re: flatten() function, scalar functions, nested ?

2015-10-12 Thread Julian Hyde
> On Oct 12, 2015, at 3:42 PM, Jacques Nadeau wrote: > > - we have shortcut for a lateral join combined with a table function used > in the select clause It’s funny, Postgres has a short-cut that allows you to use UNNEST in the SELECT clause[1]. James and I discussed it for Phoenix Unnest supp

Re: Improvements to storage plugin planning integration support

2015-10-12 Thread Julian Hyde
Dead air means I’ve had ‘flu and been busy… can you give me another day to think about this? > On Oct 12, 2015, at 4:12 PM, Jacques Nadeau wrote: > > The dead air must mean that everyone is onboard with my recommendation > > PlannerIntegration StoragePlugin.getPlannerIntegrations() > > interf

Re: flatten() function, scalar functions, nested ?

2015-10-13 Thread Julian Hyde
> On Oct 12, 2015, at 5:47 PM, Jacques Nadeau wrote: > > Julian, have you spent time thinking > about how to extend validator and sql2rel? To extend the validator, SqlOperator, SqlValidatorNamespace, SqlValidatorScope, SqlValidator can all be sub-classed — are those enough control points? You

Re: [VOTE] Release Apache Drill 1.2.0 RC3

2015-10-14 Thread Julian Hyde
Sorry, I don't think I'll be able to vote this time. Just too crunched with other things. For the record I tried RC2 and got a crash but I didn't get time to figure out whether it was just my environment. So you shouldn't take that as any indication of the quality of the release. On Wed, Oct 14, 2

Re: Apache Drill

2015-10-17 Thread Julian Hyde
Seems to me the biggest problem is to make drill understand the nested structure of an xml document. That work has been done for json, so let's build on it. Suppose there was a translator that converted xml to json (adding attributes for things that json lacks, such as namespaces, text, element

Re: Apache Drill

2015-10-17 Thread Julian Hyde
lex xml's, that I currently use in Storm. >> Takes attributes, and everything. >> >> I can share it with the community if interesting. >> >> /Magnus >> Den 17 okt 2015 7:02 em skrev "Julian Hyde" : >> >>> Seems to me the biggest pr

Re: Apache Drill

2015-10-17 Thread Julian Hyde
hing limiting its use in other domains and > resulting in larger file(s). > > Sent from my iPhone > >> On Oct 17, 2015, at 4:51 PM, Julian Hyde wrote: >> >> Yes, frankly, performance is a concern. But there are also many >> concerns about to fit a deep XML docum

Re: Apache Drill

2015-10-18 Thread Julian Hyde
) >>>> to not make a noticeable difference (which is what I think Julian is >>>> implying)? >>>> >>>> Sent from my iPhone >>>> >>>>> On Oct 17, 2015, at 1:41 PM, Magnus Pierre >>> wrote: >>>>> >>&

Re: select from table with options

2015-10-20 Thread Julian Hyde
+1 to use table functions In Calcite (and I presume Drill) a “table function” may actually function more like a (Lisp) macro. The function gets called at prepare time to yield a RelNode (say a TableScan). So a table function is every bit as efficient as using a table, but it allows extra parame

Re: select from table with options

2015-10-21 Thread Julian Hyde
Whatever API is used to scan files from SQL, there will need to be a corresponding way to accomplish the same thing in a user interface. Probably a form with various fields, some of them with drop-boxes etc. And ideally a facility that samples a few hundred rows to deduce the probable field names

Re: [DISCUSS] Design Documents

2015-10-22 Thread Julian Hyde
Zelaine, Welcome to the Drill community! It was great working with you a few years ago (I think some of your contributions are still present in the code base that evolved into Calcite) and look forward to working together again. Julian

Re: Drill 1.3 Timing: Let's start the vote next week

2015-10-26 Thread Julian Hyde
Sounds good to me. On a related note, Calcite’s 1.5 release. That release has slipped about a week, so there might be time to get https://issues.apache.org/jira/browse/CALCITE-911 in, which I know is important to Drill. Ironically if Drill re

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-10-26 Thread Julian Hyde
+100 Thanks for spearheading this, Jacques. They say memory is the new disk. So, it’s no longer sufficient to use the same on-disk data format if we want our tools to interoperate. The idea of engines interoperating by reading the same in-memory temporary tables, and passing data from one engi

Re: Drill hangout conflict with Calcite hangout..

2015-10-27 Thread Julian Hyde
I didn’t realize I had scheduled the Calcite hangout at the same time as the Drill hangout. Sorry! And thanks for flexing with this, Drillers. Here’s a virtual slice of Calcite’s graduation cake, if it’s any consolation… https://twitter.com/ApacheCalcite/status/659072259164803072

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-10-27 Thread Julian Hyde
Jacques, Can you please log the JIRA case you mentioned, and also attach any documentation (e.g. javadoc) you already have.

Re: select from table with options

2015-11-01 Thread Julian Hyde
On Sun, Oct 25, 2015 at 10:13 PM, Jacques Nadeau wrote: > Agreed. We need both select with option and .drill (by etl process or by > sql ascribe metadata). > > Let's start with the select with options. My only goal would be to make > sure that creation of .drill file through SQL uses a similar pat

Temporary branches

2015-11-05 Thread Julian Hyde
> On Nov 5, 2015, at 1:12 PM, Jacques Nadeau wrote: > > I'm not sure what to do here. INFRA just changed the Git behavior so it is > no longer possible to delete branches. I generally don't like to have > failed branches in a release history (otherwise you get a release branch > with all these m

Re: select from table with options

2015-11-05 Thread Julian Hyde
> On Nov 5, 2015, at 5:00 PM, Julien Le Dem wrote: > > TL;DR: TableMacro works for me; I need help with a bug in Calcite when > there's more than 1 function with the same name. Yes; see below. > FYI: I have a prototype of TableMacro working in Drill. For now just being > able to specify the

Re: Temporary branches

2015-11-05 Thread Julian Hyde
n that other thread so I can +1? > > -- > Jacques Nadeau > CTO and Co-Founder, Dremio > > On Thu, Nov 5, 2015 at 1:28 PM, Julian Hyde wrote: > >> >>> On Nov 5, 2015, at 1:12 PM, Jacques Nadeau wrote: >>> >>> I'm not sure what to do here. IN

Re: select from table with options

2015-11-06 Thread Julian Hyde
parameter. We do want to > overload in this case, which is why I'm looking into it. > > I'll fill a JIRA for my other branch > > Julien > > On Thu, Nov 5, 2015 at 5:39 PM, Julian Hyde wrote: > >> >> On Nov 5, 2015, at 5:00 PM, Julien Le

Re: select from table with options

2015-11-07 Thread Julian Hyde
` (type => 'CSV', fieldDelimiter >> => '|', skipFirstRow => true)) >> >> It also looks much more like a hint to the table (which is our goal). >> >> >> >> >> >> >> >> -- >> Jacques Nadeau >> CTO and Co-Foun

Re: select from table with options

2015-11-10 Thread Julian Hyde
3:34 PM, Jacques Nadeau >> wrote: >>> >>>> My proposal was an a or b using the freemarker template in the grammar, >>>> not something later. >>>> >>>> Actually, put another way: we may want to consider stating that we only >>&

Re: select from table with options

2015-11-10 Thread Julian Hyde
y DATE)'. On Tue, Nov 10, 2015 at 12:28 PM, Julien Le Dem wrote: > In the patch I just sent, probably not. > I will adjust it and add the corresponding test. > > On Tue, Nov 10, 2015 at 11:51 AM, Julian Hyde wrote: > >> Can you use both together? Say >> >>

Re: [DISCUSS] Get off Calcite Forked Version

2015-11-10 Thread Julian Hyde
Going forward, I think you should start a "calcite first" policy -- get changes accepted into Calcite, with tests of course, then cherry-pick into Drill's branch. Recent changes like "Allow a MAP literal type." should have done that. And especially changes where you could just parameterize planner

Re: select from table with options

2015-11-12 Thread Julian Hyde
tions > - allow using table functions and extend together. > Does it make sense? > Julien > > > On Tue, Nov 10, 2015 at 12:51 PM, Julian Hyde wrote: > >> To be clear, it should be possible to use a table function with all of >> the options -- EXTENDS clause, OVE

Re: select from table with options

2015-11-14 Thread Julian Hyde
u, Nov 12, 2015 at 8:34 PM, Julian Hyde wrote: > >> You’re hitting the grammar ambiguity I expected. >> >> I think that base Calcite should require the full verbose syntax: the >> TABLE keyword for table functions and the EXTEND keyword for extends >> clauses. Then

Re: Proposal for Skipping Records

2015-11-16 Thread Julian Hyde
It would be useful if you could describe the different ways that a record can be “bad”. IIRC the SQL standard divides the conditions into errors and warnings. Examples of a warning would be a string column that is truncated because it is too large for a varchar(20), or numeric underflow when you

Re: Proposal for Skipping Records

2015-11-16 Thread Julian Hyde
. However, > from the aspect of users' experience, they could just proceed and see the > different types of errors at the log, which helps them judge whether the > failure is tolerable or not. > > On Mon, Nov 16, 2015 at 11:56 AM, Julian Hyde wrote: > >> It would be usef

Re: [DISCUSS] Get off Calcite Forked Version

2015-11-17 Thread Julian Hyde
t; operators (e.g. Calcite, Phoenix) going to implement the operator now >> that Drill's lax validation policy has allowed it in? >> >> [ Will discuss in Calcite list once we have the PR ready for this patch. >> ] >> >> 9. I can't see yet how the [Sta

Re: [DISCUSS] Get off Calcite Forked Version

2015-11-17 Thread Julian Hyde
asn't been super successful. And sometimes I feel like > Jinfeng gets stuck doing all the heavy lifting. > > -- > Jacques Nadeau > CTO and Co-Founder, Dremio > >> On Tue, Nov 17, 2015 at 9:48 AM, Julian Hyde wrote: >> >> I can do a hackathon if it help

Drill on JDK 1.8

2015-11-19 Thread Julian Hyde
What’s the state of Drill on JDK 1.8? https://issues.apache.org/jira/browse/DRILL-1491 has been open for over a year, and doesn’t describe what the remaining issues are. Julian

Re: Drill on JDK 1.8

2015-11-19 Thread Julian Hyde
rformance tests using > Drill and JDK8. > > On Thu, Nov 19, 2015 at 4:35 PM, Julian Hyde wrote: > >> What’s the state of Drill on JDK 1.8? >> https://issues.apache.org/jira/browse/DRILL-1491 < >> https://issues.apache.org/jira/browse/DRILL-1491> has been ope

Re: Announcing new committer: Ellen Friedman

2015-11-23 Thread Julian Hyde
Congratulations, Ellen! Thanks for all you’ve done for Drill so far. Julian > On Nov 22, 2015, at 5:50 PM, Worthy LaFollette wrote: > > Congrats, Welcome! > > On Sun, Nov 22, 2015 at 6:38 PM, Jacques Nadeau wrote: > >> The Apache Drill PMC is very pleased to announce Ellen Friedman as a new

Re: Moving directory based pruning to fire earlier

2015-11-23 Thread Julian Hyde
I’m not sure what properties / behavior you want to override but remember that Calcite specifies a lot of brings as traits or metadata. For example, “double RelNode.getRows()" is deprecated and you would these days use RelMetadataQuery.getRowCount(). You would not need to sub-class a RelNode to

Re: Moving directory based pruning to fire earlier

2015-11-23 Thread Julian Hyde
; So, for now, we do have to override/exend all DrillLogicalRel. >> >> >> On Mon, Nov 23, 2015 at 4:55 PM, Julian Hyde wrote: >>> I’m not sure what properties / behavior you want to override but >> remember that Calcite specifies a lot of b

Re: Moving directory based pruning to fire earlier

2015-11-23 Thread Julian Hyde
h Rel node to > decide how to estimate it's own cost, given the row count, distinct > row count etc from MetadataProvider. Are you suggesting we completely > remove the Drill's costing estimation method, and use Calcite's > default one? > > > > On Mon, No

Re: Naming the new ValueVector Initiative

2015-11-26 Thread Julian Hyde
roject [1]. We could just do > this in a private email thread but I think doing it on Drill dev is better > in the interest of transparency. This isn't the perfect place for that but > I'm not sure a better place exists. > > I'm up for changing any or all of this

Re: Naming the new ValueVector Initiative

2015-11-30 Thread Julian Hyde
deau wrote: > > +1 > > -- > Jacques Nadeau > CTO and Co-Founder, Dremio > > On Mon, Nov 30, 2015 at 6:34 PM, Wes McKinney wrote: > >> Should we have a last call for votes, closing EOD tomorrow (Tuesday)? I >> missed this for a few days last week with holiday

Re: Can we pass the #skipped records with RecordBatch?

2015-12-01 Thread Julian Hyde
+1 for a sideband mechanism. Sideband can also allow correlated restart of sub-queries. In sideband use cases you described, the messages ran in the opposite direction to the data. Would the sideband also run in the same direction as the data? If so it could carry warnings, rejected rows, progr

Re: Can we pass the #skipped records with RecordBatch?

2015-12-08 Thread Julian Hyde
king. I'll try to >> write >>> up some thoughts later this week to get the ball rolling. Sound good? >>> >>> -- >>> Jacques Nadeau >>> CTO and Co-Founder, Dremio >>> +1 on having a framework. >>> OTOH, as with the warnings imp

Re: Naming the new ValueVector Initiative

2015-12-17 Thread Julian Hyde
gt;>> On Thu, Dec 3, 2015 at 1:23 AM, Marcel Kornacker >>> <mailto:mar...@cloudera.com>> >>>> wrote: >>>>> >>>>> Just added my vote. >>>>> >>>>> On Thu, Dec 3, 2015 at 12:51 PM, Wes McKinney >>>

Re: Naming the new ValueVector Initiative

2016-01-12 Thread Julian Hyde
> On Jan 12, 2016, at 09:21, Jason Altekruse wrote: > > I would not > advocate for Julian's suggestion of diverging histories in the one repo. He > seemed to just be mentioning for completeness in the discussion anyway +1

Re: Naming the new ValueVector Initiative

2016-01-21 Thread Julian Hyde
To expand on what “straight to TLP” means (correct me if I’m wrong, Jacques). From an IP standpoint, the new project is a clone of Drill. It starts off with Drill’s code base. We then, as the sculptor said [1], chip away everything that doesn’t look like Arrow. Julian [1] http://quoteinvestig

Re: Deterministic behavior of Negative Function?

2016-02-02 Thread Julian Hyde
I don’t recall interval literals being discussed on the Calcite list. We do support interval literals of the standard types (day-to-second or year-to-month) but we don’t support interval literals (or interval values) of month-to-day type. I think there’s a good reason that that kind of literal

Re: Optimizing SUM(1) query

2016-02-19 Thread Julian Hyde
And indeed COUNT(*) is equivalent to COUNT(1). COUNT(*) is the same as COUNT(e) where e is any not-null value. I would argue that SUM(1) should be optimized to COUNT(*). Or, generalizing a bit, that SUM(c) should be optimized to COUNT(*) * c. IIRC, Hive performs that optimization. It's a bit tric

Re: Optimizing SUM(1) query

2016-02-19 Thread Julian Hyde
PS I did recall correctly: https://issues.apache.org/jira/browse/HIVE-6192. But it's not implemented using Calcite, sadly. On Fri, Feb 19, 2016 at 12:11 PM, Julian Hyde wrote: > And indeed COUNT(*) is equivalent to COUNT(1). COUNT(*) is the same as > COUNT(e) where e is any not-null v

Re: hive translate function is not working from Drill

2016-02-29 Thread Julian Hyde
Arina: I did reply to your message on dev@calcite. See http://mail-archives.apache.org/mod_mbox/calcite-dev/201602.mbox/%3CDB1F2B6D-C23A-45E7-B400-C7458DCD9CF1%40apache.org%3E

Re: Optimizing SUM(1) query

2016-03-15 Thread Julian Hyde
Is there any reason why Drill cannot transform SUM(1) to COUNT(*) at an early stage (i.e. using a logical optimization rule) so that this optimization does not need to be done for each engine? > On Mar 15, 2016, at 5:29 AM, Sudip Mukherjee wrote: > > I was trying to have an Optimizer rule for

Re: Invoking UDF that doesn't have parameters without paranthesis

2016-12-20 Thread Julian Hyde
codegen/includes/parserImpls.ftl > > <https://github.com/apache/drill/blob/master/exec/java-exec/src/main/codegen/includes/parserImpls.ftl> > > Thank you, > Sudheesh > >> On Dec 19, 2016, at 3:00 PM, Nagarajan Chinnasamy >> wrote: >> >> Hi Julian Hyde,

Re: Drill: Memory Spilling for the Hash Aggregate Operator

2017-01-13 Thread Julian Hyde
The attachment didn’t come through. I’m hoping that you settled on a “hybrid” hash algorithm that can write to disk, or write to memory, and the cost of discovering that is wrong is not too great. With Goetz Graefe’s hybrid hash join (which can be easily adapted to hybrid hash aggregate) if the

Re: Drill: Memory Spilling for the Hash Aggregate Operator

2017-01-16 Thread Julian Hyde
hAggregation.pdf<https://drive.google.com/file/d/0ByUg32jfEW16ajNiQlVRczhPTjA/view?usp=sharing> > drive.google.com > > > >-- Boaz > > > From: Julian Hyde > Sent: Friday, January 13, 2017 11:00 PM > To: dev@drill.apache.org >

Re: Drill: Memory Spilling for the Hash Aggregate Operator

2017-01-17 Thread Julian Hyde
ing would do that. > > Thanks for the suggestions and the link; I’ll go over Goetz’ paper again > and look for more ideas. > > — Boaz > > >> On Jan 16, 2017, at 4:09 PM, Julian Hyde wrote: >> >> Does the data need to be written into a disk-

Re: NPE when connecting to sqlline using username and password

2017-02-08 Thread Julian Hyde
Agreed, you should use ’-n’. But also, please log a bug at https://github.com/julianhyde/sqlline . sqlline should give an error message, not throw a NullPointerException. Julian > On Feb 8, 2017, at 10:57 AM, Andries Engelbrecht > wrote: > > Use -n ma

Re: NPE when connecting to sqlline using username and password

2017-02-09 Thread Julian Hyde
ck this - > https://github.com/julianhyde/sqlline/issues/55 > > > Regards, > Khurram > ____ > From: Julian Hyde > Sent: Thursday, February 9, 2017 1:28:28 AM > To: dev@drill.apache.org > Subject: Re: NPE when connecting to sqlline using u

Time zone

2017-02-10 Thread Julian Hyde
Can someone please clarify the timezone behavior of Drill’s TIMESTAMP data type. According to the SQL standard, there is no timezone stored in a TIMESTAMP value, nor is there an implicit time zone (such as UTC or the server or session’s time zone). Under the standard model, TIMESTAMP ‘2017-02-1

Re: [ANNOUNCE] New Committer: Arina Ielchiieva

2017-02-24 Thread Julian Hyde
Congratulations, and welcome! On Fri, Feb 24, 2017 at 9:17 AM, Abhishek Girish wrote: > Congratulations Arina! > > On Fri, Feb 24, 2017 at 9:06 AM, Sudheesh Katkam > wrote: > >> The Project Management Committee (PMC) for Apache Drill has invited Arina >> Ielchiieva to become a committer, and we

Re: Drill date & time types encoding

2017-03-14 Thread Julian Hyde
I don’t think 4713 BC comes from the SQL standard. That is a Postgres thing. I believe that the standard says you should support timestamp precision up to 9 (i.e. nanoseconds). 2 ^ 64 nanoseconds is 584 years. So, it’s not possible to cram all of the timestamp values we’d like into a 64 bit inte

Re: Drill date & time types encoding

2017-03-16 Thread Julian Hyde
> On Mar 16, 2017, at 4:25 PM, Jinfeng Ni wrote: > > Time/Timestamp without t/z should be interpreted as local time. No. If I am in pacific time and I have a TIMESTAMP value “1970-01-01 12:00:00” and I send it to you in central european time you receive a TIMESTAMP value “1970-01-01 12:00:0

Re: Drill date & time types encoding

2017-03-16 Thread Julian Hyde
n Thu, Mar 16, 2017 at 4:41 PM, Julian Hyde <mailto:jh...@apache.org>> wrote: >> >>> On Mar 16, 2017, at 4:25 PM, Jinfeng Ni wrote: >>> >>> Time/Timestamp without t/z should be interpreted as local time. >> >> >> No. >> >>

Re: Understanding the science and concepts behind Calcite

2017-04-29 Thread Julian Hyde
Adding dev@drill to the cc list, because Muhammad also asked the question there. But please reply to dev@calcite only. I gave a talk “Why you should care about relational algebra”[1], intended for an audience of people who know SQL, but with a lot of details about algebra and algebraic transfor

Re: Issues categorization suggestion

2017-05-25 Thread Julian Hyde
In Calcite we assign a "newbie" flag to some issues. A more detailed categorization takes significant effort for the person triaging the bugs, so isn't worth it. On Thu, May 25, 2017 at 9:23 AM, Paul Rogers wrote: > Great suggestion. > > What I’ve learned over the last year, however, is that if s

Re: Why isn't Drill using a more recent version of Calcite ?

2017-05-28 Thread Julian Hyde
It's not exactly true that "calcite came from drill". Calcite was originally called Optiq. Drill was the second project to use Optiq (Cascading was the first) and Optiq was a pretty significant code base (almost 200k lines of code) when Drill started to use it. Drill created their own branch/fork

Re: Thinking about Drill 2.0

2017-06-09 Thread Julian Hyde
> On Jun 5, 2017, at 11:59 AM, Paul Rogers wrote: > > Similarly, the storage plugin API exposes details of Calcite (which seems to > evolve with each new version), exposes value vector implementations, and so > on. A cleaner, simpler, more isolated API will allow storage plugins to be > built

Re: [ANNOUNCE] New Committer: Charles Givre

2017-06-12 Thread Julian Hyde
Congratulations, Charles, and welcome! Thank you, not only for your code contributions, but also for your your work promoting Drill by writing and speaking at conferences. A simple search[1] turns up a lot of material. Julian [1] https://www.google.com/search?q=charles+givre+apache+drill > O

Re: Drill Summit/Conference Proposal

2017-06-14 Thread Julian Hyde
I like the idea of co-hosting a conference. ApacheCon in particular is a good venue, and they explicitly encourage sub-conferences (there are “Big Data” and “IoT” tracks, and this year there were sub-conferences for Tomcat and CloudStack). DrillCon was part of ApacheCon, people could attend a w

Re: Drill Summit/Conference Proposal

2017-06-16 Thread Julian Hyde
als, thus I’m not certain what the right choice is, > but I wanted to bring up the point for discussion. > > > On June 14, 2017 at 2:32:45 PM, Julian Hyde (jh...@apache.org > <mailto:jh...@apache.org>) wrote: > >> I like the idea of co-hosting a conference. Apac

Re: Thinking about Drill 2.0

2017-06-16 Thread Julian Hyde
Avatica? > On Jun 15, 2017, at 10:39 AM, Paul Rogers wrote: > > Hi Uwe, > > This is incredibly helpful information! You explanation makes perfect sense. > > We work quite a bit with ODBC and JDBC: two interfaces that are very much > synchronous and row-based. There are three challenges key wi

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2017-06-24 Thread Julian Hyde
Congratulations! On Fri, Jun 23, 2017 at 5:00 PM, Aman Sinha wrote: > Thank you all ! > > -Aman > > On Fri, Jun 23, 2017 at 3:59 PM, Paul Rogers wrote: > >> Congratulations Aman! >> >> And, thanks much to Parth, our outgoing chair, for his contributions as >> PMC chair over the last year! >> >>

LinkedIn requests

2017-06-27 Thread Julian Hyde
A few people (you know who you are!) have been sending LinkedIn requests to dev@drill. Please don’t send LinkedIn requests to dev or any other mailing list. As a list moderator I reject the requests, and a few days later LinkedIn re-sends the email and I have to moderate it again. If you know p

Re: Why rules from all plugins contribute into optimizing any type of query ?

2017-07-02 Thread Julian Hyde
What Ted said. But also, conversely, you should know that in Calcite you can write a general-purpose rule. Or better, re-use a general-purpose rule that someone else has written. There are logical rules, for example constant reduction and logic simplification, that work regardless of the data s

Re: Why Drill required a special Calcite fork ?

2017-07-17 Thread Julian Hyde
Leaving aside the fact that Drill needs a fork of Calcite (I accept the arguments for that, more or less), it’s embarrassing that the fork is poorly documented, poorly named (Calcite has been out of the incubator for almost 2 years, and hasn’t been called Optiq for each longer), and is in a part

Re: Drill query planning error

2017-07-26 Thread Julian Hyde
Aman, Thanks for moving dev@calcite to Bcc. This is properly a Drill question. A blanket restriction on cartesian joins is a blunt instrument. Sometimes cartesian joins are valid, safe, and the best plan for a query. This is a case in point. Users shouldn’t have to change config parameters to g

Re: Drill query planning error

2017-07-26 Thread Julian Hyde
; Best Regards >> >> >> On Thu, 27 Jul 2017 at 3:10 AM Aman Sinha > <mailto:amansi...@apache.org>> wrote: >> >>> Yes, the RelMdMaxRowCount statistic would be useful for this. Thanks for >>> pointing that out. I'll see if we can

Re: [ANNOUNCE] New PMC member: Arina Ielchiieva

2017-08-02 Thread Julian Hyde
Welcome! Well deserved. > On Aug 2, 2017, at 11:34 AM, rahul challapalli > wrote: > > Congratulations Arina! > > On Wed, Aug 2, 2017 at 11:27 AM, Kunal Khatua wrote: > >> Congratulations, Arina!! >> >> >> Thank you for your contributions to Drill ! >> >> >> ~ Kunal >> >> ___

Re: [DISCUSS] Draft board report

2017-08-03 Thread Julian Hyde
+1 > On Aug 3, 2017, at 2:58 PM, Aman Sinha wrote: > > Drill developers, > The quarterly board report for Drill is due in the next week or so. Pls > take a look at the draft report below send me your comments if any. I > would like to send it by tomorrow since I will be on vacation next week.

Re: Moving sqlline to Maven Central

2015-03-11 Thread Julian Hyde
OK, it's done. sqlline release 1.1.9 is now at maven central[1]. Cc: drill-dev and calcite-dev, because I know those communities heavily use sqlline. Julian [1] http://search.maven.org/#search|ga|1|sqlline On Fri, Mar 6, 2015 at 11:48 AM, Julian Hyde wrote: > For the last few relea

Re: Please be a little more choosey with method name verbs -- get vs create

2015-03-20 Thread Julian Hyde
+1 Principle of least surprise applies to every API you create. > On Mar 19, 2015, at 6:38 PM, Chris Westin wrote: > > In the course of things I've been doing, I've been cleaning up a large > number of compilation warnings. These can help us find problems, so it pays > to eliminate them wheneve

Re: [VOTE] Release Apache Drill 0.8.0

2015-03-23 Thread Julian Hyde
+1 (non-binding) Downloaded source tarball, verified checksums, verified notice & license, built, ran tests. (Ubuntu 14.10, Java 1.7.0_21, Maven 3.2.1.) Julian On Sun, Mar 22, 2015 at 9:33 PM, Yash Sharma wrote: > +1 . Non-binding. > > - Verified checksums for both distributions. > - Verified

Re: [VOTE] Release Apache Drill 0.8.0 (rc1)

2015-03-26 Thread Julian Hyde
+1 (non-binding) Downloaded source tarball, verified checksums, verified notice & license, built, ran tests. (Ubuntu 14.10, Java 1.7.0_21, Maven 3.2.1.) On Thu, Mar 26, 2015 at 1:08 PM, Hsuan Yi Chu wrote: > +1 > mvn clean install on linux three times in a row. All passed!!! > > On Thu, Mar 26,

Re: [VOTE] Release Apache Drill 0.8.0 (rc1)

2015-03-26 Thread Julian Hyde
:19 PM, Julian Hyde wrote: > +1 (non-binding) > > Downloaded source tarball, verified checksums, verified notice & > license, built, ran tests. > > (Ubuntu 14.10, Java 1.7.0_21, Maven 3.2.1.) > > On Thu, Mar 26, 2015 at 1:08 PM, Hsuan Yi Chu wrote: >> +1 >>

Re: limit and count do odd things

2015-04-08 Thread Julian Hyde
Ted, I'll forgive you (and anyone) the occasional intellectual lapse. I came across your wonderful t-digest paper earlier today. https://github.com/tdunning/t-digest/raw/master/docs/t-digest-paper/histo.pdf Julian > On Apr 8, 2015, at 20:20, Ted Dunning wrote: > > Thanks. > > Brain fart.

Re: [DISCUSS] improve physical plan formatting

2015-04-13 Thread Julian Hyde
+1 I have no idea whether you use Calcite for formatting plans or use Drill code. If the former, I’d be happy to accept a patch to Calcite that allows this as a formatting option for Calcite plan. Anyone have a good descriptive name for this format? Steven, were you inspired by some other sys

Re: Isn't it about time for a 0.9.0?

2015-04-18 Thread Julian Hyde
+1 Let me know if I can help with the Calcite task. Julian > On Apr 16, 2015, at 18:23, Jacques Nadeau wrote: > > Hey Guys, > > It looks like it is about time to start prepping for a 0.9 release. I > think we need to get a few last things into 0.9 and can then get a > candidate up for vote.

Re: Review Request 33136: DRILL-1384: Rebase Drill's forked Optiq library onto Calcite release 1.0.

2015-04-20 Thread Julian Hyde
Drill's RelNodes and Rules to match Calcite's new naming scheme. Strictly optional of course. - Julian Hyde On April 18, 2015, 8:34 p.m., Jinfeng Ni wrote: > > --- > This is an automatically generated e-mail. To re

Re: Should we make dir* columns only exist when requested?

2015-04-23 Thread Julian Hyde
+1 to returning directories as context. Very useful feature. Could be used to return context for other adapters (e.g. an adapter that concatenates all versions of versioned logfiles). +1 making dir an array, per Ted's suggestion I think dir should not appear in *; thus you'd have to write sele

Re: Should we make dir* columns only exist when requested?

2015-04-23 Thread Julian Hyde
commands. Relational algebra is a potentially MORE efficient. I find myself writing ' ... | sort | uniq -c | sort -nr' almost daily and wish I could write ' ... order by count(*) desc'. On Thu, Apr 23, 2015 at 6:27 PM, Julian Hyde wrote: > +1 to returning directories as con

Re: Review Request 33523: DRILL-1957: Support nested loop join planning (for NOT-IN, uncorrelated EXISTS, Inequality)

2015-04-27 Thread Julian Hyde
> On April 27, 2015, 3:46 p.m., Jinfeng Ni wrote: > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java, > > line 43 > > > > > > If a join has both equality and inequality condition,

Re: [DISCUSS] Reduce use of freemarker templates for functions

2015-04-27 Thread Julian Hyde
If you can write all of this as Java code and the JIT can handle it, that would be great. But I’d be surprised. If freemarker template for a particular function is complex, that is probably because it is doing a lot of customization for particular cases. In other words, it is adding a lot of va

Re: [VOTE] Release Apache Drill 0.9.0

2015-04-29 Thread Julian Hyde
+1 (binding) Downloaded tar.gz, built, ran tests. On Wed, Apr 29, 2015 at 11:03 AM, Hsuan Yi Chu wrote: > + 1 (non-binding) > Tried some queries; > Ran unit test on MAC OS. Things worked fine. > > On Wed, Apr 29, 2015 at 12:50 AM, Jacques Nadeau wrote: > >> Good evening, >> >> I would like to p

Re: [VOTE] Release Apache Drill 0.9.0 (rc2, third time is the charm...)

2015-04-30 Thread Julian Hyde
+1 Downloaded src, built & ran tests on x86_64, ubuntu-15.04, jdk 1.7.0_79-b15, America/Los_Angeles[1]. Verified md5, sha1, sac for src and binary[2]. Julian [1] Guess who was bitten by a timezone bug last release. [2] Here’s the tool I use to check signatures: https://github.com/julianhyde/

Re: TODOs in comments

2015-05-05 Thread Julian Hyde
Is the proposal to disallow TODOs that do not have a JIRA case number? I’d be +1 to that. I’m much less concerned with the problem that TODO(DRILL-abcd) might linger after in the code after DRILL-abcd has been fixed. Julian On May 5, 2015, at 12:38 PM, Jason Altekruse wrote: > It could opti

Re: TODOs in comments

2015-05-05 Thread Julian Hyde
r less-formal things, doesn't > seem good. > > > > > > > Daniel > > Sudheesh Katkam wrote: >> Yes, TODOs must have an associated JIRA, with the specified format. >> >>> On May 5, 2015, at 1:14 PM, Julian Hyde wrote: >>> >>> Is

Re: [DISCUSS] Remove required type

2016-03-23 Thread Julian Hyde
Jacques, Doesn't Drill detect the type of each column within each batch? If so, does it (or could it) also detect that a particular column is not null (again, within the batch)? You may not generate not-null data, but a lot of data is not-null. Let's not be too hasty to dismiss this as a theoret

Re: [DISCUSS] Remove required type

2016-03-23 Thread Julian Hyde
If the function is strict (i.e. produces null result if any of the arguments are null) then the user needs to provide only one implementation: the one where all arguments are not null. Drill would check for null arguments before calling it. Most of the SQL built-in functions are strict. Aren't mos

Re: Proposal: Create v2 branch to work on breaking changes

2016-03-26 Thread Julian Hyde
Do you plan to be doing significant development on both the v1 and v2 branches, and if so, for how long? I have been bitten badly by that pattern in the past. Developers put lots of unrelated, destabilizing changes into v2, it look longer than expected to stabilize v2, product management lost co

Re: Getting back on Calcite master: only a few steps left

2016-03-31 Thread Julian Hyde
I’ve closed 1149, if we don’t need the feature. Yes, we need a unit test for 1151. I offered a suggestion how. > On Mar 31, 2016, at 11:59 AM, Sudheesh Katkam wrote: > > I submitted a patch for CALCITE-1151 > (with changes to resolve > a ch

Re: Continued Avro Frustration

2016-04-01 Thread Julian Hyde
Stefan, I wanted to chime in. I don’t think Jacques was out of line. I understand your frustration, but the project does not owe you anything. The only surefire way to get a feature into the project is to contribute it yourself AND to work through the process of getting it accepted. If you co

Re: Getting back on Calcite master: only a few steps left

2016-04-20 Thread Julian Hyde
together >> test cases for CALCITE-1150? Maybe you could provide guidance on the set of >> queries to test? >> >> thanks, >> Jacques >> >> >> -- >> Jacques Nadeau >> CTO and Co-Founder, Dremio >> >> On Thu, Mar 31, 201

  1   2   >