drill's calcite branch

2017-02-26 Thread weijie tong
hi Drills: where can I find the calcite branch source code that drill are now self maintaining ?

Re: drill's calcite branch

2017-02-27 Thread weijie tong
cubator-calcite/tree/DrillCalcite1.4.0 > > Kind regards > Arina > > On Mon, Feb 27, 2017 at 5:51 AM, weijie tong > wrote: > > > hi Drills: > > where can I find the calcite branch source code that drill are now > self > > maintaining ? > > >

questions about FPGA plan to Drill

2017-03-16 Thread weijie tong
HI all: I notices that @Eric Fukuda ever have done some work to use FPGA to power Drill.It seems so impressive ! I want to know some recently news about that.Does anyone know about that? I have no contact to him. Best Regards

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-23 Thread weijie tong
I am working on pushing down joins to Druid storage plugin. To my experience, you need to write a rule to know whether the joins could be pushed down by your storage plugin metadata first,then if ok ,you transfer the join node to the scan node with the query relevant information in the scan node. T

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-28 Thread weijie tong
my suggestion is you define a rule which matches the DrillJoinRel RelNode , then at the onMatch method ,you traverse the join children to find the ScanRel nodes . You define a new ScanRel which include the ScanRel nodes you find last step. Then transform the JoinRel to this equivalent new ScanRel.

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-28 Thread weijie tong
to avoid misunderstanding , the new equivalent ScanRel is to have the joined ScanRel nodes's GroupScans, as the GroupScans indirectly hold the underlying storage information. On Wed, Mar 29, 2017 at 10:15 AM, weijie tong wrote: > > my suggestion is you define a rule which

what's the difficult points to cache the PrepareStatement's execution plan?

2017-03-28 Thread weijie tong
If we can cache one PrepareStatement's corresponding execution plan like the traditional relational database, we can save the logic and physical execution plan time. I want to know what's the potential difficulties . If we list these difficulties out ,we may find some solutions to solve it directly

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-29 Thread weijie tong
re hopefully you may offer > more guidance. THANKS A LOT. > > *-* > *Muhammad Gelbana* > http://www.linkedin.com/in/mgelbana > > On Wed, Mar 29, 2017 at 4:23 AM, weijie tong > wrote: > > > to avoid misunderstanding , the new equivalent ScanRel is t

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-31 Thread weijie tong
how to obtain those ! >3. Precisely, what kind of object should I use to represent a *RelNode* >that represents the whole join ? I understand that I need to use an > object >that has implements the *RelNode* interface. Then I should add the >created *GroupScan* to th

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-04-06 Thread weijie tong
different set of fields ? How can I resolve this ? > > > >6. *List*: I assume I can call this method and pass my > >columns names to it, one by one. (i.e. > >*org.apache.drill.common.expression.SchemaPath. > getCompoundPath(String...)* > >) > &

questions about Drill's implementation detail to implement HashJoin

2017-05-13 Thread weijie tong
HI Drillers: Any one could give a detail description of Drill's HashJoin implementation , a picture of the HashTable's data structure ? I also wonder the HashTable implementation difference between Drill and Flink . Seems Drill is batch model while Flink is not . Am I right ?

Questions about Drill's multi-thread model

2017-07-19 Thread weijie tong
Hi there, Our product environment has a situation that if one query was blocked by the storage,then all other queries which come later would took long and long time to run even they really just need fewer time. At the time ,the cluster's load is not too high. I know that every foreman will r

Re: Questions about Drill's multi-thread model

2017-07-21 Thread weijie tong
is an issue with some particular piece of code. > > What type of file is being read? In what environment? > > > > Thanks, > > > > - Paul > > > > > On Jul 19, 2017, at 7:20 AM, weijie tong > > wrote: > > > > > > Hi there, > > > Our pr

Re: Questions about Drill's multi-thread model

2017-07-21 Thread weijie tong
an looking for uses of the Druid data > source, and only add those to the Druid queue. Visitors exist which you can > implement to obtain the required info. > > The result would be that Druid queries block waiting for capacity on Druid > to become available, while all other queries ru

Drill query planning error

2017-07-26 Thread weijie tong
HI all: I materialize the count distinct query result to a cache, then when user query the count distinct , a specific rule will translate the query to the cache. It turns out right when the query has only one count (distinct ) operator ,but when it has two count (distinct ) ,it causes error .Th

Re: Drill query planning error

2017-07-26 Thread weijie tong
sounds like after > you > > > did the transformation to use the cache, that scalar property somehow > did > > > not get propagated. > > > You can override this behavior by a session configuration: (this will > > use > > > a nested loop join even if

Re: Drill query planning error

2017-07-26 Thread weijie tong
hen 2 or more cartesian joins are present. > > Please file a JIRA with relevant details. > > > > On Wed, Jul 26, 2017 at 3:13 PM, weijie tong <mailto:tongweijie...@gmail.com>> > > wrote: > > > >> Thanks for pointing out the possible reasons @Aman @Julian .

Re: Drill query planning error

2017-07-26 Thread weijie tong
here is the jira issue link: https://issues.apache.org/jira/browse/DRILL-5691 On Thu, Jul 27, 2017 at 8:32 AM, weijie tong wrote: > another tips ,the coun distinct query examples mentioned above are all > transferred by my rule,that is from Aggregate,Aggregate,Project,Scan to > Pro

Which code compiler is better

2017-07-30 Thread weijie tong
The compile process is long when we have 20 sum or avg expression and the compiler is janino. But if we change the compiler to jdk,we gain lower compile process time. It seems jdk compiler is better .If that's tue,why not let jdk be the default one?

Re: Which code compiler is better

2017-07-31 Thread weijie tong
r. Newer > JDK could potentially be faster, so we would need to do the comparison > again. Perhaps you should file a JIRA with your observations. > > There is also the complexity of the expressions. For simple expressions, > my understanding is Janino is typically better. Ca

Re: [GitHub] drill issue #888: Merge pull request #1 from apache/master

2017-08-02 Thread weijie tong
sorry for that wrong operation , I have closed it. On Wed, Aug 2, 2017 at 5:20 PM, arina-ielchiieva wrote: > Github user arina-ielchiieva commented on the issue: > > https://github.com/apache/drill/pull/888 > > @weijietong could you please close this PR? > > > --- > If your project is se

Questions about rpc

2017-08-07 Thread weijie tong
There's a case that the partitioned sender elapsed lots of time to wait . >From the profile ,we saw that the sender waits for 1 hour ,the opposite recivier and its subsequent hash aggregate operator spends 1 hour to its process time. (39 sender minor fragments,7 reciver minor fragments, each sender

IntelliJ code format

2017-08-08 Thread weijie tong
The IntelliJ code format downloaded from the Drill web site seems to have 4 indent .The eclipse one is 2 indent. Wonder ti's my personal env problem or the provided Intellij code format error?

Re: IntelliJ code format

2017-08-08 Thread weijie tong
the download site url : https://drill.apache.org/docs/apache-drill-contribution-guidelines/ On Tue, Aug 8, 2017 at 10:59 PM, weijie tong wrote: > The IntelliJ code format downloaded from the Drill web site seems to have > 4 indent .The eclipse one is 2 indent. Wonder ti's my p

Re: Which code compiler is better

2017-08-11 Thread weijie tong
nse: > > In DRILL-4778, JDK was faster in compilation but generated slower code. > Janino was slower in compilation and generate faster code. Your JIRA did > not mention how was the performance when running generated code. You may > want to test this aspect as well. > > > From: w

Re: Which code compiler is better

2017-08-11 Thread weijie tong
java to extend the template. On Fri, 11 Aug 2017 at 5:52 PM weijie tong wrote: > @chunhui we just adjust different compiler options ,the generating code > strategy does not affected by the compiler option. so I think the > different result just reflects the compiler's performan

Re: [GitHub] drill pull request #904: DRILL-5717: let some date time test cases be Local ...

2017-08-14 Thread weijie tong
Thanks for the advice ,will consider that. On Mon, 14 Aug 2017 at 7:56 PM vvysotskyi wrote: > Github user vvysotskyi commented on a diff in the pull request: > > https://github.com/apache/drill/pull/904#discussion_r132932858 > > --- Diff: > exec/java-exec/src/main/codegen/templates/DateI

Discuss about Drill's schedule policy

2017-08-20 Thread weijie tong
HI all: Drill's current schedule policy seems a little simple. The SimpleParallelizer assigns endpoints in round robin model which ignores the system's load and other factors. To critical scenario, some drillbits are suffering frequent full GCs which will let their control RPC blocked. Current a

Re: Discuss about Drill's schedule policy

2017-08-21 Thread weijie tong
la schedulers, so we’re > somewhat hesitant to move away from a purely symmetrical configuration. > Suggestions in this area are very welcome. > > For now, try turning on the ZK queues to limit concurrent queries and > prevent overload. Ensure your cluster is sized for your w

Re: IntelliJ code format

2017-08-21 Thread weijie tong
@padma ,what's the process? On Wed, 9 Aug 2017 at 1:04 AM Padma Penumarthy wrote: > You are right. It is configured as 4. We should fix that. > > Thanks, > Padma > > > > On Aug 8, 2017, at 8:12 AM, weijie tong wrote: > > > > the download site url :

Re: [ANNOUNCE]: New committer: Ankush Kapur

2020-08-05 Thread weijie tong
Congratulations Ankush! On Thu, Aug 6, 2020 at 2:37 AM Charles Givre wrote: > The Project Management Committee (PMC) for Apache [PROJECT] has invited > Ankush Kapur to become a committer and we are pleased to announce that he > has accepted. > > Being a committer enables easier contribution to t

Re: [VOTE]: James Turton for Committer

2020-11-05 Thread weijie tong
+1 On Fri, Nov 6, 2020 at 1:18 AM Ted Dunning wrote: > I think that looks like a great addition. > > +1 for James. > > I don't think that lazy consensus is a great idea, however. Happily, you > now have three positives. > > > On Wed, Nov 4, 2020 at 11:00 AM Charles Givre wrote: > > > Hello all,

How to generate hash code for each build side one of the hash join columns

2018-05-28 Thread weijie tong
HI All: Through implementing the JPPD feature ( https://issues.apache.org/jira/browse/DRILL-6385) , I was blocked by the problem: how to get the hash code of each build side of the hash join columns through the dynamic generated java code. Hope someone can give some advice. I supposed to add

Re: How to generate hash code for each build side one of the hash join columns

2018-05-28 Thread weijie tong
uld you also > > JBlock ifBlock = > > cg.getEvalBlock()._if(fieldIdParamHolder.getValue().eq(targe > > tBuildSideFieldId))._then(); > > > > > > > > On Mon, May 28, 2018 at 4:17 AM, weijie tong > > wrote: > > > >> HI All: > >> Through

Re: How to generate hash code for each build side one of the hash join columns

2018-05-28 Thread weijie tong
ntly and can > probably give you additional pointers. > > Thanks, > - Paul > > > > On Monday, May 28, 2018, 8:52:19 PM PDT, weijie tong < > tongweijie...@gmail.com> wrote: > > @aman thanks for your reply. "For the ifBlock, do you need an _else() > b

Re: How to generate hash code for each build side one of the hash join columns

2018-05-28 Thread weijie tong
e, May 29, 2018 at 1:47 PM weijie tong wrote: > HI Paul: > > Thanks for your enthusiasm. I have managed this skill as you ever > mentioned me at another mail thread. It's really helpful ,thanks for your > valuable work. > > Now I have solved this tough problem by ad

Re: How to generate hash code for each build side one of the hash join columns

2018-05-29 Thread weijie tong
I found ClassGenerator's nestEvalBlock(JBlock block) and unNestEvalBlock() which has the same effect to what I change to the ClassGenerator. So I give up what I change to the ClassGenerator and hope this can help someone else. On Tue, May 29, 2018 at 1:53 PM weijie tong wrote: >

Re: How to generate hash code for each build side one of the hash join columns

2018-05-30 Thread weijie tong
spill-to-disk feature. > So, this may pose some integration challenges for your run-time join > pushdown feature. > Also, one other question/clarification: for the bloom filter itself are > you implementing it natively in Drill or using an external library ? > > -Aman > > O

Re: How to generate hash code for each build side one of the hash join columns

2018-05-31 Thread weijie tong
hese are the first columns; their number can be found from the > config: e.g., htConfig.getKeyExprsBuild().size() ) > >With such implementation, that evalHash() could be used anywhere (e.g., > to match the Bloom filters on the left side of the join). > >Thanks, > >

Re: How to generate hash code for each build side one of the hash join columns

2018-06-01 Thread weijie tong
size . So why SelectionVector4 is not supported by the ProjectBatch ? The same question is to the FilterBatch's SelectVector2 which also only support the 2 Byte size record count. On Fri, Jun 1, 2018 at 1:40 PM weijie tong wrote: > Hi Boaz: > > Your propose is valuable though I h

Re: How to generate hash code for each build side one of the hash join columns

2018-06-01 Thread weijie tong
I find the answer that RecordBatch's max size is 2^16 which is defined at RecordBatch's MAX_BATCH_SIZE. On Fri, Jun 1, 2018 at 3:36 PM weijie tong wrote: > Some questions about SelectionVector2 and SelectionVector4: > > I want to create SelectionVector4 or SelectionVecto

Re: How to generate hash code for each build side one of the hash join columns

2018-06-01 Thread weijie tong
2018 at 5:14 PM weijie tong wrote: > I find the answer that RecordBatch's max size is 2^16 which is defined at > RecordBatch's MAX_BATCH_SIZE. > > On Fri, Jun 1, 2018 at 3:36 PM weijie tong > wrote: > >> Some questions about SelectionVector2 and SelectionVe

Re: How to generate hash code for each build side one of the hash join columns

2018-06-01 Thread weijie tong
ject reference of SV2 java object. Only the > underlying buffer for the SV2 object will change to store new indexes for > new incoming batch. It's only after OK_NEW_SCHEMA outcome is seen when > setup will be again called. > > > Thanks, > Sorabh > > > _

Re: [ANNOUNCE] New Committer: Padma Penumarthy

2018-06-19 Thread weijie tong
Congratulations Padma! On Tue, Jun 19, 2018 at 4:41 AM salim achouche wrote: > Congratulations Padma! > > Regards, > Salim > > > On Jun 15, 2018, at 9:58 AM, Vitalii Diravka > wrote: > > > > Congrats Padma! > > > > Kind regards > > Vitalii > > > > > > On Fri, Jun 15, 2018 at 7:40 PM Arina Ielch

how to release allocated ByteBuf which steps across two threads

2018-06-19 Thread weijie tong
HI: I faced a complicated problem by releasing the BloomFilter's direct memory at some special cases. Hope someone could give some advices. Say, one join node sends out BloomFilter to the foreman node(TestHashJoin.simpleEqualityJoin() ) . The sending thread is netty's BitClient. The BloomFi

Re: how to release allocated ByteBuf which steps across two threads

2018-06-20 Thread weijie tong
the Bloom Filter > > reaching its destination. How does the destination fragment know when it > > has to wait for the Bloom Filter? I suspect this may be more > > complicated than it appears at first glance. > > > > Not sure if this helps narrow it down. If you can s

Re: how to release allocated ByteBuf which steps across two threads

2018-06-20 Thread weijie tong
to release the ByteBuf maybe happed behind the allocator thread. On Thu, Jun 21, 2018 at 8:51 AM weijie tong wrote: > Hi Parth: > > Thanks for your reply. Your detail description explain that problem > clearly. This problem is not a common case. The bloom filter has not been >

Re: how to release allocated ByteBuf which steps across two threads

2018-06-20 Thread weijie tong
describe above. And by following this solving pattern, I solved this problem ,really appreciate of your advice. thanks you so much! On Thu, Jun 21, 2018 at 9:13 AM weijie tong wrote: > I also think this is a common problem to the case that the receiver has no > chance to sent out a ack reply,

Re: [ANNOUNCE] New PMC member: Vitalii Diravka

2018-06-26 Thread weijie tong
Congratulations Vitalii! On Wed, Jun 27, 2018 at 8:11 AM Paul Rogers wrote: > Congratulations Vitalii! > - Paul > > > > On Tuesday, June 26, 2018, 11:12:16 AM PDT, Aman Sinha < > amansi...@apache.org> wrote: > > I am pleased to announce that Drill PMC invited Vitalii Diravka to the PMC > an

Discussion about the metadata design

2018-06-28 Thread weijie tong
HI all: As @aman ever noticed me about the roadmap of DRILL-2.0 ,which includes the description of the metadata design ( https://lists.apache.org/thread.html/74cf48dd78d323535dc942c969e72008884e51f8715f4a20f6f8fb66@%3Cdev.drill.apache.org%3E) , I am interested in taking the role to implement

Re: Discussion about the metadata design

2018-06-28 Thread weijie tong
elchiyeva < > arina.yelchiy...@gmail.com> > wrote: > > > Hi, > > > > Vitalii and Vova is also looking at this part, you might want to sync up > > with them. Or even better, we can create Jira for this and held all > > discussions there. > > Vitalii,

Actual vectorization execution

2018-06-29 Thread weijie tong
HI all: I have investigate some vector friendly java codes's jit assembly code by the JITWatch tool . Then I found that JVM did not generate the expected AVX code.According to some conclusion from the JVM expert , JVM only supply some restrict usage case to generate AVX code. I found Intel h

Re: Actual vectorization execution

2018-06-29 Thread weijie tong
the test program or workload you were running was already written > to exploit vectorization. Have you also looked into > Drill's code-gen to see which ones are amenable to vectorization ? We > could start with some small use case and expand. > > [1] > http://www.oracle.co

Re: [DISCUSSION] current project state

2018-08-14 Thread weijie tong
My thinking about this topic. Drill does well now. But be better,we need to be idealist to bring in more use cases or more advanced query performance compared to other projects like Flink , Spark, Presto,Impala. To performance, I wonder do we need to adopt the project Gandiva which is so exciting o

[DISCUSSION] Does schema-free really need

2018-08-15 Thread weijie tong
Hi all: Hope the statement not seems too dash to you. Drill claims be a schema-free distributed SQL engine. It pays lots of work to make the execution engine to support it to support JSON file like storage format. It is easier to make bugs and let the code logic ugly. I wonder do we still insis

Re: [DISCUSSION] Does schema-free really need

2018-08-15 Thread weijie tong
lly running a query. That gives us pretty much the > flexibility of schema on read without as much of the burden. > > > > On Wed, Aug 15, 2018 at 5:02 PM weijie tong > wrote: > > > Hi all: > > Hope the statement not seems too dash to you. > > Drill claims

Re: [DISCUSSION] Does schema-free really need

2018-08-15 Thread weijie tong
for > > Drill would be schema-on-read. > > 2: I would not call it a battle between non-relational data and > > relational engine. The extended relational model has type of > > array/composite types, similar to what Drill has. > > > > > > > > > >

Re: [ANNOUNCE] New PMC member: Boaz Ben-Zvi

2018-08-17 Thread weijie tong
Congrats Boaz! On Fri, Aug 17, 2018 at 5:56 PM Vitalii Diravka wrote: > Congrats Boaz! > > Kind regards > Vitalii > > > On Fri, Aug 17, 2018 at 12:51 PM Arina Ielchiieva > wrote: > > > I am pleased to announce that Drill PMC invited Boaz Ben-Zvi to the PMC > and > > he has accepted the invitati

Re: [ANNOUNCE] New PMC member: Volodymyr Vysotskyi

2018-08-25 Thread weijie tong
Congratulations Volodymyr! On Sat, Aug 25, 2018 at 8:30 AM salim achouche wrote: > Congrats Volodymyr! > > On Fri, Aug 24, 2018 at 11:32 AM Gautam Parai wrote: > > > Congratulations Vova! > > > > Gautam > > > > On Fri, Aug 24, 2018 at 10:59 AM, Khurram Faraaz > wrote: > > > > > Congratulations

Re: [DISCUSS] Deprecation policy in Drill

2018-08-27 Thread weijie tong
I think we should reserve these deprecated options to let users upgrade easier. Another solution is if we remove these deprecated ones, we should add a startup checking to let users know these options are removed . On Mon, Aug 27, 2018 at 3:54 PM Arina Ielchiieva wrote: > Hi all, > > when it sho

Re: [ANNOUNCE] New Committer: Weijie Tong

2018-09-01 Thread weijie tong
Weijie, thanks for your contributions to Drill. > >>>> Thanks, > >>>> - Paul > >>>> > >>>> > >>>> > >>>> On Friday, August 31, 2018, 8:51:30 AM PDT, Arina Ielchiieva < > >>>> ar...@apache.org>

Re: Possible way to specify column types in query

2018-09-06 Thread weijie tong
Google's latest paper about F1[1] claims to support any data sources by using an extension api called TVF see section 6.3. Also need to declare column datatype before the query. [1] http://www.vldb.org/pvldb/vol11/p1835-samwel.pdf On Fri, Sep 7, 2018 at 9:47 AM Paul Rogers wrote: > Hi All, > >

[Discuss] Integrate Arrow gandiva into Drill

2019-04-03 Thread weijie tong
HI : Gandiva is a sub project of Arrow. Arrow gandiva using LLVM codegen and simd skill could achieve better query performance. Arrow and Drill has similar column memory format. The main difference now is the null representation. Also Arrow has made great changes to the ValueVector. To adopt Arro

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-04 Thread weijie tong
target for publishing > desired methods with package access ? > > Thanks, Igor > > On Thu, Apr 4, 2019 at 9:51 AM weijie tong > wrote: > > > > HI : > > > > Gandiva is a sub project of Arrow. Arrow gandiva using LLVM codegen and > > simd skill could ach

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-04 Thread weijie tong
I have a doubt about the ProjectRecordBatch implementation. Hope someone could give an explanation about that. To the line 234 of ProjectRecordBatch, at what case,the projector output row size less than the input size ? On Thu, Apr 4, 2019 at 5:11 PM weijie tong wrote: > Hi Igor: > Th

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-04 Thread weijie tong
Volodymyr Vysotskyi > > > On Thu, Apr 4, 2019 at 5:17 PM weijie tong > wrote: > > > I have a doubt about the ProjectRecordBatch implementation. Hope someone > > could give an explanation about that. To the line 234 of > > ProjectRecordBatch, at what case,the proje

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-05 Thread weijie tong
alues , I think the line 234 of ProjectRecordBatch will never be executed. Untill DRILL-6340 , we control the output batch memory size, that part of code finally come into use. If I was wrong, please let me know. On Fri, Apr 5, 2019 at 12:15 AM weijie tong wrote: > Thanks for the reply, But

Re: [ANNOUNCE] New PMC member: Sorabh Hamirwasia

2019-04-05 Thread weijie tong
Congratulations Sorabh! On Sat, Apr 6, 2019 at 7:17 AM Sorabh Hamirwasia wrote: > Thank You everyone for your wishes!! > > Looking forward for everyone's help to vote on release candidate next week > :) > > Thanks, > Sorabh > > On Fri, Apr 5, 2019 at 2:12 PM Parth Chandra wrote: > > > Congrats

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-05 Thread weijie tong
rate operator in > > Drill but I think that code is there to handle such cases. > > > > Thanks, > > Sorabh > > > > On Fri, Apr 5, 2019 at 6:08 AM weijie tong > > wrote: > > > > > The first appearance of the compari

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-18 Thread weijie tong
am very curious how you were able to > solve these problems. > > > Thanks, > > - Paul > > > > On Wednesday, April 3, 2019, 11:51:34 PM PDT, weijie tong < > tongweijie...@gmail.com> wrote: > > HI : > > Gandiva is a sub project of Arrow. Arrow g

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-23 Thread weijie tong
t; > > If the user wants to use Gandiva, he/she could set a config option to > > point to the Gandiva library (and supporting files, if any.) Or, use the > > existing LD_LIBRARY_PATH env. variable. > > > > Thanks, > > - Paul > > > > > > > &g

Questions about bushy join

2019-05-27 Thread weijie tong
Hi all: Does anyone know why we don't support bushy join in the query plan generation while hep planner is enabled. The codebase shows the fact that the PlannerPhase.JOIN_PLANNING use the LoptOptimizeJoinRule not calcite's MultiJoinOptimizeBushyRule.

Re: Questions about bushy join

2019-05-27 Thread weijie tong
nitude more than > estimated. This could happen easily in big data systems where statistics > are constantly changing due to new data ingestion and even running ANALYZE > continuously is not feasible. > That said, it is not a bad idea to experiment with such plans with say more > than 5 tab

Re: Questions about bushy join

2019-05-29 Thread weijie tong
k, 0.0 > memory}, id = 10932 > 00-20Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=classpath:/tpch/region.parquet]], > selectionRoot=classpath:/tpch/region.parquet, numFiles=1, > usedMetadataFile=false, columns=[`r_regionkey`, `r_name`]

Re: Drill storage plugin for IPFS, any suggestion is welcome :)

2019-07-08 Thread weijie tong
Amazing to see Paul’s Chinese welcome words! Also glad to hear the use case by Wang Liang using Drill and welcome to contribute that as a Drill’s storage plugin. On Tue, Jul 9, 2019 at 1:00 AM Paul Rogers wrote: > 王亮 你好, > > > Very creative use of Drill! We usually think of Drill as a tool for "

Re: Apache Drill Hangout - July 9, 2019

2019-07-08 Thread weijie tong
I could give a short talk about my recent work about parallel HashJoin and something others. On Mon, Jul 8, 2019 at 7:28 PM Bohdan Kazydub wrote: > Hi Drillers, > > We will have our bi-weekly hangout tomorrow, July 9th, at 10 AM PST > (link: https://meet.google.com/yki-iqdf-tai ). > > If there a

Re: Apache Drill Hangout - July 9, 2019

2019-07-10 Thread weijie tong
me that works for you in response to the email, so that Apache Drill > > community decides how to proceed with this (i.e. we find a convenient > time > > that works for all interested in the topic). > > > > Kind regards, > > Bohdan Kazydub > > > > On Tue, Jul 9,

Re: [ANNOUNCE] New Committer: Bohdan Kazydub

2019-07-23 Thread weijie tong
> > > > > > > Congratulations Bohdan! > > > > > > > > > > > > Gautam > > > > > > > > > > > > On Mon, Jul 15, 2019 at 11:53 PM Bohdan Kazydub > > > > > bohdan.kazy...@gmail.com> &

Re: [ANNOUNCE] New Committer: Igor Guzenko

2019-07-23 Thread weijie tong
Congratulations Igor! On Wed, Jul 24, 2019 at 1:23 AM Igor Guzenko wrote: > Hello Drillers, > > Thank you all for the greetings. It is an honor for me to be part of the > Apache Drill community. > > Best regards, > Igor Guzenko > > On Tue, Jul 23, 2019 at 6:37 PM Charles Givre wrote: > > > Cong

Re: Apache Drill Hangout July 23rd

2019-07-23 Thread weijie tong
Well, sorry about the missing time . I forgot to set the alarm and overslept. Now I can't join the meeting, maybe it has finished. I will issue the ParallelHashJoin PR recently. On Tue, Jul 23, 2019 at 10:14 AM Aman Sinha wrote: > Hi Drillers, > > We will have our bi-weekly hangout tomorrow, Jul

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2019-08-25 Thread weijie tong
Congratulations Charles. On Sat, Aug 24, 2019 at 11:33 AM Robert Hou wrote: > Congratulations Charles, and thanks for your contributions to Drill! > > Thank you Arina for all you have done as PMC Chair this past year. > > --Robert > > On Fri, Aug 23, 2019 at 4:16 PM Khurram Faraaz > wrote: > >

Re: Discuss about Drill's schedule policy

2017-08-23 Thread weijie tong
e can have different scheduler implementations (central or non-central ,maybe non-central like sparrow be the default one ). On Mon, Aug 21, 2017 at 11:51 PM, weijie tong wrote: > Thanks for all your suggestions. > > @paul your analysis is impressive . I agree with your opinion. C

Re: Discuss about Drill's schedule policy

2017-08-27 Thread weijie tong
could adopt the core of Sparrow (or whatever) with the algorithm needed > for Drill to avoid the need to invent yet another new scheduler. > > Thanks, > > - Paul > > > [1] https://www.usenix.org/system/files/conference/osdi14/ > osdi14-paper-boutin_0.pdf > > On Aug 23,

Re: Discuss about Drill's schedule policy

2017-08-27 Thread weijie tong
d let MajorFragments execute from top to leaf ,then the corresponding execution tasks from top to down are all sure to be allocated to do the pipeline works. On Sun, 27 Aug 2017 at 7:46 PM weijie tong wrote: > Hi Paul: > >I have read the codes of Sparrow and Spark-Sparrow last few da

Re: IntelliJ code format

2017-09-07 Thread weijie tong
updated now with the new jar. > Please check it out and let us know if any other issues. > > Thanks, > Padma > > > On Aug 21, 2017, at 8:59 AM, weijie tong tongweijie...@gmail.com>> wrote: > > @padma ,what's the process? > > On Wed, 9 Aug 2017 at 1:04 AM P

Propose about join push down

2017-09-19 Thread weijie tong
All: This is a propose about join query tuning by pushing down the join condition. Welcome suggestion ,discussion,objection . Suppose we have a join query "select t1.a,t1.s,t3.d (select a, sum(b) as s from t1 where a='1' group by a ) t2 join t3 on t2.a = t3.a" . This query will be transfer

Re: Propose about join push down

2017-09-20 Thread weijie tong
perators would > need to accept a next() call with some “data” parameter). > > Boaz > > > On 9/19/17, 8:45 AM, "weijie tong" wrote: > > All: >This is a propose about join query tuning by pushing down the join > con

Re: Propose about join push down

2017-09-20 Thread weijie tong
For the NLJ, indeed the current > Drill does not support “down flow” of data (and most storage does not have > indexes), and it’ll take some work to implement (e.g., all operators would > need to accept a next() call with some “data” parameter). > > Boaz > -

Re: [ANNOUNCE] New Committer: Vitalii Diravka

2017-12-12 Thread weijie tong
Congratulations Vitalii On Wed, 13 Dec 2017 at 5:50 AM Khurram Faraaz wrote: > Congratulations Vitalii. > > > Regards, > > Khurram > > > From: Vlad Rozov > Sent: Tuesday, December 12, 2017 1:10:09 PM > To: dev@drill.apache.org > Subject: Re: [ANNOUNCE] New Commi

Re: [ANNOUNCE] New Committer: Boaz Ben-Zvi

2017-12-14 Thread weijie tong
Congratulations , Boaz! On Thu, 14 Dec 2017 at 7:21 AM Boaz Ben-Zvi wrote: > > Thank you all for the warm wishes; I hope to be worthy of the new status > … > > Boaz > > On 12/13/17, 2:45 PM, "Vlad Rozov" wrote: > > Congrats! > > Thank you, > > Vlad > > On 12/13/17 14:4

Re: [ANNOUNCE] New Committer: Chunhui Shi

2018-09-28 Thread weijie tong
Congratulations Chunhui ! On Fri, Sep 28, 2018 at 10:58 PM Abhishek Girish wrote: > Congrats Chunhui! > On Fri, Sep 28, 2018 at 7:39 AM Vova Vysotskyi wrote: > > > Congratulations! Well deserved! > > > > Kind regards, > > Volodymyr Vysotskyi > > > > > > On Fri, Sep 28, 2018 at 12:17 PM Arina Ie

Re: [ANNOUNCE] New Committer: Gautam Parai

2018-10-22 Thread weijie tong
Congratulations Gautam ! On Tue, Oct 23, 2018 at 6:28 AM Aman Sinha wrote: > Congratulations Gautam ! > > On Mon, Oct 22, 2018 at 3:00 PM Jyothsna Reddy > wrote: > > > Congrats Gautam!! > > > > > > > > On Mon, Oct 22, 2018 at 2:01 PM Vitalii Diravka > > wrote: > > > > > Congratulations! > > >

Re: [HANGOUT] 29th Oct 2018 (9PM PST)

2018-10-29 Thread weijie tong
Hi : Thanks for the invitation. Here is slide: JPPD On Tue, Oct 30, 2018 at 12:12 PM Pritesh Maker wrote: > Hi, > > Apologies for the late notice - we are currently having a Hangout with > Weij

Re: [ANNOUNCE] New Committer: Hanumath Rao Maduri

2018-11-01 Thread weijie tong
Congratulations, Hanu! On Fri, Nov 2, 2018 at 8:22 AM Robert Hou wrote: > Congratulations, Hanu. Thanks for contributing to Drill. > > --Robert > > On Thu, Nov 1, 2018 at 4:06 PM Jyothsna Reddy > wrote: > > > Congrats Hanu!! Well deserved :D > > > > Thank you, > > Jyothsna > > > > On Thu, Nov

Re: [ANNOUNCE] New Committer: Karthikeyan Manivannan

2018-12-07 Thread weijie tong
Congratulations Karthik ! On Sat, Dec 8, 2018 at 12:10 PM Karthikeyan Manivannan wrote: > Thanks! In addition to all you wonderful Drillers, I would also like to > thank Google, StackOverflow and Larry Tesler > < > https://www.indiatoday.in/education-today/gk-current-affairs/story/copy-paste-inv

Re: Good DB theory references

2019-01-22 Thread weijie tong
Hi Paul: Thanks for the sharing. I would like to share another good latest paper here "Everything you always wanted to know about compiled and vectorized queries but were afraid to ask" : http://www.vldb.org/pvldb/vol11/p2209-kersten.pdf It explains the two kind of database execution architectur

[jira] [Created] (DRILL-7607) Dynamic credit based flow control

2020-02-26 Thread Weijie Tong (Jira)
Weijie Tong created DRILL-7607: -- Summary: Dynamic credit based flow control Key: DRILL-7607 URL: https://issues.apache.org/jira/browse/DRILL-7607 Project: Apache Drill Issue Type: New Feature

[jira] [Created] (DRILL-7656) Support injecting BufferManager into UDF

2020-03-20 Thread Weijie Tong (Jira)
Weijie Tong created DRILL-7656: -- Summary: Support injecting BufferManager into UDF Key: DRILL-7656 URL: https://issues.apache.org/jira/browse/DRILL-7656 Project: Apache Drill Issue Type: New

[jira] [Created] (DRILL-7663) Code refactor to DefaultFunctionResolver

2020-03-25 Thread Weijie Tong (Jira)
Weijie Tong created DRILL-7663: -- Summary: Code refactor to DefaultFunctionResolver Key: DRILL-7663 URL: https://issues.apache.org/jira/browse/DRILL-7663 Project: Apache Drill Issue Type: New