Re: drill's calcite branch

2017-02-27 Thread weijie tong
or-calcite/tree/DrillCalcite1.4.0 > > Kind regards > Arina > > On Mon, Feb 27, 2017 at 5:51 AM, weijie tong <tongweijie...@gmail.com> > wrote: > > > hi Drills: > > where can I find the calcite branch source code that drill are now > self > > maintaining ? > > >

drill's calcite branch

2017-02-26 Thread weijie tong
hi Drills: where can I find the calcite branch source code that drill are now self maintaining ?

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-23 Thread weijie tong
I am working on pushing down joins to Druid storage plugin. To my experience, you need to write a rule to know whether the joins could be pushed down by your storage plugin metadata first,then if ok ,you transfer the join node to the scan node with the query relevant information in the scan node.

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-29 Thread weijie tong
fore hopefully you may offer > more guidance. THANKS A LOT. > > *-----* > *Muhammad Gelbana* > http://www.linkedin.com/in/mgelbana > > On Wed, Mar 29, 2017 at 4:23 AM, weijie tong <tongweijie...@gmail.com> > wrote: > > > to avoid misunderstanding ,

what's the difficult points to cache the PrepareStatement's execution plan?

2017-03-29 Thread weijie tong
If we can cache one PrepareStatement's corresponding execution plan like the traditional relational database, we can save the logic and physical execution plan time. I want to know what's the potential difficulties . If we list these difficulties out ,we may find some solutions to solve it

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-28 Thread weijie tong
my suggestion is you define a rule which matches the DrillJoinRel RelNode , then at the onMatch method ,you traverse the join children to find the ScanRel nodes . You define a new ScanRel which include the ScanRel nodes you find last step. Then transform the JoinRel to this equivalent new ScanRel.

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-28 Thread weijie tong
to avoid misunderstanding , the new equivalent ScanRel is to have the joined ScanRel nodes's GroupScans, as the GroupScans indirectly hold the underlying storage information. On Wed, Mar 29, 2017 at 10:15 AM, weijie tong <tongweijie...@gmail.com> wrote: > > my suggestion is you d

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-03-31 Thread weijie tong
in those ! >3. Precisely, what kind of object should I use to represent a *RelNode* >that represents the whole join ? I understand that I need to use an > object >that has implements the *RelNode* interface. Then I should add the >created *GroupScan* to that *RelN

questions about FPGA plan to Drill

2017-03-16 Thread weijie tong
HI all: I notices that @Eric Fukuda ever have done some work to use FPGA to power Drill.It seems so impressive ! I want to know some recently news about that.Does anyone know about that? I have no contact to him. Best Regards

Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-04-06 Thread weijie tong
)*). Why couldn't I provide a *RelDataType* with > >a different set of fields ? How can I resolve this ? > > > >6. *List*: I assume I can call this method and pass my > >columns names to it, one by one. (i.e. > > *org.apache.drill.common.expression.SchemaPath. > getCompoundPath(String.

Which code compiler is better

2017-07-30 Thread weijie tong
The compile process is long when we have 20 sum or avg expression and the compiler is janino. But if we change the compiler to jdk,we gain lower compile process time. It seems jdk compiler is better .If that's tue,why not let jdk be the default one?

Re: [GitHub] drill issue #888: Merge pull request #1 from apache/master

2017-08-02 Thread weijie tong
sorry for that wrong operation , I have closed it. On Wed, Aug 2, 2017 at 5:20 PM, arina-ielchiieva wrote: > Github user arina-ielchiieva commented on the issue: > > https://github.com/apache/drill/pull/888 > > @weijietong could you please close this PR? > > > --- >

Re: Which code compiler is better

2017-08-11 Thread weijie tong
my previous response: > > In DRILL-4778, JDK was faster in compilation but generated slower code. > Janino was slower in compilation and generate faster code. Your JIRA did > not mention how was the performance when running generated code. You may > want to test this aspect as we

Re: Which code compiler is better

2017-08-11 Thread weijie tong
java to extend the template. On Fri, 11 Aug 2017 at 5:52 PM weijie tong <tongweijie...@gmail.com> wrote: > @chunhui we just adjust different compiler options ,the generating code > strategy does not affected by the compiler option. so I think the > different result just reflects

Re: [GitHub] drill pull request #904: DRILL-5717: let some date time test cases be Local ...

2017-08-14 Thread weijie tong
Thanks for the advice ,will consider that. On Mon, 14 Aug 2017 at 7:56 PM vvysotskyi wrote: > Github user vvysotskyi commented on a diff in the pull request: > > https://github.com/apache/drill/pull/904#discussion_r132932858 > > --- Diff: >

Re: Questions about Drill's multi-thread model

2017-07-21 Thread weijie tong
> > Otherwise, perhaps there is an issue with some particular piece of code. > > What type of file is being read? In what environment? > > > > Thanks, > > > > - Paul > > > > > On Jul 19, 2017, at 7:20 AM, weijie tong <tongweijie...@gmail.com> > >

Re: Questions about Drill's multi-thread model

2017-07-22 Thread weijie tong
plan looking for uses of the Druid data > source, and only add those to the Druid queue. Visitors exist which you can > implement to obtain the required info. > > The result would be that Druid queries block waiting for capacity on Druid > to become available, while all othe

Drill query planning error

2017-07-26 Thread weijie tong
HI all: I materialize the count distinct query result to a cache, then when user query the count distinct , a specific rule will translate the query to the cache. It turns out right when the query has only one count (distinct ) operator ,but when it has two count (distinct ) ,it causes error

Re: Drill query planning error

2017-07-26 Thread weijie tong
input > > > of the join is known to be scalar (single row). It sounds like after > you > > > did the transformation to use the cache, that scalar property somehow > did > > > not get propagated. > > > You can override this behavior by a session config

Re: Drill query planning error

2017-07-26 Thread weijie tong
d be an issue > > with the scalar check when 2 or more cartesian joins are present. > > Please file a JIRA with relevant details. > > > > On Wed, Jul 26, 2017 at 3:13 PM, weijie tong <tongweijie...@gmail.com > <mailto:tongweijie...@gmail.com>> > > wrote: &

Questions about Drill's multi-thread model

2017-07-19 Thread weijie tong
Hi there, Our product environment has a situation that if one query was blocked by the storage,then all other queries which come later would took long and long time to run even they really just need fewer time. At the time ,the cluster's load is not too high. I know that every foreman will

Questions about rpc

2017-08-07 Thread weijie tong
There's a case that the partitioned sender elapsed lots of time to wait . >From the profile ,we saw that the sender waits for 1 hour ,the opposite recivier and its subsequent hash aggregate operator spends 1 hour to its process time. (39 sender minor fragments,7 reciver minor fragments, each

IntelliJ code format

2017-08-08 Thread weijie tong
The IntelliJ code format downloaded from the Drill web site seems to have 4 indent .The eclipse one is 2 indent. Wonder ti's my personal env problem or the provided Intellij code format error?

Re: IntelliJ code format

2017-08-08 Thread weijie tong
the download site url : https://drill.apache.org/docs/apache-drill-contribution-guidelines/ On Tue, Aug 8, 2017 at 10:59 PM, weijie tong <tongweijie...@gmail.com> wrote: > The IntelliJ code format downloaded from the Drill web site seems to have > 4 indent .The eclipse one is 2 in

questions about Drill's implementation detail to implement HashJoin

2017-05-13 Thread weijie tong
HI Drillers: Any one could give a detail description of Drill's HashJoin implementation , a picture of the HashTable's data structure ? I also wonder the HashTable implementation difference between Drill and Flink . Seems Drill is batch model while Flink is not . Am I right ?

Re: Propose about join push down

2017-09-20 Thread weijie tong
For the NLJ, indeed the current > Drill does not support “down flow” of data (and most storage does not have > indexes), and it’ll take some work to implement (e.g., all operators would > need to accept a next() call with some “data” parameter). > > Boaz > --

Re: Propose about join push down

2017-09-20 Thread weijie tong
operators would > need to accept a next() call with some “data” parameter). > > Boaz > > > On 9/19/17, 8:45 AM, "weijie tong" <tongweijie...@gmail.com> wrote: > > All: >This is a propose about join query tuning by

Propose about join push down

2017-09-19 Thread weijie tong
All: This is a propose about join query tuning by pushing down the join condition. Welcome suggestion ,discussion,objection . Suppose we have a join query "select t1.a,t1.s,t3.d (select a, sum(b) as s from t1 where a='1' group by a ) t2 join t3 on t2.a = t3.a" . This query will be

Re: IntelliJ code format

2017-09-07 Thread weijie tong
s updated now with the new jar. > Please check it out and let us know if any other issues. > > Thanks, > Padma > > > On Aug 21, 2017, at 8:59 AM, weijie tong <tongweijie...@gmail.com tongweijie...@gmail.com>> wrote: > > @padma ,what's the process? > > On We

Re: Discuss about Drill's schedule policy

2017-08-21 Thread weijie tong
es with maintaining the YARN and Impala schedulers, so we’re > somewhat hesitant to move away from a purely symmetrical configuration. > Suggestions in this area are very welcome. > > For now, try turning on the ZK queues to limit concurrent queries and > prevent overload. Ensure your cluster

Re: IntelliJ code format

2017-08-21 Thread weijie tong
@padma ,what's the process? On Wed, 9 Aug 2017 at 1:04 AM Padma Penumarthy <ppenumar...@mapr.com> wrote: > You are right. It is configured as 4. We should fix that. > > Thanks, > Padma > > > > On Aug 8, 2017, at 8:12 AM, weijie tong <tongweijie...@gmail.com> w

Re: Discuss about Drill's schedule policy

2017-08-27 Thread weijie tong
ould adopt the core of Sparrow (or whatever) with the algorithm needed > for Drill to avoid the need to invent yet another new scheduler. > > Thanks, > > - Paul > > > [1] https://www.usenix.org/system/files/conference/osdi14/ > osdi14-paper-boutin_0.pdf > > On Aug 23, 2017,

Re: Discuss about Drill's schedule policy

2017-08-27 Thread weijie tong
MajorFragments execute from top to leaf ,then the corresponding execution tasks from top to down are all sure to be allocated to do the pipeline works. On Sun, 27 Aug 2017 at 7:46 PM weijie tong <tongweijie...@gmail.com> wrote: > Hi Paul: > >I have read the codes of Sparrow an

Re: Discuss about Drill's schedule policy

2017-08-23 Thread weijie tong
have different scheduler implementations (central or non-central ,maybe non-central like sparrow be the default one ). On Mon, Aug 21, 2017 at 11:51 PM, weijie tong <tongweijie...@gmail.com> wrote: > Thanks for all your suggestions. > > @paul your analysis is impressive . I agree wit

Discuss about Drill's schedule policy

2017-08-20 Thread weijie tong
HI all: Drill's current schedule policy seems a little simple. The SimpleParallelizer assigns endpoints in round robin model which ignores the system's load and other factors. To critical scenario, some drillbits are suffering frequent full GCs which will let their control RPC blocked. Current

Re: [ANNOUNCE] New Committer: Boaz Ben-Zvi

2017-12-14 Thread weijie tong
Congratulations , Boaz! On Thu, 14 Dec 2017 at 7:21 AM Boaz Ben-Zvi wrote: > > Thank you all for the warm wishes; I hope to be worthy of the new status > … > > Boaz > > On 12/13/17, 2:45 PM, "Vlad Rozov" wrote: > > Congrats! > > Thank

Re: [ANNOUNCE] New Committer: Vitalii Diravka

2017-12-12 Thread weijie tong
Congratulations Vitalii On Wed, 13 Dec 2017 at 5:50 AM Khurram Faraaz wrote: > Congratulations Vitalii. > > > Regards, > > Khurram > > > From: Vlad Rozov > Sent: Tuesday, December 12, 2017 1:10:09 PM > To:

Re: How to generate hash code for each build side one of the hash join columns

2018-05-30 Thread weijie tong
disk feature. > So, this may pose some integration challenges for your run-time join > pushdown feature. > Also, one other question/clarification: for the bloom filter itself are > you implementing it natively in Drill or using an external library ? > > -Aman > > On Tue, May

Re: How to generate hash code for each build side one of the hash join columns

2018-05-29 Thread weijie tong
I found ClassGenerator's nestEvalBlock(JBlock block) and unNestEvalBlock() which has the same effect to what I change to the ClassGenerator. So I give up what I change to the ClassGenerator and hope this can help someone else. On Tue, May 29, 2018 at 1:53 PM weijie tong wrote: > The c

How to generate hash code for each build side one of the hash join columns

2018-05-28 Thread weijie tong
HI All: Through implementing the JPPD feature ( https://issues.apache.org/jira/browse/DRILL-6385) , I was blocked by the problem: how to get the hash code of each build side of the hash join columns through the dynamic generated java code. Hope someone can give some advice. I supposed to add

Re: How to generate hash code for each build side one of the hash join columns

2018-05-31 Thread weijie tong
he first columns; their number can be found from the > config: e.g., htConfig.getKeyExprsBuild().size() ) > >With such implementation, that evalHash() could be used anywhere (e.g., > to match the Bloom filters on the left side of the join). > >Thanks, > >

Re: How to generate hash code for each build side one of the hash join columns

2018-06-01 Thread weijie tong
. So why SelectionVector4 is not supported by the ProjectBatch ? The same question is to the FilterBatch's SelectVector2 which also only support the 2 Byte size record count. On Fri, Jun 1, 2018 at 1:40 PM weijie tong wrote: > Hi Boaz: > > Your propose is valuable though I have im

Re: How to generate hash code for each build side one of the hash join columns

2018-06-01 Thread weijie tong
I find the answer that RecordBatch's max size is 2^16 which is defined at RecordBatch's MAX_BATCH_SIZE. On Fri, Jun 1, 2018 at 3:36 PM weijie tong wrote: > Some questions about SelectionVector2 and SelectionVector4: > > I want to create SelectionVector4 or SelectionVector2 to

Re: How to generate hash code for each build side one of the hash join columns

2018-06-01 Thread weijie tong
reference of SV2 java object. Only the > underlying buffer for the SV2 object will change to store new indexes for > new incoming batch. It's only after OK_NEW_SCHEMA outcome is seen when > setup will be again called. > > > Thanks, > Sorabh > > > ___

Re: Actual vectorization execution

2018-06-29 Thread weijie tong
re running was already written > to exploit vectorization. Have you also looked into > Drill's code-gen to see which ones are amenable to vectorization ? We > could start with some small use case and expand. > > [1] > http://www.oracle.com/technetwork/java/jvmls2016-ajila-v

Re: how to release allocated ByteBuf which steps across two threads

2018-06-20 Thread weijie tong
to release the ByteBuf maybe happed behind the allocator thread. On Thu, Jun 21, 2018 at 8:51 AM weijie tong wrote: > Hi Parth: > > Thanks for your reply. Your detail description explain that problem > clearly. This problem is not a common case. The bloom filter has not been >

Re: how to release allocated ByteBuf which steps across two threads

2018-06-20 Thread weijie tong
Bloom Filter > > reaching its destination. How does the destination fragment know when it > > has to wait for the Bloom Filter? I suspect this may be more > > complicated than it appears at first glance. > > > > Not sure if this helps narrow it down. If you can share

how to release allocated ByteBuf which steps across two threads

2018-06-19 Thread weijie tong
HI: I faced a complicated problem by releasing the BloomFilter's direct memory at some special cases. Hope someone could give some advices. Say, one join node sends out BloomFilter to the foreman node(TestHashJoin.simpleEqualityJoin() ) . The sending thread is netty's BitClient. The

Re: [ANNOUNCE] New PMC member: Vitalii Diravka

2018-06-26 Thread weijie tong
Congratulations Vitalii! On Wed, Jun 27, 2018 at 8:11 AM Paul Rogers wrote: > Congratulations Vitalii! > - Paul > > > > On Tuesday, June 26, 2018, 11:12:16 AM PDT, Aman Sinha < > amansi...@apache.org> wrote: > > I am pleased to announce that Drill PMC invited Vitalii Diravka to the PMC >

Re: [ANNOUNCE] New Committer: Padma Penumarthy

2018-06-19 Thread weijie tong
Congratulations Padma! On Tue, Jun 19, 2018 at 4:41 AM salim achouche wrote: > Congratulations Padma! > > Regards, > Salim > > > On Jun 15, 2018, at 9:58 AM, Vitalii Diravka > wrote: > > > > Congrats Padma! > > > > Kind regards > > Vitalii > > > > > > On Fri, Jun 15, 2018 at 7:40 PM Arina

[DISCUSSION] Does schema-free really need

2018-08-15 Thread weijie tong
Hi all: Hope the statement not seems too dash to you. Drill claims be a schema-free distributed SQL engine. It pays lots of work to make the execution engine to support it to support JSON file like storage format. It is easier to make bugs and let the code logic ugly. I wonder do we still

Re: [DISCUSSION] Does schema-free really need

2018-08-15 Thread weijie tong
That gives us pretty much the > flexibility of schema on read without as much of the burden. > > > > On Wed, Aug 15, 2018 at 5:02 PM weijie tong > wrote: > > > Hi all: > > Hope the statement not seems too dash to you. > > Drill claims be a schema-free dis

Re: [DISCUSSION] Does schema-free really need

2018-08-16 Thread weijie tong
2: I would not call it a battle between non-relational data and > > relational engine. The extended relational model has type of > > array/composite types, similar to what Drill has. > > > > > > > > > > > > On Wed, Aug 15, 2018 at 7:27 PM, weijie tong

Re: [DISCUSSION] current project state

2018-08-14 Thread weijie tong
My thinking about this topic. Drill does well now. But be better,we need to be idealist to bring in more use cases or more advanced query performance compared to other projects like Flink , Spark, Presto,Impala. To performance, I wonder do we need to adopt the project Gandiva which is so exciting

Re: [ANNOUNCE] New PMC member: Volodymyr Vysotskyi

2018-08-25 Thread weijie tong
Congratulations Volodymyr! On Sat, Aug 25, 2018 at 8:30 AM salim achouche wrote: > Congrats Volodymyr! > > On Fri, Aug 24, 2018 at 11:32 AM Gautam Parai wrote: > > > Congratulations Vova! > > > > Gautam > > > > On Fri, Aug 24, 2018 at 10:59 AM, Khurram Faraaz > wrote: > > > > >

Re: [DISCUSS] Deprecation policy in Drill

2018-08-27 Thread weijie tong
I think we should reserve these deprecated options to let users upgrade easier. Another solution is if we remove these deprecated ones, we should add a startup checking to let users know these options are removed . On Mon, Aug 27, 2018 at 3:54 PM Arina Ielchiieva wrote: > Hi all, > > when it

Re: [ANNOUNCE] New Committer: Weijie Tong

2018-09-01 Thread weijie tong
Weijie, thanks for your contributions to Drill. > >>>> Thanks, > >>>> - Paul > >>>> > >>>> > >>>> > >>>> On Friday, August 31, 2018, 8:51:30 AM PDT, Arina Ielchiieva < > >>>> ar...@apache.org>

Re: Possible way to specify column types in query

2018-09-06 Thread weijie tong
Google's latest paper about F1[1] claims to support any data sources by using an extension api called TVF see section 6.3. Also need to declare column datatype before the query. [1] http://www.vldb.org/pvldb/vol11/p1835-samwel.pdf On Fri, Sep 7, 2018 at 9:47 AM Paul Rogers wrote: > Hi All, >

Re: Discussion about the metadata design

2018-06-28 Thread weijie tong
< > arina.yelchiy...@gmail.com> > wrote: > > > Hi, > > > > Vitalii and Vova is also looking at this part, you might want to sync up > > with them. Or even better, we can create Jira for this and held all > > discussions there. > > Vitalii, what do yo

Re: how to release allocated ByteBuf which steps across two threads

2018-06-21 Thread weijie tong
describe above. And by following this solving pattern, I solved this problem ,really appreciate of your advice. thanks you so much! On Thu, Jun 21, 2018 at 9:13 AM weijie tong wrote: > I also think this is a common problem to the case that the receiver has no > chance to sent out a ack reply,

Discussion about the metadata design

2018-06-28 Thread weijie tong
HI all: As @aman ever noticed me about the roadmap of DRILL-2.0 ,which includes the description of the metadata design ( https://lists.apache.org/thread.html/74cf48dd78d323535dc942c969e72008884e51f8715f4a20f6f8fb66@%3Cdev.drill.apache.org%3E) , I am interested in taking the role to implement

Actual vectorization execution

2018-06-29 Thread weijie tong
HI all: I have investigate some vector friendly java codes's jit assembly code by the JITWatch tool . Then I found that JVM did not generate the expected AVX code.According to some conclusion from the JVM expert , JVM only supply some restrict usage case to generate AVX code. I found Intel

Re: [ANNOUNCE] New Committer: Gautam Parai

2018-10-22 Thread weijie tong
Congratulations Gautam ! On Tue, Oct 23, 2018 at 6:28 AM Aman Sinha wrote: > Congratulations Gautam ! > > On Mon, Oct 22, 2018 at 3:00 PM Jyothsna Reddy > wrote: > > > Congrats Gautam!! > > > > > > > > On Mon, Oct 22, 2018 at 2:01 PM Vitalii Diravka > > wrote: > > > > > Congratulations! > > >

Re: [HANGOUT] 29th Oct 2018 (9PM PST)

2018-10-29 Thread weijie tong
Hi : Thanks for the invitation. Here is slide: JPPD On Tue, Oct 30, 2018 at 12:12 PM Pritesh Maker wrote: > Hi, > > Apologies for the late notice - we are currently having a Hangout with >

Re: [ANNOUNCE] New Committer: Hanumath Rao Maduri

2018-11-01 Thread weijie tong
Congratulations, Hanu! On Fri, Nov 2, 2018 at 8:22 AM Robert Hou wrote: > Congratulations, Hanu. Thanks for contributing to Drill. > > --Robert > > On Thu, Nov 1, 2018 at 4:06 PM Jyothsna Reddy > wrote: > > > Congrats Hanu!! Well deserved :D > > > > Thank you, > > Jyothsna > > > > On Thu, Nov

Re: [ANNOUNCE] New Committer: Chunhui Shi

2018-09-28 Thread weijie tong
Congratulations Chunhui ! On Fri, Sep 28, 2018 at 10:58 PM Abhishek Girish wrote: > Congrats Chunhui! > On Fri, Sep 28, 2018 at 7:39 AM Vova Vysotskyi wrote: > > > Congratulations! Well deserved! > > > > Kind regards, > > Volodymyr Vysotskyi > > > > > > On Fri, Sep 28, 2018 at 12:17 PM Arina

Re: Good DB theory references

2019-01-22 Thread weijie tong
Hi Paul: Thanks for the sharing. I would like to share another good latest paper here "Everything you always wanted to know about compiled and vectorized queries but were afraid to ask" : http://www.vldb.org/pvldb/vol11/p2209-kersten.pdf It explains the two kind of database execution

Re: [ANNOUNCE] New Committer: Karthikeyan Manivannan

2018-12-07 Thread weijie tong
Congratulations Karthik ! On Sat, Dec 8, 2018 at 12:10 PM Karthikeyan Manivannan wrote: > Thanks! In addition to all you wonderful Drillers, I would also like to > thank Google, StackOverflow and Larry Tesler > < >

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-04 Thread weijie tong
r publishing > desired methods with package access ? > > Thanks, Igor > > On Thu, Apr 4, 2019 at 9:51 AM weijie tong > wrote: > > > > HI : > > > > Gandiva is a sub project of Arrow. Arrow gandiva using LLVM codegen and > > simd skill could achieve bette

[Discuss] Integrate Arrow gandiva into Drill

2019-04-04 Thread weijie tong
HI : Gandiva is a sub project of Arrow. Arrow gandiva using LLVM codegen and simd skill could achieve better query performance. Arrow and Drill has similar column memory format. The main difference now is the null representation. Also Arrow has made great changes to the ValueVector. To adopt

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-04 Thread weijie tong
I have a doubt about the ProjectRecordBatch implementation. Hope someone could give an explanation about that. To the line 234 of ProjectRecordBatch, at what case,the projector output row size less than the input size ? On Thu, Apr 4, 2019 at 5:11 PM weijie tong wrote: > Hi Igor: >

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-04 Thread weijie tong
Volodymyr Vysotskyi > > > On Thu, Apr 4, 2019 at 5:17 PM weijie tong > wrote: > > > I have a doubt about the ProjectRecordBatch implementation. Hope someone > > could give an explanation about that. To the line 234 of > > ProjectRecordBatch, at what case,the proje

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-05 Thread weijie tong
, I think the line 234 of ProjectRecordBatch will never be executed. Untill DRILL-6340 , we control the output batch memory size, that part of code finally come into use. If I was wrong, please let me know. On Fri, Apr 5, 2019 at 12:15 AM weijie tong wrote: > Thanks for the reply, But it se

Re: [ANNOUNCE] New PMC member: Sorabh Hamirwasia

2019-04-05 Thread weijie tong
Congratulations Sorabh! On Sat, Apr 6, 2019 at 7:17 AM Sorabh Hamirwasia wrote: > Thank You everyone for your wishes!! > > Looking forward for everyone's help to vote on release candidate next week > :) > > Thanks, > Sorabh > > On Fri, Apr 5, 2019 at 2:12 PM Parth Chandra wrote: > > > Congrats

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-05 Thread weijie tong
separate operator in > > Drill but I think that code is there to handle such cases. > > > > Thanks, > > Sorabh > > > > On Fri, Apr 5, 2019 at 6:08 AM weijie tong > > wrote: > > > > > The first appearance of the compari

Questions about bushy join

2019-05-27 Thread weijie tong
Hi all: Does anyone know why we don't support bushy join in the query plan generation while hep planner is enabled. The codebase shows the fact that the PlannerPhase.JOIN_PLANNING use the LoptOptimizeJoinRule not calcite's MultiJoinOptimizeBushyRule.

Re: Questions about bushy join

2019-05-27 Thread weijie tong
more than > estimated. This could happen easily in big data systems where statistics > are constantly changing due to new data ingestion and even running ANALYZE > continuously is not feasible. > That said, it is not a bad idea to experiment with such plans with say more > than 5 table joins a

Re: Questions about bushy join

2019-05-29 Thread weijie tong
id = 10932 > 00-20Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=classpath:/tpch/region.parquet]], > selectionRoot=classpath:/tpch/region.parquet, numFiles=1, > usedMetadataFile=false, columns=[`r_regionkey`, `r_name`]]]) : rowType > =

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-19 Thread weijie tong
t; solve these problems. > > > Thanks, > > - Paul > > > > On Wednesday, April 3, 2019, 11:51:34 PM PDT, weijie tong < > tongweijie...@gmail.com> wrote: > > HI : > > Gandiva is a sub project of Arrow. Arrow gandiva using LLVM codegen and > simd

Re: Apache Drill Hangout - July 9, 2019

2019-07-10 Thread weijie tong
you in response to the email, so that Apache Drill > > community decides how to proceed with this (i.e. we find a convenient > time > > that works for all interested in the topic). > > > > Kind regards, > > Bohdan Kazydub > > > > On Tue, Jul 9, 2019 at 2:37 AM weij

Re: Apache Drill Hangout - July 9, 2019

2019-07-08 Thread weijie tong
I could give a short talk about my recent work about parallel HashJoin and something others. On Mon, Jul 8, 2019 at 7:28 PM Bohdan Kazydub wrote: > Hi Drillers, > > We will have our bi-weekly hangout tomorrow, July 9th, at 10 AM PST > (link: https://meet.google.com/yki-iqdf-tai ). > > If there

Re: Drill storage plugin for IPFS, any suggestion is welcome :)

2019-07-08 Thread weijie tong
Amazing to see Paul’s Chinese welcome words! Also glad to hear the use case by Wang Liang using Drill and welcome to contribute that as a Drill’s storage plugin. On Tue, Jul 9, 2019 at 1:00 AM Paul Rogers wrote: > 王亮 你好, > > > Very creative use of Drill! We usually think of Drill as a tool for

Re: [Discuss] Integrate Arrow gandiva into Drill

2019-04-23 Thread weijie tong
to use Gandiva, he/she could set a config option to > > point to the Gandiva library (and supporting files, if any.) Or, use the > > existing LD_LIBRARY_PATH env. variable. > > > > Thanks, > > - Paul > > > > > > > > On Thursday, April 18,

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2019-08-25 Thread weijie tong
Congratulations Charles. On Sat, Aug 24, 2019 at 11:33 AM Robert Hou wrote: > Congratulations Charles, and thanks for your contributions to Drill! > > Thank you Arina for all you have done as PMC Chair this past year. > > --Robert > > On Fri, Aug 23, 2019 at 4:16 PM Khurram Faraaz > wrote: > >

Re: Apache Drill Hangout July 23rd

2019-07-23 Thread weijie tong
Well, sorry about the missing time . I forgot to set the alarm and overslept. Now I can't join the meeting, maybe it has finished. I will issue the ParallelHashJoin PR recently. On Tue, Jul 23, 2019 at 10:14 AM Aman Sinha wrote: > Hi Drillers, > > We will have our bi-weekly hangout tomorrow,

Re: [ANNOUNCE] New Committer: Bohdan Kazydub

2019-07-23 Thread weijie tong
Parai wrote: > > > > > > > > > > > Congratulations Bohdan! > > > > > > > > > > > > Gautam > > > > > > > > > > > > On Mon, Jul 15, 2019 at 11:53 PM Bohdan Kazydub > > > > > bohdan.kazy...@gmail.com> &

Re: [ANNOUNCE] New Committer: Igor Guzenko

2019-07-23 Thread weijie tong
Congratulations Igor! On Wed, Jul 24, 2019 at 1:23 AM Igor Guzenko wrote: > Hello Drillers, > > Thank you all for the greetings. It is an honor for me to be part of the > Apache Drill community. > > Best regards, > Igor Guzenko > > On Tue, Jul 23, 2019 at 6:37 PM Charles Givre wrote: > > >

Re: [ANNOUNCE]: New committer: Ankush Kapur

2020-08-05 Thread weijie tong
Congratulations Ankush! On Thu, Aug 6, 2020 at 2:37 AM Charles Givre wrote: > The Project Management Committee (PMC) for Apache [PROJECT] has invited > Ankush Kapur to become a committer and we are pleased to announce that he > has accepted. > > Being a committer enables easier contribution to

Re: [VOTE]: James Turton for Committer

2020-11-05 Thread weijie tong
+1 On Fri, Nov 6, 2020 at 1:18 AM Ted Dunning wrote: > I think that looks like a great addition. > > +1 for James. > > I don't think that lazy consensus is a great idea, however. Happily, you > now have three positives. > > > On Wed, Nov 4, 2020 at 11:00 AM Charles Givre wrote: > > > Hello

[jira] [Created] (DRILL-7607) Dynamic credit based flow control

2020-02-26 Thread Weijie Tong (Jira)
Weijie Tong created DRILL-7607: -- Summary: Dynamic credit based flow control Key: DRILL-7607 URL: https://issues.apache.org/jira/browse/DRILL-7607 Project: Apache Drill Issue Type: New Feature

[jira] [Created] (DRILL-7663) Code refactor to DefaultFunctionResolver

2020-03-25 Thread Weijie Tong (Jira)
Weijie Tong created DRILL-7663: -- Summary: Code refactor to DefaultFunctionResolver Key: DRILL-7663 URL: https://issues.apache.org/jira/browse/DRILL-7663 Project: Apache Drill Issue Type: New

[jira] [Created] (DRILL-7656) Support injecting BufferManager into UDF

2020-03-20 Thread Weijie Tong (Jira)
Weijie Tong created DRILL-7656: -- Summary: Support injecting BufferManager into UDF Key: DRILL-7656 URL: https://issues.apache.org/jira/browse/DRILL-7656 Project: Apache Drill Issue Type: New