Re: compiling drill 1.11.0 with cdh profile

2017-08-09 Thread yuliya Feldman
Could you click on the following link: [1] or try to wget it from the machine 
you are building on?
As apache-14.pom is NOT found in any of the repos except [1] it tries to 
download it form there and fails.You may or may not have access to other repos, 
but [1] does not look like accessible for your build.
As you say it works fine with HW profile or you try to build on different 
machine? May be HW maven repo hosts apache-14.pom or you have different 
internet access pattern on HW machine (if it is different machine indeed)
[1]  https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom

  From: Dor Ben Dov <dor.ben-...@amdocs.com>
 To: "dev@drill.apache.org" <dev@drill.apache.org>; yuliya Feldman 
<yufeld...@yahoo.com> 
 Sent: Tuesday, August 8, 2017 11:32 PM
 Subject: RE: compiling drill 1.11.0 with cdh profile
   
Yulia, 
I ran 'mvn -U -Dskip.Tests clean install -Pcdh' isn't this enough?
What am I missing on cloudera? 
When I am taking drill to hortonworks, it work well by the way.

Regards,
Dor

-Original Message-
From: yuliya Feldman [mailto:yufeld...@yahoo.com.INVALID] 
Sent: יום ג 08 אוגוסט 2017 17:46
To: dev@drill.apache.org
Subject: Re: compiling drill 1.11.0 with cdh profile

Feels like you can't access: 
https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom
as no other repo contain that pom.



      From: Dor Ben Dov <dor.ben-...@amdocs.com>
 To: "dev@drill.apache.org" <dev@drill.apache.org>
 Sent: Tuesday, August 8, 2017 1:12 AM
 Subject: RE: compiling drill 1.11.0 with cdh profile
  
 Can one help me with this ? 

Dor

-Original Message-
From: Dor Ben Dov
Sent: יום ב 07 אוגוסט 2017 13:44
To: dev@drill.apache.org
Subject: compiling drill 1.11.0 with cdh profile

Hi all,

Tried to compile source code of branch 1.11.0 with profile cdh for cloudera - 
getting this exception, anyone ? 

 [dor@dor-fedora64 drill]$ mvn -U -DskipTests clean install -Pcdh [INFO] 
Scanning for projects...
Downloading: 
https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/apache/14/apache-14.pom
Downloading: http://conjars.org/repo/org/apache/apache/14/apache-14.pom
Downloading: http://repository.mapr.com/maven/org/apache/apache/14/apache-14.pom
Downloading: http://repo.dremio.com/release/org/apache/apache/14/apache-14.pom
Downloading: 
http://repository.mapr.com/nexus/content/repositories/drill/org/apache/apache/14/apache-14.pom
Downloading: 
https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom
[ERROR] [ERROR] Some problems were encountered while processing the POMs:
[FATAL] Non-resolvable parent POM for org.apache.drill:drill-root:1.11.0: Could 
not transfer artifact org.apache:apache:pom:14 from/to cloudera 
(https://repository.cloudera.com/artifactory/cloudera-repos/): 
repository.cloudera.com: Name or service not known and 'parent.relativePath' 
points at wrong local POM @ line 15, column 11  @ [ERROR] The build could not 
read 1 project -> [Help 1] [ERROR] [ERROR]  The project 
org.apache.drill:drill-root:1.11.0 (/home/dor/Downloads/drill/pom.xml) has 1 
error [ERROR]    Non-resolvable parent POM for 
org.apache.drill:drill-root:1.11.0: Could not transfer artifact 
org.apache:apache:pom:14 from/to cloudera 
(https://repository.cloudera.com/artifactory/cloudera-repos/): 
repository.cloudera.com: Name or service not known and 'parent.relativePath' 
points at wrong local POM @ line 15, column 11: Unknown host 
repository.cloudera.com: Name or service not known -> [Help 2] [ERROR] [ERROR] 
To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException
[dor@dor-fedora64 drill]$


** I am using fedora 26 **

Regards,
Dor
This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 
<https://www.amdocs.com/about/email-disclaimer>

This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 
<https://www.amdocs.com/about/email-disclaimer>


  
This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 
<https://www.amdocs.com/about/email-disclaimer>


   

Re: compiling drill 1.11.0 with cdh profile

2017-08-08 Thread yuliya Feldman
Feels like you can't access: 
https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom
as no other repo contain that pom.



  From: Dor Ben Dov 
 To: "dev@drill.apache.org"  
 Sent: Tuesday, August 8, 2017 1:12 AM
 Subject: RE: compiling drill 1.11.0 with cdh profile
   
 Can one help me with this ? 

Dor

-Original Message-
From: Dor Ben Dov 
Sent: יום ב 07 אוגוסט 2017 13:44
To: dev@drill.apache.org
Subject: compiling drill 1.11.0 with cdh profile

Hi all,

Tried to compile source code of branch 1.11.0 with profile cdh for cloudera - 
getting this exception, anyone ? 

 [dor@dor-fedora64 drill]$ mvn -U -DskipTests clean install -Pcdh [INFO] 
Scanning for projects...
Downloading: 
https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/apache/14/apache-14.pom
Downloading: http://conjars.org/repo/org/apache/apache/14/apache-14.pom
Downloading: http://repository.mapr.com/maven/org/apache/apache/14/apache-14.pom
Downloading: http://repo.dremio.com/release/org/apache/apache/14/apache-14.pom
Downloading: 
http://repository.mapr.com/nexus/content/repositories/drill/org/apache/apache/14/apache-14.pom
Downloading: 
https://repo.maven.apache.org/maven2/org/apache/apache/14/apache-14.pom
[ERROR] [ERROR] Some problems were encountered while processing the POMs:
[FATAL] Non-resolvable parent POM for org.apache.drill:drill-root:1.11.0: Could 
not transfer artifact org.apache:apache:pom:14 from/to cloudera 
(https://repository.cloudera.com/artifactory/cloudera-repos/): 
repository.cloudera.com: Name or service not known and 'parent.relativePath' 
points at wrong local POM @ line 15, column 11  @ [ERROR] The build could not 
read 1 project -> [Help 1]
[ERROR]  
[ERROR]  The project org.apache.drill:drill-root:1.11.0 
(/home/dor/Downloads/drill/pom.xml) has 1 error
[ERROR]    Non-resolvable parent POM for org.apache.drill:drill-root:1.11.0: 
Could not transfer artifact org.apache:apache:pom:14 from/to cloudera 
(https://repository.cloudera.com/artifactory/cloudera-repos/): 
repository.cloudera.com: Name or service not known and 'parent.relativePath' 
points at wrong local POM @ line 15, column 11: Unknown host 
repository.cloudera.com: Name or service not known -> [Help 2]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException
[dor@dor-fedora64 drill]$


** I am using fedora 26 **

Regards,
Dor
This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 


This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 



   

Re: [ANNOUNCE] New PMC member: Arina Ielchiieva

2017-08-03 Thread yuliya Feldman
Congrats Arina!!!
Very glad to see this happening.
Yuliya

  From: Arina Yelchiyeva 
 To: dev@drill.apache.org 
 Sent: Thursday, August 3, 2017 2:53 AM
 Subject: Re: [ANNOUNCE] New PMC member: Arina Ielchiieva
   
Thank all you!

Kind regards
Arina

On Thu, Aug 3, 2017 at 5:58 AM, Sudheesh Katkam  wrote:

> Congratulations and thank you, Arina.
>
> On Wed, Aug 2, 2017 at 1:38 PM, Paul Rogers  wrote:
>
> > The success of the Drill 1.11 release proves this is a well-deserved
> move.
> > Congratulations!
> >
> > - Paul
> >
> > > On Aug 2, 2017, at 11:23 AM, Aman Sinha  wrote:
> > >
> > > I am pleased to announce that Drill PMC invited Arina Ielchiieva to the
> > PMC
> > > and she has accepted the invitation.
> > >
> > > Congratulations Arina and thanks for your contributions !
> > >
> > > -Aman
> > > (on behalf of Drill PMC)
> >
> >
>


   

Re: [HANGOUT] Topics for 7/25/17

2017-07-26 Thread yuliya Feldman
Sorry for the late chime in.Just a note - regarding s3 - even after upgrade to 
hadoop 2.8.x you may need to separately update versions of aws, as one provided 
with the upgrade is not supporting all the newly added regions.
Thanks,Yuliya

  From: Arina Yelchiyeva 
 To: dev@drill.apache.org 
Cc: user 
 Sent: Tuesday, July 25, 2017 10:35 AM
 Subject: Re: [HANGOUT] Topics for 7/25/17
   
Meeting minutes 25 July 2017:

Attendees:
Rob, Vova, Sorabh, Pritesh, Paul, Aman, Padma, Jyothsna, Sindhuri.

Two topics were discussed.
1. Release candidate for 1.11.0.
Everybody is encouraged to test the release candidate and vote.
Aman asked about the release candidate performance testing.
Asked Kunal via email and he confirmed that performance testing is in
progress.

2. Upgrade to hadoop version 2.8.
Padma was looking into S3 connectivity issues and found out that switching
to Hadoop version 2.8.1 will solve these problems.
However, the hadoop release notes for 2.8.1 (and 2.8.0 as well) say the
following:
"Please note that 2.8.x release line continues to be not yet ready for
production use”.
Was decided to wait till the next Hadoop stable version release (hopefully
before Drill 1.12.0 release)
and for now document that users may switch to 2.8.1 themselves.


Thank you all for attending the hangout today.

Kind regards
Arina

On Tue, Jul 25, 2017 at 8:04 PM, Arina Yelchiyeva <
arina.yelchiy...@gmail.com> wrote:

> Hangouts is starting now...
>
> On Tue, Jul 25, 2017 at 7:41 AM, Padma Penumarthy 
> wrote:
>
>> I have a topic to discuss. Lot of folks on the user mailing list raised
>> the issue of not being able to access all S3 regions using Drill.
>> We need hadoop version 2.8 or higher to be able to connect to
>> regions which support only Version 4 signature.
>> I tried with 2.8.1, which just got released and it works i.e. I am able to
>> connect to both old and new regions (by specifying the endpoint in the
>> config).
>> There are some failures in unit tests, which can be fixed.
>>
>> Fixing S3 connectivity issues is important.
>> However, the hadoop release notes for 2.8.1 (and 2.8.0 as well) say the
>> following:
>> "Please note that 2.8.x release line continues to be not yet ready for
>> production use”.
>>
>> So, should we or not move to 2.8.1 ?
>>
>> Thanks,
>> Padma
>>
>>
>> On Jul 24, 2017, at 9:46 AM, Arina Yelchiyeva > > wrote:
>>
>> Hi all,
>>
>> We'll have the hangout tomorrow at the usual time [1]. Any topics to be
>> discussed?
>>
>> [1] https://drill.apache.org/community-resources/
>>
>> Kind regards
>> Arina
>>
>>
>

   

Re: [ANNOUNCE] New Committer: Arina Ielchiieva

2017-02-26 Thread yuliya Feldman
Congratulations Arina!!!

  From: Arina Yelchiyeva 
 To: dev@drill.apache.org 
 Sent: Sunday, February 26, 2017 5:23 AM
 Subject: Re: [ANNOUNCE] New Committer: Arina Ielchiieva
   
Thank you all for congratulations! I really appreciate that.

Kind regards
Arina

On Sat, Feb 25, 2017 at 3:30 PM, Parth Chandra  wrote:

> Congratulations Arina. Welcome and thank you for your great work so far.
>
>
>
> On Fri, Feb 24, 2017 at 9:06 AM, Sudheesh Katkam 
> wrote:
>
> > The Project Management Committee (PMC) for Apache Drill has invited Arina
> > Ielchiieva to become a committer, and we are pleased to announce that she
> > has accepted.
> >
> > Arina has a long list of contributions [1] that have touched many aspects
> > of the product. Her work includes features such as dynamic UDF support
> and
> > temporary tables support.
> >
> > Welcome Arina, and thank you for your contributions.
> >
> > - Sudheesh, on behalf of the Apache Drill PMC
> >
> > [1] https://github.com/apache/drill/commits/master?author=
> arina-ielchiieva
> >
>


   

Re: [ANNOUNCE] - New Apache Drill Committer - Chris Westin

2016-12-01 Thread yuliya Feldman
Congratulations Chris!!!

  From: Jacques Nadeau 
 To: dev  
 Sent: Thursday, December 1, 2016 8:54 AM
 Subject: [ANNOUNCE] - New Apache Drill Committer - Chris Westin
   
On behalf of the Apache Drill PMC, I am very pleased to announce that Chris
Westin has accepted the invitation to become a committer in the project.

Welcome Chris and thanks for your great contributions!


--
Jacques Nadeau
CTO and Co-Founder, Dremio


   

Re: [ANNOUNCE] - New Apache Drill Committer - Neeraja Rentachintala

2016-11-18 Thread yuliya Feldman
Congratulations Neeraja!!!

  From: Parth Chandra 
 To: dev  
 Sent: Thursday, November 17, 2016 11:10 AM
 Subject: [ANNOUNCE] - New Apache Drill Committer - Neeraja Rentachintala
   
On behalf of the Apache Drill PMC, I am very pleased to announce that
Neeraja Rentachintala has accepted the invitation to become a committer in
the project.


Welcome Neeraja !


   

[jira] [Created] (DRILL-4809) Drill to provide ability to support parameterized conditions

2016-07-26 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-4809:
-

 Summary: Drill to provide ability to support parameterized 
conditions
 Key: DRILL-4809
 URL: https://issues.apache.org/jira/browse/DRILL-4809
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow, Query Planning & Optimization, SQL 
Parser
Reporter: Yuliya Feldman


Currently Drill does not provide ability to specify variables in the WHERE 
clause which means that user has to create a new query to handle any new 
condition.

For example if someone wants to execute following query:
select id, name from foo where dir0=$1 and dir1=$2

(s)he unable to do it and thus if dir0 and dir1 get created on the fly (by day, 
month or what not) new query needs to be created to handle data in new 
directories.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Dynamic UDFs support

2016-07-26 Thread yuliya Feldman
Thank you Arina
Yuliya

  From: Arina Yelchiyeva <arina.yelchiy...@gmail.com>
 To: dev@drill.apache.org; yuliya Feldman <yufeld...@yahoo.com> 
 Sent: Tuesday, July 26, 2016 10:11 AM
 Subject: Re: Dynamic UDFs support
   
Sure, I'll add this option. I'll send a link to final document once it's
done.

On Tue, Jul 26, 2016 at 8:06 PM Keys Botzum <kbot...@maprtech.com> wrote:

> +1
>
> Keys
> ___
> Keys Botzum
> Senior Principal Technologist
> kbot...@maprtech.com <mailto:kbot...@maprtech.com>
> 443-718-0098
> MapR Technologies
> http://www.mapr.com <http://www.mapr.com/>
> > On Jul 26, 2016, at 1:05 PM, yuliya Feldman <yufeld...@yahoo.com.INVALID>
> wrote:
> >
> > I want to make sure (also will make a note in the design doc) that we
> have an option to disable dynamic loading/unloading of UDFs until we will
> be able to have an ability to do proper authentication AND authorization of
> the user(s).
> >
> >      From: Arina Yelchiyeva <arina.yelchiy...@gmail.com  arina.yelchiy...@gmail.com>>
> > To: dev@drill.apache.org <mailto:dev@drill.apache.org>
> > Sent: Monday, July 25, 2016 9:09 AM
> > Subject: Re: Dynamic UDFs support
> >
> > My fault, agree, DROP is more appropriate.
> > Thanks Julian!
> >
> > On Mon, Jul 25, 2016 at 7:07 PM Julian Hyde <jhyde.apa...@gmail.com
> <mailto:jhyde.apa...@gmail.com>> wrote:
> >
> >> But don't call it DELETE. In SQL the opposite of CREATE is DROP.
> >>
> >> Julian
> >>
> >>> On Jul 25, 2016, at 8:48 AM, Keys Botzum <kbot...@maprtech.com
> <mailto:kbot...@maprtech.com>> wrote:
> >>>
> >>> I like the approach to handling DELETE. This is very useful. I think an
> >> implementation that does not guarantee consistent behavior is perfectly
> >> fine for use that is targeted at developers that are working on UDFs. As
> >> long as the docs make the intent clear this makes me very happy.
> >>>
> >>> I'll defer to others more expert than I on the remainder of the design.
> >>>
> >>> Keys
> >>> ___
> >>> Keys Botzum
> >>> Senior Principal Technologist
> >>> kbot...@maprtech.com <mailto:kbot...@maprtech.com>  kbot...@maprtech.com <mailto:kbot...@maprtech.com>>
> >>> 443-718-0098
> >>> MapR Technologies
> >>> http://www.mapr.com <http://www.mapr.com/> <http://www.mapr.com/ <
> http://www.mapr.com/>>
> >>>> On Jul 25, 2016, at 9:55 AM, Arina Yelchiyeva <
> >> arina.yelchiy...@gmail.com <mailto:arina.yelchiy...@gmail.com>> wrote:
> >>>>
> >>>> Taking into account all previous comments and discussion we had with
> >> Parth
> >>>> and Paul, please find below my design notes (I am going to prepare
> >> proper
> >>>> design document, just want to see if all agree with raw version).
> >>>> I propose will use lazy-init to dynamically loaded UDFs, in such case
> >> when
> >>>> user issues CREATE UDF command, foreman will only validate jar and
> >> update
> >>>> ZK function registry, and only if function is needed it will be loaded
> >> to
> >>>> appropriate drillbit (during planning stage or fragment execution). We
> >>>> might add listeners (as Paul proposed) to pre-load UDFs but I didn't
> >>>> include it to current release to simplify solution but we might
> >> re-consider
> >>>> this.
> >>>> I have looked at issue with class loading and unloading and if we ship
> >> each
> >>>> jar with its own classloader, DELETE functionality can be introduced
> in
> >>>> current release, at least marked as experimental or for developers use
> >>>> only, to ease UDF development process.
> >>>>
> >>>> Any comments are welcomed.
> >>>>
> >>>> *Invariants*
> >>>>
> >>>> 1. DFS staging area where user copies jar to be loaded
> >>>>
> >>>> 2. DFS udf area (former registration area) where all validated jars
> are
> >>>> present
> >>>>
> >>>> 3. ZK function registry - contains list of all dynamically loaded UDFs
> >> and
> >>>> their jars. UDF name will be represented as combination of name and
> >> input
> >>>> parameters.
> >>>>
> >&g

Re: Dynamic UDFs support

2016-07-26 Thread yuliya Feldman
I want to make sure (also will make a note in the design doc) that we have an 
option to disable dynamic loading/unloading of UDFs until we will be able to 
have an ability to do proper authentication AND authorization of the user(s).

  From: Arina Yelchiyeva 
 To: dev@drill.apache.org 
 Sent: Monday, July 25, 2016 9:09 AM
 Subject: Re: Dynamic UDFs support
   
My fault, agree, DROP is more appropriate.
Thanks Julian!

On Mon, Jul 25, 2016 at 7:07 PM Julian Hyde  wrote:

> But don't call it DELETE. In SQL the opposite of CREATE is DROP.
>
> Julian
>
> > On Jul 25, 2016, at 8:48 AM, Keys Botzum  wrote:
> >
> > I like the approach to handling DELETE. This is very useful. I think an
> implementation that does not guarantee consistent behavior is perfectly
> fine for use that is targeted at developers that are working on UDFs. As
> long as the docs make the intent clear this makes me very happy.
> >
> > I'll defer to others more expert than I on the remainder of the design.
> >
> > Keys
> > ___
> > Keys Botzum
> > Senior Principal Technologist
> > kbot...@maprtech.com 
> > 443-718-0098
> > MapR Technologies
> > http://www.mapr.com 
> >> On Jul 25, 2016, at 9:55 AM, Arina Yelchiyeva <
> arina.yelchiy...@gmail.com> wrote:
> >>
> >> Taking into account all previous comments and discussion we had with
> Parth
> >> and Paul, please find below my design notes (I am going to prepare
> proper
> >> design document, just want to see if all agree with raw version).
> >> I propose will use lazy-init to dynamically loaded UDFs, in such case
> when
> >> user issues CREATE UDF command, foreman will only validate jar and
> update
> >> ZK function registry, and only if function is needed it will be loaded
> to
> >> appropriate drillbit (during planning stage or fragment execution). We
> >> might add listeners (as Paul proposed) to pre-load UDFs but I didn't
> >> include it to current release to simplify solution but we might
> re-consider
> >> this.
> >> I have looked at issue with class loading and unloading and if we ship
> each
> >> jar with its own classloader, DELETE functionality can be introduced in
> >> current release, at least marked as experimental or for developers use
> >> only, to ease UDF development process.
> >>
> >> Any comments are welcomed.
> >>
> >> *Invariants*
> >>
> >> 1. DFS staging area where user copies jar to be loaded
> >>
> >> 2. DFS udf area (former registration area) where all validated jars are
> >> present
> >>
> >> 3. ZK function registry - contains list of all dynamically loaded UDFs
> and
> >> their jars. UDF name will be represented as combination of name and
> input
> >> parameters.
> >>
> >> 4. Lazy-init - all dynamically loaded UDFs will be loaded to drillbit
> upon
> >> request, i.e. if drillbits receives query or fragment that contains
> such UDF
> >>
> >> 5. Currently only CREATE and DELETE statements are supported
> >>
> >>
> >> *Adding UDFs*
> >>
> >> 1. User copies source and binary (hereinafter jar) to DFS staging area
> >> 2. User issues CREATE UDF command
> >> 3. Foreman receives request to create UDF:
> >> a) checks if jar is present in staging area
> >> b) copies jar to temporary DFS location
> >> c) validates UDFs present in jar locally:
> >> 1) copies jar to temporary local fs
> >> 2) scans jar using temporary classloader
> >> 3) checks if there are any duplicates in local function registry
> >> 4) returns list of UDFs to be registered
> >> d) validates UDFs present in jar in ZK:
> >> 1) takes list of dynamically loaded UDFs from ZK
> >> 2) checks if there are no duplicates either by jar name or among UDFs
> >> 3) moves jar from DFS temporary area to DFS udf area
> >> 4) updates ZK with list of new dynamic UDFs
> >> 5) removes jar from staging area
> >> 6) returns confirmation to user that UDFs were registered
> >>
> >>
> >> *Lazy-init*
> >>
> >> 1. User issues query with dynamically loaded UDF.
> >>
> >> 2. During planning stage or fragment execution, if UDF is not present in
> >> local function registry,  drillbit:
> >>
> >> a) checks if such UDF is present in ZK function registry
> >>
> >> b) if present, loads UDF using jar name, otherwise return an error
> >>
> >> c) proceeds planning stage or fragment execution
> >>
> >>
> >> *New drillbit registration / Drillbit re-start*
> >>
> >> Local udf directory is re-created, to clean up previously loaded jars
> if any
> >>
> >>
> >> *Delete UDF*
> >>
> >> Each jar that going to be loaded dynamically will have its own
> classloader
> >> which will solve problem with loading and unloading classes with the
> same
> >> name.
> >>
> >>
> >> 1. User issues DELETE command (delete will operate on jar name level)
> >>
> >> 2. Foreman receives DELETE request:
> >>
> >> a) checks if such jar is present in ZK function registry
> >>
> >> b) creates ephemeral znode 

Re: Getting java.lang.VerifyError: class io.netty.buffer.UnsafeDirectLittleEndian

2016-07-19 Thread yuliya Feldman
could be Netty versions mismatch: between version drill is using and your 
project is using.
In netty-4.0.27.Final clear() is not "final" 


  From: Rajesh Chejerla 
 To: u...@drill.apache.org; dev@drill.apache.org 
 Sent: Tuesday, July 19, 2016 6:27 AM
 Subject: Getting java.lang.VerifyError: class 
io.netty.buffer.UnsafeDirectLittleEndian
   
Hi,

I'm getting "java.lang.VerifyError: class
io.netty.buffer.UnsafeDirectLittleEndian" error while getting connection to
database. This is happening when I use another library(vert.x-web) along
with apache-drill.

Could you please help on this issue.

java.lang.VerifyError: class io.netty.buffer.UnsafeDirectLittleEndian
overrides final method clear.()Lio/netty/buffer/ByteBuf;
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at
io.netty.buffer.PooledByteBufAllocatorL.(PooledByteBufAllocatorL.java:56)
    at
org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:60)
    at
org.apache.drill.exec.memory.BaseAllocator.(BaseAllocator.java:44)
    at
org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:38)
    at
org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:140)
    at
org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:64)
    at
org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69)
    at
net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126)
    at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:270)
    at
com.gainsight.services.data.transformer.drill.api.impl.DrillServiceImpl.submitDrillQuerySync(DrillServiceImpl.java:48)
    at
com.gainsight.services.dataprocessing.dataprocessor.dagdataprocessor.nirmata.utils.NirmataUtils.transformQuery(NirmataUtils.java:54)
    at
com.gainsight.services.dataprocessing.dataprocessor.dagdataprocessor.nirmata.executors.TrasnsformExecutor$1.execute(TrasnsformExecutor.java:30)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at
com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute(MoreExecutors.java:310)
    at
com.nirmata.workflow.details.WorkflowManagerImpl.executeTask(WorkflowManagerImpl.java:553)
    at
com.nirmata.workflow.details.WorkflowManagerImpl.lambda$null$9(WorkflowManagerImpl.java:591)
    at
com.nirmata.workflow.queue.zookeeper.SimpleQueue.processNode(SimpleQueue.java:274)
    at
com.nirmata.workflow.queue.zookeeper.SimpleQueue.runLoop(SimpleQueue.java:228)
    at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)


-- 

Thanks & Regards,
Rajesh Chejerla


  

Re: Dynamic UDFs support

2016-06-21 Thread yuliya Feldman
Just thoughts:
You can try to reuse distributed cache Let Drill AM do the needful in terms of 
orchestrating UDF jars distribution.
But
I would be inclined to have a common path that is independent of the fact that 
it is Drill on YARN or not, as maintaining two separate ways of dealing with 
loading/unloading UDFs will be painful and error prone.
One more note (I left a comment in the doc) - not sure about authorization 
model here - we need to have some.
Just my 2cThanks

  From: Paul Rogers 
 To: "dev@drill.apache.org"  
 Sent: Monday, June 20, 2016 7:32 PM
 Subject: Re: Dynamic UDFs support
   
Hi Neeraja,

The proposal calls for the user to copy the jar file to each Drillbit node. The 
jar would go into a new $DRILL_HOME/jars/3rdparty/udf directory.

In Drill-on-YARN (DoY), YARN is responsible for copying Drill code to each node 
(which is good.) YARN puts that code in a location known only to YARN. Since 
the location is private to YARN, the user can’t easily hunt down the location 
in order to add the udf jar. Even if the user did find the location, the next 
Drillbit to start would create a new copy of the Drill software, without the 
udf jar.

Second, in DoY we have separated user files from Drill software. This makes it 
much easier to distribute the software to each node: we give the Drill 
distribution tar archive to YARN, and YARN copies it to each node and untars 
the Drill files. We make a separate copy of the (far smaller) set of user 
config files.

If the udf jar goes into a Drill folder ($DRILL_HOME/jars/3rdparty/udf), then 
the user would have to rebuild the Drill tar file each time they add a udf jar. 
When I tried this myself when building DoY, I found it to be slow and 
error-prone.

So, the solution is to place the udf code in the new “site” directory: 
$DRILL_SITE/jars. That’s what that is for. Then, let DoY automatically 
distribute the code to every node. Perfect! Except that it does not work to 
dynamically distribute code after Drill starts.

For DoY, the solution requirements are:

1. Distribute code using Drill itself, rather than manually copying jars to 
(unknown) Drill directories.
2. Ensure the solution works even if another Drillbit is spun up later, and 
uses the original Drill tar file.

I’m thinking we want to leverage DFS: place udf files into a well-known DFS 
directory. Register the udf into, say, ZK. When a new Drillbit starts, it looks 
for new udf jars in ZK, copies the file to a temporary location, and launches. 
An existing Drill is notified of the change and does the same download process. 
Clean-up is needed at some point to remove ZK entries if the udf jar becomes 
statically available on the next launch. That needs more thought.

We’d still need the phases mentioned earlier to ensure consistency.

Suggestions anyone as to how to do this super simply & still get it to work 
with DoY?

Thanks,

- Paul
 
> On Jun 20, 2016, at 7:18 PM, Neeraja Rentachintala 
>  wrote:
> 
> This will need to work with YARN (Once Drill is YARN enabled, I would
> expect a lot of users using it in conjunction with YARN).
> Paul, I am not clear why this wouldn't work with YARN. Can you elaborate.
> 
> -Neeraja
> 
> On Mon, Jun 20, 2016 at 7:01 PM, Paul Rogers  wrote:
> 
>> Good enough, as long as we document the limitation that this feature can’t
>> work with YARN deployment as users generally do not have access to the
>> temporary “localization” directories where the Drill code is placed by YARN.
>> 
>> Note that the jar distribution race condition issue occurs with the
>> proposed design: I believe I sketched out a scenario in one of the earlier
>> comments. Drillbit A receives the CREATE FUNCTION command. It tells
>> Drillbit B. While informing the other Drillbits, Drillbit B plans and
>> launches a query that uses the function. Drillbit Z starts execution of the
>> query before it learns from A about the new function. This will be rare —
>> just rare enough to create very hard to reproduce bugs.
>> 
>> The only reliable solution is to do the work in multiple passes:
>> 
>> Pass 1: Ask each node to load the function, but not make it available to
>> the planner. (it would be available to the execution engine.)
>> Pass 2: Await confirmation from each node that this is done.
>> Pass 3: Alert every node that it is now free to plan queries with the
>> function.
>> 
>> Finally, I wonder if we should design the SQL syntax based on a long-term
>> design, even if the feature itself is a short-term work-around. Changing
>> the syntax later might break scripts that users might write.
>> 
>> So, the question for the group is this: is the value of semi-complete
>> feature sufficient to justify the potential problems?
>> 
>> - Paul
>> 
>>> On Jun 20, 2016, at 6:15 PM, Parth Chandra 
>> wrote:
>>> 
>>> Moving discussion to dev.
>>> 
>>> I believe the aim is to do a simple 

[jira] [Created] (DRILL-4597) Calcite type validation assertions when planner.enable_type_inference is enabled for system tables

2016-04-08 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-4597:
-

 Summary: Calcite type validation assertions when 
planner.enable_type_inference is enabled for system tables
 Key: DRILL-4597
 URL: https://issues.apache.org/jira/browse/DRILL-4597
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Yuliya Feldman


With calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11 and type inference 
enabled following query fails:
select concat(hostname, ':', user_port) from sys.drillbits where `current`=true;

with below exception

at 
org.apache.calcite.sql.type.SqlTypeFactoryImpl.createSqlType(SqlTypeFactoryImpl.java:62)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.drill.exec.planner.sql.TypeInferenceUtils$DrillConcatSqlReturnTypeInference.inferReturnType(TypeInferenceUtils.java:420)
 ~[classes/:na]
at org.apache.calcite.sql.SqlOperator.inferReturnType(SqlOperator.java:468) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlOperator.validateOperands(SqlOperator.java:435) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:287) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:222) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:4288)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:4275)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:130) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1495)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1478)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.expandSelectItem(SqlValidatorImpl.java:440)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelectList(SqlValidatorImpl.java:3447)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:2995)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:877)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:210) 
~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:837)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:551)
 ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
at 
org.apache.drill.exec.planner.sql.SqlConverter.validate(SqlConverter.java:155) 
~[classes/:na]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:596)
 ~[classes/:na]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:192)
 ~[classes/:na]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:164)
 ~[classes/:na]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:94)
 ~[classes/:na]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:970) 
[classes/:na]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:254) 
[classes/:na]

So far we could not repro it on non-system tables, but any type inference on 
system table leads to the exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4412) Have an array of DrillBitEndPoints (at least) for leaf fragments instead of single one

2016-02-17 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-4412:
-

 Summary: Have an array of DrillBitEndPoints (at least) for leaf 
fragments instead of single one
 Key: DRILL-4412
 URL: https://issues.apache.org/jira/browse/DRILL-4412
 Project: Apache Drill
  Issue Type: Improvement
  Components: Query Planning & Optimization
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


To follow up on the ability to submit simple physical plan directly to a 
DrillBit for execution 
[JIRA-4132|https://issues.apache.org/jira/browse/DRILL-4132] it would be 
beneficial to have an array of DrillBitEndPoint in PlanFragment. Leaf fragments 
that scan the data can have an array of DrillBitEndPoint based on data 
locality, as data may be replicated and in case it is necessary to restart Scan 
fragment it can be restarted on DrillBits that have replica of the data, versus 
always retrying the same DrillBit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] DRILL-4132

2016-01-19 Thread yuliya Feldman
Great idea.
Created a poll [1]. Anybody interested can vote on time.
Thanks,Yuliya
[1] http://doodle.com/poll/c7w37sbvxh36k576


 

  From: Hanifi Gunes <hgu...@maprtech.com>
 To: dev@drill.apache.org; yuliya Feldman <yufeld...@yahoo.com> 
 Sent: Tuesday, January 19, 2016 11:44 AM
 Subject: Re: [DISCUSS] DRILL-4132
   
Do you want to create a doodle for this? [1]

-Hanifi

1: http://doodle.com/create

On Mon, Jan 18, 2016 at 11:02 PM, yuliya Feldman <
yufeld...@yahoo.com.invalid> wrote:

>
> Hello here,
> I wanted to start discussion on [1]
> Would be nice to have a hangout session with @jacques-n,
> @hnfgns, @StevenMPhillips
> Let me know suitable time
> Thanks,Yuliya
> [1] https://issues.apache.org/jira/browse/DRILL-4132


  

[DISCUSS] DRILL-4132

2016-01-18 Thread yuliya Feldman

Hello here,
I wanted to start discussion on [1]
Would be nice to have a hangout session with @jacques-n, @hnfgns, 
@StevenMPhillips
Let me know suitable time
Thanks,Yuliya
[1] https://issues.apache.org/jira/browse/DRILL-4132

Re: Codehale Metrics JMXReporter Disabled?

2015-12-02 Thread yuliya Feldman
I know I was enabling it for my small project I did with Drill for Strata in 
Feb.
If you enable JMX I believe there was a bug somewhere lurking around - I think 
static order init issue. I may have a code somewhere with enabling JMX and 
fixing the issue I described.
  From: Jacques Nadeau 
 To: dev  
 Sent: Wednesday, December 2, 2015 1:54 PM
 Subject: Re: Codehale Metrics JMXReporter Disabled?
   
Afraid not. I think it may have been debug reasons and shouldn't have been
merged.

--
Jacques Nadeau
CTO and Co-Founder, Dremio



On Wed, Dec 2, 2015 at 1:44 PM, Sudheesh Katkam 
wrote:

> Jacques,
>
> Do you happen to remember why JMXReporter was disabled <
> https://github.com/apache/drill/commit/4eea03a052d1b0a9190b9d1512088da9f81cc037#diff-5a33b44e2c23b1f09d338938c8f1e742R47
> >?
>
> Thank you,
> Sudheesh


  

Re: Zookeeper down before query starts/after query finishes

2015-11-08 Thread yuliya Feldman
In the reality if you can not connect to ZK (and ConnectionLoss is a client 
side error) it either means issues with network on client node itself or issues 
with ZK quorum.  In those situations unless you receive (eventually) "Session 
Expiration" or "Connection reestablished" again you don't know what is going 
on. What probably would be prudent to do is to timeout if after ConnectionLoss 
you do not have anything back from ZK server for time > ZK client timeout (30 
sec. by default I think).
And again it will need to depend on the client - in your example it is a good 
idea to fail in some other cases it may be a good idea to wait (e.g if you deal 
with non-idempotent operations)
  From: Hsuan Yi Chu 
 To: dev@drill.apache.org 
 Sent: Sunday, November 8, 2015 9:36 AM
 Subject: Re: Zookeeper down before query starts/after query finishes
   
I just submitted a pull request to address DRILL-3751, which focuses on the
scenario where query already finishes and zookeeper dies. So Foreman cannot
delete the profiles of running queries in zookeeper.

I think in this case, after a few retries, Foreman can assume Zookeeper is
down. And, this query is assumed to fail since client might not be able to
receive the result (see the behavior in DRILL-3751
).

Does this make sense?




On Fri, Nov 6, 2015 at 10:43 AM, Hsuan Yi Chu  wrote:

> My understanding is :
> Before query starts/After query finishes, Foreman will put/delete running
> query profiles in zookeeper.
>
> However, if zookeeper is down before the put/delete is successful, Drill
> would be blocked at the put/delete operation.
>
> See https://issues.apache.org/jira/browse/DRILL-3751
>
> I think it is not quite right to let Drill just wait for Zookeeper to
> respond. Does it make sense to use "time-out" here?
>
>
>


  

Re: Zookeeper down before query starts/after query finishes

2015-11-08 Thread yuliya Feldman
Did not notice your reply :)
Yes - I agree with Jacques - we should consider variety of the scenarios here.
Thanks,Yuliya
  From: Jacques Nadeau 
 To: dev  
 Sent: Sunday, November 8, 2015 11:56 AM
 Subject: Re: Zookeeper down before query starts/after query finishes
   
I think we need to talk through a couple of different scenarios and decide
on Drill behavior in each.

Client Based
1) Initial connection to ZK from client fails
2) Client loses ZK Connection
  a) Reconnects within session timeout
  b) Cannot reconnect within session timeout (loses session)
3) ZK Connection is gets reconnected with new session (2b)

Drillbit Based
4) Drillbit initial connection fails to complete
5) Drillbit loses connection
  a) reconnects within session timeout
  b) cannot reconnect within session timeout (loses session)
6) Drillbit reestablishes connection after timeout (5b)

It seems like your initial proposal is entirely focused on item (5b) in the
list above. However, the code change affects all items 1-6. I think it
would be worthwhile to come up with clear definition of desired behavior
for all items 1-6. I also think the behavior in 2b should probably be very
different than in 5b.

Note, I'm not suggesting that this initial fix needs to resolve all items
to the desired behavior. However, it is hard to review the patch without
measuring against what are target is across the items. My hope out of this
is a clear framework to review the patch as well as a number of jiras to
resolve issues across each of these issues where there are gaps.

thanks!
jacques



--
Jacques Nadeau
CTO and Co-Founder, Dremio



On Sun, Nov 8, 2015 at 9:36 AM, Hsuan Yi Chu  wrote:

> I just submitted a pull request to address DRILL-3751, which focuses on the
> scenario where query already finishes and zookeeper dies. So Foreman cannot
> delete the profiles of running queries in zookeeper.
>
> I think in this case, after a few retries, Foreman can assume Zookeeper is
> down. And, this query is assumed to fail since client might not be able to
> receive the result (see the behavior in DRILL-3751
> ).
>
> Does this make sense?
>
>
> On Fri, Nov 6, 2015 at 10:43 AM, Hsuan Yi Chu  wrote:
>
> > My understanding is :
> > Before query starts/After query finishes, Foreman will put/delete running
> > query profiles in zookeeper.
> >
> > However, if zookeeper is down before the put/delete is successful, Drill
> > would be blocked at the put/delete operation.
> >
> > See https://issues.apache.org/jira/browse/DRILL-3751
> >
> > I think it is not quite right to let Drill just wait for Zookeeper to
> > respond. Does it make sense to use "time-out" here?
> >
> >
> >
>


  

Re: [DISCUSS] Design Documents

2015-10-18 Thread yuliya Feldman
+1
  From: Parth Chandra 
 To: dev@drill.apache.org 
 Sent: Friday, October 16, 2015 10:21 AM
 Subject: [DISCUSS] Design Documents
   
Hi guys,

Now that 1.2 is out I wanted to bring up the exciting topic of design
documents for Drill. As the project gets more contributors, we definitely
need to start documenting our designs and also allow for a more substantial
review process. In particular, we need to make sure that there is
sufficient time for comment as well as a time limit for comments so that
developers are not left stranded. It is understood that committers should
ensure they spend enough time in reviewing designs.

I can see some substantial improvements in the works (some may even have
pull requests for initial work) and I think that this is a good time to
make sure that the design is done and understood by all before we get too
far ahead with the implementation.

[1] is an example from Spark, though that might be asking for a lot.

[2] is an example from Drill - Hash Aggregation in Drill - This is an ideal
design document. It could be improved even further perhaps by adding some
implementation level details (for example parameters that could be used to
tune Hash aggregation) that could aid QA/documentation.

What do people think? Can we start enforcing the requirement to have
reviewed design docs before submitting pull requests for *advanced*
features?

Parth

[1] http://people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf
[2] https://issues.apache.org/jira/secure/attachment/12622804/DrillAggrs.pdf


   

Re: Suspicious direct memory consumption when running queries concurrently

2015-07-31 Thread yuliya Feldman
How much memory your jvm is taking?
Do you even have enough disk space to dump it.
  From: Abdel Hakim Deneche adene...@maprtech.com
 To: dev@drill.apache.org dev@drill.apache.org 
 Sent: Friday, July 31, 2015 9:19 PM
 Subject: Re: Suspicious direct memory consumption when running queries 
concurrently
   
I tried getting a jmap dump multiple times without success, each time it
crashes the jvm with the following exception:

Dumping heap to /home/mapr/private-sql-hadoop-test/framework/myfile.hprof
 ...
 Exception in thread main java.io.IOException: Premature EOF
        at
 sun.tools.attach.HotSpotVirtualMachine.readInt(HotSpotVirtualMachine.java:248)
        at
 sun.tools.attach.LinuxVirtualMachine.execute(LinuxVirtualMachine.java:199)
        at
 sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:217)
        at
 sun.tools.attach.HotSpotVirtualMachine.dumpHeap(HotSpotVirtualMachine.java:180)
        at sun.tools.jmap.JMap.dump(JMap.java:242)
        at sun.tools.jmap.JMap.main(JMap.java:140)


On Mon, Jul 27, 2015 at 3:45 PM, Jacques Nadeau jacq...@dremio.com wrote:

 A allocate - release cycle all on the same thread goes into a per thread
 cache.

 A bunch of Netty arena settings are configurable.  The big issue I believe
 is that the limits are soft limits implemented by the allocation-time
 release mechanism.  As such, if you allocate a bunch of memory, then
 release it all, that won't necessarily trigger any actual chunk releases.

 --
 Jacques Nadeau
 CTO and Co-Founder, Dremio

 On Mon, Jul 27, 2015 at 12:47 PM, Abdel Hakim Deneche 
 adene...@maprtech.com
  wrote:

  @Jacques, my understanding is that chunks are not owned by specific a
  thread but they are part of a specific memory arena which is in turn only
  accessed by specific threads. Do you want me to find which threads are
  associated with the same arena where we have hanging chunks ?
 
 
  On Mon, Jul 27, 2015 at 11:04 AM, Jacques Nadeau jacq...@dremio.com
  wrote:
 
   It sounds like your statement is that we're cacheing too many unused
   chunks.  Hanifi and I previously discussed implementing a separate
  flushing
   mechanism to release unallocated chunks that are hanging around.  The
  main
   question is, why are so many chunks hanging around and what threads are
   they associated with.  A Jmap dump and analysis should allow you to do
   determine which thread owns the excess chunks.  My guess would be the
 RPC
   pool since those are long lasting (as opposed to the WorkManager pool,
   which is contracting).
  
   --
   Jacques Nadeau
   CTO and Co-Founder, Dremio
  
   On Mon, Jul 27, 2015 at 9:53 AM, Abdel Hakim Deneche 
   adene...@maprtech.com
   wrote:
  
When running a set of, mostly window function, queries concurrently
 on
  a
single drillbit with a 8GB max direct memory. We are seeing a
  continuous
increase of direct memory allocation.
   
We repeat the following steps multiple times:
- we launch in iteration of tests that will run all queries in a
  random
order, 10 queries at a time
- after the iteration finishes, we wait for a couple of minute to
 give
Drill time to release the memory being held by the finishing
 fragments
   
Using Drill's memory logger (drill.allocator) we were able to get
snapshots of how memory was internally used by Netty, we only focused
  on
the number of allocated chunks, if we take this number and multiply
 it
  by
16MB (netty's chunk size) we get approximately the same value
 reported
  by
Drill's direct memory allocation.
Here is a graph that shows the evolution of the number of allocated
   chunks
on a 500 iterations run (I'm working on improving the plots) :
   
http://bit.ly/1JL6Kp3
   
In this specific case, after the first iteration Drill was allocating
   ~2GB
of direct memory, this number kept rising after each iteration to
 ~6GB.
   We
suspect this caused one of our previous runs to crash the JVM.
   
If we only focus on the log lines between iterations (when Drill's
  memory
usage is below 10MB) then all allocated chunks are at most 2% usage.
 At
some point we end up with 288 nearly empty chunks, yet the next
  iteration
will cause more chunks to be allocated!!!
   
is this expected ?
   
PS: I am running more tests and will update this thread with more
informations.
   
--
   
Abdelhakim Deneche
   
Software Engineer
   
     http://www.mapr.com/
   
   
Now Available - Free Hadoop On-Demand Training

   
  
 
 http://www.mapr.com/training?utm_source=Emailutm_medium=Signatureutm_campaign=Free%20available



   
  
 
 
 
  --
 
  Abdelhakim Deneche
 
  Software Engineer
 
   http://www.mapr.com/
 
 
  Now Available - Free Hadoop On-Demand Training
  
 
 http://www.mapr.com/training?utm_source=Emailutm_medium=Signatureutm_campaign=Free%20available
  
 




-- 

Abdelhakim Deneche

Software Engineer

  http://www.mapr.com/


Re: eclipse:eclipse failing

2015-06-11 Thread yuliya Feldman
I have been using mvn eclipse:eclipse w/o issues for quite a long time.
As Jacques pointed out you need to run install target first.
  From: Jacques Nadeau jacq...@apache.org
 To: dev@drill.apache.org dev@drill.apache.org 
 Sent: Wednesday, June 10, 2015 12:36 PM
 Subject: Re: eclipse:eclipse failing
   
The problem is you'll need to first run a complete mvn install -DskipTests
command before you can use eclipse:eclipse.

 Furthermore, I strongly recommend using Eclipse's import capability as we
haven't tested the eclipse:eclipse behavior with Drill.



On Mon, Jun 8, 2015 at 6:27 PM, 오진박 kimble_...@foxmail.com wrote:

 hi,when using mvn eclipse:eclipse,I got following errors, I need help to
 resolve it.


  

Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-03-15 Thread Yuliya Feldman


 On March 15, 2015, 1:28 p.m., Jacques Nadeau wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java,
   line 85
  https://reviews.apache.org/r/31107/diff/7/?file=889340#file889340line85
 
  returning null here seems weird.  When would that happen?

Since there can be  1 partitioners current partitioner may not correspond to 
the index. There is a method in PartitionerDecorator that loops over 
partitioners and return one that matches the index. I will add comments to the 
method.


 On March 15, 2015, 1:28 p.m., Jacques Nadeau wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java,
   line 234
  https://reviews.apache.org/r/31107/diff/7/?file=889333#file889333line234
 
  shouldn't this parameter be instanceCount?  instanceNumber seems like 
  an index.

will do


 On March 15, 2015, 1:28 p.m., Jacques Nadeau wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java, 
  line 70
  https://reviews.apache.org/r/31107/diff/7/?file=889334#file889334line70
 
  missing description of what isClean means.

will do


- Yuliya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/#review76510
---


On March 9, 2015, 9:31 a.m., Yuliya Feldman wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31107/
 ---
 
 (Updated March 9, 2015, 9:31 a.m.)
 
 
 Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
 Venki Korukanti.
 
 
 Bugs: DRILL-2210
 https://issues.apache.org/jira/browse/DRILL-2210
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 In addition to description
 
 Fixed few classes that did not handle multithreading well
 Added/Changed some Stats behavior to allow stats merge from multiple threads, 
 since again this class is not suitable to be used in multithreaded environment
 Introduced new decorator class to handle multi thrteading (or not)  to 
 minimize changes to ParitionSenderRootExec class
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
 7cc350e 
   exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
 e413921 
   exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
 0e9da0e 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
  64cf7c5 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
  7af7b65 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
  a23bd7a 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
  5ed9c39 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
  PRE-CREATION 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
  71ffd41 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
  961b603 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
  bbfbbcb 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java
  0fb10ff 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
  3d3e96f 
   exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
 99c6ab8 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
  478 
   
 exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/31107/diff/
 
 
 Testing
 ---
 
 Still need to provide Unit Tests.
 
 Functional tests are passing
 
 Performance tests were run and look promising for some queries
 
 
 Thanks,
 
 Yuliya Feldman
 




Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-03-09 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated March 9, 2015, 9:31 a.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

More unit tests for failure scenarios
Optimized more partitioners distribution algorithm
Fixed OperatorStats merging metrics


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
7cc350e 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
e413921 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
 64cf7c5 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 a23bd7a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 71ffd41 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
 961b603 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 bbfbbcb 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
0fb10ff 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 3d3e96f 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
99c6ab8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
 478 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-03-06 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated March 6, 2015, 2:59 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Added Unit tests
Added number of threads as a metric to PartitionSender operator
Fixed partitioners distribution algorithm to do even distribution between 
threads


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
7cc350e 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
e413921 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
 64cf7c5 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 a23bd7a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 71ffd41 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
 961b603 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 bbfbbcb 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
0fb10ff 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 3d3e96f 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
99c6ab8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
 478 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-02-26 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/
---

(Updated Feb. 26, 2015, 12:34 a.m.)


Review request for drill, Jacques Nadeau, Jinfeng Ni, Steven Phillips, and 
Venki Korukanti.


Changes
---

Addressing review comments after code review with Jacques and Venki


Bugs: DRILL-2209
https://issues.apache.org/jira/browse/DRILL-2209


Repository: drill-git


Description
---

Insert Project operator to add new column EXPRHASH with hash expression for 
fields that are used for HashToRandomExchange
Remove Project operator after HashRandomExchange (or Demux) since it will 
create problems to fields ordering in HashJoin.

Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
 372c75d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java
 1adc54f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30965/diff/


Testing
---

Need to add Unit Tests. tested live, run Functional and TPCH tests


Thanks,

Yuliya Feldman



Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-02-25 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated Feb. 25, 2015, 11:24 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Changes based on latest code review with Jacques and Venki

1. Reuse of ExecutorService from WorkManager
2. Not using anonymous objects
3. Not using callables in favor of runnables
4. other small corrections


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
7cc350e 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
e413921 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 f09acaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 4292c09 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 faa8546 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
83a89df 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 aa0a5ad 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
99c6ab8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
 478 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



Re: Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-02-23 Thread Yuliya Feldman


 On Feb. 23, 2015, 5:44 p.m., Jinfeng Ni wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java,
   line 112
  https://reviews.apache.org/r/30965/diff/2/?file=871428#file871428line112
 
  MuxExchange has Project as its child. So, MuxExchange will have same 
  traits as Project (addColumnprojectPrel), in stead of its parent (prel).

Will definitely fix it - thank you for pointing out


 On Feb. 23, 2015, 5:44 p.m., Jinfeng Ni wrote:
  exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java,
   line 127
  https://reviews.apache.org/r/30965/diff/2/?file=871428#file871428line127
 
  I'm not fully clear about the motification of inserting the hash 
  expression into Project. But here if we remove the compuated hash 
  expression, does it mean that the down stream operator will not be able to 
  refer to this computed value, and have to re-compute?

The problem is that if we have HashJoin later on it is not aware of additional 
column and it will be failing, so after discussion with Jacques we decided to 
add Project before HashExchage and remove it after - so to thw world outside of 
Mux/HashExchange/Demux it will look as Project was never inserted


- Yuliya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/#review73732
---


On Feb. 23, 2015, 4:09 p.m., Yuliya Feldman wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30965/
 ---
 
 (Updated Feb. 23, 2015, 4:09 p.m.)
 
 
 Review request for drill, Jacques Nadeau, Jinfeng Ni, Steven Phillips, and 
 Venki Korukanti.
 
 
 Bugs: DRILL-2209
 https://issues.apache.org/jira/browse/DRILL-2209
 
 
 Repository: drill-git
 
 
 Description
 ---
 
 Insert Project operator to add new column EXPRHASH with hash expression for 
 fields that are used for HashToRandomExchange
 Remove Project operator after HashRandomExchange (or Demux) since it will 
 create problems to fields ordering in HashJoin.
 
 Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.
 
 
 Diffs
 -
 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
  372c75d 
   
 exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30965/diff/
 
 
 Testing
 ---
 
 Need to add Unit Tests. tested live, run Functional and TPCH tests
 
 
 Thanks,
 
 Yuliya Feldman
 




Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-02-23 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated Feb. 23, 2015, 3:29 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Addressing review comments


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 f09acaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 4292c09 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 faa8546 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 aa0a5ad 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



[jira] [Created] (DRILL-2209) Save on CPU cycles by adding Project with column that has hash calculated

2015-02-10 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-2209:
-

 Summary: Save on CPU cycles by adding Project with column that has 
hash calculated
 Key: DRILL-2209
 URL: https://issues.apache.org/jira/browse/DRILL-2209
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow, Query Planning  Optimization
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


Related to DRILL-133. Wrapping HashToRandomExhcnage and/or LocalExchange with 
Project operator with additional column that represents hash function of 
column(s) we are hashing on. This is to save CPU cycles and not recalculate 
hash every time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2210) Allow multithreaded copy and/or flush in ParittionSender

2015-02-10 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-2210:
-

 Summary: Allow multithreaded copy and/or flush in ParittionSender
 Key: DRILL-2210
 URL: https://issues.apache.org/jira/browse/DRILL-2210
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


Related to DRILL-133. As in LocalExchange we merge data from multiple receivers 
into LocalExchange to fan it out later to multiple Senders, amount of data that 
needs to be sent out increases. Add ability to copy/flush data in multiple 
threads



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1926) Fix back pressure logic

2015-01-04 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created DRILL-1926:
-

 Summary: Fix back pressure logic
 Key: DRILL-1926
 URL: https://issues.apache.org/jira/browse/DRILL-1926
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


While enqueueing coming requests in UnlimitedRawBatchBuffer replies to the 
sender(s) will queue up only if size of the queue is equal to soft limit and 
not when it is = softlimit - which means it will work only once, while 
requests will be piling up.
Also improving logic of sending responses back to the senders by not just 
sending them in one shot that can create flood of requests again, but in 
batches based on difference between softlimit and queue size



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)