[GitHub] drill pull request #1096: DRILL-6099 : Push limit past flatten(project) with...

2018-02-28 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1096#discussion_r171479641
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java
 ---
@@ -224,4 +226,64 @@ public Void visitInputRef(RexInputRef inputRef) {
 }
   }
 
+  public static boolean isLimit0(RexNode fetch) {
+if (fetch != null && fetch.isA(SqlKind.LITERAL)) {
+  RexLiteral l = (RexLiteral) fetch;
+  switch (l.getTypeName()) {
+case BIGINT:
+case INTEGER:
+case DECIMAL:
+  if (((long) l.getValue2()) == 0) {
+return true;
+  }
+  }
+}
+return false;
+  }
+
+  public static boolean isProjectOutputRowcountUnknown(RelNode project) {
+assert project instanceof Project : "Rel is NOT an instance of 
project!";
+try {
+  RexVisitor visitor =
--- End diff --

Would FLATTEN ever occur within other expressions ?  I believe it always 
occurs as an independent expression.  If that's the case, it seems to me that 
having a visitor is overkill.. what do you think ?  Even the original rewrite 
from project to flatten just iterates over the project exprs here [1].  

[1]  
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/RewriteProjectToFlatten.java#L77


---


[GitHub] drill pull request #1096: DRILL-6099 : Push limit past flatten(project) with...

2018-02-28 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1096#discussion_r171480227
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushLimitToScanRule.java
 ---
@@ -55,18 +62,21 @@ public void onMatch(RelOptRuleCall call) {
 }
   };
 
-  public static DrillPushLimitToScanRule LIMIT_ON_PROJECT =
-  new DrillPushLimitToScanRule(
-  RelOptHelper.some(DrillLimitRel.class, RelOptHelper.some(
-  DrillProjectRel.class, 
RelOptHelper.any(DrillScanRel.class))),
-  "DrillPushLimitToScanRule_LimitOnProject") {
+  public static DrillPushLimitToScanRule LIMIT_ON_PROJECT = new 
DrillPushLimitToScanRule(
+  RelOptHelper.some(DrillLimitRel.class, 
RelOptHelper.any(DrillProjectRel.class)), 
"DrillPushLimitToScanRule_LimitOnProject") {
 @Override
 public boolean matches(RelOptRuleCall call) {
   DrillLimitRel limitRel = call.rel(0);
-  DrillScanRel scanRel = call.rel(2);
-  // For now only applies to Parquet. And pushdown only apply limit 
but not offset,
+  DrillProjectRel projectRel = call.rel(1);
+  // pushdown only apply limit but not offset,
   // so if getFetch() return null no need to run this rule.
-  if (scanRel.getGroupScan().supportsLimitPushdown() && 
(limitRel.getFetch() != null)) {
--- End diff --

One implication of this is suppose the underlying Scan does not support 
Limit pushdown, you could end up with a plan `Scan->Limit->Project->Limit`  
where the Limit above the Scan is redundant (assume that there is no FLATTEN in 
this query).  Can this be avoided ? 


---


[GitHub] drill pull request #1096: DRILL-6099 : Push limit past flatten(project) with...

2018-02-28 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1096#discussion_r171478117
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java
 ---
@@ -224,4 +226,64 @@ public Void visitInputRef(RexInputRef inputRef) {
 }
   }
 
+  public static boolean isLimit0(RexNode fetch) {
+if (fetch != null && fetch.isA(SqlKind.LITERAL)) {
+  RexLiteral l = (RexLiteral) fetch;
+  switch (l.getTypeName()) {
+case BIGINT:
+case INTEGER:
+case DECIMAL:
+  if (((long) l.getValue2()) == 0) {
+return true;
+  }
+  }
+}
+return false;
+  }
+
+  public static boolean isProjectOutputRowcountUnknown(RelNode project) {
+assert project instanceof Project : "Rel is NOT an instance of 
project!";
+try {
+  RexVisitor visitor =
+  new RexVisitorImpl(true) {
+public Void visitCall(RexCall call) {
+  if 
("flatten".equals(call.getOperator().getName().toLowerCase())) {
+throw new Util.FoundOne(call); /* throw exception to 
interrupt tree walk (this is similar to
+  other utility methods in 
RexUtil.java */
+  }
+  return super.visitCall(call);
+}
+  };
+  for (RexNode rex : ((Project) project).getProjects()) {
+rex.accept(visitor);
+  }
+} catch (Util.FoundOne e) {
+  Util.swallow(e, null);
+  return true;
+}
+return false;
+  }
+
+  public static boolean isProjectOutputSchemaUnknown(RelNode project) {
--- End diff --

Javadoc


---


[GitHub] drill pull request #1096: DRILL-6099 : Push limit past flatten(project) with...

2018-02-28 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1096#discussion_r171478085
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillRelOptUtil.java
 ---
@@ -224,4 +226,64 @@ public Void visitInputRef(RexInputRef inputRef) {
 }
   }
 
+  public static boolean isLimit0(RexNode fetch) {
+if (fetch != null && fetch.isA(SqlKind.LITERAL)) {
+  RexLiteral l = (RexLiteral) fetch;
+  switch (l.getTypeName()) {
+case BIGINT:
+case INTEGER:
+case DECIMAL:
+  if (((long) l.getValue2()) == 0) {
+return true;
+  }
+  }
+}
+return false;
+  }
+
+  public static boolean isProjectOutputRowcountUnknown(RelNode project) {
--- End diff --

Could you add javadoc for this utility function. 


---


Re: [ANNOUNCE] New Committer: Kunal Khatua

2018-02-28 Thread Parth Chandra
Congrats Kunal.

On Thu, Mar 1, 2018 at 1:36 AM, Ramana I N  wrote:

> Congrats Kunal!
>
> Regards
> Ramana
>
>
> On Wed, Feb 28, 2018 at 11:02 AM, Robert Hou  wrote:
>
> > Congrats Kunal!
> >
> >
> > --Robert
> >
> > 
> > From: Robert Wu 
> > Sent: Wednesday, February 28, 2018 10:50 AM
> > To: dev@drill.apache.org
> > Subject: RE: [ANNOUNCE] New Committer: Kunal Khatua
> >
> > Congratulations, Kunal!
> >
> > Best regards,
> >
> > Rob
> >
> > -Original Message-
> > From: Vitalii Diravka [mailto:vitalii.dira...@gmail.com]
> > Sent: Wednesday, February 28, 2018 10:48 AM
> > To: dev@drill.apache.org
> > Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua
> >
> > Congrats, Kunal!
> >
> > Kind regards
> > Vitalii
> >
> > On Wed, Feb 28, 2018 at 6:39 PM, Timothy Farkas 
> wrote:
> >
> > > Congrats!
> > >
> > > 
> > > From: Paul Rogers 
> > > Sent: Wednesday, February 28, 2018 9:58:32 AM
> > > To: dev@drill.apache.org
> > > Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua
> > >
> > > Congrats, Kunal! Well deserved.
> > >
> > > - Paul
> > >
> > >
> > > > On Feb 27, 2018, at 10:42 AM, Prasad Nagaraj Subramanya <
> > > prasadn...@gmail.com> wrote:
> > > >
> > > > Congratulations Kunal!
> > > >
> > > >
> > > > On Tue, Feb 27, 2018 at 10:41 AM, Padma Penumarthy
> > > >  > > >
> > > > wrote:
> > > >
> > > >> Congratulations Kunal !
> > > >>
> > > >> Thanks
> > > >> Padma
> > > >>
> > > >>
> > > >>> On Feb 27, 2018, at 8:42 AM, Aman Sinha 
> > wrote:
> > > >>>
> > > >>> The Project Management Committee (PMC) for Apache Drill has
> > > >>> invited
> > > Kunal
> > > >>> Khatua  to become a committer, and we are pleased to announce that
> > > >>> he has accepted.
> > > >>>
> > > >>> Over the last couple of years, Kunal has made substantial
> > > >>> contributions
> > > >> to
> > > >>> the process of creating and interpreting of query profiles, among
> > > >>> other code contributions. He has led the efforts for Drill
> > > >>> performance
> > > >> evaluation
> > > >>> and benchmarking.  He is a prolific writer on the user mailing
> > > >>> list, providing detailed responses.
> > > >>>
> > > >>> Welcome Kunal, and thank you for your contributions.  Keep up the
> > > >>> good work !
> > > >>>
> > > >>> - Aman
> > > >>> (on behalf of the Apache Drill PMC)
> > > >>
> > > >>
> > >
> > >
> >
>


Re: [DISCUSS] 1.13.0 release

2018-02-28 Thread Parth Chandra
Moved Ted's PR's down in the list. Let's see where we are at the end of the
week.
Arina, Volodymyr, ank ETA on JDK 8 work? It's the gating factor for the
release.
Meanwhile, people, feel free to commit your work as usual.

Updated list:

DRILL-6185: Error is displaying while accessing query profiles via the
Web-UI  -- Ready to commit
DRILL-6174: Parquet pushdown planning improvements -- Ready to commit
DRILL-6188: Fix C++ client build on Centos 7 and OS X  --  Ready to commit

DRILL-1491:  Support for JDK 8 --* In progress.*

DRILL-6191: Need more information on TCP flags -- *In progress*

DRILL-6190: Packets can be bigger than strictly legal  -- *In progress*

DRILL-1170: YARN support for Drill -- Needs Committer +1 and Travis fix.

DRILL-6027: Implement spill to disk for the Hash Join   --- No PR and is a
major feature that should be reviewed (properly!).

DRILL-6173: Support transitive closure during filter push down and
partition pruning.  -- No PR and depends on 3 Apache Calcite issues that
are open.

DRILL-6023: Graceful shutdown improvements -- No PR. Consists of 6 sub
JIra's none of which have PRs.

On Wed, Feb 28, 2018 at 5:45 PM, Ted Dunning  wrote:

> 6190 and/or 6191 cause test failures that I have been unable to spend time
> on yet. I don't think that they are ready to commit.
>
> At least one of these is likely to be something very simple like a test
> that didn't clean up after itself. The other should be as simple, but I
> can't understand it yet. It may be a memory pressure thing rather than a
> real problem with the test.
>
>
> On Wed, Feb 28, 2018 at 3:18 AM, Parth Chandra  wrote:
>
> > OK. So let's try to get as many of the following as we can without
> breaking
> > anything. As far as I can see none of the open items below are show
> > stoppers for a release, but I'm happy to give in to popular demand for
> JDK
> > 8 :).
> >
> > Note that the last three appear to be big ticket items that have no PR
> yet.
> > Usually, it is a mistake to rush these into a release (one advantage of
> > frequent, predictable releases is that they won't have to wait too long
> for
> > the next release).
> >
> > Here's what I'm tracking :
> >
> > DRILL-6185: Error is displaying while accessing query profiles via the
> > Web-UI  -- Ready to commit
> > DRILL-6174: Parquet pushdown planning improvements -- Ready to commit
> > DRILL-6191: Need more information on TCP flags -- Ready to commit
> > DRILL-6190: Packets can be bigger than strictly legal  -- Ready to commit
> >
> > DRILL-6188: Fix C++ client build on Centos 7 and OS X  --  Needs
> committer
> > +1
> >
> > DRILL-1491:  Support for JDK 8 --* In progress.*
> >
> > DRILL-1170: YARN support for Drill -- Needs Committer +1 and Travis fix.
> >
> > DRILL-6027: Implement spill to disk for the Hash Join   --- No PR and is
> a
> > major feature that should be reviewed (properly!).
> >
> > DRILL-6173: Support transitive closure during filter push down and
> > partition pruning.  -- No PR and depends on 3 Apache Calcite issues that
> > are open.
> >
> > DRILL-6023: Graceful shutdown improvements -- No PR. Consists of 6 sub
> > JIra's none of which have PRs.
> >
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Feb 28, 2018 at 12:32 AM, Ted Dunning 
> > wrote:
> >
> > > I have two very small improvements to PCAP support with DRILL-6190 and
> > > DRILL-6191 that I would like to get in.
> > >
> > > I think that PCAP-NG support is too far from ready.
> > >
> > >
> > >
> > > On Tue, Feb 27, 2018 at 10:52 AM, Pritesh Maker 
> wrote:
> > >
> > > > I see a few more issues that are in review and worth including for
> the
> > > > 1.13 release (maybe give another week to resolve this before the 1st
> RC
> > > is
> > > > created?)
> > > >
> > > > DRILL-6027 Implement spill to disk for the Hash Join  -- Boaz and Tim
> > > > DRILL-6173 Support transitive closure during filter push down and
> > > > partition pruning - Vitalii
> > > > DRILL-6023 Graceful shutdown improvements -- Jyothsna
> > > >
> > > > There are several other bugs/ improvements that are marked in
> progress
> > -
> > > > https://issues.apache.org/jira/secure/Dashboard.jspa?
> > > selectPageId=12332152
> > > > - if folks are not working on them, we should remove the fixVersion
> for
> > > > 1.13.
> > > >
> > > > Pritesh
> > > >
> > > >
> > > > -Original Message-
> > > > From: Abhishek Girish 
> > > > Sent: February 27, 2018 10:44 AM
> > > > To: dev@drill.apache.org
> > > > Subject: Re: [DISCUSS] 1.13.0 release
> > > >
> > > > For JDK 8, we've run through both unit tests & regression tests from
> > [1]
> > > > and have observed no issues - so i think we should get the fixes into
> > > > 1.13.0 and claim support for JDK 8.
> > > >
> > > >
> > > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> > > > com_mapr_drill-2Dtest-2Dframework=DwIBaQ=
> cskdkSMqhcnjZxdQVpwTXg=
> > > > 

[GitHub] drill issue #1133: DRILL-6190 - Fix handling of packets longer than legally ...

2018-02-28 Thread tdunning
Github user tdunning commented on the issue:

https://github.com/apache/drill/pull/1133
  

Fixed the test regression. Deferring investigation into why the data field 
doesn't look unique to Drill because we will probably need to revamp how raw 
data is returned anyway.


---


[GitHub] drill pull request #1141: DRILL-6197: Skip duplicate entry for OperatorStats

2018-02-28 Thread amansinha100
Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1141#discussion_r171431227
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentStats.java ---
@@ -31,6 +32,13 @@
 public class FragmentStats {
 //  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(FragmentStats.class);
 
+  //Skip operators that already have stats reported by 
org.apache.drill.exec.physical.impl.BaseRootExec
+  private static final List operatorStatsInitToSkip = 
Lists.newArrayList(
--- End diff --

This could get out of sync with the types of senders that extend 
BaseRootExec. 


---


[GitHub] drill pull request #1141: DRILL-6197: Skip duplicate entry for OperatorStats

2018-02-28 Thread kkhatua
GitHub user kkhatua opened a pull request:

https://github.com/apache/drill/pull/1141

DRILL-6197: Skip duplicate entry for OperatorStats

`org.apache.drill.exec.ops.FragmentStats` should skip injecting the 
`org.apache.drill.exec.ops.OperatorStats` instance for these operators:
```
org.apache.drill.exec.proto.beans.CoreOperatorType.SCREEN
org.apache.drill.exec.proto.beans.CoreOperatorType.SINGLE_SENDER
org.apache.drill.exec.proto.beans.CoreOperatorType.BROADCAST_SENDER
org.apache.drill.exec.proto.beans.CoreOperatorType.HASH_PARTITION_SENDER
```
They all use the `org.apache.drill.exec.physical.impl.BaseRootExec` to 
inject the correct statistics.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kkhatua/drill DRILL-6197

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1141.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1141


commit f61e0416b10ebf9826540bb0bbe7de5d826de029
Author: Kunal Khatua 
Date:   2018-02-28T21:58:09Z

DRILL-6197: Skip duplicate entry for OperatorStats

org.apache.drill.exec.ops.FragmentStats should skip injecting the 
org.apache.drill.exec.ops.OperatorStats instance for these operators:
org.apache.drill.exec.proto.beans.CoreOperatorType.SCREEN
org.apache.drill.exec.proto.beans.CoreOperatorType.SINGLE_SENDER
org.apache.drill.exec.proto.beans.CoreOperatorType.BROADCAST_SENDER
org.apache.drill.exec.proto.beans.CoreOperatorType.HASH_PARTITION_SENDER




---


[GitHub] drill issue #1141: DRILL-6197: Skip duplicate entry for OperatorStats

2018-02-28 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1141
  
@amansinha100  please review


---


[jira] [Created] (DRILL-6197) Duplicate entries in inputProfiles of minor fragments for specific operators

2018-02-28 Thread Kunal Khatua (JIRA)
Kunal Khatua created DRILL-6197:
---

 Summary: Duplicate entries in inputProfiles of minor fragments for 
specific operators
 Key: DRILL-6197
 URL: https://issues.apache.org/jira/browse/DRILL-6197
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Monitoring
Affects Versions: 1.12.0
Reporter: Kunal Khatua
Assignee: Kunal Khatua
 Fix For: 1.13.0


Minor fragments for the following operators show duplicate entries of the 
inputProfile ({{org.apache.drill.exec.ops.OperatorStats}} instance) when viewed 
in the Profile UI.
e.g
{code:json}
{
...
"query": "select * from sys.version",
...
[ ...
{
"inputProfile": [{
"records": 0,
"batches": 0,
"schemas": 0
}],
"operatorId": 0,
"operatorType": 13,
"setupNanos": 0,
"processNanos": 0,
"peakLocalMemoryAllocated": 27131904,
"waitNanos": 0
},
{
"inputProfile": [{
"records": 1,
"batches": 1,
"schemas": 1
}],
"operatorId": 0,
"operatorType": 13,
"setupNanos": 0,
"processNanos": 752448,
"peakLocalMemoryAllocated": 27131904,
"metric": [{
"metricId": 0,
"longValue": 178
}],
"waitNanos": 889492
}]
...
}
{code}

{{operatorType: 13}} is the screen operator, for which there can be only one 
inputProfile.

It turns out that by default, all minor fragments' operators are provide a list 
of inputProfiles by 
{{org.apache.drill.exec.ops.FragmentStats.newOperatorStats(OpProfileDef, 
BufferAllocator)}}. However, for the following 4 operators, the 
{{org.apache.drill.exec.physical.impl.BaseRootExec}} constructors also inject 
{{OperatorStats}}. 

{code:java}
org.apache.drill.exec.proto.beans.CoreOperatorType.SCREEN
org.apache.drill.exec.proto.beans.CoreOperatorType.SINGLE_SENDER
org.apache.drill.exec.proto.beans.CoreOperatorType.BROADCAST_SENDER
org.apache.drill.exec.proto.beans.CoreOperatorType.HASH_PARTITION_SENDER
{code}

All updates to the inputProfiles are done by the latter, while the former only 
reports zero values.

The workaround is to have {{org.apache.drill.exec.ops.FragmentStats}} skip 
injecting the {{org.apache.drill.exec.ops.OperatorStats}} instance for these 
operators



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] drill issue #1101: DRILL-6032: Made the batch sizing for HashAgg more accura...

2018-02-28 Thread cchang738
Github user cchang738 commented on the issue:

https://github.com/apache/drill/pull/1101
  
My test fail with OOM. @ilooner has test log.


---


Re: [ANNOUNCE] New Committer: Kunal Khatua

2018-02-28 Thread Ramana I N
Congrats Kunal!

Regards
Ramana


On Wed, Feb 28, 2018 at 11:02 AM, Robert Hou  wrote:

> Congrats Kunal!
>
>
> --Robert
>
> 
> From: Robert Wu 
> Sent: Wednesday, February 28, 2018 10:50 AM
> To: dev@drill.apache.org
> Subject: RE: [ANNOUNCE] New Committer: Kunal Khatua
>
> Congratulations, Kunal!
>
> Best regards,
>
> Rob
>
> -Original Message-
> From: Vitalii Diravka [mailto:vitalii.dira...@gmail.com]
> Sent: Wednesday, February 28, 2018 10:48 AM
> To: dev@drill.apache.org
> Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua
>
> Congrats, Kunal!
>
> Kind regards
> Vitalii
>
> On Wed, Feb 28, 2018 at 6:39 PM, Timothy Farkas  wrote:
>
> > Congrats!
> >
> > 
> > From: Paul Rogers 
> > Sent: Wednesday, February 28, 2018 9:58:32 AM
> > To: dev@drill.apache.org
> > Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua
> >
> > Congrats, Kunal! Well deserved.
> >
> > - Paul
> >
> >
> > > On Feb 27, 2018, at 10:42 AM, Prasad Nagaraj Subramanya <
> > prasadn...@gmail.com> wrote:
> > >
> > > Congratulations Kunal!
> > >
> > >
> > > On Tue, Feb 27, 2018 at 10:41 AM, Padma Penumarthy
> > >  > >
> > > wrote:
> > >
> > >> Congratulations Kunal !
> > >>
> > >> Thanks
> > >> Padma
> > >>
> > >>
> > >>> On Feb 27, 2018, at 8:42 AM, Aman Sinha 
> wrote:
> > >>>
> > >>> The Project Management Committee (PMC) for Apache Drill has
> > >>> invited
> > Kunal
> > >>> Khatua  to become a committer, and we are pleased to announce that
> > >>> he has accepted.
> > >>>
> > >>> Over the last couple of years, Kunal has made substantial
> > >>> contributions
> > >> to
> > >>> the process of creating and interpreting of query profiles, among
> > >>> other code contributions. He has led the efforts for Drill
> > >>> performance
> > >> evaluation
> > >>> and benchmarking.  He is a prolific writer on the user mailing
> > >>> list, providing detailed responses.
> > >>>
> > >>> Welcome Kunal, and thank you for your contributions.  Keep up the
> > >>> good work !
> > >>>
> > >>> - Aman
> > >>> (on behalf of the Apache Drill PMC)
> > >>
> > >>
> >
> >
>


[GitHub] drill issue #1140: DRILL-6195: Quering Hive non-partitioned transactional ta...

2018-02-28 Thread vdiravka
Github user vdiravka commented on the issue:

https://github.com/apache/drill/pull/1140
  
@arina-ielchiieva Current implementation of creating schema for Drill Hive 
embedded metastore doesn't allow to create transactional tables. That's why I 
have created a separate Jira task to update Drill HiveTestGenerator - 
[DRILL-6196](https://issues.apache.org/jira/browse/DRILL-6196).


---


[GitHub] drill issue #1129: DRILL-6180: Use System Option "output_batch_size" for Ext...

2018-02-28 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1129
  
@paul-rogers Made the change you suggested. Please take a look when you get 
a chance. 


---


Re: [ANNOUNCE] New Committer: Kunal Khatua

2018-02-28 Thread Robert Hou
Congrats Kunal!


--Robert


From: Robert Wu 
Sent: Wednesday, February 28, 2018 10:50 AM
To: dev@drill.apache.org
Subject: RE: [ANNOUNCE] New Committer: Kunal Khatua

Congratulations, Kunal!

Best regards,

Rob

-Original Message-
From: Vitalii Diravka [mailto:vitalii.dira...@gmail.com]
Sent: Wednesday, February 28, 2018 10:48 AM
To: dev@drill.apache.org
Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua

Congrats, Kunal!

Kind regards
Vitalii

On Wed, Feb 28, 2018 at 6:39 PM, Timothy Farkas  wrote:

> Congrats!
>
> 
> From: Paul Rogers 
> Sent: Wednesday, February 28, 2018 9:58:32 AM
> To: dev@drill.apache.org
> Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua
>
> Congrats, Kunal! Well deserved.
>
> - Paul
>
>
> > On Feb 27, 2018, at 10:42 AM, Prasad Nagaraj Subramanya <
> prasadn...@gmail.com> wrote:
> >
> > Congratulations Kunal!
> >
> >
> > On Tue, Feb 27, 2018 at 10:41 AM, Padma Penumarthy
> >  >
> > wrote:
> >
> >> Congratulations Kunal !
> >>
> >> Thanks
> >> Padma
> >>
> >>
> >>> On Feb 27, 2018, at 8:42 AM, Aman Sinha  wrote:
> >>>
> >>> The Project Management Committee (PMC) for Apache Drill has
> >>> invited
> Kunal
> >>> Khatua  to become a committer, and we are pleased to announce that
> >>> he has accepted.
> >>>
> >>> Over the last couple of years, Kunal has made substantial
> >>> contributions
> >> to
> >>> the process of creating and interpreting of query profiles, among
> >>> other code contributions. He has led the efforts for Drill
> >>> performance
> >> evaluation
> >>> and benchmarking.  He is a prolific writer on the user mailing
> >>> list, providing detailed responses.
> >>>
> >>> Welcome Kunal, and thank you for your contributions.  Keep up the
> >>> good work !
> >>>
> >>> - Aman
> >>> (on behalf of the Apache Drill PMC)
> >>
> >>
>
>


[GitHub] drill issue #1101: DRILL-6032: Made the batch sizing for HashAgg more accura...

2018-02-28 Thread priteshm
Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/1101
  
Spoke with Chun, he will run the tests and update the PR with the test 
results.


---


[GitHub] drill issue #1137: DRILL-6185: Fixed error while displaying system profiles ...

2018-02-28 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1137
  
the purpose of parsing the plan as a String is primarily to extract the 
alternative operator names for the UI. The rest of the items are irrelevant for 
that usecase. Were you looking to figure out a way to deserialize the plan 
text, so that other scenarios could leverage off that?


---


RE: [ANNOUNCE] New Committer: Kunal Khatua

2018-02-28 Thread Robert Wu
Congratulations, Kunal!

Best regards,

Rob

-Original Message-
From: Vitalii Diravka [mailto:vitalii.dira...@gmail.com] 
Sent: Wednesday, February 28, 2018 10:48 AM
To: dev@drill.apache.org
Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua

Congrats, Kunal!

Kind regards
Vitalii

On Wed, Feb 28, 2018 at 6:39 PM, Timothy Farkas  wrote:

> Congrats!
>
> 
> From: Paul Rogers 
> Sent: Wednesday, February 28, 2018 9:58:32 AM
> To: dev@drill.apache.org
> Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua
>
> Congrats, Kunal! Well deserved.
>
> - Paul
>
>
> > On Feb 27, 2018, at 10:42 AM, Prasad Nagaraj Subramanya <
> prasadn...@gmail.com> wrote:
> >
> > Congratulations Kunal!
> >
> >
> > On Tue, Feb 27, 2018 at 10:41 AM, Padma Penumarthy 
> >  >
> > wrote:
> >
> >> Congratulations Kunal !
> >>
> >> Thanks
> >> Padma
> >>
> >>
> >>> On Feb 27, 2018, at 8:42 AM, Aman Sinha  wrote:
> >>>
> >>> The Project Management Committee (PMC) for Apache Drill has 
> >>> invited
> Kunal
> >>> Khatua  to become a committer, and we are pleased to announce that 
> >>> he has accepted.
> >>>
> >>> Over the last couple of years, Kunal has made substantial 
> >>> contributions
> >> to
> >>> the process of creating and interpreting of query profiles, among 
> >>> other code contributions. He has led the efforts for Drill 
> >>> performance
> >> evaluation
> >>> and benchmarking.  He is a prolific writer on the user mailing 
> >>> list, providing detailed responses.
> >>>
> >>> Welcome Kunal, and thank you for your contributions.  Keep up the 
> >>> good work !
> >>>
> >>> - Aman
> >>> (on behalf of the Apache Drill PMC)
> >>
> >>
>
>


Re: [ANNOUNCE] New Committer: Kunal Khatua

2018-02-28 Thread Vitalii Diravka
Congrats, Kunal!

Kind regards
Vitalii

On Wed, Feb 28, 2018 at 6:39 PM, Timothy Farkas  wrote:

> Congrats!
>
> 
> From: Paul Rogers 
> Sent: Wednesday, February 28, 2018 9:58:32 AM
> To: dev@drill.apache.org
> Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua
>
> Congrats, Kunal! Well deserved.
>
> - Paul
>
>
> > On Feb 27, 2018, at 10:42 AM, Prasad Nagaraj Subramanya <
> prasadn...@gmail.com> wrote:
> >
> > Congratulations Kunal!
> >
> >
> > On Tue, Feb 27, 2018 at 10:41 AM, Padma Penumarthy  >
> > wrote:
> >
> >> Congratulations Kunal !
> >>
> >> Thanks
> >> Padma
> >>
> >>
> >>> On Feb 27, 2018, at 8:42 AM, Aman Sinha  wrote:
> >>>
> >>> The Project Management Committee (PMC) for Apache Drill has invited
> Kunal
> >>> Khatua  to become a committer, and we are pleased to announce that he
> >>> has accepted.
> >>>
> >>> Over the last couple of years, Kunal has made substantial contributions
> >> to
> >>> the process of creating and interpreting of query profiles, among other
> >>> code contributions. He has led the efforts for Drill performance
> >> evaluation
> >>> and benchmarking.  He is a prolific writer on the user mailing list,
> >>> providing detailed responses.
> >>>
> >>> Welcome Kunal, and thank you for your contributions.  Keep up the good
> >>> work !
> >>>
> >>> - Aman
> >>> (on behalf of the Apache Drill PMC)
> >>
> >>
>
>


Re: [ANNOUNCE] New Committer: Kunal Khatua

2018-02-28 Thread Timothy Farkas
Congrats!


From: Paul Rogers 
Sent: Wednesday, February 28, 2018 9:58:32 AM
To: dev@drill.apache.org
Subject: Re: [ANNOUNCE] New Committer: Kunal Khatua

Congrats, Kunal! Well deserved.

- Paul


> On Feb 27, 2018, at 10:42 AM, Prasad Nagaraj Subramanya 
>  wrote:
>
> Congratulations Kunal!
>
>
> On Tue, Feb 27, 2018 at 10:41 AM, Padma Penumarthy 
> wrote:
>
>> Congratulations Kunal !
>>
>> Thanks
>> Padma
>>
>>
>>> On Feb 27, 2018, at 8:42 AM, Aman Sinha  wrote:
>>>
>>> The Project Management Committee (PMC) for Apache Drill has invited Kunal
>>> Khatua  to become a committer, and we are pleased to announce that he
>>> has accepted.
>>>
>>> Over the last couple of years, Kunal has made substantial contributions
>> to
>>> the process of creating and interpreting of query profiles, among other
>>> code contributions. He has led the efforts for Drill performance
>> evaluation
>>> and benchmarking.  He is a prolific writer on the user mailing list,
>>> providing detailed responses.
>>>
>>> Welcome Kunal, and thank you for your contributions.  Keep up the good
>>> work !
>>>
>>> - Aman
>>> (on behalf of the Apache Drill PMC)
>>
>>



[GitHub] drill issue #1140: DRILL-6195: Quering Hive non-partitioned transactional ta...

2018-02-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1140
  
Looks good, could you add unit test?


---


Re: [ANNOUNCE] New Committer: Kunal Khatua

2018-02-28 Thread Paul Rogers
Congrats, Kunal! Well deserved.

- Paul


> On Feb 27, 2018, at 10:42 AM, Prasad Nagaraj Subramanya 
>  wrote:
> 
> Congratulations Kunal!
> 
> 
> On Tue, Feb 27, 2018 at 10:41 AM, Padma Penumarthy 
> wrote:
> 
>> Congratulations Kunal !
>> 
>> Thanks
>> Padma
>> 
>> 
>>> On Feb 27, 2018, at 8:42 AM, Aman Sinha  wrote:
>>> 
>>> The Project Management Committee (PMC) for Apache Drill has invited Kunal
>>> Khatua  to become a committer, and we are pleased to announce that he
>>> has accepted.
>>> 
>>> Over the last couple of years, Kunal has made substantial contributions
>> to
>>> the process of creating and interpreting of query profiles, among other
>>> code contributions. He has led the efforts for Drill performance
>> evaluation
>>> and benchmarking.  He is a prolific writer on the user mailing list,
>>> providing detailed responses.
>>> 
>>> Welcome Kunal, and thank you for your contributions.  Keep up the good
>>> work !
>>> 
>>> - Aman
>>> (on behalf of the Apache Drill PMC)
>> 
>> 



[GitHub] drill issue #1125: DRILL-6126: Allocate memory for value vectors upfront in ...

2018-02-28 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1125
  
@paul-rogers Paul, Thanks a lot for your review comments and bringing up 
some good issues. Just want to let you know. I am working on refactoring the 
batch sizer code, writing bunch of unit tests to test sizing and vector 
allocation for all different vector types. Found some bugs in the process and 
fixed them. I will be posting new changes soon and need your review once they 
are ready.


---


Avro storage format behaviour

2018-02-28 Thread Vova Vysotskyi
Hi all,

I am working on DRILL-4120: dir0 does not work when the directory structure
contains Avro files.

In DRILL-3810 was added validation of query using avro schema before start
executing the query.
Therefore with these changes Drill throws an exception when the
query contains non-existent column and table has avro format.
Other storage formats such as json or parquet allow usage of non-existing
fields.

So here is my question: should we continue to treat avro as a format with
fixed schema, or we should start treating avro as a dynamic format to be
consistent with other storage formats?

-- 
Kind regards,
Volodymyr Vysotskyi


[GitHub] drill issue #1112: DRILL-6114: Metadata revisions

2018-02-28 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1112
  
@arina-ielchiieva, @parthchandra can either of you perhaps give this one a 
committer review? Thanks! 


---


[GitHub] drill issue #1140: DRILL-6195: Quering Hive non-partitioned transactional ta...

2018-02-28 Thread vdiravka
Github user vdiravka commented on the issue:

https://github.com/apache/drill/pull/1140
  
@arina-ielchiieva Please review


---


[GitHub] drill pull request #1135: DRILL-6040: Added usage for graceful_stop in drill...

2018-02-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1135#discussion_r171319850
  
--- Diff: distribution/src/resources/drillbit.sh ---
@@ -45,7 +45,7 @@
 # configuration file. The option takes precedence over the
 # DRILL_CONF_DIR environment variable.
 #
-# The command is one of: start|stop|status|restart|run
+# The command is one of: start|stop|status|restart|run|graceful_stop
--- End diff --

This command will be typed by hand sometimes. Can we find a shorter 
command? Retire? Remove? Or, should stop be changed to do this, with a new kill 
for the "ungraceful" stop?


---


[GitHub] drill pull request #1140: DRILL-6195: Quering Hive non-partitioned transacti...

2018-02-28 Thread vdiravka
GitHub user vdiravka opened a pull request:

https://github.com/apache/drill/pull/1140

DRILL-6195: Quering Hive non-partitioned transactional tables via Drill



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vdiravka/drill DRILL-6195

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1140.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1140


commit 655075f9cea5f41e2770924ceabee79a57ea323e
Author: Vitalii Diravka 
Date:   2018-02-28T14:16:17Z

DRILL-6195: Quering Hive non-partitioned transactional tables via Drill




---


[GitHub] drill issue #1138: DRILL-4120: Allow implicit columns for Avro storage forma...

2018-02-28 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1138
  
General comment: if we could move to the new scan framework; it handles 
implicit columns for all file-based readers. It also handles projection, 
missing columns, etc...


---


[GitHub] drill pull request #1129: DRILL-6180: Use System Option "output_batch_size" ...

2018-02-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1129#discussion_r171314892
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java
 ---
@@ -71,8 +72,8 @@
 
   private final int mSortBatchSize;
 
-  public SortConfig(DrillConfig config) {
-
+  public SortConfig(FragmentContext context) {
+DrillConfig config = context.getConfig();
--- End diff --

Suggestion: pass in the original `DrillConfig` plus an option manager 
rather than the fragment context. The suggestion minimizes undesired 
dependencies.


---


[GitHub] drill pull request #1129: DRILL-6180: Use System Option "output_batch_size" ...

2018-02-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1129#discussion_r171315310
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -421,7 +416,7 @@ drill.exec.options: {
 drill.exec.storage.implicit.fqn.column.label: "fqn",
 drill.exec.storage.implicit.suffix.column.label: "suffix",
 drill.exec.testing.controls: "{}",
-drill.exec.memory.operator.output_batch_size : 33554432, # 32 MB
+drill.exec.memory.operator.output_batch_size : 16777216, # 16 MB
--- End diff --

Thanks for making this adjustment.


---


[jira] [Created] (DRILL-6196) Upgrade HiveTestDataGenerator to leverage "schematool"

2018-02-28 Thread Vitalii Diravka (JIRA)
Vitalii Diravka created DRILL-6196:
--

 Summary: Upgrade HiveTestDataGenerator to leverage "schematool"
 Key: DRILL-6196
 URL: https://issues.apache.org/jira/browse/DRILL-6196
 Project: Apache Drill
  Issue Type: Improvement
  Components: Tools, Build  Test
Reporter: Vitalii Diravka


Since version 2.0, Hive uses 
["schematool"|https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool]
 to create the necessary schema in the metastore on a startup if one doesn't 
exist.
 The old method via using datanucleus property "METASTORE_AUTO_CREATE_ALL" is 
[deprecated|https://github.com/apache/hive/blob/branch-2.1/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L718].

That is especially needed to add test cases for transactional tables - 
[https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] drill pull request #1139: DRILL-6189: Security: passwords logging and file p...

2018-02-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1139#discussion_r171308023
  
--- Diff: 
protocol/src/main/java/org/apache/drill/exec/proto/UserProtos.java ---
@@ -5798,6 +5798,34 @@ public static UserToBitHandshake 
getDefaultInstance() {
 public UserToBitHandshake getDefaultInstanceForType() {
   return defaultInstance;
 }
+public String safeLogString() {
--- End diff --

You cannot add custom methods to proto buffers. Also consider using tabs 
instead of multiple spaces.
Please add to Jira example how log files looked before your changes and 
after.


---


[GitHub] drill pull request #1139: DRILL-6189: Security: passwords logging and file p...

2018-02-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1139#discussion_r171307607
  
--- Diff: 
logical/src/main/java/org/apache/drill/common/config/LogicalPlanPersistence.java
 ---
@@ -52,6 +53,7 @@ public LogicalPlanPersistence(DrillConfig conf, 
ScanResult scanResult) {
 mapper.configure(Feature.ALLOW_UNQUOTED_FIELD_NAMES, true);
 mapper.configure(JsonGenerator.Feature.QUOTE_FIELD_NAMES, true);
 mapper.configure(Feature.ALLOW_COMMENTS, true);
+mapper.setFilterProvider(new 
SimpleFilterProvider().setFailOnUnknownId(false));
--- End diff --

Will filtering passwords work when profiles are sent between nodes (i.e. 
when we have several major fragments)?


---


[GitHub] drill pull request #1139: DRILL-6189: Security: passwords logging and file p...

2018-02-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1139#discussion_r171307292
  
--- Diff: 
contrib/storage-jdbc/src/main/java/org/apache/drill/exec/store/jdbc/JdbcStorageConfig.java
 ---
@@ -17,13 +17,15 @@
  */
 package org.apache.drill.exec.store.jdbc;
 
+import com.fasterxml.jackson.annotation.JsonFilter;
 import org.apache.drill.common.logical.StoragePluginConfig;
 
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
 import com.fasterxml.jackson.annotation.JsonTypeName;
 
 @JsonTypeName(JdbcStorageConfig.NAME)
+@JsonFilter("passwordFilter")
--- End diff --

Please explain how this works?


---


[GitHub] drill pull request #1139: DRILL-6189: Security: passwords logging and file p...

2018-02-28 Thread vladimirtkach
GitHub user vladimirtkach opened a pull request:

https://github.com/apache/drill/pull/1139

DRILL-6189: Security: passwords logging and file permisions

1. Overrided serialization methods for instances with passwords
2. Changed file permissions for configuration files

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vladimirtkach/drill DRILL-6189

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1139.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1139


commit 9bf7f464fe921cef92ad9802f56c75b72064b0aa
Author: Vladimir Tkach 
Date:   2018-02-28T11:10:50Z

DRILL-6189: Security: passwords logging and file permisions

1. Overrided serialization methods for instances with passwords
2. Changed file permissions for configuration files




---


[jira] [Created] (DRILL-6195) Quering Hive non-partitioned transactional tables via Drill

2018-02-28 Thread Vitalii Diravka (JIRA)
Vitalii Diravka created DRILL-6195:
--

 Summary: Quering Hive non-partitioned transactional tables via 
Drill
 Key: DRILL-6195
 URL: https://issues.apache.org/jira/browse/DRILL-6195
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Hive
Affects Versions: 1.12.0
Reporter: Vitalii Diravka
Assignee: Vitalii Diravka
 Fix For: 1.13.0


After updating Hive client Drill can query Hive partitioned bucketed tables.
The same logic can be used for Hive non-partitioned transnational bucketed 
tables.

Use case:
{code}
Hive
CREATE TABLE test_txn_2 (userid VARCHAR(64), link STRING, came_from STRING)
CLUSTERED BY (userid) INTO 8 BUCKETS STORED AS ORC
TBLPROPERTIES (
 'transactional'='true'
);
INSERT INTO TABLE test_txn_2 VALUES ('jsmith', 'mail.com', 'sports.com'), 
('jdoe', 'mail.com', null);
{code}
{code}
0: jdbc:drill:> select * from hive.test_txn_2;
Error: SYSTEM ERROR: IOException: Open failed for file: 
/user/hive/warehouse/test_txn_2, error: Invalid argument (22)

Setup failed for null
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6194) Allow un-caching of parquet metadata or stop queries from failing when metadata is old.

2018-02-28 Thread John Humphreys (JIRA)
John Humphreys created DRILL-6194:
-

 Summary: Allow un-caching of parquet metadata or stop queries from 
failing when metadata is old.
 Key: DRILL-6194
 URL: https://issues.apache.org/jira/browse/DRILL-6194
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.10.0
Reporter: John Humphreys


Let's say you have files stored in the standard hierarchical way and the data 
is held in parquet:
 * year/
 ** month/
 *** day/
  filev2.parquet

If you cache the metadata under year/ or one of the other levels, and then you 
replace filev2.parquet with filev3.parquet, you will get errors when running 
queries relating to file2.parquet not being present.

I'm specifically seeing this when using maxdir(), and dir0/1/2 for 
year/month/day but I suspect its a general issue.

Queries using cached metadata should not fail if the metadata is outdated; they 
should just choose not to use it.  Otherwise there should be an uncache 
operator for the metadata so people can just decide to stop using it.

It's not always efficient to run a metadata refresh before every single query 
you do, and its difficult to run one from every program that touches HDFS files 
immediately after it touches them.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] 1.13.0 release

2018-02-28 Thread Ted Dunning
6190 and/or 6191 cause test failures that I have been unable to spend time
on yet. I don't think that they are ready to commit.

At least one of these is likely to be something very simple like a test
that didn't clean up after itself. The other should be as simple, but I
can't understand it yet. It may be a memory pressure thing rather than a
real problem with the test.


On Wed, Feb 28, 2018 at 3:18 AM, Parth Chandra  wrote:

> OK. So let's try to get as many of the following as we can without breaking
> anything. As far as I can see none of the open items below are show
> stoppers for a release, but I'm happy to give in to popular demand for JDK
> 8 :).
>
> Note that the last three appear to be big ticket items that have no PR yet.
> Usually, it is a mistake to rush these into a release (one advantage of
> frequent, predictable releases is that they won't have to wait too long for
> the next release).
>
> Here's what I'm tracking :
>
> DRILL-6185: Error is displaying while accessing query profiles via the
> Web-UI  -- Ready to commit
> DRILL-6174: Parquet pushdown planning improvements -- Ready to commit
> DRILL-6191: Need more information on TCP flags -- Ready to commit
> DRILL-6190: Packets can be bigger than strictly legal  -- Ready to commit
>
> DRILL-6188: Fix C++ client build on Centos 7 and OS X  --  Needs committer
> +1
>
> DRILL-1491:  Support for JDK 8 --* In progress.*
>
> DRILL-1170: YARN support for Drill -- Needs Committer +1 and Travis fix.
>
> DRILL-6027: Implement spill to disk for the Hash Join   --- No PR and is a
> major feature that should be reviewed (properly!).
>
> DRILL-6173: Support transitive closure during filter push down and
> partition pruning.  -- No PR and depends on 3 Apache Calcite issues that
> are open.
>
> DRILL-6023: Graceful shutdown improvements -- No PR. Consists of 6 sub
> JIra's none of which have PRs.
>
>
>
>
>
>
>
>
> On Wed, Feb 28, 2018 at 12:32 AM, Ted Dunning 
> wrote:
>
> > I have two very small improvements to PCAP support with DRILL-6190 and
> > DRILL-6191 that I would like to get in.
> >
> > I think that PCAP-NG support is too far from ready.
> >
> >
> >
> > On Tue, Feb 27, 2018 at 10:52 AM, Pritesh Maker  wrote:
> >
> > > I see a few more issues that are in review and worth including for the
> > > 1.13 release (maybe give another week to resolve this before the 1st RC
> > is
> > > created?)
> > >
> > > DRILL-6027 Implement spill to disk for the Hash Join  -- Boaz and Tim
> > > DRILL-6173 Support transitive closure during filter push down and
> > > partition pruning - Vitalii
> > > DRILL-6023 Graceful shutdown improvements -- Jyothsna
> > >
> > > There are several other bugs/ improvements that are marked in progress
> -
> > > https://issues.apache.org/jira/secure/Dashboard.jspa?
> > selectPageId=12332152
> > > - if folks are not working on them, we should remove the fixVersion for
> > > 1.13.
> > >
> > > Pritesh
> > >
> > >
> > > -Original Message-
> > > From: Abhishek Girish 
> > > Sent: February 27, 2018 10:44 AM
> > > To: dev@drill.apache.org
> > > Subject: Re: [DISCUSS] 1.13.0 release
> > >
> > > For JDK 8, we've run through both unit tests & regression tests from
> [1]
> > > and have observed no issues - so i think we should get the fixes into
> > > 1.13.0 and claim support for JDK 8.
> > >
> > >
> > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> > > com_mapr_drill-2Dtest-2Dframework=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=
> > > zySISmkmM4WNViCKijENtQ=DWH_joEaRjyZJAvkuLxyCK5ln-
> 1fr4O7tpGKl1SddY0=
> > > J6FGSG0CSPNSx1abvNe53qFNBwCr3UaO8ILqmUYfuWg=
> > >
> > > On Tue, Feb 27, 2018 at 9:46 AM, Aman Sinha 
> > wrote:
> > >
> > > > Agree with Arina on JDK8 support..let's get through the last
> remaining
> > > > hurdles for this for 1.13 release.
> > > > Calcite in fact is dropping support for JDK 7 in their next release.
> > > > They are even looking ahead to JDK 9.
> > > >
> > > > For Drill-on-Yarn, I think ultimately it is at the discretion of the
> > > > release manager.  My personal opinion is as long as a committer (in
> > > > this case Arina) feels comfortable with it, and given that the
> > > > functionality has been well-tested, I am okay to include it.
> > > >
> > > > -Aman
> > > >
> > > > On Tue, Feb 27, 2018 at 8:51 AM, Arina Yelchiyeva <
> > > > arina.yelchiy...@gmail.com> wrote:
> > > >
> > > > > I want to include DRILL-6174. It has already passed code review.
> > > > >
> > > > > Regarding JDK8 support. Volodymyr Tkach is working on this issue.
> > > > Currently
> > > > > all unit tests have passed. Now he is working on enforcing Java 8
> > > > (changes
> > > > > in travis.yml, drill-config.sh, pom.xml etc).
> > > > >
> > > > > Regarding Drill on Yarn, Salim has done code review. Failing Travis
> > > > > check is easy to fix. Tim has already proposed the solution (Paul
> > > > > just needs to add the dependency).
> 

[GitHub] drill issue #1138: DRILL-4120: Allow implicit columns for Avro storage forma...

2018-02-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1138
  
You are basically reverting changes done in DRILL-3810 to support schema 
validation in Avro. 
Avro format is strict and has schema. Should Drill treat it the same way or 
do loosen parsing?

We should evaluate the option of leaving schema for avro but adding 
implicit columns. Maybe the change won't be as easy as changing 
`AvroDrillTable` to `DynamicDrillTable` but it might be more correct.

You can also start mailing thread on dev / user list, asking about treating 
avro as dynamic format (listing pros and cons) and get feedback from the users. 

[1] https://issues.apache.org/jira/browse/DRILL-3810


---


Re: [DISCUSS] 1.13.0 release

2018-02-28 Thread Parth Chandra
OK. So let's try to get as many of the following as we can without breaking
anything. As far as I can see none of the open items below are show
stoppers for a release, but I'm happy to give in to popular demand for JDK
8 :).

Note that the last three appear to be big ticket items that have no PR yet.
Usually, it is a mistake to rush these into a release (one advantage of
frequent, predictable releases is that they won't have to wait too long for
the next release).

Here's what I'm tracking :

DRILL-6185: Error is displaying while accessing query profiles via the
Web-UI  -- Ready to commit
DRILL-6174: Parquet pushdown planning improvements -- Ready to commit
DRILL-6191: Need more information on TCP flags -- Ready to commit
DRILL-6190: Packets can be bigger than strictly legal  -- Ready to commit

DRILL-6188: Fix C++ client build on Centos 7 and OS X  --  Needs committer
+1

DRILL-1491:  Support for JDK 8 --* In progress.*

DRILL-1170: YARN support for Drill -- Needs Committer +1 and Travis fix.

DRILL-6027: Implement spill to disk for the Hash Join   --- No PR and is a
major feature that should be reviewed (properly!).

DRILL-6173: Support transitive closure during filter push down and
partition pruning.  -- No PR and depends on 3 Apache Calcite issues that
are open.

DRILL-6023: Graceful shutdown improvements -- No PR. Consists of 6 sub
JIra's none of which have PRs.








On Wed, Feb 28, 2018 at 12:32 AM, Ted Dunning  wrote:

> I have two very small improvements to PCAP support with DRILL-6190 and
> DRILL-6191 that I would like to get in.
>
> I think that PCAP-NG support is too far from ready.
>
>
>
> On Tue, Feb 27, 2018 at 10:52 AM, Pritesh Maker  wrote:
>
> > I see a few more issues that are in review and worth including for the
> > 1.13 release (maybe give another week to resolve this before the 1st RC
> is
> > created?)
> >
> > DRILL-6027 Implement spill to disk for the Hash Join  -- Boaz and Tim
> > DRILL-6173 Support transitive closure during filter push down and
> > partition pruning - Vitalii
> > DRILL-6023 Graceful shutdown improvements -- Jyothsna
> >
> > There are several other bugs/ improvements that are marked in progress -
> > https://issues.apache.org/jira/secure/Dashboard.jspa?
> selectPageId=12332152
> > - if folks are not working on them, we should remove the fixVersion for
> > 1.13.
> >
> > Pritesh
> >
> >
> > -Original Message-
> > From: Abhishek Girish 
> > Sent: February 27, 2018 10:44 AM
> > To: dev@drill.apache.org
> > Subject: Re: [DISCUSS] 1.13.0 release
> >
> > For JDK 8, we've run through both unit tests & regression tests from [1]
> > and have observed no issues - so i think we should get the fixes into
> > 1.13.0 and claim support for JDK 8.
> >
> >
> > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> > com_mapr_drill-2Dtest-2Dframework=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=
> > zySISmkmM4WNViCKijENtQ=DWH_joEaRjyZJAvkuLxyCK5ln-1fr4O7tpGKl1SddY0=
> > J6FGSG0CSPNSx1abvNe53qFNBwCr3UaO8ILqmUYfuWg=
> >
> > On Tue, Feb 27, 2018 at 9:46 AM, Aman Sinha 
> wrote:
> >
> > > Agree with Arina on JDK8 support..let's get through the last remaining
> > > hurdles for this for 1.13 release.
> > > Calcite in fact is dropping support for JDK 7 in their next release.
> > > They are even looking ahead to JDK 9.
> > >
> > > For Drill-on-Yarn, I think ultimately it is at the discretion of the
> > > release manager.  My personal opinion is as long as a committer (in
> > > this case Arina) feels comfortable with it, and given that the
> > > functionality has been well-tested, I am okay to include it.
> > >
> > > -Aman
> > >
> > > On Tue, Feb 27, 2018 at 8:51 AM, Arina Yelchiyeva <
> > > arina.yelchiy...@gmail.com> wrote:
> > >
> > > > I want to include DRILL-6174. It has already passed code review.
> > > >
> > > > Regarding JDK8 support. Volodymyr Tkach is working on this issue.
> > > Currently
> > > > all unit tests have passed. Now he is working on enforcing Java 8
> > > (changes
> > > > in travis.yml, drill-config.sh, pom.xml etc).
> > > >
> > > > Regarding Drill on Yarn, Salim has done code review. Failing Travis
> > > > check is easy to fix. Tim has already proposed the solution (Paul
> > > > just needs to add the dependency).
> > > > I think its safe to include these changes in this release. They go
> > > > in separate module and have no impact any existing functionality.
> > > > Even if there are some flows, users will be able to give it a try
> > > > and feedback in case of issues.
> > > >
> > > > On Tue, Feb 27, 2018 at 8:53 AM, Parth Chandra 
> > > wrote:
> > > >
> > > > > There are two issues marked as blockers for 1.13.0 - TPatch
> > > > > InfoKeySummaryReporterPStatusResolutionCreatedUpdatedDueFix
> > > > > Version/s
> > > > > Assignee
> > > > >  > > > > 

[GitHub] drill issue #1132: DRILL-6188: Fix C++ client build on Centos7, OS X

2018-02-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1132
  
+1


---


[GitHub] drill pull request #1138: DRILL-4120: Allow implicit columns for Avro storag...

2018-02-28 Thread vvysotskyi
GitHub user vvysotskyi opened a pull request:

https://github.com/apache/drill/pull/1138

DRILL-4120: Allow implicit columns for Avro storage format

Existing implementation of `AvroDrillTabl` does not allow dynamic columns 
discovering. `AvroDrillTable.getRowType()` method returns `RelDataTypeImlp` 
instance with the list of all table columns. It forces validator to check 
columns from select list in `RowType` list. It makes impossible to use implicit 
columns.

This fix replaces the usage of `AvroDrillTable` by `DynamicDrillTable` for 
Avro format and also allows usage of non-existent columns in Avro tables to be 
consistent with other storage formats.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vvysotskyi/drill DRILL-4120

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1138.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1138


commit 402accca668481bb6816aad438c867781157fac6
Author: Volodymyr Vysotskyi 
Date:   2018-02-27T16:39:22Z

DRILL-4120: Allow implicit columns for Avro storage format




---


[GitHub] drill issue #1134: DRILL-6191 - Add acknowledgement sequence number and flag...

2018-02-28 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1134
  
+1


---


[GitHub] drill issue #1133: DRILL-6190 - Fix handling of packets longer than legally ...

2018-02-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1133
  
@tdunning travis fails with `
Failed tests: 
  
TestPcapRecordReader.testDistinctQuery:51->runSQLVerifyCount:56->printResultAndVerifyRowCount:68
 expected:<1> but was:<2>`


---


[GitHub] drill issue #1133: DRILL-6190 - Fix handling of packets longer than legally ...

2018-02-28 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1133
  
+1


---


[GitHub] drill issue #1137: DRILL-6185: Fixed error while displaying system profiles ...

2018-02-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1137
  
+1, LGTM.


---