[jira] [Created] (DRILL-7148) TPCH query 17 increases execution time with Statistics enabled because join order is changed

2019-04-01 Thread Gautam Parai (JIRA)
Gautam Parai created DRILL-7148:
---

 Summary: TPCH query 17 increases execution time with Statistics 
enabled because join order is changed
 Key: DRILL-7148
 URL: https://issues.apache.org/jira/browse/DRILL-7148
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.16.0
Reporter: Gautam Parai
Assignee: Gautam Parai
 Fix For: 1.16.0


TPCH query 17 with sf 1000 runs 45% slower. One issue is that the join order 
has flipped the build side and the probe side in Major Fragment 01.

Here is the query:
select
 sum(l.l_extendedprice) / 7.0 as avg_yearly
from
 lineitem l,
 part p
where
 p.p_partkey = l.l_partkey
 and p.p_brand = 'Brand#13'
 and p.p_container = 'JUMBO CAN'
 and l.l_quantity < (
 select
 0.2 * avg(l2.l_quantity)
 from
 lineitem l2
 where
 l2.l_partkey = p.p_partkey
 );

Here is original plan:
{noformat}
00-00 Screen : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative 
cost = \{7.853786601428E10 rows, 6.6179786770537E11 cpu, 3.0599948545E10 io, 
1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489493
00-01 Project(avg_yearly=[/($0, 7.0)]) : rowType = RecordType(ANY avg_yearly): 
rowcount = 1.0, cumulative cost = \{7.853786601418E10 rows, 6.6179786770527E11 
cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 
memory}, id = 489492
00-02 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): 
rowcount = 1.0, cumulative cost = \{7.853786601318E10 rows, 6.6179786770127E11 
cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 
memory}, id = 489491
00-03 UnionExchange : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative 
cost = \{7.853786601218E10 rows, 6.6179786768927E11 cpu, 3.0599948545E10 io, 
1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489490
01-01 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): 
rowcount = 1.0, cumulative cost = \{7.853786601118E10 rows, 6.6179786768127E11 
cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 
memory}, id = 489489
01-02 Project(l_extendedprice=[$1]) : rowType = RecordType(ANY 
l_extendedprice): rowcount = 2.948545E9, cumulative cost = 
\{7.553787115668E10 rows, 6.2579792942727E11 cpu, 3.0599948545E10 io, 
1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489488
01-03 SelectionVectorRemover : rowType = RecordType(ANY l_quantity, ANY 
l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 
2.948545E9, cumulative cost = \{7.253787630218E10 rows, 6.2279793457277E11 
cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 
memory}, id = 489487
01-04 Filter(condition=[<($0, *(0.2, $4))]) : rowType = RecordType(ANY 
l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): 
rowcount = 2.948545E9, cumulative cost = \{6.953788144768E10 rows, 
6.1979793971827E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 
1.17294998955024E11 memory}, id = 489486
01-05 HashJoin(condition=[=($2, $3)], joinType=[inner], semi-join: =[false]) : 
rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY 
l_partkey, ANY $f1): rowcount = 5.89709E9, cumulative cost = 
\{6.353789173867999E10 rows, 5.8379800146427E11 cpu, 3.0599948545E10 io, 
1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489485
01-07 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2]) : rowType 
= RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 
5.89709E9, cumulative cost = \{4.2417927963E10 rows, 2.71618536905E11 cpu, 
1.8599969127E10 io, 9.8471562592256E13 network, 7.92E7 memory}, id = 489476
01-09 HashToRandomExchange(dist0=[[$2]]) : rowType = RecordType(ANY l_quantity, 
ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
5.89709E9, cumulative cost = \{3.6417938254E10 rows, 2.53618567778E11 cpu, 
1.8599969127E10 io, 9.8471562592256E13 network, 7.92E7 memory}, id = 489475
02-01 UnorderedMuxExchange : rowType = RecordType(ANY l_quantity, ANY 
l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 
5.89709E9, cumulative cost = \{3.0417948545E10 rows, 1.57618732434E11 cpu, 
1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489474
04-01 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 1301011)]) : rowType = 
RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY 
E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.89709E9, cumulative cost = 
\{2.4417958836E10 rows, 1.51618742725E11 cpu, 1.8599969127E10 io, 1.677312E11 
network, 7.92E7 memory}, id = 489473
04-02 Project(l_quantity=[$1], l_extendedprice=[$2], p_partkey=[$3]) : rowType 
= RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 
5.89709E9, cumulative 

[jira] [Commented] (DRILL-7146) Query failing with NPE when ZK queue is enabled

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807334#comment-16807334
 ] 

ASF GitHub Bot commented on DRILL-7146:
---

sohami commented on pull request #1725: DRILL-7146: Query failing with NPE when 
ZK queue is enabled.
URL: https://github.com/apache/drill/pull/1725#discussion_r271104111
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/planner/rm/TestMemoryCalculator.java
 ##
 @@ -59,6 +59,7 @@
 
   private static final long DEFAULT_SLICE_TARGET = 10L;
   private static final long DEFAULT_BATCH_SIZE = 16*1024*1024;
+  private static final String ENABLE_QUEUE = 
"drill.exec.queue.embedded.enable";
 
 Review comment:
   Minor comment. You can use: `EmbeddedQueryQueue.ENABLED` instead of again 
defining it here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query failing with NPE when ZK queue is enabled
> ---
>
> Key: DRILL-7146
> URL: https://issues.apache.org/jira/browse/DRILL-7146
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.16.0
>Reporter: Sorabh Hamirwasia
>Assignee: Hanumath Rao Maduri
>Priority: Major
> Fix For: 1.16.0
>
>
>  
> {code:java}
> >> Query: alter system reset all;
>  SYSTEM ERROR: NullPointerException
> Please, refer to logs for more information.
> [Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
>  java.sql.SQLException: SYSTEM ERROR: NullPointerException
> Please, refer to logs for more information.
> [Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
>  at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:535)
>  at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:607)
>  at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1278)
>  at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:58)
>  at 
> oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
>  at 
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1107)
>  at 
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1118)
>  at 
> oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:200)
>  at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>  at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)
>  at org.apache.drill.test.framework.Utils.execSQL(Utils.java:917)
>  at org.apache.drill.test.framework.TestDriver.setup(TestDriver.java:632)
>  at org.apache.drill.test.framework.TestDriver.runTests(TestDriver.java:152)
>  at org.apache.drill.test.framework.TestDriver.main(TestDriver.java:94)
>  Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: 
> SYSTEM ERROR: NullPointerException
> Please, refer to logs for more information.
> [Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
>  at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
>  at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
>  at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
>  at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
>  at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
>  at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>  at 
> oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>  at 
> 

[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread Paul Rogers (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807333#comment-16807333
 ] 

Paul Rogers commented on DRILL-7147:


Hi [~agirish], there is no harm in users setting variables the "simple" way – 
as long as they don't want to inherit values from the environment.

However, this ticket mentions {{distrib-env.sh}} which allows a distribution to 
override the built-in settings. Both {{distrib-env.sh}} and the built-in 
settings use the {{${FOO-:value}} form to allow overrides in either the 
environment or {{drill-env.sh}}.

You mention that people set variables using the simple form in {{drill-env.sh}} 
and this causes problems. Again, the only problem this can cause is to 
overwrite the environment. It cannot cause problems with {{dristrib-env.sh}}.

Did someone set a value in {{distrib-env.sh}} without using the proper 
notation? Is this the cause of the issue?

Reversing the file order will fix that particular issue, but will break the 
ability to set the variables in the environment: a feature that DoY requires.

> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread Abhishek Girish (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807305#comment-16807305
 ] 

Abhishek Girish commented on DRILL-7147:


I'm working with [~haozhu] to understand if he observed any inconsistent 
behavior - we can then fix those accordingly like you suggested.

> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7146) Query failing with NPE when ZK queue is enabled

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807304#comment-16807304
 ] 

ASF GitHub Bot commented on DRILL-7146:
---

HanumathRao commented on pull request #1725: DRILL-7146: Query failing with NPE 
when ZK queue is enabled.
URL: https://github.com/apache/drill/pull/1725
 
 
   This PR fixes the issue to not look for physical operator in memory adjusted 
operators of the QueueQueryParallelizer when planHasMemory is set to true.
   
   @sohami  can you please review this fix.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query failing with NPE when ZK queue is enabled
> ---
>
> Key: DRILL-7146
> URL: https://issues.apache.org/jira/browse/DRILL-7146
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.16.0
>Reporter: Sorabh Hamirwasia
>Assignee: Hanumath Rao Maduri
>Priority: Major
> Fix For: 1.16.0
>
>
>  
> {code:java}
> >> Query: alter system reset all;
>  SYSTEM ERROR: NullPointerException
> Please, refer to logs for more information.
> [Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
>  java.sql.SQLException: SYSTEM ERROR: NullPointerException
> Please, refer to logs for more information.
> [Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
>  at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:535)
>  at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:607)
>  at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1278)
>  at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:58)
>  at 
> oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
>  at 
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1107)
>  at 
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1118)
>  at 
> oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:200)
>  at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>  at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)
>  at org.apache.drill.test.framework.Utils.execSQL(Utils.java:917)
>  at org.apache.drill.test.framework.TestDriver.setup(TestDriver.java:632)
>  at org.apache.drill.test.framework.TestDriver.runTests(TestDriver.java:152)
>  at org.apache.drill.test.framework.TestDriver.main(TestDriver.java:94)
>  Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: 
> SYSTEM ERROR: NullPointerException
> Please, refer to logs for more information.
> [Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
>  at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
>  at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
>  at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
>  at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
>  at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
>  at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>  at 
> oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>  at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
>  at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>  at 
> 

[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807300#comment-16807300
 ] 

ASF GitHub Bot commented on DRILL-7147:
---

Agirish commented on issue #1724: DRILL-7147: Source order of drill-env.sh and 
distrib-env.sh should be swapped
URL: https://github.com/apache/drill/pull/1724#issuecomment-478789323
 
 
   @kkhatua , I had the same reaction at first and hence the PR :) But Paul 
pointed out that the current behavior is right.
   @paul-rogers agree with your explanation! Thanks for taking a look. I'll 
close this PR.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807301#comment-16807301
 ] 

ASF GitHub Bot commented on DRILL-7147:
---

Agirish commented on pull request #1724: DRILL-7147: Source order of 
drill-env.sh and distrib-env.sh should be swapped
URL: https://github.com/apache/drill/pull/1724
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (DRILL-7107) Unable to connect to Drill 1.15 through ZK

2019-04-01 Thread Karthikeyan Manivannan (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthikeyan Manivannan reopened DRILL-7107:
---

> Unable to connect to Drill 1.15 through ZK
> --
>
> Key: DRILL-7107
> URL: https://issues.apache.org/jira/browse/DRILL-7107
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.15.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> After upgrading to Drill 1.15, users are seeing they are no longer able to 
> connect to Drill using ZK quorum. They are getting the following "Unable to 
> setup ZK for client" error.
> [~]$ sqlline -u "jdbc:drill:zk=172.16.2.165:5181;auth=maprsasl"
> Error: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. 
> (state=,code=0)
> java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client.
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:174)
>  at 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
>  at 
> org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
>  at 
> org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
>  at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
>  at sqlline.DatabaseConnection.connect(DatabaseConnection.java:130)
>  at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:179)
>  at sqlline.Commands.connect(Commands.java:1247)
>  at sqlline.Commands.connect(Commands.java:1139)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
>  at sqlline.SqlLine.dispatch(SqlLine.java:722)
>  at sqlline.SqlLine.initArgs(SqlLine.java:416)
>  at sqlline.SqlLine.begin(SqlLine.java:514)
>  at sqlline.SqlLine.start(SqlLine.java:264)
>  at sqlline.SqlLine.main(SqlLine.java:195)
> Caused by: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for 
> client.
>  at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:340)
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:165)
>  ... 18 more
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.drill.exec.coord.zk.ZKACLProviderFactory.findACLProvider(ZKACLProviderFactory.java:68)
>  at 
> org.apache.drill.exec.coord.zk.ZKACLProviderFactory.getACLProvider(ZKACLProviderFactory.java:47)
>  at 
> org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:114)
>  at 
> org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:86)
>  at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:337)
>  ... 19 more
> Apache Drill 1.15.0.0
> "This isn't your grandfather's SQL."
> sqlline>
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6050) Provide a limit to number of rows fetched for a query in UI

2019-04-01 Thread Kunal Khatua (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua updated DRILL-6050:

Labels: doc-impacting ready-to-commit user-experience  (was: 
ready-to-commit user-experience)

> Provide a limit to number of rows fetched for a query in UI
> ---
>
> Key: DRILL-6050
> URL: https://issues.apache.org/jira/browse/DRILL-6050
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Minor
>  Labels: doc-impacting, ready-to-commit, user-experience
> Fix For: 1.16.0, 1.17.0
>
>
> Currently, the WebServer side needs to process the entire set of results and 
> stream it back to the WebClient. 
> Since the WebUI does paginate results, we can load a larger set for 
> pagination on the browser client and relieve pressure off the WebServer to 
> host all the data.
> e.g. Fetching all rows from a 1Billion records table is impractical and can 
> be capped at 10K. Currently, the user has to explicitly specify LIMIT in the 
> submitted query. 
> An option can be provided in the field to allow for this entry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6939) Indicate when a query is submitted and is in progress

2019-04-01 Thread Kunal Khatua (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua updated DRILL-6939:

Labels: doc-impacting ready-to-commit user-experience  (was: 
ready-to-commit user-experience)

> Indicate when a query is submitted and is in progress
> -
>
> Key: DRILL-6939
> URL: https://issues.apache.org/jira/browse/DRILL-6939
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.14.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Critical
>  Labels: doc-impacting, ready-to-commit, user-experience
> Fix For: 1.16.0
>
>
> When submitting a long running query, the web UI shows no indication of the 
> query having been submitted. What is needed is some form of UI enhancement 
> that shows that the submitted query is in progress and the results will load 
> when available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file

2019-04-01 Thread Kunal Khatua (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807290#comment-16807290
 ] 

Kunal Khatua commented on DRILL-7032:
-

[~cgivre]  does this require any additional Documentation beyond a mention in 
the release notes? (cc: [~bbevens])

> Ignore corrupt rows in a PCAP file
> --
>
> Key: DRILL-7032
> URL: https://issues.apache.org/jira/browse/DRILL-7032
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.15.0
> Environment: OS: Ubuntu 18.4
> Drill version: 1.15.0
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
>Reporter: Giovanni Conte
>Assignee: Charles Givre
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Would be useful for Drill to have some ability to ignore corrupt rows in a 
> PCAP file instead of trow the java exception.
> This is because there are many pcap files with corrupted lines and this 
> funcionality will avoid to do a pre-fixing of the packet-captures (example 
> attached file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7054) PCAP timestamp in milliseconds

2019-04-01 Thread Kunal Khatua (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807288#comment-16807288
 ] 

Kunal Khatua commented on DRILL-7054:
-

[~manang] / [~cgivre] does this require any additional Documentation beyond a 
mention in the release notes? (cc: [~bbevens])

> PCAP timestamp in milliseconds
> --
>
> Key: DRILL-7054
> URL: https://issues.apache.org/jira/browse/DRILL-7054
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - Other
>Reporter: Angelo Mantellini
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> It is important to show the timestamp with microseconds precision.
> timestamp has milliseconds as precision and in some case, it could be not 
> enough.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7060) Support JsonParser Feature 'ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER' in JsonReader

2019-04-01 Thread Kunal Khatua (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807287#comment-16807287
 ] 

Kunal Khatua commented on DRILL-7060:
-

[~agirish] does this require any additional Documentation beyond a mention in 
the release notes? (cc: [~bbevens])

> Support JsonParser Feature 'ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER' in 
> JsonReader
> -
>
> Key: DRILL-7060
> URL: https://issues.apache.org/jira/browse/DRILL-7060
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Some JSON files may have strings with backslashes - which are read as escape 
> characters. By default only standard escape characters are allowed. So 
> querying such files would fail. For example see:
> Data
> {code}
> {"file":"C:\Sfiles\escape.json"}
> {code}
> Error
> {code}
> (com.fasterxml.jackson.core.JsonParseException) Unrecognized character escape 
> 'S' (code 83)
>  at [Source: (org.apache.drill.exec.store.dfs.DrillFSDataInputStream); line: 
> 1, column: 178]
> com.fasterxml.jackson.core.JsonParser._constructError():1804
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError():663
> 
> com.fasterxml.jackson.core.base.ParserMinimalBase._handleUnrecognizedCharacterEscape():640
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._decodeEscaped():3243
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipString():2537
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken():683
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData():342
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch():298
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeToVector():246
> org.apache.drill.exec.vector.complex.fn.JsonReader.write():205
> org.apache.drill.exec.store.easy.json.JSONRecordReader.next():216
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():223
> ...
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7069) Poor performance of transformBinaryInMetadataCache

2019-04-01 Thread Kunal Khatua (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807286#comment-16807286
 ] 

Kunal Khatua commented on DRILL-7069:
-

[~ben-zvi] does this require any additional Documentation beyond a mention in 
the release notes? (cc: [~bbevens])

> Poor performance of transformBinaryInMetadataCache
> --
>
> Key: DRILL-7069
> URL: https://issues.apache.org/jira/browse/DRILL-7069
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: 1.15.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The performance of the method *transformBinaryInMetadataCache* scales poorly 
> as the table's numbers of underlying files, row-groups and columns grow. This 
> method is invoked during planning of every query using this table.
>      A test on a table using 219 directories (each with 20 files), 1 
> row-group in each file, and 94 columns, measured about *1340 milliseconds*.
>     The main culprit are the version checks, which take place in *every 
> iteration* (i.e., about 400k times in the previous example) and involve 
> construction of 6 MetadataVersion objects (and possibly garbage collections).
>      Removing the version checks from the loops improved this method's 
> performance on the above test down to about *250 milliseconds*.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7072) Query with semi join fails for JDBC storage plugin

2019-04-01 Thread Kunal Khatua (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807284#comment-16807284
 ] 

Kunal Khatua commented on DRILL-7072:
-

[~vvysotskyi] does this require any Documentation? (cc: [~bbevens])

> Query with semi join fails for JDBC storage plugin
> --
>
> Key: DRILL-7072
> URL: https://issues.apache.org/jira/browse/DRILL-7072
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JDBC
>Affects Versions: 1.15.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> When running a query with semi join for JDBC storage plugin, it fails with 
> class cast exception:
> {code:sql}
> select person_id from mysql.`drill_mysql_test`.person t1
> where exists (
> select person_id from mysql.`drill_mysql_test`.person
> where t1.person_id = person_id)
> {code}
> {noformat}
> SYSTEM ERROR: ClassCastException: 
> org.apache.calcite.adapter.jdbc.JdbcRules$JdbcAggregate cannot be cast to 
> org.apache.drill.exec.planner.logical.DrillAggregateRel
> Please, refer to logs for more information.
> [Error Id: 85a27762-a4e5-4571-909f-0efa18ca0689 on user515050-pc:31013]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> ClassCastException: org.apache.calcite.adapter.jdbc.JdbcRules$JdbcAggregate 
> cannot be cast to org.apache.drill.exec.planner.logical.DrillAggregateRel
> Please, refer to logs for more information.
> [Error Id: 85a27762-a4e5-4571-909f-0efa18ca0689 on user515050-pc:31013]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:779)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83)
>  [classes/:na]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:299) 
> [classes/:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_191]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_191]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: 
> org.apache.calcite.adapter.jdbc.JdbcRules$JdbcAggregate cannot be cast to 
> org.apache.drill.exec.planner.logical.DrillAggregateRel
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:300) 
> [classes/:na]
>   ... 3 common frames omitted
> Caused by: java.lang.ClassCastException: 
> org.apache.calcite.adapter.jdbc.JdbcRules$JdbcAggregate cannot be cast to 
> org.apache.drill.exec.planner.logical.DrillAggregateRel
>   at 
> org.apache.drill.exec.planner.logical.DrillSemiJoinRule.matches(DrillSemiJoinRule.java:171)
>  ~[classes/:na]
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:557) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:257)
>  ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
>  ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:216) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:203) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:431)
>  ~[classes/:na]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:382)
>  ~[classes/:na]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:365)
>  ~[classes/:na]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:289)
>  ~[classes/:na]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:331)
>  ~[classes/:na]

[jira] [Commented] (DRILL-7107) Unable to connect to Drill 1.15 through ZK

2019-04-01 Thread Kunal Khatua (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807282#comment-16807282
 ] 

Kunal Khatua commented on DRILL-7107:
-

[~karthikm]does this require any Documentation? (cc: [~bbevens])

> Unable to connect to Drill 1.15 through ZK
> --
>
> Key: DRILL-7107
> URL: https://issues.apache.org/jira/browse/DRILL-7107
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.15.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> After upgrading to Drill 1.15, users are seeing they are no longer able to 
> connect to Drill using ZK quorum. They are getting the following "Unable to 
> setup ZK for client" error.
> [~]$ sqlline -u "jdbc:drill:zk=172.16.2.165:5181;auth=maprsasl"
> Error: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. 
> (state=,code=0)
> java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client.
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:174)
>  at 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
>  at 
> org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
>  at 
> org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
>  at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
>  at sqlline.DatabaseConnection.connect(DatabaseConnection.java:130)
>  at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:179)
>  at sqlline.Commands.connect(Commands.java:1247)
>  at sqlline.Commands.connect(Commands.java:1139)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
>  at sqlline.SqlLine.dispatch(SqlLine.java:722)
>  at sqlline.SqlLine.initArgs(SqlLine.java:416)
>  at sqlline.SqlLine.begin(SqlLine.java:514)
>  at sqlline.SqlLine.start(SqlLine.java:264)
>  at sqlline.SqlLine.main(SqlLine.java:195)
> Caused by: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for 
> client.
>  at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:340)
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:165)
>  ... 18 more
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.drill.exec.coord.zk.ZKACLProviderFactory.findACLProvider(ZKACLProviderFactory.java:68)
>  at 
> org.apache.drill.exec.coord.zk.ZKACLProviderFactory.getACLProvider(ZKACLProviderFactory.java:47)
>  at 
> org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:114)
>  at 
> org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:86)
>  at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:337)
>  ... 19 more
> Apache Drill 1.15.0.0
> "This isn't your grandfather's SQL."
> sqlline>
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7125) REFRESH TABLE METADATA fails after upgrade from Drill 1.13.0 to Drill 1.15.0

2019-04-01 Thread Kunal Khatua (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807281#comment-16807281
 ] 

Kunal Khatua commented on DRILL-7125:
-

[~shamirwasia] does this require any Documentation? (cc: [~bbevens])

> REFRESH TABLE METADATA fails after upgrade from Drill 1.13.0 to Drill 1.15.0
> 
>
> Key: DRILL-7125
> URL: https://issues.apache.org/jira/browse/DRILL-7125
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> REFRESH TABLE METADATA command worked successfully on Drill 1.13.0, however 
> after upgrade Drill 1.15.0 there are errors sometime.
> {code:java}
> In sqlline logging in as regular user "alice" or Drill process user "admin" 
> gives the same error (permission denied)
> If this helps, here's also what I am seeing on sqlline
> Error message contains random but valid user's names other than the user 
> (Alice) that logged in to refresh the metadata. Looks like during the refresh 
> metadata drillbits seems to incorrectly try the metadata generation as some 
> random user which obviously does not have write access
> 2019-03-12 15:27:20,564 [2377cdd9-dd6e-d213-de1a-70b50d3641d7:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2377cdd9-dd6e-d213-de1a-70b50d3641d7:0:0: State change requested RUNNING --> 
> FINISHED
> 2019-03-12 15:27:20,564 [2377cdd9-dd6e-d213-de1a-70b50d3641d7:frag:0:0] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 2377cdd9-dd6e-d213-de1a-70b50d3641d7:0:0: State to report: FINISHED
> 2019-03-12 15:27:23,032 [2377cdb3-86cc-438d-8ada-787d2a84df9a:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query with id 
> 2377cdb3-86cc-438d-8ada-787d2a84df9a issued by alice: REFRESH TABLE METADATA 
> dfs.root.`/user/alice/logs/hive/warehouse/detail`
> 2019-03-12 15:27:23,350 [2377cdb3-86cc-438d-8ada-787d2a84df9a:foreman] ERROR 
> o.a.d.e.s.parquet.metadata.Metadata - Failed to read 
> 'file://user/alice/logs/hive/warehouse/detail/.drill.parquet_metadata_directories'
>  metadata file
> java.io.IOException: 2879.5854742.1036302960 
> /user/alice/logs/hive/warehouse/detail/file1/.drill.parquet_metadata 
> (Permission denied)
> at com.mapr.fs.Inode.throwIfFailed(Inode.java:390) 
> ~[maprfs-6.1.0-mapr.jar:na]
> at com.mapr.fs.Inode.flushPages(Inode.java:505) 
> ~[maprfs-6.1.0-mapr.jar:na]
> at com.mapr.fs.Inode.releaseDirty(Inode.java:583) 
> ~[maprfs-6.1.0-mapr.jar:na]
> at 
> com.mapr.fs.MapRFsOutStream.dropCurrentPage(MapRFsOutStream.java:73) 
> ~[maprfs-6.1.0-mapr.jar:na]
> at com.mapr.fs.MapRFsOutStream.write(MapRFsOutStream.java:85) 
> ~[maprfs-6.1.0-mapr.jar:na]
> at 
> com.mapr.fs.MapRFsDataOutputStream.write(MapRFsDataOutputStream.java:39) 
> ~[maprfs-6.1.0-mapr.jar:na]
> at 
> com.fasterxml.jackson.core.json.UTF8JsonGenerator._flushBuffer(UTF8JsonGenerator.java:2085)
>  ~[jackson-core-2.9.5.jar:2.9.5]
> at 
> com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1097)
>  ~[jackson-core-2.9.5.jar:2.9.5]
> at 
> com.fasterxml.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:2645)
>  ~[jackson-databind-2.9.5.jar:2.9.5]
> at 
> com.fasterxml.jackson.core.base.GeneratorBase.writeObject(GeneratorBase.java:381)
>  ~[jackson-core-2.9.5.jar:2.9.5]
> at 
> com.fasterxml.jackson.core.JsonGenerator.writeObjectField(JsonGenerator.java:1726)
>  ~[jackson-core-2.9.5.jar:2.9.5]
> at 
> org.apache.drill.exec.store.parquet.metadata.Metadata_V3$ColumnMetadata_v3$Serializer.serialize(Metadata_V3.java:448)
>  ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.store.parquet.metadata.Metadata_V3$ColumnMetadata_v3$Serializer.serialize(Metadata_V3.java:417)
>  ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serializeContents(IndexedListSerializer.java:119)
>  ~[jackson-databind-2.9.5.jar:2.9.5]
> at 
> com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:79)
>  ~[jackson-databind-2.9.5.jar:2.9.5]
> at 
> com.fasterxml.jackson.databind.ser.impl.IndexedListSerializer.serialize(IndexedListSerializer.java:18)
>  ~[jackson-databind-2.9.5.jar:2.9.5]
> at 
> com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
>  ~[jackson-databind-2.9.5.jar:2.9.5]
> at 
> 

[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread Abhishek Girish (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807279#comment-16807279
 ] 

Abhishek Girish commented on DRILL-7147:


Regarding the other variables being not set correctly, I'll take a look and see 
if I can fix them in a different PR (to make it more clear).

> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread Abhishek Girish (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807277#comment-16807277
 ] 

Abhishek Girish commented on DRILL-7147:


[~Paul.Rogers], thanks for taking a look! I remember this did work as expected 
when it was first added during DoY. So I was a bit confused to see the ordering 
in drill-config.sh. So, I knew you'll have an answer :)

Your explanation makes sense, but from what I've seen, users often set values 
in drill-env.sh in the form you first mention (export key=value). And that 
causes the issue of values not being correctly overridden by Drill during 
startup. 

I understand that the right way to set variables is with the latter form 
(export key=${key:-"val"}). I think this should be more clearly documented in 
our user facing docs - especially because it's not very intuitive to 
troubleshoot. We should see if there is a way from the start-up scripts to help 
with that. 

> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807260#comment-16807260
 ] 

ASF GitHub Bot commented on DRILL-7147:
---

paul-rogers commented on issue #1724: DRILL-7147: Source order of drill-env.sh 
and distrib-env.sh should be swapped
URL: https://github.com/apache/drill/pull/1724#issuecomment-478780266
 
 
   Please see note in the JIRA ticket. Current behavior is correct. I suspect 
that the issue is with one specific variable. Which one cause the issue that 
this is trying to fix?
   
   I believe that there are DoY unit tests that attempt to verify correct 
behavior (I remember writing them.) This fix would break those tests.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread Paul Rogers (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807259#comment-16807259
 ] 

Paul Rogers commented on DRILL-7147:


This is an easy misunderstanding. If we imagine that files contain lines of the 
form:

{noformat}
drill-env.sh:
  export FOO="ABC"

distrib-env.sh:
   export FOO="DEF"
{noformat}

If this is how things worked, then the change suggested here would be correct. 
However, this is *not* how the variables are supposed to be set. In order to 
create the hierarchy explained in the comments, the correct form, explained in 
{{drill-env.sh}} is:

{noformat}
drill-env.sh:
  export FOO=${FOO:-"ABC"}

distrib-env.sh:
   export FOO=${FOO:-"DEF"}
{noformat}

In this new form, we want to source the top-most levels of the hierarchy before 
the bottom-most levels. Consider:

* If {{FOO}} is set in the environment, it must take precedence, which it can 
do only using the form explained above. (There is no way to, say, source 
{{distrib-env.sh}} before the environment.)
* If {{FOO}} is set in drill-env.sh (but not in the environment), then it must 
take precedence, which it does using the above form.
* And so on.

All this said, it is likely that there is some particular variable that 
triggered this issue. Suggestion: track that variable down. Did it not follow 
the pattern above? For example, it appears that there are some entries in 
{{drill-env.sh}} for which the comments are wrong:

{code:bash}
# Maximum amount of direct memory to allocate to the Drillbit in the format
# supported by -XX:MaxDirectMemorySize. Default is 8G.

#export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"8G"}

# Native library path passed to Java. Note: use this form instead
# of the old form of DRILLBIT_JAVA_OPTS="-Djava.library.path="
# The old form is not compatible with Drill-on-YARN.

# export DRILL_JAVA_LIB_PATH=":"

# Value for the code cache size for the Drillbit. Because the Drillbit generates
# code, it benefits from a large cache. Default is 1G.

#export DRILLBIT_CODE_CACHE_SIZE=${DRILLBIT_CODE_CACHE_SIZE:-"1G"}

# Provide a customized host name for when the default mechanism is not accurate

#export DRILL_HOST_NAME=`hostname`

# Base name for Drill log files. Files are named ${DRILL_LOG_NAME}.out, etc.

# DRILL_LOG_NAME="drillbit"
{code}

In the above, {{DRILL_MAX_DIRECT_MEMORY}} is correct, {{DRILL_JAVA_LIB_PATH}}, 
{{DRILL_HOST_NAME}} and {{DRILL_LOG_NAME}} are wrong. They should be:

{code:bash}
# Native library path passed to Java. Note: use this form instead
# of the old form of DRILLBIT_JAVA_OPTS="-Djava.library.path="
# The old form is not compatible with Drill-on-YARN.

# export 
DRILL_JAVA_LIB_PATH="$DRILL_JAVA_LIB_PATH${DRILL_JAVA_LIB_PATH:=;}:"

# Provide a customized host name for when the default mechanism is not accurate

#export DRILL_HOST_NAME=${DRILL_HOST_NAME:-`hostname`}

# Base name for Drill log files. Files are named ${DRILL_LOG_NAME}.out, etc.

# DRILL_LOG_NAME=${DRILL_LOG_NAME:-"drillbit"}
{code}

> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-7038) Queries on partitioned columns scan the entire datasets

2019-04-01 Thread Bridget Bevens (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16805567#comment-16805567
 ] 

Bridget Bevens edited comment on DRILL-7038 at 4/1/19 9:49 PM:
---

Thanks, [~KazydubB]!

Is this okay to add to the document?

Starting in Drill 1.16, Drill uses a Value operator instead of a Scan operator 
to read data when a query references a directory column only and has a DISTINCT 
or GROUP BY operation. Instead of scanning all directory columns, Drill either 
reads the specified column from the metadata cache file (if one exists) or 
Drill selects directly from the directory (partition location). The presence of 
the Values operator (instead of the Scan operator) in the query plan indicates 
that Drill is using this optimization, as shown in the following examples:  

select distinct dir0 from `/logs`;
+--+
| dir0 |
+--+
| 2015 |
| 2016 |
| 2017 |
+--+


explain plan for select distinct dir0 from `/logs`;
+--+--+
|   text
   |   json 
  |
+--+--+
| 00-00Screen
00-01  Project(dir0=[$0])
00-02StreamAgg(group=[{0}])
00-03  Sort(sort0=[$0], dir0=[ASC])
00-04Values(tuples=[[{ '2015' }, { '2015' }, { '2016' }, { '2015' 
}, { '2017' }, { '2015' }]])



select dir0 from `/logs` group by dir0;
+--+
| dir0 |
+--+
| 2015 |
| 2016 |
| 2017 |
+--+

explain plan for select dir0 from `/logs` group by dir0;
>
+--+--+
|   text
   |   json 
  |
+--+--+
| 00-00Screen
00-01  Project(dir0=[$0])
00-02StreamAgg(group=[{0}])
00-03  Sort(sort0=[$0], dir0=[ASC])
00-04Values(tuples=[[{ '2015' }, { '2015' }, { '2016' }, { '2015' 
}, { '2017' }, { '2015' }]])

Thanks,
Bridget


was (Author: bbevens):
Thanks, [~KazydubB]!

Is this okay to add to the document?

Starting in 1.16, Drill uses a Values operator instead of the Scan operator for 
DISTINCT and GROUP BY queries on tables and directories, as shown in the 
following examples:

select distinct dir0 from `/logs`;
+--+
| dir0 |
+--+
| 2015 |
| 2016 |
| 2017 |
+--+


explain plan for select distinct dir0 from `/logs`;
+--+--+
|   text
   |   json 
  |
+--+--+
| 00-00Screen
00-01  Project(dir0=[$0])
00-02StreamAgg(group=[{0}])
00-03  Sort(sort0=[$0], dir0=[ASC])
00-04Values(tuples=[[{ '2015' }, { '2015' }, { '2016' }, { '2015' 
}, { '2017' }, { '2015' }]])



select dir0 from `/logs` group by dir0;
+--+
| dir0 |
+--+
| 2015 |
| 2016 |
| 2017 |
+--+

explain plan for select dir0 from `/logs` group by dir0;
>
+--+--+
|   text
   |   json 
  |
+--+--+
| 00-00Screen
00-01  Project(dir0=[$0])
00-02StreamAgg(group=[{0}])
00-03  Sort(sort0=[$0], dir0=[ASC])
00-04Values(tuples=[[{ '2015' }, { '2015' }, { '2016' }, { '2015' 
}, { '2017' }, { '2015' }]])

Thanks,
Bridget

> Queries on partitioned columns scan the entire datasets
> ---
>
> Key: DRILL-7038
> URL: 

[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807216#comment-16807216
 ] 

ASF GitHub Bot commented on DRILL-7147:
---

Agirish commented on issue #1724: DRILL-7147: Source order of drill-env.sh and 
distrib-env.sh should be swapped
URL: https://github.com/apache/drill/pull/1724#issuecomment-478757808
 
 
   + @paul-rogers (not able to tag you as a reviewer).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7117) Support creation of histograms for numeric data types (except Decimal) and date/time/timestamp

2019-04-01 Thread Aman Sinha (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-7117:
--
Labels: doc-impacting  (was: )

> Support creation of histograms for numeric data types (except Decimal) and 
> date/time/timestamp
> --
>
> Key: DRILL-7117
> URL: https://issues.apache.org/jira/browse/DRILL-7117
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Query Planning  Optimization
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> This JIRA is specific to creating histograms for numeric data types: INT, 
> BIGINT, FLOAT4, FLOAT8  and their corresponding nullable/non-nullable 
> versions.  Additionally, since DATE/TIME/TIMESTAMP are internally stored as 
> longs, we should allow the same numeric type histogram creation for these 
> data types as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread Abhishek Girish (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7147:
---
Fix Version/s: 1.16.0

> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.16.0
>
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807201#comment-16807201
 ] 

ASF GitHub Bot commented on DRILL-7147:
---

Agirish commented on pull request #1724: DRILL-7147: Source order of 
drill-env.sh and distrib-env.sh should be swapped
URL: https://github.com/apache/drill/pull/1724
 
 
   Drill environment properties in drill-env.sh should override corresponding 
properties set in distrib-env.sh
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread Abhishek Girish (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish reassigned DRILL-7147:
--

Assignee: Abhishek Girish

> Source order of "drill-env.sh" and "distrib-env.sh" should be swapped
> -
>
> Key: DRILL-7147
> URL: https://issues.apache.org/jira/browse/DRILL-7147
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Hao Zhu
>Assignee: Abhishek Girish
>Priority: Minor
>
> In bin/drill-config.sh, the description of the source order is:
> {code:java}
> # Variables may be set in one of four places:
> #
> #   Environment (per run)
> #   drill-env.sh (per site)
> #   distrib-env.sh (per distribution)
> #   drill-config.sh (this file, Drill defaults)
> #
> # Properties "inherit" from items lower on the list, and may be "overridden" 
> by items
> # higher on the list. In the environment, just set the variable:
> {code}
> However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
> distrib-env.sh.
> {code:java}
> drillEnv="$DRILL_CONF_DIR/drill-env.sh"
> if [ -r "$drillEnv" ]; then
>   . "$drillEnv"
> fi
> ...
> distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
> if [ -r "$distribEnv" ]; then
>   . "$distribEnv"
> else
>   distribEnv="$DRILL_HOME/conf/distrib-env.sh"
>   if [ -r "$distribEnv" ]; then
> . "$distribEnv"
>   fi
> fi
> {code}
> We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7147) Source order of "drill-env.sh" and "distrib-env.sh" should be swapped

2019-04-01 Thread Hao Zhu (JIRA)
Hao Zhu created DRILL-7147:
--

 Summary: Source order of "drill-env.sh" and "distrib-env.sh" 
should be swapped
 Key: DRILL-7147
 URL: https://issues.apache.org/jira/browse/DRILL-7147
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.15.0
Reporter: Hao Zhu


In bin/drill-config.sh, the description of the source order is:
{code:java}
# Variables may be set in one of four places:
#
#   Environment (per run)
#   drill-env.sh (per site)
#   distrib-env.sh (per distribution)
#   drill-config.sh (this file, Drill defaults)
#
# Properties "inherit" from items lower on the list, and may be "overridden" by 
items
# higher on the list. In the environment, just set the variable:
{code}
However actually bin/drill-config.sh sources drill-env.sh firstly, and then 
distrib-env.sh.
{code:java}
drillEnv="$DRILL_CONF_DIR/drill-env.sh"
if [ -r "$drillEnv" ]; then
  . "$drillEnv"
fi
...

distribEnv="$DRILL_CONF_DIR/distrib-env.sh"
if [ -r "$distribEnv" ]; then
  . "$distribEnv"
else
  distribEnv="$DRILL_HOME/conf/distrib-env.sh"
  if [ -r "$distribEnv" ]; then
. "$distribEnv"
  fi
fi

{code}
We need to swap the source order of drill-env.sh and distrib-env.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7117) Support creation of histograms for numeric data types (except Decimal) and date/time/timestamp

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807160#comment-16807160
 ] 

ASF GitHub Bot commented on DRILL-7117:
---

amansinha100 commented on pull request #1715: DRILL-7117: Support creation of 
equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support creation of histograms for numeric data types (except Decimal) and 
> date/time/timestamp
> --
>
> Key: DRILL-7117
> URL: https://issues.apache.org/jira/browse/DRILL-7117
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Query Planning  Optimization
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>Priority: Major
> Fix For: 1.16.0
>
>
> This JIRA is specific to creating histograms for numeric data types: INT, 
> BIGINT, FLOAT4, FLOAT8  and their corresponding nullable/non-nullable 
> versions.  Additionally, since DATE/TIME/TIMESTAMP are internally stored as 
> longs, we should allow the same numeric type histogram creation for these 
> data types as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7117) Support creation of histograms for numeric data types (except Decimal) and date/time/timestamp

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807087#comment-16807087
 ] 

ASF GitHub Bot commented on DRILL-7117:
---

gparai commented on issue #1715: DRILL-7117: Support creation of equi-depth 
histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#issuecomment-478704575
 
 
   The changes look good. +1
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support creation of histograms for numeric data types (except Decimal) and 
> date/time/timestamp
> --
>
> Key: DRILL-7117
> URL: https://issues.apache.org/jira/browse/DRILL-7117
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Query Planning  Optimization
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>Priority: Major
> Fix For: 1.16.0
>
>
> This JIRA is specific to creating histograms for numeric data types: INT, 
> BIGINT, FLOAT4, FLOAT8  and their corresponding nullable/non-nullable 
> versions.  Additionally, since DATE/TIME/TIMESTAMP are internally stored as 
> longs, we should allow the same numeric type histogram creation for these 
> data types as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7141) Hash-Join (and Agg) should always spill to disk the least used partition

2019-04-01 Thread Kunal Khatua (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua updated DRILL-7141:

Fix Version/s: (was: Future)
   1.17.0

> Hash-Join (and Agg) should always spill to disk the least used partition
> 
>
> Key: DRILL-7141
> URL: https://issues.apache.org/jira/browse/DRILL-7141
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.17.0
>
>
> When the probe-side data for a hash join is skewed, it is preferable to have 
> the corresponding partition on the build side to be in memory. 
> Currently, with the spill-to-disk feature, the partition selected for 
> spilling to disk is done at random. This means that a highly skewed 
> probe-side data would also spill for lack of a corresponding hash table 
> partition in memory. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7146) Query failing with NPE when ZK queue is enabled

2019-04-01 Thread Sorabh Hamirwasia (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-7146:
-
Description: 
 
{code:java}
>> Query: alter system reset all;
 SYSTEM ERROR: NullPointerException
Please, refer to logs for more information.
[Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
 java.sql.SQLException: SYSTEM ERROR: NullPointerException
Please, refer to logs for more information.
[Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
 at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:535)
 at 
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:607)
 at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1278)
 at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:58)
 at 
oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
 at 
org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1107)
 at 
org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1118)
 at 
oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
 at 
org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:200)
 at 
oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
 at 
oadd.org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)
 at org.apache.drill.test.framework.Utils.execSQL(Utils.java:917)
 at org.apache.drill.test.framework.TestDriver.setup(TestDriver.java:632)
 at org.apache.drill.test.framework.TestDriver.runTests(TestDriver.java:152)
 at org.apache.drill.test.framework.TestDriver.main(TestDriver.java:94)
 Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
ERROR: NullPointerException
Please, refer to logs for more information.
[Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
 at 
oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
 at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
 at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
 at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
 at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
 at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
 at 
oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
 at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
 at 
oadd.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
 at 
oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
 at 
oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
 at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
 at 
oadd.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
 at 

[jira] [Created] (DRILL-7146) Query failing with NPE when ZK queue is enabled

2019-04-01 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-7146:


 Summary: Query failing with NPE when ZK queue is enabled
 Key: DRILL-7146
 URL: https://issues.apache.org/jira/browse/DRILL-7146
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.16.0
Reporter: Sorabh Hamirwasia
Assignee: Hanumath Rao Maduri
 Fix For: 1.16.0


>> Query: alter system reset all;
SYSTEM ERROR: NullPointerException


Please, refer to logs for more information.

[Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
java.sql.SQLException: SYSTEM ERROR: NullPointerException


Please, refer to logs for more information.

[Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:535)
at 
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:607)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1278)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:58)
at 
oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
at 
org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1107)
at 
org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1118)
at 
oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
at 
org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:200)
at 
oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
at 
oadd.org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:217)
at org.apache.drill.test.framework.Utils.execSQL(Utils.java:917)
at org.apache.drill.test.framework.TestDriver.setup(TestDriver.java:632)
at 
org.apache.drill.test.framework.TestDriver.runTests(TestDriver.java:152)
at org.apache.drill.test.framework.TestDriver.main(TestDriver.java:94)
Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
ERROR: NullPointerException


Please, refer to logs for more information.

[Error Id: ec4b9c66-9f5c-4736-acf3-605f84ea0226 on drill80:31010]
at 
oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
at 
oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
at 
oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
oadd.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
at 
oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 

[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807063#comment-16807063
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r270996130
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/test/Drill2489CallsAfterCloseThrowExceptionsTest.java
 ##
 @@ -537,8 +537,6 @@ public void 
testclosedPreparedStmtOfOpenConnMethodsThrowRight() {
 new ClosedPreparedStatementChecker(PreparedStatement.class,
closedPreparedStmtOfOpenConn);
 
-checker.testAllMethods();
 
 Review comment:
   You're right. It was an accidental change, which is why it went through the 
tests. I'll restore it. Thanks for catching it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning  Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7145) Exceptions happened during retrieving values from ValueVector are not being displayed at the Drill Web UI

2019-04-01 Thread Arina Ielchiieva (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806990#comment-16806990
 ] 

Arina Ielchiieva commented on DRILL-7145:
-

Bug is due to DRILL-6477 which was partially fixed in DRILL-6591.

> Exceptions happened during retrieving values from ValueVector are not being 
> displayed at the Drill Web UI
> -
>
> Key: DRILL-7145
> URL: https://issues.apache.org/jira/browse/DRILL-7145
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
> Fix For: 1.16.0
>
>
> *Data:*
> A text file with the following content:
> {noformat}
> Id,col1,col2
> 1,aaa,bbb
> 2,ccc,ddd
> 3,eee
> 4,fff,ggg
> {noformat}
> Note that the record with id 3 has not value for the third column.
> exec.storage.enable_v3_text_reader should be false.
> *Submit the query from the Web UI:*
> {code:sql}
> select * from 
> table(dfs.tmp.`/drill/text/test`(type=>'text',lineDelimiter=>'\n',fieldDelimiter=>',',extractHeader=>true))
> {code}
> *Expected result:*
> Exception should happen due to DRILL-4814. It should be properly displayed.
> *Actual result:*
> Incorrect data is returned but without error. Query status: success.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7145) Exceptions happened during retrieving values from ValueVector are not being displayed at the Drill Web UI

2019-04-01 Thread Anton Gozhiy (JIRA)
Anton Gozhiy created DRILL-7145:
---

 Summary: Exceptions happened during retrieving values from 
ValueVector are not being displayed at the Drill Web UI
 Key: DRILL-7145
 URL: https://issues.apache.org/jira/browse/DRILL-7145
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.15.0
Reporter: Anton Gozhiy
Assignee: Anton Gozhiy
 Fix For: 1.16.0


*Data:*
A text file with the following content:
{noformat}
Id,col1,col2
1,aaa,bbb
2,ccc,ddd
3,eee
4,fff,ggg
{noformat}
Note that the record with id 3 has not value for the third column.

exec.storage.enable_v3_text_reader should be false.

*Submit the query from the Web UI:*
{code:sql}
select * from 
table(dfs.tmp.`/drill/text/test`(type=>'text',lineDelimiter=>'\n',fieldDelimiter=>',',extractHeader=>true))
{code}

*Expected result:*
Exception should happen due to DRILL-4814. It should be properly displayed.

*Actual result:*
Incorrect data is returned but without error. Query status: success.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7076) NPE is logged when querying postgres tables

2019-04-01 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806882#comment-16806882
 ] 

Volodymyr Vysotskyi commented on DRILL-7076:


[~gparai], this issue is also reproduced for the case when a query uses {{sys}} 
tables, for example for the next query:
{code:sql}
select * from sys.version
{code}
logs from {{drillbit.log}}
{noformat}
2019-04-01 08:00:34,886 [235dd86c-8d7d-f5bf-2682-4d6698d5d3d2:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query with id 
235dd86c-8d7d-f5bf-2682-4d6698d5d3d2 issued by anonymous: select * from 
sys.version
2019-04-01 08:00:36,008 [235dd86c-8d7d-f5bf-2682-4d6698d5d3d2:foreman] WARN  
o.a.d.e.p.common.DrillStatsTable - Failed to materialize the stats. Continuing 
without stats.
org.apache.drill.common.exceptions.DrillRuntimeException: Failed to find the 
stats for table [[sys, version]] in schema 
[org.apache.drill.exec.planner.sql.SqlConverter$DrillCalciteCatalogReader@3003126c]
at 
org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.visit(DrillStatsTable.java:223)
 [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at org.apache.calcite.rel.SingleRel.childrenAccept(SingleRel.java:72) 
[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
at org.apache.calcite.rel.RelVisitor.visit(RelVisitor.java:44) 
[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
at 
org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.visit(DrillStatsTable.java:231)
 [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at org.apache.calcite.rel.RelVisitor.go(RelVisitor.java:61) 
[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
at 
org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.materialize(DrillStatsTable.java:206)
 [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:235)
 [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:331)
 [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:178)
 [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:211)
 [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:116)
 [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:82)
 [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:587) 
[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:271) 
[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[na:1.8.0_161]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
{noformat}

> NPE is logged when querying postgres tables
> ---
>
> Key: DRILL-7076
> URL: https://issues.apache.org/jira/browse/DRILL-7076
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Gautam Parai
>Priority: Blocker
> Fix For: 1.16.0
>
>
> NPE is seen in logs when querying Postgres table:
> {code:sql}
> select 1 from postgres.public.tdt
> {code}
> Stack trace from {{sqlline.log}}:
> {noformat}
> 2019-03-05 13:49:19,395 [23819dc0-abf8-24f3-ea81-6ced1b6e11af:foreman] WARN  
> o.a.d.e.p.common.DrillStatsTable - Failed to materialize the stats. 
> Continuing without stats.
> java.lang.NullPointerException: null
>   at 
> org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.visit(DrillStatsTable.java:189)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at org.apache.calcite.rel.SingleRel.childrenAccept(SingleRel.java:72) 
> [calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at org.apache.calcite.rel.RelVisitor.visit(RelVisitor.java:44) 
> [calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at 
> org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.visit(DrillStatsTable.java:202)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 

[jira] [Commented] (DRILL-7076) NPE is logged when querying postgres tables

2019-04-01 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806867#comment-16806867
 ] 

Volodymyr Vysotskyi commented on DRILL-7076:


[~gparai], to reproduce this issue, bring up Postgres, create any table, 
connect to it from Drill, select this table and check log file for this 
exception.

> NPE is logged when querying postgres tables
> ---
>
> Key: DRILL-7076
> URL: https://issues.apache.org/jira/browse/DRILL-7076
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Gautam Parai
>Priority: Blocker
> Fix For: 1.16.0
>
>
> NPE is seen in logs when querying Postgres table:
> {code:sql}
> select 1 from postgres.public.tdt
> {code}
> Stack trace from {{sqlline.log}}:
> {noformat}
> 2019-03-05 13:49:19,395 [23819dc0-abf8-24f3-ea81-6ced1b6e11af:foreman] WARN  
> o.a.d.e.p.common.DrillStatsTable - Failed to materialize the stats. 
> Continuing without stats.
> java.lang.NullPointerException: null
>   at 
> org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.visit(DrillStatsTable.java:189)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at org.apache.calcite.rel.SingleRel.childrenAccept(SingleRel.java:72) 
> [calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at org.apache.calcite.rel.RelVisitor.visit(RelVisitor.java:44) 
> [calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at 
> org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.visit(DrillStatsTable.java:202)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at org.apache.calcite.rel.RelVisitor.go(RelVisitor.java:61) 
> [calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
>   at 
> org.apache.drill.exec.planner.common.DrillStatsTable$StatsMaterializationVisitor.materialize(DrillStatsTable.java:177)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:235)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:331)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:178)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:204)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:114)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:80)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:584) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:272) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_191]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_191]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]
> {noformat}
> But query runs and returns the correct result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806752#comment-16806752
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

ihuzenko commented on issue #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#issuecomment-478578390
 
 
   Hello @kkhatua, generally the PR looks good, only one comment related to 
test. And interesting thing with UI is that setting auto limit value to "100 " 
is prohibited because of space sign after number, should we trim all accepted 
values ? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning  Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6970) Issue with LogRegex format plugin where drillbuf was overflowing

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806751#comment-16806751
 ] 

ASF GitHub Bot commented on DRILL-6970:
---

asfgit commented on pull request #1673: DRILL-6970: fix issue with logregex 
format plugin where drillbuf was overflowing
URL: https://github.com/apache/drill/pull/1673
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Issue with LogRegex format plugin where drillbuf was overflowing 
> -
>
> Key: DRILL-6970
> URL: https://issues.apache.org/jira/browse/DRILL-6970
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The log format plugin does re-allocate the drillbuf when it fills up. You can 
> query small log files but larger ones will fail with this error:
> 0: jdbc:drill:zk=local> select * from dfs.root.`/prog/test.log`;
> Error: INTERNAL_ERROR ERROR: index: 32724, length: 108 (expected: range(0, 
> 32768))
> Fragment 0:0
> Please, refer to logs for more information.
>  
> I'm running drill-embeded. The log storage plugin is configured like so
> {code:java}
> "log": {
> "type": "logRegex",
> "regex": "(.+)",
> "extension": "log",
> "maxErrors": 10,
> "schema": [
> {
> "fieldName": "line"
> }
> ]
> },
> {code}
> The log files is very simple
> {code:java}
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7140) RM: Drillbits fail with "No enum constant QueueSelectionPolicy.SelectionPolicy.bestfit"

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806750#comment-16806750
 ] 

ASF GitHub Bot commented on DRILL-7140:
---

asfgit commented on pull request #1720: DRILL-7140: RM: Drillbits fail with "No 
enum constant org.apache.dril…
URL: https://github.com/apache/drill/pull/1720
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> RM: Drillbits fail with "No enum constant 
> QueueSelectionPolicy.SelectionPolicy.bestfit"
> ---
>
> Key: DRILL-7140
> URL: https://issues.apache.org/jira/browse/DRILL-7140
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
> Environment: master + changes to enable RM
>Reporter: Abhishek Ravi
>Assignee: Sorabh Hamirwasia
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> A sample configuration for RM with value *{{bestfit}}* for 
> *{{queue_selection_policy}}* fails with
> {noformat}
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:82)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.rm.DistributedResourceManager.(DistributedResourceManager.java:46)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 6 common frames omitted
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit
>   at java.lang.Enum.valueOf(Enum.java:238) ~[na:1.8.0_181]
>   at 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy$SelectionPolicy.valueOf(QueueSelectionPolicy.java:32)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:74)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 8 common frames omitted
> {noformat}
> The issue here seems to be the case mismatch between *{{bestfit}}* and enum 
> constant *{{BESTFIT}}*. Hence {{SelectionPolicy.valueOf}} does not find 
> *{{bestfit}}*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806749#comment-16806749
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

ihuzenko commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r270863652
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/test/Drill2489CallsAfterCloseThrowExceptionsTest.java
 ##
 @@ -537,8 +537,6 @@ public void 
testclosedPreparedStmtOfOpenConnMethodsThrowRight() {
 new ClosedPreparedStatementChecker(PreparedStatement.class,
closedPreparedStmtOfOpenConn);
 
-checker.testAllMethods();
 
 Review comment:
   Does deletion of the line means that test now checks nothing ? Maybe it's 
accidental change and needs revert or test method may be safely removed ? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning  Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7140) RM: Drillbits fail with "No enum constant QueueSelectionPolicy.SelectionPolicy.bestfit"

2019-04-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7140:

Fix Version/s: (was: 2.0.0)
   1.16.0

> RM: Drillbits fail with "No enum constant 
> QueueSelectionPolicy.SelectionPolicy.bestfit"
> ---
>
> Key: DRILL-7140
> URL: https://issues.apache.org/jira/browse/DRILL-7140
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 2.0.0
> Environment: master + changes to enable RM
>Reporter: Abhishek Ravi
>Assignee: Sorabh Hamirwasia
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> A sample configuration for RM with value *{{bestfit}}* for 
> *{{queue_selection_policy}}* fails with
> {noformat}
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:82)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.rm.DistributedResourceManager.(DistributedResourceManager.java:46)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 6 common frames omitted
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit
>   at java.lang.Enum.valueOf(Enum.java:238) ~[na:1.8.0_181]
>   at 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy$SelectionPolicy.valueOf(QueueSelectionPolicy.java:32)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:74)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 8 common frames omitted
> {noformat}
> The issue here seems to be the case mismatch between *{{bestfit}}* and enum 
> constant *{{BESTFIT}}*. Hence {{SelectionPolicy.valueOf}} does not find 
> *{{bestfit}}*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7140) RM: Drillbits fail with "No enum constant QueueSelectionPolicy.SelectionPolicy.bestfit"

2019-04-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7140:

Affects Version/s: (was: 2.0.0)

> RM: Drillbits fail with "No enum constant 
> QueueSelectionPolicy.SelectionPolicy.bestfit"
> ---
>
> Key: DRILL-7140
> URL: https://issues.apache.org/jira/browse/DRILL-7140
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
> Environment: master + changes to enable RM
>Reporter: Abhishek Ravi
>Assignee: Sorabh Hamirwasia
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> A sample configuration for RM with value *{{bestfit}}* for 
> *{{queue_selection_policy}}* fails with
> {noformat}
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:82)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.rm.DistributedResourceManager.(DistributedResourceManager.java:46)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 6 common frames omitted
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit
>   at java.lang.Enum.valueOf(Enum.java:238) ~[na:1.8.0_181]
>   at 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy$SelectionPolicy.valueOf(QueueSelectionPolicy.java:32)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:74)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 8 common frames omitted
> {noformat}
> The issue here seems to be the case mismatch between *{{bestfit}}* and enum 
> constant *{{BESTFIT}}*. Hence {{SelectionPolicy.valueOf}} does not find 
> *{{bestfit}}*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7140) RM: Drillbits fail with "No enum constant QueueSelectionPolicy.SelectionPolicy.bestfit"

2019-04-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7140:

Reviewer: Hanumath Rao Maduri

> RM: Drillbits fail with "No enum constant 
> QueueSelectionPolicy.SelectionPolicy.bestfit"
> ---
>
> Key: DRILL-7140
> URL: https://issues.apache.org/jira/browse/DRILL-7140
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
> Environment: master + changes to enable RM
>Reporter: Abhishek Ravi
>Assignee: Sorabh Hamirwasia
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> A sample configuration for RM with value *{{bestfit}}* for 
> *{{queue_selection_policy}}* fails with
> {noformat}
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:82)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.rm.DistributedResourceManager.(DistributedResourceManager.java:46)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 6 common frames omitted
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit
>   at java.lang.Enum.valueOf(Enum.java:238) ~[na:1.8.0_181]
>   at 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy$SelectionPolicy.valueOf(QueueSelectionPolicy.java:32)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:74)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 8 common frames omitted
> {noformat}
> The issue here seems to be the case mismatch between *{{bestfit}}* and enum 
> constant *{{BESTFIT}}*. Hence {{SelectionPolicy.valueOf}} does not find 
> *{{bestfit}}*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7140) RM: Drillbits fail with "No enum constant QueueSelectionPolicy.SelectionPolicy.bestfit"

2019-04-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7140:

Summary: RM: Drillbits fail with "No enum constant 
QueueSelectionPolicy.SelectionPolicy.bestfit"  (was: RM: Drillbits fail with 
"No enum constant 
org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit")

> RM: Drillbits fail with "No enum constant 
> QueueSelectionPolicy.SelectionPolicy.bestfit"
> ---
>
> Key: DRILL-7140
> URL: https://issues.apache.org/jira/browse/DRILL-7140
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 2.0.0
> Environment: master + changes to enable RM
>Reporter: Abhishek Ravi
>Assignee: Sorabh Hamirwasia
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 2.0.0
>
>
> A sample configuration for RM with value *{{bestfit}}* for 
> *{{queue_selection_policy}}* fails with
> {noformat}
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:82)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.rm.DistributedResourceManager.(DistributedResourceManager.java:46)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 6 common frames omitted
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit
>   at java.lang.Enum.valueOf(Enum.java:238) ~[na:1.8.0_181]
>   at 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy$SelectionPolicy.valueOf(QueueSelectionPolicy.java:32)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:74)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 8 common frames omitted
> {noformat}
> The issue here seems to be the case mismatch between *{{bestfit}}* and enum 
> constant *{{BESTFIT}}*. Hence {{SelectionPolicy.valueOf}} does not find 
> *{{bestfit}}*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7140) RM: Drillbits fail with "No enum constant org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit"

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806703#comment-16806703
 ] 

ASF GitHub Bot commented on DRILL-7140:
---

vvysotskyi commented on issue #1720: DRILL-7140: RM: Drillbits fail with "No 
enum constant org.apache.dril…
URL: https://github.com/apache/drill/pull/1720#issuecomment-478555853
 
 
   @HanumathRao, in future, when approving the PR, please check that Jira is in 
Reviewable state and please add `ready-to-commit` label, so the Jira will be 
present in the list of Jiras, ready to be merged.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> RM: Drillbits fail with "No enum constant 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit"
> 
>
> Key: DRILL-7140
> URL: https://issues.apache.org/jira/browse/DRILL-7140
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 2.0.0
> Environment: master + changes to enable RM
>Reporter: Abhishek Ravi
>Assignee: Sorabh Hamirwasia
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 2.0.0
>
>
> A sample configuration for RM with value *{{bestfit}}* for 
> *{{queue_selection_policy}}* fails with
> {noformat}
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:82)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.rm.DistributedResourceManager.(DistributedResourceManager.java:46)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 6 common frames omitted
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit
>   at java.lang.Enum.valueOf(Enum.java:238) ~[na:1.8.0_181]
>   at 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy$SelectionPolicy.valueOf(QueueSelectionPolicy.java:32)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:74)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 8 common frames omitted
> {noformat}
> The issue here seems to be the case mismatch between *{{bestfit}}* and enum 
> constant *{{BESTFIT}}*. Hence {{SelectionPolicy.valueOf}} does not find 
> *{{bestfit}}*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7140) RM: Drillbits fail with "No enum constant org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit"

2019-04-01 Thread Volodymyr Vysotskyi (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi updated DRILL-7140:
---
Labels: ready-to-commit  (was: )

> RM: Drillbits fail with "No enum constant 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit"
> 
>
> Key: DRILL-7140
> URL: https://issues.apache.org/jira/browse/DRILL-7140
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 2.0.0
> Environment: master + changes to enable RM
>Reporter: Abhishek Ravi
>Assignee: Sorabh Hamirwasia
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 2.0.0
>
>
> A sample configuration for RM with value *{{bestfit}}* for 
> *{{queue_selection_policy}}* fails with
> {noformat}
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:82)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.rm.DistributedResourceManager.(DistributedResourceManager.java:46)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 6 common frames omitted
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy.SelectionPolicy.bestfit
>   at java.lang.Enum.valueOf(Enum.java:238) ~[na:1.8.0_181]
>   at 
> org.apache.drill.exec.resourcemgr.config.selectionpolicy.QueueSelectionPolicy$SelectionPolicy.valueOf(QueueSelectionPolicy.java:32)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.resourcemgr.config.ResourcePoolTreeImpl.(ResourcePoolTreeImpl.java:74)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   ... 8 common frames omitted
> {noformat}
> The issue here seems to be the case mismatch between *{{bestfit}}* and enum 
> constant *{{BESTFIT}}*. Hence {{SelectionPolicy.valueOf}} does not find 
> *{{bestfit}}*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6970) Issue with LogRegex format plugin where drillbuf was overflowing

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806694#comment-16806694
 ] 

ASF GitHub Bot commented on DRILL-6970:
---

vvysotskyi commented on pull request #1649: DRILL-6970: Issue with LogRegex 
format plugin where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Issue with LogRegex format plugin where drillbuf was overflowing 
> -
>
> Key: DRILL-6970
> URL: https://issues.apache.org/jira/browse/DRILL-6970
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The log format plugin does re-allocate the drillbuf when it fills up. You can 
> query small log files but larger ones will fail with this error:
> 0: jdbc:drill:zk=local> select * from dfs.root.`/prog/test.log`;
> Error: INTERNAL_ERROR ERROR: index: 32724, length: 108 (expected: range(0, 
> 32768))
> Fragment 0:0
> Please, refer to logs for more information.
>  
> I'm running drill-embeded. The log storage plugin is configured like so
> {code:java}
> "log": {
> "type": "logRegex",
> "regex": "(.+)",
> "extension": "log",
> "maxErrors": 10,
> "schema": [
> {
> "fieldName": "line"
> }
> ]
> },
> {code}
> The log files is very simple
> {code:java}
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6970) Issue with LogRegex format plugin where drillbuf was overflowing

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806693#comment-16806693
 ] 

ASF GitHub Bot commented on DRILL-6970:
---

vvysotskyi commented on issue #1649: DRILL-6970: Issue with LogRegex format 
plugin where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649#issuecomment-478550902
 
 
   Closing this PR, since the latest changes are present in 
https://github.com/apache/drill/pull/1673
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Issue with LogRegex format plugin where drillbuf was overflowing 
> -
>
> Key: DRILL-6970
> URL: https://issues.apache.org/jira/browse/DRILL-6970
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The log format plugin does re-allocate the drillbuf when it fills up. You can 
> query small log files but larger ones will fail with this error:
> 0: jdbc:drill:zk=local> select * from dfs.root.`/prog/test.log`;
> Error: INTERNAL_ERROR ERROR: index: 32724, length: 108 (expected: range(0, 
> 32768))
> Fragment 0:0
> Please, refer to logs for more information.
>  
> I'm running drill-embeded. The log storage plugin is configured like so
> {code:java}
> "log": {
> "type": "logRegex",
> "regex": "(.+)",
> "extension": "log",
> "maxErrors": 10,
> "schema": [
> {
> "fieldName": "line"
> }
> ]
> },
> {code}
> The log files is very simple
> {code:java}
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7144) sqlline option : !set useLineContinuation false, fails with ParseException

2019-04-01 Thread Arina Ielchiieva (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806635#comment-16806635
 ] 

Arina Ielchiieva commented on DRILL-7144:
-

[~khfaraaz] please try on the latest Apache master after upgrade to SqlLine 
1.7. I could not reproduce issue there.

> sqlline option : !set useLineContinuation false, fails with ParseException
> --
>
> Key: DRILL-7144
> URL: https://issues.apache.org/jira/browse/DRILL-7144
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0, 1.15.0
>Reporter: Khurram Faraaz
>Assignee: Arina Ielchiieva
>Priority: Major
>
> sqlline option does not work as intended. Returns ParseException instead.
> !set useLineContinuation false
> On mapr-drill-1.13.0 we hit the below Exception.
> {noformat}
> 0: jdbc:drill:drillbit=drill-abcd-dev.dev.schw> !set useLineContinuation false
> Error setting configuration: useLineContinuation: 
> java.lang.IllegalArgumentException: No method matching 
> "setuseLineContinuation" was found in sqlline.SqlLineOpts.
> {noformat}
> It does not work on drill-1.15.0-mapr-r1
> git.branch=drill-1.15.0-mapr-r1
> git.commit.id=ebc9fe49d4477b04701fdd81884d5a0b748a13ae
> {noformat}
> [test@test-ab bin]# ./sqlline -u 
> "jdbc:drill:schema=dfs.tmp;auth=MAPRSASL;drillbit=test-ab.qa.lab" -n mapr -p 
> mapr
> Apache Drill 1.15.0.3-mapr
> "Start your SQL engine."
> 0: jdbc:drill:schema=dfs.tmp> !set useLineContinuation false
> 0: jdbc:drill:schema=dfs.tmp> select * from sys.version
> > select * from sys.memory
> Error: PARSE ERROR: Encountered "select" at line 2, column 1.
> Was expecting one of:
>  
>  "ORDER" ...
>  "LIMIT" ...
>  "OFFSET" ...
>  "FETCH" ...
>  "NATURAL" ...
>  "JOIN" ...
>  "INNER" ...
>  "LEFT" ...
>  "RIGHT" ...
>  "FULL" ...
>  "CROSS" ...
>  "," ...
>  "OUTER" ...
>  "EXTEND" ...
>  "(" ...
>  "MATCH_RECOGNIZE" ...
>  "AS" ...
>   ...
>   ...
>   ...
>   ...
>   ...
>  "TABLESAMPLE" ...
>  "WHERE" ...
>  "GROUP" ...
>  "HAVING" ...
>  "WINDOW" ...
>  "UNION" ...
>  "INTERSECT" ...
>  "EXCEPT" ...
>  "MINUS" ...
>  "." ...
>  "[" ...
> SQL Query select * from sys.version
> select * from sys.memory
> ^
> [Error Id: 067d5402-b965-4660-8981-34491ab5a051 on test-ab.qa.lab:31010] 
> (state=,code=0)
> {noformat}
> {noformat}
> [Error Id: 067d5402-b965-4660-8981-34491ab5a051 ]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  at 
> org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:185) 
> [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:138)
>  [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:110)
>  [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:76)
>  [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:584) 
> [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:272) 
> [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_151]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_151]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
> Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered 
> "select" at line 2, column 1.
> Was expecting one of:
>  
>  "ORDER" ...
>  "LIMIT" ...
>  "OFFSET" ...
>  "FETCH" ...
>  ...
>  "[" ...
> at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.convertException(DrillParserImpl.java:350)
>  ~[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.normalizeException(DrillParserImpl.java:131)
>  ~[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:137) 
> ~[calcite-core-1.17.0-drill-r2.jar:1.17.0-drill-r2]
>  at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:162) 
> ~[calcite-core-1.17.0-drill-r2.jar:1.17.0-drill-r2]
>  at 
> org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:177) 
> [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
>  ... 8 common frames omitted
> Caused by: org.apache.drill.exec.planner.sql.parser.impl.ParseException: 
> Encountered "select" at line 2, column 1.
> Was expecting one of:
>  
>  "ORDER" ...
>  "LIMIT" ...
>  "OFFSET" ...
>  "FETCH" ...
>  

[jira] [Assigned] (DRILL-7143) Enforce column-level constraints when using a schema

2019-04-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva reassigned DRILL-7143:
---

Assignee: Paul Rogers  (was: Arina Ielchiieva)

> Enforce column-level constraints when using a schema
> 
>
> Key: DRILL-7143
> URL: https://issues.apache.org/jira/browse/DRILL-7143
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.16.0
>
>
> The recently added schema framework enforces schema constraints at the table 
> level. We now wish to add additional constraints at the column level.
> * If a column is marked as "strict", then the reader will use the exact type 
> and mode from the column schema, or fail if it is not possible to do so.
> * If a column is marked as required, and provides a default value, then that 
> value is used instead of 0 if a row is missing a value for that column.
> This PR may also contain other fixes the the base functional revealed through 
> additional testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7143) Enforce column-level constraints when using a schema

2019-04-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva reassigned DRILL-7143:
---

Assignee: Arina Ielchiieva  (was: Paul Rogers)

> Enforce column-level constraints when using a schema
> 
>
> Key: DRILL-7143
> URL: https://issues.apache.org/jira/browse/DRILL-7143
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.16.0
>
>
> The recently added schema framework enforces schema constraints at the table 
> level. We now wish to add additional constraints at the column level.
> * If a column is marked as "strict", then the reader will use the exact type 
> and mode from the column schema, or fail if it is not possible to do so.
> * If a column is marked as required, and provides a default value, then that 
> value is used instead of 0 if a row is missing a value for that column.
> This PR may also contain other fixes the the base functional revealed through 
> additional testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7143) Enforce column-level constraints when using a schema

2019-04-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7143:

Priority: Major  (was: Minor)

> Enforce column-level constraints when using a schema
> 
>
> Key: DRILL-7143
> URL: https://issues.apache.org/jira/browse/DRILL-7143
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.16.0
>
>
> The recently added schema framework enforces schema constraints at the table 
> level. We now wish to add additional constraints at the column level.
> * If a column is marked as "strict", then the reader will use the exact type 
> and mode from the column schema, or fail if it is not possible to do so.
> * If a column is marked as required, and provides a default value, then that 
> value is used instead of 0 if a row is missing a value for that column.
> This PR may also contain other fixes the the base functional revealed through 
> additional testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7143) Enforce column-level constraints when using a schema

2019-04-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7143:

Reviewer: Arina Ielchiieva

> Enforce column-level constraints when using a schema
> 
>
> Key: DRILL-7143
> URL: https://issues.apache.org/jira/browse/DRILL-7143
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
> Fix For: 1.16.0
>
>
> The recently added schema framework enforces schema constraints at the table 
> level. We now wish to add additional constraints at the column level.
> * If a column is marked as "strict", then the reader will use the exact type 
> and mode from the column schema, or fail if it is not possible to do so.
> * If a column is marked as required, and provides a default value, then that 
> value is used instead of 0 if a row is missing a value for that column.
> This PR may also contain other fixes the the base functional revealed through 
> additional testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806578#comment-16806578
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270779624
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata.java
 ##
 @@ -633,43 +713,120 @@ private void readBlockMeta(Path path, boolean dirsOnly, 
MetadataContext metaCont
 parquetTableMetadataDirs.updateRelativePaths(metadataParentDirPath);
 if (!alreadyCheckedModification && 
tableModified(parquetTableMetadataDirs.getDirectories(), path, 
metadataParentDir, metaContext, fs)) {
   parquetTableMetadataDirs =
-  
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null)).getRight();
+  
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null, true)).getRight();
   newMetadata = true;
 }
   } else {
-parquetTableMetadata = mapper.readValue(is, 
ParquetTableMetadataBase.class);
+if (isFileMetadata) {
+  parquetTableMetadata.assignFiles((mapper.readValue(is, 
FileMetadata.class)).getFiles());
+  if (new 
MetadataVersion(parquetTableMetadata.getMetadataVersion()).compareTo(new 
MetadataVersion(4, 0)) >= 0) {
+((ParquetTableMetadata_v4) 
parquetTableMetadata).updateRelativePaths(metadataParentDirPath);
+  }
+
+  if (!alreadyCheckedModification && 
tableModified(parquetTableMetadata.getDirectories(), path, metadataParentDir, 
metaContext, fs)) {
+parquetTableMetadata =
+
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null, true)).getLeft();
+newMetadata = true;
+  }
+} else if (isSummaryFile) {
+  MetadataSummary metadataSummary = mapper.readValue(is, 
Metadata_V4.MetadataSummary.class);
+  ParquetTableMetadata_v4 parquetTableMetadata_v4 = new 
ParquetTableMetadata_v4(metadataSummary);
+  parquetTableMetadata = (ParquetTableMetadataBase) 
parquetTableMetadata_v4;
+} else {
+  parquetTableMetadata = mapper.readValue(is, 
ParquetTableMetadataBase.class);
+  if (new 
MetadataVersion(parquetTableMetadata.getMetadataVersion()).compareTo(new 
MetadataVersion(3, 0)) >= 0) {
+((Metadata_V3.ParquetTableMetadata_v3) 
parquetTableMetadata).updateRelativePaths(metadataParentDirPath);
+  }
+  if (!alreadyCheckedModification && 
tableModified((parquetTableMetadata.getDirectories()), path, metadataParentDir, 
metaContext, fs)) {
+parquetTableMetadata =
+
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null, true)).getLeft();
+newMetadata = true;
+  }
+}
 if (timer != null) {
   logger.debug("Took {} ms to read metadata from cache file", 
timer.elapsed(TimeUnit.MILLISECONDS));
   timer.stop();
 }
-if (new 
MetadataVersion(parquetTableMetadata.getMetadataVersion()).compareTo(new 
MetadataVersion(3, 0)) >= 0) {
-  ((ParquetTableMetadata_v3) 
parquetTableMetadata).updateRelativePaths(metadataParentDirPath);
-}
-  if (!alreadyCheckedModification && 
tableModified(parquetTableMetadata.getDirectories(), path, metadataParentDir, 
metaContext, fs)) {
-  // TODO change with current columns in existing metadata (auto 
refresh feature)
-  parquetTableMetadata =
-  
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null)).getLeft();
-  newMetadata = true;
+if (!isSummaryFile) {
+  // DRILL-5009: Remove the RowGroup if it is empty
+  List files = 
parquetTableMetadata.getFiles();
+  if (files != null) {
+for (ParquetFileMetadata file : files) {
+  List rowGroups = file.getRowGroups();
+  rowGroups.removeIf(r -> r.getRowCount() == 0);
+}
+  }
 }
-
-// DRILL-5009: Remove the RowGroup if it is empty
-List files = 
parquetTableMetadata.getFiles();
-for (ParquetFileMetadata file : files) {
-  List rowGroups = file.getRowGroups();
-  rowGroups.removeIf(r -> r.getRowCount() == 0);
+if (newMetadata) {
+  // if new metadata files were created, invalidate the existing 
metadata context
+  metaContext.clear();
 }
-
-  }
-  if (newMetadata) {
-// if new metadata files were created, 

[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806565#comment-16806565
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270769220
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata_V4.java
 ##
 @@ -0,0 +1,648 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.parquet.metadata;
+
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.KeyDeserializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import org.apache.drill.common.expression.SchemaPath;
+
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ColumnMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ParquetFileMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ParquetTableMetadataBase;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.RowGroupMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataVersion.Constants.V4;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.io.api.Binary;
+import org.apache.parquet.schema.OriginalType;
+import org.apache.parquet.schema.PrimitiveType;
+
+public class Metadata_V4 {
+
+  public static class ParquetTableMetadata_v4 extends ParquetTableMetadataBase 
{
+
+MetadataSummary metadataSummary = new MetadataSummary();
+FileMetadata fileMetadata = new FileMetadata();
+
+public ParquetTableMetadata_v4(MetadataSummary metadataSummary) {
+  this.metadataSummary = metadataSummary;
+}
+
+public ParquetTableMetadata_v4(MetadataSummary metadataSummary, 
FileMetadata fileMetadata) {
+  this.metadataSummary = metadataSummary;
+  this.fileMetadata = fileMetadata;
+}
+
+public ParquetTableMetadata_v4(String metadataVersion, 
ParquetTableMetadataBase parquetTableMetadata,
+   List files, 
List directories, String drillVersion, long totalRowCount, boolean 
allColumnsInteresting) {
+  this.metadataSummary.metadataVersion = metadataVersion;
+  this.fileMetadata.files = files;
+  this.metadataSummary.directories = directories;
+  this.metadataSummary.columnTypeInfo = ((ParquetTableMetadata_v4) 
parquetTableMetadata).metadataSummary.columnTypeInfo;
+  this.metadataSummary.drillVersion = drillVersion;
+  this.metadataSummary.totalRowCount = totalRowCount;
+  this.metadataSummary.allColumnsInteresting = allColumnsInteresting;
+}
+
+public ColumnTypeMetadata_v4 getColumnTypeInfo(String[] name) {
+  return metadataSummary.getColumnTypeInfo(name);
+}
+
+@JsonIgnore
+@Override
+public List getDirectories() {
+  return metadataSummary.getDirectories();
+}
+
+@Override
+public List getFiles() {
+  return fileMetadata.getFiles();
+}
+
+@Override
+public String getMetadataVersion() {
+  return metadataSummary.getMetadataVersion();
+}
+
+/**
+ * If directories list and file metadata list contain relative paths, 
update it to absolute ones
+ *
+ * @param baseDir base parent directory
+ */
+public void updateRelativePaths(String baseDir) {
+  // update directories paths to absolute ones
+  this.metadataSummary.directories = 
MetadataPathUtils.convertToAbsolutePaths(metadataSummary.directories, baseDir);
+
+  // update files paths to absolute ones
+  this.fileMetadata.files = 

[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806568#comment-16806568
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270773857
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata.java
 ##
 @@ -91,9 +95,14 @@
 public class Metadata {
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(Metadata.class);
 
-  public static final String[] OLD_METADATA_FILENAMES = 
{".drill.parquet_metadata.v2"};
+  public static final String[] OLD_METADATA_FILENAMES = 
{".drill.parquet_metadata", ".drill.parquet_metadata.v2"};
   public static final String METADATA_FILENAME = ".drill.parquet_metadata";
   public static final String METADATA_DIRECTORIES_FILENAME = 
".drill.parquet_metadata_directories";
+  public static final String FILE_METADATA_FILENAME = 
".drill.parquet_file_metadata.v4";
+  public static final String METADATA_SUMMARY_FILENAME = 
".drill.parquet_summary_metadata.v4";
+  public static final String[] CURRENT_METADATA_FILENAMES = 
{METADATA_SUMMARY_FILENAME, FILE_METADATA_FILENAME};
 
 Review comment:
   What about `METADATA_DIRECTORIES_FILENAME`? Is it current metadata fileName?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806569#comment-16806569
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270784411
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetMetadataCache.java
 ##
 @@ -956,4 +957,302 @@ public void testRefreshNone() throws Exception {
 int actualRowCount = testSql(query);
 assertEquals(expectedRowCount, actualRowCount);
   }
+
+  @Test
+  public void testTotalRowCount() throws Exception {
+String tableName = "nation_ctas_rowcount";
+test("use dfs");
+test("create table `%s/t1` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t2` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t3` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t4` as select * from cp.`tpch/nation.parquet`", 
tableName);
+long rowCount = testSql("select * from `nation_ctas_rowcount`");
+test("refresh table metadata %s", tableName);
+checkForMetadataFile(tableName);
+createMetadataDir(tableName);
+testBuilder()
+.sqlQuery("select t.totalRowCount as rowCount from 
`%s/metadataDir/summary_meta.json` as t", tableName)
+.unOrdered()
+.baselineColumns("rowCount")
+.baselineValues(rowCount)
+.go();
+  }
+
+  @Test
+  public void testTotalRowCountPerFile() throws Exception {
+String tableName = "nation_ctas_rowcount1";
+test("use dfs");
+test("create table `%s/t1` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t2` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t3` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t4` as select * from cp.`tpch/nation.parquet`", 
tableName);
+long rowCount = testSql("select * from `nation_ctas_rowcount1/t1`");
+test("refresh table metadata %s", tableName);
+tableName = tableName + "/t1";
+checkForMetadataFile(tableName);
+createMetadataDir(tableName);
+testBuilder()
+.sqlQuery("select t.totalRowCount as rowCount from 
`%s/metadataDir/summary_meta.json` as t", tableName)
+.unOrdered()
+.baselineColumns("rowCount")
+.baselineValues(rowCount)
+.go();
+  }
+
+
+  @Test
+  public void testTotalRowCountAddDirectory() throws Exception {
+String tableName = "nation_ctas_rowcount2";
+test("use dfs");
+
+test("create table `%s/t1` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t2` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t3` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t4` as select * from cp.`tpch/nation.parquet`", 
tableName);
+
+test("refresh table metadata %s", tableName);
+sleep(1000);
+test("create table `%s/t5` as select * from cp.`tpch/nation.parquet`", 
tableName);
+
+String query = String.format("select count(*) as count from `%s`", 
tableName);
+String rowCountQuery = String.format("select t.totalRowCount as rowCount 
from `%s/metadataDir/summary_meta.json` as t", tableName);
+
+testBuilder()
+.sqlQuery(query)
+.unOrdered()
+.baselineColumns("count")
+.baselineValues(125L)
+.go();
+checkForMetadataFile(tableName);
+createMetadataDir(tableName);
+testBuilder()
+.sqlQuery(rowCountQuery)
+.unOrdered()
+.baselineColumns("rowCount")
+.baselineValues(125L)
+.go();
+  }
+
+
+  @Test
+  public void testTotalRowCountAddSubDir() throws Exception {
+String tableName = "nation_ctas_rowcount3";
+test("use dfs");
+
+test("create table `%s/t1` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t2` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t3` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t4` as select * from cp.`tpch/nation.parquet`", 
tableName);
+
+test("refresh table metadata %s", tableName);
+sleep(1000);
+tableName = tableName + "/t1";
+test("create table `%s/t5` as select * from cp.`tpch/nation.parquet`", 
tableName);
+String query = String.format("select count(*) as count from `%s`", 
tableName);
+String rowCountQuery = String.format("select t.totalRowCount as rowCount 
from `%s/metadataDir/summary_meta.json` as t", tableName);
+testBuilder()
+.sqlQuery(query)
+

[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806576#comment-16806576
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270784820
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetMetadataCache.java
 ##
 @@ -956,4 +957,302 @@ public void testRefreshNone() throws Exception {
 int actualRowCount = testSql(query);
 assertEquals(expectedRowCount, actualRowCount);
   }
+
+  @Test
+  public void testTotalRowCount() throws Exception {
+String tableName = "nation_ctas_rowcount";
+test("use dfs");
+test("create table `%s/t1` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t2` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t3` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t4` as select * from cp.`tpch/nation.parquet`", 
tableName);
+long rowCount = testSql("select * from `nation_ctas_rowcount`");
+test("refresh table metadata %s", tableName);
+checkForMetadataFile(tableName);
+createMetadataDir(tableName);
+testBuilder()
+.sqlQuery("select t.totalRowCount as rowCount from 
`%s/metadataDir/summary_meta.json` as t", tableName)
+.unOrdered()
+.baselineColumns("rowCount")
+.baselineValues(rowCount)
+.go();
+  }
+
+  @Test
+  public void testTotalRowCountPerFile() throws Exception {
+String tableName = "nation_ctas_rowcount1";
+test("use dfs");
+test("create table `%s/t1` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t2` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t3` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t4` as select * from cp.`tpch/nation.parquet`", 
tableName);
+long rowCount = testSql("select * from `nation_ctas_rowcount1/t1`");
+test("refresh table metadata %s", tableName);
+tableName = tableName + "/t1";
+checkForMetadataFile(tableName);
+createMetadataDir(tableName);
+testBuilder()
+.sqlQuery("select t.totalRowCount as rowCount from 
`%s/metadataDir/summary_meta.json` as t", tableName)
+.unOrdered()
+.baselineColumns("rowCount")
+.baselineValues(rowCount)
+.go();
+  }
+
+
+  @Test
+  public void testTotalRowCountAddDirectory() throws Exception {
+String tableName = "nation_ctas_rowcount2";
+test("use dfs");
+
+test("create table `%s/t1` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t2` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t3` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t4` as select * from cp.`tpch/nation.parquet`", 
tableName);
+
+test("refresh table metadata %s", tableName);
+sleep(1000);
+test("create table `%s/t5` as select * from cp.`tpch/nation.parquet`", 
tableName);
+
+String query = String.format("select count(*) as count from `%s`", 
tableName);
+String rowCountQuery = String.format("select t.totalRowCount as rowCount 
from `%s/metadataDir/summary_meta.json` as t", tableName);
+
+testBuilder()
+.sqlQuery(query)
+.unOrdered()
+.baselineColumns("count")
+.baselineValues(125L)
+.go();
+checkForMetadataFile(tableName);
+createMetadataDir(tableName);
+testBuilder()
+.sqlQuery(rowCountQuery)
+.unOrdered()
+.baselineColumns("rowCount")
+.baselineValues(125L)
+.go();
+  }
+
+
+  @Test
+  public void testTotalRowCountAddSubDir() throws Exception {
+String tableName = "nation_ctas_rowcount3";
+test("use dfs");
+
+test("create table `%s/t1` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t2` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t3` as select * from cp.`tpch/nation.parquet`", 
tableName);
+test("create table `%s/t4` as select * from cp.`tpch/nation.parquet`", 
tableName);
+
+test("refresh table metadata %s", tableName);
+sleep(1000);
+tableName = tableName + "/t1";
+test("create table `%s/t5` as select * from cp.`tpch/nation.parquet`", 
tableName);
+String query = String.format("select count(*) as count from `%s`", 
tableName);
+String rowCountQuery = String.format("select t.totalRowCount as rowCount 
from `%s/metadataDir/summary_meta.json` as t", tableName);
+testBuilder()
+.sqlQuery(query)
+

[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806575#comment-16806575
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270767620
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata_V4.java
 ##
 @@ -0,0 +1,648 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.parquet.metadata;
+
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.KeyDeserializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import org.apache.drill.common.expression.SchemaPath;
+
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ColumnMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ParquetFileMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ParquetTableMetadataBase;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.RowGroupMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataVersion.Constants.V4;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.io.api.Binary;
+import org.apache.parquet.schema.OriginalType;
+import org.apache.parquet.schema.PrimitiveType;
+
+public class Metadata_V4 {
+
+  public static class ParquetTableMetadata_v4 extends ParquetTableMetadataBase 
{
+
+MetadataSummary metadataSummary = new MetadataSummary();
+FileMetadata fileMetadata = new FileMetadata();
+
+public ParquetTableMetadata_v4(MetadataSummary metadataSummary) {
+  this.metadataSummary = metadataSummary;
+}
+
+public ParquetTableMetadata_v4(MetadataSummary metadataSummary, 
FileMetadata fileMetadata) {
+  this.metadataSummary = metadataSummary;
+  this.fileMetadata = fileMetadata;
+}
+
+public ParquetTableMetadata_v4(String metadataVersion, 
ParquetTableMetadataBase parquetTableMetadata,
+   List files, 
List directories, String drillVersion, long totalRowCount, boolean 
allColumnsInteresting) {
+  this.metadataSummary.metadataVersion = metadataVersion;
+  this.fileMetadata.files = files;
+  this.metadataSummary.directories = directories;
+  this.metadataSummary.columnTypeInfo = ((ParquetTableMetadata_v4) 
parquetTableMetadata).metadataSummary.columnTypeInfo;
+  this.metadataSummary.drillVersion = drillVersion;
+  this.metadataSummary.totalRowCount = totalRowCount;
+  this.metadataSummary.allColumnsInteresting = allColumnsInteresting;
+}
+
+public ColumnTypeMetadata_v4 getColumnTypeInfo(String[] name) {
+  return metadataSummary.getColumnTypeInfo(name);
+}
+
+@JsonIgnore
+@Override
+public List getDirectories() {
+  return metadataSummary.getDirectories();
+}
+
+@Override
+public List getFiles() {
+  return fileMetadata.getFiles();
+}
+
+@Override
+public String getMetadataVersion() {
+  return metadataSummary.getMetadataVersion();
+}
+
+/**
+ * If directories list and file metadata list contain relative paths, 
update it to absolute ones
+ *
+ * @param baseDir base parent directory
+ */
+public void updateRelativePaths(String baseDir) {
+  // update directories paths to absolute ones
+  this.metadataSummary.directories = 
MetadataPathUtils.convertToAbsolutePaths(metadataSummary.directories, baseDir);
+
+  // update files paths to absolute ones
+  this.fileMetadata.files = 

[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806564#comment-16806564
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270782041
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/MetadataPathUtils.java
 ##
 @@ -60,6 +61,22 @@
* @param baseDir base parent directory
* @return list of files with absolute paths
*/
+  public static List 
convertToFilesWithAbsolutePathsForV4(
 
 Review comment:
   Please modify existing method `convertToFilesWithAbsolutePaths` for using 
with `ParquetFileMetadata_v4` and `ParquetFileMetadata_v3` to avoid duplicating 
the code.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806560#comment-16806560
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270772909
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetMetadataCache.java
 ##
 @@ -17,6 +17,7 @@
  */
 package org.apache.drill.exec.store.parquet;
 
+import static java.lang.Thread.sleep;
 
 Review comment:
   No need to import static `sleep` method. There are a lot of similar usages 
in Drill code and class usage is better for visibility.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806573#comment-16806573
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270769440
 
 

 ##
 File path: exec/java-exec/src/test/java/org/apache/drill/PlanTestBase.java
 ##
 @@ -465,4 +469,24 @@ private static String getPrefixJoinOrderFromPlan(String 
plan, String joinKeyWord
 
 return builder.toString();
   }
+
+  /**
+   * Create a temp metadata directory to query the metadata summary cache file
+   * @param table table name or table path
+   */
+  public static void createMetadataDir(String table) throws IOException {
+final String tmpDir;
+try {
+  tmpDir = dirTestWatcher.getRootDir().getCanonicalPath();
+} catch (IOException e) {
+  throw new RuntimeException(e);
+}
+File metadataDir = 
dirTestWatcher.makeRootSubDir(Paths.get(tmpDir+"/"+table+"/metadataDir"));
+File metaFile, newFile;
+metaFile = table.startsWith(tmpDir) ? FileUtils.getFile(table, 
Metadata.METADATA_SUMMARY_FILENAME)
+: FileUtils.getFile(tmpDir, table, 
Metadata.METADATA_SUMMARY_FILENAME);
+newFile = new File(tmpDir+"/"+table+"/summary_meta.json");
 
 Review comment:
   ```suggestion
   newFile = new File(tmpDir + "/" + table + "/summary_meta.json");
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806561#comment-16806561
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270738499
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java
 ##
 @@ -301,8 +301,25 @@ private Path getMetadataPath(FileStatus dir) {
   return new Path(dir.getPath(), Metadata.METADATA_FILENAME);
 }
 
+/**
+ * Check if the metadata version 4 files exist
+ * @param dir the path of the directory
+ * @param fs
+ * @return true if both file metadata and summary cache file exist
+ * @throws IOException in case of problems during accessing files
+ */
+private boolean currentMetadataFileExists(FileStatus dir, FileSystem fs) 
throws IOException {
+  for (String metaFileName : Metadata.CURRENT_METADATA_FILENAMES) {
 
 Review comment:
   It's a pity that functional style can't be used here, since 
`FileSystem.exists()` throws checked exception.
   ```
   return Arrays.stream(Metadata.OLD_METADATA_FILENAMES)
 .allMatch(metaFileName -> fs.exists(new Path(dir.getPath(), 
metaFileName)));
   ```
   
   Possibly we will introduce `FunctionWithException` in Drill in future.
   
   All are fine here now.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806574#comment-16806574
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270783164
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata_V4.java
 ##
 @@ -0,0 +1,648 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.parquet.metadata;
+
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.KeyDeserializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import org.apache.drill.common.expression.SchemaPath;
+
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ColumnMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ParquetFileMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ParquetTableMetadataBase;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.RowGroupMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataVersion.Constants.V4;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.io.api.Binary;
+import org.apache.parquet.schema.OriginalType;
+import org.apache.parquet.schema.PrimitiveType;
+
+public class Metadata_V4 {
 
 Review comment:
   Please check all methods whether they are serializable or not.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806558#comment-16806558
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270744162
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetReaderUtility.java
 ##
 @@ -387,6 +389,14 @@ private static boolean allowBinaryMetadata(String 
drillVersion, ParquetReaderCon
   names.add(Arrays.asList(columnTypeMetadata.name));
 }
   }
+} else if (parquetTableMetadata instanceof ParquetTableMetadata_v4) {
 
 Review comment:
   This logic is almost the same as for `ParquetTableMetadata_v3`.
   Can you change usage of `columnTypeInfo` field to getter everywhere?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806570#comment-16806570
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270779624
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata.java
 ##
 @@ -633,43 +713,120 @@ private void readBlockMeta(Path path, boolean dirsOnly, 
MetadataContext metaCont
 parquetTableMetadataDirs.updateRelativePaths(metadataParentDirPath);
 if (!alreadyCheckedModification && 
tableModified(parquetTableMetadataDirs.getDirectories(), path, 
metadataParentDir, metaContext, fs)) {
   parquetTableMetadataDirs =
-  
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null)).getRight();
+  
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null, true)).getRight();
   newMetadata = true;
 }
   } else {
-parquetTableMetadata = mapper.readValue(is, 
ParquetTableMetadataBase.class);
+if (isFileMetadata) {
+  parquetTableMetadata.assignFiles((mapper.readValue(is, 
FileMetadata.class)).getFiles());
+  if (new 
MetadataVersion(parquetTableMetadata.getMetadataVersion()).compareTo(new 
MetadataVersion(4, 0)) >= 0) {
+((ParquetTableMetadata_v4) 
parquetTableMetadata).updateRelativePaths(metadataParentDirPath);
+  }
+
+  if (!alreadyCheckedModification && 
tableModified(parquetTableMetadata.getDirectories(), path, metadataParentDir, 
metaContext, fs)) {
+parquetTableMetadata =
+
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null, true)).getLeft();
+newMetadata = true;
+  }
+} else if (isSummaryFile) {
+  MetadataSummary metadataSummary = mapper.readValue(is, 
Metadata_V4.MetadataSummary.class);
+  ParquetTableMetadata_v4 parquetTableMetadata_v4 = new 
ParquetTableMetadata_v4(metadataSummary);
+  parquetTableMetadata = (ParquetTableMetadataBase) 
parquetTableMetadata_v4;
+} else {
+  parquetTableMetadata = mapper.readValue(is, 
ParquetTableMetadataBase.class);
+  if (new 
MetadataVersion(parquetTableMetadata.getMetadataVersion()).compareTo(new 
MetadataVersion(3, 0)) >= 0) {
+((Metadata_V3.ParquetTableMetadata_v3) 
parquetTableMetadata).updateRelativePaths(metadataParentDirPath);
+  }
+  if (!alreadyCheckedModification && 
tableModified((parquetTableMetadata.getDirectories()), path, metadataParentDir, 
metaContext, fs)) {
+parquetTableMetadata =
+
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null, true)).getLeft();
+newMetadata = true;
+  }
+}
 if (timer != null) {
   logger.debug("Took {} ms to read metadata from cache file", 
timer.elapsed(TimeUnit.MILLISECONDS));
   timer.stop();
 }
-if (new 
MetadataVersion(parquetTableMetadata.getMetadataVersion()).compareTo(new 
MetadataVersion(3, 0)) >= 0) {
-  ((ParquetTableMetadata_v3) 
parquetTableMetadata).updateRelativePaths(metadataParentDirPath);
-}
-  if (!alreadyCheckedModification && 
tableModified(parquetTableMetadata.getDirectories(), path, metadataParentDir, 
metaContext, fs)) {
-  // TODO change with current columns in existing metadata (auto 
refresh feature)
-  parquetTableMetadata =
-  
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null)).getLeft();
-  newMetadata = true;
+if (!isSummaryFile) {
+  // DRILL-5009: Remove the RowGroup if it is empty
+  List files = 
parquetTableMetadata.getFiles();
+  if (files != null) {
+for (ParquetFileMetadata file : files) {
+  List rowGroups = file.getRowGroups();
+  rowGroups.removeIf(r -> r.getRowCount() == 0);
+}
+  }
 }
-
-// DRILL-5009: Remove the RowGroup if it is empty
-List files = 
parquetTableMetadata.getFiles();
-for (ParquetFileMetadata file : files) {
-  List rowGroups = file.getRowGroups();
-  rowGroups.removeIf(r -> r.getRowCount() == 0);
+if (newMetadata) {
+  // if new metadata files were created, invalidate the existing 
metadata context
+  metaContext.clear();
 }
-
-  }
-  if (newMetadata) {
-// if new metadata files were created, 

[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806567#comment-16806567
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270729791
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/RefreshMetadataHandler.java
 ##
 @@ -161,7 +161,7 @@ public PhysicalPlan getPlan(SqlNode sqlNode) throws 
ForemanSetupException {
*/
   private SqlNodeList getColumnList(final SqlRefreshMetadata 
sqlrefreshMetadata) {
 SqlNodeList columnList = sqlrefreshMetadata.getFieldList();
-if (columnList == null || !SqlNodeList.isEmptyList(columnList)) {
 
 Review comment:
   `SqlNodeList.isEmptyList(node)` checks `node instanceof SqlNodeList`, 
therefore no need to check `columnList == null` explicitly. 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806566#comment-16806566
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270776958
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata.java
 ##
 @@ -91,9 +95,14 @@
 public class Metadata {
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(Metadata.class);
 
-  public static final String[] OLD_METADATA_FILENAMES = 
{".drill.parquet_metadata.v2"};
+  public static final String[] OLD_METADATA_FILENAMES = 
{".drill.parquet_metadata", ".drill.parquet_metadata.v2"};
   public static final String METADATA_FILENAME = ".drill.parquet_metadata";
   public static final String METADATA_DIRECTORIES_FILENAME = 
".drill.parquet_metadata_directories";
+  public static final String FILE_METADATA_FILENAME = 
".drill.parquet_file_metadata.v4";
 
 Review comment:
   Looks like `METADATA_FILENAME` should be some sort of `OLD_METADATA_FILENAME`
   but `FILE_METADATA_FILENAME` -> `METADATA_FILENAME`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806572#comment-16806572
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270782722
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata_V1.java
 ##
 @@ -90,6 +90,16 @@ public Integer getDefinitionLevel(String[] columnName) {
   return null;
 }
 
+@Override
 
 Review comment:
   `@JsonIgnore` ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806571#comment-16806571
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270779898
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata.java
 ##
 @@ -633,43 +713,120 @@ private void readBlockMeta(Path path, boolean dirsOnly, 
MetadataContext metaCont
 parquetTableMetadataDirs.updateRelativePaths(metadataParentDirPath);
 if (!alreadyCheckedModification && 
tableModified(parquetTableMetadataDirs.getDirectories(), path, 
metadataParentDir, metaContext, fs)) {
   parquetTableMetadataDirs =
-  
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null)).getRight();
+  
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null, true)).getRight();
   newMetadata = true;
 }
   } else {
-parquetTableMetadata = mapper.readValue(is, 
ParquetTableMetadataBase.class);
+if (isFileMetadata) {
+  parquetTableMetadata.assignFiles((mapper.readValue(is, 
FileMetadata.class)).getFiles());
+  if (new 
MetadataVersion(parquetTableMetadata.getMetadataVersion()).compareTo(new 
MetadataVersion(4, 0)) >= 0) {
+((ParquetTableMetadata_v4) 
parquetTableMetadata).updateRelativePaths(metadataParentDirPath);
+  }
+
+  if (!alreadyCheckedModification && 
tableModified(parquetTableMetadata.getDirectories(), path, metadataParentDir, 
metaContext, fs)) {
+parquetTableMetadata =
+
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null, true)).getLeft();
+newMetadata = true;
+  }
+} else if (isSummaryFile) {
+  MetadataSummary metadataSummary = mapper.readValue(is, 
Metadata_V4.MetadataSummary.class);
+  ParquetTableMetadata_v4 parquetTableMetadata_v4 = new 
ParquetTableMetadata_v4(metadataSummary);
+  parquetTableMetadata = (ParquetTableMetadataBase) 
parquetTableMetadata_v4;
+} else {
+  parquetTableMetadata = mapper.readValue(is, 
ParquetTableMetadataBase.class);
+  if (new 
MetadataVersion(parquetTableMetadata.getMetadataVersion()).compareTo(new 
MetadataVersion(3, 0)) >= 0) {
+((Metadata_V3.ParquetTableMetadata_v3) 
parquetTableMetadata).updateRelativePaths(metadataParentDirPath);
+  }
+  if (!alreadyCheckedModification && 
tableModified((parquetTableMetadata.getDirectories()), path, metadataParentDir, 
metaContext, fs)) {
+parquetTableMetadata =
+
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null, true)).getLeft();
+newMetadata = true;
+  }
+}
 if (timer != null) {
   logger.debug("Took {} ms to read metadata from cache file", 
timer.elapsed(TimeUnit.MILLISECONDS));
   timer.stop();
 }
-if (new 
MetadataVersion(parquetTableMetadata.getMetadataVersion()).compareTo(new 
MetadataVersion(3, 0)) >= 0) {
-  ((ParquetTableMetadata_v3) 
parquetTableMetadata).updateRelativePaths(metadataParentDirPath);
-}
-  if (!alreadyCheckedModification && 
tableModified(parquetTableMetadata.getDirectories(), path, metadataParentDir, 
metaContext, fs)) {
-  // TODO change with current columns in existing metadata (auto 
refresh feature)
-  parquetTableMetadata =
-  
(createMetaFilesRecursivelyAsProcessUser(Path.getPathWithoutSchemeAndAuthority(path.getParent()),
 fs, true, null)).getLeft();
-  newMetadata = true;
+if (!isSummaryFile) {
+  // DRILL-5009: Remove the RowGroup if it is empty
+  List files = 
parquetTableMetadata.getFiles();
+  if (files != null) {
+for (ParquetFileMetadata file : files) {
+  List rowGroups = file.getRowGroups();
+  rowGroups.removeIf(r -> r.getRowCount() == 0);
+}
+  }
 }
-
-// DRILL-5009: Remove the RowGroup if it is empty
-List files = 
parquetTableMetadata.getFiles();
-for (ParquetFileMetadata file : files) {
-  List rowGroups = file.getRowGroups();
-  rowGroups.removeIf(r -> r.getRowCount() == 0);
+if (newMetadata) {
+  // if new metadata files were created, invalidate the existing 
metadata context
+  metaContext.clear();
 }
-
-  }
-  if (newMetadata) {
-// if new metadata files were created, 

[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806562#comment-16806562
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270779983
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata.java
 ##
 @@ -712,5 +869,4 @@ private boolean logAndStopTimer(boolean isModified, String 
directoryName,
 return isModified;
   }
 
-}
-
+}
 
 Review comment:
   ```suggestion
   }
   
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806563#comment-16806563
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270761210
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/Metadata_V4.java
 ##
 @@ -0,0 +1,648 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.parquet.metadata;
+
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.core.JsonGenerator;
+import com.fasterxml.jackson.databind.JsonSerializer;
+import com.fasterxml.jackson.databind.KeyDeserializer;
+import com.fasterxml.jackson.databind.SerializerProvider;
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+import org.apache.drill.common.expression.SchemaPath;
+
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ColumnMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ParquetFileMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.ParquetTableMetadataBase;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataBase.RowGroupMetadata;
+import static 
org.apache.drill.exec.store.parquet.metadata.MetadataVersion.Constants.V4;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.io.api.Binary;
+import org.apache.parquet.schema.OriginalType;
+import org.apache.parquet.schema.PrimitiveType;
+
+public class Metadata_V4 {
+
+  public static class ParquetTableMetadata_v4 extends ParquetTableMetadataBase 
{
+
+MetadataSummary metadataSummary = new MetadataSummary();
+FileMetadata fileMetadata = new FileMetadata();
+
+public ParquetTableMetadata_v4(MetadataSummary metadataSummary) {
+  this.metadataSummary = metadataSummary;
+}
+
+public ParquetTableMetadata_v4(MetadataSummary metadataSummary, 
FileMetadata fileMetadata) {
+  this.metadataSummary = metadataSummary;
+  this.fileMetadata = fileMetadata;
+}
+
+public ParquetTableMetadata_v4(String metadataVersion, 
ParquetTableMetadataBase parquetTableMetadata,
+   List files, 
List directories, String drillVersion, long totalRowCount, boolean 
allColumnsInteresting) {
+  this.metadataSummary.metadataVersion = metadataVersion;
+  this.fileMetadata.files = files;
+  this.metadataSummary.directories = directories;
+  this.metadataSummary.columnTypeInfo = ((ParquetTableMetadata_v4) 
parquetTableMetadata).metadataSummary.columnTypeInfo;
+  this.metadataSummary.drillVersion = drillVersion;
+  this.metadataSummary.totalRowCount = totalRowCount;
+  this.metadataSummary.allColumnsInteresting = allColumnsInteresting;
+}
+
+public ColumnTypeMetadata_v4 getColumnTypeInfo(String[] name) {
+  return metadataSummary.getColumnTypeInfo(name);
+}
+
+@JsonIgnore
+@Override
+public List getDirectories() {
+  return metadataSummary.getDirectories();
+}
+
+@Override
+public List getFiles() {
+  return fileMetadata.getFiles();
+}
+
+@Override
+public String getMetadataVersion() {
+  return metadataSummary.getMetadataVersion();
+}
+
+/**
+ * If directories list and file metadata list contain relative paths, 
update it to absolute ones
+ *
+ * @param baseDir base parent directory
+ */
+public void updateRelativePaths(String baseDir) {
+  // update directories paths to absolute ones
+  this.metadataSummary.directories = 
MetadataPathUtils.convertToAbsolutePaths(metadataSummary.directories, baseDir);
+
+  // update files paths to absolute ones
+  this.fileMetadata.files = 

[jira] [Commented] (DRILL-7063) Create separate summary file for schema, totalRowCount, totalNullCount (includes maintenance)

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806559#comment-16806559
 ] 

ASF GitHub Bot commented on DRILL-7063:
---

vdiravka commented on pull request #1723: DRILL-7063: Seperate metadata cache 
file into summary, file metadata
URL: https://github.com/apache/drill/pull/1723#discussion_r270771552
 
 

 ##
 File path: exec/java-exec/src/test/java/org/apache/drill/PlanTestBase.java
 ##
 @@ -465,4 +469,24 @@ private static String getPrefixJoinOrderFromPlan(String 
plan, String joinKeyWord
 
 return builder.toString();
   }
+
+  /**
+   * Create a temp metadata directory to query the metadata summary cache file
+   * @param table table name or table path
+   */
+  public static void createMetadataDir(String table) throws IOException {
+final String tmpDir;
+try {
+  tmpDir = dirTestWatcher.getRootDir().getCanonicalPath();
+} catch (IOException e) {
+  throw new RuntimeException(e);
+}
+File metadataDir = 
dirTestWatcher.makeRootSubDir(Paths.get(tmpDir+"/"+table+"/metadataDir"));
+File metaFile, newFile;
+metaFile = table.startsWith(tmpDir) ? FileUtils.getFile(table, 
Metadata.METADATA_SUMMARY_FILENAME)
+: FileUtils.getFile(tmpDir, table, 
Metadata.METADATA_SUMMARY_FILENAME);
+newFile = new File(tmpDir+"/"+table+"/summary_meta.json");
 
 Review comment:
   FileSeparator is different on different OS
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create separate summary file for schema, totalRowCount, totalNullCount 
> (includes maintenance)
> -
>
> Key: DRILL-7063
> URL: https://issues.apache.org/jira/browse/DRILL-7063
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-04-01 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806450#comment-16806450
 ] 

Volodymyr Vysotskyi edited comment on DRILL-7077 at 4/1/19 6:39 AM:


Merged with commit id 3c798d338f4f4dab6713956e0a94d18e6e5c72bd


was (Author: vvysotskyi):
Merged with commit 3c798d338f4f4dab6713956e0a94d18e6e5c72bd

> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806444#comment-16806444
 ] 

ASF GitHub Bot commented on DRILL-7032:
---

asfgit commented on pull request #1637: DRILL-7032: Ignore corrupt rows in a 
PCAP file
URL: https://github.com/apache/drill/pull/1637
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ignore corrupt rows in a PCAP file
> --
>
> Key: DRILL-7032
> URL: https://issues.apache.org/jira/browse/DRILL-7032
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.15.0
> Environment: OS: Ubuntu 18.4
> Drill version: 1.15.0
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
>Reporter: Giovanni Conte
>Assignee: Charles Givre
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Would be useful for Drill to have some ability to ignore corrupt rows in a 
> PCAP file instead of trow the java exception.
> This is because there are many pcap files with corrupted lines and this 
> funcionality will avoid to do a pre-fixing of the packet-captures (example 
> attached file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7051) Upgrade to Jetty 9.3

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806445#comment-16806445
 ] 

ASF GitHub Bot commented on DRILL-7051:
---

asfgit commented on pull request #1681: DRILL-7051: Upgrade to Jetty 9.3 
URL: https://github.com/apache/drill/pull/1681
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade to Jetty 9.3 
> -
>
> Key: DRILL-7051
> URL: https://issues.apache.org/jira/browse/DRILL-7051
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.15.0
>Reporter: Veera Naranammalpuram
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Is Drill using a version of jetty web server that's really old? The jar's 
> suggest it's using jetty 9.1 that was built sometime in 2014? 
> {noformat}
> -rw-r--r-- 1 veeranaranammalpuram staff 15988 Nov 20 2017 
> jetty-continuation-9.1.1.v20140108.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 103288 Nov 20 2017 
> jetty-http-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 101519 Nov 20 2017 
> jetty-io-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 95906 Nov 20 2017 
> jetty-security-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 401593 Nov 20 2017 
> jetty-server-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 110992 Nov 20 2017 
> jetty-servlet-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 119215 Nov 20 2017 
> jetty-servlets-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 341683 Nov 20 2017 
> jetty-util-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 38707 Dec 21 15:42 
> jetty-util-ajax-9.3.19.v20170502.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 111466 Nov 20 2017 
> jetty-webapp-9.1.1.v20140108.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 41763 Nov 20 2017 
> jetty-xml-9.1.1.v20140108.jar {noformat}
> This version is shown as deprecated: 
> [https://www.eclipse.org/jetty/documentation/current/what-jetty-version.html#d0e203]
> Opening this to upgrade jetty to the latest stable supported version. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806446#comment-16806446
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

asfgit commented on pull request #1680: DRILL-7077: Add Function to Facilitate 
Time Series Analysis
URL: https://github.com/apache/drill/pull/1680
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7142) Add space after > in SqlLine prompt

2019-04-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806447#comment-16806447
 ] 

ASF GitHub Bot commented on DRILL-7142:
---

asfgit commented on pull request #1721: DRILL-7142: Add space after > in 
SqlLine prompt
URL: https://github.com/apache/drill/pull/1721
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add space after > in SqlLine prompt
> ---
>
> Key: DRILL-7142
> URL: https://issues.apache.org/jira/browse/DRILL-7142
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Add space after > in SqlLine prompt as it was before SqlLine update.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7144) sqlline option : !set useLineContinuation false, fails with ParseException

2019-04-01 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-7144:
-

 Summary: sqlline option : !set useLineContinuation false, fails 
with ParseException
 Key: DRILL-7144
 URL: https://issues.apache.org/jira/browse/DRILL-7144
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.15.0, 1.13.0
Reporter: Khurram Faraaz
Assignee: Arina Ielchiieva


sqlline option does not work as intended. Returns ParseException instead.
!set useLineContinuation false

On mapr-drill-1.13.0 we hit the below Exception.

{noformat}
0: jdbc:drill:drillbit=drill-abcd-dev.dev.schw> !set useLineContinuation false
Error setting configuration: useLineContinuation: 
java.lang.IllegalArgumentException: No method matching "setuseLineContinuation" 
was found in sqlline.SqlLineOpts.
{noformat}

It does not work on drill-1.15.0-mapr-r1

git.branch=drill-1.15.0-mapr-r1
git.commit.id=ebc9fe49d4477b04701fdd81884d5a0b748a13ae

{noformat}
[test@test-ab bin]# ./sqlline -u 
"jdbc:drill:schema=dfs.tmp;auth=MAPRSASL;drillbit=test-ab.qa.lab" -n mapr -p 
mapr
Apache Drill 1.15.0.3-mapr
"Start your SQL engine."
0: jdbc:drill:schema=dfs.tmp> !set useLineContinuation false
0: jdbc:drill:schema=dfs.tmp> select * from sys.version
> select * from sys.memory
Error: PARSE ERROR: Encountered "select" at line 2, column 1.
Was expecting one of:
 
 "ORDER" ...
 "LIMIT" ...
 "OFFSET" ...
 "FETCH" ...
 "NATURAL" ...
 "JOIN" ...
 "INNER" ...
 "LEFT" ...
 "RIGHT" ...
 "FULL" ...
 "CROSS" ...
 "," ...
 "OUTER" ...
 "EXTEND" ...
 "(" ...
 "MATCH_RECOGNIZE" ...
 "AS" ...
  ...
  ...
  ...
  ...
  ...
 "TABLESAMPLE" ...
 "WHERE" ...
 "GROUP" ...
 "HAVING" ...
 "WINDOW" ...
 "UNION" ...
 "INTERSECT" ...
 "EXCEPT" ...
 "MINUS" ...
 "." ...
 "[" ...


SQL Query select * from sys.version
select * from sys.memory
^

[Error Id: 067d5402-b965-4660-8981-34491ab5a051 on test-ab.qa.lab:31010] 
(state=,code=0)
{noformat}


{noformat}
[Error Id: 067d5402-b965-4660-8981-34491ab5a051 ]
 at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
 ~[drill-common-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:185) 
[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:138)
 [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:110)
 [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:76)
 [drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:584) 
[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:272) 
[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[na:1.8.0_151]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_151]
 at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered 
"select" at line 2, column 1.
Was expecting one of:
 
 "ORDER" ...
 "LIMIT" ...
 "OFFSET" ...
 "FETCH" ...
 ...
 "[" ...

at 
org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.convertException(DrillParserImpl.java:350)
 ~[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at 
org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.normalizeException(DrillParserImpl.java:131)
 ~[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:137) 
~[calcite-core-1.17.0-drill-r2.jar:1.17.0-drill-r2]
 at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:162) 
~[calcite-core-1.17.0-drill-r2.jar:1.17.0-drill-r2]
 at org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:177) 
[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 ... 8 common frames omitted
Caused by: org.apache.drill.exec.planner.sql.parser.impl.ParseException: 
Encountered "select" at line 2, column 1.
Was expecting one of:
 
 "ORDER" ...
 "LIMIT" ...
 "OFFSET" ...
 "FETCH" ...
 "NATURAL" ...
 ...
 ...
 "[" ...

at 
org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.generateParseException(DrillParserImpl.java:24076)
 ~[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at 
org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.jj_consume_token(DrillParserImpl.java:23893)
 ~[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at 
org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.SqlStmtEof(DrillParserImpl.java:899)
 ~[drill-java-exec-1.15.0.3-mapr.jar:1.15.0.3-mapr]
 at