[GitHub] drill pull request #903: DRILL-5712: Update the pom files with dependency ex...

2017-09-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/903


---


[GitHub] drill pull request #938: DRILL-5694: Handle HashAgg OOM by spill and retry, ...

2017-09-14 Thread Ben-Zvi
Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/938#discussion_r139045903
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggregator.java
 ---
@@ -47,10 +47,7 @@
   // OK - batch returned, NONE - end of data, RESTART - call again
   public enum AggIterOutcome { AGG_OK, AGG_NONE, AGG_RESTART }
 
-  public abstract void setup(HashAggregate hashAggrConfig, HashTableConfig 
htConfig, FragmentContext context,
- OperatorStats stats, OperatorContext 
oContext, RecordBatch incoming, HashAggBatch outgoing,
- LogicalExpression[] valueExprs, 
List valueFieldIds, TypedFieldId[] keyFieldIds,
- VectorContainer outContainer) throws 
SchemaChangeException, IOException, ClassTransformationException;
+  public abstract void setup(HashAggregate hashAggrConfig, HashTableConfig 
htConfig, FragmentContext context, OperatorStats stats, OperatorContext 
oContext, RecordBatch incoming, HashAggBatch outgoing, LogicalExpression[] 
valueExprs, List valueFieldIds, TypedFieldId[] keyFieldIds, 
VectorContainer outContainer, int extraRowBytes) throws SchemaChangeException, 
IOException, ClassTransformationException;
--- End diff --

That was one of the IDE's ideas 
And simplification could be done as part of future cleanup work (like 
DRILL-5779)


---


[GitHub] drill pull request #938: DRILL-5694: Handle HashAgg OOM by spill and retry, ...

2017-09-14 Thread Ben-Zvi
Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/938#discussion_r139045744
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
@@ -1335,7 +1470,7 @@ private void updateStats(HashTable[] htables) {
 }
 if ( rowsReturnedEarly > 0 ) {
   stats.setLongStat(Metric.SPILL_MB, // update stats - est. total MB 
returned early
-  (int) Math.round( rowsReturnedEarly * estRowWidth / 1024.0D / 
1024.0));
+  (int) Math.round( rowsReturnedEarly * estOutputRowWidth / 
1024.0D / 1024.0));
--- End diff --

Work will be done later as part of DRILL-5779 


---


[GitHub] drill pull request #938: DRILL-5694: Handle HashAgg OOM by spill and retry, ...

2017-09-14 Thread Ben-Zvi
Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/938#discussion_r139045329
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
@@ -545,16 +584,19 @@ public AggOutcome doWork() {
   if (EXTRA_DEBUG_1) {
 logger.debug("Starting outer loop of doWork()...");
   }
-  for (; underlyingIndex < currentBatchRecordCount; incIndex()) {
+  while (underlyingIndex < currentBatchRecordCount) {
 if (EXTRA_DEBUG_2) {
   logger.debug("Doing loop with values underlying {}, current {}", 
underlyingIndex, currentIndex);
 }
 checkGroupAndAggrValues(currentIndex);
+
+if ( retrySameIndex ) { retrySameIndex = false; }  // need to 
retry this row (e.g. we had an OOM)
--- End diff --

So why does "or before" have spaces ? :-)  


---


[GitHub] drill pull request #938: DRILL-5694: Handle HashAgg OOM by spill and retry, ...

2017-09-14 Thread Ben-Zvi
Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/938#discussion_r139045072
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
@@ -109,14 +107,21 @@
 
   private boolean isTwoPhase = false; // 1 phase or 2 phase aggr?
   private boolean is2ndPhase = false;
-  private boolean canSpill = true; // make it false in case can not spill
+  private boolean is1stPhase = false;
+  private boolean canSpill = true; // make it false in case can not 
spill/return-early
   private ChainedHashTable baseHashTable;
   private boolean earlyOutput = false; // when 1st phase returns a 
partition due to no memory
   private int earlyPartition = 0; // which partition to return early
-
-  private long memoryLimit; // max memory to be used by this oerator
-  private long estMaxBatchSize = 0; // used for adjusting #partitions
-  private long estRowWidth = 0;
+  private boolean retrySameIndex = false; // in case put failed during 1st 
phase - need to output early, then retry
+  private boolean useMemoryPrediction = false; // whether to use memory 
prediction to decide when to spill
+  private long estMaxBatchSize = 0; // used for adjusting #partitions and 
deciding when to spill
+  private long estRowWidth = 0; // the size of the internal "row" (keys + 
values + extra columns)
+  private long estValuesRowWidth = 0; // the size of the internal values ( 
values + extra )
+  private long estOutputRowWidth = 0; // the size of the output "row" (no 
extra columns)
+  private long estValuesBatchSize = 0; // used for "reserving" memory for 
the Values batch to overcome an OOM
+  private long estOutgoingAllocSize = 0; // used for "reserving" memory 
for the Outgoing Output Values to overcome an OOM
+  private long reserveValueBatchMemory; // keep "reserve memory" for 
Values Batch
+  private long reserveOutgoingMemory; // keep "reserve memory" for the 
Outgoing (Values only) output
--- End diff --

Will wait for some future cleanup opportunity.


---


[jira] [Created] (DRILL-5794) Projection pushdown does not preserve collation

2017-09-14 Thread Gautam Kumar Parai (JIRA)
Gautam Kumar Parai created DRILL-5794:
-

 Summary: Projection pushdown does not preserve collation
 Key: DRILL-5794
 URL: https://issues.apache.org/jira/browse/DRILL-5794
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.11.0
Reporter: Gautam Kumar Parai
Assignee: Gautam Kumar Parai


While look at the projection pushdown into scan rule in Drill it seems like we 
do not consider changes to collation. This would happen in general and not just 
for the projection pushdown across other rels.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5793) NPE on close

2017-09-14 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-5793:
-

 Summary: NPE on close
 Key: DRILL-5793
 URL: https://issues.apache.org/jira/browse/DRILL-5793
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.12.0
 Environment: Drill 1.12.0 commit : 
aaff1b35b7339fb4e6ab480dd517994ff9f0a5c5
Reporter: Khurram Faraaz


The code looks wrong:
{noformat}
 @Override
 public void close() throws Exception {
   options.close();
 }
If the shutdown occurs to early, options is not yet assigned and an NPE results.
{noformat}

{noformat}
2017-09-14 20:16:39,551 [main] DEBUG o.apache.drill.exec.server.Drillbit - 
Shutdown begun.
2017-09-14 20:16:41,560 [pool-5-thread-1] INFO  
o.a.drill.exec.rpc.user.UserServer - closed eventLoopGroup 
io.netty.channel.nio.NioEventLoopGroup@71a84ff4 in 1006 ms
2017-09-14 20:16:41,560 [pool-5-thread-2] INFO  
o.a.drill.exec.rpc.data.DataServer - closed eventLoopGroup 
io.netty.channel.nio.NioEventLoopGroup@f711283 in 1005 ms
2017-09-14 20:16:41,561 [pool-5-thread-1] INFO  
o.a.drill.exec.service.ServiceEngine - closed userServer in 1007 ms
2017-09-14 20:16:41,562 [pool-5-thread-2] DEBUG 
o.a.drill.exec.memory.BaseAllocator - closed allocator[rpc:bit-data].
2017-09-14 20:16:41,562 [pool-5-thread-2] INFO  
o.a.drill.exec.service.ServiceEngine - closed dataPool in 1008 ms
2017-09-14 20:16:41,563 [main] DEBUG o.a.drill.exec.memory.BaseAllocator - 
closed allocator[rpc:user].
2017-09-14 20:16:41,563 [main] DEBUG o.a.drill.exec.memory.BaseAllocator - 
closed allocator[rpc:bit-control].
2017-09-14 20:16:41,593 [main] DEBUG o.a.drill.exec.memory.BaseAllocator - 
closed allocator[ROOT].
2017-09-14 20:16:41,593 [main] WARN  o.apache.drill.exec.server.Drillbit - 
Failure on close()
java.lang.NullPointerException: null
at 
org.apache.drill.exec.server.options.SystemOptionManager.close(SystemOptionManager.java:369)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.server.DrillbitContext.close(DrillbitContext.java:241) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at org.apache.drill.exec.work.WorkManager.close(WorkManager.java:154) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:173) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:314) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:290) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:286) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #926: DRILL-5269 Make DirectSubScan Jackson JSON deserial...

2017-09-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/926


---


Code Generation Question

2017-09-14 Thread Timothy Farkas
Hi All,

As I've been looking at the TopN operator and code generation, I've been 
wondering why we have 2 forms of code generation:


  *   One is the method of stitching compiled methods into a template class 
with ASM.
  *   The other simply creates a class that extends the TemplateClass and 
compiles it without using custom ASM techniques. This is the PlainJava 
technique.

With my high level understanding, it seems like using the PlainJava approach 
would be the simplest, and would also probably be the most performant since we 
inherit all the java compiler optimizations. Is there a specific reason why we 
still use our custom ASM technique? Would it be safe to start retiring the old 
ASM technique in favor of PlainJava?

Thanks,
Tim


[jira] [Created] (DRILL-5792) CONVERT_FROM_JSON on an empty file throws runtime exception

2017-09-14 Thread Prasad Nagaraj Subramanya (JIRA)
Prasad Nagaraj Subramanya created DRILL-5792:


 Summary: CONVERT_FROM_JSON on an empty file throws runtime 
exception
 Key: DRILL-5792
 URL: https://issues.apache.org/jira/browse/DRILL-5792
 Project: Apache Drill
  Issue Type: Bug
Reporter: Prasad Nagaraj Subramanya


Sample query to reproduce-
{code}
SELECT CONVERT_FROM(columns[1], 'JSON') as col1 FROM dfs.`file1.tbl`;
{code}

Throws runtime exception-
{code}
Error: Unexpected RuntimeException: java.lang.UnsupportedOperationException: 
Unable to find sql accessor for minor type [NULL] and mode [OPTIONAL] 
(state=,code=0)
{code}
The expected result is that the query returns 0 rows.

Issue observed with commit id - 7a900b71fd269aceee7301afb18fd8d303df5bcd

Expected result-
{code}
+---+
| col1  |
+---+
+---+
{code}
Works with drill 1.11.0



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5791) Unit test Jackson polymorphic unmarshalling

2017-09-14 Thread Vlad Rozov (JIRA)
Vlad Rozov created DRILL-5791:
-

 Summary: Unit test Jackson polymorphic unmarshalling
 Key: DRILL-5791
 URL: https://issues.apache.org/jira/browse/DRILL-5791
 Project: Apache Drill
  Issue Type: Test
Reporter: Vlad Rozov
Assignee: Vlad Rozov






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #943: DRILL-5749: solve deadlock between foreman and nett...

2017-09-14 Thread weijietong
GitHub user weijietong opened a pull request:

https://github.com/apache/drill/pull/943

DRILL-5749: solve deadlock between foreman and netty threads

@paul-rogers please review this PR again ,fail to squash the commits at 
last PR, sorry about that.  

related thread stack, please see 
[DRILL-5749](https://issues.apache.org/jira/browse/DRILL-5749).
process is to break the nested condition invoke .

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/weijietong/drill drill-5749

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/943.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #943


commit b44f780a948c4a0898e7cee042c0590f0713f780
Author: weijietong 
Date:   2017-06-08T08:03:46Z

Merge pull request #1 from apache/master

sync

commit d045c757c80a759b435479cc89f33c749fc16ac2
Author: weijie.tong 
Date:   2017-08-11T08:01:36Z

Merge branch 'master' of github.com:weijietong/drill

commit 08b7006f4c70c45a17ebf7eae6beaa2bdb0d0454
Author: weijie.tong 
Date:   2017-08-20T12:05:51Z

update

commit 9e9ebb497a183e61a72665019e6e04070d912027
Author: weijie.tong 
Date:   2017-08-20T12:07:41Z

revert

commit 837d9fc58440fb584690f93b5f638ddcedf042a1
Author: weijie.tong 
Date:   2017-08-22T10:35:12Z

Merge branch 'master' of github.com:apache/drill

commit b1fc840ad9d0a9959b05a84bfd17f17067def32d
Author: weijie.tong 
Date:   2017-08-29T16:39:48Z

Merge branch 'master' of github.com:apache/drill

commit 52d7a0b795cf2ef29c596e84277cc01f1c105d19
Author: weijie.tong 
Date:   2017-09-14T11:55:26Z

Merge branch 'master' of github.com:apache/drill

commit 2fbc23998ff5c8cb8a2a476221be856d69a559c4
Author: weijie.tong 
Date:   2017-09-14T12:02:55Z

solve deadlock occured between foreman and netty threads




---


[GitHub] drill pull request #925: DRILL-5749: solve foreman and netty threads deadloc...

2017-09-14 Thread weijietong
Github user weijietong closed the pull request at:

https://github.com/apache/drill/pull/925


---


[GitHub] drill pull request #942: DRILL-5781: Fix unit test failures to use tests con...

2017-09-14 Thread vvysotskyi
GitHub user vvysotskyi opened a pull request:

https://github.com/apache/drill/pull/942

DRILL-5781: Fix unit test failures to use tests config even if default 
config is available

Please see [DRILL-5781](https://issues.apache.org/jira/browse/DRILL-5781) 
for details.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vvysotskyi/drill DRILL-5781

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/942.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #942


commit b47b4d760adc62e1625d23e80aae611a54ea9e28
Author: Volodymyr Vysotskyi 
Date:   2017-09-07T18:01:12Z

DRILL-5781: Fix unit test failures to use tests config even if default 
config is available




---


[jira] [Created] (DRILL-5790) PCAP format explicitly opens local file

2017-09-14 Thread Ted Dunning (JIRA)
Ted Dunning created DRILL-5790:
--

 Summary: PCAP format explicitly opens local file
 Key: DRILL-5790
 URL: https://issues.apache.org/jira/browse/DRILL-5790
 Project: Apache Drill
  Issue Type: Bug
Reporter: Ted Dunning


Note the new FileInputStream line
{code}
@Override
public void setup(final OperatorContext context, final OutputMutator output) 
throws ExecutionSetupException {
try {
this.output = output;
this.buffer = new byte[10];
this.in = new FileInputStream(inputPath);
this.decoder = new PacketDecoder(in);
this.validBytes = in.read(buffer);
this.projectedCols = getProjectedColsIfItNull();
setColumns(projectedColumns);
} catch (IOException io) {
throw UserException.dataReadError(io)
.addContext("File name:", inputPath)
.build(logger);
}
}
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)