[jira] [Commented] (DRILL-6656) Add Regex To Disallow Extra Semicolons In Imports

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570960#comment-16570960
 ] 

ASF GitHub Bot commented on DRILL-6656:
---

ilooner commented on a change in pull request #1415: DRILL-6656: Disallow extra 
semicolons in import statements.
URL: https://github.com/apache/drill/pull/1415#discussion_r208065944
 
 

 ##
 File path: 
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseRegionScanAssignments.java
 ##
 @@ -176,8 +176,19 @@ public void 
testHBaseGroupScanAssignmentSomeAfinedWithOrphans() throws Exception
 scan.applyAssignments(endpoints);
 
 LinkedList sizes = Lists.newLinkedList();
-sizes.add(1); sizes.add(1); sizes.add(1); sizes.add(1); sizes.add(1); 
sizes.add(1); sizes.add(1); sizes.add(1);
-sizes.add(2); sizes.add(2); sizes.add(2); sizes.add(2); sizes.add(2);
+sizes.add(1);
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Regex To Disallow Extra Semicolons In Imports
> -
>
> Key: DRILL-6656
> URL: https://issues.apache.org/jira/browse/DRILL-6656
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6656) Add Regex To Disallow Extra Semicolons In Imports

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570758#comment-16570758
 ] 

ASF GitHub Bot commented on DRILL-6656:
---

ilooner commented on a change in pull request #1415: DRILL-6656: Disallow extra 
semicolons in import statements.
URL: https://github.com/apache/drill/pull/1415#discussion_r208024603
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillAggFuncHolder.java
 ##
 @@ -93,13 +93,13 @@ public boolean isAggregating() {
 //Loop through all workspace vectors, to get the minimum of size of 
all workspace vectors.
 JVar sizeVar = setupBlock.decl(g.getModel().INT, "vectorSize", 
JExpr.lit(Integer.MAX_VALUE));
 JClass mathClass = g.getModel().ref(Math.class);
-for (int id = 0; id < getWorkspaceVars().length; id ++) {
+for (int id = 0; id < getWorkspaceVars().length; id++) {
   if (!getWorkspaceVars()[id].isInject()) {
 
setupBlock.assign(sizeVar,mathClass.staticInvoke("min").arg(sizeVar).arg(g.getWorkspaceVectors().get(getWorkspaceVars()[id]).invoke("getValueCapacity")));
   }
 }
 
-for(int i =0 ; i < getWorkspaceVars().length; i++) {
+for(int i =0; i < getWorkspaceVars().length; i++) {
 
 Review comment:
   @vvysotskyi I don't understand, could you give an example? I formatted the 
spaces on these lines because they violated the NoWhitespaceBefore rule.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Regex To Disallow Extra Semicolons In Imports
> -
>
> Key: DRILL-6656
> URL: https://issues.apache.org/jira/browse/DRILL-6656
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6656) Add Regex To Disallow Extra Semicolons In Imports

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570690#comment-16570690
 ] 

ASF GitHub Bot commented on DRILL-6656:
---

vrozov commented on a change in pull request #1415: DRILL-6656: Disallow extra 
semicolons in import statements.
URL: https://github.com/apache/drill/pull/1415#discussion_r208005986
 
 

 ##
 File path: 
contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseRegionScanAssignments.java
 ##
 @@ -176,8 +176,19 @@ public void 
testHBaseGroupScanAssignmentSomeAfinedWithOrphans() throws Exception
 scan.applyAssignments(endpoints);
 
 LinkedList sizes = Lists.newLinkedList();
-sizes.add(1); sizes.add(1); sizes.add(1); sizes.add(1); sizes.add(1); 
sizes.add(1); sizes.add(1); sizes.add(1);
-sizes.add(2); sizes.add(2); sizes.add(2); sizes.add(2); sizes.add(2);
+sizes.add(1);
 
 Review comment:
   `Collections.addAll(sizes, 1, 1, ...);`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Regex To Disallow Extra Semicolons In Imports
> -
>
> Key: DRILL-6656
> URL: https://issues.apache.org/jira/browse/DRILL-6656
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6663) Shutdown not working when IP is used to load WebUI in secure cluster

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570589#comment-16570589
 ] 

ASF GitHub Bot commented on DRILL-6663:
---

dvjyothsna commented on a change in pull request #1424: DRILL-6663: Fixed 
shutdown button in Web UI
URL: https://github.com/apache/drill/pull/1424#discussion_r207985048
 
 

 ##
 File path: exec/java-exec/src/main/resources/rest/index.ftl
 ##
 @@ -435,6 +435,10 @@
 let rowElem = $(shutdownBtn).parent().parent();
 let hostAddr = 
$(rowElem).find('#address').contents().get(0).nodeValue.trim();
 let hostPort = $(rowElem).find('#httpPort').html();
+// Always use the host address from the url for the current 
Drillbit
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Shutdown not working when IP is used to load WebUI in secure cluster
> 
>
> Key: DRILL-6663
> URL: https://issues.apache.org/jira/browse/DRILL-6663
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.14.0
>Reporter: Sorabh Hamirwasia
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
>
> For shutdown request in index.ftl shutdown() function the request is sent to 
> the server using the hostname always. In secure cluster when a user is logged 
> in using IP address/Hostname in URL then the cookie it gets also has IP 
> addresss/Hostname respectively.
> Now in secure cluster (not with Https enabled) when a user logs in using the 
> ip address and submits a shutdown request, then that POST request is sent to 
> server using the hostname. And since the cookie presented to the server is 
> using the IP address the request is rejected or ignored. Hence the server 
> doesn't shuts down. Whereas when user is logged in using hostname then 
> everything works fine.
> The fix is to use [ip/hostname information from the URL rather than from the 
> Drillbits list table in the index.ftl 
> page|https://github.com/apache/drill/blob/master/exec/java-exec/src/main/resources/rest/index.ftl#L436].
>  When done in that way the shutdown request will always go with correct 
> Domain name for which it has the cookie as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6649) Query with unnest of column from nested subquery fails

2018-08-06 Thread Volodymyr Vysotskyi (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi updated DRILL-6649:
---
Labels: ready-to-commit  (was: )

> Query with unnest of column from nested subquery fails
> --
>
> Key: DRILL-6649
> URL: https://issues.apache.org/jira/browse/DRILL-6649
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.15.0
>
>
> This query:
> {code:sql}
> select t2.o from (select * from cp.`lateraljoin/nested-customer.json` limit 
> 1) t, unnest(t.orders) t2(o)
> {code}
> fails with error:
> {noformat}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError
> [Error Id: 6868e327-ab2c-44a2-ab0c-cf30f4a64349 on user515050-pc:31010]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:761)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83)
>  [classes/:na]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:293) 
> [classes/:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_181]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_181]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: null
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:294) 
> [classes/:na]
>   ... 3 common frames omitted
> Caused by: java.lang.AssertionError: null
>   at 
> org.apache.calcite.sql.SqlUnnestOperator.inferReturnType(SqlUnnestOperator.java:80)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.SqlOperator.validateOperands(SqlOperator.java:437) 
> ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.UnnestNamespace.validateImpl(UnnestNamespace.java:67)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2975)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2960)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateJoin(SqlValidatorImpl.java:3012)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2969)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3219)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> 

[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570554#comment-16570554
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r202754788
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java
 ##
 @@ -29,13 +30,28 @@
 import org.apache.drill.exec.record.selection.SelectionVector2;
 import org.apache.drill.exec.record.selection.SelectionVector4;
 
+import java.util.ArrayList;
 import java.util.Iterator;
+import java.util.List;
 
-public class RowSetBatch implements RecordBatch {
-  private final RowSet rowSet;
+/**
+ * A mock operator that returns the provided {@link RowSet}s as batches. 
Currently it's assumed that all the {@link RowSet}s have the same schema.
+ */
+public class RowSetBatch implements CloseableRecordBatch {
 
 Review comment:
   This is similar to 
[MockRecordBatch](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java).
 I created that class for Lateral-Unnest project. 
   The main usage for that class was to be able to simulate different 
IterOutcome with different type of batches while unit testing an operator. May 
be we can try to merge these 2 classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570553#comment-16570553
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r207977722
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java
 ##
 @@ -42,6 +42,8 @@
   private final BufferAllocator allocator;
   protected final List> wrappers = Lists.newArrayList();
   private BatchSchema schema;
+  private SelectionVector2 selectionVector2;
+  private SelectionVector4 selectionVector4;
 
 Review comment:
   I don't think we should include SV2/SV4 inside a VectorContainer. They are 
more of RecordBatch level members. VectorContainer should only have metadata 
and set of vectors in it which is what it is today.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6461) Add Basic Data Correctness Unit Tests

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570552#comment-16570552
 ] 

ASF GitHub Bot commented on DRILL-6461:
---

sohami commented on a change in pull request #1344: DRILL-6461: Added basic 
data correctness tests for hash agg, and improved operator unit testing 
framework.
URL: https://github.com/apache/drill/pull/1344#discussion_r202754788
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetBatch.java
 ##
 @@ -29,13 +30,28 @@
 import org.apache.drill.exec.record.selection.SelectionVector2;
 import org.apache.drill.exec.record.selection.SelectionVector4;
 
+import java.util.ArrayList;
 import java.util.Iterator;
+import java.util.List;
 
-public class RowSetBatch implements RecordBatch {
-  private final RowSet rowSet;
+/**
+ * A mock operator that returns the provided {@link RowSet}s as batches. 
Currently it's assumed that all the {@link RowSet}s have the same schema.
+ */
+public class RowSetBatch implements CloseableRecordBatch {
 
 Review comment:
   This is same as 
[MockRecordBatch](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/MockRecordBatch.java).
 I created that class for Lateral-Unnest project. 
   The main usage for that class was to be able to simulate different 
IterOutcome with different type of batches while unit testing an operator. May 
be we can try to merge these 2 classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Basic Data Correctness Unit Tests
> -
>
> Key: DRILL-6461
> URL: https://issues.apache.org/jira/browse/DRILL-6461
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> There are no data correctness unit tests for HashAgg. We need to add some.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6663) Shutdown not working when IP is used to load WebUI in secure cluster

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570497#comment-16570497
 ] 

ASF GitHub Bot commented on DRILL-6663:
---

sohami commented on a change in pull request #1424: DRILL-6663: Fixed shutdown 
button in Web UI
URL: https://github.com/apache/drill/pull/1424#discussion_r207964236
 
 

 ##
 File path: exec/java-exec/src/main/resources/rest/index.ftl
 ##
 @@ -435,6 +435,10 @@
 let rowElem = $(shutdownBtn).parent().parent();
 let hostAddr = 
$(rowElem).find('#address').contents().get(0).nodeValue.trim();
 let hostPort = $(rowElem).find('#httpPort').html();
+// Always use the host address from the url for the current 
Drillbit
 
 Review comment:
   please add explanation or say `For details refer DRILL-6663`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Shutdown not working when IP is used to load WebUI in secure cluster
> 
>
> Key: DRILL-6663
> URL: https://issues.apache.org/jira/browse/DRILL-6663
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.14.0
>Reporter: Sorabh Hamirwasia
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
>
> For shutdown request in index.ftl shutdown() function the request is sent to 
> the server using the hostname always. In secure cluster when a user is logged 
> in using IP address/Hostname in URL then the cookie it gets also has IP 
> addresss/Hostname respectively.
> Now in secure cluster (not with Https enabled) when a user logs in using the 
> ip address and submits a shutdown request, then that POST request is sent to 
> server using the hostname. And since the cookie presented to the server is 
> using the IP address the request is rejected or ignored. Hence the server 
> doesn't shuts down. Whereas when user is logged in using hostname then 
> everything works fine.
> The fix is to use [ip/hostname information from the URL rather than from the 
> Drillbits list table in the index.ftl 
> page|https://github.com/apache/drill/blob/master/exec/java-exec/src/main/resources/rest/index.ftl#L436].
>  When done in that way the shutdown request will always go with correct 
> Domain name for which it has the cookie as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6670) Error in parquet record reader - previously readable file fails to be read in 1.14

2018-08-06 Thread Oleksandr Kalinin (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570487#comment-16570487
 ] 

Oleksandr Kalinin commented on DRILL-6670:
--

Brief debug results:

- In 1.13, `SchemaElement.getConverted_type()` used in type conversion returns 
null, which triggers code to handle it as 'default' BIGINT (`Type.Minor.BIGINT`)
- In 1.14, `SchemaElement.getConverted_type()` returns TIMESTAMP_MICROS which 
leads to exception.

Query works when forcing explicit conversion of TIMESTAMP_MICROS to 
`Type.Minor.BIGINT` in `ParquetToDrillTypeConverter` and `ColumnReaderFactory` 
in 1.14.

Not sure where does it come from though, seems like Parquet change?

> Error in parquet record reader - previously readable file fails to be read in 
> 1.14
> --
>
> Key: DRILL-6670
> URL: https://issues.apache.org/jira/browse/DRILL-6670
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Dave Challis
>Priority: Major
> Attachments: example.parquet
>
>
> Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
> and 1.13, but fails to be read with 1.14.
> Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
> error message from the Drill web query UI:
> {code}
> Query Failed: An Error Occurred
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Error in parquet record reader. Message: Failure in setting up reader Parquet 
> Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional 
> binary name (UTF8); optional binary creation_parameters (UTF8); optional 
> int64 creation_date (TIMESTAMP_MICROS); optional int32 data_version; optional 
> int32 schema_version; } , metadata: {pandas={"index_columns": [], 
> "column_indexes": [], "columns": [{"name": "name", "field_name": "name", 
> "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_parameters", "field_name": "creation_parameters", "pandas_type": 
> "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_date", "field_name": "creation_date", "pandas_type": "datetime", 
> "numpy_type": "datetime64[ns]", "metadata": null}, {"name": "data_version", 
> "field_name": "data_version", "pandas_type": "int32", "numpy_type": "int32", 
> "metadata": null}, {"name": "schema_version", "field_name": "schema_version", 
> "pandas_type": "int32", "numpy_type": "int32", "metadata": null}], 
> "pandas_version": "0.22.0"}}}, blocks: [BlockMetaData{1, 27142 
> [ColumnMetaData{SNAPPY [name] optional binary name (UTF8) [PLAIN, RLE], 4}, 
> ColumnMetaData{SNAPPY [creation_parameters] optional binary 
> creation_parameters (UTF8) [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY 
> [creation_date] optional int64 creation_date (TIMESTAMP_MICROS) [PLAIN, RLE], 
> 46334}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version 
> [PLAIN, RLE], 46478}, ColumnMetaData{SNAPPY [schema_version] optional int32 
> schema_version [PLAIN, RLE], 46593}]}]} Fragment 0:0 [Error Id: 
> bdb2e4d5-5982-4cc6-b95e-244782f827d2 on f9d0456cddd2:31010] 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6663) Shutdown not working when IP is used to load WebUI in secure cluster

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570477#comment-16570477
 ] 

ASF GitHub Bot commented on DRILL-6663:
---

dvjyothsna opened a new pull request #1424: DRILL-6663: Fixed shutdown button 
in Web UI
URL: https://github.com/apache/drill/pull/1424
 
 
   @sohami Please review


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Shutdown not working when IP is used to load WebUI in secure cluster
> 
>
> Key: DRILL-6663
> URL: https://issues.apache.org/jira/browse/DRILL-6663
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.14.0
>Reporter: Sorabh Hamirwasia
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
>
> For shutdown request in index.ftl shutdown() function the request is sent to 
> the server using the hostname always. In secure cluster when a user is logged 
> in using IP address/Hostname in URL then the cookie it gets also has IP 
> addresss/Hostname respectively.
> Now in secure cluster (not with Https enabled) when a user logs in using the 
> ip address and submits a shutdown request, then that POST request is sent to 
> server using the hostname. And since the cookie presented to the server is 
> using the IP address the request is rejected or ignored. Hence the server 
> doesn't shuts down. Whereas when user is logged in using hostname then 
> everything works fine.
> The fix is to use [ip/hostname information from the URL rather than from the 
> Drillbits list table in the index.ftl 
> page|https://github.com/apache/drill/blob/master/exec/java-exec/src/main/resources/rest/index.ftl#L436].
>  When done in that way the shutdown request will always go with correct 
> Domain name for which it has the cookie as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570419#comment-16570419
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

sohami commented on a change in pull request #1334: DRILL-6385: Support JPPD 
feature
URL: https://github.com/apache/drill/pull/1334#discussion_r207947589
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/RuntimeFilterManager.java
 ##
 @@ -0,0 +1,666 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.filter;
+
+import org.apache.calcite.plan.volcano.RelSubset;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.core.JoinInfo;
+import org.apache.calcite.rel.core.JoinRelType;
+import org.apache.calcite.rel.metadata.RelMetadataQuery;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rel.type.RelDataTypeField;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.commons.collections.CollectionUtils;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.AccountingDataTunnel;
+import org.apache.drill.exec.ops.Consumer;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.ops.SendingAccountor;
+import org.apache.drill.exec.ops.StatusHandler;
+import org.apache.drill.exec.physical.PhysicalPlan;
+
+import org.apache.drill.exec.physical.base.AbstractPhysicalVisitor;
+import org.apache.drill.exec.physical.base.Exchange;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.config.BroadcastExchange;
+import org.apache.drill.exec.physical.config.HashJoinPOP;
+import org.apache.drill.exec.planner.fragment.Fragment;
+import org.apache.drill.exec.planner.fragment.Wrapper;
+import org.apache.drill.exec.planner.physical.HashAggPrel;
+import org.apache.drill.exec.planner.physical.HashJoinPrel;
+import org.apache.drill.exec.planner.physical.Prel;
+import org.apache.drill.exec.planner.physical.ScanPrel;
+import org.apache.drill.exec.planner.physical.StreamAggPrel;
+import org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor;
+import org.apache.drill.exec.proto.BitData;
+import org.apache.drill.exec.proto.CoordinationProtos;
+import org.apache.drill.exec.proto.GeneralRPCProtos;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.rpc.RpcException;
+import org.apache.drill.exec.rpc.RpcOutcomeListener;
+import org.apache.drill.exec.rpc.data.DataTunnel;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.util.Pointer;
+import org.apache.drill.exec.work.QueryWorkUnit;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * This class traverses the physical operator tree to find the HashJoin 
operator
+ * for which is JPPD (join predicate push down) is possible. The prerequisite 
to do JPPD
+ * is:
+ * 1. The join condition is equality
+ * 2. The physical join node is a HashJoin one
+ * 3. The probe side children of the HashJoin node should not contain a 
blocking operator like HashAgg
+ */
+public class RuntimeFilterManager {
+
+  private Wrapper rootWrapper;
+  //HashJoin node's major fragment id to its corresponding probe side nodes's 
endpoints
+  private Map> 
joinMjId2probdeScanEps = new HashMap<>();
+  //HashJoin node's major fragment id to its corresponding probe side nodes's 
number
+  private Map joinMjId2scanSize = new ConcurrentHashMap<>();
+  //HashJoin node's major fragment id to its corresponding probe side scan 
node's belonging major fragment id
+  private Map joinMjId2ScanMjId = new HashMap<>();
+
+  private RuntimeFilterWritable aggregatedRuntimeFilter;
+
+  private DrillbitContext drillbitContext;
+
+  private SendingAccountor sendingAccountor = new SendingAccountor();
+
+  private String lineSeparator;
+
+  private 

[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570420#comment-16570420
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

sohami commented on a change in pull request #1334: DRILL-6385: Support JPPD 
feature
URL: https://github.com/apache/drill/pull/1334#discussion_r207948737
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/RuntimeFilterManager.java
 ##
 @@ -0,0 +1,666 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.filter;
+
+import org.apache.calcite.plan.volcano.RelSubset;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.core.JoinInfo;
+import org.apache.calcite.rel.core.JoinRelType;
+import org.apache.calcite.rel.metadata.RelMetadataQuery;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rel.type.RelDataTypeField;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.commons.collections.CollectionUtils;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.AccountingDataTunnel;
+import org.apache.drill.exec.ops.Consumer;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.ops.SendingAccountor;
+import org.apache.drill.exec.ops.StatusHandler;
+import org.apache.drill.exec.physical.PhysicalPlan;
+
+import org.apache.drill.exec.physical.base.AbstractPhysicalVisitor;
+import org.apache.drill.exec.physical.base.Exchange;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.config.BroadcastExchange;
+import org.apache.drill.exec.physical.config.HashJoinPOP;
+import org.apache.drill.exec.planner.fragment.Fragment;
+import org.apache.drill.exec.planner.fragment.Wrapper;
+import org.apache.drill.exec.planner.physical.HashAggPrel;
+import org.apache.drill.exec.planner.physical.HashJoinPrel;
+import org.apache.drill.exec.planner.physical.Prel;
+import org.apache.drill.exec.planner.physical.ScanPrel;
+import org.apache.drill.exec.planner.physical.StreamAggPrel;
+import org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor;
+import org.apache.drill.exec.proto.BitData;
+import org.apache.drill.exec.proto.CoordinationProtos;
+import org.apache.drill.exec.proto.GeneralRPCProtos;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.rpc.RpcException;
+import org.apache.drill.exec.rpc.RpcOutcomeListener;
+import org.apache.drill.exec.rpc.data.DataTunnel;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.util.Pointer;
+import org.apache.drill.exec.work.QueryWorkUnit;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * This class traverses the physical operator tree to find the HashJoin 
operator
+ * for which is JPPD (join predicate push down) is possible. The prerequisite 
to do JPPD
+ * is:
+ * 1. The join condition is equality
+ * 2. The physical join node is a HashJoin one
+ * 3. The probe side children of the HashJoin node should not contain a 
blocking operator like HashAgg
+ */
+public class RuntimeFilterManager {
+
+  private Wrapper rootWrapper;
+  //HashJoin node's major fragment id to its corresponding probe side nodes's 
endpoints
+  private Map> 
joinMjId2probdeScanEps = new HashMap<>();
+  //HashJoin node's major fragment id to its corresponding probe side nodes's 
number
+  private Map joinMjId2scanSize = new ConcurrentHashMap<>();
+  //HashJoin node's major fragment id to its corresponding probe side scan 
node's belonging major fragment id
+  private Map joinMjId2ScanMjId = new HashMap<>();
+
+  private RuntimeFilterWritable aggregatedRuntimeFilter;
+
+  private DrillbitContext drillbitContext;
+
+  private SendingAccountor sendingAccountor = new SendingAccountor();
+
+  private String lineSeparator;
+
+  private 

[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570418#comment-16570418
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

sohami commented on a change in pull request #1334: DRILL-6385: Support JPPD 
feature
URL: https://github.com/apache/drill/pull/1334#discussion_r207946163
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java
 ##
 @@ -408,6 +424,9 @@ private void runPhysicalPlan(final PhysicalPlan plan) 
throws ExecutionSetupExcep
 fragmentsRunner.setFragmentsInfo(work.getFragments(), 
work.getRootFragment(), work.getRootOperator());
 
 startQueryProcessing();
+if (enableRuntimeFilter) {
+  runtimeFilterManager.waitForComplete();
 
 Review comment:
   But based on my understanding `RuntimeFilterManager` only sends out 
something over network during broadcasting of aggregated bloomfilter. Broadcast 
currently happens on the Rpc thread which needs to be changed. So at what time 
will this Foreman thread will end up also sending the buffers over network ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support JPPD (Join Predicate Push Down)
> ---
>
> Key: DRILL-6385
> URL: https://issues.apache.org/jira/browse/DRILL-6385
> Project: Apache Drill
>  Issue Type: New Feature
>  Components:  Server, Execution - Flow
>Affects Versions: 1.14.0
>Reporter: weijie.tong
>Assignee: weijie.tong
>Priority: Major
>
> This feature is to support the JPPD (Join Predicate Push Down). It will 
> benefit the HashJoin ,Broadcast HashJoin performance by reducing the number 
> of rows to send across the network ,the memory consumed. This feature is 
> already supported by Impala which calls it RuntimeFilter 
> ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]).
>  The first PR will try to push down a bloom filter of HashJoin node to 
> Parquet’s scan node.   The propose basic procedure is described as follow:
>  # The HashJoin build side accumulate the equal join condition rows to 
> construct a bloom filter. Then it sends out the bloom filter to the foreman 
> node.
>  # The foreman node accept the bloom filters passively from all the fragments 
> that has the HashJoin operator. It then aggregates the bloom filters to form 
> a global bloom filter.
>  # The foreman node broadcasts the global bloom filter to all the probe side 
> scan nodes which maybe already have send out partial data to the hash join 
> nodes(currently the hash join node will prefetch one batch from both sides ).
>       4.  The scan node accepts a global bloom filter from the foreman node. 
> It will filter the rest rows satisfying the bloom filter.
>  
> To implement above execution flow, some main new notion described as below:
>       1. RuntimeFilter
> It’s a filter container which may contain BloomFilter or MinMaxFilter.
>       2. RuntimeFilterReporter
> It wraps the logic to send hash join’s bloom filter to the foreman.The 
> serialized bloom filter will be sent out through the data tunnel.This object 
> will be instanced by the FragmentExecutor and passed to the 
> FragmentContext.So the HashJoin operator can obtain it through the 
> FragmentContext.
>      3. RuntimeFilterRequestHandler
> It is responsible to accept a SendRuntimeFilterRequest RPC to strip the 
> actual BloomFilter from the network. It then translates this filter to the 
> WorkerBee’s new interface registerRuntimeFilter.
> Another RPC type is BroadcastRuntimeFilterRequest. It will register the 
> accepted global bloom filter to the WorkerBee by the registerRuntimeFilter 
> method and then propagate to the FragmentContext through which the probe side 
> scan node can fetch the aggregated bloom filter.
>       4.RuntimeFilterManager
> The foreman will instance a RuntimeFilterManager .It will indirectly get 
> every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been 
> accepted and aggregated . It will broadcast the aggregated bloom filter to 
> all the probe side scan nodes through the data tunnel by a 
> BroadcastRuntimeFilterRequest RPC.
>      5. RuntimeFilterEnableOption 
>  A global option will be added to decide whether to enable this new feature.
>  
> Welcome suggestion and advice from you.The related PR will be presented as 
> soon as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6649) Query with unnest of column from nested subquery fails

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570403#comment-16570403
 ] 

ASF GitHub Bot commented on DRILL-6649:
---

vvysotskyi edited a comment on issue #1421: DRILL-6649: Query with unnest of 
column from nested star subquery fails
URL: https://github.com/apache/drill/pull/1421#issuecomment-410752353
 
 
   @sohami, yes, it was fixed in 
https://github.com/apache/calcite/commit/41a06771876374474d46ef8d3a14886d742c66e9
 and cherry-picked to our Calcite fork 
(https://github.com/mapr/incubator-calcite/commits/DrillCalcite1.16.0).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query with unnest of column from nested subquery fails
> --
>
> Key: DRILL-6649
> URL: https://issues.apache.org/jira/browse/DRILL-6649
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> This query:
> {code:sql}
> select t2.o from (select * from cp.`lateraljoin/nested-customer.json` limit 
> 1) t, unnest(t.orders) t2(o)
> {code}
> fails with error:
> {noformat}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError
> [Error Id: 6868e327-ab2c-44a2-ab0c-cf30f4a64349 on user515050-pc:31010]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:761)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83)
>  [classes/:na]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:293) 
> [classes/:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_181]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_181]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: null
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:294) 
> [classes/:na]
>   ... 3 common frames omitted
> Caused by: java.lang.AssertionError: null
>   at 
> org.apache.calcite.sql.SqlUnnestOperator.inferReturnType(SqlUnnestOperator.java:80)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.SqlOperator.validateOperands(SqlOperator.java:437) 
> ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.UnnestNamespace.validateImpl(UnnestNamespace.java:67)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2975)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2960)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateJoin(SqlValidatorImpl.java:3012)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2969)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
> 

[jira] [Commented] (DRILL-6649) Query with unnest of column from nested subquery fails

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570376#comment-16570376
 ] 

ASF GitHub Bot commented on DRILL-6649:
---

vvysotskyi commented on issue #1421: DRILL-6649: Query with unnest of column 
from nested star subquery fails
URL: https://github.com/apache/drill/pull/1421#issuecomment-410752353
 
 
   @sohami, yes, it was fixed in 
https://github.com/apache/calcite/commit/41a06771876374474d46ef8d3a14886d742c66e9
 and cherry-picked to our Calcite fork.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query with unnest of column from nested subquery fails
> --
>
> Key: DRILL-6649
> URL: https://issues.apache.org/jira/browse/DRILL-6649
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> This query:
> {code:sql}
> select t2.o from (select * from cp.`lateraljoin/nested-customer.json` limit 
> 1) t, unnest(t.orders) t2(o)
> {code}
> fails with error:
> {noformat}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError
> [Error Id: 6868e327-ab2c-44a2-ab0c-cf30f4a64349 on user515050-pc:31010]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:761)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83)
>  [classes/:na]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:293) 
> [classes/:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_181]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_181]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: null
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:294) 
> [classes/:na]
>   ... 3 common frames omitted
> Caused by: java.lang.AssertionError: null
>   at 
> org.apache.calcite.sql.SqlUnnestOperator.inferReturnType(SqlUnnestOperator.java:80)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.SqlOperator.validateOperands(SqlOperator.java:437) 
> ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.UnnestNamespace.validateImpl(UnnestNamespace.java:67)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2975)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2960)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateJoin(SqlValidatorImpl.java:3012)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2969)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> 

[jira] [Commented] (DRILL-6649) Query with unnest of column from nested subquery fails

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570369#comment-16570369
 ] 

ASF GitHub Bot commented on DRILL-6649:
---

sohami commented on issue #1421: DRILL-6649: Query with unnest of column from 
nested star subquery fails
URL: https://github.com/apache/drill/pull/1421#issuecomment-410750881
 
 
   I am assuming the actual fix is in already in calcite 1.16.0-drill-r7 which 
is checked in.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Query with unnest of column from nested subquery fails
> --
>
> Key: DRILL-6649
> URL: https://issues.apache.org/jira/browse/DRILL-6649
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> This query:
> {code:sql}
> select t2.o from (select * from cp.`lateraljoin/nested-customer.json` limit 
> 1) t, unnest(t.orders) t2(o)
> {code}
> fails with error:
> {noformat}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError
> [Error Id: 6868e327-ab2c-44a2-ab0c-cf30f4a64349 on user515050-pc:31010]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:761)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221)
>  [classes/:na]
>   at 
> org.apache.drill.exec.work.foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83)
>  [classes/:na]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:293) 
> [classes/:na]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_181]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_181]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: null
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:294) 
> [classes/:na]
>   ... 3 common frames omitted
> Caused by: java.lang.AssertionError: null
>   at 
> org.apache.calcite.sql.SqlUnnestOperator.inferReturnType(SqlUnnestOperator.java:80)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.SqlOperator.validateOperands(SqlOperator.java:437) 
> ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.UnnestNamespace.validateImpl(UnnestNamespace.java:67)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2975)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2960)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateJoin(SqlValidatorImpl.java:3012)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2969)
>  ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3219)
>  

[jira] [Updated] (DRILL-6670) Error in parquet record reader - previously readable file fails to be read in 1.14

2018-08-06 Thread Dave Challis (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Challis updated DRILL-6670:

Attachment: example.parquet

> Error in parquet record reader - previously readable file fails to be read in 
> 1.14
> --
>
> Key: DRILL-6670
> URL: https://issues.apache.org/jira/browse/DRILL-6670
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Dave Challis
>Priority: Major
> Attachments: example.parquet
>
>
> Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
> and 1.13, but fails to be read with 1.14.
> Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
> error message from the Drill web query UI:
> {code}
> Query Failed: An Error Occurred
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Error in parquet record reader. Message: Failure in setting up reader Parquet 
> Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional 
> binary name (UTF8); optional binary creation_parameters (UTF8); optional 
> int64 creation_date (TIMESTAMP_MICROS); optional int32 data_version; optional 
> int32 schema_version; } , metadata: {pandas={"index_columns": [], 
> "column_indexes": [], "columns": [{"name": "name", "field_name": "name", 
> "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_parameters", "field_name": "creation_parameters", "pandas_type": 
> "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_date", "field_name": "creation_date", "pandas_type": "datetime", 
> "numpy_type": "datetime64[ns]", "metadata": null}, {"name": "data_version", 
> "field_name": "data_version", "pandas_type": "int32", "numpy_type": "int32", 
> "metadata": null}, {"name": "schema_version", "field_name": "schema_version", 
> "pandas_type": "int32", "numpy_type": "int32", "metadata": null}], 
> "pandas_version": "0.22.0"}}}, blocks: [BlockMetaData{1, 27142 
> [ColumnMetaData{SNAPPY [name] optional binary name (UTF8) [PLAIN, RLE], 4}, 
> ColumnMetaData{SNAPPY [creation_parameters] optional binary 
> creation_parameters (UTF8) [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY 
> [creation_date] optional int64 creation_date (TIMESTAMP_MICROS) [PLAIN, RLE], 
> 46334}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version 
> [PLAIN, RLE], 46478}, ColumnMetaData{SNAPPY [schema_version] optional int32 
> schema_version [PLAIN, RLE], 46593}]}]} Fragment 0:0 [Error Id: 
> bdb2e4d5-5982-4cc6-b95e-244782f827d2 on f9d0456cddd2:31010] 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6670) Error in parquet record reader - previously readable file fails to be read in 1.14

2018-08-06 Thread Dave Challis (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570334#comment-16570334
 ] 

Dave Challis commented on DRILL-6670:
-

Thanks for the reminder, have attached an example file now.

> Error in parquet record reader - previously readable file fails to be read in 
> 1.14
> --
>
> Key: DRILL-6670
> URL: https://issues.apache.org/jira/browse/DRILL-6670
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Dave Challis
>Priority: Major
> Attachments: example.parquet
>
>
> Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
> and 1.13, but fails to be read with 1.14.
> Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
> error message from the Drill web query UI:
> {code}
> Query Failed: An Error Occurred
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Error in parquet record reader. Message: Failure in setting up reader Parquet 
> Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional 
> binary name (UTF8); optional binary creation_parameters (UTF8); optional 
> int64 creation_date (TIMESTAMP_MICROS); optional int32 data_version; optional 
> int32 schema_version; } , metadata: {pandas={"index_columns": [], 
> "column_indexes": [], "columns": [{"name": "name", "field_name": "name", 
> "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_parameters", "field_name": "creation_parameters", "pandas_type": 
> "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_date", "field_name": "creation_date", "pandas_type": "datetime", 
> "numpy_type": "datetime64[ns]", "metadata": null}, {"name": "data_version", 
> "field_name": "data_version", "pandas_type": "int32", "numpy_type": "int32", 
> "metadata": null}, {"name": "schema_version", "field_name": "schema_version", 
> "pandas_type": "int32", "numpy_type": "int32", "metadata": null}], 
> "pandas_version": "0.22.0"}}}, blocks: [BlockMetaData{1, 27142 
> [ColumnMetaData{SNAPPY [name] optional binary name (UTF8) [PLAIN, RLE], 4}, 
> ColumnMetaData{SNAPPY [creation_parameters] optional binary 
> creation_parameters (UTF8) [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY 
> [creation_date] optional int64 creation_date (TIMESTAMP_MICROS) [PLAIN, RLE], 
> 46334}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version 
> [PLAIN, RLE], 46478}, ColumnMetaData{SNAPPY [schema_version] optional int32 
> schema_version [PLAIN, RLE], 46593}]}]} Fragment 0:0 [Error Id: 
> bdb2e4d5-5982-4cc6-b95e-244782f827d2 on f9d0456cddd2:31010] 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6670) Error in parquet record reader - previously readable file fails to be read in 1.14

2018-08-06 Thread Oleksandr Kalinin (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570333#comment-16570333
 ] 

Oleksandr Kalinin commented on DRILL-6670:
--

Checking this against changes in DRILL-5797, indeed having sample file would be 
helpful.

> Error in parquet record reader - previously readable file fails to be read in 
> 1.14
> --
>
> Key: DRILL-6670
> URL: https://issues.apache.org/jira/browse/DRILL-6670
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Dave Challis
>Priority: Major
>
> Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
> and 1.13, but fails to be read with 1.14.
> Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
> error message from the Drill web query UI:
> {code}
> Query Failed: An Error Occurred
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Error in parquet record reader. Message: Failure in setting up reader Parquet 
> Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional 
> binary name (UTF8); optional binary creation_parameters (UTF8); optional 
> int64 creation_date (TIMESTAMP_MICROS); optional int32 data_version; optional 
> int32 schema_version; } , metadata: {pandas={"index_columns": [], 
> "column_indexes": [], "columns": [{"name": "name", "field_name": "name", 
> "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_parameters", "field_name": "creation_parameters", "pandas_type": 
> "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_date", "field_name": "creation_date", "pandas_type": "datetime", 
> "numpy_type": "datetime64[ns]", "metadata": null}, {"name": "data_version", 
> "field_name": "data_version", "pandas_type": "int32", "numpy_type": "int32", 
> "metadata": null}, {"name": "schema_version", "field_name": "schema_version", 
> "pandas_type": "int32", "numpy_type": "int32", "metadata": null}], 
> "pandas_version": "0.22.0"}}}, blocks: [BlockMetaData{1, 27142 
> [ColumnMetaData{SNAPPY [name] optional binary name (UTF8) [PLAIN, RLE], 4}, 
> ColumnMetaData{SNAPPY [creation_parameters] optional binary 
> creation_parameters (UTF8) [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY 
> [creation_date] optional int64 creation_date (TIMESTAMP_MICROS) [PLAIN, RLE], 
> 46334}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version 
> [PLAIN, RLE], 46478}, ColumnMetaData{SNAPPY [schema_version] optional int32 
> schema_version [PLAIN, RLE], 46593}]}]} Fragment 0:0 [Error Id: 
> bdb2e4d5-5982-4cc6-b95e-244782f827d2 on f9d0456cddd2:31010] 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6670) Error in parquet record reader - previously readable file fails to be read in 1.14

2018-08-06 Thread Arina Ielchiieva (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570301#comment-16570301
 ] 

Arina Ielchiieva commented on DRILL-6670:
-

Please attach file example as well.

> Error in parquet record reader - previously readable file fails to be read in 
> 1.14
> --
>
> Key: DRILL-6670
> URL: https://issues.apache.org/jira/browse/DRILL-6670
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Dave Challis
>Priority: Major
>
> Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
> and 1.13, but fails to be read with 1.14.
> Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
> error message from the Drill web query UI:
> {code}
> Query Failed: An Error Occurred
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Error in parquet record reader. Message: Failure in setting up reader Parquet 
> Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional 
> binary name (UTF8); optional binary creation_parameters (UTF8); optional 
> int64 creation_date (TIMESTAMP_MICROS); optional int32 data_version; optional 
> int32 schema_version; } , metadata: {pandas={"index_columns": [], 
> "column_indexes": [], "columns": [{"name": "name", "field_name": "name", 
> "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_parameters", "field_name": "creation_parameters", "pandas_type": 
> "unicode", "numpy_type": "object", "metadata": null}, {"name": 
> "creation_date", "field_name": "creation_date", "pandas_type": "datetime", 
> "numpy_type": "datetime64[ns]", "metadata": null}, {"name": "data_version", 
> "field_name": "data_version", "pandas_type": "int32", "numpy_type": "int32", 
> "metadata": null}, {"name": "schema_version", "field_name": "schema_version", 
> "pandas_type": "int32", "numpy_type": "int32", "metadata": null}], 
> "pandas_version": "0.22.0"}}}, blocks: [BlockMetaData{1, 27142 
> [ColumnMetaData{SNAPPY [name] optional binary name (UTF8) [PLAIN, RLE], 4}, 
> ColumnMetaData{SNAPPY [creation_parameters] optional binary 
> creation_parameters (UTF8) [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY 
> [creation_date] optional int64 creation_date (TIMESTAMP_MICROS) [PLAIN, RLE], 
> 46334}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version 
> [PLAIN, RLE], 46478}, ColumnMetaData{SNAPPY [schema_version] optional int32 
> schema_version [PLAIN, RLE], 46593}]}]} Fragment 0:0 [Error Id: 
> bdb2e4d5-5982-4cc6-b95e-244782f827d2 on f9d0456cddd2:31010] 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6670) Error in parquet record reader - previously readable file fails to be read in 1.14

2018-08-06 Thread Dave Challis (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570201#comment-16570201
 ] 

Dave Challis edited comment on DRILL-6670 at 8/6/18 1:26 PM:
-

>From further digging in the logs, looks like this is an issue related to 
>Parquet to Drill type conversion, this is the relevant stack trace from the 
>logs:

{code}
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Error in 
parquet record reader.
Message: Failure in setting up reader
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message schema {
  optional binary name (UTF8);
  optional binary creation_parameters (UTF8);
  optional int64 creation_date (TIMESTAMP_MICROS);
  optional int32 data_version;
  optional int32 schema_version;
}
, metadata: {pandas={"index_columns": [], "column_indexes": [], "columns": 
[{"name": "name", "field_name": "name", "pandas_type": "unicode", "numpy_type": 
"object", "metadata": null}, {"name": "creation_parameters", "field_name": 
"creation_parameters", "pandas_type": "unicode", "numpy_type": "object", 
"metadata": null}, {"name": "creation_date", "field_name": "creation_date", 
"pandas_type": "datetime", "numpy_type": "datetime64[ns]", "metadata": null}, 
{"name": "data_version", "field_name": "data_version", "pandas_type": "int32", 
"numpy_type": "int32", "metadata": null}, {"name": "schema_version", 
"field_name": "schema_version", "pandas_type": "int32", "numpy_type": "int32", 
"metadata": null}], "pandas_version": "0.22.0"}}}, blocks: [BlockMetaData{1, 
8394 [ColumnMetaData{SNAPPY [name] optional binary name (UTF8)  [PLAIN, RLE], 
4}, ColumnMetaData{SNAPPY [creation_parameters] optional binary 
creation_parameters (UTF8)  [PLAIN, RLE], 162}, ColumnMetaData{SNAPPY 
[creation_date] optional int64 creation_date (TIMESTAMP_MICROS)  [PLAIN, RLE], 
14197}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version  
[PLAIN, RLE], 14341}, ColumnMetaData{SNAPPY [schema_version] optional int32 
schema_version  [PLAIN, RLE], 14456}]}]}
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleException(ParquetRecordReader.java:271)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup(ParquetRecordReader.java:255)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas(ScanBatch.java:251)
 [drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:169) 
[drill-java-exec-1.14.0.jar:1.14.0]
... 40 common frames omitted
Caused by: java.lang.UnsupportedOperationException: unsupported type: INT64 
TIMESTAMP_MICROS
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter.getMinorType(ParquetToDrillTypeConverter.java:70)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter.toMajorType(ParquetToDrillTypeConverter.java:128)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetColumnMetadata.resolveDrillType(ParquetColumnMetadata.java:61)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetSchema.loadParquetSchema(ParquetSchema.java:132)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetSchema.buildSchema(ParquetSchema.java:115)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup(ParquetRecordReader.java:250)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
... 42 common frames omitted
{code}

I couldn't see anything related to this in the release notes.


was (Author: suicas):
>From further digging in the logs, looks like this is an issue related to 
>Parquet to Drill type conversion, this is the relevant stack trace from the 
>logs:

{code}
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Error in 
parquet record reader.
Message: Failure in setting up reader
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message schema {
  optional binary name (UTF8);
  optional binary creation_parameters (UTF8);
  optional int64 creation_date (TIMESTAMP_MICROS);
  optional int32 data_version;
  optional int32 schema_version;
}
, metadata: {pandas={"index_columns": [], "column_indexes": [], "columns": 
[{"name": "name", "field_name": "name", "pandas_type": "unicode", "numpy_type": 
"object", "metadata": null}, {"name": "creation_parameters", "field_name": 
"creation_parameters", "pandas_type": "unicode", "numpy_type": "object", 
"metadata": null}, {"name": "creation_date", "field_name": "creation_date", 
"pandas_type": "datetime", "numpy_type": "datetime64[ns]", "metadata": null}, 
{"name": 

[jira] [Commented] (DRILL-6670) Error in parquet record reader - previously readable file fails to be read in 1.14

2018-08-06 Thread Dave Challis (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570201#comment-16570201
 ] 

Dave Challis commented on DRILL-6670:
-

>From further digging in the logs, looks like this is an issue related to 
>Parquet to Drill type conversion, this is the relevant stack trace from the 
>logs:

{code}
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Error in 
parquet record reader.
Message: Failure in setting up reader
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message schema {
  optional binary name (UTF8);
  optional binary creation_parameters (UTF8);
  optional int64 creation_date (TIMESTAMP_MICROS);
  optional int32 data_version;
  optional int32 schema_version;
}
, metadata: {pandas={"index_columns": [], "column_indexes": [], "columns": 
[{"name": "name", "field_name": "name", "pandas_type": "unicode", "numpy_type": 
"object", "metadata": null}, {"name": "creation_parameters", "field_name": 
"creation_parameters", "pandas_type": "unicode", "numpy_type": "object", 
"metadata": null}, {"name": "creation_date", "field_name": "creation_date", 
"pandas_type": "datetime", "numpy_type": "datetime64[ns]", "metadata": null}, 
{"name": "data_version", "field_name": "data_version", "pandas_type": "int32", 
"numpy_type": "int32", "metadata": null}, {"name": "schema_version", 
"field_name": "schema_version", "pandas_type": "int32", "numpy_type": "int32", 
"metadata": null}], "pandas_version": "0.22.0"}}}, blocks: [BlockMetaData{1, 
8394 [ColumnMetaData{SNAPPY [name] optional binary name (UTF8)  [PLAIN, RLE], 
4}, ColumnMetaData{SNAPPY [creation_parameters] optional binary 
creation_parameters (UTF8)  [PLAIN, RLE], 162}, ColumnMetaData{SNAPPY 
[creation_date] optional int64 creation_date (TIMESTAMP_MICROS)  [PLAIN, RLE], 
14197}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version  
[PLAIN, RLE], 14341}, ColumnMetaData{SNAPPY [schema_version] optional int32 
schema_version  [PLAIN, RLE], 14456}]}]}
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleException(ParquetRecordReader.java:271)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup(ParquetRecordReader.java:255)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas(ScanBatch.java:251)
 [drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:169) 
[drill-java-exec-1.14.0.jar:1.14.0]
... 40 common frames omitted
Caused by: java.lang.UnsupportedOperationException: unsupported type: INT64 
TIMESTAMP_MICROS
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter.getMinorType(ParquetToDrillTypeConverter.java:70)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter.toMajorType(ParquetToDrillTypeConverter.java:128)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetColumnMetadata.resolveDrillType(ParquetColumnMetadata.java:61)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetSchema.loadParquetSchema(ParquetSchema.java:132)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetSchema.buildSchema(ParquetSchema.java:115)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup(ParquetRecordReader.java:250)
 ~[drill-java-exec-1.14.0.jar:1.14.0]
... 42 common frames omitted
{code}

> Error in parquet record reader - previously readable file fails to be read in 
> 1.14
> --
>
> Key: DRILL-6670
> URL: https://issues.apache.org/jira/browse/DRILL-6670
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Dave Challis
>Priority: Major
>
> Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
> and 1.13, but fails to be read with 1.14.
> Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
> error message from the Drill web query UI:
> {code}
> Query Failed: An Error Occurred
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Error in parquet record reader. Message: Failure in setting up reader Parquet 
> Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional 
> binary name (UTF8); optional binary creation_parameters (UTF8); optional 
> int64 creation_date (TIMESTAMP_MICROS); optional int32 data_version; optional 
> int32 

[jira] [Updated] (DRILL-6670) Error in parquet record reader - previously readable file fails to be read in 1.14

2018-08-06 Thread Dave Challis (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Challis updated DRILL-6670:

Description: 
Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
and 1.13, but fails to be read with 1.14.

Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
error message from the Drill web query UI:

{code}
Query Failed: An Error Occurred

org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
Error in parquet record reader. Message: Failure in setting up reader Parquet 
Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional binary 
name (UTF8); optional binary creation_parameters (UTF8); optional int64 
creation_date (TIMESTAMP_MICROS); optional int32 data_version; optional int32 
schema_version; } , metadata: {pandas={"index_columns": [], "column_indexes": 
[], "columns": [{"name": "name", "field_name": "name", "pandas_type": 
"unicode", "numpy_type": "object", "metadata": null}, {"name": 
"creation_parameters", "field_name": "creation_parameters", "pandas_type": 
"unicode", "numpy_type": "object", "metadata": null}, {"name": "creation_date", 
"field_name": "creation_date", "pandas_type": "datetime", "numpy_type": 
"datetime64[ns]", "metadata": null}, {"name": "data_version", "field_name": 
"data_version", "pandas_type": "int32", "numpy_type": "int32", "metadata": 
null}, {"name": "schema_version", "field_name": "schema_version", 
"pandas_type": "int32", "numpy_type": "int32", "metadata": null}], 
"pandas_version": "0.22.0"}}}, blocks: [BlockMetaData{1, 27142 
[ColumnMetaData{SNAPPY [name] optional binary name (UTF8) [PLAIN, RLE], 4}, 
ColumnMetaData{SNAPPY [creation_parameters] optional binary creation_parameters 
(UTF8) [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY [creation_date] optional int64 
creation_date (TIMESTAMP_MICROS) [PLAIN, RLE], 46334}, ColumnMetaData{SNAPPY 
[data_version] optional int32 data_version [PLAIN, RLE], 46478}, 
ColumnMetaData{SNAPPY [schema_version] optional int32 schema_version [PLAIN, 
RLE], 46593}]}]} Fragment 0:0 [Error Id: bdb2e4d5-5982-4cc6-b95e-244782f827d2 
on f9d0456cddd2:31010] 
{code}

  was:
Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
and 1.13, but fails to be read with 1.14.

Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
error message from the Drill web query UI:

{code}
{"code":500,"message":"SQL error while querying Drill DB Failed to create 
prepared statement: INTERNAL_ERROR ERROR: Error in parquet record 
reader.\nMessage: Failure in setting up reader\nParquet Metadata: 
ParquetMetaData{FileMetaData{schema: message schema {\n  optional binary name 
(UTF8);\n  optional binary creation_parameters (UTF8);\n  optional int64 
creation_date (TIMESTAMP_MICROS);\n  optional int32 data_version;\n  optional 
int32 schema_version;\n}\n, metadata: {pandas={\"index_columns\": [], 
\"column_indexes\": [], \"columns\": [{\"name\": \"name\", \"field_name\": 
\"name\", \"pandas_type\": \"unicode\", \"numpy_type\": \"object\", 
\"metadata\": null}, {\"name\": \"creation_parameters\", \"field_name\": 
\"creation_parameters\", \"pandas_type\": \"unicode\", \"numpy_type\": 
\"object\", \"metadata\": null}, {\"name\": \"creation_date\", \"field_name\": 
\"creation_date\", \"pandas_type\": \"datetime\", \"numpy_type\": 
\"datetime64[ns]\", \"metadata\": null}, {\"name\": \"data_version\", 
\"field_name\": \"data_version\", \"pandas_type\": \"int32\", \"numpy_type\": 
\"int32\", \"metadata\": null}, {\"name\": \"schema_version\", \"field_name\": 
\"schema_version\", \"pandas_type\": \"int32\", \"numpy_type\": \"int32\", 
\"metadata\": null}], \"pandas_version\": \"0.22.0\"}}}, blocks: 
[BlockMetaData{1, 27142 [ColumnMetaData{SNAPPY [name] optional binary name 
(UTF8)  [PLAIN, RLE], 4}, ColumnMetaData{SNAPPY [creation_parameters] optional 
binary creation_parameters (UTF8)  [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY 
[creation_date] optional int64 creation_date (TIMESTAMP_MICROS)  [PLAIN, RLE], 
46334}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version  
[PLAIN, RLE], 46478}, ColumnMetaData{SNAPPY [schema_version] optional int32 
schema_version  [PLAIN, RLE], 46593}]}]}\n\nFragment 0:0\n\n[Error Id: 
7c76ae97-03e3-4fab-9125-ec19fc572bf5 on f9d0456cddd2:31010]"}
{code}


> Error in parquet record reader - previously readable file fails to be read in 
> 1.14
> --
>
> Key: DRILL-6670
> URL: https://issues.apache.org/jira/browse/DRILL-6670
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Dave Challis
>Priority: Major
>
> Parquet file which was generated by PyArrow was readable in Apache 

[jira] [Created] (DRILL-6670) Error in parquet record reader - previously readable file fails to be read in 1.14

2018-08-06 Thread Dave Challis (JIRA)
Dave Challis created DRILL-6670:
---

 Summary: Error in parquet record reader - previously readable file 
fails to be read in 1.14
 Key: DRILL-6670
 URL: https://issues.apache.org/jira/browse/DRILL-6670
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.14.0
Reporter: Dave Challis


Parquet file which was generated by PyArrow was readable in Apache Drill 1.12 
and 1.13, but fails to be read with 1.14.

Running the query "SELECT * FROM dfs.`foo.parquet`" results in the following 
error message from the Drill web query UI:

{code}
{"code":500,"message":"SQL error while querying Drill DB Failed to create 
prepared statement: INTERNAL_ERROR ERROR: Error in parquet record 
reader.\nMessage: Failure in setting up reader\nParquet Metadata: 
ParquetMetaData{FileMetaData{schema: message schema {\n  optional binary name 
(UTF8);\n  optional binary creation_parameters (UTF8);\n  optional int64 
creation_date (TIMESTAMP_MICROS);\n  optional int32 data_version;\n  optional 
int32 schema_version;\n}\n, metadata: {pandas={\"index_columns\": [], 
\"column_indexes\": [], \"columns\": [{\"name\": \"name\", \"field_name\": 
\"name\", \"pandas_type\": \"unicode\", \"numpy_type\": \"object\", 
\"metadata\": null}, {\"name\": \"creation_parameters\", \"field_name\": 
\"creation_parameters\", \"pandas_type\": \"unicode\", \"numpy_type\": 
\"object\", \"metadata\": null}, {\"name\": \"creation_date\", \"field_name\": 
\"creation_date\", \"pandas_type\": \"datetime\", \"numpy_type\": 
\"datetime64[ns]\", \"metadata\": null}, {\"name\": \"data_version\", 
\"field_name\": \"data_version\", \"pandas_type\": \"int32\", \"numpy_type\": 
\"int32\", \"metadata\": null}, {\"name\": \"schema_version\", \"field_name\": 
\"schema_version\", \"pandas_type\": \"int32\", \"numpy_type\": \"int32\", 
\"metadata\": null}], \"pandas_version\": \"0.22.0\"}}}, blocks: 
[BlockMetaData{1, 27142 [ColumnMetaData{SNAPPY [name] optional binary name 
(UTF8)  [PLAIN, RLE], 4}, ColumnMetaData{SNAPPY [creation_parameters] optional 
binary creation_parameters (UTF8)  [PLAIN, RLE], 252}, ColumnMetaData{SNAPPY 
[creation_date] optional int64 creation_date (TIMESTAMP_MICROS)  [PLAIN, RLE], 
46334}, ColumnMetaData{SNAPPY [data_version] optional int32 data_version  
[PLAIN, RLE], 46478}, ColumnMetaData{SNAPPY [schema_version] optional int32 
schema_version  [PLAIN, RLE], 46593}]}]}\n\nFragment 0:0\n\n[Error Id: 
7c76ae97-03e3-4fab-9125-ec19fc572bf5 on f9d0456cddd2:31010]"}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6662) Access AWS access key ID and secret access key using Credential Provider API for S3 storage plugin

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570157#comment-16570157
 ] 

ASF GitHub Bot commented on DRILL-6662:
---

KazydubB commented on a change in pull request #1419: DRILL-6662: Access AWS 
access key ID and secret access key using Cred…
URL: https://github.com/apache/drill/pull/1419#discussion_r207876730
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java
 ##
 @@ -104,6 +108,31 @@ public FileSystemPlugin(FileSystemConfig config, 
DrillbitContext context, String
 }
   }
 
+  private boolean isS3() {
+java.net.URI uri = FileSystem.getDefaultUri(fsConf);
+return uri.getScheme().equals("s3a");
+  }
+
+  /**
+   * Retrieve secret and access keys from configured (with
+   * {@link 
org.apache.hadoop.security.alias.CredentialProviderFactory#CREDENTIAL_PROVIDER_PATH}
 property)
+   * credential providers and set it into {@link #fsConf}. If provider path is 
not configured or credential
+   * is absent in providers, it will conditionally fallback to configuration 
setting. The fallback will occur unless
+   * {@link 
org.apache.hadoop.security.alias.CredentialProvider#CLEAR_TEXT_FALLBACK} is set 
to false.
+   * @throws IOException thrown if a credential cannot be retrieved from 
provider
+   */
+  private void handleS3Credentials() throws IOException {
+final String[] credentialKeys = {"fs.s3a.secret.key", "fs.s3a.access.key"};
+for (String key : credentialKeys) {
+  char[] credentialChars = fsConf.getPassword(key);
+  if (credentialChars != null) {
+fsConf.set(key, String.valueOf(credentialChars));
 
 Review comment:
   Hm, what do you call an "inline declaration"? In s3 storage plugin and with 
hadoop credential CLI's -value (though this option is for testing because it's 
unsecure) the value should be wrapped with quotes if it is designed to contain 
spaces. If set in core-site.xml, the value is wrapped with tags, so any space 
is considered to be intentional.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Access AWS access key ID and secret access key using Credential Provider API 
> for S3 storage plugin
> --
>
> Key: DRILL-6662
> URL: https://issues.apache.org/jira/browse/DRILL-6662
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>
> Hadoop provides [CredentialProvider 
> API|[https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html]]
>  which allows passwords and other sensitive secrets to be stored in an 
> external provider rather than in configuration files in plaintext.
> Currently S3 storage plugin is accessing passwords, namely 
> 'fs.s3a.access.key' and 'fs.s3a.secret.key', stored in clear text in 
> Configuration with get() method. To give users an ability to remove clear 
> text passwords for S3 from configuration files Configuration.getPassword() 
> method should be used, given they configure 
> 'hadoop.security.credential.provider.path' property which points to a file 
> containing encrypted passwords instead of configuring two aforementioned 
> properties.
> By using this approach, credential providers will be checked first and if the 
> secret is not provided or providers are not configured there will be a 
> fallback to secrets configured in clear text (unless 
> 'hadoop.security.credential.clear-text-fallback' is configured to be 
> "false"), thus making new change backwards-compatible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570123#comment-16570123
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

weijietong commented on a change in pull request #1334: DRILL-6385: Support 
JPPD feature
URL: https://github.com/apache/drill/pull/1334#discussion_r207868475
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/BloomFilter.java
 ##
 @@ -0,0 +1,190 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.filter;
+
+import com.google.common.base.Preconditions;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.memory.BufferAllocator;
+
+import java.util.Arrays;
+
+
+/**
+ * According to Putze et al.'s "Cache-, Hash- and Space-Efficient BloomFilter
+ * Filters", see http://algo2.iti.kit.edu/singler/publications/cacheefficientbloomfilters-wea2007.pdf;>this
 paper
+ * for details, the main theory is to construct tiny bucket bloom filters 
which benefit to
+ * the cpu cache and SIMD opcode.
+ */
+
+public class BloomFilter {
+  // Bytes in a bucket.
+  private static final int BYTES_PER_BUCKET = 32;
+  // Minimum bloom filter data size.
+  private static final int MINIMUM_BLOOM_SIZE_IN_BYTES = 256;
+
+  private static final int DEFAULT_MAXIMUM_BLOOM_FILTER_SIZE_IN_BYTES = 16 * 
1024 * 1024;
+
+  private DrillBuf byteBuf;
+
+  private int numBytes;
+
+  private int mask[] = new int[8];
+
+  private byte[] tempBucket = new byte[32];
+
+
+  public BloomFilter(int numBytes, BufferAllocator bufferAllocator) {
+int size = BloomFilter.adjustByteSize(numBytes);
+this.byteBuf = bufferAllocator.buffer(size);
+this.numBytes = byteBuf.capacity();
+this.byteBuf.writerIndex(numBytes);
+  }
+
+  public BloomFilter(int ndv, double fpp, BufferAllocator bufferAllocator) {
+int numBytes = BloomFilter.optimalNumOfBytes(ndv, fpp);
+int size = BloomFilter.adjustByteSize(numBytes);
+this.byteBuf = bufferAllocator.buffer(size);
+this.numBytes = byteBuf.capacity();
+this.byteBuf.writerIndex(numBytes);
+  }
+
+
+  public static int adjustByteSize(int numBytes) {
+if (numBytes < MINIMUM_BLOOM_SIZE_IN_BYTES) {
+  numBytes = MINIMUM_BLOOM_SIZE_IN_BYTES;
+}
+
+if (numBytes > DEFAULT_MAXIMUM_BLOOM_FILTER_SIZE_IN_BYTES) {
+  numBytes = DEFAULT_MAXIMUM_BLOOM_FILTER_SIZE_IN_BYTES;
+}
+
+// 32 bytes alignment, one bucket.
+numBytes = (numBytes + 0x1F) & (~0x1F);
+return numBytes;
+  }
+
+  private void setMask(int key) {
+//8 odd numbers act as salt value to participate in the computation of the 
mask.
+final int SALT[] = {0x47b6137b, 0x44974d91, 0x8824ad5b, 0xa2b7289d, 
0x705495c7, 0x2df1424b, 0x9efc4947, 0x5c6bfb31};
+
+Arrays.fill(mask, 0);
+
+for (int i = 0; i < 8; ++i) {
+  mask[i] = key * SALT[i];
+}
+
+for (int i = 0; i < 8; ++i) {
+  mask[i] = mask[i] >> 27;
+}
+
+for (int i = 0; i < 8; ++i) {
+  mask[i] = 0x1 << mask[i];
+}
+  }
+
+  /**
+   * Add an element's hash value to this bloom filter.
+   * @param hash hash result of element.
+   */
+  public void insert(long hash) {
+int bucketIndex = (int) (hash >> 32) & (numBytes / BYTES_PER_BUCKET - 1);
+int key = (int) hash;
+setMask(key);
+int initialStartIndex = bucketIndex * BYTES_PER_BUCKET;
+byteBuf.getBytes(initialStartIndex, tempBucket);
+for (int i = 0; i < 8; i++) {
+  //every iterate batch,we set 32 bits
+  int bitsetIndex = i * 4;
+  tempBucket[bitsetIndex] = (byte) (tempBucket[bitsetIndex] | (byte) 
(mask[i] >>> 24));
+  tempBucket[bitsetIndex + 1] = (byte) (tempBucket[(bitsetIndex) + 1] | 
(byte) (mask[i] >>> 16));
+  tempBucket[bitsetIndex + 2] = (byte) (tempBucket[(bitsetIndex) + 2] | 
(byte) (mask[i] >>> 8));
+  tempBucket[bitsetIndex + 3] = (byte) (tempBucket[(bitsetIndex) + 3] | 
(byte) (mask[i]));
+}
+byteBuf.setBytes(initialStartIndex, tempBucket);
+  }
+
+  /**
+   * Determine whether an element is set or not.
+   *
+   * @param hash the hash value of element.
+   * @return false if the element is not set, true if 

[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570117#comment-16570117
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

weijietong commented on a change in pull request #1334: DRILL-6385: Support 
JPPD feature
URL: https://github.com/apache/drill/pull/1334#discussion_r207867048
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java
 ##
 @@ -190,11 +213,21 @@ public IterOutcome next() {
 if (isNewSchema) {
   // Even when recordCount = 0, we should return return OK_NEW_SCHEMA 
if current reader presents a new schema.
   // This could happen when data sources have a non-trivial schema 
with 0 row.
-  container.buildSchema(SelectionVectorMode.NONE);
+  if (firstRuntimeFiltered) {
+container.buildSchema(SelectionVectorMode.TWO_BYTE);
+runtimeFiltered = true;
+  } else {
+container.buildSchema(SelectionVectorMode.NONE);
+  }
 
 Review comment:
   I will take into account this suggestion.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support JPPD (Join Predicate Push Down)
> ---
>
> Key: DRILL-6385
> URL: https://issues.apache.org/jira/browse/DRILL-6385
> Project: Apache Drill
>  Issue Type: New Feature
>  Components:  Server, Execution - Flow
>Affects Versions: 1.14.0
>Reporter: weijie.tong
>Assignee: weijie.tong
>Priority: Major
>
> This feature is to support the JPPD (Join Predicate Push Down). It will 
> benefit the HashJoin ,Broadcast HashJoin performance by reducing the number 
> of rows to send across the network ,the memory consumed. This feature is 
> already supported by Impala which calls it RuntimeFilter 
> ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]).
>  The first PR will try to push down a bloom filter of HashJoin node to 
> Parquet’s scan node.   The propose basic procedure is described as follow:
>  # The HashJoin build side accumulate the equal join condition rows to 
> construct a bloom filter. Then it sends out the bloom filter to the foreman 
> node.
>  # The foreman node accept the bloom filters passively from all the fragments 
> that has the HashJoin operator. It then aggregates the bloom filters to form 
> a global bloom filter.
>  # The foreman node broadcasts the global bloom filter to all the probe side 
> scan nodes which maybe already have send out partial data to the hash join 
> nodes(currently the hash join node will prefetch one batch from both sides ).
>       4.  The scan node accepts a global bloom filter from the foreman node. 
> It will filter the rest rows satisfying the bloom filter.
>  
> To implement above execution flow, some main new notion described as below:
>       1. RuntimeFilter
> It’s a filter container which may contain BloomFilter or MinMaxFilter.
>       2. RuntimeFilterReporter
> It wraps the logic to send hash join’s bloom filter to the foreman.The 
> serialized bloom filter will be sent out through the data tunnel.This object 
> will be instanced by the FragmentExecutor and passed to the 
> FragmentContext.So the HashJoin operator can obtain it through the 
> FragmentContext.
>      3. RuntimeFilterRequestHandler
> It is responsible to accept a SendRuntimeFilterRequest RPC to strip the 
> actual BloomFilter from the network. It then translates this filter to the 
> WorkerBee’s new interface registerRuntimeFilter.
> Another RPC type is BroadcastRuntimeFilterRequest. It will register the 
> accepted global bloom filter to the WorkerBee by the registerRuntimeFilter 
> method and then propagate to the FragmentContext through which the probe side 
> scan node can fetch the aggregated bloom filter.
>       4.RuntimeFilterManager
> The foreman will instance a RuntimeFilterManager .It will indirectly get 
> every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been 
> accepted and aggregated . It will broadcast the aggregated bloom filter to 
> all the probe side scan nodes through the data tunnel by a 
> BroadcastRuntimeFilterRequest RPC.
>      5. RuntimeFilterEnableOption 
>  A global option will be added to decide whether to enable this new feature.
>  
> Welcome suggestion and advice from you.The related PR will be presented as 
> soon as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570113#comment-16570113
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

weijietong commented on a change in pull request #1334: DRILL-6385: Support 
JPPD feature
URL: https://github.com/apache/drill/pull/1334#discussion_r207866392
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java
 ##
 @@ -408,6 +424,9 @@ private void runPhysicalPlan(final PhysicalPlan plan) 
throws ExecutionSetupExcep
 fragmentsRunner.setFragmentsInfo(work.getFragments(), 
work.getRootFragment(), work.getRootOperator());
 
 startQueryProcessing();
+if (enableRuntimeFilter) {
+  runtimeFilterManager.waitForComplete();
 
 Review comment:
   To make sure the `RuntimerFilterManager`'s sending out  `ByteBuf`s to be 
safely cleared out no matter the network successed or failed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support JPPD (Join Predicate Push Down)
> ---
>
> Key: DRILL-6385
> URL: https://issues.apache.org/jira/browse/DRILL-6385
> Project: Apache Drill
>  Issue Type: New Feature
>  Components:  Server, Execution - Flow
>Affects Versions: 1.14.0
>Reporter: weijie.tong
>Assignee: weijie.tong
>Priority: Major
>
> This feature is to support the JPPD (Join Predicate Push Down). It will 
> benefit the HashJoin ,Broadcast HashJoin performance by reducing the number 
> of rows to send across the network ,the memory consumed. This feature is 
> already supported by Impala which calls it RuntimeFilter 
> ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]).
>  The first PR will try to push down a bloom filter of HashJoin node to 
> Parquet’s scan node.   The propose basic procedure is described as follow:
>  # The HashJoin build side accumulate the equal join condition rows to 
> construct a bloom filter. Then it sends out the bloom filter to the foreman 
> node.
>  # The foreman node accept the bloom filters passively from all the fragments 
> that has the HashJoin operator. It then aggregates the bloom filters to form 
> a global bloom filter.
>  # The foreman node broadcasts the global bloom filter to all the probe side 
> scan nodes which maybe already have send out partial data to the hash join 
> nodes(currently the hash join node will prefetch one batch from both sides ).
>       4.  The scan node accepts a global bloom filter from the foreman node. 
> It will filter the rest rows satisfying the bloom filter.
>  
> To implement above execution flow, some main new notion described as below:
>       1. RuntimeFilter
> It’s a filter container which may contain BloomFilter or MinMaxFilter.
>       2. RuntimeFilterReporter
> It wraps the logic to send hash join’s bloom filter to the foreman.The 
> serialized bloom filter will be sent out through the data tunnel.This object 
> will be instanced by the FragmentExecutor and passed to the 
> FragmentContext.So the HashJoin operator can obtain it through the 
> FragmentContext.
>      3. RuntimeFilterRequestHandler
> It is responsible to accept a SendRuntimeFilterRequest RPC to strip the 
> actual BloomFilter from the network. It then translates this filter to the 
> WorkerBee’s new interface registerRuntimeFilter.
> Another RPC type is BroadcastRuntimeFilterRequest. It will register the 
> accepted global bloom filter to the WorkerBee by the registerRuntimeFilter 
> method and then propagate to the FragmentContext through which the probe side 
> scan node can fetch the aggregated bloom filter.
>       4.RuntimeFilterManager
> The foreman will instance a RuntimeFilterManager .It will indirectly get 
> every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been 
> accepted and aggregated . It will broadcast the aggregated bloom filter to 
> all the probe side scan nodes through the data tunnel by a 
> BroadcastRuntimeFilterRequest RPC.
>      5. RuntimeFilterEnableOption 
>  A global option will be added to decide whether to enable this new feature.
>  
> Welcome suggestion and advice from you.The related PR will be presented as 
> soon as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570081#comment-16570081
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

weijietong commented on a change in pull request #1334: DRILL-6385: Support 
JPPD feature
URL: https://github.com/apache/drill/pull/1334#discussion_r207860831
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/RuntimeFilterManager.java
 ##
 @@ -0,0 +1,666 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.filter;
+
+import org.apache.calcite.plan.volcano.RelSubset;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.core.JoinInfo;
+import org.apache.calcite.rel.core.JoinRelType;
+import org.apache.calcite.rel.metadata.RelMetadataQuery;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rel.type.RelDataTypeField;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.commons.collections.CollectionUtils;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.AccountingDataTunnel;
+import org.apache.drill.exec.ops.Consumer;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.ops.SendingAccountor;
+import org.apache.drill.exec.ops.StatusHandler;
+import org.apache.drill.exec.physical.PhysicalPlan;
+
+import org.apache.drill.exec.physical.base.AbstractPhysicalVisitor;
+import org.apache.drill.exec.physical.base.Exchange;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.config.BroadcastExchange;
+import org.apache.drill.exec.physical.config.HashJoinPOP;
+import org.apache.drill.exec.planner.fragment.Fragment;
+import org.apache.drill.exec.planner.fragment.Wrapper;
+import org.apache.drill.exec.planner.physical.HashAggPrel;
+import org.apache.drill.exec.planner.physical.HashJoinPrel;
+import org.apache.drill.exec.planner.physical.Prel;
+import org.apache.drill.exec.planner.physical.ScanPrel;
+import org.apache.drill.exec.planner.physical.StreamAggPrel;
+import org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor;
+import org.apache.drill.exec.proto.BitData;
+import org.apache.drill.exec.proto.CoordinationProtos;
+import org.apache.drill.exec.proto.GeneralRPCProtos;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.rpc.RpcException;
+import org.apache.drill.exec.rpc.RpcOutcomeListener;
+import org.apache.drill.exec.rpc.data.DataTunnel;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.util.Pointer;
+import org.apache.drill.exec.work.QueryWorkUnit;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * This class traverses the physical operator tree to find the HashJoin 
operator
+ * for which is JPPD (join predicate push down) is possible. The prerequisite 
to do JPPD
+ * is:
+ * 1. The join condition is equality
+ * 2. The physical join node is a HashJoin one
+ * 3. The probe side children of the HashJoin node should not contain a 
blocking operator like HashAgg
+ */
+public class RuntimeFilterManager {
+
+  private Wrapper rootWrapper;
+  //HashJoin node's major fragment id to its corresponding probe side nodes's 
endpoints
+  private Map> 
joinMjId2probdeScanEps = new HashMap<>();
+  //HashJoin node's major fragment id to its corresponding probe side nodes's 
number
+  private Map joinMjId2scanSize = new ConcurrentHashMap<>();
+  //HashJoin node's major fragment id to its corresponding probe side scan 
node's belonging major fragment id
+  private Map joinMjId2ScanMjId = new HashMap<>();
+
+  private RuntimeFilterWritable aggregatedRuntimeFilter;
+
+  private DrillbitContext drillbitContext;
+
+  private SendingAccountor sendingAccountor = new SendingAccountor();
+
+  private String lineSeparator;
+
+  

[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569986#comment-16569986
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

weijietong commented on a change in pull request #1334: DRILL-6385: Support 
JPPD feature
URL: https://github.com/apache/drill/pull/1334#discussion_r207836945
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/RuntimeFilterManager.java
 ##
 @@ -0,0 +1,666 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.filter;
+
+import org.apache.calcite.plan.volcano.RelSubset;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.core.JoinInfo;
+import org.apache.calcite.rel.core.JoinRelType;
+import org.apache.calcite.rel.metadata.RelMetadataQuery;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rel.type.RelDataTypeField;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.commons.collections.CollectionUtils;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.AccountingDataTunnel;
+import org.apache.drill.exec.ops.Consumer;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.ops.SendingAccountor;
+import org.apache.drill.exec.ops.StatusHandler;
+import org.apache.drill.exec.physical.PhysicalPlan;
+
+import org.apache.drill.exec.physical.base.AbstractPhysicalVisitor;
+import org.apache.drill.exec.physical.base.Exchange;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.config.BroadcastExchange;
+import org.apache.drill.exec.physical.config.HashJoinPOP;
+import org.apache.drill.exec.planner.fragment.Fragment;
+import org.apache.drill.exec.planner.fragment.Wrapper;
+import org.apache.drill.exec.planner.physical.HashAggPrel;
+import org.apache.drill.exec.planner.physical.HashJoinPrel;
+import org.apache.drill.exec.planner.physical.Prel;
+import org.apache.drill.exec.planner.physical.ScanPrel;
+import org.apache.drill.exec.planner.physical.StreamAggPrel;
+import org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor;
+import org.apache.drill.exec.proto.BitData;
+import org.apache.drill.exec.proto.CoordinationProtos;
+import org.apache.drill.exec.proto.GeneralRPCProtos;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.rpc.RpcException;
+import org.apache.drill.exec.rpc.RpcOutcomeListener;
+import org.apache.drill.exec.rpc.data.DataTunnel;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.util.Pointer;
+import org.apache.drill.exec.work.QueryWorkUnit;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * This class traverses the physical operator tree to find the HashJoin 
operator
+ * for which is JPPD (join predicate push down) is possible. The prerequisite 
to do JPPD
+ * is:
+ * 1. The join condition is equality
+ * 2. The physical join node is a HashJoin one
+ * 3. The probe side children of the HashJoin node should not contain a 
blocking operator like HashAgg
+ */
+public class RuntimeFilterManager {
+
+  private Wrapper rootWrapper;
+  //HashJoin node's major fragment id to its corresponding probe side nodes's 
endpoints
+  private Map> 
joinMjId2probdeScanEps = new HashMap<>();
+  //HashJoin node's major fragment id to its corresponding probe side nodes's 
number
+  private Map joinMjId2scanSize = new ConcurrentHashMap<>();
+  //HashJoin node's major fragment id to its corresponding probe side scan 
node's belonging major fragment id
+  private Map joinMjId2ScanMjId = new HashMap<>();
+
+  private RuntimeFilterWritable aggregatedRuntimeFilter;
+
+  private DrillbitContext drillbitContext;
+
+  private SendingAccountor sendingAccountor = new SendingAccountor();
+
+  private String lineSeparator;
+
+  

[jira] [Commented] (DRILL-6656) Add Regex To Disallow Extra Semicolons In Imports

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569962#comment-16569962
 ] 

ASF GitHub Bot commented on DRILL-6656:
---

vvysotskyi commented on a change in pull request #1415: DRILL-6656: Disallow 
extra semicolons in import statements.
URL: https://github.com/apache/drill/pull/1415#discussion_r207830787
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillAggFuncHolder.java
 ##
 @@ -93,13 +93,13 @@ public boolean isAggregating() {
 //Loop through all workspace vectors, to get the minimum of size of 
all workspace vectors.
 JVar sizeVar = setupBlock.decl(g.getModel().INT, "vectorSize", 
JExpr.lit(Integer.MAX_VALUE));
 JClass mathClass = g.getModel().ref(Math.class);
-for (int id = 0; id < getWorkspaceVars().length; id ++) {
+for (int id = 0; id < getWorkspaceVars().length; id++) {
   if (!getWorkspaceVars()[id].isInject()) {
 
setupBlock.assign(sizeVar,mathClass.staticInvoke("min").arg(sizeVar).arg(g.getWorkspaceVectors().get(getWorkspaceVars()[id]).invoke("getValueCapacity")));
   }
 }
 
-for(int i =0 ; i < getWorkspaceVars().length; i++) {
+for(int i =0; i < getWorkspaceVars().length; i++) {
 
 Review comment:
   Could you please do formatting with spaces in the lines, where you made 
changes? 
   It may make easier adding new checkstyle rules in the future.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Regex To Disallow Extra Semicolons In Imports
> -
>
> Key: DRILL-6656
> URL: https://issues.apache.org/jira/browse/DRILL-6656
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6656) Add Regex To Disallow Extra Semicolons In Imports

2018-08-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569948#comment-16569948
 ] 

ASF GitHub Bot commented on DRILL-6656:
---

vvysotskyi commented on a change in pull request #1415: DRILL-6656: Disallow 
extra semicolons in import statements.
URL: https://github.com/apache/drill/pull/1415#discussion_r207829569
 
 

 ##
 File path: src/main/resources/checkstyle-config.xml
 ##
 @@ -35,11 +35,26 @@
 
 
 
+
 
   
 
   
 
+  
+  
 
 Review comment:
   Sorry, I didn't suppose that it will require more changes. 
   
   Looks like it was a bug in Checkstyle for OneStatementPerLine and it was 
fixed in 6.9 release [1]. So after updating checkstyle, custom RegexpSingleline 
may be deleted.
   
   [1] http://checkstyle.sourceforge.net/releasenotes.html#Release_6.9


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Regex To Disallow Extra Semicolons In Imports
> -
>
> Key: DRILL-6656
> URL: https://issues.apache.org/jira/browse/DRILL-6656
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)