[jira] [Commented] (DRILL-6204) Pass tables columns without partition columns to empty Hive reader

2018-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384792#comment-16384792
 ] 

ASF GitHub Bot commented on DRILL-6204:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1146


> Pass tables columns without partition columns to empty Hive reader
> --
>
> Key: DRILL-6204
> URL: https://issues.apache.org/jira/browse/DRILL-6204
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.12.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.13.0
>
>
> When {{store.hive.optimize_scan_with_native_readers}} is enabled, 
> {{HiveDrillNativeScanBatchCreator}} is used to read data from Hive tables 
> directly from file system. In case when table is empty or no row group are 
> matched, empty {{HiveDefaultReader}} is called to output the schema.
> If such situation happens, currently Drill fails with the following error:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> NullPointerException Setup failed for null 
> {noformat}
> This happens because instead of passing only table columns to the empty 
> reader (as we do when creating non-empty reader), we passed all columns which 
> may contain partition columns as well. Readers fails to find partition column 
> in table schema. As mentioned in on lines 81 - 82 in 
> {{HiveDrillNativeScanBatchCreator}}, we deliberately separate out partition 
> columns and table columns to pass partition columns separately:
> {noformat}
>   // Separate out the partition and non-partition columns. Non-partition 
> columns are passed directly to the
>   // ParquetRecordReader. Partition columns are passed to ScanBatch.
> {noformat}
> To fix the problem we need to pass table columns instead of all columns.
> {code:java}
> if (readers.size() == 0) {
>   readers.add(new HiveDefaultReader(table, null, null, newColumns, 
> context, conf,
> ImpersonationUtil.createProxyUgi(config.getUserName(), 
> context.getQueryUserName(;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6204) Pass tables columns without partition columns to empty Hive reader

2018-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384590#comment-16384590
 ] 

ASF GitHub Bot commented on DRILL-6204:
---

Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1146
  
+1


> Pass tables columns without partition columns to empty Hive reader
> --
>
> Key: DRILL-6204
> URL: https://issues.apache.org/jira/browse/DRILL-6204
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.12.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.13.0
>
>
> When {{store.hive.optimize_scan_with_native_readers}} is enabled, 
> {{HiveDrillNativeScanBatchCreator}} is used to read data from Hive tables 
> directly from file system. In case when table is empty or no row group are 
> matched, empty {{HiveDefaultReader}} is called to output the schema.
> If such situation happens, currently Drill fails with the following error:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> NullPointerException Setup failed for null 
> {noformat}
> This happens because instead of passing only table columns to the empty 
> reader (as we do when creating non-empty reader), we passed all columns which 
> may contain partition columns as well. Readers fails to find partition column 
> in table schema. As mentioned in on lines 81 - 82 in 
> {{HiveDrillNativeScanBatchCreator}}, we deliberately separate out partition 
> columns and table columns to pass partition columns separately:
> {noformat}
>   // Separate out the partition and non-partition columns. Non-partition 
> columns are passed directly to the
>   // ParquetRecordReader. Partition columns are passed to ScanBatch.
> {noformat}
> To fix the problem we need to pass table columns instead of all columns.
> {code:java}
> if (readers.size() == 0) {
>   readers.add(new HiveDefaultReader(table, null, null, newColumns, 
> context, conf,
> ImpersonationUtil.createProxyUgi(config.getUserName(), 
> context.getQueryUserName(;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6204) Pass tables columns without partition columns to empty Hive reader

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383965#comment-16383965
 ] 

ASF GitHub Bot commented on DRILL-6204:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1146
  
@vdiravka, thanks for the code review. Updated PR.


> Pass tables columns without partition columns to empty Hive reader
> --
>
> Key: DRILL-6204
> URL: https://issues.apache.org/jira/browse/DRILL-6204
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.12.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.13.0
>
>
> When {{store.hive.optimize_scan_with_native_readers}} is enabled, 
> {{HiveDrillNativeScanBatchCreator}} is used to read data from Hive tables 
> directly from file system. In case when table is empty or no row group are 
> matched, empty {{HiveDefaultReader}} is called to output the schema.
> If such situation happens, currently Drill fails with the following error:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> NullPointerException Setup failed for null 
> {noformat}
> This happens because instead of passing only table columns to the empty 
> reader (as we do when creating non-empty reader), we passed all columns which 
> may contain partition columns as well. Readers fails to find partition column 
> in table schema. As mentioned in on lines 81 - 82 in 
> {{HiveDrillNativeScanBatchCreator}}, we deliberately separate out partition 
> columns and table columns to pass partition columns separately:
> {noformat}
>   // Separate out the partition and non-partition columns. Non-partition 
> columns are passed directly to the
>   // ParquetRecordReader. Partition columns are passed to ScanBatch.
> {noformat}
> To fix the problem we need to pass table columns instead of all columns.
> {code:java}
> if (readers.size() == 0) {
>   readers.add(new HiveDefaultReader(table, null, null, newColumns, 
> context, conf,
> ImpersonationUtil.createProxyUgi(config.getUserName(), 
> context.getQueryUserName(;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6204) Pass tables columns without partition columns to empty Hive reader

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383861#comment-16383861
 ] 

ASF GitHub Bot commented on DRILL-6204:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1146#discussion_r171912800
  
--- Diff: 
contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveDrillNativeScanBatchCreator.java
 ---
@@ -174,7 +174,7 @@ public ScanBatch getBatch(ExecutorFragmentContext 
context, HiveDrillNativeParque
 // If there are no readers created (which is possible when the table 
is empty or no row groups are matched),
 // create an empty RecordReader to output the schema
 if (readers.size() == 0) {
-  readers.add(new HiveDefaultReader(table, null, null, columns, 
context, conf,
+  readers.add(new HiveDefaultReader(table, null, null, newColumns, 
context, conf,
--- End diff --

Could we rename newColumns -> nonPartitionedColumns or tableColumns?


> Pass tables columns without partition columns to empty Hive reader
> --
>
> Key: DRILL-6204
> URL: https://issues.apache.org/jira/browse/DRILL-6204
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.12.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.13.0
>
>
> When {{store.hive.optimize_scan_with_native_readers}} is enabled, 
> {{HiveDrillNativeScanBatchCreator}} is used to read data from Hive tables 
> directly from file system. In case when table is empty or no row group are 
> matched, empty {{HiveDefaultReader}} is called to output the schema.
> If such situation happens, currently Drill fails with the following error:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> NullPointerException Setup failed for null 
> {noformat}
> This happens because instead of passing only table columns to the empty 
> reader (as we do when creating non-empty reader), we passed all columns which 
> may contain partition columns as well. Readers fails to find partition column 
> in table schema. As mentioned in on lines 81 - 82 in 
> {{HiveDrillNativeScanBatchCreator}}, we deliberately separate out partition 
> columns and table columns to pass partition columns separately:
> {noformat}
>   // Separate out the partition and non-partition columns. Non-partition 
> columns are passed directly to the
>   // ParquetRecordReader. Partition columns are passed to ScanBatch.
> {noformat}
> To fix the problem we need to pass table columns instead of all columns.
> {code:java}
> if (readers.size() == 0) {
>   readers.add(new HiveDefaultReader(table, null, null, newColumns, 
> context, conf,
> ImpersonationUtil.createProxyUgi(config.getUserName(), 
> context.getQueryUserName(;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6204) Pass tables columns without partition columns to empty Hive reader

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383594#comment-16383594
 ] 

ASF GitHub Bot commented on DRILL-6204:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1146
  
@parthchandra, @vdiravka please review.


> Pass tables columns without partition columns to empty Hive reader
> --
>
> Key: DRILL-6204
> URL: https://issues.apache.org/jira/browse/DRILL-6204
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.12.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.13.0
>
>
> When {{store.hive.optimize_scan_with_native_readers}} is enabled, 
> {{HiveDrillNativeScanBatchCreator}} is used to read data from Hive tables 
> directly from file system. In case when table is empty or no row group are 
> matched, empty {{HiveDefaultReader}} is called to output the schema.
> If such situation happens, currently Drill fails with the following error:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> NullPointerException Setup failed for null 
> {noformat}
> This happens because instead of passing only table columns to the empty 
> reader (as we do when creating non-empty reader), we passed all columns which 
> may contain partition columns as well. As mentioned in on lines 81 - 82 in 
> {{HiveDrillNativeScanBatchCreator}} , we deliberately separate out partition 
> columns and table columns to pass partition columns separately:
> {noformat}
>   // Separate out the partition and non-partition columns. Non-partition 
> columns are passed directly to the
>   // ParquetRecordReader. Partition columns are passed to ScanBatch.
> {noformat}
> To fix the problem we need to pass table columns instead of all columns.
> {code:java}
> if (readers.size() == 0) {
>   readers.add(new HiveDefaultReader(table, null, null, newColumns, 
> context, conf,
> ImpersonationUtil.createProxyUgi(config.getUserName(), 
> context.getQueryUserName(;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6204) Pass tables columns without partition columns to empty Hive reader

2018-03-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383571#comment-16383571
 ] 

ASF GitHub Bot commented on DRILL-6204:
---

GitHub user arina-ielchiieva opened a pull request:

https://github.com/apache/drill/pull/1146

DRILL-6204: Pass tables columns without partition columns to empty Hi…

…ve reader

Details in [DRILL-6204](https://issues.apache.org/jira/browse/DRILL-6204).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/arina-ielchiieva/drill DRILL-6204

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1146.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1146


commit 50dd97c612645d025dc3fa77795e142eabee2b70
Author: Arina Ielchiieva 
Date:   2018-03-02T11:38:00Z

DRILL-6204: Pass tables columns without partition columns to empty Hive 
reader




> Pass tables columns without partition columns to empty Hive reader
> --
>
> Key: DRILL-6204
> URL: https://issues.apache.org/jira/browse/DRILL-6204
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.12.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.13.0
>
>
> When {{store.hive.optimize_scan_with_native_readers}} is enabled, 
> {{HiveDrillNativeScanBatchCreator}} is used to read data from Hive tables 
> directly from file system. In case when table is empty or no row group are 
> matched, empty {{HiveDefaultReader}} is called to output the schema.
> If such situation happens, currently Drill fails with the following error:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> NullPointerException Setup failed for null 
> {noformat}
> This happens because instead of passing only table columns to the empty 
> reader (as we do when creating non-empty reader), we passed all columns which 
> may contain partition columns as well. As mentioned in on lines 81 - 82 in 
> {{HiveDrillNativeScanBatchCreator}} , we deliberately separate out partition 
> columns and table columns to pass partition columns separately:
> {noformat}
>   // Separate out the partition and non-partition columns. Non-partition 
> columns are passed directly to the
>   // ParquetRecordReader. Partition columns are passed to ScanBatch.
> {noformat}
> To fix the problem we need to pass table columns instead of all columns.
> {code:java}
> if (readers.size() == 0) {
>   readers.add(new HiveDefaultReader(table, null, null, newColumns, 
> context, conf,
> ImpersonationUtil.createProxyUgi(config.getUserName(), 
> context.getQueryUserName(;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)