[
https://issues.apache.org/jira/browse/DRILL-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383571#comment-16383571
]
ASF GitHub Bot commented on DRILL-6204:
---------------------------------------
GitHub user arina-ielchiieva opened a pull request:
https://github.com/apache/drill/pull/1146
DRILL-6204: Pass tables columns without partition columns to empty Hi…
…ve reader
Details in [DRILL-6204](https://issues.apache.org/jira/browse/DRILL-6204).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/arina-ielchiieva/drill DRILL-6204
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/1146.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1146
----
commit 50dd97c612645d025dc3fa77795e142eabee2b70
Author: Arina Ielchiieva <arina.yelchiyeva@...>
Date: 2018-03-02T11:38:00Z
DRILL-6204: Pass tables columns without partition columns to empty Hive
reader
----
> Pass tables columns without partition columns to empty Hive reader
> ------------------------------------------------------------------
>
> Key: DRILL-6204
> URL: https://issues.apache.org/jira/browse/DRILL-6204
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Hive
> Affects Versions: 1.12.0
> Reporter: Arina Ielchiieva
> Assignee: Arina Ielchiieva
> Priority: Major
> Fix For: 1.13.0
>
>
> When {{store.hive.optimize_scan_with_native_readers}} is enabled,
> {{HiveDrillNativeScanBatchCreator}} is used to read data from Hive tables
> directly from file system. In case when table is empty or no row group are
> matched, empty {{HiveDefaultReader}} is called to output the schema.
> If such situation happens, currently Drill fails with the following error:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> NullPointerException Setup failed for null
> {noformat}
> This happens because instead of passing only table columns to the empty
> reader (as we do when creating non-empty reader), we passed all columns which
> may contain partition columns as well. As mentioned in on lines 81 - 82 in
> {{HiveDrillNativeScanBatchCreator}} , we deliberately separate out partition
> columns and table columns to pass partition columns separately:
> {noformat}
> // Separate out the partition and non-partition columns. Non-partition
> columns are passed directly to the
> // ParquetRecordReader. Partition columns are passed to ScanBatch.
> {noformat}
> To fix the problem we need to pass table columns instead of all columns.
> {code:java}
> if (readers.size() == 0) {
> readers.add(new HiveDefaultReader(table, null, null, newColumns,
> context, conf,
> ImpersonationUtil.createProxyUgi(config.getUserName(),
> context.getQueryUserName())));
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)