Arina Ielchiieva created DRILL-6204:
---------------------------------------
Summary: Pass tables columns without partition columns to empty
Hive reader
Key: DRILL-6204
URL: https://issues.apache.org/jira/browse/DRILL-6204
Project: Apache Drill
Issue Type: Bug
Components: Storage - Hive
Affects Versions: 1.12.0
Reporter: Arina Ielchiieva
Assignee: Arina Ielchiieva
Fix For: 1.13.0
When {{store.hive.optimize_scan_with_native_readers}} is enabled,
{{HiveDrillNativeScanBatchCreator}} is used to read data from Hive tables
directly from file system. In case when table is empty or no row group are
matched, empty {{HiveDefaultReader}} is called to output the schema.
If such situation happens, currently Drill fails with the following error:
{noformat}
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
NullPointerException Setup failed for null
{noformat}
This happens because instead of passing only table columns to the empty reader
(as we do when creating non-empty reader), we passed all columns which may
contain partition columns as well. As mentioned in on lines 81 - 82 in
{{HiveDrillNativeScanBatchCreator}} , we deliberately separate out partition
columns and table columns to pass partition columns separately:
{noformat}
// Separate out the partition and non-partition columns. Non-partition
columns are passed directly to the
// ParquetRecordReader. Partition columns are passed to ScanBatch.
{noformat}
To fix the problem we need to pass table columns instead of all columns.
{code:java}
if (readers.size() == 0) {
readers.add(new HiveDefaultReader(table, null, null, newColumns, context,
conf,
ImpersonationUtil.createProxyUgi(config.getUserName(),
context.getQueryUserName())));
}
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)