Arina Ielchiieva created DRILL-6204: ---------------------------------------
Summary: Pass tables columns without partition columns to empty Hive reader Key: DRILL-6204 URL: https://issues.apache.org/jira/browse/DRILL-6204 Project: Apache Drill Issue Type: Bug Components: Storage - Hive Affects Versions: 1.12.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.13.0 When {{store.hive.optimize_scan_with_native_readers}} is enabled, {{HiveDrillNativeScanBatchCreator}} is used to read data from Hive tables directly from file system. In case when table is empty or no row group are matched, empty {{HiveDefaultReader}} is called to output the schema. If such situation happens, currently Drill fails with the following error: {noformat} org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NullPointerException Setup failed for null {noformat} This happens because instead of passing only table columns to the empty reader (as we do when creating non-empty reader), we passed all columns which may contain partition columns as well. As mentioned in on lines 81 - 82 in {{HiveDrillNativeScanBatchCreator}} , we deliberately separate out partition columns and table columns to pass partition columns separately: {noformat} // Separate out the partition and non-partition columns. Non-partition columns are passed directly to the // ParquetRecordReader. Partition columns are passed to ScanBatch. {noformat} To fix the problem we need to pass table columns instead of all columns. {code:java} if (readers.size() == 0) { readers.add(new HiveDefaultReader(table, null, null, newColumns, context, conf, ImpersonationUtil.createProxyUgi(config.getUserName(), context.getQueryUserName()))); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)