Ashish Sharma created HIVE-19103: ------------------------------------ Summary: Reading only required column in nested structure schema in ORC Key: HIVE-19103 URL: https://issues.apache.org/jira/browse/HIVE-19103 Project: Hive Issue Type: Improvement Reporter: Ashish Sharma Assignee: Ashish Sharma
Reading required columns only in nested structure schema Example - *Current state* - Schema - struct<a:int, b:bigint,c:struct<d:int,e:struct<f:int>,g:string>> Query - select c.e.f from t where c.e.f > 10; Current state - read entire c struct from the file and then filter because "hive.io.file.readcolumn.ids" is referred due to which all the children column are select to read from the file. Conf - _hive.io.file.readcolumn.ids = "2" hive.io.file.readNestedColumn.paths = "c.e.f"_ Result - boolean[ ] include = [true,false,false,true,true,true,true,true] *Expected state* - Schema - struct<a:int, b:bigint,c:struct<d:int,e:struct<f:int>,g:string>> Query - select c.e.f from t where c.e.f > 10; Expected state - instead of reading entire c struct from the file just read only the f column by referring the " hive.io.file.readNestedColumn.paths". Conf - _hive.io.file.readcolumn.ids = "2" hive.io.file.readNestedColumn.paths = "c.e.f"_ Result - boolean[ ] include = [true,false,false,true,false,true,true,false] -- This message was sent by Atlassian JIRA (v7.6.3#76005)