> On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java, line 272 > > <https://reviews.apache.org/r/37985/diff/2/?file=1061874#file1061874line272> > > > > Who calls this method? If this is required later can you put this in > > that patch. Focus only on refactoring in this patch. Easy to review too :)
it's called from ORC splits and metastore logic in subsequent patches. If patches were all merged the final patch would be too epic > On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java, line 363 > > <https://reviews.apache.org/r/37985/diff/2/?file=1061874#file1061874line363> > > > > I think this is messed up from the beginning. The second argument is > > list of all projected column names and not just sarg column names. So > > rename of this method will be helpful. this is an existing method. One of the reasons for this patch is that it's messed up ;) > On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java, line 1439 > > <https://reviews.apache.org/r/37985/diff/2/?file=1061874#file1061874line1439> > > > > where is this used? I don't see it being used anywhere. Can this be > > moved to relevant jira? future metastore usage > On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java, line > > 1228 > > <https://reviews.apache.org/r/37985/diff/2/?file=1061876#file1061876line1228> > > > > indexInSourceTable will never be null. Right? it will be, for partition columns and stuff. At least some code that I was reading handles this case > On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java, line > > 1231 > > <https://reviews.apache.org/r/37985/diff/2/?file=1061876#file1061876line1231> > > > > Whats this translation doing exactly? > > Is this trying to map between sarg column -> table column index -> orc > > internal column index? It's making SARG halfway self contained by replacing columns by their indexes in the table. Then ORC can translate from those to ORC internal indexes > On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote: > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java, > > line 517 > > <https://reviews.apache.org/r/37985/diff/2/?file=1061879#file1061879line517> > > > > Create a bug? removed comment instead :) > On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote: > > serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java, > > line 146 > > <https://reviews.apache.org/r/37985/diff/2/?file=1061881#file1061881line146> > > > > I also don't understand why would it contain duplicates. IIRC, this was > > probably caused by multiple concatenation to the READ_COLUMN_IDS_CONF_STR. > > I am not sure if this happens anymore, in any case we should create a bug > > and remove this code. Or may be remove in the next patch. Also use, guava > > Splitter.splitToList? existing code :) I added logging as per some other comment > On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote: > > storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentFactory.java, > > line 28 > > <https://reviews.apache.org/r/37985/diff/2/?file=1061882#file1061882line28> > > > > I would prefer setting new name based on internal column names. > > For example: > > Schema from filedump: struct<_col0:tinyint,_col1:smallint> > > > > Current SARG: leaf-0 = (EQUALS i1 100) > > > > After this patch: leaf-0 = (EQUALS _col1 100) > > > > If we can map to internal names, the it will be easy to map sarg column > > names to internal column index. i1 -> 1 (after ripping off _col) how can we map to internal column name? We are working against a file we have not read (in split generation), these names could be anything - Sergey ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/37985/#review97837 ----------------------------------------------------------- On Sept. 2, 2015, 1:06 a.m., Sergey Shelukhin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/37985/ > ----------------------------------------------------------- > > (Updated Sept. 2, 2015, 1:06 a.m.) > > > Review request for hive and Prasanth_J. > > > Repository: hive-git > > > Description > ------- > > see jira > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 05efc5f > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java 8beff4b > ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java fcb3746 > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java > 4480600 > ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java > e034650 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java > 2dc15f9 > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b809a23 > serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java > 10086c5 > > storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentFactory.java > 0778935 > > storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java > d27ac16 > > Diff: https://reviews.apache.org/r/37985/diff/ > > > Testing > ------- > > > Thanks, > > Sergey Shelukhin > >