> On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java, line 272
> > <https://reviews.apache.org/r/37985/diff/2/?file=1061874#file1061874line272>
> >
> >     Who calls this method? If this is required later can you put this in 
> > that patch. Focus only on refactoring in this patch. Easy to review too :)

it's called from ORC splits and metastore logic in subsequent patches. If 
patches were all merged the final patch would be too epic


> On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java, line 363
> > <https://reviews.apache.org/r/37985/diff/2/?file=1061874#file1061874line363>
> >
> >     I think this is messed up from the beginning. The second argument is 
> > list of all projected column names and not just sarg column names. So 
> > rename of this method will be helpful.

this is an existing method. One of the reasons for this patch is that it's 
messed up ;)


> On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java, line 1439
> > <https://reviews.apache.org/r/37985/diff/2/?file=1061874#file1061874line1439>
> >
> >     where is this used? I don't see it being used anywhere. Can this be 
> > moved to relevant jira?

future metastore usage


> On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java, line 
> > 1228
> > <https://reviews.apache.org/r/37985/diff/2/?file=1061876#file1061876line1228>
> >
> >     indexInSourceTable will never be null. Right?

it will be, for partition columns and stuff. At least some code that I was 
reading handles this case


> On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java, line 
> > 1231
> > <https://reviews.apache.org/r/37985/diff/2/?file=1061876#file1061876line1231>
> >
> >     Whats this translation doing exactly?
> >     Is this trying to map between sarg column -> table column index -> orc 
> > internal column index?

It's making SARG halfway self contained by replacing columns by their indexes 
in the table. Then ORC can translate from those to ORC internal indexes


> On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java,
> >  line 517
> > <https://reviews.apache.org/r/37985/diff/2/?file=1061879#file1061879line517>
> >
> >     Create a bug?

removed comment instead :)


> On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java, 
> > line 146
> > <https://reviews.apache.org/r/37985/diff/2/?file=1061881#file1061881line146>
> >
> >     I also don't understand why would it contain duplicates. IIRC, this was 
> > probably caused by multiple concatenation to the READ_COLUMN_IDS_CONF_STR. 
> > I am not sure if this happens anymore, in any case we should create a bug 
> > and remove this code. Or may be remove in the next patch. Also use, guava 
> > Splitter.splitToList?

existing code :) I added logging as per some other comment


> On Sept. 5, 2015, 1:59 a.m., Prasanth_J wrote:
> > storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentFactory.java,
> >  line 28
> > <https://reviews.apache.org/r/37985/diff/2/?file=1061882#file1061882line28>
> >
> >     I would prefer setting new name based on internal column names. 
> >     For example:
> >     Schema from filedump: struct<_col0:tinyint,_col1:smallint>
> >     
> >     Current SARG: leaf-0 = (EQUALS i1 100)
> >     
> >     After this patch: leaf-0 = (EQUALS _col1 100)
> >     
> >     If we can map to internal names, the it will be easy to map sarg column 
> > names to internal column index. i1 -> 1 (after ripping off _col)

how can we map to internal column name? We are working against a file we have 
not read (in split generation), these names could be anything


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37985/#review97837
-----------------------------------------------------------


On Sept. 2, 2015, 1:06 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37985/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2015, 1:06 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 05efc5f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java 8beff4b 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java fcb3746 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java 
> 4480600 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java 
> e034650 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
> 2dc15f9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b809a23 
>   serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 
> 10086c5 
>   
> storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentFactory.java
>  0778935 
>   
> storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
>  d27ac16 
> 
> Diff: https://reviews.apache.org/r/37985/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>

Reply via email to