[ 
https://issues.apache.org/jira/browse/HIVE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641209#comment-13641209
 ] 

Phabricator commented on HIVE-4221:
-----------------------------------

kevinwilfong has commented on the revision "HIVE-4221 [jira] Stripe-level merge 
for ORC files
".

  The testcases orcfile_merge1-4.q it's not clear to me what your testing, and 
I have doubts as to whether multiple files are being generated/merged can you 
add comments and confirm.

  Can you also add negative testcases where you create a  table with certain 
parameters, write a file to it, alter one or more of those parameters, add 
another file into the table (insert into not insert overwrite) and try to merge 
it.  The parameters should include
  orc.compress
  orc.compress.size
  orc.row.index.stride
  orc.create.index

  This should cause the merge to fail.

INLINE COMMENTS
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:481-485 Could you 
add the new configs to conf/hive-default.xml.template
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/BlockMergeOutputFormat.java:33 
Could you make this string a constant, it seems fragile to have to change it in 
two places.
  ql/src/test/queries/clientpositive/orcfile_merge1.q:1 What's the purpose of 
this test?

  src is a very small table, so I don't think you'll be able to get multiple 
splits from it.  So I don't think the merge will do anything.

  I could be wrong though, did you confirm there were multiple files being 
generated and they were being merged?
  ql/src/test/queries/clientpositive/orcfile_merge2.q:1 Again, not clear to me 
what the purpose of this test is.

  Could you put a comment in it describing what the goal is.

REVISION DETAIL
  https://reviews.facebook.net/D9759

To: kevinwilfong, omalley, sxyuan
Cc: JIRA

                
> Stripe-level merge for ORC files
> --------------------------------
>
>                 Key: HIVE-4221
>                 URL: https://issues.apache.org/jira/browse/HIVE-4221
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Samuel Yuan
>            Assignee: Samuel Yuan
>         Attachments: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch
>
>
> As with RC files, we would like to be able to merge ORC files efficiently by 
> reading/writing stripes without decompressing/recompressing them. This will 
> be similar to the RC file merge, except that footers will have to be updated 
> with the stripe positions in the new file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to