----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50896/#review145227 -----------------------------------------------------------
I'm ambivilent, I would rather have pursued the change to make it in superCSV to be better in long run. But I do see it might not move very fast (did you guys try contacting them?). The patch itself looks mostly fine though. My only question is, does it need to be a 2nd version of the format? That is, is there anything that is actually backward incompatibie other than adding a new flag? Thanks. - Szehon Ho On Aug. 8, 2016, 3:13 p.m., Marta Kuczora wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50896/ > ----------------------------------------------------------- > > (Updated Aug. 8, 2016, 3:13 p.m.) > > > Review request for hive, Naveen Gangam, Sergio Pena, Szehon Ho, and Xuefu > Zhang. > > > Repository: hive-git > > > Description > ------- > > Introduced a new outputformat (dsv2) which supports multiple characters as > delimiter. > For generating the dsv, csv2 and tsv2 outputformats, the Super CSV library is > used. This library doesn’t support multiple characters as delimiter. Since > the same logic is used for generating csv2, tsv2 and dsv outputformats, I > decided not to change this logic, rather introduce a new outputformat (dsv2) > which supports multiple characters as delimiter. > The new dsv2 outputformat has the same escaping logic as the dsv outputformat > if the quoting is not disabled. > Extended the TestBeeLineWithArgs tests with new test steps which are using > multiple characters as delimiter. > > Main changes in the code: > - Changed the SeparatedValuesOutputFormat class to be an abstract class and > created two new child classes to separate the logic for single-character and > multi-character delimiters: SingleCharSeparatedValuesOutputFormat and > MultiCharSeparatedValuesOutputFormat > > - Kept the methods which are used by both children in the > SeparatedValuesOutputFormat and moved the methods specific to the > single-character case to the SingleCharSeparatedValuesOutputFormat class. > > - Didn’t change the logic which was in the SeparatedValuesOutputFormat, only > moved some parts to the child class. > > - Implemented the value escaping and concatenation with the delimiter string > in the MultiCharSeparatedValuesOutputFormat. > > > Diffs > ----- > > beeline/src/java/org/apache/hive/beeline/BeeLine.java e0fa032 > beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java e6e24b1 > > beeline/src/java/org/apache/hive/beeline/MultiCharSeparatedValuesOutputFormat.java > PRE-CREATION > beeline/src/java/org/apache/hive/beeline/SeparatedValuesOutputFormat.java > 66d9fd0 > > beeline/src/java/org/apache/hive/beeline/SingleCharSeparatedValuesOutputFormat.java > PRE-CREATION > > itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java > 892c733 > > Diff: https://reviews.apache.org/r/50896/diff/ > > > Testing > ------- > > - Tested manually in BeeLine. > - Extended the TestBeeLineWithArgs tests with new test steps which are using > multiple characters as delimiter. > > > Thanks, > > Marta Kuczora > >