> On March 26, 2015, 7:11 p.m., Mohit Sabharwal wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java, > > line 65 > > <https://reviews.apache.org/r/32499/diff/1/?file=906071#file906071line65> > > > > why remove static ?
Thanks Mohit. I did not know what's the benefit of 'private static' at the beginning, so I thought this was just extra code. But I know now that it has some benefits like guaranteeing that it does not touch instance fields, and when functions are statically linked, then executing may be a litte faster. > On March 26, 2015, 7:11 p.m., Mohit Sabharwal wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java, > > line 90 > > <https://reviews.apache.org/r/32499/diff/1/?file=906071#file906071line90> > > > > Looks like this method is called recursively (to deal with nested > > fields). Can we have duplicate column names across nesting levels ? Yes, parquet supports duplicate columns across nested levels. So, this is an example: optional group a { required binary name; optional group addr { optional binary a; } } optional group b { required binary name; optional group addr { optional binary b; } } - Sergio ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32499/#review77924 ----------------------------------------------------------- On March 25, 2015, 10:42 p.m., Sergio Pena wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/32499/ > ----------------------------------------------------------- > > (Updated March 25, 2015, 10:42 p.m.) > > > Review request for hive. > > > Bugs: HIVE-10086 > https://issues.apache.org/jira/browse/HIVE-10086 > > > Repository: hive-git > > > Description > ------- > > Attached is the patch that handles schema that do not match between Parquet > and Hive. > > The access to Parquet data is with name matching in this case. The table > column may have different schema order, but if the name matches the parquet > column name, then the value is retrieved. > > Also, if the Hive schema has columns and struct elements that do not match > with the Parquet schema, then it will return NULL values instead. > > > Diffs > ----- > > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java > 57ae7a9740d55b407cadfc8bc030593b29f90700 > ql/src/test/queries/clientpositive/parquet_schema_evolution.q PRE-CREATION > ql/src/test/queries/clientpositive/parquet_table_with_subschema.q > PRE-CREATION > ql/src/test/results/clientpositive/parquet_schema_evolution.q.out > PRE-CREATION > ql/src/test/results/clientpositive/parquet_table_with_subschema.q.out > PRE-CREATION > > Diff: https://reviews.apache.org/r/32499/diff/ > > > Testing > ------- > > > Thanks, > > Sergio Pena > >