> On Jan. 20, 2014, 6:56 p.m., Eric Hanson wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java, line > > 1119 > > <https://reviews.apache.org/r/17005/diff/1/?file=425358#file425358line1119> > > > > It seems odd that we're reading from a scaleStream because the scale > > should be the same for every value in the column. Is this necessary? > > > >
The orc decimal encoding currently supports arbitrary scale. Although, hive doesn't allow variable scales, the orc format allows it. We should have another decimal encoding in hive optimized for specific precision and scale, and correspondingly we will have to add additional vectorized reader as well for decimal. Since the reader is part of ORC code, I think it should also allow reading variable scales as per the encoding. If that doesn't match the scale in the schema, then we definitely have a data/schema corruption issue. > On Jan. 20, 2014, 6:56 p.m., Eric Hanson wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java, line > > 1123 > > <https://reviews.apache.org/r/17005/diff/1/?file=425358#file425358line1123> > > > > If any scale values are different inside a single DecimalColumnVector, > > I think that could cause unpredictable or wrong results. > > > > Later operations on DecimalColumnVector take the scale from the > > columnvector sometimes, not each individual object. If the scale in the data is different from the scale assumed in the vectorized reader, we would still have erroneous results. - Jitendra ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17005/#review32299 ----------------------------------------------------------- On Jan. 24, 2014, 10:28 p.m., Jitendra Pandey wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/17005/ > ----------------------------------------------------------- > > (Updated Jan. 24, 2014, 10:28 p.m.) > > > Review request for hive and Eric Hanson. > > > Bugs: HIVE-6178 > https://issues.apache.org/jira/browse/HIVE-6178 > > > Repository: hive-git > > > Description > ------- > > vectorized reader for DECIMAL datatype for ORC format. > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 3939511 > common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java > d71ebb3 > common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java > fbb2aa0 > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java > 23564bb > ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java 0df82b9 > ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestVectorizedORCReader.java > 0d5b7ff > > Diff: https://reviews.apache.org/r/17005/diff/ > > > Testing > ------- > > > Thanks, > > Jitendra Pandey > >