Yes, they seem to be valid. Thank you Xu. I will be validating the data in 35 tables next week, will report back when i have the results.
Regards, Pratik On Sun, Aug 31, 2014 at 8:26 AM, Xu, Qian A <[email protected]> wrote: > Hi Pratik, > > > > If reopen the file reader can solve the problem, can I come to a > conclusion that the exported Parquet files are valid? > > > > Best regards > > --Qian Xu (Stanley) > > > > > > > > > > *From:* pratik khadloya [mailto:[email protected]] > *Sent:* Friday, August 29, 2014 3:46 AM > *To:* [email protected] > *Subject:* Re: Issue with reading parquet file exported by sqoop > > > > Strangely enough another version of my reader works > https://gist.github.com/tispratik/f7a66f6a40b7ae3b98ad > > The difference is that i have to re-open the file again when i read a new > column. > > The reopening happens through the following line: > > ParquetFileReader fileReader = new ParquetFileReader(conf, filePath, > blocks, schema.getColumns()); > > > > which i am calling in a loop where i am looping over column descriptors. > > > > > > ~Pratik > > > > On Thu, Aug 28, 2014 at 11:49 AM, pratik khadloya <[email protected]> > wrote: > > This issue only occurs for some columns and that too after reading a few > thousand records. > > > > ~Pratik > > > > On Thu, Aug 28, 2014 at 11:48 AM, pratik khadloya <[email protected]> > wrote: > > Hello, > > > > I am facing the following exception when reading a parquet file exported > by sqoop. > > My parquet column reader code is at > https://gist.github.com/tispratik/f0044dd84dc8d8c6cbcf > > > > Exception in thread "main" parquet.io.ParquetDecodingException: Can't read > value in column [description] BINARY at value 44899 out of 57096, 44899 out > of 57096 in currentPage. repetition level: 0, definition level: 1 > > at > parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:450) > > at > parquet.column.impl.ColumnReaderImpl.getBinary(ColumnReaderImpl.java:398) > > at > com.rocketfuel.grid.lookup_new.RfiParquetFileReader.load(RfiParquetFileReader.java:147) > > at > com.rocketfuel.grid.lookup_new.RfiParquetFileReader.<init>(RfiParquetFileReader.java:87) > > at > com.rocketfuel.grid.lookup_new.RfiParquetFileReader.main(RfiParquetFileReader.java:114) > > Caused by: java.lang.IllegalArgumentException: Reading past RLE/BitPacking > stream. > > at parquet.Preconditions.checkArgument(Preconditions.java:47) > > at > parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readNext(RunLengthBitPackingHybridDecoder.java:80) > > at > parquet.column.values.rle.RunLengthBitPackingHybridDecoder.readInt(RunLengthBitPackingHybridDecoder.java:62) > > at > parquet.column.values.dictionary.DictionaryValuesReader.readBytes(DictionaryValuesReader.java:82) > > at parquet.column.impl.ColumnReaderImpl$2$6.read(ColumnReaderImpl.java:295) > > at > parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:446) > > ... 4 more > > > > > > Does anyone know what this could be related to? What i could be doing > wrong? > > > > > > Thanks, > > ~Pratik > > > > >
