[GitHub] [spark] AngersZhuuuu commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

GitBox Fri, 05 Nov 2021 22:24:10 -0700


AngersZhuuuu commented on pull request #34308:
URL: https://github.com/apache/spark/pull/34308#issuecomment-962397519



   > Oops sorry @AngersZhuuuu . Maybe we can try the following then:
   > 
   > ```scala
   >   test("XXX") {
   >     val data = (0 to 10).flatMap(n => Seq.fill(10)(n)).map(i => (i, 
i.toString))
   >     withParquetFile(data) { dir =>
   >       val file = SpecificParquetRecordReaderBase.listDirectory(new 
File(dir)).get(0)
   >       val filePath = new Path(file)
   >       val reader = 
ParquetFileReader.open(HadoopInputFile.fromPath(filePath, new Configuration))
   >       try {
   >         val descriptor = reader.getFileMetaData.getSchema.getColumns.get(0)
   >         val pages = reader.readNextRowGroup().getPageReader(descriptor)
   > 
   >         val dictionaryPage = pages.readDictionaryPage()
   >         assert(dictionaryPage != null, "dictionaryPage shouldn't be null")
   >         val dictionary = 
dictionaryPage.getEncoding.initDictionary(descriptor, dictionaryPage)
   >         val parquetDictionary = new ParquetDictionary(dictionary, file, 
true)
   >         val msg = intercept[UnsupportedOperationException] {
   >           parquetDictionary.decodeToInt(0)
   >         }.getMessage
   >         assert(msg.contains("Decoding to Int is not supported"))
   >       } finally {
   >         reader.close()
   >       }
   >     }
   >   }
   > ```
   
   Thanks a lot, such way can trigger this an I have tried but failed to use 
these apis. Have updated and add you as the co-author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] AngersZhuuuu commented on pull request #34308: [SPARK-37035][SQL] Improve error message when use parquet vectorize reader

Reply via email to