Fan Mo created PARQUET-1563:
-------------------------------

             Summary: cannot read 'date' datatype which write by spark
                 Key: PARQUET-1563
                 URL: https://issues.apache.org/jira/browse/PARQUET-1563
             Project: Parquet
          Issue Type: Bug
         Environment: jdk: 1.8

macOS Mojave 10.14.4
            Reporter: Fan Mo


I'm using spark 2.4.0 to write parquet file and try to use 
parquet-column-1.10.jar to read the data. All the primary datatypes are working 
however for the date datatype it gets some meanless number.  For example, input 
date is '1970-04-26', output data is '115'. if I use Spark to read the data, it 
can get the correct date. 

following are my reader code:

val reader = ParquetFileReader.open(HadoopInputFile.fromPath(new 
Path(("testfile.snappy.parquet")), new Configuration()))
val schema = reader.getFooter.getFileMetaData.getSchema
var pages : PageReadStore = null
while((pages = reader.readNextRowGroup()) != null) {
 val rows = pages.getRowCount
 val columnIO = new ColumnIOFactory().getColumnIO(schema)
 val recordReader = columnIO.getRecordReader(pages,new 
GroupRecordConverter(schema))
 (0L until rows).foreach{ _ : Long =>
 val simpleGroup = recordReader.read()
 println(simpleGroup)
 }
}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to