[ 
https://issues.apache.org/jira/browse/PARQUET-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16819332#comment-16819332
 ] 

Fan Mo commented on PARQUET-1563:
---------------------------------

I'm not trying to write the date datatype, I'm trying to read the date datatype 
which write by spark job. The reader code will convert the date datatype to Int 
automatically and return meanless number.

> cannot read 'date' datatype which write by spark
> ------------------------------------------------
>
>                 Key: PARQUET-1563
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1563
>             Project: Parquet
>          Issue Type: Bug
>         Environment: jdk: 1.8
> macOS Mojave 10.14.4
>            Reporter: Fan Mo
>            Priority: Major
>
> I'm using spark 2.4.0 to write parquet file and try to use 
> parquet-column-1.10.jar to read the data. All the primary datatypes are 
> working however for the date datatype it gets some meanless number.  For 
> example, input date is '1970-04-26', output data is '115'. if I use Spark to 
> read the data, it can get the correct date. 
> following are my reader code:
> val reader = ParquetFileReader.open(HadoopInputFile.fromPath(new 
> Path(("testfile.snappy.parquet")), new Configuration()))
> val schema = reader.getFooter.getFileMetaData.getSchema
> var pages : PageReadStore = null
> while((pages = reader.readNextRowGroup()) != null) {
>  val rows = pages.getRowCount
>  val columnIO = new ColumnIOFactory().getColumnIO(schema)
>  val recordReader = columnIO.getRecordReader(pages,new 
> GroupRecordConverter(schema))
>  (0L until rows).foreach{ _ : Long =>
>  val simpleGroup = recordReader.read()
>  println(simpleGroup)
>  }
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to