Re: Unable to read the Parquet file written by NiFi through Spark when Logical Data Type is set to true.

Mike Thomsen Thu, 31 May 2018 01:45:01 -0700

Maybe check to see which version of Avro is bundled with your deployment of
Spark?


On Thu, May 31, 2018 at 3:26 AM Mohit <mohit.j...@open-insights.co.in>
wrote:

> Hi Mike,
>
>
>
> I have created the hive external table on the top of parquet and able to
> read it from hive.
>
>
>
> While querying hive from spark these are the errors –
>
>
>
> For the decimal type, I get the following error – (In hive, data type is
> decimal(12,5))
>
> Caused by: org.apache.spark.sql.execution.QueryExecutionException: Parquet
> column cannot be converted in file
> hdfs://ip-10-0-0-216.ap-south-1.compute.internal:8020/user/hermes/nifi_test1/test_pqt2_dcm/4963966040134.
> Column: [dc_type], Expected: DecimalType(12,5), Found: BINARY
>
>   at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:192)
>
>   at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109)
>
>
>
> For the time(which was converted into TIME_MILLIS) –  (In Hive, the data
> type is int)
>
> Caused by: org.apache.spark.sql.AnalysisException: Parquet type not yet
> supported: INT32 (TIME_MILLIS);
>
> at
> org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter.typeNotImplemented$1(ParquetSchemaConverter.scala:105)
>
> at
> org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter.convertPrimitiveField(ParquetSchemaConverter.scala:141)
>
>
>
> Thanks,
>
> Mohit
>
>
>
> *From:* Mike Thomsen <mikerthom...@gmail.com>
> *Sent:* 30 May 2018 17:28
> *To:* users@nifi.apache.org
> *Subject:* Re: Unable to read the Parquet file written by NiFi through
> Spark when Logical Data Type is set to true.
>
>
>
> What's the error from Spark? Logical data types are just a variant on
> existing data types in Avro 1.8.
>
>
>
> On Wed, May 30, 2018 at 7:54 AM Mohit <mohit.j...@open-insights.co.in>
> wrote:
>
> Hi all,
>
>
>
> I’m fetching the data from RDBMS and writing it to parquet using
> PutParquet processor. I’m not able to read the data from Spark when Logical
> Data Type is true. I’m able to read it from Hive.
>
> Do I have to set some specific properties in the PutParquet processor to
> make it readable from spark as well?
>
>
>
> Regards,
>
> Mohit
>
>

Re: Unable to read the Parquet file written by NiFi through Spark when Logical Data Type is set to true.

Reply via email to