> Hello,
> Tahsin and I are trying to use the Apache Parquet file format with Spark
> SQL, but are running into errors when reading Parquet files that contain
> TimeType columns. We're wondering whether this is unsupported in Spark SQL
> due to an architectural limitation, or due to lack of resources?
> Context: When reading some Parquet files with Spark, we get an error
> message like the following:
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
> in stage 186.0 failed 4 times, most recent failure: Lost task 0.3 in stage
> 186.0 (TID 1970,, executor 1): Could
> not read or convert schema for file:
> dbfs:/test/randomdata/sample001.parquet
> ...
> Caused by: org.apache.spark.sql.AnalysisException: Illegal Parquet type:
> at
> org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter.illegalType$1(ParquetSchemaConverter.scala:106)
> This only seems to occur with Parquet files that have a column with the
> "TimeType" (or the deprecated "TIME_MILLIS"/"TIME_MICROS") types in the
> Parquet file. After digging into this a bit, we think that the error
> message is coming from "ParquetSchemaConverter.scala" here: link
> <>.
> <>
> This seems to imply that the Spark SQL engine does not support reading
> Parquet files with TimeType columns.
> We are wondering if anyone on the mailing list could shed some more light
> on this: are there are architectural/datatype limitations in Spark that are
> resulting in this error, or is TimeType support for Parquet files something
> that hasn't been implemented yet due to lack of resources/interest?
> Thanks,
