Re: Unsupported Catalyst types in Parquet
343) >>> at >>> org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$fromDataType$2.apply(ParquetTypes.scala:292) >>> at scala.Option.getOrElse(Option.scala:120) >>> at >>> org.apache.spark.sql.parquet.ParquetTypesConverter$.fromDataType(ParquetTypes.scala:291) >>> at >>> org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$4.apply(ParquetTypes.scala:363) >>> at >>> org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$4.apply(ParquetTypes.scala:362) >>> >>> I would more than happy to fix this myself, but I would need some help >>> wading through the code. Could anyone explain to me what exactly is needed >>> to support a new data type in SparkSQL's Parquet storage engine? >>> >>> Thanks. >>> >>> Alex >>> >>> On Mon, Dec 29, 2014 at 10:20 PM, Wang, Daoyuan >>> wrote: >>> >>>> By adding a flag in SQLContext, I have modified #3822 to include >>>> nanoseconds now. Since passing too many flags is ugly, now I need the whole >>>> SQLContext, so that we can put more flags there. >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Daoyuan >>>> >>>> >>>> >>>> *From:* Michael Armbrust [mailto:mich...@databricks.com] >>>> *Sent:* Tuesday, December 30, 2014 10:43 AM >>>> *To:* Alessandro Baretta >>>> *Cc:* Wang, Daoyuan; dev@spark.apache.org >>>> *Subject:* Re: Unsupported Catalyst types in Parquet >>>> >>>> >>>> >>>> Yeah, I saw those. The problem is that #3822 truncates timestamps that >>>> include nanoseconds. >>>> >>>> >>>> >>>> On Mon, Dec 29, 2014 at 5:14 PM, Alessandro Baretta < >>>> alexbare...@gmail.com> wrote: >>>> >>>> Michael, >>>> >>>> >>>> >>>> Actually, Adrian Wang already created pull requests for these issues. >>>> >>>> >>>> >>>> https://github.com/apache/spark/pull/3820 >>>> >>>> https://github.com/apache/spark/pull/3822 >>>> >>>> >>>> >>>> What do you think? >>>> >>>> >>>> >>>> Alex >>>> >>>> >>>> >>>> On Mon, Dec 29, 2014 at 3:07 PM, Michael Armbrust < >>>> mich...@databricks.com> wrote: >>>> >>>> I'd love to get both of these in. There is some trickiness that I talk >>>> about on the JIRA for timestamps since the SQL timestamp class can support >>>> nano seconds and I don't think parquet has a type for this. Other systems >>>> (impala) seem to use INT96. It would be great to maybe ask on the parquet >>>> mailing list what the plan is there to make sure that whatever we do is >>>> going to be compatible long term. >>>> >>>> >>>> >>>> Michael >>>> >>>> >>>> >>>> On Mon, Dec 29, 2014 at 8:13 AM, Alessandro Baretta < >>>> alexbare...@gmail.com> wrote: >>>> >>>> Daoyuan, >>>> >>>> Thanks for creating the jiras. I need these features by... last week, >>>> so I'd be happy to take care of this myself, if only you or someone more >>>> experienced than me in the SparkSQL codebase could provide some guidance. >>>> >>>> Alex >>>> >>>> On Dec 29, 2014 12:06 AM, "Wang, Daoyuan" >>>> wrote: >>>> >>>> Hi Alex, >>>> >>>> I'll create JIRA SPARK-4985 for date type support in parquet, and >>>> SPARK-4987 for timestamp type support. For decimal type, I think we only >>>> support decimals that fits in a long. >>>> >>>> Thanks, >>>> Daoyuan >>>> >>>> -Original Message- >>>> From: Alessandro Baretta [mailto:alexbare...@gmail.com] >>>> Sent: Saturday, December 27, 2014 2:47 PM >>>> To: dev@spark.apache.org; Michael Armbrust >>>> Subject: Unsupported Catalyst types in Parquet >>>> >>>> Michael, >>>> >>>> I'm having trouble storing my SchemaRDDs in Parquet format with >>>> SparkSQL, due to my RDDs having having DateType and DecimalType fields. >>>> What would it take to add Parquet support for these Catalyst? Are there any >>>> other Catalyst types for which there is no Catalyst support? >>>> >>>> Alex >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >> >
Re: Unsupported Catalyst types in Parquet
's Parquet storage engine? >> >> Thanks. >> >> Alex >> >> On Mon, Dec 29, 2014 at 10:20 PM, Wang, Daoyuan >> wrote: >> >>> By adding a flag in SQLContext, I have modified #3822 to include >>> nanoseconds now. Since passing too many flags is ugly, now I need the whole >>> SQLContext, so that we can put more flags there. >>> >>> >>> >>> Thanks, >>> >>> Daoyuan >>> >>> >>> >>> *From:* Michael Armbrust [mailto:mich...@databricks.com] >>> *Sent:* Tuesday, December 30, 2014 10:43 AM >>> *To:* Alessandro Baretta >>> *Cc:* Wang, Daoyuan; dev@spark.apache.org >>> *Subject:* Re: Unsupported Catalyst types in Parquet >>> >>> >>> >>> Yeah, I saw those. The problem is that #3822 truncates timestamps that >>> include nanoseconds. >>> >>> >>> >>> On Mon, Dec 29, 2014 at 5:14 PM, Alessandro Baretta < >>> alexbare...@gmail.com> wrote: >>> >>> Michael, >>> >>> >>> >>> Actually, Adrian Wang already created pull requests for these issues. >>> >>> >>> >>> https://github.com/apache/spark/pull/3820 >>> >>> https://github.com/apache/spark/pull/3822 >>> >>> >>> >>> What do you think? >>> >>> >>> >>> Alex >>> >>> >>> >>> On Mon, Dec 29, 2014 at 3:07 PM, Michael Armbrust < >>> mich...@databricks.com> wrote: >>> >>> I'd love to get both of these in. There is some trickiness that I talk >>> about on the JIRA for timestamps since the SQL timestamp class can support >>> nano seconds and I don't think parquet has a type for this. Other systems >>> (impala) seem to use INT96. It would be great to maybe ask on the parquet >>> mailing list what the plan is there to make sure that whatever we do is >>> going to be compatible long term. >>> >>> >>> >>> Michael >>> >>> >>> >>> On Mon, Dec 29, 2014 at 8:13 AM, Alessandro Baretta < >>> alexbare...@gmail.com> wrote: >>> >>> Daoyuan, >>> >>> Thanks for creating the jiras. I need these features by... last week, so >>> I'd be happy to take care of this myself, if only you or someone more >>> experienced than me in the SparkSQL codebase could provide some guidance. >>> >>> Alex >>> >>> On Dec 29, 2014 12:06 AM, "Wang, Daoyuan" >>> wrote: >>> >>> Hi Alex, >>> >>> I'll create JIRA SPARK-4985 for date type support in parquet, and >>> SPARK-4987 for timestamp type support. For decimal type, I think we only >>> support decimals that fits in a long. >>> >>> Thanks, >>> Daoyuan >>> >>> -Original Message- >>> From: Alessandro Baretta [mailto:alexbare...@gmail.com] >>> Sent: Saturday, December 27, 2014 2:47 PM >>> To: dev@spark.apache.org; Michael Armbrust >>> Subject: Unsupported Catalyst types in Parquet >>> >>> Michael, >>> >>> I'm having trouble storing my SchemaRDDs in Parquet format with >>> SparkSQL, due to my RDDs having having DateType and DecimalType fields. >>> What would it take to add Parquet support for these Catalyst? Are there any >>> other Catalyst types for which there is no Catalyst support? >>> >>> Alex >>> >>> >>> >>> >>> >>> >>> >> >> >
Re: Unsupported Catalyst types in Parquet
Sorry! My bad. I had stale spark jars sitting on the slave nodes... Alex On Tue, Dec 30, 2014 at 4:39 PM, Alessandro Baretta wrote: > Gents, > > I tried #3820. It doesn't work. I'm still getting the following exceptions: > > Exception in thread "Thread-45" java.lang.RuntimeException: Unsupported > datatype DateType > at scala.sys.package$.error(package.scala:27) > at > org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$fromDataType$2.apply(ParquetTypes.scala:343) > at > org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$fromDataType$2.apply(ParquetTypes.scala:292) > at scala.Option.getOrElse(Option.scala:120) > at > org.apache.spark.sql.parquet.ParquetTypesConverter$.fromDataType(ParquetTypes.scala:291) > at > org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$4.apply(ParquetTypes.scala:363) > at > org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$4.apply(ParquetTypes.scala:362) > > I would more than happy to fix this myself, but I would need some help > wading through the code. Could anyone explain to me what exactly is needed > to support a new data type in SparkSQL's Parquet storage engine? > > Thanks. > > Alex > > On Mon, Dec 29, 2014 at 10:20 PM, Wang, Daoyuan > wrote: > >> By adding a flag in SQLContext, I have modified #3822 to include >> nanoseconds now. Since passing too many flags is ugly, now I need the whole >> SQLContext, so that we can put more flags there. >> >> >> >> Thanks, >> >> Daoyuan >> >> >> >> *From:* Michael Armbrust [mailto:mich...@databricks.com] >> *Sent:* Tuesday, December 30, 2014 10:43 AM >> *To:* Alessandro Baretta >> *Cc:* Wang, Daoyuan; dev@spark.apache.org >> *Subject:* Re: Unsupported Catalyst types in Parquet >> >> >> >> Yeah, I saw those. The problem is that #3822 truncates timestamps that >> include nanoseconds. >> >> >> >> On Mon, Dec 29, 2014 at 5:14 PM, Alessandro Baretta < >> alexbare...@gmail.com> wrote: >> >> Michael, >> >> >> >> Actually, Adrian Wang already created pull requests for these issues. >> >> >> >> https://github.com/apache/spark/pull/3820 >> >> https://github.com/apache/spark/pull/3822 >> >> >> >> What do you think? >> >> >> >> Alex >> >> >> >> On Mon, Dec 29, 2014 at 3:07 PM, Michael Armbrust >> wrote: >> >> I'd love to get both of these in. There is some trickiness that I talk >> about on the JIRA for timestamps since the SQL timestamp class can support >> nano seconds and I don't think parquet has a type for this. Other systems >> (impala) seem to use INT96. It would be great to maybe ask on the parquet >> mailing list what the plan is there to make sure that whatever we do is >> going to be compatible long term. >> >> >> >> Michael >> >> >> >> On Mon, Dec 29, 2014 at 8:13 AM, Alessandro Baretta < >> alexbare...@gmail.com> wrote: >> >> Daoyuan, >> >> Thanks for creating the jiras. I need these features by... last week, so >> I'd be happy to take care of this myself, if only you or someone more >> experienced than me in the SparkSQL codebase could provide some guidance. >> >> Alex >> >> On Dec 29, 2014 12:06 AM, "Wang, Daoyuan" wrote: >> >> Hi Alex, >> >> I'll create JIRA SPARK-4985 for date type support in parquet, and >> SPARK-4987 for timestamp type support. For decimal type, I think we only >> support decimals that fits in a long. >> >> Thanks, >> Daoyuan >> >> -Original Message- >> From: Alessandro Baretta [mailto:alexbare...@gmail.com] >> Sent: Saturday, December 27, 2014 2:47 PM >> To: dev@spark.apache.org; Michael Armbrust >> Subject: Unsupported Catalyst types in Parquet >> >> Michael, >> >> I'm having trouble storing my SchemaRDDs in Parquet format with SparkSQL, >> due to my RDDs having having DateType and DecimalType fields. What would it >> take to add Parquet support for these Catalyst? Are there any other >> Catalyst types for which there is no Catalyst support? >> >> Alex >> >> >> >> >> >> >> > >
Re: Unsupported Catalyst types in Parquet
Gents, I tried #3820. It doesn't work. I'm still getting the following exceptions: Exception in thread "Thread-45" java.lang.RuntimeException: Unsupported datatype DateType at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$fromDataType$2.apply(ParquetTypes.scala:343) at org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$fromDataType$2.apply(ParquetTypes.scala:292) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.parquet.ParquetTypesConverter$.fromDataType(ParquetTypes.scala:291) at org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$4.apply(ParquetTypes.scala:363) at org.apache.spark.sql.parquet.ParquetTypesConverter$anonfun$4.apply(ParquetTypes.scala:362) I would more than happy to fix this myself, but I would need some help wading through the code. Could anyone explain to me what exactly is needed to support a new data type in SparkSQL's Parquet storage engine? Thanks. Alex On Mon, Dec 29, 2014 at 10:20 PM, Wang, Daoyuan wrote: > By adding a flag in SQLContext, I have modified #3822 to include > nanoseconds now. Since passing too many flags is ugly, now I need the whole > SQLContext, so that we can put more flags there. > > > > Thanks, > > Daoyuan > > > > *From:* Michael Armbrust [mailto:mich...@databricks.com] > *Sent:* Tuesday, December 30, 2014 10:43 AM > *To:* Alessandro Baretta > *Cc:* Wang, Daoyuan; dev@spark.apache.org > *Subject:* Re: Unsupported Catalyst types in Parquet > > > > Yeah, I saw those. The problem is that #3822 truncates timestamps that > include nanoseconds. > > > > On Mon, Dec 29, 2014 at 5:14 PM, Alessandro Baretta > wrote: > > Michael, > > > > Actually, Adrian Wang already created pull requests for these issues. > > > > https://github.com/apache/spark/pull/3820 > > https://github.com/apache/spark/pull/3822 > > > > What do you think? > > > > Alex > > > > On Mon, Dec 29, 2014 at 3:07 PM, Michael Armbrust > wrote: > > I'd love to get both of these in. There is some trickiness that I talk > about on the JIRA for timestamps since the SQL timestamp class can support > nano seconds and I don't think parquet has a type for this. Other systems > (impala) seem to use INT96. It would be great to maybe ask on the parquet > mailing list what the plan is there to make sure that whatever we do is > going to be compatible long term. > > > > Michael > > > > On Mon, Dec 29, 2014 at 8:13 AM, Alessandro Baretta > wrote: > > Daoyuan, > > Thanks for creating the jiras. I need these features by... last week, so > I'd be happy to take care of this myself, if only you or someone more > experienced than me in the SparkSQL codebase could provide some guidance. > > Alex > > On Dec 29, 2014 12:06 AM, "Wang, Daoyuan" wrote: > > Hi Alex, > > I'll create JIRA SPARK-4985 for date type support in parquet, and > SPARK-4987 for timestamp type support. For decimal type, I think we only > support decimals that fits in a long. > > Thanks, > Daoyuan > > -Original Message- > From: Alessandro Baretta [mailto:alexbare...@gmail.com] > Sent: Saturday, December 27, 2014 2:47 PM > To: dev@spark.apache.org; Michael Armbrust > Subject: Unsupported Catalyst types in Parquet > > Michael, > > I'm having trouble storing my SchemaRDDs in Parquet format with SparkSQL, > due to my RDDs having having DateType and DecimalType fields. What would it > take to add Parquet support for these Catalyst? Are there any other > Catalyst types for which there is no Catalyst support? > > Alex > > > > > > >
RE: Unsupported Catalyst types in Parquet
By adding a flag in SQLContext, I have modified #3822 to include nanoseconds now. Since passing too many flags is ugly, now I need the whole SQLContext, so that we can put more flags there. Thanks, Daoyuan From: Michael Armbrust [mailto:mich...@databricks.com] Sent: Tuesday, December 30, 2014 10:43 AM To: Alessandro Baretta Cc: Wang, Daoyuan; dev@spark.apache.org Subject: Re: Unsupported Catalyst types in Parquet Yeah, I saw those. The problem is that #3822 truncates timestamps that include nanoseconds. On Mon, Dec 29, 2014 at 5:14 PM, Alessandro Baretta mailto:alexbare...@gmail.com>> wrote: Michael, Actually, Adrian Wang already created pull requests for these issues. https://github.com/apache/spark/pull/3820 https://github.com/apache/spark/pull/3822 What do you think? Alex On Mon, Dec 29, 2014 at 3:07 PM, Michael Armbrust mailto:mich...@databricks.com>> wrote: I'd love to get both of these in. There is some trickiness that I talk about on the JIRA for timestamps since the SQL timestamp class can support nano seconds and I don't think parquet has a type for this. Other systems (impala) seem to use INT96. It would be great to maybe ask on the parquet mailing list what the plan is there to make sure that whatever we do is going to be compatible long term. Michael On Mon, Dec 29, 2014 at 8:13 AM, Alessandro Baretta mailto:alexbare...@gmail.com>> wrote: Daoyuan, Thanks for creating the jiras. I need these features by... last week, so I'd be happy to take care of this myself, if only you or someone more experienced than me in the SparkSQL codebase could provide some guidance. Alex On Dec 29, 2014 12:06 AM, "Wang, Daoyuan" mailto:daoyuan.w...@intel.com>> wrote: Hi Alex, I'll create JIRA SPARK-4985 for date type support in parquet, and SPARK-4987 for timestamp type support. For decimal type, I think we only support decimals that fits in a long. Thanks, Daoyuan -Original Message- From: Alessandro Baretta [mailto:alexbare...@gmail.com<mailto:alexbare...@gmail.com>] Sent: Saturday, December 27, 2014 2:47 PM To: dev@spark.apache.org<mailto:dev@spark.apache.org>; Michael Armbrust Subject: Unsupported Catalyst types in Parquet Michael, I'm having trouble storing my SchemaRDDs in Parquet format with SparkSQL, due to my RDDs having having DateType and DecimalType fields. What would it take to add Parquet support for these Catalyst? Are there any other Catalyst types for which there is no Catalyst support? Alex
Re: Unsupported Catalyst types in Parquet
Yeah, I saw those. The problem is that #3822 truncates timestamps that include nanoseconds. On Mon, Dec 29, 2014 at 5:14 PM, Alessandro Baretta wrote: > Michael, > > Actually, Adrian Wang already created pull requests for these issues. > > https://github.com/apache/spark/pull/3820 > https://github.com/apache/spark/pull/3822 > > What do you think? > > Alex > > On Mon, Dec 29, 2014 at 3:07 PM, Michael Armbrust > wrote: > >> I'd love to get both of these in. There is some trickiness that I talk >> about on the JIRA for timestamps since the SQL timestamp class can support >> nano seconds and I don't think parquet has a type for this. Other systems >> (impala) seem to use INT96. It would be great to maybe ask on the parquet >> mailing list what the plan is there to make sure that whatever we do is >> going to be compatible long term. >> >> Michael >> >> On Mon, Dec 29, 2014 at 8:13 AM, Alessandro Baretta < >> alexbare...@gmail.com> wrote: >> >>> Daoyuan, >>> >>> Thanks for creating the jiras. I need these features by... last week, so >>> I'd be happy to take care of this myself, if only you or someone more >>> experienced than me in the SparkSQL codebase could provide some guidance. >>> >>> Alex >>> On Dec 29, 2014 12:06 AM, "Wang, Daoyuan" >>> wrote: >>> Hi Alex, I'll create JIRA SPARK-4985 for date type support in parquet, and SPARK-4987 for timestamp type support. For decimal type, I think we only support decimals that fits in a long. Thanks, Daoyuan -Original Message- From: Alessandro Baretta [mailto:alexbare...@gmail.com] Sent: Saturday, December 27, 2014 2:47 PM To: dev@spark.apache.org; Michael Armbrust Subject: Unsupported Catalyst types in Parquet Michael, I'm having trouble storing my SchemaRDDs in Parquet format with SparkSQL, due to my RDDs having having DateType and DecimalType fields. What would it take to add Parquet support for these Catalyst? Are there any other Catalyst types for which there is no Catalyst support? Alex >>> >> >
Re: Unsupported Catalyst types in Parquet
Michael, Actually, Adrian Wang already created pull requests for these issues. https://github.com/apache/spark/pull/3820 https://github.com/apache/spark/pull/3822 What do you think? Alex On Mon, Dec 29, 2014 at 3:07 PM, Michael Armbrust wrote: > I'd love to get both of these in. There is some trickiness that I talk > about on the JIRA for timestamps since the SQL timestamp class can support > nano seconds and I don't think parquet has a type for this. Other systems > (impala) seem to use INT96. It would be great to maybe ask on the parquet > mailing list what the plan is there to make sure that whatever we do is > going to be compatible long term. > > Michael > > On Mon, Dec 29, 2014 at 8:13 AM, Alessandro Baretta > wrote: > >> Daoyuan, >> >> Thanks for creating the jiras. I need these features by... last week, so >> I'd be happy to take care of this myself, if only you or someone more >> experienced than me in the SparkSQL codebase could provide some guidance. >> >> Alex >> On Dec 29, 2014 12:06 AM, "Wang, Daoyuan" wrote: >> >>> Hi Alex, >>> >>> I'll create JIRA SPARK-4985 for date type support in parquet, and >>> SPARK-4987 for timestamp type support. For decimal type, I think we only >>> support decimals that fits in a long. >>> >>> Thanks, >>> Daoyuan >>> >>> -Original Message- >>> From: Alessandro Baretta [mailto:alexbare...@gmail.com] >>> Sent: Saturday, December 27, 2014 2:47 PM >>> To: dev@spark.apache.org; Michael Armbrust >>> Subject: Unsupported Catalyst types in Parquet >>> >>> Michael, >>> >>> I'm having trouble storing my SchemaRDDs in Parquet format with >>> SparkSQL, due to my RDDs having having DateType and DecimalType fields. >>> What would it take to add Parquet support for these Catalyst? Are there any >>> other Catalyst types for which there is no Catalyst support? >>> >>> Alex >>> >> >
Re: Unsupported Catalyst types in Parquet
I'd love to get both of these in. There is some trickiness that I talk about on the JIRA for timestamps since the SQL timestamp class can support nano seconds and I don't think parquet has a type for this. Other systems (impala) seem to use INT96. It would be great to maybe ask on the parquet mailing list what the plan is there to make sure that whatever we do is going to be compatible long term. Michael On Mon, Dec 29, 2014 at 8:13 AM, Alessandro Baretta wrote: > Daoyuan, > > Thanks for creating the jiras. I need these features by... last week, so > I'd be happy to take care of this myself, if only you or someone more > experienced than me in the SparkSQL codebase could provide some guidance. > > Alex > On Dec 29, 2014 12:06 AM, "Wang, Daoyuan" wrote: > >> Hi Alex, >> >> I'll create JIRA SPARK-4985 for date type support in parquet, and >> SPARK-4987 for timestamp type support. For decimal type, I think we only >> support decimals that fits in a long. >> >> Thanks, >> Daoyuan >> >> -Original Message- >> From: Alessandro Baretta [mailto:alexbare...@gmail.com] >> Sent: Saturday, December 27, 2014 2:47 PM >> To: dev@spark.apache.org; Michael Armbrust >> Subject: Unsupported Catalyst types in Parquet >> >> Michael, >> >> I'm having trouble storing my SchemaRDDs in Parquet format with SparkSQL, >> due to my RDDs having having DateType and DecimalType fields. What would it >> take to add Parquet support for these Catalyst? Are there any other >> Catalyst types for which there is no Catalyst support? >> >> Alex >> >
RE: Unsupported Catalyst types in Parquet
Daoyuan, Thanks for creating the jiras. I need these features by... last week, so I'd be happy to take care of this myself, if only you or someone more experienced than me in the SparkSQL codebase could provide some guidance. Alex On Dec 29, 2014 12:06 AM, "Wang, Daoyuan" wrote: > Hi Alex, > > I'll create JIRA SPARK-4985 for date type support in parquet, and > SPARK-4987 for timestamp type support. For decimal type, I think we only > support decimals that fits in a long. > > Thanks, > Daoyuan > > -Original Message- > From: Alessandro Baretta [mailto:alexbare...@gmail.com] > Sent: Saturday, December 27, 2014 2:47 PM > To: dev@spark.apache.org; Michael Armbrust > Subject: Unsupported Catalyst types in Parquet > > Michael, > > I'm having trouble storing my SchemaRDDs in Parquet format with SparkSQL, > due to my RDDs having having DateType and DecimalType fields. What would it > take to add Parquet support for these Catalyst? Are there any other > Catalyst types for which there is no Catalyst support? > > Alex >
RE: Unsupported Catalyst types in Parquet
Hi Alex, I'll create JIRA SPARK-4985 for date type support in parquet, and SPARK-4987 for timestamp type support. For decimal type, I think we only support decimals that fits in a long. Thanks, Daoyuan -Original Message- From: Alessandro Baretta [mailto:alexbare...@gmail.com] Sent: Saturday, December 27, 2014 2:47 PM To: dev@spark.apache.org; Michael Armbrust Subject: Unsupported Catalyst types in Parquet Michael, I'm having trouble storing my SchemaRDDs in Parquet format with SparkSQL, due to my RDDs having having DateType and DecimalType fields. What would it take to add Parquet support for these Catalyst? Are there any other Catalyst types for which there is no Catalyst support? Alex - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org