David Milicevic created SPARK-55444:
---------------------------------------
Summary: Types Framework - Phase 3 - Storage Formats
Key: SPARK-55444
URL: https://issues.apache.org/jira/browse/SPARK-55444
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 4.2.0
Reporter: David Milicevic
*Summary:*
Add storage format support to the framework
*Description:*
Extend the framework to cover storage format integration points (Parquet, ORC,
Avro, CSV, JSON, XML, columnar caching).
*What this includes:*
* New interface(s) for storage format operations (schema conversion,
read/write support)
* Integration in ~15+ files across Parquet ({{{}ParquetSchemaConverter{}}},
{{{}ParquetRowConverter{}}}, {{{}ParquetWriteSupport{}}}), ORC
({{{}OrcSerializer{}}}, {{{}OrcDeserializer{}}}, {{{}OrcUtils{}}}), Avro
({{{}AvroSerializer{}}}, {{{}AvroDeserializer{}}}, {{{}SchemaConverters{}}}),
CSV ({{{}UnivocityParser{}}}, {{{}UnivocityGenerator{}}}), JSON
({{{}JacksonParser{}}}, {{{}JacksonGenerator{}}}), XML ({{{}StaxXmlParser{}}},
{{{}StaxXmlGenerator{}}}), and columnar caching ({{{}ColumnBuilder{}}},
{{{}ColumnAccessor{}}}, {{{}ColumnType{}}}, {{{}GenerateColumnAccessor{}}})
*Design doc:*
Linked in the parent work item.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]