David Milicevic created SPARK-55444:
---------------------------------------

             Summary: Types Framework - Phase 3 - Storage Formats
                 Key: SPARK-55444
                 URL: https://issues.apache.org/jira/browse/SPARK-55444
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.2.0
            Reporter: David Milicevic


*Summary:*

Add storage format support to the framework

*Description:*

Extend the framework to cover storage format integration points (Parquet, ORC, 
Avro, CSV, JSON, XML, columnar caching).

*What this includes:*
 * New interface(s) for storage format operations (schema conversion, 
read/write support)
 * Integration in ~15+ files across Parquet ({{{}ParquetSchemaConverter{}}}, 
{{{}ParquetRowConverter{}}}, {{{}ParquetWriteSupport{}}}), ORC 
({{{}OrcSerializer{}}}, {{{}OrcDeserializer{}}}, {{{}OrcUtils{}}}), Avro 
({{{}AvroSerializer{}}}, {{{}AvroDeserializer{}}}, {{{}SchemaConverters{}}}), 
CSV ({{{}UnivocityParser{}}}, {{{}UnivocityGenerator{}}}), JSON 
({{{}JacksonParser{}}}, {{{}JacksonGenerator{}}}), XML ({{{}StaxXmlParser{}}}, 
{{{}StaxXmlGenerator{}}}), and columnar caching ({{{}ColumnBuilder{}}}, 
{{{}ColumnAccessor{}}}, {{{}ColumnType{}}}, {{{}GenerateColumnAccessor{}}})

*Design doc:*

Linked in the parent work item.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to