David Milicevic created SPARK-55441:
---------------------------------------
Summary: Phase 1b - Client Integration
Key: SPARK-55441
URL: https://issues.apache.org/jira/browse/SPARK-55441
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 4.2.0
Reporter: David Milicevic
*Summary:*
Add client-side Ops interfaces and integrate with Spark Connect, Arrow, JDBC,
Python, and Thrift
*Description:*
Extend the framework with client-facing operations and integrate with all
client communication paths.
*What this includes:*
* New interfaces: {{ProtoTypeOps}} (Spark Connect proto serialization),
{{ClientTypeOps}} (JDBC, Arrow, Python, Thrift)
* Extend TimeType Ops with client operation implementations
* Integration in ~13 files: {{{}DataTypeProtoConverter{}}},
{{{}LiteralValueProtoConverter{}}}, {{{}ArrowUtils{}}}, {{{}ArrowWriter{}}},
{{{}ArrowSerializer{}}}, {{{}ArrowVectorReader{}}}, {{{}JdbcTypeUtils{}}},
{{{}SparkExecuteStatementOperation{}}}, {{{}HiveResult{}}},
{{{}EvaluatePython{}}}, {{{}DeserializerBuildHelper{}}},
{{{}SerializerBuildHelper{}}}, {{udaf.scala}}
*Note:*
Client integration has many non-obvious patterns beyond {{case _: TimeType =>}}
- Arrow uses {{{}TimeNanoVector{}}}, proto uses
{{{}.hasTime(){}}}/{{{}.getTime(){}}}, Thrift uses {{{}TTypeId.STRING_TYPE{}}}.
The framework consolidates all of these behind a single Ops interface.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]