[
https://issues.apache.org/jira/browse/SAMZA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275710#comment-14275710
]
Yi Pan (Data Infrastructure) commented on SAMZA-484:
----------------------------------------------------
Consolidating the comments in SAMZA-482 regarding to the nested data structure
in tuples and the generic API for message from SAMZA-429, I am thinking that we
should try to solve the two tickets together.
Having a generic message API would help to isolate the details of Serde from
the SQL operators. Hence, the Tuple interface would only expose the generic
message APIs.
Regarding to the schema for tuples from discussion in SAMZA-482, I am thinking
that if the generic message API also include a generic format of schema
description like:
{code}
Map<String, Type> schema = envelope.getGenericMessageSchema();
{code}
And the {code}Type{code} would also be in a generic form of one of the
following three:
# primitive types, i.e. string, integer, boolean, etc.
# object, which is described by {code}Map<String, Type>{code}
# array, which is described by {code}Type[]{code}
That would allow us to get the schema of each tuple directly w/o additional
need to include DDL.
> Define the serialization/deserialization format for stream tuple
> ----------------------------------------------------------------
>
> Key: SAMZA-484
> URL: https://issues.apache.org/jira/browse/SAMZA-484
> Project: Samza
> Issue Type: Sub-task
> Reporter: Yi Pan (Data Infrastructure)
> Priority: Minor
> Labels: project
>
> It came out in the discussion for streaming SQL that we will need to define
> the serialization/deserialization format for stream tuple.
> The ideal serialization/deserialization format should allow both forward and
> backward compatibility on additional/missing fields in the data.
> Several choices to be considered:
> 1) Avro
> 2) Protobuf
> 3) Flatbuffer
> It might also be interesting to consider a pluggable serialization interface
> that allows different serialization methods for different Samza jobs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)