[
https://issues.apache.org/jira/browse/SAMZA-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739217#comment-13739217
]
Chris Riccomini commented on SAMZA-4:
-------------------------------------
Throwing an error by default seems pretty reasonable in KafkaSystemFactory. If
people *really* want to get at the byte array, they can just use a pass through
serde.
One minor tweak to your statement would be that all non-changelog streams
should throw an error. Having a changelog with no serde seems OK, since we're
trying to avoid double serialization in that case. An alternative would be to
default change logs to pass through serdes, but that's effectively the same as
just allowing them to not have a serde.
> SerdeManager should cache serdes on startup
> -------------------------------------------
>
> Key: SAMZA-4
> URL: https://issues.apache.org/jira/browse/SAMZA-4
> Project: Samza
> Issue Type: Bug
> Reporter: Chris Riccomini
>
> The SerdeManager does a complex set of evaluations to determine which serde
> should be used for a given incoming/outgoing message envelope. For example,
> the ordered list of rules for outgoing keys are (toBytes) are:
> 1. Use the key object, itself (don't serialize), if the stream is a changelog
> stream.
> 2. Use the key serializer defined in the envelope, if it's defined.
> 3. Use the stream's key serde defined in config, if defined.
> 4. Use the system's key serde defined in config, if defined.
> 5. Use the key object, itself (don't serialize)
> These rules are evaluated on every incoming/outgoing message right now (it's
> a bunch of if statements). Instead of this, we should just cache the
> appropriate serdes rather than doing the full re-evaluation every time.
> For outgoing messages, we'll need to make sure that we can handle arbitrary
> streams, since an outgoing message can be sent to any (undefined) stream.
> The two ways that I can think to do this are:
> 1. Cache when constructing the SerdeManager.
> 2. Cache in toBytes/fromBytes.
> I'm in favor of #2, since it means we can cache the decisions for outgoing
> message envelopes that are sent to a new (undefined in config) stream, as
> well. We couldn't do this in the constructor because we don't know all
> outgoing streams at that point.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira