[
https://issues.apache.org/jira/browse/FLINK-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064567#comment-16064567
]
ASF GitHub Bot commented on FLINK-6983:
---------------------------------------
Github user kl0u commented on the issue:
https://github.com/apache/flink/pull/4172
@dianfu and I also include @wuchong on this as these two are related.
The way I see it is that by not serializing the condition and the states,
you are trying to gain some speed, especially when using RocksDB where you
serialize/deserialize on every element, right? My suggestion is to not do these
optimizations yet.
At first, because this seems like a pre-mature optimization to me as we are
not sure yet about the interplay between all the features we are planning to
put in `CEP`, and we know that if we allow users to add `Patterns` at runtime,
then we will need 1) to store both States and Conditions and 2) match the
States and Conditions of a given NFA with its SharredBuffer. In other words, we
will need a unique Id for each NFA, that will match the restored sharedBuffer
(which is still serialized and deserialized as before) with the States
(`metastates` in this PR) and Conditions (`ConditionRegistry` in
https://github.com/apache/flink/pull/4145) of the NFA.
So I propose to implement this
https://issues.apache.org/jira/browse/FLINK-7008 and
https://issues.apache.org/jira/browse/FLINK-6938 right away so that we can
proceed with the SQL integration, and think a more general solution for
checkpointing separately the static state of the NFA (state and conditions)
from the dynamic one (sharedbuffer), which will lead to runtime gains.
> Do not serialize States with NFA
> --------------------------------
>
> Key: FLINK-6983
> URL: https://issues.apache.org/jira/browse/FLINK-6983
> Project: Flink
> Issue Type: Improvement
> Components: CEP
> Reporter: Dawid Wysakowicz
> Assignee: Dian Fu
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)