[
https://issues.apache.org/jira/browse/FLINK-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephan Ewen resolved FLINK-1259.
---------------------------------
Resolution: Fixed
Fix Version/s: 0.9
Assignee: Stephan Ewen
Added description to documentation.
Fixed in e71ee0b7953f7b061f3541c63650651a471cb6b7
> FilterFunction can modify data
> ------------------------------
>
> Key: FLINK-1259
> URL: https://issues.apache.org/jira/browse/FLINK-1259
> Project: Flink
> Issue Type: Bug
> Components: Java API, Optimizer, Scala API
> Affects Versions: 0.7.0-incubating
> Reporter: Fabian Hueske
> Assignee: Stephan Ewen
> Fix For: 0.9
>
>
> The FilterFunction returns a boolean for an input record which determines
> whether the record is filtered or not.
> However, the function can also modify the input record which has effects if
> the record is not filtered.
> The optimizer assumes that the data is not changed by a FilterFunction, i.e.,
> it assumes that a Filter preserves physical data properties (orders,
> partitionings, etc.) and might also be pushed down in the future. These
> assumptions can result in semantically incorrect programs, if the function
> actually changes its incoming records.
> Possible solutions are:
> - document the requirements (and hope that users read it and behave nicely)
> - hand a copy to the function which can be modified but is not passed on.
> This has major performance implications and might confuse users as changes
> are invalidated. However, this could also be integrated with the
> mutable/immutable runtime switch (FLINK-1005)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)