[
https://issues.apache.org/jira/browse/FLINK-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377574#comment-14377574
]
ASF GitHub Bot commented on FLINK-1656:
---------------------------------------
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/525#issuecomment-85422608
I think this is good to fix this.
I was wondering now whether we can actually solve this a bit more
global/local property specific. Cancelling out the properties from the non-key
fields was originally motivated by the fact that the group operation destroys
orders/groupings, which are actually local properties.
Is there a way we can preserve the global properties still?
What would happen if we move the new code that "cleans" the semantic
properties from the API operators to the optimizer's operator descriptor. There
we can filter the local properties and global properties independently.
The main benefit is probably to preserve the global properties for
`mapPartition()`, which is desirable.
> Filtered Semantic Properties for Operators with Iterators
> ---------------------------------------------------------
>
> Key: FLINK-1656
> URL: https://issues.apache.org/jira/browse/FLINK-1656
> Project: Flink
> Issue Type: Bug
> Components: Documentation
> Affects Versions: 0.9
> Reporter: Fabian Hueske
> Assignee: Fabian Hueske
> Priority: Critical
>
> The documentation of ForwardedFields is incomplete for operators with
> iterator inputs (GroupReduce, CoGroup).
> This should be fixed ASAP, because it can lead to incorrect program execution.
> The conditions for forwarded fields on operators with iterator input are:
> 1) forwarded fields must be emitted in the order in which they are received
> through the iterator
> 2) all forwarded fields of a record must stick together, i.e., if your
> function builds record from field 0 of the 1st, 3rd, 5th, ... and field 1 of
> the 2nd, 4th, ... record coming through the iterator, these are not valid
> forwarded fields.
> 3) it is OK to completely filter out records coming through the iterator.
> The reason for these conditions is that the optimizer uses forwarded fields
> to reason about physical data properties such as order and grouping. Mixing
> up the order of records or emitting records which are composed from different
> input records, might destroy a (secondary) order or grouping.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)