[jira] [Commented] (FLINK-6097) Guaranteed the order of the extracted field references

ASF GitHub Bot (JIRA) Fri, 17 Mar 2017 18:08:53 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930960#comment-15930960
 ]


ASF GitHub Bot commented on FLINK-6097:
---------------------------------------

Github user sunjincheng121 commented on the issue:

    https://github.com/apache/flink/pull/3560
  
    HI, @KurtYoung Thanks for your attention to this PR. Good question, Here I 
glad share why I notice this method：
    When we try to implement OVER window TableAPI, The first version of the 
prototype to achieve，we do not consider the table field will be out of order 
when we implement translateToPlan method，then we set outputRow field from 
inputRow according to the Initial order of the table field index.
    At the beginning, the projections in the select statement less than 5 
columns It works well.But Unfortunately when the count of projections bigger 
than 4 (>=5), we got the random result. Then we debug the code, we find that 
ProjectionTranslator # identifyFieldReferences method uses theSet temporary 
save field, when the number of elements in the Set is less than 5, the Set 
takes the Se1, Se2, Se3, Se4 data structures. When the number of elements is 
greater than or equal to 5, the Set takes HashSet # HashTrieSet and which will 
cause the data to be out of order. So we thought 2 approach to solve this 
problem：
    
    Let ProjectionTranslator # identifyFieldReferences method guaranteed the 
order of the extracted field references same as input order.
    We add the input and output field mapping.
    At last we using approach#2 solve the problem. This change is not necessary 
for the problem i have faced. But I feel it is better to let the output of this 
method in the same order as the input, it may be very helpful for other cases, 
though I am currently not aware of any. I am ok with not making this change, 
but we should add a comment instead to highlight that the potential output of 
the current output. Otherwise, some people may not pay attention to this and 
assume it is in order.
    
    Thanks,
    SunJincheng


> Guaranteed the order of the extracted field references
> ------------------------------------------------------
>
>                 Key: FLINK-6097
>                 URL: https://issues.apache.org/jira/browse/FLINK-6097
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>            Reporter: sunjincheng
>            Assignee: sunjincheng
>
> When we try to implement `OVER window` TableAPI, The first version of the 
> prototype to achieve，we do not consider the table field will be out of order 
> when we implement `translateToPlan` method，then we  set `outputRow` field 
> from `inputRow` according to the Initial order of the table field index.
> At the beginning, the projections in the select statement less than 5 columns 
> It works well.But Unfortunately when the count of projections bigger than 4 
> (>=5), we got the random result. Then we debug the code, we find that  
> `ProjectionTranslator # identifyFieldReferences` method uses the` Set` 
> temporary save field, when the number of elements in the Set is less than 5, 
> the Set takes the Se1, Se2, Se3, Se4 data structures. When the number of 
> elements is greater than or equal to 5, the Set takes HashSet # HashTrieSet 
> and which will cause the data to be out of order.  
> e.g.:
> Add the following elements in turn:
> {code}
> A, b, c, d, e
> Set (a)
> Class scala.collection.immutable.Set $ Set1
> Set (a, b)
> Class scala.collection.immutable.Set $ Set2
> Set (a, b, c)
> Class scala.collection.immutable.Set $ Set3
> Set (a, b, c, d)
> Class scala.collection.immutable.Set $ Set4
> // we want (a, b, c, d, e)
> Set (e, a, b, c, d) 
> Class scala.collection.immutable.HashSet $ HashTrieSet
> {code}
> So we thought 2 approach to solve this problem：
> 1. Let `ProjectionTranslator # identifyFieldReferences` method guaranteed the 
> order of the extracted field references same as input order.
> 2. We add the input and output field mapping. 
> At last we using approach#2 solve the problem. This change is not necessary 
> for the problem i have faced. But I feel it is better to let the output of 
> this method in the same order as the input, it may be very helpful for other 
> cases, though I am currently not aware of any. I am ok with not making this 
> change, but we should add a comment instead to highlight that the potential 
> output of the current output. Otherwise, some people may not pay attention to 
> this and assume it is in order.
> Hi, guys, What do you think? Welcome any feedback.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-6097) Guaranteed the order of the extracted field references

Reply via email to