[
https://issues.apache.org/jira/browse/FLINK-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933738#comment-15933738
]
ASF GitHub Bot commented on FLINK-6097:
---------------------------------------
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/3560#discussion_r107033984
--- Diff:
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/ProjectionTranslator.scala
---
@@ -227,18 +227,23 @@ object ProjectionTranslator {
* @param exprs a list of expressions to extract
* @return a list of field references extracted from the given
expressions
*/
- def extractFieldReferences(exprs: Seq[Expression]): Seq[NamedExpression]
= {
- exprs.foldLeft(Set[NamedExpression]()) {
+ def extractFieldReferences(exprs: Seq[Expression]):
List[NamedExpression] = {
+ exprs.foldLeft(List[NamedExpression]()) {
(fieldReferences, expr) => identifyFieldReferences(expr,
fieldReferences)
- }.toSeq
+ }
}
private def identifyFieldReferences(
expr: Expression,
- fieldReferences: Set[NamedExpression]): Set[NamedExpression] = expr
match {
+ fieldReferences: List[NamedExpression]): List[NamedExpression] =
expr match {
--- End diff --
We can also use a `LinkedHashSet` which preserves the order in which
elements are inserted.
> Guaranteed the order of the extracted field references
> ------------------------------------------------------
>
> Key: FLINK-6097
> URL: https://issues.apache.org/jira/browse/FLINK-6097
> Project: Flink
> Issue Type: Improvement
> Components: Table API & SQL
> Reporter: sunjincheng
> Assignee: sunjincheng
>
> When we try to implement `OVER window` TableAPI, The first version of the
> prototype to achieve,we do not consider the table field will be out of order
> when we implement `translateToPlan` method,then we set `outputRow` field
> from `inputRow` according to the Initial order of the table field index.
> At the beginning, the projections in the select statement less than 5 columns
> It works well.But Unfortunately when the count of projections bigger than 4
> (>=5), we got the random result. Then we debug the code, we find that
> `ProjectionTranslator # identifyFieldReferences` method uses the` Set`
> temporary save field, when the number of elements in the Set is less than 5,
> the Set takes the Se1, Se2, Se3, Se4 data structures. When the number of
> elements is greater than or equal to 5, the Set takes HashSet # HashTrieSet
> and which will cause the data to be out of order.
> e.g.:
> Add the following elements in turn:
> {code}
> A, b, c, d, e
> Set (a)
> Class scala.collection.immutable.Set $ Set1
> Set (a, b)
> Class scala.collection.immutable.Set $ Set2
> Set (a, b, c)
> Class scala.collection.immutable.Set $ Set3
> Set (a, b, c, d)
> Class scala.collection.immutable.Set $ Set4
> // we want (a, b, c, d, e)
> Set (e, a, b, c, d)
> Class scala.collection.immutable.HashSet $ HashTrieSet
> {code}
> So we thought 2 approach to solve this problem:
> 1. Let `ProjectionTranslator # identifyFieldReferences` method guaranteed the
> order of the extracted field references same as input order.
> 2. We add the input and output field mapping.
> At last we using approach#2 solve the problem. This change is not necessary
> for the problem i have faced. But I feel it is better to let the output of
> this method in the same order as the input, it may be very helpful for other
> cases, though I am currently not aware of any. I am ok with not making this
> change, but we should add a comment instead to highlight that the potential
> output of the current output. Otherwise, some people may not pay attention to
> this and assume it is in order.
> Hi, guys, What do you think? Welcome any feedback.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)