[jira] [Commented] (FLINK-34156) Move Flink Calcite rules from Scala to Java

2024-03-06 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824223#comment-17824223
 ] 

Yunhong Zheng commented on FLINK-34156:
---

Thanks, [~Sergey Nuyanzin] . I will try to review these PRs.

> Move Flink Calcite rules from Scala to Java
> ---
>
> Key: FLINK-34156
> URL: https://issues.apache.org/jira/browse/FLINK-34156
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Planner
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
> Fix For: 2.0.0
>
>
> This is an umbrella task for migration of Calcite rules from Scala to Java 
> mentioned at https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> The reason is that since 1.28.0 ( CALCITE-4787 - Move core to use Immutables 
> instead of ImmutableBeans ) Calcite started to use Immutables 
> (https://immutables.github.io/) and since 1.29.0 removed ImmutableBeans ( 
> CALCITE-4839 - Remove remnants of ImmutableBeans post 1.28 release ). All 
> rule configuration related api which is not Immutables based is marked as 
> deprecated. Since Immutables implies code generation while java compilation 
> it is seems impossible to use for rules in Scala code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34156) Move Flink Calcite rules from Scala to Java

2024-02-19 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818607#comment-17818607
 ] 

Yunhong Zheng commented on FLINK-34156:
---

Hi, [~Sergey Nuyanzin] . Since I have been continuously involved in the 
development related to table-planner and calcite, I am quite familiar with this 
area. Could I possibly join this work to help you to deal with some subtasks? 
Looking forward your reply, Thanks.

> Move Flink Calcite rules from Scala to Java
> ---
>
> Key: FLINK-34156
> URL: https://issues.apache.org/jira/browse/FLINK-34156
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Planner
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
> Fix For: 2.0.0
>
>
> This is an umbrella task for migration of Calcite rules from Scala to Java 
> mentioned at https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> The reason is that since 1.28.0 ( CALCITE-4787 - Move core to use Immutables 
> instead of ImmutableBeans ) Calcite started to use Immutables 
> (https://immutables.github.io/) and since 1.29.0 removed ImmutableBeans ( 
> CALCITE-4839 - Remove remnants of ImmutableBeans post 1.28 release ). All 
> rule configuration related api which is not Immutables based is marked as 
> deprecated. Since Immutables implies code generation while java compilation 
> it is seems impossible to use for rules in Scala code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34374) Complete work for syntax `DESCRIBE EXTENDED DATABASE databaseName`

2024-02-05 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814587#comment-17814587
 ] 

Yunhong Zheng commented on FLINK-34374:
---

Hi, [~xuyangzhong] . I want to take this issue, can u assign to me! Thanks.

> Complete work for syntax `DESCRIBE EXTENDED DATABASE databaseName`
> --
>
> Key: FLINK-34374
> URL: https://issues.apache.org/jira/browse/FLINK-34374
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.10.0, 1.19.0
>Reporter: xuyang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34375) Complete work for syntax `DESCRIBE EXTENDED tableName`

2024-02-05 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814586#comment-17814586
 ] 

Yunhong Zheng commented on FLINK-34375:
---

Hi, [~xuyangzhong] . I worked on the DESCRIBE EXTENDED syntax last year, so 
I'll take this issue and continue to complete it. Thank you.

> Complete work for syntax `DESCRIBE EXTENDED tableName`
> --
>
> Key: FLINK-34375
> URL: https://issues.apache.org/jira/browse/FLINK-34375
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.10.0, 1.19.0
>Reporter: xuyang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34376) FLINK SQL SUM() causes a precision error

2024-02-05 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814582#comment-17814582
 ] 

Yunhong Zheng commented on FLINK-34376:
---

Hi, [~liufangliang] . Can you provide your SQL pattern? This would allow for a 
clearer description of the issue.

> FLINK SQL SUM() causes a precision error
> 
>
> Key: FLINK-34376
> URL: https://issues.apache.org/jira/browse/FLINK-34376
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.14.3, 1.18.1
>Reporter: Fangliang Liu
>Priority: Major
> Attachments: image-2024-02-06-11-15-02-669.png, 
> image-2024-02-06-11-17-03-399.png
>
>
> {code:java}
> select cast(sum(CAST(9.11 AS DECIMAL(38,18)) *10 ) as STRING) 
> {code}
> the result in 1.14.3 and master branch is 
> !image-2024-02-06-11-15-02-669.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-34376) FLINK SQL SUM() causes a precision error

2024-02-05 Thread Yunhong Zheng (Jira)


[ https://issues.apache.org/jira/browse/FLINK-34376 ]


Yunhong Zheng deleted comment on FLINK-34376:
---

was (Author: JIRAUSER287975):
Hi, [~liufangliang] . Can you provide your SQL pattern? This would allow for a 
clearer description of the issue.

> FLINK SQL SUM() causes a precision error
> 
>
> Key: FLINK-34376
> URL: https://issues.apache.org/jira/browse/FLINK-34376
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.14.3, 1.18.1
>Reporter: Fangliang Liu
>Priority: Major
> Attachments: image-2024-02-06-11-15-02-669.png, 
> image-2024-02-06-11-17-03-399.png
>
>
> {code:java}
> select cast(sum(CAST(9.11 AS DECIMAL(38,18)) *10 ) as STRING) 
> {code}
> the result in 1.14.3 and master branch is 
> !image-2024-02-06-11-15-02-669.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34147) TimestampData to/from LocalDateTime is ambiguous

2024-02-05 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814566#comment-17814566
 ] 

Yunhong Zheng commented on FLINK-34147:
---

Hi, [~lirui]  and [~jark] . Can you assign this issue to me, I can help to 
improve the doc. Thanks!

> TimestampData to/from LocalDateTime is ambiguous
> 
>
> Key: FLINK-34147
> URL: https://issues.apache.org/jira/browse/FLINK-34147
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Rui Li
>Priority: Major
>
> It seems TimestampData is essentially an {{Instant}}. Therefore an implicit 
> time zone is used in the {{fromLocalDateTime}} and {{toLocalDateTime}} 
> methods. However neither the method name nor the API doc indicates which time 
> zone is used. So from caller's perspective, the results of these two methods 
> are ambiguous.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34370) [Umbrella] Complete work about enhanced Flink SQL DDL

2024-02-05 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814565#comment-17814565
 ] 

Yunhong Zheng commented on FLINK-34370:
---

Hi, [~xuyangzhong] .I hope I can also participate in the development of the 
remaining flip features. Please cc me if there are any further developments. 
Thanks !

> [Umbrella] Complete work about enhanced Flink SQL DDL 
> --
>
> Key: FLINK-34370
> URL: https://issues.apache.org/jira/browse/FLINK-34370
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: xuyang
>Priority: Major
>
> This is a umbrella Jira for completing work for 
> [Flip-69](https://cwiki.apache.org/confluence/display/FLINK/FLIP-69%3A+Flink+SQL+DDL+Enhancement)
>  about enhanced Flink SQL DDL.
> With [FLINK-34254](https://issues.apache.org/jira/browse/FLINK-34254), it 
> seems that this flip is not finished yet.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34362) Add argument to reuse connector docs cache in setup_docs.sh to improve build times

2024-02-05 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814558#comment-17814558
 ] 

Yunhong Zheng commented on FLINK-34362:
---

Hi, [~qingyue]  and [~martijnvisser]. I'm looking forward to take this issue, 
can you assign to me?

> Add argument to reuse connector docs cache in setup_docs.sh to improve build 
> times
> --
>
> Key: FLINK-34362
> URL: https://issues.apache.org/jira/browse/FLINK-34362
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: Jane Chan
>Priority: Minor
>
> Problem:
> The current build process of Flink's documentation involves the 
> `setup_docs.sh` script, which re-clones connector repositories every time the 
> documentation is built. This operation is time-consuming, particularly for 
> developers in regions with slower internet connections or facing network 
> restrictions (like the Great Firewall in China). This results in a build 
> process that can take an excessive amount of time, hindering developer 
> productivity.
>  
> Proposal:
> We could add a command-line argument (e.g., --use-doc-cache) to the 
> `setup_docs.sh` script, which, when set, skips the cloning step if the 
> connector repositories have already been cloned previously. As a result, 
> developers can opt to use the cache when they do not require the latest 
> versions of the connectors' documentation. This change will reduce build 
> times significantly and improve the developer experience for those working on 
> the documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33760) Group Window agg has different result when only consuming -D records while using or not using minibatch

2024-01-24 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810625#comment-17810625
 ] 

Yunhong Zheng commented on FLINK-33760:
---

After offline discussion with [~lincoln.86xy]  and [~xuyangzhong]. The key to 
this question is whether to retract the data from the window that has already 
been sent, to ensure final consistency when -D data exists alone int the 
current window, or we just simply discard the retracted record. There's 
currently no final solution, a more detailed solution is needed. 

> Group Window agg has different result when only consuming -D records while 
> using or not using minibatch
> ---
>
> Key: FLINK-33760
> URL: https://issues.apache.org/jira/browse/FLINK-33760
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Reporter: xuyang
>Assignee: Yunhong Zheng
>Priority: Major
>
> Add the test in AggregateITCase to re-produce this bug.
>  
> {code:java}
> @Test
> def test(): Unit = {
>   val upsertSourceCurrencyData = List(
> changelogRow("-D", 1.bigDecimal, "a"),
> changelogRow("-D", 1.bigDecimal, "b"),
> changelogRow("-D", 1.bigDecimal, "b")
>   )
>   val upsertSourceDataId = registerData(upsertSourceCurrencyData);
>   tEnv.executeSql(s"""
>  |CREATE TABLE T (
>  | `a` DECIMAL(32, 8),
>  | `d` STRING,
>  | proctime as proctime()
>  |) WITH (
>  | 'connector' = 'values',
>  | 'data-id' = '$upsertSourceDataId',
>  | 'changelog-mode' = 'I,UA,UB,D',
>  | 'failing-source' = 'true'
>  |)
>  |""".stripMargin)
>   val sql =
> "SELECT max(a), sum(a), min(a), TUMBLE_START(proctime, INTERVAL '0.005' 
> SECOND), TUMBLE_END(proctime, INTERVAL '0.005' SECOND), d FROM T GROUP BY d, 
> TUMBLE(proctime, INTERVAL '0.005' SECOND)"
>   val sink = new TestingRetractSink
>   tEnv.sqlQuery(sql).toRetractStream[Row].addSink(sink)
>   env.execute()
>   // Use the result precision/scale calculated for sum and don't override 
> with the one calculated
>   // for plus()/minus(), which results in loosing a decimal digit.
>   val expected = 
> List("6.41671935,65947.230719357070,609.0286740370369970")
>   assertEquals(expected, sink.getRetractResults.sorted)
> } {code}
> When MiniBatch is ON, the result is `List()`.
>  
> When MiniBatch is OFF, the result is 
> `List(null,-1.,null,2023-12-06T11:29:21.895,2023-12-06T11:29:21.900,a)`.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34016) Janino compile failed when watermark with column by udf

2024-01-24 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810339#comment-17810339
 ] 

Yunhong Zheng commented on FLINK-34016:
---

Hi, [~wczhu] , I cann't reproduce this error in my local env with master 
branch, can you try to reproduce it using master branch? 

> Janino compile failed when watermark with column by udf
> ---
>
> Key: FLINK-34016
> URL: https://issues.apache.org/jira/browse/FLINK-34016
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.15.0, 1.18.0
>Reporter: ude
>Priority: Major
>
> After submit the following flink sql by sql-client.sh will throw an exception:
> {code:java}
> Caused by: java.lang.RuntimeException: Could not instantiate generated class 
> 'WatermarkGenerator$0'
>     at 
> org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:74)
>     at 
> org.apache.flink.table.runtime.generated.GeneratedWatermarkGeneratorSupplier.createWatermarkGenerator(GeneratedWatermarkGeneratorSupplier.java:69)
>     at 
> org.apache.flink.streaming.api.operators.source.ProgressiveTimestampsAndWatermarks.createMainOutput(ProgressiveTimestampsAndWatermarks.java:109)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.initializeMainOutput(SourceOperator.java:462)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.emitNextNotReading(SourceOperator.java:438)
>     at 
> org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:414)
>     at 
> org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
>     at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:562)
>     at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807)
>     at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
>     at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.util.FlinkRuntimeException: 
> org.apache.flink.api.common.InvalidProgramException: Table program cannot be 
> compiled. This is a bug. Please file an issue.
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:94)
>     at 
> org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:101)
>     at 
> org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:68)
>     ... 16 more
> Caused by: 
> org.apache.flink.shaded.guava31.com.google.common.util.concurrent.UncheckedExecutionException:
>  org.apache.flink.api.common.InvalidProgramException: Table program cannot be 
> compiled. This is a bug. Please file an issue.
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2055)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache.get(LocalCache.java:3966)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4863)
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:92)
>     ... 18 more
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
> cannot be compiled. This is a bug. Please file an issue.
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:107)
>     at 
> org.apache.flink.table.runtime.generated.CompileUtils.lambda$compile$0(CompileUtils.java:92)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4868)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3533)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2282)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2159)
>     at 
> org.apache.flink.shaded.guava31.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2049)
>     ... 21 more
> Caused by: 

[jira] [Commented] (FLINK-34147) TimestampData to/from LocalDateTime is ambiguous

2024-01-24 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810300#comment-17810300
 ] 

Yunhong Zheng commented on FLINK-34147:
---

Hi, [~lirui] . I also think it ambiguous. IMO, `TimestampData` may be possible 
to split into two classes, such as `TimestampLtz` and `TimestampNtz`. In 
`TimestampLtz`,  it may contain methods  `fromInstant()/toInstant()`, for 
`TimestampNtz`, it may contain methods `fromLocalDateTime()/toLocalDateTime()`. 
 So, the logical type `LocalZonedTimestampType` need to represent by 
`TimestampLtz`, and the logical type `TimestampType` is represented by 
`TimestampNtz`.  

I'm not sure how significant the impact of such a change would be on the flink 
dataType system and whether it would be compatible.  [~jark] WDYT?

> TimestampData to/from LocalDateTime is ambiguous
> 
>
> Key: FLINK-34147
> URL: https://issues.apache.org/jira/browse/FLINK-34147
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Rui Li
>Priority: Major
>
> It seems TimestampData is essentially an {{Instant}}. Therefore an implicit 
> time zone is used in the {{fromLocalDateTime}} and {{toLocalDateTime}} 
> methods. However neither the method name nor the API doc indicates which time 
> zone is used. So from caller's perspective, the results of these two methods 
> are ambiguous.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33760) Group Window agg has different result when only consuming -D records while using or not using minibatch

2024-01-23 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810232#comment-17810232
 ] 

Yunhong Zheng commented on FLINK-33760:
---

[~xuyangzhong]  It looks like a bug,  I want to take this ticket! 

> Group Window agg has different result when only consuming -D records while 
> using or not using minibatch
> ---
>
> Key: FLINK-33760
> URL: https://issues.apache.org/jira/browse/FLINK-33760
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Reporter: xuyang
>Priority: Major
>
> Add the test in AggregateITCase to re-produce this bug.
>  
> {code:java}
> @Test
> def test(): Unit = {
>   val upsertSourceCurrencyData = List(
> changelogRow("-D", 1.bigDecimal, "a"),
> changelogRow("-D", 1.bigDecimal, "b"),
> changelogRow("-D", 1.bigDecimal, "b")
>   )
>   val upsertSourceDataId = registerData(upsertSourceCurrencyData);
>   tEnv.executeSql(s"""
>  |CREATE TABLE T (
>  | `a` DECIMAL(32, 8),
>  | `d` STRING,
>  | proctime as proctime()
>  |) WITH (
>  | 'connector' = 'values',
>  | 'data-id' = '$upsertSourceDataId',
>  | 'changelog-mode' = 'I,UA,UB,D',
>  | 'failing-source' = 'true'
>  |)
>  |""".stripMargin)
>   val sql =
> "SELECT max(a), sum(a), min(a), TUMBLE_START(proctime, INTERVAL '0.005' 
> SECOND), TUMBLE_END(proctime, INTERVAL '0.005' SECOND), d FROM T GROUP BY d, 
> TUMBLE(proctime, INTERVAL '0.005' SECOND)"
>   val sink = new TestingRetractSink
>   tEnv.sqlQuery(sql).toRetractStream[Row].addSink(sink)
>   env.execute()
>   // Use the result precision/scale calculated for sum and don't override 
> with the one calculated
>   // for plus()/minus(), which results in loosing a decimal digit.
>   val expected = 
> List("6.41671935,65947.230719357070,609.0286740370369970")
>   assertEquals(expected, sink.getRetractResults.sorted)
> } {code}
> When MiniBatch is ON, the result is `List()`.
>  
> When MiniBatch is OFF, the result is 
> `List(null,-1.,null,2023-12-06T11:29:21.895,2023-12-06T11:29:21.900,a)`.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33928) Should not throw exception while creating view with specify field names even if the query conflicts in field names

2024-01-12 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805962#comment-17805962
 ] 

Yunhong Zheng commented on FLINK-33928:
---

Hi, [~xuyangzhong]. I want to take this issue. Thanks.

> Should not throw exception while creating view with specify field names even 
> if the query conflicts in field names
> --
>
> Key: FLINK-33928
> URL: https://issues.apache.org/jira/browse/FLINK-33928
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: xuyang
>Priority: Major
>
> The following sql should be valid.
> {code:java}
> create view view1(a, b) as select t1.name, t2.name from t1 join t1 t2 on 
> t1.score = t2.score; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34066) LagFunction throw NPE when input argument are not null

2024-01-12 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-34066:
--
Description: 
This issue is related to https://issues.apache.org/jira/browse/FLINK-31967. In 
FLINK-31967, the NPE error has not been thoroughly fixed. If the select value  
LAG(len, 1, cast(null as int)) and  LAG(len, 1, 1) exists together in test case 
AggregateITCase.testLagAggFunction() as:
{code:java}
val sql =
  s"""
 |select
 |  LAG(len, 1, cast(null as int)) OVER w AS nullable_prev_quantity,
 |  LAG(len, 1, 1) OVER w AS prev_quantity,
 |  LAG(len) OVER w AS prev_quantity
 |from src
 |WINDOW w AS (ORDER BY proctime)
 |""".stripMargin {code}
before is:
{code:java}
val sql =
  s"""
 |select
 |  LAG(len, 1, cast(null as int)) OVER w AS prev_quantity,
 |  LAG(len) OVER w AS prev_quantity
 |from src
 |WINDOW w AS (ORDER BY proctime)
 |""".stripMargin {code}
NPE will throw.

  was:This issue is related to 


> LagFunction throw NPE when input argument are not null
> --
>
> Key: FLINK-34066
> URL: https://issues.apache.org/jira/browse/FLINK-34066
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> This issue is related to https://issues.apache.org/jira/browse/FLINK-31967. 
> In FLINK-31967, the NPE error has not been thoroughly fixed. If the select 
> value  LAG(len, 1, cast(null as int)) and  LAG(len, 1, 1) exists together in 
> test case AggregateITCase.testLagAggFunction() as:
> {code:java}
> val sql =
>   s"""
>  |select
>  |  LAG(len, 1, cast(null as int)) OVER w AS nullable_prev_quantity,
>  |  LAG(len, 1, 1) OVER w AS prev_quantity,
>  |  LAG(len) OVER w AS prev_quantity
>  |from src
>  |WINDOW w AS (ORDER BY proctime)
>  |""".stripMargin {code}
> before is:
> {code:java}
> val sql =
>   s"""
>  |select
>  |  LAG(len, 1, cast(null as int)) OVER w AS prev_quantity,
>  |  LAG(len) OVER w AS prev_quantity
>  |from src
>  |WINDOW w AS (ORDER BY proctime)
>  |""".stripMargin {code}
> NPE will throw.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34066) LagFunction throw NPE when input argument are not null

2024-01-12 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-34066:
--
Description: This issue is related to   (was: LagFunction throw NPE when 
input argument are not null.)

> LagFunction throw NPE when input argument are not null
> --
>
> Key: FLINK-34066
> URL: https://issues.apache.org/jira/browse/FLINK-34066
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> This issue is related to 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34066) LagFunction throw NPE when input argument are not null

2024-01-12 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805938#comment-17805938
 ] 

Yunhong Zheng commented on FLINK-34066:
---

[~martijnvisser] ,Sorry for not adding more details, I will add it.  Thanks for 
your reminder. (y)

> LagFunction throw NPE when input argument are not null
> --
>
> Key: FLINK-34066
> URL: https://issues.apache.org/jira/browse/FLINK-34066
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> LagFunction throw NPE when input argument are not null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34066) LagFunction throw NPE when input argument are not null

2024-01-12 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805936#comment-17805936
 ] 

Yunhong Zheng commented on FLINK-34066:
---

Hi, [~martijnvisser] . The new test case has not abandoned the previous tests; 
instead, it covers both the former cases and the new ones (they work in 
conjunction). As for the newly added field, if it is not handled with 
nullable(), it will still throw an NPE error.

> LagFunction throw NPE when input argument are not null
> --
>
> Key: FLINK-34066
> URL: https://issues.apache.org/jira/browse/FLINK-34066
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> LagFunction throw NPE when input argument are not null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34066) LagFunction throw NPE when input argument are not null

2024-01-12 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805933#comment-17805933
 ] 

Yunhong Zheng commented on FLINK-34066:
---

hi, [~martijnvisser] . This is a related bug of 
https://issues.apache.org/jira/browse/FLINK-31967.  In FLINK-31967,  the NPE 
exception was not been thoroughly fixed. For instance, the case I‘ve added now. 
(In fact, with the current case, I just add a new field without modifying the 
original sql pattern, it will still lead to an NPE exception).

> LagFunction throw NPE when input argument are not null
> --
>
> Key: FLINK-34066
> URL: https://issues.apache.org/jira/browse/FLINK-34066
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> LagFunction throw NPE when input argument are not null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34066) LagFunction throw NPE when input argument are not null

2024-01-11 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-34066:
-

 Summary: LagFunction throw NPE when input argument are not null
 Key: FLINK-34066
 URL: https://issues.apache.org/jira/browse/FLINK-34066
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Yunhong Zheng
 Fix For: 1.19.0


LagFunction throw NPE when input argument are not null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33691) Support agg push down for 'count(*)/count(1)/count(column not null)'

2023-11-29 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-33691:
--
Description: Now,  PushLocalAggIntoScanRule cannot push down 'count( * 
)/count(1)/count(column not null)', but it can push down count(column 
nullable). The reason is that count( * ) and count( 1 ) will be optimized to a 
scan with calc as '0 AS $f0' to reduce read cost, which will not match the push 
down rule  (was: Now,  PushLocalAggIntoScanRule cannot push down 
'count(*)/count(1)/count(column not null)', but it can push down count(column 
nullable). The reason is that count(*) and count(n) will be optimized to a scan 
with calc as '0 AS $f0' to reduce read cost, which will not match the push down 
rule)

> Support agg push down for 'count(*)/count(1)/count(column not null)'
> 
>
> Key: FLINK-33691
> URL: https://issues.apache.org/jira/browse/FLINK-33691
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.19.0
>
>
> Now,  PushLocalAggIntoScanRule cannot push down 'count( * 
> )/count(1)/count(column not null)', but it can push down count(column 
> nullable). The reason is that count( * ) and count( 1 ) will be optimized to 
> a scan with calc as '0 AS $f0' to reduce read cost, which will not match the 
> push down rule



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33691) Support agg push down for 'count(*)/count(1)/count(column not null)'

2023-11-29 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-33691:
-

 Summary: Support agg push down for 'count(*)/count(1)/count(column 
not null)'
 Key: FLINK-33691
 URL: https://issues.apache.org/jira/browse/FLINK-33691
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Yunhong Zheng
 Fix For: 1.19.0


Now,  PushLocalAggIntoScanRule cannot push down 'count(*)/count(1)/count(column 
not null)', but it can push down count(column nullable). The reason is that 
count(*) and count(n) will be optimized to a scan with calc as '0 AS $f0' to 
reduce read cost, which will not match the push down rule



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-22870) Grouping sets + case when + constant string throws AssertionError

2023-10-16 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17775662#comment-17775662
 ] 

Yunhong Zheng commented on FLINK-22870:
---

Hi, [~fsk119] . I want to get this tickets, can you assign this issue to me! 
Thanks! 

> Grouping sets + case when + constant string throws AssertionError
> -
>
> Key: FLINK-22870
> URL: https://issues.apache.org/jira/browse/FLINK-22870
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.14.0, 1.13.2
>Reporter: Caizhi Weng
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> Add the following case to 
> {{org.apache.flink.table.api.TableEnvironmentITCase}} to reproduce this issue.
> {code:scala}
> @Test
> def myTest2(): Unit = {
>   tEnv.executeSql(
> """
>   |create temporary table my_source(
>   |  a INT
>   |) WITH (
>   |  'connector' = 'values'
>   |)
>   |""".stripMargin)
>   tEnv.executeSql(
> """
>   |create temporary view my_view as select a, 'test' as b from my_source
>   |""".stripMargin)
>   tEnv.executeSql(
> """
>   |create temporary view my_view2 as select
>   |  a,
>   |  case when GROUPING(b) = 1 then 'test2' else b end as b
>   |from my_view
>   |group by grouping sets(
>   |  (),
>   |  (a),
>   |  (b),
>   |  (a, b)
>   |)
>   |""".stripMargin)
>   System.out.println(tEnv.explainSql(
> """
>   |select a, b from my_view2
>   |""".stripMargin))
> }
> {code}
> The exception stack is
> {code}
> java.lang.AssertionError: Conversion to relational algebra failed to preserve 
> datatypes:
> validated type:
> RecordType(INTEGER a, VARCHAR(5) CHARACTER SET "UTF-16LE" NOT NULL b) NOT NULL
> converted type:
> RecordType(INTEGER a, VARCHAR(5) CHARACTER SET "UTF-16LE" b) NOT NULL
> rel:
> LogicalProject(a=[$0], b=[CASE(=($2, 1), _UTF-16LE'test2':VARCHAR(5) 
> CHARACTER SET "UTF-16LE", CAST($1):VARCHAR(5) CHARACTER SET "UTF-16LE")])
>   LogicalAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}, {1}, {}]], 
> agg#0=[GROUPING($1)])
> LogicalProject(a=[$0], b=[_UTF-16LE'test'])
>   LogicalTableScan(table=[[default_catalog, default_database, my_source]])
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.checkConvertedType(SqlToRelConverter.java:467)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:582)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:177)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:169)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.toQueryOperation(SqlToOperationConverter.java:1048)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convertViewQuery(SqlToOperationConverter.java:897)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convertCreateView(SqlToOperationConverter.java:864)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:259)
>   at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:730)
>   at 
> org.apache.flink.table.api.TableEnvironmentITCase.myTest2(TableEnvironmentITCase.scala:148)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-26497) Cast Exception when use split agg with multiple filters

2023-10-11 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773909#comment-17773909
 ] 

Yunhong Zheng commented on FLINK-26497:
---

This problem seems cannot be reproduce in master branch. Hi [~fsk119] , can you 
help to close this issue. Thanks!

> Cast Exception when use split agg with multiple filters
> ---
>
> Key: FLINK-26497
> URL: https://issues.apache.org/jira/browse/FLINK-26497
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.14.4, 1.15.0
>Reporter: Jingsong Lee
>Assignee: Yunhong Zheng
>Priority: Major
>
> table.optimizer.distinct-agg.split.enabled is true.
> count(distinct c) filter (...), count(distinct c) filter (...), 
> count(distinct c) filter (...)
> Filtering conditions excess 8.
> java.lang.RuntimeException: [J cannot be cast to [Ljava.lang.Object;
> (BTW, it seems that we don't have test case to cover this situation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33143) Wrongly throw error while temporary table join with invalidScan lookup source and selected as view

2023-09-25 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng closed FLINK-33143.
-
Resolution: Resolved

> Wrongly throw error while temporary table join with invalidScan lookup source 
> and selected as view
> --
>
> Key: FLINK-33143
> URL: https://issues.apache.org/jira/browse/FLINK-33143
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Wrongly throw error while temporary table join with invalidScan lookup source 
> and selected as view.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33143) Wrongly throw error while temporary table join with invalidScan lookup source and selected as view

2023-09-25 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768969#comment-17768969
 ] 

Yunhong Zheng commented on FLINK-33143:
---

This fix way cannot cover all sql pattern, like subQuery, so this issue will be 
closed.

> Wrongly throw error while temporary table join with invalidScan lookup source 
> and selected as view
> --
>
> Key: FLINK-33143
> URL: https://issues.apache.org/jira/browse/FLINK-33143
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Wrongly throw error while temporary table join with invalidScan lookup source 
> and selected as view.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-09-24 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768484#comment-17768484
 ] 

Yunhong Zheng commented on FLINK-18356:
---

Hi, [~Sergey Nuyanzin] , [~renqs]  [~mapohl] . I have opened a PR in the repo 
of flink-ci-docker ([https://github.com/flink-ci/flink-ci-docker/pull/1).] If 
this error is a blocker of 1.18, can you help contact [~chesnay] to review this 
pr. Thanks very much.  Hi, [~tangyun], we encountered the same memory leak 
error as FLINK-19125 {*}{*}in flink ci docker. As you have experience in fixing 
this problem, can you help review it. Thanks!

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, 
> 1.19.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, 
> image-2023-07-11-19-41-37-105.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33143) Wrongly throw error while temporary table join with invalidScan lookup source and selected as view

2023-09-24 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-33143:
-

 Summary: Wrongly throw error while temporary table join with 
invalidScan lookup source and selected as view
 Key: FLINK-33143
 URL: https://issues.apache.org/jira/browse/FLINK-33143
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Yunhong Zheng
 Fix For: 1.19.0


Wrongly throw error while temporary table join with invalidScan lookup source 
and selected as view.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-22128) Window aggregation should have unique keys

2023-09-15 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765504#comment-17765504
 ] 

Yunhong Zheng commented on FLINK-22128:
---

Hi, [~lzljs3620320] . Does this issue still need to be done?  Could you briefly 
introduce the background of this issue? If I need to continue doing, I can take 
this tickets. Thanks!

> Window aggregation should have unique keys
> --
>
> Key: FLINK-22128
> URL: https://issues.apache.org/jira/browse/FLINK-22128
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Jingsong Lee
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> auto-unassigned
>
> We should add match method in {{FlinkRelMdUniqueKeys for 
> StreamPhysicalWindowAggregate}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25593) A redundant scan could be skipped if it is an input of join and the other input is empty after partition prune

2023-09-13 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764951#comment-17764951
 ] 

Yunhong Zheng commented on FLINK-25593:
---

Hi, [~jark] . I think the better final plan of this case should be empty values.

> A redundant scan could be skipped if it is an input of join and the other 
> input is empty after partition prune
> --
>
> Key: FLINK-25593
> URL: https://issues.apache.org/jira/browse/FLINK-25593
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Jing Zhang
>Assignee: luoyuxia
>Priority: Major
>
> A redundant scan could be skipped if it is an input of join and the other 
> input is empty after partition prune.
> For example:
> ltable has two partitions: pt=0 ad pt=1, rtable has one partition pt1=0.
> The schema of ltable is (lkey string, value int).
> The schema of rtable is (rkey string, value int).
> {code:sql}
> SELECT * FROM ltable, rtable WHERE pt=2 and pt1=0 and `lkey`=rkey
> {code}
> The plan is as following.
> {code:java}
> Calc(select=[lkey, value, CAST(2 AS INTEGER) AS pt, rkey, value1, CAST(0 AS 
> INTEGER) AS pt1])
> +- HashJoin(joinType=[InnerJoin], where=[=(lkey, rkey)], select=[lkey, value, 
> rkey, value1], build=[right])
>:- Exchange(distribution=[hash[lkey]])
>:  +- TableSourceScan(table=[[hive, source_db, ltable, partitions=[], 
> project=[lkey, value]]], fields=[lkey, value])
>+- Exchange(distribution=[hash[rkey]])
>   +- TableSourceScan(table=[[hive, source_db, rtable, 
> partitions=[{pt1=0}], project=[rkey, value1]]], fields=[rkey, value1])
> {code}
> There is no need to scan right side because the left input of join has 0 
> partitions after partition prune.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25593) A redundant scan could be skipped if it is an input of join and the other input is empty after partition prune

2023-09-13 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764949#comment-17764949
 ] 

Yunhong Zheng commented on FLINK-25593:
---

Hi, [~luoyuxia] .Is this still in progress? If not, could you assign to me! 
Thanks!

> A redundant scan could be skipped if it is an input of join and the other 
> input is empty after partition prune
> --
>
> Key: FLINK-25593
> URL: https://issues.apache.org/jira/browse/FLINK-25593
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Jing Zhang
>Assignee: luoyuxia
>Priority: Major
>
> A redundant scan could be skipped if it is an input of join and the other 
> input is empty after partition prune.
> For example:
> ltable has two partitions: pt=0 ad pt=1, rtable has one partition pt1=0.
> The schema of ltable is (lkey string, value int).
> The schema of rtable is (rkey string, value int).
> {code:sql}
> SELECT * FROM ltable, rtable WHERE pt=2 and pt1=0 and `lkey`=rkey
> {code}
> The plan is as following.
> {code:java}
> Calc(select=[lkey, value, CAST(2 AS INTEGER) AS pt, rkey, value1, CAST(0 AS 
> INTEGER) AS pt1])
> +- HashJoin(joinType=[InnerJoin], where=[=(lkey, rkey)], select=[lkey, value, 
> rkey, value1], build=[right])
>:- Exchange(distribution=[hash[lkey]])
>:  +- TableSourceScan(table=[[hive, source_db, ltable, partitions=[], 
> project=[lkey, value]]], fields=[lkey, value])
>+- Exchange(distribution=[hash[rkey]])
>   +- TableSourceScan(table=[[hive, source_db, rtable, 
> partitions=[{pt1=0}], project=[rkey, value1]]], fields=[rkey, value1])
> {code}
> There is no need to scan right side because the left input of join has 0 
> partitions after partition prune.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33083) SupportsReadingMetadata is not applied when loading a CompiledPlan

2023-09-13 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764666#comment-17764666
 ] 

Yunhong Zheng commented on FLINK-33083:
---

It looks like this is a planner bug, and we don't have related tests to cover 
the situation that connector implement 

SupportsReadingMetadata and supportsMetadataProjection return false:
{code:java}
default boolean supportsMetadataProjection() {
return false;
}{code}
 

> SupportsReadingMetadata is not applied when loading a CompiledPlan
> --
>
> Key: FLINK-33083
> URL: https://issues.apache.org/jira/browse/FLINK-33083
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.16.2, 1.17.1
>Reporter: Dawid Wysakowicz
>Priority: Major
>
> If a few conditions are met, we can not apply ReadingMetadata interface:
> # source overwrites:
>  {code}
> @Override
> public boolean supportsMetadataProjection() {
> return false;
> }
>  {code}
> # source does not implement {{SupportsProjectionPushDown}}
> # table has metadata columns e.g.
> {code}
> CREATE TABLE src (
>   physical_name STRING,
>   physical_sum INT,
>   timestamp TIMESTAMP_LTZ(3) NOT NULL METADATA VIRTUAL
> )
> {code}
> # we query the table {{SELECT * FROM src}}
> It fails with:
> {code}
> Caused by: java.lang.IllegalArgumentException: Row arity: 1, but serializer 
> arity: 2
>   at 
> org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:124)
> {code}
> The reason is {{SupportsReadingMetadataSpec}} is created only in the 
> {{PushProjectIntoTableSourceScanRule}}, but the rule is not applied when 1 & 2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33085) Improve the error message when the invalidate lookupTableSource without primary key is used as temporal join table

2023-09-13 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-33085:
-

 Summary: Improve the error message when the invalidate 
lookupTableSource without primary key is used as temporal join table
 Key: FLINK-33085
 URL: https://issues.apache.org/jira/browse/FLINK-33085
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Yunhong Zheng
 Fix For: 1.19.0


Improve the error message when the invalidate lookupTableSource without primary 
key is used as temporal join table.  This pr can check the legality of 
temporary table join syntax in sqlToRel phase and make the thrown error clearer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33064) Improve the error message when the lookup source is used as the scan source

2023-09-07 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-33064:
-

 Summary: Improve the error message when the lookup source is used 
as the scan source
 Key: FLINK-33064
 URL: https://issues.apache.org/jira/browse/FLINK-33064
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Yunhong Zheng
 Fix For: 1.19.0


Improve the error message when the lookup source is used as the scan source. 
Currently, if we use a source which only implement LookupTableSource but not 
implement ScanTableSource, as a scan source, it cannot get a property plan and 
give a '

Cannot generate a valid execution plan for the given query' which can be 
improved.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33063) udaf with user defined pojo object throw error while generate record equaliser

2023-09-07 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762932#comment-17762932
 ] 

Yunhong Zheng commented on FLINK-33063:
---

Hi, [~lincoln.86xy] . Can you assign this issue to me? Thanks !

> udaf with user defined pojo object throw error while generate record equaliser
> --
>
> Key: FLINK-33063
> URL: https://issues.apache.org/jira/browse/FLINK-33063
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.18.0
>
>
> Udaf with user define pojo object throw error while generating record 
> equaliser: 
> When user create an udaf while recore contains user define complex pojo 
> object (like List or Map). The codegen 
> will throw error while generating record equaliser, the error is:
> {code:java}
> A method named "compareTo" is not declared in any enclosing class nor any 
> subtype, nor through a static import.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33063) udaf with user defined pojo object throw error while generate record equaliser

2023-09-07 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-33063:
-

 Summary: udaf with user defined pojo object throw error while 
generate record equaliser
 Key: FLINK-33063
 URL: https://issues.apache.org/jira/browse/FLINK-33063
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.18.0
Reporter: Yunhong Zheng
 Fix For: 1.18.0


Udaf with user define pojo object throw error while generating record 
equaliser: 

When user create an udaf while recore contains user define complex pojo object 
(like List or Map). The codegen will 
throw error while generating record equaliser, the error is:
{code:java}
A method named "compareTo" is not declared in any enclosing class nor any 
subtype, nor through a static import.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33010) NPE when using GREATEST() in Flink SQL

2023-08-31 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760852#comment-17760852
 ] 

Yunhong Zheng commented on FLINK-33010:
---

Hi, [~Sergey Nuyanzin], It looks like master branch also will throw this error. 
The test pattern:
{code:java}
@Test
public void test1() {
String query1 =
"SELECT\n"
+ "SecurityId,\n"
+ "ccy1,\n"
+ "CAST(publishTimestamp AS TIMESTAMP(3)) as 
publishTimestamp\n"
+ "FROM (VALUES\n"
+ "(1, 'USD', '2022-01-01'),\n"
+ "(2, 'GBP', '2022-02-02'),\n"
+ "(3, 'GBX', '2022-03-03'),\n"
+ "(4, 'GBX', '2022-04-4'))\n"
+ "AS ccy(SecurityId, ccy1, publishTimestamp)";
Table table = tEnv().sqlQuery(query1);
tEnv().createTemporaryView("Positions", table);

String query2 =
"SELECT\n"
+ "SecurityId,\n"
+ "ccy1,\n"
+ "CAST(publishTimestamp AS TIMESTAMP(3)) as 
publishTimestamp\n"
+ "FROM (VALUES\n"
+ "(3, 'USD', '2023-01-01'),\n"
+ "(4, 'GBP', '2023-02-02'),\n"
+ "(5, 'GBX', '2023-03-03'),\n"
+ "(6, 'GBX', '2023-04-4'))\n"
+ "AS ccy(SecurityId, ccy1, publishTimestamp)";
Table table2 = tEnv().sqlQuery(query2);
tEnv().createTemporaryView("Benchmarks", table2);

String sqlQuery =
"SELECT *,\n"
+ "GREATEST(\n"
+ "IFNULL(Positions.publishTimestamp,CAST('1970-1-1' AS 
TIMESTAMP(3))),\n"
+ "IFNULL(Benchmarks.publishTimestamp,CAST('1970-1-1' AS 
TIMESTAMP(3)))\n"
+ ")\n"
+ "FROM Positions\n"
+ "FULL JOIN Benchmarks ON Positions.SecurityId = 
Benchmarks.SecurityId ";
List actual =

CollectionUtil.iteratorToList(tEnv().executeSql(sqlQuery).collect()).stream()
.map(Object::toString)
.collect(Collectors.toList());
actual.sort(String::compareTo);
List expected = Arrays.asList("");
assertEquals(expected, actual);
} {code}
The error msg:
{code:java}
Caused by: java.lang.NullPointerException
    at StreamExecCalc$90.processElement_split3(Unknown Source)
    at StreamExecCalc$90.processElement(Unknown Source)
    at 
org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:94)
    at 
org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:75)
    at 
org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39)
    at 
org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
    at 
org.apache.flink.table.runtime.operators.join.stream.StreamingJoinOperator.outputNullPadding(StreamingJoinOperator.java:349)
    at 
org.apache.flink.table.runtime.operators.join.stream.StreamingJoinOperator.processElement(StreamingJoinOperator.java:234)
    at 
org.apache.flink.table.runtime.operators.join.stream.StreamingJoinOperator.processRight(StreamingJoinOperator.java:145)
    at 
org.apache.flink.table.runtime.operators.join.stream.AbstractStreamingJoinOperator.processElement2(AbstractStreamingJoinOperator.java:141)
    at 
org.apache.flink.streaming.runtime.io.RecordProcessorUtils.lambda$getRecordProcessor2$2(RecordProcessorUtils.java:116)
    at 
org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory$StreamTaskNetworkOutput.emitRecord(StreamTwoInputProcessorFactory.java:254)
    at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:179)
    at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:143)
    at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:68)
    at 
org.apache.flink.streaming.runtime.io.StreamMultipleInputProcessor.processInput(StreamMultipleInputProcessor.java:89)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:613)
    at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:1059)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:1008)
    at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:959)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:938)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:751)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:567) {code}
The 

[jira] [Commented] (FLINK-32997) [JUnit5 Migration] Module: flink-table-planner (StreamingTestBase)

2023-08-30 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760342#comment-17760342
 ] 

Yunhong Zheng commented on FLINK-32997:
---

Hi, [~jiabao.sun] , thanks for your contribution. IMO, there are too many tests 
in the table-planner module. Migrating all tests to junit5 at once has a 
significant impact to current pr and tests. Can we complete the migration in a 
smoother way? For example, adding StreamingTestBaseV2 (junit5) and 
BatchTestBaseV2 (junit5), and then new tests extended StreamingTestBaseV2 when 
modifying and exists tests gradually migrates. This may have a smaller impact 
on existing code and developers.  cc [~lincoln.86xy]  [~jark] 

> [JUnit5 Migration] Module: flink-table-planner (StreamingTestBase)
> --
>
> Key: FLINK-32997
> URL: https://issues.apache.org/jira/browse/FLINK-32997
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Jiabao Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> JUnit5 Migration Module: flink-table-planner (StreamingTestBase)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32993) Datagen connector produce unexpected Char type data

2023-08-29 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760234#comment-17760234
 ] 

Yunhong Zheng commented on FLINK-32993:
---

Hi, [~liyubin117] . it's a by design behavior. If you want to specify the 
length of char, u need to add 'fieds.#.length' into table options as the 
document requirement: 
[document|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/connectors/table/datagen/#fields-length].
 specify char(50) in the schema is invalid(default 100).  In your case, your 
ddl need change to:
{code:java}
create table t1(name string, addr string) with ('connector' = 'datagen', 
'fields.name.length' = '50');{code}

> Datagen connector produce unexpected Char type data
> ---
>
> Key: FLINK-32993
> URL: https://issues.apache.org/jira/browse/FLINK-32993
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Affects Versions: 1.19.0
>Reporter: Yubin Li
>Priority: Major
> Attachments: image-2023-08-30-11-47-05-887.png, 
> image-2023-08-30-11-47-43-719.png, image-2023-08-30-11-51-44-498.png
>
>
> create table as follows:
> !image-2023-08-30-11-51-44-498.png!
> results:
> !image-2023-08-30-11-47-05-887.png!
> I have found that Char type data length is 100, which same as String type, it 
> is caused by the two types use the same data generation logic, and we could 
> correct it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25242) UDF with primitive int argument does not accept int values even after a not null filter

2023-08-25 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758873#comment-17758873
 ] 

Yunhong Zheng commented on FLINK-25242:
---

Hi, [~lincoln.86xy] . I find this case cannot reproduce in master branch. But I 
can't find the fix pr. So I think this issue can be marked as resolved.

> UDF with primitive int argument does not accept int values even after a not 
> null filter
> ---
>
> Key: FLINK-25242
> URL: https://issues.apache.org/jira/browse/FLINK-25242
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.15.0
>Reporter: Caizhi Weng
>Assignee: Yunhong Zheng
>Priority: Major
>
> Add the following test case to {{TableEnvironmentITCase}} to reproduce this 
> issue.
> {code:scala}
> @Test
> def myTest(): Unit = {
>   tEnv.executeSql("CREATE TEMPORARY FUNCTION MyUdf AS 
> 'org.apache.flink.table.api.MyUdf'")
>   tEnv.executeSql(
> """
>   |CREATE TABLE T (
>   |  a INT
>   |) WITH (
>   |  'connector' = 'values',
>   |  'bounded' = 'true'
>   |)
>   |""".stripMargin)
>   tEnv.executeSql("SELECT MyUdf(a) FROM T WHERE a IS NOT NULL").print()
> }
> {code}
> UDF code
> {code:scala}
> class MyUdf extends ScalarFunction {
>   def eval(a: Int): Int = {
> a + 1
>   }
> }
> {code}
> Exception stack
> {code}
> org.apache.flink.table.api.ValidationException: SQL validation failed. 
> Invalid function call:
> default_catalog.default_database.MyUdf(INT)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:168)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:107)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:219)
>   at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:736)
>   at 
> org.apache.flink.table.api.TableEnvironmentITCase.myTest(TableEnvironmentITCase.scala:97)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at 

[jira] [Commented] (FLINK-28518) Exception: AssertionError: Cannot add expression of different type to set in sub-query with ROW type

2023-08-25 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758872#comment-17758872
 ] 

Yunhong Zheng commented on FLINK-28518:
---

Hi, [~martijnvisser]  and [~KoylubaevNT] . This case already fixed as now flink 
upgrade calcite to 1.30. This issue can be closed.

> Exception: AssertionError: Cannot add expression of different type to set in 
> sub-query with ROW type
> 
>
> Key: FLINK-28518
> URL: https://issues.apache.org/jira/browse/FLINK-28518
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.15.1
>Reporter: Koylubaev Nikita
>Priority: Major
> Attachments: SubQueryRowTypeTest.scala, test.sql
>
>
> All scripts is attached to file: test.sql
> Create 2 tables:
>  
> {code:java}
> SET
> 'sql-client.execution.result-mode' = 'tableau';
> SET
> 'execution.runtime-mode' = 'batch';
> SET
> 'sql-client.execution.mode' = 'streaming';
> SET
> 'parallelism.default' = '8';
> SET
> 'table.dml-sync' = 'true';
> CREATE
> TEMPORARY TABLE fl (
> `id` INT,
> `name` STRING) 
> WITH (
> 'connector' = 'faker',
> 'number-of-rows' = '10',
> 'rows-per-second' = '1',
> 'fields.id.expression' = '#{number.numberBetween ''0'',''10''}',
> 'fields.name.expression' = '#{superhero.name}');
> CREATE
> TEMPORARY TABLE application (
> `id` INT,
> `fl_id` INT,
> `num` INT,
> `db` DOUBLE) 
> WITH (
> 'connector' = 'faker',
> 'number-of-rows' = '100',
> 'rows-per-second' = '100',
> 'fields.id.expression' = '#{number.numberBetween ''0'',''100''}',
> 'fields.fl_id.expression' = '#{number.numberBetween ''0'',''10''}',
> 'fields.num.expression' = '#{number.numberBetween 
> ''-2147483648'',''2147483647''}',
> 'fields.db.expression' = '#{number.randomDouble ''3'',''-1000'',''1000''}'); 
> {code}
> The next SQL throw exception:
> {code:java}
> select fl.name,
>(select (COLLECT(application.num), COLLECT(application.db))
> from application
> where fl.id = application.fl_id)
> from fl;{code}
> Error stack trace is (I marked what is different in type: it's just NOT NULL):
>  
>  
> {code:java}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.AssertionError: Cannot add expression of different type to set:
> set type is RecordType(INTEGER id, VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" name, RecordType(INTEGER MULTISET EXPR$0, DOUBLE MULTISET EXPR$1) 
> $f0) NOT NULL
> expression type is RecordType(INTEGER id, VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" name, RecordType(INTEGER MULTISET EXPR$0, DOUBLE MULTISET EXPR$1) 
> NOT NULL $f0) NOT NULL
> set is 
> rel#129:LogicalCorrelate.NONE.any.[](left=HepRelVertex#119,right=HepRelVertex#128,correlation=$cor0,joinType=left,requiredColumns={0})
> expression is LogicalProject(id=[$0], name=[$1], $f0=[ROW($2, $3)])
>   LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{0}])
>     LogicalTableScan(table=[[default_catalog, default_database, fl]])
>     LogicalAggregate(group=[{}], agg#0=[COLLECT($0)], agg#1=[COLLECT($1)])
>       LogicalProject(num=[$2], db=[$3])
>         LogicalFilter(condition=[=($cor0.id, $1)])
>           LogicalTableScan(table=[[default_catalog, default_database, 
> application]])
>  {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30201) Function "unnest" can't process nesting JSON properly

2023-08-25 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758852#comment-17758852
 ] 

Yunhong Zheng commented on FLINK-30201:
---

Hi, [~mooonzhang] . I think it's not a bug, you need to find a column use full 
name like 'TableName.ColumnName.SubColumnName' instead of 
'ColumnName.SubColumnName'.  In your case, you need to use 
'riskRuleEngineResultLevel2_3.data.rule_results'.

> Function "unnest" can't process nesting JSON properly
> -
>
> Key: FLINK-30201
> URL: https://issues.apache.org/jira/browse/FLINK-30201
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.16.0
>Reporter: zhangyue
>Priority: Major
>
> Here is the CREATE TABLE DDL:
> {code:java}
> riskRuleEngineResultLevel2_3 = CREATE TABLE `riskRuleEngineResultLevel2_3`(\
> `data` ROW<\
> `flow_id` STRING, \
> `flow_name` STRING, \
> `flow_version` STRING, \
> `risk_id` BIGINT, \
> `uid` BIGINT, \
> `is_pass` INT, \
> `result` INT, \
> `country_id` INT, \
> `business` STRING, \
> `engine_scene_id` STRING, \
> `flow_type` STRING, \
> `source` STRING, \
> `rule_results` ARRAY `rule_name` STRING, \
> `rule_type` STRING, \
> `rule_type_name` STRING, \
> `node_id` STRING, \
> `result` INT, \
> `policy_name` STRING, \
> `in_path` BOOLEAN>>\
> >,\
> proctime as proctime()\
> ) WITH (\
> 'connector' = 'kafka',\
> 'topic' = 'riskRuleEngineResultLevel2_3',\
> 'scan.startup.mode' = '%s',\
> 'properties.bootstrap.servers' = '%s',\
> 'properties.group.id' = '%s',\
> 'format' = 'json'\
> ) {code}
> flink sql:
> {code:java}
> String executeSql = "select data.flow_id as 
> flow_id,t.rule_id,t.rule_name,t.rule_type,t.rule_type_name,t.node_id,t.`result`
>  from riskRuleEngineResultLevel2_3, unnest(data.rule_results) as t 
> (rule_id,rule_name,rule_type,rule_type_name,node_id,`result`,policy_name,in_path)";
>  {code}
>   when the param in "unnest" Function is "data.rule_results" which is 
> actually the right structure, the ERROR occurs as below. And when I use 
> "rule_results" instead of "data.rule_results" in "unnest" Function ,It goes 
> well. I think it is wired.
> {code:java}
> // Exception in thread "main" org.apache.flink.table.api.ValidationException: 
> SQL validation failed. From line 0, column 0 to line 1, column 149: Column 
> 'data.data' not found in table 'riskRuleEngineResultLevel2_3'
>     at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:186)
>     at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:113)
>     at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:261)
>     at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:106)
>     at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:723)
>     at 
> com.akulaku.flink_tasks_project.tasks.FlowsRuleResultRiskCalc.main(FlowsRuleResultRiskCalc.java:42)
> Caused by: org.apache.calcite.runtime.CalciteContextException: From line 0, 
> column 0 to line 1, column 149: Column 'data.data' not found in table 
> 'riskRuleEngineResultLevel2_3'
>     at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>     at 
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at 
> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
>     at 
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:467)
>     at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:883)
>     at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:868)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4867)
>     at 
> org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:439)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visit(SqlValidatorImpl.java:5839)
>     at 
> 

[jira] [Commented] (FLINK-32940) Support projection pushdown to table source for column projections through UDTF

2023-08-24 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758824#comment-17758824
 ] 

Yunhong Zheng commented on FLINK-32940:
---

Hi, [~vsowrirajan] . I think your idea is good, but the biggest problem 
currently is how to pass the column cropping condition of 
LogicalTableFunctionScan to LogicalTableScan and rewrite 
LogicalTableFunctionScan. So I think we need to add a rule in project_rewrite 
stage. 
 # Actually, first we need add  rule CoreRules.ProjectCorrelateTransposeRule in 
FlinkBatchRuleSets to push project into LogicalCorrelate.
 # And we need add a rule in project_rewrite stage to pass by this project into 
LogicalTableScan side and rewrite LogicalTableFunctionScan.
 # For this npe problem, you can add if logical to avoid it.

Adding one example to explain step: 2

for this ddl

 
{code:java}
String ddl =
"CREATE TABLE NestedItemTable1 (\n"
+ "  `deptno` INT,\n"
+ "  `employees` MAP\n"
+ ") WITH (\n"
+ " 'connector' = 'values',\n"
+ " 'nested-projection-supported' = 'true',"
+ " 'bounded' = 'true'\n"
+ ")";
util.tableEnv().executeSql(ddl);

util.verifyRelPlan(
"SELECT t1.deptno, k FROM NestedItemTable1 t1, UNNEST(t1.employees) as 
f(k, v)");{code}
 

we will get the below plan after add CoreRules.ProjectCorrelateTransposeRule:

 
{code:java}
optimize project_rewrite cost 413675 ms.
optimize result: 
LogicalProject(inputs=[0], exprs=[[$2]])
+- LogicalCorrelate(correlation=[$cor1], joinType=[inner], 
requiredColumns=[{1}])
   :- LogicalTableScan(table=[[default_catalog, default_database, 
NestedItemTable1]])
   +- LogicalProject(inputs=[0])
      +- LogicalTableFunctionScan(invocation=[$UNNEST_ROWS$1($cor0.employees)], 
rowType=[RecordType:peek_no_expand(VARCHAR(2147483647) f0, VARCHAR(2147483647) 
f1)]){code}
 

I think for this pattern, we need add a new rule to match this.:

 
{code:java}
+- LogicalCorrelate
   :- LogicalTableScan
   +- LogicalProject
  +- LogicalTableFunctionScan{code}
 

In this rule, we first need to create a new LogicalTableFunctionScan after 
merge LogicalProject and LogicalTableFunctionScan.

second, we need add a new LogicalProject for LogicalTableScan, which will be 
push down to LogicalTableScan in logical stage.

IMO, the new plan after match this rule will be (just an example, not correct 
plan):

 
{code:java}
LogicalProject(inputs=[0], exprs=[[$2]])
+- LogicalCorrelate(correlation=[$cor1], joinType=[inner], 
requiredColumns=[{1}])
   :- LogicalProject(inputs=[employees.k])    
  +- LogicalTableScan(table=[[default_catalog, default_database, 
NestedItemTable1]])
   :- LogicalTableFunctionScan(invocation=[$UNNEST_ROWS$1($cor0.employees.k)], 
rowType=[RecordType:peek_no_expand(VARCHAR(2147483647) f0)]){code}
 

WDYT?  [~vsowrirajan]. Once the solution is determined and u complete the 
development, you can ping me to review it.

> Support projection pushdown to table source for column projections through 
> UDTF
> ---
>
> Key: FLINK-32940
> URL: https://issues.apache.org/jira/browse/FLINK-32940
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Venkata krishnan Sowrirajan
>Priority: Major
>
> Currently, Flink doesn't push down columns projected through UDTF like 
> _UNNEST_ to the table source.
> For eg:
> {code:java}
> SELECT t1.deptno, t2.ename FROM db.dept_nested t1, UNNEST(t1.employees) AS 
> t2{code}
> For the above SQL, Flink projects all the columns for DEPT_NESTED rather than 
> only _name_ and {_}employees{_}. If the table source supports nested fields 
> column projection, ideally it should project only _t1.employees.ename_ from 
> the table source.
> Query plan:
> {code:java}
> == Abstract Syntax Tree ==
> LogicalProject(deptno=[$0], ename=[$5])
> +- LogicalCorrelate(correlation=[$cor1], joinType=[inner], 
> requiredColumns=[{3}])
>    :- LogicalTableScan(table=[[hive_catalog, db, dept_nested]])
>    +- Uncollect
>       +- LogicalProject(employees=[$cor1.employees])
>          +- LogicalValues(tuples=[[{ 0 }]]){code}
> {code:java}
> == Optimized Physical Plan ==
> Calc(select=[deptno, ename])
> +- Correlate(invocation=[$UNNEST_ROWS$1($cor1.employees)], 
> correlate=[table($UNNEST_ROWS$1($cor1.employees))], 
> select=[deptno,name,skillrecord,employees,empno,ename,skills], 
> rowType=[RecordType(BIGINT deptno, VARCHAR(2147483647) name, 
> RecordType:peek_no_expand(VARCHAR(2147483647) skilltype, VARCHAR(2147483647) 
> desc, RecordType:peek_no_expand(VARCHAR(2147483647) a, VARCHAR(2147483647) b) 
> others) skillrecord, RecordType:peek_no_expand(BIGINT empno, 
> VARCHAR(2147483647) ename, 

[jira] [Updated] (FLINK-32952) Scan reuse with readable metadata and watermark push down will get wrong watermark

2023-08-24 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-32952:
--
Description: Scan reuse with readable metadata and watermark push down will 
get wrong result. In class ScanReuser, we will re-build watermark spec after 
projection push down. However, we will get wrong index while try to find index 
in new source type.  (was: Source reuse with readable metadata and watermark 
push down will get wrong result. In class SourceReuser, we will re-build 
watermark spec after projection push down. However, we will get wrong index 
while try to find index in new source type.)

> Scan reuse with readable metadata and watermark push down will get wrong 
> watermark 
> ---
>
> Key: FLINK-32952
> URL: https://issues.apache.org/jira/browse/FLINK-32952
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.18.0
>
>
> Scan reuse with readable metadata and watermark push down will get wrong 
> result. In class ScanReuser, we will re-build watermark spec after projection 
> push down. However, we will get wrong index while try to find index in new 
> source type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32952) Scan reuse with readable metadata and watermark push down will get wrong watermark

2023-08-24 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-32952:
--
Summary: Scan reuse with readable metadata and watermark push down will get 
wrong watermark   (was: Source reuse with readable metadata and watermark push 
down will get wrong watermark )

> Scan reuse with readable metadata and watermark push down will get wrong 
> watermark 
> ---
>
> Key: FLINK-32952
> URL: https://issues.apache.org/jira/browse/FLINK-32952
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.18.0
>
>
> Source reuse with readable metadata and watermark push down will get wrong 
> result. In class SourceReuser, we will re-build watermark spec after 
> projection push down. However, we will get wrong index while try to find 
> index in new source type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32952) Source reuse with readable metadata and watermark push down will get wrong watermark

2023-08-24 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-32952:
-

 Summary: Source reuse with readable metadata and watermark push 
down will get wrong watermark 
 Key: FLINK-32952
 URL: https://issues.apache.org/jira/browse/FLINK-32952
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Yunhong Zheng
 Fix For: 1.18.0


Source reuse with readable metadata and watermark push down will get wrong 
result. In class SourceReuser, we will re-build watermark spec after projection 
push down. However, we will get wrong index while try to find index in new 
source type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25242) UDF with primitive int argument does not accept int values even after a not null filter

2023-08-24 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758465#comment-17758465
 ] 

Yunhong Zheng commented on FLINK-25242:
---

Yes, [~lincoln.86xy] . Thank you for your reminder. I still want to fix this 
issue, can u assign this to me. Thanks.

> UDF with primitive int argument does not accept int values even after a not 
> null filter
> ---
>
> Key: FLINK-25242
> URL: https://issues.apache.org/jira/browse/FLINK-25242
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.15.0
>Reporter: Caizhi Weng
>Priority: Major
>
> Add the following test case to {{TableEnvironmentITCase}} to reproduce this 
> issue.
> {code:scala}
> @Test
> def myTest(): Unit = {
>   tEnv.executeSql("CREATE TEMPORARY FUNCTION MyUdf AS 
> 'org.apache.flink.table.api.MyUdf'")
>   tEnv.executeSql(
> """
>   |CREATE TABLE T (
>   |  a INT
>   |) WITH (
>   |  'connector' = 'values',
>   |  'bounded' = 'true'
>   |)
>   |""".stripMargin)
>   tEnv.executeSql("SELECT MyUdf(a) FROM T WHERE a IS NOT NULL").print()
> }
> {code}
> UDF code
> {code:scala}
> class MyUdf extends ScalarFunction {
>   def eval(a: Int): Int = {
> a + 1
>   }
> }
> {code}
> Exception stack
> {code}
> org.apache.flink.table.api.ValidationException: SQL validation failed. 
> Invalid function call:
> default_catalog.default_database.MyUdf(INT)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:168)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:107)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:219)
>   at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:736)
>   at 
> org.apache.flink.table.api.TableEnvironmentITCase.myTest(TableEnvironmentITCase.scala:97)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 

[jira] (FLINK-32296) Flink SQL handle array of row incorrectly

2023-08-23 Thread Yunhong Zheng (Jira)


[ https://issues.apache.org/jira/browse/FLINK-32296 ]


Yunhong Zheng deleted comment on FLINK-32296:
---

was (Author: JIRAUSER287975):
Hi, [~Sergey Nuyanzin]. Sorry for I'm not checking whether there are same issue 
in jira at the beginning, because initially I thought it was a bug in 
Kafka-connector. This is indeed a duplicate issue, it can be closed. Thanks for 
your contribution!

> Flink SQL handle array of row incorrectly
> -
>
> Key: FLINK-32296
> URL: https://issues.apache.org/jira/browse/FLINK-32296
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.15.3, 1.16.2, 1.17.1
>Reporter: Lim Qing Wei
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.3, 1.17.2, 1.19.0
>
>
> FlinkSQL produce incorrect result when involving data with type of 
> ARRAY, here's a reproduction:
>  
>  
> {code:java}
> CREATE TEMPORARY VIEW bug_data as (
> SELECT CAST(ARRAY[
> (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, 
> '20-01-10'), (4410, '2')
> ] AS ARRAY>)
> UNION
> SELECT CAST(ARRAY[
> (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), 
> (2210, '3-01-10'), (4410, '23243243')
> ] AS ARRAY>)
> UNION SELECT CAST(ARRAY[
> (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), 
> (24367, '20-01-10'), (4410, '2')
> ] AS ARRAY>)
> UNION SELECT CAST(ARRAY[
> (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), 
> (2210, '20-01-10'), (4410, '2')
> ] AS ARRAY>)
> UNION SELECT CAST(ARRAY[
> (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), 
> (8967564, '20-01-10'), (4410, '2')
> ] AS ARRAY>)
> );
> CREATE TABLE sink (
> r ARRAY>
> ) WITH ('connector' = 'print'); {code}
>  
>  
> In all 1.15. 1.16 and 1.17 version I've tested, it produces the following:
>  
> {noformat}
> [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 
> 2], +I[4410, 2]]
> [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, 
> 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat}
>  
>  
> I think this is unexpected/wrong because:
>  # The query should produce 5 rows, not 2
>  # The data is also wrong, noticed it just make every row in the array the 
> same, but the input are not the same.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32897) RowToRowCastRule will get wrong result while cast Array(Row()) type

2023-08-23 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758324#comment-17758324
 ] 

Yunhong Zheng commented on FLINK-32897:
---

Hi, [~Sergey Nuyanzin]. Sorry for I'm not checking whether there are same issue 
in jira at the beginning, because initially I thought it was a bug in 
Kafka-connector. This is indeed a duplicate issue, it can be closed. Thanks for 
your contribution!

> RowToRowCastRule will get wrong result while cast Array(Row()) type
> ---
>
> Key: FLINK-32897
> URL: https://issues.apache.org/jira/browse/FLINK-32897
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: Yunhong Zheng
>Assignee: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
> Attachments: codegen.java, image-2023-08-21-08-48-25-116.png
>
>
> {code:java}
> // code placeholder
> CREATE TABLE kafkaTest (
>   a STRING NOT NULL,
>   config ARRAY   NOT NULL,autocreate BOOLEAN NOT NULL> NOT NULL> NOT NULL,
>   ingestionTime TIMESTAMP(3) METADATA FROM 'timestamp',
>   PRIMARY KEY (businessEvent) NOT ENFORCED) 
> WITH (
>  'connector' = 'kafka',
>   'topic' = 'test',
>   'properties.group.id' = 'testGroup',
>   'scan.startup.mode' = 'earliest-offset',
>   'properties.bootstrap.servers' = '', 
>   'properties.security.protocol' = 'SASL_SSL',
>   'properties.sasl.mechanism' = 'PLAIN',
>   'properties.sasl.jaas.config' = ';',
>   'value.format' = 'json',
>   'sink.partitioner' = 'fixed'
> ); {code}
> If we with the following INSERT, it will see that the last item in the array 
> is placed in the topic 3 times and the first two are igniored.
> {code:java}
> // code placeholder
> INSERT INTO kafkaTest VALUES ('Transaction', ARRAY[ROW('G', 'IT', 
> true),ROW('H', 'FR', true), ROW('I', 'IT', false)], TIMESTAMP '2023-08-30 
> 14:01:00'); {code}
> The result:
> !image-2023-08-21-08-48-25-116.png|width=296,height=175!
> If I use the 'print' sink, I can get the right result. So I think this is a 
> bug of 'kafka' connector.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32296) Flink SQL handle array of row incorrectly

2023-08-23 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758323#comment-17758323
 ] 

Yunhong Zheng commented on FLINK-32296:
---

Hi, [~Sergey Nuyanzin]. Sorry for I'm not checking whether there are same issue 
in jira at the beginning, because initially I thought it was a bug in 
Kafka-connector. This is indeed a duplicate issue, it can be closed. Thanks for 
your contribution!

> Flink SQL handle array of row incorrectly
> -
>
> Key: FLINK-32296
> URL: https://issues.apache.org/jira/browse/FLINK-32296
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.15.3, 1.16.2, 1.17.1
>Reporter: Lim Qing Wei
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.3, 1.17.2, 1.19.0
>
>
> FlinkSQL produce incorrect result when involving data with type of 
> ARRAY, here's a reproduction:
>  
>  
> {code:java}
> CREATE TEMPORARY VIEW bug_data as (
> SELECT CAST(ARRAY[
> (10, '2020-01-10'), (101, '244ddf'), (1011, '2asdfaf'), (1110, '200'), (2210, 
> '20-01-10'), (4410, '2')
> ] AS ARRAY>)
> UNION
> SELECT CAST(ARRAY[
> (10, '2020-01-10'), (121, '244ddf'), (, '2asdfaf'), (32243, '200'), 
> (2210, '3-01-10'), (4410, '23243243')
> ] AS ARRAY>)
> UNION SELECT CAST(ARRAY[
> (10, '2020-01-10'), (222, '244ddf'), (1011, '2asdfaf'), (1110, '200'), 
> (24367, '20-01-10'), (4410, '2')
> ] AS ARRAY>)
> UNION SELECT CAST(ARRAY[
> (10, '2020-01-10'), (5666, '244ddf'), (435243, '2asdfaf'), (56567, '200'), 
> (2210, '20-01-10'), (4410, '2')
> ] AS ARRAY>)
> UNION SELECT CAST(ARRAY[
> (10, '2020-01-10'), (43543, '244ddf'), (1011, '2asdfaf'), (1110, '200'), 
> (8967564, '20-01-10'), (4410, '2')
> ] AS ARRAY>)
> );
> CREATE TABLE sink (
> r ARRAY>
> ) WITH ('connector' = 'print'); {code}
>  
>  
> In all 1.15. 1.16 and 1.17 version I've tested, it produces the following:
>  
> {noformat}
> [+I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 2], +I[4410, 
> 2], +I[4410, 2]]
> [+I[4410, 23243243], +I[4410, 23243243], +I[4410, 23243243], +I[4410, 
> 23243243], +I[4410, 23243243], +I[4410, 23243243]]{noformat}
>  
>  
> I think this is unexpected/wrong because:
>  # The query should produce 5 rows, not 2
>  # The data is also wrong, noticed it just make every row in the array the 
> same, but the input are not the same.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32940) Support projection pushdown to table source for column projections through UDTF

2023-08-23 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757996#comment-17757996
 ] 

Yunhong Zheng commented on FLINK-32940:
---

Sure, [~jark] . Please assign to me. Thanks!

> Support projection pushdown to table source for column projections through 
> UDTF
> ---
>
> Key: FLINK-32940
> URL: https://issues.apache.org/jira/browse/FLINK-32940
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Venkata krishnan Sowrirajan
>Priority: Major
>
> Currently, Flink doesn't push down columns projected through UDTF like 
> _UNNEST_ to the table source.
> For eg:
> {code:java}
> select t1.name, t2.ename from DEPT_NESTED as t1, unnest(t1.employees) as 
> t2{code}
> For the above SQL, Flink projects all the columns for DEPT_NESTED rather than 
> only _name_ and {_}employees{_}. If the table source supports nested fields 
> column projection, ideally it should project only _t1.employees.ename_ from 
> the table source.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32897) RowToRowCastRule will get wrong result while cast Array(Row()) type

2023-08-21 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-32897:
--
Summary: RowToRowCastRule will get wrong result while cast Array(Row()) 
type  (was: Kafka with nested array row type will get wrong result)

> RowToRowCastRule will get wrong result while cast Array(Row()) type
> ---
>
> Key: FLINK-32897
> URL: https://issues.apache.org/jira/browse/FLINK-32897
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: Yunhong Zheng
>Assignee: Yunhong Zheng
>Priority: Major
> Fix For: 1.19.0
>
> Attachments: codegen.java, image-2023-08-21-08-48-25-116.png
>
>
> {code:java}
> // code placeholder
> CREATE TABLE kafkaTest (
>   a STRING NOT NULL,
>   config ARRAY   NOT NULL,autocreate BOOLEAN NOT NULL> NOT NULL> NOT NULL,
>   ingestionTime TIMESTAMP(3) METADATA FROM 'timestamp',
>   PRIMARY KEY (businessEvent) NOT ENFORCED) 
> WITH (
>  'connector' = 'kafka',
>   'topic' = 'test',
>   'properties.group.id' = 'testGroup',
>   'scan.startup.mode' = 'earliest-offset',
>   'properties.bootstrap.servers' = '', 
>   'properties.security.protocol' = 'SASL_SSL',
>   'properties.sasl.mechanism' = 'PLAIN',
>   'properties.sasl.jaas.config' = ';',
>   'value.format' = 'json',
>   'sink.partitioner' = 'fixed'
> ); {code}
> If we with the following INSERT, it will see that the last item in the array 
> is placed in the topic 3 times and the first two are igniored.
> {code:java}
> // code placeholder
> INSERT INTO kafkaTest VALUES ('Transaction', ARRAY[ROW('G', 'IT', 
> true),ROW('H', 'FR', true), ROW('I', 'IT', false)], TIMESTAMP '2023-08-30 
> 14:01:00'); {code}
> The result:
> !image-2023-08-21-08-48-25-116.png|width=296,height=175!
> If I use the 'print' sink, I can get the right result. So I think this is a 
> bug of 'kafka' connector.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32897) Kafka with nested array row type will get wrong result

2023-08-21 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17756725#comment-17756725
 ] 

Yunhong Zheng commented on FLINK-32897:
---

Thanks, [~renqs] . Can you assign this issue to me, I will try to fix it.

> Kafka with nested array row type will get wrong result
> --
>
> Key: FLINK-32897
> URL: https://issues.apache.org/jira/browse/FLINK-32897
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.2, 1.18.0, 1.17.1
>Reporter: Yunhong Zheng
>Assignee: Yunhong Zheng
>Priority: Major
> Fix For: 1.19.0
>
> Attachments: codegen.java, image-2023-08-21-08-48-25-116.png
>
>
> {code:java}
> // code placeholder
> CREATE TABLE kafkaTest (
>   a STRING NOT NULL,
>   config ARRAY   NOT NULL,autocreate BOOLEAN NOT NULL> NOT NULL> NOT NULL,
>   ingestionTime TIMESTAMP(3) METADATA FROM 'timestamp',
>   PRIMARY KEY (businessEvent) NOT ENFORCED) 
> WITH (
>  'connector' = 'kafka',
>   'topic' = 'test',
>   'properties.group.id' = 'testGroup',
>   'scan.startup.mode' = 'earliest-offset',
>   'properties.bootstrap.servers' = '', 
>   'properties.security.protocol' = 'SASL_SSL',
>   'properties.sasl.mechanism' = 'PLAIN',
>   'properties.sasl.jaas.config' = ';',
>   'value.format' = 'json',
>   'sink.partitioner' = 'fixed'
> ); {code}
> If we with the following INSERT, it will see that the last item in the array 
> is placed in the topic 3 times and the first two are igniored.
> {code:java}
> // code placeholder
> INSERT INTO kafkaTest VALUES ('Transaction', ARRAY[ROW('G', 'IT', 
> true),ROW('H', 'FR', true), ROW('I', 'IT', false)], TIMESTAMP '2023-08-30 
> 14:01:00'); {code}
> The result:
> !image-2023-08-21-08-48-25-116.png|width=296,height=175!
> If I use the 'print' sink, I can get the right result. So I think this is a 
> bug of 'kafka' connector.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32897) Kafka with nested array row type will get wrong result

2023-08-20 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-32897:
-

 Summary: Kafka with nested array row type will get wrong result
 Key: FLINK-32897
 URL: https://issues.apache.org/jira/browse/FLINK-32897
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.18.0
Reporter: Yunhong Zheng
 Fix For: 1.19.0
 Attachments: image-2023-08-21-08-48-25-116.png

{code:java}
// code placeholder
CREATE TABLE kafkaTest (
  a STRING NOT NULL,
  config ARRAY NOT NULL> NOT NULL,
  ingestionTime TIMESTAMP(3) METADATA FROM 'timestamp',
  PRIMARY KEY (businessEvent) NOT ENFORCED) 
WITH (
 'connector' = 'kafka',
  'topic' = 'test',
  'properties.group.id' = 'testGroup',
  'scan.startup.mode' = 'earliest-offset',
  'properties.bootstrap.servers' = '', 
  'properties.security.protocol' = 'SASL_SSL',
  'properties.sasl.mechanism' = 'PLAIN',
  'properties.sasl.jaas.config' = ';',
  'value.format' = 'json',
  'sink.partitioner' = 'fixed'
); {code}
If we with the following INSERT, it will see that the last item in the array is 
placed in the topic 3 times and the first two are igniored.
{code:java}
// code placeholder
INSERT INTO kafkaTest VALUES ('Transaction', ARRAY[ROW('G', 'IT', 
true),ROW('H', 'FR', true), ROW('I', 'IT', false)], TIMESTAMP '2023-08-30 
14:01:00'); {code}
The result:

!image-2023-08-21-08-48-25-116.png|width=296,height=175!

If I use the 'print' sink, I can get the right result. So I think this is a bug 
of 'kafka' connector.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32321) Temporal Join job missing condition after “ON”

2023-08-10 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752710#comment-17752710
 ] 

Yunhong Zheng commented on FLINK-32321:
---

Hi, [~jark] . Could you assign this issue to me ? Thanks!

I think we shouldn't delete the constant lookup pk condition like 'rule_type = 
0' while we push the constant condition on pk down to source. It will result in 
lookup join may without a join condition after push down. Just as the below 
case shown in streaming mode (Flink-1.18):
{code:java}
util.addTable("""
|CREATE TABLE MyTable (
|  `a` INT,
|  `b` STRING,
|  `c` INT,
|   PROCTIME()
|) WITH (
|  'connector' = 'values'
|)
|""".stripMargin)

util.addTable("""
|CREATE TABLE LookupTableWithFilterPushDown (
|  `id` INT,
|  `name` STRING,
|  `age` INT,
|  PRIMARY KEY(age) NOT ENFORCED
|) WITH (
|  'connector' = 'values',
|  'filterable-fields' = 'age'
|)
|""".stripMargin)
val sql =
  """
| SELECT * FROM MyTable AS T LEFT JOIN LookupTableWithFilterPushDown
| FOR SYSTEM_TIME AS OF T.proctime D
| ON D.age = 100
|""".stripMargin
util.verifyExecPlan(sql)

{code}
The physical plan will be :
{code:java}
optimize result: 
Calc(select=[a, b, c, PROCTIME_MATERIALIZE(proctime) AS proctime, rowtime, id, 
name, age])
+- 
LookupJoin(table=[default_catalog.default_database.LookupTableWithFilterPushDown],
 joinType=[LeftOuterJoin], lookup=[], select=[a, b, c, proctime, rowtime, id, 
name, age])
   +- DataStreamScan(table=[[default_catalog, default_database, MyTable]], 
fields=[a, b, c, proctime, rowtime]) {code}
I think this will get wrong lookupJoin result while we don't have equal join 
conditions after filter push down.

> Temporal Join job missing condition after “ON”
> --
>
> Key: FLINK-32321
> URL: https://issues.apache.org/jira/browse/FLINK-32321
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway
>Affects Versions: 1.17.1
>Reporter: macdoor615
>Priority: Major
>
> We have a SQL job, like this
> {code:java}
> select ... from prod_kafka.f_alarm_tag_dev
>  /*+ OPTIONS('scan.startup.mode' = 'latest-offset') */ as f 
> left join mysql_bnpmp.gem_bnpmp.f_a_alarm_filter
>  /*+ OPTIONS('lookup.cache.max-rows' = '5000',
>  'lookup.cache.ttl' = '30s') */
> FOR SYSTEM_TIME AS OF f.proctime ff  on ff.rule_type = 0 and f.ne_ip = ff.ip 
> {code}
> We submit to flink 1.17.1 cluster with sql-gateway. We found job detail 
> missing lookup condition (rule_type=0) 
> {code:java}
>   +- [1196]:LookupJoin(table=[mysql_bnpmp.gem_bnpmp.f_a_alarm_filter], 
> joinType=[LeftOuterJoin], lookup=[ip=ne_ip], select=[event_id, {code}
> We submit same sql to flink 1.17.0 cluster with sql-gateway. There is 
> (rule_type=0) lookup condition
> {code:java}
>   +- [3]:LookupJoin(table=[mysql_bnpmp.gem_bnpmp.f_a_alarm_filter], 
> joinType=[LeftOuterJoin], lookup=[rule_type=0, ip=ne_ip], where=[(rule_type = 
> 0)], select=[event_id, severity,{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27402) Unexpected boolean expression simplification for AND expression

2023-08-08 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752024#comment-17752024
 ] 

Yunhong Zheng commented on FLINK-27402:
---

Currently, Flink does not have the ability to perform implicit conversions, so 
this error need to be solved by Calcite which need to re-implementation 
RexSimplify.simplifyAnd().

> Unexpected boolean expression simplification  for AND expression 
> -
>
> Key: FLINK-27402
> URL: https://issues.apache.org/jira/browse/FLINK-27402
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: luoyuxia
>Assignee: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
>
> Flink supports to compare between string and boolean, so the following sql 
> can work fine
>  
> {code:java}
> create table (c1 int, c2 string);
> select * from c2 = true;
> {code}
> But the following sql will throw excpetion 
>  
> {code:java}
> select * from c1 = 1 and c2 = true; {code}
> The reason it that Flink will try to simplify BOOLEAN expressions if possible 
> in 
> "c1 = 1 and c2 = true".
> So "c2 = true" will be simplified to "c2" by the following code in Flink:
>  
> {code:java}
> RexSimplify#simplifyAnd2ForUnknownAsFalse
> // Simplify BOOLEAN expressions if possible
> while (term.getKind() == SqlKind.EQUALS) {
> RexCall call = (RexCall) term;
> if (call.getOperands().get(0).isAlwaysTrue()) {
> term = call.getOperands().get(1);
> terms.set(i, term);
> continue;
> } else if (call.getOperands().get(1).isAlwaysTrue()) {
> term = call.getOperands().get(0);
> terms.set(i, term);
> continue;
> }
> break;
> } {code}
> So the expression will be reduced to ""c1 = 1 and c2". But AND requries both 
> sides are boolean expression and c2 is not a boolean expression for it 
> actually is a string.
> Then the exception "Boolean expression type expected" is thrown.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-20255) Nested decorrelate failed

2023-08-08 Thread Yunhong Zheng (Jira)


[ https://issues.apache.org/jira/browse/FLINK-20255 ]


Yunhong Zheng deleted comment on FLINK-20255:
---

was (Author: JIRAUSER287975):
Hi, [~jark] . After reading source code. I think this is not a bug cause by 
Flink, Could you help to close this issue or set it to feature?

> Nested decorrelate failed
> -
>
> Key: FLINK-20255
> URL: https://issues.apache.org/jira/browse/FLINK-20255
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0, 1.12.0
>Reporter: godfrey he
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> This issue is from ML 
> https://www.mail-archive.com/user@flink.apache.org/msg37746.html
> We can reproduce the issue through the following code
> {code:java}
> @FunctionHint(output = new DataTypeHint("ROW"))
> class SplitStringToRows extends TableFunction[Row] {
>   def eval(str: String, separator: String = ";"): Unit = {
> if (str != null) {
>   str.split(separator).foreach(s => collect(Row.of(s.trim(
> }
>   }
> }
> object Job {
>   def main(args: Array[String]): Unit = {
> val settings = 
> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
> val streamEnv = StreamExecutionEnvironment.getExecutionEnvironment
> val streamTableEnv = StreamTableEnvironment.create(streamEnv, settings)
> streamTableEnv.createTemporarySystemFunction(
>   "SplitStringToRows",
>   classOf[SplitStringToRows]
> ) // Class defined in previous email
> streamTableEnv.executeSql(
>   """
>   CREATE TABLE table2 (
> attr1 STRING,
> attr2 STRING,
> attr3 DECIMAL,
> attr4 DATE
>   ) WITH (
>'connector' = 'datagen'
>)""")
> val q2 = streamTableEnv.sqlQuery(
>   """
> SELECT
>   a.attr1 AS attr1,
>   attr2,
>   attr3,
>   attr4
> FROM table2 p, LATERAL TABLE(SplitStringToRows(p.attr1, ';')) AS 
> a(attr1)
> """)
> streamTableEnv.createTemporaryView("view2", q2)
> val q3 =
>   """
> SELECT
>   w.attr1,
>   p.attr3
> FROM table2 w
> LEFT JOIN LATERAL (
>   SELECT
> attr1,
> attr3
>   FROM (
> SELECT
>   attr1,
>   attr3,
>   ROW_NUMBER() OVER (
> PARTITION BY attr1
> ORDER BY
>   attr4 DESC NULLS LAST,
>   w.attr2 = attr2 DESC NULLS LAST
>   ) AS row_num
>   FROM view2)
>   WHERE row_num = 1) p
> ON (w.attr1 = p.attr1)
> """
> println(streamTableEnv.explainSql(q3))
>   }
> }
> {code}
> The reason is {{RelDecorrelator}} in Calcite can't handle such nested 
> decorrelate pattern now



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-20255) Nested decorrelate failed

2023-08-08 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17751959#comment-17751959
 ] 

Yunhong Zheng commented on FLINK-20255:
---

Hi, [~jark] . After reading source code. I think this is not a bug cause by 
Flink, Could you help to close this issue or set it to feature?

> Nested decorrelate failed
> -
>
> Key: FLINK-20255
> URL: https://issues.apache.org/jira/browse/FLINK-20255
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0, 1.12.0
>Reporter: godfrey he
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> This issue is from ML 
> https://www.mail-archive.com/user@flink.apache.org/msg37746.html
> We can reproduce the issue through the following code
> {code:java}
> @FunctionHint(output = new DataTypeHint("ROW"))
> class SplitStringToRows extends TableFunction[Row] {
>   def eval(str: String, separator: String = ";"): Unit = {
> if (str != null) {
>   str.split(separator).foreach(s => collect(Row.of(s.trim(
> }
>   }
> }
> object Job {
>   def main(args: Array[String]): Unit = {
> val settings = 
> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
> val streamEnv = StreamExecutionEnvironment.getExecutionEnvironment
> val streamTableEnv = StreamTableEnvironment.create(streamEnv, settings)
> streamTableEnv.createTemporarySystemFunction(
>   "SplitStringToRows",
>   classOf[SplitStringToRows]
> ) // Class defined in previous email
> streamTableEnv.executeSql(
>   """
>   CREATE TABLE table2 (
> attr1 STRING,
> attr2 STRING,
> attr3 DECIMAL,
> attr4 DATE
>   ) WITH (
>'connector' = 'datagen'
>)""")
> val q2 = streamTableEnv.sqlQuery(
>   """
> SELECT
>   a.attr1 AS attr1,
>   attr2,
>   attr3,
>   attr4
> FROM table2 p, LATERAL TABLE(SplitStringToRows(p.attr1, ';')) AS 
> a(attr1)
> """)
> streamTableEnv.createTemporaryView("view2", q2)
> val q3 =
>   """
> SELECT
>   w.attr1,
>   p.attr3
> FROM table2 w
> LEFT JOIN LATERAL (
>   SELECT
> attr1,
> attr3
>   FROM (
> SELECT
>   attr1,
>   attr3,
>   ROW_NUMBER() OVER (
> PARTITION BY attr1
> ORDER BY
>   attr4 DESC NULLS LAST,
>   w.attr2 = attr2 DESC NULLS LAST
>   ) AS row_num
>   FROM view2)
>   WHERE row_num = 1) p
> ON (w.attr1 = p.attr1)
> """
> println(streamTableEnv.explainSql(q3))
>   }
> }
> {code}
> The reason is {{RelDecorrelator}} in Calcite can't handle such nested 
> decorrelate pattern now



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-20255) Nested decorrelate failed

2023-08-08 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17751957#comment-17751957
 ] 

Yunhong Zheng commented on FLINK-20255:
---

Hi [~lam167] . By design, your example cannot work. Now, 'UNNEST(xx) AS xx (xx) 
ON' can only with condition equals 'TRUE', like 'UNNEST(relatedUserIds) AS t 
(relatedUserId) ON TRUE'. Calcite will translate UNNEST as 'Correlate', which 
you can treat it as a nested loop join, so you cannot give it a not always true 
condition.

For your sql example, you can convert it into a filter condition and add it in 
where condition, such as
SELECT * 
FROM Messages 
CROSS JOIN UNNEST(relatedUserIds) AS t (relatedUserId) where userId = 
t.relatedUserId
 

> Nested decorrelate failed
> -
>
> Key: FLINK-20255
> URL: https://issues.apache.org/jira/browse/FLINK-20255
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0, 1.12.0
>Reporter: godfrey he
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> This issue is from ML 
> https://www.mail-archive.com/user@flink.apache.org/msg37746.html
> We can reproduce the issue through the following code
> {code:java}
> @FunctionHint(output = new DataTypeHint("ROW"))
> class SplitStringToRows extends TableFunction[Row] {
>   def eval(str: String, separator: String = ";"): Unit = {
> if (str != null) {
>   str.split(separator).foreach(s => collect(Row.of(s.trim(
> }
>   }
> }
> object Job {
>   def main(args: Array[String]): Unit = {
> val settings = 
> EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
> val streamEnv = StreamExecutionEnvironment.getExecutionEnvironment
> val streamTableEnv = StreamTableEnvironment.create(streamEnv, settings)
> streamTableEnv.createTemporarySystemFunction(
>   "SplitStringToRows",
>   classOf[SplitStringToRows]
> ) // Class defined in previous email
> streamTableEnv.executeSql(
>   """
>   CREATE TABLE table2 (
> attr1 STRING,
> attr2 STRING,
> attr3 DECIMAL,
> attr4 DATE
>   ) WITH (
>'connector' = 'datagen'
>)""")
> val q2 = streamTableEnv.sqlQuery(
>   """
> SELECT
>   a.attr1 AS attr1,
>   attr2,
>   attr3,
>   attr4
> FROM table2 p, LATERAL TABLE(SplitStringToRows(p.attr1, ';')) AS 
> a(attr1)
> """)
> streamTableEnv.createTemporaryView("view2", q2)
> val q3 =
>   """
> SELECT
>   w.attr1,
>   p.attr3
> FROM table2 w
> LEFT JOIN LATERAL (
>   SELECT
> attr1,
> attr3
>   FROM (
> SELECT
>   attr1,
>   attr3,
>   ROW_NUMBER() OVER (
> PARTITION BY attr1
> ORDER BY
>   attr4 DESC NULLS LAST,
>   w.attr2 = attr2 DESC NULLS LAST
>   ) AS row_num
>   FROM view2)
>   WHERE row_num = 1) p
> ON (w.attr1 = p.attr1)
> """
> println(streamTableEnv.explainSql(q3))
>   }
> }
> {code}
> The reason is {{RelDecorrelator}} in Calcite can't handle such nested 
> decorrelate pattern now



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-08-08 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17751907#comment-17751907
 ] 

Yunhong Zheng commented on FLINK-18356:
---

Thanks [~mapohl] . I will do this work.

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, 
> image-2023-07-11-19-41-37-105.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32638) CI build failed because e2e_1_ci throw error Bash exited with code '1'

2023-07-21 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745473#comment-17745473
 ] 

Yunhong Zheng commented on FLINK-32638:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51510=results

> CI build failed because e2e_1_ci throw error Bash exited with code '1'
> --
>
> Key: FLINK-32638
> URL: https://issues.apache.org/jira/browse/FLINK-32638
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System / CI
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Critical
> Fix For: 1.18.0
>
>
> CI build failed because e2e_1_ci throw error Bash exited with code '1'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-07-20 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745356#comment-17745356
 ] 

Yunhong Zheng commented on FLINK-18356:
---

Hello, [~chesnay]. Could you please modify the JVM glibc to libjemalloc in the 
CI Docker image? If not modified, table-planner-module will occupy a very high 
amount of memory during running ITCase, and it's difficult to optimize the 
memory consumption in flink. The specific analysis process is described above. 
Thanks very much.

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, 
> image-2023-07-11-19-41-37-105.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32638) CI build failed because e2e_1_ci throw error Bash exited with code '1'

2023-07-20 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-32638:
-

 Summary: CI build failed because e2e_1_ci throw error Bash exited 
with code '1'
 Key: FLINK-32638
 URL: https://issues.apache.org/jira/browse/FLINK-32638
 Project: Flink
  Issue Type: Improvement
  Components: Build System / CI
Affects Versions: 1.18.0
Reporter: Yunhong Zheng
 Fix For: 1.18.0


CI build failed because e2e_1_ci throw error Bash exited with code '1'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32579) The filter criteria on the lookup table of Lookup join has no effect

2023-07-13 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743009#comment-17743009
 ] 

Yunhong Zheng commented on FLINK-32579:
---

Hi, [~jasonliangyc] . I got it.

For question1: there is no filter condition "p.name = ''" in relNode  
LookupJoin?
 * This is normal because the filter condition has been pushed down to the jdbc 
source (jdbc source supports filter pushdown). The pushed down condition will 
not be displayed in the LookupJoin node.

For question2: wrong join result?
 * I think this is a bug for jdbc lookup source. For the pushed filter 
condition, the jdbc lookup source did not consume this filter correctly. After 
reading the code, I speculate that this is because the jdbc source doesn't 
process this filter condition for dim table.
 * To quickly verify this error. you can disable filter push down by adding 
config 'table.optimizer.source.predicate-pushdown-enabled'.
 * Also, after verifying, if this error is caused by jdbc source, you can @ 
[~ruanhang1993] taking a look. 

 

> The filter criteria on the lookup table of Lookup join has no effect 
> -
>
> Key: FLINK-32579
> URL: https://issues.apache.org/jira/browse/FLINK-32579
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.17.0, 1.17.1
>Reporter: jasonliangyc
>Priority: Major
> Attachments: image-2023-07-12-09-31-18-261.png, 
> image-2023-07-12-09-42-59-231.png, image-2023-07-12-09-47-31-397.png, 
> image-2023-07-13-17-19-26-972.png, image-2023-07-13-22-35-35-696.png, 
> image-2023-07-13-22-38-16-709.png, image-2023-07-13-22-43-24-213.png, 
> image-2023-07-13-22-43-45-957.png, test_case.sql
>
>
> *1.* I joined two tables using the lookup join as below query in sql-client, 
> the filter criteria of (p.name = '??') didn't shows up in the execution 
> detail and it returned the rows only base on one condiction (cdc.product_id = 
> p.id)
> {code:java}
> select
> cdc.order_id,
> cdc.order_date,
> cdc.customer_name,
> cdc.price,
> p.name
> FROM orders AS cdc
> left JOIN products 
> FOR SYSTEM_TIME AS OF cdc.proc_time as p ON p.name = '??' and 
> cdc.product_id = p.id
> ; {code}
> !image-2023-07-12-09-31-18-261.png|width=657,height=132!
>  
> *2.* It showed the werid results when i changed the query as below, cause 
> there were no data in the table(products) that the value of column 'name' is 
> '??'  and and execution detail didn't show us the where criteria.
> {code:java}
> select
> cdc.order_id,
> cdc.order_date,
> cdc.customer_name,
> cdc.price,
> p.name
> FROM orders AS cdc
> left JOIN products 
> FOR SYSTEM_TIME AS OF cdc.proc_time as p ON cdc.product_id = p.id
> where p.name = '??'
> ; {code}
> !image-2023-07-12-09-42-59-231.png|width=684,height=102!
> !image-2023-07-12-09-47-31-397.png|width=685,height=120!
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32577) Avoid memory fragmentation when running CI for flink-table-planner module

2023-07-13 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-32577:
--
Description: 
This issue is a sub-issue of FLINK-18356.

 

 

  was:This issue is a sub-issue of FLINK-18356.


> Avoid memory fragmentation when running CI for flink-table-planner module
> -
>
> Key: FLINK-32577
> URL: https://issues.apache.org/jira/browse/FLINK-32577
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System / CI, Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.18.0
>
>
> This issue is a sub-issue of FLINK-18356.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32579) The filter criteria on the lookup table of Lookup join has no effect

2023-07-13 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17742728#comment-17742728
 ] 

Yunhong Zheng commented on FLINK-32579:
---

Hi [~jasonliangyc] . Are you using Flink version 1.17.0?  I am unable to 
reproduce the problem in Figure 1 as you shown in my local UT case.  My sql 
pattern is:
{code:java}
SELECT * FROM MyTable AS T LEFT JOIN LookupTable FOR SYSTEM_TIME AS OF 
T.proctime AS D ON T.b = '?' and T.a = D.id {code}
My rel plan result is:

!image-2023-07-13-17-19-26-972.png|width=708,height=49!

 

BTW, I don't quite understand what is the problem in your Figure 2? Can you 
clarify it. Thanks!

> The filter criteria on the lookup table of Lookup join has no effect 
> -
>
> Key: FLINK-32579
> URL: https://issues.apache.org/jira/browse/FLINK-32579
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.17.0, 1.17.1
>Reporter: jasonliangyc
>Priority: Major
> Attachments: image-2023-07-12-09-31-18-261.png, 
> image-2023-07-12-09-42-59-231.png, image-2023-07-12-09-47-31-397.png, 
> image-2023-07-13-17-19-26-972.png
>
>
> *1.* I joined two tables using the lookup join as below query in sql-client, 
> the filter criteria of (p.name = '??') didn't shows up in the execution 
> detail and it returned the rows only base on one condiction (cdc.product_id = 
> p.id)
> {code:java}
> select
> cdc.order_id,
> cdc.order_date,
> cdc.customer_name,
> cdc.price,
> p.name
> FROM orders AS cdc
> left JOIN products 
> FOR SYSTEM_TIME AS OF cdc.proc_time as p ON p.name = '??' and 
> cdc.product_id = p.id
> ; {code}
> !image-2023-07-12-09-31-18-261.png|width=657,height=132!
>  
> *2.* It showed the werid results when i changed the query as below, cause 
> there were no data in the table(products) that the value of column 'name' is 
> '??'  and and execution detail didn't show us the where criteria.
> {code:java}
> select
> cdc.order_id,
> cdc.order_date,
> cdc.customer_name,
> cdc.price,
> p.name
> FROM orders AS cdc
> left JOIN products 
> FOR SYSTEM_TIME AS OF cdc.proc_time as p ON cdc.product_id = p.id
> where p.name = '??'
> ; {code}
> !image-2023-07-12-09-42-59-231.png|width=684,height=102!
> !image-2023-07-12-09-47-31-397.png|width=685,height=120!
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32579) The filter criteria on the lookup table of Lookup join has no effect

2023-07-13 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-32579:
--
Attachment: image-2023-07-13-17-19-26-972.png

> The filter criteria on the lookup table of Lookup join has no effect 
> -
>
> Key: FLINK-32579
> URL: https://issues.apache.org/jira/browse/FLINK-32579
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.17.0, 1.17.1
>Reporter: jasonliangyc
>Priority: Major
> Attachments: image-2023-07-12-09-31-18-261.png, 
> image-2023-07-12-09-42-59-231.png, image-2023-07-12-09-47-31-397.png, 
> image-2023-07-13-17-19-26-972.png
>
>
> *1.* I joined two tables using the lookup join as below query in sql-client, 
> the filter criteria of (p.name = '??') didn't shows up in the execution 
> detail and it returned the rows only base on one condiction (cdc.product_id = 
> p.id)
> {code:java}
> select
> cdc.order_id,
> cdc.order_date,
> cdc.customer_name,
> cdc.price,
> p.name
> FROM orders AS cdc
> left JOIN products 
> FOR SYSTEM_TIME AS OF cdc.proc_time as p ON p.name = '??' and 
> cdc.product_id = p.id
> ; {code}
> !image-2023-07-12-09-31-18-261.png|width=657,height=132!
>  
> *2.* It showed the werid results when i changed the query as below, cause 
> there were no data in the table(products) that the value of column 'name' is 
> '??'  and and execution detail didn't show us the where criteria.
> {code:java}
> select
> cdc.order_id,
> cdc.order_date,
> cdc.customer_name,
> cdc.price,
> p.name
> FROM orders AS cdc
> left JOIN products 
> FOR SYSTEM_TIME AS OF cdc.proc_time as p ON cdc.product_id = p.id
> where p.name = '??'
> ; {code}
> !image-2023-07-12-09-42-59-231.png|width=684,height=102!
> !image-2023-07-12-09-47-31-397.png|width=685,height=120!
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32528) The RexCall a = a,if a's datatype is nullable, and when a is null, a = a is null, it isn't true in BinaryComparisonExprReducer

2023-07-11 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741974#comment-17741974
 ] 

Yunhong Zheng commented on FLINK-32528:
---

Hi, [~shenlang] , can you provide a specific sql query as example?  I don't 
quite understand this 'if a's datatype is nullable, and when a is null, a = a 
is null, a <> a is null'. Thank you!

> The RexCall a = a,if a's datatype is nullable, and when a is null, a = a is 
> null, it isn't true in BinaryComparisonExprReducer
> --
>
> Key: FLINK-32528
> URL: https://issues.apache.org/jira/browse/FLINK-32528
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: LakeShen
>Priority: Major
>
> Now I'am reading flink sql planner's source code,when I saw the 
> FlinkRexUtil.java, in the 
> org.apache.flink.table.planner.plan.utils.FlinkRexUtil#simplify method,it 
> used the BinaryComparisonExprReducer the deal with BINARY_COMPARISON's 
> operator which the operands are RexInputRef and the operands are same, e.g. a 
> = a, a <> a,a >=a...
> In BinaryComparisonExprReducer, a = a,a <=a,a>=a will be simplified the true 
> literal,a <> a,a < a, a > a will be simplified the false literal.
> if a's datatype is nullable, and when a is null, a = a is null, a <> a is 
> null. In BinaryComparisonExprReducer's logic,It does not consider the case of 
> the nullable data type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32577) Avoid memory fragmentation when running CI for flink-table-planner module

2023-07-11 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-32577:
--
Description: This issue is a sub-issue of FLINK-18356.  (was: This issue is 
a sub-issue of FLINK-18356.

When I run mvn verify for flink table-planner in azure CI and my own machine.  
I found that the heap memory and non-heap memory of JVM are stable and within 
the normal range. However, the total memory usage ({*}RES{*}) of the fork 
process is very high, as shown in the following figure(PID : 2958793 and 
2958794):

!image-2023-07-11-19-28-52-851.png|width=537,height=245!

I try to delve deeper into the specific memory allocation of these two 
processes:

 
{code:java}
pmap -p 2958793 {code}
I found that there are a lot of memory fragmentation here with a size close to 
*64MB* (>200 memory fragmentation):

 

!image-2023-07-11-19-35-54-530.png|width=237,height=413!

Based on past experience, this issue is likely to trigger the classic problem 
of the incorrect memory fragmentation manage by *glibc of JDK8.* So we 
downloaded *libjemalloc* and added the environment variable:

 
{code:java}
export LD_PRELOAD=${JAVA_HOME}/lib/amd64/libjemalloc.so.2{code}
After that, the overall memory of the fork process has become stable and meets 
expectations (5GB):

 

!image-2023-07-11-19-41-18-626.png|width=488,height=208!

!image-2023-07-11-19-41-37-105.png|width=228,height=287!

The solution to this problem requires modifying the CI execution Docker image 
[Docker image|[https://github.com/flink-ci/flink-ci-docker],]  replacing 
*glibc* with *libjemalloc* like FLINK-19125, cc [~chesnay] :{*}{{*}}
{code:java}
apt-get -y install libjemalloc-dev

ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so {code}
I have opened a new Jira (FLINK-32577) to track and fix this issue. cc 
[~mapohl]  [~jark].

 )

> Avoid memory fragmentation when running CI for flink-table-planner module
> -
>
> Key: FLINK-32577
> URL: https://issues.apache.org/jira/browse/FLINK-32577
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System / CI, Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.18.0
>
>
> This issue is a sub-issue of FLINK-18356.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32577) Avoid memory fragmentation when running CI for flink-table-planner module

2023-07-11 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-32577:
--
Description: 
This issue is a sub-issue of FLINK-18356.

When I run mvn verify for flink table-planner in azure CI and my own machine.  
I found that the heap memory and non-heap memory of JVM are stable and within 
the normal range. However, the total memory usage ({*}RES{*}) of the fork 
process is very high, as shown in the following figure(PID : 2958793 and 
2958794):

!image-2023-07-11-19-28-52-851.png|width=537,height=245!

I try to delve deeper into the specific memory allocation of these two 
processes:

 
{code:java}
pmap -p 2958793 {code}
I found that there are a lot of memory fragmentation here with a size close to 
*64MB* (>200 memory fragmentation):

 

!image-2023-07-11-19-35-54-530.png|width=237,height=413!

Based on past experience, this issue is likely to trigger the classic problem 
of the incorrect memory fragmentation manage by *glibc of JDK8.* So we 
downloaded *libjemalloc* and added the environment variable:

 
{code:java}
export LD_PRELOAD=${JAVA_HOME}/lib/amd64/libjemalloc.so.2{code}
After that, the overall memory of the fork process has become stable and meets 
expectations (5GB):

 

!image-2023-07-11-19-41-18-626.png|width=488,height=208!

!image-2023-07-11-19-41-37-105.png|width=228,height=287!

The solution to this problem requires modifying the CI execution Docker image 
[Docker image|[https://github.com/flink-ci/flink-ci-docker],]  replacing 
*glibc* with *libjemalloc* like FLINK-19125, cc [~chesnay] :{*}{{*}}
{code:java}
apt-get -y install libjemalloc-dev

ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so {code}
I have opened a new Jira (FLINK-32577) to track and fix this issue. cc 
[~mapohl]  [~jark].

 

> Avoid memory fragmentation when running CI for flink-table-planner module
> -
>
> Key: FLINK-32577
> URL: https://issues.apache.org/jira/browse/FLINK-32577
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System / CI, Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.18.0
>
>
> This issue is a sub-issue of FLINK-18356.
> When I run mvn verify for flink table-planner in azure CI and my own machine. 
>  I found that the heap memory and non-heap memory of JVM are stable and 
> within the normal range. However, the total memory usage ({*}RES{*}) of the 
> fork process is very high, as shown in the following figure(PID : 2958793 and 
> 2958794):
> !image-2023-07-11-19-28-52-851.png|width=537,height=245!
> I try to delve deeper into the specific memory allocation of these two 
> processes:
>  
> {code:java}
> pmap -p 2958793 {code}
> I found that there are a lot of memory fragmentation here with a size close 
> to *64MB* (>200 memory fragmentation):
>  
> !image-2023-07-11-19-35-54-530.png|width=237,height=413!
> Based on past experience, this issue is likely to trigger the classic problem 
> of the incorrect memory fragmentation manage by *glibc of JDK8.* So we 
> downloaded *libjemalloc* and added the environment variable:
>  
> {code:java}
> export LD_PRELOAD=${JAVA_HOME}/lib/amd64/libjemalloc.so.2{code}
> After that, the overall memory of the fork process has become stable and 
> meets expectations (5GB):
>  
> !image-2023-07-11-19-41-18-626.png|width=488,height=208!
> !image-2023-07-11-19-41-37-105.png|width=228,height=287!
> The solution to this problem requires modifying the CI execution Docker image 
> [Docker image|[https://github.com/flink-ci/flink-ci-docker],]  replacing 
> *glibc* with *libjemalloc* like FLINK-19125, cc [~chesnay] :{*}{{*}}
> {code:java}
> apt-get -y install libjemalloc-dev
> ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so {code}
> I have opened a new Jira (FLINK-32577) to track and fix this issue. cc 
> [~mapohl]  [~jark].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-07-11 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741961#comment-17741961
 ] 

Yunhong Zheng edited comment on FLINK-18356 at 7/11/23 11:55 AM:
-

Hi, all. I think I found the root cause of table-planner exit 137 error under 
the guidance of [~lincoln.86xy] .  This error is similar to  issue FLINK-19125, 
both are caused by the incorrect memory fragmentation manage by {*}glibc{*}, 
which will not return memory to kernel gracefully. (refer to [glibc 
bugzilla|https://sourceware.org/bugzilla/show_bug.cgi?id=15321] and [glibc 
manual|https://www.gnu.org/software/libc/manual/html_mono/libc.html#Freeing-after-Malloc]).

When I run mvn verify for flink table-planner in azure CI and my own machine.  
I found that the heap memory and non-heap memory of JVM are stable and within 
the normal range. However, the total memory usage ({*}RES{*}) of the fork 
process is very high, as shown in the following figure(PID : 2958793 and 
2958794):

!image-2023-07-11-19-28-52-851.png|width=537,height=245!

I try to delve deeper into the specific memory allocation of these two 
processes:

 
{code:java}
pmap -p 2958793 {code}
I found that there are a lot of memory fragmentation here with a size close to 
*64MB* (>200 memory fragmentation):

 

!image-2023-07-11-19-35-54-530.png|width=237,height=413!

Based on past experience, this issue is likely to trigger the classic problem 
of the incorrect memory fragmentation manage by *glibc of JDK8.* So we 
downloaded *libjemalloc* and added the environment variable:

 
{code:java}
export LD_PRELOAD=${JAVA_HOME}/lib/amd64/libjemalloc.so.2{code}
After that, the overall memory of the fork process has become stable and meets 
expectations (5GB):

 

!image-2023-07-11-19-41-18-626.png|width=488,height=208!

!image-2023-07-11-19-41-37-105.png|width=228,height=287!

The solution to this problem requires modifying the CI execution Docker image 
[Docker image|[https://github.com/flink-ci/flink-ci-docker],]  replacing 
*glibc* with *libjemalloc* like FLINK-19125, cc [~chesnay] .
{code:java}
apt-get -y install libjemalloc-dev

ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so {code}
I have opened a new Jira (FLINK-32577) to track and fix this issue. cc 
[~mapohl]  [~jark].

 


was (Author: JIRAUSER287975):
Hi, all. I think I found the root cause of table-planner exit 137 error under 
the guidance of [~lincoln.86xy] .  This error is similar to  issue 
[FLINK-19125|https://issues.apache.org/jira/browse/FLINK-19125], both are 
caused by the incorrect memory fragmentation manage by {*}glibc{*}, which will 
not return memory to kernel gracefully. (refer to [glibc 
bugzilla|https://sourceware.org/bugzilla/show_bug.cgi?id=15321] and [glibc 
manual|https://www.gnu.org/software/libc/manual/html_mono/libc.html#Freeing-after-Malloc]).

When I run mvn verify for flink table-planner in azure CI and my own machine.  
I found that the heap memory and non-heap memory of JVM are stable and within 
the normal range. However, the total memory usage ({*}RES{*}) of the fork 
process is very high, as shown in the following figure(PID : 2958793 and 
2958794):

!image-2023-07-11-19-28-52-851.png|width=537,height=245!

I try to delve deeper into the specific memory allocation of these two 
processes:

 
{code:java}
pmap -p 2958793 {code}
I found that there are a lot of memory fragmentation here with a size close to 
*64MB* (>200 memory fragmentation):

 

!image-2023-07-11-19-35-54-530.png|width=237,height=413!

Based on past experience, this issue is likely to trigger the classic problem 
of the incorrect memory fragmentation manage by *glibc of JDK8.* So we 
downloaded *libjemalloc* and added the environment variable:

 
{code:java}
export LD_PRELOAD=${JAVA_HOME}/lib/amd64/libjemalloc.so.2{code}
After that, the overall memory of the fork process has become stable and meets 
expectations (5GB):

 

!image-2023-07-11-19-41-18-626.png|width=488,height=208!

!image-2023-07-11-19-41-37-105.png|width=228,height=287!

The solution to this problem requires modifying the CI execution Docker image 
[Docker image|[https://github.com/flink-ci/flink-ci-docker],]  replacing 
*glibc* with *libjemalloc* like FLINK-19125, cc [~chesnay] :{*}{*}
{code:java}
apt-get -y install libjemalloc-dev

ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so {code}
I have opened a new Jira (FLINK-32577) to track and fix this issue. cc 
[~mapohl]  [~jark].

 

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: 

[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-07-11 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741963#comment-17741963
 ] 

Yunhong Zheng commented on FLINK-18356:
---

Hi [~chesnay] , could you take a look at this issue and assist in modifying the 
Docker image? Thank you!

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, 
> image-2023-07-11-19-41-37-105.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-07-11 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17741961#comment-17741961
 ] 

Yunhong Zheng commented on FLINK-18356:
---

Hi, all. I think I found the root cause of table-planner exit 137 error under 
the guidance of [~lincoln.86xy] .  This error is similar to  issue 
[FLINK-19125|https://issues.apache.org/jira/browse/FLINK-19125], both are 
caused by the incorrect memory fragmentation manage by {*}glibc{*}, which will 
not return memory to kernel gracefully. (refer to [glibc 
bugzilla|https://sourceware.org/bugzilla/show_bug.cgi?id=15321] and [glibc 
manual|https://www.gnu.org/software/libc/manual/html_mono/libc.html#Freeing-after-Malloc]).

When I run mvn verify for flink table-planner in azure CI and my own machine.  
I found that the heap memory and non-heap memory of JVM are stable and within 
the normal range. However, the total memory usage ({*}RES{*}) of the fork 
process is very high, as shown in the following figure(PID : 2958793 and 
2958794):

!image-2023-07-11-19-28-52-851.png|width=537,height=245!

I try to delve deeper into the specific memory allocation of these two 
processes:

 
{code:java}
pmap -p 2958793 {code}
I found that there are a lot of memory fragmentation here with a size close to 
*64MB* (>200 memory fragmentation):

 

!image-2023-07-11-19-35-54-530.png|width=237,height=413!

Based on past experience, this issue is likely to trigger the classic problem 
of the incorrect memory fragmentation manage by *glibc of JDK8.* So we 
downloaded *libjemalloc* and added the environment variable:

 
{code:java}
export LD_PRELOAD=${JAVA_HOME}/lib/amd64/libjemalloc.so.2{code}
After that, the overall memory of the fork process has become stable and meets 
expectations (5GB):

 

!image-2023-07-11-19-41-18-626.png|width=488,height=208!

!image-2023-07-11-19-41-37-105.png|width=228,height=287!

The solution to this problem requires modifying the CI execution Docker image 
[Docker image|[https://github.com/flink-ci/flink-ci-docker],]  replacing 
*glibc* with *libjemalloc* like FLINK-19125, cc [~chesnay] :{*}{*}
{code:java}
apt-get -y install libjemalloc-dev

ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so {code}
I have opened a new Jira (FLINK-32577) to track and fix this issue. cc 
[~mapohl]  [~jark].

 

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, 
> image-2023-07-11-19-41-37-105.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32577) Avoid memory fragmentation when running CI for flink-table-planner module

2023-07-11 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-32577:
-

 Summary: Avoid memory fragmentation when running CI for 
flink-table-planner module
 Key: FLINK-32577
 URL: https://issues.apache.org/jira/browse/FLINK-32577
 Project: Flink
  Issue Type: Improvement
  Components: Build System / CI, Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Yunhong Zheng
 Fix For: 1.18.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-07-11 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-18356:
--
Attachment: image-2023-07-11-19-41-18-626.png

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, 
> image-2023-07-11-19-41-37-105.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-07-11 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-18356:
--
Attachment: image-2023-07-11-19-41-37-105.png

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, 
> image-2023-07-11-19-41-37-105.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-07-11 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-18356:
--
Attachment: image-2023-07-11-19-35-54-530.png

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, 
> image-2023-07-11-19-35-54-530.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-07-11 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-18356:
--
Attachment: image-2023-07-11-19-28-52-851.png

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: Piotr Nowojski
>Assignee: Yunhong Zheng
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-07-06 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740572#comment-17740572
 ] 

Yunhong Zheng commented on FLINK-18356:
---

Hi, all. The frequently occur of this error recently may be due to I am running 
CI with reuseFork equal true in table-planner module in this 
pr(https://github.com/apache/flink/pull/22880). I found that this error is one 
global OOM caused by insufficient Linux mem resource , which may kill all 
processes running on the machine, thus affecting other CI agents. 

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0
>Reporter: Piotr Nowojski
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32471) IS_NOT_NULL can add to SUITABLE_FILTER_TO_PUSH

2023-06-30 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738939#comment-17738939
 ] 

Yunhong Zheng commented on FLINK-32471:
---

Hi, [~grandfisher] .  I agree with you, 'IS_NOT_NULL' is possible to push to 
left which is semantically equivalent in SQL. You can try to add this type of 
SqlKind to the SUITABLE_FILTER_TO_PUSH set. Btw, is this a requirement founded 
while using Flink batch in daily work? Thanks. 

> IS_NOT_NULL can add to SUITABLE_FILTER_TO_PUSH
> --
>
> Key: FLINK-32471
> URL: https://issues.apache.org/jira/browse/FLINK-32471
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Reporter: grandfisher
>Priority: Major
>
> According to FLINK-31273:
> The reason for the error is that other filters conflict with IS_NULL, but in 
> fact it won't conflict with IS_NOT_NULL, because operators in 
> SUITABLE_FILTER_TO_PUSH  such as 'SqlKind.GREATER_THAN'  has an implicit 
> filter 'IS_NOT_NULL' according to SQL Semantics.
>  
> So we think it is feasible to add  IS_NOT_NULL to the SUITABLE_FILTER_TO_PUSH 
> list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31273) Left join with IS_NULL filter be wrongly pushed down and get wrong join results

2023-03-13 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31273:
--
Fix Version/s: 1.17.0
   (was: 1.17.1)

> Left join with IS_NULL filter be wrongly pushed down and get wrong join 
> results
> ---
>
> Key: FLINK-31273
> URL: https://issues.apache.org/jira/browse/FLINK-31273
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0, 1.16.1
>Reporter: Yunhong Zheng
>Assignee: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Left join with IS_NULL filter be wrongly pushed down and get wrong join 
> results. The sql is:
> {code:java}
> SELECT * FROM MyTable1 LEFT JOIN MyTable2 ON a1 = a2 WHERE a2 IS NULL AND a1 
> < 10
> The wrongly plan is:
> LogicalProject(a1=[$0], b1=[$1], c1=[$2], b2=[$3], c2=[$4], a2=[$5])
> +- LogicalFilter(condition=[IS NULL($5)])
>    +- LogicalJoin(condition=[=($0, $5)], joinType=[left])
>       :- LogicalValues(tuples=[[]])
>       +- LogicalTableScan(table=[[default_catalog, default_database, 
> MyTable2]]) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31273) Left join with IS_NULL filter be wrongly pushed down and get wrong join results

2023-03-13 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31273:
--
Priority: Blocker  (was: Major)

> Left join with IS_NULL filter be wrongly pushed down and get wrong join 
> results
> ---
>
> Key: FLINK-31273
> URL: https://issues.apache.org/jira/browse/FLINK-31273
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0, 1.16.1
>Reporter: Yunhong Zheng
>Assignee: Yunhong Zheng
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Left join with IS_NULL filter be wrongly pushed down and get wrong join 
> results. The sql is:
> {code:java}
> SELECT * FROM MyTable1 LEFT JOIN MyTable2 ON a1 = a2 WHERE a2 IS NULL AND a1 
> < 10
> The wrongly plan is:
> LogicalProject(a1=[$0], b1=[$1], c1=[$2], b2=[$3], c2=[$4], a2=[$5])
> +- LogicalFilter(condition=[IS NULL($5)])
>    +- LogicalJoin(condition=[=($0, $5)], joinType=[left])
>       :- LogicalValues(tuples=[[]])
>       +- LogicalTableScan(table=[[default_catalog, default_database, 
> MyTable2]]) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31273) Left join with IS_NULL filter be wrongly pushed down and get wrong join results

2023-02-28 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-31273:
-

 Summary: Left join with IS_NULL filter be wrongly pushed down and 
get wrong join results
 Key: FLINK-31273
 URL: https://issues.apache.org/jira/browse/FLINK-31273
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.16.1, 1.17.0
Reporter: Yunhong Zheng
 Fix For: 1.17.1


Left join with IS_NULL filter be wrongly pushed down and get wrong join 
results. The sql is:
{code:java}
SELECT * FROM MyTable1 LEFT JOIN MyTable2 ON a1 = a2 WHERE a2 IS NULL AND a1 < 
10

The wrongly plan is:

LogicalProject(a1=[$0], b1=[$1], c1=[$2], b2=[$3], c2=[$4], a2=[$5])
+- LogicalFilter(condition=[IS NULL($5)])
   +- LogicalJoin(condition=[=($0, $5)], joinType=[left])
      :- LogicalValues(tuples=[[]])
      +- LogicalTableScan(table=[[default_catalog, default_database, 
MyTable2]]) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31061) Release Testing: Verify FLINK-30376 Introduce a new flink bushy join reorder rule which based on greedy algorithm

2023-02-21 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng closed FLINK-31061.
-
Resolution: Fixed

> Release Testing: Verify FLINK-30376 Introduce a new flink bushy join reorder 
> rule which based on greedy algorithm
> -
>
> Key: FLINK-31061
> URL: https://issues.apache.org/jira/browse/FLINK-31061
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Assignee: Junrui Li
>Priority: Major
> Fix For: 1.17.0
>
>
> This issue aims to verify FLINK-30376: [Introduce a new flink bushy join 
> reorder rule which based on greedy 
> algorithm|https://issues.apache.org/jira/browse/FLINK-30376].
>  In Flink-1.17, bushy join reorder strategy is the default join reorder 
> strategy, and this strategy can be disable by setting factor '
> table.optimizer.bushy-join-reorder-threshold' smaller that the table number 
> need to be reordered. If disabled, the Lopt join reorder strategy, which is 
> default join reorder strategy in Flink-1.16, will be choosen. 
> We can verify it in SQL client after we build the flink-dist package.
>  # Firstly, we need to create several tables (The best case is that these 
> tables have table and column statistics).
>  # Secondly, we need to set 'table.optimizer.join-reorder-enabled = true' to 
> open join reorder.
>  # Verify bushy join reorder (The default bushy join reorder threshold is 12, 
> so if the number of table smaller than 12, the join reorder strategy is bushy 
> join reorder).
>  # Compare the results of bushy join reorder and Lopt join reorder strategy. 
> Need to be same.
>  # If we want to create a bushy join tree after join reorder, we need to add 
> statistics. Like:'JoinReorderITCaseBase.testBushyTreeJoinReorder'. 
> If you meet any problems, it's welcome to ping me directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31061) Release Testing: Verify FLINK-30376 Introduce a new flink bushy join reorder rule which based on greedy algorithm

2023-02-21 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17691961#comment-17691961
 ] 

Yunhong Zheng commented on FLINK-31061:
---

Thanks, [~JunRuiLi]  for your careful testing. no problems were found during 
the test, so I will close this issue.

> Release Testing: Verify FLINK-30376 Introduce a new flink bushy join reorder 
> rule which based on greedy algorithm
> -
>
> Key: FLINK-31061
> URL: https://issues.apache.org/jira/browse/FLINK-31061
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Assignee: Junrui Li
>Priority: Major
> Fix For: 1.17.0
>
>
> This issue aims to verify FLINK-30376: [Introduce a new flink bushy join 
> reorder rule which based on greedy 
> algorithm|https://issues.apache.org/jira/browse/FLINK-30376].
>  In Flink-1.17, bushy join reorder strategy is the default join reorder 
> strategy, and this strategy can be disable by setting factor '
> table.optimizer.bushy-join-reorder-threshold' smaller that the table number 
> need to be reordered. If disabled, the Lopt join reorder strategy, which is 
> default join reorder strategy in Flink-1.16, will be choosen. 
> We can verify it in SQL client after we build the flink-dist package.
>  # Firstly, we need to create several tables (The best case is that these 
> tables have table and column statistics).
>  # Secondly, we need to set 'table.optimizer.join-reorder-enabled = true' to 
> open join reorder.
>  # Verify bushy join reorder (The default bushy join reorder threshold is 12, 
> so if the number of table smaller than 12, the join reorder strategy is bushy 
> join reorder).
>  # Compare the results of bushy join reorder and Lopt join reorder strategy. 
> Need to be same.
>  # If we want to create a bushy join tree after join reorder, we need to add 
> statistics. Like:'JoinReorderITCaseBase.testBushyTreeJoinReorder'. 
> If you meet any problems, it's welcome to ping me directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-02-16 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689727#comment-17689727
 ] 

Yunhong Zheng commented on FLINK-18356:
---

!image-2023-02-16-20-18-09-431.png|width=628,height=261!

Enable/disable this factor can only alleviate the CI OOM problem. We need to 
find the root cause of memory leak in table planner module.

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0
>Reporter: Piotr Nowojski
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-02-16 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-18356:
--
Attachment: image-2023-02-16-20-18-09-431.png

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0
>Reporter: Piotr Nowojski
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, 
> image-2023-02-16-20-18-09-431.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-02-16 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689724#comment-17689724
 ] 

Yunhong Zheng commented on FLINK-18356:
---

Hi, [~mapohl]  and [~martijnvisser] . I found that I modified the pom.xml file 
by mistake, and the correct modification way has been submitted by this pr:  
[https://github.com/apache/flink/pull/21950/files] . Can you review it when you 
are free? Thank you.  At the same time, I will cherry-pick to release-1.17. 

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0
>Reporter: Piotr Nowojski
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31082) Setting maven property 'flink.resueForks' to false in table planner module

2023-02-15 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31082:
--
Description: 
This issue is created to alleviate the OOM problem mentioned in issue: 
https://issues.apache.org/jira/browse/FLINK-18356

Setting maven property 'flink.resueForks' to false in table planner module can 
only reduce the frequency of oom, but can't solve this problem. To completely 
solve this problem, we need to identify the specific reasons, but this is a 
time-consuming work.

  was:
This issue is created to alleviate the OOM problem mentioned in issue: 
https://issues.apache.org/jira/browse/FLINK-18356

Setting maven property 'flink.resueForks' to false in table planner module can 
only reduce the frequency of oom, but can't solve this problem. To completely 
solve this problem, we need to identify the specific reasons, but this is a 
time-consuming work,


> Setting maven property 'flink.resueForks' to false in table planner module 
> ---
>
> Key: FLINK-31082
> URL: https://issues.apache.org/jira/browse/FLINK-31082
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> This issue is created to alleviate the OOM problem mentioned in issue: 
> https://issues.apache.org/jira/browse/FLINK-18356
> Setting maven property 'flink.resueForks' to false in table planner module 
> can only reduce the frequency of oom, but can't solve this problem. To 
> completely solve this problem, we need to identify the specific reasons, but 
> this is a time-consuming work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31082) Setting maven property 'flink.resueForks' to false in table planner module

2023-02-15 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31082:
--
Description: 
This issue is created to alleviate the OOM problem mentioned in issue: 
https://issues.apache.org/jira/browse/FLINK-18356

Setting maven property 'flink.resueForks' to false in table planner module can 
only reduce the frequency of oom, but can't solve this problem. To completely 
solve this problem, we need to identify the specific reasons, but this is a 
time-consuming work,

  was:
This issue is created to alleviate the OOM problem mentioned in issue: 
https://issues.apache.org/jira/browse/FLINK-18356

 


> Setting maven property 'flink.resueForks' to false in table planner module 
> ---
>
> Key: FLINK-31082
> URL: https://issues.apache.org/jira/browse/FLINK-31082
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> This issue is created to alleviate the OOM problem mentioned in issue: 
> https://issues.apache.org/jira/browse/FLINK-18356
> Setting maven property 'flink.resueForks' to false in table planner module 
> can only reduce the frequency of oom, but can't solve this problem. To 
> completely solve this problem, we need to identify the specific reasons, but 
> this is a time-consuming work,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31082) Setting maven property 'flink.resueForks' to false in table planner module

2023-02-15 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31082:
--
Description: 
This issue is created to alleviate the OOM problem mentioned in issue: 
https://issues.apache.org/jira/browse/FLINK-18356

 

  was:This issue is created to alleviate the OOM problem mentioned in issue: 
[https://issues.apache.org/jira/browse/FLINK-18356|https://issues.apache.org/jira/browse/FLINK-18356]


> Setting maven property 'flink.resueForks' to false in table planner module 
> ---
>
> Key: FLINK-31082
> URL: https://issues.apache.org/jira/browse/FLINK-31082
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> This issue is created to alleviate the OOM problem mentioned in issue: 
> https://issues.apache.org/jira/browse/FLINK-18356
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31082) Setting maven property 'flink.resueForks' to false in table planner module

2023-02-15 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31082:
--
Description: This issue is created to alleviate the OOM problem mentioned 
in issue: 
[https://issues.apache.org/jira/browse/FLINK-18356|https://issues.apache.org/jira/browse/FLINK-18356]
  (was: This issue is created to alleviate the OOM problem mentioned in issue: 
[Exit 137|http://https://issues.apache.org/jira/browse/FLINK-18356])

> Setting maven property 'flink.resueForks' to false in table planner module 
> ---
>
> Key: FLINK-31082
> URL: https://issues.apache.org/jira/browse/FLINK-31082
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> This issue is created to alleviate the OOM problem mentioned in issue: 
> [https://issues.apache.org/jira/browse/FLINK-18356|https://issues.apache.org/jira/browse/FLINK-18356]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31082) Setting maven property 'flink.resueForks' to false in table planner module

2023-02-15 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31082:
--
Description: This issue is created to alleviate the OOM problem mentioned 
in issue: [Exit 137|http://https://issues.apache.org/jira/browse/FLINK-18356]

> Setting maven property 'flink.resueForks' to false in table planner module 
> ---
>
> Key: FLINK-31082
> URL: https://issues.apache.org/jira/browse/FLINK-31082
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> This issue is created to alleviate the OOM problem mentioned in issue: [Exit 
> 137|http://https://issues.apache.org/jira/browse/FLINK-18356]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31082) Setting maven property 'flink.resueForks' to false in table planner module

2023-02-15 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-31082:
-

 Summary: Setting maven property 'flink.resueForks' to false in 
table planner module 
 Key: FLINK-31082
 URL: https://issues.apache.org/jira/browse/FLINK-31082
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.17.0
Reporter: Yunhong Zheng
 Fix For: 1.17.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31079) Release Testing: Verify FLINK-29663 Further improvements of adaptive batch scheduler

2023-02-15 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17688900#comment-17688900
 ] 

Yunhong Zheng commented on FLINK-31079:
---

I hope to get this ticket, [~wanglijie] 

> Release Testing: Verify FLINK-29663 Further improvements of adaptive batch 
> scheduler
> 
>
> Key: FLINK-31079
> URL: https://issues.apache.org/jira/browse/FLINK-31079
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Priority: Blocker
> Fix For: 1.17.0
>
>
> This task aims to verify FLINK-29663 which improves the adaptive batch 
> scheduler.
> Before the change of FLINK-29663, adaptive batch scheduler will distribute 
> subpartitoins according to the number of subpartitions, make different 
> downstream subtasks consume roughly the same number of subpartitions. This 
> will lead to imbalance loads of different downstream tasks when the 
> subpartitions contain different amounts of data.
> To solve this problem, in FLINK-29663, we let the adaptive batch scheduler 
> distribute subpartitoins according to the amount of data, so that different 
> downstream subtasks consume roughly the same amount of data. Note that 
> currently it only takes effect for All-To-All edges.
> The documentation of adaptive scheduler can be found 
> [here|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/elastic_scaling/#adaptive-batch-scheduler]
> One can verify it by creating intended data skew on All-To-All edges.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process

2023-02-14 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17688791#comment-17688791
 ] 

Yunhong Zheng commented on FLINK-18356:
---

Hi all, we are still trying to solve this problem. We are trying to reproduce 
this problem on our own cluster. We will continue to follow up this issue.

> flink-table-planner Exit code 137 returned from process
> ---
>
> Key: FLINK-18356
> URL: https://issues.apache.org/jira/browse/FLINK-18356
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0
>Reporter: Piotr Nowojski
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Attachments: 1234.jpg, app-profiling_4.gif, 
> image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png
>
>
> {noformat}
> = test session starts 
> ==
> platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1
> cachedir: .tox/py37-cython/.pytest_cache
> rootdir: /__w/3/s/flink-python
> collected 568 items
> pyflink/common/tests/test_configuration.py ..[  
> 1%]
> pyflink/common/tests/test_execution_config.py ...[  
> 5%]
> pyflink/dataset/tests/test_execution_environment.py .
> ##[error]Exit code 137 returned from process: file name '/bin/docker', 
> arguments 'exec -i -u 1002 
> 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb 
> /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'.
> Finishing: Test - python
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729=logs=9cada3cb-c1d3-5621-16da-0f718fb86602=8d78fe4f-d658-5c70-12f8-4921589024c3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31059) Release Testing: Verify FLINK-29717 Supports hive udaf such as sum/count by native implementation

2023-02-14 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17688435#comment-17688435
 ] 

Yunhong Zheng edited comment on FLINK-31059 at 2/14/23 10:43 AM:
-

I hope to get this ticket, [~lsy] 


was (Author: JIRAUSER287975):
I hope to get this tickets, [~lsy] 

> Release Testing: Verify FLINK-29717 Supports hive udaf such as sum/count by 
> native implementation
> -
>
> Key: FLINK-31059
> URL: https://issues.apache.org/jira/browse/FLINK-31059
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Affects Versions: 1.17.0
>Reporter: dalongliu
>Priority: Blocker
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31059) Release Testing: Verify FLINK-29717 Supports hive udaf such as sum/count by native implementation

2023-02-14 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17688435#comment-17688435
 ] 

Yunhong Zheng commented on FLINK-31059:
---

I hope to get this tickets, [~lsy] 

> Release Testing: Verify FLINK-29717 Supports hive udaf such as sum/count by 
> native implementation
> -
>
> Key: FLINK-31059
> URL: https://issues.apache.org/jira/browse/FLINK-31059
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Affects Versions: 1.17.0
>Reporter: dalongliu
>Priority: Blocker
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31060) Release Testing: Verify FLINK-30542 Support adaptive local hash aggregate in runtime

2023-02-14 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31060:
--
Description: 
This issue aims to verify FLINK-30542: Support adaptive local hash aggregate in 
runtime.

Adaptive local hash aggregation is an optimization of local hash aggregation, 
which can adaptively determine whether to continue to do local hash aggregation 
according to the distinct value rate of sampling data. If distinct value rate 
bigger than defined threshold (see parameter: 
'table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold'), we will 
stop aggregating and just send the input data to the downstream after a simple 
projection. Otherwise, we will continue to do aggregation.

We can verify it in SQL client after we build the flink-dist package.
 # Create a source table firstly. (Note: the source table need have different 
degree of aggregation, means the distinct count can be controlled by source 
connector, we recommend to modify dataGen table source to produce different 
data with different distinct row number).
 # Verify the result with different distinct value rate. (See: 
table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold)
 # Check the log in 'TM' to see whether the adaptive local hash aggregate works.

If you meet any problems, it's welcome to ping me directly.

  was:This issue aims to verify FLINK-30542: [Support adaptive local hash 
aggregate in runtime|https://issues.apache.org/jira/browse/FLINK-30542].


> Release Testing: Verify FLINK-30542 Support adaptive local hash aggregate in 
> runtime
> 
>
> Key: FLINK-31060
> URL: https://issues.apache.org/jira/browse/FLINK-31060
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> This issue aims to verify FLINK-30542: Support adaptive local hash aggregate 
> in runtime.
> Adaptive local hash aggregation is an optimization of local hash aggregation, 
> which can adaptively determine whether to continue to do local hash 
> aggregation according to the distinct value rate of sampling data. If 
> distinct value rate bigger than defined threshold (see parameter: 
> 'table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold'), we will 
> stop aggregating and just send the input data to the downstream after a 
> simple projection. Otherwise, we will continue to do aggregation.
> We can verify it in SQL client after we build the flink-dist package.
>  # Create a source table firstly. (Note: the source table need have different 
> degree of aggregation, means the distinct count can be controlled by source 
> connector, we recommend to modify dataGen table source to produce different 
> data with different distinct row number).
>  # Verify the result with different distinct value rate. (See: 
> table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold)
>  # Check the log in 'TM' to see whether the adaptive local hash aggregate 
> works.
> If you meet any problems, it's welcome to ping me directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31060) Release Testing: Verify FLINK-30542 Support adaptive local hash aggregate in runtime

2023-02-14 Thread Yunhong Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunhong Zheng updated FLINK-31060:
--
Description: This issue aims to verify FLINK-30542: [Support adaptive local 
hash aggregate in runtime|https://issues.apache.org/jira/browse/FLINK-30542].

> Release Testing: Verify FLINK-30542 Support adaptive local hash aggregate in 
> runtime
> 
>
> Key: FLINK-31060
> URL: https://issues.apache.org/jira/browse/FLINK-31060
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> This issue aims to verify FLINK-30542: [Support adaptive local hash aggregate 
> in runtime|https://issues.apache.org/jira/browse/FLINK-30542].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   >