Re: [PR] [FLINK-34336][test] Fix the bug that AutoRescalingITCase may hang sometimes [flink]

2024-02-19 Thread via GitHub


1996fanrui commented on PR #24248:
URL: https://github.com/apache/flink/pull/24248#issuecomment-1953382889

   > LGTM  
   
   Thanks @XComp for the review! Merging
   
   > Can you create a 1.19 backport PR?
   
   Sure, I will do it asap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34336][test] Fix the bug that AutoRescalingITCase may hang sometimes [flink]

2024-02-19 Thread via GitHub


1996fanrui merged PR #24248:
URL: https://github.com/apache/flink/pull/24248


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-34147) TimestampData to/from LocalDateTime is ambiguous

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-34147:
---
Labels: pull-request-available  (was: )

> TimestampData to/from LocalDateTime is ambiguous
> 
>
> Key: FLINK-34147
> URL: https://issues.apache.org/jira/browse/FLINK-34147
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Rui Li
>Assignee: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
>
> It seems TimestampData is essentially an {{Instant}}. Therefore an implicit 
> time zone is used in the {{fromLocalDateTime}} and {{toLocalDateTime}} 
> methods. However neither the method name nor the API doc indicates which time 
> zone is used. So from caller's perspective, the results of these two methods 
> are ambiguous.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [FLINK-34147][table-planner] Enhance the java doc of TimestampData to distinguish the different usage of Instant and LocalDateTime [flink]

2024-02-19 Thread via GitHub


swuferhong opened a new pull request, #24339:
URL: https://github.com/apache/flink/pull/24339

   
   
   ## What is the purpose of the change
   
   Currently, the java docs of `TimestampData` is ambiguous. `TimestampData` 
represents both `Instant` and `LocalDateTime`. this class use 
`fromInstant()/toInstant()` to convert an `Instant` from/to `TimestampData`, 
and use `fromLocalDateTime()/toLocalDateTime()` to convert a `LocalDateTime` 
from/to `TimestampData`. This need to be indicated in java docs. 
   
   ## Brief change log
   
   
   
   ## Verifying this change
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? no docs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29114) TableSourceITCase#testTableHintWithLogicalTableScanReuse sometimes fails with result mismatch

2024-02-19 Thread Jane Chan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818606#comment-17818606
 ] 

Jane Chan commented on FLINK-29114:
---

Hi [~mapohl], sorry for the late reply, I just noticed your message. I'll take 
a look now.

> TableSourceITCase#testTableHintWithLogicalTableScanReuse sometimes fails with 
> result mismatch 
> --
>
> Key: FLINK-29114
> URL: https://issues.apache.org/jira/browse/FLINK-29114
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner, Tests
>Affects Versions: 1.15.0, 1.19.0, 1.20.0
>Reporter: Sergey Nuyanzin
>Priority: Major
>  Labels: auto-deprioritized-major, test-stability
> Attachments: FLINK-29114.log
>
>
> It could be reproduced locally by repeating tests. Usually about 100 
> iterations are enough to have several failed tests
> {noformat}
> [ERROR] Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 1.664 s <<< FAILURE! - in 
> org.apache.flink.table.planner.runtime.batch.sql.TableSourceITCase
> [ERROR] 
> org.apache.flink.table.planner.runtime.batch.sql.TableSourceITCase.testTableHintWithLogicalTableScanReuse
>   Time elapsed: 0.108 s  <<< FAILURE!
> java.lang.AssertionError: expected: 3,2,Hello world, 3,2,Hello world, 3,2,Hello world)> but was: 2,2,Hello, 2,2,Hello, 3,2,Hello world, 3,2,Hello world)>
>     at org.junit.Assert.fail(Assert.java:89)
>     at org.junit.Assert.failNotEquals(Assert.java:835)
>     at org.junit.Assert.assertEquals(Assert.java:120)
>     at org.junit.Assert.assertEquals(Assert.java:146)
>     at 
> org.apache.flink.table.planner.runtime.batch.sql.TableSourceITCase.testTableHintWithLogicalTableScanReuse(TableSourceITCase.scala:428)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>     at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>     at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>     at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>     at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>     at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>     at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
>     at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
>     at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
>     at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
>     at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
>     at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
>     at 
> 

[jira] [Commented] (FLINK-34156) Move Flink Calcite rules from Scala to Java

2024-02-19 Thread Yunhong Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818607#comment-17818607
 ] 

Yunhong Zheng commented on FLINK-34156:
---

Hi, [~Sergey Nuyanzin] . Since I have been continuously involved in the 
development related to table-planner and calcite, I am quite familiar with this 
area. Could I possibly join this work to help you to deal with some subtasks? 
Looking forward your reply, Thanks.

> Move Flink Calcite rules from Scala to Java
> ---
>
> Key: FLINK-34156
> URL: https://issues.apache.org/jira/browse/FLINK-34156
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Planner
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
> Fix For: 2.0.0
>
>
> This is an umbrella task for migration of Calcite rules from Scala to Java 
> mentioned at https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> The reason is that since 1.28.0 ( CALCITE-4787 - Move core to use Immutables 
> instead of ImmutableBeans ) Calcite started to use Immutables 
> (https://immutables.github.io/) and since 1.29.0 removed ImmutableBeans ( 
> CALCITE-4839 - Remove remnants of ImmutableBeans post 1.28 release ). All 
> rule configuration related api which is not Immutables based is marked as 
> deprecated. Since Immutables implies code generation while java compilation 
> it is seems impossible to use for rules in Scala code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34355) Release Testing: Verify FLINK-34054 Support named parameters for functions and procedures

2024-02-19 Thread Shuai Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818611#comment-17818611
 ] 

Shuai Xu commented on FLINK-34355:
--

Hi, I have finished this verification.

> Release Testing: Verify FLINK-34054 Support named parameters for functions 
> and procedures
> -
>
> Key: FLINK-34355
> URL: https://issues.apache.org/jira/browse/FLINK-34355
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: lincoln lee
>Assignee: Shuai Xu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.19.0
>
>
> Test suggestion:
> 1. Implement a test UDF or Procedure and support Named Parameters.
> 2. When calling a function or procedure, use named parameters to verify if 
> the results are as expected.
> You can test the following scenarios:
> 1. Normal usage of named parameters, fully specifying each parameter.
> 2. Omitting unnecessary parameters.
> 3. Omitting necessary parameters to confirm if an error is reported.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34434][slotmanager] Complete the returnedFuture when slot remo… [flink]

2024-02-19 Thread via GitHub


KarmaGYZ commented on code in PR #24325:
URL: https://github.com/apache/flink/pull/24325#discussion_r1494398967


##
flink-runtime/src/test/java/org/apache/flink/runtime/resourcemanager/slotmanager/DefaultSlotStatusSyncerTest.java:
##
@@ -216,6 +202,58 @@ void testAllocateSlotFailsWithException() {
 
assertThat(taskManagerInfo.getAllocatedSlots()).isEmpty());
 }
 
+@Test
+void testAllocationUpdatesIgnoredIfSlotRemoved() throws Exception {

Review Comment:
   Code deduplication is always worth. Thanks for the proposal :).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34346) Release Testing: Verify FLINK-24024 Support session Window TVF

2024-02-19 Thread Shuai Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818612#comment-17818612
 ] 

Shuai Xu commented on FLINK-34346:
--

Hi, I have finished this testing. The exception I think could be improved has 
been linked to this jira.

> Release Testing: Verify FLINK-24024 Support session Window TVF
> --
>
> Key: FLINK-34346
> URL: https://issues.apache.org/jira/browse/FLINK-34346
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: xuyang
>Assignee: Shuai Xu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.19.0
>
>
> Session window TVF is ready. Users can use Session window TVF aggregation 
> instead of using legacy session group window aggregation.
> Someone can verify this feature by following the 
> [doc]([https://github.com/apache/flink/pull/24250]) although it is still 
> being reviewed. 
> Further more,  although session window join, session window rank and session 
> window deduplicate are in experimental state, If someone finds some bugs 
> about them, you could also open a Jira linked this one to report them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-27891) Add ARRAY_APPEND and ARRAY_PREPEND supported in SQL & Table API

2024-02-19 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin closed FLINK-27891.
---

> Add ARRAY_APPEND and ARRAY_PREPEND supported in SQL & Table API
> ---
>
> Key: FLINK-27891
> URL: https://issues.apache.org/jira/browse/FLINK-27891
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.20.0
>
>
> {{ARRAY_APPEND}} - adds element to the end of the array and returns the 
> resulting array
> {{ARRAY_PREPEND}} - adds element to the beginning of the array and returns 
> the resulting array
> Syntax:
> {code:sql}
> ARRAY_APPEND(  ,  );
> ARRAY_PREPEND(  ,  );
> {code}
> Arguments:
> array: An ARRAY to to add a new element.
> new_element: A new element.
> Returns:
> An array. If array is NULL, the result is NULL.
> Examples:
> {code:sql}
> SELECT array_append(array[1, 2, 3], 4);
> -- array[1, 2, 3, 4]
> select array_append(cast(null as int array), 2);
> -- null
> SELECT array_prepend(4, array[1, 2, 3]);
> -- array[4, 1, 2, 3]
> SELECT array_prepend(null, array[1, 2, 3]);
> -- array[null, 1, 2, 3]
> {code}
> See more:
> {{ARRAY_APPEND}}
> Snowflake 
> [https://docs.snowflake.com/en/sql-reference/functions/array_append.html]
> PostgreSQL 
> [https://www.postgresql.org/docs/14/functions-array.html#ARRAY-FUNCTIONS-TABLE]
> {{ARRAY_PREPEND}}
> Snowflake 
> [https://docs.snowflake.com/en/sql-reference/functions/array_prepend.html]
> PostgreSQL 
> [https://www.postgresql.org/docs/14/functions-array.html#ARRAY-FUNCTIONS-TABLE]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-27891) Add ARRAY_APPEND and ARRAY_PREPEND supported in SQL & Table API

2024-02-19 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin resolved FLINK-27891.
-
Fix Version/s: 1.20.0
   Resolution: Fixed

> Add ARRAY_APPEND and ARRAY_PREPEND supported in SQL & Table API
> ---
>
> Key: FLINK-27891
> URL: https://issues.apache.org/jira/browse/FLINK-27891
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.20.0
>
>
> {{ARRAY_APPEND}} - adds element to the end of the array and returns the 
> resulting array
> {{ARRAY_PREPEND}} - adds element to the beginning of the array and returns 
> the resulting array
> Syntax:
> {code:sql}
> ARRAY_APPEND(  ,  );
> ARRAY_PREPEND(  ,  );
> {code}
> Arguments:
> array: An ARRAY to to add a new element.
> new_element: A new element.
> Returns:
> An array. If array is NULL, the result is NULL.
> Examples:
> {code:sql}
> SELECT array_append(array[1, 2, 3], 4);
> -- array[1, 2, 3, 4]
> select array_append(cast(null as int array), 2);
> -- null
> SELECT array_prepend(4, array[1, 2, 3]);
> -- array[4, 1, 2, 3]
> SELECT array_prepend(null, array[1, 2, 3]);
> -- array[null, 1, 2, 3]
> {code}
> See more:
> {{ARRAY_APPEND}}
> Snowflake 
> [https://docs.snowflake.com/en/sql-reference/functions/array_append.html]
> PostgreSQL 
> [https://www.postgresql.org/docs/14/functions-array.html#ARRAY-FUNCTIONS-TABLE]
> {{ARRAY_PREPEND}}
> Snowflake 
> [https://docs.snowflake.com/en/sql-reference/functions/array_prepend.html]
> PostgreSQL 
> [https://www.postgresql.org/docs/14/functions-array.html#ARRAY-FUNCTIONS-TABLE]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27891) Add ARRAY_APPEND and ARRAY_PREPEND supported in SQL & Table API

2024-02-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818580#comment-17818580
 ] 

Sergey Nuyanzin commented on FLINK-27891:
-

Merged as 
[e644beac8e5ffe71d9b6185c06ed31050e7c5268|https://github.com/apache/flink/commit/e644beac8e5ffe71d9b6185c06ed31050e7c5268]

> Add ARRAY_APPEND and ARRAY_PREPEND supported in SQL & Table API
> ---
>
> Key: FLINK-27891
> URL: https://issues.apache.org/jira/browse/FLINK-27891
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available, stale-assigned
>
> {{ARRAY_APPEND}} - adds element to the end of the array and returns the 
> resulting array
> {{ARRAY_PREPEND}} - adds element to the beginning of the array and returns 
> the resulting array
> Syntax:
> {code:sql}
> ARRAY_APPEND(  ,  );
> ARRAY_PREPEND(  ,  );
> {code}
> Arguments:
> array: An ARRAY to to add a new element.
> new_element: A new element.
> Returns:
> An array. If array is NULL, the result is NULL.
> Examples:
> {code:sql}
> SELECT array_append(array[1, 2, 3], 4);
> -- array[1, 2, 3, 4]
> select array_append(cast(null as int array), 2);
> -- null
> SELECT array_prepend(4, array[1, 2, 3]);
> -- array[4, 1, 2, 3]
> SELECT array_prepend(null, array[1, 2, 3]);
> -- array[null, 1, 2, 3]
> {code}
> See more:
> {{ARRAY_APPEND}}
> Snowflake 
> [https://docs.snowflake.com/en/sql-reference/functions/array_append.html]
> PostgreSQL 
> [https://www.postgresql.org/docs/14/functions-array.html#ARRAY-FUNCTIONS-TABLE]
> {{ARRAY_PREPEND}}
> Snowflake 
> [https://docs.snowflake.com/en/sql-reference/functions/array_prepend.html]
> PostgreSQL 
> [https://www.postgresql.org/docs/14/functions-array.html#ARRAY-FUNCTIONS-TABLE]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-34160) Migrate FlinkCalcMergeRule

2024-02-19 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin closed FLINK-34160.
---

> Migrate FlinkCalcMergeRule
> --
>
> Key: FLINK-34160
> URL: https://issues.apache.org/jira/browse/FLINK-34160
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-34160) Migrate FlinkCalcMergeRule

2024-02-19 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin resolved FLINK-34160.
-
Resolution: Fixed

> Migrate FlinkCalcMergeRule
> --
>
> Key: FLINK-34160
> URL: https://issues.apache.org/jira/browse/FLINK-34160
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34160) Migrate FlinkCalcMergeRule

2024-02-19 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818586#comment-17818586
 ] 

Sergey Nuyanzin commented on FLINK-34160:
-

Merged as 
[c2eac7ec85bef93fe2b61c028984e704c5a9d126|https://github.com/apache/flink/commit/c2eac7ec85bef93fe2b61c028984e704c5a9d126]

> Migrate FlinkCalcMergeRule
> --
>
> Key: FLINK-34160
> URL: https://issues.apache.org/jira/browse/FLINK-34160
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-27891][Table] Add ARRAY_APPEND and ARRAY_PREPEND functions [flink]

2024-02-19 Thread via GitHub


snuyanzin merged PR #19873:
URL: https://github.com/apache/flink/pull/19873


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34160][table] Migration of FlinkCalcMergeRule to java [flink]

2024-02-19 Thread via GitHub


snuyanzin merged PR #24142:
URL: https://github.com/apache/flink/pull/24142


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-32596) The partition key will be wrong when use Flink dialect to create Hive table

2024-02-19 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818603#comment-17818603
 ] 

luoyuxia commented on FLINK-32596:
--

[~walls.flink.m] Thanks for your investagtion. So, do you mean Hive metastore 
will always take the last columns as partition column whatever what columns we 
specific as partition column?

> The partition key will be wrong when use Flink dialect to create Hive table
> ---
>
> Key: FLINK-32596
> URL: https://issues.apache.org/jira/browse/FLINK-32596
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.15.0, 1.16.0, 1.17.0
>Reporter: luoyuxia
>Assignee: Vallari Rastogi
>Priority: Major
> Attachments: image-2024-02-14-16-06-13-126.png, 
> image-2024-02-15-03-05-22-541.png, image-2024-02-15-03-06-28-175.png, 
> image-2024-02-15-03-08-50-029.png
>
>
> Can be reproduced by the following SQL:
>  
> {code:java}
> tableEnv.getConfig().setSqlDialect(SqlDialect.DEFAULT);
> tableEnv.executeSql(
> "create table t1(`date` string, `geo_altitude` FLOAT) partitioned by 
> (`date`)"
> + " with ('connector' = 'hive', 
> 'sink.partition-commit.delay'='1 s',  
> 'sink.partition-commit.policy.kind'='metastore,success-file')");
> CatalogTable catalogTable =
> (CatalogTable) 
> hiveCatalog.getTable(ObjectPath.fromString("default.t1"));
> // the following assertion will fail
> assertThat(catalogTable.getPartitionKeys().toString()).isEqualTo("[date]");{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34147][table-planner] Enhance the java doc of TimestampData to distinguish the different usage of Instant and LocalDateTime [flink]

2024-02-19 Thread via GitHub


flinkbot commented on PR #24339:
URL: https://github.com/apache/flink/pull/24339#issuecomment-1953353325

   
   ## CI report:
   
   * 1e0acf858f5dfa9a029dfc0507d345cdd36a17b3 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [BP-1.17][hotfix][docs] Update the versions of mongodb supported by mongodb-connector [flink]

2024-02-19 Thread via GitHub


flinkbot commented on PR #24345:
URL: https://github.com/apache/flink/pull/24345#issuecomment-1953438118

   
   ## CI report:
   
   * 181a6cc57d9fc59cf18c69686fc7ddc4735c4f40 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34463] Open catalog in CatalogManager should use proper context classloader [flink]

2024-02-19 Thread via GitHub


hackergin commented on code in PR #24328:
URL: https://github.com/apache/flink/pull/24328#discussion_r1495218821


##
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/catalog/CatalogManager.java:
##
@@ -310,7 +311,10 @@ public void createCatalog(String catalogName, 
CatalogDescriptor catalogDescripto
 }
 
 Catalog catalog = initCatalog(catalogName, catalogDescriptor);
-catalog.open();
+try (TemporaryClassLoaderContext context =

Review Comment:
   @jrthe42  Thank you for your contribution. Could you please add a test to 
verify this change?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34397) Resource wait timeout can't be disabled

2024-02-19 Thread Pulkit Jain (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818627#comment-17818627
 ] 

Pulkit Jain commented on FLINK-34397:
-

[~chesnay] 

Is there any update on this issue? We are also facing this issue on Flink 
version - 1.16.1. Could you confirm is this issue is applicable for this 
release as well?

 

Thanks

Pulkit

> Resource wait timeout can't be disabled
> ---
>
> Key: FLINK-34397
> URL: https://issues.apache.org/jira/browse/FLINK-34397
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration
>Affects Versions: 1.17.2
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
> Fix For: 1.19.0, 1.17.3, 1.18.2
>
>
> The documentation for {{jobmanager.adaptive-scheduler.resource-wait-timeout}} 
> states that:
> ??Setting a negative duration will disable the resource timeout: The 
> JobManager will wait indefinitely for resources to appear.??
> However, we don't support parsing negative durations.
> {code}
> Could not parse value '-1 s' for key 
> 'jobmanager.adaptive-scheduler.resource-wait-timeout'.
> Caused by: java.lang.NumberFormatException: text does not start with a number
>   at org.apache.flink.util.TimeUtils.parseDuration(TimeUtils.java:80)
>   at 
> org.apache.flink.configuration.ConfigurationUtils.convertToDuration(ConfigurationUtils.java:399)
>   at 
> org.apache.flink.configuration.ConfigurationUtils.convertValue(ConfigurationUtils.java:331)
>   at 
> org.apache.flink.configuration.Configuration.lambda$getOptional$3(Configuration.java:729)
>   at java.base/java.util.Optional.map(Optional.java:260)
>   at 
> org.apache.flink.configuration.Configuration.getOptional(Configuration.java:729)
>   ... 2 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33636][autoscaler] Support the JDBCAutoScalerEventHandler [flink-kubernetes-operator]

2024-02-19 Thread via GitHub


1996fanrui commented on PR #765:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/765#issuecomment-1953567662

   Hi @mxm  @gyfora , gentle ping, do you have any other comments?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-34348) Release Testing: Verify FLINK-20281 Window aggregation supports changelog stream input

2024-02-19 Thread lincoln lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-34348.
---
Resolution: Fixed

[~hackergin] Thanks for your testing work!

> Release Testing: Verify FLINK-20281 Window aggregation supports changelog 
> stream input
> --
>
> Key: FLINK-34348
> URL: https://issues.apache.org/jira/browse/FLINK-34348
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: xuyang
>Assignee: Feng Jin
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.19.0
>
> Attachments: 截屏2024-02-07 16.21.37.png, 截屏2024-02-07 16.21.55.png, 
> 截屏2024-02-07 16.22.24.png, 截屏2024-02-07 16.23.12.png, 截屏2024-02-07 
> 16.23.27.png, 截屏2024-02-07 16.23.38.png, 截屏2024-02-07 16.29.09.png, 
> 截屏2024-02-07 16.29.21.png, 截屏2024-02-07 16.29.34.png, 截屏2024-02-07 
> 16.46.12.png, 截屏2024-02-07 16.46.23.png, 截屏2024-02-07 16.46.37.png, 
> 截屏2024-02-07 16.53.37.png, 截屏2024-02-07 16.53.47.png, 截屏2024-02-07 
> 16.54.01.png, 截屏2024-02-07 16.59.22.png, 截屏2024-02-07 16.59.33.png, 
> 截屏2024-02-07 16.59.42.png
>
>
> Window TVF aggregation supports changelog stream  is ready for testing. User 
> can add a window tvf aggregation as a down stream after CDC source or some 
> nodes that will produce cdc records.
> Someone can verify this feature with:
>  # Prepare a mysql table, and insert some data at first.
>  # Start sql-client and prepare ddl for this mysql table as a cdc source.
>  # You can verify the plan by `EXPLAIN PLAN_ADVICE` to check if there is a 
> window aggregate node and the changelog contains "UA" or "UB" or "D" in its 
> upstream. 
>  # Use different kinds of window tvf to test window tvf aggregation while 
> updating the source data to check the data correctness.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31472) AsyncSinkWriterThrottlingTest failed with Illegal mailbox thread

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818685#comment-17818685
 ] 

Matthias Pohl commented on FLINK-31472:
---

https://github.com/apache/flink/actions/runs/7967481900/job/21750506043#step:10:10473

> AsyncSinkWriterThrottlingTest failed with Illegal mailbox thread
> 
>
> Key: FLINK-31472
> URL: https://issues.apache.org/jira/browse/FLINK-31472
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Common
>Affects Versions: 1.17.0, 1.16.1, 1.18.0, 1.19.0
>Reporter: Ran Tao
>Assignee: Ahmed Hamdy
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.19.0
>
>
> when run mvn clean test, this case failed occasionally.
> {noformat}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.955 
> s <<< FAILURE! - in 
> org.apache.flink.connector.base.sink.writer.AsyncSinkWriterThrottlingTest
> [ERROR] 
> org.apache.flink.connector.base.sink.writer.AsyncSinkWriterThrottlingTest.testSinkThroughputShouldThrottleToHalfBatchSize
>   Time elapsed: 0.492 s  <<< ERROR!
> java.lang.IllegalStateException: Illegal thread detected. This method must be 
> called from inside the mailbox thread!
>         at 
> org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.checkIsMailboxThread(TaskMailboxImpl.java:262)
>         at 
> org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.take(TaskMailboxImpl.java:137)
>         at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxExecutorImpl.yield(MailboxExecutorImpl.java:84)
>         at 
> org.apache.flink.connector.base.sink.writer.AsyncSinkWriter.flush(AsyncSinkWriter.java:367)
>         at 
> org.apache.flink.connector.base.sink.writer.AsyncSinkWriter.lambda$registerCallback$3(AsyncSinkWriter.java:315)
>         at 
> org.apache.flink.streaming.runtime.tasks.TestProcessingTimeService$CallbackTask.onProcessingTime(TestProcessingTimeService.java:199)
>         at 
> org.apache.flink.streaming.runtime.tasks.TestProcessingTimeService.setCurrentTime(TestProcessingTimeService.java:76)
>         at 
> org.apache.flink.connector.base.sink.writer.AsyncSinkWriterThrottlingTest.testSinkThroughputShouldThrottleToHalfBatchSize(AsyncSinkWriterThrottlingTest.java:64)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>         at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>         at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>         at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>         at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>         at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>         at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>         at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
>         at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
>         at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
>         at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
>         at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
>         at 
> 

[jira] [Commented] (FLINK-29114) TableSourceITCase#testTableHintWithLogicalTableScanReuse sometimes fails with result mismatch

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818691#comment-17818691
 ] 

Matthias Pohl commented on FLINK-29114:
---

Much appreciated!

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57642=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11539

> TableSourceITCase#testTableHintWithLogicalTableScanReuse sometimes fails with 
> result mismatch 
> --
>
> Key: FLINK-29114
> URL: https://issues.apache.org/jira/browse/FLINK-29114
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner, Tests
>Affects Versions: 1.15.0, 1.19.0, 1.20.0
>Reporter: Sergey Nuyanzin
>Priority: Major
>  Labels: auto-deprioritized-major, test-stability
> Attachments: FLINK-29114.log
>
>
> It could be reproduced locally by repeating tests. Usually about 100 
> iterations are enough to have several failed tests
> {noformat}
> [ERROR] Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 1.664 s <<< FAILURE! - in 
> org.apache.flink.table.planner.runtime.batch.sql.TableSourceITCase
> [ERROR] 
> org.apache.flink.table.planner.runtime.batch.sql.TableSourceITCase.testTableHintWithLogicalTableScanReuse
>   Time elapsed: 0.108 s  <<< FAILURE!
> java.lang.AssertionError: expected: 3,2,Hello world, 3,2,Hello world, 3,2,Hello world)> but was: 2,2,Hello, 2,2,Hello, 3,2,Hello world, 3,2,Hello world)>
>     at org.junit.Assert.fail(Assert.java:89)
>     at org.junit.Assert.failNotEquals(Assert.java:835)
>     at org.junit.Assert.assertEquals(Assert.java:120)
>     at org.junit.Assert.assertEquals(Assert.java:146)
>     at 
> org.apache.flink.table.planner.runtime.batch.sql.TableSourceITCase.testTableHintWithLogicalTableScanReuse(TableSourceITCase.scala:428)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>     at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>     at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>     at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>     at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>     at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>     at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
>     at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
>     at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
>     at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
>     at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
>     at 
> 

Re: [PR] [FLINK-34266] Compute Autoscaler metrics correctly over metric window [flink-kubernetes-operator]

2024-02-19 Thread via GitHub


1996fanrui commented on code in PR #774:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/774#discussion_r1495183092


##
flink-autoscaler/src/main/java/org/apache/flink/autoscaler/metrics/ScalingMetric.java:
##
@@ -39,12 +39,6 @@ public enum ScalingMetric {
 /** Current processing rate. */
 CURRENT_PROCESSING_RATE(true),

Review Comment:
   Thanks for the update.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34434][slotmanager] Complete the returnedFuture when slot remo… [flink]

2024-02-19 Thread via GitHub


KarmaGYZ merged PR #24325:
URL: https://github.com/apache/flink/pull/24325


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33241][doc] Align config option generation documentation for Flink's config documentation. [flink]

2024-02-19 Thread via GitHub


flinkbot commented on PR #24344:
URL: https://github.com/apache/flink/pull/24344#issuecomment-1953437389

   
   ## CI report:
   
   * 8d4e9490af1859f884721320949815886bdf0f6f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [BP-1.18][hotfix][docs] Update the versions of mongodb supported by mongodb-connector [flink]

2024-02-19 Thread via GitHub


flinkbot commented on PR #24343:
URL: https://github.com/apache/flink/pull/24343#issuecomment-1953436713

   
   ## CI report:
   
   * 12c488ade5bc133738eadb5dcc181c61947c935b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [Draft][bugfix] Move the deserialization of shuffleDescriptor to a separate … [flink]

2024-02-19 Thread via GitHub


caodizhou commented on PR #24115:
URL: https://github.com/apache/flink/pull/24115#issuecomment-1953442349

   @flinkbot  run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33634] Add Conditions to Flink CRD's Status field [flink-kubernetes-operator]

2024-02-19 Thread via GitHub


lajith2006 commented on PR #749:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/749#issuecomment-1953491190

   > > > > > @gyfora yes, I am having JIRA account using which I can login to 
https://issues.apache.org/jira/projects/FLINK/.
   > > > > 
   > > > > 
   > > > > okay then can you please tell me the account name? :D
   > > > 
   > > > 
   > > > account name : **lajithk**
   > > 
   > > 
   > > It seems like you need to create a confluence account (cwiki.apache.org) 
once you have that I can give you permissions to create a FLIP page
   > 
   > I have been checking on to create confluence account , 
https://cwiki.apache.org/confluence , it says for register go to Log in page, 
but don't see any option to register there in login page. On further digging 
noticed some thing like 
https://cwiki.apache.org/confluence/display/DIRxTRIPLESEC/User+Registration . 
Is that something I have to follow up ?. or are there any path I can look for 
registration?.
   
   @gyfora ,  Could you please point me anyone I can reach out to get 
assistance on helping to get account created in 
https://cwiki.apache.org/confluence?. Thank you in advance. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-26515) RetryingExecutorTest. testDiscardOnTimeout failed on azure

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818686#comment-17818686
 ] 

Matthias Pohl commented on FLINK-26515:
---

https://github.com/apache/flink/actions/runs/7967481900/job/21750524527#step:10:10931

> RetryingExecutorTest. testDiscardOnTimeout failed on azure
> --
>
> Key: FLINK-26515
> URL: https://issues.apache.org/jira/browse/FLINK-26515
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.14.3, 1.17.0, 1.16.1, 1.18.0, 1.19.0
>Reporter: Yun Gao
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available, 
> test-stability
>
> {code:java}
> Mar 06 01:20:29 [ERROR] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 1.941 s <<< FAILURE! - in 
> org.apache.flink.changelog.fs.RetryingExecutorTest
> Mar 06 01:20:29 [ERROR] testTimeout  Time elapsed: 1.934 s  <<< FAILURE!
> Mar 06 01:20:29 java.lang.AssertionError: expected:<500.0> but 
> was:<1922.869766>
> Mar 06 01:20:29   at org.junit.Assert.fail(Assert.java:89)
> Mar 06 01:20:29   at org.junit.Assert.failNotEquals(Assert.java:835)
> Mar 06 01:20:29   at org.junit.Assert.assertEquals(Assert.java:555)
> Mar 06 01:20:29   at org.junit.Assert.assertEquals(Assert.java:685)
> Mar 06 01:20:29   at 
> org.apache.flink.changelog.fs.RetryingExecutorTest.testTimeout(RetryingExecutorTest.java:145)
> Mar 06 01:20:29   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Mar 06 01:20:29   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Mar 06 01:20:29   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Mar 06 01:20:29   at java.lang.reflect.Method.invoke(Method.java:498)
> Mar 06 01:20:29   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Mar 06 01:20:29   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Mar 06 01:20:29   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Mar 06 01:20:29   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Mar 06 01:20:29   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Mar 06 01:20:29   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> Mar 06 01:20:29   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> Mar 06 01:20:29   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> Mar 06 01:20:29   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> Mar 06 01:20:29   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> Mar 06 01:20:29   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> Mar 06 01:20:29   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> Mar 06 01:20:29   at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> Mar 06 01:20:29   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> Mar 06 01:20:29   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Mar 06 01:20:29   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> Mar 06 01:20:29   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> Mar 06 01:20:29   at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> Mar 06 01:20:29   at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
> Mar 06 01:20:29   at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> Mar 06 01:20:29   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> Mar 06 01:20:29   at 
> java.util.Iterator.forEachRemaining(Iterator.java:116)
> Mar 06 01:20:29   at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
> Mar 06 01:20:29   at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> Mar 06 01:20:29   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>  {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=32569=logs=f450c1a5-64b1-5955-e215-49cb1ad5ec88=cc452273-9efa-565d-9db8-ef62a38a0c10=22554



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [CP-1.19][FLINK-34434][slotmanager] Complete the returnedFuture when slot remo… [flink]

2024-02-19 Thread via GitHub


KarmaGYZ commented on PR #24326:
URL: https://github.com/apache/flink/pull/24326#issuecomment-1953396289

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34336][test] Fix the bug that AutoRescalingITCase may hang sometimes [flink]

2024-02-19 Thread via GitHub


1996fanrui commented on code in PR #24340:
URL: https://github.com/apache/flink/pull/24340#discussion_r1495176445


##
flink-tests/src/test/java/org/apache/flink/test/checkpointing/AutoRescalingITCase.java:
##
@@ -163,6 +163,8 @@ public void setup() throws Exception {
 
NettyShuffleEnvironmentOptions.NETWORK_BUFFERS_PER_CHANNEL, buffersPerChannel);
 
 config.set(JobManagerOptions.SCHEDULER, 
JobManagerOptions.SchedulerType.Adaptive);
+// Disable the scaling cooldown to speed up the test
+config.set(JobManagerOptions.SCHEDULER_SCALING_INTERVAL_MIN, 
Duration.ofMillis(0));

Review Comment:
   Disable the scaling cooldown will increase the hang possibility, so I didn't 
merge it in 1.19 before.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-34336][test] Fix the bug that AutoRescalingITCase may hang sometimes [flink]

2024-02-19 Thread via GitHub


1996fanrui opened a new pull request, #24340:
URL: https://github.com/apache/flink/pull/24340

   Backporting FLINK-34336 to 1.19


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34424) BoundedBlockingSubpartitionWriteReadTest#testRead10ConsumersConcurrent times out

2024-02-19 Thread Yuxin Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818615#comment-17818615
 ] 

Yuxin Tan commented on FLINK-34424:
---

[~mapohl] [~pnowojski] Sorry for the late reply. I tried to reproduce the 
issue, but it can not be reproduced in my local environment. I think it may be 
an occasional case with a low probability. I and [~yunfengzhou] will continue 
investigating the cause.

> BoundedBlockingSubpartitionWriteReadTest#testRead10ConsumersConcurrent times 
> out
> 
>
> Key: FLINK-34424
> URL: https://issues.apache.org/jira/browse/FLINK-34424
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Matthias Pohl
>Assignee: Yunfeng Zhou
>Priority: Critical
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57446=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=9151
> {code}
> Feb 11 13:55:29 "ForkJoinPool-50-worker-25" #414 daemon prio=5 os_prio=0 
> tid=0x7f19503af800 nid=0x284c in Object.wait() [0x7f191b6db000]
> Feb 11 13:55:29java.lang.Thread.State: WAITING (on object monitor)
> Feb 11 13:55:29   at java.lang.Object.wait(Native Method)
> Feb 11 13:55:29   at java.lang.Thread.join(Thread.java:1252)
> Feb 11 13:55:29   - locked <0xe2e019a8> (a 
> org.apache.flink.runtime.io.network.partition.BoundedBlockingSubpartitionWriteReadTest$LongReader)
> Feb 11 13:55:29   at 
> org.apache.flink.core.testutils.CheckedThread.trySync(CheckedThread.java:104)
> Feb 11 13:55:29   at 
> org.apache.flink.core.testutils.CheckedThread.sync(CheckedThread.java:92)
> Feb 11 13:55:29   at 
> org.apache.flink.core.testutils.CheckedThread.sync(CheckedThread.java:81)
> Feb 11 13:55:29   at 
> org.apache.flink.runtime.io.network.partition.BoundedBlockingSubpartitionWriteReadTest.testRead10ConsumersConcurrent(BoundedBlockingSubpartitionWriteReadTest.java:177)
> Feb 11 13:55:29   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [...]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34336][test] Fix the bug that AutoRescalingITCase may hang sometimes [flink]

2024-02-19 Thread via GitHub


flinkbot commented on PR #24340:
URL: https://github.com/apache/flink/pull/24340#issuecomment-1953400607

   
   ## CI report:
   
   * 3c5a7ccf0f1c9c2a4cc27b34ffebb51cb5296c62 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [hotfix][docs] Update the versions of mongodb supported by mongodb-connector [flink]

2024-02-19 Thread via GitHub


leonardBang merged PR #24337:
URL: https://github.com/apache/flink/pull/24337


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34383][Documentation] Modify the comment with incorrect syntax [flink]

2024-02-19 Thread via GitHub


zhuzhurk commented on PR #24273:
URL: https://github.com/apache/flink/pull/24273#issuecomment-1953432542

   Thanks for participating in the community work! @lxliyou001 
   Usually we will avoid creating individual JIRAs and PRs for this kind of 
typos to avoid polluting the commit history, as long as it does not confuse or 
mislead users/developers.
   
   To participate in Flink community work , You can find `starter` tasks in 
JIRA[1].
   Or you may take a task from the release testing of Flink 1.19 [2]. It is 
very important for the release of Flink.
   
   [1] 
https://issues.apache.org/jira/browse/FLINK-34419?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20labels%20%3D%20starter
   [2] https://issues.apache.org/jira/browse/FLINK-34285


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34463] Open catalog in CatalogManager should use proper context classloader [flink]

2024-02-19 Thread via GitHub


hackergin commented on code in PR #24328:
URL: https://github.com/apache/flink/pull/24328#discussion_r1495218821


##
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/catalog/CatalogManager.java:
##
@@ -310,7 +311,10 @@ public void createCatalog(String catalogName, 
CatalogDescriptor catalogDescripto
 }
 
 Catalog catalog = initCatalog(catalogName, catalogDescriptor);
-catalog.open();
+try (TemporaryClassLoaderContext context =

Review Comment:
   @jrthe42  Thanks for the contribution, Can you add a test verity this  
change ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-34346) Release Testing: Verify FLINK-24024 Support session Window TVF

2024-02-19 Thread lincoln lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-34346.
---
Resolution: Fixed

[~xu_shuai_] Thanks for testing this!

> Release Testing: Verify FLINK-24024 Support session Window TVF
> --
>
> Key: FLINK-34346
> URL: https://issues.apache.org/jira/browse/FLINK-34346
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: xuyang
>Assignee: Shuai Xu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.19.0
>
>
> Session window TVF is ready. Users can use Session window TVF aggregation 
> instead of using legacy session group window aggregation.
> Someone can verify this feature by following the 
> [doc]([https://github.com/apache/flink/pull/24250]) although it is still 
> being reviewed. 
> Further more,  although session window join, session window rank and session 
> window deduplicate are in experimental state, If someone finds some bugs 
> about them, you could also open a Jira linked this one to report them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-34355) Release Testing: Verify FLINK-34054 Support named parameters for functions and procedures

2024-02-19 Thread lincoln lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-34355.
---
Resolution: Fixed

[~xu_shuai_] Thanks for your testing work!

> Release Testing: Verify FLINK-34054 Support named parameters for functions 
> and procedures
> -
>
> Key: FLINK-34355
> URL: https://issues.apache.org/jira/browse/FLINK-34355
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: lincoln lee
>Assignee: Shuai Xu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.19.0
>
>
> Test suggestion:
> 1. Implement a test UDF or Procedure and support Named Parameters.
> 2. When calling a function or procedure, use named parameters to verify if 
> the results are as expected.
> You can test the following scenarios:
> 1. Normal usage of named parameters, fully specifying each parameter.
> 2. Omitting unnecessary parameters.
> 3. Omitting necessary parameters to confirm if an error is reported.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29436) Upgrade Spotless Maven Plugin to 2.27.1

2024-02-19 Thread Leonard Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-29436:
---
Component/s: Build System
 Connectors / Parent

> Upgrade Spotless Maven Plugin to 2.27.1
> ---
>
> Key: FLINK-29436
> URL: https://issues.apache.org/jira/browse/FLINK-29436
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Connectors / Parent
>Reporter: Zili Chen
>Assignee: Zili Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0, connector-parent-1.1.0
>
>
> This blocker is fixed by: https://github.com/diffplug/spotless/pull/1224 and 
> https://github.com/diffplug/spotless/pull/1228.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34336) AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState may hang sometimes

2024-02-19 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818688#comment-17818688
 ] 

Rui Fan commented on FLINK-34336:
-

Hi [~mapohl] , thanks for your reminder. I have submitted a PR to backport it 
to 1.19: [https://github.com/apache/flink/pull/24340]

Please correct me if I misunderstand.

> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang sometimes
> -
>
> Key: FLINK-34336
> URL: https://issues.apache.org/jira/browse/FLINK-34336
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.19.0, 1.20.0
>
>
> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang in 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};{color}
> h2. Reason:
> The job has 2 tasks(vertices), after calling updateJobResourceRequirements. 
> The source parallelism isn't changed (It's parallelism) , and the 
> FlatMapper+Sink is changed from  parallelism to parallelism2.
> So we expect the task number should be parallelism + parallelism2 instead of 
> parallelism2.
>  
> h2. Why it can be passed for now?
> Flink 1.19 supports the scaling cooldown, and the cooldown time is 30s by 
> default. It means, flink job will rescale job 30 seconds after 
> updateJobResourceRequirements is called.
>  
> So the running tasks are old parallelism when we call 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};. {color}
> IIUC, it cannot be guaranteed, and it's unexpected.
>  
> h2. How to reproduce this bug?
> [https://github.com/1996fanrui/flink/commit/ffd713e24d37db2c103e4cd4361d0cd916d0d2f6]
>  * Disable the cooldown
>  * Sleep for a while before waitForRunningTasks
> If so, the job running in new parallelism, so `waitForRunningTasks` will hang 
> forever.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34202) python tests take suspiciously long in some of the cases

2024-02-19 Thread Jing Ge (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818693#comment-17818693
 ] 

Jing Ge commented on FLINK-34202:
-

Thanks for the info, [~jeyhunkarimov] could you help check the Alibaba001 VM?

> python tests take suspiciously long in some of the cases
> 
>
> Key: FLINK-34202
> URL: https://issues.apache.org/jira/browse/FLINK-34202
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.17.2, 1.19.0, 1.18.1
>Reporter: Matthias Pohl
>Assignee: Xingbo Huang
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> [This release-1.18 
> build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56603=logs=3e4dd1a2-fe2f-5e5d-a581-48087e718d53=b4612f28-e3b5-5853-8a8b-610ae894217a]
>  has the python stage running into a timeout without any obvious reason. The 
> [python stage run for 
> JDK17|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56603=logs=b53e1644-5cb4-5a3b-5d48-f523f39bcf06]
>  was also getting close to the 4h timeout.
> I'm creating this issue for documentation purposes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [hotfix][docs] Integrate mongodb v1.1 docs [flink]

2024-02-19 Thread via GitHub


leonardBang opened a new pull request, #24341:
URL: https://github.com/apache/flink/pull/24341

   [hotfix][docs] Integrate mongodb v1.1 docs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [BP-1.17][hotfix][docs] Update the versions of mongodb supported by mongodb-connector [flink]

2024-02-19 Thread via GitHub


Jiabao-Sun opened a new pull request, #24345:
URL: https://github.com/apache/flink/pull/24345

   
   
   ## What is the purpose of the change
   
   Update the versions of mongodb supported by mongodb-connector
   
   ## Brief change log
   
   Currently, we are using mongodb-driver version 4.7.2. 
   https://www.mongodb.com/docs/drivers/java/sync/current/compatibility/
   
   For mongo-driver version 4.7.2, theoretically, it supports mongodb server 
versions below 3.6. 
   However, it has not undergone rigorous testing. 
   Additionally, mongo-driver version 4.8 and above will no longer support 
mongodb versions below 3.6. 
   Therefore, we declare the minimum supported version of mongodb server as 3.6.
   
   ## Verifying this change
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (docs)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-33241][doc] Align config option generation documentation for Flink's config documentation. [flink]

2024-02-19 Thread via GitHub


JunRuiLee opened a new pull request, #24344:
URL: https://github.com/apache/flink/pull/24344

   
   
   
   ## What is the purpose of the change
   
   The configuration parameter docs generation is documented in two places in 
different ways:
   
[docs/README.md:62](https://github.com/apache/flink/blob/5c1e9f3b1449cb77276d578b344d9a69c7cf9a3c/docs/README.md#L62)
 and 
[flink-docs/README.md:44](https://github.com/apache/flink/blob/7bebd2d9fac517c28afc24c0c034d77cfe2b43a6/flink-docs/README.md#L44).
   
   We should remove the corresponding command from docs/README.md and refer to 
flink-docs/README.md for the documentation. That way, we only have to maintain 
a single file.
   
   
   ## Brief change log
   
   remove the corresponding command from docs/README.md and refer to 
flink-docs/README.md for the documentation
   
   ## Verifying this change
   
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [BP-1.19][hotfix][docs] Update the versions of mongodb supported by mongodb-connector [flink]

2024-02-19 Thread via GitHub


Jiabao-Sun opened a new pull request, #24342:
URL: https://github.com/apache/flink/pull/24342

   
   
   ## What is the purpose of the change
   
   Update the versions of mongodb supported by mongodb-connector
   
   ## Brief change log
   
   Currently, we are using mongodb-driver version 4.7.2. 
   https://www.mongodb.com/docs/drivers/java/sync/current/compatibility/
   
   For mongo-driver version 4.7.2, theoretically, it supports mongodb server 
versions below 3.6. 
   However, it has not undergone rigorous testing. 
   Additionally, mongo-driver version 4.8 and above will no longer support 
mongodb versions below 3.6. 
   Therefore, we declare the minimum supported version of mongodb server as 3.6.
   
   ## Verifying this change
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (docs)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [BP-1.18][hotfix][docs] Update the versions of mongodb supported by mongodb-connector [flink]

2024-02-19 Thread via GitHub


Jiabao-Sun opened a new pull request, #24343:
URL: https://github.com/apache/flink/pull/24343

   
   
   ## What is the purpose of the change
   
   Update the versions of mongodb supported by mongodb-connector
   
   ## Brief change log
   
   Currently, we are using mongodb-driver version 4.7.2. 
   https://www.mongodb.com/docs/drivers/java/sync/current/compatibility/
   
   For mongo-driver version 4.7.2, theoretically, it supports mongodb server 
versions below 3.6. 
   However, it has not undergone rigorous testing. 
   Additionally, mongo-driver version 4.8 and above will no longer support 
mongodb versions below 3.6. 
   Therefore, we declare the minimum supported version of mongodb server as 3.6.
   
   ## Verifying this change
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (docs)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33241) Align config option generation documentation for Flink's config documentation

2024-02-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33241:
---
Labels: pull-request-available starter  (was: starter)

> Align config option generation documentation for Flink's config documentation
> -
>
> Key: FLINK-33241
> URL: https://issues.apache.org/jira/browse/FLINK-33241
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Assignee: Junrui Li
>Priority: Major
>  Labels: pull-request-available, starter
>
> The configuration parameter docs generation is documented in two places in 
> different ways:
> [docs/README.md:62|https://github.com/apache/flink/blob/5c1e9f3b1449cb77276d578b344d9a69c7cf9a3c/docs/README.md#L62]
>  and 
> [flink-docs/README.md:44|https://github.com/apache/flink/blob/7bebd2d9fac517c28afc24c0c034d77cfe2b43a6/flink-docs/README.md#L44].
> We should remove the corresponding command from {{docs/README.md}} and refer 
> to {{flink-docs/README.md}} for the documentation. That way, we only have to 
> maintain a single file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [BP-1.19][hotfix][docs] Update the versions of mongodb supported by mongodb-connector [flink]

2024-02-19 Thread via GitHub


flinkbot commented on PR #24342:
URL: https://github.com/apache/flink/pull/24342#issuecomment-1953436142

   
   ## CI report:
   
   * 144537475e2e0e05d0535d2934d0c7a935126507 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [hotfix][docs] Integrate mongodb v1.1 docs [flink]

2024-02-19 Thread via GitHub


flinkbot commented on PR #24341:
URL: https://github.com/apache/flink/pull/24341#issuecomment-1953435662

   
   ## CI report:
   
   * 5a4261423d548338d3ca8f1109bec13184ac346a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (FLINK-34434) DefaultSlotStatusSyncer doesn't complete the returned future

2024-02-19 Thread Yangze Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangze Guo resolved FLINK-34434.

Resolution: Fixed

> DefaultSlotStatusSyncer doesn't complete the returned future
> 
>
> Key: FLINK-34434
> URL: https://issues.apache.org/jira/browse/FLINK-34434
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0, 1.18.2, 1.20.0
>
>
> When looking into FLINK-34427 (unrelated), I noticed an odd line in 
> [DefaultSlotStatusSyncer:155|https://github.com/apache/flink/blob/15fe1653acec45d7c7bac17071e9773a4aa690a4/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DefaultSlotStatusSyncer.java#L155]
>  where we complete a future that should be already completed (because the 
> callback is triggered after the {{requestFuture}} is already completed in 
> some way. Shouldn't we complete the {{returnedFuture}} instead?
> I'm keeping the priority at {{Major}} because it doesn't seem to have been an 
> issue in the past.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34434) DefaultSlotStatusSyncer doesn't complete the returned future

2024-02-19 Thread Yangze Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818657#comment-17818657
 ] 

Yangze Guo commented on FLINK-34434:


1.19: 8cf29969d9aec4943713f0a6096b703718ce0dd0
45d4dc10248402757e203aa266b19c95e2e93b46

> DefaultSlotStatusSyncer doesn't complete the returned future
> 
>
> Key: FLINK-34434
> URL: https://issues.apache.org/jira/browse/FLINK-34434
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
>
> When looking into FLINK-34427 (unrelated), I noticed an odd line in 
> [DefaultSlotStatusSyncer:155|https://github.com/apache/flink/blob/15fe1653acec45d7c7bac17071e9773a4aa690a4/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DefaultSlotStatusSyncer.java#L155]
>  where we complete a future that should be already completed (because the 
> callback is triggered after the {{requestFuture}} is already completed in 
> some way. Shouldn't we complete the {{returnedFuture}} instead?
> I'm keeping the priority at {{Major}} because it doesn't seem to have been an 
> issue in the past.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34434) DefaultSlotStatusSyncer doesn't complete the returned future

2024-02-19 Thread Yangze Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangze Guo updated FLINK-34434:
---
Fix Version/s: 1.19.0
   1.18.2
   1.20.0

> DefaultSlotStatusSyncer doesn't complete the returned future
> 
>
> Key: FLINK-34434
> URL: https://issues.apache.org/jira/browse/FLINK-34434
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0, 1.18.2, 1.20.0
>
>
> When looking into FLINK-34427 (unrelated), I noticed an odd line in 
> [DefaultSlotStatusSyncer:155|https://github.com/apache/flink/blob/15fe1653acec45d7c7bac17071e9773a4aa690a4/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DefaultSlotStatusSyncer.java#L155]
>  where we complete a future that should be already completed (because the 
> callback is triggered after the {{requestFuture}} is already completed in 
> some way. Shouldn't we complete the {{returnedFuture}} instead?
> I'm keeping the priority at {{Major}} because it doesn't seem to have been an 
> issue in the past.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [CP-1.19][FLINK-34434][slotmanager] Complete the returnedFuture when slot remo… [flink]

2024-02-19 Thread via GitHub


KarmaGYZ merged PR #24326:
URL: https://github.com/apache/flink/pull/24326


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34348) Release Testing: Verify FLINK-20281 Window aggregation supports changelog stream input

2024-02-19 Thread Feng Jin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818666#comment-17818666
 ] 

Feng Jin commented on FLINK-34348:
--

cc [~lincoln.86xy]  The test has been completed as above.

> Release Testing: Verify FLINK-20281 Window aggregation supports changelog 
> stream input
> --
>
> Key: FLINK-34348
> URL: https://issues.apache.org/jira/browse/FLINK-34348
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: xuyang
>Assignee: Feng Jin
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.19.0
>
> Attachments: 截屏2024-02-07 16.21.37.png, 截屏2024-02-07 16.21.55.png, 
> 截屏2024-02-07 16.22.24.png, 截屏2024-02-07 16.23.12.png, 截屏2024-02-07 
> 16.23.27.png, 截屏2024-02-07 16.23.38.png, 截屏2024-02-07 16.29.09.png, 
> 截屏2024-02-07 16.29.21.png, 截屏2024-02-07 16.29.34.png, 截屏2024-02-07 
> 16.46.12.png, 截屏2024-02-07 16.46.23.png, 截屏2024-02-07 16.46.37.png, 
> 截屏2024-02-07 16.53.37.png, 截屏2024-02-07 16.53.47.png, 截屏2024-02-07 
> 16.54.01.png, 截屏2024-02-07 16.59.22.png, 截屏2024-02-07 16.59.33.png, 
> 截屏2024-02-07 16.59.42.png
>
>
> Window TVF aggregation supports changelog stream  is ready for testing. User 
> can add a window tvf aggregation as a down stream after CDC source or some 
> nodes that will produce cdc records.
> Someone can verify this feature with:
>  # Prepare a mysql table, and insert some data at first.
>  # Start sql-client and prepare ddl for this mysql table as a cdc source.
>  # You can verify the plan by `EXPLAIN PLAN_ADVICE` to check if there is a 
> window aggregate node and the changelog contains "UA" or "UB" or "D" in its 
> upstream. 
>  # Use different kinds of window tvf to test window tvf aggregation while 
> updating the source data to check the data correctness.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34252][table] Fix lastRecordTime tracking in WatermarkAssignerOperator [flink]

2024-02-19 Thread via GitHub


1996fanrui commented on code in PR #24211:
URL: https://github.com/apache/flink/pull/24211#discussion_r1495341757


##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/wmassigners/WatermarkAssignerOperator.java:
##
@@ -100,11 +100,14 @@ public void open() throws Exception {
 
 @Override
 public void processElement(StreamRecord element) throws Exception 
{
-if (idleTimeout > 0 && currentStatus.equals(WatermarkStatus.IDLE)) {
-// mark the channel active
-emitWatermarkStatus(WatermarkStatus.ACTIVE);
+if (idleTimeout > 0) {
+if (currentStatus.equals(WatermarkStatus.IDLE)) {
+// mark the channel active
+emitWatermarkStatus(WatermarkStatus.ACTIVE);
+}
 lastRecordTime = 
getProcessingTimeService().getCurrentProcessingTime();
 }

Review Comment:
   Thanks @pnowojski for this valuable comment!
   
   > We can not call getProcessingTimeService().getCurrentProcessingTime() per 
every record, that's a too costly operation.
   
   Good catch, I didn't notice it before.
   
   > I think we could use a trick to count emitted records here, in the 
processElement. Then in WatermarkAssignerOperator#onProcessingTime you could 
periodically check if the processed elements count has changed and update 
lastRecordTime there if it did. That would loose us a little bit of accuracy, 
but not much. For example if processing timer is triggered ~5x more frequently 
than idleTimeout, the average accuracy lost would be only ~10%, which is 
negligible.
   
   The solution make sense to me.
   
   > If both idleTimeout > 0 && watermarkInterval > 0, we might need to somehow 
handle two timers frequencies:
   > I guess we could register two different ProcessingTimeCallback with two 
different frequencies.
   > Or we can ignore the problem and if both idleTimeout > 0 && 
watermarkInterval > 0, we could just have a single timer with watermarkInterval 
latency. This option is probably simpler, and might be good enough as usually 
(almost always?) watermarkInterval << idleTimeout.
   
   In general, `watermarkInterval << idleTimeout`, but it isn't always right. 
Users can set them randomly, we cannot ensure how user set them. 
   
   A workaround solution is we only register a timer with `min(idleTimeout, 
watermarkInterval)` latency.
   
   
   > For `watermarkInterval == 0 && idleTimeout > 0`.
   
   We can register a timer with `idleTimeout` latency.
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34274) AdaptiveSchedulerTest.testRequirementLowerBoundDecreaseAfterResourceScarcityBelowAvailableSlots times out

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818690#comment-17818690
 ] 

Matthias Pohl commented on FLINK-34274:
---

master (1.20): 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57627=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=9762

> AdaptiveSchedulerTest.testRequirementLowerBoundDecreaseAfterResourceScarcityBelowAvailableSlots
>  times out
> -
>
> Key: FLINK-34274
> URL: https://issues.apache.org/jira/browse/FLINK-34274
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Assignee: David Morávek
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> {code:java}
> Jan 30 03:15:46 "ForkJoinPool-420-worker-25" #9746 daemon prio=5 os_prio=0 
> tid=0x7fdfbb635800 nid=0x2dbd waiting on condition [0x7fdf39528000]
> Jan 30 03:15:46java.lang.Thread.State: WAITING (parking)
> Jan 30 03:15:46   at sun.misc.Unsafe.park(Native Method)
> Jan 30 03:15:46   - parking to wait for  <0xfe642548> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> Jan 30 03:15:46   at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> Jan 30 03:15:46   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> Jan 30 03:15:46   at 
> java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403)
> Jan 30 03:15:46   at 
> org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerTest$SubmissionBufferingTaskManagerGateway.waitForSubmissions(AdaptiveSchedulerTest.java:2225)
> Jan 30 03:15:46   at 
> org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerTest.awaitJobReachingParallelism(AdaptiveSchedulerTest.java:1333)
> Jan 30 03:15:46   at 
> org.apache.flink.runtime.scheduler.adaptive.AdaptiveSchedulerTest.testRequirementLowerBoundDecreaseAfterResourceScarcityBelowAvailableSlots(AdaptiveSchedulerTest.java:1273)
> Jan 30 03:15:46   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [...] {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57086=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=9893



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30784) HiveTableSourceITCase.testPartitionFilter failed with assertion error

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818689#comment-17818689
 ] 

Matthias Pohl commented on FLINK-30784:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57626=logs=fc5181b0-e452-5c8f-68de-1097947f6483=995c650b-6573-581c-9ce6-7ad4cc038461=22553

> HiveTableSourceITCase.testPartitionFilter  failed with assertion error
> --
>
> Key: FLINK-30784
> URL: https://issues.apache.org/jira/browse/FLINK-30784
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.17.0, 1.18.1
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: auto-deprioritized-critical, test-stability
>
> We see a test failure in {{HiveTableSourceITCase.testPartitionFilter}}:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45184=logs=a5ef94ef-68c2-57fd-3794-dc108ed1c495=2c68b137-b01d-55c9-e603-3ff3f320364b=23909
> {code}
> Jan 25 01:14:55 [ERROR] 
> org.apache.flink.connectors.hive.HiveTableSourceITCase.testPartitionFilter  
> Time elapsed: 2.212 s  <<< FAILURE!
> Jan 25 01:14:55 org.opentest4j.AssertionFailedError: 
> Jan 25 01:14:55 
> Jan 25 01:14:55 Expecting value to be false but was true
> Jan 25 01:14:55   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> Jan 25 01:14:55   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> Jan 25 01:14:55   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> Jan 25 01:14:55   at 
> org.apache.flink.connectors.hive.HiveTableSourceITCase.testPartitionFilter(HiveTableSourceITCase.java:314)
> Jan 25 01:14:55   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [...]
> {code}
> There's a similar test stability issue still open with FLINK-20975. The 
> stacktraces don't match. That's why I decided to open a new one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [CP-1.18][FLINK-34434][slotmanager] Complete the returnedFuture when slot remo… [flink]

2024-02-19 Thread via GitHub


KarmaGYZ merged PR #24327:
URL: https://github.com/apache/flink/pull/24327


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-34452) Release Testing: Verify FLINK-15959 Add min number of slots configuration to limit total number of slots

2024-02-19 Thread lincoln lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34452:

Fix Version/s: 1.19.0

> Release Testing: Verify FLINK-15959 Add min number of slots configuration to 
> limit total number of slots
> 
>
> Key: FLINK-34452
> URL: https://issues.apache.org/jira/browse/FLINK-34452
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.19.0
>Reporter: xiangyu feng
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.19.0
>
>
> Test Suggestion:
>  # Prepare configuration options:
>  ** taskmanager.numberOfTaskSlots = 2,
>  ** slotmanager.number-of-slots.min = 7,
>  ** slot.idle.timeout = 5
>  # Setup a Flink session Cluster on Yarn or Native Kubernetes based on 
> following docs:
>  ** 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/yarn/#starting-a-flink-session-on-yarn]
>  ** 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#starting-a-flink-session-on-kubernetes]
>  # Verify that 4 TaskManagers will be registered even though no jobs has been 
> submitted
>  # Verify that these TaskManagers will not be destroyed after 50 seconds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34305) Release Testing Instructions: Verify FLINK-33261 Support Setting Parallelism for Table/SQL Sources

2024-02-19 Thread lincoln lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818677#comment-17818677
 ] 

lincoln lee commented on FLINK-34305:
-

[~sudewei.sdw][~Zhanghao Chen] Can you help estimate when the docs will be 
ready?

> Release Testing Instructions: Verify FLINK-33261 Support Setting Parallelism 
> for Table/SQL Sources 
> ---
>
> Key: FLINK-34305
> URL: https://issues.apache.org/jira/browse/FLINK-34305
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: lincoln lee
>Assignee: SuDewei
>Priority: Blocker
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34336) AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState may hang sometimes

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818684#comment-17818684
 ] 

Matthias Pohl commented on FLINK-34336:
---

Thanks [~fanrui]. Can you also create a 1.19 backport?

And just as a hint: The fix version would be only 1.19.0 up to the point where 
the 1.19.0 release actually happened (even if the change also ended up in 
{{{}master{}}}). Any change that is backported to 1.19 right now is still 
considered a 1.19.0 (and not a 1.20.0 fix).

> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang sometimes
> -
>
> Key: FLINK-34336
> URL: https://issues.apache.org/jira/browse/FLINK-34336
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.19.0, 1.20.0
>
>
> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang in 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};{color}
> h2. Reason:
> The job has 2 tasks(vertices), after calling updateJobResourceRequirements. 
> The source parallelism isn't changed (It's parallelism) , and the 
> FlatMapper+Sink is changed from  parallelism to parallelism2.
> So we expect the task number should be parallelism + parallelism2 instead of 
> parallelism2.
>  
> h2. Why it can be passed for now?
> Flink 1.19 supports the scaling cooldown, and the cooldown time is 30s by 
> default. It means, flink job will rescale job 30 seconds after 
> updateJobResourceRequirements is called.
>  
> So the running tasks are old parallelism when we call 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};. {color}
> IIUC, it cannot be guaranteed, and it's unexpected.
>  
> h2. How to reproduce this bug?
> [https://github.com/1996fanrui/flink/commit/ffd713e24d37db2c103e4cd4361d0cd916d0d2f6]
>  * Disable the cooldown
>  * Sleep for a while before waitForRunningTasks
> If so, the job running in new parallelism, so `waitForRunningTasks` will hang 
> forever.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-29114) TableSourceITCase#testTableHintWithLogicalTableScanReuse sometimes fails with result mismatch

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818691#comment-17818691
 ] 

Matthias Pohl edited comment on FLINK-29114 at 2/20/24 7:56 AM:


Much appreciated!
 * 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57642=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11539]
 * 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57647=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=11599]
 * 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57647=logs=32715a4c-21b8-59a3-4171-744e5ab107eb=ff64056b-5320-5afe-c22c-6fa339e59586=11508]


was (Author: mapohl):
Much appreciated!

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57642=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11539

> TableSourceITCase#testTableHintWithLogicalTableScanReuse sometimes fails with 
> result mismatch 
> --
>
> Key: FLINK-29114
> URL: https://issues.apache.org/jira/browse/FLINK-29114
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner, Tests
>Affects Versions: 1.15.0, 1.19.0, 1.20.0
>Reporter: Sergey Nuyanzin
>Priority: Major
>  Labels: auto-deprioritized-major, test-stability
> Attachments: FLINK-29114.log
>
>
> It could be reproduced locally by repeating tests. Usually about 100 
> iterations are enough to have several failed tests
> {noformat}
> [ERROR] Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 1.664 s <<< FAILURE! - in 
> org.apache.flink.table.planner.runtime.batch.sql.TableSourceITCase
> [ERROR] 
> org.apache.flink.table.planner.runtime.batch.sql.TableSourceITCase.testTableHintWithLogicalTableScanReuse
>   Time elapsed: 0.108 s  <<< FAILURE!
> java.lang.AssertionError: expected: 3,2,Hello world, 3,2,Hello world, 3,2,Hello world)> but was: 2,2,Hello, 2,2,Hello, 3,2,Hello world, 3,2,Hello world)>
>     at org.junit.Assert.fail(Assert.java:89)
>     at org.junit.Assert.failNotEquals(Assert.java:835)
>     at org.junit.Assert.assertEquals(Assert.java:120)
>     at org.junit.Assert.assertEquals(Assert.java:146)
>     at 
> org.apache.flink.table.planner.runtime.batch.sql.TableSourceITCase.testTableHintWithLogicalTableScanReuse(TableSourceITCase.scala:428)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>     at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>     at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>     at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>     at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>     at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>     at 
> 

[jira] [Updated] (FLINK-34364) Fix release utils mount point to match the release doc and scripts

2024-02-19 Thread Leonard Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-34364:
---
Fix Version/s: connector-parent-1.1.0

> Fix release utils mount point to match the release doc and scripts
> --
>
> Key: FLINK-34364
> URL: https://issues.apache.org/jira/browse/FLINK-34364
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Parent, Release System
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Labels: pull-request-available
> Fix For: connector-parent-1.1.0
>
>
> parent_pom branch refers to an incorrect mount point tools/*release*/shared 
> instead of tools/*releasing*/shared for the release_utils. 
> _tools/releasing_/shared is the one used in the release scripts and in the 
> release docs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34434) DefaultSlotStatusSyncer doesn't complete the returned future

2024-02-19 Thread Yangze Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818613#comment-17818613
 ] 

Yangze Guo commented on FLINK-34434:


master: 15af3e49ca42fea1e3f6c53d6d1498315b1322ac
1a494bc1f04571b8b8248d7c2c1af364222a0c61
1.18: e95cb6e73900fbbc2039407be1bd87271b2a950b
21cfe998f4fb21afe24ceb8b6f4fef180e89b9e9

> DefaultSlotStatusSyncer doesn't complete the returned future
> 
>
> Key: FLINK-34434
> URL: https://issues.apache.org/jira/browse/FLINK-34434
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
>
> When looking into FLINK-34427 (unrelated), I noticed an odd line in 
> [DefaultSlotStatusSyncer:155|https://github.com/apache/flink/blob/15fe1653acec45d7c7bac17071e9773a4aa690a4/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DefaultSlotStatusSyncer.java#L155]
>  where we complete a future that should be already completed (because the 
> callback is triggered after the {{requestFuture}} is already completed in 
> some way. Shouldn't we complete the {{returnedFuture}} instead?
> I'm keeping the priority at {{Major}} because it doesn't seem to have been an 
> issue in the past.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34336) AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState may hang sometimes

2024-02-19 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818614#comment-17818614
 ] 

Rui Fan commented on FLINK-34336:
-

Merged to

master(1.20) via: e2e3de2d48e3f02b746bdbdcb4da7b0477986a11

> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang sometimes
> -
>
> Key: FLINK-34336
> URL: https://issues.apache.org/jira/browse/FLINK-34336
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.19.0
>
>
> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang in 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};{color}
> h2. Reason:
> The job has 2 tasks(vertices), after calling updateJobResourceRequirements. 
> The source parallelism isn't changed (It's parallelism) , and the 
> FlatMapper+Sink is changed from  parallelism to parallelism2.
> So we expect the task number should be parallelism + parallelism2 instead of 
> parallelism2.
>  
> h2. Why it can be passed for now?
> Flink 1.19 supports the scaling cooldown, and the cooldown time is 30s by 
> default. It means, flink job will rescale job 30 seconds after 
> updateJobResourceRequirements is called.
>  
> So the running tasks are old parallelism when we call 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};. {color}
> IIUC, it cannot be guaranteed, and it's unexpected.
>  
> h2. How to reproduce this bug?
> [https://github.com/1996fanrui/flink/commit/ffd713e24d37db2c103e4cd4361d0cd916d0d2f6]
>  * Disable the cooldown
>  * Sleep for a while before waitForRunningTasks
> If so, the job running in new parallelism, so `waitForRunningTasks` will hang 
> forever.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34336) AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState may hang sometimes

2024-02-19 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-34336:

Fix Version/s: 1.20.0

> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang sometimes
> -
>
> Key: FLINK-34336
> URL: https://issues.apache.org/jira/browse/FLINK-34336
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.19.0, 1.20.0
>
>
> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang in 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};{color}
> h2. Reason:
> The job has 2 tasks(vertices), after calling updateJobResourceRequirements. 
> The source parallelism isn't changed (It's parallelism) , and the 
> FlatMapper+Sink is changed from  parallelism to parallelism2.
> So we expect the task number should be parallelism + parallelism2 instead of 
> parallelism2.
>  
> h2. Why it can be passed for now?
> Flink 1.19 supports the scaling cooldown, and the cooldown time is 30s by 
> default. It means, flink job will rescale job 30 seconds after 
> updateJobResourceRequirements is called.
>  
> So the running tasks are old parallelism when we call 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};. {color}
> IIUC, it cannot be guaranteed, and it's unexpected.
>  
> h2. How to reproduce this bug?
> [https://github.com/1996fanrui/flink/commit/ffd713e24d37db2c103e4cd4361d0cd916d0d2f6]
>  * Disable the cooldown
>  * Sleep for a while before waitForRunningTasks
> If so, the job running in new parallelism, so `waitForRunningTasks` will hang 
> forever.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34386][state] Add RocksDB bloom filter metrics [flink]

2024-02-19 Thread via GitHub


hejufang commented on PR #24274:
URL: https://github.com/apache/flink/pull/24274#issuecomment-1953405068

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33182][table] Allow metadata columns in Ndu-analyze with ChangelogNormalize [flink]

2024-02-19 Thread via GitHub


lincoln-lil commented on PR #24121:
URL: https://github.com/apache/flink/pull/24121#issuecomment-1953417819

   @twalthr Thank you for reviewing this!  Yes, some work was needed to make it 
more readable, the original scala version use the same case branch style as 
FlinkChangelogModeInferenceProgram, but the specific logic of Ndu was too 
detailed, and the if-else nesting was even deeper in the java rewritten 
version. Will find some time to do the refactor.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34406) Expose RuntimeContext in FunctionContext

2024-02-19 Thread yisha zhou (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818616#comment-17818616
 ] 

yisha zhou commented on FLINK-34406:


Hi [~twalthr] ,  what do you think about this proposal?  I found that most of 
RuntimeContext functionalities in FunctionContext are introduced by you in 
https://issues.apache.org/jira/browse/FLINK-22857. 

> Expose RuntimeContext in FunctionContext
> 
>
> Key: FLINK-34406
> URL: https://issues.apache.org/jira/browse/FLINK-34406
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: yisha zhou
>Priority: Major
>
> When I implement a LookupFunction and utilize a RateLimiter in it, I need to 
> open the RateLimiter in the open function. And I can only get a 
> FunctionContext in the function. 
> However the RateLimiter needs to call getNumberOfParallelSubtasks of 
> RuntimeContext to get the parallelism of the job, so that it can calculate 
> the flow limitation for each subtask.
> Actually, getMetricGroup, getUserCodeClassLoader and so many FunctionContext 
> functionalities all come from RuntimeContext. Why not just expose the 
> RuntimeContext here?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33728] Do not rewatch when KubernetesResourceManagerDriver wat… [flink]

2024-02-19 Thread via GitHub


zhougit86 commented on code in PR #24163:
URL: https://github.com/apache/flink/pull/24163#discussion_r1495227606


##
flink-kubernetes/src/main/java/org/apache/flink/kubernetes/KubernetesResourceManagerDriver.java:
##
@@ -114,7 +116,8 @@ public KubernetesResourceManagerDriver(
 
 @Override
 protected void initializeInternal() throws Exception {
-podsWatchOpt = watchTaskManagerPods();
+podsWatchOptFuture = watchTaskManagerPods();
+podsWatchOptFuture.get();

Review Comment:
   This is because the watchTaskManagerPods before is a sync method, I just 
want to keep it the same behavior as before.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-34399) Release Testing: Verify FLINK-33644 Make QueryOperations SQL serializable

2024-02-19 Thread lincoln lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-34399:

Component/s: Table SQL / API

> Release Testing: Verify FLINK-33644 Make QueryOperations SQL serializable
> -
>
> Key: FLINK-34399
> URL: https://issues.apache.org/jira/browse/FLINK-34399
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Dawid Wysakowicz
>Assignee: Ferenc Csaky
>Priority: Major
>  Labels: release-testing
>
> Test suggestions:
> 1. Write a few Table API programs.
> 2. Call Table.getQueryOperation#asSerializableString, manually verify the 
> produced SQL query
> 3. Check the produced SQL query is runnable and produces the same results as 
> the Table API program:
> {code}
> Table table = tEnv.from("a") ...
> String sqlQuery = table.getQueryOperation().asSerializableString();
> //verify the sqlQuery is runnable
> tEnv.sqlQuery(sqlQuery).execute().collect()
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34336][test] Fix the bug that AutoRescalingITCase may hang sometimes [flink]

2024-02-19 Thread via GitHub


XComp commented on code in PR #24340:
URL: https://github.com/apache/flink/pull/24340#discussion_r1495372884


##
flink-tests/src/test/java/org/apache/flink/test/checkpointing/AutoRescalingITCase.java:
##
@@ -163,6 +163,8 @@ public void setup() throws Exception {
 
NettyShuffleEnvironmentOptions.NETWORK_BUFFERS_PER_CHANNEL, buffersPerChannel);
 
 config.set(JobManagerOptions.SCHEDULER, 
JobManagerOptions.SchedulerType.Adaptive);
+// Disable the scaling cooldown to speed up the test
+config.set(JobManagerOptions.SCHEDULER_SCALING_INTERVAL_MIN, 
Duration.ofMillis(0));

Review Comment:
   Why does it increase the hang possibility? Why do we merge it then now? And 
why did we merge it into `master`? :thinking: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34202) python tests take suspiciously long in some of the cases

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818374#comment-17818374
 ] 

Matthias Pohl commented on FLINK-34202:
---

Thanks for looking into it, everyone. We use random feature selection in other 
places as well (e.g. with [buffer 
debloating|https://github.com/apache/flink/blob/c8f27c25e8726360bd09fd21fa8e908c40376881/flink-runtime/src/test/java/org/apache/flink/runtime/testutils/MiniClusterResource.java#L256]).
 So, generally, I'm not against it but leave the judgement to you who are more 
familiar with the Python module. On the other hand, [~hxb] concludes that 
there's something wrong with the Alibaba001 VM. Therefore, we might want to 
check that VM fix the actual issue there rather than working around it in the 
Python tests. The actual cause might lead to problems in other tests as well. 
WDYT?

> python tests take suspiciously long in some of the cases
> 
>
> Key: FLINK-34202
> URL: https://issues.apache.org/jira/browse/FLINK-34202
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.17.2, 1.19.0, 1.18.1
>Reporter: Matthias Pohl
>Assignee: Xingbo Huang
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> [This release-1.18 
> build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56603=logs=3e4dd1a2-fe2f-5e5d-a581-48087e718d53=b4612f28-e3b5-5853-8a8b-610ae894217a]
>  has the python stage running into a timeout without any obvious reason. The 
> [python stage run for 
> JDK17|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56603=logs=b53e1644-5cb4-5a3b-5d48-f523f39bcf06]
>  was also getting close to the 4h timeout.
> I'm creating this issue for documentation purposes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34252][table] Fix lastRecordTime tracking in WatermarkAssignerOperator [flink]

2024-02-19 Thread via GitHub


pnowojski commented on code in PR #24211:
URL: https://github.com/apache/flink/pull/24211#discussion_r1494192141


##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/wmassigners/WatermarkAssignerOperator.java:
##
@@ -100,11 +100,14 @@ public void open() throws Exception {
 
 @Override
 public void processElement(StreamRecord element) throws Exception 
{
-if (idleTimeout > 0 && currentStatus.equals(WatermarkStatus.IDLE)) {
-// mark the channel active
-emitWatermarkStatus(WatermarkStatus.ACTIVE);
+if (idleTimeout > 0) {
+if (currentStatus.equals(WatermarkStatus.IDLE)) {
+// mark the channel active
+emitWatermarkStatus(WatermarkStatus.ACTIVE);
+}
 lastRecordTime = 
getProcessingTimeService().getCurrentProcessingTime();
 }

Review Comment:
   We can not call `getProcessingTimeService().getCurrentProcessingTime()` per 
every record, that's a too costly operation.
   
   I think we could use a trick to count emitted records here, in the 
`processElement`. Then in `WatermarkAssignerOperator#onProcessingTime` you 
could periodically check if the processed elements count has changed and update 
`lastRecordTime` there if it did. That would loose us a little bit of accuracy, 
but not much. For example if processing timer is triggered ~5x more frequently 
than `idleTimeout`, the average accuracy lost would be only ~10%, which is 
negligible.
   
   A couple of extra complications that I see are:
   - This operator registers a processing time timer only if periodic 
watermarks are enabled (`watermarkInterval > 0`). The code would have to be 
adapted to register and fire timers also if `watermarkInterval == 0 && 
idleTimeout > 0`.
   - If both `idleTimeout > 0 && watermarkInterval > 0`, we might need to 
somehow handle two timers frequencies:
 - I guess we could register two different `ProcessingTimeCallback` with 
two different frequencies.
 - Or we can ignore the problem and if both `idleTimeout > 0 && 
watermarkInterval > 0`, we could just have a single timer with 
`watermarkInterval` latency. This option is probably simpler, and might be good 
enough as usually (almost always?) `watermarkInterval << idleTimeout`.
   
   Or have I missed something?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34459) Results column names should match SELECT clause expression names

2024-02-19 Thread Lorenzo Affetti (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818385#comment-17818385
 ] 

Lorenzo Affetti commented on FLINK-34459:
-

[~martijnvisser] well, if the name is a long one it can be truncated 
automatically via a simple check.

I like the proposal as it would increase the quality of result understanding (y)

> Results column names should match SELECT clause expression names
> 
>
> Key: FLINK-34459
> URL: https://issues.apache.org/jira/browse/FLINK-34459
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.18.1
>Reporter: Jeyhun Karimov
>Priority: Minor
>
> When printing {{SQL SELECT}} results, Flink will output generated expression 
> name when the expression is not {{column reference or used with alias/over.}}
> For example, select a, a + 1 from T would result in 
> {code:java}
> ++-+-+
> | op |   a |  EXPR$1 |
> ++-+-+
> | +I |   1 |   2 |
> | +I |   1 |   2 |
> | +I |   1 |   2 |
> ++-+-+
> {code}
> Instead of the generated {{EXPR$1}} it would be nice to have {{a + 1}} (which 
> is the case in some other data processing systems like Spark).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-22765][test] Fixes test instability in ExceptionUtilsITCase#testIsMetaspaceOutOfMemoryError [flink]

2024-02-19 Thread via GitHub


XComp commented on code in PR #24315:
URL: https://github.com/apache/flink/pull/24315#discussion_r1494214115


##
flink-core/src/test/java/org/apache/flink/testutils/ClassLoaderUtils.java:
##
@@ -55,7 +58,7 @@ private static URLClassLoader createClassLoader(File root, 
ClassLoader parent)
 return new URLClassLoader(new URL[] {root.toURI().toURL()}, parent);
 }
 
-private static void writeAndCompile(File root, String filename, String 
source)
+public static void writeAndCompile(File root, String filename, String 
source)

Review Comment:
   > do we need to have it public?
   
   Good catch. That is the leftover from my initial approach which I missed to 
roll back.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-22765][test] Fixes test instability in ExceptionUtilsITCase#testIsMetaspaceOutOfMemoryError [flink]

2024-02-19 Thread via GitHub


XComp commented on code in PR #24315:
URL: https://github.com/apache/flink/pull/24315#discussion_r1494214673


##
flink-core/src/test/java/org/apache/flink/testutils/ClassLoaderUtils.java:
##
@@ -158,7 +170,7 @@ public ClassLoaderBuilder withParentClassLoader(ClassLoader 
classLoader) {
 return this;
 }
 
-public URLClassLoader build() throws IOException {
+public ClassLoaderBuilder compile() throws IOException {

Review Comment:
   Good point. I went for `generateSourcesAndCompile` :+1: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-22765][test] Fixes test instability in ExceptionUtilsITCase#testIsMetaspaceOutOfMemoryError [flink]

2024-02-19 Thread via GitHub


XComp commented on PR #24315:
URL: https://github.com/apache/flink/pull/24315#issuecomment-1951990498

   > I guess if we migrate from junit4 to junit5 then we don't need to depend 
on TestLogger which contains JUnit4 rules
   instead need to use @ExtendWith(TestLoggerExtension.class)
   > Also it would make sense to harden modifiers for tests since JUnit5 
doesn't require public
   
   Yikes, that I missed in the end. Thanks for noticing and sorry for being 
careless here. :innocent: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34459) Results column names should match SELECT clause expression names

2024-02-19 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818393#comment-17818393
 ] 

Martijn Visser commented on FLINK-34459:


I still am not convinced this is an actual user improvement: why is this better 
then doing as "AS x" in your SQL statement? Is this really a user problem?

How would you determine where to truncate? Is it a static value, do you base it 
on the available width of the returned table? Do you return the full result 
from the planner, and make it a client-option only, or do you want to put this 
everything? Is this even SQL standard compliant?


> Results column names should match SELECT clause expression names
> 
>
> Key: FLINK-34459
> URL: https://issues.apache.org/jira/browse/FLINK-34459
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.18.1
>Reporter: Jeyhun Karimov
>Priority: Minor
>
> When printing {{SQL SELECT}} results, Flink will output generated expression 
> name when the expression is not {{column reference or used with alias/over.}}
> For example, select a, a + 1 from T would result in 
> {code:java}
> ++-+-+
> | op |   a |  EXPR$1 |
> ++-+-+
> | +I |   1 |   2 |
> | +I |   1 |   2 |
> | +I |   1 |   2 |
> ++-+-+
> {code}
> Instead of the generated {{EXPR$1}} it would be nice to have {{a + 1}} (which 
> is the case in some other data processing systems like Spark).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34463) Open catalog in CatalogManager should use proper context classloader

2024-02-19 Thread jrthe42 (Jira)
jrthe42 created FLINK-34463:
---

 Summary: Open catalog in CatalogManager should use proper context 
classloader
 Key: FLINK-34463
 URL: https://issues.apache.org/jira/browse/FLINK-34463
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.18.1, 1.17.2
Reporter: jrthe42


When we try to create a catalog in CatalogManager, if the catalog jar is added 
using `ADD JAR` and the catalog itself requires SPI mechanism, the operation 
may fail.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34463] Open catalog in CatalogManager should use proper context classloader [flink]

2024-02-19 Thread via GitHub


flinkbot commented on PR #24328:
URL: https://github.com/apache/flink/pull/24328#issuecomment-1952032998

   
   ## CI report:
   
   * 0867037b4fde47c69379dfd8b650e91393b9f249 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34264[plugins] add pushgateway http basic auth [flink]

2024-02-19 Thread via GitHub


flinkbot commented on PR #24329:
URL: https://github.com/apache/flink/pull/24329#issuecomment-1952034920

   
   ## CI report:
   
   * b95032b5dbcacfb7be570adec95bc9fc43f022fb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-34418) Disk space issues for Docker-ized GitHub Action jobs

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818411#comment-17818411
 ] 

Matthias Pohl commented on FLINK-34418:
---

https://github.com/apache/flink/actions/runs/7945888201/job/21693426141

> Disk space issues for Docker-ized GitHub Action jobs
> 
>
> Key: FLINK-34418
> URL: https://issues.apache.org/jira/browse/FLINK-34418
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: github-actions, pull-request-available, test-stability
>
> [https://github.com/apache/flink/actions/runs/7838691874/job/21390739806#step:10:27746]
> {code:java}
> [...]
> Feb 09 03:00:13 Caused by: java.io.IOException: No space left on device
> 27608Feb 09 03:00:13  at java.io.FileOutputStream.writeBytes(Native Method)
> 27609Feb 09 03:00:13  at 
> java.io.FileOutputStream.write(FileOutputStream.java:326)
> 27610Feb 09 03:00:13  at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStreamManager.java:250)
> 27611Feb 09 03:00:13  ... 39 more
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-26644) python StreamExecutionEnvironmentTests.test_generate_stream_graph_with_dependencies failed on azure

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818410#comment-17818410
 ] 

Matthias Pohl commented on FLINK-26644:
---

https://github.com/apache/flink/actions/runs/7945888201/job/21693425916#step:10:24316

> python 
> StreamExecutionEnvironmentTests.test_generate_stream_graph_with_dependencies 
> failed on azure
> ---
>
> Key: FLINK-26644
> URL: https://issues.apache.org/jira/browse/FLINK-26644
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.14.4, 1.15.0, 1.16.0, 1.19.0
>Reporter: Yun Gao
>Priority: Minor
>  Labels: auto-deprioritized-major, test-stability
>
> {code:java}
> 2022-03-14T18:50:24.6842853Z Mar 14 18:50:24 
> === FAILURES 
> ===
> 2022-03-14T18:50:24.6844089Z Mar 14 18:50:24 _ 
> StreamExecutionEnvironmentTests.test_generate_stream_graph_with_dependencies _
> 2022-03-14T18:50:24.6844846Z Mar 14 18:50:24 
> 2022-03-14T18:50:24.6846063Z Mar 14 18:50:24 self = 
>   testMethod=test_generate_stream_graph_with_dependencies>
> 2022-03-14T18:50:24.6847104Z Mar 14 18:50:24 
> 2022-03-14T18:50:24.6847766Z Mar 14 18:50:24 def 
> test_generate_stream_graph_with_dependencies(self):
> 2022-03-14T18:50:24.6848677Z Mar 14 18:50:24 python_file_dir = 
> os.path.join(self.tempdir, "python_file_dir_" + str(uuid.uuid4()))
> 2022-03-14T18:50:24.6849833Z Mar 14 18:50:24 os.mkdir(python_file_dir)
> 2022-03-14T18:50:24.6850729Z Mar 14 18:50:24 python_file_path = 
> os.path.join(python_file_dir, "test_stream_dependency_manage_lib.py")
> 2022-03-14T18:50:24.6852679Z Mar 14 18:50:24 with 
> open(python_file_path, 'w') as f:
> 2022-03-14T18:50:24.6853646Z Mar 14 18:50:24 f.write("def 
> add_two(a):\nreturn a + 2")
> 2022-03-14T18:50:24.6854394Z Mar 14 18:50:24 env = self.env
> 2022-03-14T18:50:24.6855019Z Mar 14 18:50:24 
> env.add_python_file(python_file_path)
> 2022-03-14T18:50:24.6855519Z Mar 14 18:50:24 
> 2022-03-14T18:50:24.6856254Z Mar 14 18:50:24 def plus_two_map(value):
> 2022-03-14T18:50:24.6857045Z Mar 14 18:50:24 from 
> test_stream_dependency_manage_lib import add_two
> 2022-03-14T18:50:24.6857865Z Mar 14 18:50:24 return value[0], 
> add_two(value[1])
> 2022-03-14T18:50:24.6858466Z Mar 14 18:50:24 
> 2022-03-14T18:50:24.6858924Z Mar 14 18:50:24 def add_from_file(i):
> 2022-03-14T18:50:24.6859806Z Mar 14 18:50:24 with 
> open("data/data.txt", 'r') as f:
> 2022-03-14T18:50:24.6860266Z Mar 14 18:50:24 return i[0], 
> i[1] + int(f.read())
> 2022-03-14T18:50:24.6860879Z Mar 14 18:50:24 
> 2022-03-14T18:50:24.6862022Z Mar 14 18:50:24 from_collection_source = 
> env.from_collection([('a', 0), ('b', 0), ('c', 1), ('d', 1),
> 2022-03-14T18:50:24.6863259Z Mar 14 18:50:24  
>  ('e', 2)],
> 2022-03-14T18:50:24.6864057Z Mar 14 18:50:24  
> type_info=Types.ROW([Types.STRING(),
> 2022-03-14T18:50:24.6864651Z Mar 14 18:50:24  
>  Types.INT()]))
> 2022-03-14T18:50:24.6865150Z Mar 14 18:50:24 
> from_collection_source.name("From Collection")
> 2022-03-14T18:50:24.6866212Z Mar 14 18:50:24 keyed_stream = 
> from_collection_source.key_by(lambda x: x[1], key_type=Types.INT())
> 2022-03-14T18:50:24.6867083Z Mar 14 18:50:24 
> 2022-03-14T18:50:24.6867793Z Mar 14 18:50:24 plus_two_map_stream = 
> keyed_stream.map(plus_two_map).name("Plus Two Map").set_parallelism(3)
> 2022-03-14T18:50:24.6868620Z Mar 14 18:50:24 
> 2022-03-14T18:50:24.6869412Z Mar 14 18:50:24 add_from_file_map = 
> plus_two_map_stream.map(add_from_file).name("Add From File Map")
> 2022-03-14T18:50:24.6870239Z Mar 14 18:50:24 
> 2022-03-14T18:50:24.6870883Z Mar 14 18:50:24 test_stream_sink = 
> add_from_file_map.add_sink(self.test_sink).name("Test Sink")
> 2022-03-14T18:50:24.6871803Z Mar 14 18:50:24 
> test_stream_sink.set_parallelism(4)
> 2022-03-14T18:50:24.6872291Z Mar 14 18:50:24 
> 2022-03-14T18:50:24.6872756Z Mar 14 18:50:24 archive_dir_path = 
> os.path.join(self.tempdir, "archive_" + str(uuid.uuid4()))
> 2022-03-14T18:50:24.6873557Z Mar 14 18:50:24 
> os.mkdir(archive_dir_path)
> 2022-03-14T18:50:24.6874817Z Mar 14 18:50:24 with 
> open(os.path.join(archive_dir_path, "data.txt"), 'w') as f:
> 2022-03-14T18:50:24.6875414Z Mar 14 18:50:24 f.write("3")
> 2022-03-14T18:50:24.6875906Z Mar 14 18:50:24 archive_file_path = \

[jira] [Commented] (FLINK-34418) Disk space issues for Docker-ized GitHub Action jobs

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818409#comment-17818409
 ] 

Matthias Pohl commented on FLINK-34418:
---

https://github.com/apache/flink/actions/runs/7945888061/job/21693403355#step:10:28837

> Disk space issues for Docker-ized GitHub Action jobs
> 
>
> Key: FLINK-34418
> URL: https://issues.apache.org/jira/browse/FLINK-34418
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: github-actions, pull-request-available, test-stability
>
> [https://github.com/apache/flink/actions/runs/7838691874/job/21390739806#step:10:27746]
> {code:java}
> [...]
> Feb 09 03:00:13 Caused by: java.io.IOException: No space left on device
> 27608Feb 09 03:00:13  at java.io.FileOutputStream.writeBytes(Native Method)
> 27609Feb 09 03:00:13  at 
> java.io.FileOutputStream.write(FileOutputStream.java:326)
> 27610Feb 09 03:00:13  at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStreamManager.java:250)
> 27611Feb 09 03:00:13  ... 39 more
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34336) AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState may hang sometimes

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818429#comment-17818429
 ] 

Matthias Pohl commented on FLINK-34336:
---

Yes, I postponed looking into the PR. I will do it today. (y)

> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang sometimes
> -
>
> Key: FLINK-34336
> URL: https://issues.apache.org/jira/browse/FLINK-34336
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.19.0
>
>
> AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
> may hang in 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};{color}
> h2. Reason:
> The job has 2 tasks(vertices), after calling updateJobResourceRequirements. 
> The source parallelism isn't changed (It's parallelism) , and the 
> FlatMapper+Sink is changed from  parallelism to parallelism2.
> So we expect the task number should be parallelism + parallelism2 instead of 
> parallelism2.
>  
> h2. Why it can be passed for now?
> Flink 1.19 supports the scaling cooldown, and the cooldown time is 30s by 
> default. It means, flink job will rescale job 30 seconds after 
> updateJobResourceRequirements is called.
>  
> So the running tasks are old parallelism when we call 
> waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
> {color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};. {color}
> IIUC, it cannot be guaranteed, and it's unexpected.
>  
> h2. How to reproduce this bug?
> [https://github.com/1996fanrui/flink/commit/ffd713e24d37db2c103e4cd4361d0cd916d0d2f6]
>  * Disable the cooldown
>  * Sleep for a while before waitForRunningTasks
> If so, the job running in new parallelism, so `waitForRunningTasks` will hang 
> forever.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-21672) End to end tests (streaming) aren't Java vendor neutral (sun.management bean used)

2024-02-19 Thread david radley (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818182#comment-17818182
 ] 

david radley edited comment on FLINK-21672 at 2/19/24 9:51 AM:
---

Hi [~martijnvisser] 

Here is a list of sun. classes I have found in core Flink.
 # The one that I reported in StickyAllocationAndLocalRecoveryTestJob. This 
test code uses the sun management class to get the current pid. We have some 
options
 ## we change the constructor so it is driven reflectively so there is no 
compile error. This change would allow the tst to compile.
 ## To have this test run in Semeru we could use logic like Datadog and try to 
reflectively load other classes. Something like 
[https://github.com/DataDog/dd-trace-java/blob/aee3ca59c6a05233f4295552f2ede80bc4fc[…]agent/src/main/java/datadog/trace/bootstrap/AgentBootstrap.java|https://github.com/DataDog/dd-trace-java/blob/aee3ca59c6a05233f4295552f2ede80bc4fc%5B%E2%80%A6%5Dagent/src/main/java/datadog/trace/bootstrap/AgentBootstrap.java]
   . I see we could reflectively drive 
com.ibm.lang.management.RuntimeMXBean.getProcessID() if the sun class is not 
there.
 ## We fix this properly at Flink v2, using 
 ### methods introduced at java 10:{_}java.lang.management.RuntimeMXBean 
runtime={_}{_}java.lang.management.ManagementFactory.getRuntimeMXBean();{_}

                                _runtime.getPid();_

                           _2. Or use a java 9 approach_ _long pid = 
ProcessHandle.current().pid();_ 

      2. core Flink MemorySegment.java uses sun.misc.Unsafe. I see 
[chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://cr.openjdk.org/~psandoz/dv14-uk-paul-sandoz-unsafe-the-situation.pdf]
 I wonder if we can remove unsafe at Flink v2; I am not sure how used _off-heap 
unsafe_ memory is (also Flink v2 is changing how memory is being handled). I am 
not seeing an alternative.

     3. I see _SignalHandler and_ TestSignalHandler uses sun.misc.Signal . This 
seems to have come from Hadoop implementations that have been inherited. 

     4. There are 2 imports of import sun.security.krb5.KrbException; that can 
be produced when calling sun.security.krb5.Config.refresh()

 

I would like to implement 1.2 or if this is not acceptable 1.1. This would 
really help us short term as we could at least build with skipTests on Semeru.

 

Sun usages 2 and 3  would need so some consensus in the community - as is seems 
we would be removing capability unless we can find an alternative. The 
sun.security references are used when testing Hadoop with Kerberos, I have not 
looked into it.

 

 


was (Author: JIRAUSER300523):
Hi [~martijnvisser] 

Here is a list of sun. classes I have found in core Flink.
 # The one that I reported in StickyAllocationAndLocalRecoveryTestJob. This 
test code uses the sun management class to get the current pid. We have some 
options
 ## we change the constructor so it is driven reflectively so there is no 
compile error. This change would allow the tst to compile.
 ## To have this test run in Semeru we could use logic like Datadog and try to 
reflectively load other classes. Something like 
[https://github.com/DataDog/dd-trace-java/blob/aee3ca59c6a05233f4295552f2ede80bc4fc[…]agent/src/main/java/datadog/trace/bootstrap/AgentBootstrap.java|https://github.com/DataDog/dd-trace-java/blob/aee3ca59c6a05233f4295552f2ede80bc4fc%5B%E2%80%A6%5Dagent/src/main/java/datadog/trace/bootstrap/AgentBootstrap.java]
   . I see we could reflectively drive 
com.ibm.lang.management.RuntimeMXBean.getProcessID() if the sun class is not 
there.
 ## We fix this properly at Flink v2, using  methods introduced at java 10:
 ## _java.lang.management.RuntimeMXBean runtime =_

                           
_java.lang.management.ManagementFactory.getRuntimeMXBean();_

                           _runtime.getPid();_

                           _2. Or use a java 9 approach_ _long pid = 
ProcessHandle.current().pid();_ 

      2. core Flink MemorySegment.java uses sun.misc.Unsafe. I see 
[chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://cr.openjdk.org/~psandoz/dv14-uk-paul-sandoz-unsafe-the-situation.pdf]
 I wonder if we can remove unsafe at Flink v2; I am not sure how used _off-heap 
unsafe_ memory is. I am not seeing an alternative.

     3. I see _SignalHandler and_ TestSignalHandler uses sun.misc.Signal . This 
seems to have come from Hadoop implementations that have been inherited. 

     4. There are 2 imports of import sun.security.krb5.KrbException; that can 
be produced when calling sun.security.krb5.Config.refresh()

 

I would like to implement 1.2 or if this is not acceptable 1.1. This would 
really help us short term as we could at least build with skipTests on Semeru.

 

Sun usages 2 and 3  would need so some consensus in the community - as is seems 
we would be removing capability unless we can find an alternative. The 
sun.security 

[jira] [Commented] (FLINK-34202) python tests take suspiciously long in some of the cases

2024-02-19 Thread lincoln lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818439#comment-17818439
 ] 

lincoln lee commented on FLINK-34202:
-

1.20(master): 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57573=logs=3e4dd1a2-fe2f-5e5d-a581-48087e718d53=b4612f28-e3b5-5853-8a8b-610ae894217a

> python tests take suspiciously long in some of the cases
> 
>
> Key: FLINK-34202
> URL: https://issues.apache.org/jira/browse/FLINK-34202
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.17.2, 1.19.0, 1.18.1
>Reporter: Matthias Pohl
>Assignee: Xingbo Huang
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> [This release-1.18 
> build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56603=logs=3e4dd1a2-fe2f-5e5d-a581-48087e718d53=b4612f28-e3b5-5853-8a8b-610ae894217a]
>  has the python stage running into a timeout without any obvious reason. The 
> [python stage run for 
> JDK17|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56603=logs=b53e1644-5cb4-5a3b-5d48-f523f39bcf06]
>  was also getting close to the 4h timeout.
> I'm creating this issue for documentation purposes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34459) Results column names should match SELECT clause expression names

2024-02-19 Thread Jeyhun Karimov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818440#comment-17818440
 ] 

Jeyhun Karimov commented on FLINK-34459:


Hi [~martijnvisser] yes, there are some tradeoffs. Using AS is always a 
solution, but then a user needs to modify a query [and maybe modify back]. 
And for queries with many projection expressions, user needs to remember the 
mapping between EXPR$X -> actual expression in the query. 

Some other vendors like Spark does not truncate (at least for the large 
expressions I tried), some like MySQL/SQLite truncate after some point (they 
decide where and how to truncate for large expressions). 

So, we have multiple options to deal with the very large expressions: fallback 
to the current (EXPR$X) version, truncate, etc.
WDYT?


> Results column names should match SELECT clause expression names
> 
>
> Key: FLINK-34459
> URL: https://issues.apache.org/jira/browse/FLINK-34459
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.18.1
>Reporter: Jeyhun Karimov
>Priority: Minor
>
> When printing {{SQL SELECT}} results, Flink will output generated expression 
> name when the expression is not {{column reference or used with alias/over.}}
> For example, select a, a + 1 from T would result in 
> {code:java}
> ++-+-+
> | op |   a |  EXPR$1 |
> ++-+-+
> | +I |   1 |   2 |
> | +I |   1 |   2 |
> | +I |   1 |   2 |
> ++-+-+
> {code}
> Instead of the generated {{EXPR$1}} it would be nice to have {{a + 1}} (which 
> is the case in some other data processing systems like Spark).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [hotfix] Integrate mongodb v1.1 docs [flink]

2024-02-19 Thread via GitHub


leonardBang opened a new pull request, #24332:
URL: https://github.com/apache/flink/pull/24332

   [hotfix] Integrate mongodb v1.1 docs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-22765) ExceptionUtilsITCase.testIsMetaspaceOutOfMemoryError is unstable

2024-02-19 Thread lincoln lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818368#comment-17818368
 ] 

lincoln lee commented on FLINK-22765:
-

JDK21: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57573=logs=a657ddbf-d986-5381-9649-342d9c92e7fb=dc085d4a-05c8-580e-06ab-21f5624dab16

> ExceptionUtilsITCase.testIsMetaspaceOutOfMemoryError is unstable
> 
>
> Key: FLINK-22765
> URL: https://issues.apache.org/jira/browse/FLINK-22765
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.13.5, 1.15.0, 1.17.2, 1.19.0, 1.20.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
>  Labels: pull-request-available, stale-assigned, test-stability
> Fix For: 1.14.0, 1.16.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18292=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=a99e99c7-21cd-5a1f-7274-585e62b72f56
> {code}
> May 25 00:56:38 java.lang.AssertionError: 
> May 25 00:56:38 
> May 25 00:56:38 Expected: is ""
> May 25 00:56:38  but: was "The system is out of resources.\nConsult the 
> following stack trace for details."
> May 25 00:56:38   at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> May 25 00:56:38   at org.junit.Assert.assertThat(Assert.java:956)
> May 25 00:56:38   at org.junit.Assert.assertThat(Assert.java:923)
> May 25 00:56:38   at 
> org.apache.flink.runtime.util.ExceptionUtilsITCase.run(ExceptionUtilsITCase.java:94)
> May 25 00:56:38   at 
> org.apache.flink.runtime.util.ExceptionUtilsITCase.testIsMetaspaceOutOfMemoryError(ExceptionUtilsITCase.java:70)
> May 25 00:56:38   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> May 25 00:56:38   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> May 25 00:56:38   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> May 25 00:56:38   at java.lang.reflect.Method.invoke(Method.java:498)
> May 25 00:56:38   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> May 25 00:56:38   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> May 25 00:56:38   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> May 25 00:56:38   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> May 25 00:56:38   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> May 25 00:56:38   at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> May 25 00:56:38   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> May 25 00:56:38   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> May 25 00:56:38   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> May 25 00:56:38   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> May 25 00:56:38   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> May 25 00:56:38   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> May 25 00:56:38   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> May 25 00:56:38   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> May 25 00:56:38   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> May 25 00:56:38   at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
> May 25 00:56:38   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> May 25 00:56:38   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> May 25 00:56:38   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> May 25 00:56:38   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> May 25 00:56:38   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> May 25 00:56:38   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> May 25 00:56:38   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> May 25 00:56:38   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> May 25 00:56:38   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> May 25 00:56:38   at 
> 

[jira] [Commented] (FLINK-34418) Disk space issues for Docker-ized GitHub Action jobs

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818401#comment-17818401
 ] 

Matthias Pohl commented on FLINK-34418:
---

https://github.com/apache/flink/actions/runs/7938595320/job/21677809941

> Disk space issues for Docker-ized GitHub Action jobs
> 
>
> Key: FLINK-34418
> URL: https://issues.apache.org/jira/browse/FLINK-34418
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.19.0, 1.18.1, 1.20.0
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: github-actions, pull-request-available, test-stability
>
> [https://github.com/apache/flink/actions/runs/7838691874/job/21390739806#step:10:27746]
> {code:java}
> [...]
> Feb 09 03:00:13 Caused by: java.io.IOException: No space left on device
> 27608Feb 09 03:00:13  at java.io.FileOutputStream.writeBytes(Native Method)
> 27609Feb 09 03:00:13  at 
> java.io.FileOutputStream.write(FileOutputStream.java:326)
> 27610Feb 09 03:00:13  at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStreamManager.java:250)
> 27611Feb 09 03:00:13  ... 39 more
> [...] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-34443) YARNFileReplicationITCase.testPerJobModeWithCustomizedFileReplication failed when deploying job cluster

2024-02-19 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818396#comment-17818396
 ] 

Matthias Pohl edited comment on FLINK-34443 at 2/19/24 9:23 AM:


* 
[https://github.com/apache/flink/actions/runs/7938595181/job/21677803913#step:10:28799]
 * 
[https://github.com/apache/flink/actions/runs/7938595184/job/21677788845#step:10:27633]
 * 
[https://github.com/apache/flink/actions/runs/7938595184/job/21677813511#step:10:28731]
 * 
[https://github.com/apache/flink/actions/runs/7938595184/job/21677790189#step:10:27633]
 *  


was (Author: mapohl):
* 
[https://github.com/apache/flink/actions/runs/7938595181/job/21677803913#step:10:28799]
 * 
https://github.com/apache/flink/actions/runs/7938595184/job/21677813511#step:10:28731

> YARNFileReplicationITCase.testPerJobModeWithCustomizedFileReplication failed 
> when deploying job cluster
> ---
>
> Key: FLINK-34443
> URL: https://issues.apache.org/jira/browse/FLINK-34443
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI, Runtime / Coordination, Test 
> Infrastructure
>Affects Versions: 1.19.0, 1.20.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: github-actions, test-stability
>
> https://github.com/apache/flink/actions/runs/7895502206/job/21548246199#step:10:28804
> {code}
> Error: 03:04:05 03:04:05.066 [ERROR] Tests run: 2, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 68.10 s <<< FAILURE! -- in 
> org.apache.flink.yarn.YARNFileReplicationITCase
> Error: 03:04:05 03:04:05.067 [ERROR] 
> org.apache.flink.yarn.YARNFileReplicationITCase.testPerJobModeWithCustomizedFileReplication
>  -- Time elapsed: 1.982 s <<< ERROR!
> Feb 14 03:04:05 
> org.apache.flink.client.deployment.ClusterDeploymentException: Could not 
> deploy Yarn job cluster.
> Feb 14 03:04:05   at 
> org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:566)
> Feb 14 03:04:05   at 
> org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:109)
> Feb 14 03:04:05   at 
> org.apache.flink.yarn.YARNFileReplicationITCase.lambda$testPerJobModeWithCustomizedFileReplication$0(YARNFileReplicationITCase.java:73)
> Feb 14 03:04:05   at 
> org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:303)
> Feb 14 03:04:05   at 
> org.apache.flink.yarn.YARNFileReplicationITCase.testPerJobModeWithCustomizedFileReplication(YARNFileReplicationITCase.java:73)
> Feb 14 03:04:05   at java.lang.reflect.Method.invoke(Method.java:498)
> Feb 14 03:04:05   at 
> java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
> Feb 14 03:04:05   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> Feb 14 03:04:05   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> Feb 14 03:04:05   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> Feb 14 03:04:05   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> Feb 14 03:04:05 Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /user/root/.flink/application_1707879779446_0002/log4j-api-2.17.1.jar could 
> only be written to 0 of the 1 minReplication nodes. There are 2 datanode(s) 
> running and 2 node(s) are excluded in this operation.
> Feb 14 03:04:05   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2260)
> Feb 14 03:04:05   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
> Feb 14 03:04:05   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2813)
> Feb 14 03:04:05   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:908)
> Feb 14 03:04:05   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
> Feb 14 03:04:05   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> Feb 14 03:04:05   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
> Feb 14 03:04:05   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
> Feb 14 03:04:05   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
> Feb 14 03:04:05   at 
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1029)
> Feb 14 03:04:05   at 
> 

[jira] [Comment Edited] (FLINK-21672) End to end tests (streaming) aren't Java vendor neutral (sun.management bean used)

2024-02-19 Thread david radley (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818182#comment-17818182
 ] 

david radley edited comment on FLINK-21672 at 2/19/24 9:48 AM:
---

Hi [~martijnvisser] 

Here is a list of sun. classes I have found in core Flink.
 # The one that I reported in StickyAllocationAndLocalRecoveryTestJob. This 
test code uses the sun management class to get the current pid. We have some 
options
 ## we change the constructor so it is driven reflectively so there is no 
compile error. This change would allow the tst to compile.
 ## To have this test run in Semeru we could use logic like Datadog and try to 
reflectively load other classes. Something like 
[https://github.com/DataDog/dd-trace-java/blob/aee3ca59c6a05233f4295552f2ede80bc4fc[…]agent/src/main/java/datadog/trace/bootstrap/AgentBootstrap.java|https://github.com/DataDog/dd-trace-java/blob/aee3ca59c6a05233f4295552f2ede80bc4fc%5B%E2%80%A6%5Dagent/src/main/java/datadog/trace/bootstrap/AgentBootstrap.java]
   . I see we could reflectively drive 
com.ibm.lang.management.RuntimeMXBean.getProcessID() if the sun class is not 
there.
 ## We fix this properly at Flink v2, using  methods introduced at java 10:
 ## _java.lang.management.RuntimeMXBean runtime =_

                           
_java.lang.management.ManagementFactory.getRuntimeMXBean();_

                           _runtime.getPid();_

                           _2. Or use a java 9 approach_ _long pid = 
ProcessHandle.current().pid();_ 

      2. core Flink MemorySegment.java uses sun.misc.Unsafe. I see 
[chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://cr.openjdk.org/~psandoz/dv14-uk-paul-sandoz-unsafe-the-situation.pdf]
 I wonder if we can remove unsafe at Flink v2; I am not sure how used _off-heap 
unsafe_ memory is. I am not seeing an alternative.

     3. I see _SignalHandler and_ TestSignalHandler uses sun.misc.Signal . This 
seems to have come from Hadoop implementations that have been inherited. 

     4. There are 2 imports of import sun.security.krb5.KrbException; that can 
be produced when calling sun.security.krb5.Config.refresh()

 

I would like to implement 1.2 or if this is not acceptable 1.1. This would 
really help us short term as we could at least build with skipTests on Semeru.

 

Sun usages 2 and 3  would need so some consensus in the community - as is seems 
we would be removing capability unless we can find an alternative. The 
sun.security references are used when testing Hadoop with Kerberos, I have not 
looked into it.

 

 


was (Author: JIRAUSER300523):
Hi [~martijnvisser] 

Here is a list of sun. classes I have found in core Flink.
 # The one that I reported in StickyAllocationAndLocalRecoveryTestJob. This 
test code uses the sun management class to get the current pid. We have some 
options
 # we change the constructor so it is driven reflectively so there is no 
compile error. This change would allow the tst to compile.
 # To have this test run in Semeru we could use logic like Datadog and try to 
reflectively load other classes. Something like 
[https://github.com/DataDog/dd-trace-java/blob/aee3ca59c6a05233f4295552f2ede80bc4fc[…]agent/src/main/java/datadog/trace/bootstrap/AgentBootstrap.java|https://github.com/DataDog/dd-trace-java/blob/aee3ca59c6a05233f4295552f2ede80bc4fc%5B%E2%80%A6%5Dagent/src/main/java/datadog/trace/bootstrap/AgentBootstrap.java]
   . I see we could reflectively drive 
com.ibm.lang.management.RuntimeMXBean.getProcessID() if the sun class is not 
there.
 # We fix this properly at Flink v2, using  methods introduced at java 10:
 # _java.lang.management.RuntimeMXBean runtime =_

                           
_java.lang.management.ManagementFactory.getRuntimeMXBean();_

                           _runtime.getPid();_

 _                           2. Or use a java 9 approach_ _long pid = 
ProcessHandle.current().pid();_ 

      2. core Flink MemorySegment.java uses sun.misc.Unsafe. I see 
[chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://cr.openjdk.org/~psandoz/dv14-uk-paul-sandoz-unsafe-the-situation.pdf]
 I wonder if we can remove unsafe at Flink v2; I am not sure how used _off-heap 
unsafe_ memory is. I am not seeing an alternative.

     3. I see _SignalHandler and_ TestSignalHandler uses sun.misc.Signal . This 
seems to have come from Hadoop implementations that have been inherited. 

     4. There are 2 imports of import sun.security.krb5.KrbException; that can 
be produced when calling sun.security.krb5.Config.refresh()

 

I would like to implement 1.2 or if this is not acceptable 1.1. This would 
really help us short term as we could at least build with skipTests on Semeru.

 

Sun usages 2 and 3  would need so some consensus in the community - as is seems 
we would be removing capability unless we can find an alternative. The 
sun.security references are used when testing Hadoop with 

[PR] [FLINK-34152] Tune all directly observable memory types [flink-kubernetes-operator]

2024-02-19 Thread via GitHub


mxm opened a new pull request, #778:
URL: https://github.com/apache/flink-kubernetes-operator/pull/778

   This is the generalized version of #762 in which we tune all directly 
observable Flink memory pools: heap, managed memory, network, and metaspace. 
   
   Notable changes:
   
   - New metrics to measure managed memory, network, and metaspace usage in 
addition to heap usage have been added
   - All memory pools are tuned, i.e. their size is increased or decreased 
according to their usage.
   - Max memory size of a pool is no longer limited by the original configured 
limit for that pool. The only cap is the max TM memory in the spec.
   - Memory can be freely moved between the pools
   - The feature to "give back" memory to RocksDB has been removed. The reason 
is that the new version does that automatically. The available memory is 
distributed across the memory pools based on their usage. Managed memory is 
assigned last. In the case of a streaming job without RocksDB, this should 
result in very low managed memory usage and hence reduce the managed and total 
memory. However, when RocksDB is used, the managed memory is allowed to grow 
until the max available memory assigned in the spec.
   - We assign a minimum of 256Mb (configurable) per memory pool
   
   The changes have been verified with an actual Kubernetes deployment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   >