[jira] [Resolved] (FLINK-32851) [JUnit5 Migration] The rest package of flink-runtime module

2023-09-21 Thread Weihua Hu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weihua Hu resolved FLINK-32851.
---
Fix Version/s: 1.19.0
   Resolution: Fixed

master: 4817db916889a7701481fb04333a7f8da9f5b583

> [JUnit5 Migration] The rest package of flink-runtime module
> ---
>
> Key: FLINK-32851
> URL: https://issues.apache.org/jira/browse/FLINK-32851
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Rui Fan
>Assignee: Matt Wang
>Priority: Minor
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] huwh closed pull request #23242: [FLINK-32851][runtime][JUnit5 Migration] The rest package of flink-runtime module

2023-09-21 Thread via GitHub


huwh closed pull request #23242: [FLINK-32851][runtime][JUnit5 Migration] The 
rest package of flink-runtime module
URL: https://github.com/apache/flink/pull/23242


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33118) Remove the PythonBridgeUtils

2023-09-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33118:
---
Labels: pull-request-available  (was: )

> Remove the PythonBridgeUtils
> 
>
> Key: FLINK-33118
> URL: https://issues.apache.org/jira/browse/FLINK-33118
> Project: Flink
>  Issue Type: Improvement
>  Components: Library / Machine Learning
>Reporter: Jiang Xin
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-2.4.0
>
>
> We added org.apache.flink.ml.python.PythonBridgeUtils.java before to 
> workaround the 
> FLINK-30168 and FLINK-29477. Now they are fixed so we can remove the class 
> along with its dependencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-ml] jiangxin369 commented on pull request #256: [FLINK-33118] Remove the PythonBridgeUtils

2023-09-21 Thread via GitHub


jiangxin369 commented on PR #256:
URL: https://github.com/apache/flink-ml/pull/256#issuecomment-1730801486

   @lindong28 Could you help review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33122) [Benchmark] Null checkpoint directory in rescaling benchmarks

2023-09-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33122:
---
Labels: pull-request-available  (was: )

> [Benchmark] Null checkpoint directory in rescaling benchmarks
> -
>
> Key: FLINK-33122
> URL: https://issues.apache.org/jira/browse/FLINK-33122
> Project: Flink
>  Issue Type: Bug
>  Components: Benchmarks
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
>  Labels: pull-request-available
>
> Currently, when setting up a rescaling benchmark, a local checkpoint storage 
> is created based on a local path configured by "benchmark.state.data-dir". 
> When user does not provide value for this option, an exception is thrown. In 
> this case, the right behavior should be to create a temporary directory for 
> checkpoint, just like the _StateBackendBenchmarkUtils#createKeyedStateBackend_
>  does for local data directory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-benchmarks] masteryhx commented on a diff in pull request #79: [FLINK-33122][benchmark] Support null checkpoint data path for rescaling benchmarks

2023-09-21 Thread via GitHub


masteryhx commented on code in PR #79:
URL: https://github.com/apache/flink-benchmarks/pull/79#discussion_r1333857024


##
src/main/java/org/apache/flink/state/benchmark/RescalingBenchmarkBase.java:
##
@@ -57,6 +63,26 @@ public static void runBenchmark(Class clazz) throws 
RunnerException {
 new Runner(options).run();
 }
 
+protected static File prepareDirectory(String prefix) throws IOException {

Review Comment:
   All code paths is similar with StateBenchmarkBase#createKeyedStateBackend 
and StateBackendBenchmarkUtils#prepareDirectory, Could we merge them together ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] 1996fanrui commented on a diff in pull request #677: [FLINK-33097][autoscaler] Initialize the generic autoscaler module and interfaces

2023-09-21 Thread via GitHub


1996fanrui commented on code in PR #677:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/677#discussion_r1333849523


##
flink-autoscaler/src/main/java/org/apache/flink/autoscaler/state/AutoScalerStateStore.java:
##
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.autoscaler.state;
+
+import org.apache.flink.annotation.Experimental;
+import org.apache.flink.autoscaler.JobAutoScalerContext;
+
+/**
+ * The state store is responsible for store all states during scaling.
+ *
+ * @param  The job key.
+ * @param  Instance of JobAutoScalerContext.
+ */
+@Experimental
+public interface AutoScalerStateStore> {
+
+void storeScalingHistory(Context jobContext, String scalingHistory);

Review Comment:
   Hi @Samrat002 , thanks for your feedback!
   
   I have updated these method parameters of `AutoScalerStateStore` to the 
specific class instead of String, such as: `Map> scalingHistory`.
   
   ```
   public interface AutoScalerStateStore> {
   
   void storeScalingHistory(
   Context jobContext,
   Map> 
scalingHistory);
   
   Optional>> 
getScalingHistory(
   Context jobContext);
   
   void removeScalingHistory(Context jobContext);
   }
   ```
   
   The PR has been updated as well.
   
   Do you think is it ok? It means the state store is responsible for how to 
serialize and deserialize, for example:
   
   - The default `KubernetesAutoScalerStateStore` will serialize all states to 
String inside of `KubernetesAutoScalerStateStore`
   - As you mentioned before: if there is any complex type in the future. Each 
state store to determine how to serialize them.
   
   Also, let me add a reason why update these parameters here:
   
   Currently, all states are stored at ConfigMap, and it has size limitation. 
The size limitation should just work with `KubernetesAutoScalerStateStore`, and 
size limitation is a part of serialization. So we should move the serialization 
and deserialization in the `AutoScalerStateStore`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33051) Unify the GlobalFailureHandler and LabeledGlobalFailureHandler interface

2023-09-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33051:
---
Labels: pull-request-available  (was: )

> Unify the GlobalFailureHandler and LabeledGlobalFailureHandler interface
> 
>
> Key: FLINK-33051
> URL: https://issues.apache.org/jira/browse/FLINK-33051
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Panagiotis Garefalakis
>Assignee: Matt Wang
>Priority: Minor
>  Labels: pull-request-available
>
> FLIP-304 introduced `LabeledGlobalFailureHandler` interface that is an 
> extension of `GlobalFailureHandler` interface.  The later can thus be removed 
> in the future to avoid the existence of interfaces with duplicate functions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] wangzzu closed pull request #23437: [FLINK-33051] Unify the GlobalFailureHandler and LabeledGlobalFailureHandler interface

2023-09-21 Thread via GitHub


wangzzu closed pull request #23437: [FLINK-33051] Unify the 
GlobalFailureHandler and LabeledGlobalFailureHandler interface
URL: https://github.com/apache/flink/pull/23437


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33051) Unify the GlobalFailureHandler and LabeledGlobalFailureHandler interface

2023-09-21 Thread Matt Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767805#comment-17767805
 ] 

Matt Wang commented on FLINK-33051:
---

[~pgaref] I agree with your point of view. Currently, if we want to use 
LabeledGlobalFailureHandler to replace GlobalFailureHandler, the implementation 
cost is relatively high. I prefer to unify the exception handling modules of 
AdaptiveScheduler and DefaultScheduler in the future. I will investigate the 
feasibility of this idea. If feasible, create a separate ticket or FLIP to 
promote it at that time. Can we close this ticket first, WDYT?

> Unify the GlobalFailureHandler and LabeledGlobalFailureHandler interface
> 
>
> Key: FLINK-33051
> URL: https://issues.apache.org/jira/browse/FLINK-33051
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Panagiotis Garefalakis
>Assignee: Matt Wang
>Priority: Minor
>
> FLIP-304 introduced `LabeledGlobalFailureHandler` interface that is an 
> extension of `GlobalFailureHandler` interface.  The later can thus be removed 
> in the future to avoid the existence of interfaces with duplicate functions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30883) Missing JobID caused the k8s e2e test to fail

2023-09-21 Thread Fei Feng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767800#comment-17767800
 ] 

Fei Feng commented on FLINK-30883:
--

We encountered this problem in our product environment as well.

There is job with fixed restart strategy that job will restart at most 3 times. 
After 3 times, the job status will be FAILED. And then the failed job 
information will be deleted from the job manager UI after 1 hour. When we stop 
the SessionJob  CR , we will miss cancel job failed exception because the jobid 
is missing ...

> Missing JobID caused the k8s e2e test to fail
> -
>
> Key: FLINK-30883
> URL: https://issues.apache.org/jira/browse/FLINK-30883
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes, Runtime / Coordination
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: auto-deprioritized-critical, test-stability
> Attachments: e2e_test_failure.log, 
> flink-vsts-client-fv-az378-840.log, jobmanager.0.log, jobmanager.1.log, 
> taskmanager.log
>
>
> We've experienced a test failure in {{Run kubernetes application HA test}} 
> due to a {{CliArgsException}}:
> {code}
> Feb 01 15:03:15 org.apache.flink.client.cli.CliArgsException: Missing JobID. 
> Specify a JobID to cancel a job.
> Feb 01 15:03:15   at 
> org.apache.flink.client.cli.CliFrontend.cancel(CliFrontend.java:689) 
> ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
> Feb 01 15:03:15   at 
> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1107) 
> ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
> Feb 01 15:03:15   at 
> org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189)
>  ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
> Feb 01 15:03:15   at 
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>  [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
> Feb 01 15:03:15   at 
> org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189) 
> [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
> Feb 01 15:03:15   at 
> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157) 
> [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45569=logs=bea52777-eaf8-5663-8482-18fbc3630e81=ae4f8708-9994-57d3-c2d7-b892156e7812=b2642e3a-5b86-574d-4c8a-f7e2842bfb14=9866



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] hackergin commented on pull request #23446: [FLINK-32976][runtime] Fix NullPointException when starting flink cluster in standalone mode

2023-09-21 Thread via GitHub


hackergin commented on PR #23446:
URL: https://github.com/apache/flink/pull/23446#issuecomment-1730708373

   @gaborgsomogyi  Should we backport this commit to the 1.17 and 1.18 
branches?  If so, I'll create the pr. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23450: [FLINK-33119][table] The pojo result returned be procedure should be Row of fields in the pojo instead of whole pojo object

2023-09-21 Thread via GitHub


flinkbot commented on PR #23450:
URL: https://github.com/apache/flink/pull/23450#issuecomment-1730695359

   
   ## CI report:
   
   * 4106cf094c32a9aad6c88771529e8f1294f67df9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia opened a new pull request, #23450: [FLINK-33119][table] The pojo result returned be procedure should be Row of fields in the pojo instead of whole pojo object

2023-09-21 Thread via GitHub


luoyuxia opened a new pull request, #23450:
URL: https://github.com/apache/flink/pull/23450

   Backport of #23438


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia closed pull request #23448: [FLINK-33050][table] Atomicity is not supported prompting the user to disable

2023-09-21 Thread via GitHub


luoyuxia closed pull request #23448: [FLINK-33050][table] Atomicity is not 
supported prompting the user to disable
URL: https://github.com/apache/flink/pull/23448


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia commented on pull request #23438: [FLINK-33119][table] The pojo result returned be procedure should be Row of fields in the pojo instead of whole pojo object

2023-09-21 Thread via GitHub


luoyuxia commented on PR #23438:
URL: https://github.com/apache/flink/pull/23438#issuecomment-1730617600

   @hackergin Thanks for you review. I have addressed your comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia commented on a diff in pull request #23438: [FLINK-33119][table] The pojo result returned be procedure should be Row of fields in the pojo instead of whole pojo object

2023-09-21 Thread via GitHub


luoyuxia commented on code in PR #23438:
URL: https://github.com/apache/flink/pull/23438#discussion_r1333768943


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/operations/PlannerCallProcedureOperation.java:
##
@@ -237,6 +249,14 @@ private TableResultInternal procedureResultToTableResult(
 tableResultType = DataTypes.ROW(DataTypes.FIELD("result", 
tableResultType));
 }
 
+RowRowConverter rowConverter = null;
+// if the output is struct type,
+// we need a row converter to convert the struct value to Row

Review Comment:
   Thx, I have refined the comment. First convert the struct value to RowData, 
and then use the RowRowConverter convert the RowData Row. Hope it can be clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32851) [JUnit5 Migration] The rest package of flink-runtime module

2023-09-21 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-32851:
---
Labels: pull-request-available stale-assigned  (was: pull-request-available)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issue is assigned but has not 
received an update in 30 days, so it has been labeled "stale-assigned".
If you are still working on the issue, please remove the label and add a 
comment updating the community on your progress.  If this issue is waiting on 
feedback, please consider this a reminder to the committer/reviewer. Flink is a 
very active project, and so we appreciate your patience.
If you are no longer working on the issue, please unassign yourself so someone 
else may work on it.


> [JUnit5 Migration] The rest package of flink-runtime module
> ---
>
> Key: FLINK-32851
> URL: https://issues.apache.org/jira/browse/FLINK-32851
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Rui Fan
>Assignee: Matt Wang
>Priority: Minor
>  Labels: pull-request-available, stale-assigned
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33056) NettyClientServerSslTest#testValidSslConnection fails on AZP

2023-09-21 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-33056:
---
Labels: stale-critical test-stability  (was: test-stability)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issues has been marked as 
Critical but is unassigned and neither itself nor its Sub-Tasks have been 
updated for 14 days. I have gone ahead and marked it "stale-critical". If this 
ticket is critical, please either assign yourself or give an update. 
Afterwards, please remove the label or in 7 days the issue will be 
deprioritized.


> NettyClientServerSslTest#testValidSslConnection fails on AZP
> 
>
> Key: FLINK-33056
> URL: https://issues.apache.org/jira/browse/FLINK-33056
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration, Runtime / Coordination
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: stale-critical, test-stability
> Attachments: logs-cron_azure-test_cron_azure_core-1694048924.zip
>
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53020=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef=8592
> fails with 
> {noformat}
> Test testValidSslConnection[SSL provider = 
> JDK](org.apache.flink.runtime.io.network.netty.NettyClientServerSslTest) is 
> running.
> 
> 01:20:31,479 [main] INFO  
> org.apache.flink.runtime.io.network.netty.NettyConfig[] - NettyConfig 
> [server address: localhost/127.0.0.1, server port range: 36717, ssl enabled: 
> true, memory segment size (bytes): 1024, transport type: AUTO, number of 
> server threads: 1 (manual), number of client thr
> eads: 1 (manual), server connect backlog: 0 (use Netty's default), client 
> connect timeout (sec): 120, send/receive buffer size (bytes): 0 (use Netty's 
> default)]
> 01:20:31,479 [main] INFO  
> org.apache.flink.runtime.io.network.netty.NettyServer[] - Transport 
> type 'auto': using EPOLL.
> 01:20:31,475 [Flink Netty Client (42359) Thread 0] WARN  
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline [] - 
> An exceptionCaught() event was fired, and it reached at the tail of the 
> pipeline. It usually means the last handler in the pipeline did not handle 
> the exception.
> org.apache.flink.shaded.netty4.io.netty.handler.codec.DecoderException: 
> javax.net.ssl.SSLHandshakeException: server certificate with unknown 
> fingerprint: CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, 
> C=Unknown
> at 
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:499)
>  ~[flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
>  ~[flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
>  [flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>  [flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
>  [flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
>  [flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
>  [flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
>  [flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
>  [flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
>  [flink-shaded-netty-4.1.91.Final-17.0.jar:?]
> at 
> 

[jira] [Updated] (FLINK-32995) TPC-DS end-to-end test fails with chmod: cannot access '../target/generator/dsdgen_linux':

2023-09-21 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-32995:
---
  Labels: auto-deprioritized-critical test-stability  (was: stale-critical 
test-stability)
Priority: Major  (was: Critical)

This issue was labeled "stale-critical" 7 days ago and has not received any 
updates so it is being deprioritized. If this ticket is actually Critical, 
please raise the priority and ask a committer to assign you the issue or revive 
the public discussion.


> TPC-DS end-to-end test fails with chmod: cannot access 
> '../target/generator/dsdgen_linux': 
> ---
>
> Key: FLINK-32995
> URL: https://issues.apache.org/jira/browse/FLINK-32995
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Priority: Major
>  Labels: auto-deprioritized-critical, test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=52773=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=0f3adb59-eefa-51c6-2858-3654d9e0749d=5504
>  fails as
> {noformat}
> Aug 29 10:03:20 [INFO] 10:03:20 Generating TPC-DS qualification data, this 
> need several minutes, please wait...
> chmod: cannot access '../target/generator/dsdgen_linux': No such file or 
> directory
> Aug 29 10:03:20 [FAIL] Test script contains errors.
> Aug 29 10:03:20 Checking for errors...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] Samrat002 commented on a diff in pull request #677: [FLINK-33097][autoscaler] Initialize the generic autoscaler module and interfaces

2023-09-21 Thread via GitHub


Samrat002 commented on code in PR #677:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/677#discussion_r1333545703


##
flink-autoscaler/src/main/java/org/apache/flink/autoscaler/state/AutoScalerStateStore.java:
##
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.autoscaler.state;
+
+import org.apache.flink.annotation.Experimental;
+import org.apache.flink.autoscaler.JobAutoScalerContext;
+
+/**
+ * The state store is responsible for store all states during scaling.
+ *
+ * @param  The job key.
+ * @param  Instance of JobAutoScalerContext.
+ */
+@Experimental
+public interface AutoScalerStateStore> {
+
+void storeScalingHistory(Context jobContext, String scalingHistory);

Review Comment:
   
   Much appreciated, @1996fanrui , for summarizing our discussion.
   
   I would like to add one more point: the `AutoScalerEventHandler` currently 
offers two types, namely `warning` and `normal`. In my opinion, it would be 
beneficial to include an `Error` or `Fatal` option as part of the interface. 
This flexibility would allow different users implementing this autoscaler 
module to define and use these options according to their specific requirements.
   
   I would definitely appreciate hearing the thoughts of @mxm  and @gyfora  
regarding the adoption of structured classes over string and the proposal to 
introduce a new type called Fatal in `AutoScalerEventHandler`, as described in 
this thread.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31854) Flink connector redshift

2023-09-21 Thread Samrat Deb (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samrat Deb updated FLINK-31854:
---
Description: 
This is an umbrella Jira for 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift

 

 

  was:
Proposal :

Add new feature (submodule) flink connector redshift in flink-connector-aws.

 

Next steps 
 - Create a flip for the flink connector redshift 

 

 


> Flink connector redshift 
> -
>
> Key: FLINK-31854
> URL: https://issues.apache.org/jira/browse/FLINK-31854
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / AWS
>Reporter: Samrat Deb
>Priority: Major
>
> This is an umbrella Jira for 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33107) Update stable-spec upgrade mode on reconciled-spec change

2023-09-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33107:
---
Labels: pull-request-available  (was: )

> Update stable-spec upgrade mode on reconciled-spec change
> -
>
> Key: FLINK-33107
> URL: https://issues.apache.org/jira/browse/FLINK-33107
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.6.0, kubernetes-operator-1.7.0
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: pull-request-available
>
> Since now the rollback mechanism uses the regular upgrade flow, we need to 
> ensure that the lastStableSpec upgrade mode is kept in sync with the 
> lastReconciled spec to ensure correct stateful upgrades.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33119) The pojo result returned be procedure should be Row of fields in the pojo instead of the whole pojo object

2023-09-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33119:
---
Labels: pull-request-available  (was: )

> The pojo result returned be procedure should be Row of fields in the pojo 
> instead of the whole pojo object
> --
>
> Key: FLINK-33119
> URL: https://issues.apache.org/jira/browse/FLINK-33119
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: luoyuxia
>Priority: Major
>  Labels: pull-request-available
>
> If a procedure return a Pojo object, as a table result of Row, Now, Flink 
> will consider it as Row.of(pojo) instead of Row.of(f1, f2, ..) where f1, f2 
> are the fields of the Pojo object. The current behavior for procedure is not 
> align to function. So, we should consider it as Row.of(f1, f2, ..).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] hackergin commented on a diff in pull request #23438: [FLINK-33119][table] The pojo result returned be procedure should be Row of fields in the pojo instead of whole pojo object

2023-09-21 Thread via GitHub


hackergin commented on code in PR #23438:
URL: https://github.com/apache/flink/pull/23438#discussion_r1333458365


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/operations/PlannerCallProcedureOperation.java:
##
@@ -277,15 +297,20 @@ static final class CallProcedureResultProvider implements 
ResultProvider {
 
 private final DataStructureConverter converter;
 private final RowDataToStringConverter toStringConverter;
+
+// a converter to convert internal RowData to Row
+private final @Nullable RowRowConverter rowConverter;
 private final Object[] result;
 
 public CallProcedureResultProvider(
 DataStructureConverter converter,
 RowDataToStringConverter toStringConverter,
+RowRowConverter rowConverter,

Review Comment:
   ```suggestion
   @Nullable RowRowConverter rowConverter,
   ```



##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/operations/PlannerCallProcedureOperation.java:
##
@@ -237,6 +249,14 @@ private TableResultInternal procedureResultToTableResult(
 tableResultType = DataTypes.ROW(DataTypes.FIELD("result", 
tableResultType));
 }
 
+RowRowConverter rowConverter = null;
+// if the output is struct type,
+// we need a row converter to convert the struct value to Row

Review Comment:
   The comments here are a bit confusing. Does RowRowConverter convert RowData 
and Row, not Struct value?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-32884) PyFlink remote execution should support URLs with paths and https scheme

2023-09-21 Thread Thomas Weise (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise closed FLINK-32884.

Resolution: Fixed

> PyFlink remote execution should support URLs with paths and https scheme
> 
>
> Key: FLINK-32884
> URL: https://issues.apache.org/jira/browse/FLINK-32884
> Project: Flink
>  Issue Type: New Feature
>  Components: Client / Job Submission, Runtime / REST
>Affects Versions: 1.17.1
>Reporter: Elkhan Dadashov
>Assignee: Elkhan Dadashov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Currently, the `SUBMIT_ARGS=remote -m http://:` format. For 
> local execution it works fine `SUBMIT_ARGS=remote -m http://localhost:8081/`, 
> but it does not support the placement of the JobManager behind a proxy or 
> using an Ingress for routing to a specific Flink cluster based on the URL 
> path. In the current scenario, it expects JobManager to access PyFlink jobs 
> at `http://:/v1/jobs` endpoint. Mapping to a non-root 
> location, 
> `https://:/flink-clusters/namespace/flink_job_deployment/v1/jobs`
>  is not supported.
> This will use changes from 
> [FLINK-32885](https://issues.apache.org/jira/browse/FLINK-32885)(https://issues.apache.org/jira/browse/FLINK-32885)
> Since RestClusterClient talks to the JobManager via its REST endpoint, the 
> right format for `SUBMIT_ARGS` is a URL with a path (also support for https 
> scheme).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] tweise merged pull request #23406: [FLINK-32884] [flink-clients] PyFlink remote execution should support URLs with paths and https scheme

2023-09-21 Thread via GitHub


tweise merged PR #23406:
URL: https://github.com/apache/flink/pull/23406


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33096) Flink on k8s,if one taskmanager pod was crashed,the whole flink job will be failed

2023-09-21 Thread wawa (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767595#comment-17767595
 ] 

wawa commented on FLINK-33096:
--

So sorry,correct the previous description: 

 there was an exception during the scheduling of the taskManager, and the 
restart strategy took effect. As a result, the Duration value was reset and 
started counting again.

However, due to pod memory overuse, the taskManager pod was killed and evicted 
by Kubernetes. When trying to schedule a new taskManager pod, it failed, 
resulting in the failure of the entire Flink job.

> Flink on k8s,if one taskmanager pod was crashed,the whole flink job will be 
> failed
> --
>
> Key: FLINK-33096
> URL: https://issues.apache.org/jira/browse/FLINK-33096
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.3
>Reporter: wawa
>Priority: Major
>
> The Flink version is 1.14.3, and the job is submitted to Kubernetes using the 
> Native Kubernetes application mode. During the scheduling process, when a 
> TaskManager pod crashes due to an exception, Kubernetes will attempt to start 
> a new TaskManager pod. However, the scheduling process is halted immediately, 
> resulting in the entire Flink job being terminated. On the other hand, if the 
> JobManager pod crashes, Kubernetes is able to successfully schedule a new 
> JobManager pod. This observation was made during application usage. Can you 
> please help analyze the underlying issue?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30649) Shutting down MiniCluster times out

2023-09-21 Thread Flaviu Cicio (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767581#comment-17767581
 ] 

Flaviu Cicio commented on FLINK-30649:
--

The issue is not reproducing anymore, thus we only have the logs

I can compare them with the ones that do not fail, maybe we find a clue about 
what happened

Other things that we can try:
 * Verify if it happened multiple times, check if there's a pattern
 * Access pod logs, but I'm not sure if it's possible

> Shutting down MiniCluster times out
> ---
>
> Key: FLINK-30649
> URL: https://issues.apache.org/jira/browse/FLINK-30649
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes, Test Infrastructure
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: stale-assigned, starter, test-stability
>
> {{Run kubernetes session test (default input)}} failed with a timeout.
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44748=logs=bea52777-eaf8-5663-8482-18fbc3630e81=b2642e3a-5b86-574d-4c8a-f7e2842bfb14=6317]
> It appears that there was some issue with shutting down the pods of the 
> MiniCluster:
> {code:java}
> 2023-01-12T08:22:13.1388597Z timed out waiting for the condition on 
> pods/flink-native-k8s-session-1-7dc9976688-gq788
> 2023-01-12T08:22:13.1390040Z timed out waiting for the condition on 
> pods/flink-native-k8s-session-1-taskmanager-1-1 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33125) Upgrade JOSDK to 4.4.4

2023-09-21 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-33125.
--
Resolution: Fixed

merged to main fd68924f8db5872092177fc77ab07349d688d0fb

> Upgrade JOSDK to 4.4.4
> --
>
> Key: FLINK-33125
> URL: https://issues.apache.org/jira/browse/FLINK-33125
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Nicolas Fraison
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.7.0
>
>
> JOSDK 
> [4.4.4|https://github.com/operator-framework/java-operator-sdk/releases/tag/v4.4.4]
>  contains fix for leader election issue we face in our environment
> Here are more information on the 
> [issue|https://github.com/operator-framework/java-operator-sdk/issues/2056] 
> faced



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] Zakelly commented on pull request #23441: [FLINK-33060] Fix the javadoc of ListState interfaces about not allowing null value

2023-09-21 Thread via GitHub


Zakelly commented on PR #23441:
URL: https://github.com/apache/flink/pull/23441#issuecomment-1729445124

   @masteryhx  Would you please review this PR? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33000) SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using a ThreadFactory

2023-09-21 Thread Jiabao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767488#comment-17767488
 ] 

Jiabao Sun commented on FLINK-33000:


I think the OperationManagerTest, ResultFetcherTest should do the same change 
as well.

> SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using 
> a ThreadFactory
> -
>
> Key: FLINK-33000
> URL: https://issues.apache.org/jira/browse/FLINK-33000
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway, Tests
>Affects Versions: 1.16.2, 1.18.0, 1.17.1, 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: pull-request-available, starter
>
> {{SqlGatewayServiceITCase}} uses a {{ExecutorThreadFactory}} for its 
> asynchronous operations. Instead, one should use {{TestExecutorExtension}} to 
> ensure proper cleanup of threads.
> We might also want to remove the {{AbstractTestBase}} parent class because 
> that uses JUnit4 whereas {{SqlGatewayServiceITCase}} is already based on 
> JUnit5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #680: [FLINK-33125] Bump JOSDK to 4.4.4

2023-09-21 Thread via GitHub


gyfora commented on PR #680:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/680#issuecomment-1729104110

   > How do you manage to update/generate this NOTICE file, should I do it 
manually?
   
   In the core Flink project there is some util for this but usually I just run 
this in the `flink-kubernetes-operator` submodule:
   ```
   mvn clean package -DskipTests | grep Including | sort | cut -d' ' -f3
   ```
   and do it by hand. 
   
   In this case I think only the JOSDK related versions will change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] mxm commented on pull request #23406: [FLINK-32884] [flink-clients] PyFlink remote execution should support URLs with paths and https scheme

2023-09-21 Thread via GitHub


mxm commented on PR #23406:
URL: https://github.com/apache/flink/pull/23406#issuecomment-1729080927

   @elkhand Thanks for addressing our comments. We need to squash the commits 
prior to merging. If I merge using the Github UI it will use 
elkh...@users.noreply.github.com. This will still the associate the commit with 
your Github account. If you want to use a different email address, please 
squash and push the resulting commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] Jiabao-Sun commented on pull request #23449: [FLINK-33000][sql-gateway] SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using a ThreadFactory

2023-09-21 Thread via GitHub


Jiabao-Sun commented on PR #23449:
URL: https://github.com/apache/flink/pull/23449#issuecomment-1729078222

   Hi @fsk119, could you help review it when you have time?
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23449: [FLINK-33000][sql-gateway] SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using a ThreadFactory

2023-09-21 Thread via GitHub


flinkbot commented on PR #23449:
URL: https://github.com/apache/flink/pull/23449#issuecomment-1729076569

   
   ## CI report:
   
   * 51f70ad4520ec6255aef04caccc258968f7f96bd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33000) SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using a ThreadFactory

2023-09-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33000:
---
Labels: pull-request-available starter  (was: starter)

> SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using 
> a ThreadFactory
> -
>
> Key: FLINK-33000
> URL: https://issues.apache.org/jira/browse/FLINK-33000
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway, Tests
>Affects Versions: 1.16.2, 1.18.0, 1.17.1, 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>  Labels: pull-request-available, starter
>
> {{SqlGatewayServiceITCase}} uses a {{ExecutorThreadFactory}} for its 
> asynchronous operations. Instead, one should use {{TestExecutorExtension}} to 
> ensure proper cleanup of threads.
> We might also want to remove the {{AbstractTestBase}} parent class because 
> that uses JUnit4 whereas {{SqlGatewayServiceITCase}} is already based on 
> JUnit5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] Jiabao-Sun opened a new pull request, #23449: [FLINK-33000][sql-gateway] SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using a ThreadFactory

2023-09-21 Thread via GitHub


Jiabao-Sun opened a new pull request, #23449:
URL: https://github.com/apache/flink/pull/23449

   
   
   ## What is the purpose of the change
   
   [FLINK-33000][sql-gateway] SqlGatewayServiceITCase should utilize 
TestExecutorExtension instead of using a ThreadFactory
   
   ## Brief change log
   
SqlGatewayServiceITCase should utilize TestExecutorExtension instead of 
using a ThreadFactory
   
   
   ## Verifying this change
   
   This change is already covered by existing tests
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? ( no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] jiaoqingbo commented on pull request #23127: [FLINK-32223][runtime][security] Add Hive delegation token support And Add HiveServer2DelegationTokenIdentifier to fix compile failed in h

2023-09-21 Thread via GitHub


jiaoqingbo commented on PR #23127:
URL: https://github.com/apache/flink/pull/23127#issuecomment-1729060548

   > Just arrived back here, was busy lately. Please add the original PR 
description to this PR + the change what solved the hadoop_313 issue.
   
   updated,please take a look,thanks @gaborgsomogyi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33125) Upgrade JOSDK to 4.4.4

2023-09-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33125:
---
Labels: pull-request-available  (was: )

> Upgrade JOSDK to 4.4.4
> --
>
> Key: FLINK-33125
> URL: https://issues.apache.org/jira/browse/FLINK-33125
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Nicolas Fraison
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.7.0
>
>
> JOSDK 
> [4.4.4|https://github.com/operator-framework/java-operator-sdk/releases/tag/v4.4.4]
>  contains fix for leader election issue we face in our environment
> Here are more information on the 
> [issue|https://github.com/operator-framework/java-operator-sdk/issues/2056] 
> faced



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] ashangit commented on pull request #680: [FLINK-33125] Bump JOSDK to 4.4.4

2023-09-21 Thread via GitHub


ashangit commented on PR #680:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/680#issuecomment-1729001237

   How do you manage to update/generate this NOTICE file, should I do it 
manually?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-web] victor9309 closed pull request #675: [hotfix] cannot jump to /zh/how-to-contribute/contribute-code/

2023-09-21 Thread via GitHub


victor9309 closed pull request #675: [hotfix] cannot jump to 
/zh/how-to-contribute/contribute-code/
URL: https://github.com/apache/flink-web/pull/675


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-33007) Integrate autoscaler config validation into the general validator flow

2023-09-21 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-33007:
--

Assignee: Praneeth Ramesh

> Integrate autoscaler config validation into the general validator flow
> --
>
> Key: FLINK-33007
> URL: https://issues.apache.org/jira/browse/FLINK-33007
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Praneeth Ramesh
>Priority: Major
> Fix For: kubernetes-operator-1.7.0
>
>
> Currently autoscaler configs are not validated at all but cause runtime 
> failures of the autoscaler mechanism. 
> We should create a custom autoscaler config validator plugin and hook it up 
> into the core validation flow
>  
> As part of this we should start validating the percentage based config ranges



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] Tartarus0zm commented on pull request #23448: [FLINK-33050][table] Atomicity is not supported prompting the user to disable

2023-09-21 Thread via GitHub


Tartarus0zm commented on PR #23448:
URL: https://github.com/apache/flink/pull/23448#issuecomment-1728933478

   @luoyuxia  CI is pass, PTAL, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33123) Wrong dynamic replacement of partitioner from FORWARD to REBLANCE for autoscaler and adaptive scheduler

2023-09-21 Thread Zhanghao Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767397#comment-17767397
 ] 

Zhanghao Chen commented on FLINK-33123:
---

Hi [~fanrui], thanks for sharing the issues. For issue 1: directly updating the 
execution graph should also work. For issue 2: this is indeed an issue and I 
don't know if there is a good solution to it if we would use REBALANCE. But I 
think it would be better to use RESCALE with some help from the auotscaling 
algorithm side. We choose to change it from FORWARD to RESCALE instead 
internally for the following reasons:
 * REBALANCE takes quite a few network memory which can lead to memory issue 
after rescaling, esp. if the parallelism is big. It may also introduce 
performance degrading due to the extra shuffle.
 * FORWARD and RESCALE are actually interchangeable, they share the same 
shuffle behavior under the same upstream and downstream parallelism setting. 
This avoids the issues mentioned in issue2 here.
 * To address the issue that RESCALE might lead to imbalanced data on the 
downstream side, we introduced an improvement on the autoscaling algo side to 
make the upstream and downstream task parallelism to be multiples of each other.

> Wrong dynamic replacement of partitioner from FORWARD to REBLANCE for 
> autoscaler and adaptive scheduler
> ---
>
> Key: FLINK-33123
> URL: https://issues.apache.org/jira/browse/FLINK-33123
> Project: Flink
>  Issue Type: Bug
>  Components: Autoscaler, Runtime / Coordination
>Affects Versions: 1.17.0, 1.18.0
>Reporter: Zhanghao Chen
>Priority: Critical
> Attachments: image-2023-09-20-15-09-22-733.png, 
> image-2023-09-20-15-14-04-679.png
>
>
> *Background*
> https://issues.apache.org/jira/browse/FLINK-30213 reported that the edge is 
> wrong when the parallelism is changed for a vertex with a FORWARD edge, which 
> is used by both the autoscaler and adaptive scheduler where one can change 
> the vertex parallelism dynamically. Fix is applied to dynamically replace 
> partitioner from FORWARD to REBLANCE on task deployment in 
> {{{}StreamTask{}}}: 
>  
> !image-2023-09-20-15-09-22-733.png|width=560,height=221!
> *Problem*
> Unfortunately, the fix is still buggy in two aspects:
>  # The connections between upstream and downstream tasks are determined by 
> the distribution type of the partitioner when generating execution graph on 
> the JM side. When the edge is FORWARD, the distribution type is POINTWISE, 
> and Flink will try to evenly distribute subpartitions to all downstream 
> tasks. If one want to change it to REBALANCE, the distribution type has to be 
> changed to ALL_TO_ALL to make all-to-all connections between upstream and 
> downstream tasks. However, the fix did not change the distribution type which 
> makes the network connections be set up in a wrong way.
>  # The FOWARD partitioner will be replaced if 
> environment.getWriter(outputIndex).getNumberOfSubpartitions() equals to the 
> task parallelism. However, the number of subpartitions here equals to the 
> number of downstream tasks of this particular task, which is also determined 
> by the distribution type of the partitioner when generating execution graph 
> on the JM side.  When ceil(downstream task parallelism / upstream task 
> parallelism) = upstream task parallelism, we will have the number of 
> subpartitions = task parallelism. For example, for a topology A (parallelism 
> 2) -> B (parallelism 5), we will have 1 A task having 2 subpartitions, 1 A 
> task having 3 subpartition, and hence 1 task will have its number of 
> subpartitions equals to the task parallelism 2 and skip partitioner 
> replacement. As a result, that task will only send data to only one 
> downstream task as the FORWARD partitioner always send data to the first 
> subpartition. In fact, for a normal job with a FORWARD edge without any 
> autoscaling action, you will find that the partitioner is changed to 
> REBALANCE internally as the number of subpartitions always equals to 1 in 
> this case.
> !image-2023-09-20-15-14-04-679.png|width=892,height=301!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] Jiabao-Sun commented on pull request #23424: [FLINK-32445][runtime] Refactor BlobStoreService's closeAndCleanupAllData to cleanupAllData

2023-09-21 Thread via GitHub


Jiabao-Sun commented on PR #23424:
URL: https://github.com/apache/flink/pull/23424#issuecomment-1728913476

   Hi @XComp, could you help take a look when you have time?
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (FLINK-33126) Fix EventTimeAllWindowCheckpointingITCase jobName typo

2023-09-21 Thread Weihua Hu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weihua Hu resolved FLINK-33126.
---
Fix Version/s: 1.19.0
   Resolution: Fixed

master: 4f09bbb39239f5523da8ecb7c7cd5ac9fd34c0e5

> Fix EventTimeAllWindowCheckpointingITCase jobName typo
> --
>
> Key: FLINK-33126
> URL: https://issues.apache.org/jira/browse/FLINK-33126
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.17.1
>Reporter: Yue Ma
>Assignee: Yue Ma
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Fix EventTimeAllWindowCheckpointingITCase jobName Typo 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] huwh merged pull request #23444: [FLINK-33126] Fix EventTimeAllWindowCheckpointingITCase jobName typo

2023-09-21 Thread via GitHub


huwh merged PR #23444:
URL: https://github.com/apache/flink/pull/23444


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org