[GitHub] [flink-kubernetes-operator] wangyang0918 commented on pull request #177: [FLINK-27303][FLINK-27309] Introduce FlinkConfigManager for efficient config management

2022-04-22 Thread GitBox


wangyang0918 commented on PR #177:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/177#issuecomment-1106075709

   Just a quick note: instead of creating a watch on the ConfigMap, I prefer to 
simply use `ScheduledExecutorService#scheduleAtFixedRate` to detect default 
configuration file change on the fly. It is similar to `monitorInterval` in 
log4j. Because naked watch is not stable and could be closed in many cases, we 
need to use informer instead. But the informer might be an over-kill.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27340) [JUnit5 Migration] Module: flink-python

2022-04-22 Thread EMing Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526234#comment-17526234
 ] 

EMing Zhou commented on FLINK-27340:


Hi [~Sergey Nuyanzin] 

    Can I get the ticket?

 

> [JUnit5 Migration] Module: flink-python
> ---
>
> Key: FLINK-27340
> URL: https://issues.apache.org/jira/browse/FLINK-27340
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Tests
>Reporter: Sergey Nuyanzin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] zentol commented on a diff in pull request #19445: [FLINK-27209][build] Half the memory and double forkCount for running unit tests

2022-04-22 Thread GitBox


zentol commented on code in PR #19445:
URL: https://github.com/apache/flink/pull/19445#discussion_r855835204


##
flink-runtime/src/test/java/org/apache/flink/runtime/taskexecutor/TaskExecutorRecoveryITCase.java:
##
@@ -58,7 +58,7 @@
 
 /** Recovery tests for {@link TaskExecutor}. */
 @ExtendWith(TestLoggerExtension.class)
-class TaskExecutorRecoveryTest {
+class TaskExecutorRecoveryITCase {

Review Comment:
   `Could not allocate enough memory segments for NetworkBufferPool (required 
(Mb): 1024, allocated (Mb): 1022, missing (Mb): 2)`
   
   Looks like we're using NETWORK_MEMORY_MAX (1g) by default, which pushes us a 
_bit_ over the memory budget when starting 2 TMs...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] alpreu opened a new pull request, #19552: [FLINK-27185][connectors][formats] Convert formats and connectors modules to assertj

2022-04-22 Thread GitBox


alpreu opened a new pull request, #19552:
URL: https://github.com/apache/flink/pull/19552

   This PR fixes and grandfathers #19425 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lsyldliu opened a new pull request, #19553: [FLINK-27243][table] Support SHOW PARTITIONS statement for partitioned table

2022-04-22 Thread GitBox


lsyldliu opened a new pull request, #19553:
URL: https://github.com/apache/flink/pull/19553

   
   ## What is the purpose of the change
   
   Support SHOW PARTITIONS statement for partitioned table
   
   ## Brief change log
   
 - *Support SHOW PARTITIONS statement for partitioned table*
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - *Added unit tests in SqlToOperationConverterTest*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): ( no )
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: ( no )
 - The S3 file system connector: ( no )
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? ( docs)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27243) Support SHOW PARTITIONS statement for partitioned table

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27243:
---
Labels: pull-request-available  (was: )

> Support SHOW PARTITIONS statement for partitioned table
> ---
>
> Key: FLINK-27243
> URL: https://issues.apache.org/jira/browse/FLINK-27243
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] alpreu commented on a diff in pull request #19377: [FLINK-27048][test] Add ArchUnit rule that connectors should only depend on public API

2022-04-22 Thread GitBox


alpreu commented on code in PR #19377:
URL: https://github.com/apache/flink/pull/19377#discussion_r855846859


##
flink-architecture-tests/flink-architecture-tests-production/src/main/java/org/apache/flink/architecture/rules/ConnectorRules.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.architecture.rules;
+
+import org.apache.flink.annotation.Public;
+import org.apache.flink.annotation.PublicEvolving;
+
+import com.tngtech.archunit.base.DescribedPredicate;
+import com.tngtech.archunit.core.domain.JavaClass;
+import com.tngtech.archunit.junit.ArchTest;
+import com.tngtech.archunit.lang.ArchRule;
+import com.tngtech.archunit.thirdparty.com.google.common.base.Joiner;
+
+import static com.tngtech.archunit.base.DescribedPredicate.not;
+import static com.tngtech.archunit.core.domain.JavaModifier.PUBLIC;
+import static 
com.tngtech.archunit.core.domain.properties.HasModifiers.Predicates.modifier;
+import static com.tngtech.archunit.library.freeze.FreezingArchRule.freeze;
+import static 
org.apache.flink.architecture.common.GivenJavaClasses.noJavaClassesThat;
+import static 
org.apache.flink.architecture.common.Predicates.areDirectlyAnnotatedWithAtLeastOneOf;
+import static 
org.apache.flink.architecture.common.SourcePredicates.areProductionCode;
+
+/** Rules for Flink connectors. */
+public class ConnectorRules {
+private static final String[] CONNECTOR_PACKAGES = {
+"org.apache.flink.connector..", 
"org.apache.flink.streaming.connectors.."
+};
+
+private static DescribedPredicate 
areNotPublicAndResideOutsideOfPackages(
+String... packageIdentifiers) {
+return JavaClass.Predicates.resideOutsideOfPackages(packageIdentifiers)
+.and(
+not(areDirectlyAnnotatedWithAtLeastOneOf(
+Public.class, PublicEvolving.class))
+.and(not(modifier(PUBLIC

Review Comment:
   I think it's fair to keep it, annotated classes should also be public java 
classes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-27351) TaskExecutorBuilder configures unreasonable amount of memory

2022-04-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-27351:


 Summary: TaskExecutorBuilder configures unreasonable amount of 
memory
 Key: FLINK-27351
 URL: https://issues.apache.org/jira/browse/FLINK-27351
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Coordination, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.16.0


The TaskExecutorBuilder defines a resource spec where task, off-heap, managed 
and network memory are all set to 1g.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] zentol commented on a diff in pull request #19445: [FLINK-27209][build] Half the memory and double forkCount for running unit tests

2022-04-22 Thread GitBox


zentol commented on code in PR #19445:
URL: https://github.com/apache/flink/pull/19445#discussion_r855849054


##
flink-runtime/src/test/java/org/apache/flink/runtime/taskexecutor/TaskExecutorRecoveryITCase.java:
##
@@ -58,7 +58,7 @@
 
 /** Recovery tests for {@link TaskExecutor}. */
 @ExtendWith(TestLoggerExtension.class)
-class TaskExecutorRecoveryTest {
+class TaskExecutorRecoveryITCase {

Review Comment:
   https://issues.apache.org/jira/browse/FLINK-27351



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19552: [FLINK-27185][connectors][formats] Convert formats and connectors modules to assertj

2022-04-22 Thread GitBox


flinkbot commented on PR #19552:
URL: https://github.com/apache/flink/pull/19552#issuecomment-1106103895

   
   ## CI report:
   
   * 16349ae9b8c9293dbfbb6034abd58709145565d9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19553: [FLINK-27243][table] Support SHOW PARTITIONS statement for partitioned table

2022-04-22 Thread GitBox


flinkbot commented on PR #19553:
URL: https://github.com/apache/flink/pull/19553#issuecomment-1106103994

   
   ## CI report:
   
   * bdbe0ba06575e3b32e6b87ebe456f6cf0a57 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19554: [FLINK-27351][tests] Use sane TM resource spec

2022-04-22 Thread GitBox


flinkbot commented on PR #19554:
URL: https://github.com/apache/flink/pull/19554#issuecomment-1106108502

   
   ## CI report:
   
   * 9b48269ad19f2534e29c719cf8ffdb8c6eb5f1e8 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27351) TaskExecutorBuilder configures unreasonable amount of memory

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27351:
---
Labels: pull-request-available  (was: )

> TaskExecutorBuilder configures unreasonable amount of memory
> 
>
> Key: FLINK-27351
> URL: https://issues.apache.org/jira/browse/FLINK-27351
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Coordination, Tests
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> The TaskExecutorBuilder defines a resource spec where task, off-heap, managed 
> and network memory are all set to 1g.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #177: [FLINK-27303][FLINK-27309] Introduce FlinkConfigManager for efficient config management

2022-04-22 Thread GitBox


gyfora commented on PR #177:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/177#issuecomment-1106114156

   > Just a quick note: instead of creating a watch on the ConfigMap, I prefer 
to simply use `ScheduledExecutorService#scheduleAtFixedRate` to detect default 
configuration file change on the fly. It is similar to `monitorInterval` in 
log4j. Because naked watch is not stable and could be closed in many cases, we 
need to use informer instead. But the informer might be an over-kill.
   
   Makes sense @wangyang0918 thats a simple change to make it more robust 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] masteryhx commented on a diff in pull request #19142: [FLINK-23252][changelog] Support recovery from checkpoint after disab…

2022-04-22 Thread GitBox


masteryhx commented on code in PR #19142:
URL: https://github.com/apache/flink/pull/19142#discussion_r855863007


##
flink-runtime/src/main/java/org/apache/flink/runtime/state/AbstractKeyedStateBackend.java:
##
@@ -155,6 +155,25 @@ public AbstractKeyedStateBackend(
 this.keySelectionListeners = new ArrayList<>(1);
 }
 
+// Copy constructor
+public AbstractKeyedStateBackend(AbstractKeyedStateBackend 
abstractKeyedStateBackend) {

Review Comment:
   I have modified as you suggested.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] masteryhx commented on a diff in pull request #19142: [FLINK-23252][changelog] Support recovery from checkpoint after disab…

2022-04-22 Thread GitBox


masteryhx commented on code in PR #19142:
URL: https://github.com/apache/flink/pull/19142#discussion_r855863884


##
flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/StateChangelogStorageLoader.java:
##
@@ -99,4 +101,24 @@ public static StateChangelogStorage load(
 return factory.createStorage(configuration, metricGroup);
 }
 }
+
+@Nonnull
+public static StateChangelogStorageView load(ChangelogStateHandle 
changelogStateHandle)

Review Comment:
   replaced `load` with `loadFromStateHandle`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #178: [FLINK-27334] Support auto generate the doc for the `KubernetesOperatorConfigOptions`

2022-04-22 Thread GitBox


wangyang0918 commented on code in PR #178:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/178#discussion_r855853330


##
Dockerfile:
##
@@ -23,21 +23,23 @@ WORKDIR /app
 ENV SHADED_DIR=flink-kubernetes-shaded
 ENV OPERATOR_DIR=flink-kubernetes-operator
 ENV WEBHOOK_DIR=flink-kubernetes-webhook
+ENV DOCS_DIR=flink-kubernetes-docs
 
 RUN mkdir $OPERATOR_DIR $WEBHOOK_DIR
 
 COPY pom.xml .
 COPY $SHADED_DIR/pom.xml ./$SHADED_DIR/
 COPY $WEBHOOK_DIR/pom.xml ./$WEBHOOK_DIR/
 COPY $OPERATOR_DIR/pom.xml ./$OPERATOR_DIR/
+COPY $DOCS_DIR/pom.xml ./$DOCS_DIR/

Review Comment:
   Do we really need to copy `flink-kubernetes-docs` when building the image?



##
flink-kubernetes-docs/README.md:
##
@@ -0,0 +1,35 @@
+
+
+# Documentation generators
+
+This module contains generators that create HTML files directly from Flink 
Kubernetes Operator's source code.
+
+## Configuration documentation
+
+The `ConfigOptionsDocGenerator` can be used to generate a reference of 
`ConfigOptions`. By default, a separate file is generated for each `*Options` 
class found in `org.apache.flink.kubernetes.operator.docs.configuration`. The 
`@ConfigGroups` annotation can be used to generate multiple files from a single 
class.
+
+To integrate an `*Options` class from another package, add another 
module-package argument pair to `ConfigOptionsDocGenerator#LOCATIONS`.
+
+The files can be generated by running `mvn package -Dgenerate-config-docs -pl 
flink-kubernetes-docs -nsu -DskipTests`, and can be integrated into the 
documentation using `{{ include generated/ >}}`.

Review Comment:
   I believe we need to update the `ci.yml` to cover the new generated doc.



##
flink-kubernetes-docs/src/test/java/org/apache/flink/kubernetes/operator/docs/util/TestLoggerExtension.java:
##
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.docs.util;

Review Comment:
   We do not need this class in current project.



##
docs/content/docs/operations/configuration.md:
##
@@ -52,13 +52,4 @@ To learn more about metrics and logging configuration please 
refer to the dedica
 
 ## Operator Configuration Reference
 
-| Key  | Default | Type | Description |
-|  | --- |  | --- |
-| kubernetes.operator.reconciler.reschedule.interval |60s |  
Duration| The interval for the controller to reschedule the reconcile 
process.|
-| kubernetes.operator.observer.rest-ready.delay|  10s   | Duration 
| Final delay before deployment is marked ready after port becomes 
accessible.|
-| kubernetes.operator.reconciler.max.parallelism | 5|  Integer
|The maximum number of threads running the reconciliation loop. Use -1 for 
infinite. |
-| kubernetes.operator.observer.progress-check.interval |  10s   |  
Duration| The interval for observing status for in-progress operations 
such as deployment and savepoints.|
-| kubernetes.operator.observer.savepoint.trigger.grace-period | 10s
|   Duration   |   The interval before a savepoint trigger attempt is marked as 
unsuccessful.  |
-| kubernetes.operator.observer.flink.client.timeout | 10s|  
Duration| The timeout for the observer to wait the flink rest client to 
return.|
-| kubernetes.operator.reconciler.flink.cancel.job.timeout | 1min|  
Duration| The timeout for the reconciler to wait for flink to cancel job.   
 |
-| kubernetes.operator.reconciler.flink.cluster.shutdown.timeout | 60s  
  |  Duration| The timeout for the reconciler to wait for flink to shutdown 
cluster.   |
+{{< generated/kubernetes_operator_config_configuration >}}

Review Comment:
   Could you please verify locally to make sure `configuration` page is normal?



##
flink-kubernetes-docs/src/test/resources/META-INF/services/org.junit.jupiter.api.extension.Extension:
##
@@ -0,0 +1,16 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for ad

[jira] [Commented] (FLINK-27345) operator does not update related resource when flinkConfiguration, logConfiguration are updated.

2022-04-22 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526255#comment-17526255
 ] 

Yang Wang commented on FLINK-27345:
---

I think updating the flinkConfiguration or logConfiguration in CR could trigger 
the redeploy. Could you please check the operator logs for more information.

> operator does not update related resource when flinkConfiguration, 
> logConfiguration are updated.
> 
>
> Key: FLINK-27345
> URL: https://issues.apache.org/jira/browse/FLINK-27345
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: ChangZhuo Chen (陳昌倬)
>Priority: Major
>
> The CRD FlinkDeployment does not update running jobmanager/taskmanager when 
> flinkConfiguration or logConfiguration are updated. We need to recreate 
> FlinkDeployment to make it works.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-27345) operator does not update related resource when flinkConfiguration, logConfiguration are updated.

2022-04-22 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526255#comment-17526255
 ] 

Yang Wang edited comment on FLINK-27345 at 4/22/22 8:05 AM:


I think updating the flinkConfiguration or logConfiguration in CR could trigger 
the redeploy. Could you please check the operator logs for more information?


was (Author: fly_in_gis):
I think updating the flinkConfiguration or logConfiguration in CR could trigger 
the redeploy. Could you please check the operator logs for more information.

> operator does not update related resource when flinkConfiguration, 
> logConfiguration are updated.
> 
>
> Key: FLINK-27345
> URL: https://issues.apache.org/jira/browse/FLINK-27345
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: ChangZhuo Chen (陳昌倬)
>Priority: Major
>
> The CRD FlinkDeployment does not update running jobmanager/taskmanager when 
> flinkConfiguration or logConfiguration are updated. We need to recreate 
> FlinkDeployment to make it works.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27155) Reduce multiple reads to the same Changelog file in the same taskmanager during restore

2022-04-22 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526258#comment-17526258
 ] 

Yun Tang commented on FLINK-27155:
--

>From my experiences, serializing the acesses to the same file coud help 
>improve the IO performance.
For the time to cleanup, once all tasks starts to be RUNNING on the 
taskmanager, I think we could safely discard them then.

> Reduce multiple reads to the same Changelog file in the same taskmanager 
> during restore
> ---
>
> Key: FLINK-27155
> URL: https://issues.apache.org/jira/browse/FLINK-27155
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
> Fix For: 1.16.0
>
>
> h3. Background
> In the current implementation, State changes of different operators in the 
> same taskmanager may be written to the same changelog file, which effectively 
> reduces the number of files and requests to DFS.
> But on the other hand, the current implementation also reads the same 
> changelog file multiple times on recovery. More specifically, the number of 
> times the same changelog file is accessed is related to the number of 
> ChangeSets contained in it. And since each read needs to skip the preceding 
> bytes, this network traffic is also wasted.
> The result is a lot of unnecessary request to DFS when there are multiple 
> slots and keyed state in the same taskmanager.
> h3. Proposal
> We can reduce multiple reads to the same changelog file in the same 
> taskmanager during restore.
> One possible approach is to read the changelog file all at once and cache it 
> in memory or local file for a period of time when reading the changelog file.
> I think this could be a subtask of [v2 FLIP-158: Generalized incremental 
> checkpoints|https://issues.apache.org/jira/browse/FLINK-25842] .
> Hi [~ym] , [~roman]  how do you think about ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] JingGe commented on a diff in pull request #19377: [FLINK-27048][test] Add ArchUnit rule that connectors should only depend on public API

2022-04-22 Thread GitBox


JingGe commented on code in PR #19377:
URL: https://github.com/apache/flink/pull/19377#discussion_r855893792


##
flink-architecture-tests/flink-architecture-tests-production/src/main/java/org/apache/flink/architecture/rules/ConnectorRules.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.architecture.rules;
+
+import org.apache.flink.annotation.Public;
+import org.apache.flink.annotation.PublicEvolving;
+
+import com.tngtech.archunit.base.DescribedPredicate;
+import com.tngtech.archunit.core.domain.JavaClass;
+import com.tngtech.archunit.junit.ArchTest;
+import com.tngtech.archunit.lang.ArchRule;
+import com.tngtech.archunit.thirdparty.com.google.common.base.Joiner;
+
+import static com.tngtech.archunit.base.DescribedPredicate.not;
+import static com.tngtech.archunit.core.domain.JavaModifier.PUBLIC;
+import static 
com.tngtech.archunit.core.domain.properties.HasModifiers.Predicates.modifier;
+import static com.tngtech.archunit.library.freeze.FreezingArchRule.freeze;
+import static 
org.apache.flink.architecture.common.GivenJavaClasses.noJavaClassesThat;
+import static 
org.apache.flink.architecture.common.Predicates.areDirectlyAnnotatedWithAtLeastOneOf;
+import static 
org.apache.flink.architecture.common.SourcePredicates.areProductionCode;
+
+/** Rules for Flink connectors. */
+public class ConnectorRules {
+private static final String[] CONNECTOR_PACKAGES = {
+"org.apache.flink.connector..", 
"org.apache.flink.streaming.connectors.."
+};
+
+private static DescribedPredicate 
areNotPublicAndResideOutsideOfPackages(
+String... packageIdentifiers) {
+return JavaClass.Predicates.resideOutsideOfPackages(packageIdentifiers)
+.and(
+not(areDirectlyAnnotatedWithAtLeastOneOf(
+Public.class, PublicEvolving.class))
+.and(not(modifier(PUBLIC

Review Comment:
   yeah, it should be fine. The question is that is there a case that a 
protected class outside of the connector packages, which has the @Public or 
@PublicEvolving, will be used by connectors?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-27352) [JUnit5 Migration] Module: flink-json

2022-04-22 Thread EMing Zhou (Jira)
EMing Zhou created FLINK-27352:
--

 Summary: [JUnit5 Migration] Module: flink-json
 Key: FLINK-27352
 URL: https://issues.apache.org/jira/browse/FLINK-27352
 Project: Flink
  Issue Type: Sub-task
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.16.0
Reporter: EMing Zhou






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (FLINK-27352) [JUnit5 Migration] Module: flink-json

2022-04-22 Thread EMing Zhou (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

EMing Zhou updated FLINK-27352:
---
Affects Version/s: (was: 1.16.0)

> [JUnit5 Migration] Module: flink-json
> -
>
> Key: FLINK-27352
> URL: https://issues.apache.org/jira/browse/FLINK-27352
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: EMing Zhou
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] alpreu commented on a diff in pull request #19377: [FLINK-27048][test] Add ArchUnit rule that connectors should only depend on public API

2022-04-22 Thread GitBox


alpreu commented on code in PR #19377:
URL: https://github.com/apache/flink/pull/19377#discussion_r855898749


##
flink-architecture-tests/flink-architecture-tests-production/src/main/java/org/apache/flink/architecture/rules/ConnectorRules.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.architecture.rules;
+
+import org.apache.flink.annotation.Public;
+import org.apache.flink.annotation.PublicEvolving;
+
+import com.tngtech.archunit.base.DescribedPredicate;
+import com.tngtech.archunit.core.domain.JavaClass;
+import com.tngtech.archunit.junit.ArchTest;
+import com.tngtech.archunit.lang.ArchRule;
+import com.tngtech.archunit.thirdparty.com.google.common.base.Joiner;
+
+import static com.tngtech.archunit.base.DescribedPredicate.not;
+import static com.tngtech.archunit.core.domain.JavaModifier.PUBLIC;
+import static 
com.tngtech.archunit.core.domain.properties.HasModifiers.Predicates.modifier;
+import static com.tngtech.archunit.library.freeze.FreezingArchRule.freeze;
+import static 
org.apache.flink.architecture.common.GivenJavaClasses.noJavaClassesThat;
+import static 
org.apache.flink.architecture.common.Predicates.areDirectlyAnnotatedWithAtLeastOneOf;
+import static 
org.apache.flink.architecture.common.SourcePredicates.areProductionCode;
+
+/** Rules for Flink connectors. */
+public class ConnectorRules {
+private static final String[] CONNECTOR_PACKAGES = {
+"org.apache.flink.connector..", 
"org.apache.flink.streaming.connectors.."
+};
+
+private static DescribedPredicate 
areNotPublicAndResideOutsideOfPackages(
+String... packageIdentifiers) {
+return JavaClass.Predicates.resideOutsideOfPackages(packageIdentifiers)
+.and(
+not(areDirectlyAnnotatedWithAtLeastOneOf(
+Public.class, PublicEvolving.class))
+.and(not(modifier(PUBLIC

Review Comment:
   I see, yeah let's just remove it, we can always add it back later



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-27345) operator does not update related resource when flinkConfiguration, logConfiguration are updated.

2022-04-22 Thread 陳昌倬


 [ 
https://issues.apache.org/jira/browse/FLINK-27345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChangZhuo Chen (陳昌倬) closed FLINK-27345.

Resolution: Invalid

> operator does not update related resource when flinkConfiguration, 
> logConfiguration are updated.
> 
>
> Key: FLINK-27345
> URL: https://issues.apache.org/jira/browse/FLINK-27345
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: ChangZhuo Chen (陳昌倬)
>Priority: Major
>
> The CRD FlinkDeployment does not update running jobmanager/taskmanager when 
> flinkConfiguration or logConfiguration are updated. We need to recreate 
> FlinkDeployment to make it works.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27345) operator does not update related resource when flinkConfiguration, logConfiguration are updated.

2022-04-22 Thread 陳昌倬


[ 
https://issues.apache.org/jira/browse/FLINK-27345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526264#comment-17526264
 ] 

ChangZhuo Chen (陳昌倬) commented on FLINK-27345:
--

Looks like it can be updated now, thanks for the help.

> operator does not update related resource when flinkConfiguration, 
> logConfiguration are updated.
> 
>
> Key: FLINK-27345
> URL: https://issues.apache.org/jira/browse/FLINK-27345
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: ChangZhuo Chen (陳昌倬)
>Priority: Major
>
> The CRD FlinkDeployment does not update running jobmanager/taskmanager when 
> flinkConfiguration or logConfiguration are updated. We need to recreate 
> FlinkDeployment to make it works.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Closed] (FLINK-27215) JDBC sink transiently deleted a record because of -u message of that record

2022-04-22 Thread tim yu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tim yu closed FLINK-27215.
--
Resolution: Not A Problem

> JDBC sink transiently deleted a record because of -u message of that record
> ---
>
> Key: FLINK-27215
> URL: https://issues.apache.org/jira/browse/FLINK-27215
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.12.7, 1.13.5, 1.14.3
>Reporter: tim yu
>Priority: Major
>
> A record is deleted transiently when using JDBC sink in upsert mode.
> The -U message is processed as delete operation in class 
> TableBufferReducedStatementExecutor.
> The following codes show how to process -U message:
> {code:java}
> /**
>  * Returns true if the row kind is INSERT or UPDATE_AFTER, returns false 
> if the row kind is
>  * DELETE or UPDATE_BEFORE.
>  */
> private boolean changeFlag(RowKind rowKind) {
> switch (rowKind) {
> case INSERT:
> case UPDATE_AFTER:
> return true;
> case DELETE:
> case UPDATE_BEFORE:
> return false;
> default:
> throw new UnsupportedOperationException(
> String.format(
> "Unknown row kind, the supported row kinds 
> is: INSERT, UPDATE_BEFORE, UPDATE_AFTER,"
> + " DELETE, but get: %s.",
> rowKind));
> }
> }
> @Override
> public void executeBatch() throws SQLException {
> for (Map.Entry> entry : 
> reduceBuffer.entrySet()) {
> if (entry.getValue().f0) {
> upsertExecutor.addToBatch(entry.getValue().f1);
> } else {
> // delete by key
> deleteExecutor.addToBatch(entry.getKey());
> }
> }
> upsertExecutor.executeBatch();
> deleteExecutor.executeBatch();
> reduceBuffer.clear();
> }
> {code}
> If -U and +U messages of one record are executed separately in different JDBC 
> batches, that record will be deleted transiently in external database and 
> then insert a new updated record to it. In fact, this record should be merely 
> updated once in the external database.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] gaoyunhaii commented on pull request #19512: [FLINK-23902][connectors/hive] Make flink support Hive 3.1.3 version

2022-04-22 Thread GitBox


gaoyunhaii commented on PR #19512:
URL: https://github.com/apache/flink/pull/19512#issuecomment-1106209669

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol opened a new pull request, #19555: [FLINK-23659][metrics][prometheus] Cleanup code

2022-04-22 Thread GitBox


zentol opened a new pull request, #19555:
URL: https://github.com/apache/flink/pull/19555

   General cleanup in the prometheus reporter code base. Removes various 
warnings, simplifies some tests and reduces dependencies on runtime-internals.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-23659) Cleanup Prometheus reporter

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-23659:
---
Labels: pull-request-available  (was: )

> Cleanup Prometheus reporter
> ---
>
> Key: FLINK-23659
> URL: https://issues.apache.org/jira/browse/FLINK-23659
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27155) Reduce multiple reads to the same Changelog file in the same taskmanager during restore

2022-04-22 Thread Feifan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526281#comment-17526281
 ] 

Feifan Wang commented on FLINK-27155:
-

Thanks for your reply [~yunta] , I think we can serialize all accesses to the 
same file in a separate ticket once this ticket resolved.

 

And, I don't understand what the following sentence means, downloading and 
applying changelog files are all in RUNNING state in my knowledge, what time do 
you mean we could discard local cache file ?
{quote}For the time to cleanup, once all tasks starts to be RUNNING on the 
taskmanager, I think we could safely discard them then.
{quote}
 
 

> Reduce multiple reads to the same Changelog file in the same taskmanager 
> during restore
> ---
>
> Key: FLINK-27155
> URL: https://issues.apache.org/jira/browse/FLINK-27155
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
> Fix For: 1.16.0
>
>
> h3. Background
> In the current implementation, State changes of different operators in the 
> same taskmanager may be written to the same changelog file, which effectively 
> reduces the number of files and requests to DFS.
> But on the other hand, the current implementation also reads the same 
> changelog file multiple times on recovery. More specifically, the number of 
> times the same changelog file is accessed is related to the number of 
> ChangeSets contained in it. And since each read needs to skip the preceding 
> bytes, this network traffic is also wasted.
> The result is a lot of unnecessary request to DFS when there are multiple 
> slots and keyed state in the same taskmanager.
> h3. Proposal
> We can reduce multiple reads to the same changelog file in the same 
> taskmanager during restore.
> One possible approach is to read the changelog file all at once and cache it 
> in memory or local file for a period of time when reading the changelog file.
> I think this could be a subtask of [v2 FLIP-158: Generalized incremental 
> checkpoints|https://issues.apache.org/jira/browse/FLINK-25842] .
> Hi [~ym] , [~roman]  how do you think about ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-27155) Reduce multiple reads to the same Changelog file in the same taskmanager during restore

2022-04-22 Thread Feifan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526281#comment-17526281
 ] 

Feifan Wang edited comment on FLINK-27155 at 4/22/22 9:08 AM:
--

Thanks for your reply [~yunta] , I think we can serialize all accesses to the 
same file in a separate ticket once this ticket resolved.

 

And, I don't understand what the following sentence means, downloading and 
applying changelog files are all in RUNNING state in my knowledge, what time do 
you mean we could discard local cache file ?
{quote}For the time to cleanup, once all tasks starts to be RUNNING on the 
taskmanager, I think we could safely discard them then.
{quote}


was (Author: feifan wang):
Thanks for your reply [~yunta] , I think we can serialize all accesses to the 
same file in a separate ticket once this ticket resolved.

 

And, I don't understand what the following sentence means, downloading and 
applying changelog files are all in RUNNING state in my knowledge, what time do 
you mean we could discard local cache file ?
{quote}For the time to cleanup, once all tasks starts to be RUNNING on the 
taskmanager, I think we could safely discard them then.
{quote}
 
 

> Reduce multiple reads to the same Changelog file in the same taskmanager 
> during restore
> ---
>
> Key: FLINK-27155
> URL: https://issues.apache.org/jira/browse/FLINK-27155
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
> Fix For: 1.16.0
>
>
> h3. Background
> In the current implementation, State changes of different operators in the 
> same taskmanager may be written to the same changelog file, which effectively 
> reduces the number of files and requests to DFS.
> But on the other hand, the current implementation also reads the same 
> changelog file multiple times on recovery. More specifically, the number of 
> times the same changelog file is accessed is related to the number of 
> ChangeSets contained in it. And since each read needs to skip the preceding 
> bytes, this network traffic is also wasted.
> The result is a lot of unnecessary request to DFS when there are multiple 
> slots and keyed state in the same taskmanager.
> h3. Proposal
> We can reduce multiple reads to the same changelog file in the same 
> taskmanager during restore.
> One possible approach is to read the changelog file all at once and cache it 
> in memory or local file for a period of time when reading the changelog file.
> I think this could be a subtask of [v2 FLIP-158: Generalized incremental 
> checkpoints|https://issues.apache.org/jira/browse/FLINK-25842] .
> Hi [~ym] , [~roman]  how do you think about ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-training] NicoK merged pull request #46: [FLINK-26382]Add Chinese documents for flink-training exercises

2022-04-22 Thread GitBox


NicoK merged PR #46:
URL: https://github.com/apache/flink-training/pull/46


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19555: [FLINK-23659][metrics][prometheus] Cleanup code

2022-04-22 Thread GitBox


flinkbot commented on PR #19555:
URL: https://github.com/apache/flink/pull/19555#issuecomment-1106223761

   
   ## CI report:
   
   * 81e659448db7c78ee65291996ada0fc9d48f4081 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] openinx opened a new pull request, #99: [FLINK-27307] Flink table store support append-only ingestion without primary keys.

2022-04-22 Thread GitBox


openinx opened a new pull request, #99:
URL: https://github.com/apache/flink-table-store/pull/99

   This PR is trying to provide table store the ability to accept append-only 
ingestion without any defined primary keys.
   
   The previous table store abstraction are built on top of primary keys,  so 
in theory all the read & write path will need to be reconsidered  or 
refactored, so that we can abstract the correct API which works fine for both 
primary keys storage and immutable logs (without primary keys). 
   
   The current version is a draft PR (Actually,  I'm not quite familiar with 
the flink-table-store project before, so I'm trying to implement this 
append-only abstraction to understand the API & implementation better). 
   
   There are TODO issues that I didn't consider clearly in this PRs ( I think I 
will need the next update to address those things): 
   
   1.  The append-only table's file level statistics are quite different with 
the primary key tables.  For example, the primary key tables will generate a  
collection of `SstFileMeta`  when calling the `writer#prepareCommit()`, and 
then accomplish the first stage commit in the flink's two-phrase commit.  The 
`SstFileMeta`  will include the statistics for both key fields and value 
fields, while in the append-only table we don't have any key fields (its 
statistic information should include all columns' max-min, count etc.) . So  in 
theory, we are required to abstract the common file level statistic 
informations data structure for both two kinds of table; 
   
   2.  The different manifests design for both two kinds of tables.
   
   3.  What's the read API abstraction for those two kinds of tables.  I still 
don't have a clearly propose for it. Will try to update this PR for this.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27307) Flink table store support append-only ingestion without primary keys.

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27307:
---
Labels: pull-request-available  (was: )

> Flink table store support append-only ingestion without primary keys.
> -
>
> Key: FLINK-27307
> URL: https://issues.apache.org/jira/browse/FLINK-27307
> Project: Flink
>  Issue Type: New Feature
>  Components: Table Store
>Reporter: Zheng Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.2.0
>
>
> Currently,  flink table store only support row data ingestion with defined 
> primary keys.  Those tables are quite suitable for maintaining CDC events or 
> flink upsert stream. 
> But in fact, there are many real scenarios which don't have the required 
> primary keys. Such as user clicks logs,  we only need to maintain those data 
> like a Hive/iceberg table.  I mean we don't need to define the primary keys, 
> and we don't need to maintain all those rows into a sorted LSM. We only need 
> to partition those rows into the correct partitions and maintain the basic 
> statistics to speed the query. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27215) JDBC sink transiently deleted a record because of -u message of that record

2022-04-22 Thread tim yu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526283#comment-17526283
 ] 

tim yu commented on FLINK-27215:


Hi [~fsk119], I have seen some classes that like 
FlinkChangelogModeInferenceProgram, StreamPhysicalDropUpdateBefore and 

StreamExecDropUpdateBefore. Thank you. 

> JDBC sink transiently deleted a record because of -u message of that record
> ---
>
> Key: FLINK-27215
> URL: https://issues.apache.org/jira/browse/FLINK-27215
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.12.7, 1.13.5, 1.14.3
>Reporter: tim yu
>Priority: Major
>
> A record is deleted transiently when using JDBC sink in upsert mode.
> The -U message is processed as delete operation in class 
> TableBufferReducedStatementExecutor.
> The following codes show how to process -U message:
> {code:java}
> /**
>  * Returns true if the row kind is INSERT or UPDATE_AFTER, returns false 
> if the row kind is
>  * DELETE or UPDATE_BEFORE.
>  */
> private boolean changeFlag(RowKind rowKind) {
> switch (rowKind) {
> case INSERT:
> case UPDATE_AFTER:
> return true;
> case DELETE:
> case UPDATE_BEFORE:
> return false;
> default:
> throw new UnsupportedOperationException(
> String.format(
> "Unknown row kind, the supported row kinds 
> is: INSERT, UPDATE_BEFORE, UPDATE_AFTER,"
> + " DELETE, but get: %s.",
> rowKind));
> }
> }
> @Override
> public void executeBatch() throws SQLException {
> for (Map.Entry> entry : 
> reduceBuffer.entrySet()) {
> if (entry.getValue().f0) {
> upsertExecutor.addToBatch(entry.getValue().f1);
> } else {
> // delete by key
> deleteExecutor.addToBatch(entry.getKey());
> }
> }
> upsertExecutor.executeBatch();
> deleteExecutor.executeBatch();
> reduceBuffer.clear();
> }
> {code}
> If -U and +U messages of one record are executed separately in different JDBC 
> batches, that record will be deleted transiently in external database and 
> then insert a new updated record to it. In fact, this record should be merely 
> updated once in the external database.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (FLINK-26382) Add Chinese documents for flink-training exercises

2022-04-22 Thread Nico Kruber (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber resolved FLINK-26382.
-
Fix Version/s: 1.15.0
   1.14.5
   (was: 1.14.3)
 Assignee: tonny
   Resolution: Fixed

Fixed on
- release-1.14: 0132dd7be8c881607f9a374613309493ade8c6dd, 
18e6db2206ca4156e21276b14d35bebaf222c151
- master: aca6c47b79d486eb38969492c7e2dc8cb200d146

> Add Chinese documents for flink-training exercises
> --
>
> Key: FLINK-26382
> URL: https://issues.apache.org/jira/browse/FLINK-26382
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation / Training / Exercises
>Affects Versions: 1.14.3
>Reporter: tonny
>Assignee: tonny
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0, 1.14.5
>
>
> Provide Chinese documents for all `README` and `DISCUSSION` accompanied by 
> Chinese documents of Flink 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (FLINK-24144) Improve DataGenerator to prevent excessive creation of new Random objects

2022-04-22 Thread Nico Kruber (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-24144:

Priority: Not a Priority  (was: Major)

> Improve DataGenerator to prevent excessive creation of new Random objects
> -
>
> Key: FLINK-24144
> URL: https://issues.apache.org/jira/browse/FLINK-24144
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation / Training / Exercises
>Affects Versions: 1.14.0, 1.13.2
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Not a Priority
>
> For a couple of methods in {{DataGenerator}}, new {{Random}} objects are 
> created with a specific seed. Instead, we could create a single {{Random}} 
> object and reset the seed when needed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (FLINK-24144) Improve DataGenerator to prevent excessive creation of new Random objects

2022-04-22 Thread Nico Kruber (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber reassigned FLINK-24144:
---

Assignee: (was: Nico Kruber)

> Improve DataGenerator to prevent excessive creation of new Random objects
> -
>
> Key: FLINK-24144
> URL: https://issues.apache.org/jira/browse/FLINK-24144
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation / Training / Exercises
>Affects Versions: 1.14.0, 1.13.2
>Reporter: Nico Kruber
>Priority: Not a Priority
>
> For a couple of methods in {{DataGenerator}}, new {{Random}} objects are 
> created with a specific seed. Instead, we could create a single {{Random}} 
> object and reset the seed when needed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-ml] HuangXingBo closed pull request #88: [FLINK-26268][ml][python] Add classfication algorithm support for LogisticRegression, KNN and NaiveBayes in ML Python API

2022-04-22 Thread GitBox


HuangXingBo closed pull request #88: [FLINK-26268][ml][python] Add 
classfication algorithm support for LogisticRegression, KNN and NaiveBayes in 
ML Python API
URL: https://github.com/apache/flink-ml/pull/88


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-26268) Add classfication algorithm support for LogisticRegression, KNN and NaiveBayes in ML Python API

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26268:
---
Labels: pull-request-available  (was: )

> Add classfication algorithm support for LogisticRegression, KNN and 
> NaiveBayes in ML Python API
> ---
>
> Key: FLINK-26268
> URL: https://issues.apache.org/jira/browse/FLINK-26268
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Library / Machine Learning
>Affects Versions: ml-2.1.0
>Reporter: Huang Xingbo
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-2.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Closed] (FLINK-26268) Add classfication algorithm support for LogisticRegression, KNN and NaiveBayes in ML Python API

2022-04-22 Thread Huang Xingbo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huang Xingbo closed FLINK-26268.

Resolution: Done

Merged into master via aedf66335ff3475e449ebeee795d43372c9fa703

> Add classfication algorithm support for LogisticRegression, KNN and 
> NaiveBayes in ML Python API
> ---
>
> Key: FLINK-26268
> URL: https://issues.apache.org/jira/browse/FLINK-26268
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Library / Machine Learning
>Affects Versions: ml-2.1.0
>Reporter: Huang Xingbo
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-2.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (FLINK-26268) Add classfication algorithm support for LogisticRegression, KNN and NaiveBayes in ML Python API

2022-04-22 Thread Huang Xingbo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huang Xingbo reassigned FLINK-26268:


Assignee: Huang Xingbo

> Add classfication algorithm support for LogisticRegression, KNN and 
> NaiveBayes in ML Python API
> ---
>
> Key: FLINK-26268
> URL: https://issues.apache.org/jira/browse/FLINK-26268
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Library / Machine Learning
>Affects Versions: ml-2.1.0
>Reporter: Huang Xingbo
>Assignee: Huang Xingbo
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-2.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27348) Flink KafkaSource doesn't set groupId

2022-04-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-27348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526291#comment-17526291
 ] 

Ahmet Gürbüz commented on FLINK-27348:
--

How will i do that with KafkaSource? [~Jiangang] 

> Flink KafkaSource doesn't set groupId
> -
>
> Key: FLINK-27348
> URL: https://issues.apache.org/jira/browse/FLINK-27348
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.14.4
> Environment: OS: windows 8.1.
> Java version:
> java version "11.0.13" 2021-10-19 LTS
> Java(TM) SE Runtime Environment 18.9 (build 11.0.13+10-LTS-370)
> Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.13+10-LTS-370, mixed mode)
>  
>  
>Reporter: Ahmet Gürbüz
>Priority: Major
> Attachments: image-2022-04-22-05-43-06-475.png, 
> image-2022-04-22-05-44-56-494.png, image-2022-04-22-05-46-45-592.png, 
> image-2022-04-22-05-52-04-760.png
>
>
> I have one very simple Flink application. I have installed kafka in my local 
> and I am reading data from kafka with flink. I am using KafkaSource class in 
> Flink. Although I have assigned GroupId with setGroupId, this groupId does 
> not appear in Kafka.
>  
> {code:java}
> object FlinkKafkaSource extends App {
>   val env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI()
>   case class Event(partitionNo:Long, eventTime:String, eventTimestamp:Long, 
> userId:String, firstName:String)
>   implicit val readsEvent: Reads[Event] = Json.reads[Event]
>   env
> .fromSource(KafkaSource.builder[Event]
>   .setBootstrapServers("localhost:9092")
>   .setTopics("flink-connection")
>   .setGroupId("test-group") // I can't see this groupId in 
> kafka-consumer-groups
>   .setStartingOffsets(OffsetsInitializer.latest)
>   .setDeserializer(new KafkaRecordDeserializationSchema[Event] {
> override def deserialize(record: ConsumerRecord[Array[Byte], 
> Array[Byte]], out: Collector[Event]): Unit = {
>   val rec = record.value.map(_.toChar).mkString
>   Try(Json.fromJson[Event](Json.parse(rec)).get) match {
> case Success(event) => out.collect(event)
> case Failure(exception) => println(s"Couldn't parse string: $rec, 
> error: ${exception.toString}")
>   }
> }
> override def getProducedType: TypeInformation[Event] = 
> createTypeInformation[Event]
>   })
>   .build,
>   WatermarkStrategy.noWatermarks[Event],
>   "kafka-source"
> )
> .keyBy(l => l.userId)
> .print
>   env.execute("flink-kafka-source")
> } {code}
> I have created a topic in kafka named "flink-connection".
>  
> I am using a simple kafka-python producer to produce data flink-connection 
> topic.
> !image-2022-04-22-05-52-04-760.png!
> I am able to consume data from kafka to flink.
> !image-2022-04-22-05-44-56-494.png!
> But can't see the groupId in kafka-consumer-groups
> !image-2022-04-22-05-46-45-592.png!
> Does anyone has any idea why groupid is not setting?
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Closed] (FLINK-11694) Enforce space before left curly brace

2022-04-22 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-11694.

Resolution: Fixed

Should be covered by spotless.

> Enforce space before left curly brace
> -
>
> Key: FLINK-11694
> URL: https://issues.apache.org/jira/browse/FLINK-11694
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Chesnay Schepler
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We currently don't enforce spaces before left curly brace (\{), as a result 
> of which the following code block would not raise any error:
> {code}
> void someMethod(){
>   if (someCondition){
>   try {
>   ...
>   } catch (Exception e){
>   ...
>   }
>   }
> }
> {code}
> It is not the aim of this JIRA to forbid {} as a shorthand for empty bodies, 
> such as:
> {code}
> new TypeHint>(){}
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27353) Update training exercises to use Flink 1.15

2022-04-22 Thread Nico Kruber (Jira)
Nico Kruber created FLINK-27353:
---

 Summary: Update training exercises to use Flink 1.15
 Key: FLINK-27353
 URL: https://issues.apache.org/jira/browse/FLINK-27353
 Project: Flink
  Issue Type: New Feature
  Components: Documentation / Training / Exercises
Reporter: Nico Kruber
Assignee: Nico Kruber
 Fix For: 1.15.0






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-training] NicoK opened a new pull request, #48: [FLINK-27353] Update to Flink 1.15

2022-04-22 Thread GitBox


NicoK opened a new pull request, #48:
URL: https://github.com/apache/flink-training/pull/48

   This PR will only go through CI once 1.15.0 is released. Until then, you can 
apply the following change to `build.gradle` to see it in action:
   ```
   flinkVersion = '1.15-SNAPSHOT'
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27353) Update training exercises to use Flink 1.15

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27353:
---
Labels: pull-request-available  (was: )

> Update training exercises to use Flink 1.15
> ---
>
> Key: FLINK-27353
> URL: https://issues.apache.org/jira/browse/FLINK-27353
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation / Training / Exercises
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27155) Reduce multiple reads to the same Changelog file in the same taskmanager during restore

2022-04-22 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526298#comment-17526298
 ] 

Roman Khachatryan commented on FLINK-27155:
---

I think Yun Tang means waiting for all task of a TM to switch from INITIALIZING 
to RUNNING.

I think it could be problematic, because the set of tasks is dynamic.

Besides that, because the local space can be limited, and changelog can be 
large, it's better to clean up the cache earlier.

 

Probably, we should combine reference counting and timeouts.

periodic-materialize.interval can still be less than the recovery time.

 

As for serializing file accesses, I'm afraid that without it the ticket can 
introduce regression. Furthermore, the need to serialize can affect the design 
of cache. So I'd at least design this at once.

> Reduce multiple reads to the same Changelog file in the same taskmanager 
> during restore
> ---
>
> Key: FLINK-27155
> URL: https://issues.apache.org/jira/browse/FLINK-27155
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
> Fix For: 1.16.0
>
>
> h3. Background
> In the current implementation, State changes of different operators in the 
> same taskmanager may be written to the same changelog file, which effectively 
> reduces the number of files and requests to DFS.
> But on the other hand, the current implementation also reads the same 
> changelog file multiple times on recovery. More specifically, the number of 
> times the same changelog file is accessed is related to the number of 
> ChangeSets contained in it. And since each read needs to skip the preceding 
> bytes, this network traffic is also wasted.
> The result is a lot of unnecessary request to DFS when there are multiple 
> slots and keyed state in the same taskmanager.
> h3. Proposal
> We can reduce multiple reads to the same changelog file in the same 
> taskmanager during restore.
> One possible approach is to read the changelog file all at once and cache it 
> in memory or local file for a period of time when reading the changelog file.
> I think this could be a subtask of [v2 FLIP-158: Generalized incremental 
> checkpoints|https://issues.apache.org/jira/browse/FLINK-25842] .
> Hi [~ym] , [~roman]  how do you think about ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-27348) Flink KafkaSource doesn't set groupId

2022-04-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-27348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526291#comment-17526291
 ] 

Ahmet Gürbüz edited comment on FLINK-27348 at 4/22/22 9:49 AM:
---

Yes i found the necessary properties for that [~Jiangang] ,
{code:java}
.setProperty("enable.auto.commit", "true") {code}
doing that, many thanks ...


was (Author: JIRAUSER288443):
How will i do that with KafkaSource? [~Jiangang] 

> Flink KafkaSource doesn't set groupId
> -
>
> Key: FLINK-27348
> URL: https://issues.apache.org/jira/browse/FLINK-27348
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.14.4
> Environment: OS: windows 8.1.
> Java version:
> java version "11.0.13" 2021-10-19 LTS
> Java(TM) SE Runtime Environment 18.9 (build 11.0.13+10-LTS-370)
> Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.13+10-LTS-370, mixed mode)
>  
>  
>Reporter: Ahmet Gürbüz
>Priority: Major
> Attachments: image-2022-04-22-05-43-06-475.png, 
> image-2022-04-22-05-44-56-494.png, image-2022-04-22-05-46-45-592.png, 
> image-2022-04-22-05-52-04-760.png
>
>
> I have one very simple Flink application. I have installed kafka in my local 
> and I am reading data from kafka with flink. I am using KafkaSource class in 
> Flink. Although I have assigned GroupId with setGroupId, this groupId does 
> not appear in Kafka.
>  
> {code:java}
> object FlinkKafkaSource extends App {
>   val env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI()
>   case class Event(partitionNo:Long, eventTime:String, eventTimestamp:Long, 
> userId:String, firstName:String)
>   implicit val readsEvent: Reads[Event] = Json.reads[Event]
>   env
> .fromSource(KafkaSource.builder[Event]
>   .setBootstrapServers("localhost:9092")
>   .setTopics("flink-connection")
>   .setGroupId("test-group") // I can't see this groupId in 
> kafka-consumer-groups
>   .setStartingOffsets(OffsetsInitializer.latest)
>   .setDeserializer(new KafkaRecordDeserializationSchema[Event] {
> override def deserialize(record: ConsumerRecord[Array[Byte], 
> Array[Byte]], out: Collector[Event]): Unit = {
>   val rec = record.value.map(_.toChar).mkString
>   Try(Json.fromJson[Event](Json.parse(rec)).get) match {
> case Success(event) => out.collect(event)
> case Failure(exception) => println(s"Couldn't parse string: $rec, 
> error: ${exception.toString}")
>   }
> }
> override def getProducedType: TypeInformation[Event] = 
> createTypeInformation[Event]
>   })
>   .build,
>   WatermarkStrategy.noWatermarks[Event],
>   "kafka-source"
> )
> .keyBy(l => l.userId)
> .print
>   env.execute("flink-kafka-source")
> } {code}
> I have created a topic in kafka named "flink-connection".
>  
> I am using a simple kafka-python producer to produce data flink-connection 
> topic.
> !image-2022-04-22-05-52-04-760.png!
> I am able to consume data from kafka to flink.
> !image-2022-04-22-05-44-56-494.png!
> But can't see the groupId in kafka-consumer-groups
> !image-2022-04-22-05-46-45-592.png!
> Does anyone has any idea why groupid is not setting?
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Closed] (FLINK-27348) Flink KafkaSource doesn't set groupId

2022-04-22 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-27348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Gürbüz closed FLINK-27348.

Resolution: Done

> Flink KafkaSource doesn't set groupId
> -
>
> Key: FLINK-27348
> URL: https://issues.apache.org/jira/browse/FLINK-27348
> Project: Flink
>  Issue Type: Bug
>  Components: API / Scala
>Affects Versions: 1.14.4
> Environment: OS: windows 8.1.
> Java version:
> java version "11.0.13" 2021-10-19 LTS
> Java(TM) SE Runtime Environment 18.9 (build 11.0.13+10-LTS-370)
> Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.13+10-LTS-370, mixed mode)
>  
>  
>Reporter: Ahmet Gürbüz
>Priority: Major
> Attachments: image-2022-04-22-05-43-06-475.png, 
> image-2022-04-22-05-44-56-494.png, image-2022-04-22-05-46-45-592.png, 
> image-2022-04-22-05-52-04-760.png
>
>
> I have one very simple Flink application. I have installed kafka in my local 
> and I am reading data from kafka with flink. I am using KafkaSource class in 
> Flink. Although I have assigned GroupId with setGroupId, this groupId does 
> not appear in Kafka.
>  
> {code:java}
> object FlinkKafkaSource extends App {
>   val env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI()
>   case class Event(partitionNo:Long, eventTime:String, eventTimestamp:Long, 
> userId:String, firstName:String)
>   implicit val readsEvent: Reads[Event] = Json.reads[Event]
>   env
> .fromSource(KafkaSource.builder[Event]
>   .setBootstrapServers("localhost:9092")
>   .setTopics("flink-connection")
>   .setGroupId("test-group") // I can't see this groupId in 
> kafka-consumer-groups
>   .setStartingOffsets(OffsetsInitializer.latest)
>   .setDeserializer(new KafkaRecordDeserializationSchema[Event] {
> override def deserialize(record: ConsumerRecord[Array[Byte], 
> Array[Byte]], out: Collector[Event]): Unit = {
>   val rec = record.value.map(_.toChar).mkString
>   Try(Json.fromJson[Event](Json.parse(rec)).get) match {
> case Success(event) => out.collect(event)
> case Failure(exception) => println(s"Couldn't parse string: $rec, 
> error: ${exception.toString}")
>   }
> }
> override def getProducedType: TypeInformation[Event] = 
> createTypeInformation[Event]
>   })
>   .build,
>   WatermarkStrategy.noWatermarks[Event],
>   "kafka-source"
> )
> .keyBy(l => l.userId)
> .print
>   env.execute("flink-kafka-source")
> } {code}
> I have created a topic in kafka named "flink-connection".
>  
> I am using a simple kafka-python producer to produce data flink-connection 
> topic.
> !image-2022-04-22-05-52-04-760.png!
> I am able to consume data from kafka to flink.
> !image-2022-04-22-05-44-56-494.png!
> But can't see the groupId in kafka-consumer-groups
> !image-2022-04-22-05-46-45-592.png!
> Does anyone has any idea why groupid is not setting?
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-27354) JobMaster still processes requests while terminating

2022-04-22 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-27354:
-

 Summary: JobMaster still processes requests while terminating
 Key: FLINK-27354
 URL: https://issues.apache.org/jira/browse/FLINK-27354
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.14.4, 1.13.6, 1.15.0
Reporter: Matthias Pohl


An issue was reported in the [user 
ML|https://lists.apache.org/thread/5pm3crntmb1hl17h4txnlhjz34clghrg] about the 
JobMaster trying to reconnect to the ResourceManager during shutdown.

The JobMaster is disconnecting from the ResourceManager during shutdown (see 
[JobMaster:1182|https://github.com/apache/flink/blob/da532423487e0534b5fe61f5a02366833f76193a/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java#L1182]).
 This triggers the deregistration of the job in the {{ResourceManager}}. The RM 
responses asynchronously at the end of this deregistration through 
{{disconnectResourceManager}} (see 
[ResourceManager:993|https://github.com/apache/flink/blob/da532423487e0534b5fe61f5a02366833f76193a/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java#L993])
 which will trigger a reconnect on the {{JobMaster}}'s side (see 
[JobMaster::disconnectResourceManager|https://github.com/apache/flink/blob/da532423487e0534b5fe61f5a02366833f76193a/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobMaster.java#L789])
 if it's still around because the {{resourceManagerAddress}} (used in 
{{isConnectingToResourceManager}}) is not cleared. This would only happen 
during a RM leader change.

The {{disconnectResourceManager}} will be ignored if the {{JobMaster}} is gone 
already.

We should add a guard in some way to {{JobMaster}} to avoid reconnecting to 
other components during shutdown. This might not only include the 
ResourceManager connection but might also affect other parts of the 
{{JobMaster}} API.





--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27174) Non-null check for bootstrapServers field is incorrect in KafkaSink

2022-04-22 Thread Zhengqi Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526313#comment-17526313
 ] 

Zhengqi Zhang commented on FLINK-27174:
---

[~fpaul] , please take a look

> Non-null check for bootstrapServers field is incorrect in KafkaSink
> ---
>
> Key: FLINK-27174
> URL: https://issues.apache.org/jira/browse/FLINK-27174
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4
>Reporter: Zhengqi Zhang
>Priority: Major
>  Labels: easyfix
> Attachments: image-2022-04-11-18-11-18-576.png, 
> image-2022-04-11-18-17-48-514.png
>
>
> If the user-supplied kafkaProducerConfig contains bootstrapServers 
> information, there is no need to define the value of this field separately 
> through the setBootstrapServers method. Obviously, the current code doesn't 
> notice this.
> !image-2022-04-11-18-11-18-576.png|width=859,height=261!
>  
> Perhaps we can check bootstrapServers as follows:
> !image-2022-04-11-18-17-48-514.png|width=861,height=322!
>  
> {color:#172b4d}Or check bootstrapServers like KafkaSourceBuilder.{color}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-kubernetes-operator] Aitozi commented on pull request #176: [FLINK-27279] Extract common status interfaces

2022-04-22 Thread GitBox


Aitozi commented on PR #176:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/176#issuecomment-1106331702

   Due to the java type erasure, I can't not completly eliminate the 
`FlinkDeploymentReconciliationStatus` and `FlinkSessionJobReconciliationStatus` 
because I can't get the class at runtime 😔.
   The new interface `ReconcileTarget` is responsible for get the common view 
of the custom resource. If we have special reconcile update logic for different 
custom resource, we can extend the interface to make it happen. What do you 
think about the current implementation @gyfora 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] afedulov commented on pull request #19405: [FLINK-27066] Reintroduce e2e tests in ES as Java tests.

2022-04-22 Thread GitBox


afedulov commented on PR #19405:
URL: https://github.com/apache/flink/pull/19405#issuecomment-1106341694

   @flinkbot run azure
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #176: [FLINK-27279] Extract common status interfaces

2022-04-22 Thread GitBox


gyfora commented on PR #176:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/176#issuecomment-1106387142

   Thank you, I will check this later today :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27174) Non-null check for bootstrapServers field is incorrect in KafkaSink

2022-04-22 Thread Fabian Paul (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526331#comment-17526331
 ] 

Fabian Paul commented on FLINK-27174:
-

I think that is a valid point. We introduced the  setBootstrapServers method to 
keep in consistent with the KafkaSource. Do you want to work on relaxing the 
check that it also allows the bootstrap servers as part of the properties?

> Non-null check for bootstrapServers field is incorrect in KafkaSink
> ---
>
> Key: FLINK-27174
> URL: https://issues.apache.org/jira/browse/FLINK-27174
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4
>Reporter: Zhengqi Zhang
>Priority: Major
>  Labels: easyfix
> Attachments: image-2022-04-11-18-11-18-576.png, 
> image-2022-04-11-18-17-48-514.png
>
>
> If the user-supplied kafkaProducerConfig contains bootstrapServers 
> information, there is no need to define the value of this field separately 
> through the setBootstrapServers method. Obviously, the current code doesn't 
> notice this.
> !image-2022-04-11-18-11-18-576.png|width=859,height=261!
>  
> Perhaps we can check bootstrapServers as follows:
> !image-2022-04-11-18-17-48-514.png|width=861,height=322!
>  
> {color:#172b4d}Or check bootstrapServers like KafkaSourceBuilder.{color}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (FLINK-26114) DefaultScheduler fails fatally in case of an error when shutting down the checkpoint-related resources

2022-04-22 Thread Matthias Pohl (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-26114:
--
Description: 
In contrast to the {{AdaptiveScheduler}}, the {{DefaultScheduler}} fails 
fatally in case of an error while cleaning up the checkpoint-related resources. 
This contradicts our new approach of retrying the cleanup of job-related data 
(see FLINK-25433). Instead, we would want the {{DefaultScheduler}} to return an 
exceptionally completed future with the exception. This enables the 
{{DefaultResourceCleaner}} to trigger a retry.

Both scheduler implementations do not expose the error during shutdown of the 
{{CompletedCheckpointStore}} or {{CheckpointIDCounter}} right now. This would 
need to be addressed as well.

  was:In contrast to the {{AdaptiveScheduler}}, the {{DefaultScheduler}} fails 
fatally in case of an error while cleaning up the checkpoint-related resources. 
This contradicts our new approach of retrying the cleanup of job-related data 
(see FLINK-25433). Instead, we would want the {{DefaultScheduler}} to return an 
exceptionally completed future with the exception. This enables the 
{{DefaultResourceCleaner}} to trigger a retry.


> DefaultScheduler fails fatally in case of an error when shutting down the 
> checkpoint-related resources
> --
>
> Key: FLINK-26114
> URL: https://issues.apache.org/jira/browse/FLINK-26114
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.0
>Reporter: Matthias Pohl
>Assignee: Niklas Semmler
>Priority: Critical
>
> In contrast to the {{AdaptiveScheduler}}, the {{DefaultScheduler}} fails 
> fatally in case of an error while cleaning up the checkpoint-related 
> resources. This contradicts our new approach of retrying the cleanup of 
> job-related data (see FLINK-25433). Instead, we would want the 
> {{DefaultScheduler}} to return an exceptionally completed future with the 
> exception. This enables the {{DefaultResourceCleaner}} to trigger a retry.
> Both scheduler implementations do not expose the error during shutdown of the 
> {{CompletedCheckpointStore}} or {{CheckpointIDCounter}} right now. This would 
> need to be addressed as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (FLINK-26114) DefaultScheduler fails fatally in case of an error when shutting down the checkpoint-related resources

2022-04-22 Thread Matthias Pohl (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl reassigned FLINK-26114:
-

Assignee: (was: Niklas Semmler)

> DefaultScheduler fails fatally in case of an error when shutting down the 
> checkpoint-related resources
> --
>
> Key: FLINK-26114
> URL: https://issues.apache.org/jira/browse/FLINK-26114
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.0
>Reporter: Matthias Pohl
>Priority: Critical
>
> In contrast to the {{AdaptiveScheduler}}, the {{DefaultScheduler}} fails 
> fatally in case of an error while cleaning up the checkpoint-related 
> resources. This contradicts our new approach of retrying the cleanup of 
> job-related data (see FLINK-25433). Instead, we would want the 
> {{DefaultScheduler}} to return an exceptionally completed future with the 
> exception. This enables the {{DefaultResourceCleaner}} to trigger a retry.
> Both scheduler implementations do not expose the error during shutdown of the 
> {{CompletedCheckpointStore}} or {{CheckpointIDCounter}} right now. This would 
> need to be addressed as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-kubernetes-operator] SteNicholas commented on a diff in pull request #178: [FLINK-27334] Support auto generate the doc for the `KubernetesOperatorConfigOptions`

2022-04-22 Thread GitBox


SteNicholas commented on code in PR #178:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/178#discussion_r856126535


##
Dockerfile:
##
@@ -23,21 +23,23 @@ WORKDIR /app
 ENV SHADED_DIR=flink-kubernetes-shaded
 ENV OPERATOR_DIR=flink-kubernetes-operator
 ENV WEBHOOK_DIR=flink-kubernetes-webhook
+ENV DOCS_DIR=flink-kubernetes-docs
 
 RUN mkdir $OPERATOR_DIR $WEBHOOK_DIR
 
 COPY pom.xml .
 COPY $SHADED_DIR/pom.xml ./$SHADED_DIR/
 COPY $WEBHOOK_DIR/pom.xml ./$WEBHOOK_DIR/
 COPY $OPERATOR_DIR/pom.xml ./$OPERATOR_DIR/
+COPY $DOCS_DIR/pom.xml ./$DOCS_DIR/

Review Comment:
   @wangyang0918, this only adds the pom.xml of the `flink-kubernetes-docs`, 
which is used in parent pom.xml. I have tried not to copy this file but the 
maven command throwed the module not existing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-27355) JobManagerRunnerRegistry.localCleanupAsync does not call the JobManagerRunner.close method repeatedly

2022-04-22 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-27355:
-

 Summary: JobManagerRunnerRegistry.localCleanupAsync does not call 
the JobManagerRunner.close method repeatedly
 Key: FLINK-27355
 URL: https://issues.apache.org/jira/browse/FLINK-27355
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.15.0
Reporter: Matthias Pohl


The {{DefaultJobManagerRunner.localCleanupAsync}} method deregisters the 
JobManagerRunner and calls close on it. If close fails for whatever reason, it 
will be identified but the next retry would just notice that the 
JobManagerRunner is already deregistered and not do anything.

Hence, JobMaster shutdown won't be retriggered (i.e. errors in the 
{{CompletedCheckpointStore}} or the {{CheckpointIDCounter}} won't be handled). 
FLINK-26114 is related: Both components don't expose any errors right now, 
anyway.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink-web] NicoK commented on a diff in pull request #526: Announcement blogpost for the 1.15 release

2022-04-22 Thread GitBox


NicoK commented on code in PR #526:
URL: https://github.com/apache/flink-web/pull/526#discussion_r856088102


##
_posts/2022-04-11-1.15-announcement.md:
##
@@ -0,0 +1,431 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.15"
+subtitle: ""
+date: 2022-04-11T08:00:00.000Z
+categories: news
+authors:
+- yungao:
+  name: "Yun Gao"
+  twitter: "YunGao16"
+- joemoe:
+  name: "Joe Moser"
+  twitter: "JoemoeAT"
+
+---
+
+Thanks to our well-organized and open community, Apache Flink continues 
+[to grow](https://www.apache.org/foundation/docs/FY2021AnnualReport.pdf) as a 
+technology and remain one of the most active projects in
+the Apache community. With the release of Flink 1.15, we are proud to announce 
a number of 
+exciting changes.
+
+One of the main concepts that makes Apache Flink stand out is the unification 
of 
+batch (aka bounded) and stream (aka unbounded) data processing. A lot of 
+effort went into this unification in the previous releases but you can expect 
more efforts in this direction. 
+Apache Flink is not only growing when it comes to contributions and users, but
+also out of the original use cases. We are seeing a trend towards more 
business/analytics 
+use cases implemented in low-/no-code. Flink SQL is the feature in the Flink 
ecosystem 
+that enables such uses cases and this is why its popularity continues to grow. 
 
+
+Apache Flink is an essential building block in data pipelines/architectures 
and 
+is used with many other technologies in order to drive all sorts of use cases. 
While new ideas/products
+may appear in this domain, existing technologies continue to establish 
themselves as standards for solving 
+mission-critical problems. Knowing that we have such a wide reach and play a 
role in the success of many 
+projects, it is important that the experience of 
+integrating with Apache Flink is as seamless and easy as possible. 
+
+In the 1.15 release the Apache Flink community made significant progress 
across all 
+these areas. Still those are not the only things that made it into 1.15. The 
+contributors improved the experience of operating Apache Flink by making it 
much 
+easier and more transparent to handle checkpoints and savepoints and their 
ownership, 
+making auto scaling more seamless and complete, by removing side effects of 
use cases 
+in which different data sources produce varying amounts of data, and - finally 
- the 
+ability to upgrade SQL jobs without losing the state. By continuing on 
supporting 
+checkpoints after tasks finished and adding window table valued functions in 
batch 
+mode, the experience of unified stream and batch processing was once more 
improved 
+making hybrid use cases way easier. In the SQL space, not only the first step 
in 
+version upgrades have been added but also JSON functions to make it easier to 
import 
+and export structured data in SQL. Both will allow users to better rely on 
Flink SQL 
+for production use cases in the long term. To establish Apache Flink as part 
of the 
+data processing ecosystem we improved the cloud interoperability and added 
more sink 
+connectors and formats. And yes we enabled a Scala-free runtime 
+([the hype is real](https://flink.apache.org/2022/02/22/scala-free.html)).
+
+
+## Operating Apache Flink with ease
+
+Even Flink jobs that have been built and tuned by the best engineering teams 
still need to 
+be operated, usually on a long-term basis. The many deployment 
+patterns, APIs, tuneable configs, and use cases covered by Apache Flink mean 
that operation
+support is vital and can be burdensome.
+
+In this release, we listened to user feedback and now operating Flink is made 
much 
+easier. It is now more transparent in terms of handling checkpoints and 
savepoints and their ownership, 
+which makes auto-scaling more seamless and complete (by removing side effects 
of use cases 
+where different data sources produce varying amounts of data) and enables the  
+ability to upgrade SQL jobs without losing the state. 
+
+
+### Clarification of checkpoint and savepoint semantics
+
+An essential cornerstone of Flink’s fault tolerance strategy is based on 
+[checkpoints](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/checkpoints/)
 and 
+[savepoints](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/)
 (see [the 
comparison](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/checkpoints_vs_savepoints/).
 

Review Comment:
   ```suggestion
   
[savepoints](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/)
 (see [the 
comparison](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/checkpoints_vs_savepoints/)).
 
   ```



##
_posts/2022-04-11-1.15-announcement.md:
##
@@ -0,0 +1,431 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.15"
+subtitle: ""
+date: 2022-04-11T08:00:00.000

[GitHub] [flink] snuyanzin commented on pull request #18109: [FLINK-25284][Table SQL / API] Add proportion of null values to generate with datagen

2022-04-22 Thread GitBox


snuyanzin commented on PR #18109:
URL: https://github.com/apache/flink/pull/18109#issuecomment-1106425611

   @JingsongLi sorry for the poke
   Since you are one of the committers dealing with datagen, could you please 
have a look here, once you have time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #176: [FLINK-27279] Extract common status interfaces

2022-04-22 Thread GitBox


gyfora commented on code in PR #176:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/176#discussion_r856142721


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/ReconcileTarget.java:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.reconciler;
+
+import org.apache.flink.kubernetes.operator.crd.spec.JobSpec;
+import org.apache.flink.kubernetes.operator.crd.status.ReconciliationStatus;
+
+import javax.annotation.Nullable;
+
+/**
+ * The interface is responsible to handle the reconciliation result. For the 
common logic, it
+ * provides method to extract the common view between the {@link
+ * org.apache.flink.kubernetes.operator.crd.FlinkDeployment} and {@link
+ * org.apache.flink.kubernetes.operator.crd.FlinkSessionJob} to simplify the 
custom resource
+ * manipulation. For the special part of each custom resource, we can extend 
the interface to let
+ * the target custom resource react to the reconciliation result 
correspondingly.
+ *
+ * @param  the common view of the custom resource getSpec
+ */
+public interface ReconcileTarget {
+
+/** The common view of the spec. */
+interface SpecView {
+JobSpec getJobSpec();
+}
+
+/**
+ * Get the current getSpec of the custom resource.
+ *
+ * @return the current getSpec.
+ */
+SPEC getSpec();
+
+/**
+ * Get the current reconciliation status.
+ *
+ * @return the current reconciliation status.
+ */
+ReconciliationStatus getReconcileStatus();
+
+/**
+ * Let the target custom resource handle the reconciliation error.
+ *
+ * @param error The error to be handled.
+ */
+void handleError(@Nullable String error);

Review Comment:
   I think these methods should be part of the CommonStatus interface instead. 
Simply have the ReonciliationStatus and error fields there and use get/set



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #176: [FLINK-27279] Extract common status interfaces

2022-04-22 Thread GitBox


gyfora commented on code in PR #176:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/176#discussion_r856144200


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/ReconcileTarget.java:
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.reconciler;
+
+import org.apache.flink.kubernetes.operator.crd.spec.JobSpec;
+import org.apache.flink.kubernetes.operator.crd.status.ReconciliationStatus;
+
+import javax.annotation.Nullable;
+
+/**
+ * The interface is responsible to handle the reconciliation result. For the 
common logic, it
+ * provides method to extract the common view between the {@link
+ * org.apache.flink.kubernetes.operator.crd.FlinkDeployment} and {@link
+ * org.apache.flink.kubernetes.operator.crd.FlinkSessionJob} to simplify the 
custom resource
+ * manipulation. For the special part of each custom resource, we can extend 
the interface to let
+ * the target custom resource react to the reconciliation result 
correspondingly.
+ *
+ * @param  the common view of the custom resource getSpec
+ */
+public interface ReconcileTarget {
+
+/** The common view of the spec. */
+interface SpecView {
+JobSpec getJobSpec();
+}
+
+/**
+ * Get the current getSpec of the custom resource.
+ *
+ * @return the current getSpec.
+ */
+SPEC getSpec();
+
+/**
+ * Get the current reconciliation status.
+ *
+ * @return the current reconciliation status.
+ */
+ReconciliationStatus getReconcileStatus();
+
+/**
+ * Let the target custom resource handle the reconciliation error.
+ *
+ * @param error The error to be handled.
+ */
+void handleError(@Nullable String error);

Review Comment:
   That way the ReconcileTerget interface could be also removed and use 
`CustomResource`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fapaul commented on a diff in pull request #19405: [FLINK-27066] Reintroduce e2e tests in ES as Java tests.

2022-04-22 Thread GitBox


fapaul commented on code in PR #19405:
URL: https://github.com/apache/flink/pull/19405#discussion_r856141131


##
flink-end-to-end-tests/flink-end-to-end-tests-elasticsearch7/pom.xml:
##
@@ -30,10 +30,14 @@ under the License.
..

 
-   flink-elasticsearch7-test
-   Flink : E2E Tests : Elasticsearch 7
+   flink-end-to-end-tests-elasticsearch7
+   Flink : E2E Tests : Elasticsearch 7 Java

Review Comment:
   Is `Java` really important here?



##
flink-end-to-end-tests/flink-end-to-end-tests-common-elasticsearch/pom.xml:
##
@@ -43,50 +43,52 @@ under the License.


org.apache.flink
-   flink-connector-elasticsearch6
+   flink-connector-test-utils
${project.version}
+   compile

+   
+   org.apache.flink
+   
flink-connector-elasticsearch-base
+   ${project.version}
+   
+   
+   org.testcontainers
+   elasticsearch
+   ${testcontainers.version}
+   
+   
+   org.apache.flink
+   flink-end-to-end-tests-common
+   ${project.version}
+   test

 
+   
+  
+ 
+org.apache.httpcomponents
+httpcore-nio
+4.4.12
+ 
+  
+   
+



org.apache.maven.plugins
-   maven-shade-plugin
+   maven-jar-plugin


-   
Elasticsearch6SinkExample
+   Jar Package
package

-   shade
+   test-jar

Review Comment:
   Can you avoid introducing more test jars and maybe move the common utilities 
to the main package?



##
flink-end-to-end-tests/flink-end-to-end-tests-common-elasticsearch/src/test/java/org/apache/flink/streaming/tests/ElasticsearchSinkE2ECaseBase.java:
##
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.flink.streaming.tests;
+
+import 
org.apache.flink.connector.testframe.external.DefaultContainerizedExternalSystem;
+import org.apache.flink.connector.testframe.external.ExternalSystemDataReader;
+import org.apache.flink.connector.testframe.junit.annotations.TestEnv;
+import 
org.apache.flink.connector.testframe.junit.annotations.TestExternalSystem;
+import org.apache.flink.connector.testframe.junit.annotations.TestSemantics;
+import org.apache.flink.connector.testframe.testsuites.SinkTestSuiteBase;
+import org.apache.flink.streaming.api.CheckpointingMode;
+import org.apache.flink.tests.util.flink.FlinkContainerTestEnvironment;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import org.testcontainers.elasticsearch.ElasticsearchContainer;
+import org.testcontainers.utility.DockerImageName;
+
+import java.time.Duration;
+import java.util.Collections;
+import java.util.List;
+
+import static 
org.apache.flink.connector.testframe.utils.CollectIteratorAssertions.assertThat;
+import static 
org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition;
+
+/** Base classs for end to end ElasticsearchSink tests based on connector 
testing framework. */
+@SuppressWarnings("unused")
+public abstract class ElasticsearchSinkE2ECaseBase>
+extends SinkTestSuiteBase {
+
+private static final Logger LOG = 
LoggerFactory.getLogger(ElasticsearchSinkE2ECaseBase.class);
+private static final int READER_RETRY_ATTEMPTS = 10;
+private static final int READER_TIMEOUT = 10;
+
+protected sta

[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #176: [FLINK-27279] Extract common status interfaces

2022-04-22 Thread GitBox


gyfora commented on code in PR #176:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/176#discussion_r856146587


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/crd/status/ReconciliationStatus.java:
##
@@ -54,19 +49,20 @@ public class ReconciliationStatus {
 private ReconciliationState state = ReconciliationState.DEPLOYED;
 
 @JsonIgnore
-public FlinkDeploymentSpec deserializeLastReconciledSpec() {
-return ReconciliationUtils.deserializedSpecWithVersion(
-lastReconciledSpec, FlinkDeploymentSpec.class);
+public abstract Class getSpecClass();

Review Comment:
   You could also simply add a constructor that takes the spec class (you have 
that when you call initStatus by calling getSpec().getClass()). Then you would 
not need 2 subclasses just to implement this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan commented on a diff in pull request #19441: [FLINK-27187][state/changelog] Add changelog storage metric totalAttemptsPerUpload

2022-04-22 Thread GitBox


rkhachatryan commented on code in PR #19441:
URL: https://github.com/apache/flink/pull/19441#discussion_r856152839


##
flink-dstl/flink-dstl-dfs/src/test/java/org/apache/flink/changelog/fs/ChangelogStorageMetricsTest.java:
##
@@ -295,6 +346,55 @@ public void close() {
 }
 }
 
+private static class WaitingMaxAttemptUploader implements 
StateChangeUploader {
+private final ConcurrentHashMap 
remainingAttemptsPerTask;
+private final int maxAttempts;
+
+public WaitingMaxAttemptUploader(int maxAttempts) {
+if (maxAttempts < 1) {
+throw new IllegalArgumentException("maxAttempts < 0");
+}
+this.maxAttempts = maxAttempts;
+this.remainingAttemptsPerTask = new ConcurrentHashMap<>();
+}
+
+@Override
+public UploadTasksResult upload(Collection tasks) throws 
IOException {
+
+for (UploadTask uploadTask : tasks) {
+CountDownLatch remainingAttempts = 
remainingAttemptsPerTask.get(uploadTask);
+if (remainingAttempts == null) {
+remainingAttempts = new CountDownLatch(maxAttempts - 1);
+remainingAttemptsPerTask.put(uploadTask, 
remainingAttempts);
+} else {
+remainingAttempts.countDown();
+}
+}
+for (UploadTask uploadTask : tasks) {
+CountDownLatch remainingAttempts = 
remainingAttemptsPerTask.get(uploadTask);
+try {
+remainingAttempts.await();

Review Comment:
   You're right, completing first and last attempts sounds good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27174) Non-null check for bootstrapServers field is incorrect in KafkaSink

2022-04-22 Thread Zhengqi Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526356#comment-17526356
 ] 

Zhengqi Zhang commented on FLINK-27174:
---

In the current code, if the user does not use the setBootstrapServers method to 
set bootstrapServers, even if he provides it in a separate property, the 
non-null check on bootstrapServers will fail, which is obviously unreasonable. 
In fact, we can just check bootstrapServers in the final property.

> Non-null check for bootstrapServers field is incorrect in KafkaSink
> ---
>
> Key: FLINK-27174
> URL: https://issues.apache.org/jira/browse/FLINK-27174
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4
>Reporter: Zhengqi Zhang
>Priority: Major
>  Labels: easyfix
> Attachments: image-2022-04-11-18-11-18-576.png, 
> image-2022-04-11-18-17-48-514.png
>
>
> If the user-supplied kafkaProducerConfig contains bootstrapServers 
> information, there is no need to define the value of this field separately 
> through the setBootstrapServers method. Obviously, the current code doesn't 
> notice this.
> !image-2022-04-11-18-11-18-576.png|width=859,height=261!
>  
> Perhaps we can check bootstrapServers as follows:
> !image-2022-04-11-18-17-48-514.png|width=861,height=322!
>  
> {color:#172b4d}Or check bootstrapServers like KafkaSourceBuilder.{color}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-27174) Non-null check for bootstrapServers field is incorrect in KafkaSink

2022-04-22 Thread Zhengqi Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526356#comment-17526356
 ] 

Zhengqi Zhang edited comment on FLINK-27174 at 4/22/22 11:57 AM:
-

Yes. In the current code, if the user does not use the setBootstrapServers 
method to set bootstrapServers, even if he provides it in a separate property, 
the non-null check on bootstrapServers will fail, which is obviously 
unreasonable. In fact, we can just check bootstrapServers in the final property.


was (Author: tony giao):
In the current code, if the user does not use the setBootstrapServers method to 
set bootstrapServers, even if he provides it in a separate property, the 
non-null check on bootstrapServers will fail, which is obviously unreasonable. 
In fact, we can just check bootstrapServers in the final property.

> Non-null check for bootstrapServers field is incorrect in KafkaSink
> ---
>
> Key: FLINK-27174
> URL: https://issues.apache.org/jira/browse/FLINK-27174
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.4
>Reporter: Zhengqi Zhang
>Priority: Major
>  Labels: easyfix
> Attachments: image-2022-04-11-18-11-18-576.png, 
> image-2022-04-11-18-17-48-514.png
>
>
> If the user-supplied kafkaProducerConfig contains bootstrapServers 
> information, there is no need to define the value of this field separately 
> through the setBootstrapServers method. Obviously, the current code doesn't 
> notice this.
> !image-2022-04-11-18-11-18-576.png|width=859,height=261!
>  
> Perhaps we can check bootstrapServers as follows:
> !image-2022-04-11-18-17-48-514.png|width=861,height=322!
>  
> {color:#172b4d}Or check bootstrapServers like KafkaSourceBuilder.{color}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] zoltar9264 commented on pull request #19441: [FLINK-27187][state/changelog] Add changelog storage metric totalAttemptsPerUpload

2022-04-22 Thread GitBox


zoltar9264 commented on PR #19441:
URL: https://github.com/apache/flink/pull/19441#issuecomment-1106451265

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on pull request #99: [FLINK-27307] Flink table store support append-only ingestion without primary keys.

2022-04-22 Thread GitBox


JingsongLi commented on PR #99:
URL: https://github.com/apache/flink-table-store/pull/99#issuecomment-1106461636

   Can we set the append-only write file to an empty key? This allows for a 
good integration of these two modes.
   Actually, `SstFileMeta` can be renamed to `DataFileMeta`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #99: [FLINK-27307] Flink table store support append-only ingestion without primary keys.

2022-04-22 Thread GitBox


JingsongLi commented on code in PR #99:
URL: https://github.com/apache/flink-table-store/pull/99#discussion_r856178226


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/WriteMode.java:
##
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file;
+
+/** Defines the write mode for flink table store. */
+public enum WriteMode {
+INSERT_ONLY(
+"insert-only",
+"The table can only accept append-only insert operations. All rows 
will be "
++ "inserted into the table store without any deduplication 
or primary/unique key constraint"),
+DELETABLE("deletable", "The table can accept both insert and operations.");

Review Comment:
   Maybe just `changelog`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on pull request #99: [FLINK-27307] Flink table store support append-only ingestion without primary keys.

2022-04-22 Thread GitBox


JingsongLi commented on PR #99:
URL: https://github.com/apache/flink-table-store/pull/99#issuecomment-1106466192

   > The different manifests design for both two kinds of tables.
   
   Can you clarify which parts of the design are different?
   
   > What's the read API abstraction for those two kinds of tables. I still 
don't have a clearly propose for it. Will try to update this PR for this.
   
   A new `RecordReader` too?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia opened a new pull request, #19556: [FLINK-26413][hive] Hive dialect supports "LOAD DATA INPATH"

2022-04-22 Thread GitBox


luoyuxia opened a new pull request, #19556:
URL: https://github.com/apache/flink/pull/19556

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-26413) Hive dialect support "LOAD DATA INPATH"

2022-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26413:
---
Labels: pull-request-available  (was: )

> Hive dialect support "LOAD DATA INPATH" 
> 
>
> Key: FLINK-26413
> URL: https://issues.apache.org/jira/browse/FLINK-26413
> Project: Flink
>  Issue Type: Sub-task
>Reporter: luoyuxia
>Priority: Major
>  Labels: pull-request-available
>
> In Hive, it's supported to use such sql like 
> {code:java}
> LOAD DATA INPATH 
> {code}
> to import data to hive table.
> It's also need to support it using Hive dialect in Flink.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] snuyanzin opened a new pull request, #19557: [hotfix][docs] Fix class name in docs for ExecutionEnvironment class

2022-04-22 Thread GitBox


snuyanzin opened a new pull request, #19557:
URL: https://github.com/apache/flink/pull/19557

   
   
   ## What is the purpose of the change
   
   Trivial misprint fix class name for `ExecutionEnvironment` class in 
   docs/content.zh/docs/connectors/datastream/formats/hadoop.md
   docs/content.zh/docs/dev/dataset/hadoop_compatibility.md
   docs/content/docs/connectors/dataset/formats/hadoop.md
   docs/content/docs/connectors/datastream/formats/hadoop.md
   
   `ExecutionEnvironmen` => `ExecutionEnvironment`
   Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19556: [FLINK-26413][hive] Hive dialect supports "LOAD DATA INPATH"

2022-04-22 Thread GitBox


flinkbot commented on PR #19556:
URL: https://github.com/apache/flink/pull/19556#issuecomment-1106473307

   
   ## CI report:
   
   * 4d99ef8a1ea72740067fb5838d9c854a622be5d5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19557: [hotfix][docs] Fix class name in docs for ExecutionEnvironment class

2022-04-22 Thread GitBox


flinkbot commented on PR #19557:
URL: https://github.com/apache/flink/pull/19557#issuecomment-1106473407

   
   ## CI report:
   
   * 1940799765d3b35c3e93f7bc6f78138f1e33fbbb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] JulioPerezGitHub commented on pull request #17020: ---

2022-04-22 Thread GitBox


JulioPerezGitHub commented on PR #17020:
URL: https://github.com/apache/flink/pull/17020#issuecomment-1106481367

   https://discord.gg/ccZe2WMd


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan commented on a diff in pull request #18539: [FLINK-25745] Support RocksDB incremental native savepoints

2022-04-22 Thread GitBox


rkhachatryan commented on code in PR #18539:
URL: https://github.com/apache/flink/pull/18539#discussion_r856196969


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java:
##
@@ -369,4 +373,37 @@ public static CheckpointStorage loadCheckpointStorage(
 
 /** This class contains only static utility methods and is not meant to be 
instantiated. */
 private Checkpoints() {}
+
+private static class ClaimModeCompletedStorageLocation
+implements CompletedCheckpointStorageLocation {
+
+private final CompletedCheckpointStorageLocation wrapped;
+
+private 
ClaimModeCompletedStorageLocation(CompletedCheckpointStorageLocation location) {
+wrapped = location;
+}
+
+@Override
+public String getExternalPointer() {
+return wrapped.getExternalPointer();
+}
+
+@Override
+public StreamStateHandle getMetadataHandle() {
+return wrapped.getMetadataHandle();
+}
+
+@Override
+public void disposeStorageLocation() throws IOException {
+try {
+wrapped.disposeStorageLocation();
+} catch (Exception ex) {
+LOG.debug(
+"We could not delete the storage location: {} in CLAIM 
restore mode. It is"
++ " most probably because of shared files 
still being used by newer"
++ " checkpoints",

Review Comment:
   @dawidwys could you please explain this scenario?
   
   AFAIK, shared state files should be placed in a separate 
`checkpoints/shared` folder and therefore should not prevent 
`checkpoints/chk-xxx` folder from being deleted.
   However, this is not true when migrating to Changelog, when private state 
becomes "re-usable".
   
   I'd like to understand whether it's a general case or only related to 
Changelog.
   
   cc: @zoltar9264 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] (FLINK-27340) [JUnit5 Migration] Module: flink-python

2022-04-22 Thread EMing Zhou (Jira)


[ https://issues.apache.org/jira/browse/FLINK-27340 ]


EMing Zhou deleted comment on FLINK-27340:


was (Author: zsigner):
Hi [~Sergey Nuyanzin] 

    Can I get the ticket?

 

> [JUnit5 Migration] Module: flink-python
> ---
>
> Key: FLINK-27340
> URL: https://issues.apache.org/jira/browse/FLINK-27340
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python, Tests
>Reporter: Sergey Nuyanzin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] dawidwys commented on a diff in pull request #18539: [FLINK-25745] Support RocksDB incremental native savepoints

2022-04-22 Thread GitBox


dawidwys commented on code in PR #18539:
URL: https://github.com/apache/flink/pull/18539#discussion_r856213164


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java:
##
@@ -369,4 +373,37 @@ public static CheckpointStorage loadCheckpointStorage(
 
 /** This class contains only static utility methods and is not meant to be 
instantiated. */
 private Checkpoints() {}
+
+private static class ClaimModeCompletedStorageLocation
+implements CompletedCheckpointStorageLocation {
+
+private final CompletedCheckpointStorageLocation wrapped;
+
+private 
ClaimModeCompletedStorageLocation(CompletedCheckpointStorageLocation location) {
+wrapped = location;
+}
+
+@Override
+public String getExternalPointer() {
+return wrapped.getExternalPointer();
+}
+
+@Override
+public StreamStateHandle getMetadataHandle() {
+return wrapped.getMetadataHandle();
+}
+
+@Override
+public void disposeStorageLocation() throws IOException {
+try {
+wrapped.disposeStorageLocation();
+} catch (Exception ex) {
+LOG.debug(
+"We could not delete the storage location: {} in CLAIM 
restore mode. It is"
++ " most probably because of shared files 
still being used by newer"
++ " checkpoints",

Review Comment:
   This is the case for native savepoints. Native "incremental" RocksDB 
savepoints place all files in a single savepoint folder. They do not place 
shared files in the shared folder and thus they can prevent deleting the 
savepoint folder.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] dawidwys commented on a diff in pull request #18539: [FLINK-25745] Support RocksDB incremental native savepoints

2022-04-22 Thread GitBox


dawidwys commented on code in PR #18539:
URL: https://github.com/apache/flink/pull/18539#discussion_r856213164


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java:
##
@@ -369,4 +373,37 @@ public static CheckpointStorage loadCheckpointStorage(
 
 /** This class contains only static utility methods and is not meant to be 
instantiated. */
 private Checkpoints() {}
+
+private static class ClaimModeCompletedStorageLocation
+implements CompletedCheckpointStorageLocation {
+
+private final CompletedCheckpointStorageLocation wrapped;
+
+private 
ClaimModeCompletedStorageLocation(CompletedCheckpointStorageLocation location) {
+wrapped = location;
+}
+
+@Override
+public String getExternalPointer() {
+return wrapped.getExternalPointer();
+}
+
+@Override
+public StreamStateHandle getMetadataHandle() {
+return wrapped.getMetadataHandle();
+}
+
+@Override
+public void disposeStorageLocation() throws IOException {
+try {
+wrapped.disposeStorageLocation();
+} catch (Exception ex) {
+LOG.debug(
+"We could not delete the storage location: {} in CLAIM 
restore mode. It is"
++ " most probably because of shared files 
still being used by newer"
++ " checkpoints",

Review Comment:
   This is the case for native savepoints. Native "incremental" RocksDB 
savepoints place all files in a single savepoint folder. They do not place 
shared files in the shared folder and thus they can prevent deleting the 
savepoint folder.
   
   This is the case because savepoints need to be self contained to be 
relocatable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #177: [FLINK-27303][FLINK-27309] Introduce FlinkConfigManager for efficient config management

2022-04-22 Thread GitBox


gyfora commented on PR #177:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/177#issuecomment-1106522084

   Reworked the watcher logic and added some more tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27352) [JUnit5 Migration] Module: flink-json

2022-04-22 Thread EMing Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526450#comment-17526450
 ] 

EMing Zhou commented on FLINK-27352:


Hi, Can you assigne to me?Thank you.

> [JUnit5 Migration] Module: flink-json
> -
>
> Key: FLINK-27352
> URL: https://issues.apache.org/jira/browse/FLINK-27352
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: EMing Zhou
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] rkhachatryan commented on a diff in pull request #18539: [FLINK-25745] Support RocksDB incremental native savepoints

2022-04-22 Thread GitBox


rkhachatryan commented on code in PR #18539:
URL: https://github.com/apache/flink/pull/18539#discussion_r856290060


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java:
##
@@ -369,4 +373,37 @@ public static CheckpointStorage loadCheckpointStorage(
 
 /** This class contains only static utility methods and is not meant to be 
instantiated. */
 private Checkpoints() {}
+
+private static class ClaimModeCompletedStorageLocation
+implements CompletedCheckpointStorageLocation {
+
+private final CompletedCheckpointStorageLocation wrapped;
+
+private 
ClaimModeCompletedStorageLocation(CompletedCheckpointStorageLocation location) {
+wrapped = location;
+}
+
+@Override
+public String getExternalPointer() {
+return wrapped.getExternalPointer();
+}
+
+@Override
+public StreamStateHandle getMetadataHandle() {
+return wrapped.getMetadataHandle();
+}
+
+@Override
+public void disposeStorageLocation() throws IOException {
+try {
+wrapped.disposeStorageLocation();
+} catch (Exception ex) {
+LOG.debug(
+"We could not delete the storage location: {} in CLAIM 
restore mode. It is"
++ " most probably because of shared files 
still being used by newer"
++ " checkpoints",

Review Comment:
   Got it. Thanks for the explanation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan merged pull request #19508: [FLINK-27218] fix the problem that the internal Serializer in Operato…

2022-04-22 Thread GitBox


rkhachatryan merged PR #19508:
URL: https://github.com/apache/flink/pull/19508


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan commented on pull request #19508: [FLINK-27218] fix the problem that the internal Serializer in Operato…

2022-04-22 Thread GitBox


rkhachatryan commented on PR #19508:
URL: https://github.com/apache/flink/pull/19508#issuecomment-1106582214

   Thanks @mayuehappy.
   Merged the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] SteNicholas commented on a diff in pull request #178: [FLINK-27334] Support auto generate the doc for the `KubernetesOperatorConfigOptions`

2022-04-22 Thread GitBox


SteNicholas commented on code in PR #178:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/178#discussion_r856126535


##
Dockerfile:
##
@@ -23,21 +23,23 @@ WORKDIR /app
 ENV SHADED_DIR=flink-kubernetes-shaded
 ENV OPERATOR_DIR=flink-kubernetes-operator
 ENV WEBHOOK_DIR=flink-kubernetes-webhook
+ENV DOCS_DIR=flink-kubernetes-docs
 
 RUN mkdir $OPERATOR_DIR $WEBHOOK_DIR
 
 COPY pom.xml .
 COPY $SHADED_DIR/pom.xml ./$SHADED_DIR/
 COPY $WEBHOOK_DIR/pom.xml ./$WEBHOOK_DIR/
 COPY $OPERATOR_DIR/pom.xml ./$OPERATOR_DIR/
+COPY $DOCS_DIR/pom.xml ./$DOCS_DIR/

Review Comment:
   @wangyang0918, only the `pom.xml` of `flink-kubernetes-docs`is copied into 
Dockerfile, which is introduced as module in `pom.xml` of parent and need to 
copy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] dianfu closed pull request #19551: [FLINK-22984][python] Disable PushWatermarkIntoTableSourceScanAcrossCalcRule when having Python UDF

2022-04-22 Thread GitBox


dianfu closed pull request #19551: [FLINK-22984][python] Disable 
PushWatermarkIntoTableSourceScanAcrossCalcRule when having Python UDF
URL: https://github.com/apache/flink/pull/19551


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] SteNicholas commented on a diff in pull request #178: [FLINK-27334] Support auto generate the doc for the `KubernetesOperatorConfigOptions`

2022-04-22 Thread GitBox


SteNicholas commented on code in PR #178:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/178#discussion_r856126535


##
Dockerfile:
##
@@ -23,21 +23,23 @@ WORKDIR /app
 ENV SHADED_DIR=flink-kubernetes-shaded
 ENV OPERATOR_DIR=flink-kubernetes-operator
 ENV WEBHOOK_DIR=flink-kubernetes-webhook
+ENV DOCS_DIR=flink-kubernetes-docs
 
 RUN mkdir $OPERATOR_DIR $WEBHOOK_DIR
 
 COPY pom.xml .
 COPY $SHADED_DIR/pom.xml ./$SHADED_DIR/
 COPY $WEBHOOK_DIR/pom.xml ./$WEBHOOK_DIR/
 COPY $OPERATOR_DIR/pom.xml ./$OPERATOR_DIR/
+COPY $DOCS_DIR/pom.xml ./$DOCS_DIR/

Review Comment:
   @wangyang0918, only the pom.xml of `flink-kubernetes-docs`is copied into 
Dockerfile, which is introduced as module in pom.xml of parent and need to copy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-22984) UnsupportedOperationException when using Python UDF to generate watermark

2022-04-22 Thread Dian Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu closed FLINK-22984.
---
Fix Version/s: 1.16.0
   1.13.7
   1.14.5
   1.15.1
 Assignee: Juntao Hu
   Resolution: Fixed

Fixed in:
- master via 7ce5a7c6e1eab6823094a94bc0bca30d0ee618f1
- release-1.15 via 703b10ca5d004e8e79059e814fcf8503f84e2da8
- release-1.14 via 0806ad5a154e37d09b53ce56d59cec8dc11209da
- release-1.13 via 79a86f35fb321cb5f8dd40442db8c6bafb00153c

> UnsupportedOperationException when using Python UDF to generate watermark
> -
>
> Key: FLINK-22984
> URL: https://issues.apache.org/jira/browse/FLINK-22984
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.13.0, 1.13.1
>Reporter: Maciej Bryński
>Assignee: Juntao Hu
>Priority: Minor
>  Labels: auto-deprioritized-critical, auto-deprioritized-major, 
> pull-request-available
> Fix For: 1.16.0, 1.13.7, 1.14.5, 1.15.1
>
>
> Hi,
> I'm trying to use output of Python UDF (parse_data) to set watermark for the 
> table
> {code:java}
> CREATE TABLE test (
> data BYTES,
> ts as parse_data(data).ts,
> WATERMARK for ts as ts
> ) WITH (
>'connector' = 'kafka',
>'topic' = 'test',
>'properties.bootstrap.servers' = 'localhost:9092',
>'properties.group.id' = 'flink',
>'scan.startup.mode' = 'earliest-offset',
>'format' = 'raw'
> ){code}
> Then running SELECT on this table gives me exception
> {code:java}
> Py4JJavaError: An error occurred while calling o311.hasNext.
> : java.lang.RuntimeException: Failed to fetch next result
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
>   at 
> org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:370)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>   at 
> org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
>   at 
> org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
>   at 
> org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>   at 
> org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
>   at 
> org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.io.IOException: Failed to fetch job execution result
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.getAccumulatorResults(CollectResultFetcher.java:177)
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:120)
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:106)
>   ... 13 more
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2022)
>   at 
> org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.getAccumulatorResults(CollectResultFetcher.java:175)
>   ... 15 more
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:137)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:680)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:658)
>   at 
> java.base/java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:2094)
>   at 
> org.apache.

[jira] [Resolved] (FLINK-27218) Serializer in OperatorState has not been updated when new Serializers are NOT incompatible

2022-04-22 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan resolved FLINK-27218.
---
Fix Version/s: 1.16.0
   1.15.1
   Resolution: Fixed

Merged into master as 4033ddc5fa682a1619f8f22348e2ee38afcc1c85,

into 1.15 as 3d4b3a495b273c3a15ce7d35ba5a5b2e4ddc4c20.
 

> Serializer in OperatorState has not been updated when new Serializers are NOT 
> incompatible
> --
>
> Key: FLINK-27218
> URL: https://issues.apache.org/jira/browse/FLINK-27218
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.15.1
>Reporter: Yue Ma
>Assignee: Yue Ma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.15.1
>
> Attachments: image-2022-04-13-14-50-10-921.png, 
> image-2022-04-18-21-48-30-519.png
>
>
> OperatorState such as *BroadcastState* or *PartitionableListState*  can only 
> be constructed via {*}DefaultOperatorStateBackend{*}. But when 
> *BroadcastState* or *PartitionableListState* Serializer changes after we 
> restart the job , it seems to have the following problems .
> As an example, we can see how PartitionableListState is initialized.
> First, RestoreOperation will construct a restored PartitionableListState 
> based on the information in the snapshot.
> Then StateMetaInfo in partitionableListState will be updated  as the 
> following code
> {code:java}
> TypeSerializerSchemaCompatibility stateCompatibility =
>                 
> restoredPartitionableListStateMetaInfo.updatePartitionStateSerializer(newPartitionStateSerializer);
> partitionableListState.setStateMetaInfo(restoredPartitionableListStateMetaInfo);{code}
> The main problem is that there is also an *internalListCopySerializer* in 
> *PartitionableListState* that is built using the previous Serializer and it 
> has not been updated. 
> Therefore, when we update the StateMetaInfo, the *internalListCopySerializer* 
> also needs to be updated.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[GitHub] [flink] rkhachatryan merged pull request #19441: [FLINK-27187][state/changelog] Add changelog storage metric totalAttemptsPerUpload

2022-04-22 Thread GitBox


rkhachatryan merged PR #19441:
URL: https://github.com/apache/flink/pull/19441


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27343) flink jdbc sink will lead to unordered result, because the sink buffer records execute unorder

2022-04-22 Thread pengyusong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pengyusong updated FLINK-27343:
---
Priority: Minor  (was: Critical)

> flink jdbc sink will lead to unordered result, because the sink buffer 
> records execute unorder
> --
>
> Key: FLINK-27343
> URL: https://issues.apache.org/jira/browse/FLINK-27343
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / JDBC
>Affects Versions: 1.13.6
> Environment: flink 1.13.6
> kafka
> postgres jdbc sink
>Reporter: pengyusong
>Priority: Minor
>
> * situation one
>     when i use flink sql kafka connector re-consume a topic, the topic 
> already has many messages.
>     jdbc sink param with default.
>     kafka topic is a compact topic, which contents is a mysql table cdc 
> events.
>     there some records with same key in one batch, buffer within one batch, 
> finnaly sink to postgres with unorder, later record in the buffer batch are 
> executed first.
>     this will lead to the older message in kafka deal with after the newer 
> message, the results are inconsistent with kafka message orders.
>  * situation two
>      If i set 
> h5. sink.buffer-flush.interval = 0
> h5. sink.buffer-flush.max-rows = 1
>    the result are  inconsistent with kafka message orders.
>  
> So, I have a suspicion that the order in jdbc buffer execute is 
> non-deterministic, lead to result in jdbc unordered.
>  
> updated!!!
> I found the order is my left join operator disorder the record order.  The 
> question is left join why disorder the order



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-26052) Update chinese documentation regarding FLIP-203

2022-04-22 Thread Feifan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526478#comment-17526478
 ] 

Feifan Wang commented on FLINK-26052:
-

Hi [~pnowojski] and [~dwysakowicz] , I'm a Chinese speaker, and I am glad to 
pick up this work, can you assign this ticket to me ?

> Update chinese documentation regarding FLIP-203
> ---
>
> Key: FLINK-26052
> URL: https://issues.apache.org/jira/browse/FLINK-26052
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation, Runtime / Checkpointing
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: translation-zh
>
> Relevant english commits: 
> * c1f5c5320150402fc0cb4fbf3a31f9a27b1e4d9a
> * cd8ea8d5b207569f68acc5a3c8db95cd2ca47ba6



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (FLINK-27187) The attemptsPerUpload metric may be lower than it actually is

2022-04-22 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan resolved FLINK-27187.
---
Resolution: Fixed

Thanks for adding this metric [~Feifan Wang],

merged as cb68ccf1b2cb879148fb17d2fd6394e15d1ae46c.

> The attemptsPerUpload metric may be lower than it actually is
> -
>
> Key: FLINK-27187
> URL: https://issues.apache.org/jira/browse/FLINK-27187
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / State Backends
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> The attemptsPerUpload metric in ChangelogStorageMetricGroup indicate 
> distributions of number of attempts per upload.
> In the current implementation, each successful attempt try to update 
> attemptsPerUpload with its attemptNumber.
> But consider this case: 
>  # attempt 1 timeout, then schedule attempt 2
>  # attempt 1 completed before attempt 2 and update attemptsPerUpload with 1
> In fact there are two attempts, but attemptsPerUpload updated with 1.
> So, I think we should add "actionAttemptsCount" to 
> RetryExecutor.RetriableActionAttempt, this field shared across all attempts 
> to execute the same upload action representing the number of upload attempts. 
> And completed attempt should use this field update attemptsPerUpload.
>  
> How do you think about ? [~ym] , [~roman] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-26052) Update chinese documentation regarding FLIP-203

2022-04-22 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526480#comment-17526480
 ] 

Dawid Wysakowicz commented on FLINK-26052:
--

Sure, I've assigned it to you. Thank you for your help!

> Update chinese documentation regarding FLIP-203
> ---
>
> Key: FLINK-26052
> URL: https://issues.apache.org/jira/browse/FLINK-26052
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation, Runtime / Checkpointing
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: translation-zh
>
> Relevant english commits: 
> * c1f5c5320150402fc0cb4fbf3a31f9a27b1e4d9a
> * cd8ea8d5b207569f68acc5a3c8db95cd2ca47ba6



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


  1   2   >