date:20211216

[GitHub] [flink] AHeise commented on a change in pull request #18105: [FLINK-17510][e2e] Convert StreamingKafkaITCase to SmokeKafkaITCase covering application packaging

2021-12-16 Thread GitBox



AHeise commented on a change in pull request #18105:
URL: https://github.com/apache/flink/pull/18105#discussion_r771176056



##
File path: 
flink-end-to-end-tests/flink-end-to-end-tests-common/src/main/java/org/apache/flink/tests/util/flink/container/FlinkContainers.java
##
@@ -254,6 +259,55 @@ public void submitSQLJob(SQLJobSubmission job) throws 
IOException, InterruptedEx
 }
 }
 
+/**
+ * Submits the given job to the cluster.
+ *
+ * @param job job to submit
+ */
+public JobID submitJob(JobSubmission job) throws IOException, 
InterruptedException {
+final List commands = new ArrayList<>();
+commands.add("flink/bin/flink");
+commands.add("run");
+
+if (job.isDetached()) {
+commands.add("-d");
+}
+if (job.getParallelism() > 0) {
+commands.add("-p");
+commands.add(String.valueOf(job.getParallelism()));
+}
+job.getMainClass()
+.ifPresent(
+mainClass -> {
+commands.add("--class");
+commands.add(mainClass);
+});
+final Path jobJar = job.getJar();
+final String containerPath = "/tmp/" + jobJar.getFileName();
+commands.add(containerPath);
+jobManager.copyFileToContainer(
+MountableFile.forHostPath(jobJar.toAbsolutePath()), 
containerPath);

Review comment:
   Does this work with resubmissions?

##
File path: 
flink-end-to-end-tests/flink-end-to-end-tests-common/src/main/java/org/apache/flink/tests/util/flink/container/FlinkContainers.java
##
@@ -254,6 +259,55 @@ public void submitSQLJob(SQLJobSubmission job) throws 
IOException, InterruptedEx
 }
 }
 
+/**
+ * Submits the given job to the cluster.
+ *
+ * @param job job to submit
+ */
+public JobID submitJob(JobSubmission job) throws IOException, 
InterruptedException {

Review comment:
   `submitJobWithCLI`? I could imagine that someone wants to specificially 
test different entry points.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] alpreu commented on pull request #17819: [FLINK-15816][k8s] Limit kubernetes.cluster-id to a maximum of 40 characters

2021-12-16 Thread GitBox



alpreu commented on pull request #17819:
URL: https://github.com/apache/flink/pull/17819#issuecomment-996510750


   Sure, I can do this, just have to deprioritize it for some SDK stuff for now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wanglijie95 commented on a change in pull request #18050: [FLINK-25034][runtime] Support flexible number of subpartitions in IntermediateResultPartition

2021-12-16 Thread GitBox



wanglijie95 commented on a change in pull request #18050:
URL: https://github.com/apache/flink/pull/18050#discussion_r771175287



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartitionTest.java
##
@@ -140,6 +150,127 @@ public void testBlockingPartitionResetting() throws 
Exception {
 assertFalse(consumedPartitionGroup.areAllPartitionsFinished());
 }
 
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicAllToAllGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, false, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicPointWiseGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, false, 
Arrays.asList(4, 3));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicAllToAllGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, true, 
Arrays.asList(7, 7));

Review comment:
   This design is because the parallelism of some operators can only be a 
certain value. For example, `GlobalAgg`, it's parallelism can only be 1. In 
this case, the parallelism of `GlobalAgg` will be set to 1 in jobgraph during 
compilation, and the adaptive batch scheduler will not change it's parallelism.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25223) ElasticsearchWriterITCase fails on AZP

2021-12-16 Thread Arvid Heise (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461253#comment-17461253
 ] 

Arvid Heise commented on FLINK-25223:
-

[~alexanderpreuss] could you please double-check if we need a backport? The 
related docker image was only used in master afaik but there seems to be 
similar issues on old release branches. So maybe we also need to apply a 
similar fix to that legacy code.

> ElasticsearchWriterITCase fails on AZP
> --
>
> Key: FLINK-25223
> URL: https://issues.apache.org/jira/browse/FLINK-25223
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.15.0
>Reporter: Till Rohrmann
>Assignee: Alexander Preuss
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.15.0
>
>
> The {{ElasticsearchWriterITCase}} fails on AZP because
> {code}
> 2021-12-08T13:56:59.5449851Z Dec 08 13:56:59 [ERROR] 
> org.apache.flink.connector.elasticsearch.sink.ElasticsearchWriterITCase  Time 
> elapsed: 171.046 s  <<< ERROR!
> 2021-12-08T13:56:59.5450680Z Dec 08 13:56:59 
> org.testcontainers.containers.ContainerLaunchException: Container startup 
> failed
> 2021-12-08T13:56:59.5451652Z Dec 08 13:56:59  at 
> org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:336)
> 2021-12-08T13:56:59.5452677Z Dec 08 13:56:59  at 
> org.testcontainers.containers.GenericContainer.start(GenericContainer.java:317)
> 2021-12-08T13:56:59.5453637Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension$StoreAdapter.start(TestcontainersExtension.java:242)
> 2021-12-08T13:56:59.5454757Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension$StoreAdapter.access$200(TestcontainersExtension.java:229)
> 2021-12-08T13:56:59.5455946Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension.lambda$null$1(TestcontainersExtension.java:59)
> 2021-12-08T13:56:59.5457322Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore.lambda$getOrComputeIfAbsent$4(ExtensionValuesStore.java:86)
> 2021-12-08T13:56:59.5458571Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore$MemoizingSupplier.computeValue(ExtensionValuesStore.java:223)
> 2021-12-08T13:56:59.5459771Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore$MemoizingSupplier.get(ExtensionValuesStore.java:211)
> 2021-12-08T13:56:59.5460693Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore$StoredValue.evaluate(ExtensionValuesStore.java:191)
> 2021-12-08T13:56:59.5461437Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore$StoredValue.access$100(ExtensionValuesStore.java:171)
> 2021-12-08T13:56:59.5462198Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore.getOrComputeIfAbsent(ExtensionValuesStore.java:89)
> 2021-12-08T13:56:59.5467999Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.NamespaceAwareStore.getOrComputeIfAbsent(NamespaceAwareStore.java:53)
> 2021-12-08T13:56:59.5468791Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension.lambda$beforeAll$2(TestcontainersExtension.java:59)
> 2021-12-08T13:56:59.5469436Z Dec 08 13:56:59  at 
> java.util.ArrayList.forEach(ArrayList.java:1259)
> 2021-12-08T13:56:59.5470058Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension.beforeAll(TestcontainersExtension.java:59)
> 2021-12-08T13:56:59.5470846Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$invokeBeforeAllCallbacks$10(ClassBasedTestDescriptor.java:381)
> 2021-12-08T13:56:59.5471641Z Dec 08 13:56:59  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2021-12-08T13:56:59.5472403Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeBeforeAllCallbacks(ClassBasedTestDescriptor.java:381)
> 2021-12-08T13:56:59.5473190Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.before(ClassBasedTestDescriptor.java:205)
> 2021-12-08T13:56:59.5474001Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.before(ClassBasedTestDescriptor.java:80)
> 2021-12-08T13:56:59.5474759Z Dec 08 13:56:59  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:148)
> 2021-12-08T13:56:59.5475833Z Dec 08 13:56:59  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2021-12-08T13:56:59.5476739Z Dec 08 13:56:59  at 
>

[jira] [Resolved] (FLINK-25223) ElasticsearchWriterITCase fails on AZP

2021-12-16 Thread Arvid Heise (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvid Heise resolved FLINK-25223.
-
Resolution: Fixed

Reactivated tests and provided memory limit in master as 
98a74baeaf45350fb665558cdbed5fb72fb310dd.

> ElasticsearchWriterITCase fails on AZP
> --
>
> Key: FLINK-25223
> URL: https://issues.apache.org/jira/browse/FLINK-25223
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.15.0
>Reporter: Till Rohrmann
>Assignee: Alexander Preuss
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.15.0
>
>
> The {{ElasticsearchWriterITCase}} fails on AZP because
> {code}
> 2021-12-08T13:56:59.5449851Z Dec 08 13:56:59 [ERROR] 
> org.apache.flink.connector.elasticsearch.sink.ElasticsearchWriterITCase  Time 
> elapsed: 171.046 s  <<< ERROR!
> 2021-12-08T13:56:59.5450680Z Dec 08 13:56:59 
> org.testcontainers.containers.ContainerLaunchException: Container startup 
> failed
> 2021-12-08T13:56:59.5451652Z Dec 08 13:56:59  at 
> org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:336)
> 2021-12-08T13:56:59.5452677Z Dec 08 13:56:59  at 
> org.testcontainers.containers.GenericContainer.start(GenericContainer.java:317)
> 2021-12-08T13:56:59.5453637Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension$StoreAdapter.start(TestcontainersExtension.java:242)
> 2021-12-08T13:56:59.5454757Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension$StoreAdapter.access$200(TestcontainersExtension.java:229)
> 2021-12-08T13:56:59.5455946Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension.lambda$null$1(TestcontainersExtension.java:59)
> 2021-12-08T13:56:59.5457322Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore.lambda$getOrComputeIfAbsent$4(ExtensionValuesStore.java:86)
> 2021-12-08T13:56:59.5458571Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore$MemoizingSupplier.computeValue(ExtensionValuesStore.java:223)
> 2021-12-08T13:56:59.5459771Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore$MemoizingSupplier.get(ExtensionValuesStore.java:211)
> 2021-12-08T13:56:59.5460693Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore$StoredValue.evaluate(ExtensionValuesStore.java:191)
> 2021-12-08T13:56:59.5461437Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore$StoredValue.access$100(ExtensionValuesStore.java:171)
> 2021-12-08T13:56:59.5462198Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore.getOrComputeIfAbsent(ExtensionValuesStore.java:89)
> 2021-12-08T13:56:59.5467999Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.execution.NamespaceAwareStore.getOrComputeIfAbsent(NamespaceAwareStore.java:53)
> 2021-12-08T13:56:59.5468791Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension.lambda$beforeAll$2(TestcontainersExtension.java:59)
> 2021-12-08T13:56:59.5469436Z Dec 08 13:56:59  at 
> java.util.ArrayList.forEach(ArrayList.java:1259)
> 2021-12-08T13:56:59.5470058Z Dec 08 13:56:59  at 
> org.testcontainers.junit.jupiter.TestcontainersExtension.beforeAll(TestcontainersExtension.java:59)
> 2021-12-08T13:56:59.5470846Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.lambda$invokeBeforeAllCallbacks$10(ClassBasedTestDescriptor.java:381)
> 2021-12-08T13:56:59.5471641Z Dec 08 13:56:59  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2021-12-08T13:56:59.5472403Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.invokeBeforeAllCallbacks(ClassBasedTestDescriptor.java:381)
> 2021-12-08T13:56:59.5473190Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.before(ClassBasedTestDescriptor.java:205)
> 2021-12-08T13:56:59.5474001Z Dec 08 13:56:59  at 
> org.junit.jupiter.engine.descriptor.ClassBasedTestDescriptor.before(ClassBasedTestDescriptor.java:80)
> 2021-12-08T13:56:59.5474759Z Dec 08 13:56:59  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:148)
> 2021-12-08T13:56:59.5475833Z Dec 08 13:56:59  at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
> 2021-12-08T13:56:59.5476739Z Dec 08 13:56:59  at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
> 2021-12-08T13:56:59.5477520Z Dec 08 13:56:59  at 
>

[GitHub] [flink] AHeise merged pull request #18064: [FLINK-25223][connectors/elasticsearch] Limit Elasticsearch testcontainer memory allocation

2021-12-16 Thread GitBox



AHeise merged pull request #18064:
URL: https://github.com/apache/flink/pull/18064


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wanglijie95 commented on a change in pull request #18050: [FLINK-25034][runtime] Support flexible number of subpartitions in IntermediateResultPartition

2021-12-16 Thread GitBox



wanglijie95 commented on a change in pull request #18050:
URL: https://github.com/apache/flink/pull/18050#discussion_r771171607



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartitionTest.java
##
@@ -140,6 +150,127 @@ public void testBlockingPartitionResetting() throws 
Exception {
 assertFalse(consumedPartitionGroup.areAllPartitionsFinished());
 }
 
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicAllToAllGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, false, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicPointWiseGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, false, 
Arrays.asList(4, 3));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicAllToAllGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, true, 
Arrays.asList(7, 7));

Review comment:
   Yes, it exists, users can set parallelism of a certain vertex. The 
adaptive batch scheduler only decide parallelism for vertices whose parallelim 
is not set by user(the vertex's parallelism in jobgraph is -1). For the 
vertices whose parallelism is set by user or set in compile phase (the vertex's 
parallelism in jobgraph is > 0), adaptive batch scheduler will not change it's 
parallelism.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wanglijie95 commented on a change in pull request #18050: [FLINK-25034][runtime] Support flexible number of subpartitions in IntermediateResultPartition

2021-12-16 Thread GitBox



wanglijie95 commented on a change in pull request #18050:
URL: https://github.com/apache/flink/pull/18050#discussion_r771171790



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartitionTest.java
##
@@ -140,6 +150,127 @@ public void testBlockingPartitionResetting() throws 
Exception {
 assertFalse(consumedPartitionGroup.areAllPartitionsFinished());
 }
 
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicAllToAllGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, false, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicPointWiseGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, false, 
Arrays.asList(4, 3));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicAllToAllGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, true, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicPointWiseGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, true, 
Arrays.asList(4, 4));

Review comment:
   Same as above




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wanglijie95 commented on a change in pull request #18050: [FLINK-25034][runtime] Support flexible number of subpartitions in IntermediateResultPartition

2021-12-16 Thread GitBox



wanglijie95 commented on a change in pull request #18050:
URL: https://github.com/apache/flink/pull/18050#discussion_r771171607



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartitionTest.java
##
@@ -140,6 +150,127 @@ public void testBlockingPartitionResetting() throws 
Exception {
 assertFalse(consumedPartitionGroup.areAllPartitionsFinished());
 }
 
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicAllToAllGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, false, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicPointWiseGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, false, 
Arrays.asList(4, 3));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicAllToAllGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, true, 
Arrays.asList(7, 7));

Review comment:
   Yes, it exists, users can set parallelism of a certain vertex. The 
adaptive batch scheduler only decide parallelism for vertex whose parallelim is 
not set by user(the vertex's parallelism in jobgraph is -1). For the job vertex 
whose parallelism is set by user or set in compile phase (the vertex's 
parallelism in jobgraph is > 0), adaptive batch scheduler will not change it's 
parallelism.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25167) Support user-defined `StreamOperatorFactory` in `ConnectedStreams`#transform

2021-12-16 Thread Arvid Heise (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461250#comment-17461250
 ] 

Arvid Heise commented on FLINK-25167:
-

I assigned. [~MartijnVisser] could you please keep an eye on it?

> Support user-defined `StreamOperatorFactory` in `ConnectedStreams`#transform
> 
>
> Key: FLINK-25167
> URL: https://issues.apache.org/jira/browse/FLINK-25167
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Lsw_aka_laplace
>Assignee: Lsw_aka_laplace
>Priority: Minor
>
>   From my side, it is necessary to set my custom `StreamOperatorFactory` when 
> I ’m calling  `ConnectedStreams`#transform so that I can set up my own 
> `OperatorCoordinator`. 
>  Well, currently, `ConnectStreams` seems not to give the access, the default 
> behavior is using `SimpleOperatorFactory`.  After checking the code, I think 
> it is a trivial change to support that. If no one is working on it, I'm 
> willing to doing that.  : )



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (FLINK-25167) Support user-defined `StreamOperatorFactory` in `ConnectedStreams`#transform

2021-12-16 Thread Arvid Heise (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvid Heise reassigned FLINK-25167:
---

Assignee: Lsw_aka_laplace

> Support user-defined `StreamOperatorFactory` in `ConnectedStreams`#transform
> 
>
> Key: FLINK-25167
> URL: https://issues.apache.org/jira/browse/FLINK-25167
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Lsw_aka_laplace
>Assignee: Lsw_aka_laplace
>Priority: Minor
>
>   From my side, it is necessary to set my custom `StreamOperatorFactory` when 
> I ’m calling  `ConnectedStreams`#transform so that I can set up my own 
> `OperatorCoordinator`. 
>  Well, currently, `ConnectStreams` seems not to give the access, the default 
> behavior is using `SimpleOperatorFactory`.  After checking the code, I think 
> it is a trivial change to support that. If no one is working on it, I'm 
> willing to doing that.  : )



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25319) Quickstarts Scala nightly end-to-end test failed due to akka rpc server failed to start

2021-12-16 Thread Yangze Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461249#comment-17461249
 ] 

Yangze Guo commented on FLINK-25319:


another instance 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=28305=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a=2831

> Quickstarts Scala nightly end-to-end test failed due to akka rpc server 
> failed to start
> ---
>
> Key: FLINK-25319
> URL: https://issues.apache.org/jira/browse/FLINK-25319
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.0
>Reporter: Yun Gao
>Priority: Major
>  Labels: test-stability
>
> {code:java}
> Dec 14 17:36:15   at 
> akka.dispatch.Mailbox.exec(Mailbox.scala:243) ~[?:?]
> Dec 14 17:36:15   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) ~[?:1.8.0_292]
> Dec 14 17:36:15   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 
> ~[?:1.8.0_292]
> Dec 14 17:36:15   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) 
> ~[?:1.8.0_292]
> Dec 14 17:36:15   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 
> ~[?:1.8.0_292]
> Dec 14 17:36:15 Caused by: java.lang.NullPointerException
> Dec 14 17:36:15   at 
> org.apache.flink.runtime.rpc.RpcEndpoint.getAddress(RpcEndpoint.java:283) 
> ~[flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
> Dec 14 17:36:15   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:177)
>  ~[?:?]
> Dec 14 17:36:15   at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) ~[?:?]
> Dec 14 17:36:15   at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) ~[?:?]
> Dec 14 17:36:15   at 
> scala.PartialFunction.applyOrElse(PartialFunction.scala:123) 
> ~[flink-scala_2.12-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
> Dec 14 17:36:15   at 
> scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) 
> ~[flink-scala_2.12-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
> Dec 14 17:36:15   at 
> akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) ~[?:?]
> Dec 14 17:36:15   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
> ~[flink-scala_2.12-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
> Dec 14 17:36:15   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 
> ~[flink-scala_2.12-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
> Dec 14 17:36:15   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 
> ~[flink-scala_2.12-1.15-SNAPSHOT.jar:1.15-SNAPSHOT]
> Dec 14 17:36:15   at akka.actor.Actor.aroundReceive(Actor.scala:537) 
> ~[?:?]
> Dec 14 17:36:15   at akka.actor.Actor.aroundReceive$(Actor.scala:535) 
> ~[?:?]
> Dec 14 17:36:15   at 
> akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220) ~[?:?]
> Dec 14 17:36:15   at 
> akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) ~[?:?]
> Dec 14 17:36:15   at akka.actor.ActorCell.invoke(ActorCell.scala:548) 
> ~[?:?]
> Dec 14 17:36:15   at 
> akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) ~[?:?]
> Dec 14 17:36:15   at akka.dispatch.Mailbox.run(Mailbox.scala:231) ~[?:?]
> Dec 14 17:36:15   at akka.dispatch.Mailbox.exec(Mailbox.scala:243) ~[?:?]
> Dec 14 17:36:15   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) ~[?:1.8.0_292]
> Dec 14 17:36:15   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 
> ~[?:1.8.0_292]
> Dec 14 17:36:15   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) 
> ~[?:1.8.0_292]
> Dec 14 17:36:15   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 
> ~[?:1.8.0_292]
> Dec 14 17:36:15 2021-12-14 17:35:43,572 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - Shutting 
> StandaloneSessionClusterEntrypoint down with application status UNKNOWN. 
> Diagnostics Cluster entrypoint has been closed externally..
> Dec 14 17:36:15 2021-12-14 17:35:43,582 INFO  
> org.apache.flink.runtime.blob.BlobServer [] - Stopped 
> BLOB server at 0.0.0.0:42449
> Dec 14 17:36:15 2021-12-14 17:35:43,584 INFO  
> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint   [] - Shutting 
> down rest endpoint.
> Dec 14 17:36:15 2021-12-14 17:35:40,722 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner  [] - 
> 
> Dec 14 17:36:15 2021-12-14 17:35:40,724 INFO  
>

[GitHub] [flink] wanglijie95 commented on a change in pull request #18050: [FLINK-25034][runtime] Support flexible number of subpartitions in IntermediateResultPartition

2021-12-16 Thread GitBox



wanglijie95 commented on a change in pull request #18050:
URL: https://github.com/apache/flink/pull/18050#discussion_r771163449



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartition.java
##
@@ -77,6 +85,69 @@ public ResultPartitionType getResultType() {
 return getEdgeManager().getConsumedPartitionGroupsById(partitionId);
 }
 
+public int getNumberOfSubpartitions() {
+if (numberOfSubpartitions == UNKNOWN) {
+numberOfSubpartitions = getOrComputeNumberOfSubpartitions();
+}
+checkState(
+numberOfSubpartitions > 0,
+"Number of subpartitions is an unexpected value: " + 
numberOfSubpartitions);
+
+return numberOfSubpartitions;
+}
+
+private int getOrComputeNumberOfSubpartitions() {
+if (!getProducer().getExecutionGraphAccessor().isDynamic()) {
+// The produced data is partitioned among a number of 
subpartitions.
+//
+// If no consumers are known at this point, we use a single 
subpartition, otherwise we
+// have one for each consuming sub task.
+int numberOfSubpartitions = 1;
+List consumerVertexGroups = 
getConsumerVertexGroups();
+if (!consumerVertexGroups.isEmpty() && 
!consumerVertexGroups.get(0).isEmpty()) {
+if (consumerVertexGroups.size() > 1) {
+throw new IllegalStateException(
+"Currently, only a single consumer group per 
partition is supported.");
+}
+numberOfSubpartitions = consumerVertexGroups.get(0).size();
+}
+
+return numberOfSubpartitions;
+} else {
+if (totalResult.isBroadcast()) {
+// for dynamic graph and broadcast result, we only produced 
one subpartition,
+// and all the downstream vertices should consume this 
subpartition.
+return 1;
+} else {
+return computeNumberOfMaxPossiblePartitionConsumers();
+}
+}
+}
+
+private int computeNumberOfMaxPossiblePartitionConsumers() {
+final ExecutionJobVertex consumerJobVertex =
+getIntermediateResult().getConsumerExecutionJobVertex();
+final DistributionPattern distributionPattern =
+getIntermediateResult().getConsumingDistributionPattern();
+
+// decide the max possible consumer job vertex parallelism
+int maxConsumerJobVertexParallelism = 
consumerJobVertex.getParallelism();

Review comment:
   When the consumer's parallelism is specified( > 0), we can directly set 
the subpartition number to the consumer's parallelism. Only when we don't know 
the consumer's parallelism, we need to set it to the max parallelism.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-16 Thread Yun Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao updated FLINK-25185:

Affects Version/s: 1.15.0

> StreamFaultToleranceTestBase hangs on AZP
> -
>
> Key: FLINK-25185
> URL: https://issues.apache.org/jira/browse/FLINK-25185
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Test Infrastructure
>Affects Versions: 1.13.3, 1.15.0
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> The {{StreamFaultToleranceTestBase}} hangs on AZP.
> {code}
> 2021-12-06T04:24:48.1676089Z 
> ==
> 2021-12-06T04:24:48.1678883Z === WARNING: This task took already 95% of the 
> available time budget of 237 minutes ===
> 2021-12-06T04:24:48.1679596Z 
> ==
> 2021-12-06T04:24:48.1680326Z 
> ==
> 2021-12-06T04:24:48.1680877Z The following Java processes are running (JPS)
> 2021-12-06T04:24:48.1681467Z 
> ==
> 2021-12-06T04:24:48.6514536Z 13701 surefirebooter17740627448580534543.jar
> 2021-12-06T04:24:48.6515353Z 1622 Jps
> 2021-12-06T04:24:48.6515795Z 780 Launcher
> 2021-12-06T04:24:48.6825889Z 
> ==
> 2021-12-06T04:24:48.6826565Z Printing stack trace of Java process 13701
> 2021-12-06T04:24:48.6827012Z 
> ==
> 2021-12-06T04:24:49.1876086Z 2021-12-06 04:24:49
> 2021-12-06T04:24:49.1877098Z Full thread dump OpenJDK 64-Bit Server VM 
> (11.0.10+9 mixed mode):
> 2021-12-06T04:24:49.1877362Z 
> 2021-12-06T04:24:49.1877672Z Threads class SMR info:
> 2021-12-06T04:24:49.1878049Z _java_thread_list=0x7f254c007630, 
> length=365, elements={
> 2021-12-06T04:24:49.1878504Z 0x7f2598028000, 0x7f2598280800, 
> 0x7f2598284800, 0x7f2598299000,
> 2021-12-06T04:24:49.1878973Z 0x7f259829b000, 0x7f259829d800, 
> 0x7f259829f800, 0x7f25982a1800,
> 2021-12-06T04:24:49.1879680Z 0x7f2598337800, 0x7f25983e3000, 
> 0x7f2598431000, 0x7f2528016000,
> 2021-12-06T04:24:49.1896613Z 0x7f2599003000, 0x7f259972e000, 
> 0x7f2599833800, 0x7f259984c000,
> 2021-12-06T04:24:49.1897558Z 0x7f259984f000, 0x7f2599851000, 
> 0x7f2599892000, 0x7f2599894800,
> 2021-12-06T04:24:49.1898075Z 0x7f2499a16000, 0x7f2485acd800, 
> 0x7f2485ace000, 0x7f24876bb800,
> 2021-12-06T04:24:49.1898562Z 0x7f2461e59000, 0x7f2499a0e800, 
> 0x7f2461e5e800, 0x7f2461e81000,
> 2021-12-06T04:24:49.1899037Z 0x7f24dc015000, 0x7f2461e86800, 
> 0x7f2448002000, 0x7f24dc01c000,
> 2021-12-06T04:24:49.1899522Z 0x7f2438001000, 0x7f2438003000, 
> 0x7f2438005000, 0x7f2438006800,
> 2021-12-06T04:24:49.1899982Z 0x7f2438008800, 0x7f2434017800, 
> 0x7f243401a800, 0x7f2414008800,
> 2021-12-06T04:24:49.1900495Z 0x7f24e8089800, 0x7f24e809, 
> 0x7f23e4005800, 0x7f24e8092800,
> 2021-12-06T04:24:49.1901163Z 0x7f24e8099000, 0x7f2414015800, 
> 0x7f24dc04c000, 0x7f2414018800,
> 2021-12-06T04:24:49.1901680Z 0x7f241402, 0x7f24dc058000, 
> 0x7f24dc05b000, 0x7f2414022000,
> 2021-12-06T04:24:49.1902283Z 0x7f24d400f000, 0x7f241402e800, 
> 0x7f2414031800, 0x7f2414033800,
> 2021-12-06T04:24:49.1902880Z 0x7f2414035000, 0x7f2414037000, 
> 0x7f2414038800, 0x7f241403a800,
> 2021-12-06T04:24:49.1903354Z 0x7f241403c000, 0x7f241403e000, 
> 0x7f241403f800, 0x7f2414041800,
> 2021-12-06T04:24:49.1903812Z 0x7f2414043000, 0x7f2414045000, 
> 0x7f24dc064800, 0x7f2414047000,
> 2021-12-06T04:24:49.1904284Z 0x7f2414048800, 0x7f241404a800, 
> 0x7f241404c800, 0x7f241404e000,
> 2021-12-06T04:24:49.1904800Z 0x7f241405, 0x7f2414051800, 
> 0x7f2414053800, 0x7f2414055000,
> 2021-12-06T04:24:49.1905455Z 0x7f2414057000, 0x7f2414059000, 
> 0x7f241405a800, 0x7f241405c800,
> 2021-12-06T04:24:49.1906098Z 0x7f241405e000, 0x7f241406, 
> 0x7f2414062000, 0x7f2414063800,
> 2021-12-06T04:24:49.1906728Z 0x7f22e400c800, 0x7f2328008000, 
> 0x7f2284007000, 0x7f22cc019800,
> 2021-12-06T04:24:49.1907396Z 0x7f21f8004000, 0x7f2304012800, 
> 0x7f230001b000, 0x7f223c011000,
> 2021-12-06T04:24:49.1908080Z 0x7f24e40c1800, 0x7f2454001000, 
> 0x7f24e40c3000, 0x7f2454003000,
> 2021-12-06T04:24:49.1908794Z

[jira] [Commented] (FLINK-25185) StreamFaultToleranceTestBase hangs on AZP

2021-12-16 Thread Yun Gao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461237#comment-17461237
 ] 

Yun Gao commented on FLINK-25185:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=28297=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=19003

> StreamFaultToleranceTestBase hangs on AZP
> -
>
> Key: FLINK-25185
> URL: https://issues.apache.org/jira/browse/FLINK-25185
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Test Infrastructure
>Affects Versions: 1.13.3
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> The {{StreamFaultToleranceTestBase}} hangs on AZP.
> {code}
> 2021-12-06T04:24:48.1676089Z 
> ==
> 2021-12-06T04:24:48.1678883Z === WARNING: This task took already 95% of the 
> available time budget of 237 minutes ===
> 2021-12-06T04:24:48.1679596Z 
> ==
> 2021-12-06T04:24:48.1680326Z 
> ==
> 2021-12-06T04:24:48.1680877Z The following Java processes are running (JPS)
> 2021-12-06T04:24:48.1681467Z 
> ==
> 2021-12-06T04:24:48.6514536Z 13701 surefirebooter17740627448580534543.jar
> 2021-12-06T04:24:48.6515353Z 1622 Jps
> 2021-12-06T04:24:48.6515795Z 780 Launcher
> 2021-12-06T04:24:48.6825889Z 
> ==
> 2021-12-06T04:24:48.6826565Z Printing stack trace of Java process 13701
> 2021-12-06T04:24:48.6827012Z 
> ==
> 2021-12-06T04:24:49.1876086Z 2021-12-06 04:24:49
> 2021-12-06T04:24:49.1877098Z Full thread dump OpenJDK 64-Bit Server VM 
> (11.0.10+9 mixed mode):
> 2021-12-06T04:24:49.1877362Z 
> 2021-12-06T04:24:49.1877672Z Threads class SMR info:
> 2021-12-06T04:24:49.1878049Z _java_thread_list=0x7f254c007630, 
> length=365, elements={
> 2021-12-06T04:24:49.1878504Z 0x7f2598028000, 0x7f2598280800, 
> 0x7f2598284800, 0x7f2598299000,
> 2021-12-06T04:24:49.1878973Z 0x7f259829b000, 0x7f259829d800, 
> 0x7f259829f800, 0x7f25982a1800,
> 2021-12-06T04:24:49.1879680Z 0x7f2598337800, 0x7f25983e3000, 
> 0x7f2598431000, 0x7f2528016000,
> 2021-12-06T04:24:49.1896613Z 0x7f2599003000, 0x7f259972e000, 
> 0x7f2599833800, 0x7f259984c000,
> 2021-12-06T04:24:49.1897558Z 0x7f259984f000, 0x7f2599851000, 
> 0x7f2599892000, 0x7f2599894800,
> 2021-12-06T04:24:49.1898075Z 0x7f2499a16000, 0x7f2485acd800, 
> 0x7f2485ace000, 0x7f24876bb800,
> 2021-12-06T04:24:49.1898562Z 0x7f2461e59000, 0x7f2499a0e800, 
> 0x7f2461e5e800, 0x7f2461e81000,
> 2021-12-06T04:24:49.1899037Z 0x7f24dc015000, 0x7f2461e86800, 
> 0x7f2448002000, 0x7f24dc01c000,
> 2021-12-06T04:24:49.1899522Z 0x7f2438001000, 0x7f2438003000, 
> 0x7f2438005000, 0x7f2438006800,
> 2021-12-06T04:24:49.1899982Z 0x7f2438008800, 0x7f2434017800, 
> 0x7f243401a800, 0x7f2414008800,
> 2021-12-06T04:24:49.1900495Z 0x7f24e8089800, 0x7f24e809, 
> 0x7f23e4005800, 0x7f24e8092800,
> 2021-12-06T04:24:49.1901163Z 0x7f24e8099000, 0x7f2414015800, 
> 0x7f24dc04c000, 0x7f2414018800,
> 2021-12-06T04:24:49.1901680Z 0x7f241402, 0x7f24dc058000, 
> 0x7f24dc05b000, 0x7f2414022000,
> 2021-12-06T04:24:49.1902283Z 0x7f24d400f000, 0x7f241402e800, 
> 0x7f2414031800, 0x7f2414033800,
> 2021-12-06T04:24:49.1902880Z 0x7f2414035000, 0x7f2414037000, 
> 0x7f2414038800, 0x7f241403a800,
> 2021-12-06T04:24:49.1903354Z 0x7f241403c000, 0x7f241403e000, 
> 0x7f241403f800, 0x7f2414041800,
> 2021-12-06T04:24:49.1903812Z 0x7f2414043000, 0x7f2414045000, 
> 0x7f24dc064800, 0x7f2414047000,
> 2021-12-06T04:24:49.1904284Z 0x7f2414048800, 0x7f241404a800, 
> 0x7f241404c800, 0x7f241404e000,
> 2021-12-06T04:24:49.1904800Z 0x7f241405, 0x7f2414051800, 
> 0x7f2414053800, 0x7f2414055000,
> 2021-12-06T04:24:49.1905455Z 0x7f2414057000, 0x7f2414059000, 
> 0x7f241405a800, 0x7f241405c800,
> 2021-12-06T04:24:49.1906098Z 0x7f241405e000, 0x7f241406, 
> 0x7f2414062000, 0x7f2414063800,
> 2021-12-06T04:24:49.1906728Z 0x7f22e400c800, 0x7f2328008000, 
> 0x7f2284007000, 0x7f22cc019800,
> 2021-12-06T04:24:49.1907396Z 0x7f21f8004000, 0x7f2304012800, 
> 0x7f230001b000,

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28300)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25359) flink1.13.2 心跳超时

2021-12-16 Thread Caizhi Weng (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461234#comment-17461234
 ] 

Caizhi Weng commented on FLINK-25359:
-

Hi!

There are various reasons causing heartbeat timeout, among which the most 
frequent reason is heavy GC. Please check the GC log of your job and try to 
increase the resources for each task manager. If you have other questions, feel 
free to post them in the [user mailing 
list|https://flink.apache.org/community.html#mailing-lists].

Closing this issue as this is not a bug.

> flink1.13.2 心跳超时
> 
>
> Key: FLINK-25359
> URL: https://issues.apache.org/jira/browse/FLINK-25359
> Project: Flink
>  Issue Type: Bug
>Reporter: mindezhi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Closed] (FLINK-25359) flink1.13.2 心跳超时

2021-12-16 Thread Caizhi Weng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng closed FLINK-25359.
---
Resolution: Not A Problem

> flink1.13.2 心跳超时
> 
>
> Key: FLINK-25359
> URL: https://issues.apache.org/jira/browse/FLINK-25359
> Project: Flink
>  Issue Type: Bug
>Reporter: mindezhi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-25359) flink1.13.2 心跳超时

2021-12-16 Thread mindezhi (Jira)

mindezhi created FLINK-25359:


 Summary: flink1.13.2 心跳超时
 Key: FLINK-25359
 URL: https://issues.apache.org/jira/browse/FLINK-25359
 Project: Flink
  Issue Type: Bug
Reporter: mindezhi






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25338) Improvement of connection from TM to JM in session cluster

2021-12-16 Thread Xintong Song (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461226#comment-17461226
 ] 

Xintong Song commented on FLINK-25338:
--

In general, I like the idea that the JM process maintains connections to all 
TMs in one place, and forward only necessary messages to JobMasters.

My gut feeling, the component responsible for managing JM-TM connections should 
be the dispatcher. I'd prefer not to introduce a new component at the cluster 
entry point level. Maybe we can separate the effort into 2 steps? We can 
firstly make this improvement with a single-thread dispatcher, and introduce a 
thread pool based approach secondary if it indeed turns into a bottleneck. WDYT?

Moreover, I think this improvement changes the architecture how RPC endpoints 
talk to each other, thus may deserve a FLIP discussion and a formal vote.

> Improvement of connection from TM to JM in session cluster
> --
>
> Key: FLINK-25338
> URL: https://issues.apache.org/jira/browse/FLINK-25338
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.12.7, 1.13.5, 1.14.2
>Reporter: Shammon
>Priority: Major
>
> When taskmanager receives slot request from resourcemanager for the specify 
> job, it will connect to the jobmaster with given job address. Taskmanager 
> register itself, monitor the heartbeat of job and update task's state by this 
> connection. There's no need to create connections in one taskmanager for each 
> job, and when the taskmanager is busy, it will increase the latency of job. 
> One idea is that taskmanager manages the connection to `Dispatcher`, sends 
> events such as heartbeat, state update to `Dispatcher`,  and `Dispatcher` 
> tell the local `JobMaster`. The main problem is that `Dispatcher` is an actor 
> and can only be executed in one thread, it may be the performance bottleneck 
> for deserialize event.
> The other idea is to create a netty service in `SessionClusterEntrypoint`, it 
> can receive and deserialize events from taskmanagers in a threadpool, and 
> send the event to the `Dispatcher` or `JobMaster`. Taskmanagers manager the 
> connection to the netty service when it start. Thus a service can also 
> receive the result of a job from taskmanager later.
> [~xtsong] What do you think? THX



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28300)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] RocMarshal commented on pull request #18114: [FLINK-25173][table][hive] Introduce CatalogLock and implement HiveCatalogLock

2021-12-16 Thread GitBox



RocMarshal commented on pull request #18114:
URL: https://github.com/apache/flink/pull/18114#issuecomment-996475009


   @JingsongLi Thanks for the efforts. Shall we add a description section into 
the relevant documentation page based on `HiveConfOptions.java`? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #17711:
URL: https://github.com/apache/flink/pull/17711#issuecomment-962917757


   
   ## CI report:
   
   * b4e978359c87b33f8c5de84f6e0987f712335b19 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28299)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #17582: [FLINK-24674][kubernetes] Create corresponding resouces for task manager Pods

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #17582:
URL: https://github.com/apache/flink/pull/17582#issuecomment-953241058


   
   ## CI report:
   
   * ed1e59cced3590c5cbb27e0214b3746d148c8aa2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28292)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-25328) Improvement of share memory manager between jobs if they use the same slot in TaskManager for flink olap queries

2021-12-16 Thread Xintong Song (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461218#comment-17461218
 ] 

Xintong Song edited comment on FLINK-25328 at 12/17/21, 6:24 AM:
-

[~zjureel],
The reason we have operators, rocksdb and python sharing one memory pool is 
that, Flink can manage to take whatever amount of memory that is available. The 
problem of a dedicated memory pool is that, the memory may be unused in some 
scenarios. Network buffer pool does not have this problem because it's always 
needed. Unfortunately, operators that use memory segments, rocksdb state 
backend and python udfs are not.


was (Author: xintongsong):
[~zjureel],
The reason we have operators, rocksdb and python sharing one memory pool is 
that, Flink can manage to take whatever amount of memory that is available. The 
problem of a dedicated memory pool is that, the memory may be unused in some 
scenarios. Network buffer pool does not have this problem because it's always 
needed. Unfortunately, operators that use memory segment, rocksdb state backend 
and python udfs are not.

> Improvement of share memory manager between jobs if they use the same slot in 
> TaskManager for flink olap queries
> 
>
> Key: FLINK-25328
> URL: https://issues.apache.org/jira/browse/FLINK-25328
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Shammon
>Priority: Major
>
> We submit batch jobs to flink session cluster as olap queries, and these 
> jobs' subtasks in TaskManager are frequently created and destroyed because 
> they finish their work quickly. Each slot in taskmanager manages 
> `MemoryManager` for multiple tasks in one job, and the `MemoryManager` is 
> closed when all the subtasks are finished. Join/Aggregate/Sort and etc. 
> operators in the subtasks allocate `MemorySegment` via `MemoryManager` and 
> these `MemorySegment` will be free when they are finished. 
> 
> It causes too much memory allocation and free of `MemorySegment` in 
> taskmanager. For example, a TaskManager contains 50 slots, one job has 3 
> join/agg operatos run in the slot, each operator will allocate 2000 segments 
> and initialize them. If the subtasks of a job take 100ms to execute, then the 
> taskmanager will execute 10 jobs' subtasks one second and it will allocate 
> and free 2000 * 3 * 50 * 10 = 300w segments for them. Allocate and free too 
> many segments from memory will cause two issues:
> 1) Increases the CPU usage of taskmanager
> 2) Increase the cost of subtasks in taskmanager, which will increase the 
> latency of job and decrease the qps.
>   To improve the usage of memory segment between jobs in the same slot, 
> we propose not drop memory manager when all the subtasks in the slot are 
> finished. The slot will hold the `MemoryManager` and not free the allocated 
> `MemorySegment` in it immediately. When some subtasks of another job are 
> assigned to the slot, they don't need to allocate segments from memory and 
> can reuse the `MemoryManager` and `MemorySegment` in it.  WDYT?  [~xtsong] THX



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] zuston commented on pull request #18087: [FLINK-25268] Support task manager node-label in Yarn deployment

2021-12-16 Thread GitBox



zuston commented on pull request #18087:
URL: https://github.com/apache/flink/pull/18087#issuecomment-996469650


   All done @KarmaGYZ Could you help review it again? Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zuston commented on a change in pull request #18087: [FLINK-25268] Support task manager node-label in Yarn deployment

2021-12-16 Thread GitBox



zuston commented on a change in pull request #18087:
URL: https://github.com/apache/flink/pull/18087#discussion_r771131336



##
File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManagerDriver.java
##
@@ -550,7 +559,33 @@ private FinalApplicationStatus 
getYarnStatus(ApplicationStatus status) {
 @Nonnull
 @VisibleForTesting
 static AMRMClient.ContainerRequest getContainerRequest(
-Resource containerResource, Priority priority) {
+Resource containerResource, Priority priority, String nodeLabel) {

Review comment:
   Removed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25328) Improvement of share memory manager between jobs if they use the same slot in TaskManager for flink olap queries

2021-12-16 Thread Xintong Song (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461218#comment-17461218
 ] 

Xintong Song commented on FLINK-25328:
--

[~zjureel],
The reason we have operators, rocksdb and python sharing one memory pool is 
that, Flink can manage to take whatever amount of memory that is available. The 
problem of a dedicated memory pool is that, the memory may be unused in some 
scenarios. Network buffer pool does not have this problem because it's always 
needed. Unfortunately, operators that use memory segment, rocksdb state backend 
and python udfs are not.

> Improvement of share memory manager between jobs if they use the same slot in 
> TaskManager for flink olap queries
> 
>
> Key: FLINK-25328
> URL: https://issues.apache.org/jira/browse/FLINK-25328
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Shammon
>Priority: Major
>
> We submit batch jobs to flink session cluster as olap queries, and these 
> jobs' subtasks in TaskManager are frequently created and destroyed because 
> they finish their work quickly. Each slot in taskmanager manages 
> `MemoryManager` for multiple tasks in one job, and the `MemoryManager` is 
> closed when all the subtasks are finished. Join/Aggregate/Sort and etc. 
> operators in the subtasks allocate `MemorySegment` via `MemoryManager` and 
> these `MemorySegment` will be free when they are finished. 
> 
> It causes too much memory allocation and free of `MemorySegment` in 
> taskmanager. For example, a TaskManager contains 50 slots, one job has 3 
> join/agg operatos run in the slot, each operator will allocate 2000 segments 
> and initialize them. If the subtasks of a job take 100ms to execute, then the 
> taskmanager will execute 10 jobs' subtasks one second and it will allocate 
> and free 2000 * 3 * 50 * 10 = 300w segments for them. Allocate and free too 
> many segments from memory will cause two issues:
> 1) Increases the CPU usage of taskmanager
> 2) Increase the cost of subtasks in taskmanager, which will increase the 
> latency of job and decrease the qps.
>   To improve the usage of memory segment between jobs in the same slot, 
> we propose not drop memory manager when all the subtasks in the slot are 
> finished. The slot will hold the `MemoryManager` and not free the allocated 
> `MemorySegment` in it immediately. When some subtasks of another job are 
> assigned to the slot, they don't need to allocate segments from memory and 
> can reuse the `MemoryManager` and `MemorySegment` in it.  WDYT?  [~xtsong] THX



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18087: [FLINK-25268] Support task manager node-label in Yarn deployment

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18087:
URL: https://github.com/apache/flink/pull/18087#issuecomment-992135476


   
   ## CI report:
   
   * 89360d57d3e879d255d53229f20c96279ad2ecb8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28156)
 
   * 552ab95c290f88a04e2df63232c1dfe5be8f882b Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28305)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] zhipeng93 commented on a change in pull request #30: [FLINK-24845] Add allreduce utility function in FlinkML

2021-12-16 Thread GitBox



zhipeng93 commented on a change in pull request #30:
URL: https://github.com/apache/flink-ml/pull/30#discussion_r771129221



##
File path: 
flink-ml-lib/src/main/java/org/apache/flink/ml/common/datastream/AllReduceUtils.java
##
@@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.common.datastream;
+
+import org.apache.flink.api.common.functions.RichFlatMapFunction;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.util.Collector;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Applies all-reduce on a DataStream where each partition contains only one 
double array.
+ *
+ * AllReduce is a communication primitive widely used in MPI. In this 
implementation, all workers
+ * do reduce on a partition of the whole data and they all get the final 
reduce result. In detail,
+ * we split each double array into chunks of fixed size buffer (4KB by 
default) and let each subtask
+ * handle several chunks.
+ *
+ * There're mainly three stages:
+ * 1. All workers send their partial data to other workers for reduce.
+ * 2. All workers do reduce on all data it received and then send partial 
results to others.
+ * 3. All workers merge partial results into final result.
+ */
+public class AllReduceUtils {
+
+private static final int TRANSFER_BUFFER_SIZE = 1024 * 4;
+

Review comment:
   The `TRANSFER_BUFFER_SIZE ` here is actually `4 * 1024 * DOUBLE.bytes` 
(32KB), which is consistent with Flink's network buffer size.
   
   BTW, I don't think the algorithm developper need to see the `chunk size` 
here. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-24845) Add allreduce utility function in FlinkML

2021-12-16 Thread Yun Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao closed FLINK-24845.
---
  Assignee: Zhipeng Zhang
Resolution: Fixed

> Add allreduce utility function in FlinkML
> -
>
> Key: FLINK-24845
> URL: https://issues.apache.org/jira/browse/FLINK-24845
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / Machine Learning
>Reporter: Zhipeng Zhang
>Assignee: Zhipeng Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-0.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-24845) Add allreduce utility function in FlinkML

2021-12-16 Thread Yun Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao updated FLINK-24845:

Fix Version/s: ml-0.1.0

> Add allreduce utility function in FlinkML
> -
>
> Key: FLINK-24845
> URL: https://issues.apache.org/jira/browse/FLINK-24845
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / Machine Learning
>Reporter: Zhipeng Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-0.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18087: [FLINK-25268] Support task manager node-label in Yarn deployment

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18087:
URL: https://github.com/apache/flink/pull/18087#issuecomment-992135476


   
   ## CI report:
   
   * 89360d57d3e879d255d53229f20c96279ad2ecb8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28156)
 
   * 552ab95c290f88a04e2df63232c1dfe5be8f882b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-24845) Add allreduce utility function in FlinkML

2021-12-16 Thread Yun Gao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-24845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461216#comment-17461216
 ] 

Yun Gao commented on FLINK-24845:
-

Fix on master via eeccb82129dd666070a9c1c220f2f8de3b0e5aec

> Add allreduce utility function in FlinkML
> -
>
> Key: FLINK-24845
> URL: https://issues.apache.org/jira/browse/FLINK-24845
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / Machine Learning
>Reporter: Zhipeng Zhang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink-ml] gaoyunhaii closed pull request #30: [FLINK-24845] Add allreduce utility function in FlinkML

2021-12-16 Thread GitBox



gaoyunhaii closed pull request #30:
URL: https://github.com/apache/flink-ml/pull/30


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] gaoyunhaii commented on pull request #30: [FLINK-24845] Add allreduce utility function in FlinkML

2021-12-16 Thread GitBox



gaoyunhaii commented on pull request #30:
URL: https://github.com/apache/flink-ml/pull/30#issuecomment-996464991


   Will merge~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] gaoyunhaii commented on a change in pull request #30: [FLINK-24845] Add allreduce utility function in FlinkML

2021-12-16 Thread GitBox



gaoyunhaii commented on a change in pull request #30:
URL: https://github.com/apache/flink-ml/pull/30#discussion_r771127080



##
File path: 
flink-ml-lib/src/main/java/org/apache/flink/ml/common/datastream/AllReduceUtils.java
##
@@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.common.datastream;
+
+import org.apache.flink.api.common.functions.RichFlatMapFunction;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.util.Collector;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Applies all-reduce on a DataStream where each partition contains only one 
double array.
+ *
+ * AllReduce is a communication primitive widely used in MPI. In this 
implementation, all workers
+ * do reduce on a partition of the whole data and they all get the final 
reduce result. In detail,
+ * we split each double array into chunks of fixed size buffer (4KB by 
default) and let each subtask
+ * handle several chunks.
+ *
+ * There're mainly three stages:
+ * 1. All workers send their partial data to other workers for reduce.
+ * 2. All workers do reduce on all data it received and then send partial 
results to others.
+ * 3. All workers merge partial results into final result.
+ */
+public class AllReduceUtils {
+
+private static final int TRANSFER_BUFFER_SIZE = 1024 * 4;
+
+/**
+ * Applies allReduce on the input data stream. The input data stream is 
supposed to contain one
+ * double array in each partition. The result data stream has the same 
parallelism as the input,
+ * where each partition contains one double array that sums all of the 
double arrays in the
+ * input data stream.
+ *
+ * Note that we throw exception when one of the following two cases 
happen:
+ * 1. There exists one partition that contains more than one double 
array.
+ * 2. The length of double array is not consistent among all 
partitions.
+ *
+ * @param input The input data stream.
+ * @return The result data stream.
+ */
+public static DataStream allReduce(DataStream input) {
+// chunkId, totalElements, partitionedArray
+DataStream> allReduceSend =
+input.flatMap(new AllReduceSend()).name("all-reduce-send");
+
+// taskId, chunkId, totalElements, partitionedArray
+DataStream> allReduceSum =
+allReduceSend
+.partitionCustom(
+(chunkId, numPartitions) -> chunkId % 
numPartitions, x -> x.f0)
+.transform(
+"all-reduce-sum",
+new TupleTypeInfo<>(
+BasicTypeInfo.INT_TYPE_INFO,
+BasicTypeInfo.INT_TYPE_INFO,
+BasicTypeInfo.INT_TYPE_INFO,
+
PrimitiveArrayTypeInfo.DOUBLE_PRIMITIVE_ARRAY_TYPE_INFO),
+new AllReduceSum())
+.name("all-reduce-sum");
+
+return allReduceSum
+.partitionCustom((taskIdx, numPartitions) -> taskIdx % 
numPartitions, x -> x.f0)
+.transform(
+"all-reduce-recv", TypeInformation.of(double[].class), 
new AllReduceRecv())
+.name("all-reduce-recv");
+}
+
+/**
+ * Splits each double array into multiple chunks and send each chunk to 
the corresponding
+ * partition.
+ */
+private

[GitHub] [flink-ml] gaoyunhaii commented on a change in pull request #30: [FLINK-24845] Add allreduce utility function in FlinkML

2021-12-16 Thread GitBox



gaoyunhaii commented on a change in pull request #30:
URL: https://github.com/apache/flink-ml/pull/30#discussion_r771125346



##
File path: 
flink-ml-lib/src/main/java/org/apache/flink/ml/common/datastream/AllReduceUtils.java
##
@@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.common.datastream;
+
+import org.apache.flink.api.common.functions.RichFlatMapFunction;
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.operators.AbstractStreamOperator;
+import org.apache.flink.streaming.api.operators.BoundedOneInput;
+import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.util.Collector;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Applies all-reduce on a DataStream where each partition contains only one 
double array.
+ *
+ * AllReduce is a communication primitive widely used in MPI. In this 
implementation, all workers
+ * do reduce on a partition of the whole data and they all get the final 
reduce result. In detail,
+ * we split each double array into chunks of fixed size buffer (4KB by 
default) and let each subtask
+ * handle several chunks.
+ *
+ * There're mainly three stages:
+ * 1. All workers send their partial data to other workers for reduce.
+ * 2. All workers do reduce on all data it received and then send partial 
results to others.
+ * 3. All workers merge partial results into final result.
+ */
+public class AllReduceUtils {
+
+private static final int TRANSFER_BUFFER_SIZE = 1024 * 4;
+

Review comment:
   I think perhaps we could first see if there are requirements on 
different chunk size from the algorithm's view. If currently we do not have new 
requirements, we could first keep it as is and we could add a new method with 
the chunk size parameter in the future if needed. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18132: [FLINK-25223][connectors/elasticsearch] Disable Elasticsearch 7 Testcontainer tests due to frequent CI failures

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18132:
URL: https://github.com/apache/flink/pull/18132#issuecomment-995779473


   
   ## CI report:
   
   * 4211aa4e7ddc72673fc57693a0db63a55d833d5c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28268)
 
   * 39c21f4d5685ee74a70a6f6391abd0eaf55955c1 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28304)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18132: [FLINK-25223][connectors/elasticsearch] Disable Elasticsearch 7 Testcontainer tests due to frequent CI failures

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18132:
URL: https://github.com/apache/flink/pull/18132#issuecomment-995779473


   
   ## CI report:
   
   * 4211aa4e7ddc72673fc57693a0db63a55d833d5c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28268)
 
   * 39c21f4d5685ee74a70a6f6391abd0eaf55955c1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18131: [FLINK-25223][connectors/elasticsearch] Disable Elasticsearch 7 Testcontainer tests due to frequent CI failures

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18131:
URL: https://github.com/apache/flink/pull/18131#issuecomment-995779333


   
   ## CI report:
   
   * 3db9717e8cc373c68ec93b0b78aae3191469984d Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28267)
 
   * b1de312163a6659db1b084e5e0a6df4d516de815 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28303)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18131: [FLINK-25223][connectors/elasticsearch] Disable Elasticsearch 7 Testcontainer tests due to frequent CI failures

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18131:
URL: https://github.com/apache/flink/pull/18131#issuecomment-995779333


   
   ## CI report:
   
   * 3db9717e8cc373c68ec93b0b78aae3191469984d Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28267)
 
   * b1de312163a6659db1b084e5e0a6df4d516de815 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-24728) Batch SQL file sink forgets to close the output stream

2021-12-16 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-24728.

Resolution: Fixed

master:

6af0b9965293cb732a540b9364b6aae76a9b356a

2e9f9ad166f472edd693c8a47857e14e76928dc9

release-1.14:

9a0c5e00839983de23f662e337dfc626d0bdaad9

11d24708be32605243bc404679b17758c4e76e79

release-1.13:

3200e8ef43b3024b0b44f184dfa833d1aa7d7d75

> Batch SQL file sink forgets to close the output stream
> --
>
> Key: FLINK-24728
> URL: https://issues.apache.org/jira/browse/FLINK-24728
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.4, 1.14.0, 1.12.5, 1.13.3
>Reporter: Caizhi Weng
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0, 1.13.6, 1.14.3
>
>
> I tried to write a large avro file into HDFS and discover that the displayed 
> file size in HDFS is extremely small, but copying that file to local yields 
> the correct size. If we create another Flink job and read that avro file from 
> HDFS, the job will finish without outputting any record because the file size 
> Flink gets from HDFS is the very small file size.
> This is because the output format created in 
> {{FileSystemTableSink#createBulkWriterOutputFormat}} only finishes the 
> {{BulkWriter}}. According to the java doc of {{BulkWriter#finish}} bulk 
> writers should not close the output stream and should leave them to the 
> framework.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] JingsongLi merged pull request #18074: [FLINK-24728][table-runtime-blink] Close output stream in batch SQL file sink

2021-12-16 Thread GitBox



JingsongLi merged pull request #18074:
URL: https://github.com/apache/flink/pull/18074


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] JingsongLi merged pull request #18073: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-12-16 Thread GitBox



JingsongLi merged pull request #18073:
URL: https://github.com/apache/flink/pull/18073


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-25278) Azure failed due to unable to transfer jar from confluent maven repo

2021-12-16 Thread Yun Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao updated FLINK-25278:

Priority: Critical  (was: Major)

> Azure failed due to unable to transfer jar from confluent maven repo
> 
>
> Key: FLINK-25278
> URL: https://issues.apache.org/jira/browse/FLINK-25278
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines
>Affects Versions: 1.13.3
>Reporter: Yun Gao
>Priority: Critical
>  Labels: test-stability
>
> {code:java}
> Dec 12 00:46:45 [ERROR] Failed to execute goal on project 
> flink-avro-confluent-registry: Could not resolve dependencies for project 
> org.apache.flink:flink-avro-confluent-registry:jar:1.13-SNAPSHOT: Could not 
> transfer artifact io.confluent:common-utils:jar:5.5.2 from/to confluent 
> (https://packages.confluent.io/maven/): transfer failed for 
> https://packages.confluent.io/maven/io/confluent/common-utils/5.5.2/common-utils-5.5.2.jar:
>  Connection reset -> [Help 1]
> Dec 12 00:46:45 [ERROR] 
> Dec 12 00:46:45 [ERROR] To see the full stack trace of the errors, re-run 
> Maven with the -e switch.
> Dec 12 00:46:45 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> Dec 12 00:46:45 [ERROR] 
> Dec 12 00:46:45 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> Dec 12 00:46:45 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> Dec 12 00:46:45 [ERROR] 
> Dec 12 00:46:45 [ERROR] After correcting the problems, you can resume the 
> build with the command
> Dec 12 00:46:45 [ERROR]   mvn  -rf :flink-avro-confluent-registry
>  {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=27994=logs=a5ef94ef-68c2-57fd-3794-dc108ed1c495=9c1ddabe-d186-5a2c-5fcc-f3cafb3ec699=8812



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25278) Azure failed due to unable to transfer jar from confluent maven repo

2021-12-16 Thread Yun Gao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461199#comment-17461199
 ] 

Yun Gao commented on FLINK-25278:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=28290=logs=fc5181b0-e452-5c8f-68de-1097947f6483=995c650b-6573-581c-9ce6-7ad4cc038461=19618

> Azure failed due to unable to transfer jar from confluent maven repo
> 
>
> Key: FLINK-25278
> URL: https://issues.apache.org/jira/browse/FLINK-25278
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines
>Affects Versions: 1.13.3
>Reporter: Yun Gao
>Priority: Major
>  Labels: test-stability
>
> {code:java}
> Dec 12 00:46:45 [ERROR] Failed to execute goal on project 
> flink-avro-confluent-registry: Could not resolve dependencies for project 
> org.apache.flink:flink-avro-confluent-registry:jar:1.13-SNAPSHOT: Could not 
> transfer artifact io.confluent:common-utils:jar:5.5.2 from/to confluent 
> (https://packages.confluent.io/maven/): transfer failed for 
> https://packages.confluent.io/maven/io/confluent/common-utils/5.5.2/common-utils-5.5.2.jar:
>  Connection reset -> [Help 1]
> Dec 12 00:46:45 [ERROR] 
> Dec 12 00:46:45 [ERROR] To see the full stack trace of the errors, re-run 
> Maven with the -e switch.
> Dec 12 00:46:45 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> Dec 12 00:46:45 [ERROR] 
> Dec 12 00:46:45 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> Dec 12 00:46:45 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> Dec 12 00:46:45 [ERROR] 
> Dec 12 00:46:45 [ERROR] After correcting the problems, you can resume the 
> build with the command
> Dec 12 00:46:45 [ERROR]   mvn  -rf :flink-avro-confluent-registry
>  {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=27994=logs=a5ef94ef-68c2-57fd-3794-dc108ed1c495=9c1ddabe-d186-5a2c-5fcc-f3cafb3ec699=8812



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-25358) New File Sink end-to-end test failed due to TM could not connect to RM

2021-12-16 Thread Yun Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao updated FLINK-25358:

Labels: test-stability  (was: )

> New File Sink end-to-end test failed due to TM could not connect to RM
> --
>
> Key: FLINK-25358
> URL: https://issues.apache.org/jira/browse/FLINK-25358
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Runtime / Coordination
>Affects Versions: 1.14.2
>Reporter: Yun Gao
>Priority: Major
>  Labels: test-stability
>
> {code:java}
> 2021-12-16T16:53:21.8386046Z Dec 16 16:53:15 2021-12-16 16:39:16,946 INFO  
> org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Start job 
> leader service.
> 2021-12-16T16:53:21.8386987Z Dec 16 16:53:15 2021-12-16 16:39:16,948 INFO  
> org.apache.flink.runtime.filecache.FileCache [] - User file 
> cache uses directory 
> /tmp/flink-dist-cache-3a780c9f-c395-4355-b496-ade62a0757ad
> 2021-12-16T16:53:21.8388219Z Dec 16 16:53:15 2021-12-16 16:39:16,960 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Connecting 
> to ResourceManager 
> akka.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*().
> 2021-12-16T16:53:21.8389898Z Dec 16 16:53:15 2021-12-16 16:39:17,485 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/temp/_user_rpc_resourcemanager_*$a]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8391767Z Dec 16 16:53:15 2021-12-16 16:39:18,851 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8393631Z Dec 16 16:53:15 2021-12-16 16:39:20,374 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8395509Z Dec 16 16:53:15 2021-12-16 16:39:20,376 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8397548Z Dec 16 16:53:15 2021-12-16 16:39:27,018 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Could not 
> resolve ResourceManager address 
> akka.ssl.tcp://flink@localhost:***@localhost:6123/user/rpc/resourcemanager_*.
> 2021-12-16T16:53:21.8399173Z Dec 16 16:53:15 2021-12-16 16:39:28,864 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8401223Z Dec 16 16:53:15 2021-12-16 16:39:30,357 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8403087Z Dec 16 16:53:15 2021-12-16 16:39:37,036 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/temp/_user_rpc_resourcemanager_*$b]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using akka.remote.use-passive-connections=off 
> or use Artery TCP.
> 2021-12-16T16:53:21.8404956Z Dec 16 16:53:15 2021-12-16 16:39:38,886 WARN  
> akka.remote.EndpointReader   [] - Discarding 
> inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] 
> in read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this 
> happens often you may consider using

[jira] [Created] (FLINK-25358) New File Sink end-to-end test failed due to TM could not connect to RM

2021-12-16 Thread Yun Gao (Jira)

Yun Gao created FLINK-25358:
---

 Summary: New File Sink end-to-end test failed due to TM could not 
connect to RM
 Key: FLINK-25358
 URL: https://issues.apache.org/jira/browse/FLINK-25358
 Project: Flink
  Issue Type: Bug
  Components: Build System / Azure Pipelines, Runtime / Coordination
Affects Versions: 1.14.2
Reporter: Yun Gao



{code:java}
2021-12-16T16:53:21.8386046Z Dec 16 16:53:15 2021-12-16 16:39:16,946 INFO  
org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Start job 
leader service.
2021-12-16T16:53:21.8386987Z Dec 16 16:53:15 2021-12-16 16:39:16,948 INFO  
org.apache.flink.runtime.filecache.FileCache [] - User file 
cache uses directory /tmp/flink-dist-cache-3a780c9f-c395-4355-b496-ade62a0757ad
2021-12-16T16:53:21.8388219Z Dec 16 16:53:15 2021-12-16 16:39:16,960 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Connecting to 
ResourceManager 
akka.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*().
2021-12-16T16:53:21.8389898Z Dec 16 16:53:15 2021-12-16 16:39:17,485 WARN  
akka.remote.EndpointReader   [] - Discarding 
inbound message to [Actor[akka://flink/temp/_user_rpc_resourcemanager_*$a]] in 
read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this happens 
often you may consider using akka.remote.use-passive-connections=off or use 
Artery TCP.
2021-12-16T16:53:21.8391767Z Dec 16 16:53:15 2021-12-16 16:39:18,851 WARN  
akka.remote.EndpointReader   [] - Discarding 
inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] in 
read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this happens 
often you may consider using akka.remote.use-passive-connections=off or use 
Artery TCP.
2021-12-16T16:53:21.8393631Z Dec 16 16:53:15 2021-12-16 16:39:20,374 WARN  
akka.remote.EndpointReader   [] - Discarding 
inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] in 
read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this happens 
often you may consider using akka.remote.use-passive-connections=off or use 
Artery TCP.
2021-12-16T16:53:21.8395509Z Dec 16 16:53:15 2021-12-16 16:39:20,376 WARN  
akka.remote.EndpointReader   [] - Discarding 
inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] in 
read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this happens 
often you may consider using akka.remote.use-passive-connections=off or use 
Artery TCP.
2021-12-16T16:53:21.8397548Z Dec 16 16:53:15 2021-12-16 16:39:27,018 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Could not 
resolve ResourceManager address 
akka.ssl.tcp://flink@localhost:***@localhost:6123/user/rpc/resourcemanager_*.
2021-12-16T16:53:21.8399173Z Dec 16 16:53:15 2021-12-16 16:39:28,864 WARN  
akka.remote.EndpointReader   [] - Discarding 
inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] in 
read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this happens 
often you may consider using akka.remote.use-passive-connections=off or use 
Artery TCP.
2021-12-16T16:53:21.8401223Z Dec 16 16:53:15 2021-12-16 16:39:30,357 WARN  
akka.remote.EndpointReader   [] - Discarding 
inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] in 
read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this happens 
often you may consider using akka.remote.use-passive-connections=off or use 
Artery TCP.
2021-12-16T16:53:21.8403087Z Dec 16 16:53:15 2021-12-16 16:39:37,036 WARN  
akka.remote.EndpointReader   [] - Discarding 
inbound message to [Actor[akka://flink/temp/_user_rpc_resourcemanager_*$b]] in 
read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this happens 
often you may consider using akka.remote.use-passive-connections=off or use 
Artery TCP.
2021-12-16T16:53:21.8404956Z Dec 16 16:53:15 2021-12-16 16:39:38,886 WARN  
akka.remote.EndpointReader   [] - Discarding 
inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] in 
read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this happens 
often you may consider using akka.remote.use-passive-connections=off or use 
Artery TCP.
2021-12-16T16:53:21.8406801Z Dec 16 16:53:15 2021-12-16 16:39:40,374 WARN  
akka.remote.EndpointReader   [] - Discarding 
inbound message to [Actor[akka://flink/user/rpc/taskmanager_0#-1901973634]] in 
read-only association to [akka.ssl.tcp://flink@localhost:6123]. If this happens 
often you may consider using

[GitHub] [flink] wenlong88 commented on pull request #18017: [FLINK-25171] Validation of duplicate fields in ddl sql

2021-12-16 Thread GitBox



wenlong88 commented on pull request #18017:
URL: https://github.com/apache/flink/pull/18017#issuecomment-996448224


   LGTM，cc @godfreyhe  to do the final check


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] JingsongLi commented on a change in pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



JingsongLi commented on a change in pull request #18088:
URL: https://github.com/apache/flink/pull/18088#discussion_r77853



##
File path: 
flink-table/flink-table-common/src/main/java/org/apache/flink/table/factories/FactoryUtil.java
##
@@ -620,10 +620,16 @@ public static String getFormatPrefix(
 Class factoryClass, DynamicTableFactory.Context context) {
 final String connectorOption = 
context.getCatalogTable().getOptions().get(CONNECTOR.key());
 if (connectorOption == null) {
-throw new ValidationException(
-String.format(
-"Table options do not contain an option key '%s' 
for discovering a connector.",
-CONNECTOR.key()));
+ManagedTableFactory factory =

Review comment:
   You have convinced me that I think we can use `default`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25283) End-to-end application modules create oversized jars

2021-12-16 Thread Dian Fu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461196#comment-17461196
 ] 

Dian Fu commented on FLINK-25283:
-

[~chesnay]  [~MartijnVisser] Could you share more information about this? For 
example, what you are trying to do in this ticket and the failure for the 
Python tests?

> End-to-end application modules create oversized jars
> 
>
> Key: FLINK-25283
> URL: https://issues.apache.org/jira/browse/FLINK-25283
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Tests
>Affects Versions: 1.13.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
> Fix For: 1.15.0
>
>
> Various modules that create jars for e2e tests (e.g., 
> flink-streaming-kinesis-test) create oversized jars (100mb+) because they 
> bundle their entire dependency tree, including many parts of Flink, along 
> with test dependencies and test resources.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #17582: [FLINK-24674][kubernetes] Create corresponding resouces for task manager Pods

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #17582:
URL: https://github.com/apache/flink/pull/17582#issuecomment-953241058


   
   ## CI report:
   
   * ed1e59cced3590c5cbb27e0214b3746d148c8aa2 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28292)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #17582: [FLINK-24674][kubernetes] Create corresponding resouces for task manager Pods

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #17582:
URL: https://github.com/apache/flink/pull/17582#issuecomment-953241058


   
   ## CI report:
   
   * ed1e59cced3590c5cbb27e0214b3746d148c8aa2 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28292)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] mbalassi commented on pull request #17582: [FLINK-24674][kubernetes] Create corresponding resouces for task manager Pods

2021-12-16 Thread GitBox



mbalassi commented on pull request #17582:
URL: https://github.com/apache/flink/pull/17582#issuecomment-996443534


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-24119) KafkaITCase.testTimestamps fails due to "Topic xxx already exist"

2021-12-16 Thread Yun Gao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-24119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461192#comment-17461192
 ] 

Yun Gao commented on FLINK-24119:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=28260=logs=c5f0071e-1851-543e-9a45-9ac140befc32=15a22db7-8faa-5b34-3920-d33c9f0ca23c

> KafkaITCase.testTimestamps fails due to "Topic xxx already exist"
> -
>
> Key: FLINK-24119
> URL: https://issues.apache.org/jira/browse/FLINK-24119
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0
>Reporter: Xintong Song
>Assignee: Fabian Paul
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.15.0, 1.14.3
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23328=logs=c5f0071e-1851-543e-9a45-9ac140befc32=15a22db7-8faa-5b34-3920-d33c9f0ca23c=7419
> {code}
> Sep 01 15:53:20 [ERROR] Tests run: 23, Failures: 1, Errors: 0, Skipped: 0, 
> Time elapsed: 162.65 s <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.kafka.KafkaITCase
> Sep 01 15:53:20 [ERROR] testTimestamps  Time elapsed: 23.237 s  <<< FAILURE!
> Sep 01 15:53:20 java.lang.AssertionError: Create test topic : tstopic failed, 
> org.apache.kafka.common.errors.TopicExistsException: Topic 'tstopic' already 
> exists.
> Sep 01 15:53:20   at org.junit.Assert.fail(Assert.java:89)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.createTestTopic(KafkaTestEnvironmentImpl.java:226)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironment.createTestTopic(KafkaTestEnvironment.java:112)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaTestBase.createTestTopic(KafkaTestBase.java:212)
> Sep 01 15:53:20   at 
> org.apache.flink.streaming.connectors.kafka.KafkaITCase.testTimestamps(KafkaITCase.java:191)
> Sep 01 15:53:20   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Sep 01 15:53:20   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Sep 01 15:53:20   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Sep 01 15:53:20   at java.lang.reflect.Method.invoke(Method.java:498)
> Sep 01 15:53:20   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Sep 01 15:53:20   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> Sep 01 15:53:20   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> Sep 01 15:53:20   at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> Sep 01 15:53:20   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28300)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-25021) Source/Sink for Azure Data Explorer (ADX)

2021-12-16 Thread Xue Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461169#comment-17461169
 ] 

Xue Wang edited comment on FLINK-25021 at 12/17/21, 3:22 AM:
-

Hi [~MartijnVisser], I checked the Kusto Java client in detail today and found 
they haven't implemented an Async client as in C#. What if I wrap the 
synchronous APIs with CompletableFuture like 
[this|https://github.com/Azure/azure-kusto-java/blob/b3d73b8bb1334c2565e551b7d1d2853d14f3fd13/samples/src/main/java/FileIngestionCompletableFuture.java#L77]
 and provide a customized executor? Any concerns or suggestions? Or should I 
use FLIP-143 instead?


was (Author: xwang51):
Hi [~MartijnVisser], I checked the Kusto Java client in detail today and found 
they haven't implemented an Async client as in C#. What if I wrap the 
synchronous APIs with CompletableFuture like 
[this|https://github.com/Azure/azure-kusto-java/blob/b3d73b8bb1334c2565e551b7d1d2853d14f3fd13/samples/src/main/java/FileIngestionCompletableFuture.java#L77]?
 Any concerns?

> Source/Sink for Azure Data Explorer (ADX)
> -
>
> Key: FLINK-25021
> URL: https://issues.apache.org/jira/browse/FLINK-25021
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Xue Wang
>Assignee: Xue Wang
>Priority: Major
>
> Hi all,
> I'm considering implementing source/sink for Azure Data Explorer (ADX). But 
> first I'd like to check with the community if this is already a supported 
> scenario. If not, will you consider adding it to the list of official 
> connectors?
> FWIW, Azure Data Explorer is a widely used analytic service on Azure well 
> suited for ad-hoc and time series analysis over large volume of structured, 
> semi-structured, and unstructured data. And it can integrate with a wide 
> range of data sources and visualization tools. 
> References:
> [General Intro by ADX 
> PM|https://vincentlauzon.com/2020/02/19/azure-data-explorer-kusto]
> [ADX 
> Documentation|https://docs.microsoft.com/en-us/azure/data-explorer/data-explorer-overview]
> [Ingest data using the Azure Data Explorer Java 
> SDK|https://docs.microsoft.com/en-us/azure/data-explorer/java-ingest-data]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xiangqiao123 commented on a change in pull request #18050: [FLINK-25034][runtime] Support flexible number of subpartitions in IntermediateResultPartition

2021-12-16 Thread GitBox



xiangqiao123 commented on a change in pull request #18050:
URL: https://github.com/apache/flink/pull/18050#discussion_r770484017



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartitionTest.java
##
@@ -140,6 +150,127 @@ public void testBlockingPartitionResetting() throws 
Exception {
 assertFalse(consumedPartitionGroup.areAllPartitionsFinished());
 }
 
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicAllToAllGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, false, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicPointWiseGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, false, 
Arrays.asList(4, 3));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicAllToAllGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, true, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicPointWiseGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, true, 
Arrays.asList(4, 4));

Review comment:
   Will this scenario exist in real job?
   consumerParallelism != -1 and isDynamicGraph = true




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-24955) Add One-hot Encoder to Flink ML

2021-12-16 Thread Yun Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao closed FLINK-24955.
---
  Assignee: Yunfeng Zhou
Resolution: Fixed

> Add One-hot Encoder to Flink ML
> ---
>
> Key: FLINK-24955
> URL: https://issues.apache.org/jira/browse/FLINK-24955
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / Machine Learning
>Reporter: Yunfeng Zhou
>Assignee: Yunfeng Zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-0.1.0
>
>
> Add One-hot Encoder to Flink ML



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-24955) Add One-hot Encoder to Flink ML

2021-12-16 Thread Yun Gao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-24955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461174#comment-17461174
 ] 

Yun Gao commented on FLINK-24955:
-

Merge on master via 79d0c247e0cc900c9a16e045a99000423003780b

> Add One-hot Encoder to Flink ML
> ---
>
> Key: FLINK-24955
> URL: https://issues.apache.org/jira/browse/FLINK-24955
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / Machine Learning
>Reporter: Yunfeng Zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-0.1.0
>
>
> Add One-hot Encoder to Flink ML



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-24955) Add One-hot Encoder to Flink ML

2021-12-16 Thread Yun Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao updated FLINK-24955:

Fix Version/s: ml-0.1.0

> Add One-hot Encoder to Flink ML
> ---
>
> Key: FLINK-24955
> URL: https://issues.apache.org/jira/browse/FLINK-24955
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / Machine Learning
>Reporter: Yunfeng Zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: ml-0.1.0
>
>
> Add One-hot Encoder to Flink ML



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] xiangqiao123 commented on a change in pull request #18050: [FLINK-25034][runtime] Support flexible number of subpartitions in IntermediateResultPartition

2021-12-16 Thread GitBox



xiangqiao123 commented on a change in pull request #18050:
URL: https://github.com/apache/flink/pull/18050#discussion_r770483911



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartitionTest.java
##
@@ -140,6 +150,127 @@ public void testBlockingPartitionResetting() throws 
Exception {
 assertFalse(consumedPartitionGroup.areAllPartitionsFinished());
 }
 
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicAllToAllGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, false, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicPointWiseGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, false, 
Arrays.asList(4, 3));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicAllToAllGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, true, 
Arrays.asList(7, 7));

Review comment:
   Will this scenario exist in real job?
   consumerParallelism != -1 and isDynamicGraph = true




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] gaoyunhaii closed pull request #37: [FLINK-24955] Add Estimator and Transformer for One Hot Encoder

2021-12-16 Thread GitBox



gaoyunhaii closed pull request #37:
URL: https://github.com/apache/flink-ml/pull/37


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] gaoyunhaii commented on pull request #37: [FLINK-24955] Add Estimator and Transformer for One Hot Encoder

2021-12-16 Thread GitBox



gaoyunhaii commented on pull request #37:
URL: https://github.com/apache/flink-ml/pull/37#issuecomment-996403629


   merging...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xiangqiao123 commented on a change in pull request #18050: [FLINK-25034][runtime] Support flexible number of subpartitions in IntermediateResultPartition

2021-12-16 Thread GitBox



xiangqiao123 commented on a change in pull request #18050:
URL: https://github.com/apache/flink/pull/18050#discussion_r770484017



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartitionTest.java
##
@@ -140,6 +150,127 @@ public void testBlockingPartitionResetting() throws 
Exception {
 assertFalse(consumedPartitionGroup.areAllPartitionsFinished());
 }
 
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicAllToAllGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, false, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicPointWiseGraph() 
throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, false, 
Arrays.asList(4, 3));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicAllToAllGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.ALL_TO_ALL, true, 
Arrays.asList(7, 7));
+}
+
+@Test
+public void 
testGetNumberOfSubpartitionsFromConsumerParallelismForDynamicPointWiseGraph()
+throws Exception {
+testGetNumberOfSubpartitions(7, DistributionPattern.POINTWISE, true, 
Arrays.asList(4, 4));

Review comment:
   Will this scenario exist in real job?

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/InternalExecutionGraphAccessor.java
##
@@ -109,4 +110,8 @@ void notifySchedulerNgAboutInternalTaskFailure(
 void deleteBlobs(List blobKeys);
 
 void initializeJobVertex(ExecutionJobVertex ejv, long createTimestamp) 
throws JobException;
+
+ExecutionJobVertex getJobVertex(JobVertexID id);

Review comment:
   Is it better to use `getExecutionJobVertex` for the method name？

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartition.java
##
@@ -77,6 +85,69 @@ public ResultPartitionType getResultType() {
 return getEdgeManager().getConsumedPartitionGroupsById(partitionId);
 }
 
+public int getNumberOfSubpartitions() {
+if (numberOfSubpartitions == UNKNOWN) {
+numberOfSubpartitions = getOrComputeNumberOfSubpartitions();
+}
+checkState(
+numberOfSubpartitions > 0,
+"Number of subpartitions is an unexpected value: " + 
numberOfSubpartitions);
+
+return numberOfSubpartitions;
+}
+
+private int getOrComputeNumberOfSubpartitions() {
+if (!getProducer().getExecutionGraphAccessor().isDynamic()) {
+// The produced data is partitioned among a number of 
subpartitions.
+//
+// If no consumers are known at this point, we use a single 
subpartition, otherwise we
+// have one for each consuming sub task.
+int numberOfSubpartitions = 1;
+List consumerVertexGroups = 
getConsumerVertexGroups();
+if (!consumerVertexGroups.isEmpty() && 
!consumerVertexGroups.get(0).isEmpty()) {
+if (consumerVertexGroups.size() > 1) {
+throw new IllegalStateException(
+"Currently, only a single consumer group per 
partition is supported.");
+}
+numberOfSubpartitions = consumerVertexGroups.get(0).size();
+}
+
+return numberOfSubpartitions;
+} else {
+if (totalResult.isBroadcast()) {
+// for dynamic graph and broadcast result, we only produced 
one subpartition,
+// and all the downstream vertices should consume this 
subpartition.
+return 1;
+} else {
+return computeNumberOfMaxPossiblePartitionConsumers();
+}
+}
+}
+
+private int computeNumberOfMaxPossiblePartitionConsumers() {
+final ExecutionJobVertex consumerJobVertex =
+getIntermediateResult().getConsumerExecutionJobVertex();
+final DistributionPattern distributionPattern =
+getIntermediateResult().getConsumingDistributionPattern();
+
+// decide the max possible consumer job vertex parallelism
+int maxConsumerJobVertexParallelism = 
consumerJobVertex.getParallelism();

Review comment:
   It doesn't seem necessary，and may may cause `return (int) 
Math.ceil(((double) maxConsumerJobVertexParallelism) / numberOfPartitions);`  
is 0

##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartitionTest.java
##
@@ -140,6 +150,127 @@ public void testBlockingPartitionResetting() throws 
Exception {
 assertFalse(consumedPartitionGroup.areAllPartitionsFinished());
 }
 
+@Test
+public void testGetNumberOfSubpartitionsForNonDynamicAllToAllGraph()

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-25357) SQL planner incorrectly changes a streaming join with FLOOR(rowtime) into interval join

2021-12-16 Thread Caizhi Weng (Jira)

Caizhi Weng created FLINK-25357:
---

 Summary: SQL planner incorrectly changes a streaming join with 
FLOOR(rowtime) into interval join
 Key: FLINK-25357
 URL: https://issues.apache.org/jira/browse/FLINK-25357
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.14.2
Reporter: Caizhi Weng


This issue is reported from the [user mailing 
list|https://lists.apache.org/thread/v8omhomp58hb8m5dj4noxbr1dsyy6zjl].

Add the following test case to {{TableEnvironmentITCase}} to reproduce this 
issue.
{code:scala}
@Test
def myTest(): Unit = {
  val data = Seq(
Row.of(
  "1",
  java.time.LocalDateTime.of(2021, 12, 13, 12, 5, 8)
),
Row.of(
  "1",
  java.time.LocalDateTime.of(2021, 12, 13, 13, 5, 4)
),
Row.of(
  "1",
  java.time.LocalDateTime.of(2021, 12, 13, 14, 5, 6)
)
  )

  tEnv.executeSql(
s"""
   |create table T (
   |  id STRING,
   |  b TIMESTAMP(3),
   |  WATERMARK FOR b AS b - INTERVAL '60' MINUTES
   |) WITH (
   |  'connector' = 'values',
   |  'bounded' = 'true',
   |  'data-id' = '${TestValuesTableFactory.registerData(data)}'
   |)
   |""".stripMargin)
  tEnv.executeSql(
"""
  |SELECT
  |  source.id AS sourceid,
  |  CAST(source.b AS TIMESTAMP) AS source_startat,
  |  CAST(target.b AS TIMESTAMP) AS target_startat
  |FROM T source, T target
  |WHERE source.id = target.id
  |AND source.id IN ('1', '2', '3')
  |AND source.b >= FLOOR(target.b TO HOUR) + INTERVAL '1' HOUR AND source.b 
< FLOOR(target.b TO HOUR) + INTERVAL '2' HOUR
  |""".stripMargin).print()
}
{code}

Results (correct) for the batch task is
{code}
++++
|   sourceid | source_startat | 
target_startat |
++++
|  1 | 2021-12-13 13:05:04.00 | 2021-12-13 
12:05:08.00 |
|  1 | 2021-12-13 14:05:06.00 | 2021-12-13 
13:05:04.00 |
++++
{code}

Results (incorrect) for the streaming task is
{code}
+++++
| op |   sourceid | source_startat |
 target_startat |
+++++
| +I |  1 | 2021-12-13 14:05:06.00 | 2021-12-13 
12:05:08.00 |
| +I |  1 | 2021-12-13 14:05:06.00 | 2021-12-13 
13:05:04.00 |
+++++
{code}

Plan for the streaming task is
{code}
LogicalProject(sourceid=[$0], source_startat=[CAST($1):TIMESTAMP(6)], 
target_startat=[CAST($3):TIMESTAMP(6)])
+- LogicalFilter(condition=[AND(=($0, $2), OR(=($0, _UTF-16LE'1'), =($0, 
_UTF-16LE'2'), =($0, _UTF-16LE'3')), >=($1, +(FLOOR($3, FLAG(HOUR)), 
360:INTERVAL HOUR)), <($1, +(FLOOR($3, FLAG(HOUR)), 720:INTERVAL 
HOUR)))])
   +- LogicalJoin(condition=[true], joinType=[inner])
  :- LogicalWatermarkAssigner(rowtime=[b], watermark=[-($1, 
360:INTERVAL MINUTE)])
  :  +- LogicalTableScan(table=[[default_catalog, default_database, T]])
  +- LogicalWatermarkAssigner(rowtime=[b], watermark=[-($1, 
360:INTERVAL MINUTE)])
 +- LogicalTableScan(table=[[default_catalog, default_database, T]])

== Optimized Physical Plan ==
Calc(select=[id AS sourceid, CAST(CAST(b)) AS source_startat, CAST(CAST(b0)) AS 
target_startat])
+- IntervalJoin(joinType=[InnerJoin], windowBounds=[isRowTime=true, 
leftLowerBound=360, leftUpperBound=719, leftTimeIndex=1, 
rightTimeIndex=1], where=[AND(=(id, id0), >=(b, +(FLOOR(b0, FLAG(HOUR)), 
360:INTERVAL HOUR)), <(b, +(FLOOR(b0, FLAG(HOUR)), 720:INTERVAL 
HOUR)))], select=[id, b, id0, b0])
   :- Exchange(distribution=[hash[id]])
   :  +- Calc(select=[id, b], where=[SEARCH(id, Sarg[_UTF-16LE'1', 
_UTF-16LE'2', _UTF-16LE'3']:CHAR(1) CHARACTER SET "UTF-16LE")])
   : +- WatermarkAssigner(rowtime=[b], watermark=[-(b, 360:INTERVAL 
MINUTE)])
   :+- TableSourceScan(table=[[default_catalog, default_database, T]], 
fields=[id, b])
   +- Exchange(distribution=[hash[id]])
  +- WatermarkAssigner(rowtime=[b], watermark=[-(b, 360:INTERVAL 
MINUTE)])
 +- TableSourceScan(table=[[default_catalog, default_database, T]], 
fields=[id, b])

== Optimized Execution Plan ==
Calc(select=[id AS sourceid, CAST(CAST(b)) AS source_startat, CAST(CAST(b0)) AS 
target_startat])
+- IntervalJoin(joinType=[InnerJoin],

[jira] [Commented] (FLINK-25167) Support user-defined `StreamOperatorFactory` in `ConnectedStreams`#transform

2021-12-16 Thread Lsw_aka_laplace (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461171#comment-17461171
 ] 

Lsw_aka_laplace commented on FLINK-25167:
-

[~pnowojski] Sorry for late reply. I guess we have reached an agreement on 
exposing {{StreamOperatorFactory}}, right?  If so, would you mind assigning  
this ticket to me?  I’ll work on it, thanks.

> Support user-defined `StreamOperatorFactory` in `ConnectedStreams`#transform
> 
>
> Key: FLINK-25167
> URL: https://issues.apache.org/jira/browse/FLINK-25167
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Lsw_aka_laplace
>Priority: Minor
>
>   From my side, it is necessary to set my custom `StreamOperatorFactory` when 
> I ’m calling  `ConnectedStreams`#transform so that I can set up my own 
> `OperatorCoordinator`. 
>  Well, currently, `ConnectStreams` seems not to give the access, the default 
> behavior is using `SimpleOperatorFactory`.  After checking the code, I think 
> it is a trivial change to support that. If no one is working on it, I'm 
> willing to doing that.  : )



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] jelly-1203 commented on pull request #18017: [FLINK-25171] Validation of duplicate fields in ddl sql

2021-12-16 Thread GitBox



jelly-1203 commented on pull request #18017:
URL: https://github.com/apache/flink/pull/18017#issuecomment-996393881


   Hi, @wenlong88 Could you please review it and see what needs to be improved


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25021) Source/Sink for Azure Data Explorer (ADX)

2021-12-16 Thread Xue Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461169#comment-17461169
 ] 

Xue Wang commented on FLINK-25021:
--

Hi [~MartijnVisser], I checked the Kusto Java client in detail today and found 
they haven't implemented an Async client as in C#. What if I wrap the 
synchronous APIs with CompletableFuture like 
[this|https://github.com/Azure/azure-kusto-java/blob/b3d73b8bb1334c2565e551b7d1d2853d14f3fd13/samples/src/main/java/FileIngestionCompletableFuture.java#L77]?
 Any concerns?

> Source/Sink for Azure Data Explorer (ADX)
> -
>
> Key: FLINK-25021
> URL: https://issues.apache.org/jira/browse/FLINK-25021
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Xue Wang
>Assignee: Xue Wang
>Priority: Major
>
> Hi all,
> I'm considering implementing source/sink for Azure Data Explorer (ADX). But 
> first I'd like to check with the community if this is already a supported 
> scenario. If not, will you consider adding it to the list of official 
> connectors?
> FWIW, Azure Data Explorer is a widely used analytic service on Azure well 
> suited for ad-hoc and time series analysis over large volume of structured, 
> semi-structured, and unstructured data. And it can integrate with a wide 
> range of data sources and visualization tools. 
> References:
> [General Intro by ADX 
> PM|https://vincentlauzon.com/2020/02/19/azure-data-explorer-kusto]
> [ADX 
> Documentation|https://docs.microsoft.com/en-us/azure/data-explorer/data-explorer-overview]
> [Ingest data using the Azure Data Explorer Java 
> SDK|https://docs.microsoft.com/en-us/azure/data-explorer/java-ingest-data]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #17711:
URL: https://github.com/apache/flink/pull/17711#issuecomment-962917757


   
   ## CI report:
   
   * d143a927289d7f6a88851d6db732169c4752ff6e Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28160)
 
   * b4e978359c87b33f8c5de84f6e0987f712335b19 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28299)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #17711:
URL: https://github.com/apache/flink/pull/17711#issuecomment-962917757


   
   ## CI report:
   
   * d143a927289d7f6a88851d6db732169c4752ff6e Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28160)
 
   * b4e978359c87b33f8c5de84f6e0987f712335b19 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] yangjunhan commented on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies

2021-12-16 Thread GitBox



yangjunhan commented on pull request #17711:
URL: https://github.com/apache/flink/pull/17711#issuecomment-996390103


   > I think this will confuse the majority of developers as most do not work 
on the UI. Can we do something to restore the old behavior of not producing 
output unless the commit contains changes to the UI?
   
   @Airblader  There is such output because "flink-runtime-web" is introducing 
**husky + lint-staged** for linting which essentially uses git hook through 
pre-commit, and the pre-commit is at the repo level.
   
   One way I can think of to restore the older behavior is triggering npm 
script lint-staged only when `git diff --cached --quiet -- 
"flink-runtime-web/web-dashboard/*"` (maybe written in an independent shell 
script). Could you think of any other better way to achieve so?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25337) Check whether the target table is valid when SqlToOperationConverter.convertSqlInsert

2021-12-16 Thread vim-wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461167#comment-17461167
 ] 

vim-wang commented on FLINK-25337:
--

英文不好，我用中文描述一下哈。

是想实现对一段sql做语法检测的功能，比如下面这段sql：

CREATE TABLE datagen (
  f1 BIGINT,
  f2 STRING
) WITH (
  'connector' = 'datagen',
  'rows-per-second' = '1'
);
CREATE TABLE print_table (
  f1 BIGINT,
  f2 STRING
) WITH (
  'connector' = 'print'
);
INSERT INTO print_table
SELECT 
  f1,
  f2
FROM datagen;

 

原本的实现方式是：

SqlParser sqlParser = SqlParser.create(sqls, sqlparserConfig); //sqls就是指上面那段sql
List sqlNodeList = sqlParser.parseStmtList().getList();

for (SqlNode sqlNode : sqlNodeList) {
    Operation operation = 
SqlToOperationConverter.convert(planner.createFlinkPlanner(),catalogManager, 
sqlNode).get();

}

 

这样做就存在之前描述中的问题，INSERT INTO 
print_table，print_table写成任何值SqlToOperationConverter.convert都不会报异常。所以我后来把operation也做了处理，比如modifyOperations调用planner.translate(modifyOperations)，这样功能也能实现，但多了不少代码。

想问问针对这个场景，有便捷的实现方式吗？希望大佬指点。

> Check whether the target table is valid when 
> SqlToOperationConverter.convertSqlInsert
> -
>
> Key: FLINK-25337
> URL: https://issues.apache.org/jira/browse/FLINK-25337
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.14.0
>Reporter: vim-wang
>Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> when I execute insert sql like "insert into t1 select ...", 
> If the t1 is not defined，sql will not throw an exception after 
> SqlToOperationConverter.convertSqlInsert(), I think this is unreasonable, why 
> not use catalogManager to check whether the target table is valid?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25330) Flink SQL doesn't retract all versions of Hbase data

2021-12-16 Thread Bruce Wong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461166#comment-17461166
 ] 

Bruce Wong commented on FLINK-25330:


Hi, [~jingge] 

As for HBase usage in the CDC scenario, I can prepare the Docker example later.
Also, I would like to know when using Flink and HBase would require removing 
the last version and keeping the previous one.

Thanks for your reply.

> Flink SQL doesn't retract all versions of Hbase data
> 
>
> Key: FLINK-25330
> URL: https://issues.apache.org/jira/browse/FLINK-25330
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / HBase
>Affects Versions: 1.14.0
>Reporter: Bruce Wong
>Assignee: Jing Ge
>Priority: Critical
>  Labels: pull-request-available
> Attachments: image-2021-12-15-20-05-18-236.png
>
>
> h2. Background
> When we use CDC to synchronize mysql data to HBase, we find that HBase 
> deletes only the last version of the specified rowkey when deleting mysql 
> data. The data of the old version still exists. You end up using the wrong 
> data. And I think its a bug of HBase connector.
> The following figure shows Hbase data changes before and after mysql data is 
> deleted.
> !image-2021-12-15-20-05-18-236.png|width=910,height=669!
>  
> h2.  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   * 075b1874f67ad8a148e73e8a00e0fcd122d5a7f7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28298)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18088: [FLINK-25174][table] Introduce managed table interfaces and callback

2021-12-16 Thread GitBox



flinkbot edited a comment on pull request #18088:
URL: https://github.com/apache/flink/pull/18088#issuecomment-992202380


   
   ## CI report:
   
   * 08822a7f25a5dd09e3e235da0c6e8738bb9c6bde Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28226)
 
   * 3bfb0e859ac9f775571ecb6ab8db2c27a1b0fcce UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-25356) Add benchmarks for performance in OLAP scenarios

2021-12-16 Thread Xintong Song (Jira)

Xintong Song created FLINK-25356:


 Summary: Add benchmarks for performance in OLAP scenarios
 Key: FLINK-25356
 URL: https://issues.apache.org/jira/browse/FLINK-25356
 Project: Flink
  Issue Type: Sub-task
  Components: Benchmarks
Reporter: Xintong Song


As discussed in FLINK-25318, we would need a unified, public visible benchmark 
setups, for supporting OLAP performance improvements and investigations.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25318) Improvement of scheduler and execution for Flink OLAP

2021-12-16 Thread Xintong Song (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461164#comment-17461164
 ] 

Xintong Song commented on FLINK-25318:
--

I think [~pnowojski] has raised a very important point.

Currently, we have been relying on ByteDance & Alibaba internal benchmark 
setups for performance analysis. In order for the wide community get involved 
(at least being cautious not to break things), we certainly need a unified & 
public visible benchmark setup, either the micro benchmarks or something else.

I've created a subtask FLINK-25356 to track this effort.

> Improvement of scheduler and execution for Flink OLAP
> -
>
> Key: FLINK-25318
> URL: https://issues.apache.org/jira/browse/FLINK-25318
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Runtime / Network
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Shammon
>Priority: Major
>  Labels: Umbrella
> Fix For: 1.15.0
>
>
> We use flink to perform OLAP queries. We launch flink session cluster, submit 
> batch jobs to the cluster as OLAP queries, and fetch the jobs' results. OLAP 
> jobs are generally small queries which will finish at the seconds or 
> milliseconds, and users always submit multiple jobs to the session cluster 
> concurrently. We found the qps and latency of jobs will be greatly affected 
> when there're tens jobs are running, even when there's little data in each 
> query. We will give the result of benchmark for the latest version later.
> After discussed with [~xtsong], and thanks for his advice, we create this 
> issue to trace and manager Flink OLAP related improvements. More users and 
> developers are welcome and feel free to create Flink OLAP related subtasks 
> here, thanks



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25318) Improvement of scheduler and execution for Flink OLAP

2021-12-16 Thread Shammon (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461158#comment-17461158
 ] 

Shammon commented on FLINK-25318:
-

Hi [~pnowojski] Thanks for your comment. The suggestion about tests sounds 
good, I like it and I agree that it's very important. I have read [various 
micro 
benchmarks|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=115511847]
 and clone flink-benchmarks
code, it's wonderful! I will think about the tests for flink olap, discuss it 
with [~xtsong], we will give the design and plan then. Thanks for [~xtsong] to 
push this in the community, and I also hope you [~pnowojski] can help to review 
them when you are free, thanks

In briefly we use flink session cluster in our htap system in bytedance for 
olap, and this system serves some businesses in production. We met some  
bottlenecks about qps and latency on job scheduler and query execution when 
flink cluster executes multiple jobs in parallel. We want to improve flink on 
them, and I think some features are good for flink streaming && batch too. But 
indeed it should be careful and the tests on these improvements is necessary 
and important!

We mentioned this issue and hope that more OLAP and flink devs will participate 
in and promote the progress of Flink in OLAP. THX :)



> Improvement of scheduler and execution for Flink OLAP
> -
>
> Key: FLINK-25318
> URL: https://issues.apache.org/jira/browse/FLINK-25318
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Runtime / Network
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Shammon
>Priority: Major
>  Labels: Umbrella
> Fix For: 1.15.0
>
>
> We use flink to perform OLAP queries. We launch flink session cluster, submit 
> batch jobs to the cluster as OLAP queries, and fetch the jobs' results. OLAP 
> jobs are generally small queries which will finish at the seconds or 
> milliseconds, and users always submit multiple jobs to the session cluster 
> concurrently. We found the qps and latency of jobs will be greatly affected 
> when there're tens jobs are running, even when there's little data in each 
> query. We will give the result of benchmark for the latest version later.
> After discussed with [~xtsong], and thanks for his advice, we create this 
> issue to trace and manager Flink OLAP related improvements. More users and 
> developers are welcome and feel free to create Flink OLAP related subtasks 
> here, thanks



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (FLINK-25321) standalone deploy on k8s,pod always OOM killed,actual heap memory usage is normal, gc is normal

2021-12-16 Thread Gao Fei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461157#comment-17461157
 ] 

Gao Fei edited comment on FLINK-25321 at 12/17/21, 1:52 AM:


[~wangyang0918] 

I tried to adjust the overhead ratio from the default 0.1 to 0.3. It really 
works well, but how do I judge how much memory is needed?
The overhead memory mainly contains those pieces of memory, and there is no 
specific description in the document. I use Native Memory Tracking to track the 
off-heap memory, is it overhead = thread+code+gc+compiler+internal+symbol? Do 
you want to set at least 1.5GB here?
There is another problem. I set the total memory of the TM process, but the 
actual process RSS memory will always exceed this value. Does it mean that the 
overhead memory is only a relative value and cannot be absolutely restricted?

(taskmanager.memory.jvm-overhead.fraction: 0.3)
INFO  [] - Final TaskExecutor Memory configuration:
INFO  [] -   Total Process Memory:          3.000gb (3221225472 bytes)
INFO  [] -     Total Flink Memory:          1.850gb (1986422336 bytes)
INFO  [] -       Total JVM Heap Memory:     1.540gb (1653562372 bytes)
INFO  [] -         Framework:               128.000mb (134217728 bytes)
INFO  [] -         Task:                    1.415gb (1519344644 bytes)
INFO  [] -       Total Off-heap Memory:     317.440mb (332859964 bytes)
INFO  [] -         Managed:                 0 bytes
INFO  [] -         Total JVM Direct Memory: 317.440mb (332859964 bytes)
INFO  [] -           Framework:             128.000mb (134217728 bytes)
INFO  [] -           Task:                  0 bytes
INFO  [] -           Network:               189.440mb (198642236 bytes)
INFO  [] -     JVM Metaspace:               256.000mb (268435456 bytes)
INFO  [] -     JVM Overhead:                921.600mb (966367680 bytes)

 

Native Memory Tracking:

Total: reserved=4211MB +32MB, committed=2992MB +517MB

-                 Java Heap (reserved=1578MB, committed=1578MB +464MB)
                            (mmap: reserved=1578MB, committed=1578MB +464MB)
 
-                     Class (reserved=1103MB +2MB, committed=89MB +1MB)
                            (classes #14013 -213)
                            (malloc=3MB #20610 +1596)
                            (mmap: reserved=1100MB +2MB, committed=87MB +1MB)
 
-                    Thread (reserved=854MB +1MB, committed=854MB +1MB)
                            (thread #848 +1)
                            (stack: reserved=850MB +1MB, committed=850MB +1MB)
                            (malloc=3MB #5077 +6)
                            (arena=1MB #1692 +2)
 
-                      Code (reserved=252MB +1MB, committed=49MB +6MB)
                            (malloc=8MB +1MB #15043 +1500)
                            (mmap: reserved=244MB, committed=41MB +5MB)
 
-                        GC (reserved=121MB +15MB, committed=121MB +32MB)
                            (malloc=31MB +15MB #44400 +9384)
                            (mmap: reserved=91MB, committed=91MB +17MB)
 
-                  Compiler (reserved=3MB, committed=3MB)
                            (malloc=3MB #4000 +134)
 
-                  Internal (reserved=262MB +3MB, committed=262MB +3MB)
                            (malloc=262MB +3MB #51098 +2499)
 
-                    Symbol (reserved=20MB, committed=20MB)
                            (malloc=18MB #160625 -83)
                            (arena=2MB #1)
 
-    Native Memory Tracking (reserved=5MB, committed=5MB)
                            (tracking overhead=5MB)
 
-               Arena Chunk (reserved=11MB +10MB, committed=11MB +10MB)
                            (malloc=11MB +10MB)
 
-                   Unknown (reserved=3MB, committed=0MB)
                            (mmap: reserved=3MB, committed=0MB)


was (Author: jackin853):
I tried to adjust the overhead ratio from the default 0.1 to 0.3. It really 
works well, but how do I judge how much memory is needed?
The overhead memory mainly contains those pieces of memory, and there is no 
specific description in the document. I use Native Memory Tracking to track the 
off-heap memory, is it overhead = thread+code+gc+compiler+internal+symbol? Do 
you want to set at least 1.5GB here?
There is another problem. I set the total memory of the TM process, but the 
actual process RSS memory will always exceed this value. Does it mean that the 
overhead memory is only a relative value and cannot be absolutely restricted?

(taskmanager.memory.jvm-overhead.fraction: 0.3)
INFO  [] - Final TaskExecutor Memory configuration:
INFO  [] -   Total Process Memory:          3.000gb (3221225472 bytes)
INFO  [] -     Total Flink Memory:          1.850gb (1986422336 bytes)
INFO  [] -       Total JVM Heap Memory:     1.540gb (1653562372 bytes)
INFO  [] -         Framework:               128.000mb (134217728 bytes)
INFO  [] -         Task:

1 2 3 4 5 6 7 >

1 - 100 of 638 matches

Mail list logo