[jira] [Commented] (FLINK-10819) The instability problem of CI, JobManagerHAProcessFailureRecoveryITCase.testDispatcherProcessFailure test fail.

2019-01-30 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756979#comment-16756979
 ] 

Till Rohrmann commented on FLINK-10819:
---

Another instance: https://api.travis-ci.org/v3/job/486563876/log.txt

> The instability problem of CI, 
> JobManagerHAProcessFailureRecoveryITCase.testDispatcherProcessFailure test 
> fail.
> ---
>
> Key: FLINK-10819
> URL: https://issues.apache.org/jira/browse/FLINK-10819
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 1.7.0
>Reporter: sunjincheng
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.8.0
>
>
> Found the following error in the process of CI:
> Results :
> Tests in error: 
>  JobManagerHAProcessFailureRecoveryITCase.testDispatcherProcessFailure:331 » 
> IllegalArgument
> Tests run: 1463, Failures: 0, Errors: 1, Skipped: 29
> 18:40:55.828 [INFO] 
> 
> 18:40:55.829 [INFO] BUILD FAILURE
> 18:40:55.829 [INFO] 
> 
> 18:40:55.830 [INFO] Total time: 30:19 min
> 18:40:55.830 [INFO] Finished at: 2018-11-07T18:40:55+00:00
> 18:40:56.294 [INFO] Final Memory: 92M/678M
> 18:40:56.294 [INFO] 
> 
> 18:40:56.294 [WARNING] The requested profile "include-kinesis" could not be 
> activated because it does not exist.
> 18:40:56.295 [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
> (integration-tests) on project flink-tests_2.11: There are test failures.
> 18:40:56.295 [ERROR] 
> 18:40:56.295 [ERROR] Please refer to 
> /home/travis/build/sunjincheng121/flink/flink-tests/target/surefire-reports 
> for the individual test results.
> 18:40:56.295 [ERROR] -> [Help 1]
> 18:40:56.295 [ERROR] 
> 18:40:56.295 [ERROR] To see the full stack trace of the errors, re-run Maven 
> with the -e switch.
> 18:40:56.295 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 18:40:56.295 [ERROR] 
> 18:40:56.295 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 18:40:56.295 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> MVN exited with EXIT CODE: 1.
> Trying to KILL watchdog (11329).
> ./tools/travis_mvn_watchdog.sh: line 269: 11329 Terminated watchdog
> PRODUCED build artifacts.
> But after the rerun, the error disappeared. 
> Currently,no specific reasons are found, and will continue to pay attention.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tillrohrmann commented on issue #7606: [FLINK-10774] Rework lifecycle management of partitionDiscoverer in FlinkKafkaConsumerBase

2019-01-30 Thread GitBox
tillrohrmann commented on issue #7606: [FLINK-10774] Rework lifecycle 
management of partitionDiscoverer in FlinkKafkaConsumerBase
URL: https://github.com/apache/flink/pull/7606#issuecomment-459245503
 
 
   Thanks for the review @tzulitai. Merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (FLINK-11370) Check and port ZooKeeperLeaderElectionITCase to new code base if necessary

2019-01-30 Thread lining (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756943#comment-16756943
 ] 

lining edited comment on FLINK-11370 at 1/31/19 7:08 AM:
-

[~till.rohrmann], I have find one bug in 

LeaderChangeClusterComponentsTest for cluster init. As I update it, so I fix it 
in this pr. You can review it too. 


was (Author: lining):
[~till.rohrmann], I have find some bug in 

LeaderChangeClusterComponentsTest for cluster init. As I update it, so I fix it 
in this pr. You can review it too. 

> Check and port ZooKeeperLeaderElectionITCase to new code base if necessary
> --
>
> Key: FLINK-11370
> URL: https://issues.apache.org/jira/browse/FLINK-11370
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: lining
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Check and port {{ZooKeeperLeaderElectionITCase}} to new code base if 
> necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jinglining commented on a change in pull request #7613: [FLINK-11370][test]Check and port ZooKeeperLeaderElectionITCase to ne…

2019-01-30 Thread GitBox
jinglining commented on a change in pull request #7613: 
[FLINK-11370][test]Check and port ZooKeeperLeaderElectionITCase to ne…
URL: https://github.com/apache/flink/pull/7613#discussion_r252554942
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/leaderelection/LeaderChangeClusterComponentsTest.java
 ##
 @@ -68,8 +69,8 @@ public static void setupClass() throws Exception  {
 
miniCluster = new TestingMiniCluster(
new MiniClusterConfiguration.Builder()
+   .setNumTaskManagers(NUM_TMS)
.setNumSlotsPerTaskManager(SLOTS_PER_TM)
-   .setNumSlotsPerTaskManager(NUM_TMS)
 
 Review comment:
   @tillrohrmann , the bug is this. I fixed it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11370) Check and port ZooKeeperLeaderElectionITCase to new code base if necessary

2019-01-30 Thread lining (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756943#comment-16756943
 ] 

lining commented on FLINK-11370:


[~till.rohrmann], I have find some bug in 

LeaderChangeClusterComponentsTest for cluster init. As I update it, so I fix it 
in this pr. You can review it too. 

> Check and port ZooKeeperLeaderElectionITCase to new code base if necessary
> --
>
> Key: FLINK-11370
> URL: https://issues.apache.org/jira/browse/FLINK-11370
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: lining
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Check and port {{ZooKeeperLeaderElectionITCase}} to new code base if 
> necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-11370) Check and port ZooKeeperLeaderElectionITCase to new code base if necessary

2019-01-30 Thread lining (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756941#comment-16756941
 ] 

lining commented on FLINK-11370:


Hi, [~Tison]. I have pushed the code, you can review it. Thanks for your reply.

> Check and port ZooKeeperLeaderElectionITCase to new code base if necessary
> --
>
> Key: FLINK-11370
> URL: https://issues.apache.org/jira/browse/FLINK-11370
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: lining
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Check and port {{ZooKeeperLeaderElectionITCase}} to new code base if 
> necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jinglining opened a new pull request #7613: [FLINK-11370][test]Check and port ZooKeeperLeaderElectionITCase to ne…

2019-01-30 Thread GitBox
jinglining opened a new pull request #7613: [FLINK-11370][test]Check and port 
ZooKeeperLeaderElectionITCase to ne…
URL: https://github.com/apache/flink/pull/7613
 
 
   
   
   ## What is the purpose of the change
   This pull request Check and port ZooKeeperLeaderElectionITCase to new code 
base if necessary.
   
   
   ## Brief change log
   
   - delete ZooKeeperLeaderElectionITCase, 
testJobExecutionOnClusterWithLeaderReelection which has tested in 
LeaderChangeClusterComponentsTest
   - add testTaskManagerRegisterReelectionOfResourceManager in 
LeaderChangeClusterComponentsTest for 
testTaskManagerRegistrationAtReelectedLeader.
   
   
   ## Verifying this change
   
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: ( no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11370) Check and port ZooKeeperLeaderElectionITCase to new code base if necessary

2019-01-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11370:
---
Labels: pull-request-available  (was: )

> Check and port ZooKeeperLeaderElectionITCase to new code base if necessary
> --
>
> Key: FLINK-11370
> URL: https://issues.apache.org/jira/browse/FLINK-11370
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: lining
>Priority: Major
>  Labels: pull-request-available
>
> Check and port {{ZooKeeperLeaderElectionITCase}} to new code base if 
> necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] klion26 commented on issue #7612: [FLINK-11483][tests] Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized

2019-01-30 Thread GitBox
klion26 commented on issue #7612: [FLINK-11483][tests] Improve 
StreamOperatorSnapshotRestoreTest with JUnit's Parameterized
URL: https://github.com/apache/flink/pull/7612#issuecomment-459236367
 
 
   @tzulitai thank you for your quick review. I think you may have pinged the 
wrong person in the last comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-11322) Use try-with-resource for FlinkKafkaConsumer010

2019-01-30 Thread Tzu-Li (Gordon) Tai (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai closed FLINK-11322.
---
Resolution: Fixed

Merged for 1.8.0:
64cbf2a20178cf8b4f643ba510653881a3e5ce2f

> Use try-with-resource for FlinkKafkaConsumer010
> ---
>
> Key: FLINK-11322
> URL: https://issues.apache.org/jira/browse/FLINK-11322
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.7.1
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11485) Migrate PojoSerializer to use new serialization compatibility abstractions

2019-01-30 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-11485:
---

 Summary: Migrate PojoSerializer to use new serialization 
compatibility abstractions
 Key: FLINK-11485
 URL: https://issues.apache.org/jira/browse/FLINK-11485
 Project: Flink
  Issue Type: Sub-task
  Components: Type Serialization System
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: 1.8.0


This subtask covers migration of the {{PojoSerializer}}.
Pojo schema evolution is out of this scope. This ticket is only for migrating 
the {{PojoSerializer}} to be equivalent in behaviour / feature as the current 
master (1.8-SNAPSHOT).




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] klion26 commented on a change in pull request #7612: [FLINK-11483][tests] Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized

2019-01-30 Thread GitBox
klion26 commented on a change in pull request #7612: [FLINK-11483][tests] 
Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized
URL: https://github.com/apache/flink/pull/7612#discussion_r252551404
 
 

 ##
 File path: 
flink-tests/src/test/java/org/apache/flink/test/state/operator/restore/StreamOperatorSnapshotRestoreTest.java
 ##
 @@ -80,6 +94,18 @@
 
protected static TemporaryFolder temporaryFolder;
 
+   @Parameterized.Parameter
+   public StateBackendEnum stateBackendEnum;
+
+   enum StateBackendEnum {
 
 Review comment:
   Here we use `Enum` to iterate the threes statebackens, we have to return the 
StateBackend Instance in `parameter()` if we don't have the `Enum`. Please 
correct me if I'm wrong.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-11461) Remove useless MockRecordWriter

2019-01-30 Thread Tzu-Li (Gordon) Tai (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai closed FLINK-11461.
---
Resolution: Fixed

Merged for 1.8.0: 0e94ecfc7b65021905087b7efa94e3e04f761baf

> Remove useless MockRecordWriter
> ---
>
> Key: FLINK-11461
> URL: https://issues.apache.org/jira/browse/FLINK-11461
> Project: Flink
>  Issue Type: Task
>  Components: Tests
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{MockRecordWriter}} is not used in current code path, so delete it 
> directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hequn8128 commented on a change in pull request #6787: [FLINK-8577][table] Implement proctime DataStream to Table upsert conversion

2019-01-30 Thread GitBox
hequn8128 commented on a change in pull request #6787: [FLINK-8577][table] 
Implement proctime DataStream to Table upsert conversion
URL: https://github.com/apache/flink/pull/6787#discussion_r252550288
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/logical/FlinkLogicalUpsertToRetraction.scala
 ##
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.plan.nodes.logical
+
+import java.util.{List => JList}
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.{RelNode, SingleRel}
+import org.apache.calcite.rel.convert.ConverterRule
+import org.apache.calcite.rel.metadata.RelMetadataQuery
+import org.apache.flink.table.plan.logical.rel.LogicalUpsertToRetraction
+import org.apache.flink.table.plan.nodes.FlinkConventions
+
+class FlinkLogicalUpsertToRetraction(
 
 Review comment:
   Thanks for your reply. I guess we are on the same page now? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asfgit closed pull request #7611: [FLINK-11460][test] Remove the useless class AcknowledgeStreamMockEnvironment

2019-01-30 Thread GitBox
asfgit closed pull request #7611: [FLINK-11460][test] Remove the useless class 
AcknowledgeStreamMockEnvironment
URL: https://github.com/apache/flink/pull/7611
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asfgit closed pull request #7610: [FLINK-11461][test] Remove the useless MockRecordReader class

2019-01-30 Thread GitBox
asfgit closed pull request #7610: [FLINK-11461][test] Remove the useless 
MockRecordReader class
URL: https://github.com/apache/flink/pull/7610
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-11460) Remove useless AcknowledgeStreamMockEnvironment

2019-01-30 Thread Tzu-Li (Gordon) Tai (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai closed FLINK-11460.
---
Resolution: Fixed

Merged for 1.8.0: 52c2b22bd047912c690edd1d770d567250f6d710

> Remove useless AcknowledgeStreamMockEnvironment
> ---
>
> Key: FLINK-11460
> URL: https://issues.apache.org/jira/browse/FLINK-11460
> Project: Flink
>  Issue Type: Task
>  Components: Tests
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This class is not used any more in the code path, so delete it directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tzulitai commented on issue #7610: [FLINK-11461][test] Remove the useless MockRecordReader class

2019-01-30 Thread GitBox
tzulitai commented on issue #7610: [FLINK-11461][test] Remove the useless 
MockRecordReader class
URL: https://github.com/apache/flink/pull/7610#issuecomment-459232224
 
 
   LGTM, +1
   Merging ..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tzulitai commented on issue #7611: [FLINK-11460][test] Remove the useless class AcknowledgeStreamMockEnvironment

2019-01-30 Thread GitBox
tzulitai commented on issue #7611: [FLINK-11460][test] Remove the useless class 
AcknowledgeStreamMockEnvironment
URL: https://github.com/apache/flink/pull/7611#issuecomment-459232032
 
 
   LGTM, +1.
   Merging this ..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tzulitai commented on a change in pull request #7612: [FLINK-11483][tests] Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized

2019-01-30 Thread GitBox
tzulitai commented on a change in pull request #7612: [FLINK-11483][tests] 
Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized
URL: https://github.com/apache/flink/pull/7612#discussion_r252544276
 
 

 ##
 File path: 
flink-tests/src/test/java/org/apache/flink/test/state/operator/restore/StreamOperatorSnapshotRestoreTest.java
 ##
 @@ -26,6 +26,7 @@
 import org.apache.flink.api.common.typeinfo.TypeInformation;
 import org.apache.flink.api.common.typeutils.base.IntSerializer;
 import org.apache.flink.api.java.functions.KeySelector;
+import org.apache.flink.contrib.streaming.state.RocksDBStateBackend;
 
 Review comment:
   ah I see, we need to move the test to a different module because of the 
dependency of the RocksDB backend classes


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tzulitai commented on a change in pull request #7612: [FLINK-11483][tests] Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized

2019-01-30 Thread GitBox
tzulitai commented on a change in pull request #7612: [FLINK-11483][tests] 
Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized
URL: https://github.com/apache/flink/pull/7612#discussion_r252544131
 
 

 ##
 File path: 
flink-tests/src/test/java/org/apache/flink/test/state/operator/restore/StreamOperatorSnapshotRestoreTest.java
 ##
 @@ -26,6 +26,7 @@
 import org.apache.flink.api.common.typeinfo.TypeInformation;
 import org.apache.flink.api.common.typeutils.base.IntSerializer;
 import org.apache.flink.api.java.functions.KeySelector;
+import org.apache.flink.contrib.streaming.state.RocksDBStateBackend;
 
 Review comment:
   ah I see, we need to move the test to a different module because of the 
dependency of the RocksDB backend classes


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tzulitai commented on a change in pull request #7612: [FLINK-11483][tests] Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized

2019-01-30 Thread GitBox
tzulitai commented on a change in pull request #7612: [FLINK-11483][tests] 
Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized
URL: https://github.com/apache/flink/pull/7612#discussion_r252544244
 
 

 ##
 File path: 
flink-tests/src/test/java/org/apache/flink/test/state/operator/restore/StreamOperatorSnapshotRestoreTest.java
 ##
 @@ -80,6 +94,18 @@
 
protected static TemporaryFolder temporaryFolder;
 
+   @Parameterized.Parameter
+   public StateBackendEnum stateBackendEnum;
+
+   enum StateBackendEnum {
 
 Review comment:
   I don't think we `Enum` in the name, seems redundant


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tzulitai commented on a change in pull request #7612: [FLINK-11483][tests] Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized

2019-01-30 Thread GitBox
tzulitai commented on a change in pull request #7612: [FLINK-11483][tests] 
Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized
URL: https://github.com/apache/flink/pull/7612#discussion_r252543815
 
 

 ##
 File path: 
flink-tests/src/test/java/org/apache/flink/test/state/operator/restore/StreamOperatorSnapshotRestoreTest.java
 ##
 @@ -16,7 +16,7 @@
  * limitations under the License.
  */
 
-package org.apache.flink.streaming.api.operators;
+package org.apache.flink.test.state.operator.restore;
 
 Review comment:
   Is this necessary?
   I think the test was placed here so that it is in the same namespace as 
`StreamOperator`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11471) Job is reported RUNNING prematurely for queued scheduling

2019-01-30 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756910#comment-16756910
 ] 

vinoyang commented on FLINK-11471:
--

I am not sure, it is feasible to introduce another JobStatus. I just put 
forward my thoughts. I am not sure if the official feels that it should be 
improved. Maybe we can listen to their opinions? [~StephanEwen] [~aljoscha] 
[~till.rohrmann]

> Job is reported RUNNING prematurely for queued scheduling
> -
>
> Key: FLINK-11471
> URL: https://issues.apache.org/jira/browse/FLINK-11471
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, Scheduler
>Affects Versions: 1.7.1
>Reporter: Nico Kruber
>Assignee: vinoyang
>Priority: Major
>
> With queued scheduling enabled (seems to be the default now), the job's 
> status is changed to RUNNING before all tasks are actually running.
> Although {{JobStatus#RUNNING}} states
> {quote}Some tasks are scheduled or running, some may be pending, some may be 
> finished.{quote}
> you may argue whether this is the right thing to report, e.g. in the REST 
> interface, when a user wants to react on the actual state change from 
> SCHEDULED to RUNNING. It seems, some intermediate state is missing here which 
> would clarify things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tzulitai commented on issue #7206: [FLINK-11042][kafka, tests] Drop invalid kafka producer tests

2019-01-30 Thread GitBox
tzulitai commented on issue #7206: [FLINK-11042][kafka,tests] Drop invalid 
kafka producer tests
URL: https://github.com/apache/flink/pull/7206#issuecomment-459229838
 
 
   Ok. Since we've already been removing tests that rely on shutting down Kafka 
brokers (which were mostly the ones causing a lot of the instabilities), I 
don't feel strongly against removing this one either.
   
   +1 from my side.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] casidiablo commented on issue #2671: [FLINK-4862] fix Timer register in ContinuousEventTimeTrigger

2019-01-30 Thread GitBox
casidiablo commented on issue #2671: [FLINK-4862] fix Timer register in 
ContinuousEventTimeTrigger
URL: https://github.com/apache/flink/pull/2671#issuecomment-459222191
 
 
   I don't have enough time to fix this. I documented what I found here 
https://issues.apache.org/jira/browse/FLINK-11408 though 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11471) Job is reported RUNNING prematurely for queued scheduling

2019-01-30 Thread Zhu Zhu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756836#comment-16756836
 ] 

Zhu Zhu commented on FLINK-11471:
-

I see. So in the future, a Job is "RUNNING" when and only when at least one 
task is "RUNNING". Looks good.

Will there be a new JobStatus marking that the job is trying to scheduling 
tasks but no task is RUNNING yet?

> Job is reported RUNNING prematurely for queued scheduling
> -
>
> Key: FLINK-11471
> URL: https://issues.apache.org/jira/browse/FLINK-11471
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, Scheduler
>Affects Versions: 1.7.1
>Reporter: Nico Kruber
>Assignee: vinoyang
>Priority: Major
>
> With queued scheduling enabled (seems to be the default now), the job's 
> status is changed to RUNNING before all tasks are actually running.
> Although {{JobStatus#RUNNING}} states
> {quote}Some tasks are scheduled or running, some may be pending, some may be 
> finished.{quote}
> you may argue whether this is the right thing to report, e.g. in the REST 
> interface, when a user wants to react on the actual state change from 
> SCHEDULED to RUNNING. It seems, some intermediate state is missing here which 
> would clarify things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hequn8128 commented on a change in pull request #7587: [FLINK-11064] [table] Setup a new flink-table module structure

2019-01-30 Thread GitBox
hequn8128 commented on a change in pull request #7587: [FLINK-11064] [table] 
Setup a new flink-table module structure
URL: https://github.com/apache/flink/pull/7587#discussion_r252527514
 
 

 ##
 File path: flink-table/flink-table-planner/pom.xml
 ##
 @@ -22,13 +22,18 @@ under the License.
 

org.apache.flink
-   flink-libraries
+   flink-table
1.8-SNAPSHOT
..

 
-   flink-table_${scala.binary.version}
-   flink-table
+   flink-table-planner_${scala.binary.version}
+   flink-table-planner
+   
+   This module bridges Table/SQL API and runtime. It contains
+   all resources that are required during pre-flight and runtime
+   phase.
+   
 
 Review comment:
   Make sense. Thanks for the explanation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11483) Improve StreamOperatorSnapshotRestoreTest with Parameterized

2019-01-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11483:
---
Labels: pull-request-available  (was: )

> Improve StreamOperatorSnapshotRestoreTest with Parameterized
> 
>
> Key: FLINK-11483
> URL: https://issues.apache.org/jira/browse/FLINK-11483
> Project: Flink
>  Issue Type: Test
>  Components: State Backends, Checkpointing, Tests
>Reporter: Congxian Qiu
>Assignee: Congxian Qiu
>Priority: Major
>  Labels: pull-request-available
>
> In current implementation, we will test {{StreamOperatorSnapshot}} with three 
> statebackend: {{File}}, {{RocksDB_FULL}}, {{RocksDB_Incremental}}, each in a 
> sperate class, we could improve this with Parameterized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] klion26 opened a new pull request #7612: [FLINK-11483][tests] Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized

2019-01-30 Thread GitBox
klion26 opened a new pull request #7612: [FLINK-11483][tests] Improve 
StreamOperatorSnapshotRestoreTest with JUnit's Parameterized
URL: https://github.com/apache/flink/pull/7612
 
 
   
   ## What is the purpose of the change
   
   Improve StreamOperatorSnapshotRestoreTest with JUnit's Parameterized
   
   ## Brief change log
   
   In current we will test StreamOperatorSnapshotRestore with three backends: 
`file`, `rocksdb full` and `rocksdb incremental`, each in a separate class.
   In the current patch, we improve this by using JUnit's Parameterized.
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as 
`StreamOperatorSnapshotRestoreTest`.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (**no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**no**)
 - The serializers: (**no**)
 - The runtime per-record code paths (performance sensitive): (**no**)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (**no**)
 - The S3 file system connector: (**no**)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**no**)
 - If yes, how is the feature documented? (**not applicable**)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-10769) port InMemoryExternalCatalog to java

2019-01-30 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li closed FLINK-10769.

  Resolution: Invalid
Release Note: no need to port existing InMemoryCatalog since we will 
develop new in memory catalog with completely new APIs from scratch

> port InMemoryExternalCatalog to java
> 
>
> Key: FLINK-10769
> URL: https://issues.apache.org/jira/browse/FLINK-10769
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the Flink-Hive integration design, we propose a new FlinkInMemoryCatalog 
> (FLINK-10697) for production use. FlinkInMemoryCatalog will share some part 
> with the existing InMemoryExternalCatalog, thus we need to make changes to 
> InMemoryExternalCatalog.
> Background: there are two parallel efforts going on right now - FLINK-10686, 
> driven by Timo, includes moving external catalogs APIs from flink-table to 
> flink-table-common, also from Scala to Java; FLINK-10744 I'm working on right 
> now to integrate Flink with Hive and enhance external catalog functionality.
> As discussed with @twalthr in FLINK-10689, we'd better parallelize these 
> efforts while introducing minimal overhead for integrating them later. An 
> agreed way is to writing new code/feature related to external catalogs/hive 
> in Java in flink-table. This way, it will minimize migration efforts later to 
> move these new code into flink-table-common. If existing classes are modified 
> for a feature we can start migrating it to Java in a separate commit first 
> and then perform the actual feature changes, and migrated classes can be 
> placed in flink-table/src/main/java until we find a better module structure.
> Therefore, we will port InMemoryExternalCatalog to java first. This PR only 
> port scala to java with NO feature or behavior change. This is a 
> pre-requisite for FLINK-10697



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (FLINK-10026) Integrate heap based timers in RocksDB backend with incremental checkpoints

2019-01-30 Thread boshu Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

boshu Zheng reassigned FLINK-10026:
---

Assignee: boshu Zheng

> Integrate heap based timers in RocksDB backend with incremental checkpoints
> ---
>
> Key: FLINK-10026
> URL: https://issues.apache.org/jira/browse/FLINK-10026
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Affects Versions: 1.6.0
>Reporter: Stefan Richter
>Assignee: boshu Zheng
>Priority: Major
> Fix For: 1.8.0
>
>
> Currently, heap based priority queue state does not participate in RocksDB 
> incremental checkpointing and still uses the legacy (synchronous) way of 
> checkpointing timers.
> We should integrate it with incremental checkpoints in such way that the 
> state is written non-incrementally into a file, and another key-grouped state 
> handle becomes part of the non-shared section of the incremental state handle.
> After that we can also remove the path for legacy checkpointing completely 
> and also remove the key-grouping from the de-duplication hash map in the 
> {{HeapPriorityQueueSet}}. All can go into one map.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-11471) Job is reported RUNNING prematurely for queued scheduling

2019-01-30 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756815#comment-16756815
 ] 

vinoyang commented on FLINK-11471:
--

Currently, the transformation of {{JobStatus.RUNNING}} does not depend on the 
state of any Task. Its state machine is completely independent on the JM side 
(even without the need for the Task to be scheduled or executed). I think we 
can make it more accurate according to the status report of the Task. For 
example, we only change the JobStatus to RUNNING after receiving any Task 
report "RUNNING".

> Job is reported RUNNING prematurely for queued scheduling
> -
>
> Key: FLINK-11471
> URL: https://issues.apache.org/jira/browse/FLINK-11471
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, Scheduler
>Affects Versions: 1.7.1
>Reporter: Nico Kruber
>Assignee: vinoyang
>Priority: Major
>
> With queued scheduling enabled (seems to be the default now), the job's 
> status is changed to RUNNING before all tasks are actually running.
> Although {{JobStatus#RUNNING}} states
> {quote}Some tasks are scheduled or running, some may be pending, some may be 
> finished.{quote}
> you may argue whether this is the right thing to report, e.g. in the REST 
> interface, when a user wants to react on the actual state change from 
> SCHEDULED to RUNNING. It seems, some intermediate state is missing here which 
> would clarify things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11461) Remove useless MockRecordWriter

2019-01-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11461:
---
Labels: pull-request-available  (was: )

> Remove useless MockRecordWriter
> ---
>
> Key: FLINK-11461
> URL: https://issues.apache.org/jira/browse/FLINK-11461
> Project: Flink
>  Issue Type: Task
>  Components: Tests
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> The {{MockRecordWriter}} is not used in current code path, so delete it 
> directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11460) Remove useless AcknowledgeStreamMockEnvironment

2019-01-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11460:
---
Labels: pull-request-available  (was: )

> Remove useless AcknowledgeStreamMockEnvironment
> ---
>
> Key: FLINK-11460
> URL: https://issues.apache.org/jira/browse/FLINK-11460
> Project: Flink
>  Issue Type: Task
>  Components: Tests
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> This class is not used any more in the code path, so delete it directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zhijiangW opened a new pull request #7611: [FLINK-11460][test] Remove the useless class AcknowledgeStreamMockEnvironment

2019-01-30 Thread GitBox
zhijiangW opened a new pull request #7611: [FLINK-11460][test] Remove the 
useless class AcknowledgeStreamMockEnvironment
URL: https://github.com/apache/flink/pull/7611
 
 
   ## What is the purpose of the change
   
   *The class `AcknowledgeStreamMockEnvironment` is  not used any more in 
current code path, so delete it to make code clean.*
   
   ## Brief change log
   
 - *Delete the class `AcknowledgeStreamMockEnvironment` from code path*
   
   ## Verifying this change
   
   *This change is a  code cleanup without any test coverage.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhijiangW opened a new pull request #7610: [FLINK-11461][test] Remove the useless MockRecordReader class

2019-01-30 Thread GitBox
zhijiangW opened a new pull request #7610: [FLINK-11461][test] Remove the 
useless MockRecordReader class
URL: https://github.com/apache/flink/pull/7610
 
 
   ## What is the purpose of the change
   
   *The `MockRecordReader` class is not used any more in current code path, so 
delete it to make clean.*
   
   ## Brief change log
   
 - *Delete the class `MockRecordReader`*
   
   ## Verifying this change
   
   This change is a code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yanghua commented on a change in pull request #7571: [FLINK-10724] Refactor failure handling in check point coordinator

2019-01-30 Thread GitBox
yanghua commented on a change in pull request #7571: [FLINK-10724] Refactor 
failure handling in check point coordinator
URL: https://github.com/apache/flink/pull/7571#discussion_r252518530
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
 ##
 @@ -1277,6 +1293,34 @@ private void discardCheckpoint(PendingCheckpoint 
pendingCheckpoint, @Nullable Th
}
}
 
+   private void tryHandleCheckpointException(DeclineCheckpoint message, 
PendingCheckpoint currentCheckpoint) {
+   boolean needFailJob = message.getReason() != null && 
!(message.getReason() instanceof CheckpointDeclineException)
+   && failOnCheckpointingErrors && 
!currentCheckpoint.getProps().isSavepoint();
+
+   if (needFailJob) {
+   boolean processed = false;
+   ExecutionAttemptID failedExecutionID = 
message.getTaskExecutionId();
+   for (ExecutionVertex vertex : tasksToWaitFor) {
+   if 
(vertex.getCurrentExecutionAttempt().getAttemptId().equals(failedExecutionID)) {
+   if (currentPeriodicTrigger != null) {
+   
currentPeriodicTrigger.cancel(true);
+   currentPeriodicTrigger = null;
+   }
+
+   vertex.fail(message.getReason());
 
 Review comment:
   @aljoscha Can you help determine if this operation will have side effects? 
cc @azagrebin 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11440) ChainLengthIncreaseTest failed on Travis

2019-01-30 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756800#comment-16756800
 ] 

vinoyang commented on FLINK-11440:
--

[~aljoscha] In fact, I am not very sure whether this failure is due to my PR. 
Coincidentally, my PR is to refactor the checkpoint implementation. In the 
refactoring logic, there is an operation that triggers the ExecutionVertex#fail 
call in the CheckpointCoordinator. I will ping you in this PR related code, can 
you help determine if this operation will have side effects?

> ChainLengthIncreaseTest failed on Travis
> 
>
> Key: FLINK-11440
> URL: https://issues.apache.org/jira/browse/FLINK-11440
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: vinoyang
>Priority: Major
>  Labels: test-stability
>
>  
> {code:java}
> 12:55:46.212 [ERROR] Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time 
> elapsed: 11.875 s <<< FAILURE! - in 
> org.apache.flink.test.state.operator.restore.unkeyed.ChainLengthIncreaseTest
> 12:55:46.214 [ERROR] testMigrationAndRestore[Migrate Savepoint: 
> 1.7](org.apache.flink.test.state.operator.restore.unkeyed.ChainLengthIncreaseTest)
>   Time elapsed: 0.109 s  <<< ERROR!
> java.util.concurrent.ExecutionException: 
> java.util.concurrent.CompletionException: java.lang.IllegalStateException: 
> Checkpoint executing was failureTask received cancellation from one of its 
> inputs.
> Caused by: java.util.concurrent.CompletionException: 
> java.lang.IllegalStateException: Checkpoint executing was failureTask 
> received cancellation from one of its inputs.
> Caused by: java.lang.IllegalStateException: Checkpoint executing was 
> failureTask received cancellation from one of its inputs.
> {code}
>  
> log details : https://api.travis-ci.org/v3/job/485352065/log.txt
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-11471) Job is reported RUNNING prematurely for queued scheduling

2019-01-30 Thread Zhu Zhu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756792#comment-16756792
 ] 

Zhu Zhu commented on FLINK-11471:
-

Hi [~yanghua] and [~NicoK], do you have the solution for this issue already?

For batch job, there can always be some tasks running, some tasks in scheduling 
and some tasks pending for scheduling checking. What would the job status be in 
this case?

> Job is reported RUNNING prematurely for queued scheduling
> -
>
> Key: FLINK-11471
> URL: https://issues.apache.org/jira/browse/FLINK-11471
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, Scheduler
>Affects Versions: 1.7.1
>Reporter: Nico Kruber
>Assignee: vinoyang
>Priority: Major
>
> With queued scheduling enabled (seems to be the default now), the job's 
> status is changed to RUNNING before all tasks are actually running.
> Although {{JobStatus#RUNNING}} states
> {quote}Some tasks are scheduled or running, some may be pending, some may be 
> finished.{quote}
> you may argue whether this is the right thing to report, e.g. in the REST 
> interface, when a user wants to react on the actual state change from 
> SCHEDULED to RUNNING. It seems, some intermediate state is missing here which 
> would clarify things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tzulitai commented on issue #7606: [FLINK-10774] Rework lifecycle management of partitionDiscoverer in FlinkKafkaConsumerBase

2019-01-30 Thread GitBox
tzulitai commented on issue #7606: [FLINK-10774] Rework lifecycle management of 
partitionDiscoverer in FlinkKafkaConsumerBase
URL: https://github.com/apache/flink/pull/7606#issuecomment-459186967
 
 
   Since I've already reviewed this as part of the efforts in #7020, and Travis 
is green, LGTM +1  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tzulitai removed a comment on issue #7606: [FLINK-10774] Rework lifecycle management of partitionDiscoverer in FlinkKafkaConsumerBase

2019-01-30 Thread GitBox
tzulitai removed a comment on issue #7606: [FLINK-10774] Rework lifecycle 
management of partitionDiscoverer in FlinkKafkaConsumerBase
URL: https://github.com/apache/flink/pull/7606#issuecomment-459186967
 
 
   Since I've already reviewed this as part of the efforts in #7020, and Travis 
is green, LGTM +1  
   Thanks for fixing this @stevenzwu @tillrohrmann 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tzulitai edited a comment on issue #7606: [FLINK-10774] Rework lifecycle management of partitionDiscoverer in FlinkKafkaConsumerBase

2019-01-30 Thread GitBox
tzulitai edited a comment on issue #7606: [FLINK-10774] Rework lifecycle 
management of partitionDiscoverer in FlinkKafkaConsumerBase
URL: https://github.com/apache/flink/pull/7606#issuecomment-459186967
 
 
   Since I've already reviewed this as part of the efforts in #7020, and Travis 
is green, LGTM +1  
   Thanks for fixing this @stevenzwu @tillrohrmann 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11484) Blink java.util.concurrent.TimeoutException

2019-01-30 Thread pj (JIRA)
pj created FLINK-11484:
--

 Summary: Blink java.util.concurrent.TimeoutException
 Key: FLINK-11484
 URL: https://issues.apache.org/jira/browse/FLINK-11484
 Project: Flink
  Issue Type: Bug
  Components: Table API  SQL
Affects Versions: 1.5.5
 Environment: The link of blink source code: 
[github.com/apache/flink/tree/blink|https://github.com/apache/flink/tree/blink]
Reporter: pj


*If I run blink application on yarn and the parallelism number larger than 1.*
*Following is the command :*

./flink run -m yarn-cluster -ynm FLINK_NG_ENGINE_1 -ys 4 -yn 10 -ytm 5120 -p 40 
-c XXMain ~/xx.jar

*Following is the code:*

{{DataStream outputStream = tableEnv.toAppendStream(curTable, Row.class); 
outputStream.print();}}

*{{The whole subtask of application will hang a long time and finally the 
}}{{toAppendStream()}}{{ function will throw an exception like below:}}*

{{org.apache.flink.client.program.ProgramInvocationException: Job failed. 
(JobID: f5e4f7243d06035202e8fa250c364304) at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:276)
 at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:482) 
at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.executeInternal(StreamContextEnvironment.java:85)
 at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.executeInternal(StreamContextEnvironment.java:37)
 at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1893)
 at com.ngengine.main.KafkaMergeMain.startApp(KafkaMergeMain.java:352) at 
com.ngengine.main.KafkaMergeMain.main(KafkaMergeMain.java:94) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:561)
 at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:445)
 at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:419) 
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:786) 
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:280) at 
org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215) at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1029) 
at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1105) 
at java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
 at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
 at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1105) Caused 
by: java.util.concurrent.TimeoutException at 
org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:834)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)}}{{}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (FLINK-11471) Job is reported RUNNING prematurely for queued scheduling

2019-01-30 Thread vinoyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned FLINK-11471:


Assignee: vinoyang

> Job is reported RUNNING prematurely for queued scheduling
> -
>
> Key: FLINK-11471
> URL: https://issues.apache.org/jira/browse/FLINK-11471
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, Scheduler
>Affects Versions: 1.7.1
>Reporter: Nico Kruber
>Assignee: vinoyang
>Priority: Major
>
> With queued scheduling enabled (seems to be the default now), the job's 
> status is changed to RUNNING before all tasks are actually running.
> Although {{JobStatus#RUNNING}} states
> {quote}Some tasks are scheduled or running, some may be pending, some may be 
> finished.{quote}
> you may argue whether this is the right thing to report, e.g. in the REST 
> interface, when a user wants to react on the actual state change from 
> SCHEDULED to RUNNING. It seems, some intermediate state is missing here which 
> would clarify things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jinglining commented on issue #7552: [FLINK-11405][rest]can see task and task attempt exception by start e…

2019-01-30 Thread GitBox
jinglining commented on issue #7552: [FLINK-11405][rest]can see task and task 
attempt exception by start e…
URL: https://github.com/apache/flink/pull/7552#issuecomment-459183827
 
 
   @zentol can you review it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] themez opened a new pull request #7609: [FLINK-filesystems] [flink-s3-fs-hadoop] Add mirrored config key for s3 endpoint

2019-01-30 Thread GitBox
themez opened a new pull request #7609: [FLINK-filesystems] 
[flink-s3-fs-hadoop] Add mirrored config key for s3 endpoint
URL: https://github.com/apache/flink/pull/7609
 
 
   
   
   ## What is the purpose of the change
   
   Add s3 endpoint configuration so flink can connect to S3 in china regions 
which has different endpoints.
   
   ## Brief change log
   
   Add `MIRRORED_CONFIG_KEYS` item for s3 endpoint
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (yes)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (docs)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11483) Improve StreamOperatorSnapshotRestoreTest with Parameterized

2019-01-30 Thread Congxian Qiu (JIRA)
Congxian Qiu created FLINK-11483:


 Summary: Improve StreamOperatorSnapshotRestoreTest with 
Parameterized
 Key: FLINK-11483
 URL: https://issues.apache.org/jira/browse/FLINK-11483
 Project: Flink
  Issue Type: Test
  Components: State Backends, Checkpointing, Tests
Reporter: Congxian Qiu
Assignee: Congxian Qiu


In current implementation, we will test {{StreamOperatorSnapshot}} with three 
statebackend: {{File}}, {{RocksDB_FULL}}, {{RocksDB_Incremental}}, each in a 
sperate class, we could improve this with Parameterized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (FLINK-11470) LocalEnvironment doesn't call FileSystem.initialize()

2019-01-30 Thread vinoyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned FLINK-11470:


Assignee: vinoyang

> LocalEnvironment doesn't call FileSystem.initialize()
> -
>
> Key: FLINK-11470
> URL: https://issues.apache.org/jira/browse/FLINK-11470
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.6.2
>Reporter: Nico Kruber
>Assignee: vinoyang
>Priority: Major
>
> Proper Flink cluster components, e.g. task manager or job manager, initialize 
> configured file systems with their parsed {{Configuration}} objects. However, 
> the {{LocalEnvironment}} does not seem to do that and we therefore lack the 
> ability to configure access credentials etc like in the following example:
> {code}
> Configuration config = new Configuration();
> config.setString("s3.access-key", "user");
> config.setString("s3.secret-key", "secret");
> //FileSystem.initialize(config);
> final ExecutionEnvironment exEnv = 
> ExecutionEnvironment.createLocalEnvironment(config);
> {code}
> The workaround is to call {{FileSystem.initialize(config);}} yourself but it 
> is actually surprising that this is not done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (FLINK-11455) Support evictor operations on slicing and merging operators

2019-01-30 Thread vinoyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned FLINK-11455:


Assignee: vinoyang

> Support evictor operations on slicing and merging operators
> ---
>
> Key: FLINK-11455
> URL: https://issues.apache.org/jira/browse/FLINK-11455
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Rong Rong
>Assignee: vinoyang
>Priority: Major
>
> The original implementation POC of SliceStream and MergeStream does not 
> considere evicting window operations. this support can be further expanded in 
> order to cover multiple timeout duration session windows. See 
> [https://docs.google.com/document/d/1ziVsuW_HQnvJr_4a9yKwx_LEnhVkdlde2Z5l6sx5HlY/edit#heading=h.ihxm3alf3tk0.]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] EronWright commented on issue #7390: [FLINK-11240] [table] External Catalog Factory and Descriptor

2019-01-30 Thread GitBox
EronWright commented on issue #7390: [FLINK-11240] [table] External Catalog 
Factory and Descriptor
URL: https://github.com/apache/flink/pull/7390#issuecomment-459178024
 
 
   @twalthr ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10744) Integrate Flink with Hive metastore

2019-01-30 Thread Eron Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756766#comment-16756766
 ] 

Eron Wright  commented on FLINK-10744:
--

Update on my end, all issues assigned to me have PRs open; awaiting review.

> Integrate Flink with Hive metastore 
> 
>
> Key: FLINK-10744
> URL: https://issues.apache.org/jira/browse/FLINK-10744
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Affects Versions: 1.6.2
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
>
> This JIRA keeps track of the effort of FLINK-10556 on Hive metastore 
> integration. It mainly covers two aspects:
> # Register Hive metastore as an external catalog of Flink, such that Hive 
> table metadata can be accessed directly.
> # Store Flink metadata (tables, views, UDFs, etc) in a catalog that utilizes 
> Hive as the schema registry.
> Discussions and resulting design doc will be shared here, but detailed work 
> items will be tracked by sub-tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jinglining commented on issue #7554: [FLINK-11357][test]Check and port LeaderChangeJobRecoveryTest to new …

2019-01-30 Thread GitBox
jinglining commented on issue #7554: [FLINK-11357][test]Check and port 
LeaderChangeJobRecoveryTest to new …
URL: https://github.com/apache/flink/pull/7554#issuecomment-459178142
 
 
   Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] EronWright commented on issue #7392: [FLINK-11241][table] - Enhance TableEnvironment to connect to a catalog via descriptor

2019-01-30 Thread GitBox
EronWright commented on issue #7392:  [FLINK-11241][table] - Enhance 
TableEnvironment to connect to a catalog via descriptor
URL: https://github.com/apache/flink/pull/7392#issuecomment-459178101
 
 
   @twalthr ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] EronWright commented on issue #7393: [FLINK-9172][table][sql-client] - Support external catalogs in SQL-Client

2019-01-30 Thread GitBox
EronWright commented on issue #7393: [FLINK-9172][table][sql-client] - Support 
external catalogs in SQL-Client
URL: https://github.com/apache/flink/pull/7393#issuecomment-459177902
 
 
   @twalthr ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] EronWright commented on issue #7389: [FLINK-11237] [table] Enhance LocalExecutor to wrap TableEnvironment w/ classloader

2019-01-30 Thread GitBox
EronWright commented on issue #7389: [FLINK-11237] [table] Enhance 
LocalExecutor to wrap TableEnvironment w/ classloader
URL: https://github.com/apache/flink/pull/7389#issuecomment-459177499
 
 
   @twalthr ping


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (FLINK-11456) Improve window operator with sliding window assigners

2019-01-30 Thread vinoyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang reassigned FLINK-11456:


Assignee: vinoyang

> Improve window operator with sliding window assigners
> -
>
> Key: FLINK-11456
> URL: https://issues.apache.org/jira/browse/FLINK-11456
> Project: Flink
>  Issue Type: Sub-task
>  Components: DataStream API
>Reporter: Rong Rong
>Assignee: vinoyang
>Priority: Major
>
> With Slicing and merging operators that exposes the internals of window 
> operators. current sliding window can be improved by eliminating duplicate 
> aggregations or duplicate element insert into multiple panes (e.g. 
> namespaces). 
> The following sliding window operation
> {code:java}
> val resultStream: DataStream = inputStream
>   .keyBy("key")
>   .window(SlidingEventTimeWindow.of(Time.seconds(5L), Time.seconds(15L)))
>   .sum("value")
> {code}
> can produce job graph equivalent to
> {code:java}
> val resultStream: DataStream = inputStream
>   .keyBy("key")
>   .sliceWindow(Time.seconds(5L))
>   .sum("value")
>   .slideOver(Count.of(3))
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (FLINK-10618) Introduce catalog for Flink tables

2019-01-30 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated FLINK-10618:

Comment: was deleted

(was: This task is superseded by FLINK-11482. )

> Introduce catalog for Flink tables
> --
>
> Key: FLINK-10618
> URL: https://issues.apache.org/jira/browse/FLINK-10618
> Project: Flink
>  Issue Type: Sub-task
>  Components: SQL Client
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
> Fix For: 1.8.0
>
>
> This JIRA covers the 2nd aspect of Flink-Hive metastore integration.
> Besides meta objects such as tables that may come from an 
> {{ExternalCatalog}}, Flink also deals with tables/views/functions that are 
> created on the fly (in memory), or specified in a configuration file. Those 
> objects don't belong to any {{ExternalCatalog}}, yet Flink either stores them 
> in memory, which are non-persistent, or recreates them from a file, which is 
> a big pain for the user. Those objects are only known to Flink but Flink has 
> a poor management for them.
> Since they are typical objects in a database catalog, it's natural to have a 
> catalog that manages those objects. The interface will be similar to 
> {{ExternalCatalog}}, which contains meta objects that are not managed by 
> Flink. There are several possible implementations of the Flink internal 
> catalog interface: memory, file, external registry (such as confluent schema 
> registry or Hive metastore), and relational database, etc. 
> The initial functionality as well as the catalog hierarchy could be very 
> simple. The basic functionality of the catalog will be mostly create, alter, 
> and drop tables, views, functions, etc. Obviously, this can evolve over the 
> time.
> We plan to provide implementations: in-memory and in Hive metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10697) Create FlinkInMemoryCatalog class, an in-memory catalog that stores Flink's meta objects for production use

2019-01-30 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756756#comment-16756756
 ] 

Xuefu Zhang commented on FLINK-10697:
-

The task here is superseded by FLINK-11475.

> Create FlinkInMemoryCatalog class, an in-memory catalog that stores Flink's 
> meta objects for production use
> ---
>
> Key: FLINK-10697
> URL: https://issues.apache.org/jira/browse/FLINK-10697
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> Currently all Flink meta objects (currently tables only) are stored in memory 
> as part of Calcite catalog. Those objects are temporary (such as inline 
> tables), others are meant to live beyond user session. As we introduce 
> catalog for those objects (tables, views, and UDFs), it makes sense to 
> organize them neatly. Further, having a catalog implementation that store 
> those objects in memory is to retain the currently behavior, which can be 
> configured by user.
> Please note that this implementation is different from the current 
> {{InMemoryExternalCatalog}, which is used mainly for testing and doesn't 
> reflect what's actually needed for Flink meta objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-10697) Create FlinkInMemoryCatalog class, an in-memory catalog that stores Flink's meta objects for production use

2019-01-30 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang closed FLINK-10697.
---
   Resolution: Duplicate
Fix Version/s: (was: 1.8.0)

> Create FlinkInMemoryCatalog class, an in-memory catalog that stores Flink's 
> meta objects for production use
> ---
>
> Key: FLINK-10697
> URL: https://issues.apache.org/jira/browse/FLINK-10697
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>  Labels: pull-request-available
>
> Currently all Flink meta objects (currently tables only) are stored in memory 
> as part of Calcite catalog. Those objects are temporary (such as inline 
> tables), others are meant to live beyond user session. As we introduce 
> catalog for those objects (tables, views, and UDFs), it makes sense to 
> organize them neatly. Further, having a catalog implementation that store 
> those objects in memory is to retain the currently behavior, which can be 
> configured by user.
> Please note that this implementation is different from the current 
> {{InMemoryExternalCatalog}, which is used mainly for testing and doesn't 
> reflect what's actually needed for Flink meta objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-10699) Create FlinkHmsCatalog for persistent Flink meta objects using Hive metastore as a registry

2019-01-30 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang closed FLINK-10699.
---
Resolution: Duplicate

This duplicates FLINK-11482.

> Create FlinkHmsCatalog for persistent Flink meta objects using Hive metastore 
> as a registry
> ---
>
> Key: FLINK-10699
> URL: https://issues.apache.org/jira/browse/FLINK-10699
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Bowen Li
>Priority: Major
>
> Similar to FLINK-10697, but using Hive metastore as persistent storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11482) Implement GenericHiveMetastoreCatalog

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11482:
---

 Summary: Implement GenericHiveMetastoreCatalog
 Key: FLINK-11482
 URL: https://issues.apache.org/jira/browse/FLINK-11482
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Bowen Li


{{GenericHiveMetastoreCatalog}} is a special implementation of 
{{ReadableWritableCatalog}} interface to store tables/views/functions defined 
in Flink to Hive metastore. With respect to the objects stored, 
{{GenericHiveMetastoreCatalog}} is similar to {{GenericInMemoryCatalog}}, but 
the storage used is a Hive metastore instead of memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10618) Introduce catalog for Flink tables

2019-01-30 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756752#comment-16756752
 ] 

Xuefu Zhang commented on FLINK-10618:
-

This task is superseded by FLINK-11482. 

> Introduce catalog for Flink tables
> --
>
> Key: FLINK-10618
> URL: https://issues.apache.org/jira/browse/FLINK-10618
> Project: Flink
>  Issue Type: Sub-task
>  Components: SQL Client
>Affects Versions: 1.6.1
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
> Fix For: 1.8.0
>
>
> This JIRA covers the 2nd aspect of Flink-Hive metastore integration.
> Besides meta objects such as tables that may come from an 
> {{ExternalCatalog}}, Flink also deals with tables/views/functions that are 
> created on the fly (in memory), or specified in a configuration file. Those 
> objects don't belong to any {{ExternalCatalog}}, yet Flink either stores them 
> in memory, which are non-persistent, or recreates them from a file, which is 
> a big pain for the user. Those objects are only known to Flink but Flink has 
> a poor management for them.
> Since they are typical objects in a database catalog, it's natural to have a 
> catalog that manages those objects. The interface will be similar to 
> {{ExternalCatalog}}, which contains meta objects that are not managed by 
> Flink. There are several possible implementations of the Flink internal 
> catalog interface: memory, file, external registry (such as confluent schema 
> registry or Hive metastore), and relational database, etc. 
> The initial functionality as well as the catalog hierarchy could be very 
> simple. The basic functionality of the catalog will be mostly create, alter, 
> and drop tables, views, functions, etc. Obviously, this can evolve over the 
> time.
> We plan to provide implementations: in-memory and in Hive metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-10542) Implement an external catalog for Hive

2019-01-30 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang closed FLINK-10542.
---
Resolution: Duplicate

> Implement an external catalog for Hive
> --
>
> Key: FLINK-10542
> URL: https://issues.apache.org/jira/browse/FLINK-10542
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
>
> Similar to FLINK-2167 but rather register Hive metastore as an external 
> catalog in the {{TableEnvironment}}. After registration, Table API and SQL 
> queries should be able to access all Hive tables (metadata only with this 
> JIRA completed).
> This might supersede FLINK-2167 as Hive metastore stores a superset of tables 
> available via hCat without an indirection.
> This JIRA covers the 1st aspect of Flink-Hive metastore integration outlined 
> in FLINK-10744



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10542) Implement an external catalog for Hive

2019-01-30 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756748#comment-16756748
 ] 

Xuefu Zhang commented on FLINK-10542:
-

The task here is superseded by FLINK-11479.

> Implement an external catalog for Hive
> --
>
> Key: FLINK-10542
> URL: https://issues.apache.org/jira/browse/FLINK-10542
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Affects Versions: 1.6.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
>
> Similar to FLINK-2167 but rather register Hive metastore as an external 
> catalog in the {{TableEnvironment}}. After registration, Table API and SQL 
> queries should be able to access all Hive tables (metadata only with this 
> JIRA completed).
> This might supersede FLINK-2167 as Hive metastore stores a superset of tables 
> available via hCat without an indirection.
> This JIRA covers the 1st aspect of Flink-Hive metastore integration outlined 
> in FLINK-10744



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11481) Integrate HiveTableFactory with existing table factory discovery mechanism

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11481:
---

 Summary: Integrate HiveTableFactory with existing table factory 
discovery mechanism
 Key: FLINK-11481
 URL: https://issues.apache.org/jira/browse/FLINK-11481
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Currently table factory is auto discovered based on table properties. However, 
for {{HiveTableFactory}}, the factory class for Hive tables, is specifically 
returned from {{HiveCatalog}}. Since we allow both mechanisms, we need to 
integrate the two so that they work seamlessly.

Please refer to the design doc for details. Some further design thoughts are 
necessary, however.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11480) Create HiveTableFactory that creates TableSource/Sink from a Hive table

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11480:
---

 Summary: Create HiveTableFactory that creates TableSource/Sink 
from a Hive table
 Key: FLINK-11480
 URL: https://issues.apache.org/jira/browse/FLINK-11480
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


This may requires some design thoughts because {{HiveTableFactory}} is 
different from existing {{TableFactory}} implementations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11275) Unified Catalog APIs

2019-01-30 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated FLINK-11275:

Description: 
During Flink-Hive integration, we found that the current Catalog APIs are quite 
cumbersome and to a large extent requiring significant rework. While previous 
APIs are essentially not used, at least not in Flink code base, we needs to be 
careful in defining a new set of APIs to avoid future rework again.

This is an uber JIRA covering all the work outlined in 
[FLIP-30|https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs].

  was:
During Flink-Hive integration, we found that the current Catalog APIs are quite 
cumbersome and to a large extent requiring significant rework. While previous 
APIs are essentially not used, at least not in Flink code base, we needs to be 
careful in defining a new set of APIs to avoid future rework again.

This is an uber JIRA covering all the work outlined in FLIP-30.


> Unified Catalog APIs
> 
>
> Key: FLINK-11275
> URL: https://issues.apache.org/jira/browse/FLINK-11275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
>
> During Flink-Hive integration, we found that the current Catalog APIs are 
> quite cumbersome and to a large extent requiring significant rework. While 
> previous APIs are essentially not used, at least not in Flink code base, we 
> needs to be careful in defining a new set of APIs to avoid future rework 
> again.
> This is an uber JIRA covering all the work outlined in 
> [FLIP-30|https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11479) Implement HiveCatalog

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11479:
---

 Summary: Implement HiveCatalog
 Key: FLINK-11479
 URL: https://issues.apache.org/jira/browse/FLINK-11479
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Bowen Li


{{HiveCatalog}} is an implementation of {{ReadableWritableCatalog}} interface 
for meta objects managed by Hive Metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11477) Define catalog entries in SQL client YAML file and handle the creation and registration of those entries

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11477:
---

 Summary: Define catalog entries in SQL client YAML file and handle 
the creation and registration of those entries
 Key: FLINK-11477
 URL: https://issues.apache.org/jira/browse/FLINK-11477
 Project: Flink
  Issue Type: Sub-task
  Components: SQL Client
Reporter: Xuefu Zhang
Assignee: Bowen Li


As the configuration for SQL client, the YAML file currently allows one to 
register tables along with other entities such as deployment, execution, and so 
on. However,  it doesn't have a section for catalog registration. We need to 
support user registering catalogs at sql client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11478) Handle existing table registration via YAML file in the context of catalog support

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11478:
---

 Summary: Handle existing table registration via YAML file in the 
context of catalog support
 Key: FLINK-11478
 URL: https://issues.apache.org/jira/browse/FLINK-11478
 Project: Flink
  Issue Type: Sub-task
  Components: SQL Client, Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Before we introduce Catalog, it's common for user to define his/her tables in 
SQL client YAML file.  With catalog support, it is no longer necessary. 
However, to keep backward compatibility, this mechanism should continue 
functioning. Behind the scene, we need to solve the problem of the same tables 
is registered every time user launches SQL client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11476) Create CatalogManager to manage multiple catalogs and encapsulate Calcite schema

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11476:
---

 Summary: Create CatalogManager to manage multiple catalogs and 
encapsulate Calcite schema
 Key: FLINK-11476
 URL: https://issues.apache.org/jira/browse/FLINK-11476
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Flink allows for more than one registered catalogs. {{CatalogManager}} class is 
the holding class to manage and encapsulate the catalogs and their 
interrelations with Calcite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11475) Adapt existing InMemoryExternalCatalog to GenericInMemoryCatalog

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11475:
---

 Summary: Adapt existing InMemoryExternalCatalog to 
GenericInMemoryCatalog
 Key: FLINK-11475
 URL: https://issues.apache.org/jira/browse/FLINK-11475
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Bowen Li


{{GenericInMemoryCatalog}} needs to implement ReadableWritableCatalog interface 
based on the design.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11474) Add ReadableCatalog, ReadableWritableCatalog, and other related interfaces

2019-01-30 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-11474:
---

 Summary: Add ReadableCatalog, ReadableWritableCatalog, and other 
related interfaces
 Key: FLINK-11474
 URL: https://issues.apache.org/jira/browse/FLINK-11474
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Also deprecate ReadableCatalog, ReadableWritableCatalog, and other related, 
existing classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] TisonKun commented on issue #7570: [FLINK-11422] Prefer testing class to mock StreamTask in AbstractStre…

2019-01-30 Thread GitBox
TisonKun commented on issue #7570: [FLINK-11422] Prefer testing class to mock 
StreamTask in AbstractStre…
URL: https://github.com/apache/flink/pull/7570#issuecomment-459151470
 
 
   @pnowojski thanks for your review!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (FLINK-11363) Check and remove TaskManagerConfigurationTest

2019-01-30 Thread Gary Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756567#comment-16756567
 ] 

Gary Yao edited comment on FLINK-11363 at 1/30/19 9:24 PM:
---

Fixed via

7af5a98adebb83b6a8e687ba6ec38ec527bb325b

85c3e7e2c2b46b3a2a7efeee77317dd8aabe3479


was (Author: gjy):
Fixed via

969c8b2d08afae9dafe0349c2a10cf43c6deb918

7af5a98adebb83b6a8e687ba6ec38ec527bb325b

> Check and remove TaskManagerConfigurationTest
> -
>
> Key: FLINK-11363
> URL: https://issues.apache.org/jira/browse/FLINK-11363
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: Yun Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Check whether {{TaskManagerConfigurationTest}} contains any relevant tests 
> for the new code base and then remove this test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11363) Check and remove TaskManagerConfigurationTest

2019-01-30 Thread Gary Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-11363:
-
Fix Version/s: 1.8.0

> Check and remove TaskManagerConfigurationTest
> -
>
> Key: FLINK-11363
> URL: https://issues.apache.org/jira/browse/FLINK-11363
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: Yun Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Check whether {{TaskManagerConfigurationTest}} contains any relevant tests 
> for the new code base and then remove this test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-11363) Check and remove TaskManagerConfigurationTest

2019-01-30 Thread Gary Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao closed FLINK-11363.

Resolution: Fixed

> Check and remove TaskManagerConfigurationTest
> -
>
> Key: FLINK-11363
> URL: https://issues.apache.org/jira/browse/FLINK-11363
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: Yun Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Check whether {{TaskManagerConfigurationTest}} contains any relevant tests 
> for the new code base and then remove this test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (FLINK-11363) Check and remove TaskManagerConfigurationTest

2019-01-30 Thread Gary Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao reopened FLINK-11363:
--

> Check and remove TaskManagerConfigurationTest
> -
>
> Key: FLINK-11363
> URL: https://issues.apache.org/jira/browse/FLINK-11363
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: Yun Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Check whether {{TaskManagerConfigurationTest}} contains any relevant tests 
> for the new code base and then remove this test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-11363) Check and remove TaskManagerConfigurationTest

2019-01-30 Thread Gary Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao closed FLINK-11363.

Resolution: Fixed

Fixed via

969c8b2d08afae9dafe0349c2a10cf43c6deb918

7af5a98adebb83b6a8e687ba6ec38ec527bb325b

> Check and remove TaskManagerConfigurationTest
> -
>
> Key: FLINK-11363
> URL: https://issues.apache.org/jira/browse/FLINK-11363
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: Yun Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Check whether {{TaskManagerConfigurationTest}} contains any relevant tests 
> for the new code base and then remove this test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] GJL merged pull request #7595: [FLINK-11363][tests] Port TaskManagerConfigurationTest to new code base

2019-01-30 Thread GitBox
GJL merged pull request #7595: [FLINK-11363][tests] Port 
TaskManagerConfigurationTest to new code base
URL: https://github.com/apache/flink/pull/7595
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] walterddr commented on issue #7607: [FLINK-10076][table] Upgrade Calcite dependency to 1.18

2019-01-30 Thread GitBox
walterddr commented on issue #7607: [FLINK-10076][table] Upgrade Calcite 
dependency to 1.18
URL: https://github.com/apache/flink/pull/7607#issuecomment-459101279
 
 
   @zentol yes I think definitely wait for @twalthr to merge #7587 first. It 
shouldn't be too much of a merge issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11473) Clarify Documenation on Latency Tracking

2019-01-30 Thread Konstantin Knauf (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Knauf updated FLINK-11473:
-
Description: 
The documenation on Latency Tracking states:
{noformat}
All intermediate operators keep a list of the last n latencies from each source 
to compute a latency distribution. The sink operators keep a list from each 
source, and each parallel source instance to allow detecting latency issues 
caused by individual machines.{noformat}
{{This is not correct anymore as this behavior is controlled by a configuration 
`metrics.latency.granularity` now.}}

  was:
The documenation on Latency Markers states:
{noformat}
All intermediate operators keep a list of the last n latencies from each source 
to compute a latency distribution. The sink operators keep a list from each 
source, and each parallel source instance to allow detecting latency issues 
caused by individual machines.{noformat}
{{This is not correct anymore as this behavior is controlled by a configuration 
`metrics.latency.granularity` now.}}


> Clarify Documenation on Latency Tracking
> 
>
> Key: FLINK-11473
> URL: https://issues.apache.org/jira/browse/FLINK-11473
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.7.1
>Reporter: Konstantin Knauf
>Assignee: Konstantin Knauf
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The documenation on Latency Tracking states:
> {noformat}
> All intermediate operators keep a list of the last n latencies from each 
> source to compute a latency distribution. The sink operators keep a list from 
> each source, and each parallel source instance to allow detecting latency 
> issues caused by individual machines.{noformat}
> {{This is not correct anymore as this behavior is controlled by a 
> configuration `metrics.latency.granularity` now.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11473) Clarify Documenation on Latency Tracking

2019-01-30 Thread Konstantin Knauf (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Knauf updated FLINK-11473:
-
Summary: Clarify Documenation on Latency Tracking  (was: Clarify 
Documenation on Latency Markers)

> Clarify Documenation on Latency Tracking
> 
>
> Key: FLINK-11473
> URL: https://issues.apache.org/jira/browse/FLINK-11473
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.7.1
>Reporter: Konstantin Knauf
>Assignee: Konstantin Knauf
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The documenation on Latency Markers states:
> {noformat}
> All intermediate operators keep a list of the last n latencies from each 
> source to compute a latency distribution. The sink operators keep a list from 
> each source, and each parallel source instance to allow detecting latency 
> issues caused by individual machines.{noformat}
> {{This is not correct anymore as this behavior is controlled by a 
> configuration `metrics.latency.granularity` now.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11473) Clarify Documenation on Latency Markers

2019-01-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11473:
---
Labels: pull-request-available  (was: )

> Clarify Documenation on Latency Markers
> ---
>
> Key: FLINK-11473
> URL: https://issues.apache.org/jira/browse/FLINK-11473
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.7.1
>Reporter: Konstantin Knauf
>Assignee: Konstantin Knauf
>Priority: Minor
>  Labels: pull-request-available
>
> The documenation on Latency Markers states:
> {noformat}
> All intermediate operators keep a list of the last n latencies from each 
> source to compute a latency distribution. The sink operators keep a list from 
> each source, and each parallel source instance to allow detecting latency 
> issues caused by individual machines.{noformat}
> {{This is not correct anymore as this behavior is controlled by a 
> configuration `metrics.latency.granularity` now.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] knaufk opened a new pull request #7608: FLINK-11473 [documentation][ Clarify documenation on Latency Tracking

2019-01-30 Thread GitBox
knaufk opened a new pull request #7608: FLINK-11473 [documentation][ Clarify 
documenation on Latency Tracking
URL: https://github.com/apache/flink/pull/7608
 
 
   

[jira] [Created] (FLINK-11473) Clarify Documenation on Latency Markers

2019-01-30 Thread Konstantin Knauf (JIRA)
Konstantin Knauf created FLINK-11473:


 Summary: Clarify Documenation on Latency Markers
 Key: FLINK-11473
 URL: https://issues.apache.org/jira/browse/FLINK-11473
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.7.1
Reporter: Konstantin Knauf
Assignee: Konstantin Knauf


The documenation on Latency Markers states:
{noformat}
All intermediate operators keep a list of the last n latencies from each source 
to compute a latency distribution. The sink operators keep a list from each 
source, and each parallel source instance to allow detecting latency issues 
caused by individual machines.{noformat}
{{This is not correct anymore as this behavior is controlled by a configuration 
`metrics.latency.granularity` now.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zentol commented on issue #7607: [FLINK-10076][table] Upgrade Calcite dependency to 1.18

2019-01-30 Thread GitBox
zentol commented on issue #7607: [FLINK-10076][table] Upgrade Calcite 
dependency to 1.18
URL: https://github.com/apache/flink/pull/7607#issuecomment-459066684
 
 
   With #7587 being close to being merged you should probably re-base your PR 
on of it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-11357) Check and port LeaderChangeJobRecoveryTest to new code base if necessary

2019-01-30 Thread Gary Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao closed FLINK-11357.

Resolution: Fixed

Fixed via 969c8b2d08afae9dafe0349c2a10cf43c6deb918

> Check and port LeaderChangeJobRecoveryTest to new code base if necessary
> 
>
> Key: FLINK-11357
> URL: https://issues.apache.org/jira/browse/FLINK-11357
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Till Rohrmann
>Assignee: lining
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Check and port {{LeaderChangeJobRecoveryTest}} to new code base if necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] GJL merged pull request #7554: [FLINK-11357][test]Check and port LeaderChangeJobRecoveryTest to new …

2019-01-30 Thread GitBox
GJL merged pull request #7554: [FLINK-11357][test]Check and port 
LeaderChangeJobRecoveryTest to new …
URL: https://github.com/apache/flink/pull/7554
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-10076) Upgrade Calcite dependency to 1.18

2019-01-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-10076:
---
Labels: pull-request-available  (was: )

> Upgrade Calcite dependency to 1.18
> --
>
> Key: FLINK-10076
> URL: https://issues.apache.org/jira/browse/FLINK-10076
> Project: Flink
>  Issue Type: Task
>  Components: Table API  SQL
>Reporter: Shuyi Chen
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] walterddr opened a new pull request #7607: [FLINK-10076][table] Upgrade Calcite dependency to 1.18

2019-01-30 Thread GitBox
walterddr opened a new pull request #7607: [FLINK-10076][table] Upgrade Calcite 
dependency to 1.18
URL: https://github.com/apache/flink/pull/7607
 
 
   
   
   ## What is the purpose of the change
   
   Upgrade Calcite dependency to 1.18 for flink-table package. including 
various bug fixes and improvements ( details can be found in JIRA discussion )
   
   
   ## Brief change log
   - Fix TimeIndicator type optimization 
 - Due to: [CALCITE-1413], [CALCITE-2695] 
 - Fix: include `digest` for `TimeIndicatorRelDataType`
   - Under-simplify rexcall chain
 - Due to: [CALCITE-2726] RexSimplify does not simplify enough for some of 
the nested operations, doesn't do nested visitCall
 - Fix: updated tests, future fixes needed on calcite
   - Remove duplicated expressions & Casting optimization
 - Due to: [CALCITE-1413], [CALCITE-2631], [CALCITE-2639]
 - Fix: update tests
   - Nested selected time indicator type not working properly 
   --> Due to: [CALCITE-2470], project method combines expressions if the 
underlying node is a project; Fix: pulled in `RelBuilder` from calcite and 
changes flag to not optimize project merge.
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): yes
   - Updated Calcite dependency and list of exclusions
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? N/A
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11419) StreamingFileSink fails to recover after taskmanager failure

2019-01-30 Thread Edward Rojas (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756394#comment-16756394
 ] 

Edward Rojas commented on FLINK-11419:
--

[~kkl0u] Travis build is failing but it's not related to the changes made in 
this PR.

[~StephanEwen] I saw that you were the last person to modify this file, maybe 
you could take a look at the changes proposed.

> StreamingFileSink fails to recover after taskmanager failure
> 
>
> Key: FLINK-11419
> URL: https://issues.apache.org/jira/browse/FLINK-11419
> Project: Flink
>  Issue Type: Bug
>  Components: filesystem-connector
>Affects Versions: 1.7.1
>Reporter: Edward Rojas
>Assignee: Edward Rojas
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a job with a StreamingFileSink sending data to HDFS is running in a 
> cluster with multiple taskmanagers and the taskmanager executing the job goes 
> down (for some reason), when the other task manager start executing the job, 
> it fails saying that there is some "missing data in tmp file" because it's 
> not able to perform a truncate in the file.
>  Here the full stack trace:
> {code:java}
> java.io.IOException: Missing data in tmp file: 
> hdfs://path/to/hdfs/2019-01-20/.part-0-0.inprogress.823f9c20-3594-4fe3-ae8c-f57b6c35e191
>   at 
> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.(HadoopRecoverableFsDataOutputStream.java:93)
>   at 
> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.recover(HadoopRecoverableWriter.java:72)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.restoreInProgressFile(Bucket.java:140)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.(Bucket.java:127)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.restore(Bucket.java:396)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.DefaultBucketFactoryImpl.restoreBucket(DefaultBucketFactoryImpl.java:64)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.handleRestoredBucketState(Buckets.java:177)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.initializeActiveBuckets(Buckets.java:165)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.initializeState(Buckets.java:149)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.initializeState(StreamingFileSink.java:334)
>   at 
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
>   at 
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:278)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:738)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:289)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
>  Failed to TRUNCATE_FILE 
> /path/to/hdfs/2019-01-20/.part-0-0.inprogress.823f9c20-3594-4fe3-ae8c-f57b6c35e191
>  for DFSClient_NONMAPREDUCE_-2103482360_62 on x.xxx.xx.xx because this file 
> lease is currently owned by DFSClient_NONMAPREDUCE_1834204750_59 on x.xx.xx.xx
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3190)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.truncateInternal(FSNamesystem.java:2282)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.truncateInt(FSNamesystem.java:2228)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.truncate(FSNamesystem.java:2198)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.truncate(NameNodeRpcServer.java:1056)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.truncate(ClientNamenodeProtocolServerSideTranslatorPB.java:622)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at 

[GitHub] NicoK commented on a change in pull request #6705: [FLINK-10356][network] add sanity checks to SpillingAdaptiveSpanningRecordDeserializer

2019-01-30 Thread GitBox
NicoK commented on a change in pull request #6705: [FLINK-10356][network] add 
sanity checks to SpillingAdaptiveSpanningRecordDeserializer
URL: https://github.com/apache/flink/pull/6705#discussion_r252367679
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/api/serialization/SpanningRecordSerializationTest.java
 ##
 @@ -105,11 +119,235 @@ public void testHandleMixedLargeRecords() throws 
Exception {
testSerializationRoundTrip(originalRecords, segmentSize);
}
 
+   /**
+* Non-spanning, deserialization reads one byte too many and succeeds - 
failure report comes
+* from an additional check in {@link 
SpillingAdaptiveSpanningRecordDeserializer}.
+*/
+   @Test
+   public void testHandleDeserializingTooMuchNonSpanning1() throws 
Exception {
+   testHandleWrongDeserialization(
+   DeserializingTooMuch.getValue(),
+   32 * 1024);
+   }
+
+   /**
+* Non-spanning, deserialization reads one byte too many and fails.
+*/
+   @Test
+   public void testHandleDeserializingTooMuchNonSpanning2() throws 
Exception {
+   testHandleWrongDeserialization(
+   DeserializingTooMuch.getValue(),
+   (serializedLength) -> serializedLength,
+   isA(IndexOutOfBoundsException.class));
+   }
+
+   /**
+* Spanning, deserialization reads one byte too many and fails.
+*/
+   @Test
+   public void testHandleDeserializingTooMuchSpanning1() throws Exception {
+   testHandleWrongDeserialization(
+   DeserializingTooMuch.getValue(),
+   (serializedLength) -> serializedLength - 1,
+   isA(EOFException.class));
+   }
+
+   /**
+* Spanning, deserialization reads one byte too many and fails.
+*/
+   @Test
+   public void testHandleDeserializingTooMuchSpanning2() throws Exception {
+   testHandleWrongDeserialization(
+   DeserializingTooMuch.getValue(),
+   (serializedLength) -> 1,
+   isA(EOFException.class));
+   }
+
+   /**
+* Spanning, spilling, deserialization reads one byte too many.
+*/
+   @Test
+   public void testHandleDeserializingTooMuchSpanningLargeRecord() throws 
Exception {
+   testHandleWrongDeserialization(
+   LargeObjectTypeDeserializingTooMuch.getRandom(),
+   32 * 1024,
+   isA(EOFException.class));
+   }
+
+   /**
+* Non-spanning, deserialization forgets to read one byte - failure 
report comes from an
+* additional check in {@link 
SpillingAdaptiveSpanningRecordDeserializer}.
+*/
+   @Test
+   public void testHandleDeserializingNotEnoughNonSpanning() throws 
Exception {
+   testHandleWrongDeserialization(
+   DeserializingNotEnough.getValue(),
+   32 * 1024);
+   }
+
+   /**
+* Spanning, deserialization forgets to read one byte - failure report 
comes from an additional
+* check in {@link SpillingAdaptiveSpanningRecordDeserializer}.
+*/
+   @Test
+   public void testHandleDeserializingNotEnoughSpanning1() throws 
Exception {
+   testHandleWrongDeserialization(
+   DeserializingNotEnough.getValue(),
+   (serializedLength) -> serializedLength - 1);
+   }
+
+   /**
+* Spanning, serialization length is 17 (including headers), 
deserialization forgets to read one
+* byte - failure report comes from an additional check in {@link 
SpillingAdaptiveSpanningRecordDeserializer}.
+*/
+   @Test
+   public void testHandleDeserializingNotEnoughSpanning2() throws 
Exception {
+   testHandleWrongDeserialization(
+   DeserializingNotEnough.getValue(),
+   1);
+   }
+
+   /**
+* Spanning, spilling, deserialization forgets to read one byte - 
failure report comes from an
+* additional check in {@link 
SpillingAdaptiveSpanningRecordDeserializer}.
+*/
+   @Test
+   public void testHandleDeserializingNotEnoughSpanningLargeRecord() 
throws Exception {
+   testHandleWrongDeserialization(
+   LargeObjectTypeDeserializingNotEnough.getRandom(),
+   32 * 1024);
+   }
+
+   private void testHandleWrongDeserialization(
+   WrongDeserializationValue testValue,
+   IntFunction segmentSizeProvider,
+   Matcher expectedCause) throws 
Exception {
+   expectedException.expectCause(expectedCause);
+   testHandleWrongDeserialization(testValue, 

[GitHub] tillrohrmann commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers

2019-01-30 Thread GitBox
tillrohrmann commented on a change in pull request #7356: 
[FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers
URL: https://github.com/apache/flink/pull/7356#discussion_r252360494
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java
 ##
 @@ -395,12 +405,20 @@ public void onContainersAllocated(List 
containers) {

nodeManagerClient.startContainer(container, taskExecutorLaunchContext);
} catch (Throwable t) {
log.error("Could not start 
TaskManager in container {}.", container.getId(), t);
-
// release the failed container

workerNodeMap.remove(resourceId);

resourceManagerClient.releaseAssignedContainer(container.getId());
-   // and ask for a new one
-   
requestYarnContainerIfRequired();
+   log.error("Could not start 
TaskManager in container {}.", container.getId(), t);
+   recordFailure();
+   if (shouldRejectRequests()) {
+   
rejectAllPendingSlotRequests(new MaximumFailedTaskManagerExceedingException(
+   
String.format("Maximum number of failed container %d in interval %s"
+   
+ "is detected in Resource Manager", maximumFailureTaskExecutorPerInternal,
+   
failureInterval.toString()), t));
 
 Review comment:
   This branch should go into the `recordFailure` method.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tillrohrmann commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers

2019-01-30 Thread GitBox
tillrohrmann commented on a change in pull request #7356: 
[FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers
URL: https://github.com/apache/flink/pull/7356#discussion_r252354300
 
 

 ##
 File path: docs/_includes/generated/mesos_configuration.html
 ##
 @@ -27,6 +27,11 @@
 -1
 The maximum number of failed workers before the cluster fails. 
May be set to -1 to disable this feature. This option is ignored unless Flink 
is in legacy mode.
 
+
+mesos.maximum-failed-workers-per-interval
+-1
+Maximum number of workers the system is going to reallocate in 
case of a failure in an interval.
 
 Review comment:
   Is this a Mesos specific configuration? If we want to support that for all 
`ResourceManagers`, then it might be better to use a more generic configuration 
name. E.g. `resourcemanager.maximum-workers-failure-rate`.
   
   Should we combine `maximum-failed-workers-per-interval` and 
`workers-failure-rate-interval` into a `failure-rate` config option which 
defines how many workers can fail per minute or per hour?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tillrohrmann commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers

2019-01-30 Thread GitBox
tillrohrmann commented on a change in pull request #7356: 
[FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers
URL: https://github.com/apache/flink/pull/7356#discussion_r252354509
 
 

 ##
 File path: docs/_includes/generated/yarn_config_configuration.html
 ##
 @@ -42,6 +47,11 @@
 (none)
 Maximum number of containers the system is going to reallocate 
in case of a failure.
 
+
+yarn.maximum-failed-containers-per-interval
+-1
+Maximum number of containers the system is going to reallocate 
in case of a failure in an interval.
+
 
 Review comment:
   Duplicate config option.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tillrohrmann commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers

2019-01-30 Thread GitBox
tillrohrmann commented on a change in pull request #7356: 
[FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers
URL: https://github.com/apache/flink/pull/7356#discussion_r252364001
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java
 ##
 @@ -145,6 +148,15 @@
/** All registered listeners for status updates of the ResourceManager. 
*/
private ConcurrentMap 
infoMessageListeners;
 
+   protected final Time failureInterval;
+
+   protected final int maximumFailureTaskExecutorPerInternal;
+
+   private boolean checkFailureRate;
+
+   private final ArrayDeque taskExecutorFailureTimestamps;
 
 Review comment:
   I think we should encapsulate the metering logic behind some interface. 
Ideally I would like to use exponentially-weighted moving average to 
approximate the failure rate. This would have the benefit that we would not 
have to store every timestamp here. However, with an interface we could also 
have different implementations (e.g. the exact one you have implemented here).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tillrohrmann commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers

2019-01-30 Thread GitBox
tillrohrmann commented on a change in pull request #7356: 
[FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers
URL: https://github.com/apache/flink/pull/7356#discussion_r252360333
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/SlotManager.java
 ##
 @@ -300,6 +304,18 @@ public boolean registerSlotRequest(SlotRequest 
slotRequest) throws SlotManagerEx
}
}
 
+   /**
+* Rejects all pending slot requests.
+* @param cause the exception caused the rejection
+*/
+   public void rejectAllPendingSlotRequests(Exception cause) {
+   for (PendingSlotRequest pendingSlotRequest : 
pendingSlotRequests.values()) {
+   rejectPendingSlotRequest(pendingSlotRequest, cause);
+   }
+
+   pendingSlotRequests.clear();
+   }
 
 Review comment:
   We should add a test for this method.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tillrohrmann commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers

2019-01-30 Thread GitBox
tillrohrmann commented on a change in pull request #7356: 
[FLINK-10868][flink-yarn] Enforce maximum TMs failure rate in ResourceManagers
URL: https://github.com/apache/flink/pull/7356#discussion_r252360809
 
 

 ##
 File path: 
flink-yarn/src/test/java/org/apache/flink/yarn/YarnResourceManagerTest.java
 ##
 @@ -521,4 +533,56 @@ public void testOnContainerCompleted() throws Exception {
});
}};
}
+
+   /**
+*  Tests that YarnResourceManager will trigger to reject all 
pending slot request, when maximum number of failed
+*  contains is hit.
+*/
+   @Test
+   public void testOnContainersAllocatedWithFailure() throws Exception {
 
 Review comment:
   With the generalized test for the failure rate behaviour in 
`ResourceManagerTest` we would only need to check that a failing container 
would call `ResourceManager#recordFailure`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   3   4   5   >