[GitHub] [flink] lindong28 commented on a change in pull request #15161: [FLINK-20114][connector/kafka,common] Fix a few KafkaSource-related bugs

2021-03-23 Thread GitBox


lindong28 commented on a change in pull request #15161:
URL: https://github.com/apache/flink/pull/15161#discussion_r600194416



##
File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/source/coordinator/SourceCoordinatorContextTest.java
##
@@ -182,6 +191,44 @@ public void testSnapshotAndRestore() throws Exception {
 restoredTracker.assignmentsByCheckpointId());
 }
 
+@Test
+public void testCallableInterruptedDuringShutdownDoNotFailJob() throws 
InterruptedException {
+AtomicReference expectedError = new AtomicReference<>(null);
+
+ManuallyTriggeredScheduledExecutorService manualWorkerExecutor =
+new ManuallyTriggeredScheduledExecutorService();
+ManuallyTriggeredScheduledExecutorService manualCoordinatorExecutor =
+new ManuallyTriggeredScheduledExecutorService();
+
+SourceCoordinatorContext testingContext =
+new SourceCoordinatorContext<>(
+manualCoordinatorExecutor,
+manualWorkerExecutor,
+new 
SourceCoordinatorProvider.CoordinatorExecutorThreadFactory(
+TEST_OPERATOR_ID.toHexString(), 
getClass().getClassLoader()),
+operatorCoordinatorContext,
+new MockSourceSplitSerializer(),
+splitSplitAssignmentTracker);
+
+testingContext.callAsync(
+() -> {
+throw new InterruptedException();
+},
+(ignored, e) -> {
+if (e != null) {
+expectedError.set(e);
+throw new RuntimeException(e);
+}
+});
+
+manualWorkerExecutor.triggerAll();

Review comment:
   It is needed here. `callAsync(...)` indirectly calls 
`manualWorkerExecutor.execute(...)`, which enqueues the runnable into an 
internal queue (in `ManuallyTriggeredScheduledExecutorService`) without 
actually executing the runnable. 
   
   We need to call `manualWorkerExecutor.triggerAll()` to execute the runnable.
   
   I verified that, if we remove `manualWorkerExecutor.triggerAll()`, the test 
would fail due to `assertTrue(expectedError.get() instanceof 
InterruptedException)` failure.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] fsk119 commented on pull request #15332: [FLINK-21701][sql-client] Extend the "RESET" syntax in the SQL Client

2021-03-23 Thread GitBox


fsk119 commented on pull request #15332:
URL: https://github.com/apache/flink/pull/15332#issuecomment-805523390


   It looks good to me. CC @wuchong 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] fsk119 commented on a change in pull request #15332: [FLINK-21701][sql-client] Extend the "RESET" syntax in the SQL Client

2021-03-23 Thread GitBox


fsk119 commented on a change in pull request #15332:
URL: https://github.com/apache/flink/pull/15332#discussion_r600193100



##
File path: 
flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliClient.java
##
@@ -424,8 +424,14 @@ private void callReset(SqlCommandCall cmdCall) {
 else {
 String key = cmdCall.operands[0].trim();
 executor.resetSessionProperty(sessionId, key);
-printRemovedAndDeprecatedKeyMessage(key);
-if (!YamlConfigUtils.isRemovedKey(key)) {
+
+boolean isRemovedKey = YamlConfigUtils.isRemovedKey(key);
+boolean isDeprecatedKey = YamlConfigUtils.isDeprecatedKey(key);
+if (isRemovedKey || isDeprecatedKey) {
+printRemovedAndDeprecatedKeyMessage(key);
+}
+// it's not removedKey, need to print info message
+if (!isRemovedKey) {

Review comment:
   nit: Why don't use 
   ```
   if (...) {
   
   } else {
   ...
   }
   
   ```
   
   It seems deprecated key will also get into this if-block.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong commented on a change in pull request #15273: [FLINK-21800][core] Guard MemorySegment against concurrent frees.

2021-03-23 Thread GitBox


xintongsong commented on a change in pull request #15273:
URL: https://github.com/apache/flink/pull/15273#discussion_r600192987



##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/MemorySegment.java
##
@@ -217,10 +222,14 @@ public int size() {
 /**
  * Checks whether the memory segment was freed.
  *
+ * This method internally involves cross-thread synchronization. Do not 
use for performance
+ * sensitive code paths.
+ *
  * @return true, if the memory segment has been freed, 
false otherwise.
  */
 public boolean isFreed() {
-return address > addressLimit;
+// in performance sensitive cases, use 'address > addressLimit' instead
+return isFreedAtomic.get();

Review comment:
   I think we have both made a point.
   1. A memory segment should deal with multiple-frees, concurrency, leaks 
regardless of memory type.
   2. The unsafe memory (de-)allocation should be self-contained, without 
relying on the correctness of memory segment and its usages. To be specific, 
the cleaner itself should make sure it is called exactly once.
   
   I'm leaning towards `1` for its broader coverage, and is neutral to `2`. I 
think `2` might not be necessary if `1` works properly, but it does not harm to 
have such an extra safe net given that as far as I can see the price is not 
high.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong commented on a change in pull request #15273: [FLINK-21800][core] Guard MemorySegment against concurrent frees.

2021-03-23 Thread GitBox


xintongsong commented on a change in pull request #15273:
URL: https://github.com/apache/flink/pull/15273#discussion_r600192987



##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/MemorySegment.java
##
@@ -217,10 +222,14 @@ public int size() {
 /**
  * Checks whether the memory segment was freed.
  *
+ * This method internally involves cross-thread synchronization. Do not 
use for performance
+ * sensitive code paths.
+ *
  * @return true, if the memory segment has been freed, 
false otherwise.
  */
 public boolean isFreed() {
-return address > addressLimit;
+// in performance sensitive cases, use 'address > addressLimit' instead
+return isFreedAtomic.get();

Review comment:
   I think we have both made a point.
   1. A memory segment should deal with multiple-frees, concurrency, leaks 
regardless of memory type.
   2. The unsafe memory (de-)allocation should be self-contained, without 
relying on the correctness of memory segment and its usages. To be specific, 
the cleaner itself should make sure it is called exactly once.
   
   I'm leaning towards `1` for its broader coverage, and is neutral to `2`. I 
think `2` might not be necessary if `1` works properly, but it does not harm to 
have such an extra safe net given that as far as I can see the price is high.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15353: [FLINK-21940][table] Rowtime/proctime should be obtained from getTimestamp instead of getLong

2021-03-23 Thread GitBox


flinkbot commented on pull request #15353:
URL: https://github.com/apache/flink/pull/15353#issuecomment-805519624


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 964add8dee7853a840f3d2da7b518584314ea942 (Wed Mar 24 
05:48:43 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21940) Rowtime/proctime should be obtained from getTimestamp instead of getLong

2021-03-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21940:
---
Labels: pull-request-available  (was: )

> Rowtime/proctime should be obtained from getTimestamp instead of getLong
> 
>
> Key: FLINK-21940
> URL: https://issues.apache.org/jira/browse/FLINK-21940
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] JingsongLi opened a new pull request #15353: [FLINK-21940][table] Rowtime/proctime should be obtained from getTimestamp instead of getLong

2021-03-23 Thread GitBox


JingsongLi opened a new pull request #15353:
URL: https://github.com/apache/flink/pull/15353


   
   ## What is the purpose of the change
   
   Rowtime/proctime should be obtained from getTimestamp instead of getLong.
   
   Because the input row is BinaryRowData, so legacy `getLong` can work. But we 
should avoid this, in future, the row may be a GenericRowData.
   
   ## Verifying this change
   
   This change is a trivial rework without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no) 
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)no
 - The serializers: (yes / no / don't know)no
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't 
know)no
 - The S3 file system connector: (yes / no / don't know)no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] curcur edited a comment on pull request #15323: [FLINK-21857] StackOverflow for large parallelism jobs when processing EndOfChannelStateEvent

2021-03-23 Thread GitBox


curcur edited a comment on pull request #15323:
URL: https://github.com/apache/flink/pull/15323#issuecomment-805515175


   I think the amended approach is a more natural way to handle an "end of 
recovery" event.
   
   **One suggestion** is to add one more test for "InputStatus.END_OF_RECOVERY".
   
   Previously this info is implicitly tested in 
`testUpstreamResumedUponEndOfRecovery`
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] curcur commented on pull request #15323: [FLINK-21857] StackOverflow for large parallelism jobs when processing EndOfChannelStateEvent

2021-03-23 Thread GitBox


curcur commented on pull request #15323:
URL: https://github.com/apache/flink/pull/15323#issuecomment-805515175


   I think the amended approach is a more natural way to handle an "end of 
recovery" event.
   
   One suggestion is to add one more test for "InputStatus.END_OF_RECOVERY".
   
   Previously this info is implicitly tested in 
`testUpstreamResumedUponEndOfRecovery`
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] kezhuw commented on pull request #15344: [FLINK-21564][tests] Don't call condition or throw exception after condition met for CommonTestUtils.waitUntil

2021-03-23 Thread GitBox


kezhuw commented on pull request #15344:
URL: https://github.com/apache/flink/pull/15344#issuecomment-805511312


   @flinkbot attention @StephanEwen 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] kezhuw commented on pull request #15344: [FLINK-21564][tests] Don't call condition or throw exception after condition met for CommonTestUtils.waitUntil

2021-03-23 Thread GitBox


kezhuw commented on pull request #15344:
URL: https://github.com/apache/flink/pull/15344#issuecomment-805510806


   Failure is unrelated and reported to FLINK-20329 
(Elasticsearch7DynamicSinkITCase hangs).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] curcur commented on a change in pull request #15323: [FLINK-21857] StackOverflow for large parallelism jobs when processing EndOfChannelStateEvent

2021-03-23 Thread GitBox


curcur commented on a change in pull request #15323:
URL: https://github.com/apache/flink/pull/15323#discussion_r600150920



##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/checkpointing/CheckpointedInputGate.java
##
@@ -198,9 +197,6 @@ private void waitForPriorityEvents(InputGate inputGate, 
MailboxExecutor mailboxE
 bufferOrEvent.getChannelInfo());
 } else if (bufferOrEvent.getEvent().getClass() == 
EndOfChannelStateEvent.class) {
 
upstreamRecoveryTracker.handleEndOfRecovery(bufferOrEvent.getChannelInfo());
-if (!upstreamRecoveryTracker.allChannelsRecovered()) {
-return pollNext();
-}

Review comment:
   `pollNext()` has a unified entry-point (to be called), so why 
`pollNext()` is called from where the event is handled? Is there any reason to 
do so?
   
   I can not think of one, but just to double confirm there are not certain 
assumptions here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-20329) Elasticsearch7DynamicSinkITCase hangs

2021-03-23 Thread Kezhu Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307580#comment-17307580
 ] 

Kezhu Wang commented on FLINK-20329:


Another case: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15281=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20

> Elasticsearch7DynamicSinkITCase hangs
> -
>
> Key: FLINK-20329
> URL: https://issues.apache.org/jira/browse/FLINK-20329
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.12.0
>Reporter: Dian Fu
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10052=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20
> {code}
> 2020-11-24T16:04:05.9260517Z [INFO] Running 
> org.apache.flink.streaming.connectors.elasticsearch.table.Elasticsearch7DynamicSinkITCase
> 2020-11-24T16:19:25.5481231Z 
> ==
> 2020-11-24T16:19:25.5483549Z Process produced no output for 900 seconds.
> 2020-11-24T16:19:25.5484064Z 
> ==
> 2020-11-24T16:19:25.5484498Z 
> ==
> 2020-11-24T16:19:25.5484882Z The following Java processes are running (JPS)
> 2020-11-24T16:19:25.5485475Z 
> ==
> 2020-11-24T16:19:25.5694497Z Picked up JAVA_TOOL_OPTIONS: 
> -XX:+HeapDumpOnOutOfMemoryError
> 2020-11-24T16:19:25.7263048Z 16192 surefirebooter5057948964630155904.jar
> 2020-11-24T16:19:25.7263515Z 18566 Jps
> 2020-11-24T16:19:25.7263709Z 959 Launcher
> 2020-11-24T16:19:25.7411148Z 
> ==
> 2020-11-24T16:19:25.7427013Z Printing stack trace of Java process 16192
> 2020-11-24T16:19:25.7427369Z 
> ==
> 2020-11-24T16:19:25.7484365Z Picked up JAVA_TOOL_OPTIONS: 
> -XX:+HeapDumpOnOutOfMemoryError
> 2020-11-24T16:19:26.0848776Z 2020-11-24 16:19:26
> 2020-11-24T16:19:26.0849578Z Full thread dump OpenJDK 64-Bit Server VM 
> (25.275-b01 mixed mode):
> 2020-11-24T16:19:26.0849831Z 
> 2020-11-24T16:19:26.0850185Z "Attach Listener" #32 daemon prio=9 os_prio=0 
> tid=0x7fc148001000 nid=0x48e7 waiting on condition [0x]
> 2020-11-24T16:19:26.0850595Zjava.lang.Thread.State: RUNNABLE
> 2020-11-24T16:19:26.0850814Z 
> 2020-11-24T16:19:26.0851375Z "testcontainers-ryuk" #31 daemon prio=5 
> os_prio=0 tid=0x7fc251232000 nid=0x3fb0 in Object.wait() 
> [0x7fc1012c4000]
> 2020-11-24T16:19:26.0854688Zjava.lang.Thread.State: TIMED_WAITING (on 
> object monitor)
> 2020-11-24T16:19:26.0855379Z  at java.lang.Object.wait(Native Method)
> 2020-11-24T16:19:26.0855844Z  at 
> org.testcontainers.utility.ResourceReaper.lambda$null$1(ResourceReaper.java:142)
> 2020-11-24T16:19:26.0857272Z  - locked <0x8e2bd2d0> (a 
> java.util.ArrayList)
> 2020-11-24T16:19:26.0857977Z  at 
> org.testcontainers.utility.ResourceReaper$$Lambda$93/1981729428.run(Unknown 
> Source)
> 2020-11-24T16:19:26.0858471Z  at 
> org.rnorth.ducttape.ratelimits.RateLimiter.doWhenReady(RateLimiter.java:27)
> 2020-11-24T16:19:26.0858961Z  at 
> org.testcontainers.utility.ResourceReaper.lambda$start$2(ResourceReaper.java:133)
> 2020-11-24T16:19:26.0859422Z  at 
> org.testcontainers.utility.ResourceReaper$$Lambda$92/40191541.run(Unknown 
> Source)
> 2020-11-24T16:19:26.0859788Z  at java.lang.Thread.run(Thread.java:748)
> 2020-11-24T16:19:26.0860030Z 
> 2020-11-24T16:19:26.0860371Z "process reaper" #24 daemon prio=10 os_prio=0 
> tid=0x7fc0f803b800 nid=0x3f92 waiting on condition [0x7fc10296e000]
> 2020-11-24T16:19:26.0860913Zjava.lang.Thread.State: TIMED_WAITING 
> (parking)
> 2020-11-24T16:19:26.0861387Z  at sun.misc.Unsafe.park(Native Method)
> 2020-11-24T16:19:26.0862495Z  - parking to wait for  <0x8814bf30> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
> 2020-11-24T16:19:26.0863253Z  at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> 2020-11-24T16:19:26.0863760Z  at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
> 2020-11-24T16:19:26.0864274Z  at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
> 2020-11-24T16:19:26.0864762Z  at 
> java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
> 2020-11-24T16:19:26.0865299Z  at 
> 

[GitHub] [flink] flinkbot edited a comment on pull request #15352: [FLINK-21581][core] Remove @PublicEvolving from RuntimeContext.getJobID

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15352:
URL: https://github.com/apache/flink/pull/15352#issuecomment-805496444


   
   ## CI report:
   
   * 8fa482cb4e2822a46447266e626b96d0a966d7b2 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15338)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] kezhuw commented on pull request #15331: [FLINK-21626][core] Make RuntimeContext.jobID non-optional

2021-03-23 Thread GitBox


kezhuw commented on pull request #15331:
URL: https://github.com/apache/flink/pull/15331#issuecomment-805506049


   @rkhachatryan @dawidwys Hi all, I have done my round. Could you take another 
look for this @dawidwys ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on pull request #15173: [FLINK-21367][Connectors / JDBC] Support objectReuse in JDBC sink

2021-03-23 Thread GitBox


rkhachatryan commented on pull request #15173:
URL: https://github.com/apache/flink/pull/15173#issuecomment-805505956


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] kezhuw commented on a change in pull request #15273: [FLINK-21800][core] Guard MemorySegment against concurrent frees.

2021-03-23 Thread GitBox


kezhuw commented on a change in pull request #15273:
URL: https://github.com/apache/flink/pull/15273#discussion_r600180203



##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/MemorySegment.java
##
@@ -217,10 +222,14 @@ public int size() {
 /**
  * Checks whether the memory segment was freed.
  *
+ * This method internally involves cross-thread synchronization. Do not 
use for performance
+ * sensitive code paths.
+ *
  * @return true, if the memory segment has been freed, 
false otherwise.
  */
 public boolean isFreed() {
-return address > addressLimit;
+// in performance sensitive cases, use 'address > addressLimit' instead
+return isFreedAtomic.get();

Review comment:
   I have no objection for enforcing rules on memory types. The visible 
difference is minor: concurrent free will fail for all memory segments.
   
   I guess it is price of full aligned api unification for safe and unsafe 
memory segment. I think we could go through current approach and see whether it 
introduces much overhead for streaming users. In later stage, we could decide 
whether to drop the whole things and compromise a bit in implementation or api 
semantics base on concrete cases/usages/benchmarks.
   
   Though, I still think it is a good to guard unsafe part even after all 
existing invalid cases detected especially it is uncertain(FLINK-21798) that 
unsafe part could be "unmanaged". It is a safety net for that uncertain future, 
otherwise we are just rollback ~ rollback ~ rollback things. That is why I 
think configurable is good.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #15119: [FLINK-21736][state] Introduce state scope latency tracking metrics

2021-03-23 Thread GitBox


rkhachatryan commented on a change in pull request #15119:
URL: https://github.com/apache/flink/pull/15119#discussion_r600179518



##
File path: flink-runtime/pom.xml
##
@@ -56,6 +56,12 @@ under the License.
${project.version}

 
+   
+   io.dropwizard.metrics
+   metrics-core
+   ${dropwizard.version}
+   
+

Review comment:
   Could you explain why can't or should't we use Flink wrappers (like 
`org.apache.flink.metrics.Histogram`)?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #15119: [FLINK-21736][state] Introduce state scope latency tracking metrics

2021-03-23 Thread GitBox


rkhachatryan commented on a change in pull request #15119:
URL: https://github.com/apache/flink/pull/15119#discussion_r600176709



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/metrics/LatencyTrackingStateConfig.java
##
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state.metrics;
+
+import org.apache.flink.configuration.ReadableConfig;
+import org.apache.flink.configuration.StateBackendOptions;
+import org.apache.flink.metrics.MetricGroup;
+import org.apache.flink.util.Preconditions;
+
+import java.io.Serializable;
+
+/** Config to create latency tracking state metric. */
+public class LatencyTrackingStateConfig {
+
+private final MetricGroup metricGroup;
+
+private final boolean enabled;
+private final int sampleInterval;
+private final long slidingWindow;
+
+LatencyTrackingStateConfig(
+MetricGroup metricGroup, boolean enabled, int sampleInterval, long 
slidingWindow) {
+this.metricGroup = metricGroup;
+this.enabled = enabled;
+this.sampleInterval = sampleInterval;
+this.slidingWindow = slidingWindow;
+}
+
+public boolean isEnabled() {
+return enabled;
+}
+
+public MetricGroup getMetricGroup() {
+return metricGroup;
+}
+
+public long getSlidingWindow() {
+return slidingWindow;
+}
+
+public int getSampleInterval() {
+return sampleInterval;
+}
+
+public static LatencyTrackingStateConfig disabled() {
+return newBuilder().setEnabled(false).build();
+}
+
+public static Builder newBuilder() {
+return new Builder();
+}
+
+public static class Builder implements Serializable {
+private static final long serialVersionUID = 1L;
+
+private boolean enabled = 
StateBackendOptions.LATENCY_TRACK_ENABLED.defaultValue();
+private int sampleInterval =
+
StateBackendOptions.LATENCY_TRACK_SAMPLE_INTERVAL.defaultValue();
+private long slidingWindow =
+
StateBackendOptions.LATENCY_TRACK_SLIDING_WINDOW.defaultValue();
+private MetricGroup metricGroup;
+
+public Builder setEnabled(boolean enabled) {
+this.enabled = enabled;
+return this;
+}
+
+public Builder setSampleInterval(int sampleInterval) {
+this.sampleInterval = sampleInterval;
+return this;
+}
+
+public Builder setSlidingWindow(long slidingWindow) {
+this.slidingWindow = slidingWindow;
+return this;
+}
+
+public Builder setMetricGroup(MetricGroup metricGroup) {
+this.metricGroup = metricGroup;
+return this;
+}
+
+public Builder configure(ReadableConfig config) {
+
this.setEnabled(config.get(StateBackendOptions.LATENCY_TRACK_ENABLED))
+.setSampleInterval(
+
config.get(StateBackendOptions.LATENCY_TRACK_SAMPLE_INTERVAL))
+
.setSlidingWindow(config.get(StateBackendOptions.LATENCY_TRACK_SLIDING_WINDOW));
+return this;
+}
+
+public LatencyTrackingStateConfig build() {
+if (enabled) {
+Preconditions.checkNotNull(
+metricGroup, "Metric group cannot be null if latency 
tracking is enabled.");
+Preconditions.checkArgument(sampleInterval >= 1);

Review comment:
   I personally don't see any reason why sampling interval can not be zero 
(the linked article doesn't discuss it), and therefore those descriptions would 
be misleading (I'm also fine with disallowing `0`).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15352: [FLINK-21581][core] Remove @PublicEvolving from RuntimeContext.getJobID

2021-03-23 Thread GitBox


flinkbot commented on pull request #15352:
URL: https://github.com/apache/flink/pull/15352#issuecomment-805496444


   
   ## CI report:
   
   * 8fa482cb4e2822a46447266e626b96d0a966d7b2 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15351: Only for debug some tests and trigger ci running

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15351:
URL: https://github.com/apache/flink/pull/15351#issuecomment-805475704


   
   ## CI report:
   
   * f1ebcc9ca219b38a038cc25f6e0a426c8609e9d9 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15334)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15331: [FLINK-21626][core] Make RuntimeContext.jobID non-optional

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15331:
URL: https://github.com/apache/flink/pull/15331#issuecomment-804297575


   
   ## CI report:
   
   * cbf4ab0e0683981a078a1b16e94d0406dc5ac8e9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15264)
 
   * d037e4b5d619af67de8d9d7387558f34ef378bb7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15319)
 
   * 21436b81133864ffebfb2450c4c6094055e7ba43 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15337)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15323: [FLINK-21857] StackOverflow for large parallelism jobs when processing EndOfChannelStateEvent

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15323:
URL: https://github.com/apache/flink/pull/15323#issuecomment-804037491


   
   ## CI report:
   
   * 501c55365989bb81e7713a4fb8f07029ddecdf2f Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15336)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15148: [FLINK-21731] Add benchmarks for DefaultScheduler's creation, scheduling and deploying

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15148:
URL: https://github.com/apache/flink/pull/15148#issuecomment-796572621


   
   ## CI report:
   
   * 25e912512943d29b15c8fa5a22adc5cf0e2294a1 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15226)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15188)
 
   * cd1977e701621bdd1f827dce3f0b463b100958c0 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15335)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #15119: [FLINK-21736][state] Introduce state scope latency tracking metrics

2021-03-23 Thread GitBox


rkhachatryan commented on a change in pull request #15119:
URL: https://github.com/apache/flink/pull/15119#discussion_r600171838



##
File path: 
flink-core/src/main/java/org/apache/flink/configuration/StateBackendOptions.java
##
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.configuration;
+
+import org.apache.flink.annotation.docs.Documentation;
+import org.apache.flink.api.common.time.Time;
+import org.apache.flink.configuration.description.Description;
+import org.apache.flink.configuration.description.TextElement;
+
+/** A collection of all configuration options that relate to state backend. */
+public class StateBackendOptions {
+
+// 
+//  general state backend options
+// 
+
+/**
+ * The checkpoint storage used to store operator state locally within the 
cluster during
+ * execution.
+ *
+ * The implementation can be specified either via their shortcut name, 
or via the class name
+ * of a {@code StateBackendFactory}. If a StateBackendFactory class name 
is specified, the
+ * factory is instantiated (via its zero-argument constructor) and its 
{@code
+ * StateBackendFactory#createFromConfig(ReadableConfig, ClassLoader)} 
method is called.
+ *
+ * Recognized shortcut names are 'hashmap' and 'rocksdb'.
+ */
+@Documentation.Section(value = 
Documentation.Sections.COMMON_STATE_BACKENDS, position = 1)
+public static final ConfigOption STATE_BACKEND =
+ConfigOptions.key("state.backend")
+.stringType()
+.noDefaultValue()
+.withDescription(
+Description.builder()
+.text("The state backend to be used to 
store state.")
+.linebreak()
+.text(
+"The implementation can be 
specified either via their shortcut "
++ " name, or via the class 
name of a %s. "
++ "If a factory is 
specified it is instantiated via its "
++ "zero argument 
constructor and its %s "
++ "method is called.",
+
TextElement.code("StateBackendFactory"),
+TextElement.code(
+
"StateBackendFactory#createFromConfig(ReadableConfig, ClassLoader)"))
+.linebreak()
+.text("Recognized shortcut names are 
'hashmap' and 'rocksdb'.")
+.build());
+
+@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS)
+public static final ConfigOption LATENCY_TRACK_ENABLED =
+ConfigOptions.key("state.backend.latency-track-enabled")
+.booleanType()
+.defaultValue(false)
+.withDescription(
+"Whether to track latency of state operations, e.g 
value state put/get/clear.");
+
+@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS)
+public static final ConfigOption LATENCY_TRACK_SAMPLE_INTERVAL =
+ConfigOptions.key("state.backend.latency-track-sample-interval")
+.intType()
+.defaultValue(100)

Review comment:
   I see.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wuchong commented on a change in pull request #15265: [FLINK-21836][table-api] Introduce ParseStrategyParser

2021-03-23 Thread GitBox


wuchong commented on a change in pull request #15265:
URL: https://github.com/apache/flink/pull/15265#discussion_r600144592



##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/parse/ParseStrategyParser.java
##
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.parse;
+
+import org.apache.flink.table.api.TableException;
+import org.apache.flink.table.operations.Operation;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+/** Parser that uses {@link StatementParseStrategy} to parse statement to 
{@link Operation}. */
+public class ParseStrategyParser {

Review comment:
   What do you think about renaming it to `ExtendedParser` and also move 
`CalciteParser` into the same package ? 
   
   
   ```suggestion
   /** {@link ExtendedParser} is used for parsing some special command which 
can't supported by {@link CalciteParser}, e.g. {@code SET key=value} contains 
special characters in key and value identifier. It's also good to move some 
parsring here to avoid introducing new reserved keywords.  */
   public class ParseStrategyParser {
   ```

##
File path: 
flink-table/flink-table-planner-blink/src/test/java/org/apache/flink/table/planner/delegation/ParserImplTest.java
##
@@ -0,0 +1,177 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.delegation;
+
+import org.apache.flink.table.api.SqlParserException;
+import org.apache.flink.table.api.TableConfig;
+import org.apache.flink.table.catalog.Catalog;
+import org.apache.flink.table.catalog.CatalogManager;
+import org.apache.flink.table.catalog.FunctionCatalog;
+import org.apache.flink.table.catalog.GenericInMemoryCatalog;
+import org.apache.flink.table.delegation.Parser;
+import org.apache.flink.table.module.ModuleManager;
+import org.apache.flink.table.operations.Operation;
+import org.apache.flink.table.operations.command.ClearOperation;
+import org.apache.flink.table.operations.command.HelpOperation;
+import org.apache.flink.table.operations.command.QuitOperation;
+import org.apache.flink.table.operations.command.ResetOperation;
+import org.apache.flink.table.operations.command.SetOperation;
+import org.apache.flink.table.operations.command.SourceOperation;
+import org.apache.flink.table.planner.calcite.FlinkPlannerImpl;
+import org.apache.flink.table.planner.catalog.CatalogManagerCalciteSchema;
+import org.apache.flink.table.utils.CatalogManagerMocks;
+
+import org.hamcrest.Matcher;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ExpectedException;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.function.Supplier;
+
+import static org.apache.calcite.jdbc.CalciteSchemaBuilder.asRootSchema;
+import static org.hamcrest.CoreMatchers.instanceOf;
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertThat;
+
+/** Test for {@link ParserImpl}. */
+public class ParserImplTest {
+
+@Rule public ExpectedException thrown = ExpectedException.none();
+
+private final boolean isStreamingMode = false;
+private final TableConfig tableConfig = new TableConfig();
+private final Catalog catalog = new GenericInMemoryCatalog("MockCatalog", 
"default");
+private final 

[GitHub] [flink] flinkbot commented on pull request #15352: [FLINK-21581][core] Remove @PublicEvolving from RuntimeContext.getJobID

2021-03-23 Thread GitBox


flinkbot commented on pull request #15352:
URL: https://github.com/apache/flink/pull/15352#issuecomment-805486235


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 8fa482cb4e2822a46447266e626b96d0a966d7b2 (Wed Mar 24 
04:41:20 UTC 2021)
   
   **Warnings:**
* **1 pom.xml files were touched**: Check for build and licensing issues.
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15351: Only for debug some tests and trigger ci running

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15351:
URL: https://github.com/apache/flink/pull/15351#issuecomment-805475704


   
   ## CI report:
   
   * f1ebcc9ca219b38a038cc25f6e0a426c8609e9d9 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15334)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15331: [FLINK-21626][core] Make RuntimeContext.jobID non-optional

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15331:
URL: https://github.com/apache/flink/pull/15331#issuecomment-804297575


   
   ## CI report:
   
   * cbf4ab0e0683981a078a1b16e94d0406dc5ac8e9 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15264)
 
   * d037e4b5d619af67de8d9d7387558f34ef378bb7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15319)
 
   * 21436b81133864ffebfb2450c4c6094055e7ba43 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan opened a new pull request #15352: [FLINK-21581][core] Remove @PublicEvolving from RuntimeContext.getJobID

2021-03-23 Thread GitBox


rkhachatryan opened a new pull request #15352:
URL: https://github.com/apache/flink/pull/15352


   ## What is the purpose of the change
   
   Remove @PublicEvolving from RuntimeContext.getJobID
   to make the API more consistent.
   
   The method is added to japicmp ignore list.
   
   ## Verifying this change
   
   This change is a trivial rework without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15148: [FLINK-21731] Add benchmarks for DefaultScheduler's creation, scheduling and deploying

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15148:
URL: https://github.com/apache/flink/pull/15148#issuecomment-796572621


   
   ## CI report:
   
   * 25e912512943d29b15c8fa5a22adc5cf0e2294a1 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15226)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15188)
 
   * cd1977e701621bdd1f827dce3f0b463b100958c0 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21935) Remove "state.backend.async" option.

2021-03-23 Thread Yu Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307562#comment-17307562
 ] 

Yu Li commented on FLINK-21935:
---

+1 to remove the option and synchronous snapshot support for heap backend. 
Shall we send a notice in our user mailing list or release note is enough? 
Since the default value of {{state.backend.async}} is {{true}}, I believe the 
use of synchronous snapshot is pretty rare.

I could see more related works to do, such as deprecating 
{{NestedMapsStateTable}} and do clean ups in the next release (or maybe we 
could carry this out in this release since if we remove the option then no way 
to use it after all?).

I will be ready to do the PR review, just let me know (smile)

> Remove "state.backend.async" option.
> 
>
> Key: FLINK-21935
> URL: https://issues.apache.org/jira/browse/FLINK-21935
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Stephan Ewen
>Priority: Blocker
> Fix For: 1.13.0
>
>
> Checkpoints are always asynchronous, there is no case ever for a synchronous 
> checkpoint.
> The RocksDB state backend doesn't even support synchronous snapshots, and the 
> HashMap Heap backend also has no good use case for synchronous snapshots 
> (other than a very minor reduction in heap objects).
> Most importantly, we should not expose this option in the constructors of the 
> new state backend API classes, like {{HashMapStateBackend}}. 
> I marked this a blocker because it is part of the new user-facing State 
> Backend API and I would like to avoid that this option enters this API and 
> causes confusion when we eventually remove it.
> /cc [~sjwiesman] and [~liyu]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21570) Add Job ID to RuntimeContext

2021-03-23 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307561#comment-17307561
 ] 

Roman Khachatryan commented on FLINK-21570:
---

Thanks for raising these concerns. I've reopened FLINK-21581 to ignore the 
method in japicmp.

> Add Job ID to RuntimeContext
> 
>
> Key: FLINK-21570
> URL: https://issues.apache.org/jira/browse/FLINK-21570
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> (the issue added retroactively after the PR was merged for reference)
>  
> There are some cases (e.g. 
> [1|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-get-flink-JobId-in-runtime-td36756.html],
>  2) when job ID needs to be accessed from the user code (function).
>  Existing workarounds doesn't look clean (reliable).
>  
> One solution discussed offline is to add {{Optional}} to the 
> {{RuntimeContext}} (the latter already contains some information of the same 
> level, such as subtask index).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (FLINK-21581) Add PublicEvolving to RuntimeContext.jobId

2021-03-23 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan reopened FLINK-21581:
---

Reopening to address concerns raised in 
https://issues.apache.org/jira/browse/FLINK-21570?focusedCommentId=17306270=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17306270

> Add PublicEvolving to RuntimeContext.jobId
> --
>
> Key: FLINK-21581
> URL: https://issues.apache.org/jira/browse/FLINK-21581
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> The JobID added in FLINK-21570] is 1) a breaking change; 2) may be changed 
> after DataSet is removed.
>  
> So PublicEvolging should be added.
>  
> cc: [~chesnay]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] rkhachatryan commented on a change in pull request #15331: [FLINK-21626][core] Make RuntimeContext.jobID non-optional

2021-03-23 Thread GitBox


rkhachatryan commented on a change in pull request #15331:
URL: https://github.com/apache/flink/pull/15331#discussion_r600153326



##
File path: 
flink-core/src/main/java/org/apache/flink/api/common/operators/CollectionExecutor.java
##
@@ -110,7 +111,7 @@ public JobExecutionResult execute(Plan program) throws 
Exception {
 initCache(program.getCachedFiles());
 Collection> sinks = 
program.getDataSinks();
 for (Operator sink : sinks) {
-execute(sink);
+execute(sink, program.getJobId());
 }

Review comment:
   Sorry, I misplaced this fallback, you're right.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15351: Only for debug some tests and trigger ci running

2021-03-23 Thread GitBox


flinkbot commented on pull request #15351:
URL: https://github.com/apache/flink/pull/15351#issuecomment-805475704


   
   ## CI report:
   
   * f1ebcc9ca219b38a038cc25f6e0a426c8609e9d9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15332: [FLINK-21701][sql-client] Extend the "RESET" syntax in the SQL Client

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15332:
URL: https://github.com/apache/flink/pull/15332#issuecomment-804297721


   
   ## CI report:
   
   * 618f95c70e6f8e6fffac1171ff97856c8c6d38fb Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15259)
 
   * f42ce79779ba1596232890890122d0ac14428c23 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15333)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15273: [FLINK-21800][core] Guard MemorySegment against concurrent frees.

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15273:
URL: https://github.com/apache/flink/pull/15273#issuecomment-802508808


   
   ## CI report:
   
   * d25b8d14546059db78a2ae8d391043604b729be8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15237)
 
   * 60aa542080e1857552ba5cd7e9cbd39510d273fc Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15332)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21940) Rowtime/proctime should be obtained from getTimestamp instead of getLong

2021-03-23 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-21940:

Component/s: Table SQL / Planner

> Rowtime/proctime should be obtained from getTimestamp instead of getLong
> 
>
> Key: FLINK-21940
> URL: https://issues.apache.org/jira/browse/FLINK-21940
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
> Fix For: 1.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21940) Rowtime/proctime should be obtained from getTimestamp instead of getLong

2021-03-23 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307549#comment-17307549
 ] 

Jark Wu commented on FLINK-21940:
-

+1

> Rowtime/proctime should be obtained from getTimestamp instead of getLong
> 
>
> Key: FLINK-21940
> URL: https://issues.apache.org/jira/browse/FLINK-21940
> Project: Flink
>  Issue Type: Improvement
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
> Fix For: 1.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21898) JobRetrievalITCase crash

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307548#comment-17307548
 ] 

Guowei Ma commented on FLINK-21898:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15316=logs=8fd9202e-fd17-5b26-353c-ac1ff76c8f28=a0a633b8-47ef-5c5a-2806-3c13b9e48228=4440

> JobRetrievalITCase crash
> 
>
> Key: FLINK-21898
> URL: https://issues.apache.org/jira/browse/FLINK-21898
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15083=logs=8fd9202e-fd17-5b26-353c-ac1ff76c8f28=a0a633b8-47ef-5c5a-2806-3c13b9e48228=4383



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21745) JobMasterTest.testReconnectionAfterDisconnect hangs on azure

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307547#comment-17307547
 ] 

Guowei Ma commented on FLINK-21745:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15316=logs=c91190b6-40ae-57b2-5999-31b869b0a7c1=43529380-51b4-5e90-5af4-2dccec0ef402=13547

> JobMasterTest.testReconnectionAfterDisconnect hangs on azure
> 
>
> Key: FLINK-21745
> URL: https://issues.apache.org/jira/browse/FLINK-21745
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Matthias
>Priority: Major
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=14500=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7030a106-e977-5851-a05e-535de648c9c9=8884
> {code}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21745) JobMasterTest.testReconnectionAfterDisconnect hangs on azure

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307546#comment-17307546
 ] 

Guowei Ma commented on FLINK-21745:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15316=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7030a106-e977-5851-a05e-535de648c9c9=9378

> JobMasterTest.testReconnectionAfterDisconnect hangs on azure
> 
>
> Key: FLINK-21745
> URL: https://issues.apache.org/jira/browse/FLINK-21745
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Matthias
>Priority: Major
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=14500=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7030a106-e977-5851-a05e-535de648c9c9=8884
> {code}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21895) IncrementalAggregateJsonPlanTest.testIncrementalAggregate fail

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307545#comment-17307545
 ] 

Guowei Ma commented on FLINK-21895:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15316=logs=f66801b3-5d8b-58b4-03aa-cc67e0663d23=1abe556e-1530-599d-b2c7-b8c00d549e53=6795

> IncrementalAggregateJsonPlanTest.testIncrementalAggregate fail
> --
>
> Key: FLINK-21895
> URL: https://issues.apache.org/jira/browse/FLINK-21895
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15083=logs=f66801b3-5d8b-58b4-03aa-cc67e0663d23=1abe556e-1530-599d-b2c7-b8c00d549e53=6303
> {code:java}
> org.junit.ComparisonFailure: 
> 

[jira] [Commented] (FLINK-21896) GroupAggregateJsonPlanTest.testDistinctAggCalls fail

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307544#comment-17307544
 ] 

Guowei Ma commented on FLINK-21896:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15316=logs=f66801b3-5d8b-58b4-03aa-cc67e0663d23=1abe556e-1530-599d-b2c7-b8c00d549e53=6795

> GroupAggregateJsonPlanTest.testDistinctAggCalls fail
> 
>
> Key: FLINK-21896
> URL: https://issues.apache.org/jira/browse/FLINK-21896
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15083=logs=f66801b3-5d8b-58b4-03aa-cc67e0663d23=1abe556e-1530-599d-b2c7-b8c00d549e53=6364
> {code:java}
>   at 
> org.apache.flink.table.planner.plan.nodes.exec.stream.GroupAggregateJsonPlanTest.testDistinctAggCalls(GroupAggregateJsonPlanTest.java:148)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21416) FileBufferReaderITCase.testSequentialReading fails on azure

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307543#comment-17307543
 ] 

Guowei Ma commented on FLINK-21416:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15316=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=7c61167f-30b3-5893-cc38-a9e3d057e392=8425

> FileBufferReaderITCase.testSequentialReading fails on azure
> ---
>
> Key: FLINK-21416
> URL: https://issues.apache.org/jira/browse/FLINK-21416
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.13.0
>Reporter: Dawid Wysakowicz
>Assignee: Yun Gao
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=13473=logs=59c257d0-c525-593b-261d-e96a86f1926b=b93980e3-753f-5433-6a19-13747adae66a
> {code}
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:811)
>   at 
> org.apache.flink.runtime.io.network.partition.FileBufferReaderITCase.testSequentialReading(FileBufferReaderITCase.java:128)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by 
> NoRestartBackoffTimeStrategy
>   at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:117)
>   at 
> 

[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307539#comment-17307539
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/24/21, 4:00 AM:
-

on branch 1.12
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15277=results


was (Author: maguowei):
on brach 1.12
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15277=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305801#comment-17305801
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/24/21, 3:59 AM:
-

on branch 1.11
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15084=results


was (Author: maguowei):
on brach 1.11
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15084=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305912#comment-17305912
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/24/21, 3:59 AM:
-

on branch 1.12
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15139=results


was (Author: maguowei):
on brach 1.12
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15139=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21103) E2e tests time out on azure

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307542#comment-17307542
 ] 

Guowei Ma commented on FLINK-21103:
---

on branch 1.13
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15272=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21942) KubernetesLeaderRetrievalDriver not closed after terminated which lead to connection leak

2021-03-23 Thread Yi Tang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Tang updated FLINK-21942:

Description: 
Looks like KubernetesLeaderRetrievalDriver is not closed even if the 
KubernetesLeaderElectionDriver is closed and job reach globally terminated.
This will lead to many configmap watching be still active with connections to 
K8s.

When the connections exceeds max concurrent requests, those new configmap 
watching can not be started. Finally leads to all new jobs submitted timeout.

[~fly_in_gis] [~trohrmann] This may be related to FLINK-20695, could you 
confirm this issue?
But when many jobs are running in same session cluster, the config map watching 
is required to be active. Maybe we should merge all config maps watching?

  was:
Looks like KubernetesLeaderRetrievalDriver is not closed even if the 
KubernetesLeaderElectionDriver is closed and job reach globally terminated.
This will lead to many configmap watching be still active with connections to 
K8s.

When the connections exceeds max concurrent requests, those new configmap 
watching can not be started. Finally leads to all new jobs submitted timeout.

[~fly_in_gis] [~trohrmann] This may be related to FLINK-20695, could you 
confirm this issue?
But when many jobs are running in same session cluster, the config map is 
required to be active. Maybe we should merge all config maps watching?


> KubernetesLeaderRetrievalDriver not closed after terminated which lead to 
> connection leak
> -
>
> Key: FLINK-21942
> URL: https://issues.apache.org/jira/browse/FLINK-21942
> Project: Flink
>  Issue Type: Bug
>Reporter: Yi Tang
>Priority: Major
>
> Looks like KubernetesLeaderRetrievalDriver is not closed even if the 
> KubernetesLeaderElectionDriver is closed and job reach globally terminated.
> This will lead to many configmap watching be still active with connections to 
> K8s.
> When the connections exceeds max concurrent requests, those new configmap 
> watching can not be started. Finally leads to all new jobs submitted timeout.
> [~fly_in_gis] [~trohrmann] This may be related to FLINK-20695, could you 
> confirm this issue?
> But when many jobs are running in same session cluster, the config map 
> watching is required to be active. Maybe we should merge all config maps 
> watching?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21942) KubernetesLeaderRetrievalDriver not closed after terminated which lead to connection leak

2021-03-23 Thread Yi Tang (Jira)
Yi Tang created FLINK-21942:
---

 Summary: KubernetesLeaderRetrievalDriver not closed after 
terminated which lead to connection leak
 Key: FLINK-21942
 URL: https://issues.apache.org/jira/browse/FLINK-21942
 Project: Flink
  Issue Type: Bug
Reporter: Yi Tang


Looks like KubernetesLeaderRetrievalDriver is not closed even if the 
KubernetesLeaderElectionDriver is closed and job reach globally terminated.
This will lead to many configmap watching be still active with connections to 
K8s.

When the connections exceeds max concurrent requests, those new configmap 
watching can not be started. Finally leads to all new jobs submitted timeout.

[~fly_in_gis] [~trohrmann] This may be related to FLINK-20695, could you 
confirm this issue?
But when many jobs are running in same session cluster, the config map is 
required to be active. Maybe we should merge all config maps watching?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21941) testSavepointRescalingOutPartitionedOperatorStateList fail

2021-03-23 Thread Guowei Ma (Jira)
Guowei Ma created FLINK-21941:
-

 Summary: testSavepointRescalingOutPartitionedOperatorStateList fail
 Key: FLINK-21941
 URL: https://issues.apache.org/jira/browse/FLINK-21941
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing, Runtime / State Backends
Affects Versions: 1.12.2
Reporter: Guowei Ma


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15277=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=f508e270-48d6-5f1e-3138-42a17e0714f0=4117

{code:java}
[ERROR] testSavepointRescalingOutPartitionedOperatorStateList[backend = 
filesystem](org.apache.flink.test.checkpointing.RescalingITCase)  Time elapsed: 
1.756 s  <<< ERROR!
java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find 
Flink job (8c806e07748f451ef6927883a33d3b5e)
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.flink.test.checkpointing.RescalingITCase.testSavepointRescalingPartitionedOperatorState(RescalingITCase.java:529)
at 
org.apache.flink.test.checkpointing.RescalingITCase.testSavepointRescalingOutPartitionedOperatorStateList(RescalingITCase.java:476)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-21103) E2e tests time out on azure

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307539#comment-17307539
 ] 

Guowei Ma edited comment on FLINK-21103 at 3/24/21, 3:54 AM:
-

on brach 1.12
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15277=results


was (Author: maguowei):
on brach 1.11
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15277=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21103) E2e tests time out on azure

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307539#comment-17307539
 ] 

Guowei Ma commented on FLINK-21103:
---

on brach 1.11
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15277=results

> E2e tests time out on azure
> ---
>
> Key: FLINK-21103
> URL: https://issues.apache.org/jira/browse/FLINK-21103
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Tests
>Affects Versions: 1.11.3, 1.12.1, 1.13.0
>Reporter: Dawid Wysakowicz
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12377=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529
> {code}
> Creating worker2 ... done
> Jan 22 13:16:17 Waiting for hadoop cluster to come up. We have been trying 
> for 0 seconds, retrying ...
> Jan 22 13:16:22 Waiting for hadoop cluster to come up. We have been trying 
> for 5 seconds, retrying ...
> Jan 22 13:16:27 Waiting for hadoop cluster to come up. We have been trying 
> for 10 seconds, retrying ...
> Jan 22 13:16:32 Waiting for hadoop cluster to come up. We have been trying 
> for 15 seconds, retrying ...
> Jan 22 13:16:37 Waiting for hadoop cluster to come up. We have been trying 
> for 20 seconds, retrying ...
> Jan 22 13:16:43 Waiting for hadoop cluster to come up. We have been trying 
> for 26 seconds, retrying ...
> Jan 22 13:16:48 Waiting for hadoop cluster to come up. We have been trying 
> for 31 seconds, retrying ...
> Jan 22 13:16:53 Waiting for hadoop cluster to come up. We have been trying 
> for 36 seconds, retrying ...
> Jan 22 13:16:58 Waiting for hadoop cluster to come up. We have been trying 
> for 41 seconds, retrying ...
> Jan 22 13:17:03 Waiting for hadoop cluster to come up. We have been trying 
> for 46 seconds, retrying ...
> Jan 22 13:17:08 We only have 0 NodeManagers up. We have been trying for 0 
> seconds, retrying ...
> 21/01/22 13:17:10 INFO client.RMProxy: Connecting to ResourceManager at 
> master.docker-hadoop-cluster-network/172.19.0.3:8032
> 21/01/22 13:17:11 INFO client.AHSProxy: Connecting to Application History 
> server at master.docker-hadoop-cluster-network/172.19.0.3:10200
> Jan 22 13:17:11 We now have 2 NodeManagers up.
> ==
> === WARNING: This E2E Run took already 80% of the allocated time budget of 
> 250 minutes ===
> ==
> ==
> === WARNING: This E2E Run will time out in the next few minutes. Starting to 
> upload the log output ===
> ==
> ##[error]The task has timed out.
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.0' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.1' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Async Command Start: Upload Artifact
> Uploading 1 files
> File upload succeed.
> Upload '/tmp/_e2e_watchdog.output.2' to file container: 
> '#/11824779/e2e-timeout-logs'
> Associated artifact 140921 with build 12377
> Async Command End: Upload Artifact
> Finishing: Run e2e tests
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21235) leaderChange_withBlockingJobManagerTermination_doesNotAffectNewLeader hang

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307533#comment-17307533
 ] 

Guowei Ma commented on FLINK-21235:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15283=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=4ed44b66-cdd6-5dcf-5f6a-88b07dda665d=7619

> leaderChange_withBlockingJobManagerTermination_doesNotAffectNewLeader hang
> --
>
> Key: FLINK-21235
> URL: https://issues.apache.org/jira/browse/FLINK-21235
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.11.3
>Reporter: Guowei Ma
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.13.0
>
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=12759=logs=3b6ec2fd-a816-5e75-c775-06fb87cb6670=b33fdd4f-3de5-542e-2624-5d53167bb672]
> {code:java}
>   at 
> java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
>   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>   at 
> org.apache.flink.util.AutoCloseableAsync.close(AutoCloseableAsync.java:36)
>   at 
> org.apache.flink.runtime.dispatcher.runner.DefaultDispatcherRunnerITCase.leaderChange_withBlockingJobManagerTermination_doesNotAffectNewLeader(DefaultDispatcherRunnerITCase.java:211)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithReru{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21940) Rowtime/proctime should be obtained from getTimestamp instead of getLong

2021-03-23 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-21940:


 Summary: Rowtime/proctime should be obtained from getTimestamp 
instead of getLong
 Key: FLINK-21940
 URL: https://issues.apache.org/jira/browse/FLINK-21940
 Project: Flink
  Issue Type: Improvement
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #15351: Only for debug some tests and trigger ci running

2021-03-23 Thread GitBox


flinkbot commented on pull request #15351:
URL: https://github.com/apache/flink/pull/15351#issuecomment-805463642


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit f1ebcc9ca219b38a038cc25f6e0a426c8609e9d9 (Wed Mar 24 
03:42:42 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **Invalid pull request title: No valid Jira ID provided**
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication

2021-03-23 Thread Guowei Ma (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guowei Ma updated FLINK-20461:
--
Affects Version/s: 1.13.0

> YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
> --
>
> Key: FLINK-20461
> URL: https://issues.apache.org/jira/browse/FLINK-20461
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.11.3, 1.12.0, 1.13.0
>Reporter: Huang Xingbo
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: testability
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf]
> {code:java}
> [ERROR] 
> testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase)
>  Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does 
> not exist: 
> hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434)
>  at 
> org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:148)
>  at 
> org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:113)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15332: [FLINK-21701][sql-client] Extend the "RESET" syntax in the SQL Client

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15332:
URL: https://github.com/apache/flink/pull/15332#issuecomment-804297721


   
   ## CI report:
   
   * 618f95c70e6f8e6fffac1171ff97856c8c6d38fb Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15259)
 
   * f42ce79779ba1596232890890122d0ac14428c23 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15273: [FLINK-21800][core] Guard MemorySegment against concurrent frees.

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15273:
URL: https://github.com/apache/flink/pull/15273#issuecomment-802508808


   
   ## CI report:
   
   * d25b8d14546059db78a2ae8d391043604b729be8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15237)
 
   * 60aa542080e1857552ba5cd7e9cbd39510d273fc UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-20461) YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307532#comment-17307532
 ] 

Guowei Ma commented on FLINK-20461:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15302=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf=28982


{code:java}
[ERROR] 
testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase)
  Time elapsed: 26.797 s  <<< ERROR!
java.io.FileNotFoundException: File does not exist: 
hdfs://localhost:40564/user/agent07_azpcontainer/.flink/application_1616528383248_0001/flink-dist_2.11-1.13-SNAPSHOT.jar
at 
org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434)
at 
org.apache.flink.yarn.YARNFileReplicationITCase.extraVerification(YARNFileReplicationITCase.java:165)
at 
org.apache.flink.yarn.YARNFileReplicationITCase.deployPerJob(YARNFileReplicationITCase.java:128)
at 
org.apache.flink.yarn.YARNFileReplicationITCase.lambda$testPerJobModeWithDefaultFileReplication$1(YARNFileReplicationITCase.java:78)
at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:286)
at 
org.apache.flink.yarn.YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication(YARNFileReplicationITCase.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)


{code}


> YARNFileReplicationITCase.testPerJobModeWithDefaultFileReplication
> --
>
> Key: FLINK-20461
> URL: https://issues.apache.org/jira/browse/FLINK-20461
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.11.3, 1.12.0
>Reporter: Huang Xingbo
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: testability
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10450=logs=fc5181b0-e452-5c8f-68de-1097947f6483=62110053-334f-5295-a0ab-80dd7e2babbf]
> {code:java}
> [ERROR] 
> testPerJobModeWithDefaultFileReplication(org.apache.flink.yarn.YARNFileReplicationITCase)
>  Time elapsed: 32.501 s <<< ERROR! java.io.FileNotFoundException: File does 
> not exist: 
> hdfs://localhost:46072/user/agent04_azpcontainer/.flink/application_1606950278664_0001/flink-dist_2.11-1.12-SNAPSHOT.jar
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1441)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1434)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1434)
>  at 
> 

[GitHub] [flink] flinkbot edited a comment on pull request #15119: [FLINK-21736][state] Introduce state scope latency tracking metrics

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15119:
URL: https://github.com/apache/flink/pull/15119#issuecomment-792934307


   
   ## CI report:
   
   * f2de7e07c66e6dda2b4caef75e2f1dfde9f0e5c3 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15330)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15301)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Thesharing commented on a change in pull request #15310: [FLINK-21330] Optimize the performance of PipelinedRegionSchedulingStrategy

2021-03-23 Thread GitBox


Thesharing commented on a change in pull request #15310:
URL: https://github.com/apache/flink/pull/15310#discussion_r600139542



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/strategy/PipelinedRegionSchedulingStrategy.java
##
@@ -49,12 +51,12 @@
 
 private final DeploymentOption deploymentOption = new 
DeploymentOption(false);
 
-/** Result partitions are correlated if they have the same result id. */
-private final Map>
-correlatedResultPartitions = new HashMap<>();
+/** ConsumedPartitionGroups are correlated if they have the same result 
id. */
+private final Map>
+correlatedResultPartitionGroups = new HashMap<>();

Review comment:
   correlatedResultPartitionGroups --> correlatedConsumedPartitionGroups




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gaoyunhaii opened a new pull request #15351: Only for debug some tests and trigger ci running

2021-03-23 Thread GitBox


gaoyunhaii opened a new pull request #15351:
URL: https://github.com/apache/flink/pull/15351


   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't 
know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wsry commented on a change in pull request #15199: [FLINK-20740][network] Introduce a separated buffer pool and a separated thread pool for sort-merge blocking shuffle

2021-03-23 Thread GitBox


wsry commented on a change in pull request #15199:
URL: https://github.com/apache/flink/pull/15199#discussion_r600139401



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/disk/BatchShuffleReadBufferPool.java
##
@@ -0,0 +1,353 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.disk;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.configuration.TaskManagerOptions;
+import org.apache.flink.core.memory.MemorySegment;
+import org.apache.flink.core.memory.MemorySegmentFactory;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.annotation.concurrent.GuardedBy;
+
+import java.util.ArrayDeque;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Queue;
+import java.util.concurrent.TimeoutException;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+import static org.apache.flink.util.Preconditions.checkNotNull;
+import static org.apache.flink.util.Preconditions.checkState;
+
+/**
+ * A fixed-size {@link MemorySegment} pool used by batch shuffle for shuffle 
data read (currently
+ * only used by sort-merge blocking shuffle).
+ */
+@Internal
+public class BatchShuffleReadBufferPool {
+
+private static final Logger LOG = 
LoggerFactory.getLogger(BatchShuffleReadBufferPool.class);
+
+/** Minimum total memory size in bytes of this buffer pool. */
+public static final int MIN_TOTAL_BYTES = 32 * 1024 * 1024;
+
+/**
+ * Memory size in bytes can be allocated from this buffer pool for a 
single request (8M is for
+ * better sequential read).
+ */
+public static final int NUM_BYTES_PER_REQUEST = 8 * 1024 * 1024;
+
+/** Total direct memory size in bytes can can be allocated and used by 
this buffer pool. */
+private final long totalBytes;
+
+/**
+ * Maximum time to wait in milliseconds when requesting read buffers from 
this buffer pool
+ * before throwing an exception.
+ */
+private final long requestTimeout;
+
+/** The number of total buffers in this buffer pool. */
+private final int numTotalBuffers;
+
+/** Size of each buffer in bytes in this buffer pool. */
+private final int bufferSize;
+
+/** The number of buffers to be returned for a single request. */
+private final int numBuffersPerRequest;
+
+/**
+ * The maximum number of buffers can be allocated from this buffer pool 
for a single buffer
+ * requester.
+ */
+private final int maxBuffersPerRequester;
+
+/** All available buffers in this buffer pool currently. */
+@GuardedBy("buffers")
+private final Queue buffers = new ArrayDeque<>();
+
+/** Account for all the buffers requested per requester. */
+@GuardedBy("buffers")
+private final Map numBuffersAllocated = new HashMap<>();
+
+/** Whether this buffer pool has been destroyed or not. */
+@GuardedBy("buffers")
+private boolean destroyed;
+
+/** Whether this buffer pool has been initialized or not. */
+@GuardedBy("buffers")
+private boolean initialized;
+
+public BatchShuffleReadBufferPool(long totalBytes, int bufferSize) {
+// 5 min default buffer request timeout
+this(totalBytes, bufferSize, 5 * 60 * 1000);
+}
+
+public BatchShuffleReadBufferPool(long totalBytes, int bufferSize, long 
requestTimeout) {
+checkArgument(totalBytes > 0, "Total memory size must be positive.");
+checkArgument(bufferSize > 0, "Size of buffer must be positive.");
+checkArgument(requestTimeout > 0, "Request timeout must be positive.");
+
+this.totalBytes = totalBytes;
+this.bufferSize = bufferSize;
+this.requestTimeout = requestTimeout;
+
+this.numTotalBuffers = (int) Math.min(totalBytes / bufferSize, 
Integer.MAX_VALUE);

Review comment:
   I totally agree we should not enforce that and there is no such 
enforcement in the latest version.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific 

[GitHub] [flink] Thesharing commented on a change in pull request #15310: [FLINK-21330] Optimize the performance of PipelinedRegionSchedulingStrategy

2021-03-23 Thread GitBox


Thesharing commented on a change in pull request #15310:
URL: https://github.com/apache/flink/pull/15310#discussion_r600139119



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adapter/DefaultSchedulingPipelinedRegion.java
##
@@ -62,21 +85,34 @@ public DefaultExecutionVertex getVertex(final 
ExecutionVertexID vertexId) {
 
 @Override
 public Iterable getConsumedResults() {
-if (consumedResults == null) {
+if (consumedPartitionGroups == null) {
 initializeConsumedResults();
 }
-return consumedResults;
+return () -> flatMap(consumedPartitionGroups, 
resultPartitionRetriever);
 }
 
 private void initializeConsumedResults() {
-final Set consumedResults = new HashSet<>();
+final Set consumedResultGroupSet = new 
HashSet<>();

Review comment:
   consumedResultGroupSet --> consumedPartitionGroupSet




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wsry commented on a change in pull request #15199: [FLINK-20740][network] Introduce a separated buffer pool and a separated thread pool for sort-merge blocking shuffle

2021-03-23 Thread GitBox


wsry commented on a change in pull request #15199:
URL: https://github.com/apache/flink/pull/15199#discussion_r600138798



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/disk/BatchShuffleReadBufferPool.java
##
@@ -0,0 +1,265 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.disk;
+
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.configuration.TaskManagerOptions;
+import org.apache.flink.core.memory.MemorySegment;
+import org.apache.flink.core.memory.MemorySegmentFactory;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.annotation.concurrent.GuardedBy;
+
+import java.util.ArrayDeque;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import java.util.Queue;
+import java.util.concurrent.TimeoutException;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+import static org.apache.flink.util.Preconditions.checkState;
+
+/**
+ * A fixed-size {@link MemorySegment} pool used by batch shuffle for shuffle 
data read (currently
+ * only used by sort-merge blocking shuffle).
+ */
+public class BatchShuffleReadBufferPool {
+
+private static final Logger LOG = 
LoggerFactory.getLogger(BatchShuffleReadBufferPool.class);
+
+/**
+ * Memory size in bytes can be allocated from this buffer pool for a 
single request (8M is for
+ * better sequential read).
+ */
+private static final int NUM_BYTES_PER_REQUEST = 8 * 1024 * 1024;
+
+/** Total direct memory size in bytes can can be allocated and used by 
this buffer pool. */
+private final long totalBytes;
+
+/**
+ * Maximum time to wait in milliseconds when requesting read buffers from 
this buffer pool
+ * before throwing an exception.
+ */
+private final long requestTimeout;
+
+/** The number of total buffers in this buffer pool. */
+private final int numTotalBuffers;
+
+/** Size of each buffer in bytes in this buffer pool. */
+private final int bufferSize;
+
+/** The number of buffers to be returned for a single request. */
+private final int numBuffersPerRequest;
+
+/** All available buffers in this buffer pool currently. */
+@GuardedBy("buffers")
+private final Queue buffers = new ArrayDeque<>();
+
+/** Whether this buffer pool has been destroyed or not. */
+@GuardedBy("buffers")
+private boolean destroyed;
+
+/** Whether this buffer pool has been initialized or not. */
+@GuardedBy("buffers")
+private boolean initialized;
+
+public BatchShuffleReadBufferPool(long totalBytes, int bufferSize, long 
requestTimeout) {

Review comment:
   @StephanEwen Sorry for the misunderstanding of this reply, let us just 
ignore this. I misunderstood the context.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21603) SQLClientKafkaITCase failed due to unexpected end of file

2021-03-23 Thread Guowei Ma (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307530#comment-17307530
 ] 

Guowei Ma commented on FLINK-21603:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15306=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=ff888d9b-cd34-53cc-d90f-3e446d355529=27516

> SQLClientKafkaITCase failed due to unexpected end of file
> -
>
> Key: FLINK-21603
> URL: https://issues.apache.org/jira/browse/FLINK-21603
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Tests
>Affects Versions: 1.13.0
>Reporter: Yun Tang
>Priority: Major
>
> https://myasuka.visualstudio.com/flink/_build/results?buildId=252=logs=9401bf33-03c4-5a24-83fe-e51d75db73ef=72901ab2-7cd0-57be-82b1-bca51de20fba
> {code:java}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 498.992 s <<< FAILURE! - in 
> org.apache.flink.tests.util.kafka.SQLClientKafkaITCase
> [ERROR] testKafka[0: kafka-version:2.4.1 
> kafka-sql-version:universal](org.apache.flink.tests.util.kafka.SQLClientKafkaITCase)
>   Time elapsed: 448.465 s  <<< ERROR!
> java.io.IOException: 
> Process execution failed due error. Error output:
> gzip: stdin: unexpected end of file
> tar: Unexpected EOF in archive
> tar: Unexpected EOF in archive
> tar: Error is not recoverable: exiting now
>   at 
> org.apache.flink.tests.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:133)
>   at 
> org.apache.flink.tests.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:108)
>   at 
> org.apache.flink.tests.util.AutoClosableProcess.runBlocking(AutoClosableProcess.java:70)
>   at 
> org.apache.flink.tests.util.kafka.LocalStandaloneKafkaResource.setupKafkaDist(LocalStandaloneKafkaResource.java:112)
>   at 
> org.apache.flink.tests.util.kafka.LocalStandaloneKafkaResource.before(LocalStandaloneKafkaResource.java:102)
>   at 
> org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:46)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
>   at 
> 

[GitHub] [flink] SteNicholas commented on pull request #15349: [FLINK-21937][python] Support batch mode in Python DataStream API for basic operations

2021-03-23 Thread GitBox


SteNicholas commented on pull request #15349:
URL: https://github.com/apache/flink/pull/15349#issuecomment-805460793


   @dianfu Thanks for your pull request. LGTM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-21698) Disable problematic cast conversion between NUMERIC type and TIMESTAMP type

2021-03-23 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-21698.
---
Fix Version/s: 1.13.0
   Resolution: Fixed

Fixed in master: c8b0da5aaa71fa0f1f797f30fc2414e45f7cd78f

> Disable problematic cast conversion between NUMERIC type and TIMESTAMP type
> ---
>
> Key: FLINK-21698
> URL: https://issues.apache.org/jira/browse/FLINK-21698
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> The cast conversion  between NUMERIC type and TIMESTAMP type is problematic , 
> we should disable it, NUMERIC type and TIMESTAMP_LTZ type cast conversion is 
> valid.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wuchong closed pull request #15132: [FLINK-21698][table-planner] Disable problematic cast conversion between NUMERIC type and TIMESTAMP type

2021-03-23 Thread GitBox


wuchong closed pull request #15132:
URL: https://github.com/apache/flink/pull/15132


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15350: [FLINK-21629][python] Support Python UDAF in Sliding Window

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15350:
URL: https://github.com/apache/flink/pull/15350#issuecomment-805441787


   
   ## CI report:
   
   * 18d5d7a39ec309bd76ea95c67db75fc612a5f9d3 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15329)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15349: [FLINK-21937][python] Support batch mode in Python DataStream API for basic operations

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15349:
URL: https://github.com/apache/flink/pull/15349#issuecomment-805441726


   
   ## CI report:
   
   * 0210004ca94697eb0d1e88436f18c347ead32694 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15328)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15149: [FLINK-21294][python] Support state access API for the map/flat_map operation of Python ConnectedStreams

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15149:
URL: https://github.com/apache/flink/pull/15149#issuecomment-796635889


   
   ## CI report:
   
   * 44b79b3ce008d71fda142e56288d30ac19e2ad40 UNKNOWN
   * ec90293567a79dca2d23941aca6019f5f4c7d36e UNKNOWN
   * d833aad89167a74c7b3c6da84ad30f4470568cc2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15307)
 
   * 5239c49d3c69a39494d5fb011cb4395fed64279e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15327)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15119: [FLINK-21736][state] Introduce state scope latency tracking metrics

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15119:
URL: https://github.com/apache/flink/pull/15119#issuecomment-792934307


   
   ## CI report:
   
   * f2de7e07c66e6dda2b4caef75e2f1dfde9f0e5c3 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15301)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15330)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xintongsong commented on a change in pull request #15273: [FLINK-21800][core] Guard MemorySegment against concurrent frees.

2021-03-23 Thread GitBox


xintongsong commented on a change in pull request #15273:
URL: https://github.com/apache/flink/pull/15273#discussion_r600131068



##
File path: 
flink-core/src/main/java/org/apache/flink/core/memory/MemorySegment.java
##
@@ -217,10 +222,14 @@ public int size() {
 /**
  * Checks whether the memory segment was freed.
  *
+ * This method internally involves cross-thread synchronization. Do not 
use for performance
+ * sensitive code paths.
+ *
  * @return true, if the memory segment has been freed, 
false otherwise.
  */
 public boolean isFreed() {
-return address > addressLimit;
+// in performance sensitive cases, use 'address > addressLimit' instead
+return isFreedAtomic.get();

Review comment:
   In long term, I think it would be better to rule out all of the 
following cases. And by "rule out", I mean getting rid of the existing cases 
and prevent new cases from being introduced.
   - A segment is freed more than once (concurrently or not)
   - A segment is being GC-ed without having been freed
   
   Once the existing cases are all removed, we can strictly fail on these 
cases. That also means guarding concurrent frees should not be needed. Any 
concurrent frees should have already failed due to multiple frees during CI 
tests.
   
   The benefit of enforcing these rules on memory types that it simplifies the 
contract of using a memory segment and improves maintainability. Otherwise, it 
would be complicated to understand that a segment has to be freed exactly once 
if its unsafe, while it's okay to free another segment multiple times (or 
never) if it's heap/direct. Users of `MemorySegment` should not be forced to 
understand the underlying memory type.
   
   I admit enforcing such rules comes at a price of extra overhead in 
freeing/finalizing the segments. However, I'd argue that the price is 
insignificant compared to the benefits.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21731) Add benchmarks for DefaultScheduler's creation, scheduling and deploying

2021-03-23 Thread Zhilong Hong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhilong Hong updated FLINK-21731:
-
Description: 
We notice that {{DefaultScheduler#allocateSlotsAndDeploy}} is not covered in 
the current scheduler benchmark. When we are trying to implement a benchmark 
related to this procedure, we think that it's better to implement a benchmark 
that covers the entire procedures of DefaultScheduler's creation, scheduling 
and deploying instead.

This can avoid missing any parts that may affect the performance of scheduling. 
Also in this way we don't need to add benchmarks for every sub-procedure 
related to scheduling, which makes the benchmark heavy and hard to maintain. 
Then we can just focus on the performance-sensitive procedures, as the existing 
benchmarks do.

The new benchmark items are implemented based on 
{{DefaultSchedulerBatchSchedulingTest}}.

  was:
We notice that {{DefaultScheduler#allocateSlotsAndDeploy}} is not covered in 
the current scheduler benchmark. When we are trying to implement a benchmark 
related to this procedure, we think that it's better to implement a benchmark 
that covers the entire procedures of DefaultScheduler's creation, scheduling 
and deploying instead.

This can avoid missing any parts that may affect the performance of scheduling. 
Also in this way we don't need to add benchmarks for every sub-procedure 
related to scheduling, which makes the benchmark heavy and hard to maintain. 
Then we can just focus on the performance-sensitive procedures, as the existing 
benchmarks do.

The new benchmark item is implemented based on 
{{DefaultSchedulerBatchSchedulingTest}}.


> Add benchmarks for DefaultScheduler's creation, scheduling and deploying
> 
>
> Key: FLINK-21731
> URL: https://issues.apache.org/jira/browse/FLINK-21731
> Project: Flink
>  Issue Type: Improvement
>  Components: Benchmarks, Runtime / Coordination
>Affects Versions: 1.13.0
>Reporter: Zhilong Hong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> We notice that {{DefaultScheduler#allocateSlotsAndDeploy}} is not covered in 
> the current scheduler benchmark. When we are trying to implement a benchmark 
> related to this procedure, we think that it's better to implement a benchmark 
> that covers the entire procedures of DefaultScheduler's creation, scheduling 
> and deploying instead.
> This can avoid missing any parts that may affect the performance of 
> scheduling. Also in this way we don't need to add benchmarks for every 
> sub-procedure related to scheduling, which makes the benchmark heavy and hard 
> to maintain. Then we can just focus on the performance-sensitive procedures, 
> as the existing benchmarks do.
> The new benchmark items are implemented based on 
> {{DefaultSchedulerBatchSchedulingTest}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21939) Support batch mode in Python DataStream API for process operation

2021-03-23 Thread Dian Fu (Jira)
Dian Fu created FLINK-21939:
---

 Summary: Support batch mode in Python DataStream API for process 
operation
 Key: FLINK-21939
 URL: https://issues.apache.org/jira/browse/FLINK-21939
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Dian Fu
Assignee: Dian Fu
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] xiangtao commented on a change in pull request #15332: [FLINK-21701][sql-client] Extend the "RESET" syntax in the SQL Client

2021-03-23 Thread GitBox


xiangtao commented on a change in pull request #15332:
URL: https://github.com/apache/flink/pull/15332#discussion_r600125968



##
File path: flink-table/flink-sql-client/src/test/resources/sql/set.q
##
@@ -104,3 +104,70 @@ Was expecting one of:
 "," ...
 
 !error
+
+# test reset remove key
+reset execution.max-idle-state-retention;
+[WARNING] The specified key is not supported anymore.
+!warning
+
+set execution.max-table-result-rows=100;
+[WARNING] The specified key 'execution.max-table-result-rows' is deprecated. 
Please use 'sql-client.execution.max-table-result.rows' instead.
+[INFO] Session property has been set.
+!warning
+
+# test reset the deprecated key
+reset execution.max-table-result-rows;
+[WARNING] The specified key 'execution.max-table-result-rows' is deprecated. 
Please use 'sql-client.execution.max-table-result.rows' instead.
+[INFO] Session property has been reset.
+!warning
+
+set;
+execution.attached=true
+execution.savepoint.ignore-unclaimed-state=false
+execution.shutdown-on-attached-exit=false
+execution.target=remote
+jobmanager.rpc.address=$VAR_JOBMANAGER_RPC_ADDRESS
+pipeline.classpaths=
+pipeline.jars=$VAR_PIPELINE_JARS
+rest.port=$VAR_REST_PORT
+!ok
+
+
+set parallelism.default=3;
+[INFO] Session property has been set.
+!info
+
+# test reset option key
+reset parallelism.default;

Review comment:
   oh~, i seem to have wrong , yes  you'r right 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Myasuka commented on pull request #15119: [FLINK-21736][state] Introduce state scope latency tracking metrics

2021-03-23 Thread GitBox


Myasuka commented on pull request #15119:
URL: https://github.com/apache/flink/pull/15119#issuecomment-805447550


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21938) Add documentation about how to test Python UDFs

2021-03-23 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307521#comment-17307521
 ] 

Dian Fu commented on FLINK-21938:
-

[~Yik San Chan] Great! Thanks a lot! I have assigned the ticket to you~

> Add documentation about how to test Python UDFs
> ---
>
> Key: FLINK-21938
> URL: https://issues.apache.org/jira/browse/FLINK-21938
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python, Documentation
>Reporter: Dian Fu
>Assignee: Yik San Chan
>Priority: Major
> Fix For: 1.13.0
>
>
> It should be similar to the Java UDFs:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#testing-user-defined-functions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-21688) we use setIncreaseParallelism function, can cause slow flush in restore

2021-03-23 Thread xiaogang zhou (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307520#comment-17307520
 ] 

xiaogang zhou commented on FLINK-21688:
---

[~yunta] Hello, Can you please help understand why using the 
setIncreaseParallelism to set the rocksdb thread number, but not use the 
setMaxBackgroundJobs. 

 

the setIncreaseParallelism will limit the flush thread to 1 in the whole 
process. why we have to limit it?

 

thx a lot if you let me know your thinking.

> we use setIncreaseParallelism function, can cause slow flush in restore
> ---
>
> Key: FLINK-21688
> URL: https://issues.apache.org/jira/browse/FLINK-21688
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Affects Versions: 1.11.3
>Reporter: xiaogang zhou
>Priority: Major
>
> we use setIncreaseParallelism function, can cause slow flush in restore 
> rescaling case. As this function limits the HIGH threads to 1.
>  
> Why not set the MAX jobs to 40, which will offer more flush thread to enable 
> a fast recovery



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-21938) Add documentation about how to test Python UDFs

2021-03-23 Thread Dian Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu reassigned FLINK-21938:
---

Assignee: Yik San Chan

> Add documentation about how to test Python UDFs
> ---
>
> Key: FLINK-21938
> URL: https://issues.apache.org/jira/browse/FLINK-21938
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python, Documentation
>Reporter: Dian Fu
>Assignee: Yik San Chan
>Priority: Major
> Fix For: 1.13.0
>
>
> It should be similar to the Java UDFs:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#testing-user-defined-functions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #15350: [FLINK-21629][python] Support Python UDAF in Sliding Window

2021-03-23 Thread GitBox


flinkbot commented on pull request #15350:
URL: https://github.com/apache/flink/pull/15350#issuecomment-805441787


   
   ## CI report:
   
   * 18d5d7a39ec309bd76ea95c67db75fc612a5f9d3 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15349: [FLINK-21937][python] Support batch mode in Python DataStream API for basic operations

2021-03-23 Thread GitBox


flinkbot commented on pull request #15349:
URL: https://github.com/apache/flink/pull/15349#issuecomment-805441726


   
   ## CI report:
   
   * 0210004ca94697eb0d1e88436f18c347ead32694 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15117: [FLINK-21655] [table-planner-blink] Fix incorrect simplification for coalesce call on a groupingsets' result

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15117:
URL: https://github.com/apache/flink/pull/15117#issuecomment-792736135


   
   ## CI report:
   
   * 79d036eb98a74f28c87ada0e2be3b6947f94a965 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15252)
 
   * b06d70f7ef08aa1ff5278f864571fb52b7848113 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15326)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15350: [FLINK-21629][python] Support Python UDAF in Sliding Window

2021-03-23 Thread GitBox


flinkbot commented on pull request #15350:
URL: https://github.com/apache/flink/pull/15350#issuecomment-805438428


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 18d5d7a39ec309bd76ea95c67db75fc612a5f9d3 (Wed Mar 24 
02:39:01 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-21938) Add documentation about how to test Python UDFs

2021-03-23 Thread Yik San Chan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17307518#comment-17307518
 ] 

Yik San Chan commented on FLINK-21938:
--

Do you mind if I work on the issue, given I raise the question? Thank you!

> Add documentation about how to test Python UDFs
> ---
>
> Key: FLINK-21938
> URL: https://issues.apache.org/jira/browse/FLINK-21938
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python, Documentation
>Reporter: Dian Fu
>Priority: Major
> Fix For: 1.13.0
>
>
> It should be similar to the Java UDFs:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#testing-user-defined-functions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21629) Support Python UDAF in Sliding Window

2021-03-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21629:
---
Labels: pull-request-available  (was: )

> Support Python UDAF in Sliding Window
> -
>
> Key: FLINK-21629
> URL: https://issues.apache.org/jira/browse/FLINK-21629
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Affects Versions: 1.13.0
>Reporter: Huang Xingbo
>Assignee: Huang Xingbo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> Support Python UDAF in Sliding Window



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] HuangXingBo opened a new pull request #15350: [FLINK-21629][python] Support Python UDAF in Sliding Window

2021-03-23 Thread GitBox


HuangXingBo opened a new pull request #15350:
URL: https://github.com/apache/flink/pull/15350


   ## What is the purpose of the change
   
   *This pull request will support Python UDAF in Sliding Window*
   
   
   ## Brief change log
   
 - *Add `PanedWindowAssigner`, `SlidingWindowAssigner` and 
`CountSlidingWindowAssigner`*
 - *Add `PanedWindowProcessFunction`*
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - *integration tests `test_sliding_group_window_over_time` and 
`test_sliding_group_window_over_count` in `test_udaf.py`*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-21938) Add documentation about how to test Python UDFs

2021-03-23 Thread Dian Fu (Jira)
Dian Fu created FLINK-21938:
---

 Summary: Add documentation about how to test Python UDFs
 Key: FLINK-21938
 URL: https://issues.apache.org/jira/browse/FLINK-21938
 Project: Flink
  Issue Type: Improvement
  Components: API / Python, Documentation
Reporter: Dian Fu
 Fix For: 1.13.0


It should be similar to the Java UDFs:

https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html#testing-user-defined-functions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #15149: [FLINK-21294][python] Support state access API for the map/flat_map operation of Python ConnectedStreams

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15149:
URL: https://github.com/apache/flink/pull/15149#issuecomment-796635889


   
   ## CI report:
   
   * 44b79b3ce008d71fda142e56288d30ac19e2ad40 UNKNOWN
   * ec90293567a79dca2d23941aca6019f5f4c7d36e UNKNOWN
   * d833aad89167a74c7b3c6da84ad30f4470568cc2 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15307)
 
   * 5239c49d3c69a39494d5fb011cb4395fed64279e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #15117: [FLINK-21655] [table-planner-blink] Fix incorrect simplification for coalesce call on a groupingsets' result

2021-03-23 Thread GitBox


flinkbot edited a comment on pull request #15117:
URL: https://github.com/apache/flink/pull/15117#issuecomment-792736135


   
   ## CI report:
   
   * 79d036eb98a74f28c87ada0e2be3b6947f94a965 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=15252)
 
   * b06d70f7ef08aa1ff5278f864571fb52b7848113 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #15349: [FLINK-21937][python] Support batch mode in Python DataStream API for basic operations

2021-03-23 Thread GitBox


flinkbot commented on pull request #15349:
URL: https://github.com/apache/flink/pull/15349#issuecomment-805429839


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 0210004ca94697eb0d1e88436f18c347ead32694 (Wed Mar 24 
02:20:12 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-21794) Support retrieving slot details via rest api

2021-03-23 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song closed FLINK-21794.

Fix Version/s: 1.13.0
   Resolution: Done

Merged via:
* master (1.13): 99c2a415e9eeefafacf70762b6f54070f7911ceb

> Support retrieving slot details via rest api 
> -
>
> Key: FLINK-21794
> URL: https://issues.apache.org/jira/browse/FLINK-21794
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Reporter: Xintong Song
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> It would be helpful to allow retrieving detail information of slots via rest 
> api.
>  * JobID that the slot is assigned to
>  * Slot resources (for dynamic slot allocation)
> Such information should be displayed on webui, once fine-grained resource 
> management is enabled in future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] xintongsong closed pull request #15249: [FLINK-21794][metrics] Support retrieving slot details via rest api

2021-03-23 Thread GitBox


xintongsong closed pull request #15249:
URL: https://github.com/apache/flink/pull/15249


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-21937) Support batch mode in Python DataStream API for basic operations

2021-03-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-21937:
---
Labels: pull-request-available  (was: )

> Support batch mode in Python DataStream API for basic operations
> 
>
> Key: FLINK-21937
> URL: https://issues.apache.org/jira/browse/FLINK-21937
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Dian Fu
>Assignee: Dian Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> This ticket is dedicated to support batch mode for basic operations such as 
> map/flatmap/filter, etc in Python DataStream API. 
> For the other operations such as reduce, etc, we will support them in 
> separate tickets.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] lindong28 commented on a change in pull request #15161: [FLINK-20114][connector/kafka,common] Fix a few KafkaSource-related bugs

2021-03-23 Thread GitBox


lindong28 commented on a change in pull request #15161:
URL: https://github.com/apache/flink/pull/15161#discussion_r600110083



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/source/coordinator/SourceCoordinatorContext.java
##
@@ -103,18 +106,22 @@ public SourceCoordinatorContext(
 SimpleVersionedSerializer splitSerializer) {
 this(
 coordinatorExecutor,
+Executors.newScheduledThreadPool(
+numWorkerThreads,
+new ExecutorThreadFactory(
+
coordinatorThreadFactory.getCoordinatorThreadName() + "-worker")),
 coordinatorThreadFactory,
-numWorkerThreads,
 operatorCoordinatorContext,
 splitSerializer,
 new SplitAssignmentTracker<>());
 }
 
 // Package private method for unit test.
+@VisibleForTesting

Review comment:
   Sounds good. Will continue to follow this pattern in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   5   6   >