[GitHub] [flink] liyubin117 commented on pull request #18361: [FLINK-25631][table] Support enhanced `show tables` syntax

2022-01-13 Thread GitBox


liyubin117 commented on pull request #18361:
URL: https://github.com/apache/flink/pull/18361#issuecomment-1012879158


   @wuchong Hi, Could you please help give a review ? Thanks very much !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25654) Remove the redundant lock in SortMergeResultPartition

2022-01-13 Thread zhangtianyu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476005#comment-17476005
 ] 

zhangtianyu commented on FLINK-25654:
-

t-1
-- 原始邮件 --

发件人:Yingjie Cao (Jira) "j...@apache.org"
时 间:2022/01/14 15:42:00 周五
收件人:d...@flink.apache.org
抄送人:
主 题:[jira] [Created] (FLINK-25654) Remove the redundant lock in
 SortMergeResultPartition

Yingjie Cao created FLINK-25654:
---

 Summary: Remove the redundant lock in SortMergeResultPartition
 Key: FLINK-25654
 URL: https://issues.apache.org/jira/browse/FLINK-25654
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Affects Versions: 1.14.3
Reporter: Yingjie Cao
 Fix For: 1.15.0, 1.14.4


After FLINK-2372, the task canceler will never call the close method of 
ResultPartition, this can reduce some race conditions and simplify the code. 
This ticket aims to remove some redundant locks in SortMergeResultPartition.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


> Remove the redundant lock in SortMergeResultPartition
> -
>
> Key: FLINK-25654
> URL: https://issues.apache.org/jira/browse/FLINK-25654
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Affects Versions: 1.14.3
>Reporter: Yingjie Cao
>Priority: Major
> Fix For: 1.15.0, 1.14.4
>
>
> After FLINK-2372, the task canceler will never call the close method of 
> ResultPartition, this can reduce some race conditions and simplify the code. 
> This ticket aims to remove some redundant locks in SortMergeResultPartition.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25654) Remove the redundant lock in SortMergeResultPartition

2022-01-13 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-25654:
---

 Summary: Remove the redundant lock in SortMergeResultPartition
 Key: FLINK-25654
 URL: https://issues.apache.org/jira/browse/FLINK-25654
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Affects Versions: 1.14.3
Reporter: Yingjie Cao
 Fix For: 1.15.0, 1.14.4


After FLINK-2372, the task canceler will never call the close method of 
ResultPartition, this can reduce some race conditions and simplify the code. 
This ticket aims to remove some redundant locks in SortMergeResultPartition.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-25649) Scheduling jobs fails with org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException

2022-01-13 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475999#comment-17475999
 ] 

Till Rohrmann edited comment on FLINK-25649 at 1/14/22, 7:30 AM:
-

cc [~dmvk], [~zhuzh]


was (Author: till.rohrmann):
cc [~dmvk]

> Scheduling jobs fails with 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException
> -
>
> Key: FLINK-25649
> URL: https://issues.apache.org/jira/browse/FLINK-25649
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.1
>Reporter: Gil De Grove
>Priority: Major
>
> Following comment from Till on this [SO 
> question|https://stackoverflow.com/questions/70683048/scheduling-jobs-fails-with-org-apache-flink-runtime-jobmanager-scheduler-noresou?noredirect=1#comment124980546_70683048]
> h2. *Summary*
> We are currently experiencing a scheduling issue with our flink cluster.
> The symptoms are that some/most/all (it depend, the symptoms are not always 
> the same) of our tasks are showed as _SCHEDULED_ but fail after a timeout. 
> The jobs are them showed a _RUNNING_
> The failing exception is the following one:
> {{Caused by: java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Slot request bulk is not fulfillable! Could not allocate the required slot 
> within slot request timeout}}
> After analysis, we assume (we cannot prove it, as there are not that much 
> logs for that part of the code) that the failure is due to a deadlock/race 
> condition that is happening when several jobs are being submitted at the same 
> time to the flink cluster, even though we have enough slots available in the 
> cluster.
> We actually have the error with 52 available task slots, and have 12 jobs 
> that are not scheduled.
> h2. Additional information
>  * Flink version: 1.13.1 commit a7f3192
>  * Flink cluster in session mode
>  * 2 Job managers using k8s HA mode (resource requests: 2 CPU, 4Gb Ram, 
> limits sets on memory to 4Gb)
>  * 50 task managers with 2 slots each (resource requests: 2 CPUs, 2GB Ram. No 
> limits set).
>  * Our Flink cluster is shut down every night, and restarted every morning. 
> The error seems to occur when a lot of jobs needs to be scheduled. The jobs 
> are configured to restore their state, and we do not see any issues for jobs 
> that are being scheduled and run correctly, it seems to really be related to 
> a scheduling issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25649) Scheduling jobs fails with org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException

2022-01-13 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475999#comment-17475999
 ] 

Till Rohrmann commented on FLINK-25649:
---

cc [~dmvk]

> Scheduling jobs fails with 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException
> -
>
> Key: FLINK-25649
> URL: https://issues.apache.org/jira/browse/FLINK-25649
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.13.1
>Reporter: Gil De Grove
>Priority: Major
>
> Following comment from Till on this [SO 
> question|https://stackoverflow.com/questions/70683048/scheduling-jobs-fails-with-org-apache-flink-runtime-jobmanager-scheduler-noresou?noredirect=1#comment124980546_70683048]
> h2. *Summary*
> We are currently experiencing a scheduling issue with our flink cluster.
> The symptoms are that some/most/all (it depend, the symptoms are not always 
> the same) of our tasks are showed as _SCHEDULED_ but fail after a timeout. 
> The jobs are them showed a _RUNNING_
> The failing exception is the following one:
> {{Caused by: java.util.concurrent.CompletionException: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Slot request bulk is not fulfillable! Could not allocate the required slot 
> within slot request timeout}}
> After analysis, we assume (we cannot prove it, as there are not that much 
> logs for that part of the code) that the failure is due to a deadlock/race 
> condition that is happening when several jobs are being submitted at the same 
> time to the flink cluster, even though we have enough slots available in the 
> cluster.
> We actually have the error with 52 available task slots, and have 12 jobs 
> that are not scheduled.
> h2. Additional information
>  * Flink version: 1.13.1 commit a7f3192
>  * Flink cluster in session mode
>  * 2 Job managers using k8s HA mode (resource requests: 2 CPU, 4Gb Ram, 
> limits sets on memory to 4Gb)
>  * 50 task managers with 2 slots each (resource requests: 2 CPUs, 2GB Ram. No 
> limits set).
>  * Our Flink cluster is shut down every night, and restarted every morning. 
> The error seems to occur when a lot of jobs needs to be scheduled. The jobs 
> are configured to restore their state, and we do not see any issues for jobs 
> that are being scheduled and run correctly, it seems to really be related to 
> a scheduling issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25653) Move buffer recycle in SortMergeSubpartitionReader out of lock to avoid deadlock

2022-01-13 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-25653:
---

 Summary: Move buffer recycle in SortMergeSubpartitionReader out of 
lock to avoid deadlock
 Key: FLINK-25653
 URL: https://issues.apache.org/jira/browse/FLINK-25653
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Affects Versions: 1.13.5, 1.14.3
Reporter: Yingjie Cao
 Fix For: 1.15.0, 1.13.6, 1.14.4


For the current sort-shuffle implementation, the different lock order in 
SortMergeSubpartitionReader and SortMergeResultPartitionReadScheduler can cause 
deadlock. To solve the problem, we can move buffer recycle in 
SortMergeSubpartitionReader out of the lock.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink-table-store] JingsongLi commented on a change in pull request #7: [FLINK-25643] Introduce Predicate to table store

2022-01-13 Thread GitBox


JingsongLi commented on a change in pull request #7:
URL: https://github.com/apache/flink-table-store/pull/7#discussion_r784608572



##
File path: 
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/predicate/Equal.java
##
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.predicate;
+
+import org.apache.flink.table.store.file.stats.FieldStats;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/** A {@link Predicate} to eval equal. */
+public class Equal implements Predicate {
+
+private final int index;
+
+private final Literal literal;
+
+public Equal(int index, Literal literal) {
+this.index = index;
+this.literal = checkNotNull(literal);
+}
+
+@Override
+public boolean test(Object[] values) {
+Object field = values[index];
+return field != null && literal.compareValueTo(field) == 0;
+}
+
+@Override
+public boolean test(long rowCount, FieldStats[] fieldStats) {
+FieldStats stats = fieldStats[index];
+if (rowCount == stats.nullCount()) {
+return false;
+}
+return literal.compareValueTo(stats.minValue()) >= 0
+&& literal.compareValueTo(stats.maxValue()) <= 0;

Review comment:
   I think it is currently something that needs to be improved and should 
not exist, and we need to complete the statistics before integrating it 
together.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-table-store] tsreaper commented on a change in pull request #7: [FLINK-25643] Introduce Predicate to table store

2022-01-13 Thread GitBox


tsreaper commented on a change in pull request #7:
URL: https://github.com/apache/flink-table-store/pull/7#discussion_r784594032



##
File path: 
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/predicate/Equal.java
##
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.predicate;
+
+import org.apache.flink.table.store.file.stats.FieldStats;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/** A {@link Predicate} to eval equal. */
+public class Equal implements Predicate {
+
+private final int index;
+
+private final Literal literal;
+
+public Equal(int index, Literal literal) {
+this.index = index;
+this.literal = checkNotNull(literal);
+}
+
+@Override
+public boolean test(Object[] values) {
+Object field = values[index];
+return field != null && literal.compareValueTo(field) == 0;
+}
+
+@Override
+public boolean test(long rowCount, FieldStats[] fieldStats) {
+FieldStats stats = fieldStats[index];
+if (rowCount == stats.nullCount()) {
+return false;
+}
+return literal.compareValueTo(stats.minValue()) >= 0
+&& literal.compareValueTo(stats.maxValue()) <= 0;

Review comment:
   `stats.minValue()` and `stats.maxValue()` can be null even for 
comparable types (see current implementation of `SstFile.RollingFile#finish`). 
Please deal with this case and add more tests.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-table-store] tsreaper commented on a change in pull request #7: [FLINK-25643] Introduce Predicate to table store

2022-01-13 Thread GitBox


tsreaper commented on a change in pull request #7:
URL: https://github.com/apache/flink-table-store/pull/7#discussion_r784594032



##
File path: 
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/predicate/Equal.java
##
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.predicate;
+
+import org.apache.flink.table.store.file.stats.FieldStats;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/** A {@link Predicate} to eval equal. */
+public class Equal implements Predicate {
+
+private final int index;
+
+private final Literal literal;
+
+public Equal(int index, Literal literal) {
+this.index = index;
+this.literal = checkNotNull(literal);
+}
+
+@Override
+public boolean test(Object[] values) {
+Object field = values[index];
+return field != null && literal.compareValueTo(field) == 0;
+}
+
+@Override
+public boolean test(long rowCount, FieldStats[] fieldStats) {
+FieldStats stats = fieldStats[index];
+if (rowCount == stats.nullCount()) {
+return false;
+}
+return literal.compareValueTo(stats.minValue()) >= 0
+&& literal.compareValueTo(stats.maxValue()) <= 0;

Review comment:
   `stats.minValue()` and `stats.maxValue()` can be null even for 
comparable types (see current implementation of `SstFile.RollingFile#finish`. 
Please deal with this case and add more tests.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Airblader edited a comment on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies

2022-01-13 Thread GitBox


Airblader edited a comment on pull request #17711:
URL: https://github.com/apache/flink/pull/17711#issuecomment-1012833850


   I would prefer bumping this in a separate PR using the JIRA issue 
@MartijnVisser pointed to. I'm happy to review and merge that one as well, of 
course (no vacation in between this time).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Airblader commented on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies

2022-01-13 Thread GitBox


Airblader commented on pull request #17711:
URL: https://github.com/apache/flink/pull/17711#issuecomment-1012833850


   I would prefer bumping this in a separate PR using the JIRA issue 
@MartijnVisser pointed to.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-table-store] JingsongLi commented on a change in pull request #8: [FLINK-25630] Introduce manifest related data structure and files

2022-01-13 Thread GitBox


JingsongLi commented on a change in pull request #8:
URL: https://github.com/apache/flink-table-store/pull/8#discussion_r784581070



##
File path: 
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/manifest/ManifestEntrySerializer.java
##
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.manifest;
+
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.runtime.typeutils.RowDataSerializer;
+import org.apache.flink.table.store.file.ValueKind;
+import org.apache.flink.table.store.file.mergetree.sst.SstFileMetaSerializer;
+import org.apache.flink.table.store.file.utils.ObjectSerializer;
+import org.apache.flink.table.types.logical.RowType;
+
+/** Serializer for {@link ManifestEntry}. */
+public class ManifestEntrySerializer extends ObjectSerializer {
+

Review comment:
   Can you add `serialVersionUID` to all `ObjectSerializer`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-25652) Can "duration“ and ”received records" be updated at the same time in WebUI's task detail ?

2022-01-13 Thread jeff-zou (Jira)
jeff-zou created FLINK-25652:


 Summary: Can "duration“ and ”received records"  be updated at the 
same time in WebUI's task detail  ?
 Key: FLINK-25652
 URL: https://issues.apache.org/jira/browse/FLINK-25652
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.13.2
Reporter: jeff-zou
 Attachments: 20220114145221.png

Can "duration“ and ”records received"  be updated at the same time in WebUI's 
task detail  ?

then I can get more precise QPS which the value is  equal to  ”records 
received" div "duration“. 

!20220114145221.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] chenzihao5 commented on pull request #18261: [hotfix][javadocs] Fix the doc content in the createFieldGetter method of RowData

2022-01-13 Thread GitBox


chenzihao5 commented on pull request #18261:
URL: https://github.com/apache/flink/pull/18261#issuecomment-1012819645


   > Thank you for your contribution.
   > 
   > Please update your commit message to follow the Code Style & Quality 
Guidelines, which can be found at 
https://flink.apache.org/contributing/code-style-and-quality-pull-requests.html#commit-naming-conventions.
 Right now it mentions `[docs]` as what it fixes, this should be `[javadocs]`
   
   @MartijnVisser Thanks for your correction. I have modified.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-table-store] tsreaper opened a new pull request #8: [FLINK-25630] Introduce manifest related data structure and files

2022-01-13 Thread GitBox


tsreaper opened a new pull request #8:
URL: https://github.com/apache/flink-table-store/pull/8


   Manifest files are the meta data for sst files. They record the changes 
between two snapshots.
   
   * Introduce manifest entry and maniefst file meta
   * Add tests for FileStorePathFactory
   * Introduce ManifestFile and ManifestList


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zjureel closed pull request #18346: [DO NOT MERGE][FLINK-25085] Add scheduled executor service in MainThreadExecutor and close it when the server reaches termination

2022-01-13 Thread GitBox


zjureel closed pull request #18346:
URL: https://github.com/apache/flink/pull/18346


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #18361: [FLINK-25631][table] Support enhanced `show tables` syntax

2022-01-13 Thread GitBox


flinkbot commented on pull request #18361:
URL: https://github.com/apache/flink/pull/18361#issuecomment-1012808315


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 4aec51002d41ee5968d7c9cd8b912f7538785aad (Fri Jan 14 
06:24:39 UTC 2022)
   
✅no warnings
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wsry closed pull request #17936: [FLINK-24954][network] Refresh read buffer request timeout on buffer recycling/requesting for sort-shuffle

2022-01-13 Thread GitBox


wsry closed pull request #17936:
URL: https://github.com/apache/flink/pull/17936


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-25631) Support enhanced `show tables` statement

2022-01-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-25631:
---
Labels: pull-request-available  (was: )

> Support enhanced `show tables` statement
> 
>
> Key: FLINK-25631
> URL: https://issues.apache.org/jira/browse/FLINK-25631
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.14.4
>Reporter: Yubin Li
>Assignee: Yubin Li
>Priority: Major
>  Labels: pull-request-available
>
> Enhanced `show tables` statement like ` show tables from db1 like 't%' ` has 
> been supported broadly in many popular data process engine like 
> presto/trino/spark
> [https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-tables.html]
> I have investigated the syntax of engines as mentioned above.
>  
> We could use such statement to easily show the tables of specified databases 
> without switching db frequently, alse we could use regexp pattern to find 
> focused tables quickly from plenty of tables. besides, the new statement is 
> compatible completely with the old one, users could use `show tables` as 
> before.
> h3. SHOW TABLES [ ( FROM | IN ) [catalog.]db ] [ [NOT] LIKE regex_pattern ]
>  
> I have implemented the syntax, demo as below:
> {code:java}
> Flink SQL> create database d1;
> [INFO] Execute statement succeed.
> Flink SQL> create table d1.b1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> create table t1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> create table m1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> show tables like 'm%';
> ++
> | table name |
> ++
> |         m1 |
> ++
> 1 row in set
> Flink SQL> show tables from d1 like 'b%';
> ++
> | table name |
> ++
> |         b1 |
> ++
> 1 row in set{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] liyubin117 opened a new pull request #18361: [FLINK-25631][table] Support enhanced `show tables` syntax

2022-01-13 Thread GitBox


liyubin117 opened a new pull request #18361:
URL: https://github.com/apache/flink/pull/18361


   ## What is the purpose of the change
   
   Support enhanced `show tables` syntax like `show tables from catalog1.db1 
like 't%'`
   
   ## Brief change log
   
   * SHOW TABLES [ ( FROM | IN ) [catalog.]db ] [ [NOT] LIKE regex_pattern ]
   
   ## Verifying this change
   
   Docs only change.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] paul8263 commented on pull request #18334: [BP-1.14][FLINK-22821][core] Stabilize NetUtils#getAvailablePort in order to a…

2022-01-13 Thread GitBox


paul8263 commented on pull request #18334:
URL: https://github.com/apache/flink/pull/18334#issuecomment-1012787097


   Hi @zentol ,
   
   Thank you very much for your reply. I have tested Netty ServerBootstrap and 
it turns out that if we bind 0 as the port number, the server will finally 
choose a random unallocated port. I agree with you that for Netty related test 
the file lock could be simplified to assign a port 0 to the NettyServer, as the 
file lock implementation is more cumbersome.
   
   I think we may need to discuss whether it makes sense if we keep using the 
uniform logic by calling NetUtils#.getAvailablePort. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25483) When FlinkSQL writes ES, it will not write and update the null value field

2022-01-13 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475943#comment-17475943
 ] 

Martijn Visser commented on FLINK-25483:


[~empcl] I understand that. What I mean is that with that given SQL statement, 
the user explicitly doesn't take into account that a null value could end up in 
the result. 

I disagree with the statement "Users do not want our null values to affect the 
original records in Elasticsearch". There are users that do want this to happen.

Like I said before, if you have this actual requirement, I think the best way 
to solve it is to have a custom Elasticsearch Sink that you modify with your 
requirement. This should not end up in Flink. 

> When FlinkSQL writes ES, it will not write and update the null value field
> --
>
> Key: FLINK-25483
> URL: https://issues.apache.org/jira/browse/FLINK-25483
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Ecosystem
>Reporter: 陈磊
>Priority: Minor
>
> Using Flink SQL to consume Kafka to write ES, sometimes some fields do not 
> exist, and those that do not exist do not want to write ES, how to deal with 
> this situation?
> For example: the source data has 3 fields, a, b, c
> insert into table2
> select
> a,b,c
> from table1
> When b=null, only hope to write a and c
> When c=null, only hope to write a and b
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] MartijnVisser commented on pull request #15140: [FLINK-20628][connectors/rabbitmq2] RabbitMQ connector using new connector API

2022-01-13 Thread GitBox


MartijnVisser commented on pull request #15140:
URL: https://github.com/apache/flink/pull/15140#issuecomment-1012787224


   @SteNicholas Sounds good. Keep in mind that we might not want to merge in 
new connectors in 1.16, depending on how quickly the external connector 
discussion is progressing and the building blocks will be delivered. It could 
actually be that we start moving out connectors in 1.16. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25420) Port JDBC Source to new Source API (FLIP-27)

2022-01-13 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475942#comment-17475942
 ] 

Martijn Visser commented on FLINK-25420:


[~RocMarshal] I'm not sure if a FLIP is needed (in the end, it should be just 
an implementation of a Flink APIs, it's not introducing/breaking anything in 
Flink). Outlining it in this Jira could be sufficient or if you want to 
validate the implementation steps, another option could be to have a [DISCUSS] 
thread on the Dev mailing list to validate those implementation steps.

> Port JDBC Source to new Source API (FLIP-27)
> 
>
> Key: FLINK-25420
> URL: https://issues.apache.org/jira/browse/FLINK-25420
> Project: Flink
>  Issue Type: Improvement
>Reporter: Martijn Visser
>Priority: Major
>
> The current JDBC connector is using the old SourceFunction interface, which 
> is going to be deprecated. We should port/refactor the JDBC Source to use the 
> new Source API, based on FLIP-27 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot commented on pull request #18360: [FLINK-25329][runtime] Support memory execution graph store in session cluster

2022-01-13 Thread GitBox


flinkbot commented on pull request #18360:
URL: https://github.com/apache/flink/pull/18360#issuecomment-1012784974


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit ff1d1019b6b2d674dd699d398299233dea6cf93e (Fri Jan 14 
05:19:58 UTC 2022)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-25329).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-25329) Improvement of execution graph store in flink session cluster for jobs

2022-01-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-25329:
---
Labels: pull-request-available  (was: )

> Improvement of execution graph store in flink session cluster for jobs
> --
>
> Key: FLINK-25329
> URL: https://issues.apache.org/jira/browse/FLINK-25329
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Shammon
>Priority: Major
>  Labels: pull-request-available
>
> Flink session cluster uses files to store info of jobs after they reach 
> termination with `FileExecutionGraphInfoStore`, each job will generate one 
> file. When the cluster executes many small jobs concurrently, there will be 
> many disk related operations, which will
> 1> Increase the CPU usage of `Dispatcher`
> 2> Decrease the performance of the jobs in the cluster.
> We hope to improve the disk operations in `FileExecutionGraphInfoStore` to 
> increase the performance of session cluster, or support memory store.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] zjureel opened a new pull request #18360: [FLINK-25329][runtime] Support memory execution graph store in session cluster

2022-01-13 Thread GitBox


zjureel opened a new pull request #18360:
URL: https://github.com/apache/flink/pull/18360


   ## What is the purpose of the change
   
   This PR aims to support memory execution graph store in session cluster
   
   ## Brief change log
 - * Improve MemoryExecutionGraphInfoStore to support expiration time and 
maximum capacity
 - * Support memory and file execution graph store in 
SessionClusterEntrypoint
   
   ## Verifying this change
   This change added tests and can be verified as follows:
 - * Add `MemoryExecutionGraphInfoStoreTest` that test the methods in 
`MemoryExecutionGraphInfoStore`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no) no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no) no
 - The serializers: (yes / no / don't know) no
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know) no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) 
no
 - The S3 file system connector: (yes / no / don't know) no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no) no
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25584) Azure failed on install node and npm for Flink : Runtime web

2022-01-13 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475937#comment-17475937
 ] 

Martijn Visser commented on FLINK-25584:


+1

> Azure failed on install node and npm for Flink : Runtime web
> 
>
> Key: FLINK-25584
> URL: https://issues.apache.org/jira/browse/FLINK-25584
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Runtime / Web Frontend
>Affects Versions: 1.15.0, 1.14.2
>Reporter: Yun Gao
>Priority: Critical
>  Labels: test-stability
>
> {code:java}
> [INFO] --- frontend-maven-plugin:1.11.0:install-node-and-npm (install node 
> and npm) @ flink-runtime-web ---
> [INFO] Installing node version v12.14.1
> [INFO] Downloading 
> https://nodejs.org/dist/v12.14.1/node-v12.14.1-linux-x64.tar.gz to 
> /__w/1/.m2/repository/com/github/eirslett/node/12.14.1/node-12.14.1-linux-x64.tar.gz
> [INFO] No proxies configured
> [INFO] No proxy was configured, downloading directly
> [INFO] 
> 
> [INFO] Reactor Summary:
> 
> [ERROR] Failed to execute goal 
> com.github.eirslett:frontend-maven-plugin:1.11.0:install-node-and-npm 
> (install node and npm) on project flink-runtime-web: Could not download 
> Node.js: Could not download 
> https://nodejs.org/dist/v12.14.1/node-v12.14.1-linux-x64.tar.gz: Remote host 
> terminated the handshake: SSL peer shut down incorrectly -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR] 
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :flink-runtime-web
> {code}
> (The refactor summary is omitted)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] MartijnVisser commented on pull request #18257: [FLINK-25504][Kafka] Upgrade Kafka Client to 2.8.1 and Confluent Platform to 6.2.2

2022-01-13 Thread GitBox


MartijnVisser commented on pull request #18257:
URL: https://github.com/apache/flink/pull/18257#issuecomment-1012779888


   NB: Flinkbot reports as the CI still `PENDING`, but it has completed 
successfully. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] MartijnVisser commented on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies

2022-01-13 Thread GitBox


MartijnVisser commented on pull request #17711:
URL: https://github.com/apache/flink/pull/17711#issuecomment-1012774822


   @yangjunhan I'm not sure if it's better to make it a separate PR (since we 
also have https://issues.apache.org/jira/browse/FLINK-24768 open for this) or 
if we should do it in a separate commit. Let's see what @Airblader says on 
that. It would be great do have this update in place!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] yangjunhan commented on pull request #17711: [FLINK-24737][runtime-web] Update outdated web dependencies

2022-01-13 Thread GitBox


yangjunhan commented on pull request #17711:
URL: https://github.com/apache/flink/pull/17711#issuecomment-1012727601


   > @yangjunhan I apologize for the late response; I was on vacation over the 
holidays.
   > 
   > The changes overall look fine to me, except for the open point around the 
linting output. I think the solution you proposed sounds good, at least I can't 
think of anything better either.
   
   @Airblader Hope you enjoyed your holidays. Since NG-ZORRO-antd recently 
released its new version, I am considering bumping Runtime-Web directly to 
Angular v13 and NG-ZORRO-antd v13 in this PR, if that is ok with you and 
@MartijnVisser .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #18359: [FLINK-25484][connectors/filesystem] Support inactivityInterval config in table api

2022-01-13 Thread GitBox


flinkbot commented on pull request #18359:
URL: https://github.com/apache/flink/pull/18359#issuecomment-1012727099


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 111d05bc4d409c357223c238e67bcb34880a1951 (Fri Jan 14 
03:53:33 UTC 2022)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-25484) TableRollingPolicy do not support inactivityInterval config which is supported in datastream api

2022-01-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-25484:
---
Labels: pull-request-available  (was: )

> TableRollingPolicy do not support inactivityInterval config which is 
> supported in datastream api
> 
>
> Key: FLINK-25484
> URL: https://issues.apache.org/jira/browse/FLINK-25484
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Table SQL / Ecosystem
>Affects Versions: 1.15.0
>Reporter: LiChang
>Assignee: LiChang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
> Attachments: image-2022-01-13-11-36-24-102.png
>
>
> TableRollingPolicy do not support inactivityInterval config
> public static class TableRollingPolicy extends 
> CheckpointRollingPolicy {
> private final boolean rollOnCheckpoint;
> private final long rollingFileSize;
> private final long rollingTimeInterval;
> public TableRollingPolicy(
> boolean rollOnCheckpoint, long rollingFileSize, long rollingTimeInterval) {
> this.rollOnCheckpoint = rollOnCheckpoint;
> Preconditions.checkArgument(rollingFileSize > 0L);
> Preconditions.checkArgument(rollingTimeInterval > 0L);
> this.rollingFileSize = rollingFileSize;
> this.rollingTimeInterval = rollingTimeInterval;
> }
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] lichang-bd opened a new pull request #18359: [FLINK-25484][connectors/filesystem] Support inactivityInterval config in table api #18344

2022-01-13 Thread GitBox


lichang-bd opened a new pull request #18359:
URL: https://github.com/apache/flink/pull/18359


   Change-Id: Ica8ba43307476198c4f525ac40a54ab862b58ea7
   
   
   
   ## What is the purpose of the change
   Support inactivityInterval config in table api
   
   ## Brief change log
 - *Add SINK_ROLLING_POLICY_INACTIVITY_INTERVAL table config*
 - *Support file inactive check when rolling which has been already 
supported in datastream api*
   
   ## Verifying this change
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Documentation
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (FLINK-25312) Hive supports managed table

2022-01-13 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-25312:


Assignee: Jane Chan

> Hive supports managed table
> ---
>
> Key: FLINK-25312
> URL: https://issues.apache.org/jira/browse/FLINK-25312
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Jingsong Lee
>Assignee: Jane Chan
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25312) Hive supports managed table

2022-01-13 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475921#comment-17475921
 ] 

Jingsong Lee commented on FLINK-25312:
--

[~qingyue] Thanks!

> Hive supports managed table
> ---
>
> Key: FLINK-25312
> URL: https://issues.apache.org/jira/browse/FLINK-25312
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Jingsong Lee
>Assignee: Jane Chan
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25651) Flink1.14.2 DataStream Connectors Kafka Deserializer example method uses the wrong parameter

2022-01-13 Thread shouzuo meng (Jira)
shouzuo meng created FLINK-25651:


 Summary: Flink1.14.2 DataStream Connectors Kafka Deserializer 
example method uses the wrong parameter
 Key: FLINK-25651
 URL: https://issues.apache.org/jira/browse/FLINK-25651
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.14.2
Reporter: shouzuo meng
 Attachments: deserializer.png

The official documentation DataStream Connectors kafka Deserializer module, 
introduces the KafkaRecordDeserializationSchema. ValueOnly, used the wrong 
parameters



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] flinkbot commented on pull request #18358: Update kafka.md

2022-01-13 Thread GitBox


flinkbot commented on pull request #18358:
URL: https://github.com/apache/flink/pull/18358#issuecomment-1012712193


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit f081bab4fe89823da05cbad2f0be0ebae668e310 (Fri Jan 14 
03:17:11 UTC 2022)
   
   **Warnings:**
* Documentation files were touched, but no `docs/content.zh/` files: Update 
Chinese documentation or file Jira ticket.
* **Invalid pull request title: No valid Jira ID provided**
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] jelly-1203 opened a new pull request #18358: Update kafka.md

2022-01-13 Thread GitBox


jelly-1203 opened a new pull request #18358:
URL: https://github.com/apache/flink/pull/18358


   Modify KafkaRecordDeserializationSchema. ValueOnly method's parameters
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25634) flink-read-onyarn-configuration

2022-01-13 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475910#comment-17475910
 ] 

宇宙先生 commented on FLINK-25634:
--

thanks for review, I can't 

> flink-read-onyarn-configuration
> ---
>
> Key: FLINK-25634
> URL: https://issues.apache.org/jira/browse/FLINK-25634
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.2
>Reporter: 宇宙先生
>Priority: Major
> Attachments: image-2022-01-13-10-02-57-803.png, 
> image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, 
> image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, 
> image-2022-01-13-10-14-18-918.png, image-2022-01-13-14-19-43-539.png, 
> image-2022-01-13-16-33-24-459.png, image-2022-01-13-16-33-46-046.png, 
> image-2022-01-13-16-35-37-432.png, image-2022-01-13-19-17-01-097.png, 
> image-2022-01-13-19-17-10-731.png, image-2022-01-13-19-17-15-511.png
>
>
> in flink-src.code :
> !image-2022-01-13-10-14-18-918.png!
> Set the number of retries for failed YARN ApplicationMasters/JobManagers in 
> high
>  availability mode. This value is usually limited by YARN.
> By default, it's 1 in the standalone case and 2 in the high availability case
>  
> in my cluster,the number of retries for failed YARN ApplicationMasters is 2
> yarn's configuration  also like this
> !image-2022-01-13-10-07-02-908.png!
> But it keeps restarting when my task fails,
> !image-2022-01-13-10-10-44-945.png!
> I would like to know the reason why the configuration is not taking effect.
> sincere thanks!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] lichang-bd closed pull request #18357: Flink 25484

2022-01-13 Thread GitBox


lichang-bd closed pull request #18357:
URL: https://github.com/apache/flink/pull/18357


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #18357: Flink 25484

2022-01-13 Thread GitBox


flinkbot commented on pull request #18357:
URL: https://github.com/apache/flink/pull/18357#issuecomment-1012700380


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 111d05bc4d409c357223c238e67bcb34880a1951 (Fri Jan 14 
02:46:36 UTC 2022)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **Invalid pull request title: No valid Jira ID provided**
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] lichang-bd opened a new pull request #18357: Flink 25484

2022-01-13 Thread GitBox


lichang-bd opened a new pull request #18357:
URL: https://github.com/apache/flink/pull/18357


   Change-Id: Ica8ba43307476198c4f525ac40a54ab862b58ea7
   
   
   
   ## What is the purpose of the change
   Support inactivityInterval config in table api
   
   ## Brief change log
 - *Add SINK_ROLLING_POLICY_INACTIVITY_INTERVAL table config*
 - *Support file inactive check when rolling which has been already 
supported in datastream api*
   
   ## Verifying this change
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Documentation
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] lichang-bd closed pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-13 Thread GitBox


lichang-bd closed pull request #18344:
URL: https://github.com/apache/flink/pull/18344


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-25329) Improvement of execution graph store in flink session cluster for jobs

2022-01-13 Thread Shammon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shammon updated FLINK-25329:

Description: 
Flink session cluster uses files to store info of jobs after they reach 
termination with `FileExecutionGraphInfoStore`, each job will generate one 
file. When the cluster executes many small jobs concurrently, there will be 
many disk related operations, which will

1> Increase the CPU usage of `Dispatcher`
2> Decrease the performance of the jobs in the cluster.

We hope to improve the disk operations in `FileExecutionGraphInfoStore` to 
increase the performance of session cluster, or support memory store.

  was:
Flink session cluster uses files to store info of jobs after they reach 
termination with `FileExecutionGraphInfoStore`, each job will generate one 
file. When the cluster executes many small jobs concurrently, there will be 
many disk related operations, which will

1> Increase the CPU usage of `Dispatcher`
2> Decrease the performance of the jobs in the cluster.

We hope to improve the disk operations in `FileExecutionGraphInfoStore` to 
increase the performance of session cluster.


> Improvement of execution graph store in flink session cluster for jobs
> --
>
> Key: FLINK-25329
> URL: https://issues.apache.org/jira/browse/FLINK-25329
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Shammon
>Priority: Major
>
> Flink session cluster uses files to store info of jobs after they reach 
> termination with `FileExecutionGraphInfoStore`, each job will generate one 
> file. When the cluster executes many small jobs concurrently, there will be 
> many disk related operations, which will
> 1> Increase the CPU usage of `Dispatcher`
> 2> Decrease the performance of the jobs in the cluster.
> We hope to improve the disk operations in `FileExecutionGraphInfoStore` to 
> increase the performance of session cluster, or support memory store.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (FLINK-25329) Improvement of execution graph store in flink session cluster for jobs

2022-01-13 Thread Shammon (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shammon updated FLINK-25329:

Summary: Improvement of execution graph store in flink session cluster for 
jobs  (was: Improvement of disk operations in flink session cluster for jobs)

> Improvement of execution graph store in flink session cluster for jobs
> --
>
> Key: FLINK-25329
> URL: https://issues.apache.org/jira/browse/FLINK-25329
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Shammon
>Priority: Major
>
> Flink session cluster uses files to store info of jobs after they reach 
> termination with `FileExecutionGraphInfoStore`, each job will generate one 
> file. When the cluster executes many small jobs concurrently, there will be 
> many disk related operations, which will
> 1> Increase the CPU usage of `Dispatcher`
> 2> Decrease the performance of the jobs in the cluster.
> We hope to improve the disk operations in `FileExecutionGraphInfoStore` to 
> increase the performance of session cluster.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] liuyongvs commented on pull request #17777: [FLINK-24886][core] TimeUtils supports the form of m.

2022-01-13 Thread GitBox


liuyongvs commented on pull request #1:
URL: https://github.com/apache/flink/pull/1#issuecomment-1012689676


   hi @tillrohrmann @Thesharing ,do you have time ,sorry to @ you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-25614) Let LocalWindowAggregate be chained with upstream

2022-01-13 Thread Q Kang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475897#comment-17475897
 ] 

Q Kang commented on FLINK-25614:


Hi [~wenlong.lwl] , I have fixed `SlicingWindowAggOperatorTest`, but there 
still exist many other occasions which heavily affected by this change. Maybe 
we should consult [~jark] & [~jingzhang] for details about the reason to use 
Long instead of TimestampData.

> Let LocalWindowAggregate be chained with upstream
> -
>
> Key: FLINK-25614
> URL: https://issues.apache.org/jira/browse/FLINK-25614
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Affects Versions: 1.14.2
>Reporter: Q Kang
>Priority: Minor
>  Labels: pull-request-available
>
> When enabling two-phase aggregation (local-global) strategy for Window TVF, 
> the physical plan is shown as follows:
> {code:java}
> TableSourceScan -> Calc -> WatermarkAssigner -> Calc
> ||
> || [FORWARD]
> ||
> LocalWindowAggregate
> ||
> || [HASH]
> ||
> GlobalWindowAggregate
> ||
> ||
> ...{code}
> We can let the `LocalWindowAggregate` node be chained with upstream operators 
> in order to improve efficiency, just like the non-windowing counterpart 
> `LocalGroupAggregate`.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25312) Hive supports managed table

2022-01-13 Thread Jane Chan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475895#comment-17475895
 ] 

Jane Chan commented on FLINK-25312:
---

Hi, [~lzljs3620320], I'm interested in this subtask, and would you mind 
assigning it to me?

> Hive supports managed table
> ---
>
> Key: FLINK-25312
> URL: https://issues.apache.org/jira/browse/FLINK-25312
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] xintongsong commented on pull request #15599: [FLINK-11838][flink-gs-fs-hadoop] Create Google Storage file system with recoverable writer support

2022-01-13 Thread GitBox


xintongsong commented on pull request #15599:
URL: https://github.com/apache/flink/pull/15599#issuecomment-1012679021


   @galenwarren,
   The commit doesn't need to be too detailed, as more details can be found 
with the JIRA id. Something briefly describes the major changes would be good 
enough.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zjureel commented on pull request #18303: [FLINK-25085][runtime] Add a scheduled thread pool for periodic tasks in RpcEndpoint

2022-01-13 Thread GitBox


zjureel commented on pull request #18303:
URL: https://github.com/apache/flink/pull/18303#issuecomment-1012656102


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2

2022-01-13 Thread GitBox


JingGe commented on pull request #18330:
URL: https://github.com/apache/flink/pull/18330#issuecomment-1012608343


   Thanks for your contributions!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on a change in pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2

2022-01-13 Thread GitBox


JingGe commented on a change in pull request #18330:
URL: https://github.com/apache/flink/pull/18330#discussion_r784393519



##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableSummary.java
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.connector.sink2;
+
+import org.apache.flink.annotation.Experimental;
+
+import javax.annotation.Nullable;
+
+import java.util.OptionalLong;
+
+/**
+ * This class tracks the information about committables belonging to one 
checkpoint coming from one
+ * subtask.
+ *
+ * It is sent to down-stream consumers to depict the progress of the 
committing process.
+ *
+ * @param  type of the committable
+ */
+@Experimental
+public class CommittableSummary implements CommittableMessage {

Review comment:
   It seems that `` is unused.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on a change in pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2

2022-01-13 Thread GitBox


JingGe commented on a change in pull request #18330:
URL: https://github.com/apache/flink/pull/18330#discussion_r784386592



##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableMessageTypeInfo.java
##
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.connector.sink2;
+
+import org.apache.flink.annotation.Experimental;
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.common.typeutils.TypeSerializer;
+import org.apache.flink.api.connector.sink2.Committer;
+import org.apache.flink.api.connector.sink2.SinkWriter;
+import org.apache.flink.core.io.SimpleVersionedSerializer;
+import org.apache.flink.core.io.SimpleVersionedSerializerTypeSerializerProxy;
+import org.apache.flink.util.function.SerializableSupplier;
+
+import java.util.Objects;
+
+/** The message send from {@link SinkWriter} to {@link Committer}. */
+@Experimental
+public class CommittableMessageTypeInfo extends 
TypeInformation> {
+
+private final SerializableSupplier>
+committableSerializerFactory;
+
+private CommittableMessageTypeInfo(
+SerializableSupplier> 
committableSerializerFactory) {
+this.committableSerializerFactory = committableSerializerFactory;
+}
+
+/**
+ * Returns the type information based on the serializer for a {@link 
CommittableMessage}.
+ *
+ * @param committableSerializerFactory factory to create the serializer 
for a {@link
+ * CommittableMessage}
+ * @param  type of the committable
+ * @return
+ */
+public static 
+TypeInformation> 
forCommittableSerializerFactory(

Review comment:
   NIT: 
   public static 
   TypeInformation> 
for(SerializableSupplier>
   committableSerializerFactory)

##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableSummary.java
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.connector.sink2;
+
+import org.apache.flink.annotation.Experimental;
+
+import javax.annotation.Nullable;
+
+import java.util.OptionalLong;
+
+/**
+ * This class tracks the information about committables belonging to one 
checkpoint coming from one
+ * subtask.
+ *
+ * It is sent to down-stream consumers to depict the progress of the 
committing process.
+ *
+ * @param  type of the committable
+ */
+@Experimental
+public class CommittableSummary implements CommittableMessage {

Review comment:
   It seems that  is unused.

##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableMessageTypeInfo.java
##
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed

[jira] [Comment Edited] (FLINK-25646) Document buffer debloating issues with high parallelism

2022-01-13 Thread Mason Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475778#comment-17475778
 ] 

Mason Chen edited comment on FLINK-25646 at 1/13/22, 10:01 PM:
---

Hi [~akalashnikov], thanks for the documentation! Could you attach a reference 
to the benchmark? I'd like to take a look.

Is the observation with unaligned and/or aligned checkpoints?


was (Author: mason6345):
Hi [~akalashnikov], thanks for the documentation! Could you attach a reference 
to the benchmark? I'd like to take a look

> Document buffer debloating issues with high parallelism
> ---
>
> Key: FLINK-25646
> URL: https://issues.apache.org/jira/browse/FLINK-25646
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / Network
>Affects Versions: 1.14.0
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
>  Labels: pull-request-available
>
> According to last benchmarks, there are some problems with buffer debloat 
> when job has high parallelism. The high parallelism means the different value 
> from job to job but in general it is more than 200. So it makes sense to 
> document that problem and propose the solution - increasing the number of 
> buffers.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25646) Document buffer debloating issues with high parallelism

2022-01-13 Thread Mason Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475778#comment-17475778
 ] 

Mason Chen commented on FLINK-25646:


Hi [~akalashnikov], thanks for the documentation! Could you attach a reference 
to the benchmark? I'd like to take a look

> Document buffer debloating issues with high parallelism
> ---
>
> Key: FLINK-25646
> URL: https://issues.apache.org/jira/browse/FLINK-25646
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / Network
>Affects Versions: 1.14.0
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
>  Labels: pull-request-available
>
> According to last benchmarks, there are some problems with buffer debloat 
> when job has high parallelism. The high parallelism means the different value 
> from job to job but in general it is more than 200. So it makes sense to 
> document that problem and propose the solution - increasing the number of 
> buffers.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-21896) GroupAggregateJsonPlanTest.testDistinctAggCalls fail

2022-01-13 Thread Soumyajit Sahu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-21896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475743#comment-17475743
 ] 

Soumyajit Sahu commented on FLINK-21896:


[~godfreyhe] [~twalthr] This would break upgrade from Flink 12 to 13, right?
If serialization was done using Flink 12, and then app was redeployed using 
Flink 13, then the deserialization throws this error.

To mitigate, we had to delete our app's state and start the flink app afresh.


java.io.InvalidClassException: 
org.apache.flink.table.types.logical.LogicalType; local class incompatible: 
stream classdesc serialVersionUID = -7381419642101800907, local class 
serialVersionUID = 1
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2003)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1850)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2003)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1850)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2160)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:615)
at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:593)

> GroupAggregateJsonPlanTest.testDistinctAggCalls fail
> 
>
> Key: FLINK-21896
> URL: https://issues.apache.org/jira/browse/FLINK-21896
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.0
>Reporter: Guowei Ma
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.13.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15083&view=logs&j=f66801b3-5d8b-58b4-03aa-cc67e0663d23&t=1abe556e-1530-599d-b2c7-b8c00d549e53&l=6364
> {code:java}
>   at 
> org.apache.flink.table.planner.plan.nodes.exec.stream.GroupAggregateJsonPlanTest.testDistinctAggCalls(GroupAggregateJsonPlanTest.java:148)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats

2022-01-13 Thread GitBox


rkhachatryan commented on a change in pull request #18324:
URL: https://github.com/apache/flink/pull/18324#discussion_r784314617



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/SubtaskStateStats.java
##
@@ -85,6 +88,7 @@
 this.subtaskIndex = subtaskIndex;
 checkArgument(stateSize >= 0, "Negative state size");
 this.stateSize = stateSize;
+this.incrementalStateSize = incrementalStateSize;

Review comment:
   Add `checkState` similar to `stateSize`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats

2022-01-13 Thread GitBox


rkhachatryan commented on a change in pull request #18324:
URL: https://github.com/apache/flink/pull/18324#discussion_r784312461



##
File path: 
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogKeyedStateBackend.java
##
@@ -368,19 +372,34 @@ public boolean 
deregisterKeySelectionListener(KeySelectionListener listener)
 // collections don't change once started and handles are immutable
 List prevDeltaCopy =
 new 
ArrayList<>(changelogStateBackendStateCopy.getRestoredNonMaterialized());
+long incrementalMaterializeSize = 0L;
 if (delta != null && delta.getStateSize() > 0) {
 prevDeltaCopy.add(delta);
+incrementalMaterializeSize += delta.getIncrementalStateSize();
 }
 
 if (prevDeltaCopy.isEmpty()
 && 
changelogStateBackendStateCopy.getMaterializedSnapshot().isEmpty()) {
 return SnapshotResult.empty();
 } else {
+List materializedSnapshot =
+changelogStateBackendStateCopy.getMaterializedSnapshot();
+for (KeyedStateHandle keyedStateHandle : materializedSnapshot) {
+if (!lastCompletedHandles.contains(keyedStateHandle)) {
+incrementalMaterializeSize += 
keyedStateHandle.getStateSize();

Review comment:
   (ditto [1st 
comment](https://github.com/apache/flink/pull/18324#discussion_r784298983)): I 
think we should NOT count materialized state size in incremental state size




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats

2022-01-13 Thread GitBox


rkhachatryan commented on a change in pull request #18324:
URL: https://github.com/apache/flink/pull/18324#discussion_r784309007



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/StateObject.java
##
@@ -63,4 +63,14 @@
  * @return Size of the state in bytes.
  */
 long getStateSize();
+
+/**
+ * Returns the incremental state size in bytes. If the size is unknown, 
this method would return
+ * same result as {@link #getStateSize()}.
+ *
+ * @return Size of incremental state in bytes.
+ */
+default long getIncrementalStateSize() {
+return getStateSize();
+}

Review comment:
   1. Conceptually, incremental state is only relevant to 
`CompositeStateHandle` (which defines `registerSharedStates` method). WDYT 
about moving this method there?
   2. Then we could force non-default implementation
   3. In javadoc, could you clarify what "incremental" means (as per [comment 
above](https://github.com/apache/flink/pull/18324#discussion_r784298983))
   4. In javadoc, could you clarify the relation to channel state? Or maybe in 
some other place, like `OperatorSubtaskState`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats

2022-01-13 Thread GitBox


rkhachatryan commented on a change in pull request #18324:
URL: https://github.com/apache/flink/pull/18324#discussion_r784305375



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/IncrementalRemoteKeyedStateHandle.java
##
@@ -204,6 +203,21 @@ public long getStateSize() {
 return size;
 }
 
+@Override
+public long getIncrementalStateSize() {
+long size = StateUtil.getStateSize(metaStateHandle);
+
+for (StreamStateHandle sharedStateHandle : sharedState.values()) {
+size += sharedStateHandle.getIncrementalStateSize();

Review comment:
   I guess this only works because 
`PlaceholderStreamStateHandle.getIncrementalStateSize` returns `0`, right?
   But backend isn't requried to return placeholder IMO; in fact, it currently 
doesn't - without FLINK-25395/ #18297 (In the future, the latter PR quite 
likely will be reverted I think).
   
   WDYT about computing incremental state size externally (in 
`RocksIncrementalSnapshotStrategy`) and storing it in metadata?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats

2022-01-13 Thread GitBox


rkhachatryan commented on a change in pull request #18324:
URL: https://github.com/apache/flink/pull/18324#discussion_r784298983



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogStateBackendHandle.java
##
@@ -113,6 +124,19 @@ public long getStateSize() {
 + 
nonMaterialized.stream().mapToLong(StateObject::getStateSize).sum();
 }
 
+@Override
+public long getIncrementalStateSize() {
+long incrementalStateSize =
+incrementalMaterializeSize == 
undefinedIncrementalMaterializeSize
+? materialized.stream()
+
.mapToLong(StateObject::getIncrementalStateSize)
+.sum()
+: incrementalMaterializeSize;

Review comment:
   Depending on how we define "incremental state size", materialized part 
should be included or not:
   1. if it's everything that was uploaded for **this** checkpoint, then it 
should
   1. if it's the difference from the previous checkpoint, it should **not** be 
included
   
   Right?
   
   It seems problematic to find out what exactly was uploaded for **this** 
checkpoint because multiple checkpoints will likely include the same 
materialized state, and therefore report the same incremental state multiple 
times.
   Besides that, the 2nd option seems more intuitive to me personally.
   
   WDYT?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] rkhachatryan commented on a change in pull request #18324: [FLINK-25557][checkpoint] Introduce incremental/full checkpoint size stats

2022-01-13 Thread GitBox


rkhachatryan commented on a change in pull request #18324:
URL: https://github.com/apache/flink/pull/18324#discussion_r784301442



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogStateBackendHandle.java
##
@@ -50,17 +50,28 @@
 
 class ChangelogStateBackendHandleImpl implements 
ChangelogStateBackendHandle {
 private static final long serialVersionUID = 1L;
+private static final long undefinedIncrementalMaterializeSize = -1L;

Review comment:
   Why don't we store this size in checkpoint metadata? So that we can get 
rid of unknown size and show the correct size after recovery?
   
   Nit: `UNDEFINED_INCREMENTAL_MATERIALIZE_SIZE` ?

##
File path: 
flink-runtime-web/web-dashboard/src/app/pages/job/checkpoints/detail/job-checkpoints-detail.component.html
##
@@ -45,12 +45,13 @@
   >
 
   
-
+
 Name
 Acknowledged
 Latest Acknowledgment
 End to End Duration
-Checkpointed Data Size
+Incremental Checkpoint Data Size
+Full Checkpoint Data Size

Review comment:
   WDYT about adding a tooltip here and/or in other added tags? (maybe 
copying the javadoc).
   I think `title` attribute should work.

##
File path: 
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogKeyedStateBackend.java
##
@@ -368,19 +372,34 @@ public boolean 
deregisterKeySelectionListener(KeySelectionListener listener)
 // collections don't change once started and handles are immutable
 List prevDeltaCopy =
 new 
ArrayList<>(changelogStateBackendStateCopy.getRestoredNonMaterialized());
+long incrementalMaterializeSize = 0L;
 if (delta != null && delta.getStateSize() > 0) {
 prevDeltaCopy.add(delta);
+incrementalMaterializeSize += delta.getIncrementalStateSize();

Review comment:
   :+1: 
   We should **not** count `prevDeltaCopy.getIncrementalStateSize()` in 
`incrementalMaterializeSize`.
   
   Could you add a comment that it's inttentional?

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogStateBackendHandle.java
##
@@ -113,6 +124,19 @@ public long getStateSize() {
 + 
nonMaterialized.stream().mapToLong(StateObject::getStateSize).sum();
 }
 
+@Override
+public long getIncrementalStateSize() {
+long incrementalStateSize =
+incrementalMaterializeSize == 
undefinedIncrementalMaterializeSize
+? materialized.stream()
+
.mapToLong(StateObject::getIncrementalStateSize)
+.sum()
+: incrementalMaterializeSize;

Review comment:
   Depending on how we define "incremental state size", materialized part 
should be included or not:
   1. if it's everything that was uploaded for **this** checkpoint, then it 
should
   1. if it's the difference from the previous checkpoint, it should **not** be 
included
   Right?
   
   And it's problematic to find out what exactly was uploaded for **this** 
checkpoint because multiple checkpoints will likely include the same 
materialized state, and therefore report the same incremental state multiple 
times.
   Besides that, the 2nd option seems more intuitive to me personally.
   
   WDYT?

##
File path: 
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogKeyedStateBackend.java
##
@@ -368,19 +372,34 @@ public boolean 
deregisterKeySelectionListener(KeySelectionListener listener)
 // collections don't change once started and handles are immutable
 List prevDeltaCopy =
 new 
ArrayList<>(changelogStateBackendStateCopy.getRestoredNonMaterialized());
+long incrementalMaterializeSize = 0L;
 if (delta != null && delta.getStateSize() > 0) {
 prevDeltaCopy.add(delta);
+incrementalMaterializeSize += delta.getIncrementalStateSize();
 }
 
 if (prevDeltaCopy.isEmpty()
 && 
changelogStateBackendStateCopy.getMaterializedSnapshot().isEmpty()) {
 return SnapshotResult.empty();
 } else {
+List materializedSnapshot =
+changelogStateBackendStateCopy.getMaterializedSnapshot();
+for (KeyedStateHandle keyedStateHandle : materializedSnapshot) {
+if (!lastCompletedHandles.contains(keyedStateHandle)) {
+incrementalMaterializeSize += 
keyedStateHandle.getStateSize();

Review comment:
   (ditto 1st comment): I think we should NOT count materialized state size 
in incremental state size

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogState

[jira] [Commented] (FLINK-25505) Fix NetworkBufferPoolTest, SystemResourcesCounterTest on Apple M1

2022-01-13 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475724#comment-17475724
 ] 

Robert Metzger commented on FLINK-25505:


For fixing the SystemResourcesCounterTest, we need to bump oshi-core to version 
5.5.0 (that's when the last fix mentioning M1 got in): 
https://github.com/oshi/oshi/blob/master/CHANGELOG.md

This requires us fixing some system metrics code.

> Fix NetworkBufferPoolTest, SystemResourcesCounterTest on Apple M1 
> --
>
> Key: FLINK-25505
> URL: https://issues.apache.org/jira/browse/FLINK-25505
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics, Runtime / Network
>Affects Versions: 1.15.0
>Reporter: Robert Metzger
>Priority: Major
>
> As discussed in https://issues.apache.org/jira/browse/FLINK-23230, some tests 
> in flink-runtime are not passing on M1 / Apple Silicon Macs.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] galenwarren commented on pull request #15599: [FLINK-11838][flink-gs-fs-hadoop] Create Google Storage file system with recoverable writer support

2022-01-13 Thread GitBox


galenwarren commented on pull request #15599:
URL: https://github.com/apache/flink/pull/15599#issuecomment-1012485195


   @xintongsong Sounds great! Many thanks for all your work and guidance on 
this project. Once we get the CI issues resolved, I'll squash everything so it 
will be ready to merge. You mentioned before using an appropriate commit 
message, would something based on the initial description in the PR work, i.e.
   
   > This PR adds RecoverableWriter support for Google Storage (GS), to allow 
writing to GS buckets from Flink's StreamingFileSink. To do this, we implement 
and register a FileSystem implementation for GS and then provide a 
RecoverableWriter implementation via createRecoverableWriter.
   
   ... or were you thinking something more detailed.
   
   @zentol Thanks for your suggestions. I'm trying one other thing first -- 
when we first started this PR, the versions of Guava and some other 
dependencies in `flink-fs-hadoop-shaded` were older than what was required by 
some of the Google libraries we use; so, I had to exclude some dependencies in 
`flink-fs-hadoop-shaded` in favor of the Google-supplied ones. Since then, 
`flink-fs-hadoop-shaded` has been updated, and its versions seem fine. So I'm 
reversing things, excluding dependencies from `google-cloud-storage` and 
`gcs-connector` that are supplied by `flink-fs-hadoop-shaded`.  This does seem 
to have reduced the number of different grpc versions in the dependency tree, 
and I'm curious if it will affect our CI issues. 
   
   I'm waiting for the CI build to occur now. If this doesn't fix it, I'll 
follow through on your dependency-convergence suggestion.
   
   Last, I've excluded the ` org.checkerframework:checker-compat-qual`, ` 
org.checkerframework:checker-qual`, and 
`org.codehaus.mojo:animal-sniffer-annotations` libraries and removed their 
license files and entries in NOTICE.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] galenwarren commented on pull request #15599: [FLINK-11838][flink-gs-fs-hadoop] Create Google Storage file system with recoverable writer support

2022-01-13 Thread GitBox


galenwarren commented on pull request #15599:
URL: https://github.com/apache/flink/pull/15599#issuecomment-1012456967


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] codecholeric commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


codecholeric commented on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012456359


   Hey, I quickly looked over it (it's quite late for me already :wink:), but 
here are my 2 cents why things probably differ (as you already guessed in the 
last comments):
   
   For the case of `KafkaShuffleITCase` or `FlinkKafkaInternalProducerITCase` 
the `test-jar` goal is defined like this:
   
   ```
   
org.apache.maven.plugins
maven-jar-plugin



test-jar




**/KafkaTestEnvironment*
**/testutils/*
META-INF/LICENSE
META-INF/NOTICE




   
   ```
   
   AFAIK IntelliJ doesn't support this fine-grained configuration. It emulates 
Maven and if there is `test-jar` in one project and 
`test-jar` as dependency in another, then it will simply dump all 
classes compiled from `src/test/java` (i.e. `target/test-classes`) onto the 
classpath. I think that's why `KafkaShuffleITCase` doesn't appear when you run 
it with Maven (I also checked the contents of the `test-jar` archive and 
`KafkaShuffleITCase` is indeed not contained).
   
   I see 2 solutions:
   
   1) do you really need this optimization to only include a little bit of 
those classes in the test jars? Most other modules seem to simply dump 
everything in the test jars, which makes it simpler and the difference here 
should vanish (after all test jars are not released, right? Thus, maybe it's 
not really relevant to optimize for a couple KB more or less?)
   2) exclude those classes from the test, even though they are not tested with 
Maven (but then it also doesn't hurt)
   
   Theoretically there would be 3), don't go via classpath, but compile 
everything and then just put in the Flink root folder to look for `.class` 
files on the file system, but I wouldn't go this way if you can get around it.
   
   If there is any way, I would simply follow 1), because that's also the only 
way to really test all relevant classes in a downstream module.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on a change in pull request #18322: [FLINK-25589][docs][connectors] Update Chinese version of Elasticsearch connector docs

2022-01-13 Thread GitBox


JingGe commented on a change in pull request #18322:
URL: https://github.com/apache/flink/pull/18322#discussion_r784252017



##
File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md
##
@@ -294,131 +210,163 @@ input.addSink(esSinkBuilder.build)
 
 ### Elasticsearch Sinks 和容错
 
-启用 Flink checkpoint 后,Flink Elasticsearch Sink 保证至少一次将操作请求发送到 Elasticsearch 集群。
+默认情况下,Flink Elasticsearch Sink 不会提供任何传递健壮性的保障。
+用户可以通过配置自行启用 Elasticsearch sink 的 at-least-once 语义。

Review comment:
   ```suggestion
   用户可以选择启用 Elasticsearch sink 的 at-least-once 语义。
   ```

##
File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md
##
@@ -40,16 +40,12 @@ under the License.
 
   
   
-
-5.x
-{{< artifact flink-connector-elasticsearch5 >}}
-
 
 6.x

Review comment:
   ```suggestion
  <= 6.3.1
   ```

##
File path: docs/content.zh/docs/connectors/table/elasticsearch.md
##
@@ -192,31 +177,24 @@ CREATE TABLE myUserTable (
 
   sink.bulk-flush.backoff.max-retries
   可选
-  8
+  (none)
   Integer
   最大回退重试次数。
 
 
   sink.bulk-flush.backoff.delay
   可选
-  50ms
+  (none)
   Duration
   每次回退尝试之间的延迟。对于 CONSTANT 回退策略,该值是每次重试之间的延迟。对于 
EXPONENTIAL 回退策略,该值是初始的延迟。

Review comment:
   ```suggestion
 每次退避尝试之间的延迟。对于 CONSTANT 退避策略,该值是每次重试之间的延迟。对于 
EXPONENTIAL 退避策略,该值是初始的延迟。
   ```

##
File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md
##
@@ -40,16 +40,12 @@ under the License.
 
   
   
-
-5.x
-{{< artifact flink-connector-elasticsearch5 >}}
-
 
 6.x
 {{< artifact flink-connector-elasticsearch6 >}}
 
 
-7 及更高版本
+7.x

Review comment:
   ```suggestion
   <= 7.5.1
   ```

##
File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md
##
@@ -294,131 +210,163 @@ input.addSink(esSinkBuilder.build)
 
 ### Elasticsearch Sinks 和容错
 
-启用 Flink checkpoint 后,Flink Elasticsearch Sink 保证至少一次将操作请求发送到 Elasticsearch 集群。
+默认情况下,Flink Elasticsearch Sink 不会提供任何传递健壮性的保障。
+用户可以通过配置自行启用 Elasticsearch sink 的 at-least-once 语义。
+
+启用 Flink checkpoint 后,Flink Elasticsearch Sink 可以保证至少一次将操作请求发送到 Elasticsearch 
集群。

Review comment:
   ```suggestion
   通过启用 Flink checkpoint ,Flink Elasticsearch Sink 可以保证至少一次将操作请求发送到 
Elasticsearch 集群。
   ```

##
File path: docs/content.zh/docs/connectors/datastream/elasticsearch.md
##
@@ -294,131 +210,163 @@ input.addSink(esSinkBuilder.build)
 
 ### Elasticsearch Sinks 和容错
 
-启用 Flink checkpoint 后,Flink Elasticsearch Sink 保证至少一次将操作请求发送到 Elasticsearch 集群。
+默认情况下,Flink Elasticsearch Sink 不会提供任何传递健壮性的保障。
+用户可以通过配置自行启用 Elasticsearch sink 的 at-least-once 语义。
+
+启用 Flink checkpoint 后,Flink Elasticsearch Sink 可以保证至少一次将操作请求发送到 Elasticsearch 
集群。
 这是通过在进行 checkpoint 时等待 `BulkProcessor` 中所有挂起的操作请求来实现。
 这有效地保证了在触发 checkpoint 之前所有的请求被 Elasticsearch 成功确认,然后继续处理发送到 sink 的记录。
 
 关于 checkpoint 和容错的更多详细信息,请参见[容错文档]({{< ref "docs/learn-flink/fault_tolerance" 
>}})。
 
-要使用具有容错特性的 Elasticsearch Sinks,需要在执行环境中启用作业拓扑的 checkpoint:
+要使用具有容错特性的 Elasticsearch Sinks,需要配置启用 at-least-once 分发并且在执行环境中启用作业拓扑的 
checkpoint:
 
 {{< tabs "d00d1e93-4844-40d7-b0ec-9ec37e73145e" >}}
 {{< tab "Java" >}}
+Elasticsearch 6:
 ```java
 final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
 env.enableCheckpointing(5000); // 每 5000 毫秒执行一次 checkpoint
+
+Elasticsearch6SinkBuilder sinkBuilder = new Elasticsearch6SinkBuilder()
+.setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
+.setHosts(new HttpHost("127.0.0.1", 9200, "http"))
+.setEmitter(
+(element, context, indexer) -> 
+indexer.add(createIndexRequest(element)));
+```
+
+Elasticsearch 7:
+```java
+final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+env.enableCheckpointing(5000); // 每 5000 毫秒执行一次 checkpoint
+
+Elasticsearch7SinkBuilder sinkBuilder = new Elasticsearch7SinkBuilder()
+.setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
+.setHosts(new HttpHost("127.0.0.1", 9200, "http"))
+.setEmitter(
+(element, context, indexer) -> 
+indexer.add(createIndexRequest(element)));
 ```
 {{< /tab >}}
 {{< tab "Scala" >}}
+Elasticsearch 6:
 ```scala
 val env = StreamExecutionEnvironment.getExecutionEnvironment()
 env.enableCheckpointing(5000) // 每 5000 毫秒执行一次 checkpoint
+
+val sinkBuilder = new Elasticsearch6SinkBuilder[String]
+  .setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
+  .setHosts(new HttpHost("127.0.0.1", 9200, "http"))
+  .setEmitter((element: String, context: SinkWriter.Context, indexer: 
RequestIndexer) =>
+  indexer.add(createIndexRequest(element)))
+```
+
+Elasticsearch 7:
+```scala
+val env = StreamExecutionEnvironment.getExecutionEnvironment()
+env.enableCheckpointing(5000) // 每 5000 毫秒执行一次 checkpoint
+
+val sinkBuilder = new Elasticse

[GitHub] [flink] rkhachatryan commented on a change in pull request #18297: [FLINK-25395][state] Fix TM error handling when uploading shared state

2022-01-13 Thread GitBox


rkhachatryan commented on a change in pull request #18297:
URL: https://github.com/apache/flink/pull/18297#discussion_r784256807



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/changelog/ChangelogStateHandleStreamImpl.java
##
@@ -83,11 +80,7 @@ public KeyedStateHandle getIntersection(KeyGroupRange 
keyGroupRange) {
 
 @Override
 public void discardState() throws Exception {
-// discard only on TM; on JM, shared state is removed on subsumption
-if (stateRegistry == null) {
-bestEffortDiscardAllStateObjects(
-handlesAndOffsets.stream().map(t2 -> 
t2.f0).collect(Collectors.toList()));
-}
+// do nothing: rely on SharedStateRegistry for cleanup after ACK;

Review comment:
   The in-memory object will be GCed on checkpoint subsumption;
   The actual handles will be discarded by `SharedStateRegistry` when no longer 
used (they are registered in `registerSharedStates`).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (FLINK-25341) Improve casting STRUCTURED type to STRING

2022-01-13 Thread Timo Walther (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-25341.

Fix Version/s: 1.15.0
   Resolution: Fixed

Fixed in master: 0157aa3bf9cd7bce32cd4abb6fa7b6fdd1de2677

> Improve casting STRUCTURED type to STRING
> -
>
> Key: FLINK-25341
> URL: https://issues.apache.org/jira/browse/FLINK-25341
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Timo Walther
>Assignee: Francesco Guardiani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> Structured types currently use ROW to STRING logic. However, this is not very 
> useful for users as the field order might be determined by Flink. Also, 
> structured types has the nice property of defining a custom {{toString}} and 
> attribute names.
> I would suggest the following:
> If the structured type has a {{StructuredType.getImplementationClass}} 
> convert to external class and call {{toString}}.
> If no implementation class is present or the toString is not possible, use 
> the string representation of maps.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] twalthr closed pull request #18182: [FLINK-25341][table-planner] Add StructuredToStringCastRule to support user POJO toString implementation

2022-01-13 Thread GitBox


twalthr closed pull request #18182:
URL: https://github.com/apache/flink/pull/18182


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on a change in pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


JingGe commented on a change in pull request #18333:
URL: https://github.com/apache/flink/pull/18333#discussion_r784026310



##
File path: 
flink-architecture-tests/src/test/java/org/apache/flink/architecture/common/JavaFieldPredicates.java
##
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.architecture.common;
+
+import com.tngtech.archunit.base.DescribedPredicate;
+import com.tngtech.archunit.core.domain.JavaAnnotation;
+import com.tngtech.archunit.core.domain.JavaField;
+import com.tngtech.archunit.core.domain.JavaModifier;
+
+import java.lang.annotation.Annotation;
+import java.util.Arrays;
+
+/** Fine-grained predicates focus on the JavaField. */
+public class JavaFieldPredicates {
+
+/**
+ * Match the public modifier of the {@link JavaField}.
+ *
+ * @return A {@link DescribedPredicate} returning true, if and only if the 
tested {@link
+ * JavaField} has the public modifier.
+ */
+public static DescribedPredicate isPublic() {
+return DescribedPredicate.describe(
+"public", field -> 
field.getModifiers().contains(JavaModifier.PUBLIC));
+}
+
+/**
+ * Match the static modifier of the {@link JavaField}.
+ *
+ * @return A {@link DescribedPredicate} returning true, if and only if the 
tested {@link
+ * JavaField} has the static modifier.
+ */
+public static DescribedPredicate isStatic() {
+return DescribedPredicate.describe(
+"static", field -> 
field.getModifiers().contains(JavaModifier.STATIC));
+}
+
+/**
+ * Match none static modifier of the {@link JavaField}.
+ *
+ * @return A {@link DescribedPredicate} returning true, if and only if the 
tested {@link
+ * JavaField} has no static modifier.
+ */
+public static DescribedPredicate isNotStatic() {
+return DescribedPredicate.describe(
+"not static", field -> 
!field.getModifiers().contains(JavaModifier.STATIC));
+}
+
+/**
+ * Match the final modifier of the {@link JavaField}.
+ *
+ * @return A {@link DescribedPredicate} returning true, if and only if the 
tested {@link
+ * JavaField} has the final modifier.
+ */
+public static DescribedPredicate isFinal() {
+return DescribedPredicate.describe(
+"final", field -> 
field.getModifiers().contains(JavaModifier.FINAL));
+}

Review comment:
   I think actually the other way around. There are too many combinations.  
Methods like `arePublicStaticFinalOfType` provide the convenience for the most 
asked scenarios. These fine-gained methods will be used for other combinations, 
like how lego works.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #18356: [hotfix][flink-annotations] Add v1.15 as the next Flink version to master

2022-01-13 Thread GitBox


flinkbot commented on pull request #18356:
URL: https://github.com/apache/flink/pull/18356#issuecomment-1012346346


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit a10aee84fae72f74cb17b2b9f736b8d04d6e6200 (Thu Jan 13 
17:22:12 UTC 2022)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] matriv opened a new pull request #18356: [hotfix][flink-annotations] Add v1.15 as the next Flink version to master

2022-01-13 Thread GitBox


matriv opened a new pull request #18356:
URL: https://github.com/apache/flink/pull/18356


   The upcoming version to be released need to exist already in master so that
   when the new release branch is created from master the version of the release
   is already there.
   
   Follows: #18340
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr commented on a change in pull request #18352: [FLINK-15585][table] Improve function identifier string in plan digest

2022-01-13 Thread GitBox


twalthr commented on a change in pull request #18352:
URL: https://github.com/apache/flink/pull/18352#discussion_r784160659



##
File path: 
flink-table/flink-table-common/src/test/java/org/apache/flink/table/functions/UserDefinedFunctionHelperTest.java
##
@@ -143,34 +162,27 @@ private void testErrorMessage() {
 
 @Nullable String expectedErrorMessage;
 
-TestSpec(Class functionClass) {
+TestSpec(
+@Nullable Class functionClass,
+@Nullable UserDefinedFunction functionInstance,
+@Nullable CatalogFunction catalogFunction) {
 this.functionClass = functionClass;
-this.functionInstance = null;
-this.catalogFunction = null;
-}
-
-TestSpec(UserDefinedFunction functionInstance) {
-this.functionClass = null;
 this.functionInstance = functionInstance;
-this.catalogFunction = null;
-}
-
-TestSpec(CatalogFunction catalogFunction) {
-this.functionClass = null;
-this.functionInstance = null;
 this.catalogFunction = catalogFunction;
 }
 
 static TestSpec forClass(Class 
function) {
-return new TestSpec(function);
+return new TestSpec(function, null, null);
 }
 
 static TestSpec forInstance(UserDefinedFunction function) {
-return new TestSpec(function);
+return new TestSpec(null, function, null);
 }
 
-static TestSpec forCatalogFunction(String className, FunctionLanguage 
language) {
-return new TestSpec(new CatalogFunctionMock(className, language));
+static TestSpec forCatalogFunction(
+String className,
+@SuppressWarnings("SameParameterValue") FunctionLanguage 
language) {

Review comment:
   You are right. I will simply always choose Java for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr commented on a change in pull request #18352: [FLINK-15585][table] Improve function identifier string in plan digest

2022-01-13 Thread GitBox


twalthr commented on a change in pull request #18352:
URL: https://github.com/apache/flink/pull/18352#discussion_r784159595



##
File path: 
flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/UserDefinedFunctionHelper.java
##
@@ -243,6 +244,36 @@ public static void prepareInstance(ReadableConfig config, 
UserDefinedFunction fu
 cleanFunction(config, function);
 }
 
+/**
+ * Returns whether a {@link UserDefinedFunction} can be easily serialized 
and identified by only
+ * a fully qualified class name. It must have a default constructor and no 
serializable fields.
+ */
+public static boolean isClassNameSerializable(UserDefinedFunction 
function) {
+final Class functionClass = function.getClass();
+if (!InstantiationUtil.hasPublicNullaryConstructor(functionClass)) {
+// function must be parameterized
+return false;
+}
+Class currentClass = functionClass;
+while (!currentClass.equals(UserDefinedFunction.class)) {
+for (Field field : currentClass.getDeclaredFields()) {
+if (!Modifier.isTransient(field.getModifiers())
+&& !Modifier.isStatic(field.getModifiers())) {
+// function seems to be stateful
+return false;
+}
+}
+currentClass = currentClass.getSuperclass();
+}

Review comment:
   Even with a marker interface you need to perform this check. 
User-defined function might come in various flavors. And the more interfaces, 
the harder it gets to get the job done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] twalthr commented on a change in pull request #18352: [FLINK-15585][table] Improve function identifier string in plan digest

2022-01-13 Thread GitBox


twalthr commented on a change in pull request #18352:
URL: https://github.com/apache/flink/pull/18352#discussion_r784159595



##
File path: 
flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/UserDefinedFunctionHelper.java
##
@@ -243,6 +244,36 @@ public static void prepareInstance(ReadableConfig config, 
UserDefinedFunction fu
 cleanFunction(config, function);
 }
 
+/**
+ * Returns whether a {@link UserDefinedFunction} can be easily serialized 
and identified by only
+ * a fully qualified class name. It must have a default constructor and no 
serializable fields.
+ */
+public static boolean isClassNameSerializable(UserDefinedFunction 
function) {
+final Class functionClass = function.getClass();
+if (!InstantiationUtil.hasPublicNullaryConstructor(functionClass)) {
+// function must be parameterized
+return false;
+}
+Class currentClass = functionClass;
+while (!currentClass.equals(UserDefinedFunction.class)) {
+for (Field field : currentClass.getDeclaredFields()) {
+if (!Modifier.isTransient(field.getModifiers())
+&& !Modifier.isStatic(field.getModifiers())) {
+// function seems to be stateful
+return false;
+}
+}
+currentClass = currentClass.getSuperclass();
+}

Review comment:
   Even with a marker interface you need perform this check. User-defined 
function might come in various flavors. And the more interfaces, the harder it 
gets to get the job done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] echauchot commented on pull request #18284: [FLINK-21407][docs][formats] Remove MongoDb connector as it is not supported in DataStream API

2022-01-13 Thread GitBox


echauchot commented on pull request #18284:
URL: https://github.com/apache/flink/pull/18284#issuecomment-1012320859


   > Sorry again but do we expect more changes to come from 
https://issues.apache.org/jira/browse/FLINK-21407 if so we should create 
subtasks for the different connectors and formats. I think we should start with 
it right away because now we have multiple PRs coming from the same ticket to 
the same branch.
   
   Hi @fapaul I had the same interrogation: we don't expect other changes as 
@MartijnVisser answered above: no other connectors to clean. There was indeed 2 
PRs on each branch (1.13, 1.14, master) the first one adds the formats and the 
second one removes MongoDb that was added by error. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] gaoyunhaii commented on a change in pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2

2022-01-13 Thread GitBox


gaoyunhaii commented on a change in pull request #18330:
URL: https://github.com/apache/flink/pull/18330#discussion_r784116528



##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/CommittableMessage.java
##
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.connector.sink2;
+
+import org.apache.flink.annotation.Experimental;
+
+import java.util.OptionalLong;
+
+/** The message send from {@link SinkWriter} to {@link Committer}. */

Review comment:
   The two links would need imports. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] CrynetLogistics commented on a change in pull request #18165: [FLINK-24904][docs] Updated docs to reflect new KDS Sink and deprecat…

2022-01-13 Thread GitBox


CrynetLogistics commented on a change in pull request #18165:
URL: https://github.com/apache/flink/pull/18165#discussion_r784138446



##
File path: docs/content/docs/connectors/datastream/kinesis.md
##
@@ -29,9 +29,26 @@ under the License.
 
 The Kinesis connector provides access to [Amazon AWS Kinesis 
Streams](http://aws.amazon.com/kinesis/streams/).
 
-To use the connector, add the following Maven dependency to your project:
-
-{{< artifact flink-connector-kinesis >}}
+To use this connector, add one or more of the following dependencies to your 
project, depending on whether you are reading from and/or writing to Kinesis 
Data Streams:
+
+
+  
+
+  KDS Connectivity
+  Maven Dependency
+
+  
+  
+
+Source
+{{< artifact flink-connector-kinesis >}}
+
+
+Sink
+{{< artifact flink-connector-aws-kinesis-data-streams >}}
+
+  
+
 
 {{< hint warning >}}
 **Attention** Prior to Flink version 1.10.0 the `flink-connector-kinesis` has 
a dependency on code licensed under the [Amazon Software 
License](https://aws.amazon.com/asl/).

Review comment:
   You're right, is it time to remove it?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] CrynetLogistics commented on a change in pull request #18165: [FLINK-24904][docs] Updated docs to reflect new KDS Sink and deprecat…

2022-01-13 Thread GitBox


CrynetLogistics commented on a change in pull request #18165:
URL: https://github.com/apache/flink/pull/18165#discussion_r784136808



##
File path: docs/content/docs/connectors/datastream/kinesis.md
##
@@ -566,124 +583,126 @@ Retry and backoff parameters can be configured using 
the `ConsumerConfigConstant
 this is called once per stream during stream consumer deregistration, unless 
the `NONE` or `EAGER` registration strategy is configured.
 Retry and backoff parameters can be configured using the 
`ConsumerConfigConstants.DEREGISTER_STREAM_*` keys.  
 
-## Kinesis Producer
-
-The `FlinkKinesisProducer` uses [Kinesis Producer Library 
(KPL)](http://docs.aws.amazon.com/streams/latest/dev/developing-producers-with-kpl.html)
 to put data from a Flink stream into a Kinesis stream.
-
-Note that the producer is not participating in Flink's checkpointing and 
doesn't provide exactly-once processing guarantees. Also, the Kinesis producer 
does not guarantee that records are written in order to the shards (See 
[here](https://github.com/awslabs/amazon-kinesis-producer/issues/23) and 
[here](http://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecord.html#API_PutRecord_RequestSyntax)
 for more details).
+## Kinesis Data Streams Sink
 
-In case of a failure or a resharding, data will be written again to Kinesis, 
leading to duplicates. This behavior is usually called "at-least-once" 
semantics.
+The Kinesis Data Streams sink (hereafter "Kinesis sink") uses the [AWS v2 SDK 
for 
Java](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/home.html)
 to write data from a Flink stream into a Kinesis stream.
 
-To put data into a Kinesis stream, make sure the stream is marked as "ACTIVE" 
in the AWS dashboard.
+To write data into a Kinesis stream, make sure the stream is marked as 
"ACTIVE" in the AWS dashboard.
 
 For the monitoring to work, the user accessing the stream needs access to the 
CloudWatch service.
 
 {{< tabs "6df3b696-c2ca-4f44-bea0-96cf8275d61c" >}}
 {{< tab "Java" >}}
 ```java
-Properties producerConfig = new Properties();
-// Required configs
-producerConfig.put(AWSConfigConstants.AWS_REGION, "us-east-1");
-producerConfig.put(AWSConfigConstants.AWS_ACCESS_KEY_ID, "aws_access_key_id");
-producerConfig.put(AWSConfigConstants.AWS_SECRET_ACCESS_KEY, 
"aws_secret_access_key");
-// Optional configs
-producerConfig.put("AggregationMaxCount", "4294967295");
-producerConfig.put("CollectionMaxCount", "1000");
-producerConfig.put("RecordTtl", "3");
-producerConfig.put("RequestTimeout", "6000");
-producerConfig.put("ThreadPoolSize", "15");
-
-// Disable Aggregation if it's not supported by a consumer
-// producerConfig.put("AggregationEnabled", "false");
-// Switch KinesisProducer's threading model
-// producerConfig.put("ThreadingModel", "PER_REQUEST");
-
-FlinkKinesisProducer kinesis = new FlinkKinesisProducer<>(new 
SimpleStringSchema(), producerConfig);
-kinesis.setFailOnError(true);
-kinesis.setDefaultStream("kinesis_stream_name");
-kinesis.setDefaultPartition("0");
+ElementConverter elementConverter =
+KinesisDataStreamsSinkElementConverter.builder()
+.setSerializationSchema(new SimpleStringSchema())
+.setPartitionKeyGenerator(element -> 
String.valueOf(element.hashCode()))
+.build();
+
+Properties sinkProperties = new Properties();
+// Required
+sinkProperties.put(AWSConfigConstants.AWS_REGION, "us-east-1");
+
+// Optional, provide via alternative routes e.g. environment variables
+sinkProperties.put(AWSConfigConstants.AWS_ACCESS_KEY_ID, "aws_access_key_id");
+sinkProperties.put(AWSConfigConstants.AWS_SECRET_ACCESS_KEY, 
"aws_secret_access_key");
+
+KinesisDataStreamsSink kdsSink =
+KinesisDataStreamsSink.builder()
+.setKinesisClientProperties(sinkProperties)// Required
+.setElementConverter(elementConverter) // Required
+.setStreamName("your-stream-name") // Required
+.setFailOnError(false) // Optional
+.setMaxBatchSize(500)  // Optional
+.setMaxInFlightRequests(16)// Optional
+.setMaxBufferedRequests(10_000)// Optional
+.setMaxBatchSizeInBytes(5 * 1024 * 1024)   // Optional
+.setMaxTimeInBufferMS(5000)// Optional
+.setMaxRecordSizeInBytes(1 * 1024 * 1024)  // Optional
+.build();
 
 DataStream simpleStringStream = ...;
-simpleStringStream.addSink(kinesis);
+simpleStringStream.sinkTo(kdsSink);
 ```
 {{< /tab >}}
 {{< tab "Scala" >}}
 ```scala
-val producerConfig = new Properties()
-// Required configs
-producerConfig.put(AWSConfigConstants.AWS_REGION, "us-east-1")
-producerConfig.put(AWSConfigConstants.AWS_ACCESS_KEY_ID, "aws_access_key_id")
-producerConfig.put(AWSConfigConstants.AWS_SECRET_ACCESS_KEY, 
"aws_secret_access_key")
-// Optional KPL configs
-producerConfig.put("AggregationMaxCount", "4294967295")
-prod

[GitHub] [flink] fapaul commented on pull request #18284: [FLINK-21407][docs][formats] Remove MongoDb connector as it is not supported in DataStream API

2022-01-13 Thread GitBox


fapaul commented on pull request #18284:
URL: https://github.com/apache/flink/pull/18284#issuecomment-1012308846


   Sorry again but do we expect more changes to come from 
https://issues.apache.org/jira/browse/FLINK-21407 if so we should create 
subtasks for the different connectors and formats.
   I think we should start with it right away because now we have multiple PRs 
coming from the same ticket to the same branch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] fapaul commented on pull request #17937: [FLINK-25044][testing][Pulsar Connector] Add More Unit Test For Pulsar Source

2022-01-13 Thread GitBox


fapaul commented on pull request #17937:
URL: https://github.com/apache/flink/pull/17937#issuecomment-1012304383


   Looks good please squash and rebase then I'll take care of merging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] slinkydeveloper commented on a change in pull request #18352: [FLINK-15585][table] Improve function identifier string in plan digest

2022-01-13 Thread GitBox


slinkydeveloper commented on a change in pull request #18352:
URL: https://github.com/apache/flink/pull/18352#discussion_r784104335



##
File path: 
flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/UserDefinedFunctionHelper.java
##
@@ -243,6 +244,36 @@ public static void prepareInstance(ReadableConfig config, 
UserDefinedFunction fu
 cleanFunction(config, function);
 }
 
+/**
+ * Returns whether a {@link UserDefinedFunction} can be easily serialized 
and identified by only
+ * a fully qualified class name. It must have a default constructor and no 
serializable fields.
+ */
+public static boolean isClassNameSerializable(UserDefinedFunction 
function) {
+final Class functionClass = function.getClass();
+if (!InstantiationUtil.hasPublicNullaryConstructor(functionClass)) {
+// function must be parameterized
+return false;
+}
+Class currentClass = functionClass;
+while (!currentClass.equals(UserDefinedFunction.class)) {
+for (Field field : currentClass.getDeclaredFields()) {
+if (!Modifier.isTransient(field.getModifiers())
+&& !Modifier.isStatic(field.getModifiers())) {
+// function seems to be stateful
+return false;
+}
+}
+currentClass = currentClass.getSuperclass();
+}

Review comment:
   That sounds like a very magic way to find out whether a function is 
stateful or not. Don't we have a marker interface for that?

##
File path: 
flink-table/flink-table-common/src/test/java/org/apache/flink/table/functions/UserDefinedFunctionHelperTest.java
##
@@ -143,34 +162,27 @@ private void testErrorMessage() {
 
 @Nullable String expectedErrorMessage;
 
-TestSpec(Class functionClass) {
+TestSpec(
+@Nullable Class functionClass,
+@Nullable UserDefinedFunction functionInstance,
+@Nullable CatalogFunction catalogFunction) {
 this.functionClass = functionClass;
-this.functionInstance = null;
-this.catalogFunction = null;
-}
-
-TestSpec(UserDefinedFunction functionInstance) {
-this.functionClass = null;
 this.functionInstance = functionInstance;
-this.catalogFunction = null;
-}
-
-TestSpec(CatalogFunction catalogFunction) {
-this.functionClass = null;
-this.functionInstance = null;
 this.catalogFunction = catalogFunction;
 }
 
 static TestSpec forClass(Class 
function) {
-return new TestSpec(function);
+return new TestSpec(function, null, null);
 }
 
 static TestSpec forInstance(UserDefinedFunction function) {
-return new TestSpec(function);
+return new TestSpec(null, function, null);
 }
 
-static TestSpec forCatalogFunction(String className, FunctionLanguage 
language) {
-return new TestSpec(new CatalogFunctionMock(className, language));
+static TestSpec forCatalogFunction(
+String className,
+@SuppressWarnings("SameParameterValue") FunctionLanguage 
language) {

Review comment:
   What warning is this one?

##
File path: 
flink-table/flink-table-common/src/main/java/org/apache/flink/table/functions/UserDefinedFunction.java
##
@@ -58,9 +60,13 @@
 
 /** Returns a unique, serialized representation for this function. */
 public final String functionIdentifier() {
+final String className = getClass().getName();
+if (isClassNameSerializable(this)) {
+return className;
+}
 final String md5 =
 
EncodingUtils.hex(EncodingUtils.md5(EncodingUtils.encodeObjectToString(this)));
-return getClass().getName().replace('.', '$').concat("$").concat(md5);
+return className.concat("$").concat(md5);

Review comment:
   Does this logic works when the function is implemented as a scala 
`object`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] fapaul commented on pull request #18330: [FLINK-25570][streaming] Add topology extension points to Sink V2

2022-01-13 Thread GitBox


fapaul commented on pull request #18330:
URL: https://github.com/apache/flink/pull/18330#issuecomment-1012277251


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #18355: [FLINK-25160][docs] Clarified purpose of execution.checkpointing.tole…

2022-01-13 Thread GitBox


flinkbot commented on pull request #18355:
URL: https://github.com/apache/flink/pull/18355#issuecomment-1012276469


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 6b89b5c2cb9734084a479e7dfcea82a3218dd3ff (Thu Jan 13 
16:04:21 UTC 2022)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] leonardBang commented on pull request #17657: [FLINK-24745][format][json] Add support for Oracle OGG JSON format parser

2022-01-13 Thread GitBox


leonardBang commented on pull request #17657:
URL: https://github.com/apache/flink/pull/17657#issuecomment-1012275729


   > @leonardBang I'm sorry, I've been a bit busy for a while, in fact, I 
probably won't be able to fix the bugs for the month. or should I close this PR 
for now or set it to draft mode and submit it when I'm done fixing it?
   
   It's okay, let me find someone to help take over this PR, we hope this 
feature can appear in 1.15.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-25160) Make doc clear: tolerable-failed-checkpoints counts consecutive failures

2022-01-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-25160:
---
Labels: pull-request-available  (was: )

> Make doc clear: tolerable-failed-checkpoints counts consecutive failures
> 
>
> Key: FLINK-25160
> URL: https://issues.apache.org/jira/browse/FLINK-25160
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.14.0, 1.12.5, 1.13.3
>Reporter: Jun Qin
>Assignee: Anton Kalashnikov
>Priority: Major
>  Labels: pull-request-available
>
> According to the code, tolerable-failed-checkpoints counts the consecutive 
> failures. We should make this clear in the doc 
> [config|https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/]
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [flink] akalash opened a new pull request #18355: [FLINK-25160][docs] Clarified purpose of execution.checkpointing.tole…

2022-01-13 Thread GitBox


akalash opened a new pull request #18355:
URL: https://github.com/apache/flink/pull/18355


   …rable-failed-checkpoints
   
   
   
   ## What is the purpose of the change
   
   Clarified that execution.checkpointing.tolerable-failed-checkpoints is for 
the consecutive failures.
   
   ## Verifying this change
   
   It is just documentation
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**yes** / no)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (not applicable / **docs** / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe edited a comment on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


JingGe edited a comment on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012252594


   One additional information I found to share (thanks @fapaul): some Flink 
modules customized test-jar build logic, which turns out not all test code 
could be found in test jars. This might also be a reason that test running with 
maven only cover a smaller scope than running in IntelliJ


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


JingGe commented on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012257162


   > 
   
   sure, this is the option I mentioned previously to shrink the scope.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Airblader commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


Airblader commented on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012254365


   > some Flink modules customized test-jar build logic, which turns out not 
all test code could be found in test jars
   
   Yeah, that sounds like it might well be the cause, or at least be partially 
the reason. Do you know which or how many modules? If removing one or two such 
modules fixes the issue, we could use that as a temporary solution. I'd rather 
cover less code with the tests than have inconsistent results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


JingGe commented on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012252594


   One additional information I found to share: some Flink modules customized 
test-jar build logic, which turns out not all test code could be found in test 
jars. This might also be a reason that test running with maven only cover a 
smaller scope than running in IntelliJ


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Airblader commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


Airblader commented on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012247801


   > It means that maven and IntelliJ are running in different scopes, i.e. 
"more violations" will be found by IntelliJ because it use a bigger scope than 
maven
   
   Running your branch in IntelliJ adds new violations, but also removes some 
violations.
   
   > […] than working on it within this PR?
   
   I'm also OK if you want to resolve this outside of this PR, but then the PR 
would have to wait until then. If we want to merge this PR, we need to address 
it here. As long as the test executes in both IntelliJ and Maven, but produces 
different results in each, I will not be +1, because I think it is too 
confusing and frustrating for developers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on a change in pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


JingGe commented on a change in pull request #18333:
URL: https://github.com/apache/flink/pull/18333#discussion_r784063682



##
File path: 
flink-architecture-tests/src/test/java/org/apache/flink/architecture/common/Predicates.java
##
@@ -50,5 +83,78 @@
 && field.getRawType().isEquivalentTo(clazz));
 }
 
+/**
+ * Tests that the given field is {@code public final} and not {@code 
static} and of the given
+ * type {@code clazz} .
+ */
+public static DescribedPredicate arePublicFinalOfTyp(Class 
clazz) {
+return 
is(ofType(clazz)).and(isPublic()).and(isFinal()).and(isNotStatic());
+}
+
+/**
+ * Tests that the given field is {@code public static final} and of the 
given type {@code clazz}
+ * .
+ */
+public static DescribedPredicate 
arePublicStaticFinalOfTyp(Class clazz) {
+return arePublicStaticOfType(clazz).and(isFinal());
+}
+
+/**
+ * Tests that the given field is {@code public final} and of the given 
type {@code clazz} with
+ * exactly the given {@code annotationType}.
+ */
+public static DescribedPredicate 
arePublicFinalOfTypeWithAnnotation(
+Class clazz, Class annotationType) {
+return arePublicFinalOfTyp(clazz).and(annotatedWith(annotationType));
+}
+
+/**
+ * Tests that the given field is {@code public static final} and of the 
given type {@code clazz}
+ * with exactly the given {@code annotationType}.
+ */
+public static DescribedPredicate 
arePublicStaticFinalOfTypeWithAnnotation(
+Class clazz, Class annotationType) {
+return 
arePublicStaticFinalOfTyp(clazz).and(annotatedWith(annotationType));
+}
+
+/**
+ * Tests that the given field is {@code public static final} and of the 
given type. It must have
+ * any of the given {@code annotationTypes}
+ */
+@SafeVarargs
+public static DescribedPredicate 
arePublicStaticFinalOfTypeWithOneOfAnnotations(

Review comment:
   I have nothing to against removing unused code. It was committed it to 
provide convenience for further development.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] matriv commented on a change in pull request #18342: [FLINK-25230][table-planner] Replace RelDataType with LogicalType serialization

2022-01-13 Thread GitBox


matriv commented on a change in pull request #18342:
URL: https://github.com/apache/flink/pull/18342#discussion_r784059921



##
File path: 
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/serde/RelDataTypeJsonSerdeTest.java
##
@@ -46,83 +35,113 @@
 import org.apache.calcite.sql.parser.SqlParserPos;
 import org.apache.calcite.sql.type.SqlTypeName;
 import org.junit.Test;
-import org.junit.runner.RunWith;
-import org.junit.runners.Parameterized;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.MethodSource;
+import org.junit.runners.Parameterized.Parameters;
 
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Arrays;
-import java.util.Collection;
 import java.util.List;
 
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertSame;
+import static 
org.apache.flink.table.planner.plan.nodes.exec.serde.JsonSerdeMocks.configuredSerdeContext;
+import static 
org.apache.flink.table.planner.plan.nodes.exec.serde.JsonSerdeMocks.toJson;
+import static 
org.apache.flink.table.planner.plan.nodes.exec.serde.JsonSerdeMocks.toObject;
+import static org.assertj.core.api.Assertions.assertThat;
 
-/** Tests for serialization/deserialization of {@link RelDataType}. */
-@RunWith(Parameterized.class)
+/** Tests for {@link RelDataType} serialization and deserialization. */
 public class RelDataTypeJsonSerdeTest {
+
 private static final FlinkTypeFactory FACTORY = 
FlinkTypeFactory.INSTANCE();
 
-@Parameterized.Parameters(name = "type = {0}")
-public static Collection parameters() {
+@ParameterizedTest
+@MethodSource("testRelDataTypeSerde")
+public void testRelDataTypeSerde(RelDataType relDataType) throws 
IOException {
+final SerdeContext serdeContext = configuredSerdeContext();
+
+final String json = toJson(serdeContext, relDataType);
+final RelDataType actual = toObject(serdeContext, json, 
RelDataType.class);
+
+assertThat(actual).isSameAs(relDataType);
+}
+
+@Test
+public void testMissingPrecisionAndScale() {
+final SerdeContext serdeContext = configuredSerdeContext();
+
+final String json =
+toJson(
+serdeContext,
+FACTORY.createSqlIntervalType(
+new SqlIntervalQualifier(
+TimeUnit.DAY, TimeUnit.SECOND, 
SqlParserPos.ZERO)));
+final RelDataType actual = toObject(serdeContext, json, 
RelDataType.class);
+
+assertThat(actual)
+.isSameAs(
+FACTORY.createSqlIntervalType(
+new SqlIntervalQualifier(
+TimeUnit.DAY,
+
DayTimeIntervalType.DEFAULT_DAY_PRECISION,
+TimeUnit.SECOND,
+
DayTimeIntervalType.DEFAULT_FRACTIONAL_PRECISION,
+SqlParserPos.ZERO)));
+}
+
+// 

+// Test data
+// 

+
+@Parameters(name = "{0}")
+public static List testRelDataTypeSerde() {
 // the values in the list do not care about nullable.
-List types =
+final List types =
 Arrays.asList(
 FACTORY.createSqlType(SqlTypeName.BOOLEAN),
 FACTORY.createSqlType(SqlTypeName.TINYINT),
 FACTORY.createSqlType(SqlTypeName.SMALLINT),
 FACTORY.createSqlType(SqlTypeName.INTEGER),
 FACTORY.createSqlType(SqlTypeName.BIGINT),
-FACTORY.createSqlType(SqlTypeName.DECIMAL, 3, 10),
-FACTORY.createSqlType(SqlTypeName.DECIMAL, 0, 19),
-FACTORY.createSqlType(SqlTypeName.DECIMAL, -1, 19),

Review comment:
   I can't think of such a case, doesn't make sense.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe edited a comment on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


JingGe edited a comment on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012232986


   > Of course technically it seems like IntelliJ actually finds more correct 
violations, so the IntelliJ result is "more correct", but IntelliJ is not our 
source of truth.
   
   It means that maven and IntelliJ are running in different scopes, i.e. "more 
violations" will be found by IntelliJ because it use a bigger scope than maven, 
IMHO, it has nothing to do with correctness. It might be an issue 
(incorrectness) that ArchUnit should fix.
   
   > The problem here are the test-jars and how they're handled; we currently 
don't have an issue because we do not use JARs for the dependencies in the 
module.
   
   that is what I mentioned that complex scope has not been touched.
   
   > 
   > When IntelliJ executes the tests, it handles the dependencies differently 
than Maven, and that is causing the inconsistency. We can fix this by improving 
the import options to cover both scenarios equally.
   
   It sounds more like a compatible IntelliJ setup with maven project w.r.t. 
dependencies or ArchUnit abstracts and hides the hybrid dependencies with 
maven, IntelliJ, gradle, Bazel, etc.  than working on it within this PR?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe edited a comment on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


JingGe edited a comment on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012232986


   > Of course technically it seems like IntelliJ actually finds more correct 
violations, so the IntelliJ result is "more correct", but IntelliJ is not our 
source of truth.
   
   It means that maven and IntelliJ are running in different scopes, i.e. "more 
violations" will be found by IntelliJ because it use a bigger scope than maven, 
IMHO, it has nothing to do with correctness.
   
   > The problem here are the test-jars and how they're handled; we currently 
don't have an issue because we do not use JARs for the dependencies in the 
module.
   
   that is what I mentioned that complex scope has not been touched.
   
   > 
   > When IntelliJ executes the tests, it handles the dependencies differently 
than Maven, and that is causing the inconsistency. We can fix this by improving 
the import options to cover both scenarios equally.
   
   It sounds more like a compatible IntelliJ setup with maven project w.r.t. 
dependencies or ArchUnit abstracts and hides the hybrid dependencies with 
maven, IntelliJ, gradle, Bazel, etc.  than working on it within this PR?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe edited a comment on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


JingGe edited a comment on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012232986


   > Of course technically it seems like IntelliJ actually finds more correct 
violations, so the IntelliJ result is "more correct", but IntelliJ is not our 
source of truth.
   
   It means that maven and IntelliJ are running in different scopes, i.e. "more 
violations", IMHO, it has nothing to do with correctness.
   
   > The problem here are the test-jars and how they're handled; we currently 
don't have an issue because we do not use JARs for the dependencies in the 
module.
   
   that is what I mentioned that complex scope has not been touched.
   
   > 
   > When IntelliJ executes the tests, it handles the dependencies differently 
than Maven, and that is causing the inconsistency. We can fix this by improving 
the import options to cover both scenarios equally.
   
   It sounds more like a compatible IntelliJ setup with maven project w.r.t. 
dependencies or ArchUnit abstracts and hides the hybrid dependencies with 
maven, IntelliJ, gradle, Bazel, etc.  than working on it within this PR?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingGe commented on pull request #18333: [FLINK-25220][test] Write an architectural rule for all IT cases w.r.t. the MiniCluster

2022-01-13 Thread GitBox


JingGe commented on pull request #18333:
URL: https://github.com/apache/flink/pull/18333#issuecomment-1012232986


   > Of course technically it seems like IntelliJ actually finds more correct 
violations, so the IntelliJ result is "more correct", but IntelliJ is not our 
source of truth.
   
   It means that maven and IntelliJ are running in different scopes, i.e. "more 
violations", IMHO, it has nothing to do with correctness.
   
   > The problem here are the test-jars and how they're handled; we currently 
don't have an issue because we do not use JARs for the dependencies in the 
module.
   
   that is what I mentioned that complex scope has not been touched.
   
   > 
   > When IntelliJ executes the tests, it handles the dependencies differently 
than Maven, and that is causing the inconsistency. We can fix this by improving 
the import options to cover both scenarios equally.
   
   Tt sounds more like a compatible IntelliJ setup with maven project w.r.t. 
dependencies or ArchUnit abstracts and hides the hybrid dependencies with 
maven, IntelliJ, gradle, Bazel, etc.  than working on it within this PR?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   5   >