[GitHub] [flink] flinkbot commented on pull request #20246: [FLINK-28074][table-planner] show statistics details for DESCRIBE EXT…

2022-07-11 Thread GitBox


flinkbot commented on PR #20246:
URL: https://github.com/apache/flink/pull/20246#issuecomment-1181339251

   
   ## CI report:
   
   * 231f19cae1d48a8a9fe17842065656d8e02f6ac6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fredia commented on pull request #20160: [FLINK-28286][state] move 'enablechangelog' constant to flink-core module

2022-07-11 Thread GitBox


fredia commented on PR #20160:
URL: https://github.com/apache/flink/pull/20160#issuecomment-1181335690

   Sure, Rebased. Thanks a lot for your review and patience!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-28074) show statistics details for DESCRIBE EXTENDED

2022-07-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-28074:
---
Labels: pull-request-available  (was: )

> show statistics details for DESCRIBE EXTENDED
> -
>
> Key: FLINK-28074
> URL: https://issues.apache.org/jira/browse/FLINK-28074
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Reporter: godfrey he
>Assignee: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Currently, DESCRIBE command only show the schema of a given table, EXTENDED 
> does not work. so for EXTENDED mode, the statistics details can also be shown.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] swuferhong opened a new pull request, #20246: [FLINK-28074][table-planner] show statistics details for DESCRIBE EXT…

2022-07-11 Thread GitBox


swuferhong opened a new pull request, #20246:
URL: https://github.com/apache/flink/pull/20246

   …ENDED
   
   
   
   ## What is the purpose of the change
   
   - This pr aims to let syntax `DESCRIBE EXTENDED` support printing table 
statistics details. table stats Details including 
rowCount、fileCount、rawDataSize and totalSize. 
   - In order to provide a readable print style, in this pr, I realized a new 
PrintStyle named `DescExtendedStyle` which can display contents in different 
parts and calculate the column widths respectively.  The display effect is as 
follows:
   
![image](https://user-images.githubusercontent.com/27358654/178415725-9722fadc-4fb3-428d-9928-e69f8657ad50.png)
   
   ## Brief change log
   
   - In `TableEnvironmentImpl`, change the method `buildDescribeRows`.
   - Adding a new print style, named `DescExtendedStyle`.
   - Adding test in `TableEnvironment`.
   
   
   ## Verifying this change
   
   Adding test in `TableEnvironment` to test desc extended.
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency):  no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): don't know
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper:  no 
 - The S3 file system connector: no 
   
   ## Documentation
   
 - Does this pull request introduce a new feature?  yes 
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   now there is no docs about this feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-28500) Add Transformer for Tokenizer

2022-07-11 Thread Zhipeng Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhipeng Zhang reassigned FLINK-28500:
-

Assignee: Zhipeng Zhang

> Add Transformer for Tokenizer
> -
>
> Key: FLINK-28500
> URL: https://issues.apache.org/jira/browse/FLINK-28500
> Project: Flink
>  Issue Type: New Feature
>  Components: Library / Machine Learning
>Reporter: Zhipeng Zhang
>Assignee: Zhipeng Zhang
>Priority: Major
> Fix For: ml-2.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] wuchong merged pull request #20213: [hotfix][docs] Fix formatting in Chinese doc

2022-07-11 Thread GitBox


wuchong merged PR #20213:
URL: https://github.com/apache/flink/pull/20213


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-28499) resource leak when job failed with unknown status In Application Mode

2022-07-11 Thread Yun Tang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang updated FLINK-28499:
-
Priority: Major  (was: Minor)

> resource leak when job failed with unknown status In Application Mode
> -
>
> Key: FLINK-28499
> URL: https://issues.apache.org/jira/browse/FLINK-28499
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.13.1
>Reporter: lihe ma
>Priority: Major
> Attachments: cluster-pod-error.png
>
>
> I found a job restarted for thousands of times, and jobmanager tried to 
> create a new taskmanager pod every time.  The jobmanager restarted because 
> submitted with duplicate  job id[1] (we preset the jobId rather than 
> generate), but I hadn't save the logs unfortunately. 
> this job requires one taskmanager pod in normal circumstances, but thousands 
> of pods were leaked finally.  you can find the screenshot in the attachment.
>  
> In application mode, cluster resources will be released  when job finished in 
> succeeded, failed or canceled status[2][3] . When some exception happen, the 
> job may be terminated in unknown status[4] . 
> In this case, the job exited with unknown status , without releasing  
> taskmanager pods. So is it reasonable to not release taskmanager when job 
> exited in unknown status ? 
>  
>  
> one line in original logs:
> 2022-07-01 09:45:40,712 [main] INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster 
> entrypoint process KubernetesApplicationClusterEntrypoint with exit code 1445.
>  
> [1] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L452]
> [2] 
> [https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L90-L91]
> [3] 
> [https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L175]
> [4] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ApplicationStatus.java#L39]
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] 1996fanrui commented on pull request #20137: Just for CI

2022-07-11 Thread GitBox


1996fanrui commented on PR #20137:
URL: https://github.com/apache/flink/pull/20137#issuecomment-1181305318

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-28502) Add Transformer for RegexTokenizer

2022-07-11 Thread Zhipeng Zhang (Jira)
Zhipeng Zhang created FLINK-28502:
-

 Summary: Add Transformer for RegexTokenizer
 Key: FLINK-28502
 URL: https://issues.apache.org/jira/browse/FLINK-28502
 Project: Flink
  Issue Type: New Feature
  Components: Library / Machine Learning
Reporter: Zhipeng Zhang
 Fix For: ml-2.2.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28501) Add Transformer and Estimator for VectorIndexer

2022-07-11 Thread Zhipeng Zhang (Jira)
Zhipeng Zhang created FLINK-28501:
-

 Summary: Add Transformer and Estimator for VectorIndexer
 Key: FLINK-28501
 URL: https://issues.apache.org/jira/browse/FLINK-28501
 Project: Flink
  Issue Type: New Feature
  Components: Library / Machine Learning
Reporter: Zhipeng Zhang
 Fix For: ml-2.2.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28499) resource leak when job failed with unknown status In Application Mode

2022-07-11 Thread lihe ma (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lihe ma updated FLINK-28499:

Description: 
I found a job restarted for thousands of times, and jobmanager tried to create 
a new taskmanager pod every time.  The jobmanager restarted because submitted 
with duplicate  job id[1] (we preset the jobId rather than generate), but I 
hadn't save the logs unfortunately. 

this job requires one taskmanager pod in normal circumstances, but thousands of 
pods were leaked finally.  you can find the screenshot in the attachment.



 

In application mode, cluster resources will be released  when job finished in 
succeeded, failed or canceled status[2][3] . When some exception happen, the 
job may be terminated in unknown status[4] . 

In this case, the job exited with unknown status , without releasing  
taskmanager pods. So is it reasonable to not release taskmanager when job 
exited in unknown status ? 

 

 

one line in original logs:
2022-07-01 09:45:40,712 [main] INFO 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster 
entrypoint process KubernetesApplicationClusterEntrypoint with exit code 1445.

 

[1] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L452]

[2] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L90-L91]

[3] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L175]

[4] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ApplicationStatus.java#L39]

 

 

 

  was:
I found a job restarted for thousands of times, and jobmanager tried to create 
a new taskmanager pod every time.  The jobmanager restarted because submitted 
with duplicate  job id[1] (we preset the jobId rather than generate), but I 
hadn't save the logs unfortunately. 

this job requires one taskmanager pod in normal circumstances, but thousands of 
pods were leaked finally.
!image-2022-07-12-11-02-43-009.png|width=666,height=366!



In application mode, cluster resources will be released  when job finished in 
succeeded, failed or canceled status[2][3] . When some exception happen, the 
job may be terminated in unknown status[4] . 

In this case, the job exited with unknown status , without releasing  
taskmanager pods. So is it reasonable to not release taskmanager when job 
exited in unknown status ? 

 

 

one line in original logs:
2022-07-01 09:45:40,712 [main] INFO 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster 
entrypoint process KubernetesApplicationClusterEntrypoint with exit code 1445.

 

[1] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L452]

[2] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L90-L91]


[3] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L175]

[4] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ApplicationStatus.java#L39]

 

 

 

   Priority: Minor  (was: Major)

> resource leak when job failed with unknown status In Application Mode
> -
>
> Key: FLINK-28499
> URL: https://issues.apache.org/jira/browse/FLINK-28499
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.13.1
>Reporter: lihe ma
>Priority: Minor
> Attachments: cluster-pod-error.png
>
>
> I found a job restarted for thousands of times, and jobmanager tried to 
> create a new taskmanager pod every time.  The jobmanager restarted because 
> submitted with duplicate  job id[1] (we preset the jobId rather than 
> generate), but I hadn't save the logs unfortunately. 
> this job requires one taskmanager pod in normal circumstances, but thousands 
> of pods were leaked finally.  you can find the screenshot in the attachment.
>  
> In application mode, cluster resources will be released  when job finished in 
> succeeded, failed or canceled status[2][3] . When some exception happen, the 
> job may be terminated in unknown status[4] . 
> In this case, the job exited with unknown status , without releasing  
> taskmanager pods. So is it reasonable to not release taskmanager when job 
> exited in unknown status ? 
>  
>  
> one line in original logs:
> 2022-07-01 

[jira] [Updated] (FLINK-28499) resource leak when job failed with unknown status In Application Mode

2022-07-11 Thread Yun Tang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang updated FLINK-28499:
-
Priority: Major  (was: Minor)

> resource leak when job failed with unknown status In Application Mode
> -
>
> Key: FLINK-28499
> URL: https://issues.apache.org/jira/browse/FLINK-28499
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.13.1
>Reporter: lihe ma
>Priority: Major
> Attachments: cluster-pod-error.png
>
>
> I found a job restarted for thousands of times, and jobmanager tried to 
> create a new taskmanager pod every time.  The jobmanager restarted because 
> submitted with duplicate  job id[1] (we preset the jobId rather than 
> generate), but I hadn't save the logs unfortunately. 
> this job requires one taskmanager pod in normal circumstances, but thousands 
> of pods were leaked finally.
> !image-2022-07-12-11-02-43-009.png|width=666,height=366!
> In application mode, cluster resources will be released  when job finished in 
> succeeded, failed or canceled status[2][3] . When some exception happen, the 
> job may be terminated in unknown status[4] . 
> In this case, the job exited with unknown status , without releasing  
> taskmanager pods. So is it reasonable to not release taskmanager when job 
> exited in unknown status ? 
>  
>  
> one line in original logs:
> 2022-07-01 09:45:40,712 [main] INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster 
> entrypoint process KubernetesApplicationClusterEntrypoint with exit code 1445.
>  
> [1] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L452]
> [2] 
> [https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L90-L91]
> [3] 
> [https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L175]
> [4] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ApplicationStatus.java#L39]
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28498) resource leak when job failed with unknown status In Application Mode

2022-07-11 Thread lihe ma (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lihe ma closed FLINK-28498.
---
Resolution: Duplicate

> resource leak when job failed with unknown status In Application Mode
> -
>
> Key: FLINK-28498
> URL: https://issues.apache.org/jira/browse/FLINK-28498
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.13.1
>Reporter: lihe ma
>Priority: Minor
> Attachments: cluster-pod-error.png
>
>
> I found a job restarted for thousands of times, and jobmanager tried to 
> create a new taskmanager pod every time.  The jobmanager restarted because 
> submitted with duplicate  job id[1] (we preset the jobId rather than 
> generate), but I hadn't save the logs unfortunately. 
> this job requires one taskmanager pod in normal circumstances, but thousands 
> of pods were leaked finally.
> !image-2022-07-12-11-02-43-009.png|width=666,height=366!
> In application mode, cluster resources will be released  when job finished in 
> succeeded, failed or canceled status[2][3] . When some exception happen, the 
> job may be terminated in unknown status[4] .  
> In this case, the job exited with unknown status , without releasing  
> taskmanager pods. So is it reasonable to not release taskmanager when job 
> exited in unknown status ? 
>  
>  
> one line in original logs:
> 2022-07-01 09:45:40,712 [main] INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster 
> entrypoint process KubernetesApplicationClusterEntrypoint with exit code 1445.
>  
> [1] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L452]
> [2] 
> [https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L90-L91]
> [3] 
> [https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L175]
> [4] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ApplicationStatus.java#L39]
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28500) Add Transformer for Tokenizer

2022-07-11 Thread Zhipeng Zhang (Jira)
Zhipeng Zhang created FLINK-28500:
-

 Summary: Add Transformer for Tokenizer
 Key: FLINK-28500
 URL: https://issues.apache.org/jira/browse/FLINK-28500
 Project: Flink
  Issue Type: New Feature
  Components: Library / Machine Learning
Reporter: Zhipeng Zhang
 Fix For: ml-2.2.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25545) [JUnit5 Migration] Module: flink-clients

2022-07-11 Thread Hang Ruan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565253#comment-17565253
 ] 

Hang Ruan commented on FLINK-25545:
---

[~RocMarshal] , thanks for your work. Please be free to take this ticket ~

> [JUnit5 Migration] Module: flink-clients
> 
>
> Key: FLINK-25545
> URL: https://issues.apache.org/jira/browse/FLINK-25545
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hang Ruan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28497) resource leak when job failed with unknown status In Application Mode

2022-07-11 Thread lihe ma (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lihe ma closed FLINK-28497.
---
Resolution: Duplicate

> resource leak when job failed with unknown status In Application Mode
> -
>
> Key: FLINK-28497
> URL: https://issues.apache.org/jira/browse/FLINK-28497
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.13.1
>Reporter: lihe ma
>Priority: Minor
>
> I found a job restarted for thousands of times, and jobmanager tried to 
> create a new taskmanager pod every time.  The jobmanager restarted because 
> submitted with duplicate  job id[1] (we preset the jobId rather than 
> generate), but I hadn't save the logs unfortunately. 
> this job requires one taskmanager pod in normal circumstances, but thousands 
> of pods were leaked finally.
> !image-2022-07-12-11-02-43-009.png|width=666,height=366!
> In application mode, cluster resources will be released  when job finished in 
> succeeded, failed or canceled status[2][3] . When some exception happen, the 
> job may be terminated in unknown status[4] .  
> In this case, the job exited with unknown status , without releasing  
> taskmanager pods. So is it reasonable to not release taskmanager when job 
> exited in unknown status ? 
>  
>  
> one line in original logs:
> 2022-07-01 09:45:40,712 [main] INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster 
> entrypoint process KubernetesApplicationClusterEntrypoint with exit code 1445.
>  
> [1] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L452]
> [2] 
> [https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L90-L91]
> [3] 
> [https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L175]
> [4] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ApplicationStatus.java#L39]
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25546) [JUnit5 Migration] Module: flink-connector-base

2022-07-11 Thread Hang Ruan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565252#comment-17565252
 ] 

Hang Ruan commented on FLINK-25546:
---

[~Sergey Nuyanzin] , Thanks for your work.

[~renqs], could you help to assign this ticket to [~Sergey Nuyanzin] ?

> [JUnit5 Migration] Module: flink-connector-base
> ---
>
> Key: FLINK-25546
> URL: https://issues.apache.org/jira/browse/FLINK-25546
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hang Ruan
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28178) Show the delegated StateBackend and whether changelog is enabled in the UI

2022-07-11 Thread Feifan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565251#comment-17565251
 ] 

Feifan Wang commented on FLINK-28178:
-

Thanks [~yunta] , I got it.

> Show the delegated StateBackend and whether changelog is enabled in the UI
> --
>
> Key: FLINK-28178
> URL: https://issues.apache.org/jira/browse/FLINK-28178
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing, Runtime / Web Frontend
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> If changelog is enabled, StateBackend shown in Web UI is always 
> 'ChangelogStateBackend'. I think ChangelogStateBackend should not expose to 
> user, we should show the delegated StateBackend in this place. And We should 
> add add a row to indicate whether changelog is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] xintongsong commented on a diff in pull request #20100: [FLINK-27905][runtime] Introduce HsSpillingStrategy and HsSelectiveSpillingStrategy implementation

2022-07-11 Thread GitBox


xintongsong commented on code in PR #20100:
URL: https://github.com/apache/flink/pull/20100#discussion_r918520765


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingInfoProvider.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid;
+
+import java.util.Deque;
+import java.util.List;
+
+/** This component is responsible for providing some information needed for 
the spill decision. */
+public interface HsSpillingInfoProvider {
+/**
+ * Get the number of downstream consumers.
+ *
+ * @return Number of subpartitions.
+ */
+int getNumSubpartitions();
+
+/**
+ * Get all downstream consumption progress.
+ *
+ * @return A list containing all downstream consumption progress, if the 
downstream subpartition
+ * view has not been registered, the corresponding return value is -1.
+ */
+List getConsumptionProgress();

Review Comment:
   "Consuming progress" is ambiguous. It can be "last consumed buffer index", 
or "next buffer index to consume".



##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsAllSpillingStrategy.java:
##
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid;
+
+import java.util.ArrayList;
+import java.util.List;
+
+/** A special implementation of {@link HsSpillingStrategy} that spilled all 
buffer to disk. */
+public class HsAllSpillingStrategy implements HsSpillingStrategy {
+private final int numBuffersTriggerSpilling;
+
+public HsAllSpillingStrategy(HybridShuffleConfiguration 
hybridShuffleConfiguration) {
+this.numBuffersTriggerSpilling = 
hybridShuffleConfiguration.getNumBuffersTriggerSpilling();
+}
+
+/**
+ * For the case of buffer finished, whenever the unSpilledBuffers reaches 
{@link
+ * #numBuffersTriggerSpilling}, spill and release all buffers.
+ */

Review Comment:
   This should be a comment, not a JavaDoc.



##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HybridShuffleConfiguration.java:
##
@@ -74,4 +88,82 @@ public int getMaxBuffersReadAhead() {
 public Duration getBufferRequestTimeout() {
 return bufferRequestTimeout;
 }
+
+/**
+ * When the number of buffers that have been requested exceeds this 
threshold, trigger the
+ * spilling operation. Used by {@link HsSelectiveSpillingStrategy}.
+ */
+public float getSelectiveSpillThreshold() {
+return selectiveSpillThreshold;
+}
+
+/** The proportion of buffers to be spilled. Used by {@link 
HsSelectiveSpillingStrategy}. */
+public float getSelectiveSpillBufferRatio() {
+return selectiveSpillBufferRatio;
+}
+
+/**
+ * When the number of unSpilled buffers equal to this value, trigger the 
spilling operation.
+ * Used by {@link HsAllSpillingStrategy}.
+ */
+public int getNumBuffersTriggerSpilling() {
+return numBuffersTriggerSpilling;
+}

Review Comment:
   I'd suggest to prefix these properties with "Selective|FullStrategy", in 
clearly indicate that the configuration is used by a specific strategy.



##

[GitHub] [flink] ruanhang1993 closed pull request #18113: [FLINK-25315][tests] add new extensions and utils to help the Junit5 migration

2022-07-11 Thread GitBox


ruanhang1993 closed pull request #18113: [FLINK-25315][tests] add new 
extensions and utils to help the Junit5 migration
URL: https://github.com/apache/flink/pull/18113


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] ruanhang1993 commented on pull request #18113: [FLINK-25315][tests] add new extensions and utils to help the Junit5 migration

2022-07-11 Thread GitBox


ruanhang1993 commented on PR #18113:
URL: https://github.com/apache/flink/pull/18113#issuecomment-1181282045

   Closed by https://github.com/apache/flink/pull/20145


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] ruanhang1993 commented on pull request #18113: [FLINK-25315][tests] add new extensions and utils to help the Junit5 migration

2022-07-11 Thread GitBox


ruanhang1993 commented on PR #18113:
URL: https://github.com/apache/flink/pull/18113#issuecomment-1181281616

   Closed by #20145


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #209: [FLINK-27103] Don't store redundant primary key fields

2022-07-11 Thread GitBox


JingsongLi commented on code in PR #209:
URL: https://github.com/apache/flink-table-store/pull/209#discussion_r918515556


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/KeyValueSerializer.java:
##
@@ -45,6 +49,13 @@ public class KeyValueSerializer extends 
ObjectSerializer {
 private final OffsetRowData reusedValue;
 private final KeyValue reusedKv;
 
+private TableSchema tableSchema;
+
+public KeyValueSerializer(RowType keyType, RowType valueType, TableSchema 
tableSchema) {

Review Comment:
   I think we can introduce a special deserializer for `ColumnarRowIterator`.
   After getting `VectorizedColumnBatch` from `ColumnarRowData`. We can get 
`ColumnVector[]`. Then, we can do our projection to produce key and value for 
`KeyValue`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-28178) Show the delegated StateBackend and whether changelog is enabled in the UI

2022-07-11 Thread Yun Tang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565246#comment-17565246
 ] 

Yun Tang commented on FLINK-28178:
--

[~Feifan Wang] sure, you can ping me on the github directly.

> Show the delegated StateBackend and whether changelog is enabled in the UI
> --
>
> Key: FLINK-28178
> URL: https://issues.apache.org/jira/browse/FLINK-28178
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing, Runtime / Web Frontend
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> If changelog is enabled, StateBackend shown in Web UI is always 
> 'ChangelogStateBackend'. I think ChangelogStateBackend should not expose to 
> user, we should show the delegated StateBackend in this place. And We should 
> add add a row to indicate whether changelog is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-web] lindong28 merged pull request #559: Rebuild website

2022-07-11 Thread GitBox


lindong28 merged PR #559:
URL: https://github.com/apache/flink-web/pull/559


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-web] lindong28 commented on pull request #559: Rebuild website

2022-07-11 Thread GitBox


lindong28 commented on PR #559:
URL: https://github.com/apache/flink-web/pull/559#issuecomment-1181272522

   LGTM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-28499) resource leak when job failed with unknown status In Application Mode

2022-07-11 Thread lihe ma (Jira)
lihe ma created FLINK-28499:
---

 Summary: resource leak when job failed with unknown status In 
Application Mode
 Key: FLINK-28499
 URL: https://issues.apache.org/jira/browse/FLINK-28499
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.13.1
Reporter: lihe ma
 Attachments: cluster-pod-error.png

I found a job restarted for thousands of times, and jobmanager tried to create 
a new taskmanager pod every time.  The jobmanager restarted because submitted 
with duplicate  job id[1] (we preset the jobId rather than generate), but I 
hadn't save the logs unfortunately. 

this job requires one taskmanager pod in normal circumstances, but thousands of 
pods were leaked finally.
!image-2022-07-12-11-02-43-009.png|width=666,height=366!



In application mode, cluster resources will be released  when job finished in 
succeeded, failed or canceled status[2][3] . When some exception happen, the 
job may be terminated in unknown status[4] . 

In this case, the job exited with unknown status , without releasing  
taskmanager pods. So is it reasonable to not release taskmanager when job 
exited in unknown status ? 

 

 

one line in original logs:
2022-07-01 09:45:40,712 [main] INFO 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster 
entrypoint process KubernetesApplicationClusterEntrypoint with exit code 1445.

 

[1] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L452]

[2] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L90-L91]


[3] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L175]

[4] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ApplicationStatus.java#L39]

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] LadyForest commented on a diff in pull request #208: [FLINK-28483] Basic schema evolution for table store

2022-07-11 Thread GitBox


LadyForest commented on code in PR #208:
URL: https://github.com/apache/flink-table-store/pull/208#discussion_r918514001


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/schema/SchemaChange.java:
##
@@ -0,0 +1,206 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.schema;
+
+import org.apache.flink.table.types.logical.LogicalType;
+
+import javax.annotation.Nullable;
+
+import java.util.Objects;
+
+/** Schema change to table. */
+public interface SchemaChange {

Review Comment:
   > Lack `updateColumnNullability` and `updateColumnComment`.
   
   What about adding a TODO to track?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-28498) resource leak when job failed with unknown status In Application Mode

2022-07-11 Thread lihe ma (Jira)
lihe ma created FLINK-28498:
---

 Summary: resource leak when job failed with unknown status In 
Application Mode
 Key: FLINK-28498
 URL: https://issues.apache.org/jira/browse/FLINK-28498
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.13.1
Reporter: lihe ma
 Attachments: cluster-pod-error.png

I found a job restarted for thousands of times, and jobmanager tried to create 
a new taskmanager pod every time.  The jobmanager restarted because submitted 
with duplicate  job id[1] (we preset the jobId rather than generate), but I 
hadn't save the logs unfortunately. 

this job requires one taskmanager pod in normal circumstances, but thousands of 
pods were leaked finally.
!image-2022-07-12-11-02-43-009.png|width=666,height=366!



In application mode, cluster resources will be released  when job finished in 
succeeded, failed or canceled status[2][3] . When some exception happen, the 
job may be terminated in unknown status[4] .  

In this case, the job exited with unknown status , without releasing  
taskmanager pods. So is it reasonable to not release taskmanager when job 
exited in unknown status ? 

 

 

one line in original logs:
2022-07-01 09:45:40,712 [main] INFO 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster 
entrypoint process KubernetesApplicationClusterEntrypoint with exit code 1445.

 

[1] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L452]

[2] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L90-L91]


[3] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L175]

[4] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ApplicationStatus.java#L39]

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28497) resource leak when job failed with unknown status In Application Mode

2022-07-11 Thread lihe ma (Jira)
lihe ma created FLINK-28497:
---

 Summary: resource leak when job failed with unknown status In 
Application Mode
 Key: FLINK-28497
 URL: https://issues.apache.org/jira/browse/FLINK-28497
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.13.1
Reporter: lihe ma


I found a job restarted for thousands of times, and jobmanager tried to create 
a new taskmanager pod every time.  The jobmanager restarted because submitted 
with duplicate  job id[1] (we preset the jobId rather than generate), but I 
hadn't save the logs unfortunately. 

this job requires one taskmanager pod in normal circumstances, but thousands of 
pods were leaked finally.
!image-2022-07-12-11-02-43-009.png|width=666,height=366!



In application mode, cluster resources will be released  when job finished in 
succeeded, failed or canceled status[2][3] . When some exception happen, the 
job may be terminated in unknown status[4] .  

In this case, the job exited with unknown status , without releasing  
taskmanager pods. So is it reasonable to not release taskmanager when job 
exited in unknown status ? 

 

 

one line in original logs:
2022-07-01 09:45:40,712 [main] INFO 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating cluster 
entrypoint process KubernetesApplicationClusterEntrypoint with exit code 1445.

 

[1] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L452]

[2] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L90-L91]


[3] 
[https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/deployment/application/ApplicationDispatcherBootstrap.java#L175]

[4] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ApplicationStatus.java#L39]

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-web] zhipeng93 opened a new pull request, #559: Rebuild website

2022-07-11 Thread GitBox


zhipeng93 opened a new pull request, #559:
URL: https://github.com/apache/flink-web/pull/559

   This is a hotfix of https://github.com/apache/flink-web/pull/558.
   
   In https://github.com/apache/flink-web/pull/558, some auto generated files 
are missing.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] Tartarus0zm commented on pull request #20223: [FLINK-28457][runtime] Introduce JobStatusHook

2022-07-11 Thread GitBox


Tartarus0zm commented on PR #20223:
URL: https://github.com/apache/flink/pull/20223#issuecomment-1181267795

   @lsyldliu @reswqa  thanks for your review. I have addressed the comments 
left.
   Please take a look again.
   cc @gaoyunhaii 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-web] lindong28 merged pull request #558: Release Flink ML 2.1.0

2022-07-11 Thread GitBox


lindong28 merged PR #558:
URL: https://github.com/apache/flink-web/pull/558


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-web] lindong28 commented on pull request #558: Release Flink ML 2.1.0

2022-07-11 Thread GitBox


lindong28 commented on PR #558:
URL: https://github.com/apache/flink-web/pull/558#issuecomment-1181263032

   Thank you for the fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27103) Don't store redundant primary key fields

2022-07-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27103:
---
Labels: pull-request-available  (was: )

> Don't store redundant primary key fields
> 
>
> Key: FLINK-27103
> URL: https://issues.apache.org/jira/browse/FLINK-27103
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: pull-request-available
> Fix For: table-store-0.2.0
>
>
> We are currently storing the primary key redundantly in the file, we can 
> directly use the primary key field in the original fields to avoid redundant 
> storage



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] SteNicholas opened a new pull request, #209: [FLINK-27103] Don't store redundant primary key fields

2022-07-11 Thread GitBox


SteNicholas opened a new pull request, #209:
URL: https://github.com/apache/flink-table-store/pull/209

   `ChangelogWithKeyFileStoreTable` currently stores the primary key 
redundantly in the file, which could directly use the primary key field in the 
original fields to avoid redundant storage.
   
   **The brief change log**
   - The `primaryKey` of `SinkRecord` is set to null for 
`ChangelogWithKeyFileStoreTable` and `KeyValueSerializer` reads the primary key 
from the value.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] Tartarus0zm commented on a diff in pull request #20223: [FLINK-28457][runtime] Introduce JobStatusHook

2022-07-11 Thread GitBox


Tartarus0zm commented on code in PR #20223:
URL: https://github.com/apache/flink/pull/20223#discussion_r918502121


##
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/DefaultExecutionGraph.java:
##
@@ -1544,6 +1552,30 @@ private void notifyJobStatusChange(JobStatus newState) {
 }
 }
 
+private void notifyJobStatusHooks(JobStatus newState, Throwable cause) {
+JobID jobID = jobInformation.getJobId();
+for (JobStatusHook hook : jobStatusHooks) {
+try {
+switch (newState) {
+case CREATED:
+hook.onCreated(jobID);
+break;
+case CANCELED:
+hook.onCanceled(jobID);
+break;
+case FAILED:
+hook.onFailed(jobID, cause);
+break;
+case FINISHED:
+hook.onFinished(jobID);
+break;
+}
+} catch (Throwable ignore) {
+LOG.warn("Error while notifying JobStatusHook[{}]", 
hook.getClass(), ignore);

Review Comment:
   good catch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] Tartarus0zm commented on a diff in pull request #20223: [FLINK-28457][runtime] Introduce JobStatusHook

2022-07-11 Thread GitBox


Tartarus0zm commented on code in PR #20223:
URL: https://github.com/apache/flink/pull/20223#discussion_r918501911


##
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/DefaultExecutionGraph.java:
##
@@ -1544,6 +1552,30 @@ private void notifyJobStatusChange(JobStatus newState) {
 }
 }
 
+private void notifyJobStatusHooks(JobStatus newState, Throwable cause) {
+JobID jobID = jobInformation.getJobId();
+for (JobStatusHook hook : jobStatusHooks) {
+try {
+switch (newState) {
+case CREATED:
+hook.onCreated(jobID);

Review Comment:
   This is not necessary, only the FAILED branch needs to use cause



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] liuzhuang2017 commented on pull request #20208: [hotfix][flink-rpc] Fix the RemoteRpcInvocation class typo.

2022-07-11 Thread GitBox


liuzhuang2017 commented on PR #20208:
URL: https://github.com/apache/flink/pull/20208#issuecomment-1181253936

   @MartijnVisser , Sorry to bother you again, can you help me review this pr? 
Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] liuzhuang2017 commented on pull request #20210: [FLINK-27205][docs-zh] Translate "Concepts -> Glossary" page into Chinese.

2022-07-11 Thread GitBox


liuzhuang2017 commented on PR #20210:
URL: https://github.com/apache/flink/pull/20210#issuecomment-1181253412

   @MartijnVisser , Sorry to bother you again, can you help me review this pr? 
Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] liuzhuang2017 commented on pull request #20226: Fix Chinese document format errors.

2022-07-11 Thread GitBox


liuzhuang2017 commented on PR #20226:
URL: https://github.com/apache/flink/pull/20226#issuecomment-1181253038

   @MartijnVisser , Sorry to bother you again, can you help me review this pr? 
Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-28178) Show the delegated StateBackend and whether changelog is enabled in the UI

2022-07-11 Thread Feifan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565228#comment-17565228
 ] 

Feifan Wang commented on FLINK-28178:
-

Hi [~yunta] , can you help me review the 
[pr|https://github.com/apache/flink/pull/20103] ?

> Show the delegated StateBackend and whether changelog is enabled in the UI
> --
>
> Key: FLINK-28178
> URL: https://issues.apache.org/jira/browse/FLINK-28178
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing, Runtime / Web Frontend
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-1.png
>
>
> If changelog is enabled, StateBackend shown in Web UI is always 
> 'ChangelogStateBackend'. I think ChangelogStateBackend should not expose to 
> user, we should show the delegated StateBackend in this place. And We should 
> add add a row to indicate whether changelog is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] swuferhong commented on pull request #20007: [FLINK-27989][table-planner] Csv format supports reporting statistics

2022-07-11 Thread GitBox


swuferhong commented on PR #20007:
URL: https://github.com/apache/flink/pull/20007#issuecomment-1181245110

   @godfreyhe CI passed, could you help me to merge it ! Thanks! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] Myasuka commented on pull request #19701: [FLINK-24786][state] Introduce and expose RocksDB statistics as metrics

2022-07-11 Thread GitBox


Myasuka commented on PR #19701:
URL: https://github.com/apache/flink/pull/19701#issuecomment-1181241854

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] luoyuxia commented on a diff in pull request #20227: [FLINK-28471]Translate hive_read_write.md into Chinese

2022-07-11 Thread GitBox


luoyuxia commented on code in PR #20227:
URL: https://github.com/apache/flink/pull/20227#discussion_r918482407


##
docs/content.zh/docs/connectors/table/hive/hive_read_write.md:
##
@@ -25,74 +25,63 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Hive Read & Write
+# Hive 读 & 写
 
-Using the `HiveCatalog`, Apache Flink can be used for unified `BATCH` and 
`STREAM` processing of Apache 
-Hive Tables. This means Flink can be used as a more performant alternative to 
Hive’s batch engine,
-or to continuously read and write data into and out of Hive tables to power 
real-time data
-warehousing applications. 
+通过使用 `HiveCatalog`,Apache Flink 可以对 Apache Hive 表做统一的批和流处理。这意味着 Flink 可以成为 
Hive 批处理引擎的一个性能更好的选择,或者连续读写 Hive 表中的数据以支持为实时数据仓库应用。

Review Comment:
   ```suggestion
   通过使用 `HiveCatalog`,Apache Flink 可以对 Apache Hive 表做统一的批和流处理。这意味着 Flink 可以成为 
Hive 批处理引擎的一个性能更好的选择,或者连续读写 Hive 表中的数据以支持实时数据仓库应用。
   ```



##
docs/content.zh/docs/connectors/table/hive/hive_read_write.md:
##
@@ -25,74 +25,63 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Hive Read & Write
+# Hive 读 & 写
 
-Using the `HiveCatalog`, Apache Flink can be used for unified `BATCH` and 
`STREAM` processing of Apache 
-Hive Tables. This means Flink can be used as a more performant alternative to 
Hive’s batch engine,
-or to continuously read and write data into and out of Hive tables to power 
real-time data
-warehousing applications. 
+通过使用 `HiveCatalog`,Apache Flink 可以对 Apache Hive 表做统一的批和流处理。这意味着 Flink 可以成为 
Hive 批处理引擎的一个性能更好的选择,或者连续读写 Hive 表中的数据以支持为实时数据仓库应用。
 
-## Reading
+## 读
 
-Flink supports reading data from Hive in both `BATCH` and `STREAMING` modes. 
When run as a `BATCH`
-application, Flink will execute its query over the state of the table at the 
point in time when the
-query is executed. `STREAMING` reads will continuously monitor the table and 
incrementally fetch
-new data as it is made available. Flink will read tables as bounded by default.
+Flink 支持以批和流两种模式从 Hive 表中读取数据。批读的时候,Flink 
会基于执行查询时表的状态进行查询。流读时将持续监控表,并在表中新数据可用时进行增量获取,默认情况下,Flink 将以批模式读取数据。
 
-`STREAMING` reads support consuming both partitioned and non-partitioned 
tables. 
-For partitioned tables, Flink will monitor the generation of new partitions, 
and read
-them incrementally when available. For non-partitioned tables, Flink will 
monitor the generation
-of new files in the folder and read new files incrementally.
+流读支持消费分区表和非分区表。对于分区表,Flink 会监控新分区的生成,并且在数据可用的情况下增量获取数据。对于非分区表,Flink 
将监控文件夹中新文件的生成,并增量地读取新文件。
 
 
   
 
-Key
-Default
-Type
-Description
+键
+默认值
+类型
+描述
 
   
   
 
 streaming-source.enable
 false
 Boolean
-Enable streaming source or not. NOTES: Please make sure that each 
partition/file should be written atomically, otherwise the reader may get 
incomplete data.
+是否启动流读。注意:请确保每个分区/文件都应该原子地写入,否则读取不到完整的数据。
 
 
 streaming-source.partition.include
 all
 String
-Option to set the partitions to read, the supported option are 
`all` and `latest`, the `all` means read all partitions; the `latest` means 
read latest partition in order of 'streaming-source.partition.order', the 
`latest` only works` when the streaming hive source table used as temporal 
table. By default the option is `all`.
-Flink supports temporal join the latest hive partition by enabling 
'streaming-source.enable' and setting 'streaming-source.partition.include' to 
'latest', at the same time, user can assign the partition compare order and 
data update interval by configuring following partition-related options.  
+选择读取的分区,可选项为 `all` 和 `latest`,`all` 读取所有分区;`latest` 读取按照 
'streaming-source.partition.order' 排序后的最新分区,`latest` 仅在 streaming 模式的 hive 
source table 作为时态表时有效。默认的选项是`all`。在开启 'streaming-source.enable' 并设置 
'streaming-source.partition.include' 为 'latest' 时,Flink 支持 temporal join 
最新的hive分区,同时,用户可以通过配置分区相关的选项来分配分区比较顺序和数据更新时间间隔。 
 
  
 
 streaming-source.monitor-interval
 None
 Duration
-Time interval for consecutively monitoring partition/file.
-Notes: The default interval for hive streaming reading is '1 min', 
the default interval for hive streaming temporal join is '60 min', this is 
because there's one framework limitation that every TM will visit the Hive 
metaStore in current hive streaming temporal join implementation which may 
produce pressure to metaStore, this will improve in the future.
+连续监控分区/文件的时间间隔。
+注意: 默认情况下,流式读 Hive 的间隔为 '1 min',但流读 Hive 的 temporal join 的默认时间间隔是 
'60 min',这是因为当前流读 Hive 的 temporal join 实现上有一个框架限制,即每个TM都要访问 Hive 
metaStore,这可能会对metaStore产生压力,这个问题将在未来得到改善。
 
 
 streaming-source.partition-order
 partition-name
 String
-The partition order of streaming 

[GitHub] [flink-connector-elasticsearch] deadwind4 commented on pull request #24: [FLINK-18887][connectors/elasticsearch][docs][docs-zh] Add doc of Elasticsearch connector for Python DataStream API

2022-07-11 Thread GitBox


deadwind4 commented on PR #24:
URL: 
https://github.com/apache/flink-connector-elasticsearch/pull/24#issuecomment-1181235582

   @dianfu Please review it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] leozhangsr commented on pull request #20234: [FLINK-28475] [Connector/kafka] Stopping offset can be 0

2022-07-11 Thread GitBox


leozhangsr commented on PR #20234:
URL: https://github.com/apache/flink/pull/20234#issuecomment-1181224833

   > @leozhangsr Can you add a test for this situation?
   
   ok, I will add a test later


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-27579) The param client.timeout can not be set by dynamic properties when stopping the job

2022-07-11 Thread Yao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565208#comment-17565208
 ] 

Yao Zhang commented on FLINK-27579:
---

Hi [~wangyang0918] ,

Thank you very much for your suggestions and code review.

If there is something need to update in this PR, please let me know.

> The param client.timeout can not be set by dynamic properties when stopping 
> the job 
> 
>
> Key: FLINK-27579
> URL: https://issues.apache.org/jira/browse/FLINK-27579
> Project: Flink
>  Issue Type: Improvement
>  Components: Client / Job Submission
>Affects Versions: 1.16.0
>Reporter: Liu
>Priority: Major
>  Labels: pull-request-available
>
> The default client.timeout value is one minute which may be too short when 
> stop-with-savepoint for big state jobs.
> When we stop the job by dynamic properties(-D or -yD for yarn), the 
> client.timeout is not effective.
> From the code, we can see that the dynamic properties are only effect for run 
> command. We should support it for stop command.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-26256) AWS SDK Async Event Loop Group Classloader Issue

2022-07-11 Thread Flink Jira Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-26256:
---
Labels: pull-request-available stale-assigned  (was: pull-request-available)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issue is assigned but has not 
received an update in 30 days, so it has been labeled "stale-assigned".
If you are still working on the issue, please remove the label and add a 
comment updating the community on your progress.  If this issue is waiting on 
feedback, please consider this a reminder to the committer/reviewer. Flink is a 
very active project, and so we appreciate your patience.
If you are no longer working on the issue, please unassign yourself so someone 
else may work on it.


> AWS SDK Async Event Loop Group Classloader Issue
> 
>
> Key: FLINK-26256
> URL: https://issues.apache.org/jira/browse/FLINK-26256
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Reporter: Danny Cranmer
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h3. Background
> AWS SDK v2 async clients use a Netty async client for Kinesis Data 
> Streams/Firehose sink and Kinesis Data Streams EFO consumer. The SDK creates 
> a shared thread pool for Netty to use for network operations when one is not 
> configured. The thread pool is managed by a shared ELG (event loop group), 
> and this is stored in a static field. We do not configure this for the AWS 
> connectors in the Flink codebase. 
> When threads are spawned within the ELG, they inherit the context classloader 
> from the current thread. If the ELG is created from a shared classloader, for 
> instance Flink parent classloader, or MiniCluster parent classloader, 
> multiple Flink jobs can share the same ELG. When an ELG thread is spawned 
> from a Flink job, it will inherit the Flink user classloader. When this job 
> completes/fails, the classloader is destroyed, however the Netty thread is 
> still referencing it, and this leads to below exception.
> h3. Impact
> This issue *does not* impact jobs deployed to TM when AWS SDK v2 is loaded 
> via the Flink User Classloader. It is expected this is the standard 
> deployment configuration.
> This issue is known to impact:
> - Flink mini cluster, for example in integration tests (FLINK-26064)
> - Flink cluster loading AWS SDK v2 via parent classloader
> h3. Suggested solution
> There are a few possible solutions, as discussed 
> https://github.com/apache/flink/pull/18733
> 1. Create a separate ELG per Flink job
> 2. Create a separate ELG per subtask
> 3. Attach the correct classloader to ELG spawned threads
> h3. Error Stack
> (shortened stack trace, as full is too large)
> {noformat}
> Feb 09 20:05:04 java.util.concurrent.ExecutionException: 
> software.amazon.awssdk.core.exception.SdkClientException: Unable to execute 
> HTTP request: Trying to access closed classloader. Please check if you store 
> classloaders directly or indirectly in static fields. If the stacktrace 
> suggests that the leak occurs in a third party library and cannot be fixed 
> immediately, you can disable this check with the configuration 
> 'classloader.check-leaked-classloader'.
> Feb 09 20:05:04   at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> Feb 09 20:05:04   at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> (...)
> Feb 09 20:05:04 Caused by: 
> software.amazon.awssdk.core.exception.SdkClientException: Unable to execute 
> HTTP request: Trying to access closed classloader. Please check if you store 
> classloaders directly or indirectly in static fields. If the stacktrace 
> suggests that the leak occurs in a third party library and cannot be fixed 
> immediately, you can disable this check with the configuration 
> 'classloader.check-leaked-classloader'.
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:98)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:43)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:204)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:200)
> Feb 09 20:05:04   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:179)
> Feb 09 20:05:04 

[jira] [Assigned] (FLINK-27142) Remove bash tests dependencies on the Elasticsearch connector.

2022-07-11 Thread Alexander Fedulov (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Fedulov reassigned FLINK-27142:
-

Assignee: Alexander Fedulov  (was: Alexander Preuss)

> Remove bash tests dependencies on the Elasticsearch connector.
> --
>
> Key: FLINK-27142
> URL: https://issues.apache.org/jira/browse/FLINK-27142
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Alexander Fedulov
>Assignee: Alexander Fedulov
>Priority: Major
>  Labels: pull-request-available
>
> Elasticsearch connector is used in test_quickstart.sh and test_sql_client.sh. 
> It is desirable to prevent such cyclic dependency and remove the connector 
> usage from within Flink.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] RyanSkraba commented on pull request #20230: [FLINK-28449][tests][JUnit5 migration] flink-parquet

2022-07-11 Thread GitBox


RyanSkraba commented on PR #20230:
URL: https://github.com/apache/flink/pull/20230#issuecomment-1180875380

   @flinkbot run azure
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan commented on a diff in pull request #19679: [FLINK-23143][state/changelog] Support state migration for ChangelogS…

2022-07-11 Thread GitBox


rkhachatryan commented on code in PR #19679:
URL: https://github.com/apache/flink/pull/19679#discussion_r918295754


##
flink-runtime/src/main/java/org/apache/flink/runtime/state/heap/HeapKeyedStateBackend.java:
##
@@ -163,6 +170,32 @@ KeyGroupedInternalPriorityQueue create(
 return priorityQueuesManager.createOrUpdate(stateName, 
byteOrderedElementSerializer);
 }
 
+@Override
+public  IS upgradeKeyedState(
+TypeSerializer namespaceSerializer,
+StateDescriptor stateDescriptor,
+@Nonnull StateSnapshotTransformFactory 
snapshotTransformFactory)
+throws Exception {
+
Preconditions.checkState(createdKVStates.containsKey(stateDescriptor.getName()));
+registeredKVStates.computeIfPresent(
+stateDescriptor.getName(),
+(stateName, stateTable) -> {
+stateTable.setMetaInfo(
+new RegisteredKeyValueStateBackendMetaInfo<>(
+stateTable.getMetaInfo().snapshot()));
+return stateTable;

Review Comment:
   1. I think we also need to update `StateMap`s serializers, don't we?
   2. The purpose of re-creating `RegisteredKeyValueStateBackendMetaInfo` isn't 
obvious (it is using a different `StateSerializerProvider`, right?). I think it 
deserves a comment, WDYT?



##
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogKeyedStateBackend.java:
##
@@ -565,18 +568,44 @@ public  IS 
createInternalState(
 StateSnapshotTransformer.StateSnapshotTransformFactory
 snapshotTransformFactory)
 throws Exception {
+ChangelogState changelogState =
+changelogStateFactory.getExistingState(
+stateDesc.getName(), BackendStateType.KEY_VALUE);
+if (changelogState == null) {
+InternalKvState state =
+keyedStateBackend.createInternalState(
+namespaceSerializer, stateDesc, 
snapshotTransformFactory);
+
+changelogState =
+changelogStateFactory.create(
+stateDesc,
+state,
+getKvStateChangeLogger(state, stateDesc, 
snapshotTransformFactory),
+keyedStateBackend /* pass the nested backend as 
key context so that it get key updates on recovery*/);
+} else {
+InternalKvState state =
+keyedStateBackend.upgradeKeyedState(
+namespaceSerializer, stateDesc, 
snapshotTransformFactory);
+changelogState.setDelegatedState(state);

Review Comment:
   Is it only the delegated state that needs to be upgraded?
   I think at least the serializer inside the logger needs to be upgraded as 
well.
   
   ditto: PQ state



##
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogKeyedStateBackend.java:
##
@@ -565,18 +568,44 @@ public  IS 
createInternalState(
 StateSnapshotTransformer.StateSnapshotTransformFactory
 snapshotTransformFactory)
 throws Exception {
+ChangelogState changelogState =
+changelogStateFactory.getExistingState(
+stateDesc.getName(), BackendStateType.KEY_VALUE);
+if (changelogState == null) {

Review Comment:
   I'm wondering about the "normal" case - i.e. with both materialized and 
non-materialized state:
   1. The internal backend might receive serializers from its initial snapshot
   2. when first reading the `METADATA` record from changelog, `changelogState` 
will be null` 
   3. so `keyedStateBackend.upgradeKeyedState` will not be called
   
   Or am I missing something?
   
   ditto: PQ state



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


gyfora commented on PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#issuecomment-1180808724

   cc @tweise @wangyang0918 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


morhidi commented on PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#issuecomment-1180789758

   > > What is `FlinkDeployment.Lifecycle.State.STATE_NAME.Count: 0` ?
   > 
   > @morhidi `STATE_NAME` is a placeholder in this example. In practice it's 
gonna be one of CREATED, UPGRADING, DEPLOYED etc.
   
   
   
   > > What is `FlinkDeployment.Lifecycle.State.STATE_NAME.Count: 0` ?
   > 
   > @morhidi `STATE_NAME` is a placeholder in this example. In practice it's 
gonna be one of CREATED, UPGRADING, DEPLOYED etc.
   
   nit: you can update the description to 
FlinkDeployment.Lifecycle.State..Count = ... to make it obvious


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


gyfora commented on PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#issuecomment-1180786814

   > What is `FlinkDeployment.Lifecycle.State.STATE_NAME.Count: 0` ?
   
   @morhidi `STATE_NAME` is a placeholder in this example. In practice it's 
gonna be one of CREATED, UPGRADING, DEPLOYED etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


gyfora commented on code in PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#discussion_r918283173


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/metrics/lifecycle/ResourceLifecycleMetricsTest.java:
##
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.metrics.lifecycle;
+
+import org.apache.flink.api.common.JobStatus;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.kubernetes.operator.TestUtils;
+import org.apache.flink.kubernetes.operator.crd.spec.JobState;
+import org.apache.flink.kubernetes.operator.crd.status.ReconciliationState;
+import org.apache.flink.kubernetes.operator.reconciler.ReconciliationUtils;
+import org.apache.flink.metrics.Histogram;
+import org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogram;
+
+import org.junit.jupiter.api.Test;
+
+import java.time.Instant;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.CREATED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.DEPLOYED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.FAILED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLED_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLING_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.STABLE;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.SUSPENDED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.UPGRADING;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+
+/** Test for resource lifecycle metrics. */
+public class ResourceLifecycleMetricsTest {
+
+@Test
+public void lifecycleStateTest() {
+var application = TestUtils.buildApplicationCluster();
+assertEquals(CREATED, application.getStatus().getLifecycleState());
+
+ReconciliationUtils.updateStatusBeforeDeploymentAttempt(application, 
new Configuration());
+assertEquals(UPGRADING, application.getStatus().getLifecycleState());
+
+ReconciliationUtils.updateStatusForDeployedSpec(application, new 
Configuration());
+assertEquals(DEPLOYED, application.getStatus().getLifecycleState());
+
+
application.getStatus().getReconciliationStatus().markReconciledSpecAsStable();
+assertEquals(STABLE, application.getStatus().getLifecycleState());
+
+application.getStatus().setError("errr");
+assertEquals(STABLE, application.getStatus().getLifecycleState());
+
+
application.getStatus().getJobStatus().setState(JobStatus.FAILED.name());
+assertEquals(FAILED, application.getStatus().getLifecycleState());
+
+application.getStatus().setError("");
+
+application
+.getStatus()
+.getReconciliationStatus()
+.setState(ReconciliationState.ROLLING_BACK);
+assertEquals(ROLLING_BACK, 
application.getStatus().getLifecycleState());
+
+
application.getStatus().getJobStatus().setState(JobStatus.RECONCILING.name());
+
application.getStatus().getReconciliationStatus().setState(ReconciliationState.ROLLED_BACK);
+assertEquals(ROLLED_BACK, application.getStatus().getLifecycleState());
+
+
application.getStatus().getJobStatus().setState(JobStatus.FAILED.name());
+assertEquals(FAILED, application.getStatus().getLifecycleState());
+
+
application.getStatus().getJobStatus().setState(JobStatus.RUNNING.name());
+application.getSpec().getJob().setState(JobState.SUSPENDED);
+ReconciliationUtils.updateStatusForDeployedSpec(application, new 
Configuration());
+assertEquals(SUSPENDED, application.getStatus().getLifecycleState());
+}
+
+@Test
+public void testLifecycleTracker() {
+var histos = initHistos();
+  

[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


morhidi commented on PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#issuecomment-1180785668

   I love this PR @gyfora added some minor comments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


gyfora commented on code in PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#discussion_r918282383


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/metrics/lifecycle/LifecycleMetrics.java:
##
@@ -0,0 +1,205 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.metrics.lifecycle;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.kubernetes.operator.crd.AbstractFlinkResource;
+import 
org.apache.flink.kubernetes.operator.metrics.KubernetesOperatorMetricGroup;
+import org.apache.flink.metrics.Histogram;
+import org.apache.flink.metrics.MetricGroup;
+import org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogram;
+
+import lombok.RequiredArgsConstructor;
+import lombok.ToString;
+
+import java.time.Clock;
+import java.time.Instant;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.CREATED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.DEPLOYED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLED_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLING_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.STABLE;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.SUSPENDED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.UPGRADING;
+
+/**
+ * Utility for tracking resource lifecycle metrics globally and per namespace.
+ *
+ * @param  Flink resource type.
+ */
+public class LifecycleMetrics> {
+
+public static final List TRACKED_TRANSITIONS = 
getTrackedTransitions();
+
+private final Map, ResourceLifecycleMetricTracker> 
lifecycleTrackers =
+new ConcurrentHashMap<>();
+private final Set namespaces = Collections.newSetFromMap(new 
ConcurrentHashMap<>());
+
+private final int histogramWindowSize = 1000;

Review Comment:
   yes, that would be nice I agree



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


morhidi commented on code in PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#discussion_r91828


##
flink-kubernetes-operator/src/test/java/org/apache/flink/kubernetes/operator/metrics/lifecycle/ResourceLifecycleMetricsTest.java:
##
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.metrics.lifecycle;
+
+import org.apache.flink.api.common.JobStatus;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.kubernetes.operator.TestUtils;
+import org.apache.flink.kubernetes.operator.crd.spec.JobState;
+import org.apache.flink.kubernetes.operator.crd.status.ReconciliationState;
+import org.apache.flink.kubernetes.operator.reconciler.ReconciliationUtils;
+import org.apache.flink.metrics.Histogram;
+import org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogram;
+
+import org.junit.jupiter.api.Test;
+
+import java.time.Instant;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.CREATED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.DEPLOYED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.FAILED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLED_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLING_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.STABLE;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.SUSPENDED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.UPGRADING;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+
+/** Test for resource lifecycle metrics. */
+public class ResourceLifecycleMetricsTest {
+
+@Test
+public void lifecycleStateTest() {
+var application = TestUtils.buildApplicationCluster();
+assertEquals(CREATED, application.getStatus().getLifecycleState());
+
+ReconciliationUtils.updateStatusBeforeDeploymentAttempt(application, 
new Configuration());
+assertEquals(UPGRADING, application.getStatus().getLifecycleState());
+
+ReconciliationUtils.updateStatusForDeployedSpec(application, new 
Configuration());
+assertEquals(DEPLOYED, application.getStatus().getLifecycleState());
+
+
application.getStatus().getReconciliationStatus().markReconciledSpecAsStable();
+assertEquals(STABLE, application.getStatus().getLifecycleState());
+
+application.getStatus().setError("errr");
+assertEquals(STABLE, application.getStatus().getLifecycleState());
+
+
application.getStatus().getJobStatus().setState(JobStatus.FAILED.name());
+assertEquals(FAILED, application.getStatus().getLifecycleState());
+
+application.getStatus().setError("");
+
+application
+.getStatus()
+.getReconciliationStatus()
+.setState(ReconciliationState.ROLLING_BACK);
+assertEquals(ROLLING_BACK, 
application.getStatus().getLifecycleState());
+
+
application.getStatus().getJobStatus().setState(JobStatus.RECONCILING.name());
+
application.getStatus().getReconciliationStatus().setState(ReconciliationState.ROLLED_BACK);
+assertEquals(ROLLED_BACK, application.getStatus().getLifecycleState());
+
+
application.getStatus().getJobStatus().setState(JobStatus.FAILED.name());
+assertEquals(FAILED, application.getStatus().getLifecycleState());
+
+
application.getStatus().getJobStatus().setState(JobStatus.RUNNING.name());
+application.getSpec().getJob().setState(JobState.SUSPENDED);
+ReconciliationUtils.updateStatusForDeployedSpec(application, new 
Configuration());
+assertEquals(SUSPENDED, application.getStatus().getLifecycleState());
+}
+
+@Test
+public void testLifecycleTracker() {
+var histos = initHistos();
+ 

[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


gyfora commented on code in PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#discussion_r918282131


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/metrics/MetricManager.java:
##
@@ -18,31 +18,37 @@
 package org.apache.flink.kubernetes.operator.metrics;
 
 import org.apache.flink.configuration.Configuration;
+import org.apache.flink.kubernetes.operator.crd.AbstractFlinkResource;
 import org.apache.flink.kubernetes.operator.crd.FlinkDeployment;
 import org.apache.flink.kubernetes.operator.crd.FlinkSessionJob;
+import org.apache.flink.kubernetes.operator.metrics.lifecycle.LifecycleMetrics;
 
-import io.fabric8.kubernetes.client.CustomResource;
-
+import java.time.Clock;
 import java.util.Map;
 import java.util.concurrent.ConcurrentHashMap;
 
 /** Metric manager for Operator managed custom resources. */
-public class MetricManager> {
+public class MetricManager> {

Review Comment:
   I think the name is fine for now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


morhidi commented on code in PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#discussion_r918277417


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/metrics/lifecycle/LifecycleMetrics.java:
##
@@ -0,0 +1,205 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.metrics.lifecycle;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.kubernetes.operator.crd.AbstractFlinkResource;
+import 
org.apache.flink.kubernetes.operator.metrics.KubernetesOperatorMetricGroup;
+import org.apache.flink.metrics.Histogram;
+import org.apache.flink.metrics.MetricGroup;
+import org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogram;
+
+import lombok.RequiredArgsConstructor;
+import lombok.ToString;
+
+import java.time.Clock;
+import java.time.Instant;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.CREATED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.DEPLOYED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLED_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLING_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.STABLE;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.SUSPENDED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.UPGRADING;
+
+/**
+ * Utility for tracking resource lifecycle metrics globally and per namespace.
+ *
+ * @param  Flink resource type.
+ */
+public class LifecycleMetrics> {
+
+public static final List TRACKED_TRANSITIONS = 
getTrackedTransitions();
+
+private final Map, ResourceLifecycleMetricTracker> 
lifecycleTrackers =
+new ConcurrentHashMap<>();
+private final Set namespaces = Collections.newSetFromMap(new 
ConcurrentHashMap<>());
+
+private final int histogramWindowSize = 1000;
+private final Configuration configuration;
+private final Clock clock;
+private final KubernetesOperatorMetricGroup operatorMetricGroup;
+
+private Map>> 
transitionMetrics;
+private Function metricGroupFunction;
+
+public LifecycleMetrics(
+Configuration configuration,
+Clock clock,
+KubernetesOperatorMetricGroup operatorMetricGroup) {
+this.configuration = configuration;
+this.clock = clock;
+this.operatorMetricGroup = operatorMetricGroup;
+}
+
+public void onUpdate(CR cr) {
+
getLifecycleMetricTracker(cr).onUpdate(cr.getStatus().getLifecycleState(), 
clock.instant());
+}
+
+public void onRemove(CR cr) {
+lifecycleTrackers.remove(
+Tuple2.of(cr.getMetadata().getNamespace(), 
cr.getMetadata().getName()));
+}
+
+private ResourceLifecycleMetricTracker getLifecycleMetricTracker(CR cr) {
+init(cr);
+createNamespaceStateCountIfMissing(cr.getMetadata().getNamespace());
+return lifecycleTrackers.computeIfAbsent(
+Tuple2.of(cr.getMetadata().getNamespace(), 
cr.getMetadata().getName()),
+k -> {
+var initialState = cr.getStatus().getLifecycleState();
+var time =
+initialState == CREATED
+? 
Instant.parse(cr.getMetadata().getCreationTimestamp())
+: clock.instant();
+return new ResourceLifecycleMetricTracker(
+initialState, time, getTransitionHistograms(cr));
+});
+}
+
+private void createNamespaceStateCountIfMissing(String namespace) {
+if (!namespaces.add(namespace)) {
+return;
+   

[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


morhidi commented on code in PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#discussion_r918274673


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/metrics/lifecycle/LifecycleMetrics.java:
##
@@ -0,0 +1,205 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.kubernetes.operator.metrics.lifecycle;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.kubernetes.operator.crd.AbstractFlinkResource;
+import 
org.apache.flink.kubernetes.operator.metrics.KubernetesOperatorMetricGroup;
+import org.apache.flink.metrics.Histogram;
+import org.apache.flink.metrics.MetricGroup;
+import org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogram;
+
+import lombok.RequiredArgsConstructor;
+import lombok.ToString;
+
+import java.time.Clock;
+import java.time.Instant;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.CREATED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.DEPLOYED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLED_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.ROLLING_BACK;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.STABLE;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.SUSPENDED;
+import static 
org.apache.flink.kubernetes.operator.metrics.lifecycle.ResourceLifecycleState.UPGRADING;
+
+/**
+ * Utility for tracking resource lifecycle metrics globally and per namespace.
+ *
+ * @param  Flink resource type.
+ */
+public class LifecycleMetrics> {
+
+public static final List TRACKED_TRANSITIONS = 
getTrackedTransitions();
+
+private final Map, ResourceLifecycleMetricTracker> 
lifecycleTrackers =
+new ConcurrentHashMap<>();
+private final Set namespaces = Collections.newSetFromMap(new 
ConcurrentHashMap<>());
+
+private final int histogramWindowSize = 1000;

Review Comment:
   Don't you think it worth introducing a global window size config for the 
histograms across the operator? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] morhidi commented on a diff in pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


morhidi commented on code in PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#discussion_r918272226


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/metrics/MetricManager.java:
##
@@ -18,31 +18,37 @@
 package org.apache.flink.kubernetes.operator.metrics;
 
 import org.apache.flink.configuration.Configuration;
+import org.apache.flink.kubernetes.operator.crd.AbstractFlinkResource;
 import org.apache.flink.kubernetes.operator.crd.FlinkDeployment;
 import org.apache.flink.kubernetes.operator.crd.FlinkSessionJob;
+import org.apache.flink.kubernetes.operator.metrics.lifecycle.LifecycleMetrics;
 
-import io.fabric8.kubernetes.client.CustomResource;
-
+import java.time.Clock;
 import java.util.Map;
 import java.util.concurrent.ConcurrentHashMap;
 
 /** Metric manager for Operator managed custom resources. */
-public class MetricManager> {
+public class MetricManager> {

Review Comment:
   Shall we call it `ResourceMetricManager`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-28496) Expose label selector support in k8s operator

2022-07-11 Thread Jira
Márton Balassi created FLINK-28496:
--

 Summary: Expose label selector support in k8s operator
 Key: FLINK-28496
 URL: https://issues.apache.org/jira/browse/FLINK-28496
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.1.0
Reporter: Márton Balassi


The JOSDK has a 
[labelselector|https://github.com/java-operator-sdk/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ControllerConfigurationOverrider.java#L34]
 which can let users filter the custom resources watched. We should expose this 
via our config.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28496) Expose label selector support in k8s operator

2022-07-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-28496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi reassigned FLINK-28496:
--

Assignee: Márton Balassi

> Expose label selector support in k8s operator
> -
>
> Key: FLINK-28496
> URL: https://issues.apache.org/jira/browse/FLINK-28496
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.1.0
>Reporter: Márton Balassi
>Assignee: Márton Balassi
>Priority: Major
>
> The JOSDK has a 
> [labelselector|https://github.com/java-operator-sdk/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ControllerConfigurationOverrider.java#L34]
>  which can let users filter the custom resources watched. We should expose 
> this via our config.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


morhidi commented on PR #311:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/311#issuecomment-1180771363

   What is `FlinkDeployment.Lifecycle.State.STATE_NAME.Count: 0` ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-28479) Add latency histogram for resource state transitions

2022-07-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-28479:
---
Labels: pull-request-available  (was: )

> Add latency histogram for resource state transitions
> 
>
> Key: FLINK-28479
> URL: https://issues.apache.org/jira/browse/FLINK-28479
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.1.0
>
>
> We should track how long operator lifecycle transitions take, between states:
> SUSPENDED,
> UPGRADING,
> DEPLOYED,
> STABLE,
> ROLLING_BACK,
> ROLLED_BACK,
> FAILED



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] gyfora opened a new pull request, #311: [FLINK-28479] Add metrics for resource lifecycle state transitions

2022-07-11 Thread GitBox


gyfora opened a new pull request, #311:
URL: https://github.com/apache/flink-kubernetes-operator/pull/311

   ## What is the purpose of the change
   
   Introduce histogram metrics for tracking how long resource lifecycle state 
transitions take between the following states:
   
- CREATED
- SUSPENDED
- UPGRADING
- DEPLOYED
- STABLE
- ROLLING_BACK
- ROLLED_BACK
- FAILED
   
   New metrics:
   
   ```
   FlinkDeployment.Lifecycle.Transition.ResumeTimeSeconds: count=0, min=0, 
max=0, mean=NaN, ...
   FlinkDeployment.Lifecycle.Transition.SuspendTimeSeconds: count=1, min=2, 
max=2, mean=2.0, ...
   FlinkDeployment.Lifecycle.Transition.UpgradeTimeSeconds: count=1, min=33, 
max=33, mean=33.0, ...
   FlinkDeployment.Lifecycle.Transition.StabilizationTimeSeconds: count=1, 
min=29, max=29, mean=29.0, ...
   FlinkDeployment.Lifecycle.Transition.RollbackTimeSeconds: count=0, min=0, 
max=0, mean=NaN, ...
   FlinkDeployment.Lifecycle.Transition.SubmissionTimeSeconds: count=1, min=1, 
max=1, mean=1.0, ...
   
   FlinkDeployment.Lifecycle.State.STATE_NAME.Count: 0
   ```
   
   ## Brief change log
   
- Introduce ResourceLifecycleState derived from the resource status
- Add mechanism to track ResourceLifecycleState transitions
- Create histogram metrics for select transitions
- Add count metrics for each state
- Add tests
   
   ## Verifying this change
   
   New unit tests + manually verified on minikube
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
no
 - Core observer or reconciler logic that is regularly executed: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? **[TODO]**


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-28476) Add metrics for Kubernetes API server access

2022-07-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-28476:
---
Labels: pull-request-available  (was: )

> Add metrics for Kubernetes API server access
> 
>
> Key: FLINK-28476
> URL: https://issues.apache.org/jira/browse/FLINK-28476
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Matyas Orhidi
>Assignee: Matyas Orhidi
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.1.0
>
>
> e.g.:
>  * http response counter
>  * http response latency histogram
>  * http response status counter



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] morhidi opened a new pull request, #310: [FLINK-28476] Add metrics for Kubernetes API server access

2022-07-11 Thread GitBox


morhidi opened a new pull request, #310:
URL: https://github.com/apache/flink-kubernetes-operator/pull/310

   ## What is the purpose of the change
   
   This pull request adds metrics and KPIs related to Kubernetes API server 
access. Metrics can be enabled by 
`kubernetes.operator.kubernetes.client.metrics.enabled` (defaults to `true`).
   
   ## Brief change log
   - added various request/response counters
   ```
   -- Counters 
---
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpRequest.Count:
 94
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpRequest.POST.Count:
 6
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpRequest.PATCH.Count:
 10
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpRequest.DELETE.Count:
 4
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpRequest.PUT.Count:
 8
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpRequest.GET.Count:
 66
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpRequest.Failed.Count:
 3
   
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpResponse.Count:
 91
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpResponse.101.Count:
 5
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpResponse.409.Count:
 1
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpResponse.201.Count:
 6
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpResponse.404.Count:
 10
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpResponse.200.Count:
 69
   
   ```
   - added key request/response KPIs:
   
   ```
   -- Meters 
-
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpRequest.NumPerSecond:
 0.08333
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpResponse.NumPerSecond:
 0.0
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpResponse.Failed.NumPerSecond:
 0.05
   ```
   
   ```
   -- Histograms 
-
   
localhost.k8soperator.default.flink-kubernetes-operator.KubeClient.HttpResponse.LatencyNanos:
 count=91, min=2588875, max=273916959, mean=1.8684283417582415E7, 
stddev=4.088778006829815E7, p50=7575458.0, p75=1.3146208E7, p95=5.92533498E7, 
p98=2.7390890844E8, p99=2.73916959E8, p999=2.73916959E8
   ```
   ## Verifying this change
   
   This change added tests that covers the functionality and can be verified as 
follows:
   
   Manually by enabling the `Slf4jReporterFactory` that dumps the metrics into 
the logs:
   ```
   kubernetes.operator.metrics.reporter.slf4j.factory.class: 
org.apache.flink.metrics.slf4j.Slf4jReporterFactory
   kubernetes.operator.metrics.reporter.slf4j.interval: 10 SECONDS
   ```
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
no
 - Core observer or reconciler logic that is regularly executed: no
   
   ## Documentation
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? 
   - docs for `kubernetes.operator.kubernetes.client.metrics.enabled` 
property is autogenerated
   - Metrics descriptions are added to the documentation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-28472) Add JmStatus as a group for JobManagerDeploymentStatus count metrics

2022-07-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-28472:
---
Labels: pull-request-available  (was: )

> Add JmStatus as a group for JobManagerDeploymentStatus count metrics
> 
>
> Key: FLINK-28472
> URL: https://issues.apache.org/jira/browse/FLINK-28472
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Matyas Orhidi
>Priority: Critical
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.1.0
>
>
> Currently the metric simply looks like
> ...FlinkDeployment.READY.Count
> We should add a group to clarify that READY is the status of the Jm 
> deployment. This will make it possible to add other status count metrics in 
> the future:
> FlinkDeployment.JmStatus.READY.Count
> So in the future we can track other states with similar counts:
> FlinkDeployment.JobStatus.RUNNING
> ...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] morhidi opened a new pull request, #309: [FLINK-28472] Add JMDeploymentStatus as a group for JobManagerDeploymentStatuscount metrics

2022-07-11 Thread GitBox


morhidi opened a new pull request, #309:
URL: https://github.com/apache/flink-kubernetes-operator/pull/309

   ## What is the purpose of the change
   
   Add JMDeploymentStatus as a group for JobManagerDeploymentStatuscount 
metrics.
   
   ## Brief change log
   
   Before:
   ```
   
localhost.k8soperator.default.flink-kubernetes-operator.namespace.default.FlinkDeployment.READY.Count:
 1
   ```
   After:
   ```
   
localhost.k8soperator.default.flink-kubernetes-operator.namespace.default.FlinkDeployment.JMDeploymentStatus.READY.Count:
 1
   ```
   
   ## Verifying this change
   This change is already covered by existing tests, such as:
   
   - `StatusRecorderTest` (removed redundant tests)
   
   - `FlinkDeploymentMetricsTest` (updated existing tests)
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
(no)
 - Core observer or reconciler logic that is regularly executed: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (docs)
   - Metrics documentation is updated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-docker] alpinegizmo opened a new pull request, #124: [readme] explicitly mention that the full GPG fingerprint is required

2022-07-11 Thread GitBox


alpinegizmo opened a new pull request, #124:
URL: https://github.com/apache/flink-docker/pull/124

   My pull request against official-images wasn't accepted until I updated it 
to refer to commits in this repo that build images that use the full GPG 
fingerprint of my signing key. 
   
   This PR updates the README so that others can avoid this problem.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-26621) ChangelogPeriodicMaterializationITCase crashes JVM on CI in RocksDB cleanup

2022-07-11 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-26621:
-
Issue Type: Technical Debt  (was: Bug)

> ChangelogPeriodicMaterializationITCase crashes JVM on CI in RocksDB cleanup
> ---
>
> Key: FLINK-26621
> URL: https://issues.apache.org/jira/browse/FLINK-26621
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Checkpointing, Tests
>Affects Versions: 1.15.0
>Reporter: Yun Gao
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0, 1.15.2
>
>
> {code:java}
> 2022-03-11T16:20:12.6929558Z Mar 11 16:20:12 [WARNING] The requested profile 
> "skip-webui-build" could not be activated because it does not exist.
> 2022-03-11T16:20:12.6939269Z Mar 11 16:20:12 [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test 
> (integration-tests) on project flink-tests: There are test failures.
> 2022-03-11T16:20:12.6940062Z Mar 11 16:20:12 [ERROR] 
> 2022-03-11T16:20:12.6940954Z Mar 11 16:20:12 [ERROR] Please refer to 
> /__w/2/s/flink-tests/target/surefire-reports for the individual test results.
> 2022-03-11T16:20:12.6941875Z Mar 11 16:20:12 [ERROR] Please refer to dump 
> files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
> 2022-03-11T16:20:12.6942966Z Mar 11 16:20:12 [ERROR] ExecutionException Error 
> occurred in starting fork, check output in log
> 2022-03-11T16:20:12.6943919Z Mar 11 16:20:12 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException Error occurred in starting fork, check output in log
> 2022-03-11T16:20:12.6945023Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
> 2022-03-11T16:20:12.6945878Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:479)
> 2022-03-11T16:20:12.6946761Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:322)
> 2022-03-11T16:20:12.6947532Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
> 2022-03-11T16:20:12.6953051Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314)
> 2022-03-11T16:20:12.6954035Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159)
> 2022-03-11T16:20:12.6954917Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:932)
> 2022-03-11T16:20:12.6955749Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
> 2022-03-11T16:20:12.6956542Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
> 2022-03-11T16:20:12.6957456Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
> 2022-03-11T16:20:12.6958232Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
> 2022-03-11T16:20:12.6959038Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
> 2022-03-11T16:20:12.6960553Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
> 2022-03-11T16:20:12.6962116Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
> 2022-03-11T16:20:12.6963009Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
> 2022-03-11T16:20:12.6963737Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
> 2022-03-11T16:20:12.6964644Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
> 2022-03-11T16:20:12.6965647Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
> 2022-03-11T16:20:12.6966732Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
> 2022-03-11T16:20:12.6967818Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.main(MavenCli.java:160)
> 2022-03-11T16:20:12.6968857Z Mar 11 16:20:12 [ERROR] at 
> 

[jira] [Closed] (FLINK-26621) ChangelogPeriodicMaterializationITCase crashes JVM on CI in RocksDB cleanup

2022-07-11 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-26621.

Resolution: Fixed

> ChangelogPeriodicMaterializationITCase crashes JVM on CI in RocksDB cleanup
> ---
>
> Key: FLINK-26621
> URL: https://issues.apache.org/jira/browse/FLINK-26621
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Tests
>Affects Versions: 1.15.0
>Reporter: Yun Gao
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0, 1.15.2
>
>
> {code:java}
> 2022-03-11T16:20:12.6929558Z Mar 11 16:20:12 [WARNING] The requested profile 
> "skip-webui-build" could not be activated because it does not exist.
> 2022-03-11T16:20:12.6939269Z Mar 11 16:20:12 [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test 
> (integration-tests) on project flink-tests: There are test failures.
> 2022-03-11T16:20:12.6940062Z Mar 11 16:20:12 [ERROR] 
> 2022-03-11T16:20:12.6940954Z Mar 11 16:20:12 [ERROR] Please refer to 
> /__w/2/s/flink-tests/target/surefire-reports for the individual test results.
> 2022-03-11T16:20:12.6941875Z Mar 11 16:20:12 [ERROR] Please refer to dump 
> files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
> 2022-03-11T16:20:12.6942966Z Mar 11 16:20:12 [ERROR] ExecutionException Error 
> occurred in starting fork, check output in log
> 2022-03-11T16:20:12.6943919Z Mar 11 16:20:12 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException Error occurred in starting fork, check output in log
> 2022-03-11T16:20:12.6945023Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
> 2022-03-11T16:20:12.6945878Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:479)
> 2022-03-11T16:20:12.6946761Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:322)
> 2022-03-11T16:20:12.6947532Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
> 2022-03-11T16:20:12.6953051Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314)
> 2022-03-11T16:20:12.6954035Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159)
> 2022-03-11T16:20:12.6954917Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:932)
> 2022-03-11T16:20:12.6955749Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
> 2022-03-11T16:20:12.6956542Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
> 2022-03-11T16:20:12.6957456Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
> 2022-03-11T16:20:12.6958232Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
> 2022-03-11T16:20:12.6959038Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
> 2022-03-11T16:20:12.6960553Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
> 2022-03-11T16:20:12.6962116Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
> 2022-03-11T16:20:12.6963009Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
> 2022-03-11T16:20:12.6963737Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
> 2022-03-11T16:20:12.6964644Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
> 2022-03-11T16:20:12.6965647Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
> 2022-03-11T16:20:12.6966732Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
> 2022-03-11T16:20:12.6967818Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.main(MavenCli.java:160)
> 2022-03-11T16:20:12.6968857Z Mar 11 16:20:12 [ERROR] at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 

[jira] [Comment Edited] (FLINK-26621) ChangelogPeriodicMaterializationITCase crashes JVM on CI in RocksDB cleanup

2022-07-11 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565119#comment-17565119
 ] 

Chesnay Schepler edited comment on FLINK-26621 at 7/11/22 5:45 PM:
---

master: c1d0c8fac16988c22f4ce4781e4a92a1de4b61fc

1.15: 31fca4af7df6b8c8424c6c37fb298a24d8836444 


was (Author: zentol):
master: c1d0c8fac16988c22f4ce4781e4a92a1de4b61fc

> ChangelogPeriodicMaterializationITCase crashes JVM on CI in RocksDB cleanup
> ---
>
> Key: FLINK-26621
> URL: https://issues.apache.org/jira/browse/FLINK-26621
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Tests
>Affects Versions: 1.15.0
>Reporter: Yun Gao
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0, 1.15.2
>
>
> {code:java}
> 2022-03-11T16:20:12.6929558Z Mar 11 16:20:12 [WARNING] The requested profile 
> "skip-webui-build" could not be activated because it does not exist.
> 2022-03-11T16:20:12.6939269Z Mar 11 16:20:12 [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test 
> (integration-tests) on project flink-tests: There are test failures.
> 2022-03-11T16:20:12.6940062Z Mar 11 16:20:12 [ERROR] 
> 2022-03-11T16:20:12.6940954Z Mar 11 16:20:12 [ERROR] Please refer to 
> /__w/2/s/flink-tests/target/surefire-reports for the individual test results.
> 2022-03-11T16:20:12.6941875Z Mar 11 16:20:12 [ERROR] Please refer to dump 
> files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
> 2022-03-11T16:20:12.6942966Z Mar 11 16:20:12 [ERROR] ExecutionException Error 
> occurred in starting fork, check output in log
> 2022-03-11T16:20:12.6943919Z Mar 11 16:20:12 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException Error occurred in starting fork, check output in log
> 2022-03-11T16:20:12.6945023Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
> 2022-03-11T16:20:12.6945878Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:479)
> 2022-03-11T16:20:12.6946761Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:322)
> 2022-03-11T16:20:12.6947532Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
> 2022-03-11T16:20:12.6953051Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314)
> 2022-03-11T16:20:12.6954035Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159)
> 2022-03-11T16:20:12.6954917Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:932)
> 2022-03-11T16:20:12.6955749Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
> 2022-03-11T16:20:12.6956542Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
> 2022-03-11T16:20:12.6957456Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
> 2022-03-11T16:20:12.6958232Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
> 2022-03-11T16:20:12.6959038Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
> 2022-03-11T16:20:12.6960553Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
> 2022-03-11T16:20:12.6962116Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
> 2022-03-11T16:20:12.6963009Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
> 2022-03-11T16:20:12.6963737Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
> 2022-03-11T16:20:12.6964644Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
> 2022-03-11T16:20:12.6965647Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
> 2022-03-11T16:20:12.6966732Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
> 

[jira] [Commented] (FLINK-26621) ChangelogPeriodicMaterializationITCase crashes JVM on CI in RocksDB cleanup

2022-07-11 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565119#comment-17565119
 ] 

Chesnay Schepler commented on FLINK-26621:
--

master: c1d0c8fac16988c22f4ce4781e4a92a1de4b61fc

> ChangelogPeriodicMaterializationITCase crashes JVM on CI in RocksDB cleanup
> ---
>
> Key: FLINK-26621
> URL: https://issues.apache.org/jira/browse/FLINK-26621
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Tests
>Affects Versions: 1.15.0
>Reporter: Yun Gao
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0, 1.15.2
>
>
> {code:java}
> 2022-03-11T16:20:12.6929558Z Mar 11 16:20:12 [WARNING] The requested profile 
> "skip-webui-build" could not be activated because it does not exist.
> 2022-03-11T16:20:12.6939269Z Mar 11 16:20:12 [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test 
> (integration-tests) on project flink-tests: There are test failures.
> 2022-03-11T16:20:12.6940062Z Mar 11 16:20:12 [ERROR] 
> 2022-03-11T16:20:12.6940954Z Mar 11 16:20:12 [ERROR] Please refer to 
> /__w/2/s/flink-tests/target/surefire-reports for the individual test results.
> 2022-03-11T16:20:12.6941875Z Mar 11 16:20:12 [ERROR] Please refer to dump 
> files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
> 2022-03-11T16:20:12.6942966Z Mar 11 16:20:12 [ERROR] ExecutionException Error 
> occurred in starting fork, check output in log
> 2022-03-11T16:20:12.6943919Z Mar 11 16:20:12 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException Error occurred in starting fork, check output in log
> 2022-03-11T16:20:12.6945023Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
> 2022-03-11T16:20:12.6945878Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:479)
> 2022-03-11T16:20:12.6946761Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:322)
> 2022-03-11T16:20:12.6947532Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
> 2022-03-11T16:20:12.6953051Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314)
> 2022-03-11T16:20:12.6954035Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159)
> 2022-03-11T16:20:12.6954917Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:932)
> 2022-03-11T16:20:12.6955749Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
> 2022-03-11T16:20:12.6956542Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
> 2022-03-11T16:20:12.6957456Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
> 2022-03-11T16:20:12.6958232Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
> 2022-03-11T16:20:12.6959038Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
> 2022-03-11T16:20:12.6960553Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
> 2022-03-11T16:20:12.6962116Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
> 2022-03-11T16:20:12.6963009Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
> 2022-03-11T16:20:12.6963737Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
> 2022-03-11T16:20:12.6964644Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
> 2022-03-11T16:20:12.6965647Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
> 2022-03-11T16:20:12.6966732Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
> 2022-03-11T16:20:12.6967818Z Mar 11 16:20:12 [ERROR] at 
> org.apache.maven.cli.MavenCli.main(MavenCli.java:160)
> 2022-03-11T16:20:12.6968857Z Mar 11 16:20:12 [ERROR] at 
> 

[GitHub] [flink] flinkbot commented on pull request #20245: [FLINK-28487][connectors] Introduce configurable RateLimitingStrategy…

2022-07-11 Thread GitBox


flinkbot commented on PR #20245:
URL: https://github.com/apache/flink/pull/20245#issuecomment-1180685284

   
   ## CI report:
   
   * c342f9dc454ef0fdd2e4b8666c59778ad960696a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol merged pull request #20240: [FLINK-26621][tests] Close delegate keyed state backend

2022-07-11 Thread GitBox


zentol merged PR #20240:
URL: https://github.com/apache/flink/pull/20240


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #20244: [FLINK-28487][connectors] Introduce configurable RateLimitingStrategy…

2022-07-11 Thread GitBox


flinkbot commented on PR #20244:
URL: https://github.com/apache/flink/pull/20244#issuecomment-1180680321

   
   ## CI report:
   
   * a3df3733e444ae9e2d66d94960b7401003378a0c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] hlteoh37 opened a new pull request, #20245: [FLINK-28487][connectors] Introduce configurable RateLimitingStrategy…

2022-07-11 Thread GitBox


hlteoh37 opened a new pull request, #20245:
URL: https://github.com/apache/flink/pull/20245

   … for AsyncSinkWriter
   
   ## What is the purpose of the change
   This pull request makes the RateLimitingStrategy used in AsyncSinkWriter 
configurable by the sink implementer. That way a sink implementer can decide to 
implement a custom RateLimitingStrategy.
   
   
   ## Brief change log
   - Introduced RateLimitingStrategy interface
   - Implemented a CongestionControlRateLimitingStrategy
   - Migrated tracking of inFlightMessages from AsyncSinkWriter into the 
RateLimitingStrategy
   - Introduced logic within AsyncSinkWriter to check with RateLimitingStrategy 
if the next request should be blocked.
   - Introduced a new AsyncSinkWriter constructor and deprecated the old 
AsyncSinkWriter constructors.
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   - Added test that verifies AsyncSinkWriter respects the RateLimitingStrategy 
passed into the constructor
   - Added test that verifies AIMDRateLimitingStrategy follows a linear 
increase and multiplicative decrease
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? [FLIP-242: Introduce configurable 
RateLimitingStrategy for Async 
Sink](https://cwiki.apache.org/confluence/display/FLINK/FLIP-242%3A+Introduce+configurable+RateLimitingStrategy+for+Async+Sink)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] hlteoh37 closed pull request #20244: [FLINK-28487][connectors] Introduce configurable RateLimitingStrategy…

2022-07-11 Thread GitBox


hlteoh37 closed pull request #20244: [FLINK-28487][connectors] Introduce 
configurable RateLimitingStrategy…
URL: https://github.com/apache/flink/pull/20244


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-28487) Introduce configurable RateLimitingStrategy for Async Sink

2022-07-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-28487:
---
Labels: pull-request-available  (was: )

> Introduce configurable RateLimitingStrategy for Async Sink
> --
>
> Key: FLINK-28487
> URL: https://issues.apache.org/jira/browse/FLINK-28487
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Hong Liang Teoh
>Assignee: Hong Liang Teoh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Introduce a configurable RateLimitingStrategy to the AsyncSinkWriter.
> This change will allow sink implementers using AsyncSinkWriter to configure 
> their own RateLimitingStrategy instead of using the default 
> AIMDRateLimitingStrategy.
> See [FLIP-242: Introduce configurable RateLimitingStrategy for Async 
> Sink|https://cwiki.apache.org/confluence/display/FLINK/FLIP-242%3A+Introduce+configurable+RateLimitingStrategy+for+Async+Sink].
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] hlteoh37 opened a new pull request, #20244: [FLINK-28487][connectors] Introduce configurable RateLimitingStrategy…

2022-07-11 Thread GitBox


hlteoh37 opened a new pull request, #20244:
URL: https://github.com/apache/flink/pull/20244

   … for AsyncSinkWriter
   
   ## What is the purpose of the change
   This pull request makes the RateLimitingStrategy used in AsyncSinkWriter 
configurable by the sink implementer. That way a sink implementer can decide to 
implement a custom RateLimitingStrategy.
   
   
   ## Brief change log
   - Introduced RateLimitingStrategy interface
   - Implemented a CongestionControlRateLimitingStrategy
   - Migrated tracking of inFlightMessages from AsyncSinkWriter into the 
RateLimitingStrategy
   - Introduced logic within AsyncSinkWriter to check with RateLimitingStrategy 
if the next request should be blocked.
   - Introduced a new AsyncSinkWriter constructor and deprecated the old 
AsyncSinkWriter constructors.
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   - Added test that verifies AsyncSinkWriter respects the RateLimitingStrategy 
passed into the constructor
   - Added test that verifies AIMDRateLimitingStrategy follows a linear 
increase and multiplicative decrease
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? [FLIP-242: Introduce configurable 
RateLimitingStrategy for Async 
Sink](https://cwiki.apache.org/confluence/display/FLINK/FLIP-242%3A+Introduce+configurable+RateLimitingStrategy+for+Async+Sink)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on a diff in pull request #20100: [FLINK-27905][runtime] Introduce HsSpillingStrategy and HsSelectiveSpillingStrategy implementation

2022-07-11 Thread GitBox


reswqa commented on code in PR #20100:
URL: https://github.com/apache/flink/pull/20100#discussion_r918178186


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSelectiveSpillingStrategy.java:
##
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.hybrid;
+
+import org.apache.flink.util.Preconditions;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import java.util.PriorityQueue;
+
+/**
+ * A special implementation of {@link HsSpillingStrategy} that reduce disk 
writes as much as
+ * possible.
+ */
+public class HsSelectiveSpillingStrategy implements HsSpillingStrategy {
+/** The proportion of buffers to be spilled. */
+public static final float SPILL_BUFFER_RATIO = 0.4f;
+
+/**
+ * When the number of buffers that have been requested exceeds this 
threshold, trigger the
+ * spilling operation.
+ */
+public static final float SPILL_THRESHOLD = 0.7f;
+
+/**
+ * For the case of buffer finished, there is no need to make a decision 
for {@link
+ * HsSelectiveSpillingStrategy}.
+ */
+@Override
+public Action onBufferFinished(
+HsSpillingInfoProvider spillingInfoProvider, BufferWithIdentity 
finishedBuffer) {
+return Action.EMPTY;
+}
+
+/**
+ * For the case of buffer consumed, this buffer need release. The control 
of the buffer is taken
+ * over by the downstream task.
+ */
+@Override
+public Action onBufferConsumed(
+HsSpillingInfoProvider spillingInfoProvider, BufferWithIdentity 
consumedBuffer) {
+return 
Action.onlyNeedRelease(Collections.singletonList(consumedBuffer));
+}
+
+/**
+ * When the amount of memory used exceeds the threshold, trigger the 
spilling operation and
+ * score the buffer of each subpartition. The lower the score, the more 
likely the buffer will
+ * be consumed in the next time, and should be kept in memory as much as 
possible. Select all
+ * buffers that need to be spilled according to the score from high to low.
+ */
+@Override
+public Action maybeSpill(HsSpillingInfoProvider spillInfoProvider) {
+if (spillInfoProvider.getNumRequestedBuffers()
+< spillInfoProvider.getPoolSize() * SPILL_THRESHOLD) {
+return Action.EMPTY;
+}
+
+List consumptionProgress = 
spillInfoProvider.getConsumptionProgress();
+PriorityQueue heap = new PriorityQueue<>();
+int numBuffers = 0;
+for (int channel = 0; channel < 
spillInfoProvider.getNumSubpartitions(); channel++) {
+Collection finishedBuffers =
+spillInfoProvider.getSubpartitionBuffers(channel);
+int subpartitionProgress = consumptionProgress.get(channel);
+// compute score for each subpartition buffers.
+for (BufferWithIdentity bufferWithInfo : finishedBuffers) {
+// buffer index in finishedBuffers must be greater than or 
equal to the downstream
+// consumption progress, so the score must be greater than or 
equal to 0.
+int score = bufferWithInfo.getBufferIndex() - 
subpartitionProgress;
+heap.add(new BufferIdentityWithScore(bufferWithInfo, score));
+numBuffers++;
+}
+}
+List toSpill = new ArrayList<>();
+int spillNum = (int) (numBuffers * SPILL_BUFFER_RATIO);
+for (int i = 0; i < spillNum; i++) {
+
toSpill.add(Preconditions.checkNotNull(heap.poll()).bufferWithInfo);
+}

Review Comment:
   It's really a nice idea, this can greatly improve the efficiency of the 
algorithm when there are a large number of buffers.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-28402) Create FailureHandlingResultSnapshot with the truly failed execution

2022-07-11 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28402.
---
Resolution: Done

Done via edd10a1d8fd09f8a5e296ecdf0201945d77d5ff2

> Create FailureHandlingResultSnapshot with the truly failed execution
> 
>
> Key: FLINK-28402
> URL: https://issues.apache.org/jira/browse/FLINK-28402
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Previously, FailureHandlingResultSnapshot was always created to treat the 
> only current attempt of an execution vertex as the failed execution. This is 
> no longer right in speculative execution cases, in which an execution vertex 
> can have multiple current executions, and any of them may fail.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] zhuzhurk closed pull request #20221: [FLINK-28402] Create FailureHandlingResultSnapshot with the truly failed execution

2022-07-11 Thread GitBox


zhuzhurk closed pull request #20221: [FLINK-28402] Create 
FailureHandlingResultSnapshot with the truly failed execution
URL: https://github.com/apache/flink/pull/20221


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zhuzhurk commented on pull request #20221: [FLINK-28402] Create FailureHandlingResultSnapshot with the truly failed execution

2022-07-11 Thread GitBox


zhuzhurk commented on PR #20221:
URL: https://github.com/apache/flink/pull/20221#issuecomment-1180611626

   Thanks for reviewing! @wanglijie95 
   Merging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on a diff in pull request #20229: [FLINK-28399] Refactor CheckedThread to accept runnable via constructor

2022-07-11 Thread GitBox


reswqa commented on code in PR #20229:
URL: https://github.com/apache/flink/pull/20229#discussion_r918116847


##
flink-test-utils-parent/flink-test-utils-junit/src/main/java/org/apache/flink/core/testutils/CheckedThread.java:
##
@@ -18,26 +18,42 @@
 
 package org.apache.flink.core.testutils;
 
+import java.util.concurrent.TimeoutException;
+
 /**
  * A thread that additionally catches exceptions and offers a joining method 
that re-throws the
  * exceptions.
  *
- * Rather than overriding {@link Thread#run()} (or supplying a {@link 
Runnable}), one needs to
- * extends this class and implement the {@link #go()} method. That method may 
throw exceptions.
+ * This class needs to supply a {@link RunnableWithException} that may 
throw exceptions or
+ * override {@link #go()} method.
+ *
+ * you can use it as the same way of using threads like: {@code new 
Thread(Runnable runnable)} or
+ * {@Code new Thread()} and then override {@link Thread#run()} method. Just 
change it to {@code new
+ * CheckedThread(RunnableWithException runnableWithException)} or {@Code new 
CheckedThread()} and
+ * then override {@link CheckedThread#go()} method.
  *
- * Exception from the {@link #go()} method are caught and re-thrown when 
joining this thread via
- * the {@link #sync()} method.
+ * Exception from the {@link #runnable} or the override {@link #go()} are 
caught and re-thrown
+ * when joining this thread via the {@link #sync()} method.
  */
-public abstract class CheckedThread extends Thread {
+public class CheckedThread extends Thread {

Review Comment:
   In addition, I still want to know the clear benefits of removing the 
override go method. In my mind, it is more reasonable for us to use 
`CheckedThread` like using the `Thread`. When using thread, we usually build it 
in two ways: one is pass the runnable object to constructor, and the other is 
to override its run method. So the only difference is that the `run` method is 
replaced by the `go` method. Previously, `CheckedThread` did not support 
constructed by runnable, but now we can complete this function through this pr.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-28495) Fix typos or mistakes of Flink CEP Document in the official website

2022-07-11 Thread Biao Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Biao Geng updated FLINK-28495:
--
Description: 
1. "how you can migrate your job from an older Flink version to Flink-1.3." -> 
"how you can migrate your job from an older Flink version to Flink-1.13."

2. "Will generate the following matches for an input sequence: C D A1 A2 A3 D 
A4 B. with combinations enabled: {quote}\\{ C A1 B\}, \{C A1 A2 B\}, \{C A1 A3 
B\}, \{C A1 A4 B\}, \{C A1 A2 A3 B\}, \{C A1 A2 A4 B\}, \{C A1 A3 A4 B\}, \{C 
A1 A2 A3 A4 B\}{quote}" -> "Will generate the following matches for an input 
sequence: C D A1 A2 A3 D A4 B. with combinations enabled: {quote}\{C A1 B\}, 
\{C A1 A2 B\}, \{C A1 A3 B\}, \{C A1 A4 B\}, \{C A1 A2 A3 B\}, \{C A1 A2 A4 
B\}, \{C A1 A3 A4 B\}, \{C A1 A2 A3 A4 B\}, \{C A2 B\}, \{C A2 A3 B\}, \{C A2 
A4 B\}, \{C A2 A3 A4 B\}, \{C A3 B\}, \{C A3 A4 B\}, \{C A4 B\}{quote}"


3. "For SKIP_TO_FIRST/LAST there are two options how to handle cases when there 
are no elements mapped to the specified variable." -> "For SKIP_TO_FIRST/LAST 
there are two options how to handle cases when there are no events mapped to 
the *PatternName*."

  was:
1. "how you can migrate your job from an older Flink version to Flink-1.3." -> 
"how you can migrate your job from an older Flink version to Flink-1.13."

2. "Will generate the following matches for an input sequence: C D A1 A2 A3 D 
A4 B. with combinations enabled: {quote}{C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C 
A1 A4 B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 
B}{quote}" -> "Will generate the following matches for an input sequence: C D 
A1 A2 A3 D A4 B. with combinations enabled: {quote}{C A1 B}, {C A1 A2 B}, {C A1 
A3 B}, {C A1 A4 B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 
A4 B}, {C A2 B}, {C A2 A3 B}, {C A2 A4 B}, {C A2 A3 A4 B}, {C A3 B}, {C A3 A4 
B}, {C A4 B}{quote}"


3. "For SKIP_TO_FIRST/LAST there are two options how to handle cases when there 
are no elements mapped to the specified variable." -> "For SKIP_TO_FIRST/LAST 
there are two options how to handle cases when there are no events mapped to 
the *PatternName*."


> Fix typos or mistakes of Flink CEP Document in the official website
> ---
>
> Key: FLINK-28495
> URL: https://issues.apache.org/jira/browse/FLINK-28495
> Project: Flink
>  Issue Type: Improvement
>  Components: Library / CEP
>Reporter: Biao Geng
>Priority: Minor
>
> 1. "how you can migrate your job from an older Flink version to Flink-1.3." 
> -> "how you can migrate your job from an older Flink version to Flink-1.13."
> 2. "Will generate the following matches for an input sequence: C D A1 A2 A3 D 
> A4 B. with combinations enabled: {quote}\\{ C A1 B\}, \{C A1 A2 B\}, \{C A1 
> A3 B\}, \{C A1 A4 B\}, \{C A1 A2 A3 B\}, \{C A1 A2 A4 B\}, \{C A1 A3 A4 B\}, 
> \{C A1 A2 A3 A4 B\}{quote}" -> "Will generate the following matches for an 
> input sequence: C D A1 A2 A3 D A4 B. with combinations enabled: {quote}\{C A1 
> B\}, \{C A1 A2 B\}, \{C A1 A3 B\}, \{C A1 A4 B\}, \{C A1 A2 A3 B\}, \{C A1 A2 
> A4 B\}, \{C A1 A3 A4 B\}, \{C A1 A2 A3 A4 B\}, \{C A2 B\}, \{C A2 A3 B\}, \{C 
> A2 A4 B\}, \{C A2 A3 A4 B\}, \{C A3 B\}, \{C A3 A4 B\}, \{C A4 B\}{quote}"
> 3. "For SKIP_TO_FIRST/LAST there are two options how to handle cases when 
> there are no elements mapped to the specified variable." -> "For 
> SKIP_TO_FIRST/LAST there are two options how to handle cases when there are 
> no events mapped to the *PatternName*."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28495) Fix typos or mistakes of Flink CEP Document in the official website

2022-07-11 Thread Biao Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Biao Geng updated FLINK-28495:
--
Description: 
1. "how you can migrate your job from an older Flink version to Flink-1.3." -> 
"how you can migrate your job from an older Flink version to Flink-1.13."

2. "Will generate the following matches for an input sequence: C D A1 A2 A3 D 
A4 B. with combinations enabled: {quote}{C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C 
A1 A4 B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 
B}{quote}" -> "Will generate the following matches for an input sequence: C D 
A1 A2 A3 D A4 B. with combinations enabled: {quote}{C A1 B}, {C A1 A2 B}, {C A1 
A3 B}, {C A1 A4 B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 
A4 B}, {C A2 B}, {C A2 A3 B}, {C A2 A4 B}, {C A2 A3 A4 B}, {C A3 B}, {C A3 A4 
B}, {C A4 B}{quote}"


3. "For SKIP_TO_FIRST/LAST there are two options how to handle cases when there 
are no elements mapped to the specified variable." -> "For SKIP_TO_FIRST/LAST 
there are two options how to handle cases when there are no events mapped to 
the *PatternName*."

  was:
"how you can migrate your job from an older Flink version to Flink-1.3." -> 
"how you can migrate your job from an older Flink version to Flink-1.13."

"Will generate the following matches for an input sequence: C D A1 A2 A3 D A4 
B. with combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 B}, 
{C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}" -> "Will 
generate the following matches for an input sequence: C D A1 A2 A3 D A4 B. with 
combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 B}, {C A1 A2 
A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}, {C A2 B}, {C A2 A3 
B}, {C A2 A4 B}, {C A2 A3 A4 B}, {C A3 B}, {C A3 A4 B}, {C A4 B}"


"For SKIP_TO_FIRST/LAST there are two options how to handle cases when there 
are no elements mapped to the specified variable." -> "For SKIP_TO_FIRST/LAST 
there are two options how to handle cases when there are no events mapped to 
the *PatternName*."


> Fix typos or mistakes of Flink CEP Document in the official website
> ---
>
> Key: FLINK-28495
> URL: https://issues.apache.org/jira/browse/FLINK-28495
> Project: Flink
>  Issue Type: Improvement
>  Components: Library / CEP
>Reporter: Biao Geng
>Priority: Minor
>
> 1. "how you can migrate your job from an older Flink version to Flink-1.3." 
> -> "how you can migrate your job from an older Flink version to Flink-1.13."
> 2. "Will generate the following matches for an input sequence: C D A1 A2 A3 D 
> A4 B. with combinations enabled: {quote}{C A1 B}, {C A1 A2 B}, {C A1 A3 B}, 
> {C A1 A4 B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 
> B}{quote}" -> "Will generate the following matches for an input sequence: C D 
> A1 A2 A3 D A4 B. with combinations enabled: {quote}{C A1 B}, {C A1 A2 B}, {C 
> A1 A3 B}, {C A1 A4 B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 
> A2 A3 A4 B}, {C A2 B}, {C A2 A3 B}, {C A2 A4 B}, {C A2 A3 A4 B}, {C A3 B}, {C 
> A3 A4 B}, {C A4 B}{quote}"
> 3. "For SKIP_TO_FIRST/LAST there are two options how to handle cases when 
> there are no elements mapped to the specified variable." -> "For 
> SKIP_TO_FIRST/LAST there are two options how to handle cases when there are 
> no events mapped to the *PatternName*."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28495) Fix typos or mistakes of Flink CEP Document in the official website

2022-07-11 Thread Biao Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Biao Geng updated FLINK-28495:
--
Description: 
"how you can migrate your job from an older Flink version to Flink-1.3." -> 
"how you can migrate your job from an older Flink version to Flink-1.13."

"Will generate the following matches for an input sequence: C D A1 A2 A3 D A4 
B. with combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 B}, 
{C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}" -> "Will 
generate the following matches for an input sequence: C D A1 A2 A3 D A4 B. with 
combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 B}, {C A1 A2 
A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}, {C A2 B}, {C A2 A3 
B}, {C A2 A4 B}, {C A2 A3 A4 B}, {C A3 B}, {C A3 A4 B}, {C A4 B}"


"For SKIP_TO_FIRST/LAST there are two options how to handle cases when there 
are no elements mapped to the specified variable." -> "For SKIP_TO_FIRST/LAST 
there are two options how to handle cases when there are no events mapped to 
the *PatternName*."

  was:
"Will generate the following matches for an input sequence: C D A1 A2 A3 D A4 
B. with combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 B}, 
{C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}" -> "Will 
generate the following matches for an input sequence: C D A1 A2 A3 D A4 B. with 
combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 B}, {C A1 A2 
A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}, {C A2 B}, {C A2 A3 
B}, {C A2 A4 B}, {C A2 A3 A4 B}, {C A3 B}, {C A3 A4 B}, {C A4 B}"
"For SKIP_TO_FIRST/LAST there are two options how to handle cases when there 
are no elements mapped to the specified variable." -> "For SKIP_TO_FIRST/LAST 
there are two options how to handle cases when there are no events mapped to 
the *PatternName*."


> Fix typos or mistakes of Flink CEP Document in the official website
> ---
>
> Key: FLINK-28495
> URL: https://issues.apache.org/jira/browse/FLINK-28495
> Project: Flink
>  Issue Type: Improvement
>  Components: Library / CEP
>Reporter: Biao Geng
>Priority: Minor
>
> "how you can migrate your job from an older Flink version to Flink-1.3." -> 
> "how you can migrate your job from an older Flink version to Flink-1.13."
> "Will generate the following matches for an input sequence: C D A1 A2 A3 D A4 
> B. with combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 
> B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}" -> 
> "Will generate the following matches for an input sequence: C D A1 A2 A3 D A4 
> B. with combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 
> B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}, {C A2 
> B}, {C A2 A3 B}, {C A2 A4 B}, {C A2 A3 A4 B}, {C A3 B}, {C A3 A4 B}, {C A4 B}"
> "For SKIP_TO_FIRST/LAST there are two options how to handle cases when there 
> are no elements mapped to the specified variable." -> "For SKIP_TO_FIRST/LAST 
> there are two options how to handle cases when there are no events mapped to 
> the *PatternName*."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28144) Let JobMaster support blocklist mechanism

2022-07-11 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28144.
---
Fix Version/s: 1.16.0
   Resolution: Done

Done via:
f2f83e1956eccecaa2371b21bddaf7778bb4f819
04f2f0c2660b312449419a3acb58a46a38d84f64
72ea8b5999bf36125aa5f1a38df4ec52c7a95702
387b2a473d0c0a8d58d1ca0401894dffc0527b31

> Let JobMaster support blocklist mechanism
> -
>
> Key: FLINK-28144
> URL: https://issues.apache.org/jira/browse/FLINK-28144
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> SlotPool should avoid allocating slots that located on blocked nodes. To do 
> that, our core idea is to keep the SlotPool in such a state: there is no slot 
> in SlotPool that is free (no task assigned) and located on blocked nodes. 
> Details are as following:
> 1. When receiving slot offers from task managers located on blocked nodes, 
> all offers should be rejected.
> 2. When a node is newly blocked, we should release all free(no task assigned) 
> slots on it. We need to find all task managers on blocked nodes and release 
> all free slots on them by SlotPoolService#releaseFreeSlotsOnTaskManager.
> 3. When a slot state changes from reserved(task assigned) to free(no task 
> assigned), it will check whether the corresponding task manager is blocked. 
> If yes, release the slot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28495) Fix typos or mistakes of Flink CEP Document in the official website

2022-07-11 Thread Biao Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Biao Geng updated FLINK-28495:
--
Component/s: Library / CEP

> Fix typos or mistakes of Flink CEP Document in the official website
> ---
>
> Key: FLINK-28495
> URL: https://issues.apache.org/jira/browse/FLINK-28495
> Project: Flink
>  Issue Type: Improvement
>  Components: Library / CEP
>Reporter: Biao Geng
>Priority: Minor
>
> "Will generate the following matches for an input sequence: C D A1 A2 A3 D A4 
> B. with combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 
> B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}" -> 
> "Will generate the following matches for an input sequence: C D A1 A2 A3 D A4 
> B. with combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 
> B}, {C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}, {C A2 
> B}, {C A2 A3 B}, {C A2 A4 B}, {C A2 A3 A4 B}, {C A3 B}, {C A3 A4 B}, {C A4 B}"
> "For SKIP_TO_FIRST/LAST there are two options how to handle cases when there 
> are no elements mapped to the specified variable." -> "For SKIP_TO_FIRST/LAST 
> there are two options how to handle cases when there are no events mapped to 
> the *PatternName*."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] reswqa commented on a diff in pull request #20229: [FLINK-28399] Refactor CheckedThread to accept runnable via constructor

2022-07-11 Thread GitBox


reswqa commented on code in PR #20229:
URL: https://github.com/apache/flink/pull/20229#discussion_r918081449


##
flink-test-utils-parent/flink-test-utils-junit/src/main/java/org/apache/flink/core/testutils/CheckedThread.java:
##
@@ -18,26 +18,42 @@
 
 package org.apache.flink.core.testutils;
 
+import java.util.concurrent.TimeoutException;
+
 /**
  * A thread that additionally catches exceptions and offers a joining method 
that re-throws the
  * exceptions.
  *
- * Rather than overriding {@link Thread#run()} (or supplying a {@link 
Runnable}), one needs to
- * extends this class and implement the {@link #go()} method. That method may 
throw exceptions.
+ * This class needs to supply a {@link RunnableWithException} that may 
throw exceptions or
+ * override {@link #go()} method.
+ *
+ * you can use it as the same way of using threads like: {@code new 
Thread(Runnable runnable)} or
+ * {@Code new Thread()} and then override {@link Thread#run()} method. Just 
change it to {@code new
+ * CheckedThread(RunnableWithException runnableWithException)} or {@Code new 
CheckedThread()} and
+ * then override {@link CheckedThread#go()} method.
  *
- * Exception from the {@link #go()} method are caught and re-thrown when 
joining this thread via
- * the {@link #sync()} method.
+ * Exception from the {@link #runnable} or the override {@link #go()} are 
caught and re-thrown
+ * when joining this thread via the {@link #sync()} method.
  */
-public abstract class CheckedThread extends Thread {
+public class CheckedThread extends Thread {

Review Comment:
   Nice idea, But I tried to get rid of the support for overriding the go 
method, and then found that a more difficult class might not be easy to deal 
with. It is `org.apache.flink.streaming.runtime.io.benchmark.ReceiverThread`,  
do you have any suggestions?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-28495) Fix typos or mistakes of Flink CEP Document in the official website

2022-07-11 Thread Biao Geng (Jira)
Biao Geng created FLINK-28495:
-

 Summary: Fix typos or mistakes of Flink CEP Document in the 
official website
 Key: FLINK-28495
 URL: https://issues.apache.org/jira/browse/FLINK-28495
 Project: Flink
  Issue Type: Improvement
Reporter: Biao Geng


"Will generate the following matches for an input sequence: C D A1 A2 A3 D A4 
B. with combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 B}, 
{C A1 A2 A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}" -> "Will 
generate the following matches for an input sequence: C D A1 A2 A3 D A4 B. with 
combinations enabled: {C A1 B}, {C A1 A2 B}, {C A1 A3 B}, {C A1 A4 B}, {C A1 A2 
A3 B}, {C A1 A2 A4 B}, {C A1 A3 A4 B}, {C A1 A2 A3 A4 B}, {C A2 B}, {C A2 A3 
B}, {C A2 A4 B}, {C A2 A3 A4 B}, {C A3 B}, {C A3 A4 B}, {C A4 B}"
"For SKIP_TO_FIRST/LAST there are two options how to handle cases when there 
are no elements mapped to the specified variable." -> "For SKIP_TO_FIRST/LAST 
there are two options how to handle cases when there are no events mapped to 
the *PatternName*."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] reswqa commented on a diff in pull request #20229: [FLINK-28399] Refactor CheckedThread to accept runnable via constructor

2022-07-11 Thread GitBox


reswqa commented on code in PR #20229:
URL: https://github.com/apache/flink/pull/20229#discussion_r918081449


##
flink-test-utils-parent/flink-test-utils-junit/src/main/java/org/apache/flink/core/testutils/CheckedThread.java:
##
@@ -18,26 +18,42 @@
 
 package org.apache.flink.core.testutils;
 
+import java.util.concurrent.TimeoutException;
+
 /**
  * A thread that additionally catches exceptions and offers a joining method 
that re-throws the
  * exceptions.
  *
- * Rather than overriding {@link Thread#run()} (or supplying a {@link 
Runnable}), one needs to
- * extends this class and implement the {@link #go()} method. That method may 
throw exceptions.
+ * This class needs to supply a {@link RunnableWithException} that may 
throw exceptions or
+ * override {@link #go()} method.
+ *
+ * you can use it as the same way of using threads like: {@code new 
Thread(Runnable runnable)} or
+ * {@Code new Thread()} and then override {@link Thread#run()} method. Just 
change it to {@code new
+ * CheckedThread(RunnableWithException runnableWithException)} or {@Code new 
CheckedThread()} and
+ * then override {@link CheckedThread#go()} method.
  *
- * Exception from the {@link #go()} method are caught and re-thrown when 
joining this thread via
- * the {@link #sync()} method.
+ * Exception from the {@link #runnable} or the override {@link #go()} are 
caught and re-thrown
+ * when joining this thread via the {@link #sync()} method.
  */
-public abstract class CheckedThread extends Thread {
+public class CheckedThread extends Thread {

Review Comment:
   Nice idea, But I tried to get rid of the support for overriding the go 
method, and then found that a more difficult class might not be easy to deal 
with. It is `org apache.flink.streaming. runtime.io.benchmark.ReceiverThread`,  
do you have any suggestions?



##
flink-test-utils-parent/flink-test-utils-junit/src/main/java/org/apache/flink/core/testutils/CheckedThread.java:
##
@@ -18,26 +18,42 @@
 
 package org.apache.flink.core.testutils;
 
+import java.util.concurrent.TimeoutException;
+
 /**
  * A thread that additionally catches exceptions and offers a joining method 
that re-throws the
  * exceptions.
  *
- * Rather than overriding {@link Thread#run()} (or supplying a {@link 
Runnable}), one needs to
- * extends this class and implement the {@link #go()} method. That method may 
throw exceptions.
+ * This class needs to supply a {@link RunnableWithException} that may 
throw exceptions or
+ * override {@link #go()} method.
+ *
+ * you can use it as the same way of using threads like: {@code new 
Thread(Runnable runnable)} or
+ * {@Code new Thread()} and then override {@link Thread#run()} method. Just 
change it to {@code new
+ * CheckedThread(RunnableWithException runnableWithException)} or {@Code new 
CheckedThread()} and
+ * then override {@link CheckedThread#go()} method.
  *
- * Exception from the {@link #go()} method are caught and re-thrown when 
joining this thread via
- * the {@link #sync()} method.
+ * Exception from the {@link #runnable} or the override {@link #go()} are 
caught and re-thrown
+ * when joining this thread via the {@link #sync()} method.
  */
-public abstract class CheckedThread extends Thread {
+public class CheckedThread extends Thread {

Review Comment:
   Nice idea, But I tried to get rid of the support for overriding the go 
method, and then found that a more difficult class might not be easy to deal 
with. It is `org.apache.flink.streaming. runtime.io.benchmark.ReceiverThread`,  
do you have any suggestions?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on a diff in pull request #20229: [FLINK-28399] Refactor CheckedThread to accept runnable via constructor

2022-07-11 Thread GitBox


reswqa commented on code in PR #20229:
URL: https://github.com/apache/flink/pull/20229#discussion_r918081449


##
flink-test-utils-parent/flink-test-utils-junit/src/main/java/org/apache/flink/core/testutils/CheckedThread.java:
##
@@ -18,26 +18,42 @@
 
 package org.apache.flink.core.testutils;
 
+import java.util.concurrent.TimeoutException;
+
 /**
  * A thread that additionally catches exceptions and offers a joining method 
that re-throws the
  * exceptions.
  *
- * Rather than overriding {@link Thread#run()} (or supplying a {@link 
Runnable}), one needs to
- * extends this class and implement the {@link #go()} method. That method may 
throw exceptions.
+ * This class needs to supply a {@link RunnableWithException} that may 
throw exceptions or
+ * override {@link #go()} method.
+ *
+ * you can use it as the same way of using threads like: {@code new 
Thread(Runnable runnable)} or
+ * {@Code new Thread()} and then override {@link Thread#run()} method. Just 
change it to {@code new
+ * CheckedThread(RunnableWithException runnableWithException)} or {@Code new 
CheckedThread()} and
+ * then override {@link CheckedThread#go()} method.
  *
- * Exception from the {@link #go()} method are caught and re-thrown when 
joining this thread via
- * the {@link #sync()} method.
+ * Exception from the {@link #runnable} or the override {@link #go()} are 
caught and re-thrown
+ * when joining this thread via the {@link #sync()} method.
  */
-public abstract class CheckedThread extends Thread {
+public class CheckedThread extends Thread {

Review Comment:
   Nice idea, I tried to get rid of the support for overriding the go method, 
and then found that a more difficult class might not be easy to deal with. It 
is `org apache. flink. streaming. runtime. io. benchmark. ReceiverThread`,  do 
you have any suggestions?



##
flink-test-utils-parent/flink-test-utils-junit/src/main/java/org/apache/flink/core/testutils/CheckedThread.java:
##
@@ -18,26 +18,42 @@
 
 package org.apache.flink.core.testutils;
 
+import java.util.concurrent.TimeoutException;
+
 /**
  * A thread that additionally catches exceptions and offers a joining method 
that re-throws the
  * exceptions.
  *
- * Rather than overriding {@link Thread#run()} (or supplying a {@link 
Runnable}), one needs to
- * extends this class and implement the {@link #go()} method. That method may 
throw exceptions.
+ * This class needs to supply a {@link RunnableWithException} that may 
throw exceptions or
+ * override {@link #go()} method.
+ *
+ * you can use it as the same way of using threads like: {@code new 
Thread(Runnable runnable)} or
+ * {@Code new Thread()} and then override {@link Thread#run()} method. Just 
change it to {@code new
+ * CheckedThread(RunnableWithException runnableWithException)} or {@Code new 
CheckedThread()} and
+ * then override {@link CheckedThread#go()} method.
  *
- * Exception from the {@link #go()} method are caught and re-thrown when 
joining this thread via
- * the {@link #sync()} method.
+ * Exception from the {@link #runnable} or the override {@link #go()} are 
caught and re-thrown
+ * when joining this thread via the {@link #sync()} method.
  */
-public abstract class CheckedThread extends Thread {
+public class CheckedThread extends Thread {

Review Comment:
   Nice idea, But I tried to get rid of the support for overriding the go 
method, and then found that a more difficult class might not be easy to deal 
with. It is `org apache. flink. streaming. runtime. io. benchmark. 
ReceiverThread`,  do you have any suggestions?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zhuzhurk closed pull request #20153: [FLINK-28144][runtime] Let JobMaster support blocklist.

2022-07-11 Thread GitBox


zhuzhurk closed pull request #20153: [FLINK-28144][runtime] Let JobMaster 
support blocklist.
URL: https://github.com/apache/flink/pull/20153


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zhuzhurk commented on pull request #20221: [FLINK-28402] Create FailureHandlingResultSnapshot with the truly failed execution

2022-07-11 Thread GitBox


zhuzhurk commented on PR #20221:
URL: https://github.com/apache/flink/pull/20221#issuecomment-1180561038

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   4   >