[jira] [Updated] (FLINK-24392) Upgrade presto s3 fs implementation to Trino >= 348

2021-09-28 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-24392:
---
Description: 
The Presto s3 filesystem implementation currently shipped with Flink doesn't 
support streaming uploads. All data needs to be materialized to a single file 
on disk, before it can be uploaded.
This can lead to situations where TaskManagers are running out of disk when 
creating a savepoint.

The Hadoop filesystem implementation supports streaming uploads (by using 
multipart uploads of smaller (say 100mb) files locally), but it does more API 
calls, leading to other issues.

Trino version >= 348 supports streaming uploads.

During experiments, I also noticed that the current presto s3 fs implementation 
seems to allocate a lot of memory outside the heap (when shipping large data, 
for example when creating a savepoint). On a K8s pod with a memory limit of 
4000Mi, I was not able to run Flink with a "taskmanager.memory.flink.size" 
above 3000m. This means that an additional 1gb of memory needs to be allocated 
just for the peaks in memory allocation when presto s3 is taking a savepoint. 
It would be good to confirm this behavior, and then either adjust the default 
memory configuration or the documentation.

As part of this upgrade, we also need to make sure that the new presto / Trino 
version is not doing substantially more S3 API calls than the current version. 
After switching away from the presto s3 to hadoop s3, I noticed that disposing 
an old checkpoint (~100gb) can take up to 15 minutes. The upgraded presto s3 fs 
should still be able to quickly dispose state.

  was:
The Presto s3 filesystem implementation currently shipped with Flink doesn't 
support streaming uploads. All data needs to be materialized to a single file 
on disk, before it can be uploaded.
This can lead to situations where TaskManagers are running out of disk when 
creating a savepoint.

The Hadoop filesystem implementation supports streaming uploads (by using 
multipart uploads of smaller (say 100mb) files locally), but it does more API 
calls, leading to other issues.

Trino version >= 348 supports streaming uploads.

During experiments, I also noticed that the current presto s3 fs implementation 
seems to allocate a lot of memory outside the heap (when shipping large data, 
for example when creating a savepoint). On a K8s pod with a memory limit of 
4000Mi, I was not able to run Flink with a "taskmanager.memory.flink.size" 
above 3000m. This means that an additional 1gb of memory needs to be allocated 
just for the peaks in memory allocation when presto s3 is taking a savepoint.
As part of this upgrade, we also need to make sure that the new presto / Trino 
version is not doing substantially more S3 API calls than the current version. 
After switching away from the presto s3 to hadoop s3, I noticed that disposing 
an old checkpoint (~100gb) can take up to 15 minutes. The upgraded presto s3 fs 
should still be able to quickly dispose state.


> Upgrade presto s3 fs implementation to Trino >= 348
> ---
>
> Key: FLINK-24392
> URL: https://issues.apache.org/jira/browse/FLINK-24392
> Project: Flink
>  Issue Type: Improvement
>  Components: FileSystems
>Affects Versions: 1.14.0
>Reporter: Robert Metzger
>Priority: Major
> Fix For: 1.15.0
>
>
> The Presto s3 filesystem implementation currently shipped with Flink doesn't 
> support streaming uploads. All data needs to be materialized to a single file 
> on disk, before it can be uploaded.
> This can lead to situations where TaskManagers are running out of disk when 
> creating a savepoint.
> The Hadoop filesystem implementation supports streaming uploads (by using 
> multipart uploads of smaller (say 100mb) files locally), but it does more API 
> calls, leading to other issues.
> Trino version >= 348 supports streaming uploads.
> During experiments, I also noticed that the current presto s3 fs 
> implementation seems to allocate a lot of memory outside the heap (when 
> shipping large data, for example when creating a savepoint). On a K8s pod 
> with a memory limit of 4000Mi, I was not able to run Flink with a 
> "taskmanager.memory.flink.size" above 3000m. This means that an additional 
> 1gb of memory needs to be allocated just for the peaks in memory allocation 
> when presto s3 is taking a savepoint. It would be good to confirm this 
> behavior, and then either adjust the default memory configuration or the 
> documentation.
> As part of this upgrade, we also need to make sure that the new presto / 
> Trino version is not doing substantially more S3 API calls than the current 
> version. After switching away from the presto s3 to hadoop s3, I noticed that 
> disposing an old checkpoint (~100gb) can take up to 15 

[GitHub] [flink] RocMarshal edited a comment on pull request #16962: [FLINK-15352][connector-jdbc] Develop MySQLCatalog to connect Flink with MySQL tables and ecosystem.

2021-09-28 Thread GitBox


RocMarshal edited a comment on pull request #16962:
URL: https://github.com/apache/flink/pull/16962#issuecomment-925446816


   Hi, @MartijnVisser  @Airblader @twalthr  @zhuzhurk  @JingsongLi , I made 
some changes based on your suggestions. Could you help me  to review it ? Thank 
you so much for your attention.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17379: Update kafka.md

2021-09-28 Thread GitBox


flinkbot edited a comment on pull request #17379:
URL: https://github.com/apache/flink/pull/17379#issuecomment-929802829


   
   ## CI report:
   
   * af61c2cdd87041b08acd0d52ec63bf39b50bd89a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=24606)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17379: Update kafka.md

2021-09-28 Thread GitBox


flinkbot edited a comment on pull request #17379:
URL: https://github.com/apache/flink/pull/17379#issuecomment-929802829


   
   ## CI report:
   
   * af61c2cdd87041b08acd0d52ec63bf39b50bd89a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=24606)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24401) TM cannot exit after Metaspace OOM

2021-09-28 Thread future (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

future updated FLINK-24401:
---
Description: 
Hi masters, from the code and log, we can see that OOM will terminateJVM 
directly, but Metaspace OutOfMemoryError will graceful shutdown. The code 
comment mentions: {{_it does not usually require more class loading to fail 
again with the Metaspace OutOfMemoryError_.}}.

But we encountered: after Metaspace OutOfMemoryError, 
{{_java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner$Result_.}}, makes Tm 
unable to exit, keeps trying again, keeps NoClassDefFoundError, keeps class 
loading failure, until kill tm by manually.

I want to add a catch Throwable in the onFatalError method, and directly 
terminateJVM() in the catch. Is there any problem with this strategy? 

 

[code link 
|https://github.com/apache/flink/blob/4fe9f525a92319acc1e3434bebed601306f7a16f/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L312]

picture:

 

!image-2021-09-29-12-00-44-812.png|width=1337,height=692!

  !image-2021-09-29-12-00-28-510.png!

 

 

  was:
Hi masters, from the code and log, we can see that OOM will terminateJVM 
directly, but Metaspace OutOfMemoryError will graceful shutdown. The code 
comment mentions: {{_it does not usually require more class loading to fail 
again with the Metaspace OutOfMemoryError_.}}.

But we encountered: after Metaspace OutOfMemoryError, 
{{_java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner$Result_.}}, makes Tm 
unable to exit, keeps trying again, keeps NoClassDefFoundError, keeps class 
loading failure, until kill tm by manually.

I want to add a catch Throwable in the onFatalError method, and directly 
terminateJVM() in the catch. Is there any problem with this strategy? 

 

[code link 
|https://github.com/apache/flink/blob/4fe9f525a92319acc1e3434bebed601306f7a16f/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L312]

picture:

!image-2021-09-29-11-45-48-098.png|width=663,height=343!

 

!image-2021-09-29-11-47-47-157.png!

 


> TM cannot exit after Metaspace OOM
> --
>
> Key: FLINK-24401
> URL: https://issues.apache.org/jira/browse/FLINK-24401
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.12.0, 1.13.0
>Reporter: future
>Priority: Major
> Fix For: 1.13.3, 1.14.1
>
> Attachments: image-2021-09-29-12-00-28-510.png, 
> image-2021-09-29-12-00-44-812.png
>
>
> Hi masters, from the code and log, we can see that OOM will terminateJVM 
> directly, but Metaspace OutOfMemoryError will graceful shutdown. The code 
> comment mentions: {{_it does not usually require more class loading to fail 
> again with the Metaspace OutOfMemoryError_.}}.
> But we encountered: after Metaspace OutOfMemoryError, 
> {{_java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner$Result_.}}, makes Tm 
> unable to exit, keeps trying again, keeps NoClassDefFoundError, keeps class 
> loading failure, until kill tm by manually.
> I want to add a catch Throwable in the onFatalError method, and directly 
> terminateJVM() in the catch. Is there any problem with this strategy? 
>  
> [code link 
> |https://github.com/apache/flink/blob/4fe9f525a92319acc1e3434bebed601306f7a16f/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L312]
> picture:
>  
> !image-2021-09-29-12-00-44-812.png|width=1337,height=692!
>   !image-2021-09-29-12-00-28-510.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24401) TM cannot exit after Metaspace OOM

2021-09-28 Thread future (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

future updated FLINK-24401:
---
Attachment: image-2021-09-29-12-00-28-510.png

> TM cannot exit after Metaspace OOM
> --
>
> Key: FLINK-24401
> URL: https://issues.apache.org/jira/browse/FLINK-24401
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.12.0, 1.13.0
>Reporter: future
>Priority: Major
> Fix For: 1.13.3, 1.14.1
>
> Attachments: image-2021-09-29-12-00-28-510.png, 
> image-2021-09-29-12-00-44-812.png
>
>
> Hi masters, from the code and log, we can see that OOM will terminateJVM 
> directly, but Metaspace OutOfMemoryError will graceful shutdown. The code 
> comment mentions: {{_it does not usually require more class loading to fail 
> again with the Metaspace OutOfMemoryError_.}}.
> But we encountered: after Metaspace OutOfMemoryError, 
> {{_java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner$Result_.}}, makes Tm 
> unable to exit, keeps trying again, keeps NoClassDefFoundError, keeps class 
> loading failure, until kill tm by manually.
> I want to add a catch Throwable in the onFatalError method, and directly 
> terminateJVM() in the catch. Is there any problem with this strategy? 
>  
> [code link 
> |https://github.com/apache/flink/blob/4fe9f525a92319acc1e3434bebed601306f7a16f/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L312]
> picture:
> !image-2021-09-29-11-45-48-098.png|width=663,height=343!
>  
> !image-2021-09-29-11-47-47-157.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24401) TM cannot exit after Metaspace OOM

2021-09-28 Thread future (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

future updated FLINK-24401:
---
Attachment: image-2021-09-29-12-00-44-812.png

> TM cannot exit after Metaspace OOM
> --
>
> Key: FLINK-24401
> URL: https://issues.apache.org/jira/browse/FLINK-24401
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.12.0, 1.13.0
>Reporter: future
>Priority: Major
> Fix For: 1.13.3, 1.14.1
>
> Attachments: image-2021-09-29-12-00-28-510.png, 
> image-2021-09-29-12-00-44-812.png
>
>
> Hi masters, from the code and log, we can see that OOM will terminateJVM 
> directly, but Metaspace OutOfMemoryError will graceful shutdown. The code 
> comment mentions: {{_it does not usually require more class loading to fail 
> again with the Metaspace OutOfMemoryError_.}}.
> But we encountered: after Metaspace OutOfMemoryError, 
> {{_java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner$Result_.}}, makes Tm 
> unable to exit, keeps trying again, keeps NoClassDefFoundError, keeps class 
> loading failure, until kill tm by manually.
> I want to add a catch Throwable in the onFatalError method, and directly 
> terminateJVM() in the catch. Is there any problem with this strategy? 
>  
> [code link 
> |https://github.com/apache/flink/blob/4fe9f525a92319acc1e3434bebed601306f7a16f/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L312]
> picture:
> !image-2021-09-29-11-45-48-098.png|width=663,height=343!
>  
> !image-2021-09-29-11-47-47-157.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24401) TM cannot exit after Metaspace OOM

2021-09-28 Thread future (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

future updated FLINK-24401:
---
Attachment: (was: image-2021-09-29-11-47-47-157.png)

> TM cannot exit after Metaspace OOM
> --
>
> Key: FLINK-24401
> URL: https://issues.apache.org/jira/browse/FLINK-24401
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.12.0, 1.13.0
>Reporter: future
>Priority: Major
> Fix For: 1.13.3, 1.14.1
>
> Attachments: image-2021-09-29-12-00-28-510.png, 
> image-2021-09-29-12-00-44-812.png
>
>
> Hi masters, from the code and log, we can see that OOM will terminateJVM 
> directly, but Metaspace OutOfMemoryError will graceful shutdown. The code 
> comment mentions: {{_it does not usually require more class loading to fail 
> again with the Metaspace OutOfMemoryError_.}}.
> But we encountered: after Metaspace OutOfMemoryError, 
> {{_java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner$Result_.}}, makes Tm 
> unable to exit, keeps trying again, keeps NoClassDefFoundError, keeps class 
> loading failure, until kill tm by manually.
> I want to add a catch Throwable in the onFatalError method, and directly 
> terminateJVM() in the catch. Is there any problem with this strategy? 
>  
> [code link 
> |https://github.com/apache/flink/blob/4fe9f525a92319acc1e3434bebed601306f7a16f/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L312]
> picture:
> !image-2021-09-29-11-45-48-098.png|width=663,height=343!
>  
> !image-2021-09-29-11-47-47-157.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24401) TM cannot exit after Metaspace OOM

2021-09-28 Thread future (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

future updated FLINK-24401:
---
Attachment: (was: image-2021-09-29-11-45-48-098.png)

> TM cannot exit after Metaspace OOM
> --
>
> Key: FLINK-24401
> URL: https://issues.apache.org/jira/browse/FLINK-24401
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.12.0, 1.13.0
>Reporter: future
>Priority: Major
> Fix For: 1.13.3, 1.14.1
>
> Attachments: image-2021-09-29-11-47-47-157.png
>
>
> Hi masters, from the code and log, we can see that OOM will terminateJVM 
> directly, but Metaspace OutOfMemoryError will graceful shutdown. The code 
> comment mentions: {{_it does not usually require more class loading to fail 
> again with the Metaspace OutOfMemoryError_.}}.
> But we encountered: after Metaspace OutOfMemoryError, 
> {{_java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner$Result_.}}, makes Tm 
> unable to exit, keeps trying again, keeps NoClassDefFoundError, keeps class 
> loading failure, until kill tm by manually.
> I want to add a catch Throwable in the onFatalError method, and directly 
> terminateJVM() in the catch. Is there any problem with this strategy? 
>  
> [code link 
> |https://github.com/apache/flink/blob/4fe9f525a92319acc1e3434bebed601306f7a16f/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L312]
> picture:
> !image-2021-09-29-11-45-48-098.png|width=663,height=343!
>  
> !image-2021-09-29-11-47-47-157.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24400) does elasticsearch connector can suport UpdateByQueryRequest?

2021-09-28 Thread jacky jia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421917#comment-17421917
 ] 

jacky jia commented on FLINK-24400:
---

I have found the original pull request 
[https://github.com/apache/flink/pull/6043]

as it say, it just for consider compatible of the version in 5.x . so may be it 
can be expand

>  does elasticsearch connector can suport UpdateByQueryRequest?
> --
>
> Key: FLINK-24400
> URL: https://issues.apache.org/jira/browse/FLINK-24400
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Reporter: jacky jia
>Priority: Minor
>
> now in the connector of elasticsearch ,RequestIndexer only supports Index, 
> Delete and Update requests
>  is it possable to support UpdateByQueryRequest
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24401) TM cannot exit after Metaspace OOM

2021-09-28 Thread future (Jira)
future created FLINK-24401:
--

 Summary: TM cannot exit after Metaspace OOM
 Key: FLINK-24401
 URL: https://issues.apache.org/jira/browse/FLINK-24401
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.13.0, 1.12.0
Reporter: future
 Fix For: 1.13.3, 1.14.1
 Attachments: image-2021-09-29-11-45-48-098.png, 
image-2021-09-29-11-47-47-157.png

Hi masters, from the code and log, we can see that OOM will terminateJVM 
directly, but Metaspace OutOfMemoryError will graceful shutdown. The code 
comment mentions: {{_it does not usually require more class loading to fail 
again with the Metaspace OutOfMemoryError_.}}.

But we encountered: after Metaspace OutOfMemoryError, 
{{_java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner$Result_.}}, makes Tm 
unable to exit, keeps trying again, keeps NoClassDefFoundError, keeps class 
loading failure, until kill tm by manually.

I want to add a catch Throwable in the onFatalError method, and directly 
terminateJVM() in the catch. Is there any problem with this strategy? 

 

[code link 
|https://github.com/apache/flink/blob/4fe9f525a92319acc1e3434bebed601306f7a16f/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L312]

picture:

!image-2021-09-29-11-45-48-098.png|width=663,height=343!

 

!image-2021-09-29-11-47-47-157.png!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24400) does elasticsearch connector can suport UpdateByQueryRequest?

2021-09-28 Thread jacky jia (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jacky jia updated FLINK-24400:
--
Priority: Minor  (was: Not a Priority)

>  does elasticsearch connector can suport UpdateByQueryRequest?
> --
>
> Key: FLINK-24400
> URL: https://issues.apache.org/jira/browse/FLINK-24400
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Reporter: jacky jia
>Priority: Minor
>
> now in the connector of elasticsearch ,RequestIndexer only supports Index, 
> Delete and Update requests
>  is it possable to support UpdateByQueryRequest
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24400) does elasticsearch connector can suport UpdateByQueryRequest?

2021-09-28 Thread jacky jia (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jacky jia updated FLINK-24400:
--
Issue Type: Improvement  (was: Technical Debt)

>  does elasticsearch connector can suport UpdateByQueryRequest?
> --
>
> Key: FLINK-24400
> URL: https://issues.apache.org/jira/browse/FLINK-24400
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Reporter: jacky jia
>Priority: Not a Priority
>
> now in the connector of elasticsearch ,RequestIndexer only supports Index, 
> Delete and Update requests
>  is it possable to support UpdateByQueryRequest
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #17379: Update kafka.md

2021-09-28 Thread GitBox


flinkbot commented on pull request #17379:
URL: https://github.com/apache/flink/pull/17379#issuecomment-929802829


   
   ## CI report:
   
   * af61c2cdd87041b08acd0d52ec63bf39b50bd89a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17370: [BP-1.14][FLINK-24380][k8s] Terminate the pod if it failed

2021-09-28 Thread GitBox


flinkbot edited a comment on pull request #17370:
URL: https://github.com/apache/flink/pull/17370#issuecomment-929030266


   
   ## CI report:
   
   * dcd53f6a0fe2cb509e9593cdfc935d1945812d31 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=24567)
 
   * 579496274548581d5e08f9597ebfb4644192f86a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=24604)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17371: [BP-1.13][FLINK-24380][k8s] Terminate the pod if it failed

2021-09-28 Thread GitBox


flinkbot edited a comment on pull request #17371:
URL: https://github.com/apache/flink/pull/17371#issuecomment-929030407


   
   ## CI report:
   
   * eb7ba466c81412f77297fd09956c4058b4ef2571 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=24568)
 
   * 1f761e65aa4c3ce202641a88d2c579f9db8b3567 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=24605)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-24400) does elasticsearch connector can suport UpdateByQueryRequest?

2021-09-28 Thread jacky jia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421913#comment-17421913
 ] 

jacky jia commented on FLINK-24400:
---

I think it is easy to support, but may be it has some unsafe dangerous to 
effect the at least or exact once Semantics

>  does elasticsearch connector can suport UpdateByQueryRequest?
> --
>
> Key: FLINK-24400
> URL: https://issues.apache.org/jira/browse/FLINK-24400
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / ElasticSearch
>Reporter: jacky jia
>Priority: Not a Priority
>
> now in the connector of elasticsearch ,RequestIndexer only supports Index, 
> Delete and Update requests
>  is it possable to support UpdateByQueryRequest
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24400) does elasticsearch connector can suport UpdateByQueryRequest?

2021-09-28 Thread jacky jia (Jira)
jacky jia created FLINK-24400:
-

 Summary:  does elasticsearch connector can suport 
UpdateByQueryRequest?
 Key: FLINK-24400
 URL: https://issues.apache.org/jira/browse/FLINK-24400
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / ElasticSearch
Reporter: jacky jia


now in the connector of elasticsearch ,RequestIndexer only supports Index, 
Delete and Update requests

 is it possable to support UpdateByQueryRequest

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on pull request #17379: Update kafka.md

2021-09-28 Thread GitBox


flinkbot commented on pull request #17379:
URL: https://github.com/apache/flink/pull/17379#issuecomment-929793876


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit af61c2cdd87041b08acd0d52ec63bf39b50bd89a (Wed Sep 29 
03:18:48 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **Invalid pull request title: No valid Jira ID provided**
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17371: [BP-1.13][FLINK-24380][k8s] Terminate the pod if it failed

2021-09-28 Thread GitBox


flinkbot edited a comment on pull request #17371:
URL: https://github.com/apache/flink/pull/17371#issuecomment-929030407


   
   ## CI report:
   
   * eb7ba466c81412f77297fd09956c4058b4ef2571 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=24568)
 
   * 1f761e65aa4c3ce202641a88d2c579f9db8b3567 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17370: [BP-1.14][FLINK-24380][k8s] Terminate the pod if it failed

2021-09-28 Thread GitBox


flinkbot edited a comment on pull request #17370:
URL: https://github.com/apache/flink/pull/17370#issuecomment-929030266


   
   ## CI report:
   
   * dcd53f6a0fe2cb509e9593cdfc935d1945812d31 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=24567)
 
   * 579496274548581d5e08f9597ebfb4644192f86a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] BradyYue opened a new pull request #17379: Update kafka.md

2021-09-28 Thread GitBox


BradyYue opened a new pull request #17379:
URL: https://github.com/apache/flink/pull/17379


   java and scala exeample code :  
WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofSeconds(20)) More than is 
point
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-20110) Support 'merge' method for first_value and last_value UDAF

2021-09-28 Thread Caizhi Weng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421902#comment-17421902
 ] 

Caizhi Weng commented on FLINK-20110:
-

Hi [~hailong wang] are you still working on this PR? If currently there is no 
one working on this I would like to fix it.

> Support 'merge' method for first_value and last_value UDAF
> --
>
> Key: FLINK-20110
> URL: https://issues.apache.org/jira/browse/FLINK-20110
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Affects Versions: 1.12.0
>Reporter: hailong wang
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> From the user-zh email,  when use first_value function in hop window, It 
> throws the exception because first_vaue does not implement the merge method.
> We can support 'merge' method for first_value and last_value UDAF



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15130) Drop "RequiredParameters" and "Options"

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15130:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Drop "RequiredParameters" and "Options"
> ---
>
> Key: FLINK-15130
> URL: https://issues.apache.org/jira/browse/FLINK-15130
> Project: Flink
>  Issue Type: Technical Debt
>  Components: API / DataSet, API / DataStream
>Affects Versions: 1.9.1
>Reporter: Stephan Ewen
>Priority: Minor
> Fix For: 1.15.0
>
>
> As per mailing list discussion, we want to drop those because they are unused 
> redundant code.
> There are many options for command line parsing, including one in Flink 
> (Parameter Tool).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15351) develop PostgresCatalog to connect Flink with Postgres tables and ecosystem

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15351:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> develop PostgresCatalog to connect Flink with Postgres tables and ecosystem
> ---
>
> Key: FLINK-15351
> URL: https://issues.apache.org/jira/browse/FLINK-15351
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / JDBC
>Reporter: Bowen Li
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15183) Use SQL-CLI to TPC-DS E2E test

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15183:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Use SQL-CLI to TPC-DS E2E test
> --
>
> Key: FLINK-15183
> URL: https://issues.apache.org/jira/browse/FLINK-15183
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Tests
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Now SQL-CLI support DDL, we can use SQL-CLI to test tpc-ds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21799) Detect and fail explicitly on managed memory leaking

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21799:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Detect and fail explicitly on managed memory leaking
> 
>
> Key: FLINK-21799
> URL: https://issues.apache.org/jira/browse/FLINK-21799
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> As discussed in FLINK-21419, it would be helpful to fail explicitly if a 
> managed memory leaking is detected.
> The leaking can be detected as:
> * A segment being GC-ed is never freed.
> * A memory manager being closed does not have all its memory released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-22585) Add deprecated message when "-m yarn-cluster" is used

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-22585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-22585:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add deprecated message when "-m yarn-cluster" is used
> -
>
> Key: FLINK-22585
> URL: https://issues.apache.org/jira/browse/FLINK-22585
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: Yang Wang
>Assignee: Rainie Li
>Priority: Major
>  Labels: auto-unassigned, pull-request-available, stale-assigned
> Fix For: 1.15.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The unified executor interface has been introduced for a long time, which 
> could be activated by "\--target 
> yarn-per-job/yarn-session/kubernetes-application". It is more descriptive and 
> clearer. We should try to deprecate "-m yarn-cluster" and suggest our users 
> to use the new CLI commands.
>  
> However, AFAIK, many companies are using some CLI commands to integrate with 
> their deployers. So we could not remove the "-m yarn-cluster" very soon. 
> Maybe we could do it in the release 2.0 since we could do some breaking 
> changes.
>  
> For now, I suggest to add the {{@Deprecated}} annotation and printing a WARN 
> log message when "-m yarn-cluster" is used. It is useful to let the users 
> know the long-term goals and migrate asap.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-20103) Improve test coverage with chaos testing & side-by-side tests

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-20103:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Improve test coverage with chaos testing & side-by-side tests
> -
>
> Key: FLINK-20103
> URL: https://issues.apache.org/jira/browse/FLINK-20103
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing, Runtime / Network, Tests
>Reporter: Roman Khachatryan
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
> Fix For: 1.15.0
>
>
> This is a follow-up ticket after FLINK-20097.
> With the current setup (UnalignedITCase):
>  - race conditions are not detected reliably (1 per tens of runs)
>  - require changing the configuration (low checkpoint timeout)
>  - adding a new job graph often reveals a new bug
> An additional issue with the current setup is that it's difficult to git 
> bisect (for long ranges). 
> Changes that might hide the bugs:
>  - having Preconditions in ChannelStatePersister (slow down processing)
>  - some Preconditions may mask errors by causing job restart
>  - timings in tests (UnalignedITCase)
>  Some options to consider
>  # chaos monkey tests including induced latency and/or CPU bursts - on 
> different workloads/configs
>  # side-by-side tests with randomized inputs/configs
> Extending Jepsen coverage further (validating output) does not seem promising 
> in the context of Flink because it's output isn't linearisable.
>   
> Some tools for (1) that could be used:
> 1. https://github.com/chaosblade-io/chaosblade (docs need translation)
> 2. https://github.com/Netflix/chaosmonkey - requires spinnaker (CD)
> 3. jvm agent: https://github.com/mrwilson/byte-monkey
> 4. https://vmware.github.io/mangle/ - supports java method latency; ui 
> oriented?; not actively maintained?
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24281) Migrate all existing tests to new Kafka Sink

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-24281:
-
Fix Version/s: (was: 1.14.1)
   1.14.0

> Migrate all existing tests to new Kafka Sink
> 
>
> Key: FLINK-24281
> URL: https://issues.apache.org/jira/browse/FLINK-24281
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Fabian Paul
>Assignee: Fabian Paul
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> The FlinkKafkaProducer is deprecated since 1.14 but a lot of existing tests 
> are still using.
> We should replace it with the KafkaSink because it completely subsumes it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23607) Cleanup unnecessary dependencies in dstl pom.xml

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-23607:
-
Fix Version/s: (was: 1.14.1)
   1.14.0

> Cleanup unnecessary dependencies in dstl pom.xml
> 
>
> Key: FLINK-23607
> URL: https://issues.apache.org/jira/browse/FLINK-23607
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Affects Versions: 1.14.0
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.15.0
>
>
> - Clean up `flink-statebackend-changelog` denpendencies (move 
> flink-statebackend-changelog depency to flink-test-utils) 
>  - check whether some dependencies (i.e. flink-streaming-java, shaded guava, 
> flink-test-utils-junit) are indeed necessary; comment if they are transitive
>  - fix the scope of flink-runtime and flink-core - compile (not provided) and 
> test for test-jar
>  - Document how changelog randomization flags work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-21769) Encapsulate component meta data

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-21769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-21769:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Encapsulate component meta data
> ---
>
> Key: FLINK-21769
> URL: https://issues.apache.org/jira/browse/FLINK-21769
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: auto-unassigned, stale-assigned
> Fix For: 1.15.0
>
>
> Encapsulate jm/tm/job/task/operator meta data in simple pojos that can be 
> passed around, reducing the dependency on strict metric group hierarchies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24315) Cannot rebuild watcher thread while the K8S API server is unavailable

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-24315:
-
Fix Version/s: (was: 1.14.0)
   1.14.1

> Cannot rebuild watcher thread while the K8S API server is unavailable
> -
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.0, 1.13.2
>Reporter: ouyangwulin
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.3, 1.15.0, 1.14.1
>
>
> In native k8s integration, Flink will try to rebuild the watcher thread if 
> the API server is temporarily unavailable. However, if the jitter is longer 
> than the web socket timeout, the rebuilding of the watcher will timeout and 
> Flink cannot handle the pod event correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24353) Bash scripts do not respect dynamic configurations when calculating memory sizes

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-24353:
-
Fix Version/s: (was: 1.14.0)
   1.14.1

> Bash scripts do not respect dynamic configurations when calculating memory 
> sizes
> 
>
> Key: FLINK-24353
> URL: https://issues.apache.org/jira/browse/FLINK-24353
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Scripts
>Affects Versions: 1.14.0, 1.13.2, 1.15.0
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.3, 1.15.0, 1.14.1
>
>
> Dynamic configurations (the '-D' arguments) are lost due to changes in 
> FLINK-21128.
> Consequently, dynamic configurations like the following commands will not 
> take effect.
> {code}
> ./bin/taskmanager.sh start -D taskmanager.memory.task.off-heap.size=128m
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-23798) Avoid using reflection to get filter when partition filter is enabled

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-23798:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Avoid using reflection to get filter when partition filter is enabled
> -
>
> Key: FLINK-23798
> URL: https://issues.apache.org/jira/browse/FLINK-23798
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Yun Tang
>Assignee: PengFei Li
>Priority: Minor
> Fix For: 1.15.0
>
>
> FLINK-20496 introduce partitioned index & filter to Flink. However, RocksDB 
> only support new full format of filter in this feature, and we need to 
> replace previous filter if user enabled. [Previous implementation use 
> reflection to get the 
> filter|https://github.com/apache/flink/blob/7ff4cbdc25aa971dccaf5ce02aaf46dc1e7345cc/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBResourceContainer.java#L251-L258]
>  and we could use API to get that after upgrading to newer version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24159) document of entropy injection may mislead users

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-24159:
-
Fix Version/s: (was: 1.14.1)
   1.14.0

> document of entropy injection may mislead users
> ---
>
> Key: FLINK-24159
> URL: https://issues.apache.org/jira/browse/FLINK-24159
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / Checkpointing
>Affects Versions: 1.14.0, 1.12.5, 1.13.2
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.13.3, 1.15.0
>
>
> FLINK-9061 incroduce entropy inject to s3 path for better scalability, but in 
> document of 
> [entropy-injection-for-s3-file-systems|https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#entropy-injection-for-s3-file-systems]
>  use a example with checkpoint directory 
> "{color:#ff}s3://my-bucket/checkpoints/_entropy_/dashboard-job/{color}", 
> with this configuration every checkpoint key will still start with constant 
> checkpoints/ prefix which actually reduces scalability.
> Thanks to dmtolpeko for describing this issue in his blog ( 
> [flink-and-s3-entropy-injection-for-checkpoints 
> |http://cloudsqale.com/2021/01/02/flink-and-s3-entropy-injection-for-checkpoints/]).
> h3. Proposal
> alter the checkpoint directory in document of 
> [entropy-injection-for-s3-file-systems|https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#entropy-injection-for-s3-file-systems]
>  to 
> "{color:#ff}s3://my-bucket/_entropy_/checkpoints/dashboard-job/{color}" 
> (make entropy key at start of keys).
>  
> If this proposal is appropriate, I am glad to submit a PR to modify the 
> document here. Any other ideas for this ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24159) document of entropy injection may mislead users

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-24159:
-
Fix Version/s: (was: 1.14.0)
   1.14.1

> document of entropy injection may mislead users
> ---
>
> Key: FLINK-24159
> URL: https://issues.apache.org/jira/browse/FLINK-24159
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / Checkpointing
>Affects Versions: 1.14.0, 1.12.5, 1.13.2
>Reporter: Feifan Wang
>Assignee: Feifan Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.3, 1.15.0, 1.14.1
>
>
> FLINK-9061 incroduce entropy inject to s3 path for better scalability, but in 
> document of 
> [entropy-injection-for-s3-file-systems|https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#entropy-injection-for-s3-file-systems]
>  use a example with checkpoint directory 
> "{color:#ff}s3://my-bucket/checkpoints/_entropy_/dashboard-job/{color}", 
> with this configuration every checkpoint key will still start with constant 
> checkpoints/ prefix which actually reduces scalability.
> Thanks to dmtolpeko for describing this issue in his blog ( 
> [flink-and-s3-entropy-injection-for-checkpoints 
> |http://cloudsqale.com/2021/01/02/flink-and-s3-entropy-injection-for-checkpoints/]).
> h3. Proposal
> alter the checkpoint directory in document of 
> [entropy-injection-for-s3-file-systems|https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#entropy-injection-for-s3-file-systems]
>  to 
> "{color:#ff}s3://my-bucket/_entropy_/checkpoints/dashboard-job/{color}" 
> (make entropy key at start of keys).
>  
> If this proposal is appropriate, I am glad to submit a PR to modify the 
> document here. Any other ideas for this ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] iyupeng commented on pull request #17344: [FLINK-20895] [flink-table-planner] support local aggregate push down in table planner

2021-09-28 Thread GitBox


iyupeng commented on pull request #17344:
URL: https://github.com/apache/flink/pull/17344#issuecomment-929774236


   @wuchong @godfreyhe Please take a look, thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-9407) Support orc rolling sink writer

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-9407:

Fix Version/s: (was: 1.14.0)
   1.15.0

> Support orc rolling sink writer
> ---
>
> Key: FLINK-9407
> URL: https://issues.apache.org/jira/browse/FLINK-9407
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: zhangminglei
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available, 
> usability
> Fix For: 1.15.0
>
>
> Currently, we only support {{StringWriter}}, {{SequenceFileWriter}} and 
> {{AvroKeyValueSinkWriter}}. I would suggest add an orc writer for rolling 
> sink.
> Below, FYI.
> I tested the PR and verify the results with spark sql. Obviously, we can get 
> the results of what we had written down before. But I will give more tests in 
> the next couple of days. Including the performance under compression with 
> short checkpoint intervals. And more UTs.
> {code:java}
> scala> spark.read.orc("hdfs://10.199.196.0:9000/data/hive/man/2018-07-06--21")
> res1: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala>
> scala> res1.registerTempTable("tablerice")
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> scala> spark.sql("select * from tablerice")
> res3: org.apache.spark.sql.DataFrame = [name: string, age: int ... 1 more 
> field]
> scala> res3.show(3)
> +-+---+---+
> | name|age|married|
> +-+---+---+
> |Sagar| 26|  false|
> |Sagar| 30|  false|
> |Sagar| 34|  false|
> +-+---+---+
> only showing top 3 rows
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13735) Support session window with blink planner in batch mode

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-13735:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support session window with blink planner in batch mode
> ---
>
> Key: FLINK-13735
> URL: https://issues.apache.org/jira/browse/FLINK-13735
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner, Table SQL / Runtime
>Reporter: Kurt Young
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14340) Specify an unique DFSClient name for Hadoop FileSystem

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14340:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Specify an unique DFSClient name for Hadoop FileSystem
> --
>
> Key: FLINK-14340
> URL: https://issues.apache.org/jira/browse/FLINK-14340
> Project: Flink
>  Issue Type: Improvement
>  Components: FileSystems
>Reporter: Congxian Qiu
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently, when Flink read/write to HDFS, we do not set the DFSClient name 
> for all the connections, so we can’t distinguish the connections, and can’t 
> find the specific Job or TM quickly.
> This issue wants to add the {{container_id}} as a unique name when init 
> Hadoop File System, so we can easily distinguish the connections belongs to 
> which Job/TM.
>  
> Core changes is add a line such as below in 
> {{org.apache.flink.runtime.fs.hdfs.HadoopFsFactory#create}}
>  
> {code:java}
> hadoopConfig.set(“mapreduce.task.attempt.id”, 
> System.getenv().getOrDefault(CONTAINER_KEY_IN_ENV, 
> DEFAULT_CONTAINER_ID));{code}
>  
> Currently, In {{YarnResourceManager}} and {{MesosResourceManager}} we both 
> have an enviroment key {{ENV_FLINK_CONTAINER_ID = "_FLINK_CONTAINER_ID"}}, so 
> maybe we should introduce this key in {{StandaloneResourceManager}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14032) Make the cache size of RocksDBPriorityQueueSetFactory configurable

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14032:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Make the cache size of RocksDBPriorityQueueSetFactory configurable
> --
>
> Key: FLINK-14032
> URL: https://issues.apache.org/jira/browse/FLINK-14032
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Yun Tang
>Priority: Minor
>  Labels: auto-unassigned, usability
> Fix For: 1.15.0
>
>
> Currently, the cache size of {{RocksDBPriorityQueueSetFactory}} has been set 
> as 128 and no any ways to configure this to other value. (We could increase 
> this to obtain better performance if necessary). Actually, this is also a 
> TODO for quiet a long time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-11499) Extend StreamingFileSink BulkFormats to support arbitrary roll policies

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-11499:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Extend StreamingFileSink BulkFormats to support arbitrary roll policies
> ---
>
> Key: FLINK-11499
> URL: https://issues.apache.org/jira/browse/FLINK-11499
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Seth Wiesman
>Priority: Minor
>  Labels: auto-deprioritized-major, usability
> Fix For: 1.15.0
>
>
> Currently when using the StreamingFilleSink Bulk-encoding formats can only be 
> combined with the `OnCheckpointRollingPolicy`, which rolls the in-progress 
> part file on every checkpoint.
> However, many bulk formats such as parquet are most efficient when written as 
> large files; this is not possible when frequent checkpointing is enabled. 
> Currently the only work-around is to have long checkpoint intervals which is 
> not ideal.
>  
> The StreamingFileSink should be enhanced to support arbitrary roll policy's 
> so users may write large bulk files while retaining frequent checkpoints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12358) Verify whether rest documenation needs to be updated when building pull request

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-12358:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Verify whether rest documenation needs to be updated when building pull 
> request
> ---
>
> Key: FLINK-12358
> URL: https://issues.apache.org/jira/browse/FLINK-12358
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Yun Tang
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> Currently, unlike configuration docs, rest-API docs have no any methods to 
> check whether updated to latest code. This is really annoying and not easy to 
> track if only checked by developers.
> I plan to check this in travis to verify whether any files have been updated 
> by using `git status`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12491) Incorrect documentation for directory path separators of CoreOptions.TMP_DIRS

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-12491:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Incorrect documentation for directory path separators of CoreOptions.TMP_DIRS
> -
>
> Key: FLINK-12491
> URL: https://issues.apache.org/jira/browse/FLINK-12491
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Affects Versions: 1.6.4, 1.7.2, 1.8.0, 1.9.0
>Reporter: Kezhu Wang
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{CoreOptions.TMP_DIRS}} and {{ConfigConstants.TASK_MANAGER_TMP_DIR_KEY}} 
> both say that:
> {quote}
> The config parameter defining the directories for temporary files, separated 
> by
>* ",", "|", or the system's \{@link java.io.File#pathSeparator}.
> {quote}
> But the parsing phase uses {{String.split}} with argument {{",|" + 
> File.pathSeparator}} eventually. However, in fact the sole parameter of 
> {{String.split}} is a regular expression, so the directory path separators 
> are "," or {{java.io.File#pathSeparator}}. After digging into history, I 
> found that the documentation was introduced in commit 
> {{a7c407ace4f6cbfbde3e247071cee5a755ae66db}} and inherited by 
> {{76abcaa55d0d6ab704b7ab8164718e8e2dcae2c4}}. So, I think it is safe to drop 
> "|" from documentation.
> {code:title=ConfigurationUtils.java}
> public class ConfigurationUtils {
>   private static String[] splitPaths(@Nonnull String separatedPaths) {
>   return separatedPaths.length() > 0 ? separatedPaths.split(",|" 
> + File.pathSeparator) : EMPTY;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13809) The log directory of Flink Python API is unwritable if it is installed via "sudo"

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-13809:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> The log directory of Flink Python API  is unwritable if it is installed via 
> "sudo"
> --
>
> Key: FLINK-13809
> URL: https://issues.apache.org/jira/browse/FLINK-13809
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.9.0
>Reporter: Wei Zhong
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, if the python apache-flink package is installed via "sudo", an 
> exception will be thrown when starting the flink python shell:
> {code:java}
> log4j:ERROR setFile(null,false) call failed. java.io.FileNotFoundException: 
> /Library/Python/2.7/site-packages/pyflink/log/flink-zhongwei-python-zhongweideMacBook-Pro.local.log
>  (Permission denied) at java.io.FileOutputStream.open0(Native Method) at 
> java.io.FileOutputStream.open(FileOutputStream.java:270) at 
> java.io.FileOutputStream.(FileOutputStream.java:213) at 
> java.io.FileOutputStream.(FileOutputStream.java:133) at 
> org.apache.log4j.FileAppender.setFile(FileAppender.java:294) at 
> org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) at 
> org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) at 
> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) 
> at 
> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) 
> at 
> org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
>  at 
> org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
>  at 
> org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648)
>  at 
> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514)
>  at 
> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
>  at 
> org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
>  at org.apache.log4j.LogManager.(LogManager.java:127) at 
> org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:81) at 
> org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:329) at 
> org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:349) at 
> org.apache.flink.api.java.ExecutionEnvironment.(ExecutionEnvironment.java:102)
>  at java.lang.Class.forName0(Native Method) at 
> java.lang.Class.forName(Class.java:348) at 
> org.apache.flink.api.python.shaded.py4j.reflection.CurrentThreadClassLoadingStrategy.classForName(CurrentThreadClassLoadingStrategy.java:40)
>  at 
> org.apache.flink.api.python.shaded.py4j.reflection.ReflectionUtil.classForName(ReflectionUtil.java:51)
>  at 
> org.apache.flink.api.python.shaded.py4j.reflection.TypeUtil.forName(TypeUtil.java:243)
>  at 
> org.apache.flink.api.python.shaded.py4j.commands.ReflectionCommand.getUnknownMember(ReflectionCommand.java:175)
>  at 
> org.apache.flink.api.python.shaded.py4j.commands.ReflectionCommand.execute(ReflectionCommand.java:87)
>  at 
> org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
>  at java.lang.Thread.run(Thread.java:748)
> {code}
> It does not affect the running of flink python shell but it would be better 
> if we can fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14684) Add Pinterest to Chinese Powered By page

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14684:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add Pinterest to Chinese Powered By page
> 
>
> Key: FLINK-14684
> URL: https://issues.apache.org/jira/browse/FLINK-14684
> Project: Flink
>  Issue Type: New Feature
>  Components: chinese-translation
>Reporter: Hequn Cheng
>Priority: Minor
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Pinterest was added to the English Powered By page with commit:
> [51f7e3ced85b94dcbe3c051069379d22c88fbc5c|https://github.com/apache/flink-web/pull/281]
> It should be added to the Chinese Powered By (and index.html) page as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14125) Display memory and CPU usage in the overview page

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14125:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Display memory and CPU usage in the overview page
> -
>
> Key: FLINK-14125
> URL: https://issues.apache.org/jira/browse/FLINK-14125
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.10.0
>Reporter: Yadong Xie
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
> Attachments: 屏幕快照 2019-09-19 下午5.03.14.png
>
>
> In the overview page of Web UI, besides the task slots and jobs, we could add 
> memory and CPU usage metrics to the cluster-level, these metrics are already 
> available in the Blink branch.
> !屏幕快照 2019-09-19 下午5.03.14.png|width=564,height=283!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12412) Allow ListTypeInfo used for java.util.List and MapTypeInfo used for java.util.Map

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-12412:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Allow ListTypeInfo used for java.util.List and MapTypeInfo used for 
> java.util.Map
> -
>
> Key: FLINK-12412
> URL: https://issues.apache.org/jira/browse/FLINK-12412
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Type Serialization System
>Affects Versions: 1.9.0
>Reporter: YangFei
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available, starer
> Fix For: 1.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> // code placeholder
> public static class UserBehavior { 
>   public long userId;
>   public long itemId;  
>   public int categoryId; 
>   public long timestamp;  
>   public List comments; 
> }
> public static void main(String[] args) throws Exception { 
>   PojoTypeInfo pojoType = (PojoTypeInfo) 
> TypeExtractor.createTypeInfo(UserBehavior.class); 
> }
> {code}
>  
> The filed comments in UserBehavior will be extracted by TypeExtractor to 
> GenericType .
> I think it can be extracted to ListTypeInfo .
> This would be a big improvement as in many cases classes including List or Map



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14707) Refactor checkpoint related methods within Environment

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14707:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Refactor checkpoint related methods within Environment
> --
>
> Key: FLINK-14707
> URL: https://issues.apache.org/jira/browse/FLINK-14707
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: Yun Tang
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> After FLINK-7720 fixed, the two methods of 
> {{Environment#acknowledgeCheckpoint}} have been actually useless, and their 
> usage have been overridden by {{TaskStateManager#reportTaskStateSnapshots}}. 
> More generally, we would consider {{Environment}} interface as getter to 
> access components instead of doing real actions. I prefer to remove 
> {{acknowledgeCheckpoint}} and {{declineCheckpoint}} methods or at least 
> deprecated them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14690) Support 'DESCRIBE CATALOG catalogName' statement in TableEnvironment and SQL Client

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14690:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support 'DESCRIBE CATALOG catalogName' statement in TableEnvironment and SQL 
> Client
> ---
>
> Key: FLINK-14690
> URL: https://issues.apache.org/jira/browse/FLINK-14690
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Client
>Reporter: Terry Wang
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> 1. showCatalogsStatement. (has been supported)
> SHOW CATALOGS
> 2. describeCatalogStatement
> DESCRIBE CATALOG catalogName
> See 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+69+-+Flink+SQL+DDL+Enhancement



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13015) Create validators, strategies and transformations required for porting logical expressions

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-13015:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Create validators, strategies and transformations required for porting 
> logical expressions
> --
>
> Key: FLINK-13015
> URL: https://issues.apache.org/jira/browse/FLINK-13015
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The goal of this task is to implement :
> InputTypeValidator:
> * by type root
> TypeStrategies:
> * cascade
> * explicit
> TypeTransformations:
> * to_nullable
> This set of classes will enable porting 
> AND/OR/NOT/IS_TRUE/IS_FALSE/IS_NOT_TRUE/IS_NOT_FALSE to new type inference 
> stack.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14479) Strange exceptions found in log file after executing `test_udf.py`

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14479:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Strange exceptions found in log file after executing `test_udf.py`
> --
>
> Key: FLINK-14479
> URL: https://issues.apache.org/jira/browse/FLINK-14479
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: sunjincheng
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
> Attachments: screenshot-1.png
>
>
> There are several strange exceptions as follow in 
> `${flink_source}/build-target/log/flink-${username}-python-udf-boot-${machine_name}.local.log`
>  after executing 
> `${flink_source}/flink-python/pyflink/table/tests/test_udf.py`:
> Traceback (most recent call last):
> {code:java}
>  File 
> "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py",
>  line 193, in _run_module_as_main
>  "__main__", mod_spec)
>  File 
> "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py",
>  line 85, in _run_code
>  exec(code, run_globals)
>  File 
> "/Users/zhongwei/flink/flink-python/pyflink/fn_execution/sdk_worker_main.py", 
> line 30, in 
>  apache_beam.runners.worker.sdk_worker_main.main(sys.argv)
>  File 
> "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker_main.py",
>  line 148, in main
>  sdk_pipeline_options.view_as(pipeline_options.ProfilingOptions))
>  File 
> "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 133, in run
>  for work_request in control_stub.Control(get_responses()):
>  File 
> "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/grpc/_channel.py",
>  line 364, in __next__
>  return self._next()
>  File 
> "/Users/zhongwei/pyflink_env/py37/lib/python3.7/site-packages/grpc/_channel.py",
>  line 347, in _next
>  raise self
> grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
>  status = StatusCode.CANCELLED
>  details = "Runner closed connection"
>  debug_error_string = 
> "{"created":"@1571660342.057172000","description":"Error received from peer 
> ipv6:[::1]:52699","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"Runner
>  closed connection","grpc_status":1}"{code}
> It appears randomly when executing test cases of blink planner. Although it 
> does not affect test results we need to find out why it appears.
> Welcome any feedback!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-9879) Find sane defaults for (advanced) SSL session parameters

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-9879:

Fix Version/s: (was: 1.14.0)
   1.15.0

> Find sane defaults for (advanced) SSL session parameters
> 
>
> Key: FLINK-9879
> URL: https://issues.apache.org/jira/browse/FLINK-9879
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.5.2, 1.6.0
>Reporter: Nico Kruber
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> After adding these configuration parameters with 
> https://issues.apache.org/jira/browse/FLINK-9878:
> - SSL session cache size
> - SSL session timeout
> - SSL handshake timeout
> - SSL close notify flush timeout
> We should try to find sane defaults that "just work" :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14934) Remove error log statement in ES connector

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14934:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Remove error log statement in ES connector
> --
>
> Key: FLINK-14934
> URL: https://issues.apache.org/jira/browse/FLINK-14934
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.9.1
>Reporter: Arvid Heise
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> The ES connector currently uses the [log and throw 
> antipattern|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-elasticsearch-base/src/main/java/org/apache/flink/streaming/connectors/elasticsearch/ElasticsearchSinkBase.java#L406],
>  which doesn't allow users to ignore certain types of errors without getting 
> their logs spammed.
> The log statement should be removed completely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14879) Support 'DESCRIBE DATABSE databaseName' in TableEnvironment and SQL Client

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14879:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support 'DESCRIBE DATABSE databaseName' in TableEnvironment and SQL Client
> --
>
> Key: FLINK-14879
> URL: https://issues.apache.org/jira/browse/FLINK-14879
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Client
>Reporter: Terry Wang
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> 1. showDatabasesStatement: (has been supported)
> SHOW DATABASES
> 2. descDatabaseStatement:
> DESCRIBE  DATABASE [ EXTENDED] [ catalogName.] dataBasesName
> See 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+69+-+Flink+SQL+DDL+Enhancement



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-10230) Support 'SHOW CREATE VIEW' syntax to print the query of a view

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-10230:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support 'SHOW CREATE VIEW' syntax to print the query of a view
> --
>
> Key: FLINK-10230
> URL: https://issues.apache.org/jira/browse/FLINK-10230
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Client
>Reporter: Timo Walther
>Assignee: Roc Marshal
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>
> FLINK-10163 added initial support for views in SQL Client. We should add a 
> command that allows for printing the query of a view for debugging. MySQL 
> offers {{SHOW CREATE VIEW}} for this. Hive generalizes this to {{SHOW CREATE 
> TABLE}}. The latter one could be extended to also show information about the 
> used table factories and properties.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15471) HA e2e check for empty .out files does not print specific error

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15471:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> HA e2e check for empty .out files does not print specific error
> ---
>
> Key: FLINK-15471
> URL: https://issues.apache.org/jira/browse/FLINK-15471
> Project: Flink
>  Issue Type: Improvement
>  Components: Test Infrastructure
>Affects Versions: 1.9.0
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.10.4, 1.11.5, 1.15.0
>
>
> {{common_ha.sh#verify_logs:}}
> {code}
> if ! check_logs_for_non_empty_out_files; then
> echo "FAILURE: Alerts found at the general purpose job."
> EXIT_CODE=1
> fi
> {code}
> Since check_logs_for_non_empty_out_files does only set EXIT_CODE without 
> modifying the return value the check will never fail.
> While the test will still fail (since EXIT_CODE is later being evaluated), we 
> may not actually print the error cause.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15740) Remove Deadline#timeLeft()

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15740:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Remove Deadline#timeLeft()
> --
>
> Key: FLINK-15740
> URL: https://issues.apache.org/jira/browse/FLINK-15740
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Tests
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As shown in FLINK-13662, {{Deadline#timeLeft()}} is conceptually broken since 
> there is no reliable way to call said method while ensuring that
>  a) the value is non-negative (desired since most time-based APIs reject 
> negative values)
>  b) the value sign (+,-) corresponds to preceding calls to {{#hasTimeLeft()}}
>  
> As a result any usage of the following form is unreliable and obfuscating 
> error messages.
> {code:java}
> while (deadline.hasTimeLeft()) {
>   doSomething(deadline.timeLeft());
> } {code}
>  
> All existing usage should be migrate to either
> {code:java}
> while (deadline.hasTimeLeft()) {
>   doSomething();
> } {code}
> or
> {code:java}
> while (true) {
>   doSomething(deadline.timeLeftIfAny());
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15922) Show "Warn - received late message for checkpoint" only when checkpoint actually expired

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15922:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Show "Warn - received late message for checkpoint" only when checkpoint 
> actually expired
> 
>
> Key: FLINK-15922
> URL: https://issues.apache.org/jira/browse/FLINK-15922
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.10.0
>Reporter: Stephan Ewen
>Priority: Minor
>  Labels: auto-deprioritized-major, usability
> Fix For: 1.15.0
>
>
> The message "Warn - received late message for checkpoint" is shown frequently 
> in the logs, also when a checkpoint was purposefully canceled.
> In those case, this message is unhelpful and misleading.
> We should log this only when the checkpoint is actually expired.
> Meaning that when receiving the message, we check if we have an expired 
> checkpoint for that ID. If yes, we log that message, if not, we simply drop 
> the message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15018) Add event time page to Concepts

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15018:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add event time page to Concepts
> ---
>
> Key: FLINK-15018
> URL: https://issues.apache.org/jira/browse/FLINK-15018
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Seth Wiesman
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.10.4, 1.11.5, 1.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We should disentangle the conceptual overview of event time from the 
> DataStream documentation so that it can be used by DataStream and SQL users. 
> This has the added benefit of making the page more stable as it will not have 
> to be rewritten or touched every time an API is changed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-17372) SlotManager should expose total required resources

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17372:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> SlotManager should expose total required resources
> --
>
> Key: FLINK-17372
> URL: https://issues.apache.org/jira/browse/FLINK-17372
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.11.0
>Reporter: Till Rohrmann
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, the {{SlotManager}} exposes the set of required resources which 
> have not been fulfilled via {{SlotManager.getRequiredResources}}. The idea of 
> this function is to allow the {{ResourceManager}} to decide whether new 
> pods/containers need to be started or not.
> The problem is that once a resource has been registered at the 
> {{SlotManager}} it will decrease the set of required resources. If now a 
> pod/container fails, then the {{ResourceManager}} won't know whether it needs 
> to restart the container or not.
> In order to simplify the interaction, I propose to let the {{SlotManager}} 
> announce all of its required resources (pending + registered resources). That 
> way the {{ResourceManager}} only needs to compare the set of required 
> resources with the set of pending and allocated containers/pods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15859) Unify identifiers in the interface methods of CatalogManager

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15859:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Unify identifiers in the interface methods of CatalogManager
> 
>
> Key: FLINK-15859
> URL: https://issues.apache.org/jira/browse/FLINK-15859
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> We're not being too consistent with the type of identifier that the 
> FunctionCatalog/CatalogManager accepts.
> Some methods accept {{UnresolvedIdentifier}} e.g. 
> {{FunctionCatalog#registerTemporaryCatalogFunction}}, 
> {{CatalogManager#dropTemporaryView}}. 
> Some resolved {{ObjectIdentifier}} e.g. 
> {{CatalogManager#createTemporaryTable}}, {{CatalogManager#createTable}}.
> I am not sure which one should we prefer. If we go with the 
> {{UnresolvedIdentifier}} the benefit is that we always qualify it in a 
> {{Catalog*}}. The downside is that we would use {{UnresolvedIdentifier}} in 
> {{*Operations}}, (e.g. {{CreateTableOperation}} etc.), whereas we said that 
> all Operations should be fully resolved...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16023) jdbc connector's 'connector.table' property should be optional rather than required

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16023:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> jdbc connector's 'connector.table' property should be optional rather than 
> required
> ---
>
> Key: FLINK-16023
> URL: https://issues.apache.org/jira/browse/FLINK-16023
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / JDBC
>Reporter: Bowen Li
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> jdbc connector's 'connector.table' property should be optional rather than 
> required.
> connector should assume the table name in dbms is the same as that in Flink 
> when this property is not present
> The fundamental reason is that such a design didn't consider integration with 
> catalogs. Once introduced catalog, the flink table's name should be just the 
> 'table''s name in corresponding external system. 
> cc [~ykt836]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12273) The default value of CheckpointRetentionPolicy should be RETAIN_ON_FAILURE, otherwise it cannot be recovered according to checkpoint after failure.

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-12273:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> The default value of CheckpointRetentionPolicy should be RETAIN_ON_FAILURE, 
> otherwise it cannot be recovered according to checkpoint after failure.
> ---
>
> Key: FLINK-12273
> URL: https://issues.apache.org/jira/browse/FLINK-12273
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.6.3, 1.6.4, 1.7.2, 1.8.0
>Reporter: Mr.Nineteen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The default value of CheckpointRetentionPolicy should be RETAIN_ON_FAILURE, 
> otherwise it cannot be recovered according to checkpoint after failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15553) Create table ddl support comment after computed column

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15553:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Create table ddl support  comment after computed column
> ---
>
> Key: FLINK-15553
> URL: https://issues.apache.org/jira/browse/FLINK-15553
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: hailong wang
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> For now, we can define computed column in create table ddl, but we can not 
> add comment after it just like regular table column, So we should support it, 
>  it's grammar as follows:
> {code:java}
> col_name AS expr  [COMMENT 'string']
> {code}
> My idea is, we can introduce  class
> {code:java}
>  SqlTableComputedColumn{code}
> to wrap name, expression and comment,  And just get the element from it will 
> be ok.
> As for parserImpls.ftl, it can be like as follows:
> {code:java}
> identifier = SimpleIdentifier()
> 
> expr = Expression(ExprContext.ACCEPT_NON_QUERY)
> [   {
> String p = SqlParserUtil.parseString(token.image);
> comment = SqlLiteral.createCharString(p, getPos());
> }]
> {
> SqlTableComputedColumn tableComputedColumn =
> new SqlTableComputedColumn(identifier, expr, comment, getPos());
> context.columnList.add(tableComputedColumn);
> }{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15585) Improve function identifier string in plan digest

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15585:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Improve function identifier string in plan digest
> -
>
> Key: FLINK-15585
> URL: https://issues.apache.org/jira/browse/FLINK-15585
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we are using {{UserDefinedFunction#functionIdentifier}} as the 
> identifier string of UDFs in plan digest, for example: 
> {code:java}
> LogicalTableFunctionScan(invocation=[org$apache$flink$table$planner$utils$TableFunc1$8050927803993624f40152a838c98018($2)],
>  rowType=...)
> {code}
> However, the result of {{UserDefinedFunction#functionIdentifier}} will change 
> if we just add a method in UserDefinedFunction, because it uses Java 
> serialization. Then we have to update 60 plan tests which is very annoying. 
> In the other hand, displaying the function identifier string in operator name 
> in Web UI is verbose to users. 
> In order to improve this situation, there are something we can do:
> 1) If the UDF has a catalog function name, we can just use the catalog name 
> as the digest. Otherwise, fallback to (2). 
> 2) If the UDF doesn't contain fields, we just use the full calss name as the 
> digest. Otherwise, fallback to (3).
> 3) Use identifier string which will do the full serialization.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-13698:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Rework threading model of CheckpointCoordinator
> ---
>
> Key: FLINK-13698
> URL: https://issues.apache.org/jira/browse/FLINK-13698
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.10.0
>Reporter: Piotr Nowojski
>Priority: Minor
>  Labels: auto-deprioritized-critical, auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is 
> executed by multiple different threads (mostly {{ioExecutor}}, but not only). 
> It's causing multiple concurrency issues, for example: 
> https://issues.apache.org/jira/browse/FLINK-13497
> Proper fix would be to rethink threading model there. At first glance it 
> doesn't seem that this code should be multi threaded, except of parts doing 
> the actual IO operations, so it should be possible to run everything in one 
> single ExecutionGraph's thread and just run asynchronously necessary IO 
> operations with some feedback loop ("mailbox style").
> I would strongly recommend fixing this issue before adding new features in 
> the \{{CheckpointCoordinator}} component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-11868) [filesystems] Introduce listStatusIterator API to file system

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-11868:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> [filesystems] Introduce listStatusIterator API to file system
> -
>
> Key: FLINK-11868
> URL: https://issues.apache.org/jira/browse/FLINK-11868
> Project: Flink
>  Issue Type: New Feature
>  Components: FileSystems
>Reporter: Yun Tang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> From existed experience, we know {{listStatus}} is expensive for many 
> distributed file systems especially when the folder contains too many files. 
> This method would not only block the thread until result is return but also 
> could cause OOM due to the returned array of {{FileStatus}} is really large. 
> I think we should already learn it from FLINK-7266 and FLINK-8540.
> However, list file status under a path is really helpful in many situations. 
> Thankfully, many distributed file system noticed that and provide API such as 
> {{[listStatusIterator|https://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#listStatusIterator(org.apache.hadoop.fs.Path)]}}
>  to call the file system on demand.
>  
> We should also introduce this API and replace current implementation which 
> used previous {{listStatus}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16701) Elasticsearch sink support alias for indices.

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16701:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Elasticsearch sink support alias for indices.
> -
>
> Key: FLINK-16701
> URL: https://issues.apache.org/jira/browse/FLINK-16701
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.11.0
>Reporter: Leonard Xu
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> This is related to 
> [FLINK-15400|https://issues.apache.org/jira/browse/FLINK-15400]  FLINK-15400 
> will only support dynamic index, and do not support the alias.  Because 
> supporting alias both need in Streaming API and Table API, so I think split 
> the original design to two PRs make sense.
> PR for FLINK-15400:
>         support dynamic index for ElasticsearchTableSink
> PR for this issue:
>           support alias for Streaming API and Table API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16027) kafka connector's 'connector.topic' property should be optional rather than required

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16027:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> kafka connector's 'connector.topic' property should be optional rather than 
> required
> 
>
> Key: FLINK-16027
> URL: https://issues.apache.org/jira/browse/FLINK-16027
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Reporter: Bowen Li
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15973) Optimize the execution plan where it refers the Python UDF result field in the where clause

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15973:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Optimize the execution plan where it refers the Python UDF result field in 
> the where clause
> ---
>
> Key: FLINK-15973
> URL: https://issues.apache.org/jira/browse/FLINK-15973
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Dian Fu
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> For the following job:
> {code}
> t_env.register_function("inc", inc)
> table.select("inc(id) as inc_id") \
>  .where("inc_id > 0") \
>  .insert_into("sink")
> {code}
> The execution plan is as following:
> {code}
> StreamExecPythonCalc(select=inc(f0) AS inc_id))
> +- StreamExecCalc(select=id AS f0, where=>(f0, 0))
> +--- StreamExecPythonCalc(select=id, inc(f0) AS f0))
> +-StreamExecCalc(select=id, id AS f0))
> +---StreamExecTableSourceScan(fields=id)
> {code}
> The plan is not the best. It could be optimized as following:
> {code}
> StreamExecCalc(select=f0, where=>(f0, 0))
> +- StreamExecPythonCalc(select=inc(f0) AS f0))
> +---StreamExecCalc(select=id, id AS f0))
> +-StreamExecTableSourceScan(fields=id)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15860) Store temporary functions as CatalogFunctions in FunctionCatalog

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15860:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Store temporary functions as CatalogFunctions in FunctionCatalog
> 
>
> Key: FLINK-15860
> URL: https://issues.apache.org/jira/browse/FLINK-15860
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> We should change the {{FunctionCatalog}} so that it stores temporary 
> functions as {{CatalogFunction}}s instead of instances of 
> {{FunctionDefinition}} the same way we store {{CatalogTable}}s for temporary 
> tables.
> For functions that were registered with their instance we should create a 
> {{CatalogFunction}} wrapper similar to {{ConnectorCatalogTable}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14379) Supplement documentation about raw state

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14379:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Supplement documentation about raw state
> 
>
> Key: FLINK-14379
> URL: https://issues.apache.org/jira/browse/FLINK-14379
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Runtime / State Backends
>Reporter: Yun Tang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> Currently, we only have very simple 
> [documentation|https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#raw-and-managed-state]
>  or even we could say only one sentence to talk about raw state. It might 
> lead beginner of Flink feel not so clear about this concept, I think we 
> should supplement documentation about raw state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12590) Replace http links in documentation

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-12590:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Replace http links in documentation
> ---
>
> Key: FLINK-12590
> URL: https://issues.apache.org/jira/browse/FLINK-12590
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.7.2, 1.8.0, 1.9.0
>Reporter: Chesnay Schepler
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-17784) Better detection for parquet and orc in hive

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17784:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Better detection for parquet and orc in hive
> 
>
> Key: FLINK-17784
> URL: https://issues.apache.org/jira/browse/FLINK-17784
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16024) support filter pushdown in jdbc connector

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16024:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> support filter pushdown in jdbc connector
> -
>
> Key: FLINK-16024
> URL: https://issues.apache.org/jira/browse/FLINK-16024
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / JDBC
>Reporter: Bowen Li
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13876) Remove ExecutionConfig field from PojoSerializer

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-13876:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Remove ExecutionConfig field from PojoSerializer
> 
>
> Key: FLINK-13876
> URL: https://issues.apache.org/jira/browse/FLINK-13876
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Type Serialization System
>Affects Versions: 1.7.2, 1.8.1, 1.9.0, 1.10.0
>Reporter: Dawid Wysakowicz
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> The PojoSerializers stores an instance of ExecutionConfig as internal field, 
> even though that the only information it may ever need are the registered 
> kryo serializers.
> This has a few drawbacks:
> * It blocks the evolution of {{ExecutionConfig}} as serializers where stored 
> in a state. Therefore any change to ExecutionConfig must be backwards 
> compatible in respect to java serialization
> * It probably already introduced a bug, as upon restore the Snapshot actually 
> recreates the serializer with an empty ExecutionConfig (see 
> org.apache.flink.api.java.typeutils.runtime.PojoSerializerSnapshot#restoreSerializer)
> I suggest to remove the field completely and adjust corresponding usages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14500) Support Flink Python User-Defined Stateless Function for Table - Phase 2

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14500:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support Flink Python User-Defined Stateless Function for Table - Phase 2
> 
>
> Key: FLINK-14500
> URL: https://issues.apache.org/jira/browse/FLINK-14500
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python
>Reporter: Dian Fu
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> This is the umbrella Jira which tracks the functionalities of "Python 
> User-Defined Stateless Function for Table" which are planned to be supported 
> in 1.11, such as docker mode support, user-defined metrics support, arrow 
> support, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15003) Improve hive test performance

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15003:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Improve hive test performance
> -
>
> Key: FLINK-15003
> URL: https://issues.apache.org/jira/browse/FLINK-15003
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Now, our hive tests are too slow, for many tests, This leads us not to cover 
> all formats well.
> We can just use embedded mode to improve test performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-17589) Extend StreamingFileSink to Support Streaming Hive Connector

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17589:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Extend StreamingFileSink to Support Streaming Hive Connector
> 
>
> Key: FLINK-17589
> URL: https://issues.apache.org/jira/browse/FLINK-17589
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / FileSystem
>Reporter: Yun Gao
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> According to the [discussion on FLIP-115 | 
> https://lists.apache.org/thread.html/rb1795428e481dbeaa1af2dcffed87b4804ed81d2af60256cd50032d7%40%3Cdev.flink.apache.org%3E
>  
> |https://lists.apache.org/thread.html/rb1795428e481dbeaa1af2dcffed87b4804ed81d2af60256cd50032d7%40%3Cdev.flink.apache.org%3E],
>  we will use StreamingFileSink to support the streaming hive connector. This 
> requires to make some extension to the current StreamingFileSink:
>  #   Support path-based writer to write the specified Hadoop path.
>  #   Provides listeners to collect the bucket state so that the Hive sink 
> could decide when one bucket is terminated and then writing meta-info to Hive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16531) Add full integration tests for "GROUPING SETS" for streaming mode

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16531:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add full integration tests for "GROUPING SETS" for streaming mode
> -
>
> Key: FLINK-16531
> URL: https://issues.apache.org/jira/browse/FLINK-16531
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Planner
>Reporter: Jark Wu
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> We have a plan test for GROUPING SETS for streaming mode, i.e. 
> {{GroupingSetsTest}}. But we should also have a full IT coverage for it, just 
> like batch's 
> {{org.apache.flink.table.planner.runtime.batch.sql.agg.GroupingSetsITCase}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15378:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.9.2, 1.10.0
>Reporter: ouyangwulin
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
> Fix For: 1.15.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16548) Expose consistent environment variable to identify the component name and resource id of jm/tm

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16548:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Expose consistent environment variable to identify the component name and 
> resource id of jm/tm
> --
>
> Key: FLINK-16548
> URL: https://issues.apache.org/jira/browse/FLINK-16548
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Reporter: hejianchao
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> We proposed to expose environment variable to identify the component name and 
> resource id of jm/tm.
> To be specified:
> - Expose {{FLINK_COMPONENT_NAME}}. For jm, it should be "jobmanager". For tm, 
> it should be "taskexecutor".
> - Expose {{FLINK_COMPONENT_ID}}. For jm/tm, it should be the resource id.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16323) Support to join a static table in streaming mode

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16323:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support to join a static table in streaming mode
> 
>
> Key: FLINK-16323
> URL: https://issues.apache.org/jira/browse/FLINK-16323
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> Currently, we already support to join a stream and a bounded stream using 
> reguar join. However, this will be tranlsated into stream-stream join which 
> is not efficient, because it will output early results and maybe retracted 
> afterwards. 
> A better and native support will be using a special temporal join operator 
> which will block the streaming side until all the static table data is 
> loaded. 
> This can help users to join a huge table, e.g. MySQL table with billion 
> records but changes little.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15000) WebUI Metrics is very slow in large parallelism

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15000:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> WebUI Metrics is very slow in large parallelism
> ---
>
> Key: FLINK-15000
> URL: https://issues.apache.org/jira/browse/FLINK-15000
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.9.0, 1.9.1
>Reporter: fa zheng
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> metrics in web ui are very slow when parallelism is huge. It's hard to add 
> metric and choose one metric. I run carTopSpeedWindowingExample with command 
> {code:java}
> //代码占位符
> flink run -m yarn-cluster -p 1200 examples/streaming/TopSpeedWindowing.jar
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15460) planner dependencies won't be necessary for JDBC connector

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15460:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> planner dependencies won't be necessary for JDBC connector
> --
>
> Key: FLINK-15460
> URL: https://issues.apache.org/jira/browse/FLINK-15460
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / JDBC
>Affects Versions: 1.10.0
>Reporter: Zhenghua Gao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> remove planner dependencies from JDBC connector by changing the scope to test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14867) Move TextInputFormat & TextOutputFormat to flink-core

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-14867:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Move TextInputFormat & TextOutputFormat to flink-core
> -
>
> Key: FLINK-14867
> URL: https://issues.apache.org/jira/browse/FLINK-14867
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataSet, API / DataStream
>Reporter: Zili Chen
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is one step to decouple the dependency from flink-streaming-java to 
> flink-java. We already have a package {{o.a.f.core.io}} for these formats.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16775) expose FlinkKafkaConsumer/FlinkKafkaProducer Properties for other system

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16775:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> expose FlinkKafkaConsumer/FlinkKafkaProducer Properties for other system
> 
>
> Key: FLINK-16775
> URL: https://issues.apache.org/jira/browse/FLINK-16775
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.10.0
>Reporter: jackylau
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
>
> i want to expose Properties  FlinkKafkaConsumer/FlinkKafkaProducer Properties 
> such as getProperties for other system such as atlas and i think it is needed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15012) Checkpoint directory not cleaned up

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15012:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Checkpoint directory not cleaned up
> ---
>
> Key: FLINK-15012
> URL: https://issues.apache.org/jira/browse/FLINK-15012
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.9.1
>Reporter: Nico Kruber
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I started a Flink cluster with 2 TMs using {{start-cluster.sh}} and the 
> following config (in addition to the default {{flink-conf.yaml}})
> {code:java}
> state.checkpoints.dir: file:///path/to/checkpoints/
> state.backend: rocksdb {code}
> After submitting a jobwith checkpoints enabled (every 5s), checkpoints show 
> up, e.g.
> {code:java}
> bb969f842bbc0ecc3b41b7fbe23b047b/
> ├── chk-2
> │   ├── 238969e1-6949-4b12-98e7-1411c186527c
> │   ├── 2702b226-9cfc-4327-979d-e5508ab2e3d5
> │   ├── 4c51cb24-6f71-4d20-9d4c-65ed6e826949
> │   ├── e706d574-c5b2-467a-8640-1885ca252e80
> │   └── _metadata
> ├── shared
> └── taskowned {code}
> If I shut down the cluster via {{stop-cluster.sh}}, these files will remain 
> on disk and not be cleaned up.
> In contrast, if I cancel the job, at least {{chk-2}} will be deleted, but 
> still leaving the (empty) directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16609) Promotes the column name representation of SQL-CLI

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16609:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Promotes the column name representation of SQL-CLI
> --
>
> Key: FLINK-16609
> URL: https://issues.apache.org/jira/browse/FLINK-16609
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.10.0
>Reporter: Danny Chen
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
> Fix For: 1.15.0
>
>
> The SQL-CLI now outputs the column name as the name that comes from the plan, 
> which is  not that readable sometimes, i can think of 2 cases that can be 
> promoted:
> * Expression like "a + b" should output "a + b" (i.e. the MySQL CLI) instead 
> of "$Expr{index}"
> * We should always output the alias if it's there, now, the alias in the plan 
> may be dropped because of 2 reasons:
> 1. Project remove rule would remove the project without considering the 
> alias
> 2. after CALCITE-3713, some CALC/PROJ would be reused while ignoring the 
> column alias



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-17362) Improve table examples to reflect latest status

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17362:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Improve table examples to reflect latest status
> ---
>
> Key: FLINK-17362
> URL: https://issues.apache.org/jira/browse/FLINK-17362
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: Kurt Young
>Priority: Minor
> Fix For: 1.15.0
>
>
> Currently the table examples seems outdated, especially after blink planner 
> becomes the default choice. We might need to refactor the structure of all 
> examples, and cover the following items:
>  # streaming sql & table api examples
>  # batch sql & table api examples
>  # table/sql & datastream interoperation
>  # table/sql & dataset interoperation
>  # DDL & DML examples



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-17011) Introduce builder to create AbstractStreamOperatorTestHarness for testing

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17011:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Introduce builder to create AbstractStreamOperatorTestHarness for testing
> -
>
> Key: FLINK-17011
> URL: https://issues.apache.org/jira/browse/FLINK-17011
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Tests
>Affects Versions: 1.10.0
>Reporter: Yun Tang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current \{{AbstractStreamOperatorTestHarness}} lacks of builder which leads 
> us to create more constructors. Moreover, to set customized component, we 
> might have to call \{{AbstractStreamOperatorTestHarness#setup}}, which might 
> be treated a deprecated interface, before using.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-15845) Add java-based StreamingFileSink test using s3

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-15845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-15845:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Add java-based StreamingFileSink test using s3
> --
>
> Key: FLINK-15845
> URL: https://issues.apache.org/jira/browse/FLINK-15845
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Tests
>Affects Versions: 1.11.0
>Reporter: Arvid Heise
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> StreamingFileSink is covered by a few shell-based e2e tests but no Java-based 
> tests. To reach our long-term goal of replacing shell-based tests, this issue 
> will provide a first version that runs on minio/aws and aims to 
> extend/replace `test_streaming_file_sink.sh`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16774) expose HBaseUpsertSinkFunction hTableName and schema for other system

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16774:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> expose HBaseUpsertSinkFunction hTableName and schema for other system
> -
>
> Key: FLINK-16774
> URL: https://issues.apache.org/jira/browse/FLINK-16774
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / HBase
>Affects Versions: 1.10.0
>Reporter: jackylau
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.15.0
>
> Attachments: flink-atlas.pdf
>
>
> i want to expose hTableName and schema of HBaseUpsertSinkFunction  such as 
> getTableName, getTableScheme for other system such as atlas and i think it is 
> needed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16799) add hive partition limit when read from hive

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16799:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> add hive partition limit when read from hive
> 
>
> Key: FLINK-16799
> URL: https://issues.apache.org/jira/browse/FLINK-16799
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Jun Zhang
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> add a partition limit when read from hive , a query will not be executed if 
> it attempts to fetch more partitions per table than the limit configured. 
>  
>  To avoid full table scans



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-17642) Exception while reading broken ORC file is hidden

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17642:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Exception while reading broken ORC file is hidden
> -
>
> Key: FLINK-17642
> URL: https://issues.apache.org/jira/browse/FLINK-17642
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / ORC
>Affects Versions: 1.8.3, 1.9.3, 1.10.1
>Reporter: Nikola
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 1.10.4, 1.11.5, 1.15.0
>
>
> I have a simple setup of a batch job like this:
> {code:java}
> BatchTableEnvironment tableEnvFirst = BatchTableEnvironment.create(env);
> OrcTableSource orcTableSource = OrcTableSource.builder()
>  .path("path", true)
>  .forOrcSchema(ORC.getSchema())
>  .withConfiguration(hdfsConfig)
>  .build();
> tableEnvFirst.registerTableSource("table", orcTableSource);
> Table nnfTable = tableEnvFirst.sqlQuery(sqlString);
> return tableEnvFirst.toDataSet(nnfTable, Row.class);{code}
>   
> And that works just fine to fetch ORC files from hdfs as a DataSet.
> However, there are some ORC files which are broken. "Broken" means that they 
> are invalid in some way and cannot be processed / fetch normally. They throw 
> exceptions. Examples of those are:
> {code:java}
> org.apache.orc.FileFormatException: Malformed ORC file /user/hdfs/orcfile-1 
> Invalid postscript length 2 
> at org.apache.orc.impl.ReaderImpl.ensureOrcFooter(ReaderImpl.java:258) 
> at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:562) 
> at org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) 
> at org.apache.orc.OrcFile.createReader(OrcFile.java:342) 
> at org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:225) 
> at org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:63) 
> at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
>  
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) 
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) 
> at java.lang.Thread.run(Thread.java:748){code}
>  
> {code:java}
> com.google.protobuf.InvalidProtocolBufferException: Protocol message 
> contained an invalid tag (zero). 
> at 
> com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
>  
> at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108) 
> at org.apache.orc.OrcProto$PostScript.(OrcProto.java:18526) 
> at org.apache.orc.OrcProto$PostScript.(OrcProto.java:18490) 
> at org.apache.orc.OrcProto$PostScript$1.parsePartialFrom(OrcProto.java:18628) 
> at org.apache.orc.OrcProto$PostScript$1.parsePartialFrom(OrcProto.java:18623) 
> at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:89) 
> at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:95) 
> at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) 
> at org.apache.orc.OrcProto$PostScript.parseFrom(OrcProto.java:19022) 
> at org.apache.orc.impl.ReaderImpl.extractPostScript(ReaderImpl.java:436) 
> at org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:564) 
> at org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) 
> at org.apache.orc.OrcFile.createReader(OrcFile.java:342) 
> at org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:225) 
> at org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:63) 
> at 
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:173)
>  
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) 
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) 
> at java.lang.Thread.run(Thread.java:748){code}
>   
> Given that some specific files are broken, that's OK to throw exception. 
> However, the issue is that I cannot catch those exceptions and they make my 
> job to fail. I tried to wrap everything in a try-catch block just to see what 
> I can catch and handle, but it seems that when flink runs it, it's not run 
> from that place, but rather from DataSourceTask.invoke()
> I can digged a little bit to find out why don't I get an exception and I can 
> see that {{OrcTableSource}} creates {{OrcRowInputFormat}} instance 
> [here|#L157]] which then calls open() and open() has this signature: 
> {code:java}
> public void open(FileInputSplit fileSplit) throws IOException {{code}
>   
> So the open() throws the exception but I am not able to catch it. 
> Is what I am doing correct or is there any other way to handle exception 
> coming from DataSourceTask.invoke()? In general my goal would be to ignore 
> all broken/corrupted ORC files but that does not seem to be possible



--
This message was sent by Atlassian Jira

[jira] [Updated] (FLINK-17755) Support side-output of expiring states with TTL.

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-17755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-17755:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Support side-output of expiring states with TTL.
> 
>
> Key: FLINK-17755
> URL: https://issues.apache.org/jira/browse/FLINK-17755
> Project: Flink
>  Issue Type: New Feature
>  Components: API / State Processor
>Reporter: Roey Shem Tov
>Priority: Minor
>  Labels: auto-deprioritized-major
> Fix For: 2.0.0, 1.15.0
>
>
> When we set a StateTTLConfig to StateDescriptor, then when a record has been 
> expired, it is deleted from the StateBackend.
> I want suggest a new feature, that we can get the expiring results as side 
> output, to process them and not just delete them.
> For example, if we have a ListState that have a TTL enabled, we can get the 
> expiring records in the list as side-output.
> What do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16762) Relocation Beam dependency of PyFlink

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16762:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Relocation Beam dependency of PyFlink
> -
>
> Key: FLINK-16762
> URL: https://issues.apache.org/jira/browse/FLINK-16762
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.0
>Reporter: sunjincheng
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned, 
> pull-request-available
> Fix For: 1.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Some users may already use beam on their own cluster, which may cause the 
> conflict between the beam jar package carried by pyflink and the jar of the 
> user cluster beam to a certain extent. So, I would like to relocation the 
> Beam dependency of PyFlink.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-16952) Parquet file system format support filter pushdown

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-16952:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Parquet file system format support filter pushdown
> ---
>
> Key: FLINK-16952
> URL: https://issues.apache.org/jira/browse/FLINK-16952
> Project: Flink
>  Issue Type: Sub-task
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Jingsong Lee
>Priority: Major
>  Labels: auto-unassigned, pull-request-available
> Fix For: 1.15.0
>
>
> We can create the conversion between Flink Expression(NOTE: should be new 
> Expression instead of PlannerExpression) and parquet FilterPredicate.
> And apply to Parquet file system format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   9   >