date:20221219

[jira] [Commented] (FLINK-29427) LookupJoinITCase failed with classloader problem

2022-12-19 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649614#comment-17649614
 ] 

Matthias Pohl commented on FLINK-29427:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44084=logs=de826397-1924-5900-0034-51895f69d4b7=f311e913-93a2-5a37-acab-4a63e1328f94=17551

> LookupJoinITCase failed with classloader problem
> 
>
> Key: FLINK-29427
> URL: https://issues.apache.org/jira/browse/FLINK-29427
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Huang Xingbo
>Assignee: Alexander Smirnov
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> {code:java}
> 2022-09-27T02:49:20.9501313Z Sep 27 02:49:20 Caused by: 
> org.codehaus.janino.InternalCompilerException: Compiling 
> "KeyProjection$108341": Trying to access closed classloader. Please check if 
> you store classloaders directly or indirectly in static fields. If the 
> stacktrace suggests that the leak occurs in a third party library and cannot 
> be fixed immediately, you can disable this check with the configuration 
> 'classloader.check-leaked-classloader'.
> 2022-09-27T02:49:20.9502654Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:382)
> 2022-09-27T02:49:20.9503366Z Sep 27 02:49:20  at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:237)
> 2022-09-27T02:49:20.9504044Z Sep 27 02:49:20  at 
> org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:465)
> 2022-09-27T02:49:20.9504704Z Sep 27 02:49:20  at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:216)
> 2022-09-27T02:49:20.9505341Z Sep 27 02:49:20  at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:207)
> 2022-09-27T02:49:20.9505965Z Sep 27 02:49:20  at 
> org.codehaus.commons.compiler.Cookable.cook(Cookable.java:80)
> 2022-09-27T02:49:20.9506584Z Sep 27 02:49:20  at 
> org.codehaus.commons.compiler.Cookable.cook(Cookable.java:75)
> 2022-09-27T02:49:20.9507261Z Sep 27 02:49:20  at 
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:104)
> 2022-09-27T02:49:20.9507883Z Sep 27 02:49:20  ... 30 more
> 2022-09-27T02:49:20.9509266Z Sep 27 02:49:20 Caused by: 
> java.lang.IllegalStateException: Trying to access closed classloader. Please 
> check if you store classloaders directly or indirectly in static fields. If 
> the stacktrace suggests that the leak occurs in a third party library and 
> cannot be fixed immediately, you can disable this check with the 
> configuration 'classloader.check-leaked-classloader'.
> 2022-09-27T02:49:20.9510835Z Sep 27 02:49:20  at 
> org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:184)
> 2022-09-27T02:49:20.9511760Z Sep 27 02:49:20  at 
> org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:192)
> 2022-09-27T02:49:20.9512456Z Sep 27 02:49:20  at 
> java.lang.Class.forName0(Native Method)
> 2022-09-27T02:49:20.9513014Z Sep 27 02:49:20  at 
> java.lang.Class.forName(Class.java:348)
> 2022-09-27T02:49:20.9513649Z Sep 27 02:49:20  at 
> org.codehaus.janino.ClassLoaderIClassLoader.findIClass(ClassLoaderIClassLoader.java:89)
> 2022-09-27T02:49:20.9514339Z Sep 27 02:49:20  at 
> org.codehaus.janino.IClassLoader.loadIClass(IClassLoader.java:312)
> 2022-09-27T02:49:20.9514990Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.findTypeByName(UnitCompiler.java:8556)
> 2022-09-27T02:49:20.9515659Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:6749)
> 2022-09-27T02:49:20.9516337Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.getReferenceType(UnitCompiler.java:6594)
> 2022-09-27T02:49:20.9516989Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.getType2(UnitCompiler.java:6573)
> 2022-09-27T02:49:20.9517632Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler.access$13900(UnitCompiler.java:215)
> 2022-09-27T02:49:20.9518319Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler$22$1.visitReferenceType(UnitCompiler.java:6481)
> 2022-09-27T02:49:20.9519018Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler$22$1.visitReferenceType(UnitCompiler.java:6476)
> 2022-09-27T02:49:20.9519680Z Sep 27 02:49:20  at 
> org.codehaus.janino.Java$ReferenceType.accept(Java.java:3928)
> 2022-09-27T02:49:20.9520386Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler$22.visitType(UnitCompiler.java:6476)
> 2022-09-27T02:49:20.9521042Z Sep 27 02:49:20  at 
> org.codehaus.janino.UnitCompiler$22.visitType(UnitCompiler.java:6469)
> 2022-09-27T02:49:20.9521677Z Sep 27 02:49:20  at 
>

[jira] [Commented] (FLINK-29859) TPC-DS end-to-end test with adaptive batch scheduler failed due to oo non-empty .out files.

2022-12-19 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649613#comment-17649613
 ] 

Matthias Pohl commented on FLINK-29859:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44084=logs=f8e16326-dc75-5ba0-3e95-6178dd55bf6c=15c1d318-5ca8-529f-77a2-d113a700ec34=9253

> TPC-DS end-to-end test with adaptive batch scheduler failed due to oo 
> non-empty .out files.
> ---
>
> Key: FLINK-29859
> URL: https://issues.apache.org/jira/browse/FLINK-29859
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Leonard Xu
>Priority: Major
>  Labels: test-stability
>
> Nov 03 02:02:12 [FAIL] 'TPC-DS end-to-end test with adaptive batch scheduler' 
> failed after 21 minutes and 44 seconds! Test exited with exit code 0 but the 
> logs contained errors, exceptions or non-empty .out files 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42766=logs=ae4f8708-9994-57d3-c2d7-b892156e7812=af184cdd-c6d8-5084-0b69-7e9c67b35f7a



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-29664) Collect subpartition sizes of BLOCKING result partitions

2022-12-19 Thread Lijie Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-29664.
--
Resolution: Done

> Collect subpartition sizes of BLOCKING result partitions
> 
>
> Key: FLINK-29664
> URL: https://issues.apache.org/jira/browse/FLINK-29664
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> In order to divide subpartition range according to the amount of data, the 
> scheduler needs to collect the size of each subpartition produced by upstream 
> tasks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29664) Collect subpartition sizes of BLOCKING result partitions

2022-12-19 Thread Lijie Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649603#comment-17649603
 ] 

Lijie Wang commented on FLINK-29664:


Done via 96795ae5bf0b6d6c8b1fa793be742a7c1af3

> Collect subpartition sizes of BLOCKING result partitions
> 
>
> Key: FLINK-29664
> URL: https://issues.apache.org/jira/browse/FLINK-29664
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> In order to divide subpartition range according to the amount of data, the 
> scheduler needs to collect the size of each subpartition produced by upstream 
> tasks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] wanglijie95 closed pull request #21111: [FLINK-29664][runtime] Collect subpartition sizes of blocking result partitions

2022-12-19 Thread GitBox



wanglijie95 closed pull request #2: [FLINK-29664][runtime] Collect 
subpartition sizes of blocking result partitions
URL: https://github.com/apache/flink/pull/2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-30456) OLM Bundle Description Version Problems

2022-12-19 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-30456:
--

Assignee: James Busche

> OLM Bundle Description Version Problems
> ---
>
> Key: FLINK-30456
> URL: https://issues.apache.org/jira/browse/FLINK-30456
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.0
>Reporter: James Busche
>Assignee: James Busche
>Priority: Major
>  Labels: pull-request-available
>
> OLM is working great with OperatorHub, but noticed a few items that need 
> fixing.
> 1.  The basic.yaml example version is release-1.1 instead of the latest 
> release (release-1.3).  This needs fixing in two places:
> tools/olm/generate-olm-bundle.sh
> tools/olm/docker-entry.sh
> 2.  The label versions in the description are hardcoded to 1.2.0 instead of 
> the latest release (1.3.0)
> 3. The Provider is blank space " " but soon needs to have some text in there 
> to avoid CI errors with the latest operator-sdk versions.  The person who 
> noticed it recommended "Community" for now, but maybe we can being making it 
> "The Apache Software Foundation" now?  Not sure if we're ready for that yet 
> or not...
>  
> I'm working on a PR to address these items.  Can you assign the issue to me?  
> Thanks!
> fyi [~tedhtchang] [~gyfora] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] jbusche commented on pull request #491: [FLINK-30456] Fixing Version and provider in OLM Description

2022-12-19 Thread GitBox



jbusche commented on PR #491:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/491#issuecomment-1358917278

   I tested my PR briefly on OpenShift 4.10, will do some more thorough testing 
tomorrow and post the results here.  I think @tedhtchang might test on his 
server(s) too.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30453) Fix 'can't find CatalogFactory' error when using FLINK sql-client to add table store bundle jar

2022-12-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30453:
---
Labels: pull-request-available  (was: )

> Fix 'can't find CatalogFactory' error when using FLINK sql-client to add 
> table store bundle jar
> ---
>
> Key: FLINK-30453
> URL: https://issues.apache.org/jira/browse/FLINK-30453
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Reporter: yuzelin
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.3.0
>
>
> The FLINK 1.16 has introduced new mechanism to allow passing a ClassLoader to 
> EnvironmentSettings[FLINK-15635|https://issues.apache.org/jira/browse/FLINK-15635],
>   and the Flink SQL client will pass a `ClientWrapperClassLoader`, making 
> that table store CatalogFactory can't be found if it is added by 'ADD JAR' 
> statement through SQL Client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-table-store] yuzelin opened a new pull request, #442: [FLINK-30453] Fix 'can't find CatalogFactory' error when using FLINK sql-client to add table store bundle jar

2022-12-19 Thread GitBox



yuzelin opened a new pull request, #442:
URL: https://github.com/apache/flink-table-store/pull/442

   Main Change:
   - Allow passing ClassLoader parameter when creating catalog;
   - Pass ClassLoader from context when creating `FlinkCatalog`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] huwh commented on pull request #21496: [FLINK-29869][ResourceManager] make ResourceAllocator declarative.

2022-12-19 Thread GitBox



huwh commented on PR #21496:
URL: https://github.com/apache/flink/pull/21496#issuecomment-1358905082

   Thanks for your comment, @xintongsong, I made the changes
   
   1. delete UnwantedWorker and UnwantedWorkerWithResourceProfile. Make the 
unwanted workers as hint. 
   2. add totalWorkerCounter to maintian the count of all workers which have 
resource spec(include recovered workers) 
   3. remove the delayed check of declarationResource, since its cost is not 
very heavy.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30260) FLIP-271: Add initial Autoscaling implementation

2022-12-19 Thread SwathiChandrashekar (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649592#comment-17649592
 ] 

SwathiChandrashekar commented on FLINK-30260:
-

Hi [~mxm] , [~gyfora] ,  I would like to volunteer if there are any subtasks 
where I can contribute

> FLIP-271: Add initial Autoscaling implementation 
> -
>
> Key: FLINK-30260
> URL: https://issues.apache.org/jira/browse/FLINK-30260
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: kubernetes-operator-1.4.0
>
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-271%3A+Autoscaling



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30435) `SHOW CREATE TABLE` statement shows column comment

2022-12-19 Thread Yubin Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649587#comment-17649587
 ] 

Yubin Li commented on FLINK-30435:
--

Hi [~lzljs3620320], After FLINK-29679 merged, just need to modify a few lines 
to implement the feature. Could you please give a reivew? Thanks!

 

> `SHOW CREATE TABLE` statement shows column comment
> --
>
> Key: FLINK-30435
> URL: https://issues.apache.org/jira/browse/FLINK-30435
> Project: Flink
>  Issue Type: Improvement
>Reporter: Yubin Li
>Priority: Major
>  Labels: pull-request-available
>
> After a table with column comments created, we would find that the results 
> generated by `show create table` statement lose column comments.
> As created table has been migrated to new schema framework in 
> https://issues.apache.org/jira/browse/FLINK-29679,  It is easy to take it 
> done.
> Note: the feature doesn't change sql syntax, just make outputs consistent 
> with ddl and more user-friendly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30445) Add Flink Web3 Connector

2022-12-19 Thread luoyuxia (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luoyuxia updated FLINK-30445:
-
Description: 
# Web3 is very hot. But you could search GitHub open source blockchain 
explorer, the most stars project is blockscout, 
[https://github.com/blockscout/blockscout|https://github.com/blockscout/blockscout,]
 which use Elixir as a parallel engine to sync block from blockchain node into 
a file(CSV format). I think Flink is the best solution of ingestion. Reason:

(1)blockchain needs to match different chain, including Ethereum, Bitcoin, 
Solana, etc. through JSON RPC.

(2)Like EtherScan, the blockchain needs to fetch the latest block into storage 
for the index to search.

(3)Also as a supplement to (2), we need a connector to fully sync all block 
from Blockchain Node. I think Flink Stream/Batch alignment feature is suit for 
this scenarios.

(4)According to FLIP-27, we could use block number as SourceSplit to read. It 
is very natural.

(5)Flink Community could use web3 topic to get PR effects on web3 cycle.

  was:
Web3 is very hot. But you could search GitHub open source blockchain explorer, 
the most stars project is blockscout, 
[https://github.com/blockscout/blockscout|https://github.com/blockscout/blockscout,]
 which use Elixir as a parallel engine to sync block from blockchain node into 
a file(CSV format). I think Flink is the best solution of ingestion. Reason:

(1)blockchain needs to match different chain, including Ethereum, Bitcoin, 
Solana, etc. through JSON RPC.

(2)Like EtherScan, the blockchain needs to fetch the latest block into storage 
for the index to search.

(3)Also as a supplement to (2), we need a connector to fully sync all block 
from Blockchain Node. I think Flink Stream/Batch alignment feature is suit for 
this scenarios.

(4)According to FLIP-27, we could use block number as SourceSplit to read. It 
is very natural.

(5)Flink Community could use web3 topic to get PR effects on web3 cycle.


> Add Flink Web3 Connector
> 
>
> Key: FLINK-30445
> URL: https://issues.apache.org/jira/browse/FLINK-30445
> Project: Flink
>  Issue Type: New Feature
>Reporter: Junyao Huang
>Priority: Major
>
> # Web3 is very hot. But you could search GitHub open source blockchain 
> explorer, the most stars project is blockscout, 
> [https://github.com/blockscout/blockscout|https://github.com/blockscout/blockscout,]
>  which use Elixir as a parallel engine to sync block from blockchain node 
> into a file(CSV format). I think Flink is the best solution of ingestion. 
> Reason:
> (1)blockchain needs to match different chain, including Ethereum, Bitcoin, 
> Solana, etc. through JSON RPC.
> (2)Like EtherScan, the blockchain needs to fetch the latest block into 
> storage for the index to search.
> (3)Also as a supplement to (2), we need a connector to fully sync all block 
> from Blockchain Node. I think Flink Stream/Batch alignment feature is suit 
> for this scenarios.
> (4)According to FLIP-27, we could use block number as SourceSplit to read. It 
> is very natural.
> (5)Flink Community could use web3 topic to get PR effects on web3 cycle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30445) Add Flink Web3 Connector

2022-12-19 Thread Junyao Huang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649558#comment-17649558
 ] 

Junyao Huang commented on FLINK-30445:
--

[https://lists.apache.org/list.html?d...@flink.apache.org] posted. 
[~martijnvisser] 

> Add Flink Web3 Connector
> 
>
> Key: FLINK-30445
> URL: https://issues.apache.org/jira/browse/FLINK-30445
> Project: Flink
>  Issue Type: New Feature
>Reporter: Junyao Huang
>Priority: Major
>
> Web3 is very hot. But you could search GitHub open source blockchain 
> explorer, the most stars project is blockscout, 
> [https://github.com/blockscout/blockscout|https://github.com/blockscout/blockscout,]
>  which use Elixir as a parallel engine to sync block from blockchain node 
> into a file(CSV format). I think Flink is the best solution of ingestion. 
> Reason:
> (1)blockchain needs to match different chain, including Ethereum, Bitcoin, 
> Solana, etc. through JSON RPC.
> (2)Like EtherScan, the blockchain needs to fetch the latest block into 
> storage for the index to search.
> (3)Also as a supplement to (2), we need a connector to fully sync all block 
> from Blockchain Node. I think Flink Stream/Batch alignment feature is suit 
> for this scenarios.
> (4)According to FLIP-27, we could use block number as SourceSplit to read. It 
> is very natural.
> (5)Flink Community could use web3 topic to get PR effects on web3 cycle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30445) Add Flink Web3 Connector

2022-12-19 Thread Junyao Huang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649553#comment-17649553
 ] 

Junyao Huang commented on FLINK-30445:
--

Is it this one? [https://lists.apache.org/list.html?u...@flink.apache.org] 
[~martijnvisser] 

> Add Flink Web3 Connector
> 
>
> Key: FLINK-30445
> URL: https://issues.apache.org/jira/browse/FLINK-30445
> Project: Flink
>  Issue Type: New Feature
>Reporter: Junyao Huang
>Priority: Major
>
> Web3 is very hot. But you could search GitHub open source blockchain 
> explorer, the most stars project is blockscout, 
> [https://github.com/blockscout/blockscout|https://github.com/blockscout/blockscout,]
>  which use Elixir as a parallel engine to sync block from blockchain node 
> into a file(CSV format). I think Flink is the best solution of ingestion. 
> Reason:
> (1)blockchain needs to match different chain, including Ethereum, Bitcoin, 
> Solana, etc. through JSON RPC.
> (2)Like EtherScan, the blockchain needs to fetch the latest block into 
> storage for the index to search.
> (3)Also as a supplement to (2), we need a connector to fully sync all block 
> from Blockchain Node. I think Flink Stream/Batch alignment feature is suit 
> for this scenarios.
> (4)According to FLIP-27, we could use block number as SourceSplit to read. It 
> is very natural.
> (5)Flink Community could use web3 topic to get PR effects on web3 cycle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] ramkrish86 commented on a diff in pull request #21508: [FLINK-30128] Introduce Azure Data Lake Gen2 APIs in the Hadoop Recoverable path

2022-12-19 Thread GitBox



ramkrish86 commented on code in PR #21508:
URL: https://github.com/apache/flink/pull/21508#discussion_r1052900496


##
flink-filesystems/flink-azure-fs-hadoop/pom.xml:
##
@@ -185,13 +185,6 @@ under the License.



-   
-   
org.apache.flink:flink-hadoop-fs
-   

-   
org/apache/flink/runtime/util/HadoopUtils
-   
org/apache/flink/runtime/fs/hdfs/HadoopRecoverable*
-   

-   

Review Comment:
   Can I resolve this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] ramkrish86 commented on a diff in pull request #21508: [FLINK-30128] Introduce Azure Data Lake Gen2 APIs in the Hadoop Recoverable path

2022-12-19 Thread GitBox



ramkrish86 commented on code in PR #21508:
URL: https://github.com/apache/flink/pull/21508#discussion_r1052900374


##
flink-filesystems/flink-azure-fs-hadoop/src/main/java/org/apache/flink/fs/azurefs/AzureBlobFsRecoverableDataOutputStream.java:
##
@@ -0,0 +1,299 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.fs.azurefs;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.RecoverableFsDataOutputStream;
+import org.apache.flink.core.fs.RecoverableWriter;
+import org.apache.flink.runtime.fs.hdfs.HadoopFsRecoverable;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/**
+ * An implementation of the {@link RecoverableFsDataOutputStream} for 
AzureBlob's file system
+ * abstraction.
+ */
+@Internal
+public class AzureBlobFsRecoverableDataOutputStream extends 
RecoverableFsDataOutputStream {
+
+private static final Logger LOG =
+
LoggerFactory.getLogger(AzureBlobFsRecoverableDataOutputStream.class);
+
+private final FileSystem fs;
+
+private final Path targetFile;
+
+private final Path tempFile;
+
+private final FSDataOutputStream out;
+
+// Not final to override in tests
+public static int minBufferLength = 2097152;
+
+// init to 0. When ever recovery is done add this to the pos.
+private long initialFileSize = 0;
+
+public AzureBlobFsRecoverableDataOutputStream(FileSystem fs, Path 
targetFile, Path tempFile)
+throws IOException {
+this.fs = checkNotNull(fs);
+this.targetFile = checkNotNull(targetFile);
+LOG.debug("The targetFile is {}", targetFile.getName());
+this.tempFile = checkNotNull(tempFile);
+LOG.debug("The tempFile is {}", tempFile.getName());
+this.out = fs.create(tempFile);
+}
+
+public AzureBlobFsRecoverableDataOutputStream(FileSystem fs, 
HadoopFsRecoverable recoverable)
+throws IOException {
+this.fs = checkNotNull(fs);
+this.targetFile = checkNotNull(recoverable.targetFile());
+this.tempFile = checkNotNull(recoverable.tempFile());
+long len = fs.getFileStatus(tempFile).getLen();
+LOG.info("The recoverable offset is {} and the file len is {}", 
recoverable.offset(), len);
+// Happens when we recover from a previously committed offset. 
Otherwise this is not
+// really needed
+if (len > recoverable.offset()) {
+truncate(fs, recoverable);
+} else if (len < recoverable.offset()) {
+LOG.error(
+"Temp file length {} is less than the expected recoverable 
offset {}",
+len,
+recoverable.offset());
+throw new IOException(
+"Unable to create recoverable outputstream as length of 
file "
++ len
++ " is less than "
++ "recoverable offset "
++ recoverable.offset());
+}
+// In ABFS when we try to append we don't account for the initial file 
size like we do in
+// DFS.
+// So we explicitly store this and when we do a persist call we make 
use of it.
+initialFileSize = fs.getFileStatus(tempFile).getLen();
+out = fs.append(tempFile);
+LOG.debug("Created a new OS for appending {}", tempFile);
+}
+
+private void truncate(FileSystem fs, HadoopFsRecoverable recoverable) 
throws IOException {
+Path renameTempPath = new Path(tempFile.toString() + ".rename");
+try {
+LOG.info(
+"Creating the temp rename file {} for truncating the 
tempFile {}",
+renameTempPath,
+tempFile);
+FSDataOutputStream fsDataOutputStream =

[jira] [Created] (FLINK-30457) Introduce period compaction to FRocksDB

2022-12-19 Thread Yun Tang (Jira)

Yun Tang created FLINK-30457:


 Summary: Introduce period compaction to FRocksDB
 Key: FLINK-30457
 URL: https://issues.apache.org/jira/browse/FLINK-30457
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Reporter: Yun Tang
Assignee: Yun Tang
 Fix For: 1.17.0


RocksDB supports period compaction once compaction filter is enabled (see 
https://github.com/facebook/rocksdb/wiki/Leveled-Compaction#periodic-compaction),
 which is really useful for Flink TTL feature to pick really old files in 
compaction.

Current FRocksDB base version: 6.20.3 has included the necessary code on the 
C++ side: https://github.com/facebook/rocksdb/pull/5166 and 
https://github.com/facebook/rocksdb/pull/5184. We just need to cherry pick the 
https://github.com/facebook/rocksdb/pull/8579 on the java side to support it in 
Flink.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] yangjunhan commented on a diff in pull request #21483: [FLINK-30358][rest] Exception location contains TM resource id and link to TM pages.

2022-12-19 Thread GitBox



yangjunhan commented on code in PR #21483:
URL: https://github.com/apache/flink/pull/21483#discussion_r1052849668


##
flink-runtime-web/web-dashboard/src/app/pages/job/exceptions/job-exceptions.component.ts:
##
@@ -151,4 +153,10 @@ export class JobExceptionsComponent implements OnInit, 
OnDestroy {
 });
   });
   }
+
+  public navigateTo(taskManagerId: string): void {
+if (taskManagerId != null) {

Review Comment:
   ```suggestion
   if (taskManagerId !== null) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] yangjunhan commented on a diff in pull request #21483: [FLINK-30358][rest] Exception location contains TM resource id and link to TM pages.

2022-12-19 Thread GitBox



yangjunhan commented on code in PR #21483:
URL: https://github.com/apache/flink/pull/21483#discussion_r1052849595


##
flink-runtime-web/web-dashboard/src/app/pages/job/exceptions/job-exceptions.component.ts:
##
@@ -151,4 +153,10 @@ export class JobExceptionsComponent implements OnInit, 
OnDestroy {
 });
   });
   }
+
+  public navigateTo(taskManagerId: string): void {

Review Comment:
   ```suggestion
 public navigateTo(taskManagerId: string | null): void {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-24932) Frocksdb cannot run on Apple M1

2022-12-19 Thread Yun Tang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang updated FLINK-24932:
-
Priority: Major  (was: Minor)

> Frocksdb cannot run on Apple M1
> ---
>
> Key: FLINK-24932
> URL: https://issues.apache.org/jira/browse/FLINK-24932
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Yun Tang
>Assignee: Yun Tang
>Priority: Major
> Fix For: 1.17.0
>
>
> After we bump up RocksDB version to 6.20.3, we support to run RocksDB on 
> linux arm cluster. However, according to the feedback from Robert, Apple M1 
> machines cannot run FRocksDB yet:
> {code:java}
> java.lang.Exception: Exception while creating StreamOperatorStateContext.
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:255)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:268)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:109)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:711)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:687)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
>  ~[flink-runtime-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) 
> ~[flink-runtime-1.14.0.jar:1.14.0]
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) 
> ~[flink-runtime-1.14.0.jar:1.14.0]
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) 
> ~[flink-runtime-1.14.0.jar:1.14.0]
>   at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_312]
> Caused by: org.apache.flink.util.FlinkException: Could not restore keyed 
> state backend for StreamFlatMap_c21234bcbf1e8eb4c61f1927190efebd_(1/1) from 
> any of the 1 provided restore options.
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:346)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:164)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   ... 11 more
> Caused by: java.io.IOException: Could not load the native RocksDB library
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.ensureRocksDBIsLoaded(EmbeddedRocksDBStateBackend.java:882)
>  ~[flink-statebackend-rocksdb_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:402)
>  ~[flink-statebackend-rocksdb_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:345)
>  ~[flink-statebackend-rocksdb_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:87)
>  ~[flink-statebackend-rocksdb_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:329)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>  ~[flink-streaming-java_2.11-1.14.0.jar:1.14.0]
>   at 
>

[GitHub] [flink-kubernetes-operator] jbusche commented on pull request #491: [FLINK-30456] Fixing Version and provider in OLM Description

2022-12-19 Thread GitBox



jbusche commented on PR #491:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/491#issuecomment-1358748799

   Yes @morhidi it took me awhile to fill out completely
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #491: [FLINK-30456] Fixing Version and provider in OLM Description

2022-12-19 Thread GitBox



morhidi commented on PR #491:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/491#issuecomment-1358742932

   @jbusche please fill out the PR template properly


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30456) OLM Bundle Description Version Problems

2022-12-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30456:
---
Labels: pull-request-available  (was: )

> OLM Bundle Description Version Problems
> ---
>
> Key: FLINK-30456
> URL: https://issues.apache.org/jira/browse/FLINK-30456
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.0
>Reporter: James Busche
>Priority: Major
>  Labels: pull-request-available
>
> OLM is working great with OperatorHub, but noticed a few items that need 
> fixing.
> 1.  The basic.yaml example version is release-1.1 instead of the latest 
> release (release-1.3).  This needs fixing in two places:
> tools/olm/generate-olm-bundle.sh
> tools/olm/docker-entry.sh
> 2.  The label versions in the description are hardcoded to 1.2.0 instead of 
> the latest release (1.3.0)
> 3. The Provider is blank space " " but soon needs to have some text in there 
> to avoid CI errors with the latest operator-sdk versions.  The person who 
> noticed it recommended "Community" for now, but maybe we can being making it 
> "The Apache Software Foundation" now?  Not sure if we're ready for that yet 
> or not...
>  
> I'm working on a PR to address these items.  Can you assign the issue to me?  
> Thanks!
> fyi [~tedhtchang] [~gyfora] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-table-store] zjureel closed pull request #441: [FLINK-30423] Introduce codegen for CastExecutor

2022-12-19 Thread GitBox



zjureel closed pull request #441: [FLINK-30423] Introduce codegen for 
CastExecutor
URL: https://github.com/apache/flink-table-store/pull/441


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] jbusche opened a new pull request, #491: [FLINK-30456] Fixing Version and provider in OLM Description

2022-12-19 Thread GitBox



jbusche opened a new pull request, #491:
URL: https://github.com/apache/flink-kubernetes-operator/pull/491

   Signed-off-by: James Busche 
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request adds a new feature to periodically create 
and maintain savepoints through the `FlinkDeployment` custom resource.)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *Periodic savepoint trigger is introduced to the custom resource*
 - *The operator checks on reconciliation whether the required time has 
passed*
 - *The JobManager's dispose savepoint API is used to clean up obsolete 
savepoints*
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
(yes / no)
 - Core observer or reconciler logic that is regularly executed: (yes / no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xintongsong commented on pull request #21508: [FLINK-30128] Introduce Azure Data Lake Gen2 APIs in the Hadoop Recoverable path

2022-12-19 Thread GitBox



xintongsong commented on PR #21508:
URL: https://github.com/apache/flink/pull/21508#issuecomment-1358726442

   > Mainly 2 TODOs. One is the 2MB buffer and next is the flush support. I 
think for now it should be good. But I can raise TODO JIRAs and work on it 
later. Is that ok ?
   
   I'm fine with the plan. Just trying to understand the plan. I have no strong 
opinions on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xintongsong commented on a diff in pull request #21508: [FLINK-30128] Introduce Azure Data Lake Gen2 APIs in the Hadoop Recoverable path

2022-12-19 Thread GitBox



xintongsong commented on code in PR #21508:
URL: https://github.com/apache/flink/pull/21508#discussion_r1052787197


##
flink-filesystems/flink-azure-fs-hadoop/src/main/java/org/apache/flink/fs/azurefs/AzureBlobFsRecoverableDataOutputStream.java:
##
@@ -0,0 +1,299 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.fs.azurefs;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.RecoverableFsDataOutputStream;
+import org.apache.flink.core.fs.RecoverableWriter;
+import org.apache.flink.runtime.fs.hdfs.HadoopFsRecoverable;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/**
+ * An implementation of the {@link RecoverableFsDataOutputStream} for 
AzureBlob's file system
+ * abstraction.
+ */
+@Internal
+public class AzureBlobFsRecoverableDataOutputStream extends 
RecoverableFsDataOutputStream {
+
+private static final Logger LOG =
+
LoggerFactory.getLogger(AzureBlobFsRecoverableDataOutputStream.class);
+
+private final FileSystem fs;
+
+private final Path targetFile;
+
+private final Path tempFile;
+
+private final FSDataOutputStream out;
+
+// Not final to override in tests
+public static int minBufferLength = 2097152;
+
+// init to 0. When ever recovery is done add this to the pos.
+private long initialFileSize = 0;
+
+public AzureBlobFsRecoverableDataOutputStream(FileSystem fs, Path 
targetFile, Path tempFile)
+throws IOException {
+this.fs = checkNotNull(fs);
+this.targetFile = checkNotNull(targetFile);
+LOG.debug("The targetFile is {}", targetFile.getName());
+this.tempFile = checkNotNull(tempFile);
+LOG.debug("The tempFile is {}", tempFile.getName());
+this.out = fs.create(tempFile);
+}
+
+public AzureBlobFsRecoverableDataOutputStream(FileSystem fs, 
HadoopFsRecoverable recoverable)
+throws IOException {
+this.fs = checkNotNull(fs);
+this.targetFile = checkNotNull(recoverable.targetFile());
+this.tempFile = checkNotNull(recoverable.tempFile());
+long len = fs.getFileStatus(tempFile).getLen();
+LOG.info("The recoverable offset is {} and the file len is {}", 
recoverable.offset(), len);
+// Happens when we recover from a previously committed offset. 
Otherwise this is not
+// really needed
+if (len > recoverable.offset()) {
+truncate(fs, recoverable);
+} else if (len < recoverable.offset()) {
+LOG.error(
+"Temp file length {} is less than the expected recoverable 
offset {}",
+len,
+recoverable.offset());
+throw new IOException(
+"Unable to create recoverable outputstream as length of 
file "
++ len
++ " is less than "
++ "recoverable offset "
++ recoverable.offset());
+}
+// In ABFS when we try to append we don't account for the initial file 
size like we do in
+// DFS.
+// So we explicitly store this and when we do a persist call we make 
use of it.
+initialFileSize = fs.getFileStatus(tempFile).getLen();
+out = fs.append(tempFile);
+LOG.debug("Created a new OS for appending {}", tempFile);
+}
+
+private void truncate(FileSystem fs, HadoopFsRecoverable recoverable) 
throws IOException {
+Path renameTempPath = new Path(tempFile.toString() + ".rename");
+try {
+LOG.info(
+"Creating the temp rename file {} for truncating the 
tempFile {}",
+renameTempPath,
+tempFile);
+FSDataOutputStream fsDataOutputStream =

[jira] [Updated] (FLINK-30456) OLM Bundle Description Version Problems

2022-12-19 Thread James Busche (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Busche updated FLINK-30456:
-
Description: 
OLM is working great with OperatorHub, but noticed a few items that need fixing.

1.  The basic.yaml example version is release-1.1 instead of the latest release 
(release-1.3).  This needs fixing in two places:

tools/olm/generate-olm-bundle.sh

tools/olm/docker-entry.sh

2.  The label versions in the description are hardcoded to 1.2.0 instead of the 
latest release (1.3.0)

3. The Provider is blank space " " but soon needs to have some text in there to 
avoid CI errors with the latest operator-sdk versions.  The person who noticed 
it recommended "Community" for now, but maybe we can being making it "The 
Apache Software Foundation" now?  Not sure if we're ready for that yet or not...

 

I'm working on a PR to address these items.  Can you assign the issue to me?  
Thanks!

fyi [~tedhtchang] [~gyfora] 

  was:
OLM is working great with OperatorHub, but noticed a few items that need fixing.

1.  The basic.yaml example version is release-1.1 instead of the latest release 
(release-1.3).  This needs fixing in two places:

tools/olm/generate-olm-bundle.sh

tools/olm/docker-entry.sh

2.  The label versions in the description are hardcoded to 1.2.0 instead of the 
latest release (1.3.0)

3. The Provider is blank space " " but soon needs to have some text in there to 
avoid CI errors with the latest operator-sdk versions.  The person who noticed 
it recommended "Community" for now, but maybe we can being making it "The 
Apache Software Foundation" now?  Not sure if we're ready for that yet or not...

 

I'm working on a PR to address these items.  Can you assign the issue to me?  
Thanks!

fyi [~tedchang] [~gyfora] 


> OLM Bundle Description Version Problems
> ---
>
> Key: FLINK-30456
> URL: https://issues.apache.org/jira/browse/FLINK-30456
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.3.0
>Reporter: James Busche
>Priority: Major
>
> OLM is working great with OperatorHub, but noticed a few items that need 
> fixing.
> 1.  The basic.yaml example version is release-1.1 instead of the latest 
> release (release-1.3).  This needs fixing in two places:
> tools/olm/generate-olm-bundle.sh
> tools/olm/docker-entry.sh
> 2.  The label versions in the description are hardcoded to 1.2.0 instead of 
> the latest release (1.3.0)
> 3. The Provider is blank space " " but soon needs to have some text in there 
> to avoid CI errors with the latest operator-sdk versions.  The person who 
> noticed it recommended "Community" for now, but maybe we can being making it 
> "The Apache Software Foundation" now?  Not sure if we're ready for that yet 
> or not...
>  
> I'm working on a PR to address these items.  Can you assign the issue to me?  
> Thanks!
> fyi [~tedhtchang] [~gyfora] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30456) OLM Bundle Description Version Problems

2022-12-19 Thread James Busche (Jira)

James Busche created FLINK-30456:


 Summary: OLM Bundle Description Version Problems
 Key: FLINK-30456
 URL: https://issues.apache.org/jira/browse/FLINK-30456
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.3.0
Reporter: James Busche


OLM is working great with OperatorHub, but noticed a few items that need fixing.

1.  The basic.yaml example version is release-1.1 instead of the latest release 
(release-1.3).  This needs fixing in two places:

tools/olm/generate-olm-bundle.sh

tools/olm/docker-entry.sh

2.  The label versions in the description are hardcoded to 1.2.0 instead of the 
latest release (1.3.0)

3. The Provider is blank space " " but soon needs to have some text in there to 
avoid CI errors with the latest operator-sdk versions.  The person who noticed 
it recommended "Community" for now, but maybe we can being making it "The 
Apache Software Foundation" now?  Not sure if we're ready for that yet or not...

 

I'm working on a PR to address these items.  Can you assign the issue to me?  
Thanks!

fyi [~tedchang] [~gyfora] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30406) Jobmanager Deployment error without HA metadata should not lead to unrecoverable error

2022-12-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30406:
---
Labels: pull-request-available  (was: )

> Jobmanager Deployment error without HA metadata should not lead to 
> unrecoverable error
> --
>
> Key: FLINK-30406
> URL: https://issues.apache.org/jira/browse/FLINK-30406
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.4.0
>
>
> Currently we don't have a good way of asserting that the job never started 
> after savepoint upgrade when the JM deployment fails (such as on an incorrect 
> image).
> This easily leads to scenarios which require manual recovery from the user.
> We should try to avoid this with some mechanism to greately improve the 
> robustness of savepoint ugrades.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] gyfora opened a new pull request, #489: [FLINK-30406] Detect when jobmanager never started

2022-12-19 Thread GitBox



gyfora opened a new pull request, #489:
URL: https://github.com/apache/flink-kubernetes-operator/pull/489

   ## What is the purpose of the change
   
   The purpose of this PR is to fix the long standing annoying case where the 
job was stuck after a non-upgradable state after starting/upgrading from a 
savepoint but the JobManager never starts.
   
   In these cases previously we only supported last-state (HA based) upgrade 
which was impossible to do if the JM never started and never created the HA 
metadata configmaps.
   
   The PR introduces a check whether the JobManager pods ever started by 
checking the Availability conditions on the JM deployment and comparing 
condition times with the deployment creation timestamp.
   
   If availability is False and the Deployment never transitioned out of this 
state after creation, we can then assume that the JM never started and we can 
perform the upgrade using the last recorded savepoint.
   
   This also removes the slightly adhoc logic we had in place for upgrades on 
initial deployments before stable state (that basically intended to work around 
this limitaiton).
   
   ## Verifying this change
   
   Unit tests + manual testing on minikube
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
no
 - Core observer or reconciler logic that is regularly executed: yes
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-30260) FLIP-271: Add initial Autoscaling implementation

2022-12-19 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-30260.
--
Fix Version/s: kubernetes-operator-1.4.0
   Resolution: Fixed

> FLIP-271: Add initial Autoscaling implementation 
> -
>
> Key: FLINK-30260
> URL: https://issues.apache.org/jira/browse/FLINK-30260
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: kubernetes-operator-1.4.0
>
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-271%3A+Autoscaling



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30437) State incompatibility issue might cause state loss

2022-12-19 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-30437:
---
Fix Version/s: kubernetes-operator-1.4.0

> State incompatibility issue might cause state loss
> --
>
> Key: FLINK-30437
> URL: https://issues.apache.org/jira/browse/FLINK-30437
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.2.0, kubernetes-operator-1.3.0
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.4.0
>
>
> Even though we set:
> execution.shutdown-on-application-finish: false
> execution.submit-failed-job-on-application-error: true
> If there is a state incompatibility the jobmanager marks the Job failed, 
> cleans up HA metada and restarts itself. This is a very concerning behaviour, 
> but we have to fix this on the operator side to at least guarantee no state 
> loss.
> The solution is to harden the HA metadata check properly 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30408) Add unit test for HA metadata check logic

2022-12-19 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-30408.
--
Resolution: Fixed

merged to main c4e76402f02f05932c6446d97bdc3d60861b9b27

> Add unit test for HA metadata check logic
> -
>
> Key: FLINK-30408
> URL: https://issues.apache.org/jira/browse/FLINK-30408
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Critical
> Fix For: kubernetes-operator-1.4.0
>
>
> The current mechanism to check for the existence of HA metadata in the 
> operator is not guarded by any unit tests which makes in more susceptible to 
> accidental regressions.
> We should add at least a few simple test cases to cover the expected behaviour



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-30437) State incompatibility issue might cause state loss

2022-12-19 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-30437.
--
Resolution: Fixed

merged to main c4e76402f02f05932c6446d97bdc3d60861b9b27

> State incompatibility issue might cause state loss
> --
>
> Key: FLINK-30437
> URL: https://issues.apache.org/jira/browse/FLINK-30437
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.2.0, kubernetes-operator-1.3.0
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Blocker
>  Labels: pull-request-available
>
> Even though we set:
> execution.shutdown-on-application-finish: false
> execution.submit-failed-job-on-application-error: true
> If there is a state incompatibility the jobmanager marks the Job failed, 
> cleans up HA metada and restarts itself. This is a very concerning behaviour, 
> but we have to fix this on the operator side to at least guarantee no state 
> loss.
> The solution is to harden the HA metadata check properly 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] gyfora merged pull request #488: [FLINK-30437][FLINK-30408] Harden HA meta check to avoid state loss

2022-12-19 Thread GitBox



gyfora merged PR #488:
URL: https://github.com/apache/flink-kubernetes-operator/pull/488


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30444) State recovery error not handled correctly and always causes JM failure

2022-12-19 Thread Gyula Fora (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649470#comment-17649470
 ] 

Gyula Fora commented on FLINK-30444:


I think the JobManager shutting down is inconsistent with the 
execution.shutdown-on-application-finish configuration. All other fatal job 
errors simply leave the jobmanager there.

Also in Application mode, this should trigger the proper shutdown of the whole 
application, which in Kubernetes mean that it won't restart. 

> State recovery error not handled correctly and always causes JM failure
> ---
>
> Key: FLINK-30444
> URL: https://issues.apache.org/jira/browse/FLINK-30444
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.16.0, 1.14.6, 1.15.3
>Reporter: Gyula Fora
>Assignee: David Morávek
>Priority: Critical
>
> When you submit a job in Application mode and you try to restore from an 
> incompatible savepoint, there is a very unexpected behaviour.
> Even with the following config:
> {noformat}
> execution.shutdown-on-application-finish: false
> execution.submit-failed-job-on-application-error: true{noformat}
> The job goes into a FAILED state, and the jobmanager fails. In a kubernetes 
> environment (when using the native kubernetes integration) this means that 
> the JobManager is restarted automatically.
> This will mean that if you have jobresult store enabled, after the JM comes 
> back you will end up with an empty application cluster.
> I think the correct behaviour would be, depending on the above mention config:
> 1. If there is a job recovery error and you have 
> (execution.submit-failed-job-on-application-error) configured, then the job 
> should show up as failed, and the JM should not exit (if 
> execution.shutdown-on-application-finish is false)
> 2. If (execution.shutdown-on-application-finish is true) then the jobmanager 
> should exit cleanly like on normal job terminal state and thus stop the 
> deployment in Kubernetes, preventing a JM restart cycle



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] gyfora merged pull request #484: FLIP-271: Autoscaling

2022-12-19 Thread GitBox



gyfora merged PR #484:
URL: https://github.com/apache/flink-kubernetes-operator/pull/484


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] MartijnVisser commented on pull request #21529: [DO NOT MERGE] Testing CI with Hadoop 3 upgrade

2022-12-19 Thread GitBox



MartijnVisser commented on PR #21529:
URL: https://github.com/apache/flink/pull/21529#issuecomment-1358263883

   @luoyuxia I could use your help with this. This is a PR to validate that 
after the update of Hadoop from 2.8.5 to 2.10.2 everything would still work. As 
part of the upgrade for Hadoop 2, we've concluded that we also need to update 
our Hadoop 3 support from Hadoop 3.1.3 to Hadoop 3.2.3. 
   
   This PR was just to check that everything would work if we would run the 
Hadoop 3 profile (via `-Dflink.hadoop.version=3.2.3 -Phadoop3-tests,hive3`). 
Everything works except that now the `HiveRunnerITCase` fails with:
   
   ```
   2022-12-19T19:42:44.1764995Z Dec 19 19:42:44 [ERROR] 
org.apache.flink.connectors.hive.HiveRunnerITCase.testOrcSchemaEvol  Time 
elapsed: 0.882 s  <<< ERROR!
   2022-12-19T19:42:44.1766652Z Dec 19 19:42:44 
java.lang.IllegalArgumentException: Failed to executeQuery Hive query insert 
into table db1.src values (1,100),(2,200): Error while processing statement: 
FAILED: Execution Error, return code -101 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/apache/hadoop/metrics/Updater
   2022-12-19T19:42:44.1767729Z Dec 19 19:42:44 at 
com.klarna.hiverunner.HiveServerContainer.executeStatement(HiveServerContainer.java:143)
   2022-12-19T19:42:44.1768485Z Dec 19 19:42:44 at 
com.klarna.hiverunner.builder.HiveShellBase.executeStatementsWithCommandShellEmulation(HiveShellBase.java:121)
   2022-12-19T19:42:44.1769312Z Dec 19 19:42:44 at 
com.klarna.hiverunner.builder.HiveShellBase.executeScriptWithCommandShellEmulation(HiveShellBase.java:110)
   2022-12-19T19:42:44.1770263Z Dec 19 19:42:44 at 
com.klarna.hiverunner.builder.HiveShellBase.execute(HiveShellBase.java:129)
   2022-12-19T19:42:44.1770974Z Dec 19 19:42:44 at 
org.apache.flink.connectors.hive.HiveRunnerITCase.testOrcSchemaEvol(HiveRunnerITCase.java:540)
   2022-12-19T19:42:44.1771599Z Dec 19 19:42:44 at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   2022-12-19T19:42:44.1772276Z Dec 19 19:42:44 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   2022-12-19T19:42:44.1773807Z Dec 19 19:42:44 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   2022-12-19T19:42:44.1775029Z Dec 19 19:42:44 at 
java.lang.reflect.Method.invoke(Method.java:498)
   2022-12-19T19:42:44.1776298Z Dec 19 19:42:44 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
   2022-12-19T19:42:44.1777597Z Dec 19 19:42:44 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   2022-12-19T19:42:44.1778841Z Dec 19 19:42:44 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
   2022-12-19T19:42:44.1780098Z Dec 19 19:42:44 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
   2022-12-19T19:42:44.1781377Z Dec 19 19:42:44 at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
   2022-12-19T19:42:44.1782663Z Dec 19 19:42:44 at 
org.apache.flink.connectors.hive.FlinkEmbeddedHiveRunner.runTestMethod(FlinkEmbeddedHiveRunner.java:128)
   2022-12-19T19:42:44.1784165Z Dec 19 19:42:44 at 
org.apache.flink.connectors.hive.FlinkEmbeddedHiveRunner.runChild(FlinkEmbeddedHiveRunner.java:116)
   2022-12-19T19:42:44.1785673Z Dec 19 19:42:44 at 
org.apache.flink.connectors.hive.FlinkEmbeddedHiveRunner.runChild(FlinkEmbeddedHiveRunner.java:71)
   2022-12-19T19:42:44.1786903Z Dec 19 19:42:44 at 
org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
   2022-12-19T19:42:44.1787956Z Dec 19 19:42:44 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
   2022-12-19T19:42:44.1789050Z Dec 19 19:42:44 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
   2022-12-19T19:42:44.1790152Z Dec 19 19:42:44 at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
   2022-12-19T19:42:44.1791230Z Dec 19 19:42:44 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
   2022-12-19T19:42:44.1792405Z Dec 19 19:42:44 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
   2022-12-19T19:42:44.1793627Z Dec 19 19:42:44 at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
   2022-12-19T19:42:44.1794944Z Dec 19 19:42:44 at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
   2022-12-19T19:42:44.1796202Z Dec 19 19:42:44 at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
   2022-12-19T19:42:44.1797250Z Dec 19 19:42:44 at 
org.junit.rules.RunRules.evaluate(RunRules.java:20)
   2022-12-19T19:42:44.1797836Z Dec 19 19:42:44 at

[jira] [Commented] (FLINK-30232) shading of netty epoll shared library does not account for ARM64 platform

2022-12-19 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649454#comment-17649454
 ] 

Martijn Visser commented on FLINK-30232:


[~cthomson] Have you heard anything? Else I can take a look myself

> shading of netty epoll shared library does not account for ARM64 platform
> -
>
> Key: FLINK-30232
> URL: https://issues.apache.org/jira/browse/FLINK-30232
> Project: Flink
>  Issue Type: Bug
>  Components: BuildSystem / Shaded
>Affects Versions: 1.15.2
> Environment: Kubernetes 1.23 provided by AWS managed Kubernetes 
> service (EKS) with Graviton 2 based EC2 instances (ARM64) using Flink 1.15.2, 
> native epoll enabled (taskmanager.network.netty.transport: epoll)
>Reporter: Chris Thomson
>Priority: Major
>
> While evaluating migration of Flink application to Graviton 2 based EC2 
> instances in a AWS managed Kubernetes service (EKS) using Kubernetes 1.23, 
> found that the shaded Netty library renames the AMD64 version of the shared 
> library as part of relocation of the Netty library but does not rename the 
> matching ARM64 shared library. This results in the following error when 
> `taskmanager.network.netty.transport: epoll` is used:
>  
>  
> {{Suppressed: java.lang.UnsatisfiedLinkError: no 
> org_apache_flink_shaded_netty4_netty_transport_native_epoll in 
> java.library.path}}
> {{at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860) ~[?:1.8.0_352]}}
> {{at java.lang.Runtime.loadLibrary0(Runtime.java:843) ~[?:1.8.0_352]}}
> {{at java.lang.System.loadLibrary(System.java:1136) ~[?:1.8.0_352]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_352]}}
> {{at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_352]}}
> {{at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_352]}}
> {{at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_352]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:335)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_352]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:327)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:293)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:136)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:309)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.(Native.java:85)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.Epoll.(Epoll.java:40)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.(EpollEventLoop.java:51)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:185)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:36)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:84)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:60)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.MultithreadEventExecutorGroup.(MultithreadEventExecutorGroup.java:49)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.channel.MultithreadEventLoopGroup.(MultithreadEventLoopGroup.java:59)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:113)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
> org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.(EpollEventLoopGroup.java:100)
>  ~[flink-dist-1.15.2.jar:1.15.2]}}
> {{at 
>

[GitHub] [flink] MartijnVisser commented on pull request #21529: [DO NOT MERGE] Testing CI with Hadoop 3 upgrade

2022-12-19 Thread GitBox



MartijnVisser commented on PR #21529:
URL: https://github.com/apache/flink/pull/21529#issuecomment-1358137678

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30444) State recovery error not handled correctly and always causes JM failure

2022-12-19 Thread Jira



[ 
https://issues.apache.org/jira/browse/FLINK-30444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649416#comment-17649416
 ] 

David Morávek commented on FLINK-30444:
---

This is a deeper issue with how the savepoints are recovered when JobMaster 
starts up. If anything goes sideways during execution graph restore, we fail 
the job master, which is ultimately handled as a fatal exception by the 
dispatcher 
(_DefaultExecutionGraphFactory.tryRestoreExecutionGraphFromSavepoint_ is the 
relevant code path).

Since the job has already been registered for execution by the dispatcher, we 
should handle the exception and fail the job accordingly because we can't 
recover from this.

> This also not consistent with some other startup errors such as, missing 
> application jar. That causes a jobmanager restart loop, but does not put the 
> job a terminal FAILED state. This behaviour is more desirable as it doesn't 
> lead to empty application clusters on Kubernetes

This is a special class of errors that could be qualified as "pre-main method 
errors"; I think this is orthogonal to this issue and should be discussed 
separately.

> State recovery error not handled correctly and always causes JM failure
> ---
>
> Key: FLINK-30444
> URL: https://issues.apache.org/jira/browse/FLINK-30444
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.16.0, 1.14.6, 1.15.3
>Reporter: Gyula Fora
>Assignee: David Morávek
>Priority: Critical
>
> When you submit a job in Application mode and you try to restore from an 
> incompatible savepoint, there is a very unexpected behaviour.
> Even with the following config:
> {noformat}
> execution.shutdown-on-application-finish: false
> execution.submit-failed-job-on-application-error: true{noformat}
> The job goes into a FAILED state, and the jobmanager fails. In a kubernetes 
> environment (when using the native kubernetes integration) this means that 
> the JobManager is restarted automatically.
> This will mean that if you have jobresult store enabled, after the JM comes 
> back you will end up with an empty application cluster.
> I think the correct behaviour would be, depending on the above mention config:
> 1. If there is a job recovery error and you have 
> (execution.submit-failed-job-on-application-error) configured, then the job 
> should show up as failed, and the JM should not exit (if 
> execution.shutdown-on-application-finish is false)
> 2. If (execution.shutdown-on-application-finish is true) then the jobmanager 
> should exit cleanly like on normal job terminal state and thus stop the 
> deployment in Kubernetes, preventing a JM restart cycle



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] pnowojski commented on a diff in pull request #20151: [FLINK-26803][checkpoint] Merging channel state files

2022-12-19 Thread GitBox



pnowojski commented on code in PR #20151:
URL: https://github.com/apache/flink/pull/20151#discussion_r1052418421


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateWriteRequestExecutorImpl.java:
##
@@ -51,27 +66,40 @@ class ChannelStateWriteRequestExecutorImpl implements 
ChannelStateWriteRequestEx
 private final Thread thread;
 private volatile Exception thrown = null;
 private volatile boolean wasClosed = false;
-private final String taskName;
+
+private final Map> 
unreadyQueues =
+new ConcurrentHashMap<>();
+
+private final JobID jobID;
+private final Set subtasks;
+private final AtomicBoolean isRegistering = new AtomicBoolean(true);
+private final int numberOfSubtasksShareFile;

Review Comment:
   > BlockingDeque.take() will wait until an element becomes available. Deque 
is hard to achieve. BlockingDeque is easy to implement the producer & consumer 
model.
   
   That's a slight complexity, but should be easily solved via `lock.wait()` 
and `lock.notifyAll()` called in one or two places (`close()` and whenever we 
add anything to the current `dequeue`)? 
https://howtodoinjava.com/java/multi-threading/wait-notify-and-notifyall-methods/
   
   The loop probably should look like this:
   
   ```
   while (true) {
 synchronized (lock) {
   if (wasClosed) return; 
   (...)
   ChannelStateWriteRequest request = waitAndTakeUnsafe();
   (...)
 }
   }
   
   private ChannelStateWriteRequest waitAndTakeUnsafe() {
 ChannelStateWriteRequest request;
 while (true) {
   request = dequeue.pollFirst();
   if (request == null) {
 lock.wait();
   }
   else {
 return request;
   }
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #21532: FLINK-30455 [core] Excluding java.lang.String and primitive types from closure cleaning

2022-12-19 Thread GitBox



flinkbot commented on PR #21532:
URL: https://github.com/apache/flink/pull/21532#issuecomment-1357891772

   
   ## CI report:
   
   * 5cb42cb1d6c5850906ac77e54d1943182097474d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] gunnarmorling commented on pull request #21531: FLINK-30454 [runtime] Fixing visibility issue for SizeGauge.SizeSupplier

2022-12-19 Thread GitBox



gunnarmorling commented on PR #21531:
URL: https://github.com/apache/flink/pull/21531#issuecomment-1357888979

   // CC @rmetzger


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30455) Avoid "cleaning" of java.lang.String in ClosureCleaner

2022-12-19 Thread Gunnar Morling (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649387#comment-17649387
 ] 

Gunnar Morling commented on FLINK-30455:


Sent PR [https://github.com/apache/flink/pull/21532,] seems simple enough.

> Avoid "cleaning" of java.lang.String in ClosureCleaner
> --
>
> Key: FLINK-30455
> URL: https://issues.apache.org/jira/browse/FLINK-30455
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Gunnar Morling
>Assignee: Gunnar Morling
>Priority: Major
>  Labels: pull-request-available
>
> When running a simple "hello world" example on JDK 17, I noticed the closure 
> cleaner tries to reflectively access the {{java.lang.String}} class, which 
> fails due to the strong encapsulation enabled by default in JDK 17 and 
> beyond. I don't think the closure cleaner needs to act on {{String}} at all, 
> as it doesn't contain any inner classes. Unless there are objections, I'll 
> provide a fix along those lines. With this change in place, I can run that 
> example on JDK 17.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30455) Avoid "cleaning" of java.lang.String in ClosureCleaner

2022-12-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30455:
---
Labels: pull-request-available  (was: )

> Avoid "cleaning" of java.lang.String in ClosureCleaner
> --
>
> Key: FLINK-30455
> URL: https://issues.apache.org/jira/browse/FLINK-30455
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Gunnar Morling
>Assignee: Gunnar Morling
>Priority: Major
>  Labels: pull-request-available
>
> When running a simple "hello world" example on JDK 17, I noticed the closure 
> cleaner tries to reflectively access the {{java.lang.String}} class, which 
> fails due to the strong encapsulation enabled by default in JDK 17 and 
> beyond. I don't think the closure cleaner needs to act on {{String}} at all, 
> as it doesn't contain any inner classes. Unless there are objections, I'll 
> provide a fix along those lines. With this change in place, I can run that 
> example on JDK 17.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] gunnarmorling opened a new pull request, #21532: FLINK-30455 [core] Excluding java.lang.String and primitive types from closure cleaning

2022-12-19 Thread GitBox



gunnarmorling opened a new pull request, #21532:
URL: https://github.com/apache/flink/pull/21532

   https://issues.apache.org/jira/browse/FLINK-30455
   
   ## What is the purpose of the change
   
   Avoiding reflective access to java.lang.String which fails on Java 17 and 
beyond.
   
   ## Brief change log
   
   Excluding java.lang.String and primitive types from closure cleaning
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? No
 - If yes, how is the feature documented? Not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #21531: FLINK-30454 [runtime] Fixing visibility issue for SizeGauge.SizeSupplier

2022-12-19 Thread GitBox



flinkbot commented on PR #21531:
URL: https://github.com/apache/flink/pull/21531#issuecomment-1357877649

   
   ## CI report:
   
   * b64e541cca8bfe43c5d0ec84918bbb2bd0851d24 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-30455) Avoid "cleaning" of java.lang.String in ClosureCleaner

2022-12-19 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser reassigned FLINK-30455:
--

Assignee: Gunnar Morling

> Avoid "cleaning" of java.lang.String in ClosureCleaner
> --
>
> Key: FLINK-30455
> URL: https://issues.apache.org/jira/browse/FLINK-30455
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Gunnar Morling
>Assignee: Gunnar Morling
>Priority: Major
>
> When running a simple "hello world" example on JDK 17, I noticed the closure 
> cleaner tries to reflectively access the {{java.lang.String}} class, which 
> fails due to the strong encapsulation enabled by default in JDK 17 and 
> beyond. I don't think the closure cleaner needs to act on {{String}} at all, 
> as it doesn't contain any inner classes. Unless there are objections, I'll 
> provide a fix along those lines. With this change in place, I can run that 
> example on JDK 17.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-30454) Inconsistent class hierarchy in TaskIOMetricGroup

2022-12-19 Thread Gunnar Morling (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649383#comment-17649383
 ] 

Gunnar Morling edited comment on FLINK-30454 at 12/19/22 3:51 PM:
--

Ok, cool. Created PR 
[https://github.com/apache/flink/pull/21531/.|https://github.com/apache/flink/pull/21531/]


was (Author: gunnar.morling):
Ok, cool. Logged 
[https://github.com/apache/flink/pull/21531/.|https://github.com/apache/flink/pull/21531/]

> Inconsistent class hierarchy in TaskIOMetricGroup
> -
>
> Key: FLINK-30454
> URL: https://issues.apache.org/jira/browse/FLINK-30454
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Gunnar Morling
>Priority: Major
>  Labels: pull-request-available
>
> I noticed an interesting issue when trying to compile the flink-runtime 
> module with Eclipse (same for VSCode): the _private_ inner class 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has 
> yet another _public_ inner class, {{{}SizeSupplier{}}}. The public method 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
>  has a parameter of that type.
> The invocation of this method in 
> {{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
> TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
> TaskMailbox)}} can be compiled with the javac compiler of the JDK, but fails 
> to compile with ecj:
> {code:java}
> The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
> target context is not visible here.  
> {code}
> I tend to believe that the behavior of Eclipse's compiler is the correct one. 
> After all, you couldn't explicitly reference the {{SizeSupplier}} type either.
> One possible fix would be to promote {{SizeSupplier}} to the same level as 
> {{{}SizeGauge{}}}. This would be source-compatible but not binary-compatible, 
> though. I.e. code compiled against the earlier signature of 
> {{registerMailboxSizeSupplier()}} would be broken with a 
> {{{}NoSuchMethodError{}}}. Depending on whether 
> {{registerMailboxSizeSupplier()}} are expected in client code or not, this 
> may or may not be acceptable.
> Another fix would be to make {{SizeGauge}} public. I think that's the change 
> I'd do. Curious what other folks here think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30454) Inconsistent class hierarchy in TaskIOMetricGroup

2022-12-19 Thread Gunnar Morling (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649383#comment-17649383
 ] 

Gunnar Morling commented on FLINK-30454:


Ok, cool. Logged 
[https://github.com/apache/flink/pull/21531/.|https://github.com/apache/flink/pull/21531/]

> Inconsistent class hierarchy in TaskIOMetricGroup
> -
>
> Key: FLINK-30454
> URL: https://issues.apache.org/jira/browse/FLINK-30454
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Gunnar Morling
>Priority: Major
>  Labels: pull-request-available
>
> I noticed an interesting issue when trying to compile the flink-runtime 
> module with Eclipse (same for VSCode): the _private_ inner class 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has 
> yet another _public_ inner class, {{{}SizeSupplier{}}}. The public method 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
>  has a parameter of that type.
> The invocation of this method in 
> {{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
> TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
> TaskMailbox)}} can be compiled with the javac compiler of the JDK, but fails 
> to compile with ecj:
> {code:java}
> The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
> target context is not visible here.  
> {code}
> I tend to believe that the behavior of Eclipse's compiler is the correct one. 
> After all, you couldn't explicitly reference the {{SizeSupplier}} type either.
> One possible fix would be to promote {{SizeSupplier}} to the same level as 
> {{{}SizeGauge{}}}. This would be source-compatible but not binary-compatible, 
> though. I.e. code compiled against the earlier signature of 
> {{registerMailboxSizeSupplier()}} would be broken with a 
> {{{}NoSuchMethodError{}}}. Depending on whether 
> {{registerMailboxSizeSupplier()}} are expected in client code or not, this 
> may or may not be acceptable.
> Another fix would be to make {{SizeGauge}} public. I think that's the change 
> I'd do. Curious what other folks here think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30454) Inconsistent class hierarchy in TaskIOMetricGroup

2022-12-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30454:
---
Labels: pull-request-available  (was: )

> Inconsistent class hierarchy in TaskIOMetricGroup
> -
>
> Key: FLINK-30454
> URL: https://issues.apache.org/jira/browse/FLINK-30454
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Gunnar Morling
>Priority: Major
>  Labels: pull-request-available
>
> I noticed an interesting issue when trying to compile the flink-runtime 
> module with Eclipse (same for VSCode): the _private_ inner class 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has 
> yet another _public_ inner class, {{{}SizeSupplier{}}}. The public method 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
>  has a parameter of that type.
> The invocation of this method in 
> {{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
> TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
> TaskMailbox)}} can be compiled with the javac compiler of the JDK, but fails 
> to compile with ecj:
> {code:java}
> The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
> target context is not visible here.  
> {code}
> I tend to believe that the behavior of Eclipse's compiler is the correct one. 
> After all, you couldn't explicitly reference the {{SizeSupplier}} type either.
> One possible fix would be to promote {{SizeSupplier}} to the same level as 
> {{{}SizeGauge{}}}. This would be source-compatible but not binary-compatible, 
> though. I.e. code compiled against the earlier signature of 
> {{registerMailboxSizeSupplier()}} would be broken with a 
> {{{}NoSuchMethodError{}}}. Depending on whether 
> {{registerMailboxSizeSupplier()}} are expected in client code or not, this 
> may or may not be acceptable.
> Another fix would be to make {{SizeGauge}} public. I think that's the change 
> I'd do. Curious what other folks here think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] gunnarmorling opened a new pull request, #21531: FLINK-30454 [runtime] Fixing visibility issue for SizeGauge.SizeSupplier

2022-12-19 Thread GitBox



gunnarmorling opened a new pull request, #21531:
URL: https://github.com/apache/flink/pull/21531

   ## What is the purpose of the change
   
   Making sure the runtime module can be compiled with ecj. I encountered two 
locations which seem inconsistent with the JLS but were still accepted by javac.
   
   ## Brief change log
   
   Reworked two code places to make it compile with ecj. Note the change around 
`SizeGauge.SizeSupplier` is source-compatible but not binary-compatible, based 
on the assumption that this code is not invoked in external client code. If 
that actually is the case, I'd rework the change to simply change the 
visibility of `SizeGauge` to public.
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature: no
 - If yes, how is the feature documented: not applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30455) Avoid "cleaning" of java.lang.String in ClosureCleaner

2022-12-19 Thread Gunnar Morling (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649371#comment-17649371
 ] 

Gunnar Morling commented on FLINK-30455:


[~rmetzger] et al., if you think this makes sense, could you assign this issue 
to me? Thanks!

> Avoid "cleaning" of java.lang.String in ClosureCleaner
> --
>
> Key: FLINK-30455
> URL: https://issues.apache.org/jira/browse/FLINK-30455
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Gunnar Morling
>Priority: Major
>
> When running a simple "hello world" example on JDK 17, I noticed the closure 
> cleaner tries to reflectively access the {{java.lang.String}} class, which 
> fails due to the strong encapsulation enabled by default in JDK 17 and 
> beyond. I don't think the closure cleaner needs to act on {{String}} at all, 
> as it doesn't contain any inner classes. Unless there are objections, I'll 
> provide a fix along those lines. With this change in place, I can run that 
> example on JDK 17.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30455) Avoid "cleaning" of java.lang.String in ClosureCleaner

2022-12-19 Thread Gunnar Morling (Jira)

Gunnar Morling created FLINK-30455:
--

 Summary: Avoid "cleaning" of java.lang.String in ClosureCleaner
 Key: FLINK-30455
 URL: https://issues.apache.org/jira/browse/FLINK-30455
 Project: Flink
  Issue Type: Sub-task
Reporter: Gunnar Morling


When running a simple "hello world" example on JDK 17, I noticed the closure 
cleaner tries to reflectively access the {{java.lang.String}} class, which 
fails due to the strong encapsulation enabled by default in JDK 17 and beyond. 
I don't think the closure cleaner needs to act on {{String}} at all, as it 
doesn't contain any inner classes. Unless there are objections, I'll provide a 
fix along those lines. With this change in place, I can run that example on JDK 
17.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30454) Inconsistent class hierarchy in TaskIOMetricGroup

2022-12-19 Thread Robert Metzger (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649365#comment-17649365
 ] 

Robert Metzger commented on FLINK-30454:


I'm pretty sure these are not public classes. We use the @Public / 
@PublicEvolving / @Internal annotations to mark interface visibility in Flink.
The TaskIOMetricGroup-stuff is, in my understanding anyways internal. 

> Inconsistent class hierarchy in TaskIOMetricGroup
> -
>
> Key: FLINK-30454
> URL: https://issues.apache.org/jira/browse/FLINK-30454
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Gunnar Morling
>Priority: Major
>
> I noticed an interesting issue when trying to compile the flink-runtime 
> module with Eclipse (same for VSCode): the _private_ inner class 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has 
> yet another _public_ inner class, {{{}SizeSupplier{}}}. The public method 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
>  has a parameter of that type.
> The invocation of this method in 
> {{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
> TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
> TaskMailbox)}} can be compiled with the javac compiler of the JDK, but fails 
> to compile with ecj:
> {code:java}
> The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
> target context is not visible here.  
> {code}
> I tend to believe that the behavior of Eclipse's compiler is the correct one. 
> After all, you couldn't explicitly reference the {{SizeSupplier}} type either.
> One possible fix would be to promote {{SizeSupplier}} to the same level as 
> {{{}SizeGauge{}}}. This would be source-compatible but not binary-compatible, 
> though. I.e. code compiled against the earlier signature of 
> {{registerMailboxSizeSupplier()}} would be broken with a 
> {{{}NoSuchMethodError{}}}. Depending on whether 
> {{registerMailboxSizeSupplier()}} are expected in client code or not, this 
> may or may not be acceptable.
> Another fix would be to make {{SizeGauge}} public. I think that's the change 
> I'd do. Curious what other folks here think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] pnowojski commented on a diff in pull request #20151: [FLINK-26803][checkpoint] Merging channel state files

2022-12-19 Thread GitBox



pnowojski commented on code in PR #20151:
URL: https://github.com/apache/flink/pull/20151#discussion_r1052319366


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriter.java:
##
@@ -286,27 +277,32 @@ private void runWithChecks(RunnableWithException r) {
 }
 }
 
-public void fail(JobID jobID, JobVertexID jobVertexID, int subtaskIndex, 
Throwable e) {
+/**
+ * The throwable is just used for special subtask, other subtasks should 
fail by {@link
+ * CHANNEL_STATE_SHARED_STREAM_EXCEPTION}.

Review Comment:
   nit:
   
   > The throwable is just used for specific subtask that triggered the 
failure. Other subtasks should fail by {@link 
CHANNEL_STATE_SHARED_STREAM_EXCEPTION}
   
   ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-30450) FileSystem supports exporting client-side metrics

2022-12-19 Thread Anton Ippolitov (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649364#comment-17649364
 ] 

Anton Ippolitov commented on FLINK-30450:
-

I am mostly interested in the checkpoint / savepoint side of things, the main 
idea is to monitor the pressure on the FS during these operations. I already 
have some server-side metrics for S3 etc but client-side metrics would offer a 
more granular view.

> FileSystem supports exporting client-side metrics
> -
>
> Key: FLINK-30450
> URL: https://issues.apache.org/jira/browse/FLINK-30450
> Project: Flink
>  Issue Type: New Feature
>  Components: FileSystems
>Reporter: Hangxiang Yu
>Priority: Major
>
> Client-side metrics, or job level metrics for filesystem could help us to 
> monitor filesystem more precisely.
> Some metrics (like request rate , throughput, latency, retry count, etc) are 
> useful to monitor the network or client problem of checkpointing or other 
> access cases for a job.  
> Some filesystems like s3, s3-presto, gs have supported enabling some metrics, 
> these could be exported in the filesystem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-15786) Load connector code with separate classloader

2022-12-19 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-15786:
---
Priority: Major  (was: Not a Priority)

> Load connector code with separate classloader
> -
>
> Key: FLINK-15786
> URL: https://issues.apache.org/jira/browse/FLINK-15786
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Guowei Ma
>Priority: Major
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> usability
>
> Currently, connector code can be seen as part of user code. Usually, users 
> only need to add the corresponding connector as a dependency and package it 
> in the user jar. This is convenient enough.
> However, connectors usually need to interact with external systems and often 
> introduce heavy dependencies, there is a high possibility of a class conflict 
> of different connectors or the user code of the same job. For example, every 
> one or two weeks, we will receive issue reports relevant with connector class 
> conflict from our users. The problem can get worse when users want to analyze 
> data from different sources and write output to different sinks.
> Using separate classloader to load the different connector code could resolve 
> the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30450) FileSystem supports exporting client-side metrics

2022-12-19 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649362#comment-17649362
 ] 

Martijn Visser commented on FLINK-30450:


With regards to this request, are you looking at FileSource / FileSink (to read 
and/or write files) or also looking at this from a checkpoint/savepointing 
perspective?

If the first, it would be a duplicate of FLINK-28021 / FLINK-28117 imho. 

Exposing the Presto / Hadoop metrics might not be trivial given that these are 
loaded via the Flink plugin mechanism 
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/plugins/
It would probably require a FLIP and a discussion the Dev mailing list

> FileSystem supports exporting client-side metrics
> -
>
> Key: FLINK-30450
> URL: https://issues.apache.org/jira/browse/FLINK-30450
> Project: Flink
>  Issue Type: New Feature
>  Components: FileSystems
>Reporter: Hangxiang Yu
>Priority: Major
>
> Client-side metrics, or job level metrics for filesystem could help us to 
> monitor filesystem more precisely.
> Some metrics (like request rate , throughput, latency, retry count, etc) are 
> useful to monitor the network or client problem of checkpointing or other 
> access cases for a job.  
> Some filesystems like s3, s3-presto, gs have supported enabling some metrics, 
> these could be exported in the filesystem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] pnowojski commented on a diff in pull request #20151: [FLINK-26803][checkpoint] Merging channel state files

2022-12-19 Thread GitBox



pnowojski commented on code in PR #20151:
URL: https://github.com/apache/flink/pull/20151#discussion_r1052295272


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/channel/ChannelStateCheckpointWriter.java:
##
@@ -18,280 +18,265 @@
 package org.apache.flink.runtime.checkpoint.channel;
 
 import org.apache.flink.annotation.VisibleForTesting;
-import 
org.apache.flink.runtime.checkpoint.channel.ChannelStateWriter.ChannelStateWriteResult;
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.runtime.checkpoint.CheckpointException;
 import org.apache.flink.runtime.io.network.buffer.Buffer;
 import org.apache.flink.runtime.io.network.logger.NetworkActionsLogger;
-import org.apache.flink.runtime.state.AbstractChannelStateHandle;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
 import 
org.apache.flink.runtime.state.AbstractChannelStateHandle.StateContentMetaInfo;
 import org.apache.flink.runtime.state.CheckpointStateOutputStream;
 import org.apache.flink.runtime.state.CheckpointStreamFactory;
-import org.apache.flink.runtime.state.InputChannelStateHandle;
-import org.apache.flink.runtime.state.ResultSubpartitionStateHandle;
 import org.apache.flink.runtime.state.StreamStateHandle;
-import org.apache.flink.runtime.state.memory.ByteStreamStateHandle;
 import org.apache.flink.util.Preconditions;
 import org.apache.flink.util.function.RunnableWithException;
 
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import javax.annotation.Nonnull;
 import javax.annotation.concurrent.NotThreadSafe;
 
 import java.io.DataOutputStream;
 import java.io.IOException;
-import java.util.ArrayList;
-import java.util.Collection;
 import java.util.HashMap;
-import java.util.List;
+import java.util.HashSet;
 import java.util.Map;
-import java.util.Optional;
-import java.util.concurrent.CompletableFuture;
+import java.util.Objects;
+import java.util.Set;
 
-import static java.util.Collections.emptyList;
-import static java.util.Collections.singletonList;
-import static java.util.UUID.randomUUID;
+import static 
org.apache.flink.runtime.checkpoint.CheckpointFailureReason.CHANNEL_STATE_SHARED_STREAM_EXCEPTION;
 import static org.apache.flink.runtime.state.CheckpointedStateScope.EXCLUSIVE;
 import static org.apache.flink.util.ExceptionUtils.findThrowable;
 import static org.apache.flink.util.ExceptionUtils.rethrow;
+import static org.apache.flink.util.Preconditions.checkArgument;
 import static org.apache.flink.util.Preconditions.checkNotNull;
 import static org.apache.flink.util.Preconditions.checkState;
 
-/** Writes channel state for a specific checkpoint-subtask-attempt triple. */
+/** Writes channel state for multiple subtasks of the same checkpoint. */
 @NotThreadSafe
 class ChannelStateCheckpointWriter {
 private static final Logger LOG = 
LoggerFactory.getLogger(ChannelStateCheckpointWriter.class);
 
 private final DataOutputStream dataStream;
 private final CheckpointStateOutputStream checkpointStream;
-private final ChannelStateWriteResult result;
-private final Map 
inputChannelOffsets = new HashMap<>();
-private final Map 
resultSubpartitionOffsets =
-new HashMap<>();
+
+/**
+ * Indicates whether the current checkpoints of all subtasks have 
exception. If it's not null,
+ * the checkpoint will fail.
+ */
+private Throwable throwable;
+
 private final ChannelStateSerializer serializer;
 private final long checkpointId;
-private boolean allInputsReceived = false;
-private boolean allOutputsReceived = false;
 private final RunnableWithException onComplete;
-private final int subtaskIndex;
-private final String taskName;
+
+// Subtasks that have not yet register writer result.
+private final Set waitedSubtasks;
+
+private final Map pendingResults = 
new HashMap<>();
 
 ChannelStateCheckpointWriter(
-String taskName,
-int subtaskIndex,
-CheckpointStartRequest startCheckpointItem,
+Set subtasks,
+long checkpointId,
 CheckpointStreamFactory streamFactory,
 ChannelStateSerializer serializer,
 RunnableWithException onComplete)
 throws Exception {
 this(
-taskName,
-subtaskIndex,
-startCheckpointItem.getCheckpointId(),
-startCheckpointItem.getTargetResult(),
+subtasks,
+checkpointId,
 streamFactory.createCheckpointStateOutputStream(EXCLUSIVE),
 serializer,
 onComplete);
 }
 
 @VisibleForTesting
 ChannelStateCheckpointWriter(
-String taskName,
-int subtaskIndex,
+Set subtasks,
 long checkpointId,
-ChannelStateWriteResult result,
 CheckpointStateOutputStream stream,
 ChannelStateSerializer serializer,

[GitHub] [flink] Aitozi commented on pull request #21441: [hotfix] Remove the duplicated test case

2022-12-19 Thread GitBox



Aitozi commented on PR #21441:
URL: https://github.com/apache/flink/pull/21441#issuecomment-1357774211

   Hi @twalthr could help take a look at this simple cleanup?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-30444) State recovery error not handled correctly and always causes JM failure

2022-12-19 Thread Jira



 [ 
https://issues.apache.org/jira/browse/FLINK-30444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Morávek reassigned FLINK-30444:
-

Assignee: David Morávek

> State recovery error not handled correctly and always causes JM failure
> ---
>
> Key: FLINK-30444
> URL: https://issues.apache.org/jira/browse/FLINK-30444
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.16.0, 1.14.6, 1.15.3
>Reporter: Gyula Fora
>Assignee: David Morávek
>Priority: Critical
>
> When you submit a job in Application mode and you try to restore from an 
> incompatible savepoint, there is a very unexpected behaviour.
> Even with the following config:
> {noformat}
> execution.shutdown-on-application-finish: false
> execution.submit-failed-job-on-application-error: true{noformat}
> The job goes into a FAILED state, and the jobmanager fails. In a kubernetes 
> environment (when using the native kubernetes integration) this means that 
> the JobManager is restarted automatically.
> This will mean that if you have jobresult store enabled, after the JM comes 
> back you will end up with an empty application cluster.
> I think the correct behaviour would be, depending on the above mention config:
> 1. If there is a job recovery error and you have 
> (execution.submit-failed-job-on-application-error) configured, then the job 
> should show up as failed, and the JM should not exit (if 
> execution.shutdown-on-application-finish is false)
> 2. If (execution.shutdown-on-application-finish is true) then the jobmanager 
> should exit cleanly like on normal job terminal state and thus stop the 
> deployment in Kubernetes, preventing a JM restart cycle



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30450) FileSystem supports exporting client-side metrics

2022-12-19 Thread Anton Ippolitov (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649357#comment-17649357
 ] 

Anton Ippolitov commented on FLINK-30450:
-

Thank you for creating this ticket! To expand on what I mentioned in the 
original [email 
thread,|https://lists.apache.org/thread/dpgh6sh0r21sgohjxxbqtm2mrmjdolgr] we'd 
like to be able to see the following metrics:
 * Request rate (if possible tagged by HTTP method)
 * Request latency
 * Upload / download byte rates
 * Error rate (if possible tagged by error) - would be useful to track 
throttling errors from S3 for example
 * Retry count
 * Number of active connections

As mentioned in the thread, the S3 Presto client already gathers these metrics 
[here|https://github.com/prestodb/presto/blob/0.272/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystemStats.java]
 but they are not exposed anywhere in Flink. The S3A client also has built-in 
[metrics|https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#Metrics]
 that are actually already exposed via JMX when the client is used in Flink but 
it would obviously be great to standardize the way we expose FS metrics on the 
Flink side.

I haven't looked into GCS or Azure Storage yet but definitely interested in 
metrics from these clients too.

> FileSystem supports exporting client-side metrics
> -
>
> Key: FLINK-30450
> URL: https://issues.apache.org/jira/browse/FLINK-30450
> Project: Flink
>  Issue Type: New Feature
>  Components: FileSystems
>Reporter: Hangxiang Yu
>Priority: Major
>
> Client-side metrics, or job level metrics for filesystem could help us to 
> monitor filesystem more precisely.
> Some metrics (like request rate , throughput, latency, retry count, etc) are 
> useful to monitor the network or client problem of checkpointing or other 
> access cases for a job.  
> Some filesystems like s3, s3-presto, gs have supported enabling some metrics, 
> these could be exported in the filesystem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] flinkbot commented on pull request #21530: [FLINK-30376][table-planner] Introduce a new flink busy join reorder rule which based on greedy algorithm

2022-12-19 Thread GitBox



flinkbot commented on PR #21530:
URL: https://github.com/apache/flink/pull/21530#issuecomment-1357765946

   
   ## CI report:
   
   * c139c227d20dd5e3406ac080d3efcc0e1c106e9d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30376) Introduce a new flink busy join reorder rule which based on greedy algorithm

2022-12-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30376:
---
Labels: pull-request-available  (was: )

> Introduce a new flink busy join reorder rule which based on greedy algorithm
> 
>
> Key: FLINK-30376
> URL: https://issues.apache.org/jira/browse/FLINK-30376
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Introducing a new Flink busy join reorder strategy which based on the greedy 
> algorithm. The old join reorder rule will also be the default join reorder 
> rule and the new busy join reorder strategy will be optional.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] swuferhong opened a new pull request, #21530: [FLINK-30376][table-planner] Introduce a new flink busy join reorder rule which based on greedy algorithm

2022-12-19 Thread GitBox



swuferhong opened a new pull request, #21530:
URL: https://github.com/apache/flink/pull/21530

   
   
   ## What is the purpose of the change
   
   Introducing a new Flink busy join reorder strategy which based on the greedy 
algorithm. The old join reorder rule will also be the default join reorder rule 
and the new busy join reorder strategy will be optional.
   
   
   ## Brief change log
   
   - Introducing a new Flink busy join reorder strategy named 
`FlinkBusyJoinReorderRule`
   - Adding plan tests
   - Adding ITCase
   
   
   ## Verifying this change
   
   - Adding plan tests:  `FlinkBusyJoinReorderRuleTest`
   - Adding ITCase: `JoinReorderITCase` in stream and batch mode
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper:no
 - The S3 file system connector: no
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? no docs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] ramkrish86 commented on pull request #21508: [FLINK-30128] Introduce Azure Data Lake Gen2 APIs in the Hadoop Recoverable path

2022-12-19 Thread GitBox



ramkrish86 commented on PR #21508:
URL: https://github.com/apache/flink/pull/21508#issuecomment-1357757731

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Aitozi commented on pull request #15137: [FLINK-21396][table] Resolve tables when calling Catalog.createTable/alterTable

2022-12-19 Thread GitBox



Aitozi commented on PR #15137:
URL: https://github.com/apache/flink/pull/15137#issuecomment-1357750969

   Thanks @twalthr for your input. Because the `TableEnvironment#getCatalog` 
leak the internal `Catalog` to the user's face, it's not compatible to enforce 
the Catalog to interact with resolved table only, Right?
   
   Recently, I'm trying to migrate `TableSchema` to new APIs in hive connector. 
And get some problems when dealing with the HiveCatalog. The best way I think 
is to let it deal with the ResolvedCatalogTable directly. Now, with your notes, 
I think it makes sense to do an extra cast internally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-30454) Inconsistent class hierarchy in TaskIOMetricGroup

2022-12-19 Thread Gunnar Morling (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunnar Morling updated FLINK-30454:
---
Description: 
I noticed an interesting issue when trying to compile the flink-runtime module 
with Eclipse (same for VSCode): the _private_ inner class 
{{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has yet 
another _public_ inner class, {{{}SizeSupplier{}}}. The public method 
{{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
 has a parameter of that type.

The invocation of this method in 
{{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
TaskMailbox)}} can be compiled with the javac compiler of the JDK, but fails to 
compile with ecj:
{code:java}
The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
target context is not visible here.  
{code}
I tend to believe that the behavior of Eclipse's compiler is the correct one. 
After all, you couldn't explicitly reference the {{SizeSupplier}} type either.

One possible fix would be to promote {{SizeSupplier}} to the same level as 
{{{}SizeGauge{}}}. This would be source-compatible but not binary-compatible, 
though. I.e. code compiled against the earlier signature of 
{{registerMailboxSizeSupplier()}} would be broken with a 
{{{}NoSuchMethodError{}}}. Depending on whether 
{{registerMailboxSizeSupplier()}} are expected in client code or not, this may 
or may not be acceptable.

Another fix would be to make {{SizeGauge}} public. I think that's the change 
I'd do. Curious what other folks here think.

  was:
I noticed an interesting issue when trying to compile the flink-runtime module 
with Eclipse (same for VSCode): the _private_ inner class 
{{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has yet 
another _public_ inner class, {{SizeSupplier}}. The public method 
{{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
 has a parameter of that type. The invocation of this method in 
{{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
TaskMailbox)}} can be compiled with the javac compiler of the JDK but fails to 
compile with ecj:

{code}
The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
target context is not visible here.  
{code}

I tend to believe that the behavior of Eclipse's compiler is the correct one. 
After all, you couldn't explicitly reference the {{SizeSupplier}} type either. 
One possible fix would be to promote {{SizeSupplier}} to the same level as 
{{SizeGauge}}. This would be source-compatible but not binary-compatible, 
though. I.e. code compiled against the earlier signature of 
{{registerMailboxSizeSupplier()}} would be broken with a {{NoSuchMethodError}}. 
Depending on whether {{registerMailboxSizeSupplier()}} are expected in client 
code or not, this may or may not be acceptable. Another fix would be to make 
{{SizeGauge}} public. I think that's the change I'd do. Curious what other 
folks here think.


> Inconsistent class hierarchy in TaskIOMetricGroup
> -
>
> Key: FLINK-30454
> URL: https://issues.apache.org/jira/browse/FLINK-30454
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Gunnar Morling
>Priority: Major
>
> I noticed an interesting issue when trying to compile the flink-runtime 
> module with Eclipse (same for VSCode): the _private_ inner class 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has 
> yet another _public_ inner class, {{{}SizeSupplier{}}}. The public method 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
>  has a parameter of that type.
> The invocation of this method in 
> {{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
> TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
> TaskMailbox)}} can be compiled with the javac compiler of the JDK, but fails 
> to compile with ecj:
> {code:java}
> The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
> target context is not visible here.  
> {code}
> I tend to believe that the behavior of Eclipse's compiler is the correct one. 
> After all, you couldn't explicitly reference the {{SizeSupplier}} type either.
> One possible fix would be to promote {{SizeSupplier}} to the same level as 
> {{{}SizeGauge{}}}. This would be source-compatible but not binary-compatible, 
> though. I.e. code compiled against the earlier signature of 
> {{registerMailboxSizeSupplier()}} would be

[jira] [Comment Edited] (FLINK-30454) Inconsistent class hierarchy in TaskIOMetricGroup

2022-12-19 Thread Gunnar Morling (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649345#comment-17649345
 ] 

Gunnar Morling edited comment on FLINK-30454 at 12/19/22 2:16 PM:
--

[~rmetzger], curious about your thoughts on that compatibility question.


was (Author: gunnar.morling):
[~rmetzger] , curious about your thoughts on that compatibility question.

> Inconsistent class hierarchy in TaskIOMetricGroup
> -
>
> Key: FLINK-30454
> URL: https://issues.apache.org/jira/browse/FLINK-30454
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Gunnar Morling
>Priority: Major
>
> I noticed an interesting issue when trying to compile the flink-runtime 
> module with Eclipse (same for VSCode): the _private_ inner class 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has 
> yet another _public_ inner class, {{SizeSupplier}}. The public method 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
>  has a parameter of that type. The invocation of this method in 
> {{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
> TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
> TaskMailbox)}} can be compiled with the javac compiler of the JDK but fails 
> to compile with ecj:
> {code}
> The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
> target context is not visible here.  
> {code}
> I tend to believe that the behavior of Eclipse's compiler is the correct one. 
> After all, you couldn't explicitly reference the {{SizeSupplier}} type 
> either. One possible fix would be to promote {{SizeSupplier}} to the same 
> level as {{SizeGauge}}. This would be source-compatible but not 
> binary-compatible, though. I.e. code compiled against the earlier signature 
> of {{registerMailboxSizeSupplier()}} would be broken with a 
> {{NoSuchMethodError}}. Depending on whether {{registerMailboxSizeSupplier()}} 
> are expected in client code or not, this may or may not be acceptable. 
> Another fix would be to make {{SizeGauge}} public. I think that's the change 
> I'd do. Curious what other folks here think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-30454) Inconsistent class hierarchy in TaskIOMetricGroup

2022-12-19 Thread Gunnar Morling (Jira)

Gunnar Morling created FLINK-30454:
--

 Summary: Inconsistent class hierarchy in TaskIOMetricGroup
 Key: FLINK-30454
 URL: https://issues.apache.org/jira/browse/FLINK-30454
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Reporter: Gunnar Morling


I noticed an interesting issue when trying to compile the flink-runtime module 
with Eclipse (same for VSCode): the _private_ inner class 
{{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has yet 
another _public_ inner class, {{SizeSupplier}}. The public method 
{{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
 has a parameter of that type. The invocation of this method in 
{{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
TaskMailbox)}} can be compiled with the javac compiler of the JDK but fails to 
compile with ecj:

{code}
The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
target context is not visible here.  
{code}

I tend to believe that the behavior of Eclipse's compiler is the correct one. 
After all, you couldn't explicitly reference the {{SizeSupplier}} type either. 
One possible fix would be to promote {{SizeSupplier}} to the same level as 
{{SizeGauge}}. This would be source-compatible but not binary-compatible, 
though. I.e. code compiled against the earlier signature of 
{{registerMailboxSizeSupplier()}} would be broken with a {{NoSuchMethodError}}. 
Depending on whether {{registerMailboxSizeSupplier()}} are expected in client 
code or not, this may or may not be acceptable. Another fix would be to make 
{{SizeGauge}} public. I think that's the change I'd do. Curious what other 
folks here think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30454) Inconsistent class hierarchy in TaskIOMetricGroup

2022-12-19 Thread Gunnar Morling (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-30454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649345#comment-17649345
 ] 

Gunnar Morling commented on FLINK-30454:


[~rmetzger] , curious about your thoughts on that compatibility question.

> Inconsistent class hierarchy in TaskIOMetricGroup
> -
>
> Key: FLINK-30454
> URL: https://issues.apache.org/jira/browse/FLINK-30454
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Gunnar Morling
>Priority: Major
>
> I noticed an interesting issue when trying to compile the flink-runtime 
> module with Eclipse (same for VSCode): the _private_ inner class 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.SizeGauge}} has 
> yet another _public_ inner class, {{SizeSupplier}}. The public method 
> {{org.apache.flink.runtime.metrics.groups.TaskIOMetricGroup.registerMailboxSizeSupplier(SizeSupplier)}}
>  has a parameter of that type. The invocation of this method in 
> {{org.apache.flink.streaming.runtime.tasks.StreamTask.StreamTask(Environment, 
> TimerService, UncaughtExceptionHandler, StreamTaskActionExecutor, 
> TaskMailbox)}} can be compiled with the javac compiler of the JDK but fails 
> to compile with ecj:
> {code}
> The type TaskIOMetricGroup.SizeGauge from the descriptor computed for the 
> target context is not visible here.  
> {code}
> I tend to believe that the behavior of Eclipse's compiler is the correct one. 
> After all, you couldn't explicitly reference the {{SizeSupplier}} type 
> either. One possible fix would be to promote {{SizeSupplier}} to the same 
> level as {{SizeGauge}}. This would be source-compatible but not 
> binary-compatible, though. I.e. code compiled against the earlier signature 
> of {{registerMailboxSizeSupplier()}} would be broken with a 
> {{NoSuchMethodError}}. Depending on whether {{registerMailboxSizeSupplier()}} 
> are expected in client code or not, this may or may not be acceptable. 
> Another fix would be to make {{SizeGauge}} public. I think that's the change 
> I'd do. Curious what other folks here think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-28875) Add FlinkSessionJobControllerTest

2022-12-19 Thread Gyula Fora (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649343#comment-17649343
 ] 

Gyula Fora commented on FLINK-28875:


merged to release-1.3 69f7e94808cbded27f2238bf3039b9d693009f4a

> Add FlinkSessionJobControllerTest
> -
>
> Key: FLINK-28875
> URL: https://issues.apache.org/jira/browse/FLINK-28875
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.4.0, kubernetes-operator-1.3.1
>
>
> There are currently no tests for the FlinkSessionJobController, only unit 
> tests for the individual components of it.
> We should add a larger integration style test similar to the 
> FlinkDeploymentControllerTest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-28875) Add FlinkSessionJobControllerTest

2022-12-19 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-28875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-28875:
---
Fix Version/s: kubernetes-operator-1.3.1

> Add FlinkSessionJobControllerTest
> -
>
> Key: FLINK-28875
> URL: https://issues.apache.org/jira/browse/FLINK-28875
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.4.0, kubernetes-operator-1.3.1
>
>
> There are currently no tests for the FlinkSessionJobController, only unit 
> tests for the individual components of it.
> We should add a larger integration style test similar to the 
> FlinkDeploymentControllerTest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-kubernetes-operator] gyfora merged pull request #486: [backport][FLINK-28875] Add FlinkSessionJobControllerTest

2022-12-19 Thread GitBox



gyfora merged PR #486:
URL: https://github.com/apache/flink-kubernetes-operator/pull/486


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] ramkrish86 commented on pull request #21508: [FLINK-30128] Introduce Azure Data Lake Gen2 APIs in the Hadoop Recoverable path

2022-12-19 Thread GitBox



ramkrish86 commented on PR #21508:
URL: https://github.com/apache/flink/pull/21508#issuecomment-1357728930

   Pushed the changes. We will update here once the other tests are done. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-mongodb] Jiabao-Sun commented on a diff in pull request #1: [FLINK-6573][connectors/mongodb] Flink MongoDB Connector

2022-12-19 Thread GitBox



Jiabao-Sun commented on code in PR #1:
URL: 
https://github.com/apache/flink-connector-mongodb/pull/1#discussion_r1052217885


##
flink-connector-mongodb/src/test/java/org/apache/flink/connector/mongodb/table/converter/MongoConvertersTest.java:
##
@@ -146,28 +146,31 @@ public void testConvertBsonToRowData() {
 GenericRowData.of(
 StringData.fromString(oid.toHexString()),
 StringData.fromString("string"),
-StringData.fromString(uuid.toString()),
+StringData.fromString(
+"{\"_value\": {\"$binary\": {\"base64\": 
\"gR+qXamERr+L0IyxvO9daQ==\", \"subType\": \"04\"}}}"),
 2,
 3L,
 4.1d,
 DecimalData.fromBigDecimal(new BigDecimal("5.1"), 10, 
2),
 false,
 TimestampData.fromEpochMillis(now.getEpochSecond() * 
1000),
+TimestampData.fromEpochMillis(now.toEpochMilli()),
 StringData.fromString(
-OffsetDateTime.ofInstant(
-
Instant.ofEpochMilli(now.toEpochMilli()),
-ZoneOffset.UTC)
-.format(ISO_OFFSET_DATE_TIME)),
-StringData.fromString("/^9$/i"),
-StringData.fromString("function() { return 10; }"),
-StringData.fromString("function() { return 11; }"),
-StringData.fromString("12"),
-StringData.fromString(oid.toHexString()),
+"{\"_value\": {\"$regularExpression\": 
{\"pattern\": \"^9$\", \"options\": \"i\"}}}"),

Review Comment:
   No matter which form it is, it cannot be directly handed over to MongoDB for 
processing. The bson document stored in mongodb needs to be converted into 
`BsonDocument`. We need to extract and convert the `RowData` fileds stored as 
string types, and finally convert the entire `RowData` into `BsonDocument`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-30355) crictl causes long wait in e2e tests

2022-12-19 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser closed FLINK-30355.
--
Fix Version/s: 1.17.0
   Resolution: Fixed

Fixed in master: f9557068cf203a2eabf31ffd3911c40373d7e9ce

> crictl causes long wait in e2e tests
> 
>
> Key: FLINK-30355
> URL: https://issues.apache.org/jira/browse/FLINK-30355
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure, Tests
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Martijn Visser
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.17.0
>
>
> We observed strange behavior in the e2e test where the e2e test run times 
> out: 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=43824=logs=bea52777-eaf8-5663-8482-18fbc3630e81=ae4f8708-9994-57d3-c2d7-b892156e7812=b2642e3a-5b86-574d-4c8a-f7e2842bfb14=7446
> The issue seems to be related to {{crictl}} again because we see the 
> following error message in multiple tests. No logs are produced afterwards 
> for ~30mins resulting in the overall test run taking too long:
> {code}
> Dec 09 08:55:39 crictl
> fatal: destination path 'cri-dockerd' already exists and is not an empty 
> directory.
> fatal: a branch named 'v0.2.3' already exists
> mkdir: cannot create directory ‘bin’: File exists
> Dec 09 09:26:41 fs.protected_regular = 0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] MartijnVisser merged pull request #21520: [FLINK-30355][Kubernetes][Tests] Remove `cri-dockerd` if already cloned earlier

2022-12-19 Thread GitBox



MartijnVisser merged PR #21520:
URL: https://github.com/apache/flink/pull/21520


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-mongodb] Jiabao-Sun commented on a diff in pull request #1: [FLINK-6573][connectors/mongodb] Flink MongoDB Connector

2022-12-19 Thread GitBox



Jiabao-Sun commented on code in PR #1:
URL: 
https://github.com/apache/flink-connector-mongodb/pull/1#discussion_r1052209447


##
flink-connector-mongodb/src/test/java/org/apache/flink/connector/mongodb/table/converter/MongoConvertersTest.java:
##
@@ -146,28 +146,31 @@ public void testConvertBsonToRowData() {
 GenericRowData.of(
 StringData.fromString(oid.toHexString()),
 StringData.fromString("string"),
-StringData.fromString(uuid.toString()),
+StringData.fromString(
+"{\"_value\": {\"$binary\": {\"base64\": 
\"gR+qXamERr+L0IyxvO9daQ==\", \"subType\": \"04\"}}}"),
 2,
 3L,
 4.1d,
 DecimalData.fromBigDecimal(new BigDecimal("5.1"), 10, 
2),
 false,
 TimestampData.fromEpochMillis(now.getEpochSecond() * 
1000),
+TimestampData.fromEpochMillis(now.toEpochMilli()),
 StringData.fromString(
-OffsetDateTime.ofInstant(
-
Instant.ofEpochMilli(now.toEpochMilli()),
-ZoneOffset.UTC)
-.format(ISO_OFFSET_DATE_TIME)),
-StringData.fromString("/^9$/i"),
-StringData.fromString("function() { return 10; }"),
-StringData.fromString("function() { return 11; }"),
-StringData.fromString("12"),
-StringData.fromString(oid.toHexString()),
+"{\"_value\": {\"$regularExpression\": 
{\"pattern\": \"^9$\", \"options\": \"i\"}}}"),

Review Comment:
   The current bson api only supports the conversion of bson document. For a 
single bson value, we need to customize `JsonWriter` and `JsonReader`.
   So in the previous implementation, we used a _value to wrap a single bson 
value as a bson document, so that we can parse them easily.
   
   ```java
   import org.bson.BsonDocument;
   import org.bson.BsonRegularExpression;
   import org.bson.json.JsonMode;
   import org.bson.json.JsonWriterSettings;
   import org.junit.jupiter.api.Test;
   
   import static org.junit.jupiter.api.Assertions.assertEquals;
   
   public class JsonConversionTest {
   
   @Test
   public void bsonToJsonTest() {
   BsonRegularExpression original = new BsonRegularExpression("regex", 
"i");
   BsonDocument wrapped = new BsonDocument("_value", original);
   
   String json = 
wrapped.toJson(JsonWriterSettings.builder().outputMode(JsonMode.EXTENDED).build());
   // {"_value": {"$regularExpression": {"pattern": "regex", "options": 
"i"}}}
   System.out.println(json);
   
   BsonDocument parsed = BsonDocument.parse(json);
   BsonRegularExpression parsedRegularExpression = 
parsed.getRegularExpression("_value");
   
   assertEquals(parsedRegularExpression, original);
   }
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] MartijnVisser commented on pull request #21128: [FLINK-29710] Bump minimum supported Hadoop version to 2.10.2

2022-12-19 Thread GitBox



MartijnVisser commented on PR #21128:
URL: https://github.com/apache/flink/pull/21128#issuecomment-1357654602

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-mongodb] Jiabao-Sun commented on a diff in pull request #1: [FLINK-6573][connectors/mongodb] Flink MongoDB Connector

2022-12-19 Thread GitBox



Jiabao-Sun commented on code in PR #1:
URL: 
https://github.com/apache/flink-connector-mongodb/pull/1#discussion_r1052185934


##
flink-connector-mongodb/src/main/java/org/apache/flink/connector/mongodb/table/MongoDynamicTableFactory.java:
##
@@ -151,16 +146,11 @@ public DynamicTableSink createDynamicTableSink(Context 
context) {
 MongoConfiguration config = new 
MongoConfiguration(helper.getOptions());
 helper.validate();
 
-ResolvedSchema schema = context.getCatalogTable().getResolvedSchema();
-SerializableFunction keyExtractor =
-MongoKeyExtractor.createKeyExtractor(schema);
-
 return new MongoDynamicTableSink(
 getConnectionOptions(config),
 getWriteOptions(config),
 config.getSinkParallelism(),
-context.getPhysicalRowDataType(),
-keyExtractor);
+context.getCatalogTable().getResolvedSchema());

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-mongodb] twalthr commented on a diff in pull request #1: [FLINK-6573][connectors/mongodb] Flink MongoDB Connector

2022-12-19 Thread GitBox



twalthr commented on code in PR #1:
URL: 
https://github.com/apache/flink-connector-mongodb/pull/1#discussion_r1052181728


##
flink-connector-mongodb/src/main/java/org/apache/flink/connector/mongodb/table/MongoDynamicTableFactory.java:
##
@@ -151,16 +146,11 @@ public DynamicTableSink createDynamicTableSink(Context 
context) {
 MongoConfiguration config = new 
MongoConfiguration(helper.getOptions());
 helper.validate();
 
-ResolvedSchema schema = context.getCatalogTable().getResolvedSchema();
-SerializableFunction keyExtractor =
-MongoKeyExtractor.createKeyExtractor(schema);
-
 return new MongoDynamicTableSink(
 getConnectionOptions(config),
 getWriteOptions(config),
 config.getSinkParallelism(),
-context.getPhysicalRowDataType(),
-keyExtractor);
+context.getCatalogTable().getResolvedSchema());

Review Comment:
   Personally, I would keep out the catalog's schema from the sink. The sink 
has no schema but a data type. Let's use a boolean flag in the constructor 
`isUpsert`.



##
flink-connector-mongodb/src/test/java/org/apache/flink/connector/mongodb/table/converter/MongoConvertersTest.java:
##
@@ -146,28 +146,31 @@ public void testConvertBsonToRowData() {
 GenericRowData.of(
 StringData.fromString(oid.toHexString()),
 StringData.fromString("string"),
-StringData.fromString(uuid.toString()),
+StringData.fromString(
+"{\"_value\": {\"$binary\": {\"base64\": 
\"gR+qXamERr+L0IyxvO9daQ==\", \"subType\": \"04\"}}}"),
 2,
 3L,
 4.1d,
 DecimalData.fromBigDecimal(new BigDecimal("5.1"), 10, 
2),
 false,
 TimestampData.fromEpochMillis(now.getEpochSecond() * 
1000),
+TimestampData.fromEpochMillis(now.toEpochMilli()),
 StringData.fromString(
-OffsetDateTime.ofInstant(
-
Instant.ofEpochMilli(now.toEpochMilli()),
-ZoneOffset.UTC)
-.format(ISO_OFFSET_DATE_TIME)),
-StringData.fromString("/^9$/i"),
-StringData.fromString("function() { return 10; }"),
-StringData.fromString("function() { return 11; }"),
-StringData.fromString("12"),
-StringData.fromString(oid.toHexString()),
+"{\"_value\": {\"$regularExpression\": 
{\"pattern\": \"^9$\", \"options\": \"i\"}}}"),

Review Comment:
   The question whether we need to customize a JsonWriter depends on the 
round-trip story. Is it possible to read a RegExp as string and write it out as 
a string again that MongoDB could correctly classify as RegExp again?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-27177) Implement Binary Sorted State

2022-12-19 Thread Etienne Chauchot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot updated FLINK-27177:
-
Description: Following the plan in 
[FLIP-220|https://cwiki.apache.org/confluence/x/Xo_FD]  (was: Following the 
plan in 
[FLIP-220|[https://cwiki.apache.org/confluence/x/Xo_FD]|https://cwiki.apache.org/confluence/x/Xo_FD],])

> Implement Binary Sorted State
> -
>
> Key: FLINK-27177
> URL: https://issues.apache.org/jira/browse/FLINK-27177
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: David Anderson
>Priority: Major
>
> Following the plan in [FLIP-220|https://cwiki.apache.org/confluence/x/Xo_FD]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] wangyang0918 commented on pull request #21527: [FLINK-27925] watch tm pod performance optimization

2022-12-19 Thread GitBox



wangyang0918 commented on PR #21527:
URL: https://github.com/apache/flink/pull/21527#issuecomment-1357635178

   We need to add a test to guard this behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #21529: [DO NOT MERGE] Testing CI with Hadoop 3 upgrade

2022-12-19 Thread GitBox



flinkbot commented on PR #21529:
URL: https://github.com/apache/flink/pull/21529#issuecomment-1357633959

   
   ## CI report:
   
   * d56fcee002d98d3a9b70818c0243cbb4856b3874 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-27925) Avoid to create watcher without the resourceVersion

2022-12-19 Thread Yang Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang reassigned FLINK-27925:
-

Assignee: ouyangwulin

> Avoid to create watcher without the resourceVersion
> ---
>
> Key: FLINK-27925
> URL: https://issues.apache.org/jira/browse/FLINK-27925
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Aitozi
>Assignee: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-12-19-20-19-41-303.png
>
>
> Currently, we create the watcher in KubernetesResourceManager. But it do not 
> pass the resourceVersion parameter, it will trigger a request to etcd. It 
> will bring the burden to the etcd in large scale cluster (which have been 
> seen in our internal k8s cluster). More detail can be found 
> [here|https://kubernetes.io/docs/reference/using-api/api-concepts/#the-resourceversion-parameter]
>  
> I think we could use the informer to improve it (which will spawn a 
> list-watch and maintain the resourceVersion internally)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-27925) Avoid to create watcher without the resourceVersion

2022-12-19 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649317#comment-17649317
 ] 

Yang Wang commented on FLINK-27925:
---

Thanks [~ouyangwuli] for sharing the investigation. +1 for adding the 
resourceVersion=0 when doing the list pods.

> Avoid to create watcher without the resourceVersion
> ---
>
> Key: FLINK-27925
> URL: https://issues.apache.org/jira/browse/FLINK-27925
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Aitozi
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-12-19-20-19-41-303.png
>
>
> Currently, we create the watcher in KubernetesResourceManager. But it do not 
> pass the resourceVersion parameter, it will trigger a request to etcd. It 
> will bring the burden to the etcd in large scale cluster (which have been 
> seen in our internal k8s cluster). More detail can be found 
> [here|https://kubernetes.io/docs/reference/using-api/api-concepts/#the-resourceversion-parameter]
>  
> I think we could use the informer to improve it (which will spawn a 
> list-watch and maintain the resourceVersion internally)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] MartijnVisser opened a new pull request, #21529: [DO NOT MERGE] Testing CI with Hadoop 3 upgrade

2022-12-19 Thread GitBox



MartijnVisser opened a new pull request, #21529:
URL: https://github.com/apache/flink/pull/21529

   This PR is only to validate if https://github.com/apache/flink/pull/21128 
compiles with the Hadoop 3 profile


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-aws] boring-cyborg[bot] commented on pull request #41: [FLINK-30224][Connectors/Kinesis] An IT test for slow FlinKinesisConsumer's run() which caused an NPE in close

2022-12-19 Thread GitBox



boring-cyborg[bot] commented on PR #41:
URL: 
https://github.com/apache/flink-connector-aws/pull/41#issuecomment-1357608490

   Thanks for opening this pull request! Please check out our contributing 
guidelines. (https://flink.apache.org/contributing/how-to-contribute.html)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-aws] astamur opened a new pull request, #41: [FLINK-30224][Connectors/Kinesis] An IT test for slow FlinKinesisConsumer's run() which caused an NPE in close

2022-12-19 Thread GitBox



astamur opened a new pull request, #41:
URL: https://github.com/apache/flink-connector-aws/pull/41

   ## What is the purpose of the change
   
[FLINK-30224](https://issues.apache.org/jira/projects/FLINK/issues/FLINK-30224)
   - An IT test was added which ensures that the issue with a slow KDS consumer 
initialization doesn't cause an NPE during cancelation;
   - Previously the issue itself was fixed in 
[FLINK-29324](https://issues.apache.org/jira/projects/FLINK/issues/FLINK-29324).
   
   ## Brief change log
   Improvement. An IT test was added which checks this fix: 
[FLINK-29324](https://issues.apache.org/jira/projects/FLINK/issues/FLINK-29324)
   
   ## Verifying this change
   This change only adds a new test. Azure build succeeded.
   
   ## Does this pull request potentially affect one of the following parts:
   - Dependencies (does it add or upgrade a dependency): **No**
   - The public API, i.e., is any changed class annotated with 
@Public(Evolving): **No**
   - The serializers: **No**
   - The runtime per-record code paths (performance sensitive): **No**
   - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **No**
   - The S3 file system connector: **No**
   
   ## Documentation
   - Does this pull request introduce a new feature? **No**
   - If yes, how is the feature documented? not applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-27925) Avoid to create watcher without the resourceVersion

2022-12-19 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649232#comment-17649232
 ] 

ouyangwulin edited comment on FLINK-27925 at 12/19/22 12:31 PM:


In the case of large-scale start and stop jobs, constantly reading data from 
etcd can cause a bottleneck in etcd performance. I agree with [~wangyang0918]  
using informer will increase memory pressure. We can increase resourceversion=0 
in watcher to reduce data read from etcd.

As 
[https://kubernetes.io/docs/reference/using-api/api-concepts/#the-resourceversion-parameter]
 describe and the screenshots of code，

   1 ."resourceVersion unset" is means "Most Recent" ，The returned data must be 
consistent (in detail: served from etcd via a quorum read).

   2. "resourceVersion="0" is means "Any".  Return data at any resource 
version. The newest available resource version is preferred, but strong 
consistency is not required; data at any resource version may be served. It is 
possible for the request to return data at a much older resource version that 
the client has previously observed, particularly in high availability 
configurations, due to partitions or stale caches. 

!image-2022-12-19-20-19-41-303.png!


was (Author: ouyangwuli):
In the case of large-scale start and stop jobs, constantly reading data from 
etcd can cause a bottleneck in etcd performance. I agree with [~wangyang0918]  
using informer will increase memory pressure. We can increase resourceversion=0 
in watch to reduce data read from etcd.

> Avoid to create watcher without the resourceVersion
> ---
>
> Key: FLINK-27925
> URL: https://issues.apache.org/jira/browse/FLINK-27925
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Reporter: Aitozi
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-12-19-20-19-41-303.png
>
>
> Currently, we create the watcher in KubernetesResourceManager. But it do not 
> pass the resourceVersion parameter, it will trigger a request to etcd. It 
> will bring the burden to the etcd in large scale cluster (which have been 
> seen in our internal k8s cluster). More detail can be found 
> [here|https://kubernetes.io/docs/reference/using-api/api-concepts/#the-resourceversion-parameter]
>  
> I think we could use the informer to improve it (which will spawn a 
> list-watch and maintain the resourceVersion internally)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30423) Introduce cast executor codegen for column type evolution

2022-12-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-30423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30423:
---
Labels: pull-request-available  (was: )

> Introduce cast executor codegen for column type evolution
> -
>
> Key: FLINK-30423
> URL: https://issues.apache.org/jira/browse/FLINK-30423
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Shammon
>Priority: Major
>  Labels: pull-request-available
>
> Introduce cast executor codegen for column type evolution



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-table-store] zjureel opened a new pull request, #441: [FLINK-30423] Introduce codegen for CastExecutor

2022-12-19 Thread GitBox



zjureel opened a new pull request, #441:
URL: https://github.com/apache/flink-table-store/pull/441

   This PR aims to introduce codegen for `CastExecutor`, the main codes are 
copied from flink `CastRule` and related class


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-mongodb] Jiabao-Sun commented on a diff in pull request #1: [FLINK-6573][connectors/mongodb] Flink MongoDB Connector

2022-12-19 Thread GitBox



Jiabao-Sun commented on code in PR #1:
URL: 
https://github.com/apache/flink-connector-mongodb/pull/1#discussion_r1052145855


##
flink-connector-mongodb/src/test/java/org/apache/flink/connector/mongodb/table/converter/MongoConvertersTest.java:
##
@@ -146,28 +146,31 @@ public void testConvertBsonToRowData() {
 GenericRowData.of(
 StringData.fromString(oid.toHexString()),
 StringData.fromString("string"),
-StringData.fromString(uuid.toString()),
+StringData.fromString(
+"{\"_value\": {\"$binary\": {\"base64\": 
\"gR+qXamERr+L0IyxvO9daQ==\", \"subType\": \"04\"}}}"),
 2,
 3L,
 4.1d,
 DecimalData.fromBigDecimal(new BigDecimal("5.1"), 10, 
2),
 false,
 TimestampData.fromEpochMillis(now.getEpochSecond() * 
1000),
+TimestampData.fromEpochMillis(now.toEpochMilli()),
 StringData.fromString(
-OffsetDateTime.ofInstant(
-
Instant.ofEpochMilli(now.toEpochMilli()),
-ZoneOffset.UTC)
-.format(ISO_OFFSET_DATE_TIME)),
-StringData.fromString("/^9$/i"),
-StringData.fromString("function() { return 10; }"),
-StringData.fromString("function() { return 11; }"),
-StringData.fromString("12"),
-StringData.fromString(oid.toHexString()),
+"{\"_value\": {\"$regularExpression\": 
{\"pattern\": \"^9$\", \"options\": \"i\"}}}"),

Review Comment:
   Thanks @twalthr.
   The `JsonWriter` cannot directly write a `BsonValue` in to a string. It will 
throw an exception when writing directly to a `BsonValue`, so we used `_value` 
to wrap the bson value into a bson document. However, we can also extend a 
`JsonWriter` so that it does not check when writing bson value directly.  Do 
you think we need to customize a `JsonWriter`?
   
   ```java
   package org.apache.flink;
   
   import org.bson.BsonRegularExpression;
   import org.bson.json.JsonWriter;
   
   import java.io.IOException;
   import java.io.StringWriter;
   
   public class JsonWriterTest {
   
   public static void main(String[] args) throws IOException {
   try (StringWriter stringWriter = new StringWriter();
JsonWriter jsonWriter = new JsonWriter(stringWriter)) {
   BsonRegularExpression regularExpression = new 
BsonRegularExpression("regex", "i");
   jsonWriter.writeRegularExpression(regularExpression);
   }
   }
   }
   ```
   
   ```shell
   Exception in thread "main" org.bson.BsonInvalidOperationException: A 
RegularExpression value cannot be written to the root level of a BSON document.
at 
org.bson.AbstractBsonWriter.throwInvalidState(AbstractBsonWriter.java:740)
at 
org.bson.AbstractBsonWriter.checkPreconditions(AbstractBsonWriter.java:701)
at 
org.bson.AbstractBsonWriter.writeRegularExpression(AbstractBsonWriter.java:590)
at org.apache.flink.JsonWriterTest.main(JsonWriterTest.java:15)
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (FLINK-29996) Link to the task manager's thread dump page in the backpressure tab

2022-12-19 Thread Yun Tang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang resolved FLINK-29996.
--
Resolution: Fixed

merged in master: 3c043cfd68fd52135175d4ae3d65a1e1909af133

> Link to the task manager's thread dump page in the backpressure tab
> ---
>
> Key: FLINK-29996
> URL: https://issues.apache.org/jira/browse/FLINK-29996
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Reporter: Yun Tang
>Assignee: Yu Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Currently, we have a complex steps to find the thread dump of backpressured 
> tasks, however, this could be simplified with a link in the backpressure tab 
> of web UI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] Myasuka merged pull request #21465: [FLINK-29996] Link to the task manager's thread dump page in the back…

2022-12-19 Thread GitBox



Myasuka merged PR #21465:
URL: https://github.com/apache/flink/pull/21465


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 >

1 - 100 of 150 matches

Mail list logo