[GitHub] [flink] xintongsong commented on a diff in pull request #21508: [FLINK-30128] Introduce Azure Data Lake Gen2 APIs in the Hadoop Recoverable path

2023-01-08 Thread GitBox


xintongsong commented on code in PR #21508:
URL: https://github.com/apache/flink/pull/21508#discussion_r1064353993


##
flink-filesystems/flink-azure-fs-hadoop/src/main/java/org/apache/flink/fs/azurefs/AzureBlobFsRecoverableDataOutputStream.java:
##
@@ -0,0 +1,303 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.fs.azurefs;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.RecoverableFsDataOutputStream;
+import org.apache.flink.core.fs.RecoverableWriter;
+import 
org.apache.flink.runtime.fs.hdfs.BaseHadoopFsRecoverableFsDataOutputStream;
+import org.apache.flink.runtime.fs.hdfs.HadoopFsRecoverable;
+
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/**
+ * An implementation of the {@link RecoverableFsDataOutputStream} for 
AzureBlob's file system
+ * abstraction.
+ */
+@Internal
+public class AzureBlobFsRecoverableDataOutputStream
+extends BaseHadoopFsRecoverableFsDataOutputStream {
+
+private static final Logger LOG =
+
LoggerFactory.getLogger(AzureBlobFsRecoverableDataOutputStream.class);
+private static final String RENAME = ".rename";
+
+// Not final to override in tests
+public static int minBufferLength = 2097152;

Review Comment:
   We use `@VisibleForTesting` for fields that have broader visibility then 
needed for testing purpose.



##
flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/BaseHadoopFsRecoverableFsDataOutputStream.java:
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.fs.hdfs;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.core.fs.RecoverableFsDataOutputStream;
+import org.apache.flink.core.fs.RecoverableWriter;
+
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
+import java.io.IOException;
+
+/** Base class for ABFS and Hadoop recoverable stream. */
+@Internal
+public abstract class BaseHadoopFsRecoverableFsDataOutputStream
+extends RecoverableFsDataOutputStream {
+
+protected FileSystem fs;
+
+protected Path targetFile;
+
+protected Path tempFile;
+
+protected FSDataOutputStream out;
+
+// In ABFS outputstream we need to add this to the current pos
+protected long initialFileSize = 0;
+
+public long getPos() throws IOException {
+return out.getPos();
+}
+
+@Override
+public void write(int b) throws IOException {
+out.write(b);
+}
+
+@Override
+public void write(byte[] b, int off, int len) throws IOException {
+out.write(b, off, len);
+}
+
+@Override
+public abstract void flush() throws IOException;
+
+@Override
+public void sync() throws IOException {
+out.hflush();
+out.hsync();
+}

Review Comment:
   1. IIUC, @anoopsjohn 's comment was that you can call `hsync` without 
calling `hflush` in `sync()` for ABFS. It was not about ABFS should call 
`hsync` in `flush()`. 

[jira] [Comment Edited] (FLINK-30489) flink-sql-connector-pulsar doesn't shade all dependencies

2023-01-08 Thread Yufan Sheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652982#comment-17652982
 ] 

Yufan Sheng edited comment on FLINK-30489 at 1/9/23 7:56 AM:
-

We didn't shade these dependencies because these dependencies are not always 
required for Pulsar connector. Such as {{org.bouncycastel}}, you will need it 
only if you use the end-to-end encryption.

But we can truly shade all the dependencies.

{{org.sfl4j}} dependency will be removed in the next Pulsar 2.11 release. So no 
need to shade it.


was (Author: syhily):
We didn't shade these dependencies because these dependencies are not always 
required for Pulsar connector. Such as {{org.bouncycastel}}, you will need it 
only if you use the end-to-end encryption.

But we can truly shade all the dependencies.

> flink-sql-connector-pulsar doesn't shade all dependencies
> -
>
> Key: FLINK-30489
> URL: https://issues.apache.org/jira/browse/FLINK-30489
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.16.0
>Reporter: Henri Yandell
>Priority: Major
>
> Looking at 
> [https://repo1.maven.org/maven2/org/apache/flink/flink-sql-connector-pulsar/1.16.0/flink-sql-connector-pulsar-1.16.0.jar]
>  I'm seeing that some dependencies are shaded (com.fasterxml, com.yahoo etc), 
> but others are not (org.sfl4j, org.bouncycastel, com.scurrilous, ...) and 
> will presumably clash with other jar files.
> Additionally, this bundling is going on in the '.jar' file rather than in a 
> more clearly indicated separate -bundle or -shaded jar file. 
> As a jar file this seems confusing and potentially bug inducing; though I 
> note I'm just a review of the jar and not Flink experienced.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30508) CliClientITCase.testSqlStatements failed with output not matched with expected

2023-01-08 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655957#comment-17655957
 ] 

Matthias Pohl commented on FLINK-30508:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44552=logs=f2c100be-250b-5e85-7bbe-176f68fcddc5=05efd11e-5400-54a4-0d27-a4663be008a9=13186

> CliClientITCase.testSqlStatements failed with output not matched with expected
> --
>
> Key: FLINK-30508
> URL: https://issues.apache.org/jira/browse/FLINK-30508
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.17.0
>Reporter: Qingsheng Ren
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44246=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=14992



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30508) CliClientITCase.testSqlStatements failed with output not matched with expected

2023-01-08 Thread Matthias Pohl (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-30508:
--
Labels: test-stability  (was: )

> CliClientITCase.testSqlStatements failed with output not matched with expected
> --
>
> Key: FLINK-30508
> URL: https://issues.apache.org/jira/browse/FLINK-30508
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.17.0
>Reporter: Qingsheng Ren
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44246=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=14992



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30525) Cannot open jobmanager configuration web page

2023-01-08 Thread Weihua Hu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655952#comment-17655952
 ] 

Weihua Hu commented on FLINK-30525:
---

[~renqs] Can you take a look at this? This will affect the release of 1.17

> Cannot open jobmanager configuration web page
> -
>
> Key: FLINK-30525
> URL: https://issues.apache.org/jira/browse/FLINK-30525
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Affects Versions: 1.17.0
>Reporter: Weihua Hu
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-12-28-20-37-00-825.png, 
> image-2022-12-28-20-37-05-551.png
>
>
> we remove the environments in rest api in 
> https://issues.apache.org/jira/browse/FLINK-30116.
> The jobmanager.configuration web page will throw "TypeError: Cannot read 
> properties of undefined (reading 'length')" 
> the environment in jobmanager.configuration web page should be delete too.
>  !image-2022-12-28-20-37-00-825.png! 
>  !image-2022-12-28-20-37-05-551.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28424) JdbcExactlyOnceSinkE2eTest hangs on Azure

2023-01-08 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655951#comment-17655951
 ] 

Matthias Pohl commented on FLINK-28424:
---

https://issues.apache.org/jira/browse/FLINK-28424?filter=-1=project%20%3D%20FLINK%20AND%20text%20~%20%22JdbcExactlyOnceSinkE2eTest%22%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved)

> JdbcExactlyOnceSinkE2eTest hangs on Azure
> -
>
> Key: FLINK-28424
> URL: https://issues.apache.org/jira/browse/FLINK-28424
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Martijn Visser
>Priority: Major
>  Labels: auto-deprioritized-critical, test-stability
>
> {code:java}
> 2022-07-06T07:10:57.8133295Z 
> ==
> 2022-07-06T07:10:57.8137200Z === WARNING: This task took already 95% of the 
> available time budget of 232 minutes ===
> 2022-07-06T07:10:57.8140723Z 
> ==
> 2022-07-06T07:10:57.8186584Z 
> ==
> 2022-07-06T07:10:57.8187530Z The following Java processes are running (JPS)
> 2022-07-06T07:10:57.8188571Z 
> ==
> 2022-07-06T07:10:58.2136012Z 825016 Jps
> 2022-07-06T07:10:58.2136438Z 34359 surefirebooter8568713056714319310.jar
> 2022-07-06T07:10:58.2136774Z 525 Launcher
> 2022-07-06T07:10:58.2240260Z 
> ==
> 2022-07-06T07:10:58.2240814Z Printing stack trace of Java process 825016
> 2022-07-06T07:10:58.2241256Z 
> ==
> 2022-07-06T07:10:58.4498109Z 825016: No such process
> 2022-07-06T07:10:58.4524779Z 
> ==
> 2022-07-06T07:10:58.4525272Z Printing stack trace of Java process 34359
> 2022-07-06T07:10:58.4525713Z 
> ==
> 2022-07-06T07:10:58.6399085Z 2022-07-06 07:10:58
> 2022-07-06T07:10:58.6400425Z Full thread dump OpenJDK 64-Bit Server VM 
> (25.292-b10 mixed mode):
> 2022-07-06T07:10:58.7332738Z "Legacy Source Thread - Source: Custom Source -> 
> Map -> Sink: Unnamed (1/4)#44585" #870775 prio=5 os_prio=0 
> tid=0x7fca5c06f800 nid=0xc3c26 waiting on condition [0x7fca503b1000]
> 2022-07-06T07:10:58.786Zjava.lang.Thread.State: WAITING (parking)
> 2022-07-06T07:10:58.7333759Z  at sun.misc.Unsafe.park(Native Method)
> 2022-07-06T07:10:58.7334404Z  - parking to wait for  <0xd5998448> (a 
> java.util.concurrent.CountDownLatch$Sync)
> 2022-07-06T07:10:58.7334943Z  at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 2022-07-06T07:10:58.7335605Z  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> 2022-07-06T07:10:58.7336392Z  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> 2022-07-06T07:10:58.7337195Z  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> 2022-07-06T07:10:58.7337966Z  at 
> java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> 2022-07-06T07:10:58.7338677Z  at 
> org.apache.flink.connector.jdbc.xa.JdbcExactlyOnceSinkE2eTest$TestEntrySource.waitForConsumers(JdbcExactlyOnceSinkE2eTest.java:314)
> 2022-07-06T07:10:58.7339566Z  at 
> org.apache.flink.connector.jdbc.xa.JdbcExactlyOnceSinkE2eTest$TestEntrySource.run(JdbcExactlyOnceSinkE2eTest.java:300)
> 2022-07-06T07:10:58.7340281Z  at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:110)
> 2022-07-06T07:10:58.7340883Z  at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:67)
> 2022-07-06T07:10:58.7341583Z  at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:333)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=37685=logs=075127ba-54d5-54b0-cccf-6a36778b332d=c35a13eb-0df9-505f-29ac-8097029d4d79=14871



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29405) InputFormatCacheLoaderTest is unstable

2023-01-08 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655950#comment-17655950
 ] 

Matthias Pohl commented on FLINK-29405:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44553=logs=ce3801ad-3bd5-5f06-d165-34d37e757d90=5e4d9387-1dcc-5885-a901-90469b7e6d2f=11690

> InputFormatCacheLoaderTest is unstable
> --
>
> Key: FLINK-29405
> URL: https://issues.apache.org/jira/browse/FLINK-29405
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Chesnay Schepler
>Assignee: Alexander Smirnov
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> #testExceptionDuringReload/#testCloseAndInterruptDuringReload fail reliably 
> when run in a loop.
> {code}
> java.lang.AssertionError: 
> Expecting AtomicInteger(0) to have value:
>   0
> but did not.
>   at 
> org.apache.flink.table.runtime.functions.table.fullcache.inputformat.InputFormatCacheLoaderTest.testCloseAndInterruptDuringReload(InputFormatCacheLoaderTest.java:161)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #21622: [FLINK-30570] Fix error inferred partition condition

2023-01-08 Thread GitBox


flinkbot commented on PR #21622:
URL: https://github.com/apache/flink/pull/21622#issuecomment-1375214257

   
   ## CI report:
   
   * fbfa489271fd7ede903d76682fb09565466bf082 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30570) RexNodeExtractor#isSupportedPartitionPredicate generates unexpected partition predicates

2023-01-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30570:
---
Labels: pull-request-available  (was: )

> RexNodeExtractor#isSupportedPartitionPredicate generates unexpected partition 
> predicates
> 
>
> Key: FLINK-30570
> URL: https://issues.apache.org/jira/browse/FLINK-30570
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Aitozi
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the condition {{where rand(1) < 0.0001}} will be recognized as a 
> partition predicates and will be evaluated to false when compiling the SQL. 
> It has two problem. 
> First, it should not be recognized as a partition predicates, and the 
> nondeterministic function should never pass the partition pruner 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] Aitozi opened a new pull request, #21622: [FLINK-30570] Fix error inferred partition condition

2023-01-08 Thread GitBox


Aitozi opened a new pull request, #21622:
URL: https://github.com/apache/flink/pull/21622

   ## What is the purpose of the change
   
   This PR is meant to fix the unexpected partition pruning. It fix in two 
parts:
   
   - The partition condition should contains the InputRef of the partition 
field. Otherwise, it will be grouped to non-partition predicates
   - The partition condition should only accept the deterministic predicates.
   
   ## Verifying this change
   
   Some tests are added to verify it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-30603) CompactActionITCase in table store is unstable

2023-01-08 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655945#comment-17655945
 ] 

Jingsong Lee commented on FLINK-30603:
--

[~TsReaper] CC

> CompactActionITCase in table store is unstable
> --
>
> Key: FLINK-30603
> URL: https://issues.apache.org/jira/browse/FLINK-30603
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.4.0
>Reporter: Shammon
>Priority: Major
>
> https://github.com/apache/flink-table-store/actions/runs/3871130631/jobs/6598625030
> [INFO] Results:
> [INFO] 
> Error:  Failures: 
> Error:CompactActionITCase.testStreamingCompact:187 expected:<[+I 
> 1|100|15|20221208, +I 1|100|15|20221209]> but was:<[+I 1|100|15|20221209]>



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] xintongsong commented on a diff in pull request #21347: [FLINK-29640][Client/Job Submission]Enhance the function configured by execution.shutdown-on-attached-exi…

2023-01-08 Thread GitBox


xintongsong commented on code in PR #21347:
URL: https://github.com/apache/flink/pull/21347#discussion_r1064320554


##
flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java:
##
@@ -1756,6 +1756,9 @@ public final class ConfigConstants {
 /** The user lib directory name. */
 public static final String DEFAULT_FLINK_USR_LIB_DIR = "usrlib";
 
+/** The initial client timeout when submitting the job. */
+public static final String INITIAL_CLIENT_HEARTBEAT_TIMEOUT = 
"initialClientHeartbeatTimeout";

Review Comment:
   `ConfigConstants` is public api annotated with `@Public`. We should not add 
this internal config key here.
   
   I think it would be good enough to introduce the config key as a constant in 
`JobGraph`.



##
flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java:
##
@@ -206,4 +206,14 @@ default CompletableFuture> 
listCompletedClusterDatasetIds() {
 default CompletableFuture invalidateClusterDataset(AbstractID 
clusterDatasetId) {
 return CompletableFuture.completedFuture(null);
 }
+
+/**
+ * The client reports the heartbeat to the dispatcher for aliveness.
+ *
+ * @param jobId The jobId for the client and the job.
+ * @return
+ */
+default CompletableFuture reportHeartbeat(JobID jobId, long 
expiredTimestamp) {
+return CompletableFuture.completedFuture(null);

Review Comment:
   ```suggestion
   return FutureUtils.completedVoidFuture();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30561) ChangelogStreamHandleReaderWithCache cause FileNotFoundException

2023-01-08 Thread Feifan Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feifan Wang updated FLINK-30561:

Description: 
When a job with state changelog enabled continues to restart, the following 
exceptions may occur :
{code:java}
java.lang.RuntimeException: java.io.FileNotFoundException: 
/data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp
 (No such file or directory)
    at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
    at 
org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87)
    at 
org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69)
    at 
org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:107)
    at 
org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:78)
    at 
org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94)
    at 
org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136)
    at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336)
    at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
    at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
    at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353)
    at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165)
    at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265)
    at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:726)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:702)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:669)
    at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: 
/data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp
 (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.init(FileInputStream.java:138)
    at 
org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:158)
    at 
org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:95)
    at 
org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42)
    at 
org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85)
    ... 21 more {code}
*Problem causes:*
 # *_ChangelogStreamHandleReaderWithCache_* use RefCountedFile manager local 
cache file. The reference count is incremented when the input stream is opened 
from the cache file, and decremented by one when the input stream is closed. So 
the input stream must be closed and only once.
 # _*StateChangelogHandleStreamHandleReader#getChanges()*_ may cause the input 
stream to be closed twice. This happens when changeIterator.read(tuple2.f0, 
tuple2.f1) throws an exception (for example, when the task is canceled for 
other reasons during the restore process) the current state change iterator 
will be closed twice.

{code:java}
private void advance() {
while (!current.hasNext() && handleIterator.hasNext()) {
try {
current.close();
Tuple2 tuple2 = handleIterator.next();
LOG.debug("read at {} from {}", tuple2.f1, tuple2.f0);
current = 

[jira] [Updated] (FLINK-30471) Optimize the enriching network memory process in SsgNetworkMemoryCalculationUtils

2023-01-08 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-30471:
--
Description: 
In SsgNetworkMemoryCalculationUtils#enrichNetworkMemory, getting PartitionTypes 
is run in a separate loop, which is not friendly to performance.  If we want to 
add inputPartitionTypes in the subsequential PR, a new separate loop may be 
introduced too, which I think is not a good choice.

Using a separate loop to get each collection just looks simpler in code style, 
but it will affect the performance. We can get all the results of 
maxSubpartitionNums and partitionTypes through one loop instead of multiple 
loops, which will be faster. In this way, when we need to add 
inputPartitionTypes later, we do not need to add a new loop logic.

  was:
In SsgNetworkMemoryCalculationUtils#enrichNetworkMemory, getting PartitionTypes 
is run in a separate loop, which is not friendly to performance. If we want to 
get inputPartitionTypes, a new separate loop may be introduced too. 

It just looks simpler in code, but it will affect the performance. We can get 
all the results through one loop instead of multiple loops, which will be 
faster.


> Optimize the enriching network memory process in 
> SsgNetworkMemoryCalculationUtils
> -
>
> Key: FLINK-30471
> URL: https://issues.apache.org/jira/browse/FLINK-30471
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Affects Versions: 1.17.0
>Reporter: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> In SsgNetworkMemoryCalculationUtils#enrichNetworkMemory, getting 
> PartitionTypes is run in a separate loop, which is not friendly to 
> performance.  If we want to add inputPartitionTypes in the subsequential PR, 
> a new separate loop may be introduced too, which I think is not a good choice.
> Using a separate loop to get each collection just looks simpler in code 
> style, but it will affect the performance. We can get all the results of 
> maxSubpartitionNums and partitionTypes through one loop instead of multiple 
> loops, which will be faster. In this way, when we need to add 
> inputPartitionTypes later, we do not need to add a new loop logic.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30561) ChangelogStreamHandleReaderWithCache cause FileNotFoundException

2023-01-08 Thread Feifan Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feifan Wang updated FLINK-30561:

Description: 
When a job with state changelog enabled continues to restart, the following 
exceptions may occur :
{code:java}
java.lang.RuntimeException: java.io.FileNotFoundException: 
/data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp
 (No such file or directory)
    at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
    at 
org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87)
    at 
org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69)
    at 
org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:107)
    at 
org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:78)
    at 
org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94)
    at 
org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136)
    at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336)
    at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
    at 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
    at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353)
    at 
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165)
    at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265)
    at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:726)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:702)
    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:669)
    at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: 
/data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp
 (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.init(FileInputStream.java:138)
    at 
org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:158)
    at 
org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:95)
    at 
org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42)
    at 
org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85)
    ... 21 more {code}
*Problem causes:*
 # *_ChangelogStreamHandleReaderWithCache_* use RefCountedFile manager local 
cache file. The reference count is incremented when the input stream is opened 
from the cache file, and decremented by one when the input stream is closed. So 
the input stream must be closed and only once.
 # _*StateChangelogHandleStreamHandleReader#getChanges()*_ may cause the input 
stream to be closed twice. This happens when changeIterator.read(tuple2.f0, 
tuple2.f1) throws an exception (for example, when the task is canceled for 
other reasons during the restore process) the current state change iterator 
will be closed twice.

{code:java}
private void advance() {
while (!current.hasNext() && handleIterator.hasNext()) {
try {
current.close();
Tuple2 tuple2 = handleIterator.next();
LOG.debug("read at {} from {}", tuple2.f1, tuple2.f0);
current = 

[jira] [Closed] (FLINK-30602) Remove FileStoreTableITCase in table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-30602.

Fix Version/s: table-store-0.4.0
 Assignee: Shammon
   Resolution: Fixed

master: b27fa51ed05bca3596ef1ba5131e97921ec01e33

> Remove FileStoreTableITCase in table store
> --
>
> Key: FLINK-30602
> URL: https://issues.apache.org/jira/browse/FLINK-30602
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.4.0
>Reporter: Shammon
>Assignee: Shammon
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.4.0
>
>
> Remove `FileStoreTableITCase` in table store



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] JingsongLi merged pull request #470: [FLINK-30602] Remove FileStoreTableITCase in table store

2023-01-08 Thread GitBox


JingsongLi merged PR #470:
URL: https://github.com/apache/flink-table-store/pull/470


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-24989) Upgrade shade-plugin to 3.2.4

2023-01-08 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655934#comment-17655934
 ] 

Dian Fu commented on FLINK-24989:
-

[~chesnay]  It seems that Maven shade plugin supports JDK17 since 3.3.0. I 
guess we need to bump it again.See 
[https://blogs.apache.org/maven/entry/apache-maven-shade-plugin-version6] for 
more details.

I encountered the following issue when building Flink with JDK17:
{code}

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-shade-plugin:3.2.4:shade (shade-flink) on 
project flink-runtime: Error creating shaded jar: Problem shading JAR 
/Users/dianfu/code/src/apache/flink/flink-runtime/target/flink-runtime-1.17-SNAPSHOT.jar
 entry org/apache/flink/runtime/jobmaster/ExecutionDeploymentReconciler.class: 
java.lang.IllegalArgumentException: Unsupported class file major version 61 -> 
[Help 1]

org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
org.apache.maven.plugins:maven-shade-plugin:3.2.4:shade (shade-flink) on 
project flink-runtime: Error creating shaded jar: Problem shading JAR 
/Users/dianfu/code/src/apache/flink/flink-runtime/target/flink-runtime-1.17-SNAPSHOT.jar
 entry org/apache/flink/runtime/jobmaster/ExecutionDeploymentReconciler.class: 
java.lang.IllegalArgumentException: Unsupported class file major version 61

 at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:216)

 at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)

 at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)

 at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)

 at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)

 at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)

 at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)

 at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)

 at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)

 at org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)

 at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)

 at org.apache.maven.cli.MavenCli.main(MavenCli.java:160)

 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)

 at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)

 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.base/java.lang.reflect.Method.invoke(Method.java:568)

 at 
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)

 at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)

 at 
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)

 at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)

Caused by: org.apache.maven.plugin.MojoExecutionException: Error creating 
shaded jar: Problem shading JAR 
/Users/dianfu/code/src/apache/flink/flink-runtime/target/flink-runtime-1.17-SNAPSHOT.jar
 entry org/apache/flink/runtime/jobmaster/ExecutionDeploymentReconciler.class: 
java.lang.IllegalArgumentException: Unsupported class file major version 61

 at org.apache.maven.plugins.shade.mojo.ShadeMojo.execute(ShadeMojo.java:607)

 at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)

 at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)

 ... 19 more

Caused by: java.io.IOException: Problem shading JAR 
/Users/dianfu/code/src/apache/flink/flink-runtime/target/flink-runtime-1.17-SNAPSHOT.jar
 entry org/apache/flink/runtime/jobmaster/ExecutionDeploymentReconciler.class: 
java.lang.IllegalArgumentException: Unsupported class file major version 61

 at 
org.apache.maven.plugins.shade.DefaultShader.shadeJars(DefaultShader.java:201)

 at org.apache.maven.plugins.shade.DefaultShader.shade(DefaultShader.java:108)

 at org.apache.maven.plugins.shade.mojo.ShadeMojo.execute(ShadeMojo.java:463)

 ... 21 more

Caused by: java.lang.IllegalArgumentException: Unsupported class file major 
version 61

 at org.objectweb.asm.ClassReader.(ClassReader.java:196)

 at org.objectweb.asm.ClassReader.(ClassReader.java:177)

 at org.objectweb.asm.ClassReader.(ClassReader.java:163)

 at org.objectweb.asm.ClassReader.(ClassReader.java:284)

 at 
org.apache.maven.plugins.shade.DefaultShader.addRemappedClass(DefaultShader.java:465)

 at 
org.apache.maven.plugins.shade.DefaultShader.shadeSingleJar(DefaultShader.java:234)

 at 
org.apache.maven.plugins.shade.DefaultShader.shadeJars(DefaultShader.java:196)

 ... 23 more{code}

> Upgrade shade-plugin to 3.2.4
> 

[GitHub] [flink] flinkbot commented on pull request #21621: [FLINK-30601][runtime] Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance

2023-01-08 Thread GitBox


flinkbot commented on PR #21621:
URL: https://github.com/apache/flink/pull/21621#issuecomment-1375167857

   
   ## CI report:
   
   * 3e738de71ae44af8940e8e170f71a82c473d012b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance

2023-01-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30601:
---
Labels: pull-request-available  (was: )

> Omit "setKeyContextElement" call for non-keyed stream/operators to improve 
> performance
> --
>
> Key: FLINK-30601
> URL: https://issues.apache.org/jira/browse/FLINK-30601
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Currently, flink will set the correct key context(by call 
> [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
>  before processing each record, which is typically used to extract key from 
> record and pass that key to the state backends.
> However, the "setKeyContextElement" is obviously not need for non-keyed 
> stream/operator, in which case we can omit the "setKeyContextElement" calls 
> to improve performance. Note that "setKeyContextElement" is an interface 
> method, it requires looking up the interface table when calling, which will 
> further increase the method call overhead.
>  
> We run the following program as benchmark with parallelism=1 and object 
> re-use enabled. The benchmark results are averaged across 5 runs for each 
> setup. Before and after applying the proposed change, the average execution 
> time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
> {code:java}
> env.fromSequence(1, 10L)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x).addSink(new DiscardingSink<>());{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] wanglijie95 opened a new pull request, #21621: [FLINK-30601][runtime] Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance

2023-01-08 Thread GitBox


wanglijie95 opened a new pull request, #21621:
URL: https://github.com/apache/flink/pull/21621

   ## What is the purpose of the change
   Currently, flink will set the correct key context(by call 
[setKeyContextElement](https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B))
 before processing each record, which is typically used to extract key from 
record and pass that key to the state backends.
   
   However, the "setKeyContextElement" is obviously not need for non-keyed 
stream/operator, in which case we can omit the "setKeyContextElement" calls to 
improve performance.
   
   ## Verifying this change
   Add test `RecordProcessorUtilsTest`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (**no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**no**)
 - The serializers: (**no**)
 - The runtime per-record code paths (performance sensitive): (**no**)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (**no**)
 - The S3 file system connector: (**no**)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**no**)
 - If yes, how is the feature documented? (**not applicable**)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] xintongsong commented on a diff in pull request #21565: [FLINK-18229][ResourceManager] Support cancel pending workers if no longer needed.

2023-01-08 Thread GitBox


xintongsong commented on code in PR #21565:
URL: https://github.com/apache/flink/pull/21565#discussion_r1064310906


##
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java:
##
@@ -284,7 +284,7 @@ public void close() throws Exception {
 @Override
 public void clearResourceRequirements(JobID jobId) {
 jobMasterTargetAddresses.remove(jobId);
-taskManagerTracker.clearPendingAllocationsOfJob(jobId);
+taskManagerTracker.clearPendingAllocationsOfJob(jobId, 
resourceAllocator.isSupported());

Review Comment:
   We should not pass `resourceAllocator.isSupported()` into 
`taskManagerTracker`. Instead, we should not call 
`taskManagerTracker.clearPendingAllocationsOfJob()` if the allocator is not 
supported.



##
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/active/ActiveResourceManager.java:
##
@@ -342,21 +352,53 @@ private void checkResourceDeclarations() {
 resourceDeclaration.getUnwantedWorkers(),
 releaseOrRequestWorkerNumber);
 
-// TODO, release pending/starting/running workers to exceed 
declared worker number.
 if (remainingReleasingWorkerNumber > 0) {
-log.debug(
-"need release {} workers after release unwanted 
workers.",
-remainingReleasingWorkerNumber);
+// release not allocated workers;
+remainingReleasingWorkerNumber =
+releaseUnallocatedWorkers(
+workerResourceSpec, 
remainingReleasingWorkerNumber);
 }
+
+if (remainingReleasingWorkerNumber > 0) {
+// release starting workers and running workers;
+List workerCanReleaseInStartingOrder = new 
ArrayList<>();
+currentAttemptUnregisteredWorkers.stream()
+.filter(r -> 
workerResourceSpec.equals(workerResourceSpecs.get(r)))
+.forEach(workerCanReleaseInStartingOrder::add);
+workerNodeMap.keySet().stream()
+.filter(
+r ->
+
workerResourceSpec.equals(workerResourceSpecs.get(r))
+&& 
!currentAttemptUnregisteredWorkers.contains(
+r))
+.forEach(workerCanReleaseInStartingOrder::add);
+
+remainingReleasingWorkerNumber =
+releaseResources(
+workerCanReleaseInStartingOrder,
+remainingReleasingWorkerNumber);
+}

Review Comment:
   This is not what I meant by deduplicating. There's no need to collect 
starting and running workers into one list.
   
   The following logics for starting and running workers are identical:
   - Filtering workers that matches the given spec
   - Releasing as many resources as needed
   - Returning the remaining number of resources that needs to be released
   
   The above logics can be deduplicated as a method, with the only difference 
being the collection of workers passed in.



##
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManagerDriver.java:
##
@@ -373,8 +421,8 @@ private void onContainersOfPriorityAllocated(Priority 
priority, List
 requestResourceFutures.remove(taskExecutorProcessSpec);
 }
 
-startTaskExecutorInContainerAsync(
-container, taskExecutorProcessSpec, resourceId, 
requestResourceFuture);
+startTaskExecutorInContainerAsync(container, 
taskExecutorProcessSpec, resourceId);
+requestResourceFuture.complete(new YarnWorkerNode(container, 
resourceId));

Review Comment:
   Minor:
   ```suggestion
   requestResourceFuture.complete(new YarnWorkerNode(container, 
resourceId));
   startTaskExecutorInContainerAsync(container, 
taskExecutorProcessSpec, resourceId);
   ```



##
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java:
##
@@ -284,7 +284,7 @@ public void close() throws Exception {
 @Override
 public void clearResourceRequirements(JobID jobId) {
 jobMasterTargetAddresses.remove(jobId);
-taskManagerTracker.clearPendingAllocationsOfJob(jobId);
+taskManagerTracker.clearPendingAllocationsOfJob(jobId, 
resourceAllocator.isSupported());

Review Comment:
   Same for `replaceAllPendingAllocations`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to 

[GitHub] [flink] TanYuxin-tyx commented on pull request #21618: [FLINK-30472][network] Modify the default value of the max network memory config option

2023-01-08 Thread GitBox


TanYuxin-tyx commented on PR #21618:
URL: https://github.com/apache/flink/pull/21618#issuecomment-1375163838

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] 1996fanrui commented on pull request #21614: [FLINK-30584][docs] Document flame graph at the subtask level

2023-01-08 Thread GitBox


1996fanrui commented on PR #21614:
URL: https://github.com/apache/flink/pull/21614#issuecomment-1375162700

   Hi @xintongsong , please help take a look in your free time, thanks~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-30603) CompactActionITCase in table store is unstable

2023-01-08 Thread Shammon (Jira)
Shammon created FLINK-30603:
---

 Summary: CompactActionITCase in table store is unstable
 Key: FLINK-30603
 URL: https://issues.apache.org/jira/browse/FLINK-30603
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Affects Versions: table-store-0.4.0
Reporter: Shammon


https://github.com/apache/flink-table-store/actions/runs/3871130631/jobs/6598625030

[INFO] Results:
[INFO] 
Error:  Failures: 
Error:CompactActionITCase.testStreamingCompact:187 expected:<[+I 
1|100|15|20221208, +I 1|100|15|20221209]> but was:<[+I 1|100|15|20221209]>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] lsyldliu commented on pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime

2023-01-08 Thread GitBox


lsyldliu commented on PR #21556:
URL: https://github.com/apache/flink/pull/21556#issuecomment-1375139834

   @godfreyhe Thanks for reviewing, I've address the comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-30600) Merge flink-table-store-kafka to flink-table-store-connector

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-30600.

Resolution: Fixed

master: dd95fc47da7dad818c13bb178a7671db326088d9

> Merge flink-table-store-kafka to flink-table-store-connector
> 
>
> Key: FLINK-30600
> URL: https://issues.apache.org/jira/browse/FLINK-30600
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.4.0
>
>
> At present, Kafka heavily relies on the implementation of Flink, which is 
> difficult to extract, so it can be directly incorporated into the Flink 
> connector.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] JingsongLi merged pull request #469: [FLINK-30600] Merge flink-table-store-kafka to flink-table-store-connector

2023-01-08 Thread GitBox


JingsongLi merged PR #469:
URL: https://github.com/apache/flink-table-store/pull/469


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-table-store] zjureel commented on pull request #469: [FLINK-30600] Merge flink-table-store-kafka to flink-table-store-connector

2023-01-08 Thread GitBox


zjureel commented on PR #469:
URL: 
https://github.com/apache/flink-table-store/pull/469#issuecomment-1375132561

   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] laughingman7743 commented on pull request #21613: [FLINK-30093][formats] Fix compile errors for google.protobuf.Timestamp type

2023-01-08 Thread GitBox


laughingman7743 commented on PR #21613:
URL: https://github.com/apache/flink/pull/21613#issuecomment-1375132192

   @maosuhan I have added a description of data mapping of type 
google.protobuf.Timestamp to the documentation.
   
https://github.com/apache/flink/pull/21613/commits/23b5fb0b35abb5e888ad7b50fcc22c837f860977
   
   
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44568=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a
   The e2e test seems to be failing, but I can't find the cause. Could the 
changes made in this pull request be affecting it? Please let me know if I am 
missing something.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30602) Remove FileStoreTableITCase in table store

2023-01-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30602:
---
Labels: pull-request-available  (was: )

> Remove FileStoreTableITCase in table store
> --
>
> Key: FLINK-30602
> URL: https://issues.apache.org/jira/browse/FLINK-30602
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.4.0
>Reporter: Shammon
>Priority: Major
>  Labels: pull-request-available
>
> Remove `FileStoreTableITCase` in table store



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-table-store] zjureel opened a new pull request, #470: [FLINK-30602] Remove FileStoreTableITCase in table store

2023-01-08 Thread GitBox


zjureel opened a new pull request, #470:
URL: https://github.com/apache/flink-table-store/pull/470

   Remove `FileStoreTableITCase` and fix related test case


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #21620: [FLINK-30473][network] Optimize the InputGate network memory management for TaskManager

2023-01-08 Thread GitBox


flinkbot commented on PR #21620:
URL: https://github.com/apache/flink/pull/21620#issuecomment-1375130869

   
   ## CI report:
   
   * a92b74f449522a5ccf1c9cb05e3079c2fdfa7581 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] KarlManong commented on pull request #18059: [FLINK-25224][filesystem] Bump Hadoop version to 2.8.4

2023-01-08 Thread GitBox


KarlManong commented on PR #18059:
URL: https://github.com/apache/flink/pull/18059#issuecomment-1375128934

   @liufangqi could you please fix the title?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30473) Optimize the InputGate network memory management for TaskManager

2023-01-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30473:
---
Labels: pull-request-available  (was: )

> Optimize the InputGate network memory management for TaskManager
> 
>
> Key: FLINK-30473
> URL: https://issues.apache.org/jira/browse/FLINK-30473
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Affects Versions: 1.17.0
>Reporter: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> Based on the 
> [FLIP-266|https://cwiki.apache.org/confluence/display/FLINK/FLIP-266%3A+Simplify+network+memory+configurations+for+TaskManager],
>  this issue mainly focuses on the first issue.
> This change proposes a method to control the maximum required memory buffers 
> in an inputGate according to parallelism size.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] TanYuxin-tyx opened a new pull request, #21620: [FLINK-30473][network] Optimize the InputGate network memory management for TaskManager

2023-01-08 Thread GitBox


TanYuxin-tyx opened a new pull request, #21620:
URL: https://github.com/apache/flink/pull/21620

   
   
   ## What is the purpose of the change
   
   *This change proposes a method to control the maximum required memory 
buffers in an inputGate according to parallelism size and data partition type*
   
   
   ## Brief change log
   
 - *Modify required buffers calculation in SingleInputGateFactory*
 - *Add a new class GateBuffersNumCalculator in SingleInputGateFactory*
 - *Add a new option 
taskmanager.network.memory.read-buffer.required-per-gate.max*
 - *Deparacate taskmanager.network.memory.buffers-per-channel and 
taskmanager.network.memory.floating-buffers-per-gate*
   
   
   ## Verifying this change
   
   This change added tests GateBuffersNumCalculatorTest.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**yes** / no)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**yes** / no)
 - If yes, how is the feature documented? (not applicable / **docs** / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] ramkrish86 commented on pull request #21512: [FLINK-30375][hotfix]sql-client leaks flink-table-planner jar in /tmp

2023-01-08 Thread GitBox


ramkrish86 commented on PR #21512:
URL: https://github.com/apache/flink/pull/21512#issuecomment-1375123666

   +1 (non binding).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-30602) Remove FileStoreTableITCase in table store

2023-01-08 Thread Shammon (Jira)
Shammon created FLINK-30602:
---

 Summary: Remove FileStoreTableITCase in table store
 Key: FLINK-30602
 URL: https://issues.apache.org/jira/browse/FLINK-30602
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Affects Versions: table-store-0.4.0
Reporter: Shammon


Remove `FileStoreTableITCase` in table store



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #21619: [FLINK-30308][python] Fix ClassLeakCleaner to work with Java 11.0.16+ and 17.0.4+

2023-01-08 Thread GitBox


flinkbot commented on PR #21619:
URL: https://github.com/apache/flink/pull/21619#issuecomment-1375105572

   
   ## CI report:
   
   * a3ec6a28e6443ec364e8a578bfba0f18ef845cd4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30308) ClassCastException: class java.io.ObjectStreamClass$Caches$1 cannot be cast to class java.util.Map is showing in the logging when the job shutdown

2023-01-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30308:
---
Labels: pull-request-available  (was: )

> ClassCastException: class java.io.ObjectStreamClass$Caches$1 cannot be cast 
> to class java.util.Map is showing in the logging when the job shutdown
> --
>
> Key: FLINK-30308
> URL: https://issues.apache.org/jira/browse/FLINK-30308
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Reporter: Dian Fu
>Assignee: Dian Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.16.1, 1.15.4
>
>
> {code:java}
> 2022-12-05 18:26:40,229 WARN  
> org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Failed 
> to clean up the leaking objects.
> java.lang.ClassCastException: class java.io.ObjectStreamClass$Caches$1 cannot 
> be cast to class java.util.Map (java.io.ObjectStreamClass$Caches$1 and 
> java.util.Map are in module java.base of loader 'bootstrap')
> at 
> org.apache.flink.streaming.api.utils.ClassLeakCleaner.clearCache(ClassLeakCleaner.java:58)
>  
> ~[blob_p-a72e14b9030c3ca0d3d0a8fc6e70166c7419d431-f7f18b2164971cb6798db9ab762feabd:1.15.0]
> at 
> org.apache.flink.streaming.api.utils.ClassLeakCleaner.cleanUpLeakingClasses(ClassLeakCleaner.java:39)
>  
> ~[blob_p-a72e14b9030c3ca0d3d0a8fc6e70166c7419d431-f7f18b2164971cb6798db9ab762feabd:1.15.0]
> at 
> org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.close(AbstractPythonFunctionOperator.java:142)
>  
> ~[blob_p-a72e14b9030c3ca0d3d0a8fc6e70166c7419d431-f7f18b2164971cb6798db9ab762feabd:1.15.0]
> at 
> org.apache.flink.streaming.api.operators.python.AbstractExternalPythonFunctionOperator.close(AbstractExternalPythonFunctionOperator.java:73)
>  
> ~[blob_p-a72e14b9030c3ca0d3d0a8fc6e70166c7419d431-f7f18b2164971cb6798db9ab762feabd:1.15.0]
> at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:163)
>  ~[flink-dist-1.15.2.jar:1.15.2]
> at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.closeAllOperators(RegularOperatorChain.java:125)
>  ~[flink-dist-1.15.2.jar:1.15.2]
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:997)
>  ~[flink-dist-1.15.2.jar:1.15.2]
> at org.apache.flink.util.IOUtils.closeAll(IOUtils.java:254) 
> ~[flink-dist-1.15.2.jar:1.15.2]
> at 
> org.apache.flink.core.fs.AutoCloseableRegistry.doClose(AutoCloseableRegistry.java:72)
>  ~[flink-dist-1.15.2.jar:1.15.2]
> at 
> org.apache.flink.util.AbstractAutoCloseableRegistry.close(AbstractAutoCloseableRegistry.java:127)
>  ~[flink-dist-1.15.2.jar:1.15.2]
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUp(StreamTask.java:916)
>  ~[flink-dist-1.15.2.jar:1.15.2]
> at 
> org.apache.flink.runtime.taskmanager.Task.lambda$restoreAndInvoke$0(Task.java:930)
>  ~[flink-dist-1.15.2.jar:1.15.2]
> at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
>  [flink-dist-1.15.2.jar:1.15.2]
> at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:930) 
> [flink-dist-1.15.2.jar:1.15.2]
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741) 
> [flink-dist-1.15.2.jar:1.15.2]
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) 
> [flink-dist-1.15.2.jar:1.15.2]
> at java.lang.Thread.run(Unknown Source) [?:?]{code}
> Reported in Slack: 
> https://apache-flink.slack.com/archives/C03G7LJTS2G/p1670265131083639?thread_ts=1670265114.640369=C03G7LJTS2G



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] dianfu opened a new pull request, #21619: [FLINK-30308][python] Fix ClassLeakCleaner to work with Java 11.0.16+ and 17.0.4+

2023-01-08 Thread GitBox


dianfu opened a new pull request, #21619:
URL: https://github.com/apache/flink/pull/21619

   ## What is the purpose of the change
   
   *This pull request fixes ClassLeakCleaner to work with Java 11.0.16+ and 
17.0.4+.*
   
   
   
   ## Verifying this change
   
   This change is a trivial rework without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): ( no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: ( no)
 - The S3 file system connector: ( no )
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-30561) ChangelogStreamHandleReaderWithCache cause FileNotFoundException

2023-01-08 Thread Yanfei Lei (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655905#comment-17655905
 ] 

Yanfei Lei commented on FLINK-30561:


[~Feifan Wang] Thanks for the clarification. I think this makes sense in case 
the TM process doesn't restart when the job failover.

 

> ChangelogStreamHandleReaderWithCache cause FileNotFoundException
> 
>
> Key: FLINK-30561
> URL: https://issues.apache.org/jira/browse/FLINK-30561
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.16.0
>Reporter: Feifan Wang
>Priority: Major
>
> When a job with state changelog enabled continues to restart, the following 
> exceptions may occur :
> {code:java}
> java.lang.RuntimeException: java.io.FileNotFoundException: 
> /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp
>  (No such file or directory)
>     at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
>     at 
> org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87)
>     at 
> org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69)
>     at 
> org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:107)
>     at 
> org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:78)
>     at 
> org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94)
>     at 
> org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136)
>     at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336)
>     at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
>     at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>     at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353)
>     at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165)
>     at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265)
>     at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:726)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:702)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:669)
>     at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
>     at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: 
> /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp
>  (No such file or directory)
>     at java.io.FileInputStream.open0(Native Method)
>     at java.io.FileInputStream.open(FileInputStream.java:195)
>     at java.io.FileInputStream.init(FileInputStream.java:138)
>     at 
> org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:158)
>     at 
> org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:95)
>     at 
> org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42)
>     at 
> org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85)
>     ... 21 more {code}
> *Problem causes:*
>  # *_ChangelogStreamHandleReaderWithCache_* use RefCountedFile manager local 
> cache file. The reference count is incremented when the input stream is 
> opened from the cache file, and 

[jira] [Updated] (FLINK-30542) Support adaptive local hash aggregate in runtime

2023-01-08 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He updated FLINK-30542:
---
Summary: Support adaptive local hash aggregate in runtime  (was: Support 
adaptive hash aggregate in runtime)

> Support adaptive local hash aggregate in runtime
> 
>
> Key: FLINK-30542
> URL: https://issues.apache.org/jira/browse/FLINK-30542
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> Introduce a new strategy to adaptively determine whether local hash aggregate 
> is required according to the aggregation degree of local hash aggregate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-30491) Hive table partition supports to deserialize later during runtime

2023-01-08 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He reassigned FLINK-30491:
--

Assignee: dalongliu

> Hive table partition supports to deserialize later during runtime
> -
>
> Key: FLINK-30491
> URL: https://issues.apache.org/jira/browse/FLINK-30491
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Affects Versions: 1.16.0
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-30542) Support adaptive local hash aggregate in runtime

2023-01-08 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He reassigned FLINK-30542:
--

Assignee: Yunhong Zheng

> Support adaptive local hash aggregate in runtime
> 
>
> Key: FLINK-30542
> URL: https://issues.apache.org/jira/browse/FLINK-30542
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Assignee: Yunhong Zheng
>Priority: Major
> Fix For: 1.17.0
>
>
> Introduce a new strategy to adaptively determine whether local hash aggregate 
> is required according to the aggregation degree of local hash aggregate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-30365) New dynamic partition pruning strategy to support more dpp patterns

2023-01-08 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He closed FLINK-30365.
--
Resolution: Fixed

Fixed in 1.17.0: d9f9b55f82dfbc1676572cc36b718a99001497f8

> New dynamic partition pruning strategy to support more dpp patterns
> ---
>
> Key: FLINK-30365
> URL: https://issues.apache.org/jira/browse/FLINK-30365
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> New dynamic partition pruning strategy to support more dpp patterns. Now, dpp 
> rules is coupled with the join reorder rules, which will affect the result of 
> join reorder. At the same time, the dpp rule don't support these patterns 
> like union node in fact side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-30365) New dynamic partition pruning strategy to support more dpp patterns

2023-01-08 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He reassigned FLINK-30365:
--

Assignee: Yunhong Zheng

> New dynamic partition pruning strategy to support more dpp patterns
> ---
>
> Key: FLINK-30365
> URL: https://issues.apache.org/jira/browse/FLINK-30365
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.17.0
>Reporter: Yunhong Zheng
>Assignee: Yunhong Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> New dynamic partition pruning strategy to support more dpp patterns. Now, dpp 
> rules is coupled with the join reorder rules, which will affect the result of 
> join reorder. At the same time, the dpp rule don't support these patterns 
> like union node in fact side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] godfreyhe closed pull request #21489: [FLINK-30365][table-planner] New dynamic partition pruning strategy to support more dpp patterns

2023-01-08 Thread GitBox


godfreyhe closed pull request #21489: [FLINK-30365][table-planner] New dynamic 
partition pruning strategy to support more dpp patterns
URL: https://github.com/apache/flink/pull/21489


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fsk119 commented on pull request #21597: [FLINK-30554][sql-client] Remove useless sessionId in the Executor

2023-01-08 Thread GitBox


fsk119 commented on PR #21597:
URL: https://github.com/apache/flink/pull/21597#issuecomment-1375092445

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance

2023-01-08 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-30601:
---
Description: 
Currently, flink will set the correct key context(by call 
[setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
 before processing each record, which is typically used to extract key from 
record and pass that key to the state backends.

However, the "setKeyContextElement" is obviously not need for non-keyed 
stream/operator, in which case we can omit the "setKeyContextElement" calls to 
improve performance. Note that setKeyContextElement is an interface method, it 
requires looking up the interface table when calling, which will further 
increase the method call overhead.
 
We run the following program as benchmark with parallelism=1 and object re-use 
enabled. The benchmark results are averaged across 5 runs for each setup. 
Before and after applying the proposed change, the average execution time 
changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
 
{code:java}
env.fromSequence(1, 10L)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x).addSink(new DiscardingSink<>());
{code}
 
 

  was:
Currently, flink will set the correct key context(by call 
[setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
 before processing each record, which is typically used to extract key from 
record and pass that key to the state backends. 

However, the "setKeyContextElement" is obviously not need for non-keyed 
stream/operator, in which case we should omit the "setKeyContextElement" calls 
to improve performance. Note that setKeyContextElement is an interface method, 
it requires looking up the interface table when calling, which will further 
increase the method call overhead.
 
We run the following program as benchmark with parallelism=1 and object re-use 
enabled. The benchmark results are averaged across 5 runs for each setup. 
Before and after applying the proposed change, the average execution time 
changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
 
{code:java}
env.fromSequence(1, 10L)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x).addSink(new DiscardingSink<>());
{code}
 
 


> Omit "setKeyContextElement" call for non-keyed stream/operators to improve 
> performance
> --
>
> Key: FLINK-30601
> URL: https://issues.apache.org/jira/browse/FLINK-30601
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Lijie Wang
>Priority: Major
> Fix For: 1.17.0
>
>
> Currently, flink will set the correct key context(by call 
> [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
>  before processing each record, which is typically used to extract key from 
> record and pass that key to the state backends.
> However, the "setKeyContextElement" is obviously not need for non-keyed 
> stream/operator, in which case we can omit the "setKeyContextElement" calls 
> to improve performance. Note that setKeyContextElement is an interface 
> method, it requires looking up the interface table when calling, which will 
> further increase the method call overhead.
>  
> We run the following program as benchmark with parallelism=1 and object 
> re-use enabled. The benchmark results are averaged across 5 runs for each 
> setup. Before and after applying the proposed change, the average execution 
> time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
>  
> {code:java}
> env.fromSequence(1, 10L)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x).addSink(new DiscardingSink<>());
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance

2023-01-08 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-30601:
---
Description: 
Currently, flink will set the correct key context(by call 
[setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
 before processing each record, which is typically used to extract key from 
record and pass that key to the state backends.

However, the "setKeyContextElement" is obviously not need for non-keyed 
stream/operator, in which case we can omit the "setKeyContextElement" calls to 
improve performance. Note that "setKeyContextElement" is an interface method, 
it requires looking up the interface table when calling, which will further 
increase the method call overhead.
 
We run the following program as benchmark with parallelism=1 and object re-use 
enabled. The benchmark results are averaged across 5 runs for each setup. 
Before and after applying the proposed change, the average execution time 
changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
{code:java}
env.fromSequence(1, 10L)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x).addSink(new DiscardingSink<>());
{code}
 
 

  was:
Currently, flink will set the correct key context(by call 
[setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
 before processing each record, which is typically used to extract key from 
record and pass that key to the state backends.

However, the "setKeyContextElement" is obviously not need for non-keyed 
stream/operator, in which case we can omit the "setKeyContextElement" calls to 
improve performance. Note that setKeyContextElement is an interface method, it 
requires looking up the interface table when calling, which will further 
increase the method call overhead.
 
We run the following program as benchmark with parallelism=1 and object re-use 
enabled. The benchmark results are averaged across 5 runs for each setup. 
Before and after applying the proposed change, the average execution time 
changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
 
{code:java}
env.fromSequence(1, 10L)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x).addSink(new DiscardingSink<>());
{code}
 
 


> Omit "setKeyContextElement" call for non-keyed stream/operators to improve 
> performance
> --
>
> Key: FLINK-30601
> URL: https://issues.apache.org/jira/browse/FLINK-30601
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Lijie Wang
>Priority: Major
> Fix For: 1.17.0
>
>
> Currently, flink will set the correct key context(by call 
> [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
>  before processing each record, which is typically used to extract key from 
> record and pass that key to the state backends.
> However, the "setKeyContextElement" is obviously not need for non-keyed 
> stream/operator, in which case we can omit the "setKeyContextElement" calls 
> to improve performance. Note that "setKeyContextElement" is an interface 
> method, it requires looking up the interface table when calling, which will 
> further increase the method call overhead.
>  
> We run the following program as benchmark with parallelism=1 and object 
> re-use enabled. The benchmark results are averaged across 5 runs for each 
> setup. Before and after applying the proposed change, the average execution 
> time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
> {code:java}
> env.fromSequence(1, 10L)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x).addSink(new DiscardingSink<>());
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance

2023-01-08 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-30601:
---
Description: 
Currently, flink will set the correct key context(by call 
[setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
 before processing each record, which is typically used to extract key from 
record and pass that key to the state backends.

However, the "setKeyContextElement" is obviously not need for non-keyed 
stream/operator, in which case we can omit the "setKeyContextElement" calls to 
improve performance. Note that "setKeyContextElement" is an interface method, 
it requires looking up the interface table when calling, which will further 
increase the method call overhead.
 
We run the following program as benchmark with parallelism=1 and object re-use 
enabled. The benchmark results are averaged across 5 runs for each setup. 
Before and after applying the proposed change, the average execution time 
changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
{code:java}
env.fromSequence(1, 10L)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x).addSink(new DiscardingSink<>());{code}

  was:
Currently, flink will set the correct key context(by call 
[setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
 before processing each record, which is typically used to extract key from 
record and pass that key to the state backends.

However, the "setKeyContextElement" is obviously not need for non-keyed 
stream/operator, in which case we can omit the "setKeyContextElement" calls to 
improve performance. Note that "setKeyContextElement" is an interface method, 
it requires looking up the interface table when calling, which will further 
increase the method call overhead.
 
We run the following program as benchmark with parallelism=1 and object re-use 
enabled. The benchmark results are averaged across 5 runs for each setup. 
Before and after applying the proposed change, the average execution time 
changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
{code:java}
env.fromSequence(1, 10L)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x).addSink(new DiscardingSink<>());
{code}
 
 


> Omit "setKeyContextElement" call for non-keyed stream/operators to improve 
> performance
> --
>
> Key: FLINK-30601
> URL: https://issues.apache.org/jira/browse/FLINK-30601
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Lijie Wang
>Priority: Major
> Fix For: 1.17.0
>
>
> Currently, flink will set the correct key context(by call 
> [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
>  before processing each record, which is typically used to extract key from 
> record and pass that key to the state backends.
> However, the "setKeyContextElement" is obviously not need for non-keyed 
> stream/operator, in which case we can omit the "setKeyContextElement" calls 
> to improve performance. Note that "setKeyContextElement" is an interface 
> method, it requires looking up the interface table when calling, which will 
> further increase the method call overhead.
>  
> We run the following program as benchmark with parallelism=1 and object 
> re-use enabled. The benchmark results are averaged across 5 runs for each 
> setup. Before and after applying the proposed change, the average execution 
> time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
> {code:java}
> env.fromSequence(1, 10L)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x)
> .map(x -> x).addSink(new DiscardingSink<>());{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance

2023-01-08 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-30601:
--

 Summary: Omit "setKeyContextElement" call for non-keyed 
stream/operators to improve performance
 Key: FLINK-30601
 URL: https://issues.apache.org/jira/browse/FLINK-30601
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Task
Reporter: Lijie Wang
 Fix For: 1.17.0


Currently, flink will set the correct key context(by call 
[setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B])
 before processing each record, which is typically used to extract key from 
record and pass that key to the state backends. 

However, the "setKeyContextElement" is obviously not need for non-keyed 
stream/operator, in which case we should omit the "setKeyContextElement" calls 
to improve performance. Note that setKeyContextElement is an interface method, 
it requires looking up the interface table when calling, which will further 
increase the method call overhead.
 
We run the following program as benchmark with parallelism=1 and object re-use 
enabled. The benchmark results are averaged across 5 runs for each setup. 
Before and after applying the proposed change, the average execution time 
changed from 88.39 s to 78.76 s, which increases throughput by 10.8%.
 
{code:java}
env.fromSequence(1, 10L)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x)
.map(x -> x).addSink(new DiscardingSink<>());
{code}
 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28796) Add Statement Completement API for sql gateway rest endpoint

2023-01-08 Thread Shengkai Fang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengkai Fang closed FLINK-28796.
-
Fix Version/s: 1.17.0
   Resolution: Implemented

Merged into master: 6301cca7a6924cebd1107fcd39c063b7a091551d

> Add Statement Completement API for sql gateway rest endpoint
> 
>
> Key: FLINK-28796
> URL: https://issues.apache.org/jira/browse/FLINK-28796
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Gateway
>Reporter: Wencong Liu
>Assignee: yuzelin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> SQL Gateway supports various clients: sql client, rest, hiveserver2, etc. 
> Given the 1.16 feature freeze date, we won't be able to finish all the 
> endpoints. Thus, we'd exclude one of the rest apis (tracked by this ticket) 
> from [FLINK-28163] Introduce the statement related API for REST endpoint - 
> ASF JIRA (apache.org)], which is only needed by the sql client, and still try 
> to complete the remaining of them.
> In other words, we'd expect the sql gateway to support rest & hiveserver2 
> apis in 1.16, and sql client in 1.17.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] fsk119 merged pull request #21526: [FLINK-28796] Add completeStatement REST API in the SQL Gateway

2023-01-08 Thread GitBox


fsk119 merged PR #21526:
URL: https://github.com/apache/flink/pull/21526


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-22320) Add documentation for new introduced ALTER TABLE statements

2023-01-08 Thread Jane Chan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655896#comment-17655896
 ] 

Jane Chan commented on FLINK-22320:
---

Hi, I would like to take on this task. cc [~fsk119] 

> Add documentation for new introduced ALTER TABLE statements
> ---
>
> Key: FLINK-22320
> URL: https://issues.apache.org/jira/browse/FLINK-22320
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation, Table SQL / API
>Reporter: Jark Wu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] chrismartin823 commented on pull request #21611: [FLINK-30592][doc] remove unsupported hive version in hive overview document

2023-01-08 Thread GitBox


chrismartin823 commented on PR #21611:
URL: https://github.com/apache/flink/pull/21611#issuecomment-1375070781

   > @chrismartin823 Could you please create a backport for release 1.16. 
Thanks.
   
   ok,I will create a backport for release 1.16.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-connector-hive] chenqin closed pull request #2: Move Flink Hive Connector

2023-01-08 Thread GitBox


chenqin closed pull request #2: Move Flink Hive Connector
URL: https://github.com/apache/flink-connector-hive/pull/2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-19883) Support "IF EXISTS" in DDL for ALTER TABLE

2023-01-08 Thread Jane Chan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655894#comment-17655894
 ] 

Jane Chan commented on FLINK-19883:
---

Hi [~nicholasjiang], I would like to continue working on this issue if you are 
unavailable.

> Support "IF EXISTS" in DDL for ALTER TABLE
> --
>
> Key: FLINK-19883
> URL: https://issues.apache.org/jira/browse/FLINK-19883
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.11.2
>Reporter: Ingo Bürk
>Assignee: Nicholas Jiang
>Priority: Not a Priority
>  Labels: pull-request-available, stale-assigned
>
> ALTER TABLE does not seem to support the "IF EXISTS" part in the DDL, but the 
> corresponding methods Catalog#renameTable and Catalog#alterTable do support 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #21618: [FLINK-30472][network] Modify the default value of the max network memory config option

2023-01-08 Thread GitBox


flinkbot commented on PR #21618:
URL: https://github.com/apache/flink/pull/21618#issuecomment-1375067510

   
   ## CI report:
   
   * 82022cdf0c82952340b60648ff258c63228ac8ba UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] godfreyhe commented on a diff in pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime

2023-01-08 Thread GitBox


godfreyhe commented on code in PR #21556:
URL: https://github.com/apache/flink/pull/21556#discussion_r1064264590


##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/util/HivePartitionUtils.java:
##
@@ -290,4 +291,33 @@ private static void listStatusRecursively(
 }
 }
 }
+
+public static List serializeHiveTablePartition(

Review Comment:
   nit: it's better we can add some test to verify this utility method.



##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTablePartitionSerializer.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connectors.hive;
+
+import org.apache.flink.core.io.SimpleVersionedSerializer;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.ObjectInputStream;
+import java.io.ObjectOutputStream;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/** SerDe for {@link HiveTablePartition}. */
+public class HiveTablePartitionSerializer implements 
SimpleVersionedSerializer {
+
+private static final int VERSION = 1;
+
+public static final HiveTablePartitionSerializer INSTANCE = new 
HiveTablePartitionSerializer();
+
+@Override
+public int getVersion() {
+return VERSION;
+}
+
+@Override
+public byte[] serialize(HiveTablePartition hiveTablePartition) throws 
IOException {
+checkArgument(
+hiveTablePartition.getClass() == HiveTablePartition.class,
+"Cannot serialize subclasses of HiveTablePartition");
+ByteArrayOutputStream byteArrayOutputStream = new 
ByteArrayOutputStream();
+try (ObjectOutputStream outputStream = new 
ObjectOutputStream(byteArrayOutputStream)) {
+outputStream.writeObject(hiveTablePartition);
+}
+return byteArrayOutputStream.toByteArray();
+}
+
+@Override
+public HiveTablePartition deserialize(int version, byte[] serialized) 
throws IOException {
+if (version == 1) {

Review Comment:
   if (version == CURRENT_VERSION)



##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTablePartitionSerializer.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connectors.hive;
+
+import org.apache.flink.core.io.SimpleVersionedSerializer;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.ObjectInputStream;
+import java.io.ObjectOutputStream;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/** SerDe for {@link HiveTablePartition}. */
+public class HiveTablePartitionSerializer implements 
SimpleVersionedSerializer {
+
+private static final int VERSION = 1;

Review Comment:
   nit: CURRENT_VERSION



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30472) Modify the default value of the max network memory config option

2023-01-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30472:
---
Labels: pull-request-available  (was: )

> Modify the default value of the max network memory config option
> 
>
> Key: FLINK-30472
> URL: https://issues.apache.org/jira/browse/FLINK-30472
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Affects Versions: 1.17.0
>Reporter: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> This issue mainly focuses on the second issue in 
> [FLIP-266|https://cwiki.apache.org/confluence/display/FLINK/FLIP-266%3A+Simplify+network+memory+configurations+for+TaskManager],
>  modifying the default value of taskmanager.memory.network.max



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] TanYuxin-tyx opened a new pull request, #21618: [FLINK-30472][network] Modify the default value of the max network memory config option

2023-01-08 Thread GitBox


TanYuxin-tyx opened a new pull request, #21618:
URL: https://github.com/apache/flink/pull/21618

   
   
   
   
   ## What is the purpose of the change
   
   *This issue mainly focuses on the second issue in 
[FLIP-266](https://cwiki.apache.org/confluence/display/FLINK/FLIP-266%3A+Simplify+network+memory+configurations+for+TaskManager),
 modifying the default value of taskmanager.memory.network.max*
   
   
   ## Brief change log
   
 - *Set modifying the default value of taskmanager.memory.network.max to 
MemorySize.MAX_VALUE.*
 - *Update the config option documents about 
taskmanager.memory.network.max.*
   
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as 
*TaskExecutorProcessUtilsTest*.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**yes** / no)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lsyldliu commented on a diff in pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime

2023-01-08 Thread GitBox


lsyldliu commented on code in PR #21556:
URL: https://github.com/apache/flink/pull/21556#discussion_r1064259672


##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveSourceFileEnumerator.java:
##
@@ -222,19 +225,24 @@ private static int getThreadNumToSplitHiveFile(JobConf 
jobConf) {
 /** A factory to create {@link HiveSourceFileEnumerator}. */
 public static class Provider implements FileEnumerator.Provider {
 
+private static final Logger LOG = 
LoggerFactory.getLogger(Provider.class);
+
 private static final long serialVersionUID = 1L;
 
-private final List partitions;
+private final List partitionBytes;
 private final JobConfWrapper jobConfWrapper;
 
-public Provider(List partitions, JobConfWrapper 
jobConfWrapper) {
-this.partitions = partitions;
+public Provider(List partitionBytes, JobConfWrapper 
jobConfWrapper) {
+this.partitionBytes = partitionBytes;
 this.jobConfWrapper = jobConfWrapper;
 }
 
 @Override
 public FileEnumerator create() {
-return new HiveSourceFileEnumerator(partitions, 
jobConfWrapper.conf());
+LOG.info("Deserialize hive table partition in 
HiveSourceFileEnumerator.");

Review Comment:
   I also think it is no need.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] swuferhong commented on pull request #21489: [FLINK-30365][table-planner] New dynamic partition pruning strategy to support more dpp patterns

2023-01-08 Thread GitBox


swuferhong commented on PR #21489:
URL: https://github.com/apache/flink/pull/21489#issuecomment-1375056765

   > It's better we can add this optimization in flink-tpcds-test module, we 
can do it in another pr.
   Ok. I will make a jira issue to track this work. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lsyldliu commented on pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime

2023-01-08 Thread GitBox


lsyldliu commented on PR #21556:
URL: https://github.com/apache/flink/pull/21556#issuecomment-1375056600

   @luoyuxia Thanks for review, I have addressed your comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-28690) UpdateSchema#fromCatalogTable lost column comment

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-28690:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> UpdateSchema#fromCatalogTable lost column comment
> -
>
> Key: FLINK-28690
> URL: https://issues.apache.org/jira/browse/FLINK-28690
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Not a Priority
> Fix For: table-store-0.4.0
>
>
> The reason is that 
> org.apache.flink.table.api.TableSchema#toPhysicalRowDataType lost column 
> comments, which leads to comparison failure in 
> AbstractTableStoreFactory#buildFileStoreTable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-27845) Introduce Column Type Schema evolution

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-27845:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Introduce Column Type Schema evolution
> --
>
> Key: FLINK-27845
> URL: https://issues.apache.org/jira/browse/FLINK-27845
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: pull-request-available
> Fix For: table-store-0.4.0
>
>
> After schema changes, for example, a column with int type to bigint type. We 
> should convert the data from the old type to the new type.
>  * What types are safe to convert? According to the implicitCastingRules in 
> LogicalTypeCasts.
>  * Where to convert? DataFileReader
>  * How to convert? In RecordIterator.next, we can convert old RowData to new 
> RowData if there is a schema change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] swuferhong commented on a diff in pull request #21489: [FLINK-30365][table-planner] New dynamic partition pruning strategy to support more dpp patterns

2023-01-08 Thread GitBox


swuferhong commented on code in PR #21489:
URL: https://github.com/apache/flink/pull/21489#discussion_r1064259324


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/utils/DynamicPartitionPruningUtils.java:
##
@@ -60,318 +69,470 @@
 public class DynamicPartitionPruningUtils {
 
 /**
- * For the input join node, judge whether the join left side and join 
right side meet the
- * requirements of dynamic partition pruning. Fact side in left or right 
join is not clear.
+ * Judge whether the input RelNode meets the conditions of dimSide. If 
joinKeys is null means we
+ * need not consider the join keys in dim side, which already deal by 
dynamic partition pruning
+ * rule. If joinKeys not null means we need to judge whether joinKeys 
changed in dim side, if
+ * changed, this RelNode is not dim side.
  */
-public static boolean supportDynamicPartitionPruning(Join join) {
-return supportDynamicPartitionPruning(join, true)
-|| supportDynamicPartitionPruning(join, false);
+public static boolean isDppDimSide(RelNode rel) {
+DppDimSideChecker dimSideChecker = new DppDimSideChecker(rel);
+return dimSideChecker.isDppDimSide();
 }
 
 /**
- * For the input join node, judge whether the join left side and join 
right side meet the
- * requirements of dynamic partition pruning. Fact side in left or right 
is clear. If meets the
- * requirements, return true.
+ * Judge whether the input RelNode can be converted to the dpp fact side. 
If the input RelNode
+ * can be converted, this method will return the converted fact side whose 
partitioned table
+ * source will be converted to {@link 
BatchPhysicalDynamicFilteringTableSourceScan}, If not,
+ * this method will return the origin RelNode.
  */
-public static boolean supportDynamicPartitionPruning(Join join, boolean 
factInLeft) {
-if (!ShortcutUtils.unwrapContext(join)
-.getTableConfig()
-
.get(OptimizerConfigOptions.TABLE_OPTIMIZER_DYNAMIC_FILTERING_ENABLED)) {
-return false;
-}
-// Now dynamic partition pruning supports left/right join, inner and 
semi join. but now semi
-// join can not join reorder.
-if (join.getJoinType() == JoinRelType.LEFT) {
-if (factInLeft) {
-return false;
-}
-} else if (join.getJoinType() == JoinRelType.RIGHT) {
-if (!factInLeft) {
-return false;
-}
-} else if (join.getJoinType() != JoinRelType.INNER
-&& join.getJoinType() != JoinRelType.SEMI) {
+public static Tuple2 canConvertAndConvertDppFactSide(
+RelNode rel,
+ImmutableIntList joinKeys,
+RelNode dimSide,
+ImmutableIntList dimSideJoinKey) {
+DppFactSideChecker dppFactSideChecker =
+new DppFactSideChecker(rel, joinKeys, dimSide, dimSideJoinKey);
+return dppFactSideChecker.canConvertAndConvertDppFactSide();
+}
+
+/** Judge whether the join node is suitable one for dpp pattern. */
+public static boolean isSuitableJoin(Join join) {
+// Now dynamic partition pruning supports left/right join, inner and 
semi
+// join. but now semi join can not join reorder.
+if (join.getJoinType() != JoinRelType.INNER
+&& join.getJoinType() != JoinRelType.SEMI
+&& join.getJoinType() != JoinRelType.LEFT
+&& join.getJoinType() != JoinRelType.RIGHT) {
 return false;
 }
 
 JoinInfo joinInfo = join.analyzeCondition();
-if (joinInfo.leftKeys.isEmpty()) {
-return false;
-}
-RelNode left = join.getLeft();
-RelNode right = join.getRight();
-
-// TODO Now fact side and dim side don't support many complex 
patterns, like join inside
-// fact/dim side, agg inside fact/dim side etc. which will support 
next.
-return factInLeft
-? isDynamicPartitionPruningPattern(left, right, 
joinInfo.leftKeys)
-: isDynamicPartitionPruningPattern(right, left, 
joinInfo.rightKeys);
+return !joinInfo.leftKeys.isEmpty();
 }
 
-private static boolean isDynamicPartitionPruningPattern(
-RelNode factSide, RelNode dimSide, ImmutableIntList 
factSideJoinKey) {
-return isDimSide(dimSide) && isFactSide(factSide, factSideJoinKey);
-}
+/** This class is used to check whether the relNode is dpp dim side. */
+private static class DppDimSideChecker {
+private final RelNode relNode;
+private boolean hasFilter;
+private boolean hasPartitionedScan;
+private final List tables = new ArrayList<>();
 
-/** make a dpp fact side factor to recurrence in fact side. */
-private static boolean isFactSide(RelNode rel, ImmutableIntList 

[GitHub] [flink] lsyldliu commented on a diff in pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime

2023-01-08 Thread GitBox


lsyldliu commented on code in PR #21556:
URL: https://github.com/apache/flink/pull/21556#discussion_r1064259298


##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveSourceDynamicFileEnumerator.java:
##
@@ -187,31 +187,36 @@ public static class Provider implements 
DynamicFileEnumerator.Provider {
 
 private static final long serialVersionUID = 1L;
 
+private static final Logger LOG = 
LoggerFactory.getLogger(Provider.class);
+
 private final String table;
 private final List dynamicFilterPartitionKeys;
-private final List partitions;
+private final List partitionBytes;
 private final String hiveVersion;
 private final JobConfWrapper jobConfWrapper;
 
 public Provider(
 String table,
 List dynamicFilterPartitionKeys,
-List partitions,
+List partitionBytes,
 String hiveVersion,
 JobConfWrapper jobConfWrapper) {
 this.table = checkNotNull(table);
 this.dynamicFilterPartitionKeys = 
checkNotNull(dynamicFilterPartitionKeys);
-this.partitions = checkNotNull(partitions);
+this.partitionBytes = checkNotNull(partitionBytes);
 this.hiveVersion = checkNotNull(hiveVersion);
 this.jobConfWrapper = checkNotNull(jobConfWrapper);
 }
 
 @Override
 public DynamicFileEnumerator create() {
+LOG.info(

Review Comment:
   After rethinking, we don't need this log, if deserializing the hive table 
partition fails, the exception will be thrown.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-29735) Introduce Metadata tables for table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29735:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Introduce Metadata tables for table store
> -
>
> Key: FLINK-29735
> URL: https://issues.apache.org/jira/browse/FLINK-29735
> Project: Flink
>  Issue Type: New Feature
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> You can query the related metadata of the table through SQL, for example, 
> query the historical version information of table "T" through the following 
> SQL:
> SELECT * FROM T$history;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30276) [umbrella] Flink free for table store core

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-30276:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> [umbrella] Flink free for table store core
> --
>
> Key: FLINK-30276
> URL: https://issues.apache.org/jira/browse/FLINK-30276
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> In FLINK-30080, We need a core that does not rely on specific Flink versions 
> to support flexible deployment and ecology.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30394) [umbrella] Refactor filesystem support in table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-30394:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> [umbrella] Refactor filesystem support in table store
> -
>
> Key: FLINK-30394
> URL: https://issues.apache.org/jira/browse/FLINK-30394
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> - Let other computing engines, such as hive, spark, trino, support object 
> storage file systems, such as OSS and s3.
> - Let table store access different file systems from Flink cluster according 
> to configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30390) Ensure that no compaction is in progress before closing the writer

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-30390:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Ensure that no compaction is in progress before closing the writer
> --
>
> Key: FLINK-30390
> URL: https://issues.apache.org/jira/browse/FLINK-30390
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> When the writer does not generate a new submission file, it will be closed. 
> (In AbstractFileStoreWrite) However, at this time, there may be asynchronous 
> interactions that have not been completed and are forced to close, which will 
> cause some strange exceptions to be printed in the log.
> We can avoid this situation, ensure that no compaction is in progress before 
> closing the writer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28026) Add benchmark module for flink table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-28026.

Resolution: Invalid

> Add benchmark module for flink table store
> --
>
> Key: FLINK-28026
> URL: https://issues.apache.org/jira/browse/FLINK-28026
> Project: Flink
>  Issue Type: New Feature
>  Components: Table Store
>Reporter: Aiden Gong
>Assignee: Aiden Gong
>Priority: Minor
>  Labels: pull-request-available
> Fix For: table-store-0.3.0
>
>
> Add benchmark module for flink table store.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30000) Introduce FileSystemFactory to create FileSystem from custom configuration

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-3:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Introduce FileSystemFactory to create FileSystem from custom configuration
> --
>
> Key: FLINK-3
> URL: https://issues.apache.org/jira/browse/FLINK-3
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> Currently, table store uses static Flink FileSystem. This can not support:
> 1. Use another FileSystem different from checkpoint FileSystem.
> 2. Use FileSystem in Hive and Spark from custom configuration instead of 
> using FileSystem.initialize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29953) Get rid of flink-connector-hive dependency in flink-table-store-hive

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29953:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Get rid of flink-connector-hive dependency in flink-table-store-hive
> 
>
> Key: FLINK-29953
> URL: https://issues.apache.org/jira/browse/FLINK-29953
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> It is unnecessary for the tablestore to rely on it in the test. Its 
> incompatible modifications will make the tablestore troublesome.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-29963) Flink Table Store supports pluggable filesystem

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-29963.

  Assignee: Jane Chan
Resolution: Fixed

> Flink Table Store supports pluggable filesystem
> ---
>
> Key: FLINK-29963
> URL: https://issues.apache.org/jira/browse/FLINK-29963
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
> Fix For: table-store-0.3.0
>
>
> Currently, users cannot query the FTS table from Spark/Hive if using OSS/S3 
> as the underlying filesystem. We need to support them to improve user 
> experience.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29080) Migrate all tests from managed table to catalog-based tests

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29080:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Migrate all tests from managed table to catalog-based tests
> ---
>
> Key: FLINK-29080
> URL: https://issues.apache.org/jira/browse/FLINK-29080
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> To get rid of ManagedTableFactory and enable test on -Pflink-1.14



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-29965) Support Spark/Hive with S3

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-29965.

Resolution: Invalid

> Support Spark/Hive with S3
> --
>
> Key: FLINK-29965
> URL: https://issues.apache.org/jira/browse/FLINK-29965
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
> Fix For: table-store-0.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29756) Support materialized column to improve query performance for complex types

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29756:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Support materialized column to improve query performance for complex types
> --
>
> Key: FLINK-29756
> URL: https://issues.apache.org/jira/browse/FLINK-29756
> Project: Flink
>  Issue Type: New Feature
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Nicholas Jiang
>Priority: Minor
> Fix For: table-store-0.4.0
>
>
> In the world of data warehouse, it is very common to use one or more columns 
> from a complex type such as a map, or to put many subfields into it. These 
> operations can greatly affect query performance because:
>  # These operations are very wasteful IO. For example, if we have a field 
> type of Map, which contains dozens of subfields, we need to read the entire 
> column when reading this column. And Spark will traverse the entire map to 
> get the value of the target key.
>  # Cannot take advantage of vectorized reads when reading nested type columns.
>  # Filter pushdown cannot be used when reading nested columns.
> It is necessary to introduce the materialized column feature in Flink Table 
> Store, which transparently solves the above problems of arbitrary columnar 
> storage (not just Parquet).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-26900) Introduce Best Practices document for table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-26900.

Resolution: Invalid

> Introduce Best Practices document for table store
> -
>
> Key: FLINK-26900
> URL: https://issues.apache.org/jira/browse/FLINK-26900
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
> Fix For: table-store-0.3.0
>
>
> # Eventual Mode
>  # Transaction Mode ChanglogAll
>  # Without Log System
>  # ...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29882) LargeDataITCase is not stable

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29882:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> LargeDataITCase is not stable
> -
>
> Key: FLINK-29882
> URL: https://issues.apache.org/jira/browse/FLINK-29882
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> https://github.com/apache/flink-table-store/actions/runs/3391781964/jobs/5637271002



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28690) UpdateSchema#fromCatalogTable lost column comment

2023-01-08 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655884#comment-17655884
 ] 

Jingsong Lee commented on FLINK-28690:
--

wait for Flink 1.17

> UpdateSchema#fromCatalogTable lost column comment
> -
>
> Key: FLINK-28690
> URL: https://issues.apache.org/jira/browse/FLINK-28690
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Not a Priority
> Fix For: table-store-0.3.0
>
>
> The reason is that 
> org.apache.flink.table.api.TableSchema#toPhysicalRowDataType lost column 
> comments, which leads to comparison failure in 
> AbstractTableStoreFactory#buildFileStoreTable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-27166) Introduce file structure document for table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-27166.

Resolution: Invalid

> Introduce file structure document for table store
> -
>
> Key: FLINK-27166
> URL: https://issues.apache.org/jira/browse/FLINK-27166
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
> Fix For: table-store-0.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-26902) Introduce performence tune and benchmark document for table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-26902.

Fix Version/s: table-store-0.3.0
   (was: table-store-0.4.0)
   Resolution: Invalid

> Introduce performence tune and benchmark document for table store
> -
>
> Key: FLINK-26902
> URL: https://issues.apache.org/jira/browse/FLINK-26902
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
> Fix For: table-store-0.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-27103) Don't store redundant primary key fields

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-27103:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Don't store redundant primary key fields
> 
>
> Key: FLINK-27103
> URL: https://issues.apache.org/jira/browse/FLINK-27103
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
>  Labels: pull-request-available
> Fix For: table-store-0.4.0
>
>
> We are currently storing the primary key redundantly in the file, we can 
> directly use the primary key field in the original fields to avoid redundant 
> storage



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-26902) Introduce performence tune and benchmark document for table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-26902:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Introduce performence tune and benchmark document for table store
> -
>
> Key: FLINK-26902
> URL: https://issues.apache.org/jira/browse/FLINK-26902
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
> Fix For: table-store-0.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-26937) Introduce Leveled compression for LSM

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-26937:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Introduce Leveled compression for LSM
> -
>
> Key: FLINK-26937
> URL: https://issues.apache.org/jira/browse/FLINK-26937
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: xingyuan cheng
>Priority: Minor
> Fix For: table-store-0.4.0
>
>
> Currently ORC is all ZLIB compression by default, in fact the files at level 
> 0, will definitely be rewritten and we can have different compression for 
> different levels.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28718) SinkSavepointITCase.testRecoverFromSavepoint is unstable

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-28718:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> SinkSavepointITCase.testRecoverFromSavepoint is unstable
> 
>
> Key: FLINK-28718
> URL: https://issues.apache.org/jira/browse/FLINK-28718
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> https://github.com/apache/flink-table-store/runs/7537817210?check_suite_focus=true
> {code:java}
> Error:  Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 185.274 s <<< FAILURE! - in 
> org.apache.flink.table.store.connector.sink.SinkSavepointITCase
> Error:  testRecoverFromSavepoint  Time elapsed: 180.157 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 18 
> milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.flink.table.store.connector.sink.SinkSavepointITCase.testRecoverFromSavepoint(SinkSavepointITCase.java:84)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:750)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-27548) Improve quick-start of table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-27548:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Improve quick-start of table store
> --
>
> Key: FLINK-27548
> URL: https://issues.apache.org/jira/browse/FLINK-27548
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
> Fix For: table-store-0.4.0
>
>
> When the quick-start is completed, the stream job needs to be killed on the 
> flink page and the table needs to be dropped.
> But the exiting of the stream job is asynchronous and we need to wait a while 
> between these two actions. Otherwise there will be residue in the file 
> directory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-26465) Optimize SortMergeReader: use loser tree to reduce comparisons

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-26465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-26465:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Optimize SortMergeReader: use loser tree to reduce comparisons
> --
>
> Key: FLINK-26465
> URL: https://issues.apache.org/jira/browse/FLINK-26465
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Aiden Gong
>Priority: Minor
>  Labels: pull-request-available, stale-assigned
> Fix For: table-store-0.4.0
>
>
> See https://en.wikipedia.org/wiki/K-way_merge_algorithm



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28086) Table Store Catalog supports partition methods

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-28086:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Table Store Catalog supports partition methods
> --
>
> Key: FLINK-28086
> URL: https://issues.apache.org/jira/browse/FLINK-28086
> Project: Flink
>  Issue Type: New Feature
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Nicholas Jiang
>Priority: Minor
> Fix For: table-store-0.4.0
>
>
> Table Store Catalog can support:
>  * listPartitions
>  * listPartitionsByFilter
>  * getPartition
>  * partitionExists
>  * dropPartition



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-27679) Add tests for append-only table with log store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-27679.

Resolution: Invalid

> Add tests for append-only table with log store
> --
>
> Key: FLINK-27679
> URL: https://issues.apache.org/jira/browse/FLINK-27679
> Project: Flink
>  Issue Type: Bug
>Reporter: Zheng Hu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: table-store-0.3.0
>
>
> Will publish separate PR to support append-only table for log table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-27365) [umbrella] Schema evolution on table store

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-27365:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> [umbrella] Schema evolution on table store
> --
>
> Key: FLINK-27365
> URL: https://issues.apache.org/jira/browse/FLINK-27365
> Project: Flink
>  Issue Type: New Feature
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Minor
> Fix For: table-store-0.4.0
>
>
> FLIP in: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-226%3A+Introduce+Schema+Evolution+on+Table+Store



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30248) Spark writer supports insert overwrite

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-30248:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Spark writer supports insert overwrite
> --
>
> Key: FLINK-30248
> URL: https://issues.apache.org/jira/browse/FLINK-30248
> Project: Flink
>  Issue Type: New Feature
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] luoyuxia commented on pull request #21611: [FLINK-30592][doc] remove unsupported hive version in hive overview document

2023-01-08 Thread GitBox


luoyuxia commented on PR #21611:
URL: https://github.com/apache/flink/pull/21611#issuecomment-1375051560

   @chrismartin823 Could you please create a backport for release 1.16. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-27002) Optimize batch multiple partitions inserting

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-27002:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Optimize batch multiple partitions inserting
> 
>
> Key: FLINK-27002
> URL: https://issues.apache.org/jira/browse/FLINK-27002
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Minor
> Fix For: table-store-0.4.0
>
>
> By default, batch sink should sort the input by partition and sequence_field 
> to avoid generating a large number of small files. Too many small files cause 
> poor performance, especially object storage.
> We can not implement `SupportsPartitioning.requiresPartitionGrouping`.  we 
> need sequence.field to sort, otherwise we can't confirm what the last record 
> is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29630) Junit 5.8.1 run unit test with temporary directory will occur Failed to delete temp directory.

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29630:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> Junit 5.8.1 run unit test with temporary directory will occur Failed to 
> delete temp directory.
> --
>
> Key: FLINK-29630
> URL: https://issues.apache.org/jira/browse/FLINK-29630
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Affects Versions: table-store-0.3.0
>Reporter: Aiden Gong
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.4.0
>
> Attachments: image-2022-10-14-09-12-33-903.png
>
>
> Junit 5.8.1 run unit test with temporary directory will occur Failed to 
> delete temp directory.
> My local :
> windows 10
> jdk1.8
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29331) pre-aggregated merge supports changelog inputs

2023-01-08 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-29331:
-
Fix Version/s: table-store-0.4.0
   (was: table-store-0.3.0)

> pre-aggregated merge supports changelog inputs
> --
>
> Key: FLINK-29331
> URL: https://issues.apache.org/jira/browse/FLINK-29331
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> In FLINK-27626 ,  we have supported pre-agg merge, but no changelog inputs 
> support.
> We can support changelog inputs for some function, like sum/count.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   >