[GitHub] [flink] xintongsong commented on a diff in pull request #21508: [FLINK-30128] Introduce Azure Data Lake Gen2 APIs in the Hadoop Recoverable path
xintongsong commented on code in PR #21508: URL: https://github.com/apache/flink/pull/21508#discussion_r1064353993 ## flink-filesystems/flink-azure-fs-hadoop/src/main/java/org/apache/flink/fs/azurefs/AzureBlobFsRecoverableDataOutputStream.java: ## @@ -0,0 +1,303 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.fs.azurefs; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.RecoverableFsDataOutputStream; +import org.apache.flink.core.fs.RecoverableWriter; +import org.apache.flink.runtime.fs.hdfs.BaseHadoopFsRecoverableFsDataOutputStream; +import org.apache.flink.runtime.fs.hdfs.HadoopFsRecoverable; + +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.FileNotFoundException; +import java.io.IOException; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +/** + * An implementation of the {@link RecoverableFsDataOutputStream} for AzureBlob's file system + * abstraction. + */ +@Internal +public class AzureBlobFsRecoverableDataOutputStream +extends BaseHadoopFsRecoverableFsDataOutputStream { + +private static final Logger LOG = + LoggerFactory.getLogger(AzureBlobFsRecoverableDataOutputStream.class); +private static final String RENAME = ".rename"; + +// Not final to override in tests +public static int minBufferLength = 2097152; Review Comment: We use `@VisibleForTesting` for fields that have broader visibility then needed for testing purpose. ## flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/BaseHadoopFsRecoverableFsDataOutputStream.java: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.fs.hdfs; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.core.fs.RecoverableFsDataOutputStream; +import org.apache.flink.core.fs.RecoverableWriter; + +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; + +import java.io.IOException; + +/** Base class for ABFS and Hadoop recoverable stream. */ +@Internal +public abstract class BaseHadoopFsRecoverableFsDataOutputStream +extends RecoverableFsDataOutputStream { + +protected FileSystem fs; + +protected Path targetFile; + +protected Path tempFile; + +protected FSDataOutputStream out; + +// In ABFS outputstream we need to add this to the current pos +protected long initialFileSize = 0; + +public long getPos() throws IOException { +return out.getPos(); +} + +@Override +public void write(int b) throws IOException { +out.write(b); +} + +@Override +public void write(byte[] b, int off, int len) throws IOException { +out.write(b, off, len); +} + +@Override +public abstract void flush() throws IOException; + +@Override +public void sync() throws IOException { +out.hflush(); +out.hsync(); +} Review Comment: 1. IIUC, @anoopsjohn 's comment was that you can call `hsync` without calling `hflush` in `sync()` for ABFS. It was not about ABFS should call `hsync` in `flush()`.
[jira] [Comment Edited] (FLINK-30489) flink-sql-connector-pulsar doesn't shade all dependencies
[ https://issues.apache.org/jira/browse/FLINK-30489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17652982#comment-17652982 ] Yufan Sheng edited comment on FLINK-30489 at 1/9/23 7:56 AM: - We didn't shade these dependencies because these dependencies are not always required for Pulsar connector. Such as {{org.bouncycastel}}, you will need it only if you use the end-to-end encryption. But we can truly shade all the dependencies. {{org.sfl4j}} dependency will be removed in the next Pulsar 2.11 release. So no need to shade it. was (Author: syhily): We didn't shade these dependencies because these dependencies are not always required for Pulsar connector. Such as {{org.bouncycastel}}, you will need it only if you use the end-to-end encryption. But we can truly shade all the dependencies. > flink-sql-connector-pulsar doesn't shade all dependencies > - > > Key: FLINK-30489 > URL: https://issues.apache.org/jira/browse/FLINK-30489 > Project: Flink > Issue Type: Bug > Components: Connectors / Pulsar >Affects Versions: 1.16.0 >Reporter: Henri Yandell >Priority: Major > > Looking at > [https://repo1.maven.org/maven2/org/apache/flink/flink-sql-connector-pulsar/1.16.0/flink-sql-connector-pulsar-1.16.0.jar] > I'm seeing that some dependencies are shaded (com.fasterxml, com.yahoo etc), > but others are not (org.sfl4j, org.bouncycastel, com.scurrilous, ...) and > will presumably clash with other jar files. > Additionally, this bundling is going on in the '.jar' file rather than in a > more clearly indicated separate -bundle or -shaded jar file. > As a jar file this seems confusing and potentially bug inducing; though I > note I'm just a review of the jar and not Flink experienced. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30508) CliClientITCase.testSqlStatements failed with output not matched with expected
[ https://issues.apache.org/jira/browse/FLINK-30508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655957#comment-17655957 ] Matthias Pohl commented on FLINK-30508: --- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44552=logs=f2c100be-250b-5e85-7bbe-176f68fcddc5=05efd11e-5400-54a4-0d27-a4663be008a9=13186 > CliClientITCase.testSqlStatements failed with output not matched with expected > -- > > Key: FLINK-30508 > URL: https://issues.apache.org/jira/browse/FLINK-30508 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Affects Versions: 1.17.0 >Reporter: Qingsheng Ren >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44246=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=14992 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30508) CliClientITCase.testSqlStatements failed with output not matched with expected
[ https://issues.apache.org/jira/browse/FLINK-30508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl updated FLINK-30508: -- Labels: test-stability (was: ) > CliClientITCase.testSqlStatements failed with output not matched with expected > -- > > Key: FLINK-30508 > URL: https://issues.apache.org/jira/browse/FLINK-30508 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Affects Versions: 1.17.0 >Reporter: Qingsheng Ren >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44246=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=14992 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30525) Cannot open jobmanager configuration web page
[ https://issues.apache.org/jira/browse/FLINK-30525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655952#comment-17655952 ] Weihua Hu commented on FLINK-30525: --- [~renqs] Can you take a look at this? This will affect the release of 1.17 > Cannot open jobmanager configuration web page > - > > Key: FLINK-30525 > URL: https://issues.apache.org/jira/browse/FLINK-30525 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend >Affects Versions: 1.17.0 >Reporter: Weihua Hu >Priority: Major > Labels: pull-request-available > Attachments: image-2022-12-28-20-37-00-825.png, > image-2022-12-28-20-37-05-551.png > > > we remove the environments in rest api in > https://issues.apache.org/jira/browse/FLINK-30116. > The jobmanager.configuration web page will throw "TypeError: Cannot read > properties of undefined (reading 'length')" > the environment in jobmanager.configuration web page should be delete too. > !image-2022-12-28-20-37-00-825.png! > !image-2022-12-28-20-37-05-551.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28424) JdbcExactlyOnceSinkE2eTest hangs on Azure
[ https://issues.apache.org/jira/browse/FLINK-28424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655951#comment-17655951 ] Matthias Pohl commented on FLINK-28424: --- https://issues.apache.org/jira/browse/FLINK-28424?filter=-1=project%20%3D%20FLINK%20AND%20text%20~%20%22JdbcExactlyOnceSinkE2eTest%22%20AND%20status%20NOT%20IN%20(Closed%2C%20Resolved) > JdbcExactlyOnceSinkE2eTest hangs on Azure > - > > Key: FLINK-28424 > URL: https://issues.apache.org/jira/browse/FLINK-28424 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Affects Versions: 1.16.0, 1.17.0 >Reporter: Martijn Visser >Priority: Major > Labels: auto-deprioritized-critical, test-stability > > {code:java} > 2022-07-06T07:10:57.8133295Z > == > 2022-07-06T07:10:57.8137200Z === WARNING: This task took already 95% of the > available time budget of 232 minutes === > 2022-07-06T07:10:57.8140723Z > == > 2022-07-06T07:10:57.8186584Z > == > 2022-07-06T07:10:57.8187530Z The following Java processes are running (JPS) > 2022-07-06T07:10:57.8188571Z > == > 2022-07-06T07:10:58.2136012Z 825016 Jps > 2022-07-06T07:10:58.2136438Z 34359 surefirebooter8568713056714319310.jar > 2022-07-06T07:10:58.2136774Z 525 Launcher > 2022-07-06T07:10:58.2240260Z > == > 2022-07-06T07:10:58.2240814Z Printing stack trace of Java process 825016 > 2022-07-06T07:10:58.2241256Z > == > 2022-07-06T07:10:58.4498109Z 825016: No such process > 2022-07-06T07:10:58.4524779Z > == > 2022-07-06T07:10:58.4525272Z Printing stack trace of Java process 34359 > 2022-07-06T07:10:58.4525713Z > == > 2022-07-06T07:10:58.6399085Z 2022-07-06 07:10:58 > 2022-07-06T07:10:58.6400425Z Full thread dump OpenJDK 64-Bit Server VM > (25.292-b10 mixed mode): > 2022-07-06T07:10:58.7332738Z "Legacy Source Thread - Source: Custom Source -> > Map -> Sink: Unnamed (1/4)#44585" #870775 prio=5 os_prio=0 > tid=0x7fca5c06f800 nid=0xc3c26 waiting on condition [0x7fca503b1000] > 2022-07-06T07:10:58.786Zjava.lang.Thread.State: WAITING (parking) > 2022-07-06T07:10:58.7333759Z at sun.misc.Unsafe.park(Native Method) > 2022-07-06T07:10:58.7334404Z - parking to wait for <0xd5998448> (a > java.util.concurrent.CountDownLatch$Sync) > 2022-07-06T07:10:58.7334943Z at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > 2022-07-06T07:10:58.7335605Z at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > 2022-07-06T07:10:58.7336392Z at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > 2022-07-06T07:10:58.7337195Z at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > 2022-07-06T07:10:58.7337966Z at > java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > 2022-07-06T07:10:58.7338677Z at > org.apache.flink.connector.jdbc.xa.JdbcExactlyOnceSinkE2eTest$TestEntrySource.waitForConsumers(JdbcExactlyOnceSinkE2eTest.java:314) > 2022-07-06T07:10:58.7339566Z at > org.apache.flink.connector.jdbc.xa.JdbcExactlyOnceSinkE2eTest$TestEntrySource.run(JdbcExactlyOnceSinkE2eTest.java:300) > 2022-07-06T07:10:58.7340281Z at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:110) > 2022-07-06T07:10:58.7340883Z at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:67) > 2022-07-06T07:10:58.7341583Z at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:333) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=37685=logs=075127ba-54d5-54b0-cccf-6a36778b332d=c35a13eb-0df9-505f-29ac-8097029d4d79=14871 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29405) InputFormatCacheLoaderTest is unstable
[ https://issues.apache.org/jira/browse/FLINK-29405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655950#comment-17655950 ] Matthias Pohl commented on FLINK-29405: --- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44553=logs=ce3801ad-3bd5-5f06-d165-34d37e757d90=5e4d9387-1dcc-5885-a901-90469b7e6d2f=11690 > InputFormatCacheLoaderTest is unstable > -- > > Key: FLINK-29405 > URL: https://issues.apache.org/jira/browse/FLINK-29405 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime >Affects Versions: 1.16.0, 1.17.0 >Reporter: Chesnay Schepler >Assignee: Alexander Smirnov >Priority: Blocker > Labels: pull-request-available, test-stability > > #testExceptionDuringReload/#testCloseAndInterruptDuringReload fail reliably > when run in a loop. > {code} > java.lang.AssertionError: > Expecting AtomicInteger(0) to have value: > 0 > but did not. > at > org.apache.flink.table.runtime.functions.table.fullcache.inputformat.InputFormatCacheLoaderTest.testCloseAndInterruptDuringReload(InputFormatCacheLoaderTest.java:161) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #21622: [FLINK-30570] Fix error inferred partition condition
flinkbot commented on PR #21622: URL: https://github.com/apache/flink/pull/21622#issuecomment-1375214257 ## CI report: * fbfa489271fd7ede903d76682fb09565466bf082 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30570) RexNodeExtractor#isSupportedPartitionPredicate generates unexpected partition predicates
[ https://issues.apache.org/jira/browse/FLINK-30570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-30570: --- Labels: pull-request-available (was: ) > RexNodeExtractor#isSupportedPartitionPredicate generates unexpected partition > predicates > > > Key: FLINK-30570 > URL: https://issues.apache.org/jira/browse/FLINK-30570 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Reporter: Aitozi >Priority: Major > Labels: pull-request-available > > Currently, the condition {{where rand(1) < 0.0001}} will be recognized as a > partition predicates and will be evaluated to false when compiling the SQL. > It has two problem. > First, it should not be recognized as a partition predicates, and the > nondeterministic function should never pass the partition pruner -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] Aitozi opened a new pull request, #21622: [FLINK-30570] Fix error inferred partition condition
Aitozi opened a new pull request, #21622: URL: https://github.com/apache/flink/pull/21622 ## What is the purpose of the change This PR is meant to fix the unexpected partition pruning. It fix in two parts: - The partition condition should contains the InputRef of the partition field. Otherwise, it will be grouped to non-partition predicates - The partition condition should only accept the deterministic predicates. ## Verifying this change Some tests are added to verify it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-30603) CompactActionITCase in table store is unstable
[ https://issues.apache.org/jira/browse/FLINK-30603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655945#comment-17655945 ] Jingsong Lee commented on FLINK-30603: -- [~TsReaper] CC > CompactActionITCase in table store is unstable > -- > > Key: FLINK-30603 > URL: https://issues.apache.org/jira/browse/FLINK-30603 > Project: Flink > Issue Type: Bug > Components: Table Store >Affects Versions: table-store-0.4.0 >Reporter: Shammon >Priority: Major > > https://github.com/apache/flink-table-store/actions/runs/3871130631/jobs/6598625030 > [INFO] Results: > [INFO] > Error: Failures: > Error:CompactActionITCase.testStreamingCompact:187 expected:<[+I > 1|100|15|20221208, +I 1|100|15|20221209]> but was:<[+I 1|100|15|20221209]> -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] xintongsong commented on a diff in pull request #21347: [FLINK-29640][Client/Job Submission]Enhance the function configured by execution.shutdown-on-attached-exi…
xintongsong commented on code in PR #21347: URL: https://github.com/apache/flink/pull/21347#discussion_r1064320554 ## flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java: ## @@ -1756,6 +1756,9 @@ public final class ConfigConstants { /** The user lib directory name. */ public static final String DEFAULT_FLINK_USR_LIB_DIR = "usrlib"; +/** The initial client timeout when submitting the job. */ +public static final String INITIAL_CLIENT_HEARTBEAT_TIMEOUT = "initialClientHeartbeatTimeout"; Review Comment: `ConfigConstants` is public api annotated with `@Public`. We should not add this internal config key here. I think it would be good enough to introduce the config key as a constant in `JobGraph`. ## flink-clients/src/main/java/org/apache/flink/client/program/ClusterClient.java: ## @@ -206,4 +206,14 @@ default CompletableFuture> listCompletedClusterDatasetIds() { default CompletableFuture invalidateClusterDataset(AbstractID clusterDatasetId) { return CompletableFuture.completedFuture(null); } + +/** + * The client reports the heartbeat to the dispatcher for aliveness. + * + * @param jobId The jobId for the client and the job. + * @return + */ +default CompletableFuture reportHeartbeat(JobID jobId, long expiredTimestamp) { +return CompletableFuture.completedFuture(null); Review Comment: ```suggestion return FutureUtils.completedVoidFuture(); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30561) ChangelogStreamHandleReaderWithCache cause FileNotFoundException
[ https://issues.apache.org/jira/browse/FLINK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifan Wang updated FLINK-30561: Description: When a job with state changelog enabled continues to restart, the following exceptions may occur : {code:java} java.lang.RuntimeException: java.io.FileNotFoundException: /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp (No such file or directory) at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87) at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69) at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:107) at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:78) at org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94) at org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336) at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:726) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:702) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:669) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.FileNotFoundException: /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.init(FileInputStream.java:138) at org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:158) at org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:95) at org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42) at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85) ... 21 more {code} *Problem causes:* # *_ChangelogStreamHandleReaderWithCache_* use RefCountedFile manager local cache file. The reference count is incremented when the input stream is opened from the cache file, and decremented by one when the input stream is closed. So the input stream must be closed and only once. # _*StateChangelogHandleStreamHandleReader#getChanges()*_ may cause the input stream to be closed twice. This happens when changeIterator.read(tuple2.f0, tuple2.f1) throws an exception (for example, when the task is canceled for other reasons during the restore process) the current state change iterator will be closed twice. {code:java} private void advance() { while (!current.hasNext() && handleIterator.hasNext()) { try { current.close(); Tuple2 tuple2 = handleIterator.next(); LOG.debug("read at {} from {}", tuple2.f1, tuple2.f0); current =
[jira] [Updated] (FLINK-30471) Optimize the enriching network memory process in SsgNetworkMemoryCalculationUtils
[ https://issues.apache.org/jira/browse/FLINK-30471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuxin Tan updated FLINK-30471: -- Description: In SsgNetworkMemoryCalculationUtils#enrichNetworkMemory, getting PartitionTypes is run in a separate loop, which is not friendly to performance. If we want to add inputPartitionTypes in the subsequential PR, a new separate loop may be introduced too, which I think is not a good choice. Using a separate loop to get each collection just looks simpler in code style, but it will affect the performance. We can get all the results of maxSubpartitionNums and partitionTypes through one loop instead of multiple loops, which will be faster. In this way, when we need to add inputPartitionTypes later, we do not need to add a new loop logic. was: In SsgNetworkMemoryCalculationUtils#enrichNetworkMemory, getting PartitionTypes is run in a separate loop, which is not friendly to performance. If we want to get inputPartitionTypes, a new separate loop may be introduced too. It just looks simpler in code, but it will affect the performance. We can get all the results through one loop instead of multiple loops, which will be faster. > Optimize the enriching network memory process in > SsgNetworkMemoryCalculationUtils > - > > Key: FLINK-30471 > URL: https://issues.apache.org/jira/browse/FLINK-30471 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Network >Affects Versions: 1.17.0 >Reporter: Yuxin Tan >Priority: Major > Labels: pull-request-available > > In SsgNetworkMemoryCalculationUtils#enrichNetworkMemory, getting > PartitionTypes is run in a separate loop, which is not friendly to > performance. If we want to add inputPartitionTypes in the subsequential PR, > a new separate loop may be introduced too, which I think is not a good choice. > Using a separate loop to get each collection just looks simpler in code > style, but it will affect the performance. We can get all the results of > maxSubpartitionNums and partitionTypes through one loop instead of multiple > loops, which will be faster. In this way, when we need to add > inputPartitionTypes later, we do not need to add a new loop logic. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30561) ChangelogStreamHandleReaderWithCache cause FileNotFoundException
[ https://issues.apache.org/jira/browse/FLINK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feifan Wang updated FLINK-30561: Description: When a job with state changelog enabled continues to restart, the following exceptions may occur : {code:java} java.lang.RuntimeException: java.io.FileNotFoundException: /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp (No such file or directory) at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87) at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69) at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:107) at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:78) at org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94) at org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336) at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:726) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:702) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:669) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.FileNotFoundException: /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.init(FileInputStream.java:138) at org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:158) at org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:95) at org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42) at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85) ... 21 more {code} *Problem causes:* # *_ChangelogStreamHandleReaderWithCache_* use RefCountedFile manager local cache file. The reference count is incremented when the input stream is opened from the cache file, and decremented by one when the input stream is closed. So the input stream must be closed and only once. # _*StateChangelogHandleStreamHandleReader#getChanges()*_ may cause the input stream to be closed twice. This happens when changeIterator.read(tuple2.f0, tuple2.f1) throws an exception (for example, when the task is canceled for other reasons during the restore process) the current state change iterator will be closed twice. {code:java} private void advance() { while (!current.hasNext() && handleIterator.hasNext()) { try { current.close(); Tuple2 tuple2 = handleIterator.next(); LOG.debug("read at {} from {}", tuple2.f1, tuple2.f0); current =
[jira] [Closed] (FLINK-30602) Remove FileStoreTableITCase in table store
[ https://issues.apache.org/jira/browse/FLINK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-30602. Fix Version/s: table-store-0.4.0 Assignee: Shammon Resolution: Fixed master: b27fa51ed05bca3596ef1ba5131e97921ec01e33 > Remove FileStoreTableITCase in table store > -- > > Key: FLINK-30602 > URL: https://issues.apache.org/jira/browse/FLINK-30602 > Project: Flink > Issue Type: Sub-task > Components: Table Store >Affects Versions: table-store-0.4.0 >Reporter: Shammon >Assignee: Shammon >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.4.0 > > > Remove `FileStoreTableITCase` in table store -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-table-store] JingsongLi merged pull request #470: [FLINK-30602] Remove FileStoreTableITCase in table store
JingsongLi merged PR #470: URL: https://github.com/apache/flink-table-store/pull/470 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-24989) Upgrade shade-plugin to 3.2.4
[ https://issues.apache.org/jira/browse/FLINK-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655934#comment-17655934 ] Dian Fu commented on FLINK-24989: - [~chesnay] It seems that Maven shade plugin supports JDK17 since 3.3.0. I guess we need to bump it again.See [https://blogs.apache.org/maven/entry/apache-maven-shade-plugin-version6] for more details. I encountered the following issue when building Flink with JDK17: {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-shade-plugin:3.2.4:shade (shade-flink) on project flink-runtime: Error creating shaded jar: Problem shading JAR /Users/dianfu/code/src/apache/flink/flink-runtime/target/flink-runtime-1.17-SNAPSHOT.jar entry org/apache/flink/runtime/jobmaster/ExecutionDeploymentReconciler.class: java.lang.IllegalArgumentException: Unsupported class file major version 61 -> [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-shade-plugin:3.2.4:shade (shade-flink) on project flink-runtime: Error creating shaded jar: Problem shading JAR /Users/dianfu/code/src/apache/flink/flink-runtime/target/flink-runtime-1.17-SNAPSHOT.jar entry org/apache/flink/runtime/jobmaster/ExecutionDeploymentReconciler.class: java.lang.IllegalArgumentException: Unsupported class file major version 61 at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:216) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155) at org.apache.maven.cli.MavenCli.execute(MavenCli.java:584) at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216) at org.apache.maven.cli.MavenCli.main(MavenCli.java:160) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289) at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229) at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415) at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356) Caused by: org.apache.maven.plugin.MojoExecutionException: Error creating shaded jar: Problem shading JAR /Users/dianfu/code/src/apache/flink/flink-runtime/target/flink-runtime-1.17-SNAPSHOT.jar entry org/apache/flink/runtime/jobmaster/ExecutionDeploymentReconciler.class: java.lang.IllegalArgumentException: Unsupported class file major version 61 at org.apache.maven.plugins.shade.mojo.ShadeMojo.execute(ShadeMojo.java:607) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) ... 19 more Caused by: java.io.IOException: Problem shading JAR /Users/dianfu/code/src/apache/flink/flink-runtime/target/flink-runtime-1.17-SNAPSHOT.jar entry org/apache/flink/runtime/jobmaster/ExecutionDeploymentReconciler.class: java.lang.IllegalArgumentException: Unsupported class file major version 61 at org.apache.maven.plugins.shade.DefaultShader.shadeJars(DefaultShader.java:201) at org.apache.maven.plugins.shade.DefaultShader.shade(DefaultShader.java:108) at org.apache.maven.plugins.shade.mojo.ShadeMojo.execute(ShadeMojo.java:463) ... 21 more Caused by: java.lang.IllegalArgumentException: Unsupported class file major version 61 at org.objectweb.asm.ClassReader.(ClassReader.java:196) at org.objectweb.asm.ClassReader.(ClassReader.java:177) at org.objectweb.asm.ClassReader.(ClassReader.java:163) at org.objectweb.asm.ClassReader.(ClassReader.java:284) at org.apache.maven.plugins.shade.DefaultShader.addRemappedClass(DefaultShader.java:465) at org.apache.maven.plugins.shade.DefaultShader.shadeSingleJar(DefaultShader.java:234) at org.apache.maven.plugins.shade.DefaultShader.shadeJars(DefaultShader.java:196) ... 23 more{code} > Upgrade shade-plugin to 3.2.4 >
[GitHub] [flink] flinkbot commented on pull request #21621: [FLINK-30601][runtime] Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance
flinkbot commented on PR #21621: URL: https://github.com/apache/flink/pull/21621#issuecomment-1375167857 ## CI report: * 3e738de71ae44af8940e8e170f71a82c473d012b UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance
[ https://issues.apache.org/jira/browse/FLINK-30601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-30601: --- Labels: pull-request-available (was: ) > Omit "setKeyContextElement" call for non-keyed stream/operators to improve > performance > -- > > Key: FLINK-30601 > URL: https://issues.apache.org/jira/browse/FLINK-30601 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Reporter: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > > Currently, flink will set the correct key context(by call > [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) > before processing each record, which is typically used to extract key from > record and pass that key to the state backends. > However, the "setKeyContextElement" is obviously not need for non-keyed > stream/operator, in which case we can omit the "setKeyContextElement" calls > to improve performance. Note that "setKeyContextElement" is an interface > method, it requires looking up the interface table when calling, which will > further increase the method call overhead. > > We run the following program as benchmark with parallelism=1 and object > re-use enabled. The benchmark results are averaged across 5 runs for each > setup. Before and after applying the proposed change, the average execution > time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. > {code:java} > env.fromSequence(1, 10L) > .map(x -> x) > .map(x -> x) > .map(x -> x) > .map(x -> x) > .map(x -> x).addSink(new DiscardingSink<>());{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] wanglijie95 opened a new pull request, #21621: [FLINK-30601][runtime] Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance
wanglijie95 opened a new pull request, #21621: URL: https://github.com/apache/flink/pull/21621 ## What is the purpose of the change Currently, flink will set the correct key context(by call [setKeyContextElement](https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B)) before processing each record, which is typically used to extract key from record and pass that key to the state backends. However, the "setKeyContextElement" is obviously not need for non-keyed stream/operator, in which case we can omit the "setKeyContextElement" calls to improve performance. ## Verifying this change Add test `RecordProcessorUtilsTest` ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (**no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (**no**) - The serializers: (**no**) - The runtime per-record code paths (performance sensitive): (**no**) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (**no**) - The S3 file system connector: (**no**) ## Documentation - Does this pull request introduce a new feature? (**no**) - If yes, how is the feature documented? (**not applicable**) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] xintongsong commented on a diff in pull request #21565: [FLINK-18229][ResourceManager] Support cancel pending workers if no longer needed.
xintongsong commented on code in PR #21565: URL: https://github.com/apache/flink/pull/21565#discussion_r1064310906 ## flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java: ## @@ -284,7 +284,7 @@ public void close() throws Exception { @Override public void clearResourceRequirements(JobID jobId) { jobMasterTargetAddresses.remove(jobId); -taskManagerTracker.clearPendingAllocationsOfJob(jobId); +taskManagerTracker.clearPendingAllocationsOfJob(jobId, resourceAllocator.isSupported()); Review Comment: We should not pass `resourceAllocator.isSupported()` into `taskManagerTracker`. Instead, we should not call `taskManagerTracker.clearPendingAllocationsOfJob()` if the allocator is not supported. ## flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/active/ActiveResourceManager.java: ## @@ -342,21 +352,53 @@ private void checkResourceDeclarations() { resourceDeclaration.getUnwantedWorkers(), releaseOrRequestWorkerNumber); -// TODO, release pending/starting/running workers to exceed declared worker number. if (remainingReleasingWorkerNumber > 0) { -log.debug( -"need release {} workers after release unwanted workers.", -remainingReleasingWorkerNumber); +// release not allocated workers; +remainingReleasingWorkerNumber = +releaseUnallocatedWorkers( +workerResourceSpec, remainingReleasingWorkerNumber); } + +if (remainingReleasingWorkerNumber > 0) { +// release starting workers and running workers; +List workerCanReleaseInStartingOrder = new ArrayList<>(); +currentAttemptUnregisteredWorkers.stream() +.filter(r -> workerResourceSpec.equals(workerResourceSpecs.get(r))) +.forEach(workerCanReleaseInStartingOrder::add); +workerNodeMap.keySet().stream() +.filter( +r -> + workerResourceSpec.equals(workerResourceSpecs.get(r)) +&& !currentAttemptUnregisteredWorkers.contains( +r)) +.forEach(workerCanReleaseInStartingOrder::add); + +remainingReleasingWorkerNumber = +releaseResources( +workerCanReleaseInStartingOrder, +remainingReleasingWorkerNumber); +} Review Comment: This is not what I meant by deduplicating. There's no need to collect starting and running workers into one list. The following logics for starting and running workers are identical: - Filtering workers that matches the given spec - Releasing as many resources as needed - Returning the remaining number of resources that needs to be released The above logics can be deduplicated as a method, with the only difference being the collection of workers passed in. ## flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManagerDriver.java: ## @@ -373,8 +421,8 @@ private void onContainersOfPriorityAllocated(Priority priority, List requestResourceFutures.remove(taskExecutorProcessSpec); } -startTaskExecutorInContainerAsync( -container, taskExecutorProcessSpec, resourceId, requestResourceFuture); +startTaskExecutorInContainerAsync(container, taskExecutorProcessSpec, resourceId); +requestResourceFuture.complete(new YarnWorkerNode(container, resourceId)); Review Comment: Minor: ```suggestion requestResourceFuture.complete(new YarnWorkerNode(container, resourceId)); startTaskExecutorInContainerAsync(container, taskExecutorProcessSpec, resourceId); ``` ## flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/FineGrainedSlotManager.java: ## @@ -284,7 +284,7 @@ public void close() throws Exception { @Override public void clearResourceRequirements(JobID jobId) { jobMasterTargetAddresses.remove(jobId); -taskManagerTracker.clearPendingAllocationsOfJob(jobId); +taskManagerTracker.clearPendingAllocationsOfJob(jobId, resourceAllocator.isSupported()); Review Comment: Same for `replaceAllPendingAllocations`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to
[GitHub] [flink] TanYuxin-tyx commented on pull request #21618: [FLINK-30472][network] Modify the default value of the max network memory config option
TanYuxin-tyx commented on PR #21618: URL: https://github.com/apache/flink/pull/21618#issuecomment-1375163838 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] 1996fanrui commented on pull request #21614: [FLINK-30584][docs] Document flame graph at the subtask level
1996fanrui commented on PR #21614: URL: https://github.com/apache/flink/pull/21614#issuecomment-1375162700 Hi @xintongsong , please help take a look in your free time, thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-30603) CompactActionITCase in table store is unstable
Shammon created FLINK-30603: --- Summary: CompactActionITCase in table store is unstable Key: FLINK-30603 URL: https://issues.apache.org/jira/browse/FLINK-30603 Project: Flink Issue Type: Bug Components: Table Store Affects Versions: table-store-0.4.0 Reporter: Shammon https://github.com/apache/flink-table-store/actions/runs/3871130631/jobs/6598625030 [INFO] Results: [INFO] Error: Failures: Error:CompactActionITCase.testStreamingCompact:187 expected:<[+I 1|100|15|20221208, +I 1|100|15|20221209]> but was:<[+I 1|100|15|20221209]> -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] lsyldliu commented on pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime
lsyldliu commented on PR #21556: URL: https://github.com/apache/flink/pull/21556#issuecomment-1375139834 @godfreyhe Thanks for reviewing, I've address the comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-30600) Merge flink-table-store-kafka to flink-table-store-connector
[ https://issues.apache.org/jira/browse/FLINK-30600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-30600. Resolution: Fixed master: dd95fc47da7dad818c13bb178a7671db326088d9 > Merge flink-table-store-kafka to flink-table-store-connector > > > Key: FLINK-30600 > URL: https://issues.apache.org/jira/browse/FLINK-30600 > Project: Flink > Issue Type: Sub-task > Components: Table Store >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.4.0 > > > At present, Kafka heavily relies on the implementation of Flink, which is > difficult to extract, so it can be directly incorporated into the Flink > connector. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-table-store] JingsongLi merged pull request #469: [FLINK-30600] Merge flink-table-store-kafka to flink-table-store-connector
JingsongLi merged PR #469: URL: https://github.com/apache/flink-table-store/pull/469 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-table-store] zjureel commented on pull request #469: [FLINK-30600] Merge flink-table-store-kafka to flink-table-store-connector
zjureel commented on PR #469: URL: https://github.com/apache/flink-table-store/pull/469#issuecomment-1375132561 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] laughingman7743 commented on pull request #21613: [FLINK-30093][formats] Fix compile errors for google.protobuf.Timestamp type
laughingman7743 commented on PR #21613: URL: https://github.com/apache/flink/pull/21613#issuecomment-1375132192 @maosuhan I have added a description of data mapping of type google.protobuf.Timestamp to the documentation. https://github.com/apache/flink/pull/21613/commits/23b5fb0b35abb5e888ad7b50fcc22c837f860977 https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44568=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a The e2e test seems to be failing, but I can't find the cause. Could the changes made in this pull request be affecting it? Please let me know if I am missing something. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30602) Remove FileStoreTableITCase in table store
[ https://issues.apache.org/jira/browse/FLINK-30602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-30602: --- Labels: pull-request-available (was: ) > Remove FileStoreTableITCase in table store > -- > > Key: FLINK-30602 > URL: https://issues.apache.org/jira/browse/FLINK-30602 > Project: Flink > Issue Type: Sub-task > Components: Table Store >Affects Versions: table-store-0.4.0 >Reporter: Shammon >Priority: Major > Labels: pull-request-available > > Remove `FileStoreTableITCase` in table store -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink-table-store] zjureel opened a new pull request, #470: [FLINK-30602] Remove FileStoreTableITCase in table store
zjureel opened a new pull request, #470: URL: https://github.com/apache/flink-table-store/pull/470 Remove `FileStoreTableITCase` and fix related test case -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #21620: [FLINK-30473][network] Optimize the InputGate network memory management for TaskManager
flinkbot commented on PR #21620: URL: https://github.com/apache/flink/pull/21620#issuecomment-1375130869 ## CI report: * a92b74f449522a5ccf1c9cb05e3079c2fdfa7581 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] KarlManong commented on pull request #18059: [FLINK-25224][filesystem] Bump Hadoop version to 2.8.4
KarlManong commented on PR #18059: URL: https://github.com/apache/flink/pull/18059#issuecomment-1375128934 @liufangqi could you please fix the title? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30473) Optimize the InputGate network memory management for TaskManager
[ https://issues.apache.org/jira/browse/FLINK-30473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-30473: --- Labels: pull-request-available (was: ) > Optimize the InputGate network memory management for TaskManager > > > Key: FLINK-30473 > URL: https://issues.apache.org/jira/browse/FLINK-30473 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Network >Affects Versions: 1.17.0 >Reporter: Yuxin Tan >Priority: Major > Labels: pull-request-available > > Based on the > [FLIP-266|https://cwiki.apache.org/confluence/display/FLINK/FLIP-266%3A+Simplify+network+memory+configurations+for+TaskManager], > this issue mainly focuses on the first issue. > This change proposes a method to control the maximum required memory buffers > in an inputGate according to parallelism size. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] TanYuxin-tyx opened a new pull request, #21620: [FLINK-30473][network] Optimize the InputGate network memory management for TaskManager
TanYuxin-tyx opened a new pull request, #21620: URL: https://github.com/apache/flink/pull/21620 ## What is the purpose of the change *This change proposes a method to control the maximum required memory buffers in an inputGate according to parallelism size and data partition type* ## Brief change log - *Modify required buffers calculation in SingleInputGateFactory* - *Add a new class GateBuffersNumCalculator in SingleInputGateFactory* - *Add a new option taskmanager.network.memory.read-buffer.required-per-gate.max* - *Deparacate taskmanager.network.memory.buffers-per-channel and taskmanager.network.memory.floating-buffers-per-gate* ## Verifying this change This change added tests GateBuffersNumCalculatorTest. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (**yes** / no) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (**yes** / no) - If yes, how is the feature documented? (not applicable / **docs** / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] ramkrish86 commented on pull request #21512: [FLINK-30375][hotfix]sql-client leaks flink-table-planner jar in /tmp
ramkrish86 commented on PR #21512: URL: https://github.com/apache/flink/pull/21512#issuecomment-1375123666 +1 (non binding). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-30602) Remove FileStoreTableITCase in table store
Shammon created FLINK-30602: --- Summary: Remove FileStoreTableITCase in table store Key: FLINK-30602 URL: https://issues.apache.org/jira/browse/FLINK-30602 Project: Flink Issue Type: Sub-task Components: Table Store Affects Versions: table-store-0.4.0 Reporter: Shammon Remove `FileStoreTableITCase` in table store -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #21619: [FLINK-30308][python] Fix ClassLeakCleaner to work with Java 11.0.16+ and 17.0.4+
flinkbot commented on PR #21619: URL: https://github.com/apache/flink/pull/21619#issuecomment-1375105572 ## CI report: * a3ec6a28e6443ec364e8a578bfba0f18ef845cd4 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30308) ClassCastException: class java.io.ObjectStreamClass$Caches$1 cannot be cast to class java.util.Map is showing in the logging when the job shutdown
[ https://issues.apache.org/jira/browse/FLINK-30308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-30308: --- Labels: pull-request-available (was: ) > ClassCastException: class java.io.ObjectStreamClass$Caches$1 cannot be cast > to class java.util.Map is showing in the logging when the job shutdown > -- > > Key: FLINK-30308 > URL: https://issues.apache.org/jira/browse/FLINK-30308 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0, 1.16.1, 1.15.4 > > > {code:java} > 2022-12-05 18:26:40,229 WARN > org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Failed > to clean up the leaking objects. > java.lang.ClassCastException: class java.io.ObjectStreamClass$Caches$1 cannot > be cast to class java.util.Map (java.io.ObjectStreamClass$Caches$1 and > java.util.Map are in module java.base of loader 'bootstrap') > at > org.apache.flink.streaming.api.utils.ClassLeakCleaner.clearCache(ClassLeakCleaner.java:58) > > ~[blob_p-a72e14b9030c3ca0d3d0a8fc6e70166c7419d431-f7f18b2164971cb6798db9ab762feabd:1.15.0] > at > org.apache.flink.streaming.api.utils.ClassLeakCleaner.cleanUpLeakingClasses(ClassLeakCleaner.java:39) > > ~[blob_p-a72e14b9030c3ca0d3d0a8fc6e70166c7419d431-f7f18b2164971cb6798db9ab762feabd:1.15.0] > at > org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.close(AbstractPythonFunctionOperator.java:142) > > ~[blob_p-a72e14b9030c3ca0d3d0a8fc6e70166c7419d431-f7f18b2164971cb6798db9ab762feabd:1.15.0] > at > org.apache.flink.streaming.api.operators.python.AbstractExternalPythonFunctionOperator.close(AbstractExternalPythonFunctionOperator.java:73) > > ~[blob_p-a72e14b9030c3ca0d3d0a8fc6e70166c7419d431-f7f18b2164971cb6798db9ab762feabd:1.15.0] > at > org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:163) > ~[flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.closeAllOperators(RegularOperatorChain.java:125) > ~[flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:997) > ~[flink-dist-1.15.2.jar:1.15.2] > at org.apache.flink.util.IOUtils.closeAll(IOUtils.java:254) > ~[flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.core.fs.AutoCloseableRegistry.doClose(AutoCloseableRegistry.java:72) > ~[flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.util.AbstractAutoCloseableRegistry.close(AbstractAutoCloseableRegistry.java:127) > ~[flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUp(StreamTask.java:916) > ~[flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.runtime.taskmanager.Task.lambda$restoreAndInvoke$0(Task.java:930) > ~[flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948) > [flink-dist-1.15.2.jar:1.15.2] > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:930) > [flink-dist-1.15.2.jar:1.15.2] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741) > [flink-dist-1.15.2.jar:1.15.2] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) > [flink-dist-1.15.2.jar:1.15.2] > at java.lang.Thread.run(Unknown Source) [?:?]{code} > Reported in Slack: > https://apache-flink.slack.com/archives/C03G7LJTS2G/p1670265131083639?thread_ts=1670265114.640369=C03G7LJTS2G -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] dianfu opened a new pull request, #21619: [FLINK-30308][python] Fix ClassLeakCleaner to work with Java 11.0.16+ and 17.0.4+
dianfu opened a new pull request, #21619: URL: https://github.com/apache/flink/pull/21619 ## What is the purpose of the change *This pull request fixes ClassLeakCleaner to work with Java 11.0.16+ and 17.0.4+.* ## Verifying this change This change is a trivial rework without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): ( no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: ( no) - The S3 file system connector: ( no ) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-30561) ChangelogStreamHandleReaderWithCache cause FileNotFoundException
[ https://issues.apache.org/jira/browse/FLINK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655905#comment-17655905 ] Yanfei Lei commented on FLINK-30561: [~Feifan Wang] Thanks for the clarification. I think this makes sense in case the TM process doesn't restart when the job failover. > ChangelogStreamHandleReaderWithCache cause FileNotFoundException > > > Key: FLINK-30561 > URL: https://issues.apache.org/jira/browse/FLINK-30561 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.16.0 >Reporter: Feifan Wang >Priority: Major > > When a job with state changelog enabled continues to restart, the following > exceptions may occur : > {code:java} > java.lang.RuntimeException: java.io.FileNotFoundException: > /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp > (No such file or directory) > at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) > at > org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87) > at > org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69) > at > org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:107) > at > org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:78) > at > org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94) > at > org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336) > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:726) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:702) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:669) > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.FileNotFoundException: > /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp > (No such file or directory) > at java.io.FileInputStream.open0(Native Method) > at java.io.FileInputStream.open(FileInputStream.java:195) > at java.io.FileInputStream.init(FileInputStream.java:138) > at > org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:158) > at > org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:95) > at > org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42) > at > org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85) > ... 21 more {code} > *Problem causes:* > # *_ChangelogStreamHandleReaderWithCache_* use RefCountedFile manager local > cache file. The reference count is incremented when the input stream is > opened from the cache file, and
[jira] [Updated] (FLINK-30542) Support adaptive local hash aggregate in runtime
[ https://issues.apache.org/jira/browse/FLINK-30542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Godfrey He updated FLINK-30542: --- Summary: Support adaptive local hash aggregate in runtime (was: Support adaptive hash aggregate in runtime) > Support adaptive local hash aggregate in runtime > > > Key: FLINK-30542 > URL: https://issues.apache.org/jira/browse/FLINK-30542 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Affects Versions: 1.17.0 >Reporter: Yunhong Zheng >Priority: Major > Fix For: 1.17.0 > > > Introduce a new strategy to adaptively determine whether local hash aggregate > is required according to the aggregation degree of local hash aggregate. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-30491) Hive table partition supports to deserialize later during runtime
[ https://issues.apache.org/jira/browse/FLINK-30491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Godfrey He reassigned FLINK-30491: -- Assignee: dalongliu > Hive table partition supports to deserialize later during runtime > - > > Key: FLINK-30491 > URL: https://issues.apache.org/jira/browse/FLINK-30491 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Affects Versions: 1.16.0 >Reporter: dalongliu >Assignee: dalongliu >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-30542) Support adaptive local hash aggregate in runtime
[ https://issues.apache.org/jira/browse/FLINK-30542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Godfrey He reassigned FLINK-30542: -- Assignee: Yunhong Zheng > Support adaptive local hash aggregate in runtime > > > Key: FLINK-30542 > URL: https://issues.apache.org/jira/browse/FLINK-30542 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Affects Versions: 1.17.0 >Reporter: Yunhong Zheng >Assignee: Yunhong Zheng >Priority: Major > Fix For: 1.17.0 > > > Introduce a new strategy to adaptively determine whether local hash aggregate > is required according to the aggregation degree of local hash aggregate. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-30365) New dynamic partition pruning strategy to support more dpp patterns
[ https://issues.apache.org/jira/browse/FLINK-30365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Godfrey He closed FLINK-30365. -- Resolution: Fixed Fixed in 1.17.0: d9f9b55f82dfbc1676572cc36b718a99001497f8 > New dynamic partition pruning strategy to support more dpp patterns > --- > > Key: FLINK-30365 > URL: https://issues.apache.org/jira/browse/FLINK-30365 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Affects Versions: 1.17.0 >Reporter: Yunhong Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > > New dynamic partition pruning strategy to support more dpp patterns. Now, dpp > rules is coupled with the join reorder rules, which will affect the result of > join reorder. At the same time, the dpp rule don't support these patterns > like union node in fact side. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-30365) New dynamic partition pruning strategy to support more dpp patterns
[ https://issues.apache.org/jira/browse/FLINK-30365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Godfrey He reassigned FLINK-30365: -- Assignee: Yunhong Zheng > New dynamic partition pruning strategy to support more dpp patterns > --- > > Key: FLINK-30365 > URL: https://issues.apache.org/jira/browse/FLINK-30365 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Affects Versions: 1.17.0 >Reporter: Yunhong Zheng >Assignee: Yunhong Zheng >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > > New dynamic partition pruning strategy to support more dpp patterns. Now, dpp > rules is coupled with the join reorder rules, which will affect the result of > join reorder. At the same time, the dpp rule don't support these patterns > like union node in fact side. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] godfreyhe closed pull request #21489: [FLINK-30365][table-planner] New dynamic partition pruning strategy to support more dpp patterns
godfreyhe closed pull request #21489: [FLINK-30365][table-planner] New dynamic partition pruning strategy to support more dpp patterns URL: https://github.com/apache/flink/pull/21489 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] fsk119 commented on pull request #21597: [FLINK-30554][sql-client] Remove useless sessionId in the Executor
fsk119 commented on PR #21597: URL: https://github.com/apache/flink/pull/21597#issuecomment-1375092445 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance
[ https://issues.apache.org/jira/browse/FLINK-30601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-30601: --- Description: Currently, flink will set the correct key context(by call [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) before processing each record, which is typically used to extract key from record and pass that key to the state backends. However, the "setKeyContextElement" is obviously not need for non-keyed stream/operator, in which case we can omit the "setKeyContextElement" calls to improve performance. Note that setKeyContextElement is an interface method, it requires looking up the interface table when calling, which will further increase the method call overhead. We run the following program as benchmark with parallelism=1 and object re-use enabled. The benchmark results are averaged across 5 runs for each setup. Before and after applying the proposed change, the average execution time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. {code:java} env.fromSequence(1, 10L) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x).addSink(new DiscardingSink<>()); {code} was: Currently, flink will set the correct key context(by call [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) before processing each record, which is typically used to extract key from record and pass that key to the state backends. However, the "setKeyContextElement" is obviously not need for non-keyed stream/operator, in which case we should omit the "setKeyContextElement" calls to improve performance. Note that setKeyContextElement is an interface method, it requires looking up the interface table when calling, which will further increase the method call overhead. We run the following program as benchmark with parallelism=1 and object re-use enabled. The benchmark results are averaged across 5 runs for each setup. Before and after applying the proposed change, the average execution time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. {code:java} env.fromSequence(1, 10L) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x).addSink(new DiscardingSink<>()); {code} > Omit "setKeyContextElement" call for non-keyed stream/operators to improve > performance > -- > > Key: FLINK-30601 > URL: https://issues.apache.org/jira/browse/FLINK-30601 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Reporter: Lijie Wang >Priority: Major > Fix For: 1.17.0 > > > Currently, flink will set the correct key context(by call > [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) > before processing each record, which is typically used to extract key from > record and pass that key to the state backends. > However, the "setKeyContextElement" is obviously not need for non-keyed > stream/operator, in which case we can omit the "setKeyContextElement" calls > to improve performance. Note that setKeyContextElement is an interface > method, it requires looking up the interface table when calling, which will > further increase the method call overhead. > > We run the following program as benchmark with parallelism=1 and object > re-use enabled. The benchmark results are averaged across 5 runs for each > setup. Before and after applying the proposed change, the average execution > time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. > > {code:java} > env.fromSequence(1, 10L) > .map(x -> x) > .map(x -> x) > .map(x -> x) > .map(x -> x) > .map(x -> x).addSink(new DiscardingSink<>()); > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance
[ https://issues.apache.org/jira/browse/FLINK-30601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-30601: --- Description: Currently, flink will set the correct key context(by call [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) before processing each record, which is typically used to extract key from record and pass that key to the state backends. However, the "setKeyContextElement" is obviously not need for non-keyed stream/operator, in which case we can omit the "setKeyContextElement" calls to improve performance. Note that "setKeyContextElement" is an interface method, it requires looking up the interface table when calling, which will further increase the method call overhead. We run the following program as benchmark with parallelism=1 and object re-use enabled. The benchmark results are averaged across 5 runs for each setup. Before and after applying the proposed change, the average execution time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. {code:java} env.fromSequence(1, 10L) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x).addSink(new DiscardingSink<>()); {code} was: Currently, flink will set the correct key context(by call [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) before processing each record, which is typically used to extract key from record and pass that key to the state backends. However, the "setKeyContextElement" is obviously not need for non-keyed stream/operator, in which case we can omit the "setKeyContextElement" calls to improve performance. Note that setKeyContextElement is an interface method, it requires looking up the interface table when calling, which will further increase the method call overhead. We run the following program as benchmark with parallelism=1 and object re-use enabled. The benchmark results are averaged across 5 runs for each setup. Before and after applying the proposed change, the average execution time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. {code:java} env.fromSequence(1, 10L) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x).addSink(new DiscardingSink<>()); {code} > Omit "setKeyContextElement" call for non-keyed stream/operators to improve > performance > -- > > Key: FLINK-30601 > URL: https://issues.apache.org/jira/browse/FLINK-30601 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Reporter: Lijie Wang >Priority: Major > Fix For: 1.17.0 > > > Currently, flink will set the correct key context(by call > [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) > before processing each record, which is typically used to extract key from > record and pass that key to the state backends. > However, the "setKeyContextElement" is obviously not need for non-keyed > stream/operator, in which case we can omit the "setKeyContextElement" calls > to improve performance. Note that "setKeyContextElement" is an interface > method, it requires looking up the interface table when calling, which will > further increase the method call overhead. > > We run the following program as benchmark with parallelism=1 and object > re-use enabled. The benchmark results are averaged across 5 runs for each > setup. Before and after applying the proposed change, the average execution > time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. > {code:java} > env.fromSequence(1, 10L) > .map(x -> x) > .map(x -> x) > .map(x -> x) > .map(x -> x) > .map(x -> x).addSink(new DiscardingSink<>()); > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance
[ https://issues.apache.org/jira/browse/FLINK-30601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lijie Wang updated FLINK-30601: --- Description: Currently, flink will set the correct key context(by call [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) before processing each record, which is typically used to extract key from record and pass that key to the state backends. However, the "setKeyContextElement" is obviously not need for non-keyed stream/operator, in which case we can omit the "setKeyContextElement" calls to improve performance. Note that "setKeyContextElement" is an interface method, it requires looking up the interface table when calling, which will further increase the method call overhead. We run the following program as benchmark with parallelism=1 and object re-use enabled. The benchmark results are averaged across 5 runs for each setup. Before and after applying the proposed change, the average execution time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. {code:java} env.fromSequence(1, 10L) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x).addSink(new DiscardingSink<>());{code} was: Currently, flink will set the correct key context(by call [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) before processing each record, which is typically used to extract key from record and pass that key to the state backends. However, the "setKeyContextElement" is obviously not need for non-keyed stream/operator, in which case we can omit the "setKeyContextElement" calls to improve performance. Note that "setKeyContextElement" is an interface method, it requires looking up the interface table when calling, which will further increase the method call overhead. We run the following program as benchmark with parallelism=1 and object re-use enabled. The benchmark results are averaged across 5 runs for each setup. Before and after applying the proposed change, the average execution time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. {code:java} env.fromSequence(1, 10L) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x).addSink(new DiscardingSink<>()); {code} > Omit "setKeyContextElement" call for non-keyed stream/operators to improve > performance > -- > > Key: FLINK-30601 > URL: https://issues.apache.org/jira/browse/FLINK-30601 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Reporter: Lijie Wang >Priority: Major > Fix For: 1.17.0 > > > Currently, flink will set the correct key context(by call > [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) > before processing each record, which is typically used to extract key from > record and pass that key to the state backends. > However, the "setKeyContextElement" is obviously not need for non-keyed > stream/operator, in which case we can omit the "setKeyContextElement" calls > to improve performance. Note that "setKeyContextElement" is an interface > method, it requires looking up the interface table when calling, which will > further increase the method call overhead. > > We run the following program as benchmark with parallelism=1 and object > re-use enabled. The benchmark results are averaged across 5 runs for each > setup. Before and after applying the proposed change, the average execution > time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. > {code:java} > env.fromSequence(1, 10L) > .map(x -> x) > .map(x -> x) > .map(x -> x) > .map(x -> x) > .map(x -> x).addSink(new DiscardingSink<>());{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30601) Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance
Lijie Wang created FLINK-30601: -- Summary: Omit "setKeyContextElement" call for non-keyed stream/operators to improve performance Key: FLINK-30601 URL: https://issues.apache.org/jira/browse/FLINK-30601 Project: Flink Issue Type: Improvement Components: Runtime / Task Reporter: Lijie Wang Fix For: 1.17.0 Currently, flink will set the correct key context(by call [setKeyContextElement|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/ChainingOutput.java#:~:text=input.setKeyContextElement(castRecord)%3B]) before processing each record, which is typically used to extract key from record and pass that key to the state backends. However, the "setKeyContextElement" is obviously not need for non-keyed stream/operator, in which case we should omit the "setKeyContextElement" calls to improve performance. Note that setKeyContextElement is an interface method, it requires looking up the interface table when calling, which will further increase the method call overhead. We run the following program as benchmark with parallelism=1 and object re-use enabled. The benchmark results are averaged across 5 runs for each setup. Before and after applying the proposed change, the average execution time changed from 88.39 s to 78.76 s, which increases throughput by 10.8%. {code:java} env.fromSequence(1, 10L) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x) .map(x -> x).addSink(new DiscardingSink<>()); {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28796) Add Statement Completement API for sql gateway rest endpoint
[ https://issues.apache.org/jira/browse/FLINK-28796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shengkai Fang closed FLINK-28796. - Fix Version/s: 1.17.0 Resolution: Implemented Merged into master: 6301cca7a6924cebd1107fcd39c063b7a091551d > Add Statement Completement API for sql gateway rest endpoint > > > Key: FLINK-28796 > URL: https://issues.apache.org/jira/browse/FLINK-28796 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Gateway >Reporter: Wencong Liu >Assignee: yuzelin >Priority: Major > Labels: pull-request-available > Fix For: 1.17.0 > > > SQL Gateway supports various clients: sql client, rest, hiveserver2, etc. > Given the 1.16 feature freeze date, we won't be able to finish all the > endpoints. Thus, we'd exclude one of the rest apis (tracked by this ticket) > from [FLINK-28163] Introduce the statement related API for REST endpoint - > ASF JIRA (apache.org)], which is only needed by the sql client, and still try > to complete the remaining of them. > In other words, we'd expect the sql gateway to support rest & hiveserver2 > apis in 1.16, and sql client in 1.17. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] fsk119 merged pull request #21526: [FLINK-28796] Add completeStatement REST API in the SQL Gateway
fsk119 merged PR #21526: URL: https://github.com/apache/flink/pull/21526 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22320) Add documentation for new introduced ALTER TABLE statements
[ https://issues.apache.org/jira/browse/FLINK-22320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655896#comment-17655896 ] Jane Chan commented on FLINK-22320: --- Hi, I would like to take on this task. cc [~fsk119] > Add documentation for new introduced ALTER TABLE statements > --- > > Key: FLINK-22320 > URL: https://issues.apache.org/jira/browse/FLINK-22320 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table SQL / API >Reporter: Jark Wu >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] chrismartin823 commented on pull request #21611: [FLINK-30592][doc] remove unsupported hive version in hive overview document
chrismartin823 commented on PR #21611: URL: https://github.com/apache/flink/pull/21611#issuecomment-1375070781 > @chrismartin823 Could you please create a backport for release 1.16. Thanks. ok,I will create a backport for release 1.16. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-connector-hive] chenqin closed pull request #2: Move Flink Hive Connector
chenqin closed pull request #2: Move Flink Hive Connector URL: https://github.com/apache/flink-connector-hive/pull/2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19883) Support "IF EXISTS" in DDL for ALTER TABLE
[ https://issues.apache.org/jira/browse/FLINK-19883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655894#comment-17655894 ] Jane Chan commented on FLINK-19883: --- Hi [~nicholasjiang], I would like to continue working on this issue if you are unavailable. > Support "IF EXISTS" in DDL for ALTER TABLE > -- > > Key: FLINK-19883 > URL: https://issues.apache.org/jira/browse/FLINK-19883 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.11.2 >Reporter: Ingo Bürk >Assignee: Nicholas Jiang >Priority: Not a Priority > Labels: pull-request-available, stale-assigned > > ALTER TABLE does not seem to support the "IF EXISTS" part in the DDL, but the > corresponding methods Catalog#renameTable and Catalog#alterTable do support > it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] flinkbot commented on pull request #21618: [FLINK-30472][network] Modify the default value of the max network memory config option
flinkbot commented on PR #21618: URL: https://github.com/apache/flink/pull/21618#issuecomment-1375067510 ## CI report: * 82022cdf0c82952340b60648ff258c63228ac8ba UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] godfreyhe commented on a diff in pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime
godfreyhe commented on code in PR #21556: URL: https://github.com/apache/flink/pull/21556#discussion_r1064264590 ## flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/util/HivePartitionUtils.java: ## @@ -290,4 +291,33 @@ private static void listStatusRecursively( } } } + +public static List serializeHiveTablePartition( Review Comment: nit: it's better we can add some test to verify this utility method. ## flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTablePartitionSerializer.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.connectors.hive; + +import org.apache.flink.core.io.SimpleVersionedSerializer; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.ObjectInputStream; +import java.io.ObjectOutputStream; + +import static org.apache.flink.util.Preconditions.checkArgument; + +/** SerDe for {@link HiveTablePartition}. */ +public class HiveTablePartitionSerializer implements SimpleVersionedSerializer { + +private static final int VERSION = 1; + +public static final HiveTablePartitionSerializer INSTANCE = new HiveTablePartitionSerializer(); + +@Override +public int getVersion() { +return VERSION; +} + +@Override +public byte[] serialize(HiveTablePartition hiveTablePartition) throws IOException { +checkArgument( +hiveTablePartition.getClass() == HiveTablePartition.class, +"Cannot serialize subclasses of HiveTablePartition"); +ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); +try (ObjectOutputStream outputStream = new ObjectOutputStream(byteArrayOutputStream)) { +outputStream.writeObject(hiveTablePartition); +} +return byteArrayOutputStream.toByteArray(); +} + +@Override +public HiveTablePartition deserialize(int version, byte[] serialized) throws IOException { +if (version == 1) { Review Comment: if (version == CURRENT_VERSION) ## flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTablePartitionSerializer.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.connectors.hive; + +import org.apache.flink.core.io.SimpleVersionedSerializer; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.ObjectInputStream; +import java.io.ObjectOutputStream; + +import static org.apache.flink.util.Preconditions.checkArgument; + +/** SerDe for {@link HiveTablePartition}. */ +public class HiveTablePartitionSerializer implements SimpleVersionedSerializer { + +private static final int VERSION = 1; Review Comment: nit: CURRENT_VERSION -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-30472) Modify the default value of the max network memory config option
[ https://issues.apache.org/jira/browse/FLINK-30472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-30472: --- Labels: pull-request-available (was: ) > Modify the default value of the max network memory config option > > > Key: FLINK-30472 > URL: https://issues.apache.org/jira/browse/FLINK-30472 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Network >Affects Versions: 1.17.0 >Reporter: Yuxin Tan >Priority: Major > Labels: pull-request-available > > This issue mainly focuses on the second issue in > [FLIP-266|https://cwiki.apache.org/confluence/display/FLINK/FLIP-266%3A+Simplify+network+memory+configurations+for+TaskManager], > modifying the default value of taskmanager.memory.network.max -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] TanYuxin-tyx opened a new pull request, #21618: [FLINK-30472][network] Modify the default value of the max network memory config option
TanYuxin-tyx opened a new pull request, #21618: URL: https://github.com/apache/flink/pull/21618 ## What is the purpose of the change *This issue mainly focuses on the second issue in [FLIP-266](https://cwiki.apache.org/confluence/display/FLINK/FLIP-266%3A+Simplify+network+memory+configurations+for+TaskManager), modifying the default value of taskmanager.memory.network.max* ## Brief change log - *Set modifying the default value of taskmanager.memory.network.max to MemorySize.MAX_VALUE.* - *Update the config option documents about taskmanager.memory.network.max.* ## Verifying this change This change is already covered by existing tests, such as *TaskExecutorProcessUtilsTest*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (**yes** / no) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] lsyldliu commented on a diff in pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime
lsyldliu commented on code in PR #21556: URL: https://github.com/apache/flink/pull/21556#discussion_r1064259672 ## flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveSourceFileEnumerator.java: ## @@ -222,19 +225,24 @@ private static int getThreadNumToSplitHiveFile(JobConf jobConf) { /** A factory to create {@link HiveSourceFileEnumerator}. */ public static class Provider implements FileEnumerator.Provider { +private static final Logger LOG = LoggerFactory.getLogger(Provider.class); + private static final long serialVersionUID = 1L; -private final List partitions; +private final List partitionBytes; private final JobConfWrapper jobConfWrapper; -public Provider(List partitions, JobConfWrapper jobConfWrapper) { -this.partitions = partitions; +public Provider(List partitionBytes, JobConfWrapper jobConfWrapper) { +this.partitionBytes = partitionBytes; this.jobConfWrapper = jobConfWrapper; } @Override public FileEnumerator create() { -return new HiveSourceFileEnumerator(partitions, jobConfWrapper.conf()); +LOG.info("Deserialize hive table partition in HiveSourceFileEnumerator."); Review Comment: I also think it is no need. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] swuferhong commented on pull request #21489: [FLINK-30365][table-planner] New dynamic partition pruning strategy to support more dpp patterns
swuferhong commented on PR #21489: URL: https://github.com/apache/flink/pull/21489#issuecomment-1375056765 > It's better we can add this optimization in flink-tpcds-test module, we can do it in another pr. Ok. I will make a jira issue to track this work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] lsyldliu commented on pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime
lsyldliu commented on PR #21556: URL: https://github.com/apache/flink/pull/21556#issuecomment-1375056600 @luoyuxia Thanks for review, I have addressed your comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-28690) UpdateSchema#fromCatalogTable lost column comment
[ https://issues.apache.org/jira/browse/FLINK-28690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-28690: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > UpdateSchema#fromCatalogTable lost column comment > - > > Key: FLINK-28690 > URL: https://issues.apache.org/jira/browse/FLINK-28690 > Project: Flink > Issue Type: Bug > Components: Table Store >Affects Versions: table-store-0.2.0 >Reporter: Jane Chan >Assignee: Jane Chan >Priority: Not a Priority > Fix For: table-store-0.4.0 > > > The reason is that > org.apache.flink.table.api.TableSchema#toPhysicalRowDataType lost column > comments, which leads to comparison failure in > AbstractTableStoreFactory#buildFileStoreTable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27845) Introduce Column Type Schema evolution
[ https://issues.apache.org/jira/browse/FLINK-27845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-27845: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Introduce Column Type Schema evolution > -- > > Key: FLINK-27845 > URL: https://issues.apache.org/jira/browse/FLINK-27845 > Project: Flink > Issue Type: Sub-task > Components: Table Store >Reporter: Jingsong Lee >Priority: Minor > Labels: pull-request-available > Fix For: table-store-0.4.0 > > > After schema changes, for example, a column with int type to bigint type. We > should convert the data from the old type to the new type. > * What types are safe to convert? According to the implicitCastingRules in > LogicalTypeCasts. > * Where to convert? DataFileReader > * How to convert? In RecordIterator.next, we can convert old RowData to new > RowData if there is a schema change. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] swuferhong commented on a diff in pull request #21489: [FLINK-30365][table-planner] New dynamic partition pruning strategy to support more dpp patterns
swuferhong commented on code in PR #21489: URL: https://github.com/apache/flink/pull/21489#discussion_r1064259324 ## flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/utils/DynamicPartitionPruningUtils.java: ## @@ -60,318 +69,470 @@ public class DynamicPartitionPruningUtils { /** - * For the input join node, judge whether the join left side and join right side meet the - * requirements of dynamic partition pruning. Fact side in left or right join is not clear. + * Judge whether the input RelNode meets the conditions of dimSide. If joinKeys is null means we + * need not consider the join keys in dim side, which already deal by dynamic partition pruning + * rule. If joinKeys not null means we need to judge whether joinKeys changed in dim side, if + * changed, this RelNode is not dim side. */ -public static boolean supportDynamicPartitionPruning(Join join) { -return supportDynamicPartitionPruning(join, true) -|| supportDynamicPartitionPruning(join, false); +public static boolean isDppDimSide(RelNode rel) { +DppDimSideChecker dimSideChecker = new DppDimSideChecker(rel); +return dimSideChecker.isDppDimSide(); } /** - * For the input join node, judge whether the join left side and join right side meet the - * requirements of dynamic partition pruning. Fact side in left or right is clear. If meets the - * requirements, return true. + * Judge whether the input RelNode can be converted to the dpp fact side. If the input RelNode + * can be converted, this method will return the converted fact side whose partitioned table + * source will be converted to {@link BatchPhysicalDynamicFilteringTableSourceScan}, If not, + * this method will return the origin RelNode. */ -public static boolean supportDynamicPartitionPruning(Join join, boolean factInLeft) { -if (!ShortcutUtils.unwrapContext(join) -.getTableConfig() - .get(OptimizerConfigOptions.TABLE_OPTIMIZER_DYNAMIC_FILTERING_ENABLED)) { -return false; -} -// Now dynamic partition pruning supports left/right join, inner and semi join. but now semi -// join can not join reorder. -if (join.getJoinType() == JoinRelType.LEFT) { -if (factInLeft) { -return false; -} -} else if (join.getJoinType() == JoinRelType.RIGHT) { -if (!factInLeft) { -return false; -} -} else if (join.getJoinType() != JoinRelType.INNER -&& join.getJoinType() != JoinRelType.SEMI) { +public static Tuple2 canConvertAndConvertDppFactSide( +RelNode rel, +ImmutableIntList joinKeys, +RelNode dimSide, +ImmutableIntList dimSideJoinKey) { +DppFactSideChecker dppFactSideChecker = +new DppFactSideChecker(rel, joinKeys, dimSide, dimSideJoinKey); +return dppFactSideChecker.canConvertAndConvertDppFactSide(); +} + +/** Judge whether the join node is suitable one for dpp pattern. */ +public static boolean isSuitableJoin(Join join) { +// Now dynamic partition pruning supports left/right join, inner and semi +// join. but now semi join can not join reorder. +if (join.getJoinType() != JoinRelType.INNER +&& join.getJoinType() != JoinRelType.SEMI +&& join.getJoinType() != JoinRelType.LEFT +&& join.getJoinType() != JoinRelType.RIGHT) { return false; } JoinInfo joinInfo = join.analyzeCondition(); -if (joinInfo.leftKeys.isEmpty()) { -return false; -} -RelNode left = join.getLeft(); -RelNode right = join.getRight(); - -// TODO Now fact side and dim side don't support many complex patterns, like join inside -// fact/dim side, agg inside fact/dim side etc. which will support next. -return factInLeft -? isDynamicPartitionPruningPattern(left, right, joinInfo.leftKeys) -: isDynamicPartitionPruningPattern(right, left, joinInfo.rightKeys); +return !joinInfo.leftKeys.isEmpty(); } -private static boolean isDynamicPartitionPruningPattern( -RelNode factSide, RelNode dimSide, ImmutableIntList factSideJoinKey) { -return isDimSide(dimSide) && isFactSide(factSide, factSideJoinKey); -} +/** This class is used to check whether the relNode is dpp dim side. */ +private static class DppDimSideChecker { +private final RelNode relNode; +private boolean hasFilter; +private boolean hasPartitionedScan; +private final List tables = new ArrayList<>(); -/** make a dpp fact side factor to recurrence in fact side. */ -private static boolean isFactSide(RelNode rel, ImmutableIntList
[GitHub] [flink] lsyldliu commented on a diff in pull request #21556: [FLINK-30491][hive] Hive table partition supports deserializing later during runtime
lsyldliu commented on code in PR #21556: URL: https://github.com/apache/flink/pull/21556#discussion_r1064259298 ## flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveSourceDynamicFileEnumerator.java: ## @@ -187,31 +187,36 @@ public static class Provider implements DynamicFileEnumerator.Provider { private static final long serialVersionUID = 1L; +private static final Logger LOG = LoggerFactory.getLogger(Provider.class); + private final String table; private final List dynamicFilterPartitionKeys; -private final List partitions; +private final List partitionBytes; private final String hiveVersion; private final JobConfWrapper jobConfWrapper; public Provider( String table, List dynamicFilterPartitionKeys, -List partitions, +List partitionBytes, String hiveVersion, JobConfWrapper jobConfWrapper) { this.table = checkNotNull(table); this.dynamicFilterPartitionKeys = checkNotNull(dynamicFilterPartitionKeys); -this.partitions = checkNotNull(partitions); +this.partitionBytes = checkNotNull(partitionBytes); this.hiveVersion = checkNotNull(hiveVersion); this.jobConfWrapper = checkNotNull(jobConfWrapper); } @Override public DynamicFileEnumerator create() { +LOG.info( Review Comment: After rethinking, we don't need this log, if deserializing the hive table partition fails, the exception will be thrown. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-29735) Introduce Metadata tables for table store
[ https://issues.apache.org/jira/browse/FLINK-29735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-29735: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Introduce Metadata tables for table store > - > > Key: FLINK-29735 > URL: https://issues.apache.org/jira/browse/FLINK-29735 > Project: Flink > Issue Type: New Feature > Components: Table Store >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Fix For: table-store-0.4.0 > > > You can query the related metadata of the table through SQL, for example, > query the historical version information of table "T" through the following > SQL: > SELECT * FROM T$history; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30276) [umbrella] Flink free for table store core
[ https://issues.apache.org/jira/browse/FLINK-30276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-30276: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > [umbrella] Flink free for table store core > -- > > Key: FLINK-30276 > URL: https://issues.apache.org/jira/browse/FLINK-30276 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Fix For: table-store-0.4.0 > > > In FLINK-30080, We need a core that does not rely on specific Flink versions > to support flexible deployment and ecology. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30394) [umbrella] Refactor filesystem support in table store
[ https://issues.apache.org/jira/browse/FLINK-30394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-30394: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > [umbrella] Refactor filesystem support in table store > - > > Key: FLINK-30394 > URL: https://issues.apache.org/jira/browse/FLINK-30394 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Fix For: table-store-0.4.0 > > > - Let other computing engines, such as hive, spark, trino, support object > storage file systems, such as OSS and s3. > - Let table store access different file systems from Flink cluster according > to configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30390) Ensure that no compaction is in progress before closing the writer
[ https://issues.apache.org/jira/browse/FLINK-30390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-30390: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Ensure that no compaction is in progress before closing the writer > -- > > Key: FLINK-30390 > URL: https://issues.apache.org/jira/browse/FLINK-30390 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Major > Fix For: table-store-0.4.0 > > > When the writer does not generate a new submission file, it will be closed. > (In AbstractFileStoreWrite) However, at this time, there may be asynchronous > interactions that have not been completed and are forced to close, which will > cause some strange exceptions to be printed in the log. > We can avoid this situation, ensure that no compaction is in progress before > closing the writer. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28026) Add benchmark module for flink table store
[ https://issues.apache.org/jira/browse/FLINK-28026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-28026. Resolution: Invalid > Add benchmark module for flink table store > -- > > Key: FLINK-28026 > URL: https://issues.apache.org/jira/browse/FLINK-28026 > Project: Flink > Issue Type: New Feature > Components: Table Store >Reporter: Aiden Gong >Assignee: Aiden Gong >Priority: Minor > Labels: pull-request-available > Fix For: table-store-0.3.0 > > > Add benchmark module for flink table store. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30000) Introduce FileSystemFactory to create FileSystem from custom configuration
[ https://issues.apache.org/jira/browse/FLINK-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-3: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Introduce FileSystemFactory to create FileSystem from custom configuration > -- > > Key: FLINK-3 > URL: https://issues.apache.org/jira/browse/FLINK-3 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Major > Fix For: table-store-0.4.0 > > > Currently, table store uses static Flink FileSystem. This can not support: > 1. Use another FileSystem different from checkpoint FileSystem. > 2. Use FileSystem in Hive and Spark from custom configuration instead of > using FileSystem.initialize. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29953) Get rid of flink-connector-hive dependency in flink-table-store-hive
[ https://issues.apache.org/jira/browse/FLINK-29953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-29953: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Get rid of flink-connector-hive dependency in flink-table-store-hive > > > Key: FLINK-29953 > URL: https://issues.apache.org/jira/browse/FLINK-29953 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Major > Fix For: table-store-0.4.0 > > > It is unnecessary for the tablestore to rely on it in the test. Its > incompatible modifications will make the tablestore troublesome. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29963) Flink Table Store supports pluggable filesystem
[ https://issues.apache.org/jira/browse/FLINK-29963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-29963. Assignee: Jane Chan Resolution: Fixed > Flink Table Store supports pluggable filesystem > --- > > Key: FLINK-29963 > URL: https://issues.apache.org/jira/browse/FLINK-29963 > Project: Flink > Issue Type: Improvement > Components: Table Store >Affects Versions: table-store-0.3.0 >Reporter: Jane Chan >Assignee: Jane Chan >Priority: Major > Fix For: table-store-0.3.0 > > > Currently, users cannot query the FTS table from Spark/Hive if using OSS/S3 > as the underlying filesystem. We need to support them to improve user > experience. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29080) Migrate all tests from managed table to catalog-based tests
[ https://issues.apache.org/jira/browse/FLINK-29080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-29080: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Migrate all tests from managed table to catalog-based tests > --- > > Key: FLINK-29080 > URL: https://issues.apache.org/jira/browse/FLINK-29080 > Project: Flink > Issue Type: Improvement > Components: Table Store >Affects Versions: table-store-0.3.0 >Reporter: Jane Chan >Priority: Major > Fix For: table-store-0.4.0 > > > To get rid of ManagedTableFactory and enable test on -Pflink-1.14 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-29965) Support Spark/Hive with S3
[ https://issues.apache.org/jira/browse/FLINK-29965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-29965. Resolution: Invalid > Support Spark/Hive with S3 > -- > > Key: FLINK-29965 > URL: https://issues.apache.org/jira/browse/FLINK-29965 > Project: Flink > Issue Type: Sub-task > Components: Table Store >Affects Versions: table-store-0.3.0 >Reporter: Jane Chan >Assignee: Jane Chan >Priority: Major > Fix For: table-store-0.3.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29756) Support materialized column to improve query performance for complex types
[ https://issues.apache.org/jira/browse/FLINK-29756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-29756: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Support materialized column to improve query performance for complex types > -- > > Key: FLINK-29756 > URL: https://issues.apache.org/jira/browse/FLINK-29756 > Project: Flink > Issue Type: New Feature > Components: Table Store >Affects Versions: table-store-0.3.0 >Reporter: Nicholas Jiang >Priority: Minor > Fix For: table-store-0.4.0 > > > In the world of data warehouse, it is very common to use one or more columns > from a complex type such as a map, or to put many subfields into it. These > operations can greatly affect query performance because: > # These operations are very wasteful IO. For example, if we have a field > type of Map, which contains dozens of subfields, we need to read the entire > column when reading this column. And Spark will traverse the entire map to > get the value of the target key. > # Cannot take advantage of vectorized reads when reading nested type columns. > # Filter pushdown cannot be used when reading nested columns. > It is necessary to introduce the materialized column feature in Flink Table > Store, which transparently solves the above problems of arbitrary columnar > storage (not just Parquet). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-26900) Introduce Best Practices document for table store
[ https://issues.apache.org/jira/browse/FLINK-26900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-26900. Resolution: Invalid > Introduce Best Practices document for table store > - > > Key: FLINK-26900 > URL: https://issues.apache.org/jira/browse/FLINK-26900 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Minor > Fix For: table-store-0.3.0 > > > # Eventual Mode > # Transaction Mode ChanglogAll > # Without Log System > # ... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29882) LargeDataITCase is not stable
[ https://issues.apache.org/jira/browse/FLINK-29882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-29882: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > LargeDataITCase is not stable > - > > Key: FLINK-29882 > URL: https://issues.apache.org/jira/browse/FLINK-29882 > Project: Flink > Issue Type: Bug > Components: Table Store >Affects Versions: table-store-0.3.0 >Reporter: Jane Chan >Priority: Major > Fix For: table-store-0.4.0 > > > https://github.com/apache/flink-table-store/actions/runs/3391781964/jobs/5637271002 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28690) UpdateSchema#fromCatalogTable lost column comment
[ https://issues.apache.org/jira/browse/FLINK-28690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17655884#comment-17655884 ] Jingsong Lee commented on FLINK-28690: -- wait for Flink 1.17 > UpdateSchema#fromCatalogTable lost column comment > - > > Key: FLINK-28690 > URL: https://issues.apache.org/jira/browse/FLINK-28690 > Project: Flink > Issue Type: Bug > Components: Table Store >Affects Versions: table-store-0.2.0 >Reporter: Jane Chan >Assignee: Jane Chan >Priority: Not a Priority > Fix For: table-store-0.3.0 > > > The reason is that > org.apache.flink.table.api.TableSchema#toPhysicalRowDataType lost column > comments, which leads to comparison failure in > AbstractTableStoreFactory#buildFileStoreTable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-27166) Introduce file structure document for table store
[ https://issues.apache.org/jira/browse/FLINK-27166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-27166. Resolution: Invalid > Introduce file structure document for table store > - > > Key: FLINK-27166 > URL: https://issues.apache.org/jira/browse/FLINK-27166 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Minor > Fix For: table-store-0.3.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-26902) Introduce performence tune and benchmark document for table store
[ https://issues.apache.org/jira/browse/FLINK-26902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-26902. Fix Version/s: table-store-0.3.0 (was: table-store-0.4.0) Resolution: Invalid > Introduce performence tune and benchmark document for table store > - > > Key: FLINK-26902 > URL: https://issues.apache.org/jira/browse/FLINK-26902 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Minor > Fix For: table-store-0.3.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27103) Don't store redundant primary key fields
[ https://issues.apache.org/jira/browse/FLINK-27103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-27103: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Don't store redundant primary key fields > > > Key: FLINK-27103 > URL: https://issues.apache.org/jira/browse/FLINK-27103 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Minor > Labels: pull-request-available > Fix For: table-store-0.4.0 > > > We are currently storing the primary key redundantly in the file, we can > directly use the primary key field in the original fields to avoid redundant > storage -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-26902) Introduce performence tune and benchmark document for table store
[ https://issues.apache.org/jira/browse/FLINK-26902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-26902: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Introduce performence tune and benchmark document for table store > - > > Key: FLINK-26902 > URL: https://issues.apache.org/jira/browse/FLINK-26902 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Minor > Fix For: table-store-0.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-26937) Introduce Leveled compression for LSM
[ https://issues.apache.org/jira/browse/FLINK-26937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-26937: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Introduce Leveled compression for LSM > - > > Key: FLINK-26937 > URL: https://issues.apache.org/jira/browse/FLINK-26937 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Assignee: xingyuan cheng >Priority: Minor > Fix For: table-store-0.4.0 > > > Currently ORC is all ZLIB compression by default, in fact the files at level > 0, will definitely be rewritten and we can have different compression for > different levels. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28718) SinkSavepointITCase.testRecoverFromSavepoint is unstable
[ https://issues.apache.org/jira/browse/FLINK-28718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-28718: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > SinkSavepointITCase.testRecoverFromSavepoint is unstable > > > Key: FLINK-28718 > URL: https://issues.apache.org/jira/browse/FLINK-28718 > Project: Flink > Issue Type: Bug > Components: Table Store >Reporter: Jingsong Lee >Priority: Major > Fix For: table-store-0.4.0 > > > https://github.com/apache/flink-table-store/runs/7537817210?check_suite_focus=true > {code:java} > Error: Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 185.274 s <<< FAILURE! - in > org.apache.flink.table.store.connector.sink.SinkSavepointITCase > Error: testRecoverFromSavepoint Time elapsed: 180.157 s <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 18 > milliseconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.flink.table.store.connector.sink.SinkSavepointITCase.testRecoverFromSavepoint(SinkSavepointITCase.java:84) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27548) Improve quick-start of table store
[ https://issues.apache.org/jira/browse/FLINK-27548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-27548: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Improve quick-start of table store > -- > > Key: FLINK-27548 > URL: https://issues.apache.org/jira/browse/FLINK-27548 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Minor > Fix For: table-store-0.4.0 > > > When the quick-start is completed, the stream job needs to be killed on the > flink page and the table needs to be dropped. > But the exiting of the stream job is asynchronous and we need to wait a while > between these two actions. Otherwise there will be residue in the file > directory. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-26465) Optimize SortMergeReader: use loser tree to reduce comparisons
[ https://issues.apache.org/jira/browse/FLINK-26465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-26465: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Optimize SortMergeReader: use loser tree to reduce comparisons > -- > > Key: FLINK-26465 > URL: https://issues.apache.org/jira/browse/FLINK-26465 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Assignee: Aiden Gong >Priority: Minor > Labels: pull-request-available, stale-assigned > Fix For: table-store-0.4.0 > > > See https://en.wikipedia.org/wiki/K-way_merge_algorithm -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28086) Table Store Catalog supports partition methods
[ https://issues.apache.org/jira/browse/FLINK-28086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-28086: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Table Store Catalog supports partition methods > -- > > Key: FLINK-28086 > URL: https://issues.apache.org/jira/browse/FLINK-28086 > Project: Flink > Issue Type: New Feature > Components: Table Store >Reporter: Jingsong Lee >Assignee: Nicholas Jiang >Priority: Minor > Fix For: table-store-0.4.0 > > > Table Store Catalog can support: > * listPartitions > * listPartitionsByFilter > * getPartition > * partitionExists > * dropPartition -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-27679) Add tests for append-only table with log store
[ https://issues.apache.org/jira/browse/FLINK-27679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-27679. Resolution: Invalid > Add tests for append-only table with log store > -- > > Key: FLINK-27679 > URL: https://issues.apache.org/jira/browse/FLINK-27679 > Project: Flink > Issue Type: Bug >Reporter: Zheng Hu >Priority: Minor > Labels: pull-request-available > Fix For: table-store-0.3.0 > > > Will publish separate PR to support append-only table for log table. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27365) [umbrella] Schema evolution on table store
[ https://issues.apache.org/jira/browse/FLINK-27365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-27365: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > [umbrella] Schema evolution on table store > -- > > Key: FLINK-27365 > URL: https://issues.apache.org/jira/browse/FLINK-27365 > Project: Flink > Issue Type: New Feature > Components: Table Store >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Minor > Fix For: table-store-0.4.0 > > > FLIP in: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-226%3A+Introduce+Schema+Evolution+on+Table+Store -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-30248) Spark writer supports insert overwrite
[ https://issues.apache.org/jira/browse/FLINK-30248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-30248: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Spark writer supports insert overwrite > -- > > Key: FLINK-30248 > URL: https://issues.apache.org/jira/browse/FLINK-30248 > Project: Flink > Issue Type: New Feature > Components: Table Store >Reporter: Jingsong Lee >Priority: Major > Fix For: table-store-0.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [flink] luoyuxia commented on pull request #21611: [FLINK-30592][doc] remove unsupported hive version in hive overview document
luoyuxia commented on PR #21611: URL: https://github.com/apache/flink/pull/21611#issuecomment-1375051560 @chrismartin823 Could you please create a backport for release 1.16. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-27002) Optimize batch multiple partitions inserting
[ https://issues.apache.org/jira/browse/FLINK-27002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-27002: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Optimize batch multiple partitions inserting > > > Key: FLINK-27002 > URL: https://issues.apache.org/jira/browse/FLINK-27002 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Minor > Fix For: table-store-0.4.0 > > > By default, batch sink should sort the input by partition and sequence_field > to avoid generating a large number of small files. Too many small files cause > poor performance, especially object storage. > We can not implement `SupportsPartitioning.requiresPartitionGrouping`. we > need sequence.field to sort, otherwise we can't confirm what the last record > is. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29630) Junit 5.8.1 run unit test with temporary directory will occur Failed to delete temp directory.
[ https://issues.apache.org/jira/browse/FLINK-29630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-29630: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > Junit 5.8.1 run unit test with temporary directory will occur Failed to > delete temp directory. > -- > > Key: FLINK-29630 > URL: https://issues.apache.org/jira/browse/FLINK-29630 > Project: Flink > Issue Type: Improvement > Components: Table Store >Affects Versions: table-store-0.3.0 >Reporter: Aiden Gong >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.4.0 > > Attachments: image-2022-10-14-09-12-33-903.png > > > Junit 5.8.1 run unit test with temporary directory will occur Failed to > delete temp directory. > My local : > windows 10 > jdk1.8 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29331) pre-aggregated merge supports changelog inputs
[ https://issues.apache.org/jira/browse/FLINK-29331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-29331: - Fix Version/s: table-store-0.4.0 (was: table-store-0.3.0) > pre-aggregated merge supports changelog inputs > -- > > Key: FLINK-29331 > URL: https://issues.apache.org/jira/browse/FLINK-29331 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Major > Fix For: table-store-0.4.0 > > > In FLINK-27626 , we have supported pre-agg merge, but no changelog inputs > support. > We can support changelog inputs for some function, like sum/count. -- This message was sent by Atlassian Jira (v8.20.10#820010)