[jira] [Created] (FLINK-31397) Introduce write-once hash lookup store
Jingsong Lee created FLINK-31397: Summary: Introduce write-once hash lookup store Key: FLINK-31397 URL: https://issues.apache.org/jira/browse/FLINK-31397 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 Introduce interface for lookup changelog producer: {code:java} /** * A key-value store for lookup, key-value store should be single binary file written once and ready * to be used. This factory provide two interfaces: * * * Writer: written once to prepare binary file. * Reader: lookup value by key bytes. * */ public interface LookupStoreFactory { LookupStoreWriter createWriter(File file) throws IOException; LookupStoreReader createReader(File file) throws IOException; } {code} We can convert remote columnar data to local lookup store, and ready to be used to lookup. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31392) Refactor classes code of full-compaction
Jingsong Lee created FLINK-31392: Summary: Refactor classes code of full-compaction Key: FLINK-31392 URL: https://issues.apache.org/jira/browse/FLINK-31392 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 Refactor classes code of full-compaction, this is to prepare some shared codes for lookup changelog producer. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31391) Introduce lookup changelog producer
Jingsong Lee created FLINK-31391: Summary: Introduce lookup changelog producer Key: FLINK-31391 URL: https://issues.apache.org/jira/browse/FLINK-31391 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 Currently, only full-compaction can produce changelog, some merge-engine must have changelog producing, for example, partial-update and aggregation. But full-compaction is very heavy, write amplification is big huge... We should introduce a new changelog producer, supports lower latency to produce changelog. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31343) Remove JMH dependency in flink-table-store-micro-benchmark
Jingsong Lee created FLINK-31343: Summary: Remove JMH dependency in flink-table-store-micro-benchmark Key: FLINK-31343 URL: https://issues.apache.org/jira/browse/FLINK-31343 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31331) Flink 1.16 should implement new LookupFunction
Jingsong Lee created FLINK-31331: Summary: Flink 1.16 should implement new LookupFunction Key: FLINK-31331 URL: https://issues.apache.org/jira/browse/FLINK-31331 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 Only implements new LookupFunction, retry lookup join can work. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31329) Fix Parquet stats extractor
Jingsong Lee created FLINK-31329: Summary: Fix Parquet stats extractor Key: FLINK-31329 URL: https://issues.apache.org/jira/browse/FLINK-31329 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 Some bugs in Parquet stats extractor: # Decimal Supports # Timestamp Supports # Null nullCounts supports -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31315) FlinkActionsE2eTest.testMergeInto is unstable
Jingsong Lee created FLINK-31315: Summary: FlinkActionsE2eTest.testMergeInto is unstable Key: FLINK-31315 URL: https://issues.apache.org/jira/browse/FLINK-31315 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 {code:java} Error: Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 320.272 s <<< FAILURE! - in org.apache.flink.table.store.tests.FlinkActionsE2eTest 82Error: testMergeInto Time elapsed: 111.826 s <<< FAILURE! 83org.opentest4j.AssertionFailedError: 84Result is still unexpected after 60 retries. 85Expected: {3, v_3, creation, 02-27=1, 2, v_2, creation, 02-27=1, 6, v_6, creation, 02-28=1, 1, v_1, creation, 02-27=1, 8, v_8, insert, 02-29=1, 11, v_11, insert, 02-29=1, 7, Seven, matched_upsert, 02-28=1, 5, v_5, creation, 02-28=1, 10, v_10, creation, 02-28=1, 9, v_9, creation, 02-28=1} 86Actual: {4, v_4, creation, 02-27=1, 8, v_8, creation, 02-28=1, 3, v_3, creation, 02-27=1, 7, v_7, creation, 02-28=1, 2, v_2, creation, 02-27=1, 6, v_6, creation, 02-28=1, 1, v_1, creation, 02-27=1, 5, v_5, creation, 02-28=1, 10, v_10, creation, 02-28=1, 9, v_9, creation, 02-28=1} 87 at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39) 88 at org.junit.jupiter.api.Assertions.fail(Assertions.java:134) 89 at org.apache.flink.table.store.tests.E2eTestBase.checkResult(E2eTestBase.java:261) 90 at org.apache.flink.table.store.tests.FlinkActionsE2eTest.testMergeInto(FlinkActionsE2eTest.java:355) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31311) Supports Bounded Watermark streaming read
Jingsong Lee created FLINK-31311: Summary: Supports Bounded Watermark streaming read Key: FLINK-31311 URL: https://issues.apache.org/jira/browse/FLINK-31311 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 There are some bound stream scenarios that require that stream reading can be ended. Generally speaking, the end event time is the better. So in this ticket, supports writing the watermark to the snapshot and can specify the ending watermark when reading the stream. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31310) Force clear directory no matter what situation in HiveCatalog.dropTable
Jingsong Lee created FLINK-31310: Summary: Force clear directory no matter what situation in HiveCatalog.dropTable Key: FLINK-31310 URL: https://issues.apache.org/jira/browse/FLINK-31310 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 Currently, if no table in hive, will not clear the table. We should clear table directory in any situation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31309) Rollback DFS schema if hive sync fail in HiveCatalog.createTable
Jingsong Lee created FLINK-31309: Summary: Rollback DFS schema if hive sync fail in HiveCatalog.createTable Key: FLINK-31309 URL: https://issues.apache.org/jira/browse/FLINK-31309 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 Avoid schema residue on DFS. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31292) User HadoopUtils to get Configuration in CatalogContext
Jingsong Lee created FLINK-31292: Summary: User HadoopUtils to get Configuration in CatalogContext Key: FLINK-31292 URL: https://issues.apache.org/jira/browse/FLINK-31292 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 At present, if HadoopConf is not passed in the CatalogContext, a new HadoopConf will be directly generated, which may not have the required parameters. We can refer to HadoopUtils to obtain hadoopConf from the configuration and environment variables. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31291) Document table.exec.sink.upsert-materialize to none
Jingsong Lee created FLINK-31291: Summary: Document table.exec.sink.upsert-materialize to none Key: FLINK-31291 URL: https://issues.apache.org/jira/browse/FLINK-31291 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 The table store has the ability to correct disorder, such as: [https://nightlies.apache.org/flink/flink-table-store-docs-master/docs/concepts/primary-key-table/#sequence-field] But Flink SQL default sink materialize will result strange behavior, In particular, write to the agg table of the fts. We should document this, set table.exec.sink.upsert-materialize to none always, set 'sequence.field' to table in case of disorder. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31290) Remove features in documentation
Jingsong Lee created FLINK-31290: Summary: Remove features in documentation Key: FLINK-31290 URL: https://issues.apache.org/jira/browse/FLINK-31290 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 Features is confused in documentation. Now, there are two pages in features, log system and lookup join. We can move log system to concepts. And move lookup join to how-to. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31289) Default aggregate-function for field can be last_non_null_value
Jingsong Lee created FLINK-31289: Summary: Default aggregate-function for field can be last_non_null_value Key: FLINK-31289 URL: https://issues.apache.org/jira/browse/FLINK-31289 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 At present, when aggfunc is not configured, NPE will be generated. When the table is oriented to many fields, the configuration will be more troublesome. We can give the field the default aggfunc, such as last_ non_ null_ Value, which is consistent with the partial-update table. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31287) Default value of 'changelog-producer.compaction-interval' can be zero
Jingsong Lee created FLINK-31287: Summary: Default value of 'changelog-producer.compaction-interval' can be zero Key: FLINK-31287 URL: https://issues.apache.org/jira/browse/FLINK-31287 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 At present, the 30-minute interval is too conservative. We can set it to 0 by default, so that each checkpoint will have a full-compaction and generate a changelog. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31254) Improve the read performance for files table
Jingsong Lee created FLINK-31254: Summary: Improve the read performance for files table Key: FLINK-31254 URL: https://issues.apache.org/jira/browse/FLINK-31254 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 At present, the reading performance of the Files table is very poor. Even every data read will read the schema file. We can optimize the reading performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31253) Port itcases to Flink 1.15 and 1.14
Jingsong Lee created FLINK-31253: Summary: Port itcases to Flink 1.15 and 1.14 Key: FLINK-31253 URL: https://issues.apache.org/jira/browse/FLINK-31253 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 At present, only common has tests. We need to copy a part of itcase to 1.14 and 1.15 to ensure normal work. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31252) Improve StaticFileStoreSplitEnumerator to assign batch splits
Jingsong Lee created FLINK-31252: Summary: Improve StaticFileStoreSplitEnumerator to assign batch splits Key: FLINK-31252 URL: https://issues.apache.org/jira/browse/FLINK-31252 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 {code:java} // The following batch assignment operation is for two things: // 1. It can be evenly distributed during batch reading to avoid scheduling problems (for // example, the current resource can only schedule part of the tasks) that cause some tasks // to fail to read data. // 2. Read with limit, if split is assigned one by one, it may cause the task to repeatedly // create SplitFetchers. After the task is created, it is found that it is idle and then // closed. Then, new split coming, it will create SplitFetcher and repeatedly read the data // of the limit number (the limit status is in the SplitFetcher). {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31248) Improve documentation for append-only table
Jingsong Lee created FLINK-31248: Summary: Improve documentation for append-only table Key: FLINK-31248 URL: https://issues.apache.org/jira/browse/FLINK-31248 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31236) Limit pushdown will open useless RecordReader
Jingsong Lee created FLINK-31236: Summary: Limit pushdown will open useless RecordReader Key: FLINK-31236 URL: https://issues.apache.org/jira/browse/FLINK-31236 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31213) Aggregation merge engine supports retract inputs
Jingsong Lee created FLINK-31213: Summary: Aggregation merge engine supports retract inputs Key: FLINK-31213 URL: https://issues.apache.org/jira/browse/FLINK-31213 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 For sum, it can support retracts. For others, which do not support retraction (`UPDATE_BEFORE` and `DELETE`). If the user allow some functions to ignore retraction messages, the user can configure: `'fields.${field_name}.ignore-retract'='true'`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31209) Introduce creation time to files table
Jingsong Lee created FLINK-31209: Summary: Introduce creation time to files table Key: FLINK-31209 URL: https://issues.apache.org/jira/browse/FLINK-31209 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31201) Provides option to sort partition for full stage in streaming read
Jingsong Lee created FLINK-31201: Summary: Provides option to sort partition for full stage in streaming read Key: FLINK-31201 URL: https://issues.apache.org/jira/browse/FLINK-31201 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 The overall order may be out of order due to the writing of the old partition. We can provide an option to sort the full reading stage by partition fields to avoid the disorder. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31195) FullChangelogStoreSinkWrite bucket writer conflicts with rescale
Jingsong Lee created FLINK-31195: Summary: FullChangelogStoreSinkWrite bucket writer conflicts with rescale Key: FLINK-31195 URL: https://issues.apache.org/jira/browse/FLINK-31195 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 At present, this operator relies on ListState, Flink distributes data according to round-robin when rescaling, which may be different from the distribution rules of our bucket after rescaling. We need to change the mode of UnionListState, broadcast to each node, and finally decide whether it belongs to the task. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31179) Make data structures serializable
Jingsong Lee created FLINK-31179: Summary: Make data structures serializable Key: FLINK-31179 URL: https://issues.apache.org/jira/browse/FLINK-31179 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31178) Public Writer API
Jingsong Lee created FLINK-31178: Summary: Public Writer API Key: FLINK-31178 URL: https://issues.apache.org/jira/browse/FLINK-31178 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31117) Split flink connector to each module of each version
Jingsong Lee created FLINK-31117: Summary: Split flink connector to each module of each version Key: FLINK-31117 URL: https://issues.apache.org/jira/browse/FLINK-31117 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 This will make compilation and testing much easier. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31111) Introduce CatalogTestBase
Jingsong Lee created FLINK-3: Summary: Introduce CatalogTestBase Key: FLINK-3 URL: https://issues.apache.org/jira/browse/FLINK-3 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 Currently, only tests for FlinkCatalog or ITCase, we should add cases for catalogs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31103) Public Table API for table store
Jingsong Lee created FLINK-31103: Summary: Public Table API for table store Key: FLINK-31103 URL: https://issues.apache.org/jira/browse/FLINK-31103 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31072) Introduce streaming-read-atomic to ensure UB and UA cannot be split
Jingsong Lee created FLINK-31072: Summary: Introduce streaming-read-atomic to ensure UB and UA cannot be split Key: FLINK-31072 URL: https://issues.apache.org/jira/browse/FLINK-31072 Project: Flink Issue Type: Improvement Reporter: Jingsong Lee Currently, streaming source will be checkpoint in any time, this means UPDATE_BEFORE and UPDATE_AFTER can be split into two checkpoint. Downstream can see intermediate state. This is weird in some cases. So in this ticket, add streaming-read-atomic: The option to enable return per iterator instead of per record in streaming read. This can ensure that there will be no checkpoint segmentation in iterator consumption. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31054) Flink free to common codegen core shade
Jingsong Lee created FLINK-31054: Summary: Flink free to common codegen core shade Key: FLINK-31054 URL: https://issues.apache.org/jira/browse/FLINK-31054 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31038) Avoid accessing non-TableStore tables in HiveCatalog.listTables
Jingsong Lee created FLINK-31038: Summary: Avoid accessing non-TableStore tables in HiveCatalog.listTables Key: FLINK-31038 URL: https://issues.apache.org/jira/browse/FLINK-31038 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 In HiveCatalog.listTables, in the current implementation, getTable will be called for each TableName. However, the environment here may not be able to access non-TableStore tables. We can avoid access non-TableStore tables by judging whether it is a TableStore table in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31037) Table Store supports streaming reading a whole snapshot in one checkpoint
Jingsong Lee created FLINK-31037: Summary: Table Store supports streaming reading a whole snapshot in one checkpoint Key: FLINK-31037 URL: https://issues.apache.org/jira/browse/FLINK-31037 Project: Flink Issue Type: Improvement Reporter: Jingsong Lee Fix For: table-store-0.4.0 At present, in the streaming reading of tablestore, the checkpoint may be performed when a snapshot is not read completely, and then a single snapshot may be cut into multiple slices, and the downstream will see the intermediate state. In some scenarios, this intermediate state is not allowed. We need to support a mode to prohibit this situation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31024) Copy code splitter to table store from flink table
Jingsong Lee created FLINK-31024: Summary: Copy code splitter to table store from flink table Key: FLINK-31024 URL: https://issues.apache.org/jira/browse/FLINK-31024 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31023) Introduce ConfigOption for table store
Jingsong Lee created FLINK-31023: Summary: Introduce ConfigOption for table store Key: FLINK-31023 URL: https://issues.apache.org/jira/browse/FLINK-31023 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31022) Using new Serializer for table store
Jingsong Lee created FLINK-31022: Summary: Using new Serializer for table store Key: FLINK-31022 URL: https://issues.apache.org/jira/browse/FLINK-31022 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31009) Add recordCount to snapshot meta
Jingsong Lee created FLINK-31009: Summary: Add recordCount to snapshot meta Key: FLINK-31009 URL: https://issues.apache.org/jira/browse/FLINK-31009 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Record count represents the total number of data records. It is simply added by the number of data records of all files. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31002) Provide data sampling query
Jingsong Lee created FLINK-31002: Summary: Provide data sampling query Key: FLINK-31002 URL: https://issues.apache.org/jira/browse/FLINK-31002 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Want to take several randomly from each partition, but the limit is always fixed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31001) Introduce Hive writer
Jingsong Lee created FLINK-31001: Summary: Introduce Hive writer Key: FLINK-31001 URL: https://issues.apache.org/jira/browse/FLINK-31001 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30980) Support s3.signer-type for S3
Jingsong Lee created FLINK-30980: Summary: Support s3.signer-type for S3 Key: FLINK-30980 URL: https://issues.apache.org/jira/browse/FLINK-30980 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 Currently, s3.signer-type should be s3a.signer-type, we can also support s3.signer-type configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30979) The buckets of the secondary partition should fall on different tasks
Jingsong Lee created FLINK-30979: Summary: The buckets of the secondary partition should fall on different tasks Key: FLINK-30979 URL: https://issues.apache.org/jira/browse/FLINK-30979 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 In Flink Streaming Job, sink to table store. Considering that I only set one bucket now, but there are many secondary partitions, I expect to use multiple parallelism tasks. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30882) Introduce partition.expiration-time to automatically delete expired partitions
Jingsong Lee created FLINK-30882: Summary: Introduce partition.expiration-time to automatically delete expired partitions Key: FLINK-30882 URL: https://issues.apache.org/jira/browse/FLINK-30882 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 Same to snapshot expiration, we can also introduce partition expiration to automatically delete expired partitions in commit node. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30866) Introduce FileIO for table store
Jingsong Lee created FLINK-30866: Summary: Introduce FileIO for table store Key: FLINK-30866 URL: https://issues.apache.org/jira/browse/FLINK-30866 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 FileIO aims to make table-store and the FileSystem of Flink independent, In this way, we can provide different FileSystem support in the Flink cluster, such as other S3 buckets. In addition, different engines can provide the same FileIO experience (such as configuration and usage) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30706) Remove flink-table-common dependency for table store core
Jingsong Lee created FLINK-30706: Summary: Remove flink-table-common dependency for table store core Key: FLINK-30706 URL: https://issues.apache.org/jira/browse/FLINK-30706 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30676) Introduce Data Structures for table store
Jingsong Lee created FLINK-30676: Summary: Introduce Data Structures for table store Key: FLINK-30676 URL: https://issues.apache.org/jira/browse/FLINK-30676 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 Copy data structures to table store from Flink. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30632) Introduce DataType for table store
Jingsong Lee created FLINK-30632: Summary: Introduce DataType for table store Key: FLINK-30632 URL: https://issues.apache.org/jira/browse/FLINK-30632 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 Introduce table store own DataType to decouple Flink SQL LogicalType. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30628) Kerberos in HiveCatalog is not work
Jingsong Lee created FLINK-30628: Summary: Kerberos in HiveCatalog is not work Key: FLINK-30628 URL: https://issues.apache.org/jira/browse/FLINK-30628 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 We should read kerberos keytab from catalog options and doAs for hive metastore client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30611) Expire snapshot should be reentrant
Jingsong Lee created FLINK-30611: Summary: Expire snapshot should be reentrant Key: FLINK-30611 URL: https://issues.apache.org/jira/browse/FLINK-30611 Project: Flink Issue Type: Bug Reporter: Jingsong Lee Fix For: table-store-0.3.0 At present, if the file is incomplete, expire will throw an exception. However, the snapshot in expire may be incomplete. It can be interrupted and killed suddenly. Therefore, we should ensure the safety of expire, make it reentrant, and avoid throwing exceptions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30610) Flink-table-runtime free for disk io in flink-core
Jingsong Lee created FLINK-30610: Summary: Flink-table-runtime free for disk io in flink-core Key: FLINK-30610 URL: https://issues.apache.org/jira/browse/FLINK-30610 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30600) Merge flink-table-store-kafka to flink-table-store-connector
Jingsong Lee created FLINK-30600: Summary: Merge flink-table-store-kafka to flink-table-store-connector Key: FLINK-30600 URL: https://issues.apache.org/jira/browse/FLINK-30600 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 At present, Kafka heavily relies on the implementation of Flink, which is difficult to extract, so it can be directly incorporated into the Flink connector. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30582) Flink-avro Flink-orc free for flink-table-store-format
Jingsong Lee created FLINK-30582: Summary: Flink-avro Flink-orc free for flink-table-store-format Key: FLINK-30582 URL: https://issues.apache.org/jira/browse/FLINK-30582 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30581) Deprecate FileStoreTableITCase and use CatalogITCaseBase
Jingsong Lee created FLINK-30581: Summary: Deprecate FileStoreTableITCase and use CatalogITCaseBase Key: FLINK-30581 URL: https://issues.apache.org/jira/browse/FLINK-30581 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 We recommend users to use Catalog tables instead managed tables. Managed tables should be deprecated. Now we already did not expose managed in documentation. We can remove it. Before removing, tests should be refactored. FileStoreTableITCase with managed tables should be changed to CatalogITCaseBase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30580) [umbrella] Refactor tests for table store
Jingsong Lee created FLINK-30580: Summary: [umbrella] Refactor tests for table store Key: FLINK-30580 URL: https://issues.apache.org/jira/browse/FLINK-30580 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 This is a umbrella issue to improve tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30572) Make parquet as default data file format
Jingsong Lee created FLINK-30572: Summary: Make parquet as default data file format Key: FLINK-30572 URL: https://issues.apache.org/jira/browse/FLINK-30572 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.4.0 - We have done some tests. Parquet is 30% faster. - After FLINK-30565, Parquet can support complex types and file systems such as OSS and s3 (decoupled from hadoop filesystem). - After FLINK-30569, the table can switch formats at will. Therefore, if detailed and comprehensive tests have been carried out here, we can use Parquet as the default format. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30569) File Format can not change with data file exists
Jingsong Lee created FLINK-30569: Summary: File Format can not change with data file exists Key: FLINK-30569 URL: https://issues.apache.org/jira/browse/FLINK-30569 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 # Set file format to orc # Write records. # Set file format to parquet. # Write records # Read -> throw exception... We should support change file format. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30565) Flink-parquet free for flink-table-store-format
Jingsong Lee created FLINK-30565: Summary: Flink-parquet free for flink-table-store-format Key: FLINK-30565 URL: https://issues.apache.org/jira/browse/FLINK-30565 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30555) Hive cluster can not read oss/s3 tables
Jingsong Lee created FLINK-30555: Summary: Hive cluster can not read oss/s3 tables Key: FLINK-30555 URL: https://issues.apache.org/jira/browse/FLINK-30555 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 FLINK-29964 add oss support for Hive, but only valid in the case of standalone Hive, the distributed Hive compute engine cannot access. We should add more FileSystems.initialize to Hive connector -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30547) Flink-table-runtime free for flink-table-store-common
Jingsong Lee created FLINK-30547: Summary: Flink-table-runtime free for flink-table-store-common Key: FLINK-30547 URL: https://issues.apache.org/jira/browse/FLINK-30547 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30398) Introduce S3 support for table store
Jingsong Lee created FLINK-30398: Summary: Introduce S3 support for table store Key: FLINK-30398 URL: https://issues.apache.org/jira/browse/FLINK-30398 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 S3 contains a large number of dependencies, which can easily lead to class conflicts. We need a plugin mechanism to load the corresponding jars through the classloader. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30395) Refactor module name and documentation for filesystems
Jingsong Lee created FLINK-30395: Summary: Refactor module name and documentation for filesystems Key: FLINK-30395 URL: https://issues.apache.org/jira/browse/FLINK-30395 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30394) [umbrella] Refactor filesystem support in table store
Jingsong Lee created FLINK-30394: Summary: [umbrella] Refactor filesystem support in table store Key: FLINK-30394 URL: https://issues.apache.org/jira/browse/FLINK-30394 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 - Let other computing engines, such as hive, spark, trino, support object storage file systems, such as OSS and s3. - Let table store access different file systems from Flink cluster according to configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30390) Ensure that no compaction is in progress before closing the writer
Jingsong Lee created FLINK-30390: Summary: Ensure that no compaction is in progress before closing the writer Key: FLINK-30390 URL: https://issues.apache.org/jira/browse/FLINK-30390 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 When the writer does not generate a new submission file, it will be closed. (In AbstractFileStoreWrite) However, at this time, there may be asynchronous interactions that have not been completed and are forced to close, which will cause some strange exceptions to be printed in the log. We can avoid this situation, ensure that no compaction is in progress before closing the writer. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30389) Add retry to read hints
Jingsong Lee created FLINK-30389: Summary: Add retry to read hints Key: FLINK-30389 URL: https://issues.apache.org/jira/browse/FLINK-30389 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 For the oss (object store) filesystem. When writing hint file, delete it first and then add it. Reading hint file may fail frequently. We don't need to return directly in case of failure. We can add a retry. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30373) Flink-table-runtime free for flink-table-store-codegen
Jingsong Lee created FLINK-30373: Summary: Flink-table-runtime free for flink-table-store-codegen Key: FLINK-30373 URL: https://issues.apache.org/jira/browse/FLINK-30373 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30341) Introduce audit_log system table
Jingsong Lee created FLINK-30341: Summary: Introduce audit_log system table Key: FLINK-30341 URL: https://issues.apache.org/jira/browse/FLINK-30341 Project: Flink Issue Type: New Feature Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 In some scenarios, users need to get the changelog to do some auditing work, such as determining the number of updates and inserts. We can provide audit_log system table, users can get the rowkind information column. {code:java} INSERT INTO %s VALUES ('1', '2', '3'); INSERT INTO %s VALUES ('1', '4', '5'); SELECT * FROM T$audit_log; users can get: - "+I", "1", "2", "3" - "-U", "1", "2", "3"; - "+U", "1", "4", "5" {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30333) Supports lookup a partial-update table with full compaction
Jingsong Lee created FLINK-30333: Summary: Supports lookup a partial-update table with full compaction Key: FLINK-30333 URL: https://issues.apache.org/jira/browse/FLINK-30333 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 The lookup uses streaming read for reading table. (In TableStreamingReader) But partial-update table without full compaction do not support streaming read. We should throw unsupported exception for this. And we should support lookup a partial-update table with full compaction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30293) Create an enumerator for static (batch)
Jingsong Lee created FLINK-30293: Summary: Create an enumerator for static (batch) Key: FLINK-30293 URL: https://issues.apache.org/jira/browse/FLINK-30293 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 In FLINK-30207, we have created enumerator for continuous. We should have an enumerator for static (batch). For example, for the current read-compacted, time traveling may specify the commit time to read snapshots in the future. I think these capabilities need to be in the core, but should they be in scan? (It seems that it should not) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30276) [umbrella] Flink free for table store core
Jingsong Lee created FLINK-30276: Summary: [umbrella] Flink free for table store core Key: FLINK-30276 URL: https://issues.apache.org/jira/browse/FLINK-30276 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 In FLINK-30080, We need a core that does not rely on specific Flink versions to support flexible deployment and ecology. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30273) Introduce RecordReaderUtils.transform to transform RecordReader
Jingsong Lee created FLINK-30273: Summary: Introduce RecordReaderUtils.transform to transform RecordReader Key: FLINK-30273 URL: https://issues.apache.org/jira/browse/FLINK-30273 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 Returns a RecordReader that applies function to each element of fromReader. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30272) Introduce a Predicate Visitor
Jingsong Lee created FLINK-30272: Summary: Introduce a Predicate Visitor Key: FLINK-30272 URL: https://issues.apache.org/jira/browse/FLINK-30272 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 At present, predicate is traversed in many places. We need a visitor mode, which can better traverse Predicate. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30271) Introduce Table.copy from dynamic options
Jingsong Lee created FLINK-30271: Summary: Introduce Table.copy from dynamic options Key: FLINK-30271 URL: https://issues.apache.org/jira/browse/FLINK-30271 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 At present, our processing of dynamic options is relatively independent. In FileStoreTableFactory, this is not conducive to other engines configuring dynamic options. We should propose an interface on the Table, and dynamic options can be configured at any time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30248) Spark writer supports insert overwrite
Jingsong Lee created FLINK-30248: Summary: Spark writer supports insert overwrite Key: FLINK-30248 URL: https://issues.apache.org/jira/browse/FLINK-30248 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30247) Introduce Time Travel reading for table store
Jingsong Lee created FLINK-30247: Summary: Introduce Time Travel reading for table store Key: FLINK-30247 URL: https://issues.apache.org/jira/browse/FLINK-30247 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 For example: - SELECT * FROM T /*+ OPTIONS('as-of-timestamp-mills'='121230')*/; Read snapshot specific by commit time. - SELECT * FROM T /*+ OPTIONS('as-of-snapshot'='12')*/; Read snapshot specific by snapshot id. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30223) Refactor Lock to provide Lock.Factory
Jingsong Lee created FLINK-30223: Summary: Refactor Lock to provide Lock.Factory Key: FLINK-30223 URL: https://issues.apache.org/jira/browse/FLINK-30223 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 For the core, it should not see too many Flink Table concepts, such as database and tableName. It only needs to create a Lock. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30164) Expose BucketComputer from SupportsWrite
Jingsong Lee created FLINK-30164: Summary: Expose BucketComputer from SupportsWrite Key: FLINK-30164 URL: https://issues.apache.org/jira/browse/FLINK-30164 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 When other engines dock with Sink, they need to know the corresponding bucket rules before they can be correctly distributed to each bucket. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30139) CodeGenLoader fails when temporary directory is a symlink
Jingsong Lee created FLINK-30139: Summary: CodeGenLoader fails when temporary directory is a symlink Key: FLINK-30139 URL: https://issues.apache.org/jira/browse/FLINK-30139 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 Same to FLINK-28102 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30125) Projection pushdown is not work for partial update
Jingsong Lee created FLINK-30125: Summary: Projection pushdown is not work for partial update Key: FLINK-30125 URL: https://issues.apache.org/jira/browse/FLINK-30125 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 We did not properly process the project in MergeFunction, which resulted in subsequent reading position errors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30114) Introduce PyFlink example for table store
Jingsong Lee created FLINK-30114: Summary: Introduce PyFlink example for table store Key: FLINK-30114 URL: https://issues.apache.org/jira/browse/FLINK-30114 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30110) Enable from-timestamp log scan when timestamp-millis is configured
Jingsong Lee created FLINK-30110: Summary: Enable from-timestamp log scan when timestamp-millis is configured Key: FLINK-30110 URL: https://issues.apache.org/jira/browse/FLINK-30110 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30082) Enable write-buffer-spillable by default only for object storage
Jingsong Lee created FLINK-30082: Summary: Enable write-buffer-spillable by default only for object storage Key: FLINK-30082 URL: https://issues.apache.org/jira/browse/FLINK-30082 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 After a lot of tests, it is found that the participation of spillable does not improve HDFS greatly, but will bring some jitters. In this jira, spillable is enabled only when the object is stored by default, so that the performance can be improved without affecting hdfs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30080) Introduce public programming api and dependency jar for table store
Jingsong Lee created FLINK-30080: Summary: Introduce public programming api and dependency jar for table store Key: FLINK-30080 URL: https://issues.apache.org/jira/browse/FLINK-30080 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Users need to access tablestore through programming interfaces, but do not want to use a computing engine such as Flink or Spark. We can expose the programming api to read and write the tablestore, and also need to expose the corresponding dependency jar. Note that this dependency may not conflict with multiple versions of Flink, which is conducive to the integration of third-party systems. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30049) CsvBulkWriter is unsupported for S3 FileSystem
Jingsong Lee created FLINK-30049: Summary: CsvBulkWriter is unsupported for S3 FileSystem Key: FLINK-30049 URL: https://issues.apache.org/jira/browse/FLINK-30049 Project: Flink Issue Type: Bug Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.15.2, 1.16.0 Reporter: Jingsong Lee {code:java} Caused by: org.apache.flink.util.SerializedThrowable: Cannot sync state to system like S3. Use persist() to create a persistent recoverable intermediate point. at org.apache.flink.core.fs.RefCountedBufferingFileStream.sync(RefCountedBufferingFileStream.java:111) at org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.sync at org.apache.flink.formats.csv.CsvBulkWriter.finish(CsvBulkWriter.java:106) at org.apache.flink.connector.file.table.FileSystemTableSink$ProjectionBulkFactory$1.finish(FileSystemTableSink.java:653) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:64) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30000) Introduce FileSystemFactory to create FileSystem from custom configuration
Jingsong Lee created FLINK-3: Summary: Introduce FileSystemFactory to create FileSystem from custom configuration Key: FLINK-3 URL: https://issues.apache.org/jira/browse/FLINK-3 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 Currently, table store uses static Flink FileSystem. This can not support: 1. Use another FileSystem different from checkpoint FileSystem. 2. Use FileSystem in Hive and Spark from custom configuration instead of using FileSystem.initialize. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29988) Improve upper case fields for hive metastore
Jingsong Lee created FLINK-29988: Summary: Improve upper case fields for hive metastore Key: FLINK-29988 URL: https://issues.apache.org/jira/browse/FLINK-29988 Project: Flink Issue Type: Improvement Reporter: Jingsong Lee If the fields in the fts table are uppercase, there will be a mismatched exception when used in the Hive. 1. If it is not supported at the beginning, throw an exception when flink creates a table to the hive metastore. 2. If it is supported, so that no error is reported in the whole process, but save lower case in hive metastore. We can check columns with the same name when creating a table in Flink with hive metastore. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29987) PartialUpdateITCase.testForeignKeyJo is unstable
Jingsong Lee created FLINK-29987: Summary: PartialUpdateITCase.testForeignKeyJo is unstable Key: FLINK-29987 URL: https://issues.apache.org/jira/browse/FLINK-29987 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29953) Get rid of flink-connector-hive dependency in flink-table-store-hive
Jingsong Lee created FLINK-29953: Summary: Get rid of flink-connector-hive dependency in flink-table-store-hive Key: FLINK-29953 URL: https://issues.apache.org/jira/browse/FLINK-29953 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 It is unnecessary for the tablestore to rely on it in the test. Its incompatible modifications will make the tablestore troublesome. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29933) Bump Flink version to 1.16.0
Jingsong Lee created FLINK-29933: Summary: Bump Flink version to 1.16.0 Key: FLINK-29933 URL: https://issues.apache.org/jira/browse/FLINK-29933 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Assignee: Nicholas Jiang Fix For: table-store-0.3.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29774) Introduce options metadata table
Jingsong Lee created FLINK-29774: Summary: Introduce options metadata table Key: FLINK-29774 URL: https://issues.apache.org/jira/browse/FLINK-29774 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 SELECT * FROM T$options; KEY | VALUE ... | ... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29760) Introduce snapshots metadata table
Jingsong Lee created FLINK-29760: Summary: Introduce snapshots metadata table Key: FLINK-29760 URL: https://issues.apache.org/jira/browse/FLINK-29760 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 Introduce snapshots metadata table to show snapshot history. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29736) Abstract a table interface for both data and metadata tables
Jingsong Lee created FLINK-29736: Summary: Abstract a table interface for both data and metadata tables Key: FLINK-29736 URL: https://issues.apache.org/jira/browse/FLINK-29736 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29735) Introduce Metadata tables for table store
Jingsong Lee created FLINK-29735: Summary: Introduce Metadata tables for table store Key: FLINK-29735 URL: https://issues.apache.org/jira/browse/FLINK-29735 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 You can query the related metadata of the table through SQL, for example, query the historical version information of table "T" through the following SQL: SELECT * FROM T$history; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29700) Serializer to BinaryInMemorySortBuffer is wrong
Jingsong Lee created FLINK-29700: Summary: Serializer to BinaryInMemorySortBuffer is wrong Key: FLINK-29700 URL: https://issues.apache.org/jira/browse/FLINK-29700 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0, table-store-0.2.2 In SortBufferMemTable, it will use `BinaryInMemorySortBuffer.createBuffer(BinaryRowDataSerializer serializer)`, the serializer is for full row, not just sort key fields. Problems may occur when there are many fields. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29690) listTables in HiveCatalog should only return table store tables
Jingsong Lee created FLINK-29690: Summary: listTables in HiveCatalog should only return table store tables Key: FLINK-29690 URL: https://issues.apache.org/jira/browse/FLINK-29690 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0, table-store-0.2.2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29621) Append-only with eventual log.consistency can not work
Jingsong Lee created FLINK-29621: Summary: Append-only with eventual log.consistency can not work Key: FLINK-29621 URL: https://issues.apache.org/jira/browse/FLINK-29621 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0, table-store-0.2.2 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29614) Introduce Spark writer for table store
Jingsong Lee created FLINK-29614: Summary: Introduce Spark writer for table store Key: FLINK-29614 URL: https://issues.apache.org/jira/browse/FLINK-29614 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Jingsong Lee The main difficulty is that the Spark SourceV2 interface currently does not support custom distribution, and the Table Store must have consistent distribution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29586) Let write buffer spillable
Jingsong Lee created FLINK-29586: Summary: Let write buffer spillable Key: FLINK-29586 URL: https://issues.apache.org/jira/browse/FLINK-29586 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0 Column format and remote DFS may greatly affect the performance of compaction. We can change the writeBuffer to spillable to improve the performance. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29554) Add partial-update.ignore-delete option to avoid exception after join
Jingsong Lee created FLINK-29554: Summary: Add partial-update.ignore-delete option to avoid exception after join Key: FLINK-29554 URL: https://issues.apache.org/jira/browse/FLINK-29554 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0, table-store-0.2.1 When the partial update input is a normal database cdc input, it can work normally as long as there is no delete data. However, if a join is performed previously, the join node in flink job will generate delete messages, which will cause the partial update insertion to throw an exception. We can add an option to decide whether to ignore the delete message in this case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29491) Primary key without partition field can be supported from full changelog
Jingsong Lee created FLINK-29491: Summary: Primary key without partition field can be supported from full changelog Key: FLINK-29491 URL: https://issues.apache.org/jira/browse/FLINK-29491 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 When pk does not contain partition fields, an exception will be thrown under any circumstances. We can relax this restriction. When the input is a complete changelog. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29490) Timestamp LTZ is unsupported in table store
Jingsong Lee created FLINK-29490: Summary: Timestamp LTZ is unsupported in table store Key: FLINK-29490 URL: https://issues.apache.org/jira/browse/FLINK-29490 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 Due to orc format limitation, timestamp ltz is unsupported now. We should fix this, and validate this type cross multiple engines (hive spark trino). We need to careful about time zone. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29412) Connection leak in orc reader
Jingsong Lee created FLINK-29412: Summary: Connection leak in orc reader Key: FLINK-29412 URL: https://issues.apache.org/jira/browse/FLINK-29412 Project: Flink Issue Type: Bug Components: Table Store Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: table-store-0.3.0, table-store-0.2.1 -- This message was sent by Atlassian Jira (v8.20.10#820010)