[jira] [Created] (FLINK-31397) Introduce write-once hash lookup store

2023-03-10 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31397:


 Summary: Introduce write-once hash lookup store
 Key: FLINK-31397
 URL: https://issues.apache.org/jira/browse/FLINK-31397
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


Introduce interface for lookup changelog producer:
{code:java}
/**
 * A key-value store for lookup, key-value store should be single binary file 
written once and ready
 * to be used. This factory provide two interfaces:
 *
 * 
 *   Writer: written once to prepare binary file.
 *   Reader: lookup value by key bytes.
 * 
 */
public interface LookupStoreFactory {

LookupStoreWriter createWriter(File file) throws IOException;

LookupStoreReader createReader(File file) throws IOException;
}
 {code}
We can convert remote columnar data to local lookup store, and ready to be used 
to lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31392) Refactor classes code of full-compaction

2023-03-09 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31392:


 Summary: Refactor classes code of full-compaction
 Key: FLINK-31392
 URL: https://issues.apache.org/jira/browse/FLINK-31392
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


Refactor classes code of full-compaction, this is to prepare some shared codes 
for lookup changelog producer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31391) Introduce lookup changelog producer

2023-03-09 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31391:


 Summary: Introduce lookup changelog producer
 Key: FLINK-31391
 URL: https://issues.apache.org/jira/browse/FLINK-31391
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


Currently, only full-compaction can produce changelog, some merge-engine must 
have changelog producing, for example, partial-update and aggregation. But 
full-compaction is very heavy, write amplification is big huge...

We should introduce a new changelog producer, supports lower latency to produce 
changelog.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31343) Remove JMH dependency in flink-table-store-micro-benchmark

2023-03-06 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31343:


 Summary: Remove JMH dependency in flink-table-store-micro-benchmark
 Key: FLINK-31343
 URL: https://issues.apache.org/jira/browse/FLINK-31343
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31331) Flink 1.16 should implement new LookupFunction

2023-03-06 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31331:


 Summary: Flink 1.16 should implement new LookupFunction
 Key: FLINK-31331
 URL: https://issues.apache.org/jira/browse/FLINK-31331
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


Only implements new LookupFunction, retry lookup join can work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31329) Fix Parquet stats extractor

2023-03-05 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31329:


 Summary: Fix Parquet stats extractor
 Key: FLINK-31329
 URL: https://issues.apache.org/jira/browse/FLINK-31329
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


Some bugs in Parquet stats extractor:
 # Decimal Supports
 # Timestamp Supports
 # Null nullCounts supports



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31315) FlinkActionsE2eTest.testMergeInto is unstable

2023-03-03 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31315:


 Summary: FlinkActionsE2eTest.testMergeInto is unstable
 Key: FLINK-31315
 URL: https://issues.apache.org/jira/browse/FLINK-31315
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


{code:java}
Error:  Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 320.272 
s <<< FAILURE! - in org.apache.flink.table.store.tests.FlinkActionsE2eTest
82Error:  testMergeInto  Time elapsed: 111.826 s  <<< FAILURE!
83org.opentest4j.AssertionFailedError: 
84Result is still unexpected after 60 retries.
85Expected: {3, v_3, creation, 02-27=1, 2, v_2, creation, 02-27=1, 6, v_6, 
creation, 02-28=1, 1, v_1, creation, 02-27=1, 8, v_8, insert, 02-29=1, 11, 
v_11, insert, 02-29=1, 7, Seven, matched_upsert, 02-28=1, 5, v_5, creation, 
02-28=1, 10, v_10, creation, 02-28=1, 9, v_9, creation, 02-28=1}
86Actual: {4, v_4, creation, 02-27=1, 8, v_8, creation, 02-28=1, 3, v_3, 
creation, 02-27=1, 7, v_7, creation, 02-28=1, 2, v_2, creation, 02-27=1, 6, 
v_6, creation, 02-28=1, 1, v_1, creation, 02-27=1, 5, v_5, creation, 02-28=1, 
10, v_10, creation, 02-28=1, 9, v_9, creation, 02-28=1}
87  at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39)
88  at org.junit.jupiter.api.Assertions.fail(Assertions.java:134)
89  at 
org.apache.flink.table.store.tests.E2eTestBase.checkResult(E2eTestBase.java:261)
90  at 
org.apache.flink.table.store.tests.FlinkActionsE2eTest.testMergeInto(FlinkActionsE2eTest.java:355)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31311) Supports Bounded Watermark streaming read

2023-03-02 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31311:


 Summary: Supports Bounded Watermark streaming read
 Key: FLINK-31311
 URL: https://issues.apache.org/jira/browse/FLINK-31311
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


There are some bound stream scenarios that require that stream reading can be 
ended. Generally speaking, the end event time is the better.

So in this ticket, supports writing the watermark to the snapshot and can 
specify the ending watermark when reading the stream.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31310) Force clear directory no matter what situation in HiveCatalog.dropTable

2023-03-02 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31310:


 Summary: Force clear directory no matter what situation in 
HiveCatalog.dropTable
 Key: FLINK-31310
 URL: https://issues.apache.org/jira/browse/FLINK-31310
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


Currently, if no table in hive, will not clear the table.

We should clear table directory in any situation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31309) Rollback DFS schema if hive sync fail in HiveCatalog.createTable

2023-03-02 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31309:


 Summary: Rollback DFS schema if hive sync fail in 
HiveCatalog.createTable
 Key: FLINK-31309
 URL: https://issues.apache.org/jira/browse/FLINK-31309
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


Avoid schema residue on DFS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31292) User HadoopUtils to get Configuration in CatalogContext

2023-03-01 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31292:


 Summary: User HadoopUtils to get Configuration in CatalogContext
 Key: FLINK-31292
 URL: https://issues.apache.org/jira/browse/FLINK-31292
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


At present, if HadoopConf is not passed in the CatalogContext, a new HadoopConf 
will be directly generated, which may not have the required parameters.

We can refer to HadoopUtils to obtain hadoopConf from the configuration and 
environment variables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31291) Document table.exec.sink.upsert-materialize to none

2023-03-01 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31291:


 Summary: Document table.exec.sink.upsert-materialize to none
 Key: FLINK-31291
 URL: https://issues.apache.org/jira/browse/FLINK-31291
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


The table store has the ability to correct disorder, such as:

[https://nightlies.apache.org/flink/flink-table-store-docs-master/docs/concepts/primary-key-table/#sequence-field]

But Flink SQL default sink materialize will result strange behavior, In 
particular, write to the agg table of the fts.

We should document this, set table.exec.sink.upsert-materialize to none always, 
set 'sequence.field' to table in case of disorder.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31290) Remove features in documentation

2023-03-01 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31290:


 Summary: Remove features in documentation
 Key: FLINK-31290
 URL: https://issues.apache.org/jira/browse/FLINK-31290
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


Features is confused in documentation.

Now, there are two pages in features, log system and lookup join.

We can move log system to concepts.

And move lookup join to how-to.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31289) Default aggregate-function for field can be last_non_null_value

2023-03-01 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31289:


 Summary: Default aggregate-function for field can be 
last_non_null_value
 Key: FLINK-31289
 URL: https://issues.apache.org/jira/browse/FLINK-31289
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


At present, when aggfunc is not configured, NPE will be generated. When the 
table is oriented to many fields, the configuration will be more troublesome.

We can give the field the default aggfunc, such as last_ non_ null_ Value, 
which is consistent with the partial-update table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31287) Default value of 'changelog-producer.compaction-interval' can be zero

2023-03-01 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31287:


 Summary: Default value of 'changelog-producer.compaction-interval' 
can be zero
 Key: FLINK-31287
 URL: https://issues.apache.org/jira/browse/FLINK-31287
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


At present, the 30-minute interval is too conservative. We can set it to 0 by 
default, so that each checkpoint will have a full-compaction and generate a 
changelog.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31254) Improve the read performance for files table

2023-02-27 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31254:


 Summary: Improve the read performance for files table
 Key: FLINK-31254
 URL: https://issues.apache.org/jira/browse/FLINK-31254
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


At present, the reading performance of the Files table is very poor. Even every 
data read will read the schema file. We can optimize the reading performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31253) Port itcases to Flink 1.15 and 1.14

2023-02-27 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31253:


 Summary: Port itcases to Flink 1.15 and 1.14
 Key: FLINK-31253
 URL: https://issues.apache.org/jira/browse/FLINK-31253
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


At present, only common has tests. We need to copy a part of itcase to 1.14 and 
1.15 to ensure normal work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31252) Improve StaticFileStoreSplitEnumerator to assign batch splits

2023-02-27 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31252:


 Summary: Improve StaticFileStoreSplitEnumerator to assign batch 
splits
 Key: FLINK-31252
 URL: https://issues.apache.org/jira/browse/FLINK-31252
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


{code:java}
// The following batch assignment operation is for two things:
// 1. It can be evenly distributed during batch reading to avoid 
scheduling problems (for
// example, the current resource can only schedule part of the tasks) 
that cause some tasks
// to fail to read data.
// 2. Read with limit, if split is assigned one by one, it may cause 
the task to repeatedly
// create SplitFetchers. After the task is created, it is found that it 
is idle and then
// closed. Then, new split coming, it will create SplitFetcher and 
repeatedly read the data
// of the limit number (the limit status is in the SplitFetcher).
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31248) Improve documentation for append-only table

2023-02-27 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31248:


 Summary: Improve documentation for append-only table
 Key: FLINK-31248
 URL: https://issues.apache.org/jira/browse/FLINK-31248
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31236) Limit pushdown will open useless RecordReader

2023-02-27 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31236:


 Summary: Limit pushdown will open useless RecordReader
 Key: FLINK-31236
 URL: https://issues.apache.org/jira/browse/FLINK-31236
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31213) Aggregation merge engine supports retract inputs

2023-02-24 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31213:


 Summary: Aggregation merge engine supports retract inputs
 Key: FLINK-31213
 URL: https://issues.apache.org/jira/browse/FLINK-31213
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


For sum, it can support retracts.
For others, which do not support retraction (`UPDATE_BEFORE` and `DELETE`). If 
the user allow some functions to ignore retraction messages, the user can 
configure: `'fields.${field_name}.ignore-retract'='true'`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31209) Introduce creation time to files table

2023-02-24 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31209:


 Summary: Introduce creation time to files table
 Key: FLINK-31209
 URL: https://issues.apache.org/jira/browse/FLINK-31209
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31201) Provides option to sort partition for full stage in streaming read

2023-02-23 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31201:


 Summary: Provides option to sort partition for full stage in 
streaming read
 Key: FLINK-31201
 URL: https://issues.apache.org/jira/browse/FLINK-31201
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


The overall order may be out of order due to the writing of the old partition. 
We can provide an option to sort the full reading stage by partition fields to 
avoid the disorder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31195) FullChangelogStoreSinkWrite bucket writer conflicts with rescale

2023-02-23 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31195:


 Summary: FullChangelogStoreSinkWrite bucket writer conflicts with 
rescale
 Key: FLINK-31195
 URL: https://issues.apache.org/jira/browse/FLINK-31195
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


At present, this operator relies on ListState, Flink distributes data according 
to round-robin when rescaling, which may be different from the distribution 
rules of our bucket after rescaling.

We need to change the mode of UnionListState, broadcast to each node, and 
finally decide whether it belongs to the task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31179) Make data structures serializable

2023-02-21 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31179:


 Summary: Make data structures serializable
 Key: FLINK-31179
 URL: https://issues.apache.org/jira/browse/FLINK-31179
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31178) Public Writer API

2023-02-21 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31178:


 Summary: Public Writer API
 Key: FLINK-31178
 URL: https://issues.apache.org/jira/browse/FLINK-31178
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31117) Split flink connector to each module of each version

2023-02-17 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31117:


 Summary: Split flink connector to each module of each version
 Key: FLINK-31117
 URL: https://issues.apache.org/jira/browse/FLINK-31117
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


This will make compilation and testing much easier.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31111) Introduce CatalogTestBase

2023-02-16 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-3:


 Summary: Introduce CatalogTestBase
 Key: FLINK-3
 URL: https://issues.apache.org/jira/browse/FLINK-3
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


Currently, only tests for FlinkCatalog or ITCase, we should add cases for 
catalogs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31103) Public Table API for table store

2023-02-16 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31103:


 Summary: Public Table API for table store
 Key: FLINK-31103
 URL: https://issues.apache.org/jira/browse/FLINK-31103
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31072) Introduce streaming-read-atomic to ensure UB and UA cannot be split

2023-02-14 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31072:


 Summary: Introduce streaming-read-atomic to ensure UB and UA 
cannot be split
 Key: FLINK-31072
 URL: https://issues.apache.org/jira/browse/FLINK-31072
 Project: Flink
  Issue Type: Improvement
Reporter: Jingsong Lee


Currently, streaming source will be checkpoint in any time, this means 
UPDATE_BEFORE and UPDATE_AFTER can be split into two checkpoint.
Downstream can see intermediate state. This is weird in some cases.
So in this ticket, add streaming-read-atomic:
The option to enable return per iterator instead of per record in streaming 
read. This can ensure that there will be no checkpoint segmentation in iterator 
consumption.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31054) Flink free to common codegen core shade

2023-02-13 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31054:


 Summary: Flink free to common codegen core shade
 Key: FLINK-31054
 URL: https://issues.apache.org/jira/browse/FLINK-31054
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31038) Avoid accessing non-TableStore tables in HiveCatalog.listTables

2023-02-13 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31038:


 Summary: Avoid accessing non-TableStore tables in 
HiveCatalog.listTables
 Key: FLINK-31038
 URL: https://issues.apache.org/jira/browse/FLINK-31038
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


In HiveCatalog.listTables, in the current implementation, getTable will be 
called for each TableName. However, the environment here may not be able to 
access non-TableStore tables.
We can avoid access non-TableStore tables by judging whether it is a TableStore 
table in advance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31037) Table Store supports streaming reading a whole snapshot in one checkpoint

2023-02-13 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31037:


 Summary: Table Store supports streaming reading a whole snapshot 
in one checkpoint
 Key: FLINK-31037
 URL: https://issues.apache.org/jira/browse/FLINK-31037
 Project: Flink
  Issue Type: Improvement
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


At present, in the streaming reading of tablestore, the checkpoint may be 
performed when a snapshot is not read completely, and then a single snapshot 
may be cut into multiple slices, and the downstream will see the intermediate 
state.

In some scenarios, this intermediate state is not allowed. We need to support a 
mode to prohibit this situation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31024) Copy code splitter to table store from flink table

2023-02-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31024:


 Summary: Copy code splitter to table store from flink table
 Key: FLINK-31024
 URL: https://issues.apache.org/jira/browse/FLINK-31024
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31023) Introduce ConfigOption for table store

2023-02-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31023:


 Summary: Introduce ConfigOption for table store
 Key: FLINK-31023
 URL: https://issues.apache.org/jira/browse/FLINK-31023
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31022) Using new Serializer for table store

2023-02-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31022:


 Summary: Using new Serializer for table store
 Key: FLINK-31022
 URL: https://issues.apache.org/jira/browse/FLINK-31022
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31009) Add recordCount to snapshot meta

2023-02-10 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31009:


 Summary: Add recordCount to snapshot meta
 Key: FLINK-31009
 URL: https://issues.apache.org/jira/browse/FLINK-31009
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee


Record count represents the total number of data records. It is simply added by 
the number of data records of all files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31002) Provide data sampling query

2023-02-09 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31002:


 Summary: Provide data sampling query
 Key: FLINK-31002
 URL: https://issues.apache.org/jira/browse/FLINK-31002
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee


Want to take several randomly from each partition, but the limit is always 
fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31001) Introduce Hive writer

2023-02-09 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31001:


 Summary: Introduce Hive writer
 Key: FLINK-31001
 URL: https://issues.apache.org/jira/browse/FLINK-31001
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30980) Support s3.signer-type for S3

2023-02-08 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30980:


 Summary: Support s3.signer-type for S3
 Key: FLINK-30980
 URL: https://issues.apache.org/jira/browse/FLINK-30980
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


Currently, s3.signer-type should be s3a.signer-type, we can also support 
s3.signer-type configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30979) The buckets of the secondary partition should fall on different tasks

2023-02-08 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30979:


 Summary: The buckets of the secondary partition should fall on 
different tasks
 Key: FLINK-30979
 URL: https://issues.apache.org/jira/browse/FLINK-30979
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


In Flink Streaming Job, sink to table store.
Considering that I only set one bucket now, but there are many secondary 
partitions, I expect to use multiple parallelism tasks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30882) Introduce partition.expiration-time to automatically delete expired partitions

2023-02-02 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30882:


 Summary: Introduce partition.expiration-time to automatically 
delete expired partitions
 Key: FLINK-30882
 URL: https://issues.apache.org/jira/browse/FLINK-30882
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


Same to snapshot expiration, we can also introduce partition expiration to 
automatically delete expired partitions in commit node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30866) Introduce FileIO for table store

2023-02-01 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30866:


 Summary: Introduce FileIO for table store
 Key: FLINK-30866
 URL: https://issues.apache.org/jira/browse/FLINK-30866
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


FileIO aims to make table-store and the FileSystem of Flink independent, In 
this way, we can provide different FileSystem support in the Flink cluster, 
such as other S3 buckets. In addition, different engines can provide the same 
FileIO experience (such as configuration and usage)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30706) Remove flink-table-common dependency for table store core

2023-01-16 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30706:


 Summary: Remove flink-table-common dependency for table store core
 Key: FLINK-30706
 URL: https://issues.apache.org/jira/browse/FLINK-30706
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30676) Introduce Data Structures for table store

2023-01-13 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30676:


 Summary: Introduce Data Structures for table store
 Key: FLINK-30676
 URL: https://issues.apache.org/jira/browse/FLINK-30676
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


Copy data structures to table store from Flink.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30632) Introduce DataType for table store

2023-01-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30632:


 Summary: Introduce DataType for table store
 Key: FLINK-30632
 URL: https://issues.apache.org/jira/browse/FLINK-30632
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


Introduce table store own DataType to decouple Flink SQL LogicalType.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30628) Kerberos in HiveCatalog is not work

2023-01-10 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30628:


 Summary: Kerberos in HiveCatalog is not work
 Key: FLINK-30628
 URL: https://issues.apache.org/jira/browse/FLINK-30628
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


We should read kerberos keytab from catalog options and doAs for hive metastore 
client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30611) Expire snapshot should be reentrant

2023-01-09 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30611:


 Summary: Expire snapshot should be reentrant
 Key: FLINK-30611
 URL: https://issues.apache.org/jira/browse/FLINK-30611
 Project: Flink
  Issue Type: Bug
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


At present, if the file is incomplete, expire will throw an exception.
However, the snapshot in expire may be incomplete. It can be interrupted and 
killed suddenly.
Therefore, we should ensure the safety of expire, make it reentrant, and avoid 
throwing exceptions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30610) Flink-table-runtime free for disk io in flink-core

2023-01-09 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30610:


 Summary: Flink-table-runtime free for disk io in flink-core
 Key: FLINK-30610
 URL: https://issues.apache.org/jira/browse/FLINK-30610
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30600) Merge flink-table-store-kafka to flink-table-store-connector

2023-01-08 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30600:


 Summary: Merge flink-table-store-kafka to 
flink-table-store-connector
 Key: FLINK-30600
 URL: https://issues.apache.org/jira/browse/FLINK-30600
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


At present, Kafka heavily relies on the implementation of Flink, which is 
difficult to extract, so it can be directly incorporated into the Flink 
connector.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30582) Flink-avro Flink-orc free for flink-table-store-format

2023-01-05 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30582:


 Summary: Flink-avro Flink-orc free for flink-table-store-format
 Key: FLINK-30582
 URL: https://issues.apache.org/jira/browse/FLINK-30582
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30581) Deprecate FileStoreTableITCase and use CatalogITCaseBase

2023-01-05 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30581:


 Summary: Deprecate FileStoreTableITCase and use CatalogITCaseBase
 Key: FLINK-30581
 URL: https://issues.apache.org/jira/browse/FLINK-30581
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


We recommend users to use Catalog tables instead managed tables.
Managed tables should be deprecated. Now we already did not expose managed in 
documentation. We can remove it.
Before removing, tests should be refactored.

FileStoreTableITCase with managed tables should be changed to CatalogITCaseBase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30580) [umbrella] Refactor tests for table store

2023-01-05 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30580:


 Summary: [umbrella] Refactor tests for table store
 Key: FLINK-30580
 URL: https://issues.apache.org/jira/browse/FLINK-30580
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


This is a umbrella issue to improve tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30572) Make parquet as default data file format

2023-01-05 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30572:


 Summary: Make parquet as default data file format
 Key: FLINK-30572
 URL: https://issues.apache.org/jira/browse/FLINK-30572
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


- We have done some tests. Parquet is 30% faster.
- After FLINK-30565, Parquet can support complex types and file systems such as 
OSS and s3 (decoupled from hadoop filesystem).
- After FLINK-30569, the table can switch formats at will.

Therefore, if detailed and comprehensive tests have been carried out here, we 
can use Parquet as the default format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30569) File Format can not change with data file exists

2023-01-04 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30569:


 Summary: File Format can not change with data file exists
 Key: FLINK-30569
 URL: https://issues.apache.org/jira/browse/FLINK-30569
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0


# Set file format to orc
# Write records.
# Set file format to parquet.
# Write records
# Read -> throw exception...

We should support change file format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30565) Flink-parquet free for flink-table-store-format

2023-01-04 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30565:


 Summary: Flink-parquet free for flink-table-store-format
 Key: FLINK-30565
 URL: https://issues.apache.org/jira/browse/FLINK-30565
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30555) Hive cluster can not read oss/s3 tables

2023-01-03 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30555:


 Summary: Hive cluster can not read oss/s3 tables
 Key: FLINK-30555
 URL: https://issues.apache.org/jira/browse/FLINK-30555
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


FLINK-29964 add oss support for Hive, but only valid in the case of standalone 
Hive, the distributed Hive compute engine cannot access.
We should add more FileSystems.initialize to Hive connector



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30547) Flink-table-runtime free for flink-table-store-common

2023-01-03 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30547:


 Summary: Flink-table-runtime free for flink-table-store-common
 Key: FLINK-30547
 URL: https://issues.apache.org/jira/browse/FLINK-30547
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.4.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30398) Introduce S3 support for table store

2022-12-13 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30398:


 Summary: Introduce S3 support for table store
 Key: FLINK-30398
 URL: https://issues.apache.org/jira/browse/FLINK-30398
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


S3 contains a large number of dependencies, which can easily lead to class 
conflicts. We need a plugin mechanism to load the corresponding jars through 
the classloader.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30395) Refactor module name and documentation for filesystems

2022-12-12 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30395:


 Summary: Refactor module name and documentation for filesystems
 Key: FLINK-30395
 URL: https://issues.apache.org/jira/browse/FLINK-30395
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30394) [umbrella] Refactor filesystem support in table store

2022-12-12 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30394:


 Summary: [umbrella] Refactor filesystem support in table store
 Key: FLINK-30394
 URL: https://issues.apache.org/jira/browse/FLINK-30394
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


- Let other computing engines, such as hive, spark, trino, support object 
storage file systems, such as OSS and s3.
- Let table store access different file systems from Flink cluster according to 
configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30390) Ensure that no compaction is in progress before closing the writer

2022-12-12 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30390:


 Summary: Ensure that no compaction is in progress before closing 
the writer
 Key: FLINK-30390
 URL: https://issues.apache.org/jira/browse/FLINK-30390
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


When the writer does not generate a new submission file, it will be closed. (In 
AbstractFileStoreWrite) However, at this time, there may be asynchronous 
interactions that have not been completed and are forced to close, which will 
cause some strange exceptions to be printed in the log.

We can avoid this situation, ensure that no compaction is in progress before 
closing the writer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30389) Add retry to read hints

2022-12-12 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30389:


 Summary: Add retry to read hints
 Key: FLINK-30389
 URL: https://issues.apache.org/jira/browse/FLINK-30389
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


For the oss (object store) filesystem. When writing hint file, delete it first 
and then add it. Reading hint file may fail frequently. We don't need to return 
directly in case of failure. We can add a retry.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30373) Flink-table-runtime free for flink-table-store-codegen

2022-12-12 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30373:


 Summary: Flink-table-runtime free for flink-table-store-codegen
 Key: FLINK-30373
 URL: https://issues.apache.org/jira/browse/FLINK-30373
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30341) Introduce audit_log system table

2022-12-08 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30341:


 Summary: Introduce audit_log system table
 Key: FLINK-30341
 URL: https://issues.apache.org/jira/browse/FLINK-30341
 Project: Flink
  Issue Type: New Feature
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


In some scenarios, users need to get the changelog to do some auditing work, 
such as determining the number of updates and inserts.

We can provide audit_log system table, users can get the rowkind information 
column.


{code:java}
INSERT INTO %s VALUES ('1', '2', '3');
INSERT INTO %s VALUES ('1', '4', '5');

SELECT * FROM T$audit_log;

users can get:
- "+I", "1", "2", "3"
- "-U", "1", "2", "3";
- "+U", "1", "4", "5"
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30333) Supports lookup a partial-update table with full compaction

2022-12-07 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30333:


 Summary: Supports lookup a partial-update table with full 
compaction
 Key: FLINK-30333
 URL: https://issues.apache.org/jira/browse/FLINK-30333
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


The lookup uses streaming read for reading table. (In TableStreamingReader)
But partial-update table without full compaction do not support streaming read. 
We should throw unsupported exception for this.
And we should support lookup a partial-update table with full compaction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30293) Create an enumerator for static (batch)

2022-12-04 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30293:


 Summary: Create an enumerator for static (batch)
 Key: FLINK-30293
 URL: https://issues.apache.org/jira/browse/FLINK-30293
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


In FLINK-30207, we have created enumerator for continuous.
We should have an enumerator for static (batch).
For example, for the current read-compacted, time traveling may specify the 
commit time to read snapshots in the future.
I think these capabilities need to be in the core, but should they be in scan? 
(It seems that it should not)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30276) [umbrella] Flink free for table store core

2022-12-02 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30276:


 Summary: [umbrella] Flink free for table store core
 Key: FLINK-30276
 URL: https://issues.apache.org/jira/browse/FLINK-30276
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


In FLINK-30080, We need a core that does not rely on specific Flink versions to 
support flexible deployment and ecology.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30273) Introduce RecordReaderUtils.transform to transform RecordReader

2022-12-02 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30273:


 Summary: Introduce RecordReaderUtils.transform to transform 
RecordReader
 Key: FLINK-30273
 URL: https://issues.apache.org/jira/browse/FLINK-30273
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


Returns a RecordReader that applies function to each element of fromReader.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30272) Introduce a Predicate Visitor

2022-12-01 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30272:


 Summary: Introduce a Predicate Visitor
 Key: FLINK-30272
 URL: https://issues.apache.org/jira/browse/FLINK-30272
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


At present, predicate is traversed in many places. We need a visitor mode, 
which can better traverse Predicate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30271) Introduce Table.copy from dynamic options

2022-12-01 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30271:


 Summary: Introduce Table.copy from dynamic options
 Key: FLINK-30271
 URL: https://issues.apache.org/jira/browse/FLINK-30271
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


At present, our processing of dynamic options is relatively independent. In 
FileStoreTableFactory, this is not conducive to other engines configuring 
dynamic options.

We should propose an interface on the Table, and dynamic options can be 
configured at any time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30248) Spark writer supports insert overwrite

2022-11-30 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30248:


 Summary: Spark writer supports insert overwrite
 Key: FLINK-30248
 URL: https://issues.apache.org/jira/browse/FLINK-30248
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30247) Introduce Time Travel reading for table store

2022-11-30 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30247:


 Summary: Introduce Time Travel reading for table store
 Key: FLINK-30247
 URL: https://issues.apache.org/jira/browse/FLINK-30247
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


For example:
- SELECT * FROM T /*+ OPTIONS('as-of-timestamp-mills'='121230')*/; Read 
snapshot specific by commit time.
- SELECT * FROM T /*+ OPTIONS('as-of-snapshot'='12')*/; Read snapshot specific 
by snapshot id.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30223) Refactor Lock to provide Lock.Factory

2022-11-27 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30223:


 Summary: Refactor Lock to provide Lock.Factory
 Key: FLINK-30223
 URL: https://issues.apache.org/jira/browse/FLINK-30223
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


For the core, it should not see too many Flink Table concepts, such as database 
and tableName. It only needs to create a Lock.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30164) Expose BucketComputer from SupportsWrite

2022-11-23 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30164:


 Summary: Expose BucketComputer from SupportsWrite
 Key: FLINK-30164
 URL: https://issues.apache.org/jira/browse/FLINK-30164
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


When other engines dock with Sink, they need to know the corresponding bucket 
rules before they can be correctly distributed to each bucket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30139) CodeGenLoader fails when temporary directory is a symlink

2022-11-22 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30139:


 Summary: CodeGenLoader fails when temporary directory is a symlink
 Key: FLINK-30139
 URL: https://issues.apache.org/jira/browse/FLINK-30139
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


Same to FLINK-28102 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30125) Projection pushdown is not work for partial update

2022-11-21 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30125:


 Summary: Projection pushdown is not work for partial update
 Key: FLINK-30125
 URL: https://issues.apache.org/jira/browse/FLINK-30125
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


We did not properly process the project in MergeFunction, which resulted in 
subsequent reading position errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30114) Introduce PyFlink example for table store

2022-11-21 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30114:


 Summary: Introduce PyFlink example for table store
 Key: FLINK-30114
 URL: https://issues.apache.org/jira/browse/FLINK-30114
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30110) Enable from-timestamp log scan when timestamp-millis is configured

2022-11-20 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30110:


 Summary: Enable from-timestamp log scan when timestamp-millis is 
configured
 Key: FLINK-30110
 URL: https://issues.apache.org/jira/browse/FLINK-30110
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30082) Enable write-buffer-spillable by default only for object storage

2022-11-17 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30082:


 Summary: Enable write-buffer-spillable by default only for object 
storage
 Key: FLINK-30082
 URL: https://issues.apache.org/jira/browse/FLINK-30082
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


After a lot of tests, it is found that the participation of spillable does not 
improve HDFS greatly, but will bring some jitters.
In this jira, spillable is enabled only when the object is stored by default, 
so that the performance can be improved without affecting hdfs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30080) Introduce public programming api and dependency jar for table store

2022-11-17 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30080:


 Summary: Introduce public programming api and dependency jar for 
table store
 Key: FLINK-30080
 URL: https://issues.apache.org/jira/browse/FLINK-30080
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee


Users need to access tablestore through programming interfaces, but do not want 
to use a computing engine such as Flink or Spark.
We can expose the programming api to read and write the tablestore, and also 
need to expose the corresponding dependency jar. Note that this dependency may 
not conflict with multiple versions of Flink, which is conducive to the 
integration of third-party systems.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30049) CsvBulkWriter is unsupported for S3 FileSystem

2022-11-16 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-30049:


 Summary: CsvBulkWriter is unsupported for S3 FileSystem
 Key: FLINK-30049
 URL: https://issues.apache.org/jira/browse/FLINK-30049
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.15.2, 1.16.0
Reporter: Jingsong Lee


{code:java}
Caused by: org.apache.flink.util.SerializedThrowable: Cannot sync state to 
system like S3. Use persist() to create a persistent recoverable intermediate 
point.
at 
org.apache.flink.core.fs.RefCountedBufferingFileStream.sync(RefCountedBufferingFileStream.java:111)
 
at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.sync
at 
org.apache.flink.formats.csv.CsvBulkWriter.finish(CsvBulkWriter.java:106) 
at 
org.apache.flink.connector.file.table.FileSystemTableSink$ProjectionBulkFactory$1.finish(FileSystemTableSink.java:653)
 
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:64)
 
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30000) Introduce FileSystemFactory to create FileSystem from custom configuration

2022-11-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-3:


 Summary: Introduce FileSystemFactory to create FileSystem from 
custom configuration
 Key: FLINK-3
 URL: https://issues.apache.org/jira/browse/FLINK-3
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


Currently, table store uses static Flink FileSystem. This can not support:
1. Use another FileSystem different from checkpoint FileSystem.
2. Use FileSystem in Hive and Spark from custom configuration instead of using 
FileSystem.initialize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29988) Improve upper case fields for hive metastore

2022-11-10 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29988:


 Summary: Improve upper case fields for hive metastore
 Key: FLINK-29988
 URL: https://issues.apache.org/jira/browse/FLINK-29988
 Project: Flink
  Issue Type: Improvement
Reporter: Jingsong Lee


If the fields in the fts table are uppercase, there will be a mismatched 
exception when used in the Hive.

1. If it is not supported at the beginning, throw an exception when flink 
creates a table to the hive metastore.
2. If it is supported, so that no error is reported in the whole process, but 
save lower case in hive metastore. We can check columns with the same name when 
creating a table in Flink with hive metastore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29987) PartialUpdateITCase.testForeignKeyJo is unstable

2022-11-10 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29987:


 Summary: PartialUpdateITCase.testForeignKeyJo is unstable
 Key: FLINK-29987
 URL: https://issues.apache.org/jira/browse/FLINK-29987
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29953) Get rid of flink-connector-hive dependency in flink-table-store-hive

2022-11-08 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29953:


 Summary: Get rid of flink-connector-hive dependency in 
flink-table-store-hive
 Key: FLINK-29953
 URL: https://issues.apache.org/jira/browse/FLINK-29953
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


It is unnecessary for the tablestore to rely on it in the test. Its 
incompatible modifications will make the tablestore troublesome.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29933) Bump Flink version to 1.16.0

2022-11-08 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29933:


 Summary: Bump Flink version to 1.16.0
 Key: FLINK-29933
 URL: https://issues.apache.org/jira/browse/FLINK-29933
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Nicholas Jiang
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29774) Introduce options metadata table

2022-10-26 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29774:


 Summary: Introduce options metadata table
 Key: FLINK-29774
 URL: https://issues.apache.org/jira/browse/FLINK-29774
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


SELECT * FROM T$options;
KEY | VALUE
... | ...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29760) Introduce snapshots metadata table

2022-10-25 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29760:


 Summary: Introduce snapshots metadata table
 Key: FLINK-29760
 URL: https://issues.apache.org/jira/browse/FLINK-29760
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


Introduce snapshots metadata table to show snapshot history.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29736) Abstract a table interface for both data and metadata tables

2022-10-24 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29736:


 Summary: Abstract a table interface for both data and metadata 
tables
 Key: FLINK-29736
 URL: https://issues.apache.org/jira/browse/FLINK-29736
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29735) Introduce Metadata tables for table store

2022-10-24 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29735:


 Summary: Introduce Metadata tables for table store
 Key: FLINK-29735
 URL: https://issues.apache.org/jira/browse/FLINK-29735
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


You can query the related metadata of the table through SQL, for example, query 
the historical version information of table "T" through the following SQL:

SELECT * FROM T$history;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29700) Serializer to BinaryInMemorySortBuffer is wrong

2022-10-20 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29700:


 Summary: Serializer to BinaryInMemorySortBuffer is wrong
 Key: FLINK-29700
 URL: https://issues.apache.org/jira/browse/FLINK-29700
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0, table-store-0.2.2


In SortBufferMemTable, it will use 
`BinaryInMemorySortBuffer.createBuffer(BinaryRowDataSerializer serializer)`, 
the serializer is for full row, not just sort key fields.

Problems may occur when there are many fields.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29690) listTables in HiveCatalog should only return table store tables

2022-10-19 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29690:


 Summary: listTables in HiveCatalog should only return table store 
tables
 Key: FLINK-29690
 URL: https://issues.apache.org/jira/browse/FLINK-29690
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0, table-store-0.2.2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29621) Append-only with eventual log.consistency can not work

2022-10-13 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29621:


 Summary: Append-only with eventual log.consistency can not work
 Key: FLINK-29621
 URL: https://issues.apache.org/jira/browse/FLINK-29621
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0, table-store-0.2.2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29614) Introduce Spark writer for table store

2022-10-13 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29614:


 Summary: Introduce Spark writer for table store
 Key: FLINK-29614
 URL: https://issues.apache.org/jira/browse/FLINK-29614
 Project: Flink
  Issue Type: New Feature
  Components: Table Store
Reporter: Jingsong Lee


The main difficulty is that the Spark SourceV2 interface currently does not 
support custom distribution, and the Table Store must have consistent 
distribution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29586) Let write buffer spillable

2022-10-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29586:


 Summary: Let write buffer spillable
 Key: FLINK-29586
 URL: https://issues.apache.org/jira/browse/FLINK-29586
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0


Column format and remote DFS may greatly affect the performance of compaction. 
We can change the writeBuffer to spillable to improve the performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29554) Add partial-update.ignore-delete option to avoid exception after join

2022-10-09 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29554:


 Summary: Add partial-update.ignore-delete option to avoid 
exception after join
 Key: FLINK-29554
 URL: https://issues.apache.org/jira/browse/FLINK-29554
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0, table-store-0.2.1


When the partial update input is a normal database cdc input, it can work 
normally as long as there is no delete data.
However, if a join is performed previously, the join node in flink job will 
generate delete messages, which will cause the partial update insertion to 
throw an exception.
We can add an option to decide whether to ignore the delete message in this 
case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29491) Primary key without partition field can be supported from full changelog

2022-09-30 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29491:


 Summary: Primary key without partition field can be supported from 
full changelog
 Key: FLINK-29491
 URL: https://issues.apache.org/jira/browse/FLINK-29491
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


When pk does not contain partition fields, an exception will be thrown under 
any circumstances. We can relax this restriction. When the input is a complete 
changelog.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29490) Timestamp LTZ is unsupported in table store

2022-09-30 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29490:


 Summary: Timestamp LTZ is unsupported in table store 
 Key: FLINK-29490
 URL: https://issues.apache.org/jira/browse/FLINK-29490
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


Due to orc format limitation, timestamp ltz is unsupported now. We should fix 
this, and validate this type cross multiple engines (hive spark trino).
We need to careful about time zone.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29412) Connection leak in orc reader

2022-09-26 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-29412:


 Summary: Connection leak in orc reader
 Key: FLINK-29412
 URL: https://issues.apache.org/jira/browse/FLINK-29412
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.3.0, table-store-0.2.1






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   >