[
https://issues.apache.org/jira/browse/FLINK-23730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Carl updated FLINK-23730:
-------------------------
Attachment: image-2021-08-26-09-51-38-899.png
> Source from hive sink hbase lost data
> -------------------------------------
>
> Key: FLINK-23730
> URL: https://issues.apache.org/jira/browse/FLINK-23730
> Project: Flink
> Issue Type: Bug
> Components: Connectors / HBase, Connectors / Hive
> Affects Versions: 1.12.1
> Reporter: Carl
> Priority: Major
> Attachments: image-2021-08-26-09-43-39-055.png,
> image-2021-08-26-09-44-20-390.png, image-2021-08-26-09-50-35-061.png,
> image-2021-08-26-09-51-38-899.png
>
>
> Our use case is as follows,
> # hive source: create hive table which meta data is in HMS
> # create hbase use hbase shell
> # flink sql ddl: create hbase flink table
> # use hive catalog: use flink sql insert into hbase flink table
> if i set the tableconfig: table.exec.hive.infer-source-parallelism = false
> The program will run as one parallelism,and the number of records of results
> is correct.
> but if i set the tableconfig: table.exec.hive.infer-source-parallelism = true
> The program will run as twenty parallelism that express source parallelism is
> inferred according to splits number,and the number of records of results is
> not correct.
>
> The test was repeated many times and there was no exception occurred.
>
> So I guess it has something to do with high concurrency. Does it lose data
> because of high concurrency?
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)