[
https://issues.apache.org/jira/browse/FLINK-23730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399613#comment-17399613
]
luoyuxia edited comment on FLINK-23730 at 8/16/21, 8:59 AM:
------------------------------------------------------------
[~yanchenyun] Thanks for reporting it. It's strange that that the result won't
be correct when enable infer-source-parallelism. It shouldn't lose data with a
high concurrency.
Would you like to show the completed flink sql?
was (Author: luoyuxia):
[~yanchenyun] Thanks for reporting it. It's strange that that the result won't
be correct when enable infer-source-parallelism. It shouldn't lose data because
of high concurrency.
Would you like to show the completed flink sql?
> Source from hive sink hbase lost data
> -------------------------------------
>
> Key: FLINK-23730
> URL: https://issues.apache.org/jira/browse/FLINK-23730
> Project: Flink
> Issue Type: Bug
> Components: Connectors / HBase, Connectors / Hive
> Affects Versions: 1.12.1
> Reporter: Carl
> Priority: Major
>
> Our use case is as follows,
> # hive source: create hive table which meta data is in HMS
> # create hbase use hbase shell
> # flink sql ddl: create hbase flink table
> # use hive catalog: use flink sql insert into hbase flink table
> if i set the tableconfig: table.exec.hive.infer-source-parallelism = false
> The program will run as one parallelism,and the number of records of results
> is correct.
> but if i set the tableconfig: table.exec.hive.infer-source-parallelism = true
> The program will run as twenty parallelism that express source parallelism is
> inferred according to splits number,and the number of records of results is
> not correct.
>
> The test was repeated many times and there was no exception occurred.
>
> So I guess it has something to do with high concurrency. Does it lose data
> because of high concurrency?
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)