[
https://issues.apache.org/jira/browse/FLINK-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126741#comment-17126741
]
Rui Li commented on FLINK-18025:
--------------------------------
I have manually tried the following test cases.
# Create a {{datagen}} as the source table. The source table generates 5
records per second. And it generates totally 1000 records.
# Insert the source table into a partitioned Hive table. Use {{process-time}}
as commit trigger. Verify partitions are being committed as the job progresses.
Verify success file is written for each partition. Verify number of records
after the job finishes.
# Insert the source table into a partitioned Hive table. Use {{partition-time}}
as commit trigger and the timestamp is extracted from a single partition
column. Verify partitions are being committed as the job progresses. Verify
success file is written for each partition. Verify number of records after the
job finishes.
# Insert the source table into a single partition repeatedly. Use
{{process-time}} as commit trigger. Verify number of records after the job
finishes.
# Insert the source table into a non-partitioned Hive table. Verify number of
records after the job finishes.
# Insert the source table into a partitioned Hive table. Use {{process-time}}
as commit trigger. Kill one TM during job execution. Verify number of records
after the job finishes.
Currently test case #4 and #6 fail. For #4, old data can be overwritten which
doesn't meet the semantics of {{INSERT OVERWRITE}}. For #6, the job finishes
successfully after restart, but there's data loss.
> E2E tests manually for Hive streaming sink
> ------------------------------------------
>
> Key: FLINK-18025
> URL: https://issues.apache.org/jira/browse/FLINK-18025
> Project: Flink
> Issue Type: Sub-task
> Components: Connectors / Hive, Tests
> Affects Versions: 1.11.0
> Reporter: Danny Chen
> Assignee: Rui Li
> Priority: Blocker
> Fix For: 1.11.0
>
>
> - hive streaming sink failover
> - hive streaming sink job re-run
> - hive streaming sink without partition
> - ...
--
This message was sent by Atlassian Jira
(v8.3.4#803005)