[ 
https://issues.apache.org/jira/browse/FLINK-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126741#comment-17126741
 ] 

Rui Li commented on FLINK-18025:
--------------------------------

I have manually tried the following test cases.

# Create a {{datagen}} as the source table. The source table generates 5 
records per second. And it generates totally 1000 records.
# Insert the source table into a partitioned Hive table. Use {{process-time}} 
as commit trigger. Verify partitions are being committed as the job progresses. 
Verify success file is written for each partition. Verify number of records 
after the job finishes.
# Insert the source table into a partitioned Hive table. Use {{partition-time}} 
as commit trigger and the timestamp is extracted from a single partition 
column. Verify partitions are being committed as the job progresses. Verify 
success file is written for each partition. Verify number of records after the 
job finishes.
# Insert the source table into a single partition repeatedly. Use 
{{process-time}} as commit trigger. Verify number of records after the job 
finishes.
# Insert the source table into a non-partitioned Hive table. Verify number of 
records after the job finishes.
# Insert the source table into a partitioned Hive table. Use {{process-time}} 
as commit trigger. Kill one TM during job execution. Verify number of records 
after the job finishes.

Currently test case #4 and #6 fail. For #4, old data can be overwritten which 
doesn't meet the semantics of {{INSERT OVERWRITE}}. For #6, the job finishes 
successfully after restart, but there's data loss.

> E2E tests manually for Hive streaming sink
> ------------------------------------------
>
>                 Key: FLINK-18025
>                 URL: https://issues.apache.org/jira/browse/FLINK-18025
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / Hive, Tests
>    Affects Versions: 1.11.0
>            Reporter: Danny Chen
>            Assignee: Rui Li
>            Priority: Blocker
>             Fix For: 1.11.0
>
>
> - hive streaming sink failover
>  - hive streaming sink job re-run
>  - hive streaming sink without partition
>  - ...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to