[ 
https://issues.apache.org/jira/browse/GOBBLIN-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zihan Li resolved GOBBLIN-1343.
-------------------------------
    Resolution: Fixed

> Fix the data loss issue caused by the cache expiration in 
> PartitionerDataWriter
> -------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1343
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1343
>             Project: Apache Gobblin
>          Issue Type: Task
>            Reporter: Zihan Li
>            Priority: Major
>
> Problem statement:
> Previously, we maintain a cache in PartitionedDataWriter to avoid accumulate 
> writer in memory in long running job. But when we expire the writer, we only 
> close it without flush/commit, so it may cause data loss when there is a 
> slowness happening on HDFS.
>  
> Potential solution:
>  # In the removal logic, we can make sure the writer has been committed 
> correctly, i.e. force it to commit before close.  But the issue here is we 
> still remove the writer from cache, so next flush message will be handled and 
> return without call commit for the right writer, and watermark will move 
> without data being published to HDFS.
>  # We calculate the time for the write operation, and if it takes a long 
> time, we force to add the writer back to cache so that next flush message 
> will be picked up by the writer. 
> Here we use the second solution



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to