[ 
https://issues.apache.org/jira/browse/GOBBLIN-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanghang Liu resolved GOBBLIN-1949.
-----------------------------------
    Resolution: Fixed

> Add option to detect malformed orc during commit
> ------------------------------------------------
>
>                 Key: GOBBLIN-1949
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1949
>             Project: Apache Gobblin
>          Issue Type: Bug
>            Reporter: Hanghang Liu
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hot fix for malformed ORC file issue.
> The issue was observed during compaction that the malformed ORC can’t be 
> opened. There're two scenarios of malformed file, one is the file only 
> contains the last keyword of Postscript, meaning the byte of "ORC" is written 
> to the file. Another situation is the file contains concrete data but doesn't 
> end properly so read will fail when ReaderImplextractPostScript().
> The fix is to add an validation step of the ORC file during commit, more 
> specifically after close the writer and before commit. This can prevent the 
> malformed data being moved the output direction and even published to 
> destination. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to