[ 
https://issues.apache.org/jira/browse/HIVE-24266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita reassigned HIVE-24266:
---------------------------------


> Committed rows in hflush'd ACID files may be missing from query result
> ----------------------------------------------------------------------
>
>                 Key: HIVE-24266
>                 URL: https://issues.apache.org/jira/browse/HIVE-24266
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ádám Szita
>            Assignee: Ádám Szita
>            Priority: Major
>
> in HDFS environment if a writer is using hflush to write ORC ACID files 
> during a transaction commit, the results might be seen as missing when 
> reading the table before this file is completely persisted to disk (thus 
> synced)
> This is due to hflush not persisting the new buffers to disk, it rather just 
> ensures that new readers can see the new content. This causes the block 
> information to be incomplete, on which BISplitStrategy relies on. Although 
> the side file (_flush_length) tracks the proper end of the file that is being 
> written, this information is neglected in the favour of block information, 
> and we may end up generating a very short split instead of the larger, 
> available length.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to