Oh but s3Guard will not solve the atomicity problem, right?

Reference:
https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/bk_cloud-data-access/content/ch03s07s01.html

*What S3Guard Cannot Do*
*...*

*Mimic the "directory rename is a single atomic transaction" behavior of a
filesystem like HDFS. Directory renames are still slow and visible while in
progress. This means that if the operations fail partway through, the
source and destination paths may contain a mix (including some duplicate)
copies of data files.*

So that means that the directory will be "visible while in progress", and
the reader might pick up the compacted directory even when all files
haven't been copied.

Thanks,
Somani

On Fri, Nov 9, 2018 at 10:25 PM Gopal Vijayaraghavan <gop...@apache.org>
wrote:

> >    To me it looks like this problem will be solved by
> >    https://issues.apache.org/jira/browse/HIVE-20823, but until then, is
> this
> >    broken or I have missed a crucial detail?
>
> Yes, S3Guard.
>
>
> https://www.slideshare.net/hortonworks/s3guard-whats-in-your-consistency-model
>
> However, that's another daemon you need to run (+ provision DynamoDB etc).
>
> It is not the most convenient of setups to run on S3.
>
> Cheers,
> Gopal
>
>
>
>

Reply via email to