On Wed, 23 Sep 2020 at 20:16, Jim Brennan
<james.bren...@verizonmedia.com.invalid> wrote:

> I replied in the Jira.   The speed up provided by the v2 commit algorithm
> is very important to us at Verizon Media (Yahoo).  Please do not remove it.
> I referred to this comment from Jason Lowe on the original Jira:
>
> https://issues.apache.org/jira/browse/MAPREDUCE-4815?focusedCommentId=14271115&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14271115
>
> I think it would be appropriate to better document the limitations of the
> v2 algorithm and possibly make it not be the default, as long as we can
> still use it.
>


What about:
-change default
-log @ WARN in job setup (but not tasks)


People like yourself -aware of and happy with the risk- can carry on, but
everyone else gets a warning of risk

I could also have a special log for the warning so you can turn it off...

>
> On Wed, Sep 23, 2020 at 2:07 PM Igor Dvorzhak <i...@google.com.invalid>
> wrote:
>
> > What will be the solution for object stores to have fast and correct
> > commit algorithms?
> >
> > On Wed, Sep 23, 2020 at 11:42 AM Steve Loughran
> > <ste...@cloudera.com.invalid> wrote:
> >
> >> I've got a PR up to completely remove the v2 commit algorithm
> >>
> >> https://github.com/apache/hadoop/pull/2320
> >>
> >> That may seem overkill, but while *we* know there's a small window of
> risk
> >> (task attempt 1 failing partway through a nonatomic commit), that's not
> >> known/appreciated by others.
> >>
> >> The patch removes the v2 codepath from FileOutputCommitter, making it a
> >> lot
> >> less complicated, and when v2 is requested, a warning is printed and the
> >> option ignored.
> >>
> >> Overkill? Maybe. But it guarantees correctness
> >>
> >
>

Reply via email to