Ok, thank you, Sean.

Ok i think we are in agreement.

On Wed, Mar 2, 2011 at 9:22 AM, Sean Owen <[email protected]> wrote:

> On Wed, Mar 2, 2011 at 5:05 PM, Dmitriy Lyubimov <[email protected]>
> wrote:
> > I am not saying that. I actually was just affirming that in case of
> > hdfs (albeit not in general per close() contract -- it's not true for
> > files for example).
>
> I hope there's no such case, or else things like file writes may
> silently never happen in Java. That shouldn't be true in general, I
> hope.
>
>
> > ok what i am trying to argue is that using quietClose(...) in
> > o.a.m.common.ioutils is a bad practice for hdfs writes such as side
> > files and multiple outputs. I can't seem to get you agree on this.
>
> Please go ahead as you have proposed already. My only suggestion is
> try to make the naming consistent ("close()"?) and ideally we have a
> similar API and implementation. You can alter the existing
> implementation.
>
> I actually don't even mind if all the quiet closes are made noisy.
> Wherever they exist, it should be the case that close()s don't "do"
> anything and shouldn't fail anyway. And if it does, well, sure the
> caller can't do anything meaningful, but maybe better to fall over if
> something very weird happened, rather than just log it.
>
> The worst thing that happens with a "noisy" close in your situation is
> an M/R is failed and is restarted when it may or may not have been
> needed -- and that's no big deal at all. You're right that is actually
> a case of the caller being able to recover so it makes sense.
>
> And come to think of it, I bet there are certain things
> SequenceFile.Writer must do to finish writing a file given that HDFS
> files are write-once, and it has no finish() or flush() method like,
> for instance GzipOutputStream. So yes there is a case where in
> practice you do have to make sure close() succeeds rather than
> explicitly finish the write. That's a good example indeed.
>
>
> >
> > But perhaps i can hope to  get you to agree that checking for close
> > errors at least doesn't hurt so i can continue checking for them and
> > not using quietClose() during MR task commits?
> >
> >
> >
> >
> > -d
> >
> > On Wed, Mar 2, 2011 at 8:20 AM, Sean Owen (JIRA) <[email protected]>
> wrote:
> >>
> >>    [
> https://issues.apache.org/jira/browse/MAHOUT-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001470#comment-13001470]
> >>
> >> Sean Owen commented on MAHOUT-593:
> >> ----------------------------------
> >>
> >> I think we're just talking about semantics now. You are welcome to
> commit as you like.
> >>
> >> I do agree that a stream could buffer writes, and that it could commit
> as a way to clean up before close, and the commit could fail during close.
> If the caller is trying to commit by closing, that's really the problem.
> But, it's a real problem.Yes, I don't think that in any Java API I can think
> of or HDFS you would have a successful write and successful close but fail
> to write.
> >>
> >>> Backport of Stochastic SVD patch (Mahout-376) to hadoop 0.20 to ensure
> compatibility with current Mahout dependencies.
> >>>
> ----------------------------------------------------------------------------------------------------------------------
> >>>
> >>>                 Key: MAHOUT-593
> >>>                 URL: https://issues.apache.org/jira/browse/MAHOUT-593
> >>>             Project: Mahout
> >>>          Issue Type: New Feature
> >>>          Components: Math
> >>>    Affects Versions: 0.4
> >>>            Reporter: Dmitriy Lyubimov
> >>>             Fix For: 0.5
> >>>
> >>>         Attachments: MAHOUT-593.patch.gz, MAHOUT-593.patch.gz,
> MAHOUT-593.patch.gz, SSVD-givens-CLI.pdf, ssvdclassdiag.png
> >>>
> >>>
> >>> Current Mahout-376 patch requries 'new' hadoop API.  Certain elements
> of that API (namely, multiple outputs) are not available in standard hadoop
> 0.20.2 release. As such, that may work only with either CDH or 0.21
> distributions.
> >>>  In order to bring it into sync with current Mahout dependencies, a
> backport of the patch to 'old' API is needed.
> >>> Also, some work is needed to resolve math dependencies. Existing patch
> relies on apache commons-math 2.1 for eigen decomposition of small matrices.
> This dependency is not currently set up in the mahout core. So, certain
> snippets of code are either required to go to mahout-math or use Colt eigen
> decompositon (last time i tried, my results were mixed with that one. It
> seems to produce results inconsistent with those from mahout-math
> eigensolver, at the very least, it doesn't produce singular values in sorted
> order).
> >>> So this patch is mainly moing some Mahout-376 code around.
> >>
> >> --
> >> This message is automatically generated by JIRA.
> >> -
> >> For more information on JIRA, see:
> http://www.atlassian.com/software/jira
> >>
> >>
> >>
> >
>

Reply via email to