The trickiest case is where a user-defined function fails on some, but not all inputs. Do you:

a) fail the entire job
b) log the errors, omit those inputs, and keep on processing (questionable course of action in the case of non-monotonic queries/ programs e.g. set difference, as well as aggregation like COUNT) c) log the errors, and insert NULLs into the stream (perhaps the best option, especially given that we're going to support nulls anyway) d) support some subset of (a,b,c) and permit the user to choose on a per-job basis

?

-Chris

On Feb 29, 2008, at 11:10 AM, Olga Natkovich wrote:



-----Original Message-----
From: Benjamin Francisoud [mailto:[EMAIL PROTECTED]
Sent: Friday, February 29, 2008 7:05 AM
To: [email protected]
Subject: Re: Proposal for error handling in Pig

About Internal Errors, do you consider such code to be part of them ?

public void something(Object object) {
    if (o == null) {
        throw new IllegalArgumentException("Object can't be null");
    }
    ...
}

class StateMachine {
    public void start() {...}
    public void end() {
        if (startCalled == false) {
            throw new IllegalStateException("You didn't call
start()");
        }
    }
}

About user errors, how should we handle them ?
The way I proposed in PIG-100 (1) ?

Yes, that's fine. I personally don't see a strong reason to log the exception stack in this case but I am fine with doing it if others find it helpful. I will update the doc to include this information.


try {
    plan = parser.Parse();
} catch (ParseException e) {
    log.error(e.getMessage());
    log.debug(e);
}



[1]
https://issues.apache.org/jira/browse/PIG-100?focusedCommentId
=12573218#action_12573218

Olga Natkovich a écrit :
Pig developers,

We had many patches submitted that are trying to improve
error handling.
This is really great as many users ask exactly for that. So
it seems
timely to establish some guidelines on how errors should be
handled,
propagated, delivered, etc.

I put together a proposal to start the discussion. Please,
review and
comment. Once we have an agreement we would need to add the missing
pieces to deploy it into Pig and then review the existing
patches to
make sure they follow the proposed practice.

http://wiki.apache.org/pig/PigDeveloperCookbook

I have also started a general document called Pig Developer
Cookbook
where we can keep track of development patterns we as a
community want
to follow.

Thanks again for everybody's contributions!

Olga





--
Christopher Olston, Ph.D.
Sr. Research Scientist
Yahoo! Research


Reply via email to