Hi Patrick,

I think what you need is
OutputCommitter#commitTask()<http://hadoop.apache.org/docs/r1.1.2/api/org/apache/hadoop/mapreduce/OutputCommitter.html#commitTask(org.apache.hadoop.mapreduce.TaskAttemptContext)>.
This is called by Hadoop in each task process, so you can write your own
OutputCommitter class and associate it with your StoreFunc. Then you can
make a single call to your DB for the batched output per task.

If you're looking for a way to do some final work per job, you will have to
rely on either 
commitJob()<http://hadoop.apache.org/docs/r1.1.2/api/org/apache/hadoop/mapreduce/OutputCommitter.html#commitJob(org.apache.hadoop.mapreduce.JobContext)>
or
cleanUpOnSuccess(). But again, these are not called by the task process. I
am not sure what context you want to share between putNext() and
cleanUpOnSuccess(). But JobConf object will be constructed on the frontend
before launching MR jobs, and properties in this JobConf object will be
available everywhere. However, you won't be able to update some properties
in putNext() and see them in cleanUpOnSuccess(). Hope this is clear.

Thanks,
Cheolsoo



On Sun, Dec 15, 2013 at 7:11 AM, Patrick Thompson <
[email protected]> wrote:

> So is there a good way to flush a buffer accumulated by putNext? I was
> hoping it was possible in cleanUpOnSuccess, but that apparently isn't going
> to work. This is horrible for something talking to a store such as MySql,
> as it means you have to do updates one-at-a-time.
>
> Patrick
>
>
> On Sun, Dec 15, 2013 at 12:41 AM, Cheolsoo Park <[email protected]
> >wrote:
>
> > >> putNext and cleanUpOnSuccess will be called in the same execution
> > context?
> >
> > putNext() is called on the backend during the job execution, whereas
> > cleanUpOnSuccess() is called on the frontend after the job is finished.
> So
> > they won't be executed by the same object. From the comment, I also doubt
> > that you can share properties between them via JobConf.
> >
> > See MapReduceLauncher.java as for how cleanUpOnSuccess() is used.
> >
> > On Thu, Dec 5, 2013 at 11:10 AM, Patrick Thompson <
> > [email protected]> wrote:
> >
> > > It's not clear from the docs where the various StoreFuncInterface
> > functions
> > > get called. There are some hints in the API
> > > docs<http://pig.apache.org/docs/r0.12.0/api/>,
> > > but I am left wondering, does pig guarantee that, for example, putNext
> > and
> > > cleanUpOnSuccess will be called in the same execution context?
> > >
> > > Is this documented somewhere? Maybe someone can provide an answer? It
> > would
> > > save me a lot of time experimenting and spelunking in the code.
> > >
> > > Thanks
> > >
> > > Patrick
> > >
> >
>
>
>
> --
> fun and games - a blog <http://funazonki.blogspot.com/>, a word
> game<http://1.whatwouldwho.appspot.com/wwws.html>and
> CanCan <http://www.standingwaiting.com/CanCan/Game.html>
>

Reply via email to