Hive supports PostExecute hooks that can support this. Look at org.apache.hadoop.hive.ql.hooks.PostExecute
After you implement this hook you can register it through hive.exec.post.hooks Variable which is a comma separated list of hook implementations. The caveat is that this hook gets called at the end of the query (which may comprise of a number of hadoop jobs). If you need per job hook there is effort underway to add PreTask and PostTask hooks. https://issues.apache.org/jira/browse/HIVE-1347 Would those help for your use case or is the postexecute hook enough. We use the postexecute hook to drive replication and collect a lot of usage stats.. Ashish -----Original Message----- From: Ashutosh Chauhan [mailto:[email protected]] Sent: Wednesday, May 26, 2010 9:17 AM To: [email protected] Subject: Re: job level output committer in storage handler Hi Kortni, Thanks for your suggestion. But we cant use it in our setup. We are not spinning hive jobs in a separate process which we can monitor rather I want to get the handle on when job finishes in my storage handler / serde. Ashutosh On Tue, May 25, 2010 at 12:25, Kortni Smith <[email protected]> wrote: > Hi Ashutosh , > > I'm not sure how to accomplish that on the hive side of things, but in > case it helps I am writing because it sounds like you to know when > your job is done so you can update something externally and my company > will also be implementing this in the near future. Our plan is to > have the process that kicks off our hive jobs in the cloud, to monitor > each job status periodically using amazon's emr java library, and when > their state changes to complete, update our external systems accordingly. > > > Kortni Smith | Software Developer > AbeBooks.com Passion for books. > > [email protected] > phone: 250.412.3272 | fax: 250.475.6014 > > Suite 500 - 655 Tyee Rd. Victoria, BC. Canada V9A 6X5 > > www.abebooks.com | www.abebooks.co.uk | www.abebooks.de > www.abebooks.fr | www.abebooks.it | www.iberlibro.com > > -----Original Message----- > From: Ashutosh Chauhan [mailto:[email protected]] > Sent: Tuesday, May 25, 2010 12:13 PM > To: [email protected] > Subject: job level output committer in storage handler > > Hi, > > I am implementing my own serde and storage handler. Is there any > method in one of these interfaces (or any other) which give me a > handle to do some operation after all the records have been written by > all reducer. Something very similar to job level output committer. I > want to update some state in an external system once I know job has > completed successfully. Ideally, I would do this kind of a thing in a > job level output committer, but since Hive is on old MR api, I dont > have access to that. There is a Hive's RecordWriter#close() I tried > that but it looks like its a task level handle. So, every reducer will > try to update the state of my external system, which is not I want. > Any pointers on how to achieve this will be much appreciated. If its > unclear what I am asking for, let me know and I will provide more > details. > > Thanks, > Ashutosh >
