Hi Artem, would you mind describe your use case in a more details? I'm especially interested to know more about what do you mean by executing from map/reduce program.
Sqoop itself will span a mapreduce job, so executing it from another map/reduce job do not make much sense as you would get exponencial load. Imagine 50 mappers where each will span Sqoop job that will again span 50 mappers, thats 50 * 50 = 2500 running map tasks that most likely would kill your remote database. Thus it might be more appropriate to execute Sqoop prior running your mapreduce job as you've mentioned that you're already doing. About your question whether Sqoop needs to be installed on each node, it do not. Hadoop is providing facility called DistributedCache [1] that allows you to distribute arbitrary files with your job. The benefit is that jars will be automatically added to application classpath. Jarcec Links: 1: http://hadoop.apache.org/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html On Tue, Feb 12, 2013 at 03:26:50PM +0000, Artem Ervits wrote: > Hello all, > > I'd like to know if there's a way to execute an incremental job from a > map/reduce program. If there is a way, please point to a user guide I can > take a look at to achieve it. In case it is possible, does Sqoop need to be > installed on every node of the Hadoop cluster? I'm aware of the fact that > Oozie would be able to achieve this but I was wondering if there are other > ways. Right now I have a script that first calls the Sqoop job and then > executes the M/R job. > > Thank you. > > Artem Ervits > New York Presbyterian Hospital > > > > -------------------- > > This electronic message is intended to be for the use only of the named > recipient, and may contain information that is confidential or privileged. > If you are not the intended recipient, you are hereby notified that any > disclosure, copying, distribution or use of the contents of this message is > strictly prohibited. If you have received this message in error or are not > the named recipient, please notify us immediately by contacting the sender at > the electronic mail address noted above, and delete and destroy all copies of > this message. Thank you. > > > > > -------------------- > > This electronic message is intended to be for the use only of the named > recipient, and may contain information that is confidential or privileged. > If you are not the intended recipient, you are hereby notified that any > disclosure, copying, distribution or use of the contents of this message is > strictly prohibited. If you have received this message in error or are not > the named recipient, please notify us immediately by contacting the sender at > the electronic mail address noted above, and delete and destroy all copies of > this message. Thank you. > > >
signature.asc
Description: Digital signature
