Hi Jarec, If we are merging two hdfs data, I do not understand why we would need database connection. Could you explain?
Thanks, Chalcy On Tue, Oct 23, 2012 at 10:59 AM, Jarek Jarcec Cecho <[email protected]>wrote: > Hi Chalcy, > Sqoop needs to be able to parse the files you're trying to merge as newer > entries must be updated. Usually Sqoop generate special class for this > purpose based on connection in use, however in merge case there is no > connection to the database and therefore you need to specify such class > manually. This class is automatically generated for you in case of an > import tool and might be manually generated using codegen tool [1]. You > might get additional information about those two arguments in merge tool in > our user guide [2]. > > Jarcec > > Links: > 1: > http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_codegen_literal > 2: > http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_merge_literal > > On Tue, Oct 23, 2012 at 09:41:07AM -0400, Chalcy wrote: > > Hello Sqoop users, > > > > I tried to use sqoop merge and understand all the parameters except > > --class-name and --jar-file. What should that be? Sqoop errors out if I > > do not specify them. > > > > The command I am using is > > sqoop merge --new-data user/hadoop/testincrement --onto > > /user/hadoop/exisitngdata --target-dir /user/hadoop/mergeddir --merge-key > > rowid > > > > Thanks, > > Chalcy >
