Hi Mark, I was totally wrong about sequence files in my previous email. In fact, I realized that SqoopRecord is needed by MR jobs to deserialize sequence files. Again, I am sorry for the confusion.
Thanks, Cheolsoo On Wed, Apr 11, 2012 at 11:52 AM, Cheolsoo Park <[email protected]>wrote: > Hi Mark, > > It would be helpful if you could provide complete log with the --verbose > option on. > > I believe the result in the hdfs file is a serialization of the java >> object of a class generated automatically by sqoop, the class name is the >> table name and extends SqoopRecord: let’s call it table_name.java . > > > A serialization of 'table_name' is not the result. The auto-generated > Java class is only for Sqoop to interface with the DB. The result is > sequence files that contain data. > > Now, I am trying to run a MapReduce job against this file but it is >> failing, I added the class table_name.java in my jar. But when I run the >> mapreduce job, I get “ClassNotFoundException: >> com.cloudera.sqoop.lib.SqoopRecord”. Even with the option –libjars >> sqoop-1.3.0.jar. > > ** > > I am not clear what MR jobs you're running here. > > 1) If you're importing data, I am wondering why you have to do this > manually since it should be automatically done by Sqoop: compile table_name > into a jar, load the jar into hdfs, pass the path to the jar to import > mapper jobs, etc > > 2) If you're running your own MR jobs on imported data, they don't need to > know about 'tabe_name' or 'SqoopRecord' since data are already in sequence > file format, so your MR jobs should be able to understand them. > > Hope this is helpful. > > Thanks, > Cheolsoo > > On Wed, Apr 11, 2012 at 9:28 AM, Marc Sturm <[email protected]> wrote: > >> Hi,**** >> >> ** ** >> >> I am new to hadoop and sqoop. So far I was able to run a single node >> hadoop cluster on my mac and I am trying to load data from sql server using >> sqoop 1.3 and Microsoft’s sqoop connector.**** >> >> The data is stored as varbinary column (though it is text blob) and I am >> loading it into hadoop with sqoop using the --as-sequencefile option. I >> believe the result in the hdfs file is a serialization of the java object >> of a class generated automatically by sqoop, the class name is the table >> name and extends SqoopRecord: let’s call it table_name.java . This was done >> successfully.**** >> >> ** ** >> >> Now, I am trying to run a MapReduce job against this file but it is >> failing, I added the class table_name.java in my jar. But when I run the >> mapreduce job, I get “ClassNotFoundException: >> com.cloudera.sqoop.lib.SqoopRecord”.**** >> >> Even with the option –libjars sqoop-1.3.0.jar.**** >> >> ** ** >> >> I hope all this makes sense to you. If you can help me understand what >> the problem is or point me to the right documentation that would be great. >> **** >> >> ** ** >> >> Thanks,**** >> >> Marc**** >> >> ** ** >> >> ------------------------------ >> This electronic message is intended to be for the use only of the named >> recipient, and may contain information that is confidential or privileged. >> If you are not the intended recipient, you are hereby notified that any >> disclosure, copying, distribution or use of the contents of this message is >> strictly prohibited. If you have received this message in error or are not >> the named recipient, please notify us immediately by contacting the sender >> at the electronic mail address noted above, and delete and destroy all >> copies of this message. Thank you. >> >> -------------------- >> >> This electronic message is intended to be for the use only of the named >> recipient, and may contain information that is confidential or privileged. >> If you are not the intended recipient, you are hereby notified that any >> disclosure, copying, distribution or use of the contents of this message is >> strictly prohibited. If you have received this message in error or are not >> the named recipient, please notify us immediately by contacting the sender >> at the electronic mail address noted above, and delete and destroy all >> copies of this message. Thank you. >> >> >> -------------------- >> >> This electronic message is intended to be for the use only of the named >> recipient, and may contain information that is confidential or privileged. >> If you are not the intended recipient, you are hereby notified that any >> disclosure, copying, distribution or use of the contents of this message is >> strictly prohibited. If you have received this message in error or are not >> the named recipient, please notify us immediately by contacting the sender >> at the electronic mail address noted above, and delete and destroy all >> copies of this message. Thank you. >> >> >> >> >
