Thanks for your help Cheolsoo, I added the sqoop jar to hadoop's lib dir and 
now my mr job runs fine.
Marc

From: Cheolsoo Park [mailto:[email protected]]
Sent: Wednesday, April 11, 2012 4:06 PM
To: [email protected]
Subject: Re: loading as sequencefile and running an hadoop mapreduce job

Hi Mark,

I was totally wrong about sequence files in my previous email. In fact, I 
realized that SqoopRecord is needed by MR jobs to deserialize sequence files. 
Again, I am sorry for the confusion.

Thanks,
Cheolsoo
On Wed, Apr 11, 2012 at 11:52 AM, Cheolsoo Park 
<[email protected]<mailto:[email protected]>> wrote:
Hi Mark,

It would be helpful if you could provide complete log with the --verbose option 
on.

I believe the result in the hdfs file is a serialization of the java object of 
a class generated automatically by sqoop, the class name is the table name and 
extends SqoopRecord: let's call it table_name.java .

A serialization of 'table_name' is not the result.  The auto-generated Java 
class is only for Sqoop to interface with the DB. The result is sequence files 
that contain data.

Now, I am trying to run a MapReduce job against this file but it is failing, I 
added the class table_name.java in my jar. But when I run the mapreduce job, I 
get "ClassNotFoundException: com.cloudera.sqoop.lib.SqoopRecord". Even with the 
option -libjars sqoop-1.3.0.jar.

I am not clear what MR jobs you're running here.

1) If you're importing data, I am wondering why you have to do this manually 
since it should be automatically done by Sqoop: compile table_name into a jar, 
load the jar into hdfs, pass the path to the jar to import mapper jobs, etc

2) If you're running your own MR jobs on imported data, they don't need to know 
about 'tabe_name' or 'SqoopRecord' since data are already in sequence file 
format, so your MR jobs should be able to understand them.

Hope this is helpful.

Thanks,
Cheolsoo

On Wed, Apr 11, 2012 at 9:28 AM, Marc Sturm 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

I am new to hadoop and sqoop. So far I was able to run a single node hadoop 
cluster on my mac and I am trying to load data from sql server using sqoop 1.3 
and Microsoft's sqoop connector.
The data is stored as varbinary column (though it is text blob) and I am 
loading it into hadoop with sqoop using the --as-sequencefile option. I believe 
the result in the hdfs file is a serialization of the java object of a class 
generated automatically by sqoop, the class name is the table name and extends 
SqoopRecord: let's call it table_name.java . This was done successfully.

Now, I am trying to run a MapReduce job against this file but it is failing, I 
added the class table_name.java in my jar. But when I run the mapreduce job, I 
get "ClassNotFoundException: com.cloudera.sqoop.lib.SqoopRecord".
Even with the option -libjars sqoop-1.3.0.jar.

I hope all this makes sense to you. If you can help me understand what the 
problem is or point me to the right documentation that would be great.

Thanks,
Marc


________________________________
This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged. If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited. If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message. Thank you.

--------------------



This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.



--------------------



This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.








--------------------

This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.




--------------------

This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.



Reply via email to