Re: Sqoop 1.4.1-cdh4.0.1 is not running in Hadoop 2.0.0-cdh4.1.1

Arun C Murthy Wed, 24 Oct 2012 07:15:13 -0700

Please ask CDH questions on CDH lists. 

On Oct 23, 2012, at 3:17 PM, Kartashov, Andy wrote:


> Guys tried for hours to resolve this error.
>  
> I am trying to import a table to Hadoop using Sqoop.
>  
> ERROR is:
> Error: 
> org.hsqldb.DatabaseURL.parseURL(Ljava/lang/String;ZZ)Lorg/hsqldb/persist/HsqlProperties
>  
>  
> I realise that there is an issue with the versions of hsqldb.jar files
>  
> At first, Sqoop was spitting above error until I realised that my 
> /usr/lib/sqoop/lib folder had both versions hsqldb-1.8.0.10.jar and just 
> hsqldb.jar (2.0? I suppose), and sqoop-conf was picking up the first (wrong 
> jar).
>  
> When I moved the hsqldb-1.8.0.10.jar away, Sqoop stopped complaining but them 
> Hadoop began spitting out the same error. No matter what I tried I could not 
> get Hadoop to pick the right jar.
>  
> I tried setting:
> export HADOOP_CLASSPATH=”/usr/lib/sqoop/lib/hsqldb.jar” and then
> export HADOOP_USER_CLASSPATH_FIRST=true
> without luck..
>  
> Please help.
>  
> Thnks
> AK
>  
>  
> From: Jonathan Bishop [mailto:[email protected]] 
> Sent: Tuesday, October 23, 2012 2:41 PM
> To: [email protected]
> Subject: Re: zlib does not uncompress gzip during MR run
>  
> Just to follow up on my own question...
>  
> I believe the problem is caused by the input split during MR. So my real 
> question is how to handle input splits when the input is gzipped.
>  
> Is it even possible to have splits of a gzipped file?
>  
> Thanks,
>  
> Jon
> 
> On Tue, Oct 23, 2012 at 11:10 AM, Jonathan Bishop <[email protected]> 
> wrote:
> Hi,
> 
> My input files are gzipped, and I am using the builtin java codecs 
> successfully to uncompress them in a normal java run...
> 
>         fileIn = fs.open(fsplit.getPath());
>         codec = compressionCodecs.getCodec(fsplit.getPath());
>         in = new LineReader(codec != null ? codec.createInputStream(fileIn) : 
> fileIn, config);
> 
> But when I use the same piece of code in a MR job I am getting...
> 
> 
> 
> 12/10/23 11:02:25 INFO util.NativeCodeLoader: Loaded the native-hadoop library
> 12/10/23 11:02:25 INFO zlib.ZlibFactory: Successfully loaded & initialized 
> native-zlib library
> 12/10/23 11:02:25 INFO compress.CodecPool: Got brand-new compressor
> 12/10/23 11:02:26 INFO mapreduce.HFileOutputFormat: Incremental table output 
> configured.
> 12/10/23 11:02:26 INFO input.FileInputFormat: Total input paths to process : 3
> 12/10/23 11:02:27 INFO mapred.JobClient: Running job: job_201210221549_0014
> 12/10/23 11:02:28 INFO mapred.JobClient:  map 0% reduce 0%
> 12/10/23 11:02:49 INFO mapred.JobClient: Task Id : 
> attempt_201210221549_0014_m_000003_0, Status : FAILED
> java.io.IOException: incorrect header check
>     at 
> org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native 
> Method)
>     at 
> org.apache.hadoop.io.compress.zlib.ZlibDecompressor.decompress(ZlibDecompressor.java:221)
>     at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>     at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>     at java.io.InputStream.read(InputStream.java:101)
> 
> So I am thinking that there is some incompatibility of zlib and my gzip. Is 
> there a way to force hadoop to use the java built-in compression codecs?
> 
> Also, I would like to try lzo which I hope will allow splitting of the input 
> files (I recall reading this somewhere). Can someone point me to the best way 
> to do this?
> 
> Thanks,
> 
> Jon
>  
> NOTICE: This e-mail message and any attachments are confidential, subject to 
> copyright and may be privileged. Any unauthorized use, copying or disclosure 
> is prohibited. If you are not the intended recipient, please delete and 
> contact the sender immediately. Please consider the environment before 
> printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui 
> l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent 
> être couverts par le secret professionnel. Toute utilisation, copie ou 
> divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire 
> prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. 
> Veuillez penser à l'environnement avant d'imprimer le présent courriel

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Sqoop 1.4.1-cdh4.0.1 is not running in Hadoop 2.0.0-cdh4.1.1

Reply via email to