Please ask CDH questions on CDH lists. On Oct 23, 2012, at 3:17 PM, Kartashov, Andy wrote:
> Guys tried for hours to resolve this error. > > I am trying to import a table to Hadoop using Sqoop. > > ERROR is: > Error: > org.hsqldb.DatabaseURL.parseURL(Ljava/lang/String;ZZ)Lorg/hsqldb/persist/HsqlProperties > > > I realise that there is an issue with the versions of hsqldb.jar files > > At first, Sqoop was spitting above error until I realised that my > /usr/lib/sqoop/lib folder had both versions hsqldb-1.8.0.10.jar and just > hsqldb.jar (2.0? I suppose), and sqoop-conf was picking up the first (wrong > jar). > > When I moved the hsqldb-1.8.0.10.jar away, Sqoop stopped complaining but them > Hadoop began spitting out the same error. No matter what I tried I could not > get Hadoop to pick the right jar. > > I tried setting: > export HADOOP_CLASSPATH=”/usr/lib/sqoop/lib/hsqldb.jar” and then > export HADOOP_USER_CLASSPATH_FIRST=true > without luck.. > > Please help. > > Thnks > AK > > > From: Jonathan Bishop [mailto:[email protected]] > Sent: Tuesday, October 23, 2012 2:41 PM > To: [email protected] > Subject: Re: zlib does not uncompress gzip during MR run > > Just to follow up on my own question... > > I believe the problem is caused by the input split during MR. So my real > question is how to handle input splits when the input is gzipped. > > Is it even possible to have splits of a gzipped file? > > Thanks, > > Jon > > On Tue, Oct 23, 2012 at 11:10 AM, Jonathan Bishop <[email protected]> > wrote: > Hi, > > My input files are gzipped, and I am using the builtin java codecs > successfully to uncompress them in a normal java run... > > fileIn = fs.open(fsplit.getPath()); > codec = compressionCodecs.getCodec(fsplit.getPath()); > in = new LineReader(codec != null ? codec.createInputStream(fileIn) : > fileIn, config); > > But when I use the same piece of code in a MR job I am getting... > > > > 12/10/23 11:02:25 INFO util.NativeCodeLoader: Loaded the native-hadoop library > 12/10/23 11:02:25 INFO zlib.ZlibFactory: Successfully loaded & initialized > native-zlib library > 12/10/23 11:02:25 INFO compress.CodecPool: Got brand-new compressor > 12/10/23 11:02:26 INFO mapreduce.HFileOutputFormat: Incremental table output > configured. > 12/10/23 11:02:26 INFO input.FileInputFormat: Total input paths to process : 3 > 12/10/23 11:02:27 INFO mapred.JobClient: Running job: job_201210221549_0014 > 12/10/23 11:02:28 INFO mapred.JobClient: map 0% reduce 0% > 12/10/23 11:02:49 INFO mapred.JobClient: Task Id : > attempt_201210221549_0014_m_000003_0, Status : FAILED > java.io.IOException: incorrect header check > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native > Method) > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.decompress(ZlibDecompressor.java:221) > at > org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82) > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76) > at java.io.InputStream.read(InputStream.java:101) > > So I am thinking that there is some incompatibility of zlib and my gzip. Is > there a way to force hadoop to use the java built-in compression codecs? > > Also, I would like to try lzo which I hope will allow splitting of the input > files (I recall reading this somewhere). Can someone point me to the best way > to do this? > > Thanks, > > Jon > > NOTICE: This e-mail message and any attachments are confidential, subject to > copyright and may be privileged. Any unauthorized use, copying or disclosure > is prohibited. If you are not the intended recipient, please delete and > contact the sender immediately. Please consider the environment before > printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui > l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent > être couverts par le secret professionnel. Toute utilisation, copie ou > divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire > prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. > Veuillez penser à l'environnement avant d'imprimer le présent courriel -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
