Definitely Raja, but looks like the one for zip is blocked for some time now

https://issues.apache.org/jira/browse/MAPREDUCE-210

Regards
Bejoy KS

> Date: Sun, 30 Sep 2012 12:41:29 -0700
> Subject: Re: zip file or tar file cosumption
> From: thiruvath...@gmail.com
> To: user@hive.apache.org
> 
> we can write custom codecs
> 
> On Sun, Sep 30, 2012 at 11:47 AM, Bejoy KS <bejo...@outlook.com> wrote:
> > Yes Manish, Zip is not supported in hadoop. You may have to use gzip
> > instead.
> >
> > Regards
> > Bejoy KS
> >
> >
> > ________________________________
> > Subject: RE: zip file or tar file cosumption
> > From: manishbh...@rocketmail.com
> > To: user@hive.apache.org
> > CC: chuck.conn...@nuance.com
> > Date: Sun, 30 Sep 2012 20:35:35 +0530
> >
> > Thanks Bejoy. I have zip file there is sense to convert into gzip again.
> >
> > Chuck, I got what you are trying to say. So I need to process it outside
> > HDFS and bring the text file into HDFS.
> >
> >
> > On Sun, 2012-09-30 at 18:21 +0530, Bejoy KS wrote:
> >
> > Hi Manish
> >
> > Gzip works well if you have the compression codec available in
> > 'io.compression.codes' . Gzip codec is present in default.
> >
> > I don't think untar ing world be done by map reduce jobs. So tar files may
> > not work with hive, you need to untar the files out of hadoop hive as a
> > prerequisite.
> >
> >
> >
> > Regards
> >
> > Bejoy KS
> >
> >
> > ________________________________
> >
> > To: user@hive.apache.org; keshav.c.sav...@fisglobal.com
> > Subject: Re: zip file or tar file cosumption
> > From: manishbh...@rocketmail.com
> > Date: Sun, 30 Sep 2012 12:32:15 +0000
> >
> > What about .gz OR tar file. Does this unzip require at HDFS and load into
> > hive? How you resolve it.
> >
> > Sent from my BlackBerry, pls excuse typo
> >
> > ________________________________
> >
> > From: "Connell, Chuck" <chuck.conn...@nuance.com>
> >
> > Date: Sun, 30 Sep 2012 12:24:37 +0000
> >
> > To: user@hive.apache.org<user@hive.apache.org>; Savant,
> > Keshav<keshav.c.sav...@fisglobal.com>
> >
> > ReplyTo: user@hive.apache.org
> >
> > Subject: RE: zip file or tar file cosumption
> >
> >
> >
> > I have seen that error when I try to overwrite an existing file.
> >
> > But, more importantly, Hive cannot understand ZIP files. There was a long
> > thread about this just a few days ago. Your table def says "stored as
> > textfile" but you are not giving it a text file.
> >
> > Chuck
> >
> >
> > ________________________________
> >
> > From: Manish [manishbh...@rocketmail.com]
> > Sent: Sunday, September 30, 2012 7:38 AM
> > To: Savant, Keshav
> > Cc: user@hive.apache.org
> > Subject: RE: zip file or tar file cosumption
> >
> >
> >
> >
> > I am getting below error when loading zip file
> >
> > Driver returned: 9.  Errors: Hive history
> > file=/tmp/hue/hive_job_log_hue_201209300434_1768401171.txt
> > Loading data to table default.pageview_zip
> > Failed with exception Error moving:
> > hdfs://localhost:54310/user/manish/input/zip/11sep12.zip into:
> > /user/manish/input/zip
> > FAILED: Execution Error, return code 1 from
> > org.apache.hadoop.hive.ql.exec.MoveTask
> >
> > My load statement is: LOAD DATA INPATH '/user/manish/input/11sep12.zip'
> > OVERWRITE INTO TABLE `pageview_zip`
> >
> > Table definition:
> > CREATE external TABLE pageview_zip
> > (
> > C_0 STRING,
> > C_1 STRING,
> > C_7 MAP<STRING,STRING>,
> > C_8 STRING,
> > C_13 MAP<STRING,STRING>,
> > C_21 STRING
> > )
> > COMMENT 'Page View'
> > ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' COLLECTION ITEMS TERMINATED BY
> > ';' MAP KEYS TERMINATED BY '='
> > STORED AS TEXTFILE LOCATION '/user/manish/input/zip'
> >
> > Thank You,
> > Manish
> >
> >
> >
> > On Thu, 2012-09-27 at 11:11 +0000, Savant, Keshav wrote:
> >
> > True Manish.
> >
> >
> >
> > Keshav C Savant
> >
> >
> >
> >
> > From: Manish.Bhoge [mailto:manish.bh...@target.com]
> > Sent: Thursday, September 27, 2012 4:26 PM
> > To: user@hive.apache.org; manishbh...@rocketmail.com
> > Subject: RE: zip file or tar file cosumption
> >
> >
> >
> >
> > Thanks Savant. I believe this will hold good for .zip file also.
> >
> >
> >
> > Thank You,
> >
> > Manish.
> >
> >
> >
> > From: Savant, Keshav [mailto:keshav.c.sav...@fisglobal.com]
> > Sent: Thursday, September 27, 2012 10:19 AM
> > To: user@hive.apache.org; manishbh...@rocketmail.com
> > Subject: RE: zip file or tar file cosumption
> >
> >
> >
> >
> > Manish the table that has been created for zipped text files should be
> > defined as sequence file, for example
> >
> >
> >
> > CREATE TABLE my_table_zip(col1 STRING,col2 STRING) ROW FORMAT DELIMITED
> > FIELDS TERMINATED BY ',' stored as sequencefile;
> >
> >
> >
> > After this you can use regular load command to load these files, for example
> >
> >
> >
> > load data local inpath 'path-to-csv-file.gz' into table my_table_zip;
> >
> >
> >
> > hope this helps
> >
> >
> >
> > Keshav C Savant
> >
> >
> >
> >
> > From: Manish Bhoge [mailto:manishbh...@rocketmail.com]
> > Sent: Wednesday, September 26, 2012 9:43 PM
> > To: user@hive.apache.org
> > Subject: Re: zip file or tar file cosumption
> >
> >
> >
> >
> > Hi Richin,
> >
> > Thanks! Yes this is what I wanted to understand how to load zip file to Hive
> > table. Now, I'll try this option.
> >
> > Thank You,
> > Manish.
> >
> > Sent from my BlackBerry, pls excuse typo
> >
> >
> > ________________________________
> >
> > From:<richin.j...@nokia.com>
> >
> >
> > Date:Wed, 26 Sep 2012 14:51:39 +0000
> >
> >
> > To:<user@hive.apache.org>
> >
> >
> > ReplyTo:user@hive.apache.org
> >
> >
> > Subject:RE: zip file or tar file cosumption
> >
> >
> >
> >
> >
> > You are right Chuck. I thought his question was how to use zip files or any
> > compressed files in Hive tables.
> >
> >
> >
> > Yeah, seems like you can’t do that
> > see:http://mail-archives.apache.org/mod_mbox/hive-user/201203.mbox/%3CCAENxBwxkF--3PzCkpz1HX21=gb9yvasr2jl0u3yul2tfgu0...@mail.gmail.com%3E
> >
> > But you can always compress your files in gzip format and they should be
> > good to go.
> >
> >
> >
> > Richin
> >
> >
> >
> > From: ext Connell, Chuck [mailto:chuck.conn...@nuance.com]
> > Sent: Wednesday, September 26, 2012 10:44 AM
> > To: user@hive.apache.org
> > Subject: RE: zip file or tar file cosumption
> >
> >
> >
> >
> > But TEXTFILE in Hive always has newline as the record delimiter. How could
> > this possibly work with a zip/tar file that can contain ASCII 10 characters
> > at random locations, and certainly does not have ASCII 10 at the end of each
> > data record?
> >
> >
> >
> > Chuck Connell
> >
> > Nuance R&D Data Team
> >
> > Burlington, MA
> >
> >
> >
> >
> >
> >
> > From:richin.j...@nokia.com [mailto:richin.j...@nokia.com]
> > Sent: Wednesday, September 26, 2012 10:14 AM
> > To: user@hive.apache.org; manishbh...@rocketmail.com
> > Subject: RE: zip file or tar file cosumption
> >
> >
> >
> >
> > Hi Manish,
> >
> >
> >
> > If you have your zip file at location -  /home/manish/zipfile, you can just
> > point your external table to that location like
> >
> > CREATE EXTERNAL TABLE manish_test (field1 string, field2 string) ROW FORMAT
> > DELIMITED FIELDS TERMINATED BY <your_column_delimiter> STORED AS TEXTFILE
> > LOCATION ‘/home/manish/zipfile’;
> >
> >
> >
> > OR
> >
> >
> >
> > If you already have external table pointing to a certain location you can
> > load this zip file into your table as
> >
> > LOAD DATA INPATH ‘/home/manish/zipfile’ INTO TABLE manish_test;
> >
> >
> >
> > Hope this helps.
> >
> >
> >
> > Richin
> >
> >
> >
> > From: ext Manish Bhoge [mailto:manishbh...@rocketmail.com]
> > Sent: Wednesday, September 26, 2012 9:13 AM
> > To: user@hive.apache.org
> > Subject: Re: zip file or tar file cosumption
> >
> >
> >
> >
> > Hi Savant,
> >
> > Got it. But I still need to understand that how to load zip? Can I directly
> > use zip file in external table. can u pls help to get the load statement.
> >
> > Sent from my BlackBerry, pls excuse typo
> >
> >
> > ________________________________
> >
> > From:"Savant, Keshav" <keshav.c.sav...@fisglobal.com>
> >
> >
> > Date:Wed, 26 Sep 2012 12:25:38 +0000
> >
> >
> > To:user@hive.apache.org<user@hive.apache.org>
> >
> >
> > ReplyTo:user@hive.apache.org
> >
> >
> > Cc:manish.bh...@target.com<manish.bh...@target.com>;
> > chuck.conn...@nuance.com<chuck.conn...@nuance.com>
> >
> >
> > Subject:RE: zip file or tar file cosumption
> >
> >
> >
> >
> >
> > Another solution would be
> >
> >
> >
> > Using shell script do following
> >
> > 1.      unzip txt files,
> >
> > 2.      one by one merge those 50 (or N number of) text files into one text
> > file,
> >
> > 3.      then the zip/tar that bigger text file,
> >
> > 4.      then that big zip/tar file can be uploaded into hive.
> >
> >
> >
> > Keshav C Savant
> >
> >
> >
> >
> > From: Connell, Chuck [mailto:chuck.conn...@nuance.com]
> > Sent: Wednesday, September 26, 2012 4:04 PM
> > To: user@hive.apache.org
> > Subject: RE: zip file or tar file cosumption
> >
> >
> >
> >
> > This could be a problem. Hive uses newline as the record separator. A ZIP
> > file will certainly newline characters. So I doubt this is possible.
> >
> > BUT, I would like to hear from anyone who has solved the "newline is always
> > a record separator" problem, because we ran into it for another type of
> > compressed file.
> >
> > Chuck
> >
> > ________________________________
> >
> > From: Manish.Bhoge [manish.bh...@target.com]
> > Sent: Wednesday, September 26, 2012 3:17 AM
> > To: user@hive.apache.org
> > Subject: zip file or tar file cosumption
> >
> >
> > Hivers,
> >
> >
> >
> > I want to understand that would it be possible to utilize zip/tar files
> > directly into Hive. All the files has similar schema (structure).  Say 50
> > *.txt files are zipped into a single zip file can we load data directly from
> > this zip file OR should we need to unzip first?
> >
> >
> >
> > Thanks & Regards
> >
> > Manish Bhoge | Technical Architect ¤TargetDW/BI|( +919379850010 (M) Ext:
> > 5691 VOIP: 22165 |! “Excellence is not a skill, It is an attitude.” MySite
> >
> >
> >
> >
> > _____________
> > The information contained in this message is proprietary and/or
> > confidential. If you are not the intended recipient, please: (i) delete the
> > message and all copies; (ii) do not disclose, distribute or use the message
> > in any manner; and (iii) notify the sender immediately. In addition, please
> > be aware that any message addressed to our domain is subject to archiving
> > and review by persons other than the intended recipient. Thank you.
> >
> >
> > _____________
> > The information contained in this message is proprietary and/or
> > confidential. If you are not the intended recipient, please: (i) delete the
> > message and all copies; (ii) do not disclose, distribute or use the message
> > in any manner; and (iii) notify the sender immediately. In addition, please
> > be aware that any message addressed to our domain is subject to archiving
> > and review by persons other than the intended recipient. Thank you.
> >
> >
> > _____________
> > The information contained in this message is proprietary and/or
> > confidential. If you are not the intended recipient, please: (i) delete the
> > message and all copies; (ii) do not disclose, distribute or use the message
> > in any manner; and (iii) notify the sender immediately. In addition, please
> > be aware that any message addressed to our domain is subject to archiving
> > and review by persons other than the intended recipient. Thank you.
> >
> >
> >
> >
> >
> 
> 
> 
> -- 
> 
> Raja Thiruvathuru
                                          

Reply via email to