Re: [jira] [Resolved] (IMPALA-6829) how to get compressed hdfs file using impala or hive

Tim Armstrong Wed, 11 Apr 2018 10:36:12 -0700

Hi,
  If I understood correctly, the query is behaving as expected but you're
wondering how it works, right?


Impala detects the compression type based on the file suffix. We mention
this in the docs in the "Using gzip, bzip2, or Snappy-Compressed Text
Files" section:
https://impala.apache.org/docs/build/html/topics/impala_txtfile.html

- Tim

On Mon, Apr 9, 2018 at 7:30 PM, Sathishkumar Paramasivam <
[email protected]> wrote:

>
>
> Pls help
>
> ---------- Forwarded message ---------
> From: Tim Armstrong (JIRA) <[email protected]>
> Date: Mon, Apr 9, 2018 at 7:18 PM
> Subject: [jira] [Resolved] (IMPALA-6829) how to get compressed hdfs file
> using impala or hive
> To: <[email protected]>
>
>
>
>      [ https://issues.apache.org/jira/browse/IMPALA-6829?page=
> com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Tim Armstrong resolved IMPALA-6829.
> -----------------------------------
>     Resolution: Not A Bug
>
> We're happy to help you out with learning Impala, but it would be best to
> have the discussion on the user list: [email protected]
>
> We mainly use JIRA for tracking changes we want to make to Impala, so
> discussions with users tend to get lost here.
>
> > how to get compressed hdfs file using impala or hive
> > ----------------------------------------------------
> >
> >                 Key: IMPALA-6829
> >                 URL: https://issues.apache.org/jira/browse/IMPALA-6829
> >             Project: IMPALA
> >          Issue Type: Question
> >            Reporter: sathishkumar paramasivam
> >            Priority: Major
> >
> > hi,
> >
> > i am doing the self learning now the impala and trying to enable the
> compression for the table but could not see the hdfs file getting the
> extension?
> > referring to
> > [https://www.cloudera.com/documentation/enterprise/5-8-
> x/topics/impala_txtfile.html]
> > but not sure how the final compressed file are creating.
> > When I try sqoop, i can get the compress file.  please guide.
> > create table csv_compressed (a string, b string, c string)
> >   row format delimited fields terminated by ",";
> > insert into csv_compressed values
> >   ('one - uncompressed', 'two - uncompressed', 'three - uncompressed'),
> >   ('abc - uncompressed', 'xyz - uncompressed', '123 - uncompressed');
> > ...make equivalent .gz, .bz2, and .snappy files and load them into same
> table directory...
> > select * from csv_compressed;
> > +--------------------+--------------------+----------------------+
> > | a                  | b                  | c                    |
> > +--------------------+--------------------+----------------------+
> > | one - snappy       | two - snappy       | three - snappy       |
> > | one - uncompressed | two - uncompressed | three - uncompressed |
> > | abc - uncompressed | xyz - uncompressed | 123 - uncompressed   |
> > | one - bz2          | two - bz2          | three - bz2          |
> > | abc - bz2          | xyz - bz2          | 123 - bz2            |
> > | one - gzip         | two - gzip         | three - gzip         |
> > | abc - gzip         | xyz - gzip         | 123 - gzip           |
> > +--------------------+--------------------+----------------------+
> > $ hdfs dfs -ls 'hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.
> db/csv_compressed/';
> > ...truncated for readability...
> > 75 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.
> db/csv_compressed/csv_compressed.snappy
> > 79 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.
> db/csv_compressed/csv_compressed_bz2.csv.bz2
> > 80 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.
> db/csv_compressed/csv_compressed_gzip.csv.gz
> > 116 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.
> db/csv_compressed/dd414df64d67d49b_data.0.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
>

Re: [jira] [Resolved] (IMPALA-6829) how to get compressed hdfs file using impala or hive

Reply via email to