Pls help

---------- Forwarded message ---------
From: Tim Armstrong (JIRA) <j...@apache.org>
Date: Mon, Apr 9, 2018 at 7:18 PM
Subject: [jira] [Resolved] (IMPALA-6829) how to get compressed hdfs file
using impala or hive
To: <kumar.sathish...@gmail.com>



     [
https://issues.apache.org/jira/browse/IMPALA-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tim Armstrong resolved IMPALA-6829.
-----------------------------------
    Resolution: Not A Bug

We're happy to help you out with learning Impala, but it would be best to
have the discussion on the user list: user@impala.apache.org

We mainly use JIRA for tracking changes we want to make to Impala, so
discussions with users tend to get lost here.

> how to get compressed hdfs file using impala or hive
> ----------------------------------------------------
>
>                 Key: IMPALA-6829
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6829
>             Project: IMPALA
>          Issue Type: Question
>            Reporter: sathishkumar paramasivam
>            Priority: Major
>
> hi,
>
> i am doing the self learning now the impala and trying to enable the
compression for the table but could not see the hdfs file getting the
extension?
> referring to
> [
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_txtfile.html
]
> but not sure how the final compressed file are creating.
> When I try sqoop, i can get the compress file.  please guide.
> create table csv_compressed (a string, b string, c string)
>   row format delimited fields terminated by ",";
> insert into csv_compressed values
>   ('one - uncompressed', 'two - uncompressed', 'three - uncompressed'),
>   ('abc - uncompressed', 'xyz - uncompressed', '123 - uncompressed');
> ...make equivalent .gz, .bz2, and .snappy files and load them into same
table directory...
> select * from csv_compressed;
> +--------------------+--------------------+----------------------+
> | a                  | b                  | c                    |
> +--------------------+--------------------+----------------------+
> | one - snappy       | two - snappy       | three - snappy       |
> | one - uncompressed | two - uncompressed | three - uncompressed |
> | abc - uncompressed | xyz - uncompressed | 123 - uncompressed   |
> | one - bz2          | two - bz2          | three - bz2          |
> | abc - bz2          | xyz - bz2          | 123 - bz2            |
> | one - gzip         | two - gzip         | three - gzip         |
> | abc - gzip         | xyz - gzip         | 123 - gzip           |
> +--------------------+--------------------+----------------------+
> $ hdfs dfs -ls 'hdfs://
127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/';
> ...truncated for readability...
> 75 hdfs://
127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed.snappy
> 79 hdfs://
127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed_bz2.csv.bz2
> 80 hdfs://
127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed_gzip.csv.gz
> 116 hdfs://
127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/dd414df64d67d49b_data.0
.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to