Sahil Takiar created IMPALA-8549:
------------------------------------
Summary: Add support for scanning DEFLATE text files
Key: IMPALA-8549
URL: https://issues.apache.org/jira/browse/IMPALA-8549
Project: IMPALA
Issue Type: Improvement
Components: Backend
Reporter: Sahil Takiar
Several Hadoop tools (e.g. Hive, MapReduce, etc.) support reading and writing
text files stored using zlib / deflate (results in files such as
{{000000_0.deflate}}). Impala currently does not support reading {{.deflate}}
files and returns errors such as: {{ERROR: Scanner plugin 'DEFLATE' is not one
of the enabled plugins: 'LZO'}}.
Moreover, the default compression codec in Hadoop is zlib / deflate (see
{{o.a.h.io.compress.DefaultCodec}}). So when writing to a text table in Hive,
if users set {{hive.exec.compress.output}} to true, then {{.deflate}} files
will be written by default.
Impala does support zlib / deflate with other file formats though: Avro,
RCFiles, SequenceFiles (see
https://impala.apache.org/docs/build/html/topics/impala_file_formats.html).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]