Hi,

Here is a confusion I encounter these days: I don't install or build snappy on 
my hadoop cluster, but when I tested and compared about the compression ratio 
of Parquet and ORC storage format. During the test, I can set the way of 
compression for two storage format, for example, using "TBLPROPERTIES 
("orc.compress"="Snappy"); " or "set parquet.compression=snappy;", both these 
commands would work. However, when I just want to compress the textfile format 
with snappy compression, it says that "can not find or access the snappy 
library".


I wonder why this situation happen, and, I really doubt that whether the ORC or 
Parquet file using "Snappy" compression. But, the storage really becomes 
smaller, and diff from the "gzip" or "zlib" compression.


Looking forward to your reply and help.


Best,
Zhefu Peng

Reply via email to