No parquet and orc have internal compression which must be used over the external compression that you are referring to.
Internal compression can be decompressed in parallel which is significantly faster. Internally parquet supports only snappy, gzip,lzo, brotli (2.4.), lz4 (2.4), zstd (2.4). > On 22. Aug 2018, at 07:33, Tanvi Thacker <tanvithack...@gmail.com> wrote: > > Hi Patrick, > > What are other formats supported? > - As far as I know, you can set any compression with any format (ORC, Text > with snappy ,gzip etc). Are you looking for any specific format or > compression? > > How can I verify a file is compressed and with what algorithm? > - you may check parquet-tools if they provide any meta information about > compression. > > And, on another note, if you are already having an uncompressed data and you > are creating a table with snappy compression, you need to do use "CREATE into > new_compressed table as select * from un_compressed_table" in order to > actually compress the data > > Regards, > Tanvi Thacker > >> On Fri, Aug 10, 2018 at 6:30 AM Patrick Duin <patd...@gmail.com> wrote: >> Hi, >> >> I got some hive tables in Parquet format and I am trying to find out how >> best to enable compression. >> >> Done a bit of searching and the information is a bit scattered but I found I >> can use this hive property to enable compression.It needs to be set before >> doing an insert. >> >> set parquet.compression=SNAPPY; >> >> What other formats are supported? >> How can I verify a file is compressed and with what algorithm? >> >> Thanks, >> Patrick