Hi all, I am planning to compress my Hive tables in LZO and I have a few questions:
1) Is there a point to compress both SequenceFile and TextFile formats ? 2) Before an INSERT command I set up the following variables : SET hive.exec.compress.output=true SET io.seqfile.compression.type=BLOCK SET mapred.output.compress=true SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec Is that enough to ensure LZO compression or do I need to specify something else when I create my tables ? 3) Do I need to use the LZOIndexer externally or will Hive do it for me automatically ? 4) Do I need to set up something else to make sure that Hive will use the LZO index to split my tables for read operations ? Thanks for your help. Cheers, Michael _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorization. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, France Telecom - Orange shall not be liable if this message was modified, changed or falsified. Thank you.