Actually, compressed sequence files may not work with Pig or Hive then right?
On Apr 9, 2013, at 9:50 AM, Mark <[email protected]> wrote: > Forgetting Impala, what format would be best to use with daily logs? > > Block-compressed sequence files? > > On Apr 8, 2013, at 8:12 PM, Harsh J <[email protected]> wrote: > >> Hey Mark, >> >> Gzip codec creates extension .gzip, not .deflate (which is >> DeflateCodec). You may want to re-check your settings. >> >> Impala questions are best resolved at its current user and developer >> community at >> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user. >> Impala does currently support LZO (and also Indexed LZO) compressed >> text files however, so you may want to try that as its splittable >> (compared to Gzip ones). >> >> On Tue, Apr 9, 2013 at 5:18 AM, Mark <[email protected]> wrote: >>> Trying to determine what the best format to use for storing daily logs. We >>> recently switch from snappy (.snappy) to gzip (.deflate) but I'm wondering >>> if there is something better? Our main clients for these daily logs are pig >>> and hive using an external table. We were thinking about testing out impala >>> but we see that it doesn't work with compressed text files. Any suggestions? >>> >>> Thanks >> >> >> >> -- >> Harsh J >
