Forgetting Impala, what format would be best to use with daily logs? Block-compressed sequence files?
On Apr 8, 2013, at 8:12 PM, Harsh J <[email protected]> wrote: > Hey Mark, > > Gzip codec creates extension .gzip, not .deflate (which is > DeflateCodec). You may want to re-check your settings. > > Impala questions are best resolved at its current user and developer > community at > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user. > Impala does currently support LZO (and also Indexed LZO) compressed > text files however, so you may want to try that as its splittable > (compared to Gzip ones). > > On Tue, Apr 9, 2013 at 5:18 AM, Mark <[email protected]> wrote: >> Trying to determine what the best format to use for storing daily logs. We >> recently switch from snappy (.snappy) to gzip (.deflate) but I'm wondering >> if there is something better? Our main clients for these daily logs are pig >> and hive using an external table. We were thinking about testing out impala >> but we see that it doesn't work with compressed text files. Any suggestions? >> >> Thanks > > > > -- > Harsh J
