Johannes Rußek
Wed, 17 Mar 2010 09:21:30 -0700
Hello Dmitriy! Sure can do, would love to give it a test run though :) No hurry though. Thanks and regards, Johannes Am 17.03.2010 16:34, schrieb Dmitriy Ryaboy:
Johannes, If you can wait a week or two, we (Twitter) are about to open-source all of our LZO+Protobuf+Pig stuff. Just documentation left to do :-). -Dmitriy On Wed, Mar 17, 2010 at 8:14 AM, Johannes Rußek< johannes.rus...@io-consulting.net> wrote:Hello everybody, I'm trying to use pig with compressed input files. I have a bunch of 1-2GB big apache log files which are compressed down to 30-40MB by using bzip2. I tried to simply load the .bz2 file, but it only "kind of" worked. It seems that it only loaded a fraction of the file and processed that. When I took the uncompressed file, i ended up with ~3500 lines of output, but when i used the .bz2 input file, i had ten. Does this make any sense to you? I've also tried using .lzo files, but pig wouldn't read them in at all, so i figure i have to install some LZO Classes for that. Any hints where I can find them and how to integrate them? Thanks and best regards, Johannes