Hi,

the comparison is between simple text files and lzo with protobuf. I am
using LzoIndexer for calculating the splits. The intermediate data or the
map outputs are not compressed.

What i am trying to do is executing a simple select queries using the
simple text data and lzo with protobufs in pig scripts and based on the
result planning to use them in the project.

I have tried with the following options
Plain Text files vs Lzo+Protobuf(with and without output compression of
final result)
Plain Text files vs Lzo of Plain Text here using LzoTokenisedLoader

In all the cases the performance of Plain Text files version is better than
others.

Am I missing a point here wrt to usage of Lzo?

thanks and regards,
Vijaya Bhaskar Peddinti

On Sun, Dec 11, 2011 at 12:52 PM, Prashant Kommireddi
<[email protected]>wrote:

> Vijay it really depends on what you are doing with LZO. Is it being
> used for creating splits, map output compression, intermediate files?
> Also what are you comparing this to? Simple text files, gzip/bzip
> compressed files?
>
> Sent from my iPhone
>
> On Dec 10, 2011, at 11:12 PM, vijaya bhaskar peddinti
> <[email protected]> wrote:
>
> > Dear All,
> >
> > I am doing a PoC on Lzo compression with Protobuf using elephant bird and
> > Pig 0.8.0. I am doing this PoC on cluster of 10 nodes. I have also done
> > indexing for the Lzo file. i have noticed that there is no performance
> > improvement when compared with uncompressed data. Does Lzo support is
> there
> > for Pig?
> >
> > The data size if 1.5GB for the PoC. Pig script is a select query kind of
> > which reads and writes data using Lzo*ProtoBuf Loader and storage
> methods.
> >
> > Please provide any suggestions and pointer in this regards.
> >
> >
> > thanks and regards,
> > Vijaya Bhaskar Peddinti
>

Reply via email to