Re: Yahoo Streaming Benchmark

Sandesh Hegde Thu, 28 Jan 2016 10:40:57 -0800

Thanks Ashwin.

I have made the changes to use POJO instead of map.


Regarding your point 2.
Filter Tuples will reduce the number of tuples so that will also reduce the
total amount of data passing through. Ultimately choice between them
depends on type of dataset.

Here is the latest change
https://github.com/sandeshh/streaming-benchmarks/tree/squash


On Mon, Jan 18, 2016 at 11:01 AM Ashwin Chandra Putta <
[email protected]> wrote:

> Sandesh,
>
> I had a quick look at the application. Here are a few comments.
>
> 1. Please use pojo instead of map for better performance.
> 2. Can you also check if changing the order of processing from filerTuples
> --> filterFields to filterFields --> filterTuples will increase the
> performance, we are basically reducing the size of the object going across
> the stream before filter.
> 3. Can you rename to filterFields to something like fieldPicker, because
> seems like it is actually picking fields and not filtering.
>
> Reagrds,
> Ashwin.
>
> On Mon, Jan 18, 2016 at 10:46 AM, Sandesh Hegde <[email protected]>
> wrote:
>
> > Hello All,
> >
> > Yahoo did benchmarking of Streaming systems, here is the blog for the
> same.
> >
> >
> >
> http://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
> > Codes is at : https://github.com/yahoo/streaming-benchmarks
> >
> > Currently, they don't have Apex. I am working on the Apex App for the
> > benchmark.  Here is the "apex-benchmark" that I am working on,
> > https://github.com/sandeshh/streaming-benchmarks.
> >
> > Benchmark app is mostly done, currently I am doing the plumbing. Sending
> > this mail to get your valuable feedback.
> >
> > Thanks
> > Sandesh
> >
>
>
>
> --
>
> Regards,
> Ashwin.
>

Re: Yahoo Streaming Benchmark

Reply via email to