It sounds like a good idea to me.

Regards
JB

On 10/18/2016 08:08 PM, Amit Sela wrote:
@Jesse how about runners "tracing" the constructed DAG (by Beam) so that
it's clear what the runner actually executed ?

Example:
For the SparkRunner, a ParDo translates to a mapPartitions transformation.

That could provide transparency when debugging/benchmarking pipelines
per-runner.

On Tue, Oct 18, 2016 at 8:25 PM Jesse Anderson <je...@smokinghand.com>
wrote:

@Dan before starting with Beam, I'd want to know how much performance I've
giving up by not programming directly to the API.

On Tue, Oct 18, 2016 at 10:03 AM Dan Halperin <dhalp...@google.com.invalid

wrote:

I think there are lots of excellent one-off performance studies, but I'm
not sure how useful that is to Beam.

From a test infra point of view, I'm wondering more about tracking of
performance over time, identifying regressions, etc.

Google has some tools like PerfKit
<https://github.com/GoogleCloudPlatform/PerfKitBenchmarker> which is
basically a skin on a database + some scripts to load and query data;
but I
don't love it. Do other Apache projects do public, long-term benchmarking
and performance regression testing?

Dan

On Tue, Oct 18, 2016 at 8:52 AM, Jesse Anderson <je...@smokinghand.com>
wrote:

I found data Artisan's benchmarking post
<http://data-artisans.com/high-throughput-low-latency-and-
exactly-once-stream-processing-with-apache-flink/>.
They also shared the code <https://github.com/dataArtisans/performance
.
I
didn't dig in much, but they did a wide range of algorithms. They have
the
native code, so you write the Beam code and check against the native
performance.

On Mon, Oct 17, 2016 at 5:14 PM amir bahmanyari
<amirto...@yahoo.com.invalid>
wrote:

Hi Jason,I have been busy bench-marking Flink Cluster (Spark next)
under
Beam.I can share my experience. Can you list items of interest to
know
so I
can answer them to the best of my knowledge.Cheers

      From: Jason Kuster <jasonkus...@google.com.INVALID>
 To: dev@beam.incubator.apache.org
 Sent: Monday, October 17, 2016 5:06 PM
 Subject: Exploring Performance Testing

Hey all,

Now that we've covered some of the initial ground with regard to
correctness testing, I'm going to be starting work on performance
testing
and benchmarking. I wanted to reach out and see what people's
experiences
have been with performance testing and benchmarking
frameworks, particularly in other Apache projects. Anyone have any
experience or thoughts?

Best,

Jason

--
-------
Jason Kuster
Apache Beam (Incubating) / Google Cloud Dataflow








--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to