Thanks Ismael for the comments! Replied inline.

On Wed, Mar 15, 2017 at 8:18 AM, Ismaël Mejía <ieme...@gmail.com> wrote:

> Excellent proposal, sorry to jump into this discussion so late, this
> was in my toread list for almost two weeks, and I finally got the time
> to read the document and I have two minor comments:
>
> I have the impression that the strict separation of Providers (the
> data-processing systems) and Resources (the concrete Data Stores)
> makes sense for the general case, but is lacking if what we want to
> test are things in the Hadoop ecosystem where the data stores commonly
> co-exist in the same group of machines with the data-processing
> systems (the Providers), e.g. HDFS, Hbase + YARN. This is important to
> correctly test that data locality works correctly for example. Have
> you considered such case?
>

Definitely interesting to think about, and I don't think I added provisions
for this in the doc. My impression, though, is that since the providers and
the data stores are not coupled, if the provider we are bringing up also
provides the data store, we can just omit the data store for that benchmark
and use what we've already brought up. Does that answer your question, or
have I misunderstood?

>
> Another thing I noticed is that in the list of runners supporting PKB
> the Direct Runner is not included, is there any particular reason for
> this? I think that even if performance is not the main goal of the
> direct runner it can be nice to have it there too to catch any
> performance regressions, or is it because it is already ready for it?
> what do you think?
>
>
Great point -- I neglected to include the DirectRunner in the plans here.
I'll add it to the doc and file a JIRA.


> Thanks,
> Ismaël
>
> On Thu, Mar 2, 2017 at 11:49 PM, Amit Sela <amitsel...@gmail.com> wrote:
> > Looks great, and I'll be sure to follow this. Ping me if I can assist in
> > any way!
> >
> > On Fri, Mar 3, 2017 at 12:09 AM Ahmet Altay <al...@google.com.invalid>
> > wrote:
> >
> >> Sounds great, thank you!
> >>
> >> On Thu, Mar 2, 2017 at 1:41 PM, Jason Kuster <jasonkus...@google.com
> >> .invalid
> >> > wrote:
> >>
> >> > D'oh, my bad Ahmet. I've opened BEAM-1610, which handles support for
> >> Python
> >> > in PKB against the Dataflow runner. Once the Fn API progresses some
> more
> >> we
> >> > can add some work items for the other runners too. Let's chat about
> this
> >> > more, maybe next week?
> >> >
> >> > On Thu, Mar 2, 2017 at 1:31 PM, Ahmet Altay <al...@google.com.invalid
> >
> >> > wrote:
> >> >
> >> > > Thank you Jason, this is great.
> >> > >
> >> > > Which one of these issues fall into the land of sdk-py?
> >> > >
> >> > > Ahmet
> >> > >
> >> > > On Thu, Mar 2, 2017 at 12:34 PM, Jason Kuster <
> >> > > jasonkus...@google.com.invalid> wrote:
> >> > >
> >> > > > Glad to hear the excitement. :)
> >> > > >
> >> > > > Filed BEAM-1595 - 1609 to track work items. Some of these fall
> under
> >> > > runner
> >> > > > components, please feel free to reach out to me if you have any
> >> > questions
> >> > > > about how to accomplish these.
> >> > > >
> >> > > > Best,
> >> > > >
> >> > > > Jason
> >> > > >
> >> > > > On Wed, Mar 1, 2017 at 5:50 AM, Aljoscha Krettek <
> >> aljos...@apache.org>
> >> > > > wrote:
> >> > > >
> >> > > > > Thanks for writing this and taking care of this, Jason!
> >> > > > >
> >> > > > > I'm afraid I also cannot add anything except that I'm excited to
> >> see
> >> > > some
> >> > > > > results from this.
> >> > > > >
> >> > > > > On Wed, 1 Mar 2017 at 03:28 Kenneth Knowles
> <k...@google.com.invalid
> >> >
> >> > > > > wrote:
> >> > > > >
> >> > > > > Just got a chance to look this over. I don't have anything to
> add,
> >> > but
> >> > > > I'm
> >> > > > > pretty excited to follow this project. Have the JIRAs been filed
> >> > since
> >> > > > you
> >> > > > > shared the doc?
> >> > > > >
> >> > > > > On Wed, Feb 22, 2017 at 10:38 AM, Jason Kuster <
> >> > > > > jasonkus...@google.com.invalid> wrote:
> >> > > > >
> >> > > > > > Hey all, just wanted to pop this up again for people -- if
> anyone
> >> > has
> >> > > > > > thoughts on performance testing please feel welcome to chime
> in.
> >> :)
> >> > > > > >
> >> > > > > > On Fri, Feb 17, 2017 at 4:03 PM, Jason Kuster <
> >> > > jasonkus...@google.com>
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Hi all,
> >> > > > > > >
> >> > > > > > > I've written up a doc on next steps for getting performance
> >> > testing
> >> > > > up
> >> > > > > > and
> >> > > > > > > running for Beam. I'd love to hear from people -- there's a
> >> fair
> >> > > > amount
> >> > > > > > of
> >> > > > > > > work encapsulated in here, but the end result is that we
> have a
> >> > > > > > performance
> >> > > > > > > testing system which we can use for benchmarking all
> aspects of
> >> > > Beam,
> >> > > > > > which
> >> > > > > > > would be really exciting. Looking forward to your thoughts.
> >> > > > > > >
> >> > > > > > > https://docs.google.com/document/d/
> >> > 1PsjGPSN6FuorEEPrKEP3u3m16tyOz
> >> > > > > > > ph5FnL2DhaRDz0/edit?ts=58a78e73
> >> > > > > > >
> >> > > > > > > Best,
> >> > > > > > >
> >> > > > > > > Jason
> >> > > > > > >
> >> > > > > > > --
> >> > > > > > > -------
> >> > > > > > > Jason Kuster
> >> > > > > > > Apache Beam / Google Cloud Dataflow
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > -------
> >> > > > > > Jason Kuster
> >> > > > > > Apache Beam / Google Cloud Dataflow
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > -------
> >> > > > Jason Kuster
> >> > > > Apache Beam / Google Cloud Dataflow
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > -------
> >> > Jason Kuster
> >> > Apache Beam / Google Cloud Dataflow
> >> >
> >>
>



-- 
-------
Jason Kuster
Apache Beam / Google Cloud Dataflow

Reply via email to