> .. if the provider we are bringing up also
> provides the data store, we can just omit the data store for that benchmark
> and use what we've already brought up. Does that answer your question, or
> have I misunderstood?

Yes, and it is a perfect approach for the case, great idea.

> Great point -- I neglected to include the DirectRunner in the plans here.
> I'll add it to the doc and file a JIRA.

Excellent.

This work is super interesting so don’t hesitate to ask anything from
us the rest of the community because I think there are many of us
interested and we can give a hand if needed.


On Thu, Mar 16, 2017 at 9:17 AM, Jason Kuster
<jasonkus...@google.com.invalid> wrote:
> Thanks Ismael for the comments! Replied inline.
>
> On Wed, Mar 15, 2017 at 8:18 AM, Ismaël Mejía <ieme...@gmail.com> wrote:
>
>> Excellent proposal, sorry to jump into this discussion so late, this
>> was in my toread list for almost two weeks, and I finally got the time
>> to read the document and I have two minor comments:
>>
>> I have the impression that the strict separation of Providers (the
>> data-processing systems) and Resources (the concrete Data Stores)
>> makes sense for the general case, but is lacking if what we want to
>> test are things in the Hadoop ecosystem where the data stores commonly
>> co-exist in the same group of machines with the data-processing
>> systems (the Providers), e.g. HDFS, Hbase + YARN. This is important to
>> correctly test that data locality works correctly for example. Have
>> you considered such case?
>>
>
> Definitely interesting to think about, and I don't think I added provisions
> for this in the doc. My impression, though, is that since the providers and
> the data stores are not coupled, if the provider we are bringing up also
> provides the data store, we can just omit the data store for that benchmark
> and use what we've already brought up. Does that answer your question, or
> have I misunderstood?
>
>>
>> Another thing I noticed is that in the list of runners supporting PKB
>> the Direct Runner is not included, is there any particular reason for
>> this? I think that even if performance is not the main goal of the
>> direct runner it can be nice to have it there too to catch any
>> performance regressions, or is it because it is already ready for it?
>> what do you think?
>>
>>
> Great point -- I neglected to include the DirectRunner in the plans here.
> I'll add it to the doc and file a JIRA.
>
>
>> Thanks,
>> Ismaël
>>
>> On Thu, Mar 2, 2017 at 11:49 PM, Amit Sela <amitsel...@gmail.com> wrote:
>> > Looks great, and I'll be sure to follow this. Ping me if I can assist in
>> > any way!
>> >
>> > On Fri, Mar 3, 2017 at 12:09 AM Ahmet Altay <al...@google.com.invalid>
>> > wrote:
>> >
>> >> Sounds great, thank you!
>> >>
>> >> On Thu, Mar 2, 2017 at 1:41 PM, Jason Kuster <jasonkus...@google.com
>> >> .invalid
>> >> > wrote:
>> >>
>> >> > D'oh, my bad Ahmet. I've opened BEAM-1610, which handles support for
>> >> Python
>> >> > in PKB against the Dataflow runner. Once the Fn API progresses some
>> more
>> >> we
>> >> > can add some work items for the other runners too. Let's chat about
>> this
>> >> > more, maybe next week?
>> >> >
>> >> > On Thu, Mar 2, 2017 at 1:31 PM, Ahmet Altay <al...@google.com.invalid
>> >
>> >> > wrote:
>> >> >
>> >> > > Thank you Jason, this is great.
>> >> > >
>> >> > > Which one of these issues fall into the land of sdk-py?
>> >> > >
>> >> > > Ahmet
>> >> > >
>> >> > > On Thu, Mar 2, 2017 at 12:34 PM, Jason Kuster <
>> >> > > jasonkus...@google.com.invalid> wrote:
>> >> > >
>> >> > > > Glad to hear the excitement. :)
>> >> > > >
>> >> > > > Filed BEAM-1595 - 1609 to track work items. Some of these fall
>> under
>> >> > > runner
>> >> > > > components, please feel free to reach out to me if you have any
>> >> > questions
>> >> > > > about how to accomplish these.
>> >> > > >
>> >> > > > Best,
>> >> > > >
>> >> > > > Jason
>> >> > > >
>> >> > > > On Wed, Mar 1, 2017 at 5:50 AM, Aljoscha Krettek <
>> >> aljos...@apache.org>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > Thanks for writing this and taking care of this, Jason!
>> >> > > > >
>> >> > > > > I'm afraid I also cannot add anything except that I'm excited to
>> >> see
>> >> > > some
>> >> > > > > results from this.
>> >> > > > >
>> >> > > > > On Wed, 1 Mar 2017 at 03:28 Kenneth Knowles
>> <k...@google.com.invalid
>> >> >
>> >> > > > > wrote:
>> >> > > > >
>> >> > > > > Just got a chance to look this over. I don't have anything to
>> add,
>> >> > but
>> >> > > > I'm
>> >> > > > > pretty excited to follow this project. Have the JIRAs been filed
>> >> > since
>> >> > > > you
>> >> > > > > shared the doc?
>> >> > > > >
>> >> > > > > On Wed, Feb 22, 2017 at 10:38 AM, Jason Kuster <
>> >> > > > > jasonkus...@google.com.invalid> wrote:
>> >> > > > >
>> >> > > > > > Hey all, just wanted to pop this up again for people -- if
>> anyone
>> >> > has
>> >> > > > > > thoughts on performance testing please feel welcome to chime
>> in.
>> >> :)
>> >> > > > > >
>> >> > > > > > On Fri, Feb 17, 2017 at 4:03 PM, Jason Kuster <
>> >> > > jasonkus...@google.com>
>> >> > > > > > wrote:
>> >> > > > > >
>> >> > > > > > > Hi all,
>> >> > > > > > >
>> >> > > > > > > I've written up a doc on next steps for getting performance
>> >> > testing
>> >> > > > up
>> >> > > > > > and
>> >> > > > > > > running for Beam. I'd love to hear from people -- there's a
>> >> fair
>> >> > > > amount
>> >> > > > > > of
>> >> > > > > > > work encapsulated in here, but the end result is that we
>> have a
>> >> > > > > > performance
>> >> > > > > > > testing system which we can use for benchmarking all
>> aspects of
>> >> > > Beam,
>> >> > > > > > which
>> >> > > > > > > would be really exciting. Looking forward to your thoughts.
>> >> > > > > > >
>> >> > > > > > > https://docs.google.com/document/d/
>> >> > 1PsjGPSN6FuorEEPrKEP3u3m16tyOz
>> >> > > > > > > ph5FnL2DhaRDz0/edit?ts=58a78e73
>> >> > > > > > >
>> >> > > > > > > Best,
>> >> > > > > > >
>> >> > > > > > > Jason
>> >> > > > > > >
>> >> > > > > > > --
>> >> > > > > > > -------
>> >> > > > > > > Jason Kuster
>> >> > > > > > > Apache Beam / Google Cloud Dataflow
>> >> > > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > --
>> >> > > > > > -------
>> >> > > > > > Jason Kuster
>> >> > > > > > Apache Beam / Google Cloud Dataflow
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > --
>> >> > > > -------
>> >> > > > Jason Kuster
>> >> > > > Apache Beam / Google Cloud Dataflow
>> >> > > >
>> >> > >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > -------
>> >> > Jason Kuster
>> >> > Apache Beam / Google Cloud Dataflow
>> >> >
>> >>
>>
>
>
>
> --
> -------
> Jason Kuster
> Apache Beam / Google Cloud Dataflow

Reply via email to