Re: [VOTE] Apache Samza 1.4.0 RC1

2020-03-10 Thread Xinyu Liu
+1 (binding).

Run check-all.sh and integration tests for both yarn and standalone. All
passed.

Thanks,
Xinyu


On Fri, Mar 6, 2020 at 6:46 PM Yi Pan  wrote:

> Have downloaded the files, build with check-all.sh, and ran both YARN and
> standalone integration tests. All passed.
>
> +1 (binding).
>
> Thanks!
>
> -Yi
>
> On Tue, Mar 3, 2020 at 3:03 PM Cameron Lee 
> wrote:
>
> > Hi all,
> >
> > This is a call for a vote on a release of Apache Samza 1.4.0. Thanks to
> > everyone who has contributed to this release.
> >
> > The release candidate can be downloaded from here:
> > https://home.apache.org/~cameronlee/samza-1.4.0-rc1/
> >
> > The release candidate is signed with pgp key 0x54CB3CE3, which can be
> found
> > here:
> >
> >
> https://keyserver.ubuntu.com/pks/lookup?search=0x54CB3CE3=on=index
> > or to directly see the public key here:
> >
> >
> https://keyserver.ubuntu.com/pks/lookup?op=get=0x71b0145290ecdbfa5caea6dbd786a7ba54cb3ce3
> >
> > The git tag is release-1.4.0-rc1, signed by the same pgp key above:
> >
> >
> https://gitbox.apache.org/repos/asf?p=samza.git;a=commit;h=5327fafb8502b126482ec0c4efc8d1aa9b96ba44
> >
> > Test binaries have been published to Maven's staging repository, and are
> > available here:
> > https://repository.apache.org/content/repositories/orgapachesamza-1077
> >
> > The vote will be open for 72 hours (until Friday, March 6, 2020 at 3pm
> > PST).
> >
> > Please download the release candidate, check the hashes/signature, build
> it
> > and test it, and then please vote:
> > [ ] +1 approve
> > [ ] +0 no opinion
> > [ ] -1 disapprove (and reason why)
> >
> > I ran check-all.sh and integration tests.
> >
> > +1 (non-binding) from my side.
> >
> > Thank you,
> > Cameron
> >
>


Re: [VOTE] SEP-24: Cluster-based Job Coordinator Dependency Isolation

2020-03-10 Thread Cameron Lee
a) The "yarn.resources.*" configs are for localizing the necessary
resources into the working directory for the process. I felt that the
specific configuration format to specify these resources might be
YARN-specific (e.g. YARN has type and visibility configs for each of its
resources), so a generic format might not apply. In a non-YARN case, the
localization configs would need to be specified according to the technology
being used.
b) It is correct that the Avro version will need to be compatible with the
version that is used by the infrastructure, if infrastructure needs to use
Avro and pass the Avro object to the application. This is the case with any
serde technology that needs to be used. For the job coordinator, it is not
much of a concern anyways, since it is not doing serde of Avro messages.
This may be more of a concern for general split deployment, which will
impact the processing containers, and will be a separate SEP.
c) It should work to leave infrastructure serdes in the infrastructure
classpath. The infrastructure serdes just see generic types (which are
java.lang.Object at runtime) for the messages, and they don't do anything
with the concrete types, so in the infrastructure classes, the messages get
passed around as Object, but their concrete classes can still be loaded
from the application. As with (b), this is more of a concern for general
split deployment, since the job coordinator doesn't do message serde. I
have run some tests regarding this classloading pattern, but we will do
further verification for general split deployment.
d) Yes, you are correct. Good catch. It should be "described above at
Application classloader".

Thanks for all of your questions. I will clarify some details in the doc
regarding your questions.

Cameron

On Mon, Mar 9, 2020 at 12:07 PM Yi Pan  wrote:

> Hi, Cameron,
>
> Sorry to chime in late. Overall, looks great! I do have a few
> suggestions/questions before I can cast my vote here:
> a) for the configuration variable names, why are we limiting ourselves to
> yarn.resource.*? We have changed some of the configuration variables from
> yarn specific to non-yarn specific. I would love to keep that consistent
> (i.e. gradually moving all our yarn-specific configuration variables to
> non-yarn-specifc names)
> b) for the avro case as referred to in the delegation case in the
> Infrastructure classloader, if we delegate the object deserialization class
> to the application classloader, would it be possible that the application
> provides an non-compatible version of avro class than the ones used within
> the "infrastructure plugins" and hence causing runtime exception in the
> infrastructure plugin? Or is the solution being: do not directly use serde
> classes in the infrastructure code?
> c) following the description of infrastructure classloader flow, where
> should we expect the serde classes? In the application classpath, I guess?
> So, does that mean that we should exclude serde classes (including
> SerializableSerde and JsonSerdeV2) in the Samza infrastructure package, and
> tell the users to package them in application package?
> d) I am a bit confused about the description on "multiple" application
> classloaders on the job coordinator: one is for the describe flow and the
> other is in the "Application" classloader, instead of "API" classloader,
> right?
>
> Best,
>
> -Yi
>
>
> On Wed, Mar 4, 2020 at 11:32 AM Ke Wu  wrote:
>
> > +1.
> >
> > Thanks for driving this effort.
> >
> > Best,
> > Ke
> >
> > > On Mar 3, 2020, at 6:28 PM, Jagadish Venkatraman <
> jagadish1...@gmail.com>
> > wrote:
> > >
> > > +1 binding.
> > >
> > > Thanks Cameron. I look forward to this feature taking our "Stream
> > > Processing as a service" offering to the next level.
> > >
> > > Cheers
> > >
> > > On Tuesday, March 3, 2020, Prateek Maheshwari 
> > wrote:
> > >
> > >> +1 (binding) from me. Thanks for contributing this feature. Looking
> > forward
> > >> to having dependency isolation and to the ability to upgrade the
> > framework
> > >> independently from an application.
> > >>
> > >> Thanks,
> > >> Prateek
> > >>
> > >> On Fri, Feb 28, 2020 at 10:48 AM Cameron Lee  >
> > >> wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> This is a call for a vote on SEP-24: Cluster-based Job Coordinator
> > >>> Dependency Isolation. Thanks to everyone who reviewed the proposal
> and
> > >>> provided feedback.
> > >>>
> > >>> I have addressed comments on the SEP, and I am not aware of any
> further
> > >>> major questions or objections, so I am starting this vote.
> > >>>
> > >>> SEP link:
> > >>>
> > >>> https://cwiki.apache.org/confluence/display/SAMZA/SEP-
> > >> 24%3A+Cluster-based+Job+Coordinator+Dependency+Isolation
> > >>>
> > >>> Discuss thread:
> > >>>
> > >>> https://mail-archives.apache.org/mod_mbox/samza-dev/202001.mbox/%
> > >> 3cCAMja7KeGcRZ3H95Rxk5XE=60zxm6jxjkjuwwxmgmadpfbyk...@mail.gmail.com
> %3e
> > >>> There was also some discussion through comments on the SEP page (see
>