Re: [ANNOUNCE] Please welcome Boris Shkolnik to the Samza PMC

2019-06-07 Thread santhosh venkat
Congratulations boris! Very well deserved.

On Fri, Jun 7, 2019 at 3:41 PM Daniel Nishimura 
wrote:

> Congrats!
>
> On Fri, Jun 7, 2019 at 3:35 PM Ignacio Solis  wrote:
>
> > Congrats Boris!
> >
> > On Fri, Jun 7, 2019 at 3:20 PM Bharath Kumara Subramanian <
> > codin.mart...@gmail.com> wrote:
> >
> > > Congratulations Boris!
> > >
> > > On Fri, Jun 7, 2019 at 3:19 PM Jagadish Venkatraman <
> > > jagadish1...@gmail.com>
> > > wrote:
> > >
> > > > Congratulations Boris!
> > > >
> > > > On Fri, Jun 7, 2019 at 3:15 PM Xinyu Liu 
> > wrote:
> > > >
> > > > > Congrats, Boris!
> > > > >
> > > > > Xinyu
> > > > >
> > > > > On Fri, Jun 7, 2019 at 3:13 PM Jakob Homan 
> > wrote:
> > > > >
> > > > > > Howdy all-
> > > > > >I'm very pleased to announce that the Samza PMC has voted
> Boris
> > > > > > Shkolnik to be a Project Management Committee (PMC) Member.  The
> > PMC
> > > > > > is responsible for the overall health of a project andl for
> voting
> > in
> > > > > > new committers and PMC members, as well as VOTEing on releases.
> > Over
> > > > > > the past two years, Boris has been a valuable committer on the
> > > > > > project.
> > > > > >
> > > > > > Congrats Boris!
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jakob
> > > > > > on behalf of the Samza PMC
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Jagadish V,
> > > > Graduate Student,
> > > > Department of Computer Science,
> > > > Stanford University
> > > >
> > >
> >
> >
> > --
> > Nacho - Ignacio Solis - iso...@igso.net
> >
>


Re: REMINDER. [VOTE] Apache Samza 1.2.0 RC4

2019-06-04 Thread santhosh venkat
+1(non-binding)

1. ./bin/check-all.sh succeeded
2. ./bin/integration-tests.sh succeeded
3. Expanded samza-tools and followed the tutorial steps for standalone SQL
examples Succeeded.
4. Verified all sha1 hash code and asc signatures successfully

Thanks,


On Tue, Jun 4, 2019 at 1:26 PM Xinyu Liu  wrote:

> +1 (binding).
>
> run check-all.sh and the build passed.
>
> Having trouble running the integration tests in both linux and mac,
> possibly due to my local machine env.
>
> Thanks,
> Xinyu
>
> On Mon, Jun 3, 2019 at 11:00 AM Daniel Nishimura 
> wrote:
>
> > check-all.sh and integration tests passed. +1 from me.
> >
> > Just a side note, the link in the original email is a broken link. The
> link
> > to the RC archive is: http://home.apache.org/~boryas/samza-1.2.0-rc4
> >
> > On Sun, Jun 2, 2019 at 5:00 PM Boris Shkolnik  wrote:
> >
> > > Hi,
> > >
> > > This is a call for a vote on a release of Apache Samza 1.2.0. Thanks to
> > > everyone who has contributed to this release.
> > >
> > >
> > > The release candidate can be downloaded from here:
> > > http://home.apache.org/~boryas/samza-1.2.0-rc
> > > 4
> > >
> > > (this release has a fix for standalone integration test)
> > >
> > > The release candidate is signed with pgp key 0x7D74D0CD5B5EB041, which
> > can
> > > be found
> > >
> http://keyserver.ubuntu.com/pks/lookup?op=get=0x7d74d0cd5b5eb041
> > > <
> http://keyserver.ubuntu.com/pks/lookup?op=get=0xF8B95961A401BF0F
> > >
> > > The git tag is release-1.2.0-rc4 and signed with the same pgp key:
> > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.2.0-rc
> > > <
> > >
> >
> https://gitbox.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.1.0-rc1
> > > >
> > > 4
> > >
> > > Test binaries have been published to Maven's staging repository, and
> are
> > > available here:
> > > https://repository.apache.org/content/repositories/orgapachesamza-106
> > > <
> > >
> >
> https://repository.apache.org/content/repositories/orgapachesamza-1065/org/
> > > >
> > > 9
> > >
> > > The vote will be open until 06:00 PM PST Monday, 06/03/2019.
> > >
> > >
> > > Please download the release candidate, check the hashes/signature,
> build
> > it
> > > and test it, and then please vote:
> > >
> > > [ ] +1 approve
> > >
> > > [ ] +0 no opinion
> > >
> > > [ ] -1 disapprove (and reason why)
> > >
> > > I ran check-all.sh and integration tests.
> > >
> > > +1 from my side.
> > >
> > > Thanks
> > >
> >
>


Re: [DISCUSS] 1.2 release

2019-05-20 Thread santhosh venkat
+1 (non-binding)

Thanks,

On Mon, May 20, 2019 at 10:10 AM Xinyu Liu  wrote:

> +1 on the 1.2 release. It's time to get the newly added features out!
>
> Thanks,
> Xinyu
>
> On Mon, May 20, 2019 at 9:39 AM Jake Maes  wrote:
>
> > I don't think we did anything for "Making sendTo(table), sendTo(stream)
> > non-terminal". The ticket was just closed as a "won't fix" IIRC.
> >
> > Nevertheless, I think the Kafka 2.0 upgrade warrants a release by itself.
> >
> > Let's do it.
> >
> > On Fri, May 17, 2019 at 12:17 PM Boris S  wrote:
> >
> > > Hi all,
> > >
> > > We have added a number of major features and changes to master since
> > > 1.1 that warrants a new 1.2 release.
> > >
> > > Within LinkedIn, some of these features have already been tested as
> > > part of our test suites. We plan to continue our testing in coming
> > > week to validate the stability prior to release.
> > >
> > > We wanted to kick off the discussion in the open source forum to keep
> > > the momentum flowing.
> > > Here is a selected list of features that are part of the new release
> > >
> > >   Kafka 2.0 upgrade
> > >
> > >   Couchbase support for Samza Table API
> > >   Making sendTo(table), sendTo(stream) non-terminal
> > >
> > > We have also worked on the following upgrades and bugfixes.
> > > You can find a concrete list of the features, bug-fixes, upgrades
> > > herehttps://
> > >
> >
> issues.apache.org/jira/issues/?jql=project%20%3D%20%22SAMZA%22%20and%20fixVersion%20in%20(1.2)
> > >
> > >
> > > Some of these Jiras are not marked as fixed (but they are marked as
> > > committed in the git log). Please close the Jiras is they are fixed.
> > >
> > > Here is my proposal on our release schedule and timelines.
> > >1. Cut the 1.2 release branch.
> > >2. Target a release vote on the week of May 20, 2019
> > >
> > >
> > > Thanks
> > > Boris
> > >
> >
>


Re: [VOTE] Apache Samza 1.1.0 RC2

2019-03-18 Thread santhosh venkat
Hi,

The vote of Samza 1.1.0 has been open for more than 72 hours. We got +1
(binding)x 3 and +1 (non-binding) x 3 and no vetos.

*Binding +1: Prateek M, Jagadish V, Jake Maes*
*Non-binding +1: Rayman P, Daniel C, Shanthoosh V*

Thanks everyone for helping validate the release. Samza 1.1.0 has
officially passed the VOTE.

Thanks,
Shanthoosh


On Mon, Mar 18, 2019 at 4:32 PM Prateek Maheshwari 
wrote:

> 1. Verified checksum and signatures for the binaries.
> 2. Ran ./check-all.sh
> 3. Ran YARN and Standalone integration tests with the config patch
> successfully.
>
> +1(binding) from my side as well.
>
> Thanks,
> Prateek
>
> On Mon, Mar 18, 2019 at 2:06 PM Jagadish Venkatraman <
> jagadish1...@gmail.com>
> wrote:
>
> > 1. Verified check-sum and signatures for the release binaries.
> > 2. Ran ./check-all.sh successfully
> > 3. Ran YARN integration tests successfully
> > 4. Encountered an error on the standalone integration test, but it
> > succeeded after setting Kafka's replication factor config to 1.
> >
> > +1(binding) from my side.
> >
> > Thanks Daniel Chen and Shanthoosh for shepherding Samza 1.0.1!
> >
> > On Mon, Mar 18, 2019 at 9:47 AM Jake Maes  wrote:
> >
> > > Verified with check-all on RHEL 7
> > >
> > > Verified pgp and sha.
> > >
> > > +1 (binding)
> > >
> > > On Fri, Mar 15, 2019 at 11:39 AM rayman preet 
> > > wrote:
> > >
> > > > +1 (Non-binding)
> > > >
> > > > --
> > > > thanks
> > > > rayman
> > > >
> > > > On Wed, Mar 13, 2019 at 7:17 PM Daniel Chen 
> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I performed the following verifications:
> > > > >
> > > > > 1. ./bin/check-all.sh succeeded.
> > > > >
> > > > > 2. Verified both ./bin/integration-tests.sh yarn-integration-tests
> > and
> > > > > ./bin/integration-tests.sh standalone-integration-tests succeeded.
> > > > >
> > > > > 3. Verified that SQL console available in samza-tool.tgz.
> > > > >
> > > > > +1 (Non-binding)
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Daniel
> > > > >
> > > > >
> > > > > On Tue, Mar 12, 2019 at 4:11 PM santhosh venkat <
> > > > > santhoshvenkat1...@gmail.com> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > This is a call for a vote on a release of Apache Samza 1.1.0.
> > Thanks
> > > to
> > > > > > everyone who has contributed to this release.
> > > > > >
> > > > > > The release candidate can be downloaded from here:
> > > > > > http://home.apache.org/~shanthoosh/samza-1.1.0-rc2/
> > > > > >
> > > > > > The release candidate is signed with pgp key 0xF8B95961A401BF0F,
> > > which
> > > > > can
> > > > > > be found
> > > > > >
> > > >
> > http://keyserver.ubuntu.com/pks/lookup?op=get=0xF8B95961A401BF0F
> > > > > >
> > > > > > The git tag is release-1.1.0-rc0 and signed with the same pgp
> key:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.1.0-rc2
> > > > > >
> > > > > > Test binaries have been published to Maven's staging repository,
> > and
> > > > are
> > > > > > available here:
> > > > > >
> > > >
> > https://repository.apache.org/content/repositories/orgapachesamza-1060/
> > > > > >
> > > > > > The vote will be open for 72 hours (ending at 16:30 PM PST
> > Thursday,
> > > > > > 03/15/2018).
> > > > > >
> > > > > > Please download the release candidate, check the
> hashes/signature,
> > > > build
> > > > > it
> > > > > > and test it, and then please vote:
> > > > > >
> > > > > > [ ] +1 approve
> > > > > >
> > > > > > [ ] +0 no opinion
> > > > > >
> > > > > > [ ] -1 disapprove (and reason why)
> > > > > >
> > > > > > I ran check-all.sh, integration tests and verified the SQL
> console
> > > > > > in samza-tool tgz.
> > > > > >
> > > > > > +1 (non-binding) from my side.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > thanks
> > > > rayman
> > > >
> > >
> >
> >
> > --
> > Jagadish V,
> > Graduate Student,
> > Department of Computer Science,
> > Stanford University
> >
>


[VOTE] Apache Samza 1.1.0 RC2

2019-03-12 Thread santhosh venkat
Hi,

This is a call for a vote on a release of Apache Samza 1.1.0. Thanks to
everyone who has contributed to this release.

The release candidate can be downloaded from here:
http://home.apache.org/~shanthoosh/samza-1.1.0-rc2/

The release candidate is signed with pgp key 0xF8B95961A401BF0F, which can
be found
http://keyserver.ubuntu.com/pks/lookup?op=get=0xF8B95961A401BF0F

The git tag is release-1.1.0-rc0 and signed with the same pgp key:
https://gitbox.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.1.0-rc2

Test binaries have been published to Maven's staging repository, and are
available here:
https://repository.apache.org/content/repositories/orgapachesamza-1060/

The vote will be open for 72 hours (ending at 16:30 PM PST Thursday,
03/15/2018).

Please download the release candidate, check the hashes/signature, build it
and test it, and then please vote:

[ ] +1 approve

[ ] +0 no opinion

[ ] -1 disapprove (and reason why)

I ran check-all.sh, integration tests and verified the SQL console
in samza-tool tgz.

+1 (non-binding) from my side.

Thanks,


[CANCEL][VOTE] Apache Samza 1.1.0 RC1

2019-03-11 Thread santhosh venkat
Hi all,

This is the CANCEL notification for the samza 1.1.0 RC1. We found few bugs
with batch processing in samza which we will fix in SAMZA-2126.

Thanks,


[VOTE] Apache Samza 1.1.0 RC1

2019-03-11 Thread santhosh venkat
Hi,

This is a call for a vote on a release of Apache Samza 1.1.0. Thanks to
everyone who has contributed to this release.

The release candidate can be downloaded from here:
http://home.apache.org/~shanthoosh/samza-1.1.0-rc1/

The release candidate is signed with pgp key 0xF8B95961A401BF0F, which can
be found
http://keyserver.ubuntu.com/pks/lookup?op=get=0xF8B95961A401BF0F

The git tag is release-1.1.0-rc0 and signed with the same pgp key:
https://gitbox.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.1.0-rc1

Test binaries have been published to Maven's staging repository, and are
available here:
https://repository.apache.org/content/repositories/orgapachesamza-1058/

The vote will be open for 72 hours (ending at 10:00 PM PST Thursday,
03/14/2018).

Please download the release candidate, check the hashes/signature, build it
and test it, and then please vote:

[ ] +1 approve

[ ] +0 no opinion

[ ] -1 disapprove (and reason why)

I ran check-all.sh, integration tests and verified the SQL console
in samza-tool tgz.

+1 (non-binding) from my side.

Thanks,


[CANCEL][VOTE] Apache Samza 1.1.0 RC0

2019-03-11 Thread santhosh venkat
Hi all,

This is the CANCEL notification for the 1.1.0 RC0. We found
gpg key issues with the artifacts published in the RC0. This will be fixed
in the follow-up RC.

Thanks,


[VOTE] Apache Samza 1.1.0 RC0

2019-03-11 Thread santhosh venkat
Hi all,

This is a call for a vote on a release of Apache Samza 1.1.0. Thanks to
everyone who has contributed to this release.

The release candidate can be downloaded from here:
http://home.apache.org/~shanthoosh/samza-1.1.0-rc0/

The release candidate is signed with pgp key 0x2F38CB810438EDE3, which can
be found
http://keyserver.ubuntu.com/pks/lookup?op=get=0x2F38CB810438EDE3

The git tag is release-1.1.0-rc0 and signed with the same pgp key:
https://gitbox.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.1.0-rc0

Test binaries have been published to Maven's staging repository, and are
available here:
https://repository.apache.org/content/repositories/orgapachesamza-1058/

The vote will be open for 72 hours (ending at 10:00 PM PST Thursday,
03/14/2018).

Please download the release candidate, check the hashes/signature, build it
and test it, and then please vote:

[ ] +1 approve

[ ] +0 no opinion

[ ] -1 disapprove (and reason why)

I ran check-all.sh, integration tests and verified the SQL console
in samza-tools.tgz.

All of them passed. +1 (non-binding) from my side.

Thanks,


Re: [DISCUSS] Samza 1.1.0 release

2019-03-07 Thread santhosh venkat
+1 (non-binding)

Thanks,

On Wed, Mar 6, 2019 at 10:42 PM Yi Pan  wrote:

> +1 (binding)
>
> On Wed, Mar 6, 2019 at 10:08 PM Daniel Chen  wrote:
>
> > Hello everyone,
> >
> > We have added couple of major features to master since 1.0.0 that
> warrants
> > a major release.
> >
> > Within LinkedIn, some of these features have already been tested as part
> of
> > our test suites. We plan to continue our testing in coming weeks to
> > validate the stability prior to release.
> >
> > Here is the highlighted list of features that are part of the new release
> > (in chronological order)
> > SAMZA-1981
> > Consolidate table descriptors to samza-api
> > SAMZA-1985
> > Implement Startpoints model and StartpointManager
> > SAMZA-1998
> > Table API refactoring
> > SAMZA-2012
> > Add API for wiring an external context through to application processing
> > code
> > SAMZA-2041
> > Add system descriptors for HDFS and Kinesis
> > SAMZA-2043
> > Consolidate ReadableTable and ReadWriteTable
> > SAMZA-2106
> > Samza App & Job Config Refactor
> > SAMZA-2081
> > Samza SQL : Type system for Samza SQL
> >
> > You can find a complete list of features here:
> >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fissues%2F%3Fjql%3Dproject%2520%253D%2520SAMZA%2520AND%2520resolution%2520%2520%253D%2520Fixed%2520%2520AND%2520(fixVersion%2520%253E%253D%25201.1%2520)%2520ORDER%2520BY%2520createdDate%2520%2520DESCdata=02%7C01%7Cdchen1%40linkedin.com%7C01251a7438ea4324f3f608d6a2c11a53%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636875347611087937sdata=ZDMaQj5vX6Vlm%2B8vpGhrNygxpI2vvNnYGi1USWe%2FD5A%3Dreserved=0
> >
> > Here is my proposal on our release schedule and timelines.
> >
> >1. Cut a release version 1.1.0 from master
> >2. Target a release vote on the week March 13th (next week)
> >
> > Thoughts?
> >
> > Thanks,
> > Daniel
> >
>


Re: [VOTE] Migration of Samza git repo to gitbox.apache.org

2019-01-24 Thread santhosh venkat
+1 (non-binding).

On Thu, Jan 24, 2019 at 7:10 AM Jake Maes  wrote:

> +1 (binding)
>
> On Wed, Jan 23, 2019 at 10:35 PM santhosh venkat <
> santhoshvenkat1...@gmail.com> wrote:
>
> > +1 (binding).
> >
> > Thanks,
> >
> > On Wed, Jan 23, 2019 at 2:43 PM Jagadish Venkatraman <
> > jagadish1...@gmail.com>
> > wrote:
> >
> > > +1 (binding). Thank you Pawas for driving this!
> > >
> > > On Wed, Jan 23, 2019 at 2:40 PM Xinyu Liu 
> wrote:
> > >
> > > > +1 (binding).
> > > >
> > > > On Wed, Jan 23, 2019 at 2:39 PM Prateek Maheshwari <
> > prateek...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > +1 (binding) again
> > > > >
> > > > > - Prateek
> > > > >
> > > > > On Wed, Jan 23, 2019 at 11:50 AM Pawas Chhokra <
> pawas2...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > This is a call for a vote on migrating Samza git repo to
> > > > > gitbox.apache.org, on
> > > > > > 11 AM, Jan 29, 2019. As mandated by the Apache Infrastructure
> Team,
> > > all
> > > > > git
> > > > > > repositories must be migrated from git-wip-us.apache.org URL to
> > > > > > gitbox.apache.org, as the old service is being decommissioned.
> > > > > > The vote will be open for 72 hours (ending at 12:00 PM PST
> Monday,
> > > > > > January 28). You can vote as follows:
> > > > > >
> > > > > > [ ] +1 approve
> > > > > >
> > > > > > [ ] +0 no opinion
> > > > > >
> > > > > > [ ] -1 disapprove (and reason why)
> > > > > >
> > > > > > The vote is +1 from my side.
> > > > > >
> > > > > > Thanks & Regards,
> > > > > > Pawas Chhokra
> > > > >
> > > >
> > >
> > >
> > > --
> > > Jagadish V,
> > > Graduate Student,
> > > Department of Computer Science,
> > > Stanford University
> > >
> >
>


Re: [VOTE] Migration of Samza git repo to gitbox.apache.org

2019-01-23 Thread santhosh venkat
+1 (binding).

Thanks,

On Wed, Jan 23, 2019 at 2:43 PM Jagadish Venkatraman 
wrote:

> +1 (binding). Thank you Pawas for driving this!
>
> On Wed, Jan 23, 2019 at 2:40 PM Xinyu Liu  wrote:
>
> > +1 (binding).
> >
> > On Wed, Jan 23, 2019 at 2:39 PM Prateek Maheshwari  >
> > wrote:
> >
> > > +1 (binding) again
> > >
> > > - Prateek
> > >
> > > On Wed, Jan 23, 2019 at 11:50 AM Pawas Chhokra 
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > > This is a call for a vote on migrating Samza git repo to
> > > gitbox.apache.org, on
> > > > 11 AM, Jan 29, 2019. As mandated by the Apache Infrastructure Team,
> all
> > > git
> > > > repositories must be migrated from git-wip-us.apache.org URL to
> > > > gitbox.apache.org, as the old service is being decommissioned.
> > > > The vote will be open for 72 hours (ending at 12:00 PM PST Monday,
> > > > January 28). You can vote as follows:
> > > >
> > > > [ ] +1 approve
> > > >
> > > > [ ] +0 no opinion
> > > >
> > > > [ ] -1 disapprove (and reason why)
> > > >
> > > > The vote is +1 from my side.
> > > >
> > > > Thanks & Regards,
> > > > Pawas Chhokra
> > >
> >
>
>
> --
> Jagadish V,
> Graduate Student,
> Department of Computer Science,
> Stanford University
>


Re: [VOTE] Apache Samza 1.0.0 RC4

2018-11-05 Thread santhosh venkat
1. ./bin/check-all.sh succeeded.
2. Both the commands ./bin/integration-tests.sh yarn-integration-tests and
./bin/integration-tests.sh standalone-integration-tests succeeded.
3. Verified the SQL console available in samza-tool tgz.

+1

Thanks.

On Mon, Nov 5, 2018 at 2:13 PM Daniel Nishimura 
wrote:

> Ran check-all and integration tests. All passed.
> Verified signatures.
> Also as an extra sanity check, I ran a Samza snapshot build with
> samza-hello-samza (High level and low level jobs).
> +1
>
> On Wed, Oct 31, 2018 at 7:15 PM Jagadish Venkatraman 
> wrote:
>
> > Hi all,
> >
> > This is a call for a vote on a release of Apache Samza 1.0.0. Thanks to
> > everyone who has contributed to this release.
> >
> > The release candidate can be downloaded from here:
> > http://home.apache.org/~jagadish/samza-1.0.0-rc4/
> >
> > The release candidate is signed with pgp key AF81FFBF, which can be found
> > on keyservers:
> > http://pgp.mit.edu/pks/lookup?op=get=0xAF81FFBF
> >
> > The git tag is release-1.0.0-rc4 and signed with the same pgp key:
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.0.0-rc4
> >
> > Test binaries have been published to Maven's staging repository, and are
> > available here:
> > https://repository.apache.org/content/repositories/orgapachesamza-1055/
> >
> > The vote will be open for 72 hours (ending at 7:00 PM PST Saturday,
> > November 3).
> >
> > Please download the release candidate, check the hashes/signature, build
> it
> > and test it, and then please vote:
> >
> > [ ] +1 approve
> >
> > [ ] +0 no opinion
> >
> > [ ] -1 disapprove (and reason why)
> >
> > For me, I ran check-all.sh, integration tests and verified the SQL
> console
> > in samza-tool tgz. So +1 (binding) from my side.
> >
> > Thanks,
> > Jagadish
> >
> > --
> > Jagadish V
> >
>


Re: [VOTE] Apache Samza 1.0.0 RC2

2018-10-25 Thread santhosh venkat
1. ./bin/check-all.sh succeeded.
2. Both the commands ./bin/integration-tests.sh yarn-integration-tests and
./bin/integration-tests.sh standalone-integration-tests succeeded.
3. Verified the SQL console available in samza-tool tgz.

+1

Thanks.

On Wed, Oct 24, 2018 at 10:12 PM Yi Pan  wrote:

> Ran check-all and deployed locally with the test jobs. All tests passed.
>
> +1 (binding) from my end.
>
> Thanks for push the release!
>
> -Yi
>
> On Wed, Oct 24, 2018 at 8:53 AM Prateek Maheshwari 
> wrote:
>
> > Hi Jagadish,
> >
> > PR 755 is mis-titled. Its only adding back the tests for the old
> > consumer. The old consumer was already added back in
> > https://github.com/apache/samza/pull/740.
> >
> > Thanks,
> > Prateek
> > On Wed, Oct 24, 2018 at 12:02 AM Jagadish Venkatraman
> >  wrote:
> > >
> > > Boris,
> > >
> > > Do users have the option to switch to use the "old" Kafka consumer if
> > they
> > > encounter any issue with the "new" consumer?. If not, should we pull in
> > > https://github.com/apache/samza/pull/755? It is my understanding that
> > > PR-755 adds support for this.
> > >
> > > Thanks,
> > > Jagadish
> > >
> > > On Tue, Oct 23, 2018 at 2:50 PM Boris S  wrote:
> > >
> > > > Ran build, test and integration test on Linux.
> > > > Verified the signatures.
> > > >
> > > > +1
> > > >
> > > > On Tue, Oct 23, 2018 at 11:55 AM Prateek Maheshwari <
> > prate...@utexas.edu>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > This is a call for a vote on a release of Apache Samza 1.0.0.
> Thanks
> > to
> > > > > everyone who has contributed to this release.
> > > > >
> > > > > The release candidate can be downloaded from here:
> > > > > http://home.apache.org/~pmaheshwari/samza-1.0.0-rc2/
> > > > >
> > > > > The release candidate is signed with pgp key 6585B3D7, which can be
> > found
> > > > > on keyservers:
> > https://pgp.mit.edu/pks/lookup?op=get=0x6585B3D7
> > > > >
> > > > > The git tag is release-1.0.0-rc2 and signed with the same pgp key:
> > > > >
> > > > >
> > > >
> >
> https://git-wip-us.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.0.0-rc2
> > > > >
> > > > > Test binaries have been published to Maven's staging repository,
> and
> > are
> > > > > available here:
> > > > >
> > https://repository.apache.org/content/repositories/orgapachesamza-1053/
> > > > >
> > > > > The vote will be open for 72 hours (ending at 12:00 PM PST Friday,
> > > > > 10/26/2018).
> > > > >
> > > > > Please download the release candidate, check the hashes/signature,
> > build
> > > > it
> > > > > and test it, and then please vote:
> > > > >
> > > > > [ ] +1 approve
> > > > >
> > > > > [ ] +0 no opinion
> > > > >
> > > > > [ ] -1 disapprove (and reason why)
> > > > >
> > > > > For me, I ran check-all.sh, integration tests and verified the SQL
> > > > console
> > > > > in samza-tool tgz. So +1 (non-binding) from my side.
> > > > >
> > > > > Thanks,
> > > > > Prateek
> > > > >
> > > >
> > >
> > >
> > > --
> > > Jagadish V,
> > > Graduate Student,
> > > Department of Computer Science,
> > > Stanford University
> >
>


Re: [VOTE] Apache Samza 1.0.0 RC1

2018-10-22 Thread santhosh venkat
I tried building the release candidate(RC1) and it fails with the following
checkstyle errors.

[ant:checkstyle]
/Users/svenkata/Documents/apache-samza-1.0.0-src/samza-test/src/main/java/org/apache/samza/test/integration/TestStandaloneIntegrationApplication.java:43:
'member def type' have incorrect indentation level 5, expected level should
be 4.
[ant:checkstyle]
/Users/svenkata/Documents/apache-samza-1.0.0-src/samza-test/src/main/java/org/apache/samza/test/integration/TestStandaloneIntegrationApplication.java:43:
'method def' child have incorrect indentation level 5, expected level
should be 4.
[ant:checkstyle]
/Users/svenkata/Documents/apache-samza-1.0.0-src/samza-test/src/main/java/org/apache/samza/test/integration/TestStandaloneIntegrationApplication.java:44:
'method def' child have incorrect indentation level 5, expected level
should be 4.

Thanks.

On Mon, Oct 22, 2018 at 9:09 PM Prateek Maheshwari 
wrote:

> Hi all,
>
> This is a call for a vote on a release of Apache Samza 1.0.0. Thanks to
> everyone who has contributed to this release.
>
> The release candidate can be downloaded from here:
> http://home.apache.org/~pmaheshwari/samza-1.0.0-rc1/
>
> The release candidate is signed with pgp key 6585B3D7, which can be found
> on keyservers: https://pgp.mit.edu/pks/lookup?op=get=0x6585B3D7
>
> The git tag is release-1.0.0-rc1 and signed with the same pgp key:
>
> https://git-wip-us.apache.org/repos/asf?p=samza.git;a=tag;h=refs/tags/release-1.0.0-rc1
>
> Test binaries have been published to Maven's staging repository, and are
> available here:
> https://repository.apache.org/content/repositories/orgapachesamza-1052/
>
> The vote will be open for 72 hours (ending at 9:00 PM PST Thursday,
> 10/25/2018).
>
> Please download the release candidate, check the hashes/signature, build it
> and test it, and then please vote:
>
> [ ] +1 approve
>
> [ ] +0 no opinion
>
> [ ] -1 disapprove (and reason why)
>
> For me, I ran check-all.sh, integration tests and verified the SQL console
> in samza-tool tgz. So +1 (non-binding) from my side.
>
> Thanks,
> Prateek
>


Re: [VOTE] SEP-11: Host affinity in standalone.

2018-04-09 Thread santhosh venkat
Hi,

The vote of SEP-11 had been open for more than 72 hours and we got +1
(binding) x 3.

Votes are as follows:
+1 (binding) - Yi Pan, Xinyu Liu, Jagadish Venkatraman.

SEP-11 had officially passed the VOTE!

Thanks.


Re: [VOTE] SEP-11: Host affinity in standalone.

2018-04-09 Thread santhosh venkat
Hi Jagadish,

Thanks for your feedback.

I have added the suggestion test-case to the test plan.

Thanks.

On Mon, Apr 9, 2018 at 12:43 PM, Jagadish Venkatraman <
jagadish1...@gmail.com> wrote:

> Shanthoosh,
>
> As discussed earlier, I think we should add an explicit test-case for
> verifying that we minimize partition movements during rolling upgrades.
> Please update the proposal with this missing test-case. Other than this, +1
> (binding) from my side.
>
> Thanks!
>
> On Fri, Apr 6, 2018 at 6:30 PM, Jagadish Venkatraman <
> jagadish1...@gmail.com
> > wrote:
>
> > Let's extend the vote to Monday 11:59PM PST.
> >
> > On Thu, Apr 5, 2018 at 5:05 PM, xinyu liu <xinyuliu...@gmail.com> wrote:
> >
> >> +1 (binding). Look forward to the implementation.
> >>
> >> Xinyu
> >>
> >> On Wed, Apr 4, 2018 at 2:43 PM, Yi Pan <nickpa...@gmail.com> wrote:
> >>
> >> > +1 (binding). Thanks for the revisions!
> >> >
> >> > -Yi
> >> >
> >> > On Wed, Apr 4, 2018 at 2:39 PM, santhosh venkat <
> >> > santhoshvenkat1...@gmail.com> wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > This is a voting thread for SEP-11: Host affinity in standalone.
> >> > >
> >> > > For reference, here is the wiki link: https://cwiki.apache.org
> >> > > /confluence/pages/viewpage.action?pageId=75957309
> >> > >
> >> > > Thanks.
> >> > >
> >> >
> >>
> >
> >
> >
> > --
> > Jagadish V,
> > Graduate Student,
> > Department of Computer Science,
> > Stanford University
> >
>
>
>
> --
> Jagadish V,
> Graduate Student,
> Department of Computer Science,
> Stanford University
>


[VOTE] SEP-11: Host affinity in standalone.

2018-04-04 Thread santhosh venkat
Hi,

This is a voting thread for SEP-11: Host affinity in standalone.

For reference, here is the wiki link: https://cwiki.apache.org
/confluence/pages/viewpage.action?pageId=75957309

Thanks.


[CANCEL][VOTE] SEP-11: Host affinity in standalone.

2018-03-15 Thread santhosh venkat
Hi,


This is an official CANCEL for the SEP-11: Host affinity in standalone vote.


Thanks.


[VOTE] SEP-11: Host affinity in standalone.

2018-02-20 Thread santhosh venkat
Hi everyone,

This is a voting thread for  SEP-11: Host affinity in standalone.
For reference, here is the wiki link:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75957309

Link to discussion mail thread:
http://mail-archives.apache.org/mod_mbox/samza-dev/201802.mbox/%3CCAFvExu2hrUhnjknWoH3er=q4agh3pzqjtbqwrp9xpdijawv...@mail.gmail.com%3E

Thanks.


Re: [DISCUSS] SEP: Host affinity in standalone.

2018-02-06 Thread santhosh venkat
Boris,

Thanks for your review. Responses are inline.

>> I think we also need MetadataStorage one (details may be worked out
later) to hide the locality storage implementation details.

  - Agree, my initial proposal had LocalityManager interface which
was logically similar to MetaDataStore. Changed the interface name to
MetadataStore and updated it everywhere in the proposal.


   >>  Instead of using physical hostname we should stick to the
LocationId, since some VMs may be running multiple processors on a single
physical host.

  - Agree. This is taken into account and we have a pluggable
abstraction to generate it for different execution environments.


   >> I think, though, using function names doesn't give enough clarity on
what is going on. May be we should add more explanation.

  - Updated the proposal based upon this comment.



   >> First diagram describes how local storage works. Please label it as
such.

  - Updated the proposal to address this comment.


   >> Some time the perfect mapping to the same Locality is not
possible(especially when a task dies and is distributed between other
tasks).

  - Yes, this is worst case scenario where the task will be
assigned to any processor when there’re no live processors registered from
it’s preferred host.

Thanks.


Re: [DISCUSS] SEP: Host affinity in standalone.

2018-02-06 Thread santhosh venkat
Yi,

Thanks for taking time to review this and providing your feedback.
Responses inline.

>> 1) ContainerInfo: (containerId, physicalResourceId): this mapping is the
reported mapping from container processes in standalone, and preferred
mapping in YARN

>> 2) TaskLocality: (taskId, physicalResourceId):this mapping is actually
reported task location from container processes in both standalone and
YARN.

Yes, totally agree with above points. I think the group interface contract
already reflects that. locationId registered by live processors and task
locality of previous generation will be used to calculate the assignment of
current generation in standalone. Preferred host mapping will be used for
task and processor locality in case of yarn. Any new task/processor for
which grouping in unknown(unavailable in preferred host/task-locality in
underlying storage layer), will be treated as any_host during assignment.

I don't think it's a good idea to unify the locality storage formats
between yarn and standalone as a part of this change(which will require a
elaborate migration plan and extensive testing). I think it's fair to
consider it's out of scope for this proposal.

>>  Should the leader validate that everyone has picked up the new version
of JobModel and reported the correct task-locality expected in the
JobModel, after step 10 in the graph?
 Though it's a extra precaution taken by leader for ensuring correctness, i
think it might be a premature optimization. Even in existing setup in yarn,
we don't have corresponding validations by ApplicationMaster after
generations. I think it's fair to keep the behavior synonymous.

>> Why are we missing a processor-to-locationId mapping in the zknode data
model?
It is stored as a part of live processors as a part of value. I had it my
initial proposal, but received feedback just to add things that i'm
changing in the existing setup(hence removed it). Added it back now.

>> Also, why don’t we write locationId as a value to task0N znode, instead
of a child node?
It is stored as a value of the task zookeeper node. I was unable to
represent it pictorially in that zookeeper hierarchical model(hence had it
like one level down like child). Added corresponding descriptions in that
data model to make it clear.

>> And which znode is the distributed barrier that you used in the graph?
This was removed after initial feedback(suggesting to add stuff that I'm
changing in data-model). Added the barrier zookeeper node to data-model for
clarity.

I think we are on same page about most of the choices made in this
proposal. If there are other major concerns/feedback, let's discuss offline.

Thanks.


Re: [DISCUSS] SEP: Host affinity in standalone.

2018-02-06 Thread santhosh venkat
Hi Yi,

Thanks for your feedback. Responses inline.


> It seems like we can deprecate the whole BalancingTaskNameGrouper
altogether.

   - Yes, that’s part of the proposed interface changes.

>  That also means that you will somehow store the task-to-container
mapping info in the locality znode as well. It would be nice to make it
clear how the task-to-container-to-physical-resource mapping is stored and
read in ZK.

- I think task assignment (task to processor) mapping is stored
in  recent version of JobModel in zookeeper.  I don’t see the value in
duplicating it in Locality znode as well (which will burden us with
maintaining consistency between same data stored at two places). I want to
derive container to localityId mapping based upon task to Locality mapping
from locality zookeeper node and container to task mapping available in
latest JobModel. When new processors join, they will not have any previous
task assignment. Any new tasks  added by changing SSPGrouper will not have
any previous host assigned (will be open to be distributed to any available
processor in the group). Please share your thoughts. Will update the
proposal, if there's a consensus on this.


> Why are we missing a processor-to-locationId mapping in the zknode data
model?

- Planning to derive it based out of task-to-locationId mapping
from locality zookeeper node and container to taskId from the latest job
model.


>  Also needs to include compatibility test after deprecating/changing
TaskNameGrouper API, make sure the default behavior of default groupers is
the same.

- Added it to the compatibility section.


>  In Compatibility section, remove the future versions (i.e. 0.15) since
we are not sure when this change can be completed and released yet. It
seems that we are not changing the persistent data format for locality info
in coordinator stream. Make it explicit.

- Updated the compatibility section.

>  If you are making LocalityManager class an interface, are you planning
to make it pluggable as well? Actually, I was thinking that the model we
wanted to go is that making the metadata store for locality info an
interface and pluggable, while keep the LocalityManager as a common
implementation.

- Yes, LocalityManager is pluggable interface (there will be two
implementations one for coordinator-stream and other for zookeeper). I
think you’re proposing the  same thing as in my change, but with a
different interface name(MetaStore instead of LocalityManager). I don't
think there'll be any value in LocalityManager class at all, once we have
the meta-store interface.

>  what’s the definition of LocationId? An interface? An abstract class?
   - It’s a unique identifier which represents a
virtual-container/physical-host (any physical  execution environment) in
which a processor runs. It’s a pluggable interface(more like
ProcessorIdGenerator). In case of YARN, physical hostname will be used as
locationId. In standalone, a combination of slice-name, slice-id,
instance-id will be used as locationId.

> For Semantics of host affinity with run.id, “all stream processors of a samza
application stops” = one run is not true for YARN deployed Samza
applications.
  - Updated the proposal based upon this feedback.

> Lastly, from the grouper API definition, you will not be able to get the
physical location info, if it is not passed in via
currentGenerationProcessorIds or ContainerModel. How are you going to
resolve that w/o creating a LocalityManager in Grouper implementation
class? I would strongly recommend no to create an instance of
LocalityManager in the Grouper implementation class.
   - LocationId is part of the ContainerModel class and will be
used to propagate the previous run's locality information.

> if the input topic partitions change, or the SSP grouper changes, the
task-to-ssp mapping changes and the task locality may not make sense at
all. Is this considered out-of-scope.

   - SystemStreamPartitionCountMonitor is out of scope for
this  proposal.
I think we can determine input streams partition change based upon previous job
model version and current number of partitions of i/p streams and purge
task locality information. After SEP-5, existing previous task locality
information can be reused.

> not quite clear to me that what’s the distributed barrier is used for in
the graph. For every container process to pick up a certain version of
JobModel? Who is waiting on the barrier? The leader or the followers? Or
everyone?
- Leaders/followers are waiting on the barrier to agree upon a
JobModel. Will add that to the state diagram.

Thanks.


Re: Welcome Xinyu as new Samza PMC!

2018-01-17 Thread santhosh venkat
Congrats Xinyu.

On Wed, Jan 17, 2018 at 1:10 PM, Daniel Nishimura 
wrote:

> Congrats Xinyu!!
>
> On Wed, Jan 17, 2018 at 11:55 AM, Bharath Kumarasubramanian <
> bkumarasubraman...@linkedin.com> wrote:
>
> > Congratulations Xinyu! Well deserved (
> >
> > On 1/17/18, 11:23 AM, "Fred Haifeng Ji"  wrote:
> >
> > Congratulations Xinyu!
> >
> > Fred
> >
> > On Wed, Jan 17, 2018 at 10:53 AM, Aditya  wrote:
> >
> > > Congrats Xinyu!
> > >
> > > On Wed, Jan 17, 2018 at 10:42 AM, Prateek Maheshwari <
> > > pmaheshw...@linkedin.com> wrote:
> > >
> > > > This is great news. Congrats Xinyu, and thanks for your
> > contributions!
> > > >
> > > > > On Jan 17, 2018, at 10:39 AM, Srinivasulu Punuru <
> > s...@outlook.com>
> > > > wrote:
> > > > >
> > > > > Congrats Xinyu, Very well deserved!
> > > > >
> > > > > 
> > > > > From: Jagadish Venkatraman 
> > > > > Sent: Wednesday, January 17, 2018 10:37:46 AM
> > > > > To: dev@samza.apache.org
> > > > > Subject: Re: Welcome Xinyu as new Samza PMC!
> > > > >
> > > > > Big Congrats Xinyu. Thanks for your continued contributions to
> > all
> > > > aspects
> > > > > of the project!
> > > > >
> > > > > On Wed, Jan 17, 2018 at 10:36 AM, Wei Song  >
> > wrote:
> > > > >
> > > > >> Congrats, Xinyu!
> > > > >>
> > > > >> --
> > > > >> Thanks
> > > > >> -Wei
> > > > >>
> > > > >>
> > > > >> On 1/17/18, 10:35 AM, "Navina Ramesh" 
> > wrote:
> > > > >>
> > > > >>Congratulations, Xinyu!
> > > > >>Thanks for all your contribution and looking forward to
> more
> > 
> > > > >>
> > > > >>
> > > > >>Cheers!
> > > > >>Navina
> > > > >>
> > > > >>
> > > > >>From: Yi Pan 
> > > > >>Sent: Wednesday, January 17, 2018 10:26:54 AM
> > > > >>To: dev@samza.apache.org
> > > > >>Subject: Welcome Xinyu as new Samza PMC!
> > > > >>
> > > > >>Finally all the documentation procedure is completed and
> > Xinyu Liu
> > > > has
> > > > >> been
> > > > >>officially promoted to Samza PMC member! This is well
> > deserved due
> > > to
> > > > >> his
> > > > >>continued contribution to the Samza project.
> > > > >>
> > > > >>Please join me to welcome Xinyu as our newest PMC member!
> > > > >>
> > > > >>Cheers!
> > > > >>
> > > > >>-Yi Pan
> > > > >>
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > > Jagadish V,
> > > > > Graduate Student,
> > > > > Department of Computer Science,
> > > > > Stanford University
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Haifeng (Fred)  Ji
> >
> >
> >
>


[DISCUSS] SEP: Host affinity in standalone.

2018-01-10 Thread santhosh venkat
Hi,


I created SEP for SAMZA-1554
: Host affinity in
standalone.


The link to the SEP is here:

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75957309


Please review and comments are welcome.


Thanks.


Re: [VOTE] Apache Samza 0.14.0 RC5

2017-12-28 Thread santhosh venkat
+1

Verified by running check-all.sh and integration tests successfully on OS-X.

On Thu, Dec 28, 2017 at 11:02 AM, Daniel Nishimura 
wrote:

> +1
>
> Verified check-all.sh and integration tests on Ubuntu 14.04 and OSX 10.11.
>
> On Thu, Dec 28, 2017 at 10:18 AM, xinyu liu  wrote:
>
> > +1 on my side.
> >
> > Verified by running check-all.sh and integration tests. They both passed.
> >
> > Thanks,
> > Xinyu
> >
> > On Thu, Dec 28, 2017 at 5:06 AM, Jagadish Venkatraman <
> > jagadish1...@gmail.com> wrote:
> >
> > > +1 (binding)
> > >
> > > Verified the RC. Ran *check-all.sh* and integration tests successfully
> on
> > > OS X. Thanks Xinyu, and everyone for driving Samza-0.14!
> > >
> > > On Thu, Dec 28, 2017 at 3:22 AM, Yi Pan  wrote:
> > >
> > > > +1 (binding).
> > > >
> > > > Verified the signature and MD5
> > > > Ran ./bin/check-all.sh on OSX
> > > > Ran integration tests on OSX
> > > > Verified ./gradlew releaseToolsTarGz generated samza-tools-0.14.0.tgz
> > in
> > > > build directory
> > > >
> > > > Thanks for all!
> > > >
> > > > -Yi
> > > >
> > > > On Fri, Dec 22, 2017 at 6:10 PM, Boris S  wrote:
> > > >
> > > > > Verified the signature.
> > > > > Ran build, tests and integration tests on Unix.
> > > > > All passed (as before requires python 2.7, neither higher nor
> lower).
> > > > >
> > > > > +1
> > > > > Thanks guys !!
> > > > >
> > > > > On Fri, Dec 22, 2017 at 2:50 PM, xinyu liu 
> > > > wrote:
> > > > >
> > > > > > This is a call for a vote on a release of Apache Samza 0.14.0.
> > Thanks
> > > > > > to everyone
> > > > > > who has contributed to this release.
> > > > > >
> > > > > > The release candidate can be downloaded from here:
> > > > > > http://home.apache.org/~xinyu/samza-0.14.0-rc5/
> > > > > >
> > > > > > The release candidate is signed with pgp key C31D7061, which can
> be
> > > > found
> > > > > > on
> > > > > > keyservers:
> > > > > > http://pgp.mit.edu/pks/lookup?op=get=0x35964389C31D7061
> > > > > >
> > > > > > The git tag is release-0.14.1-rc5 and signed with the same pgp
> key:
> > > > > > https://git-wip-us.apache.org/repos/asf?p=samza.git;a=tag;h=
> > > > > > refs/tags/release-0.14.0-rc5
> > > > > >
> > > > > > Test binaries have been published to Maven's staging repository,
> > and
> > > > > > are available
> > > > > > here:
> > > > > > https://repository.apache.org/content/repositories/
> > > orgapachesamza-1042
> > > > > >
> > > > > > 61 issues have been resolved as part of this release
> > > > > > https://issues.apache.org/jira/browse/SAMZA-1519?jql=project
> > > > > > %20%3D%20SAMZA%20AND%20fixVersion%20%3D%200.14.0%20AND%
> > > > > > 20status%20%3D%20Resolved
> > > > > >
> > > > > > The vote will be open for 72 hours (ending at 15:00 PM Thursday,
> > > > > > 12/28/2017).
> > > > > >
> > > > > > Please download the release candidate, check the
> hashes/signature,
> > > > build
> > > > > it
> > > > > > and test it, and then please vote:
> > > > > >
> > > > > > [ ] +1 approve
> > > > > >
> > > > > > [ ] +0 no opinion
> > > > > >
> > > > > > [ ] -1 disapprove (and reason why)
> > > > > >
> > > > > > Thanks,
> > > > > > Xinyu
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Jagadish V,
> > > Graduate Student,
> > > Department of Computer Science,
> > > Stanford University
> > >
> >
>


Re: Samza 0.12.0 artifact for scala 2.10 in maven central

2017-03-22 Thread santhosh venkat
Hi Maskim,

+1 to your suggestion.

I've created a JIRA here : https://issues.apache.org/jira/browse/SAMZA-1163.
We will work on publishing samza-0.12 build supporting scala 2.10 version.

Thanks.


Re: Periodic cleanup of unused local stores

2016-09-08 Thread santhosh venkat
Hi Navina,

Thanks for the review and the comments. Please find my replies inline.

1. It is always very useful to provide more context to the reader, esp. in
explaining what the different terms mean (like host-affinity, tombstone
etc) and how it relates to the problem being described."

>> Updated the design doc with a glossary section, where the
terms are described briefly.

2. "The Host Affinity feature in Samza enables it to restore local state
from disk instead of bootstrapping the entire changelog" -> host-affinity
as a features only tries to bring-up the container in the same host as
before. This will help samza leverage the locally persisted store data. It
doesn't actually help it restore state in anyway.

>> I've rephrased it accordingly in the design doc.

3. "To achieve this, Samza stores local state for change logged stores in a
shared directory so it is not tied to a resource manager’s storage
structure and cleanup schedule." -> I think by shared directory, you are
referring to the yarn application's workspace. This shared workspace is
part of the NM, not the RM. You can rephrase this and additionally, provide
the logical path to the state stores.

>> Yes, it was mentioned incorrectly. I've fixed it in the design doc.

4. " Expose an API in samza­rest that" -> Can you elaborate what the API
looks like ?

>> This API would take in jobId and jobName as parameters
and return the preferred host for all the tasks in the job.

Request URL:  http://Host:Port/v1/jobs/{jobName}/{jobId}/containers

Sample json response

{

  "jobName" : "Job name",

  "jobId" : "Job id",

  "containers" : [

  {

  "name" : "Container name",

  "id" :  “1”,

  "tasks" : [{

 "name" : "Task name",

  "partitions" : ["Id 1","Id 2"],

  "preferredHost" : "Host name"

}]

}]}

Alternatively, granular API’s at task and container levels could be exposed
rather than a single API returning the complete job model hierarchy. To
construct the complete job hierarchy with the granular API’s, job's
coordinator stream has be queried multiple times(for each of the containers
and tasks), leading to performance problems.


5. Is the rest-api to be invoked by the monitor for all jobs in the cluster
or all running jobs ? What is the criteria there? Please do mention them,
if any.

>> Monitor will use the rest-api for all the jobs in the cluster
that has host affinity enabled.


Updated Design doc is here:
https://issues.apache.org/jira/secure/attachment/12827691/DESIGN-SAMZA-656.pdf

Please let me know your thoughts.

Thanks.


Periodic cleanup of unused local stores

2016-09-01 Thread santhosh venkat
Currently in Samza, to enable reuse of local store between restarts, local
store is persisted outside of the YARN’s working directory. However, there
is no mechanism currently available to periodically clean up the unused
local stores. Here is a proposal detailing a possible way to accomplish
this:

https://issues.apache.org/jira/secure/attachment/12826531/GCstalelocalstate.pdf

This is tracked in SAMZA-656. Any feedback/comments are welcome.

Thanks.