Re: [DISCUSS] SEP-23: Simplify Job Runner

2019-12-02 Thread Ke Wu
Hi Xinyu,

Please see the response in line:

>   1. After this change, seems the original config-factory and config-path
>   are only used to supply parameters for submitting job. Is that the case?
>   Which configs are still needed in the submission?

Yes, only configs related to job submission is needed.

job.name, job.factory.class & yarn.package.path are the minimum three configs 
needed for the job submission, which may be supplied by --config instead. 

>   2. For backward compatibility, does it still work if the user doesn't
>   specify the new ConfigLoader in the command line? The
>   PropertiesConfigLoader class seems requiring the path of the config after
>   exploding the tgz.

If the user does not specify config loader in the config, then it will work in 
the previous flow, where runner publishes configs in coordinator stream and job 
coordinator/application master will pick it up by reading from Kafka. So this 
is a backward compatible change.

>   3. If the final plan is to remove the original config factory/path, how
>   do we pass the parameters needed for Yarn submission, e.g. job name, id,
>   and tgz path?

We can either pass them by --config or introduce delicate command line 
arguments for it in CommandLine.scala.


Let me know if you have any further questions.

Best,
Ke

> On Nov 27, 2019, at 11:02 AM, Xinyu Liu  wrote:
> 
> Thanks a lot for putting out the design for simplifying the job submission
> process. The motivation makes sense to me that most of the planning and
> config generation should be done after submitting to the cluster, instead
> of during the submission, which can happen in a local sandbox without the
> access to the resources needed for planning. It also improves the process
> from the security stand of the view.
> 
> A few questions regarding to the interface changes:
> 
>   1. After this change, seems the original config-factory and config-path
>   are only used to supply parameters for submitting job. Is that the case?
>   Which configs are still needed in the submission?
>   2. For backward compatibility, does it still work if the user doesn't
>   specify the new ConfigLoader in the command line? The
>   PropertiesConfigLoader class seems requiring the path of the config after
>   exploding the tgz.
>   3. If the final plan is to remove the original config factory/path, how
>   do we pass the parameters needed for Yarn submission, e.g. job name, id,
>   and tgz path?
> 
> Thanks,
> Xinyu
> 
> On Fri, Nov 15, 2019 at 3:00 PM Ke Wu  wrote:
> 
>> We created SEP-23: Simplify Job Runner, which simplifies job runner by
>> moving config retrieval and planning to AM.
>> 
>> Please find out the SEP wiki below:
>> 
>> https://cwiki.apache.org/confluence/display/SAMZA/SEP-23%3A+Simplify+Job+Runner
>> 
>> Please take a look and chime in your thoughts.
>> 
>> Thanks,
>> Ke
>> 



Re: [VOTE] Apache Samza 1.3.0 RC2

2019-12-02 Thread Xinyu Liu
+ 1 (binding)

Verified the signatures, built and ran the integration tests. All passed.
There is one flaky test failure during running check-all.sh:

org.apache.samza.table.batching.TestBatchProcessor$TestBatchTriggered.testBatchOperationTriggeredByBatchSize
FAILED
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at
org.apache.samza.table.batching.TestBatchProcessor$TestBatchTriggered.testBatchOperationTriggeredByBatchSize(TestBatchProcessor.java:122)

This shouldn't block the release as the test is flaky. We should either fix
or disable this test for the future releases. Create ticket to track:
https://issues.apache.org/jira/browse/SAMZA-2411

Thanks,
Xinyu



On Sun, Dec 1, 2019 at 6:20 PM Yi Pan  wrote:

> +1 (binding), verified the signature, built and local integration tests
> passed.
>
> Thanks!
>
> -Yi
>
> On Wed, Nov 27, 2019 at 2:49 PM Hai Lu  wrote:
>
> > Hi,
> >
> > This is a call for a vote on a release of Apache Samza 1.3.0. Thanks to
> > everyone who has contributed to this release.
> >
> > The release candidate can be downloaded from here:
> > http://home.apache.org/~lhaiesp/samza-1.3.0-rc2/
> >
> > The release candidate is signed with pgp key 0x07678C76, which can be
> found
> > here:
> >
> >
> https://keyserver.ubuntu.com/pks/lookup?search=0x07678C76&fingerprint=on&op=index
> > or to directly see the public key here:
> >
> >
> https://keyserver.ubuntu.com/pks/lookup?op=get&search=0x1513eaedf69d7ca77ff283b534ea3ca507678c76
> >
> > The git tag is release-1.3.0-rc2 and signed with the same pgp key above:
> >
> >
> https://gitbox.apache.org/repos/asf?p=samza.git;a=commit;h=573ef951dd9d96d9d547db86bbc8023557714f47
> >
> > Test binaries have been published to Maven's staging repository, and are
> > available here:
> > https://repository.apache.org/content/repositories/orgapachesamza-1073
> >
> > The vote will be open for 171 hours (ending at 6:00 PM PST Wednesday,
> > 12/4/2019).
> >
> > Please download the release candidate, check the hashes/signature, build
> it
> > and test it, and then please vote:
> >
> > [ ] +1 approve
> >
> > [ ] +0 no opinion
> >
> > [ ] -1 disapprove (and reason why)
> >
> > I ran check-all.sh and integration tests (both YARN and standalone).
> >
> > +1 (non-binding) from my side.
> >
> > Thanks,
> > Hai
> >
>