Re: [VOTE] SEP-1: Semantics of ProcessorId in Samza

2017-03-30 Thread Yi Pan
deals with the > > underlying execution environment. > > > > Let me know if you have a different perspective on this. > > > > Cheers! > > Navina > > > > On Thu, Mar 30, 2017 at 9:42 AM, Yi Pan <nickpa...@gmail.com> wrote: > > > &g

Re: Steps to Upgrading Samza (0.9 to 0.12)

2017-03-30 Thread Yi Pan
Hi, Thomas, Sorry to hear that you were hit by the removal of migration in Samza 0.11. The reason we removed it is following a deprecate-removal policy in two versions. We are not aware that people still using 0.9 after we released 0.11 and were not expecting a direct upgrade from 0.9 to 0.12.

Re: [VOTE] SEP-1: Semantics of ProcessorId in Samza

2017-03-30 Thread Yi Pan
@Navina, Sorry to chime in late. One question: 1. Why is it in JobCoordinator, and why not in StreamProcessor class? Because JobCoordinator provides coordination service across many processors, an interface getProcessorId() in JobCoordinator is confusing regarding to which processorId we are

Re: Review Request 56913: SAMZA-1099: Documentation updates for Samza 0.12 release (for master branch)

2017-02-22 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56913/#review166454 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Feb. 22

Re: Review Request 56911: SAMZA-1099: Documentation updates for Samza 0.12 release (for 0.12.0 branch)

2017-02-22 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56911/#review166453 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Feb. 22

Re: Review Request 56911: SAMZA-1099: Documentation updates for Samza 0.12 release (for 0.12.0 branch)

2017-02-22 Thread Yi Pan (Data Infrastructure)
mza-yarn_2.10 0.11.0 docs/startup/hello-samza/versioned/index.md (line 63) <https://reviews.apache.org/r/56911/#comment238282> This section is talking about "dev" build of hello-samza, w/ PR#59 checked in, we should refer to hello-samza-0.13.0-SNAPSHOT-dist.tar

Re: Review Request 56909: Changes to hello-samza for the 0.12.0 release

2017-02-21 Thread Yi Pan (Data Infrastructure)
- Yi Pan (Data Infrastructure) On Feb. 22, 2017, 1:38 a.m., Jagadish Venkatraman wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache

Streams meetup @LinkedIn on 2/16

2017-02-16 Thread Yi Pan
Hi, all, Just a kind reminder that our first meetup in 2017 will be held tomorrow. Details here: https://www.meetup.com/Stream-Processing-Meetup-LinkedIn/events/237171557/ Looking forward to seeing you all! -Yi

Re: [VOTE] Apache Samza 0.12.0 RC2

2017-02-15 Thread Yi Pan
Ran check-all and integration tests on Mac. Passed and verified the pgp key. P.S. I do see a non-consistent test hanging issue on my Mac in TestStreamProcessor test. Drilled in a bit and found out that the Kafka broker is not started serving correctly during the job initialization. It passed in

Join us in the first Streams meetup @LinkedIn in 2017!

2017-01-31 Thread Yi Pan
Hi, all, We have an exciting agenda setup in the Streams meetup @LinkedIn, to welcome the year 2017! It will be on Thursday, February 16, 2017, at 6pm in LinkedIn office in Sunnyvale. There will be talks from Uber, LinkedIn, and Optimizly on Kafka and Samza, talking about Kafka on SSD, async

Re: How to gracefully stop samza job

2017-01-17 Thread Yi Pan
; > 舒琦 > 地址:长沙市岳麓区文轩路27号麓谷企业广场A4栋1单元6F > 网址:http://www.eefung.com > 微博:http://weibo.com/eefung > 邮编:410013 > 电话:400-677-0986 > 传真:0731-88519609 > > > 在 2017年1月17日,17:18,Yi Pan <nickpa...@gmail.com> 写道: > > > > Hi, Qi, > &g

Re: How to gracefully stop samza job

2017-01-17 Thread Yi Pan
ot be collected, only the log of am > can be seen. > > > > > ShuQi > > 在 2017年1月16日,10:39,Liu Bo <diabl...@gmail.com> 写道: > > Hi, > > *container log will be removed automatically,* > > you can turn on yarn log aggregation, so that terminat

Re: How to gracefully stop samza job

2017-01-13 Thread Yi Pan
Hi, Qi, Sorry to reply late. I am curious on your comment that the close and stop methods are not called. When user initiated a kill request, the graceful shutdown sequence is triggered by the shutdown hook added to SamzaContainer. The shutdown sequence is the following in the code: {code}

[REPORT] Samza - Jan 2017

2017-01-13 Thread Yi Pan
## Description: - Apache Samza is a stream processing framework built on top of Apache Hadoop YARN and Apache Kafka. ## Issues: - there are no issues requiring board attention at this time ## Activity: - Streaming and SQL analytics on Samza in QCon SF'16 - Streaming meetup at LinkedIn in

Re: Review Request 54647: SAMZA-1073: job-level fluent API

2017-01-03 Thread Yi Pan (Data Infrastructure)
://reviews.apache.org/r/54647/diff/ Testing --- Thanks, Yi Pan (Data Infrastructure)

Re: [DISCUSS] Samza 0.12.0 release

2016-12-23 Thread Yi Pan
e jetty dependency to Jetty 9 from Jetty 8 > > Thanks, > > Fred > > On Fri, Dec 23, 2016 at 2:38 PM, Yi Pan <nickpa...@gmail.com> wrote: > > > lgtm, +1 > > > > On Fri, Dec 23, 2016 at 10:44 AM, santhosh venkat < > > santhoshvenkat1...@gmail.com&

Re: [DISCUSS] Samza 0.12.0 release

2016-12-23 Thread Yi Pan
lgtm, +1 On Fri, Dec 23, 2016 at 10:44 AM, santhosh venkat < santhoshvenkat1...@gmail.com> wrote: > Hi All, > > There have been quite a lot of new features added to master since 0.11 > release to warrant a new major release. At LinkedIn, we've done functional > and performance testing against

Re: Review Request 54647: WIP: job-level fluent API

2016-12-13 Thread Yi Pan (Data Infrastructure)
: https://reviews.apache.org/r/54647/diff/ Testing --- Thanks, Yi Pan (Data Infrastructure)

Review Request 54647: WIP: job-level fluent API

2016-12-12 Thread Yi Pan (Data Infrastructure)
/main/java/org/apache/samza/operators/spec/WindowOperatorSpec.java 2f5b1e76f2dfbecec44195142ffe309f46226d6c Diff: https://reviews.apache.org/r/54647/diff/ Testing --- Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 53000: SAMZA-1038: Update hello-samza master to use Samza 0.11.0

2016-11-15 Thread Yi Pan (Data Infrastructure)
000/#comment226032> Have we decided to release 0.11.1 as the next one? I thought that the next planned release (i.e. the latest here) is 0.12.0? - Yi Pan (Data Infrastructure) On Oct. 18, 2016, 10:17 p.m., Xinyu Liu

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-20 Thread Yi Pan (Data Infrastructure)
/InputJsonSystemMessage.java PRE-CREATION Diff: https://reviews.apache.org/r/47994/diff/ Testing --- ./gradlew clean build. Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-16 Thread Yi Pan (Data Infrastructure)
info. > > > > 3. Does the order of the Operators in the list have any meaning? e.g. > > does it implicitly define the order of processing, or is it just for > > consistency, or is the List used to allow duplicates? > > Yi Pan (Data Infrastruct

Re: [VOTE] Apache Samza 0.11.0 RC2

2016-10-11 Thread Yi Pan
Build, validated MD5, test w/ integration tests and passed. Thanks! +1 (binding) On Mon, Oct 10, 2016 at 4:07 PM, xinyu liu wrote: > Hey all, > > This is a call for a vote on a release of Apache Samza 0.11.0. Thanks to > everyone who has contributed to this release. We

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-08 Thread Yi Pan (Data Infrastructure)
-operator/src/test/java/org/apache/samza/task/BroadcastOperatorTask.java PRE-CREATION samza-operator/src/test/java/org/apache/samza/task/InputJsonSystemMessage.java PRE-CREATION Diff: https://reviews.apache.org/r/47994/diff/ Testing --- ./gradlew clean build. Thanks, Yi Pan (Data

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-08 Thread Yi Pan (Data Infrastructure)
hat is not enough to help understanding the layered representation of the DAG (from programming to representation to implementation). I will try to embed something in the code then. Closing this one since the first issue is similar and is kept open. - Yi --------

Re: [DISCUSS] [VOTE] Apache Samza 0.11.0 RC0

2016-10-05 Thread Yi Pan
Hi, guys, I found a few issues w/ the current release candidate: - integration test (./bin/integration-tests.sh) is broken due to reference to "0.11.0-SNAPSHOT" version of tgz files in the build directory. samza-test/src/main/python/configs/tests.json needs to be updated w/ official 0.11.0

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-05 Thread Yi Pan (Data Infrastructure)
> Do we also need the `offset` of the incoming message that flows through > > each of these operators? > > > > Ideally, the offset should be a part of the context.(since, this RB is > > just for the wire-up, I'm certainly open to doing it later.) You a

Re: Review Request 47994: SAMZA-915: implementation of StreamPipeline and operator runtime impl classes

2016-10-05 Thread Yi Pan (Data Infrastructure)
; > > > I wonder if this is a candidate for being made `final`? From what I can > > tell, this is not modified elsewhere. Make sense. Fixed. - Yi --- This is an automatically generated e-mail. To reply, visit: https

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-10-04 Thread Yi Pan (Data Infrastructure)
ct. I will remove them for now. - Yi --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review151345 ------- On Oct. 4, 2016, 12:45 a

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-10-03 Thread Yi Pan (Data Infrastructure)
4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-10-03 Thread Yi Pan (Data Infrastructure)
d demo. We can move them to examples as you suggest later. - Yi --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review151227 ---

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-10-03 Thread Yi Pan (Data Infrastructure)
> On Sept. 29, 2016, 10:02 p.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala, > > line 197 > > <https://reviews.apache.org/r/51142/diff/5-7/?file=1493810#file1493810line197> > > > >

Re: [Discuss] Moving Samza to Java 1.8 source compatibility.

2016-09-30 Thread Yi Pan
1.0. > > > will be a good starting point? Or should we wait until we are at 1.0. > > > > > > I think the users in the community need to provide feedback so we can > > make > > > progress accordingly. > > > > > > Thanks! > > > Navina >

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
tps://reviews.apache.org/r/51142/#comment219062> Question: why do we need this in open source? Don't we already have a run-job.sh in open source that is general for any YARN application? - Yi Pan (Data Infrastructure) On Sept. 28, 2016, 9:57 p.m., Hai Lu

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
stMultiFileHdfsReader.java (line 17) <https://reviews.apache.org/r/51142/#comment219061> nit: I would recommend to test negative case where the offset is out-of-range as well. - Yi Pan (Data Infrastructure) On

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
nitely log a JIRA for improvement here though, since I heard from Venice team that Hadoop actually is less reliable than Kafka. - Yi Pan (Data Infrastructure) On Sept. 28, 2016, 9:57 p.m., Hai Lu wrote: > > --- > This is an automatically

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 14, 2016, 6:19 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemFactory.scala, > > line 38 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493812#file1493812line38> > > > >

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
--- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51142/ > ----------- > > (Updated Sept. 28, 2016, 9:57 p.m.) > > > Review request for samza, Yi Pan (Data Infrastructu

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 14, 2016, 6:19 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala, > > line 66 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493810#file1493810line66> > > > >

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 14, 2016, 6:19 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/MultiFileHdfsReader.java, > > line 59 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493808#file1493808line59> > > > &

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-29 Thread Yi Pan (Data Infrastructure)
> On Sept. 13, 2016, 1:37 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/DirectoryPartitioner.java, > > line 58 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493803#file1493803line58> >

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-28 Thread Yi Pan (Data Infrastructure)
/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-21 Thread Yi Pan (Data Infrastructure)
left out of the scope of the current RB. We can add the customized aggregate function easily when we figure out all to allow users to customize window operator's state later. - Yi --- This is an automatically generated e-mail. To reply,

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-21 Thread Yi Pan (Data Infrastructure)
AGG result. It's possible to do this by > > instantiating a new `SessionWindow` object but, it maybe a good idea to > > expose this as a builder in `Windows`. > > > > Let me know what you think. > > Yi Pan (Data Infrastructure) wrote: > Discussed offline. This can b

Re: Review Request 51126: SAMZA 998: Documentation updates for refactored Job Coordinator

2016-09-19 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51126/#review149579 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Sept

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-19 Thread Yi Pan (Data Infrastructure)
t: https://reviews.apache.org/r/47835/#review149545 ------- On Sept. 14, 2016, 8:53 a.m., Yi Pan (Data Infrastructure) wrote: > > --- > This is an automatically generated e

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-19 Thread Yi Pan (Data Infrastructure)
To reply, visit: https://reviews.apache.org/r/47835/#review148908 ----------- On Sept. 14, 2016, 8:53 a.m., Yi Pan (Data Infrastructure) wrote: > > --- > This is an au

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-19 Thread Yi Pan (Data Infrastructure)
te class need to be exposed? Interfaces are generally > > nicer to program against for extensibility and testing. Though in this case > > it looks like Window is ABC not an interface. > > Yi Pan (Data Infrastructure) wrote: > Window is an ABC, and is intended to be use

Re: Review Request 51962: SAMZA-1021 Remove the redundent poll waiting inside AsyncRunLoop blockIfBusy

2016-09-16 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51962/#review149278 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Sept

Re: Review Request 50174: SAMZA-977: User doc for samza multithreading

2016-09-16 Thread Yi Pan (Data Infrastructure)
ache.org/r/50174/#comment216843> Isn't this just talking about commit() is mutally exclusive to process/processAsync and window? We can simply state: - Checkpointing is guaranteed to only cover events that are fully processed. It is persisted in commit() method. - Yi Pan

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-16 Thread Yi Pan (Data Infrastructure)
n't appear to be testing anything (other than that types > > check). This RB is limited to APIs. I will add more assert/test code to ensure the internal variable/methods are setup/invoked correctly. > On Sept. 14, 2016, 7:03 p.m., Chris Pettitt wrote: > > samza-operator/src/tes

Re: Issue with consuming non-existent topics in 0.10.1

2016-09-14 Thread Yi Pan
Hi, Tommy, Could you open a JIRA for this one? Also, could you include the Kafka broker version in this test? Thanks! -Yi On Wed, Sep 14, 2016 at 6:06 AM, Tommy Becker wrote: > We are testing an upgrade to 0.10.1 from 0.9.1 and noticed a regression. > When starting a

Re: SIGSEGV in RocksDB when killing jobs

2016-09-14 Thread Yi Pan
Hi, Tommy, Thanks for reporting this. Definitely we can be more defensive in coding here. I just wonder what's the specific reason for you to call RocksDB store close() explicitly? As you see that SamzaContainer#shutdownStores already calling flush() and close() automatically. Does it work for

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-14 Thread Yi Pan (Data Infrastructure)
PRE-CREATION samza-sql-core/src/test/java/org/apache/samza/task/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-14 Thread Yi Pan (Data Infrastructure)
joinSource2) > > .join(joinSource3) > > .window(SessionWindows.into ()) > > .sink(SinkFunction) > > Yi Pan (Data Infrastructure) wrote: > As we discussed offline, there are more details in why the above join() > does not play well w/ t

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-14 Thread Yi Pan (Data Infrastructure)
king a better version of MessageStream.join() public. For now, let's pond on it a bit more before making the change. - Yi ------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review148787 ---

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-14 Thread Yi Pan (Data Infrastructure)
- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review148787 --- On Sept. 12, 2016, 5:53 p.m., Yi Pan (Data Infrastructure) wrote: > > --

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-14 Thread Yi Pan (Data Infrastructure)
> On Sept. 13, 2016, 1:37 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/AvroFileHdfsReader.java, > > line 24 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493806#file1493806line24> >

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-14 Thread Yi Pan (Data Infrastructure)
> On Sept. 13, 2016, 1:37 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/DirectoryPartitioner.java, > > line 83 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493803#file1493803line83> >

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-14 Thread Yi Pan (Data Infrastructure)
> On Sept. 13, 2016, 12:33 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemAdmin.java, > > line 91 > > <https://reviews.apache.org/r/51142/diff/5/?file=1493800#file1493800line91> > > > > You

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-14 Thread Yi Pan (Data Infrastructure)
y re-formatting samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/TestHdfsSystemProducerTestSuite.scala (line 338) <https://reviews.apache.org/r/51142/#comment216370> nit: unnecessary re-formatting samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/TestHdfsSystemProducerTestSuite.s

Re: Review Request 50174: SAMZA-977: User doc for samza multithreading

2016-09-13 Thread Yi Pan (Data Infrastructure)
/documentation/versioned/container/event-loop.md (line 28) <https://reviews.apache.org/r/50174/#comment216325> What about thread-safety among multiple process() operations? I thought that multiple process() can be in multiple threads as well? - Yi Pan (Data Infrastructure) On Sept. 13, 2

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-12 Thread Yi Pan (Data Infrastructure)
roups). How does it work here? If we assume that "offset" here only refers to fileOffset, please clarify and discard this comment. - Yi Pan (Data Infrastructure) On Sept. 9, 2016, 1:34 a.m., Hai Lu wrote: > > -

Re: Review Request 51142: SAMZA-967: HDFS System Consumer

2016-09-12 Thread Yi Pan (Data Infrastructure)
one loop like below instead of two embedded loops: while (!isShutdown) { if (!reader.hasNext()) { break; } IncomingMessageEnvelope messageEnvelope = reader.readNext(); try { super.put() ... } catch () {

Re: Review Request 51613: Fix SAMZA-842

2016-09-12 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51613/#review148593 --- Ship it! lgtm! - Yi Pan (Data Infrastructure) On Sept. 2

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-12 Thread Yi Pan (Data Infrastructure)
> MessageStram.apply(new MyCustomFilter()) > > ``` > > > > The former doesn't read like English. Could be slightly better if the > > name was "filterWith" but apply() still feels best. > > Yi Pan (Data Infrastructure) wrote: >

Re: Review Request 50174: SAMZA-977: User doc for samza multithreading

2016-09-07 Thread Yi Pan (Data Infrastructure)
on/versioned/container/event-loop.md (line 50) <https://reviews.apache.org/r/50174/#comment215426> from *a* different user thread. - Yi Pan (Data Infrastructure) On Sept. 7, 2016, 5:16 p.m., Xinyu Liu wrote: > > --- >

Re: Review Request 51126: SAMZA 998: Documentation updates for refactored Job Coordinator

2016-09-07 Thread Yi Pan (Data Infrastructure)
5399> Same here. docs/learn/documentation/versioned/jobs/configuration-table.html (line 1597) <https://reviews.apache.org/r/51126/#comment215400> Same. - Yi Pan (Data Infrastructure) On Aug. 16, 2016, 2:18 a.m., Jagadish Venkatraman wrote: > > --

Re: Samza container hang on exception

2016-09-06 Thread Yi Pan
c21udUViWW8tSUVV > > On Fri, Sep 2, 2016 at 4:41 PM, Yi Pan <nickpa...@gmail.com> wrote: > > > Hi, Sining, > > > > You note is on a site that I don't have account/access and it requires > > sign-up. Can you share it via google doc, since you have a gmail

Re: checkpoint on flush of system producer

2016-09-06 Thread Yi Pan
Hi, Jarrad, Yes! You have found your answer! Looking forward to your implementation of SystemProducer. Just curious, what's the target output system that you are writing to? -Yi On Tue, Sep 6, 2016 at 9:01 AM, Jarrad, Ken wrote: > I think I have discovered the

Re: Samza container hang on exception

2016-09-02 Thread Yi Pan
1 want to fix by stopping the process if failed too much > >> > times. But the process is still there and hanging. > >> > > >> > On Mon, Aug 22, 2016 at 1:14 PM, 李斯宁 <lisin...@gmail.com> wrote: > >> > > >> >> Thanks so much, I'll

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-02 Thread Yi Pan (Data Infrastructure)
the type parameter, there is no way to find out the key type in build time and use that to define the type parameters for window and join functions. Let's discuss in person tomorrow. - Yi --- This is an automatically generated e-mail

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-09-02 Thread Yi Pan (Data Infrastructure)
are for the super simple cases. > > > > If the goal of this is to actually test the lambdas, then ignore this > > feedback. Sure. I will make the complex lambdas as predefined functions. I assume that you mainly refer to sink()? - Yi ---

Re: Samza Mesos

2016-08-31 Thread Yi Pan
am Ramachandrasekaran < > > sri.ram...@gmail.com> wrote: > > > > > Yi, > > > That's a good amount of history to know. I will take a look at 680 and > > then > > > see if I can implement something as well. If there's some stuff that's > > > a

Re: Samza Mesos

2016-08-31 Thread Yi Pan
Hi, Sriram, The story behind delaying the integration of SAMZA-375 is that there are tons of repeated code in SamzaAppMaster that exist in both samza-yarn and Mesos. W/o the change we recently made in SAMZA-680, we are going to copy the SamzaAppMaster code for every distributed execution system

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-08-29 Thread Yi Pan (Data Infrastructure)
;K, M> to SK. What would happen, for example, if > > Message<K0, M0> mapped to SK0 and then a call to stateCreator with > > Message<K0, M0> produced Entry<SK1, SS0>? Instead, it seems that we should > > just take SS in and produce SS out. You could mayb

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-08-26 Thread Yi Pan (Data Infrastructure)
ide external access to those method in org.apache.samza.operator.impl package? The only potential solution I can think of is to fold these methods into package private and expose to implementation classes via WindowOperator inside MessageStream. However, I will have to move Window related classes to

Re: Question on changelog partition mapping

2016-08-26 Thread Yi Pan
y implement. Would you consider accepting a PR that makes this > change to the standard groupers? It's just strange that the generated > partition mappings can vary like this, even for identical inputs. > > -Tommy > > > On 08/16/2016 03:04 PM, Yi Pan wrote: > > Hi, Tommy

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-08-26 Thread Yi Pan (Data Infrastructure)
gt; into a different store as well, for the reverse lookup. So, to summarize, yes, we will need buffered store for each input stream in the join. For one particular PartialJoinOperator in MessageStream<K, M1>, it will use the buffered store for MessageStream<K, M2> as the joinStore

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-08-26 Thread Yi Pan (Data Infrastructure)
/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 49212: RFC: SAMZA-855: Update kafka client to 0.10.0.0

2016-08-24 Thread Yi Pan (Data Infrastructure)
/KafkaCheckpointMigration.scala (line 37) <https://reviews.apache.org/r/49212/#comment213264> This whole class is deleted in another RB. Could you rebase w/ latest master? - Yi Pan (Data Infrastructure) On June 24, 2016, 7:45 p.m., Robert Crim

Re: [DISCUSS] Samza 0.11.0 release

2016-08-24 Thread Yi Pan
Hi, Nicolas, Could you explain to me why Samza is blocking you from upgrading your Kafka brokers to 0.10? At LinkedIn, we are running Samza 0.10 w/ Kafka 0.10 brokers. This is a valid combination since Kafka 0.10 brokers should be backward compatible w/ 0.8.2 clients (which is the version Samza

Re: Review Request 51250: Support container restart when partition count diff happens.

2016-08-23 Thread Yi Pan (Data Infrastructure)
s.t. some new containers can start earlier when a subset of old containers are killed (i.e. all new containers processing partitions that are not processed by old containers can start immediately). I would recommend to go w/ option a) first, and gradually moving toward b)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-08-23 Thread Yi Pan (Data Infrastructure)
-sql-core/src/test/java/org/apache/samza/task/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: kafka dependency version

2016-08-22 Thread Yi Pan
Hi, Gaurav, There is already an effort going on for this one: SAMZA-855 . It would be good if you can try out the patch. Thanks! -Yi On Mon, Aug 22, 2016 at 1:11 AM, Gaurav Agarwal wrote: > My initial attempt to build

Re: Review Request 48356: RFC: Samza as a library

2016-08-17 Thread Yi Pan (Data Infrastructure)
e nice. samza-core/src/main/java/org/apache/samza/configbuilder/RewriterConfig.java (line 41) <https://reviews.apache.org/r/48356/#comment212325> Can we keep it package-private? Any external reference to this method? - Yi Pan (Data Infrastructure) On Aug. 12, 2016, 3:29 a.m., Navina

Re: Review Request 51047: SAMZA-1000 Fix hello-samza documentation to not use latest branch by default

2016-08-16 Thread Yi Pan (Data Infrastructure)
hello-samza documentation include the checkout latest and make this publishToMavenLocal as mandatory, while removing both the checkout latest and the publishToMavenLocal in the released version of document? - Yi Pan (Data Infrastructure) On Aug. 15, 2016, 6:18 p.m., Jake Maes

Re: Question on changelog partition mapping

2016-08-16 Thread Yi Pan
; changelog mapping each time it runs, even if the number of tasks is the > same. Does that make sense? The code has changed some since 0.9.1 but > seems to have the same issue even in 0.10.1. > > -Tommy > > On 08/11/2016 06:12 PM, Yi Pan wrote: > > Hi, Tommy

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-08-15 Thread Yi Pan (Data Infrastructure)
/src/test/java/org/apache/samza/task/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Question on changelog partition mapping

2016-08-11 Thread Yi Pan
Hi, Tommy, Which version of Samza are you using? Since 0.10, the changelog partition mapping has been moved to the coordinator stream, not in the checkpoint topic any more. That said, I want to ask a few more questions to understand what you referred to as "non-deterministic" behavior. So,

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-08-11 Thread Yi Pan (Data Infrastructure)
/task/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 48356: RFC: Samza as a library

2016-08-10 Thread Yi Pan (Data Infrastructure)
> On July 15, 2016, 12:12 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-core/src/main/java/org/apache/samza/configbuilder/ConfigBuilder.java, > > line 107 > > <https://reviews.apache.org/r/48356/diff/8/?file=1441886#file1441886line107> > > > > H

Upcoming Streams Meetup @LinkedIn

2016-08-09 Thread Yi Pan
Hi, all, I am pleased to announce that LinkedIn invites you to attend a Streams Processing meetup on Tuesday, August 23 at our Mountain View campus. There will be speakers from LinkedIn, Confluent, and TripAdvisor.

Re: Review Request 50619: SAMZA-963: add KV storage engine timers to help identify the issues on kv stores and also add unit test

2016-08-08 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50619/#review145135 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Aug. 5

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-08-08 Thread Yi Pan (Data Infrastructure)
: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: [DISCUSS] [VOTE] Apache Samza 0.10.1 RC0

2016-08-07 Thread Yi Pan
by tomorrow. > > Thanks! > Navina > > > On Sun, Aug 7, 2016 at 7:53 PM, Yi Pan <nickpa...@gmail.com> wrote: > > > It has been more than 5 days and we have got 3 +1 (binding) and 5 +1 > > (non-binding) already. Can we conclude this vote? Thanks! > > > &

Re: [DISCUSS] [VOTE] Apache Samza 0.10.1 RC0

2016-08-07 Thread Yi Pan
It has been more than 5 days and we have got 3 +1 (binding) and 5 +1 (non-binding) already. Can we conclude this vote? Thanks! On Tue, Aug 2, 2016 at 1:10 PM, Boris Shkolnik wrote: > +1 (non-binding). > > Boris. > > On Mon, Aug 1, 2016 at 11:39 AM, Navina Ramesh >

Re: Review Request 50828: SAMZA-994 Fix StreamAppender to work with the refactored Job Coordinator

2016-08-05 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50828/#review144983 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Aug. 5

Re: Review Request 50619: SAMZA-963: add KV storage engine timers to help identify the issues on kv stores and also add unit test

2016-08-05 Thread Yi Pan (Data Infrastructure)
/main/scala/org/apache/samza/storage/kv/BaseKeyValueStorageEngineFactory.scala (line 135) <https://reviews.apache.org/r/50619/#comment211122> Why do we need to add the default function, if you already defined the default in KeyValueStorageEngine.scala? - Yi Pan (Data Infrastr

Re: Samza yarn job - cannot bind to local host

2016-08-04 Thread Yi Pan
mer.zookeeper.connect=host1:2181, > host2:2181,host3:2181 > > > > systems.kafka.consumer.auto.offset.reset=largest > > > > systems.kafka.producer.producer.type=sync > > > > # Normally, we'd set this much higher, but we want things to look snappy >

<    1   2   3   4   5   6   7   8   >