Re: Review Request 48356: RFC: Samza as a library

2016-07-11 Thread Fred Ji
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48356/#review141799 --- Thanks a lot for the RB! I only have a few INFO questions to

Re: Review Request 48356: RFC: Samza as a library

2016-07-11 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48356/ --- (Updated July 12, 2016, midnight) Review request for samza, Chris Pettitt and

Re: [NEED COMMENTS] import-control & checkstyle plugin

2016-07-11 Thread Jacob Maes
I don't particularly mind that it doesn't get cleaned up. Theoretically if it was once reasonable for one class/package to be referenced within another, it will continue to be reasonable, even if the code no longer makes the reference. That said, import control has been a pain every time we add

Re: Review Request 48393: Integrate Kerberos with JC UI refactoring (part 2).

2016-07-11 Thread Yi Pan (Data Infrastructure)
> On June 15, 2016, 7:22 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-core/src/main/java/org/apache/samza/clustermanager/AbstractContainerAllocator.java, > > line 158 > > > > > > I thought that we discussed

Re: [NEED COMMENTS] import-control & checkstyle plugin

2016-07-11 Thread Yi Pan
+1 on removing the import control. The original idea to include the checkstyle.xml is to enforce some coding style guidelines, not to strictly control the imports. W/ the outdated import control list, it practically does not serve the purpose... On Mon, Jul 11, 2016 at 4:02 PM, Navina Ramesh

[NEED COMMENTS] import-control & checkstyle plugin

2016-07-11 Thread Navina Ramesh
Hi Samza devs, Lately, with the major re-works such as standalone, multithreading etc, it is getting harder to keep track of the package/class dependencies in import-control.xml. What I have noticed is that we don't bother removing the class/package dependencies when it is no longer valid. This

Re: Review Request 48356: RFC: Samza as a library

2016-07-11 Thread Navina Ramesh
> On June 27, 2016, 6:53 p.m., Chris Pettitt wrote: > > samza-core/src/main/java/org/apache/samza/processor/StreamProcessor.java, > > lines 125-126 > > > > > > Don't we need to stop the container directly here?

Re: Review Request 48356: RFC: Samza as a library

2016-07-11 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48356/ --- (Updated July 11, 2016, 10:47 p.m.) Review request for samza, Chris Pettitt

Re: Review Request 48356: RFC: Samza as a library

2016-07-11 Thread Navina Ramesh
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48356/ --- (Updated July 11, 2016, 10:27 p.m.) Review request for samza, Chris Pettitt

Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

2016-07-11 Thread Jagadish Venkatraman
> On July 11, 2016, 6:47 p.m., Chris Pettitt wrote: > > Very high level question: I assume you looked at `ps -o rss` and > > disqualified it for some reason. Could you elaborate as to why? `ps` itself > > is certainly more portable than procfs (though `-o rss` is not part of the > > POSIX

Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

2016-07-11 Thread Xinyu Liu
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/49877/#review141760 --- Fix it, then Ship it!

Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

2016-07-11 Thread Jake Maes
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/49877/#review141747 --- Ship it!

Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

2016-07-11 Thread Chris Pettitt
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/49877/#review141745 --- Very high level question: I assume you looked at `ps -o rss` and

Re: Review Request 48243: SAMZA-961: Async tasks and multithreading model

2016-07-11 Thread Xinyu Liu
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48243/ --- (Updated July 11, 2016, 5:30 p.m.) Review request for samza, Chris Pettitt,

Re: The best way to import data into kv store?

2016-07-11 Thread Yi Pan
Hi, Sining, Yes! What you did is exactly what I meant by "batch-to-stream job"! Enjoy Samza! -Yi On Mon, Jul 11, 2016 at 8:50 AM, 李斯宁 wrote: > hi, Yi > Thanks to your respones. > My old userid-db is stored in a hdfs folder, and I have found a way to > import my userid

Re: The best way to import data into kv store?

2016-07-11 Thread 李斯宁
hi, Yi Thanks to your respones. My old userid-db is stored in a hdfs folder, and I have found a way to import my userid data. 1) Create a MapReduce job to completely write userid data to a kafka topic, let's call it "import_uid" 2) In samza JoinTask's configuration, set "import_uid" as a

Re: flushing changelog & checkpointing

2016-07-11 Thread Jacob Maes
Hey Ramanan, Confirmed. It all happens in commit (at the "checkpoint interval") The operations are executed serially for each task. The order is exactly as Yi described. The order was chosen with crashes in mind. That is, the checkpoint is not written until state has been updated and output

Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

2016-07-11 Thread Jagadish Venkatraman
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/49877/ --- Review request for samza, Boris Shkolnik, Chris Pettitt, Jake Maes, Yi Pan