Re: Samza yarn job - cannot bind to local host

2016-08-04 Thread Yi Pan
Hi, Shekar, Did you check your firewall configuration? Could you also paste your configuration, especially task.opts? -Yi On Wed, Aug 3, 2016 at 5:56 PM, Shekar Tippur wrote: > I am trying to submit a Samza job to yarn and I get a error: > > Exception in thread "main"

Re: Kafka Streams

2016-08-03 Thread Yi Pan
Hi, Nick, IMHO, there are following points that differs Samza from KStreams: - Stability of local state management. Samza supports durable local state and host-affinity for faster state recovery. 0.10.1 makes further progress in host-affinity to allow a) continuous check-pointing of state store;

Re: Different Serde for Store and Changelog

2016-08-03 Thread Yi Pan
Hi, Nick, Thanks a lot for the input. Does it work for you if you only encrypt the value? If that works, you won't have the problem w/ the order of keys in RocksDB store. Regarding to the decryption cost, if you enable the cache store, most of the cache access is to get the deserialized objects.

Re: Review Request 50670: SAMZA-991: Continue to report SamzaAppMasterMetrics

2016-08-02 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50670/#review144562 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Aug. 1

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-08-02 Thread Yi Pan (Data Infrastructure)
settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 50667: SAMZA-989 - Update hello-samza to use the startup logger

2016-08-01 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50667/#review144372 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On Aug. 1

Re: Review Request 50583: SAMZA-954 Improve logging for Samza

2016-08-01 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50583/#review144369 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On July 28

Re: Review Request 50317: SAMZA-978: update md files to resolve inconsistent links, broken links and some confusing sentences

2016-08-01 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50317/#review144365 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On July 22

Re: Review Request 50614: SAMZA-970 - Problems with integration tests and SAMZA-987 - Preparing for 0.10.1 version release

2016-07-29 Thread Yi Pan (Data Infrastructure)
if we just overwrite the current 0.10 online doc site. - Yi Pan (Data Infrastructure) On July 29, 2016, 9:04 p.m., Navina Ramesh wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.

Re: Review Request 50614: SAMZA-970 - Problems with integration tests and SAMZA-987 - Preparing for 0.10.1 version release

2016-07-29 Thread Yi Pan (Data Infrastructure)
- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/50614/ > --- > > (Updated July 29, 2016, 9:04 p.m.) > > > Review

Re: Review Request 50451: SAMZA-981: Set consistent Kafka clientId for a job instance

2016-07-28 Thread Yi Pan (Data Infrastructure)
y we are passing in "" as clientId. Or, make the "" as a constant string w/ name like: NOT_USED_EMPTY_CONSUMER_CLIENT_ID - Yi Pan (Data Infrastructure) On July 26, 2016, 11:46 p.m., Xinyu Liu wrote: > > --- > This i

Re: Review Request 50318: SAMZA-979: Remove KafkaCheckpointMigration

2016-07-28 Thread Yi Pan (Data Infrastructure)
/reviews.apache.org/r/50318/ > --- > > (Updated July 25, 2016, 9:29 p.m.) > > > Review request for samza, Navina Ramesh and Yi Pan (Data Infrastructure). > > > Repository: samza > > > Description > --- > > KafkaCheckpointMigration is not needed anymore

Re: Review Request 50583: SAMZA-954 Improve logging for Samza

2016-07-28 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50583/#review144003 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On July 28

Re: Review Request 50527: SAMZA-970: fix integration tests

2016-07-27 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50527/#review143839 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On July 27

Re: SamzaSQL document required

2016-07-27 Thread Yi Pan
Hi, Ankita, There is no official release documentation for SamzaSQL yet. If you are referring to the paper in HPBDC this year by Milinda, it is based on several patches under development. I will start by listing the relevant JIRAs: - SAMZA-390: the over-arching ticket describing the view of SQL

Re: Review Request 50318: SAMZA-979: Fix for KafkaCheckpointMigration not registering source correctly

2016-07-22 Thread Yi Pan (Data Infrastructure)
0.9 directly to 0.11. And the code here is to migrate 0.9 jobs to 0.10 that have the changlog partition map from checkpoint to the coordinator stream. I would propose to remove the migration code completely. - Yi Pan (Data Infrastructure) On July 22, 2016, 1:24 a.m., Xinyu Liu wrote

Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

2016-07-20 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/49877/#review143037 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On July 19

Re: Review Request 50056: SAMZA-863: Multithreading changes

2016-07-20 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50056/#review142960 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On July 19

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-07-19 Thread Yi Pan (Data Infrastructure)
hing else in a package private namespace. > > 6. I suspect you want to limit the ability to create custom operators (at > > least if the assumption about how graph walking would work in bullet 2 > > holds), so StreamOperator's constructor probably needs to be package > > p

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-07-19 Thread Yi Pan (Data Infrastructure)
y generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review142421 ----------- On July 19, 2016, 6:04 p.m., Yi Pan (Data Infrastructure) wrote: > > --- > Th

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-07-19 Thread Yi Pan (Data Infrastructure)
the above points s.t. I can accommodate them in the design doc and the next update for this RB. - Yi --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47835/#review142421 ---

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-07-19 Thread Yi Pan (Data Infrastructure)
://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 50082: SAMZA-973: Disk Quotas: clamp max delay, better measure processing time

2016-07-15 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50082/#review142457 --- Ship it! +1. lgtm. - Yi Pan (Data Infrastructure) On July

Re: Review Request 49877: SAMZA-972: Holistic memory monitoring for SamzaContainer

2016-07-15 Thread Yi Pan (Data Infrastructure)
.java (line 63) <https://reviews.apache.org/r/49877/#comment208014> Is there any test for PosixBasedStatisticsGetter? Since that is the default, we need to add this to unit test. - Yi Pan (Data Infrastructure) On

Re: Review Request 48356: RFC: Samza as a library

2016-07-14 Thread Yi Pan (Data Infrastructure)
ig.java <https://reviews.apache.org/r/48356/#comment207880> I think that we should deprecate this one w/ job.coordinator.host-affinity.enabled. Maybe copying over this value and print a warning for now and remove completely later. - Y

Re: Review Request 48356: RFC: Samza as a library

2016-07-14 Thread Yi Pan (Data Infrastructure)
-- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/48356/ > ----------- > > (Updated July 13, 2016, 9:58 p.m.) > > > Review request for samza, Chris Pett

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-07-13 Thread Yi Pan (Data Infrastructure)
/apache/samza/task/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 48393: Integrate Kerberos with JC UI refactoring (part 2).

2016-07-11 Thread Yi Pan (Data Infrastructure)
> On June 15, 2016, 7:22 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-core/src/main/java/org/apache/samza/clustermanager/AbstractContainerAllocator.java, > > line 158 > > <https://reviews.apache.org/r/48393/diff/3/?file=1417391#file1417391line158> > > &g

Re: [NEED COMMENTS] import-control & checkstyle plugin

2016-07-11 Thread Yi Pan
+1 on removing the import control. The original idea to include the checkstyle.xml is to enforce some coding style guidelines, not to strictly control the imports. W/ the outdated import control list, it practically does not serve the purpose... On Mon, Jul 11, 2016 at 4:02 PM, Navina Ramesh

Re: The best way to import data into kv store?

2016-07-11 Thread Yi Pan
"import_uid" topic, write to the kv store. > 4) In the task, when processing my realtime stream, read from kv store and > do the join. > > The "key point" is import data with bootstrap stream. > Is this your "batch-to-stream" approach mean? > > >

Re: flushing changelog & checkpointing

2016-07-06 Thread Yi Pan
Hi, Buvana, Please see answers below. On Tue, Jul 5, 2016 at 11:47 AM, Ramanan, Buvana (Nokia - US) < buvana.rama...@nokia-bell-labs.com> wrote: > > Does this mean that all writes to the disk for state store purposes will > be done at the checkpointing time (which is also the time Samza

Re: The best way to import data into kv store?

2016-07-06 Thread Yi Pan
Hi, Sining, There are a few questions to be asked s.t. we know your application use case better. 1) In what format is your old userid-db data? 2) Is the old userid-db data partitioned using the same key and the same number of partitions as you expect to consume in your Samza job? Generally

Re: A magic question

2016-06-29 Thread Yi Pan
Hi, Shaodong, Could you try to paste the graphs somewhere else? Apache mailing list seems to remove all the embedded images in your email. Hence, I can not see what your exact problem is. Thanks! On Wed, Jun 29, 2016 at 12:58 AM, 吴少东 wrote: > Hello everyone: > >

Review Request 49138: SAMZA-889: Change log not working properly with In memory Store

2016-06-23 Thread Yi Pan (Data Infrastructure)
e5a66a4770b9553a1cc48fbb505f52d123c6c754 samza-test/src/test/scala/org/apache/samza/storage/kv/TestKeyValueStores.scala 23f8a1a6bee8ef38e0640a4e90778e53d982deeb Diff: https://reviews.apache.org/r/49138/diff/ Testing --- Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 46287: Add a double serde.

2016-06-23 Thread Yi Pan (Data Infrastructure)
to update Util.scala class to add the support for "double" as a build-in serde factory name. - Yi Pan (Data Infrastructure) On April 15, 2016, 11:17 p.m., Jon Bringhurst wrote: > > --- > This is an automatically generat

Re: Review Request 49116: SAMZA-889: Change log not working properly with In memory Store

2016-06-22 Thread Yi Pan (Data Infrastructure)
/TestKeyValueStores.scala 23f8a1a6bee8ef38e0640a4e90778e53d982deeb Diff: https://reviews.apache.org/r/49116/diff/ Testing --- Local build w/ unit tests Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 48808: Rebase samza-41 with master

2016-06-20 Thread Yi Pan (Data Infrastructure)
> On June 20, 2016, 5:02 p.m., Yi Pan (Data Infrastructure) wrote: > > samza-core/src/main/scala/org/apache/samza/system/RangeSystemStreamPartitionMatcher.scala, > > line 52 > > <https://reviews.apache.org/r/48808/diff/1/?file=1421716#file1421716line52> > &

Re: Review Request 48808: Rebase samza-41 with master

2016-06-20 Thread Yi Pan (Data Infrastructure)
The same thing here. We need to at least document that usage of ssp matcher and broadcast stream together could be an issue. - Yi Pan (Data Infrastructure) On June 16, 2016, 6:40 p.m., Jagadish Venkatraman wrote: > > --- &g

Re: Review Request 48811: SAMZA-968 - SequenceFileHdfsFileWriter does not close file properly

2016-06-20 Thread Yi Pan (Data Infrastructure)
the file handle within the writer object and after calling writer.close(), make sure that the file is closed as well? - Yi Pan (Data Infrastructure) On June 16, 2016, 6:59 p.m., Benjamin Smith wrote: > > --- > This is an auto

Re: Review Request 48862: allow empty serde for SystemStream

2016-06-17 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48862/#review138282 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On June 17

Re: Bug in SequenceFileHdfsFileWriter

2016-06-16 Thread Yi Pan
Hi, Benjamin, Thanks a lot for reporting this! It makes sense from reading the posts. Could you open a JIRA? Are you interested in assigning to yourself and contribute the fix? Thanks a lot again! -Yi On Thu, Jun 16, 2016 at 9:52 AM, Benjamin Smith < ben.sm...@ranksoftwareinc.com> wrote: > >

Re: Manually Commit Offsets?

2016-06-16 Thread Yi Pan
the info. > > Is there any way to 'pause' the job or stop processing kafka from inside a > StreamTask.process() method? That would work for me too. > > > Jeremiah Adams > Software Engineer > www.helixeducation.com > Blog | Twitter | Facebook | LinkedIn > > _________

Re: Manually Commit Offsets?

2016-06-15 Thread Yi Pan
anager to do this but cannot see how > to wire it into my StreamTask. > > > Jeremiah Adams > Software Engineer > www.helixeducation.com > Blog | Twitter | Facebook | LinkedIn > > > From: Yi Pan <nickpa...@gmail.com> > Se

Re: Review Request 48393: Integrate Kerberos with JC UI refactoring (part 2).

2016-06-15 Thread Yi Pan (Data Infrastructure)
needed in index.scaml file? - Yi Pan (Data Infrastructure) On June 13, 2016, 11:59 p.m., Jagadish Venkatraman wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.

Re: Manually Commit Offsets?

2016-06-14 Thread Yi Pan
Sorry. Correction: > 2) in your code, call TaskContext.commit() whenever you are ready to > checkpoint. > > *TaskCoordinator.commit()* > > On Tue, Jun 14, 2016 at 10:16 AM, Jeremiah Adams < > jad...@helixeducation.com> wrote: > >> We need to send messages to a remote service. I need to

Re: Manually Commit Offsets?

2016-06-14 Thread Yi Pan
Hi, Jeremiah, Samza does support manual checkpointing. You can following the steps below: 1) turn off auto-commit by setting task.commit.ms=-1 2) in your code, call TaskContext.commit() whenever you are ready to checkpoint. We have applications in LinkedIn using this pattern to successfully

Re: No updates to some of the store changelog partitions

2016-06-13 Thread Yi Pan
Hi, David, Did you check the log to see whether there is any log lines indicating the producer issues on the three partitions that you suspect? And could you also check whether you have auto-commit turned on? If your auto-commit is on and producer does not report any issue writing to the

Re: Review Request 37026: SAMZA-727: Support for Kerberos

2016-06-08 Thread Yi Pan (Data Infrastructure)
Need license header as well. - Yi Pan (Data Infrastructure) On June 8, 2016, 7:10 p.m., Chen Song wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://re

Re: Review Request 48182: SAMZA-958: Make store/cache thread safe

2016-06-08 Thread Yi Pan (Data Infrastructure)
erator behavior under the synchronized lock. It turns out that RocksDB access is safe under the synchronized lock and this test probably is no longer needed. - Yi Pan (Data Infrastructure) On June 3, 2016, 9:30 p.m., Xinyu Liu

Re: Review Request 37026: SAMZA-727: Support for Kerberos

2016-06-08 Thread Yi Pan (Data Infrastructure)
3> Same here. Should be getYarnKerberosPrincipal() and getYarnKerberosKeytab() - Yi Pan (Data Infrastructure) On June 8, 2016, 7:10 p.m., Chen Song wrote: > > --- > This is an automatically generated e-mail. To reply, visit:

Re: Review Request 48393: Integrate Kerberos with JC UI refactoring (part 2).

2016-06-08 Thread Yi Pan (Data Infrastructure)
ick question: is YarnAppState going to be a super set of SamzaAppState? Or reverse? I thought that w/ the refactoring, we are shooting for a reverse logic: SamzaAppState would be the super set that includes YARN-specific states? - Yi Pan (Data Infrastructure) On June 8, 2016, 4:03 a.m., Jagadish

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-06-07 Thread Yi Pan (Data Infrastructure)
Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-06-07 Thread Yi Pan (Data Infrastructure)
PRE-CREATION samza-sql-core/src/test/java/org/apache/samza/task/sql/UserCallbacksSqlTask.java PRE-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-06-07 Thread Yi Pan (Data Infrastructure)
--- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Update all values in RocksDB

2016-06-07 Thread Yi Pan
r level) why iteration through the entries can > be a slow process? > > Thanks, > David > > On Mon, Jun 6, 2016 at 2:34 PM, Yi Pan <nickpa...@gmail.com> wrote: > > > Hi, David, > > > > I would recommend to keep a separate table of closed sessions as a

Re: Update all values in RocksDB

2016-06-06 Thread Yi Pan
Hi, David, I would recommend to keep a separate table of closed sessions as a "queue", ordered by the time the session is closed. And in your window method, just create an iterator in the "queue" and only make progress toward the end of the "queue", and do a point deletion in the sessionStore,

Re: Review Request 48109: SAMZA-957 Avoid unnecessary KV Store flushes (part 3)

2016-06-02 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48109/#review136036 --- Ship it! lgtm. Thanks! - Yi Pan (Data Infrastructure

Re: Review Request 47835: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-05-27 Thread Yi Pan (Data Infrastructure)
-CREATION settings.gradle 4c1aa107a11d413777e69bc4e48847b811aff7d2 Diff: https://reviews.apache.org/r/47835/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Re: Review Request 47687: Merged patch 44920

2016-05-27 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47687/#review135346 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On May 23

Merged Kafka/Samza meetup @LinkedIn

2016-05-27 Thread Yi Pan
Hi, all, In order to organize better quality talks @ more frequent cadence, we are merging the Kafka/Samza meetups @ LinkedIn. The new meetup is titled Stream Processing Meetup @ LinkedIn and the first one was just announced on 6/15. Please see the details at the link here:

Re: Review Request 37026: SAMZA-727: Support for Kerberos

2016-05-24 Thread Yi Pan (Data Infrastructure)
> On May 24, 2016, 3:31 a.m., Yi Pan (Data Infrastructure) wrote: > > Hi, Chen, thanks for the update! One more comment: please rebase the patch > > against the latest master s.t. it can be applied cleanly. > > > > Thanks! > > Yi Pan (Data Infrastructure) wrot

Re: Review Request 37026: SAMZA-727: Support for Kerberos

2016-05-24 Thread Yi Pan (Data Infrastructure)
> On May 24, 2016, 3:31 a.m., Yi Pan (Data Infrastructure) wrote: > > Hi, Chen, thanks for the update! One more comment: please rebase the patch > > against the latest master s.t. it can be applied cleanly. > > > > Thanks! Before it is too late. I tried to reb

Re: Review Request 37026: SAMZA-727: Support for Kerberos

2016-05-23 Thread Yi Pan (Data Infrastructure)
to keep this code when the application exiting status is either "SUCCEEDED" or "KILLED" (defined in FinalApplicationStatus class). - Yi Pan (Data Infrastructure) On May 8, 2016, 12:50 a.m., Chen Song wrote: > > -

Re: Review Request 47620: SAMZA-951 - Improve event loop timing metrics

2016-05-23 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47620/#review134473 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On May 23

Re: Review Request 47620: SAMZA-951 - Improve event loop timing metrics

2016-05-23 Thread Yi Pan (Data Infrastructure)
comment regarding to "metrics name compatibility". We have three interfaces: programming interface, configuration, and metrics. The backward incompatible changes in all three interfaces need to be stated in the release notes. - Yi Pan (Data Infrastructure)

Re: Java 8, RocksDB and Samza 0.10.0

2016-05-23 Thread Yi Pan
Awesome! @Louis, do you mind to contribute a section to FAQ for this issue? That would help all users who encounter this issue later. Thanks! -Yi On Fri, May 20, 2016 at 10:09 AM, Louis Calisi wrote: > We finally found the issue. The two links in my original email

Re: Review Request 45324: SAMZA-914: Initial draft for Java programming APIs on operators supporting DAGs

2016-05-22 Thread Yi Pan (Data Infrastructure)
t; > > > > Will Batch need to be templated with generics? Or is it really just a > > window cardinality? I have changed it to Limit class later. The whole idea is that this is descrbing a counter-threshold based triggering condition. > On May 20, 2016,

Re: Samza job killed by left orphaned on YARN

2016-05-19 Thread Yi Pan
Hi, David and all, The "ultimate" solution is probably to implement SAMZA-871 , which allows Samza JobCoordinator directly identifies whether a container is alive or not w/o dependency on the cluster management systems. This is also considered

Re: Java 8, RocksDB and Samza 0.10.0

2016-05-19 Thread Yi Pan
Hi, all, Samza 0.10 was test and validated using Java 8 and Redhat 6.6 in LinkedIn. The rocksDB native library issue was not seen in our runtime environment. We did see a unit test failure and was able to track back to the issue reported to RocksDB here:

Re: 0.10.1 Release

2016-05-12 Thread Yi Pan
Hi, Andy, We are doing some pre-release work at this moment. My rough estimation on 0.10.1 timeline would be about 1 month away. Thanks a lot! -Yi On Thu, May 12, 2016 at 9:56 AM, Andy Throgmorton wrote: > Hi, > > I'm wondering if anyone has a rough estimate on when

Re: Review Request 47251: SAMZA-852 - Better logging when system can not be created Always log the exception when we cant instantiate a producer or consumer

2016-05-11 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47251/#review132796 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On May 11

Re: Review Request 47247: SAMZA-947 - TaskAssignmentManager registration exception when partition count changes.

2016-05-11 Thread Yi Pan (Data Infrastructure)
rating with Xinyu's change. - Yi Pan (Data Infrastructure) On May 11, 2016, 8:26 p.m., Jake Maes wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.

Re: Review Request 47247: SAMZA-947 - TaskAssignmentManager registration exception when partition count changes.

2016-05-11 Thread Yi Pan (Data Infrastructure)
> On May 11, 2016, 8:07 p.m., Yi Pan (Data Infrastructure) wrote: > > It seems that all we needed to fix the issue is to remove the line to > > register the coordinator stream consumer? Did I miss any other things? > > Jake Maes wrote: > The key change is to remove

Re: Review Request 47197: SAMZA-948 CoordinatorSystemStreamConsumer is not threadsafe

2016-05-11 Thread Yi Pan (Data Infrastructure)
/coordinator/stream/CoordinatorStreamSystemConsumer.java (line 146) <https://reviews.apache.org/r/47197/#comment196987> nit: you can do a double check on isBootstraped here again to avoid doing another round of bootstrap in case two thread are contending on bootstrapLock(). - Yi Pan

Re: Review Request 47247: SAMZA-947 - TaskAssignmentManager registration exception when partition count changes.

2016-05-11 Thread Yi Pan (Data Infrastructure)
ent196984> There is no place that we change the registered flag. Why do we need to have it if registered is always false? - Yi Pan (Data Infrastructure) On May 11, 2016, 6:19 p.m., Jake Maes wrote: > > --- > This is an automatic

Re: Review Request 46287: Add a double serde.

2016-05-11 Thread Yi Pan (Data Infrastructure)
-in serde class to the document. One example is in the configure table. - Yi Pan (Data Infrastructure) On April 15, 2016, 11:17 p.m., Jon Bringhurst wrote: > > --- > This is an automatically generated e-mail. To reply, visit

Re: Kafka dependency

2016-05-10 Thread Yi Pan
ts to the release > when there is no official 0.10.0 release. > > Thanks! > Nick > > > -Original Message- > From: Yi Pan [mailto:nickpa...@gmail.com] > Sent: Tuesday, May 10, 2016 11:22 AM > To: dev@samza.apache.org > Subject: Re: Kafka dependency > > Hi, Ni

Re: Kafka dependency

2016-05-10 Thread Yi Pan
Hi, Nick, We do have plan to update the Kafka dependency in Samza. However, Samza only uses Kafka client library. We have confirmed that any Kafka 0.8.2 clients should be supported by Kafka 0.9 brokers. Hence, it should not block you if you are thinking of upgrading Kafka broker versions (e.g.

Re: Review Request 46287: Add a double serde.

2016-05-10 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/46287/#review132481 --- Ship it! +1. lgtm! - Yi Pan (Data Infrastructure) On April

Re: samza job start takes 20 minutes to figure out the Checkpointed offset

2016-05-09 Thread Yi Pan
Hi, Bo, I embedded my answers in-between: On Sun, May 8, 2016 at 9:00 PM, Liu Bo wrote: > The other thing is log retention is set to 24 hour or 30GB. But seems not > working for checkpoint topic. As all the *.log file are there unlike the > data topic which only has recent

Re: Review Request 46644: SAMZA-889 - Change log not working properly with In memory Store

2016-05-06 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/46644/#review132109 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On May 6

Re: Review Request 47029: SAMZA-932 port collisions in JmxServer

2016-05-06 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47029/#review132076 --- Ship it! LGTM! Thanks! - Yi Pan (Data Infrastructure

Review Request 47043: SAMZA-915: operator implementation - part 2

2016-05-05 Thread Yi Pan (Data Infrastructure)
samza-operator/src/main/java/org/apache/samza/operators/impl/window/WindowImpl.java PRE-CREATION Diff: https://reviews.apache.org/r/47043/diff/ Testing --- ./gradlew clean build Thanks, Yi Pan (Data Infrastructure)

Review Request 47042: SAMZA-915: libear pipeline programming APIs - part 1

2016-05-05 Thread Yi Pan (Data Infrastructure)
build passes locally Thanks, Yi Pan (Data Infrastructure)

Re: Recover from SamzaException thrown by KeyValueIterator.all()

2016-05-04 Thread Yi Pan
Hi, Jack, Unfortunately, this would happen for all stores that has the changelog configured, even you would try to iterate and remove the large records *before* it is flushed. The reason that you saw this in CachedStore.all() is that we call flush() in CachedStore when creating the iterator,

Re: Review Request 44422: SAMZA-41 Update current patch to targe v0.10.0.

2016-05-02 Thread Yi Pan (Data Infrastructure)
ing template. - Yi Pan (Data Infrastructure) On March 5, 2016, 8:47 a.m., Jose Barrueta wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://re

Re: [Discuss] Moving Samza to Java 1.8 source compatibility.

2016-04-28 Thread Yi Pan
I am +1 on the JDK8 move. As Jake has elaborated, there are numerous advantages from 1.8 source compatible code. As for the downside of dropping JDK7 support, obviously, bin backward-compatibility will be broken. However, moving to JDK8 binary is not a big effort for JDK7-compatible Java and

Re: Review Request 46732: SAMZA-930 fix issue with json deserialisation in YarnUtil

2016-04-28 Thread Yi Pan (Data Infrastructure)
/autoscaling/utils/YarnUtilTest.java:40: method def rcurly at indentation level 4 not at correct indentation, 2 :samza-autoscaling_2.10:checkstyleTest FAILED Could you help to address them? You can run ./gradlew clean check to re-pro the above errors. Thanks! - Yi Pan (Data Infrastructure) On April

Re: Review Request 44920: SAMZA-680 Refactor the Samza AppMaster to support other cluster managers

2016-04-28 Thread Yi Pan (Data Infrastructure)
/ContainerProcessManagerMetrics.scala (line 38) <https://reviews.apache.org/r/44920/#comment194855> nit: I thought that the formatting of function signature we are following the ones in the previous code? - Yi Pan (Data Infrastructure) On April 27, 2016, 6:29 p.m., Jagadish Venkatraman

Re: Review Request 46732: SAMZA-930 fix issue with json deserialisation in YarnUtil

2016-04-27 Thread Yi Pan (Data Infrastructure)
inning of this file. Thanks! - Yi Pan (Data Infrastructure) On April 27, 2016, 10:22 a.m., Alex Buck wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.

Re: Review Request 46732: SAMZA-930 fix issue with json deserialisation in YarnUtil

2016-04-27 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/46732/#review130870 --- Ship it! +1 lgtm. Thanks! - Yi Pan (Data Infrastructure

Re: Review Request 45258: Abandon producer retry after a certain # of errors : SAMZA-911

2016-04-27 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45258/#review130847 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On April

Re: Review Request 45258: Abandon producer retry after a certain # of errors : SAMZA-911

2016-04-27 Thread Yi Pan (Data Infrastructure)
> On April 21, 2016, 5:21 a.m., Yi Pan (Data Infrastructure) wrote: > > samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala, > > line 44 > > <https://reviews.apache.org/r/45258/diff/2/?file=1346073#file1346073line44> > > > >

Re: Review Request 37026: SAMZA-727: Support for Kerberos

2016-04-27 Thread Yi Pan (Data Infrastructure)
hread pool size = 1, it would be clearer if we use Executors.newSingleThreadScheduledExecutor() samza-yarn/src/main/scala/org/apache/samza/job/yarn/YarnJob.scala (line 63) <https://reviews.apache.org/r/37026/#comment194739> nit: since submitApplication would require to access non yarnConfig

Re: Review Request 44920: SAMZA-680 Refactor the Samza AppMaster to support other cluster managers

2016-04-25 Thread Yi Pan (Data Infrastructure)
erId if it is not passed via command line here? samza-yarn/src/main/scala/org/apache/samza/job/yarn/refactor/SamzaAppMasterService.scala (line 1) <https://reviews.apache.org/r/44920/#comment194295> Maybe change it to YarnAppMasterWebService and remove the 'refactor' in the path

Re: Review Request 45258: Abandon producer retry after a certain # of errors : SAMZA-911

2016-04-20 Thread Yi Pan (Data Infrastructure)
/KafkaSystemProducer.scala (line 44) <https://reviews.apache.org/r/45258/#comment193405> It would be nice to add a producer configuration for the max retries. - Yi Pan (Data Infrastructure) On April 15, 2016, 1:32 a.m., Jagadish Venkatraman

Re: Samza 0.10.0 with Kafka 0.9.0.0

2016-04-04 Thread Yi Pan
Hi, Krishna, I just replied to Nick's question to dev list. Let me know if it makes senses to you or not. Thanks! -Yi On Mon, Apr 4, 2016 at 10:18 AM, Krishna wrote: > Hi Yi, > > Any update on Kafka 0.9 move on Samza 0.10.0 ? > > Thanks > > Krishna >

Re: Review Request 45388: SAMZA-919 - Samza - Add milliseconds and threadname to log4j config. Also switch DailyRollingFileAppender

2016-04-04 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45388/#review126940 --- Ship it! Ship It! - Yi Pan (Data Infrastructure) On March

Re: Review Request 44604: split deployment logic

2016-04-01 Thread Yi Pan (Data Infrastructure)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/44604/#review126604 --- Ship it! LGTM! +1 - Yi Pan (Data Infrastructure) On March

Re: Review Request 44920: SAMZA-881

2016-03-31 Thread Yi Pan (Data Infrastructure)
(line 158) <https://reviews.apache.org/r/44920/#comment189478> Not sure why we need a separate SamzaTaskManager?? samza-core/src/main/java/org/apache/samza/clustermanager/ClusterBasedJobCoordinator.java (line 173) <https://reviews.apache.org/r/44920/#com

<    1   2   3   4   5   6   7   8   >