[GitHub] samza pull request #888: SAMZA-2078: Add zookeeper session metrics to ZkJobC...
GitHub user shanthoosh opened a pull request: https://github.com/apache/samza/pull/888 SAMZA-2078: Add zookeeper session metrics to ZkJobCoordinator. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shanthoosh/samza SAMZA-2078 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/samza/pull/888.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #888 commit ee796d57b4fe69856b1918a50f8c0393b3d6c87e Author: Shanthoosh Venkataraman Date: 2019-01-18T00:32:53Z SAMZA-2078: Add zookeeper session metrics to ZkJobCoordinator. ---
[GitHub] samza pull request #882: DiagnosticsAppender for log4j2
Github user asfgit closed the pull request at: https://github.com/apache/samza/pull/882 ---
[GitHub] samza pull request #887: SAMZA-2076 RocksDbTableDescriptor should use Long t...
Github user asfgit closed the pull request at: https://github.com/apache/samza/pull/887 ---
[GitHub] samza pull request #887: SAMZA-2076 RocksDbTableDescriptor should use Long t...
GitHub user weisong44 opened a pull request: https://github.com/apache/samza/pull/887 SAMZA-2076 RocksDbTableDescriptor should use Long type for TTL Samza uses millisec as config value, while in rocksDB it's defined as int32. It's currently defined as integer in RocksDbTableDescriptor, the range isn't large enough to match, and it should be of Long type. You can merge this pull request into a Git repository by running: $ git pull https://github.com/weisong44/samza SAMZA-2076 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/samza/pull/887.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #887 commit 3f7ed71f8f5d6efe3567f039734ac96bb12679b6 Author: Wei Song Date: 2018-07-25T01:34:06Z Added self to committer list commit 5cbf9af9c12b96c5773a37274538f8baea8347e1 Author: Wei Song Date: 2018-08-03T20:38:41Z Merge remote-tracking branch 'upstream/master' commit a15a7c9a231adcc1b441351963e15a7855ea5879 Author: Wei Song Date: 2018-08-06T18:49:30Z Merge remote-tracking branch 'upstream/master' commit aae0f380e49950bc36a18e34d08041d554fc13a7 Author: Wei Song Date: 2018-08-16T17:50:45Z Merge remote-tracking branch 'upstream/master' commit 0440f75fcd16061d8ff87e775d4a1e731936ece0 Author: Wei Song Date: 2018-08-20T18:42:56Z Merge remote-tracking branch 'upstream/master' commit 4782c61df5eb046a57c0401ab94c1d88dc6ba0a7 Author: Wei Song Date: 2018-08-21T19:47:29Z Merge remote-tracking branch 'upstream/master' commit f28b491de20cf18357f9f92b780fb3eba52f6301 Author: Wei Song Date: 2018-08-23T22:22:40Z Merge remote-tracking branch 'upstream/master' commit df2f8d7b6ebc714a42376d98a45db20343f0a0fd Author: Wei Song Date: 2018-08-27T16:10:59Z Merge remote-tracking branch 'upstream/master' commit de708f5e93bcdd352137db8c570469dd6a657ee5 Author: Wei Song Date: 2018-09-05T19:17:04Z Merge remote-tracking branch 'upstream/master' commit 5156239155e0a010242fd5f7d96de1892494e376 Author: Wei Song Date: 2018-09-11T22:27:45Z Merge remote-tracking branch 'upstream/master' commit eca0020411ef3997b947fb8ec7c30a400bd796df Author: Wei Song Date: 2018-09-18T04:33:54Z Merge remote-tracking branch 'upstream/master' commit 239a0950539899ea14d1bd40dd462172f477b735 Author: Wei Song Date: 2018-09-19T20:37:32Z Merge remote-tracking branch 'upstream/master' commit 41299b5b03ccd6135715a55e39210d2e8cd15fe3 Author: Wei Song Date: 2018-09-20T16:41:17Z Merge remote-tracking branch 'upstream/master' commit a6c94add7724c0f1a41873e2c2050a9088ec0850 Author: Wei Song Date: 2018-09-20T17:39:31Z Merge remote-tracking branch 'upstream/master' commit 1c6a2eae475127e79078ddd378f95f5bb5a8bd45 Author: Wei Song Date: 2018-09-21T21:44:40Z Merge remote-tracking branch 'upstream/master' commit 8ee784415a329468972e894d96194a8240071ff4 Author: Wei Song Date: 2018-09-25T00:31:24Z Merge remote-tracking branch 'upstream/master' commit e19b4dc9a56dd6d9f18d4fa461c9349d7c29c5e2 Author: Wei Song Date: 2018-09-25T22:09:11Z Merge remote-tracking branch 'upstream/master' commit ec7d84095142895ce83d3a33120138520d07d872 Author: Wei Song Date: 2018-09-26T19:30:01Z Merge remote-tracking branch 'upstream/master' commit 242d8442b7d0ba458c4b94dce52cebce14d79217 Author: Wei Song Date: 2018-09-26T21:15:41Z Merge remote-tracking branch 'upstream/master' commit c85604e0b1b977eda236e249baf40a718087cea7 Author: Wei Song Date: 2018-09-27T17:16:46Z Merge remote-tracking branch 'upstream/master' commit 1e5de45a84e8a55353174ac8afb86ab4d5ed44d0 Author: Wei Song Date: 2018-09-29T05:45:06Z Merge remote-tracking branch 'upstream/master' commit f5731b10b8772d318e34b0cd0687361ffa158dcd Author: Wei Song Date: 2018-10-01T20:33:10Z Merge remote-tracking branch 'upstream/master' commit 7706ab1f8f04cb3d90de23eed5d7dbb3d3b8bf3f Author: Wei Song Date: 2018-10-02T17:05:30Z Merge remote-tracking branch 'upstream/master' commit f748050528cadb9c7126232c24c4a9d66942bf6c Author: Wei Song Date: 2018-10-03T18:49:16Z Merge remote-tracking branch 'upstream/master' commit 05822f0a4001ab7723898ea5f4e61505ef1fe819 Author: Wei Song Date: 2018-10-04T22:52:17Z Merge remote-tracking branch 'upstream/master' commit 097958c897fb2a1ef48e2203b1c2433fe2ecd8d0 Author: Wei Song Date: 2018-10-05T22:17:18Z Merge remote-tracking branch 'upstream/master' commit a56c28dc09ada7c0783aa0c0b601c36bd63d989f Author: Wei Song Date: 2018-10-11T17:06:44Z Merge remote-tracking branch 'upstream/master' commit 2c679c3915c96a703c6f528d504ea99063e82401 Author: Wei Song Date: 2018-10-11T21:58:19Z Merge remote-tracking branch 'upstream/master' commit a06e8ec2088fe0a2737624957077d6260b73e1c6 Author: Wei Song Date: 2018-10-16T
[GitHub] samza pull request #880: SAMZA-2066 Refactor remote tables to separate vario...
Github user asfgit closed the pull request at: https://github.com/apache/samza/pull/880 ---
[GitHub] samza pull request #751: Samza 1965: SAMZA 1.0 DOCUMENTATION FOR TEST Framew...
Github user asfgit closed the pull request at: https://github.com/apache/samza/pull/751 ---
[GitHub] samza pull request #871: SAMZA-2056 : Adding a TaskMode in the TaskModel
Github user asfgit closed the pull request at: https://github.com/apache/samza/pull/871 ---
Re: InMemorySystemDescriptor ignores serde
Hi, Tom, First, your observation about current InMemorySystem is exactly right and thanks for raising this issue to the community! The current InMemorySystem came up with a tight coupling with the Samza test framework, which I believe put quite a lot of limitations on its uses, e.g. using InMemorySystem in prototyping. Currently we are working on ways to improve it so it can be used by normal user code. For your case, it seems to me the inconsistency of using serde caused the confusion. You point of being more consistent to support Serde in InMemory stream sounds reasonable to me. I have the same impression that InMemory can be treated the same way as other input streams. The initial rollout doesn't have this feature and I create the ticket https://issues.apache.org/jira/browse/SAMZA-2075 to track this. We will see whether we can get it there in the next release. Thanks, Xinyu On Thu, Jan 17, 2019 at 6:11 AM Tom Davis wrote: > Hey Sanil, thanks for the reply. I eventually figured that not > supporting serdes for in-memory streams was an intentional restriction, > I was just pointing out that it is inconsistent with earlier versions > since it was relatively easy to supply stream serdes directly before the > Descriptor API. > > I can't really send a test case along because it's all in Clojure and > uses a Clojure-based API wrapper I wrote for interacting with Samza. In > theory, the easiest test would be one where a Config contains the > property I mentioned; with that, you should be able to run a simple > pipeline that shows -- despite the NoOpSerde forced by > InMemorySystemDescriptor -- the input is serialized using that serde. > > Anyway, I'm not sure if it's worth the trouble. I get why you'd forgo > serialization for the in-memory system, it was just a handy way to test > my entire pipeline which contains a few non-trivial custom serdes. > > > Sanil Jain writes: > > > Hi Tom, > > > > InMemorySystem is a system that is supposed to only support NoOpSerde > since > > all the associated steams for this system are maintained in memory. In > > addition to this, if your test is using the Samza's Test Framework, it > will > > override any explicit serde configs specified for streams to NoOp. > > > > > > You are expected to supply deserialized objects to the in-memory system. > > > > > > In addition to that in your email you mentioned: > > > > > > {unformat} > > > > I had still specified in my config: > > > > streams.in-0.samza.msg.serde=integer > > > > > > Apparently, that *was* respected by some part of the system because > > integers were > > deserialized properly! Removing this configuration value results in my > > operator > > receiving a byte array since the in-memory system only uses NoOpSerde. > > > > {unformat} > > > > > > Can you send me a snippet of test you were trying to fix so that I can > > understand the problem better? > > > > > > Thanks > > > > Sanil > > > > On Tue, 8 Jan 2019 at 17:28, Tom Davis wrote: > > > >> I am in the process of updating a project to 1.0 and spent today > debugging > >> a > >> rather odd test failure. When using input/output streams with > IntegerSerde, > >> things worked fine -- however, using LongSerde, every message value was > 0! > >> I > >> eventually found that InMemorySystemDescriptor#getInputDescriptor > ignores > >> the > >> serde passed to it. However, I had still specified in my config: > >> > >> streams.in-0.samza.msg.serde=integer > >> > >> Apparently that *was* respected by some part of the system because > >> integers were > >> deserialized properly! Removing this configuration value results in my > >> operator > >> receiving a byte array since the in-memory system only uses NoOpSerde. > >> > >> This behavior appears inconsistent with the previous version of Samza. > The > >> old > >> `getInputStream` was passed a serde that was always used, but since the > new > >> version receives a Descriptor that has already discarded the serde, I am > >> forced > >> into assuming NoOpSerde everywhere, at least for testing purposes. > >> > >> Not the end of the world, but it does introduce an inconsistency between > >> the > >> in-memory system and any other -- one that requires a fair bit of domain > >> knowledge to avoid. > >> > >> As always, thanks for the great project! > >> >
Re: InMemorySystemDescriptor ignores serde
Hey Sanil, thanks for the reply. I eventually figured that not supporting serdes for in-memory streams was an intentional restriction, I was just pointing out that it is inconsistent with earlier versions since it was relatively easy to supply stream serdes directly before the Descriptor API. I can't really send a test case along because it's all in Clojure and uses a Clojure-based API wrapper I wrote for interacting with Samza. In theory, the easiest test would be one where a Config contains the property I mentioned; with that, you should be able to run a simple pipeline that shows -- despite the NoOpSerde forced by InMemorySystemDescriptor -- the input is serialized using that serde. Anyway, I'm not sure if it's worth the trouble. I get why you'd forgo serialization for the in-memory system, it was just a handy way to test my entire pipeline which contains a few non-trivial custom serdes. Sanil Jain writes: Hi Tom, InMemorySystem is a system that is supposed to only support NoOpSerde since all the associated steams for this system are maintained in memory. In addition to this, if your test is using the Samza's Test Framework, it will override any explicit serde configs specified for streams to NoOp. You are expected to supply deserialized objects to the in-memory system. In addition to that in your email you mentioned: {unformat} I had still specified in my config: streams.in-0.samza.msg.serde=integer Apparently, that *was* respected by some part of the system because integers were deserialized properly! Removing this configuration value results in my operator receiving a byte array since the in-memory system only uses NoOpSerde. {unformat} Can you send me a snippet of test you were trying to fix so that I can understand the problem better? Thanks Sanil On Tue, 8 Jan 2019 at 17:28, Tom Davis wrote: I am in the process of updating a project to 1.0 and spent today debugging a rather odd test failure. When using input/output streams with IntegerSerde, things worked fine -- however, using LongSerde, every message value was 0! I eventually found that InMemorySystemDescriptor#getInputDescriptor ignores the serde passed to it. However, I had still specified in my config: streams.in-0.samza.msg.serde=integer Apparently that *was* respected by some part of the system because integers were deserialized properly! Removing this configuration value results in my operator receiving a byte array since the in-memory system only uses NoOpSerde. This behavior appears inconsistent with the previous version of Samza. The old `getInputStream` was passed a serde that was always used, but since the new version receives a Descriptor that has already discarded the serde, I am forced into assuming NoOpSerde everywhere, at least for testing purposes. Not the end of the world, but it does introduce an inconsistency between the in-memory system and any other -- one that requires a fair bit of domain knowledge to avoid. As always, thanks for the great project!