Your discovery seems to have helped me, too. I'm not sure yet exactly what my problem was, but adding the chill-storm dependency and registering the BlizzardKryoFactory seems to have made it work. Now to go back and see if there's anything in my configuration that I added in desperation that's no longer necessary.
Thanks! Mark On Tue, Mar 3, 2015 at 6:03 PM, Matthew Waymost <[email protected]> wrote: > I was able to solve my issue. > > Once I verified that all my simulation logic was valid, I started looking > for reasons why the registrations in my decorator weren't being picked up. > Knowing that this was a consistent behavior, as opposed to what I > originally thought, helped greatly (thanks Bill). > > Ultimately, I found a component in chill that contains an implementation > of KryoFactory. Substituting this for the default one storm provides solved > my problem. > > In case someone else happens upon this with the same issue, I first had to > add chill-storm as a dependency in sbt. Then I added the following to my > topology's configuration: > > conf.put("com.twitter.chill.config.configuredinstantiator", > "com.twitter.chill.ScalaKryoInstantiator") > conf.setKryoFactory(classOf[com.twitter.chill.storm.BlizzardKryoFactory]) > > The first line tells chill which KryoInstantiator to use (it has many). > Also, with this in place, the KryoDecorator I have, which I still needed > for my custom serialization, worked fine as well. > > Matthew > > On Mon, Mar 2, 2015 at 12:22 PM, Brunner, Bill <[email protected]> > wrote: > >> Yeah, I use the storm serializer out of the box… I had a chill >> implementation a while back but didn’t notice much of an improvement in my >> case. But my particular use case is not designed to be super fast so I >> can’t really answer that irt a high performance system. I’ve only ever >> run into serialization problems with scala maps and the filterKeys method, >> which is documented as unserializable anyway (and simple enough to work >> around). >> >> >> >> *From:* Matthew Waymost [mailto:[email protected]] >> *Sent:* Monday, March 02, 2015 2:50 PM >> *To:* [email protected] >> *Subject:* Re: KryoDecorator not working when setNumWorkers > 1 >> >> >> >> I didn't realize that locally storm would optimize to not serialize, but >> that makes total sense and is extremely helpful to know. >> >> >> >> I've had issues in the past with kryo not properly serializing scala case >> classes, and I've solved by adding twitter/chill's scala registrations >> before. So I assumed I would need the same thing here as I didn't see any >> documentation indicating that they were already included. >> >> >> >> The custom serializer is for a class that uses MapProxy (which I need to >> get away from using admittedly). Neither kryo nor chill have handled >> MapProxy properly in the past, so that's what the custom serializer is for. >> >> >> >> I'll definitely take a much closer look at my serialization logic and see >> if I can isolate the problem there. >> >> >> >> Out of curiosity, do you typically use java's built-in serialization >> instead of kryo? I've read and heard that it's very slow and inefficient, >> so I'd be interested in hearing your experience. >> >> >> >> On Mon, Mar 2, 2015 at 6:49 AM, Brunner, Bill <[email protected]> >> wrote: >> >> The reason your code is working locally or with a single worker is >> because there is no reason for serialization to happen when everything is >> contained in the same JVM. Once you add a worker, your parallelism hint >> now has the opportunity to ship the tuples to another JVM, thus >> serialization has to occur. So the issue is not with an increasing number >> of workers, it’s with your serialization. I am using scala as well and >> have yet to uncover an instance where I needed custom serialization… the >> out of the box java serialization seems to work well. >> >> >> >> *From:* Matthew Waymost [mailto:[email protected]] >> *Sent:* Friday, February 27, 2015 4:14 PM >> *To:* [email protected] >> *Subject:* KryoDecorator not working when setNumWorkers > 1 >> >> >> >> Hi everybody, >> >> >> >> I'm a new user to storm and have hit a roadblock in getting my topology >> to run over multiple workers. >> >> >> >> Our codebase is in scala and we send scala classes to storm, so I'm using >> a kryo decorator to call to chill's scala registrar to add all the >> serialization logic for scala classes to kryo. In addition, I have a custom >> serializer than I'm adding in the same decorator. >> >> >> >> This has worked perfectly fine for me so far locally and on our cluster >> until I tried turning up the number of workers on which the topology runs. >> When I use conf.setNumWorkers to set the number of workers greater than 1, >> the topology gives me InvalidClassExceptions when attempting to deserialize >> our classes. Removing the setNumWorkers call such that the number of >> workers stays at the default of 1 resolves the problem and everything runs >> fine. >> >> >> >> I'm completely stumped as to why this is happening, and I'm not sure how >> to diagnose the issue. I've tried the following: >> >> >> >> * Configure the decorator through storm.yaml instead of in source code on >> all worker nodes and nimbus. >> >> * Kill the topology, shut down all worker nodes, nimbus, and zookeeper, >> clear all temporary data, and bring it all back up. >> >> * Verify that everything is using the same version of storm >> >> * Searching google and staring at code >> >> >> >> Looking at what's going on in the UI, it doesn't fail at the very first >> chance either. It appears only to fail around the part of the topology >> where I have a parallelismHint set, which is a few steps in. So I'm >> guessing it's directly a result of trying to run it over multiple workers, >> but I don't know what to do with that info. >> >> >> >> We're running openjdk 7, zk 3.4.6, and storm 0.9.3 on gce. We've got 1 zk >> server, 1 nimbus server, and 3 worker servers. The call to the topology is >> made over drpc, and drpc is hosted on the nimbus server. The topology is >> implemented using trident. >> >> >> >> Thanks for any help you can provide. >> >> >> >> Matthew >> ------------------------------ >> >> This message, and any attachments, is for the intended recipient(s) only, >> may contain information that is privileged, confidential and/or proprietary >> and subject to important terms and conditions available at >> http://www.bankofamerica.com/emaildisclaimer. If you are not the >> intended recipient, please delete this message. >> >> >> ------------------------------ >> This message, and any attachments, is for the intended recipient(s) only, >> may contain information that is privileged, confidential and/or proprietary >> and subject to important terms and conditions available at >> http://www.bankofamerica.com/emaildisclaimer. If you are not the >> intended recipient, please delete this message. >> > >
