Serialization isn't free. By skipping it where possible, even in a cluster, it's worth doing so to conserve CPU resources.
Using immutable objects is cheaper. Assuming you're coding in java, consider using ImmutableMap, ImmutableMap.Builder, and similar classes in the Guava library from Google. http://docs.guava-libraries.googlecode.com/git-history/v18.0/javadoc/com/google/common/collect/ImmutableMap.html [http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726] Grant Overby Software Engineer Cisco.com<http://www.cisco.com/> [email protected]<mailto:[email protected]> Mobile: 865 724 4910 [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information. From: Nathan Leung <[email protected]<mailto:[email protected]>> Reply-To: user <[email protected]<mailto:[email protected]>> Date: Tuesday, December 1, 2015 at 9:30 AM To: user <[email protected]<mailto:[email protected]>> Subject: Re: [Discussion] storm local-mode event object reuse bug It is bypassed by design. As noted in https://storm.apache.org/apidocs/backtype/storm/task/OutputCollector.html, the emitted objects must be immutable. If you're intent on modifying them, be very careful. On Tue, Dec 1, 2015 at 4:28 AM, Stephen Powis <[email protected]<mailto:[email protected]>> wrote: I believe anytime tuples are passed between bolts on the same jvm (either in local mode or in remote mode where the upstream and downstream bolt both reside on the same worker) serialization is bypassed by design. On Tue, Dec 1, 2015 at 1:46 PM, Edward Zhang <[email protected]<mailto:[email protected]>> wrote: Hi Storm developers, Today, I hit one possible storm issue which happens in local mode. In local mode, one event object is sent out of spout and looks it does not go through serialization/deserialization, instead this event object including its members is directly referenced by following bolts. So when one bolt modifies this event object then another bolt will also see the changes immediately. For example the event object sent by spout includes one java Map object, if there are 2 following bolts after this spout, then in one bolt if we modify this Map object, then the other bolt will see that or throw concurrentmodificationexception if it iterates the Map Object. Please let us know if this behavior should be corrected by storm framework or by storm application. In storm application, we can do deep copy if it's local mode, but in storm framework, probably serialization/deserialization should be always executed. Let me know your thoughts. Thanks Edward Zhang
