[
https://issues.apache.org/jira/browse/STORM-263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Noll updated STORM-263:
-------------------------------
Description:
In a nutshell: As reported by Twitter (see below) there are apparently issues
with Kryo versions prior to 2.21 which are causing data corruption [1]. Also,
albeit of lesser critical importance, is that Storm's current insistence on
Kryo 2.17 prevents (or at least unnecessarily complicates) the use of Twitter
Chill/Bijection, which are helpful utility libraries to simplify data
serialization in Storm topologies (e.g. when using Avro).
For this reason we may consider upgrading Storm's version of carbonite -- a
Clojure library for Kryo -- which through a transitive dependency determines
the actual Kryo version that Storm uses.
*Background*
I originally discovered this when I ran into a version conflict between the
Kryo versions used by Storm and by Twitter Chill. Storm 0.9.1-incubating
(latest version) uses Kryo 2.17 whereas Chill (latest version) uses Kryo 2.21.
Without resorting to `exclude` tricks in my build file I couldn't integrate
Chill with Storm, and I wanted to use Chill/Bijection to simplify Avro
encoding/decoding in my Storm topologies.
I filed an issue at the Chill project:
* CHILL-173: Kryo version conflict between Chill and Storm 0.9.1-incubating
causes Avro serialization to fail [2]
Ian O'Connell (@ianoc) replied and pointed out that due to data corruption
issues seen in production at Twitter when using Kryo < 2.21 the Chill project
cannot downgrade from Kryo 2.21 to Storm's 2.17 version (Scalding, Spark, and
Summingbird all use chill with Kryo at 2.21).
*Carbonite*
Storm uses {{com.twitter:carbonite}}, which is maintained by [~sritchie09]
(@sritchie) at https://github.com/sritchie/carbonite. Sam would be ok with a
patch for carbonite to address this Kryo versioning issue, if needed.
[1] https://github.com/twitter/chill/issues/173#issuecomment-36534229
[2] https://github.com/twitter/chill/issues/173
was:
In a nutshell: As reported by Twitter (see below) there are apparently issues
with [Kryo versions prior to 2.21 which are causing data
corruption|https://github.com/twitter/chill/issues/173#issuecomment-36534229].
Also, albeit of lesser critical importance, is that Storm's current insistence
on Kryo 2.17 prevents (or at least unnecessarily complicates) the use of
Twitter Chill/Bijection, which are helpful utility libraries to simplify data
serialization in Storm topologies (e.g. when using Avro).
For this reason we may consider upgrading Storm's version of carbonite -- a
Clojure library for Kryo -- which through a transitive dependency determines
the actual Kryo version that Storm uses.
*Background*
I originally discovered this when I ran into a version conflict between the
Kryo versions used by Storm and by Twitter Chill. Storm 0.9.1-incubating
(latest version) uses Kryo 2.17 whereas Chill (latest version) uses Kryo 2.21.
Without resorting to `exclude` tricks in my build file I couldn't integrate
Chill with Storm, and I wanted to use Chill/Bijection to simplify Avro
encoding/decoding in my Storm topologies.
I filed an issue at the Chill project:
* [CHILL-173: Kryo version conflict between Chill and Storm 0.9.1-incubating
causes Avro serialization to fail|https://github.com/twitter/chill/issues/173].
Ian O'Connell (@ianoc) replied and pointed out that due to data corruption
issues seen in production at Twitter when using Kryo < 2.21 the Chill project
cannot downgrade from Kryo 2.21 to Storm's 2.17 version (Scalding, Spark, and
Summingbird all use chill with Kryo at 2.21).
*Carbonite*
Storm uses {{com.twitter:carbonite}}, which is maintained by [~sritchie09]
(@sritchie) at https://github.com/sritchie/carbonite. Sam would be ok with a
patch for carbonite to address this Kryo versioning issue, if needed.
> Update Kryo version to 2.21+
> ----------------------------
>
> Key: STORM-263
> URL: https://issues.apache.org/jira/browse/STORM-263
> Project: Apache Storm (Incubating)
> Issue Type: Improvement
> Affects Versions: 0.9.2-incubating
> Reporter: Michael Noll
> Labels: carbonite, kryo, serialization
>
> In a nutshell: As reported by Twitter (see below) there are apparently issues
> with Kryo versions prior to 2.21 which are causing data corruption [1].
> Also, albeit of lesser critical importance, is that Storm's current
> insistence on Kryo 2.17 prevents (or at least unnecessarily complicates) the
> use of Twitter Chill/Bijection, which are helpful utility libraries to
> simplify data serialization in Storm topologies (e.g. when using Avro).
> For this reason we may consider upgrading Storm's version of carbonite -- a
> Clojure library for Kryo -- which through a transitive dependency determines
> the actual Kryo version that Storm uses.
> *Background*
> I originally discovered this when I ran into a version conflict between the
> Kryo versions used by Storm and by Twitter Chill. Storm 0.9.1-incubating
> (latest version) uses Kryo 2.17 whereas Chill (latest version) uses Kryo
> 2.21. Without resorting to `exclude` tricks in my build file I couldn't
> integrate Chill with Storm, and I wanted to use Chill/Bijection to simplify
> Avro encoding/decoding in my Storm topologies.
> I filed an issue at the Chill project:
> * CHILL-173: Kryo version conflict between Chill and Storm 0.9.1-incubating
> causes Avro serialization to fail [2]
> Ian O'Connell (@ianoc) replied and pointed out that due to data corruption
> issues seen in production at Twitter when using Kryo < 2.21 the Chill project
> cannot downgrade from Kryo 2.21 to Storm's 2.17 version (Scalding, Spark, and
> Summingbird all use chill with Kryo at 2.21).
> *Carbonite*
> Storm uses {{com.twitter:carbonite}}, which is maintained by [~sritchie09]
> (@sritchie) at https://github.com/sritchie/carbonite. Sam would be ok with a
> patch for carbonite to address this Kryo versioning issue, if needed.
> [1] https://github.com/twitter/chill/issues/173#issuecomment-36534229
> [2] https://github.com/twitter/chill/issues/173
--
This message was sent by Atlassian JIRA
(v6.2#6252)