GitHub user nickwallen reopened a pull request:
https://github.com/apache/metron/pull/1012
METRON-1551 Profiler Should Not Use Java Serialization
When running the Profiler in a topology where serialization occurs, the
following error happens. This can occur when the number of workers is greater
than 1.
The topology should not be using Java serialization for serializing tuple
values as this will negatively impact performance.
```
2018-05-09 10:48:35.136 o.a.s.d.executor [ERROR]
java.lang.RuntimeException: java.lang.RuntimeException:
java.io.NotSerializableException:
org.apache.metron.common.configuration.profiler.ProfileResult
at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:485)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.disruptor$consume_loop_STAR_$fn__7183.invoke(disruptor.clj:83)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484)
[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
Caused by: java.lang.RuntimeException: java.io.NotSerializableException:
org.apache.metron.common.configuration.profiler.ProfileResult
at
org.apache.storm.serialization.SerializableSerializer.write(SerializableSerializer.java:41)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
~[kryo-3.0.3.jar:?]
at
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
~[kryo-3.0.3.jar:?]
at
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
~[kryo-3.0.3.jar:?]
at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534)
~[kryo-3.0.3.jar:?]
at
org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:44)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:44)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.daemon.worker$mk_transfer_fn$transfer_fn__7805.invoke(worker.clj:193)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.daemon.executor$start_batch_transfer__GT_worker_handler_BANG_$fn__7430.invoke(executor.clj:309)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.disruptor$clojure_handler$reify__7166.onEvent(disruptor.clj:40)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
... 6 more
Caused by: java.io.NotSerializableException:
org.apache.metron.common.configuration.profiler.ProfileResult
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
~[?:1.8.0_162]
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
~[?:1.8.0_162]
at
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
~[?:1.8.0_162]
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
~[?:1.8.0_162]
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
~[?:1.8.0_162]
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
~[?:1.8.0_162]
at
org.apache.storm.serialization.SerializableSerializer.write(SerializableSerializer.java:38)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
~[kryo-3.0.3.jar:?]
at
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
~[kryo-3.0.3.jar:?]
at
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
~[kryo-3.0.3.jar:?]
at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534)
~[kryo-3.0.3.jar:?]
at
org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:44)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:44)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.daemon.worker$mk_transfer_fn$transfer_fn__7805.invoke(worker.clj:193)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.daemon.executor$start_batch_transfer__GT_worker_handler_BANG_$fn__7430.invoke(executor.clj:309)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.disruptor$clojure_handler$reify__7166.onEvent(disruptor.clj:40)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472)
~[storm-core-1.1.0.2.6.4.0-91.jar:1.1.0.2.6.4.0-91]
... 6 more
```
### Changes
* Defined a configuration property `topology.kryo.register` for Storm that
specifies the classes which should use Kryo serialization. Storm requires that
you specifically tell it which classes should undergo Kryo serialization.
* If an advanced user creates a profile that results in a user-defined type
(a class not already in `topology.kryo.register`) this property allows them to
register the class for Kryo serialization. This could happen if a user is
using a 3rd party library containing Stellar functions.
* Updated the existing integration tests to force them to serialize tuples
and to fail if Java serialization is used.
* Created another integration test that creates a profile that uses the
STATS library. This ensures that the STATS library can be Kryo serialized.
* Updated the README to reflect this change.
* Updated all Profiler classes to be Java serializable in case a user
chooses to use Java serialization in their Storm topology. This is not
recommended, but there is no reason to block a user from doing so.
### Manual Testing
1. Launch the development environment.
1. Alter the Profiler properties.
Set the Profiler duration to 1 minute.
```
profiler.period.duration=1
profiler.period.duration.units=MINUTES
```
Force the topology to fail if Kryo serialization is not used.
```
topology.fall.back.on.java.serialization=false
```
Force the topology to serialize values, even with a single worker.
```
topology.testing.always.try.serialize=true
```
1. Restart the Profiler.
```
source /etc/default/metron
storm kill profiler
$METRON_HOME/bin/start_profiler_topology.sh
```
1. Create a Profile that uses the STATS library.
```
$METRON_HOME/bin/stellar -z $ZOOKEEPER
```
```
[Stellar]>>> conf := SHELL_EDIT()
{
"profiles": [
{
"profile": "profile-with-stats",
"foreach": "'global'",
"init": { "stats": "STATS_INIT()" },
"update": { "stats": "STATS_ADD(stats, 1)" },
"result": "stats"
}
],
"timestampField": "timestamp"
}
[Stellar]>>> CONFIG_PUT("PROFILER",conf)
```
1. Wait until a flush event occurs, then attempt to retrieve the profile
values. If values can be retrieved, the change has been successful.
```
[Stellar]>>> stats := PROFILE_GET("profile-with-stats", "global",
PROFILE_FIXED(30, "DAYS"))
[org.apache.metron.statistics.OnlineStatisticsProvider@c8bab446,
org.apache.metron.statistics.OnlineStatisticsProvider@2990520a,
org.apache.metron.statistics.OnlineStatisticsProvider@f6affafd,
org.apache.metron.statistics.OnlineStatisticsProvider@de0d511,
org.apache.metron.statistics.OnlineStatisticsProvider@9247daa4,
org.apache.metron.statistics.OnlineStatisticsProvider@ddef62a6,
org.apache.metron.statistics.OnlineStatisticsProvider@2e6874a6,
org.apache.metron.statistics.OnlineStatisticsProvider@8568c433,
org.apache.metron.statistics.OnlineStatisticsProvider@dc65d27e,
org.apache.metron.statistics.OnlineStatisticsProvider@28b2e728]
```
```
[Stellar]>>> STATS_MEAN(STATS_MERGE(stats))
1.0
```
## Pull Request Checklist
- [x] Is there a JIRA ticket associated with this PR? If not one needs to
be created at [Metron
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
- [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA
number you are trying to resolve? Pay particular attention to the hyphen "-"
character.
- [x] Has your PR been rebased against the latest commit within the target
branch (typically master)?
- [x] Have you included steps to reproduce the behavior or problem that is
being changed or addressed?
- [x] Have you included steps or a guide to how the change may be verified
and tested manually?
- [x] Have you ensured that the full suite of tests and checks have been
executed in the root metron folder via:
- [x] Have you written or updated unit tests and or integration tests to
verify your changes?
- [x] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [x] Have you verified the basic functionality of the build by building
and running locally with Vagrant full-dev environment or the equivalent?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/nickwallen/metron METRON-1551
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/metron/pull/1012.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1012
----
commit 387cd2de46b6a2392d1fe7a537fc8107b3bf4a59
Author: Nick Allen <nick@...>
Date: 2018-05-11T19:22:00Z
METRON-1551 Profiler Should Not Use Java Serialization
commit 2898e30dc5537bdfe850462fcb98a32f176acd47
Author: Nick Allen <nick@...>
Date: 2018-05-12T18:31:44Z
Fixed a broken hyperlink in the README
commit 788b86166e867d99c3e499b04a3f2480a05b3ddf
Author: Nick Allen <nick@...>
Date: 2018-05-14T16:28:39Z
Made all Profiler classes serializable in case a user chooses Java
serialization in their topology. Test cases added to validate
commit db9446e1cbf73cd1e853c2d7af65169882ca9044
Author: Nick Allen <nick@...>
Date: 2018-05-14T17:43:47Z
Added Java serializable check to test classes and also added 'Serializable'
where applicable
commit 13f9a2dad60b7e9b6cd8a31302d926312cc8c66f
Author: Nick Allen <nick@...>
Date: 2018-05-14T18:39:30Z
Added extra details in error message
----
---