HI Petr,

I encountered a similar issue to your first point a while ago. See 
https://issues.apache.org/jira/browse/STORM-2038 (STORM-2038) for some more 
discussion that occurred.

Regards,
Paul

-----Original Message-----
From: Petr Janeček [mailto:janecekp...@seznam.cz]
Sent: 20 February 2017 10:02
To: user@storm.apache.org
Subject: Local Mode issues and undocumented behaviour

Hello,

you might notice the below email is a simplified repost as the last one got no 
attention which may have been because it was in a wrong thread. Sorry about 
that, but any answer from a reliable source qualifies - even "nobody knows 
anymore" or "not sure, please file a Jira with a minimal reproducible test".

We're using Storm heavily and are trying to get things tested locally as much 
as possible. While doing so, we accumulated a few questions:


1. Since 1.0.3 the Local Cluster on Windows needs the ability to create 
symlinks, but it did not need to do that before. Both 
`LocalCluster.submitTopology()` and `Testing.withSimulatedTimeLocalCluster() 
... Testing.completeTopology()` do this, and it's a major pain for local 
development where our IDEs constantly run tests in local mode.

    We read <link mangled by corporate email filter - removed - PM> (which 404s 
for 1.0.1 and 1.0.2, by the way), and running our IDEs as Administrator fixes 
the issue.

    It might at be a good idea to add this change into the release notes - 
"Running in Local Mode now requires the symlink creation permission, too." 
Introducing new major features in .build versions is unfortunate :(. Is there 
any configuration to revert to old behaviour, please?


2. Since 1.0.3, every time we run any test on Local Cluster, there is an extra 
directory being created in the root of our project in IDE: 
./logs/workers-artifacts/topologytest-random-uuid, and it contains a single 
file, "worker.yaml".

    Is there anything we can do to move this logging to wherever else, 
preferably the ./target directory? I went through the release notes and did not 
find anything related.


3. How does `Testing.completeTopology()` know the topology is completed? We're 
not acking any tuples, so I'd expect the method to return once all tuples have 
internally timed out (or the `CompleteTopologyParam` timeout has passed). 
However, the method returns much sooner (sooner than we'd say our topology is 
"completed"), implying a more clever strategy. Is this deterministic? Are there 
any knobs to turn?


4. Does local mode not honor `conf.registerSerialization()`? This seems 
strange, but if we're sending an instance of  `OurData` class in local mode, 
the serialization fails with `NotSerializableException`, like this:

        java.io.NotSerializableException: com.our.company.data.OurData
                at org.apache.storm.utils.Utils.javaSerialize(Utils.java:236)
                at 
org.apache.storm.thrift$serialize_component_object.invoke(thrift.clj:172)
                at 
org.apache.storm.testing$complete_topology.doInvoke(testing.clj:514)
                at clojure.lang.RestFn.invoke(RestFn.java:1124)
                at 
org.apache.storm.testing4j$_completeTopology.invoke(testing4j.clj:63)
                at org.apache.storm.Testing.completeTopology(Unknown Source)

     ...even though we have used `conf.registerSerialization(OurData.class, 
OurDataSerializer.class);`

    Slapping `Serializable` on the class fixes the issue, but obviously that's 
not the solution - we don't want to change our class because of local testing, 
and we definitely *want* local testing to use the same serialization mechanism 
as production. I'm very sure there's a lot of people using this functionality, 
so we probably just overlooked something? We even tried:

        conf.put(Config.TOPOLOGY_TESTING_ALWAYS_TRY_SERIALIZE, true);  // 
Thanks for fixing this in 1.0.3, by the way!
        conf.put(Config.TOPOLOGY_FALL_BACK_ON_JAVA_SERIALIZATION, false);

    By the way, as far as I know, Kryo can serialize nonserializable classes, 
too, via using the `FieldSerializer`. Is there any hidden option to enable this 
by default instead of Java serialization? Do you have any plans on using this 
instead of Java serialization?


Thank you in advance for any responses, we've been scratching our heads lately.
Petr Janeček
Please consider the environment before printing this email. This message should 
be regarded as confidential. If you have received this email in error please 
notify the sender and destroy it immediately. Statements of intent shall only 
become binding when confirmed in hard copy by an authorised signatory. The 
contents of this email may relate to dealings with other companies under the 
control of BAE Systems Applied Intelligence Limited, details of which can be 
found at http://www.baesystems.com/Businesses/index.htm.

Reply via email to