Re: [HELP NEEDED] Please test 2.6.0-SNAPSHOT

2023-11-03 Thread Alexandre Vermeerbergen
Thanks Richard for the analysis.

My thought on this are very simple: let me fix my own old code, after
all this was impacting only one of my topologies, the other one were
using more careful Integer.parseInt or Integer.parseLong to avoid
making assumption on the way the values in the config Map were
serialized.

My other concern with 2.6.0 Snapshot is that in Storm UI, the "Storm
UI" title at the top left hand side of each page is missing. This may
be the result of one of my own PR, I'll have to check why...

Alex

Le ven. 3 nov. 2023 à 20:23, Richard Zowalla  a écrit :
>
> Hi,
>
> thanks for the feedback.
>
> The main difference to 2.5.0 is, that we switched the JSON
> implementation in Utils, which relied on a
> super old JSON library. What is happening here is, that the new JSON
> library (net.minidev) behaves differently.
>
> We can see it by writing some simple code without Storm (The related
> code line with the parsing is taken from Storm Utils)
>
>
> Map conf = new HashMap<>();
> conf.put("test", 1L);
>
> Map deserTopoConf = (Map)
> JSONValue.parseWithException(JSONValue.toJSONString(conf));
>
> for(Map.Entry e : deserTopoConf.entrySet()) {
>   System.out.println(e.getValue().getClass().getSimpleName());
> }
>
> This will print "Integer". Using "1L" as a value will
> result in "Long".
>
> The main issue derives from the fact, that we serialize the config map
> into a JSON String and parse the result.
> Based on the String representation, "1L" will still fit into an
> Integer, so the parser will not use Long ;-)
>
> If the number is big enough, the parser will switch to a Long.
> The old library (> 10 years old now):
>
> deserTopoConf = (Map)
> org.json.simple.JSONValue.parseWithException(org.json.simple.JSONValue.
> toJSONString(conf));
>
> for(Map.Entry e : deserTopoConf.entrySet()) {
>   System.out.println(e.getValue().getClass().getSimpleName());
> }
>
> This will print "Long".
>
> The difference between these libraries is, that the super old
> org.json.simple just always returns Long as the type.
>
> If can eventually tune the new JSON parser to something like:
>
>   deserTopoConf = (Map) new
> JSONParser(JSONParser.BIG_DIGIT_UNRESTRICTED).parse(JSONValue.toJSONStr
> ing(conf), JSONValue.defaultReader.DEFAULT);
>
> This will result in "Long" but might have other side-effects.
>
> From my point of view, the newer JSON library works as expected
> (although it might break backward compatibility).
> Unchecked casts are always evil, imho. In your specific case, it might
> be more save to cast to "Number" and than invoke "longValue()" instead.
>
> Nevertheless, we might being able to restore the old behaviour by
> tuning the JSONParser configuration.
>
> Thoughts?
>
> Gruß
> Richard
>
>
>
>
> Am Freitag, dem 03.11.2023 um 18:32 +0100 schrieb Alexandre
> Vermeerbergen:
> > On a side note, the Long value that "MyKOBolt" is set by a
> > DefaultTopologyConfig to a value with is a java.lang.Long instance...
> > hence it's surprising to see it internally converted into a
> > java.lang.Integer in the Map passed to prepare() method...
> >
> >
> > Le ven. 3 nov. 2023 à 18:27, Alexandre Vermeerbergen
> >  a écrit :
> > >
> > > Hello,
> > >
> > > I found a breaking change using Apache Storm 2.6.0 snapshot build
> > > dated 20231103 and without changing my topologies' code (keeping
> > > the
> > > same big jars).
> > >
> > > This seems to be tied to a datatype change in Map argument of
> > > prepare() method of Bolt.
> > >
> > > For example, this Bolt is KO at runtime:
> > >
> > > public class MyKOBolt extends BaseRichBolt {
> > >
> > > @Override
> > > public void prepare(Map stormConf, TopologyContext context,
> > > OutputCollector collector) {
> > > cacheMaxSize = (Long)
> > > stormConf.get(ConfKeys.SVC_DEF_CACHE_SIZE);
> > > cacheTtlMin = (Long)
> > > stormConf.get(ConfKeys.SVC_DEF_CACHE_TTL_MIN);
> > >
> > > as it throws this error:
> > > 2023-11-03 16:42:20.230 o.a.s.e.b.BoltExecutor
> > > Thread-14-__system-executor[-1, -1] [INFO] Preparing bolt
> > > __system:[-1]
> > > 2023-11-03 16:42:20.193 o.a.s.u.Utils
> > > Thread-15-evaluateTriggers-executor[11, 11] [ERROR] Async loop
> > > died!
> > > java.lang.ClassCastException: java.lang.Integer incompati

Re: [HELP NEEDED] Please test 2.6.0-SNAPSHOT

2023-11-03 Thread Richard Zowalla
Hi,

thanks for the feedback.

The main difference to 2.5.0 is, that we switched the JSON
implementation in Utils, which relied on a 
super old JSON library. What is happening here is, that the new JSON
library (net.minidev) behaves differently.

We can see it by writing some simple code without Storm (The related
code line with the parsing is taken from Storm Utils)


Map conf = new HashMap<>();
conf.put("test", 1L);

Map deserTopoConf = (Map)
JSONValue.parseWithException(JSONValue.toJSONString(conf));

for(Map.Entry e : deserTopoConf.entrySet()) {
  System.out.println(e.getValue().getClass().getSimpleName());
}

This will print "Integer". Using "1L" as a value will
result in "Long".

The main issue derives from the fact, that we serialize the config map
into a JSON String and parse the result. 
Based on the String representation, "1L" will still fit into an
Integer, so the parser will not use Long ;-)

If the number is big enough, the parser will switch to a Long. 
The old library (> 10 years old now):

deserTopoConf = (Map)
org.json.simple.JSONValue.parseWithException(org.json.simple.JSONValue.
toJSONString(conf));

for(Map.Entry e : deserTopoConf.entrySet()) {
  System.out.println(e.getValue().getClass().getSimpleName());
}

This will print "Long". 

The difference between these libraries is, that the super old
org.json.simple just always returns Long as the type.

If can eventually tune the new JSON parser to something like:

  deserTopoConf = (Map) new
JSONParser(JSONParser.BIG_DIGIT_UNRESTRICTED).parse(JSONValue.toJSONStr
ing(conf), JSONValue.defaultReader.DEFAULT);

This will result in "Long" but might have other side-effects.

>From my point of view, the newer JSON library works as expected
(although it might break backward compatibility).
Unchecked casts are always evil, imho. In your specific case, it might
be more save to cast to "Number" and than invoke "longValue()" instead.

Nevertheless, we might being able to restore the old behaviour by
tuning the JSONParser configuration.

Thoughts?

Gruß
Richard




Am Freitag, dem 03.11.2023 um 18:32 +0100 schrieb Alexandre
Vermeerbergen:
> On a side note, the Long value that "MyKOBolt" is set by a
> DefaultTopologyConfig to a value with is a java.lang.Long instance...
> hence it's surprising to see it internally converted into a
> java.lang.Integer in the Map passed to prepare() method...
> 
> 
> Le ven. 3 nov. 2023 à 18:27, Alexandre Vermeerbergen
>  a écrit :
> > 
> > Hello,
> > 
> > I found a breaking change using Apache Storm 2.6.0 snapshot build
> > dated 20231103 and without changing my topologies' code (keeping
> > the
> > same big jars).
> > 
> > This seems to be tied to a datatype change in Map argument of
> > prepare() method of Bolt.
> > 
> > For example, this Bolt is KO at runtime:
> > 
> > public class MyKOBolt extends BaseRichBolt {
> > 
> >     @Override
> >     public void prepare(Map stormConf, TopologyContext context,
> > OutputCollector collector) {
> >     cacheMaxSize = (Long)
> > stormConf.get(ConfKeys.SVC_DEF_CACHE_SIZE);
> >     cacheTtlMin = (Long)
> > stormConf.get(ConfKeys.SVC_DEF_CACHE_TTL_MIN);
> > 
> > as it throws this error:
> > 2023-11-03 16:42:20.230 o.a.s.e.b.BoltExecutor
> > Thread-14-__system-executor[-1, -1] [INFO] Preparing bolt
> > __system:[-1]
> > 2023-11-03 16:42:20.193 o.a.s.u.Utils
> > Thread-15-evaluateTriggers-executor[11, 11] [ERROR] Async loop
> > died!
> > java.lang.ClassCastException: java.lang.Integer incompatible with
> > java.lang.Long
> >  at com.acme.storm.alerting.MyKOBolt
> > .prepare(EvaluationBolt.java:71)
> > ~[stormjar.jar:?]
> >  at
> > org.apache.storm.executor.bolt.BoltExecutor.init(BoltExecutor.java:
> > 128)
> > ~[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
> >  at
> > org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:
> > 138)
> > ~[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
> >  at
> > org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:
> > 54)
> > ~[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
> >  at org.apache.storm.utils.Utils$1.run(Utils.java:393)
> > [storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
> >  at java.lang.Thread.run(Thread.java:857) [?:?]
> > 
> > whereas this bolt is fine with Storm 2.6.0:
> > 
> > public class MyOKBolt extends BaseBasicBolt {
> > 
> >     @Override
> >     public void prepare(final Map stormConf, final TopologyContext
> > context) {
> >

Re: [HELP NEEDED] Please test 2.6.0-SNAPSHOT

2023-11-03 Thread Alexandre Vermeerbergen
On a side note, the Long value that "MyKOBolt" is set by a
DefaultTopologyConfig to a value with is a java.lang.Long instance...
hence it's surprising to see it internally converted into a
java.lang.Integer in the Map passed to prepare() method...


Le ven. 3 nov. 2023 à 18:27, Alexandre Vermeerbergen
 a écrit :
>
> Hello,
>
> I found a breaking change using Apache Storm 2.6.0 snapshot build
> dated 20231103 and without changing my topologies' code (keeping the
> same big jars).
>
> This seems to be tied to a datatype change in Map argument of
> prepare() method of Bolt.
>
> For example, this Bolt is KO at runtime:
>
> public class MyKOBolt extends BaseRichBolt {
>
> @Override
> public void prepare(Map stormConf, TopologyContext context,
> OutputCollector collector) {
> cacheMaxSize = (Long) stormConf.get(ConfKeys.SVC_DEF_CACHE_SIZE);
> cacheTtlMin = (Long) stormConf.get(ConfKeys.SVC_DEF_CACHE_TTL_MIN);
>
> as it throws this error:
> 2023-11-03 16:42:20.230 o.a.s.e.b.BoltExecutor
> Thread-14-__system-executor[-1, -1] [INFO] Preparing bolt
> __system:[-1]
> 2023-11-03 16:42:20.193 o.a.s.u.Utils
> Thread-15-evaluateTriggers-executor[11, 11] [ERROR] Async loop died!
> java.lang.ClassCastException: java.lang.Integer incompatible with 
> java.lang.Long
>  at com.acme.storm.alerting.MyKOBolt .prepare(EvaluationBolt.java:71)
> ~[stormjar.jar:?]
>  at org.apache.storm.executor.bolt.BoltExecutor.init(BoltExecutor.java:128)
> ~[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
>  at org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:138)
> ~[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
>  at org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:54)
> ~[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
>  at org.apache.storm.utils.Utils$1.run(Utils.java:393)
> [storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
>  at java.lang.Thread.run(Thread.java:857) [?:?]
>
> whereas this bolt is fine with Storm 2.6.0:
>
> public class MyOKBolt extends BaseBasicBolt {
>
> @Override
> public void prepare(final Map stormConf, final TopologyContext context) {
> super.prepare(stormConf, context);
>
> this.redisPort   = Integer.parseInt((String)
> stormConf.get(ConfKeys.REDIS_PORT));
>
> That said, I can modify the code of "MyKOBolt" to use same
> Integer.parseInt  or Integer.parseLong trick, but it's the first time
> in my long history of upgrades that I have seen such runtime
> incompatibility.
>
> Maybe "MyKOBolt" was badly written since the beginning and I have just
> hit the punishment for it: is there a documentation which clarifies
> datatypes of the Map argument of prepare() method?
>
> Also worth noting: "MyKOBolt" derives from BaseRichBolt , while
> "MyOKBolt" derives from BaseBasicBolt => could this have any impact on
> this finding ?
>
> Last, I'm running this on Redhat Linux 8 and IBM Semeru JDK 17.0.8.1.
>
> Kind regards,
> Alexandre
>
> Le ven. 3 nov. 2023 à 15:16, Julien Nioche
>  a écrit :
> >
> > Thanks Richard.
> >
> > Tried the latest snapshot with StormCrawler both in local and deployed mode
> > and did not find any issues.
> > Will try it on a topology generating WARC files next week to check that the
> > dependency changes on Hadoop have not broken anything.
> >
> > Have a good week end
> >
> > Julien
> >
> > On Thu, 2 Nov 2023 at 19:25, Richard Zowalla  wrote:
> >
> > > Short update. Sorted out an issue with the tar.gz/zip with Julien today
> > > and re-uploaded them to the nightlies area.
> > >
> > > This new bundle works as expected in my deployment but happy to receive
> > > additional feedback before getting up a first release candidate ;-)
> > >
> > > Am Montag, dem 30.10.2023 um 08:21 +0100 schrieb Richard Zowalla:
> > > > Hi Alexandre,
> > > >
> > > > we are not in a hurry here :) - take as much as time you need.
> > > >
> > > > Gruß
> > > > Richard
> > > >
> > > >
> > > > Am Montag, dem 30.10.2023 um 07:50 +0100 schrieb Alexandre
> > > > Vermeerbergen:
> > > > > Hello Richard,
> > > > >
> > > > > Okay, I'm more than happy to do that with my pre-production cluster
> > > > > (~10 topologies) using the binary artifacts.
> > > > > Would it be OK if I can use up to end of this week  so that I'll be
> > > > > able to have enough time to check all potential issues that this
> > > > > upg

Re: [HELP NEEDED] Please test 2.6.0-SNAPSHOT

2023-11-03 Thread Alexandre Vermeerbergen
Hello,

I found a breaking change using Apache Storm 2.6.0 snapshot build
dated 20231103 and without changing my topologies' code (keeping the
same big jars).

This seems to be tied to a datatype change in Map argument of
prepare() method of Bolt.

For example, this Bolt is KO at runtime:

public class MyKOBolt extends BaseRichBolt {

@Override
public void prepare(Map stormConf, TopologyContext context,
OutputCollector collector) {
cacheMaxSize = (Long) stormConf.get(ConfKeys.SVC_DEF_CACHE_SIZE);
cacheTtlMin = (Long) stormConf.get(ConfKeys.SVC_DEF_CACHE_TTL_MIN);

as it throws this error:
2023-11-03 16:42:20.230 o.a.s.e.b.BoltExecutor
Thread-14-__system-executor[-1, -1] [INFO] Preparing bolt
__system:[-1]
2023-11-03 16:42:20.193 o.a.s.u.Utils
Thread-15-evaluateTriggers-executor[11, 11] [ERROR] Async loop died!
java.lang.ClassCastException: java.lang.Integer incompatible with java.lang.Long
 at com.acme.storm.alerting.MyKOBolt .prepare(EvaluationBolt.java:71)
~[stormjar.jar:?]
 at org.apache.storm.executor.bolt.BoltExecutor.init(BoltExecutor.java:128)
~[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
 at org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:138)
~[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
 at org.apache.storm.executor.bolt.BoltExecutor.call(BoltExecutor.java:54)
~[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
 at org.apache.storm.utils.Utils$1.run(Utils.java:393)
[storm-client-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT]
 at java.lang.Thread.run(Thread.java:857) [?:?]

whereas this bolt is fine with Storm 2.6.0:

public class MyOKBolt extends BaseBasicBolt {

@Override
public void prepare(final Map stormConf, final TopologyContext context) {
super.prepare(stormConf, context);

this.redisPort   = Integer.parseInt((String)
stormConf.get(ConfKeys.REDIS_PORT));

That said, I can modify the code of "MyKOBolt" to use same
Integer.parseInt  or Integer.parseLong trick, but it's the first time
in my long history of upgrades that I have seen such runtime
incompatibility.

Maybe "MyKOBolt" was badly written since the beginning and I have just
hit the punishment for it: is there a documentation which clarifies
datatypes of the Map argument of prepare() method?

Also worth noting: "MyKOBolt" derives from BaseRichBolt , while
"MyOKBolt" derives from BaseBasicBolt => could this have any impact on
this finding ?

Last, I'm running this on Redhat Linux 8 and IBM Semeru JDK 17.0.8.1.

Kind regards,
Alexandre

Le ven. 3 nov. 2023 à 15:16, Julien Nioche
 a écrit :
>
> Thanks Richard.
>
> Tried the latest snapshot with StormCrawler both in local and deployed mode
> and did not find any issues.
> Will try it on a topology generating WARC files next week to check that the
> dependency changes on Hadoop have not broken anything.
>
> Have a good week end
>
> Julien
>
> On Thu, 2 Nov 2023 at 19:25, Richard Zowalla  wrote:
>
> > Short update. Sorted out an issue with the tar.gz/zip with Julien today
> > and re-uploaded them to the nightlies area.
> >
> > This new bundle works as expected in my deployment but happy to receive
> > additional feedback before getting up a first release candidate ;-)
> >
> > Am Montag, dem 30.10.2023 um 08:21 +0100 schrieb Richard Zowalla:
> > > Hi Alexandre,
> > >
> > > we are not in a hurry here :) - take as much as time you need.
> > >
> > > Gruß
> > > Richard
> > >
> > >
> > > Am Montag, dem 30.10.2023 um 07:50 +0100 schrieb Alexandre
> > > Vermeerbergen:
> > > > Hello Richard,
> > > >
> > > > Okay, I'm more than happy to do that with my pre-production cluster
> > > > (~10 topologies) using the binary artifacts.
> > > > Would it be OK if I can use up to end of this week  so that I'll be
> > > > able to have enough time to check all potential issues that this
> > > > upgrade bring?
> > > >
> > > > Thanks,
> > > > Alexandre
> > > >
> > > > Le lun. 30 oct. 2023 à 07:35, Richard Zowalla  a
> > > > écrit :
> > > > >
> > > > > Hi all,
> > > > >
> > > > > before starting a release of 2.6.0-SNAPSHOT, I would like to
> > > > > receive
> > > > > some feedback on the current SNAPSHOT build.
> > > > >
> > > > > I just uploaded a 2.6.0-SNAPSHOT of Storm build from
> > > > >
> > https://github.com/apache/storm/commit/8f883086032669a8f04b09a3b312d60af5b44533
> > > > >
> > > > > It is available via the ASF Snapshot repository:
> > > > > https://repos

Re: [HELP NEEDED] Please test 2.6.0-SNAPSHOT

2023-11-03 Thread Julien Nioche
Thanks Richard.

Tried the latest snapshot with StormCrawler both in local and deployed mode
and did not find any issues.
Will try it on a topology generating WARC files next week to check that the
dependency changes on Hadoop have not broken anything.

Have a good week end

Julien

On Thu, 2 Nov 2023 at 19:25, Richard Zowalla  wrote:

> Short update. Sorted out an issue with the tar.gz/zip with Julien today
> and re-uploaded them to the nightlies area.
>
> This new bundle works as expected in my deployment but happy to receive
> additional feedback before getting up a first release candidate ;-)
>
> Am Montag, dem 30.10.2023 um 08:21 +0100 schrieb Richard Zowalla:
> > Hi Alexandre,
> >
> > we are not in a hurry here :) - take as much as time you need.
> >
> > Gruß
> > Richard
> >
> >
> > Am Montag, dem 30.10.2023 um 07:50 +0100 schrieb Alexandre
> > Vermeerbergen:
> > > Hello Richard,
> > >
> > > Okay, I'm more than happy to do that with my pre-production cluster
> > > (~10 topologies) using the binary artifacts.
> > > Would it be OK if I can use up to end of this week  so that I'll be
> > > able to have enough time to check all potential issues that this
> > > upgrade bring?
> > >
> > > Thanks,
> > > Alexandre
> > >
> > > Le lun. 30 oct. 2023 à 07:35, Richard Zowalla  a
> > > écrit :
> > > >
> > > > Hi all,
> > > >
> > > > before starting a release of 2.6.0-SNAPSHOT, I would like to
> > > > receive
> > > > some feedback on the current SNAPSHOT build.
> > > >
> > > > I just uploaded a 2.6.0-SNAPSHOT of Storm build from
> > > >
> https://github.com/apache/storm/commit/8f883086032669a8f04b09a3b312d60af5b44533
> > > >
> > > > It is available via the ASF Snapshot repository:
> > > > https://repository.apache.org/content/repositories/snapshots/
> > > >
> > > > You can consume by adding
> > > >
> > > >  
> > > >   apache.snapshots
> > > >   Apache Snapshot Repository
> > > >   https://repository.apache.org/snapshots
> > > >   
> > > > false
> > > >   
> > > >  
> > > >
> > > > to your project pom. As we do not deploy SNAPSHOTS automatically,
> > > > it
> > > > should be easy to just consume the latest SNAPSHOT.
> > > >
> > > > The packaged binaries are available at nightlies.apache.org:
> > > >
> > > >
> https://nightlies.apache.org/storm/2.6.0-SNAPSHOT/8f883086032669a8f04b09a3b312d60af5b44533/
> > > >
> > > > If you have some minutes left: Please test and report any issues
> > > > with
> > > > this binaries, so we can fix before attempting to release.
> > > >
> > > > The most significant changes are in the are of hadoop/hbase/hdfs
> > > > as
> > > > we
> > > > upgraded from 2.x to 3.x - our own test coverage within the build
> > > > looks
> > > > good but would be  nice to get some real world use-case feedback
> > > > on
> > > > this. In addition, we had quite a lof of 3rd party dependency
> > > > upgrades.
> > > >
> > > > In addition, it contains the pruning of external modules as
> > > > listed
> > > > in
> > > >
> > > > https://issues.apache.org/jira/browse/STORM-3988
> > > >
> > > > A summaryin Jira is here:
> > > >
> > > >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12314820=12353484
> > > >
> > > > Gruß
> > > > Richard
> > > >
> > > >
> >
>
>

-- 

*Open Source Solutions for Text Engineering*

http://www.digitalpebble.com
http://digitalpebble.blogspot.com/
#digitalpebble