transport)

ASF GitHub Bot (Jira) Wed, 18 Feb 2026 16:15:04 -0800


     [ 
https://issues.apache.org/jira/browse/AMQ-9855?focusedWorklogId=1005962&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-1005962
 ]


ASF GitHub Bot logged work on AMQ-9855:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 19/Feb/26 00:14
            Start Date: 19/Feb/26 00:14
    Worklog Time Spent: 10m 
      Work Description: cshannon commented on PR #1659:
URL: https://github.com/apache/activemq/pull/1659#issuecomment-3923949758

   > To clarify the background task involvement: in a standard serial flow, the 
Destination handles the guards correctly. However, in my 
ActiveMQTextMessageStressTest, the race occurs during high-concurrency 
scenarios where multiple threads interact with the message command.
   > 
   > The failure specifically occurs here: Caused by: java.lang.AssertionError: 
Text should never be null during stress at line 138
   > 
   > This trace confirms that even though the message was produced with text, 
the unmarshalled state was cleared out from under the consumer thread before it 
could be read.
   > 
   > Because clearUnmarshalledState() is public, it can be invoked by internal 
broker components (like Advisory dispatchers or NIO transport threads) to 
reduce memory footprint. Since the current implementation doesn't check if 
content is actually populated before nulling the text, it creates this 
'double-null' state.
   > 
   > Hardening the command itself makes it 'safe by default,' protecting data 
integrity regardless of which internal broker component calls the cleanup
   
   So taking a step back, at this point I'm not sure there's a real problem we 
need to solve here, or if there is I haven't seen the evidence yet for an 
actual broker problem to fix because the messages should be copied on dispatch.
   
   A few things to point out:
   
   First, the goal here should not be to make changes to the code to simply 
make the unit test pass. The unit test is an artificial recreation of an error 
by manually calling that method outside of normal operation in a multi threaded 
environment.  As @tabish121 pointed out, those messages were **never** intended 
to be used by multiple threads so there will be race conditions if 2 threads 
are operating on the message. That alone makes the test invalid as the messages 
are inherently not thread safe so a stress test will break it.
   
   Second, simply making this change inside of clearUnmarshalledState() to 
serialize and marshal the data if not serialized does not make the messages 
thread safe. There is still a race condition and using that method is only safe 
from a single threaded environment. You could still get into trouble if calling 
the method from multiple threads so it doesn't really solve the issue if there 
are 2 threads touching the message by mistake.
   
   Third, yes the method is public, so in theory someone could certainly write 
code that would invoke the method and clear the unmarshaled state without 
marshaling the data first.  However, if a user is going to write code to do 
that a user is also capable of checking the data has been marshaled before 
calling that method just like the broker does. If anything, it may make more 
sense to do a state check and throw an exception inside of 
clearUnmarshalledState() if the marshaled data is missing as if someone is 
calling that method they likely are going to be expecting the data to already 
be marshaled and transparently marshaling would just be hiding an error.
   
   Lastly, this issue and PR were originally opened up because of receiving 
null message bodies in a real environment but I have yet to see evidence or a 
demonstration of how this is possible to happen because the messages should be 
copied and be unique for each consumer. I'm not saying it's impossible to 
happen but so far receiving null bodies has only been demonstrated through a 
unit test that is not really a valid test because it creates a race condition 
scenario that should not be possible in a normal broker operation because those 
messages should always be copied. 
   
   Originally I was ok adding synchronization because I was thinking there was 
a spot in the broker where multiple threads might be interacting with the same 
copy in the VM transport. However, so far that doesn't appear to be the case or 
at least we haven't found where that is yet as we should be copying the message 
on dispatch.  If there is a bug to fix that fix would be to locate the spot 
where multiple threads are interacting with the same copy vs their own copy and 
fix that. Any client code that is calling clearUnmarshalledState() should make 
sure to do so on its own copy of the message and should verify it's safe to 
call that method (ie check it's marshaled first)
   




Issue Time Tracking
-------------------

    Worklog Id:     (was: 1005962)
    Time Spent: 7h 40m  (was: 7.5h)

> Intermittent null/empty body when consuming from a topic (vm:// transport)
> --------------------------------------------------------------------------
>
>                 Key: AMQ-9855
>                 URL: https://issues.apache.org/jira/browse/AMQ-9855
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: AMQP, Camel
>    Affects Versions: 6.2.0, 6.1.2, 6.1.6, 6.1.7
>            Reporter: JJ
>            Priority: Major
>             Fix For: 6.3.0
>
>          Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Also see AMQ-6708 This is very much the same issue but with more details. The 
> op on that ticket hasn't been seen since 2017.
> We have a simple AMQ instance using Camel; It connects to an upstream remote 
> server via OpenWire and subscribes to topics. It Bridges those topics to the 
> local AMQ with some later Camel processing.
> The route looks like this:
> <route id="Route_SPLITTER">
>     <from uri="remoteServer:topic:TOPIC_A?durableSubscriptionName=some.user"/>
>     <choice>
>         <when>
>             <simple>${body} == null || ${body} == ''</simple>
>             <log message="Received message with missing body: 
> ${header.CamelMessageHistory}"/>
>         </when>
>         <otherwise>
>         </otherwise>
>     </choice>
>         
>     <to uri="localAMQ:topic:MY_TOPIC_A"/>
>     <split streaming="true" >
>         <method ref="Splitter" method="processMessage"/>
>         <multicast>
>             <to uri="direct:routeSorter"/>
>         </multicast>    
>     </split>
> </route>
>  
> Logging was added to make sure it wasn't an upstream issue (and it's not)
>  
> The data being passed is formatted as arrays of JSON. The <to 
> uri="localAMQ:topic:MY_TOPIC_A"/> just passes it untouched. The Splitter send 
> a copy elsewhere to be filtered by an order number prefix.
> The internal Camel to AMQ connection is via the vm:// transport using 
> org.apache.camel.component.activemq.ActiveMQComponent (but I have also tried 
> a pooled JMS connection factory with the same results)
> When I connect a test non durable consumer from a Ruby script using STOMP, or 
> NIO I see the same issue. Some messages appear to have a 0 sized body.
> I can connect an c++ open wire consumer from the same server and that 
> instance gets all messages with no 0 size bodies.
> I have tried various versions of Camel and all exhibit the same results. 
> It;'s also worth noting that the data sent to the splitter function reports 
> no errors either.
> I have also tried some of the older STOPM GEM packages but no change. (Though 
> I have found some odd connection issue when you upgrade to io-wait-0.4.0 from 
> 0.3.1
>  
> After much swapping things round and testing I've finally narrowed it down to 
> some issue with the vm:// transport...
> I have swapped the internal Camel connection from using vm:// to tcp:// and 
> for the last 24hrs have seen no client errors with 0 sized bodies. 
> I don't have any way to debug this deeper but hopefully someone else will 
> pick this up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact

[jira] [Work logged] (AMQ-9855) Intermittent null/empty body when consuming from a topic (vm:// transport)

Reply via email to