[
https://issues.apache.org/jira/browse/AMQ-9855?focusedWorklogId=1005508&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-1005508
]
ASF GitHub Bot logged work on AMQ-9855:
---------------------------------------
Author: ASF GitHub Bot
Created on: 17/Feb/26 06:17
Start Date: 17/Feb/26 06:17
Worklog Time Spent: 10m
Work Description: pradeep85841 commented on PR #1659:
URL: https://github.com/apache/activemq/pull/1659#issuecomment-3912506450
> i think adding the storeContent() part to the clearUnmarshalledState() is
ok but I'm curious where/why we need to do it. The only spot I could find it
being called is when reduceMemoryFootprint is true and we are validating that
it's marshaled before calling it.
>
> I just want to make sure we fully understand the root cause of your issue
so that we can validate the fix is correct.
Hi @cshannon,
Great question. After digging into the analysis from @tabish121 and my test
results, here is the 'why':
The root cause is a state-transition gap. Currently,
clearUnmarshalledState() assumes the content bytes are already there and simply
wipes the 'live' data (String, Map, etc.). If a background task (like a memory
sweep) hits this method before marshaling is finished, the message becomes
empty—leading to the data loss I caught.
By adding a quick storeContent() check inside clearUnmarshalledState(), we
make the message 'self-healing.' It ensures that the message protects its own
data integrity regardless of the timing or threading context.
I’ve confirmed that this logic fix works perfectly without needing
synchronized locks. Since this vulnerability exists across MapMessage,
ObjectMessage, and StreamMessage as well, I’d like to apply this hardening to
all of them for consistency.
Issue Time Tracking
-------------------
Worklog Id: (was: 1005508)
Time Spent: 7h 10m (was: 7h)
> Intermittent null/empty body when consuming from a topic (vm:// transport)
> --------------------------------------------------------------------------
>
> Key: AMQ-9855
> URL: https://issues.apache.org/jira/browse/AMQ-9855
> Project: ActiveMQ
> Issue Type: Bug
> Components: AMQP, Camel
> Affects Versions: 6.2.0, 6.1.2, 6.1.6, 6.1.7
> Reporter: JJ
> Priority: Major
> Fix For: 6.3.0
>
> Time Spent: 7h 10m
> Remaining Estimate: 0h
>
> Also see AMQ-6708 This is very much the same issue but with more details. The
> op on that ticket hasn't been seen since 2017.
> We have a simple AMQ instance using Camel; It connects to an upstream remote
> server via OpenWire and subscribes to topics. It Bridges those topics to the
> local AMQ with some later Camel processing.
> The route looks like this:
> <route id="Route_SPLITTER">
> <from uri="remoteServer:topic:TOPIC_A?durableSubscriptionName=some.user"/>
> <choice>
> <when>
> <simple>${body} == null || ${body} == ''</simple>
> <log message="Received message with missing body:
> ${header.CamelMessageHistory}"/>
> </when>
> <otherwise>
> </otherwise>
> </choice>
>
> <to uri="localAMQ:topic:MY_TOPIC_A"/>
> <split streaming="true" >
> <method ref="Splitter" method="processMessage"/>
> <multicast>
> <to uri="direct:routeSorter"/>
> </multicast>
> </split>
> </route>
>
> Logging was added to make sure it wasn't an upstream issue (and it's not)
>
> The data being passed is formatted as arrays of JSON. The <to
> uri="localAMQ:topic:MY_TOPIC_A"/> just passes it untouched. The Splitter send
> a copy elsewhere to be filtered by an order number prefix.
> The internal Camel to AMQ connection is via the vm:// transport using
> org.apache.camel.component.activemq.ActiveMQComponent (but I have also tried
> a pooled JMS connection factory with the same results)
> When I connect a test non durable consumer from a Ruby script using STOMP, or
> NIO I see the same issue. Some messages appear to have a 0 sized body.
> I can connect an c++ open wire consumer from the same server and that
> instance gets all messages with no 0 size bodies.
> I have tried various versions of Camel and all exhibit the same results.
> It;'s also worth noting that the data sent to the splitter function reports
> no errors either.
> I have also tried some of the older STOPM GEM packages but no change. (Though
> I have found some odd connection issue when you upgrade to io-wait-0.4.0 from
> 0.3.1
>
> After much swapping things round and testing I've finally narrowed it down to
> some issue with the vm:// transport...
> I have swapped the internal Camel connection from using vm:// to tcp:// and
> for the last 24hrs have seen no client errors with 0 sized bodies.
> I don't have any way to debug this deeper but hopefully someone else will
> pick this up.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact