Mikael Andersson Wigander,

While it is not a Camel solution, large documents can be parsed using  a 
streaming Xpath parser. This can be implemented using SAX. This would work in 
your use case since the Xpath in question are not performing lookback. This way 
the entire document is never read into memory at once. When you need 
millisecond performance this is a good option.

Alex Mattern

From: Siano, Stephan <stephan.si...@sap.com.INVALID>
Sent: Tuesday, November 9, 2021 12:41 AM
To: users@camel.apache.org
Subject: [EXTERNAL SENDER:] RE: How to make a bean thread safe?

Hi

I don’t think that this issue is related to thread safety. XPath as such is a 
very expensive operation as it requires parsing the document into a DOM. You 
have 10 of those XPath parameters and the heap dump shows 10 XPath builders 
that are consuming a lot of memory. You’d probably better pass the payload only 
once (maybe as a Node or Document) and then execute the XPath expressions on it 
inside the bean (then you will only parse your document once and have only one 
DOM tree).

Best regards
Stephan

From: Mikael Andersson Wigander <mikael.andersson.wigan...@pm.me.INVALID>
Sent: Monday, 8 November 2021 20:30
To: Camel Mail List <users@camel.apache.org>
Subject: Sv: How to make a bean thread safe?

There’s a typo in the code sample.
The processing SHOULD be parallel, not sequential as in the snippet.


/M


På mån, nov. 8, 2021 vid 20:11, Mikael Andersson Wigander 
<mikael.andersson.wigan...@pm.me.INVALID<mailto:mikael.andersson.wigan...@pm.me.INVALID>>
 skrev:
Hi

With the risk of being seen as a n00b (again)…

We are processing large XML files (0.5GB/~500.000 records).
To process them we use stream caching, spit, parallel processing, xpath and a 
bean.

We get a lot of OutOfMemoryExceptions and after analysing we see that the call 
to the bean method is the villain.

The process is to split() using tokenizeXML() on a tag that makes up one record 
in the XML.

For each of these records we call a bean where the method utilises @Xpath() on 
the method parameters.

We see in the heap dump that these calls are never GC'd, we have 90% leftovers
[cid:image001.png@01D7D534.B0F2D940]

The question is: is this related to a not thread safe bean/method or what could 
be the reason?
The documentation states the default behaviour is a Signleton and when used in 
concurrent processing it must be thread safe…
https://camel.apache.org/components/3.11.x/bean-component.html#_options<https://urldefense.com/v3/__https:/camel.apache.org/components/3.11.x/bean-component.html*_options__;Iw!!KV6Wb-o!ogNQ7izVYRBfZZ5ZiPzvYH0PrFUlFEqoEeGe3LK-HvumrNJUGw23j6Z8oeaX18Dh$>

Running as a war under Tomcat 9 on Windows using Camel 3.11.3 and Spring Boot 
2.5.6.
Server has 32GB of RAM…

Route:
from(file("Full"))
                .streamCaching()
                .unmarshal()
                .zipFile()
                .split()
                .tokenizeXML("RefData")
                .streaming()
                .parallelProcessing(false)
                .bean(XmlToSqlBean.class)
                .to(jdbc("default"))
                .end();

Bean:
public class XmlToSqlBean {
            public String toSql(@XPath("//FinInstrmGnlAttrbts/Id") final String 
isin,
                                @XPath("//NtnlCcy") final String currency,
                                @XPath("//FullNm") final String fullName,
                                @XPath("//TradgVnRltdAttrbts/Id") final String 
venue,
                                @XPath("//ClssfctnTp") final String 
classification,
                                @XPath("//TradgVnRltdAttrbts/TermntnDt") final 
String terminationDate,
                                @XPath("//Issr") final String issuer,
                                @XPath("//MtrtyDt") String maturityDate,
                                @XPath("//TermntdRcrd") final String 
termnRecord,
                                @XPath("//NewRcrd") final String newRecord) {
                …
            }
        }


Thanks

/M




*************************** IMPORTANT NOTE*****************************
The opinions expressed in this message and/or any attachments are those of the 
author and not necessarily those of Brown Brothers Harriman & Co., its 
subsidiaries and affiliates ("BBH"). There is no guarantee that this message is 
either private or confidential, and it may have been altered by unauthorized 
sources without your or our knowledge. Nothing in the message is capable or 
intended to create any legally binding obligations on either party and it is 
not intended to provide legal advice. BBH accepts no responsibility for loss or 
damage from its use, including damage from virus.
******************************************************************************

Reply via email to