HI

Some time back (couple of years) I  used split xtokenize on a large XML
file at three levels which has worked very well. Two of the three elements
had the same tags, however I needed elements from each level so did the
split tokenize in separate routes i.e. split/tokenize (get an element) call
the next route which split tokenized the next section(s) and then finally
the third level. I use the XML DSL (and this was using Camel  2.15.1 - so
I'm sure it will work still)
The file had 500000 rows to 1000000 rows and I tried different
approaches to handle both the size and speed of processing (Claus had a
good article on processing large xml files which I followed)

Cheers

Bob




On Wed, Jun 17, 2020 at 1:22 PM Claus Ibsen <claus.ib...@gmail.com> wrote:

> Hi
>
> It can be both, hower its a little better as an argument
>
> split(xpath(xxxx), xxxx)
>
>
>
> On Wed, Jun 17, 2020 at 1:10 PM Mikael Andersson Wigander
> <mikael.grevs...@gmail.com> wrote:
> >
> > Hi
> >
> > Makes sense.
> >
> > Tried splitting using xpath but it didn’t work either.
> > Should xpath be as argument of split or own statement (.xpath())?
> >
> > // Mikael Andersson Wigander
> >
> >
> > > 17 juni 2020 kl. 10:14 skrev Claus Ibsen <claus.ib...@gmail.com>:
> > >
> > > Hi
> > >
> > > No tokenizeXml is for not complex XML with tags that are nested. It
> > > uses regexp parsing etc.
> > >
> > > Instead using camel-stax or camel-jaxb or something like that.
> > >
> > >
> > >> On Wed, Jun 17, 2020 at 9:14 AM Mikael Andersson Wigander
> > >> <mikael.grevs...@gmail.com> wrote:
> > >>
> > >> Hi
> > >>
> > >> We have a XML file to split on tag <Tx>.
> > >> However this tag is also present in a node further down the tree as
> well.
> > >>
> > >> tokenizeXML is used in our application but now this won’t work
> because it ends prematurely.
> > >>
> > >> Here’s the XML
> > >>
> > >> <?xml version="1.0" encoding="UTF-8"?>
> > >> <UVMiFIRDocument
> xmlns="urn:uv:xsd:unavista.mifir.iso20022.001.001.001">
> > >>    <UVHeader>
> > >>        <UVHeader xmlns="unavista.header.001.001.001">
> > >>            <SubmittingEntityID>1312312</SubmittingEntityID>
> > >>        </UVHeader>
> > >>    </UVHeader>
> > >>    <Document>
> > >>        <Document
> xmlns="urn:iso:std:iso:20022:tech:xsd:DRAFT15auth.016.001.01">
> > >>            <FinInstrmRptgTxRpt>
> > >>                <Tx>
> > >>                    <New>
> > >>                        <TxId>197X85138XMT</TxId>
> > >>                        <ExctgPty>1231231</ExctgPty>
> > >>                        <InvstmtPtyInd>true</InvstmtPtyInd>
> > >>                        <SubmitgPty>312312</SubmitgPty>
> > >>                        <Buyr>
> > >>                            <AcctOwnr>
> > >>                                <Id>
> > >>                                    <LEI>123123</LEI>
> > >>                                </Id>
> > >>                                <CtryOfBrnch>NL</CtryOfBrnch>
> > >>                            </AcctOwnr>
> > >>                            <DcsnMakr>
> > >>                                <LEI>549300DLR3UX38D4Z689</LEI>
> > >>                            </DcsnMakr>
> > >>                        </Buyr>
> > >>                        <Sellr>
> > >>                            <AcctOwnr>
> > >>                                <Id>
> > >>                                    <LEI>123123123</LEI>
> > >>                                </Id>
> > >>                            </AcctOwnr>
> > >>                        </Sellr>
> > >>                        <OrdrTrnsmssn>
> > >>                            <TrnsmssnInd>true</TrnsmssnInd>
> > >>                        </OrdrTrnsmssn>
> > >>                        <Tx>
> > >>                            <TradDt>2020-06-05T21:18:32.000Z</TradDt>
> > >>                            <TradgCpcty>AOTC</TradgCpcty>
> > >>                            <Qty>
> > >>                                <NmnlVal Ccy="EUR">3.57</NmnlVal>
> > >>                            </Qty>
> > >>                            <Pric>
> > >>                                <Pric>
> > >>                                    <MntryVal>
> > >>                                        <Amt Ccy="USD">1.131818</Amt>
> > >>                                    </MntryVal>
> > >>                                </Pric>
> > >>                            </Pric>
> > >>                            <TradVn>XOFF</TradVn>
> > >>                        </Tx>
> > >>                        <FinInstrm>
> > >>                            <Othr>
> > >>                                <FinInstrmGnlAttrbts>
> > >>                                    <FullNm>USD/EUR</FullNm>
> > >>                                    <ClssfctnTp>JFTXFP</ClssfctnTp>
> > >>                                    <NtnlCcy>USD</NtnlCcy>
> > >>                                </FinInstrmGnlAttrbts>
> > >>                                <DerivInstrmAttrbts>
> > >>                                    <XpryDt>2020-06-09</XpryDt>
> > >>                                    <PricMltplr>1</PricMltplr>
> > >>                                    <UndrlygInstrm>
> > >>                                        <Othr>
> > >>                                            <Sngl>
> > >>                                                <Indx>
> > >>                                                    <Nm>
> > >>                                                        <RefRate>
> > >>
> <Nm>USD/EUR</Nm>
> > >>                                                        </RefRate>
> > >>                                                    </Nm>
> > >>                                                </Indx>
> > >>                                            </Sngl>
> > >>                                        </Othr>
> > >>                                    </UndrlygInstrm>
> > >>                                    <DlvryTp>PHYS</DlvryTp>
> > >>                                </DerivInstrmAttrbts>
> > >>                            </Othr>
> > >>                        </FinInstrm>
> > >>                        <ExctgPrsn>
> > >>                            <Clnt>NORE</Clnt>
> > >>                        </ExctgPrsn>
> > >>                        <AddtlAttrbts>
> > >>
> <SctiesFincgTxInd>false</SctiesFincgTxInd></AddtlAttrbts>
> > >>                    </New>
> > >>                </Tx>
> > >>            </FinInstrmRptgTxRpt>
> > >>        </Document>
> > >>    </Document>
> > >> </UVMiFIRDocument>
> > >>
> > >> In the debugger it reveals that it is “broken”
> > >>
> > >> <Tx>
> > >>                    <New>
> > >>                        <TxId>197X85138XMT</TxId>
> > >>                        <ExctgPty>549300DLR3UX38D4Z689</ExctgPty>
> > >>                        <InvstmtPtyInd>true</InvstmtPtyInd>
> > >>                        <SubmitgPty>549300FVRWYPDFJTH118</SubmitgPty>
> > >>                        <Buyr>
> > >>                            <AcctOwnr>
> > >>                                <Id>
> > >>                                    <LEI>5493000WZY3YLO3WB727</LEI>
> > >>                                </Id>
> > >>                                <CtryOfBrnch>NL</CtryOfBrnch>
> > >>                            </AcctOwnr>
> > >>                            <DcsnMakr>
> > >>                                <LEI>549300DLR3UX38D4Z689</LEI>
> > >>                            </DcsnMakr>
> > >>                        </Buyr>
> > >>                        <Sellr>
> > >>                            <AcctOwnr>
> > >>                                <Id>
> > >>                                    <LEI>5493006KMX1VFTPYPW14</LEI>
> > >>                                </Id>
> > >>                            </AcctOwnr>
> > >>                        </Sellr>
> > >>                        <OrdrTrnsmssn>
> > >>                            <TrnsmssnInd>true</TrnsmssnInd>
> > >>                        </OrdrTrnsmssn>
> > >>                        <Tx>
> > >>                            <TradDt>2020-06-05T21:18:32.000Z</TradDt>
> > >>                            <TradgCpcty>AOTC</TradgCpcty>
> > >>                            <Qty>
> > >>                                <NmnlVal Ccy="EUR">3.57</NmnlVal>
> > >>                            </Qty>
> > >>                            <Pric>
> > >>                                <Pric>
> > >>                                    <MntryVal>
> > >>                                        <Amt Ccy="USD">1.131818</Amt>
> > >>                                    </MntryVal>
> > >>                                </Pric>
> > >>                            </Pric>
> > >>                            <TradVn>XOFF</TradVn>
> > >>                        </Tx>
> > >>
> > >>
> > >> Can this be done using tokenizeXML or?
> > >>
> > >>
> > >>
> > >> Thx
> > >
> > >
> > >
> > > --
> > > Claus Ibsen
> > > -----------------
> > > http://davsclaus.com @davsclaus
> > > Camel in Action 2: https://www.manning.com/ibsen2
>
>
>
> --
> Claus Ibsen
> -----------------
> http://davsclaus.com @davsclaus
> Camel in Action 2: https://www.manning.com/ibsen2
>


-- 
Bob Anderson
+27 (0) 82 389 0335
[image: View my profile on LinkedIn]
<http://ng.linkedin.com/pub/bob-anderson/2/25/9b5>

Reply via email to