So it seems like its a UTF-8 issue, when i changed the string to use Hex instead of Text and using the HEXcode with 00 (2 BYte) the contentsplit worked.
<POSTransaction xmlns is the string i was looking to split on which translates into following Hex code *3c0050004f0053005400720061006e00730061006300740069006f006e00200078006d006c006e007300* the transformXML is now failing i think because of the UTF-8. I know i had it working in normal ascii file. Do i need to specify someplace the flow files are UTF-8 or is it smart enough to figure it out on its own ? based on some reading i see that some processors expect UTF-8 so the next question would be do all processors support UTF 8 ? Anuj On Mon, Jun 13, 2016 at 3:01 PM, Anuj Handa <[email protected]> wrote: > thanks Joe, unfortunately since my xml has namespaces (xmlns ) that > approach wont work. > any thought on why spilt doesn't work using the tag, does it accept UTF8 > flow files ? > > Anuj > > On Mon, Jun 13, 2016 at 2:50 PM, ski n <[email protected]> wrote: > >> You can also make your input XML well-formed by creating a custom root >> element (e.g. <PostTransactions>...xmldocuments</PostTransactions> >> and then use the SplitXML processor (or just the transformation step). >> >> 2016-06-13 18:04 GMT+02:00 Anuj Handa <[email protected]>: >> >>> i have a text file which has multiple XML documents. which starts with >>> <POSTransaction >>> xmlns >>> i am trying to break each one of the XML docs into 1 flow-file so i can >>> then use evaluate XML and then convert into JSOn and then load into a >>> database. >>> >>> i tried just the split content and that didnt work. the file is UTF 8 >>> not sure if that plays into it. and i am running the nifi on linux and the >>> file is also local on linux. >>> >>> [image: Inline image 1] >>> >>> this is my entire workflow. >>> >>> [image: Inline image 2] >>> >>> >>> On Mon, Jun 13, 2016 at 11:43 AM, Joe Percivall <[email protected]> >>> wrote: >>> >>>> Awesome, and what processor were you planning to use to split on >>>> "#|#|#"? The SplitContent processor[1] can be used to split the content on >>>> a sequence of text characters which could split on "<POSTransaction xmlns" >>>> without needing to add "#|#|#". >>>> >>>> Also I see "xmlns" and think this is an xml file you are trying to >>>> split. If so are you by chance trying to split evenly on each child? If so >>>> the "SplitXml" processor[2] would easily take care of that. >>>> >>>> [1] >>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitContent/index.html >>>> [2] >>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitXml/index.html >>>> >>>> Joe- - - - - - >>>> Joseph Percivall >>>> linkedin.com/in/Percivall >>>> e: [email protected] >>>> >>>> >>>> >>>> >>>> On Monday, June 13, 2016 11:26 AM, Anuj Handa <[email protected]> >>>> wrote: >>>> Yes that's exactly correct. >>>> >>>> >>>> > On Jun 13, 2016, at 11:14 AM, Joe Percivall <[email protected]> >>>> wrote: >>>> > >>>> > Sorry I got a bit confused, in your original question you said that >>>> you wanted to append the value and I took it that you just wanted to append >>>> the value to the end of the line or text. >>>> > >>>> > Let me try and restate your goal so I'm sure I understand, ultimately >>>> you want to split the incoming FlowFile on each occurrence of >>>> "<POSTransaction xmlns" and you are planning on using ReplaceText to add >>>> "#|#|#" before each occurrence so that it will be easy to split? >>>> > >>>> > >>>> > Joe >>>> > - - - - - - >>>> > Joseph Percivall >>>> > linkedin.com/in/Percivall >>>> > e: [email protected] >>>> > >>>> > >>>> > >>>> > On Monday, June 13, 2016 11:05 AM, Anuj Handa <[email protected]> >>>> wrote: >>>> > >>>> > >>>> > >>>> > Anuj >>>> > Hi Joe, >>>> > >>>> > I modified the process per your suggestion but it only works to >>>> replace the first occurrence, There are multiple such tags which it doesn't >>>> replace. . >>>> > when i used evaluation mode line by line it appended it to every line >>>> in the file and not to the one i waned too. >>>> > >>>> > >>>> > >>>> > >>>> > On Mon, Jun 13, 2016 at 10:40 AM, Joe Percivall < >>>> [email protected]> wrote: >>>> > >>>> > Hello, >>>> >> >>>> >> In order to use ReplaceText[1] to solely append a value to the end >>>> of then entire text then change the "Replacement Strategy" to "Append" and >>>> leave "Evaluation Mode" as "Entire Text". This will take whatever is the >>>> "Replacement Value" and append it as a literal(without interpreting >>>> back-references) to the end of the text. >>>> >> >>>> >> Alternatively, if you want to append to the end of each line then >>>> change "Evaluation Mode" to "Line-by-Line". >>>> >> >>>> >> [1] >>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ReplaceText/index.html >>>> >> >>>> >> >>>> >> Hope that helps, >>>> >> Joe >>>> >> - - - - - - Joseph Percivall >>>> >> linkedin.com/in/Percivall >>>> >> e: [email protected] >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> On Monday, June 13, 2016 10:05 AM, Anuj Handa <[email protected]> >>>> wrote: >>>> >> >>>> >> >>>> >> >>>> >> Hi, >>>> >> >>>> >> I am trying to read a file and then use replaceText to append a >>>> string so I can spilt the line in the next step. I am nable to make the >>>> ReplaceText work. >>>> >> The flowfile is going through as success without the string being >>>> appended or replaced >>>> >> >>>> >> Any thoughts what i could be doing wrong >>>> >> >>>> >>> >>> >> >
