Re: Replace Text

Anuj Handa Mon, 13 Jun 2016 13:45:39 -0700

So it seems like its a UTF-8 issue, when i changed the string to use Hex
instead of Text and using the HEXcode with 00 (2 BYte) the contentsplit
worked.


<POSTransaction xmlns is the string i was looking to split on which
translates into following Hex code
*3c0050004f0053005400720061006e00730061006300740069006f006e00200078006d006c006e007300*

the transformXML is now failing i think because of the UTF-8. I know i had
it working in normal ascii file.

Do i need to specify someplace the flow files are UTF-8 or is it smart
enough to figure it out on its own ?
based on some reading i see that some processors expect UTF-8 so the next
question would be do all processors support UTF 8 ?

Anuj



On Mon, Jun 13, 2016 at 3:01 PM, Anuj Handa <[email protected]> wrote:

> thanks Joe, unfortunately since my xml has namespaces (xmlns )  that
> approach wont work.
> any thought on why spilt doesn't work using the tag, does it accept UTF8
> flow files ?
>
> Anuj
>
> On Mon, Jun 13, 2016 at 2:50 PM, ski n <[email protected]> wrote:
>
>> You can also make your input XML well-formed by creating a custom root
>> element (e.g. <PostTransactions>...xmldocuments</PostTransactions>
>>  and then use the SplitXML processor (or just the transformation step).
>>
>> 2016-06-13 18:04 GMT+02:00 Anuj Handa <[email protected]>:
>>
>>> i have a text file which has multiple XML documents. which starts with 
>>> <POSTransaction
>>> xmlns
>>> i am trying to break each one of the XML docs into 1 flow-file so i can
>>> then use evaluate XML and then convert into JSOn and then load into a
>>> database.
>>>
>>> i tried just the split content and that didnt work. the file is UTF 8
>>> not sure if that plays into it. and i am running the nifi on linux and the
>>> file is also local on linux.
>>>
>>> [image: Inline image 1]
>>>
>>> this is my entire workflow.
>>>
>>> [image: Inline image 2]
>>>
>>>
>>> On Mon, Jun 13, 2016 at 11:43 AM, Joe Percivall <[email protected]>
>>> wrote:
>>>
>>>> Awesome, and what processor were you planning to use to split on
>>>> "#|#|#"? The SplitContent processor[1] can be used to split the content on
>>>> a sequence of text characters which could split on "<POSTransaction xmlns"
>>>> without needing to add "#|#|#".
>>>>
>>>> Also I see "xmlns" and think this is an xml file you are trying to
>>>> split. If so are you by chance trying to split evenly on each child? If so
>>>> the "SplitXml" processor[2] would easily take care of that.
>>>>
>>>> [1]
>>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitContent/index.html
>>>> [2]
>>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitXml/index.html
>>>>
>>>> Joe- - - - - -
>>>> Joseph Percivall
>>>> linkedin.com/in/Percivall
>>>> e: [email protected]
>>>>
>>>>
>>>>
>>>>
>>>> On Monday, June 13, 2016 11:26 AM, Anuj Handa <[email protected]>
>>>> wrote:
>>>> Yes that's exactly correct.
>>>>
>>>>
>>>> > On Jun 13, 2016, at 11:14 AM, Joe Percivall <[email protected]>
>>>> wrote:
>>>> >
>>>> > Sorry I got a bit confused, in your original question you said that
>>>> you wanted to append the value and I took it that you just wanted to append
>>>> the value to the end of the line or text.
>>>> >
>>>> > Let me try and restate your goal so I'm sure I understand, ultimately
>>>> you want to split the incoming FlowFile on each occurrence of
>>>> "<POSTransaction xmlns" and you are planning on using ReplaceText to add
>>>> "#|#|#" before each occurrence so that it will be easy to split?
>>>> >
>>>> >
>>>> > Joe
>>>> > - - - - - -
>>>> > Joseph Percivall
>>>> > linkedin.com/in/Percivall
>>>> > e: [email protected]
>>>> >
>>>> >
>>>> >
>>>> > On Monday, June 13, 2016 11:05 AM, Anuj Handa <[email protected]>
>>>> wrote:
>>>> >
>>>> >
>>>> >
>>>> > Anuj
>>>> > Hi Joe,
>>>> >
>>>> > I modified the process per your suggestion but it only works to
>>>> replace the first occurrence, There are multiple such tags which it doesn't
>>>> replace. .
>>>> > when i used evaluation mode line by line it appended it to every line
>>>> in the file and not to the one i waned too.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Jun 13, 2016 at 10:40 AM, Joe Percivall <
>>>> [email protected]> wrote:
>>>> >
>>>> > Hello,
>>>> >>
>>>> >> In order to use ReplaceText[1] to solely append a value to the end
>>>> of then entire text then change the "Replacement Strategy" to "Append" and
>>>> leave "Evaluation Mode" as "Entire  Text". This will take whatever is the
>>>> "Replacement Value" and append it as a literal(without interpreting
>>>> back-references) to the end of the text.
>>>> >>
>>>> >> Alternatively, if you want to append to the end of each line then
>>>> change "Evaluation Mode" to "Line-by-Line".
>>>> >>
>>>> >> [1]
>>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ReplaceText/index.html
>>>> >>
>>>> >>
>>>> >> Hope that helps,
>>>> >> Joe
>>>> >> - - - - - - Joseph Percivall
>>>> >> linkedin.com/in/Percivall
>>>> >> e: [email protected]
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Monday, June 13, 2016 10:05 AM, Anuj Handa <[email protected]>
>>>> wrote:
>>>> >>
>>>> >>
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> I am trying to read a file and then use replaceText to append a
>>>> string so I can spilt the line in the next step. I am nable to make the
>>>> ReplaceText work.
>>>> >> The flowfile is going through as success without the string being
>>>> appended or replaced
>>>> >>
>>>> >> Any thoughts what i could be doing wrong
>>>> >>
>>>>
>>>
>>>
>>
>

Re: Replace Text

Reply via email to