anybody has any thoughts on UTF 8 Flow files with XMLtransforemation and
other processors ?

Anuj

On Mon, Jun 13, 2016 at 4:45 PM, Anuj Handa <[email protected]> wrote:

> So it seems like its a UTF-8 issue, when i changed the string to use Hex
> instead of Text and using the HEXcode with 00 (2 BYte) the contentsplit
> worked.
>
> <POSTransaction xmlns is the string i was looking to split on which
> translates into following Hex code
>
> *3c0050004f0053005400720061006e00730061006300740069006f006e00200078006d006c006e007300*
>
> the transformXML is now failing i think because of the UTF-8. I know i had
> it working in normal ascii file.
>
> Do i need to specify someplace the flow files are UTF-8 or is it smart
> enough to figure it out on its own ?
> based on some reading i see that some processors expect UTF-8 so the next
> question would be do all processors support UTF 8 ?
>
> Anuj
>
>
>
> On Mon, Jun 13, 2016 at 3:01 PM, Anuj Handa <[email protected]> wrote:
>
>> thanks Joe, unfortunately since my xml has namespaces (xmlns )  that
>> approach wont work.
>> any thought on why spilt doesn't work using the tag, does it accept UTF8
>> flow files ?
>>
>> Anuj
>>
>> On Mon, Jun 13, 2016 at 2:50 PM, ski n <[email protected]> wrote:
>>
>>> You can also make your input XML well-formed by creating a custom root
>>> element (e.g. <PostTransactions>...xmldocuments</PostTransactions>
>>>  and then use the SplitXML processor (or just the transformation step).
>>>
>>> 2016-06-13 18:04 GMT+02:00 Anuj Handa <[email protected]>:
>>>
>>>> i have a text file which has multiple XML documents. which starts with 
>>>> <POSTransaction
>>>> xmlns
>>>> i am trying to break each one of the XML docs into 1 flow-file so i can
>>>> then use evaluate XML and then convert into JSOn and then load into a
>>>> database.
>>>>
>>>> i tried just the split content and that didnt work. the file is UTF 8
>>>> not sure if that plays into it. and i am running the nifi on linux and the
>>>> file is also local on linux.
>>>>
>>>> [image: Inline image 1]
>>>>
>>>> this is my entire workflow.
>>>>
>>>> [image: Inline image 2]
>>>>
>>>>
>>>> On Mon, Jun 13, 2016 at 11:43 AM, Joe Percivall <[email protected]
>>>> > wrote:
>>>>
>>>>> Awesome, and what processor were you planning to use to split on
>>>>> "#|#|#"? The SplitContent processor[1] can be used to split the content on
>>>>> a sequence of text characters which could split on "<POSTransaction xmlns"
>>>>> without needing to add "#|#|#".
>>>>>
>>>>> Also I see "xmlns" and think this is an xml file you are trying to
>>>>> split. If so are you by chance trying to split evenly on each child? If so
>>>>> the "SplitXml" processor[2] would easily take care of that.
>>>>>
>>>>> [1]
>>>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitContent/index.html
>>>>> [2]
>>>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitXml/index.html
>>>>>
>>>>> Joe- - - - - -
>>>>> Joseph Percivall
>>>>> linkedin.com/in/Percivall
>>>>> e: [email protected]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Monday, June 13, 2016 11:26 AM, Anuj Handa <[email protected]>
>>>>> wrote:
>>>>> Yes that's exactly correct.
>>>>>
>>>>>
>>>>> > On Jun 13, 2016, at 11:14 AM, Joe Percivall <[email protected]>
>>>>> wrote:
>>>>> >
>>>>> > Sorry I got a bit confused, in your original question you said that
>>>>> you wanted to append the value and I took it that you just wanted to 
>>>>> append
>>>>> the value to the end of the line or text.
>>>>> >
>>>>> > Let me try and restate your goal so I'm sure I understand,
>>>>> ultimately you want to split the incoming FlowFile on each occurrence of
>>>>> "<POSTransaction xmlns" and you are planning on using ReplaceText to add
>>>>> "#|#|#" before each occurrence so that it will be easy to split?
>>>>> >
>>>>> >
>>>>> > Joe
>>>>> > - - - - - -
>>>>> > Joseph Percivall
>>>>> > linkedin.com/in/Percivall
>>>>> > e: [email protected]
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Monday, June 13, 2016 11:05 AM, Anuj Handa <[email protected]>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> >
>>>>> > Anuj
>>>>> > Hi Joe,
>>>>> >
>>>>> > I modified the process per your suggestion but it only works to
>>>>> replace the first occurrence, There are multiple such tags which it 
>>>>> doesn't
>>>>> replace. .
>>>>> > when i used evaluation mode line by line it appended it to every
>>>>> line in the file and not to the one i waned too.
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Jun 13, 2016 at 10:40 AM, Joe Percivall <
>>>>> [email protected]> wrote:
>>>>> >
>>>>> > Hello,
>>>>> >>
>>>>> >> In order to use ReplaceText[1] to solely append a value to the end
>>>>> of then entire text then change the "Replacement Strategy" to "Append" and
>>>>> leave "Evaluation Mode" as "Entire  Text". This will take whatever is the
>>>>> "Replacement Value" and append it as a literal(without interpreting
>>>>> back-references) to the end of the text.
>>>>> >>
>>>>> >> Alternatively, if you want to append to the end of each line then
>>>>> change "Evaluation Mode" to "Line-by-Line".
>>>>> >>
>>>>> >> [1]
>>>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ReplaceText/index.html
>>>>> >>
>>>>> >>
>>>>> >> Hope that helps,
>>>>> >> Joe
>>>>> >> - - - - - - Joseph Percivall
>>>>> >> linkedin.com/in/Percivall
>>>>> >> e: [email protected]
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Monday, June 13, 2016 10:05 AM, Anuj Handa <[email protected]>
>>>>> wrote:
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> Hi,
>>>>> >>
>>>>> >> I am trying to read a file and then use replaceText to append a
>>>>> string so I can spilt the line in the next step. I am nable to make the
>>>>> ReplaceText work.
>>>>> >> The flowfile is going through as success without the string being
>>>>> appended or replaced
>>>>> >>
>>>>> >> Any thoughts what i could be doing wrong
>>>>> >>
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to