Re: How to avoid this splitting of single line as multi lines in SplitText?

Andy LoPresto Thu, 16 Feb 2017 22:20:19 -0800

This isn’t working because of known issue NIFI-3255. Oleg has submitted a PR 
with a patch and Koji has been reviewing. There are some outstanding questions 
about provenance chain decisions with original vs. split, but the code fixes 
the exception which was raised and I was able to make a working flow once I 
applied the patch.


All of this is updated on the StackOverflow question as well.

Andy LoPresto
[email protected]
[email protected]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Feb 15, 2017, at 2:19 AM, prabhu Mahendran <[email protected]> wrote:
> 
> Andy,
> 
> I have used following properties in ReplaceText processor.
> Search Value:"(.*?)(\n)(.*?)"
> 
> Replacement Value:"$1\\n$3"
> 
> Character Set:UTF-8
> 
> MaximumBuffer Size:1MB
> 
> Replacement Strategy:Regex Replace
> 
> Evaluation Mode:Entire Text
> 
> Result of this processor same as like input.It could n't perform any change.
> 
> Thanks,
> prabhu
> 
> On Wed, Feb 15, 2017 at 12:35 PM, Andy LoPresto <[email protected] 
> <mailto:[email protected]>> wrote:
> Prabhu,
> 
> I answered this on Stack Overflow [1] but I think you could do it with 
> ReplaceText before the SplitText using a regex like
> 
> "(.*?)(\n)(.*?)" replaced with "$1\\n$3"
> 
> [1] http://stackoverflow.com/a/42242665/70465 
> <http://stackoverflow.com/a/42242665/70465>
> 
> Andy LoPresto
> [email protected] <mailto:[email protected]>
> [email protected] <mailto:[email protected]>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Feb 14, 2017, at 10:52 PM, Lee Laim <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Prabhu,
>> 
>> You need to remove the new lines from within the last field.  I'd recommend 
>> using awk in an execute stream command processor first, then splitting the 
>> text.  Alternatively, you could write a custom processor to specifically 
>> handle the incoming data.
>> 
>> Lee
>> 
>> On Feb 14, 2017, at 11:01 PM, prabhu Mahendran <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>>> I have CSV file which contains following line.
>>> 
>>> No,NAme,ID,Description
>>> 1,Stack,232,"ABCDEFGHIJKLMNO
>>>  -- Jiuaslkm asdasdasd"
>>> used below processor structure GetFile-->SplitText
>>> 
>>> In SplitText i have given header and line split count as 1.
>>> 
>>> So i think it could be split row as below..,
>>> 
>>>  No,NAme,ID,Description
>>> 1,Stack,232,"ABCDEFGHIJKLMNO
>>>  -- Jiuaslkm asdasdasd:"
>>> But it actually split the csv as "2" splits like below.,
>>> 
>>> First SPlit:
>>> 
>>> No,NAme,ID,Description
>>> 1,Stack,232,"ABCDEFGHIJKLMNO
>>> Second Split:
>>> 
>>> No,NAme,ID,Description
>>>     -- Jiuaslkm asdasdasd"
>>> So i have faced data handling missed something.
>>> 
>>> GOal:Now i need to handle those data lines as single line.
>>> 
>>> Any one help me to resolve this?
>>> 
> 
>

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: How to avoid this splitting of single line as multi lines in SplitText?

Reply via email to