Re: I have attribute called X. But X.0 and X.1 also got created. Why?

Andy LoPresto Mon, 13 Mar 2017 18:12:07 -0700

Srini,

I thought about it a little bit more and I think I have a temporary solution 
that will actually work for you. I still recommend you open the Jira but the 
following regex should work for you:


^.*(.??)$


I’ll break down the regex:

^     - Match at the start of the content
.*    - Match any character any number of times
(.??) - Capture group to match any character 0 or 1 times, greedy (i.e. will 
prefer 0 over 1)
$     - Match the end of the content

This results in the following LogAttribute output:

--------------------------------------------------
Standard FlowFile Attributes
Key: 'entryDate'
        Value: 'Mon Mar 13 17:38:03 PDT 2017'
Key: 'lineageStartDate'
        Value: 'Mon Mar 13 17:38:03 PDT 2017'
Key: 'fileSize'
        Value: '29'
FlowFile Attribute Map Content
Key: 'entire_match.0'
        Value: 'This is a plaintext message. '
Key: 'filename'
        Value: '1343455595942828'
Key: 'path'
        Value: './'
Key: 'uuid'
        Value: '9382e5f0-782d-4c71-963f-1004c2a50275'
--------------------------------------------------

Now your expression passes validation (because it has 1 explicit capture 
group), but won’t waste space on duplicate attributes. You just have to 
reference “attribute.0” instead of “attribute” in your follow-on processors (or 
use UpdateAttribute to copy and delete the original attribute, but this also 
wastes space).

Hope this helps until we can provide the improved UX.

Andy LoPresto
[email protected]
[email protected]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Mar 13, 2017, at 5:00 PM, Andy LoPresto <[email protected]> wrote:
> 
> Here is the specific source code for reference: 
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractText.java#L262-L262
>  
> <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractText.java#L262-L262>
> 
> Andy LoPresto
> [email protected] <mailto:[email protected]>
> [email protected] <mailto:[email protected]>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Mar 13, 2017, at 4:56 PM, Andy LoPresto <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Yes, I evaluated locally and apparently the ExtractText regex validation 
>> requires “1 to 40 capturing groups”. You can set “Include Capture Group 0” 
>> to false to reduce the duplication of the captured attribute (you’ll go from 
>> 3*n to 2*n). I am unaware of a technical reason the provided regex is 
>> required to have at least one capture group. I would recommend you open a 
>> Jira to reduce the minimum capture group count to 0 during validation if 
>> “Include Capture Group 0” is set to true.
>> 
>> <Screen Shot 2017-03-13 at 4.54.49 PM.png><Screen Shot 2017-03-13 at 4.55.25 
>> PM.png>
>> 
>> 
>> Andy LoPresto
>> [email protected] <mailto:[email protected]>
>> [email protected] <mailto:[email protected]>
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> 
>>> On Mar 13, 2017, at 3:25 PM, srini <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Hi Any,
>>> I dropped the idea of saving the flowfile to an attribute. So I am good in
>>> that part.
>>> 
>>> And you said "An immediate fix is to remove the parentheses from your regex;
>>> .*"
>>> But It is not taking if I remove parentheses.
>>> 
>>> thanks
>>> Srini
>>> 
>>> 
>>> 
>>> --
>>> View this message in context: 
>>> http://apache-nifi-developer-list.39713.n7.nabble.com/I-have-attribute-called-X-But-X-0-and-X-1-also-got-created-Why-tp15062p15114.html
>>>  
>>> <http://apache-nifi-developer-list.39713.n7.nabble.com/I-have-attribute-called-X-But-X-0-and-X-1-also-got-created-Why-tp15062p15114.html>
>>> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com 
>>> <http://nabble.com/>.
>> 
>

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: I have attribute called X. But X.0 and X.1 also got created. Why?

Reply via email to