[ 
https://issues.apache.org/jira/browse/NIFI-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15093257#comment-15093257
 ] 

ASF subversion and git services commented on NIFI-1316:
-------------------------------------------------------

Commit 6b54753dbb9bf6b2694a0cee7ac485fdcc8c3d01 in nifi's branch 
refs/heads/master from [~JPercivall]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=6b54753 ]

NIFI-1316 adding option to DetectDuplicate to not cache the entry identifier

Signed-off-by: Aldrin Piri <[email protected]>


> Allow DetectDuplicate to only detect and not cache
> --------------------------------------------------
>
>                 Key: NIFI-1316
>                 URL: https://issues.apache.org/jira/browse/NIFI-1316
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 0.4.1
>            Reporter: Joseph Percivall
>            Priority: Minor
>             Fix For: 0.5.0
>
>         Attachments: 
> 0001-NIFI-1316-adding-option-to-DetectDuplicate-to-not-ca.patch, 
> 0002-NIFI-1316.patch, WebCrawler.xml
>
>
> Working on a Webcrawler template/documentation I find myself wanting to have 
> a pair of detect duplicate processors. One of which does the typical check, 
> cache and remove if duplicate. The other I want to only check and remove if 
> Dup (don't add them to the cache in that processor).
> The use-case being I want to add URLs to the cache after being successfully 
> reached by the InvokeHttp processor. I also would like to check for urls that 
> were successfully reached before even sending them to the InvokeHttp 
> processor but I don't want to add to the cache before InvokeHttp because they 
> might not successfully hit the URL.
> I attached the template to the ticket. You can see how the DetectDuplicate 
> going into InvokeHttp should only check for duplicates and not cache them 
> (because the URL hasn't been successfully hit yet).
> Ideally this improvement would only require a configuration option added to 
> the processor which gives the option whether or not to cache. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to