Re: DetectDuplicate Processor Doesn't Remove From Cache

2020-07-28 Thread bsavard
Hi,

I'm questioning my understanding of how the code is working.  I did a screen
shot while debugging and I'll attach it here.  If you look at the values of
'now' and 'originalCachevalue entryTimeMS', how could this ever work? 
Contrast the value of cacheValue.entryTimeMS vs
originalCacheValue.entryTimeMS.  

Stepping through the code, I think the value of entryTimeMS is derived from
the flowfile's value, not from a timestamp.  I'm assuming that I haven't had
enough coffee yet and I'm just being stupid. :)


 




--
Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/


Re: DetectDuplicate Processor Doesn't Remove From Cache

2020-07-28 Thread Mark Payne
Hello,

I’m not sure that I understand the concern. CacheValue.getEntryTimeMS() returns 
a Long (64-bit signed integer) value, which
represents the timestamp of when the entry was added. Are you experiencing 
problems with events not being evicted and resulting
in heap exhaustion, or is your concern just stemming from a lack of clarity 
around how the code is working?

Thanks
-Mark


> On Jul 28, 2020, at 9:28 AM, bsavard  wrote:
> 
> Hi,
> 
> I've developed a test where I'm caching a flowfile using
> PutDistributedMapCache, then checking for entries using a DeleteDuplicate
> processor.  I'm finding that the DeleteDuplicate is, in fact, finding the
> duplicate, but it's not removing it from the cache.
> The problem is the test in DeleteDuplicate on line 189:
> if (duplicate && durationMS != null && (now >=
> originalCacheValue.getEntryTimeMS() + durationMS)) {
> 
> If I dig into the code, originalCacheValue.getEntryTimeMS() doesn't look to
> me like it really represents a time value, so the comparison with "now"
> always fails.  I say that because the cache value seems to be a
> serialization of the flowfile.
> 
> Maybe I've misunderstood what the doc means when it says "determines if the
> cached value has already been seen".  Has anyone else had difficulty with
> this?
> 
> 
> 
> --
> Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/



DetectDuplicate Processor Doesn't Remove From Cache

2020-07-28 Thread bsavard
Hi,

I've developed a test where I'm caching a flowfile using
PutDistributedMapCache, then checking for entries using a DeleteDuplicate
processor.  I'm finding that the DeleteDuplicate is, in fact, finding the
duplicate, but it's not removing it from the cache.
The problem is the test in DeleteDuplicate on line 189:
if (duplicate && durationMS != null && (now >=
originalCacheValue.getEntryTimeMS() + durationMS)) {

If I dig into the code, originalCacheValue.getEntryTimeMS() doesn't look to
me like it really represents a time value, so the comparison with "now"
always fails.  I say that because the cache value seems to be a
serialization of the flowfile.

Maybe I've misunderstood what the doc means when it says "determines if the
cached value has already been seen".  Has anyone else had difficulty with
this?



--
Sent from: http://apache-nifi-users-list.2361937.n4.nabble.com/


Re: Need help with DetectDuplicate

2020-01-02 Thread Matt Burgess
William,

DistributedMapCacheClientService works in a standalone NiFi. Do you
have a DistributedMapCacheServer configured for localhost:4557 and
enabled?

On Thu, Jan 2, 2020 at 4:39 PM William Gosse
 wrote:
>
> Will the DetectDuplicate with DistributedMapCacheClientService only work in a 
> cluster?  I’m trying to use it in a standalone NiFi.
>
>
>
> From: Emanuel Oliveira 
> Sent: Tuesday, December 24, 2019 4:29 PM
> To: users@nifi.apache.org
> Subject: Re: Need help with DetectDuplicate
>
>
>
> [CAUTION: This email originated from outside of Kodak Alaris. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.]
>
> 
>
> Hi,
>
>
>
> Depending on how your cluster setup you may need to add/setup ssl controller 
> service?
>
>
>
> Emanuel
>
>
>
> On Tue 24 Dec 2019, 18:16 William Gosse,  wrote:
>
> I’m trying to use DetectDuplicate processor but not having much luck. Here 
> the config:
>
> Cache Entry Identifier
>
> ${resourceId}
>
> FlowFile Description
>
> Ingestion
>
> Age Off Duration
>
> 60 sec
>
> Cache The Entry Identifier
>
> true
>
> Distributed Cache Service
>
> DistributedMapCacheClientService
>
>
>
> I created abd enabled a DistributedMapCacheClientService. Here’s its config:
>
> Server Hostname
>
> localhost
>
> Server Port
>
> 4557
>
> SSL Context ServiceNo value setCommunications Timeout
>
> 30 secs
>
>
>
> When I run it I get the following error:
>
> 2019-12-24 13:14:05,355 ERROR [Timer-Driven Process Thread-9] 
> o.a.n.p.standard.DetectDuplicate 
> DetectDuplicate[id=38cb8a64-016f-1000-b55b-f6c4e0f69f61] Unable to 
> communicate with cache when processing 
> StandardFlowFileRecord[uuid=7f049d8e-1d04-4fee-9f04-320d6980bc55,claim=StandardContentClaim
>  [resourceClaim=StandardResourceClaim[id=1577202782598-43, container=default, 
> section=43], offset=74111, 
> length=4528],offset=0,name=84068ffb-69b1-4471-abbd-29243d3be39e,size=4528] 
> due to java.net.ConnectException: Connection refused: no further information: 
> java.net.ConnectException: Connection refused: no further information
>
> java.net.ConnectException: Connection refused: no further information
>
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>
> at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>
> at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
>
> at 
> org.apache.nifi.distributed.cache.client.StandardCommsSession.(StandardCommsSession.java:52)
>
> at 
> org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.createCommsSession(DistributedMapCacheClientService.java:410)
>
> at 
> org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.leaseCommsSession(DistributedMapCacheClientService.java:425)
>
> at 
> org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.withCommsSession(DistributedMapCacheClientService.java:491)
>
> at 
> org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.getAndPutIfAbsent(DistributedMapCacheClientService.java:174)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:498)
>
> at 
> org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:87)
>
> at com.sun.proxy.$Proxy142.getAndPutIfAbsent(Unknown Source)
>
> at 
> org.apache.nifi.processors.standard.DetectDuplicate.onTrigger(DetectDuplicate.java:183)
>
> at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>
> at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
>
> at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
>
> at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>
> at 
> org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>
> at 
> java.util.concurrent.Exec

RE: Need help with DetectDuplicate

2020-01-02 Thread William Gosse
Will the DetectDuplicate with DistributedMapCacheClientService only work in a 
cluster?  I’m trying to use it in a standalone NiFi.

From: Emanuel Oliveira 
Sent: Tuesday, December 24, 2019 4:29 PM
To: users@nifi.apache.org
Subject: Re: Need help with DetectDuplicate

[CAUTION: This email originated from outside of Kodak Alaris. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.]

Hi,

Depending on how your cluster setup you may need to add/setup ssl controller 
service?

Emanuel

On Tue 24 Dec 2019, 18:16 William Gosse, 
mailto:william.go...@aifoundry.com>> wrote:
I’m trying to use DetectDuplicate processor but not having much luck. Here the 
config:
Cache Entry Identifier
${resourceId}
FlowFile Description
Ingestion
Age Off Duration
60 sec
Cache The Entry Identifier
true
Distributed Cache Service
DistributedMapCacheClientService

I created abd enabled a DistributedMapCacheClientService. Here’s its config:
Server Hostname
localhost
Server Port
4557
SSL Context ServiceNo value setCommunications Timeout
30 secs

When I run it I get the following error:
2019-12-24 13:14:05,355 ERROR [Timer-Driven Process Thread-9] 
o.a.n.p.standard.DetectDuplicate 
DetectDuplicate[id=38cb8a64-016f-1000-b55b-f6c4e0f69f61] Unable to communicate 
with cache when processing 
StandardFlowFileRecord[uuid=7f049d8e-1d04-4fee-9f04-320d6980bc55,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1577202782598-43, container=default, 
section=43], offset=74111, 
length=4528],offset=0,name=84068ffb-69b1-4471-abbd-29243d3be39e,size=4528] due 
to java.net.ConnectException: Connection refused: no further information: 
java.net.ConnectException: Connection refused: no further information
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
at 
org.apache.nifi.distributed.cache.client.StandardCommsSession.(StandardCommsSession.java:52)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.createCommsSession(DistributedMapCacheClientService.java:410)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.leaseCommsSession(DistributedMapCacheClientService.java:425)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.withCommsSession(DistributedMapCacheClientService.java:491)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.getAndPutIfAbsent(DistributedMapCacheClientService.java:174)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:87)
at com.sun.proxy.$Proxy142.getAndPutIfAbsent(Unknown Source)
at 
org.apache.nifi.processors.standard.DetectDuplicate.onTrigger(DetectDuplicate.java:183)
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Not sure whats missing?


Re: Need help with DetectDuplicate

2019-12-25 Thread Shawn Weeks
Did you create the actual map cache server in controller services? I couldn’t 
tell. All I saw was the client service.

Thanks
Shawn

Sent from my iPhone

On Dec 24, 2019, at 12:16 PM, William Gosse  wrote:


I’m trying to use DetectDuplicate processor but not having much luck. Here the 
config:
Cache Entry Identifier
${resourceId}
FlowFile Description
Ingestion
Age Off Duration
60 sec
Cache The Entry Identifier
true
Distributed Cache Service
DistributedMapCacheClientService

I created abd enabled a DistributedMapCacheClientService. Here’s its config:
Server Hostname
localhost
Server Port
4557
SSL Context ServiceNo value setCommunications Timeout
30 secs

When I run it I get the following error:
2019-12-24 13:14:05,355 ERROR [Timer-Driven Process Thread-9] 
o.a.n.p.standard.DetectDuplicate 
DetectDuplicate[id=38cb8a64-016f-1000-b55b-f6c4e0f69f61] Unable to communicate 
with cache when processing 
StandardFlowFileRecord[uuid=7f049d8e-1d04-4fee-9f04-320d6980bc55,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1577202782598-43, container=default, 
section=43], offset=74111, 
length=4528],offset=0,name=84068ffb-69b1-4471-abbd-29243d3be39e,size=4528] due 
to java.net.ConnectException: Connection refused: no further information: 
java.net.ConnectException: Connection refused: no further information
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
at 
org.apache.nifi.distributed.cache.client.StandardCommsSession.(StandardCommsSession.java:52)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.createCommsSession(DistributedMapCacheClientService.java:410)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.leaseCommsSession(DistributedMapCacheClientService.java:425)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.withCommsSession(DistributedMapCacheClientService.java:491)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.getAndPutIfAbsent(DistributedMapCacheClientService.java:174)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:87)
at com.sun.proxy.$Proxy142.getAndPutIfAbsent(Unknown Source)
at 
org.apache.nifi.processors.standard.DetectDuplicate.onTrigger(DetectDuplicate.java:183)
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Not sure whats missing?


Re: Need help with DetectDuplicate

2019-12-24 Thread Emanuel Oliveira
Hi,

Depending on how your cluster setup you may need to add/setup ssl
controller service?

Emanuel

On Tue 24 Dec 2019, 18:16 William Gosse, 
wrote:

> I’m trying to use DetectDuplicate processor but not having much luck. Here
> the config:
>
> Cache Entry Identifier
>
> ${resourceId}
>
> FlowFile Description
>
> Ingestion
>
> Age Off Duration
>
> 60 sec
>
> Cache The Entry Identifier
>
> true
>
> Distributed Cache Service
>
> DistributedMapCacheClientService
>
>
>
> I created abd enabled a DistributedMapCacheClientService. Here’s its
> config:
>
> Server Hostname
>
> localhost
>
> Server Port
>
> 4557
>
> SSL Context ServiceNo value setCommunications Timeout
>
> 30 secs
>
>
>
> When I run it I get the following error:
>
> 2019-12-24 13:14:05,355 ERROR [Timer-Driven Process Thread-9]
> o.a.n.p.standard.DetectDuplicate
> DetectDuplicate[id=38cb8a64-016f-1000-b55b-f6c4e0f69f61] Unable to
> communicate with cache when processing
> StandardFlowFileRecord[uuid=7f049d8e-1d04-4fee-9f04-320d6980bc55,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1577202782598-43,
> container=default, section=43], offset=74111,
> length=4528],offset=0,name=84068ffb-69b1-4471-abbd-29243d3be39e,size=4528]
> due to java.net.ConnectException: Connection refused: no further
> information: java.net.ConnectException: Connection refused: no further
> information
>
> java.net.ConnectException: Connection refused: no further information
>
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>
> at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
>
> at
> org.apache.nifi.distributed.cache.client.StandardCommsSession.(StandardCommsSession.java:52)
>
> at
> org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.createCommsSession(DistributedMapCacheClientService.java:410)
>
> at
> org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.leaseCommsSession(DistributedMapCacheClientService.java:425)
>
> at
> org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.withCommsSession(DistributedMapCacheClientService.java:491)
>
> at
> org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.getAndPutIfAbsent(DistributedMapCacheClientService.java:174)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:498)
>
> at
> org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:87)
>
> at com.sun.proxy.$Proxy142.getAndPutIfAbsent(Unknown
> Source)
>
> at
> org.apache.nifi.processors.standard.DetectDuplicate.onTrigger(DetectDuplicate.java:183)
>
> at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>
> at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
>
> at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
>
> at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>
> at
> org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
>
>
> Not sure whats missing?
>


Need help with DetectDuplicate

2019-12-24 Thread William Gosse
I’m trying to use DetectDuplicate processor but not having much luck. Here the 
config:
Cache Entry Identifier
${resourceId}
FlowFile Description
Ingestion
Age Off Duration
60 sec
Cache The Entry Identifier
true
Distributed Cache Service
DistributedMapCacheClientService

I created abd enabled a DistributedMapCacheClientService. Here’s its config:
Server Hostname
localhost
Server Port
4557
SSL Context ServiceNo value setCommunications Timeout
30 secs

When I run it I get the following error:
2019-12-24 13:14:05,355 ERROR [Timer-Driven Process Thread-9] 
o.a.n.p.standard.DetectDuplicate 
DetectDuplicate[id=38cb8a64-016f-1000-b55b-f6c4e0f69f61] Unable to communicate 
with cache when processing 
StandardFlowFileRecord[uuid=7f049d8e-1d04-4fee-9f04-320d6980bc55,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1577202782598-43, container=default, 
section=43], offset=74111, 
length=4528],offset=0,name=84068ffb-69b1-4471-abbd-29243d3be39e,size=4528] due 
to java.net.ConnectException: Connection refused: no further information: 
java.net.ConnectException: Connection refused: no further information
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
at 
org.apache.nifi.distributed.cache.client.StandardCommsSession.(StandardCommsSession.java:52)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.createCommsSession(DistributedMapCacheClientService.java:410)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.leaseCommsSession(DistributedMapCacheClientService.java:425)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.withCommsSession(DistributedMapCacheClientService.java:491)
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.getAndPutIfAbsent(DistributedMapCacheClientService.java:174)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:87)
at com.sun.proxy.$Proxy142.getAndPutIfAbsent(Unknown Source)
at 
org.apache.nifi.processors.standard.DetectDuplicate.onTrigger(DetectDuplicate.java:183)
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1176)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Not sure whats missing?


Re: Record-oriented DetectDuplicate?

2019-02-16 Thread Mike Thomsen
Andrew, Mark, etc.

A new contributor alerted me on Jira that he did his own take on this
processor. I encouraged him to join the dev list so we can discuss the use
case in more depth and sort out what is the best way forward.

See https://issues.apache.org/jira/browse/NIFI-6047

I'll give him a little while to join and announce he's ready to go over it
before I move forward with a discussion on this.

On Sat, Feb 9, 2019 at 12:34 PM Mike Thomsen  wrote:

> PR if anyone is interested:
>
> https://github.com/apache/nifi/pull/3298
>
> On Fri, Feb 8, 2019 at 5:34 PM Mike Thomsen 
> wrote:
>
>> With Redis and HBase you can set a TTL on the data itself in the lookup
>> table. Were you thinking something more than that?
>>
>> On Fri, Feb 8, 2019 at 4:42 PM Andrew Grande  wrote:
>>
>>> Can I suggest a time-based option for specifying the window? I think we
>>> only mentioned the number of records.
>>>
>>> Andrew
>>>
>>> On Fri, Feb 8, 2019, 8:22 AM Mike Thomsen 
>>> wrote:
>>>
>>>> Thanks. That answers it succinctly for me. I'll build out a
>>>> DetectDuplicateRecord processor to handle this.
>>>>
>>>> On Fri, Feb 8, 2019 at 11:17 AM Mark Payne 
>>>> wrote:
>>>>
>>>>> Matt,
>>>>>
>>>>> That would work if you want to select distinct records in a given
>>>>> FlowFIle but not across FlowFiles.
>>>>> PartitionRecord -> UpdateAttribute (optionally to combine multiple
>>>>> attributes into one) -> DetectDuplicate
>>>>> would work, but given that you expect the records to be unique
>>>>> generally, this would have the effect of
>>>>> splitting each FlowFile into Record-per-FlowFile, which is certainly
>>>>> not ideal.
>>>>>
>>>>> Thanks
>>>>> -Mark
>>>>>
>>>>>
>>>>> > On Feb 8, 2019, at 11:14 AM, Matt Burgess 
>>>>> wrote:
>>>>> >
>>>>> > Mike,
>>>>> >
>>>>> > I don't think so, but you could try a SELECT DISTINCT in QueryRecord,
>>>>> > might be a bit of a pain if you want to select all columns and there
>>>>> > are lots of them.
>>>>> >
>>>>> > Alternatively you could try PartitionRecord -> QueryRecord (select *
>>>>> > limit 1). Neither PartitionRecord nor QueryRecord keeps state so
>>>>> you'd
>>>>> > likely need to use distributed cache or UpdateAttribute.
>>>>> >
>>>>> > Regards,
>>>>> > Matt
>>>>> >
>>>>> > On Fri, Feb 8, 2019 at 11:08 AM Mike Thomsen 
>>>>> wrote:
>>>>> >>
>>>>> >> Do we have anything like DetectDuplicate for the Record API
>>>>> already? Didn't see anything, but wanted to ask before reinventing the
>>>>> wheel.
>>>>> >>
>>>>> >> Thanks,
>>>>> >>
>>>>> >> Mike
>>>>>
>>>>>


Re: Record-oriented DetectDuplicate?

2019-02-09 Thread Mike Thomsen
PR if anyone is interested:

https://github.com/apache/nifi/pull/3298

On Fri, Feb 8, 2019 at 5:34 PM Mike Thomsen  wrote:

> With Redis and HBase you can set a TTL on the data itself in the lookup
> table. Were you thinking something more than that?
>
> On Fri, Feb 8, 2019 at 4:42 PM Andrew Grande  wrote:
>
>> Can I suggest a time-based option for specifying the window? I think we
>> only mentioned the number of records.
>>
>> Andrew
>>
>> On Fri, Feb 8, 2019, 8:22 AM Mike Thomsen  wrote:
>>
>>> Thanks. That answers it succinctly for me. I'll build out a
>>> DetectDuplicateRecord processor to handle this.
>>>
>>> On Fri, Feb 8, 2019 at 11:17 AM Mark Payne  wrote:
>>>
>>>> Matt,
>>>>
>>>> That would work if you want to select distinct records in a given
>>>> FlowFIle but not across FlowFiles.
>>>> PartitionRecord -> UpdateAttribute (optionally to combine multiple
>>>> attributes into one) -> DetectDuplicate
>>>> would work, but given that you expect the records to be unique
>>>> generally, this would have the effect of
>>>> splitting each FlowFile into Record-per-FlowFile, which is certainly
>>>> not ideal.
>>>>
>>>> Thanks
>>>> -Mark
>>>>
>>>>
>>>> > On Feb 8, 2019, at 11:14 AM, Matt Burgess 
>>>> wrote:
>>>> >
>>>> > Mike,
>>>> >
>>>> > I don't think so, but you could try a SELECT DISTINCT in QueryRecord,
>>>> > might be a bit of a pain if you want to select all columns and there
>>>> > are lots of them.
>>>> >
>>>> > Alternatively you could try PartitionRecord -> QueryRecord (select *
>>>> > limit 1). Neither PartitionRecord nor QueryRecord keeps state so you'd
>>>> > likely need to use distributed cache or UpdateAttribute.
>>>> >
>>>> > Regards,
>>>> > Matt
>>>> >
>>>> > On Fri, Feb 8, 2019 at 11:08 AM Mike Thomsen 
>>>> wrote:
>>>> >>
>>>> >> Do we have anything like DetectDuplicate for the Record API already?
>>>> Didn't see anything, but wanted to ask before reinventing the wheel.
>>>> >>
>>>> >> Thanks,
>>>> >>
>>>> >> Mike
>>>>
>>>>


Re: Record-oriented DetectDuplicate?

2019-02-08 Thread Mike Thomsen
With Redis and HBase you can set a TTL on the data itself in the lookup
table. Were you thinking something more than that?

On Fri, Feb 8, 2019 at 4:42 PM Andrew Grande  wrote:

> Can I suggest a time-based option for specifying the window? I think we
> only mentioned the number of records.
>
> Andrew
>
> On Fri, Feb 8, 2019, 8:22 AM Mike Thomsen  wrote:
>
>> Thanks. That answers it succinctly for me. I'll build out a
>> DetectDuplicateRecord processor to handle this.
>>
>> On Fri, Feb 8, 2019 at 11:17 AM Mark Payne  wrote:
>>
>>> Matt,
>>>
>>> That would work if you want to select distinct records in a given
>>> FlowFIle but not across FlowFiles.
>>> PartitionRecord -> UpdateAttribute (optionally to combine multiple
>>> attributes into one) -> DetectDuplicate
>>> would work, but given that you expect the records to be unique
>>> generally, this would have the effect of
>>> splitting each FlowFile into Record-per-FlowFile, which is certainly not
>>> ideal.
>>>
>>> Thanks
>>> -Mark
>>>
>>>
>>> > On Feb 8, 2019, at 11:14 AM, Matt Burgess 
>>> wrote:
>>> >
>>> > Mike,
>>> >
>>> > I don't think so, but you could try a SELECT DISTINCT in QueryRecord,
>>> > might be a bit of a pain if you want to select all columns and there
>>> > are lots of them.
>>> >
>>> > Alternatively you could try PartitionRecord -> QueryRecord (select *
>>> > limit 1). Neither PartitionRecord nor QueryRecord keeps state so you'd
>>> > likely need to use distributed cache or UpdateAttribute.
>>> >
>>> > Regards,
>>> > Matt
>>> >
>>> > On Fri, Feb 8, 2019 at 11:08 AM Mike Thomsen 
>>> wrote:
>>> >>
>>> >> Do we have anything like DetectDuplicate for the Record API already?
>>> Didn't see anything, but wanted to ask before reinventing the wheel.
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Mike
>>>
>>>


Re: Record-oriented DetectDuplicate?

2019-02-08 Thread Andrew Grande
Can I suggest a time-based option for specifying the window? I think we
only mentioned the number of records.

Andrew

On Fri, Feb 8, 2019, 8:22 AM Mike Thomsen  wrote:

> Thanks. That answers it succinctly for me. I'll build out a
> DetectDuplicateRecord processor to handle this.
>
> On Fri, Feb 8, 2019 at 11:17 AM Mark Payne  wrote:
>
>> Matt,
>>
>> That would work if you want to select distinct records in a given
>> FlowFIle but not across FlowFiles.
>> PartitionRecord -> UpdateAttribute (optionally to combine multiple
>> attributes into one) -> DetectDuplicate
>> would work, but given that you expect the records to be unique generally,
>> this would have the effect of
>> splitting each FlowFile into Record-per-FlowFile, which is certainly not
>> ideal.
>>
>> Thanks
>> -Mark
>>
>>
>> > On Feb 8, 2019, at 11:14 AM, Matt Burgess  wrote:
>> >
>> > Mike,
>> >
>> > I don't think so, but you could try a SELECT DISTINCT in QueryRecord,
>> > might be a bit of a pain if you want to select all columns and there
>> > are lots of them.
>> >
>> > Alternatively you could try PartitionRecord -> QueryRecord (select *
>> > limit 1). Neither PartitionRecord nor QueryRecord keeps state so you'd
>> > likely need to use distributed cache or UpdateAttribute.
>> >
>> > Regards,
>> > Matt
>> >
>> > On Fri, Feb 8, 2019 at 11:08 AM Mike Thomsen 
>> wrote:
>> >>
>> >> Do we have anything like DetectDuplicate for the Record API already?
>> Didn't see anything, but wanted to ask before reinventing the wheel.
>> >>
>> >> Thanks,
>> >>
>> >> Mike
>>
>>


Re: Record-oriented DetectDuplicate?

2019-02-08 Thread Mike Thomsen
Thanks. That answers it succinctly for me. I'll build out a
DetectDuplicateRecord processor to handle this.

On Fri, Feb 8, 2019 at 11:17 AM Mark Payne  wrote:

> Matt,
>
> That would work if you want to select distinct records in a given FlowFIle
> but not across FlowFiles.
> PartitionRecord -> UpdateAttribute (optionally to combine multiple
> attributes into one) -> DetectDuplicate
> would work, but given that you expect the records to be unique generally,
> this would have the effect of
> splitting each FlowFile into Record-per-FlowFile, which is certainly not
> ideal.
>
> Thanks
> -Mark
>
>
> > On Feb 8, 2019, at 11:14 AM, Matt Burgess  wrote:
> >
> > Mike,
> >
> > I don't think so, but you could try a SELECT DISTINCT in QueryRecord,
> > might be a bit of a pain if you want to select all columns and there
> > are lots of them.
> >
> > Alternatively you could try PartitionRecord -> QueryRecord (select *
> > limit 1). Neither PartitionRecord nor QueryRecord keeps state so you'd
> > likely need to use distributed cache or UpdateAttribute.
> >
> > Regards,
> > Matt
> >
> > On Fri, Feb 8, 2019 at 11:08 AM Mike Thomsen 
> wrote:
> >>
> >> Do we have anything like DetectDuplicate for the Record API already?
> Didn't see anything, but wanted to ask before reinventing the wheel.
> >>
> >> Thanks,
> >>
> >> Mike
>
>


Re: Record-oriented DetectDuplicate?

2019-02-08 Thread Mark Payne
Matt,

That would work if you want to select distinct records in a given FlowFIle but 
not across FlowFiles.
PartitionRecord -> UpdateAttribute (optionally to combine multiple attributes 
into one) -> DetectDuplicate 
would work, but given that you expect the records to be unique generally, this 
would have the effect of
splitting each FlowFile into Record-per-FlowFile, which is certainly not ideal.

Thanks
-Mark


> On Feb 8, 2019, at 11:14 AM, Matt Burgess  wrote:
> 
> Mike,
> 
> I don't think so, but you could try a SELECT DISTINCT in QueryRecord,
> might be a bit of a pain if you want to select all columns and there
> are lots of them.
> 
> Alternatively you could try PartitionRecord -> QueryRecord (select *
> limit 1). Neither PartitionRecord nor QueryRecord keeps state so you'd
> likely need to use distributed cache or UpdateAttribute.
> 
> Regards,
> Matt
> 
> On Fri, Feb 8, 2019 at 11:08 AM Mike Thomsen  wrote:
>> 
>> Do we have anything like DetectDuplicate for the Record API already? Didn't 
>> see anything, but wanted to ask before reinventing the wheel.
>> 
>> Thanks,
>> 
>> Mike



Re: Record-oriented DetectDuplicate?

2019-02-08 Thread Mark Payne
We do not. I've thought about it, but I have not had a chance to put any work 
towards it. My vision of how it would work would be to
allow user to specify N number of RecordPath values as user-defined properties. 
Then have those values extracted out and another
Record would be considered a 'duplicate' if all RecordPaths evaluated to the 
same values. However, we then have to be rather careful
because this can certainly be sensitive data that is stored in a 
DistributedMapCache or something of the sort, so we'll have to ensure
that we support secure comms well and document this.


> On Feb 8, 2019, at 10:57 AM, Mike Thomsen  wrote:
> 
> Do we have anything like DetectDuplicate for the Record API already? Didn't 
> see anything, but wanted to ask before reinventing the wheel.
> 
> Thanks,
> 
> Mike



Re: Record-oriented DetectDuplicate?

2019-02-08 Thread Matt Burgess
Mike,

I don't think so, but you could try a SELECT DISTINCT in QueryRecord,
might be a bit of a pain if you want to select all columns and there
are lots of them.

Alternatively you could try PartitionRecord -> QueryRecord (select *
limit 1). Neither PartitionRecord nor QueryRecord keeps state so you'd
likely need to use distributed cache or UpdateAttribute.

Regards,
Matt

On Fri, Feb 8, 2019 at 11:08 AM Mike Thomsen  wrote:
>
> Do we have anything like DetectDuplicate for the Record API already? Didn't 
> see anything, but wanted to ask before reinventing the wheel.
>
> Thanks,
>
> Mike


Record-oriented DetectDuplicate?

2019-02-08 Thread Mike Thomsen
Do we have anything like DetectDuplicate for the Record API already? Didn't
see anything, but wanted to ask before reinventing the wheel.

Thanks,

Mike


Re: DetectDuplicate

2016-12-19 Thread Andrew Grande
Juan, no change from how you remember this processor yet. I personally
would love to have a more pluggable backend for it, too.

Andrew

On Mon, Dec 19, 2016, 2:35 PM Juan Sequeiros  wrote:

> Hello,
>
> I am wondering if DetectDuplicate still has single dependency on
> Distributed Cache Service?
> And if so can I assume that DetectDuplicate will fail if Distributed Cache
> server is down?
>
>
> I want to replace our DetectDuplicate solution "external DB" and use
> NIFI's but single point reliance on Cache server is a blocker. Not sure if
> I am missing something possibly now it uses zookeeper?
>
>
>


DetectDuplicate

2016-12-19 Thread Juan Sequeiros
Hello,

I am wondering if DetectDuplicate still has single dependency on
Distributed Cache Service?
And if so can I assume that DetectDuplicate will fail if Distributed Cache
server is down?


I want to replace our DetectDuplicate solution "external DB" and use NIFI's
but single point reliance on Cache server is a blocker. Not sure if I am
missing something possibly now it uses zookeeper?


RE: NiFI 1.0.0 errors with DetectDuplicate processor

2016-09-08 Thread Porta Léonard
Thanks Bryan,

Everything worked well using the map cache server.
4 eyes are better than two !

Regards,
Léo

From: Bryan Bende [mailto:bbe...@gmail.com]
Sent: jeudi 8 septembre 2016 15:28
To: users@nifi.apache.org
Subject: Re: NiFI 1.0.0 errors with DetectDuplicate processor

Hello,

I'm not sure if this is the problem, but I noticed you are using the map cache 
client with the set cache server.

I think the map cache client needs to be used with the map cache server. Can 
you let us know if that works.

-Bryan



On Thu, Sep 8, 2016 at 9:00 AM, Porta Léonard 
mailto:leonard.po...@kudelskisecurity.com>> 
wrote:
Hello,

Under NiFI 1.0.0 I am trying to configure a DetectDuplicate processor.

I have configured and enabled
DistributedSetCacheServer, with by default configuration (Port=4557)
and
DistributedMapCacheClientService (Server Hostname=localhost, Server Port=4557, 
SSL Context Service=, Comm. Timeout=30 secs)


DetectDuplicate configuration is
Cache Entry Identifier=${my-attribute-key}
FlowFile Description=my text here
Age Off Duration=24 hrs
Distributed Cache Service= DistributedMapCacheClientService
Cache The Entry Identifier=true


May someone help.

Thanks
Leo

Here is the log from nifi-user.log :
2016-09-08 12:12:26,157 ERROR [Distributed Cache Server Communications Thread: 
09a381b1-0157-1000-902d-067af19e4c19] o.a.n.d.cache.server.AbstractCacheServer 
org.apache.nifi.distributed.cache.server.AbstractCacheServer$1$1@759fdc8a<mailto:org.apache.nifi.distributed.cache.server.AbstractCacheServer$1$1@759fdc8a>
 unable to communicate with remote peer localhost due to java.io.IOException: 
IllegalRequest
2016-09-08 12:12:26,166 ERROR [Timer-Driven Process Thread-1] 
o.a.n.p.standard.DetectDuplicate 
DetectDuplicate[id=09981881-0157-1000-ac0f-43f500adcf1a] Unable to communicate 
with cache when processing 
StandardFlowFileRecord[uuid=35de4947-92b3-46fb-ada7-0a195600e4f1,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1473336740239-1293, container=default, 
section=269], offset=22578, 
length=1044],offset=0,name=689694820062267,size=1044] due to 
java.io.EOFException: java.io.EOFException
2016-09-08 12:12:26,169 ERROR [Timer-Driven Process Thread-1] 
o.a.n.p.standard.DetectDuplicate
java.io.EOFException: null
at java.io.DataInputStream.readInt(DataInputStream.java:392) 
~[na:1.8.0_101]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.readLengthDelimitedResponse(DistributedMapCacheClientService.java:220)
 ~[na:na]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.access$100(DistributedMapCacheClientService.java:51)
 ~[na:na]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService$4.execute(DistributedMapCacheClientService.java:176)
 ~[na:na]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.withCommsSession(DistributedMapCacheClientService.java:305)
 ~[na:na]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.getAndPutIfAbsent(DistributedMapCacheClientService.java:164)
 ~[na:na]
at sun.reflect.GeneratedMethodAccessor882.invoke(Unknown Source) 
~[na:na]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.8.0_101]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_101]
at 
org.apache.nifi.controller.service.StandardControllerServiceProvider$1.invoke(StandardControllerServiceProvider.java:177)
 ~[na:na]
at com.sun.proxy.$Proxy163.getAndPutIfAbsent(Unknown Source) ~[na:na]
at 
org.apache.nifi.processors.standard.DetectDuplicate.onTrigger(DetectDuplicate.java:182)
 ~[nifi-standard-processors-1.0.0.jar:1.0.0]
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
 [nifi-api-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1064)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_101]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[na:1.8.0_101]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_101]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [na:1.8.

Re: NiFI 1.0.0 errors with DetectDuplicate processor

2016-09-08 Thread Bryan Bende
Hello,

I'm not sure if this is the problem, but I noticed you are using the map
cache client with the set cache server.

I think the map cache client needs to be used with the map cache server.
Can you let us know if that works.

-Bryan



On Thu, Sep 8, 2016 at 9:00 AM, Porta Léonard <
leonard.po...@kudelskisecurity.com> wrote:

> Hello,
>
>
>
> Under NiFI 1.0.0 I am trying to configure a DetectDuplicate processor.
>
>
>
> I have configured and enabled
>
> DistributedSetCacheServer, with by default configuration (Port=4557)
>
> and
>
> DistributedMapCacheClientService (Server Hostname=localhost, Server
> Port=4557, SSL Context Service=, Comm. Timeout=30 secs)
>
>
>
>
>
> DetectDuplicate configuration is
>
> Cache Entry Identifier=${my-attribute-key}
>
> FlowFile Description=my text here
>
> Age Off Duration=24 hrs
>
> Distributed Cache Service= DistributedMapCacheClientService
>
> Cache The Entry Identifier=true
>
>
>
>
>
> May someone help.
>
>
>
> Thanks
>
> Leo
>
>
>
> Here is the log from nifi-user.log :
>
> 2016-09-08 12:12:26,157 ERROR [Distributed Cache Server Communications
> Thread: 09a381b1-0157-1000-902d-067af19e4c19] 
> o.a.n.d.cache.server.AbstractCacheServer
> org.apache.nifi.distributed.cache.server.AbstractCacheServer$1$1@759fdc8a
> unable to communicate with remote peer localhost due to
> java.io.IOException: IllegalRequest
>
> 2016-09-08 12:12:26,166 ERROR [Timer-Driven Process Thread-1]
> o.a.n.p.standard.DetectDuplicate 
> DetectDuplicate[id=09981881-0157-1000-ac0f-43f500adcf1a]
> Unable to communicate with cache when processing
> StandardFlowFileRecord[uuid=35de4947-92b3-46fb-ada7-0a195600e4f1,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1473336740239-1293,
> container=default, section=269], offset=22578, 
> length=1044],offset=0,name=689694820062267,size=1044]
> due to java.io.EOFException: java.io.EOFException
>
> 2016-09-08 12:12:26,169 ERROR [Timer-Driven Process Thread-1]
> o.a.n.p.standard.DetectDuplicate
>
> java.io.EOFException: null
>
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> ~[na:1.8.0_101]
>
> at org.apache.nifi.distributed.cache.client.
> DistributedMapCacheClientService.readLengthDelimitedResponse(
> DistributedMapCacheClientService.java:220) ~[na:na]
>
> at org.apache.nifi.distributed.cache.client.
> DistributedMapCacheClientService.access$100(DistributedMapCacheClientService.java:51)
> ~[na:na]
>
> at org.apache.nifi.distributed.cache.client.
> DistributedMapCacheClientService$4.execute(DistributedMapCacheClientService.java:176)
> ~[na:na]
>
> at org.apache.nifi.distributed.cache.client.
> DistributedMapCacheClientService.withCommsSession(
> DistributedMapCacheClientService.java:305) ~[na:na]
>
> at org.apache.nifi.distributed.cache.client.
> DistributedMapCacheClientService.getAndPutIfAbsent(
> DistributedMapCacheClientService.java:164) ~[na:na]
>
> at sun.reflect.GeneratedMethodAccessor882.invoke(Unknown Source)
> ~[na:na]
>
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_101]
>
> at java.lang.reflect.Method.invoke(Method.java:498)
> ~[na:1.8.0_101]
>
> at org.apache.nifi.controller.service.
> StandardControllerServiceProvider$1.invoke(StandardControllerServiceProvider.java:177)
> ~[na:na]
>
> at com.sun.proxy.$Proxy163.getAndPutIfAbsent(Unknown Source)
> ~[na:na]
>
> at org.apache.nifi.processors.standard.DetectDuplicate.
> onTrigger(DetectDuplicate.java:182) ~[nifi-standard-processors-1.
> 0.0.jar:1.0.0]
>
> at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> [nifi-api-1.0.0.jar:1.0.0]
>
> at org.apache.nifi.controller.StandardProcessorNode.onTrigger(
> StandardProcessorNode.java:1064) [nifi-framework-core-1.0.0.jar:1.0.0]
>
> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.
> call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.0.0.
> jar:1.0.0]
>
> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.
> call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.0.0.
> jar:1.0.0]
>
> at org.apache.nifi.controller.scheduling.
> TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
> [nifi-framework-core-1.0.0.jar:1.0.0]
>
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_101]
>
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>

NiFI 1.0.0 errors with DetectDuplicate processor

2016-09-08 Thread Porta Léonard
Hello,

Under NiFI 1.0.0 I am trying to configure a DetectDuplicate processor.

I have configured and enabled
DistributedSetCacheServer, with by default configuration (Port=4557)
and
DistributedMapCacheClientService (Server Hostname=localhost, Server Port=4557, 
SSL Context Service=, Comm. Timeout=30 secs)


DetectDuplicate configuration is
Cache Entry Identifier=${my-attribute-key}
FlowFile Description=my text here
Age Off Duration=24 hrs
Distributed Cache Service= DistributedMapCacheClientService
Cache The Entry Identifier=true


May someone help.

Thanks
Leo

Here is the log from nifi-user.log :
2016-09-08 12:12:26,157 ERROR [Distributed Cache Server Communications Thread: 
09a381b1-0157-1000-902d-067af19e4c19] o.a.n.d.cache.server.AbstractCacheServer 
org.apache.nifi.distributed.cache.server.AbstractCacheServer$1$1@759fdc8a 
unable to communicate with remote peer localhost due to java.io.IOException: 
IllegalRequest
2016-09-08 12:12:26,166 ERROR [Timer-Driven Process Thread-1] 
o.a.n.p.standard.DetectDuplicate 
DetectDuplicate[id=09981881-0157-1000-ac0f-43f500adcf1a] Unable to communicate 
with cache when processing 
StandardFlowFileRecord[uuid=35de4947-92b3-46fb-ada7-0a195600e4f1,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1473336740239-1293, container=default, 
section=269], offset=22578, 
length=1044],offset=0,name=689694820062267,size=1044] due to 
java.io.EOFException: java.io.EOFException
2016-09-08 12:12:26,169 ERROR [Timer-Driven Process Thread-1] 
o.a.n.p.standard.DetectDuplicate
java.io.EOFException: null
at java.io.DataInputStream.readInt(DataInputStream.java:392) 
~[na:1.8.0_101]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.readLengthDelimitedResponse(DistributedMapCacheClientService.java:220)
 ~[na:na]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.access$100(DistributedMapCacheClientService.java:51)
 ~[na:na]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService$4.execute(DistributedMapCacheClientService.java:176)
 ~[na:na]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.withCommsSession(DistributedMapCacheClientService.java:305)
 ~[na:na]
at 
org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService.getAndPutIfAbsent(DistributedMapCacheClientService.java:164)
 ~[na:na]
at sun.reflect.GeneratedMethodAccessor882.invoke(Unknown Source) 
~[na:na]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.8.0_101]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_101]
at 
org.apache.nifi.controller.service.StandardControllerServiceProvider$1.invoke(StandardControllerServiceProvider.java:177)
 ~[na:na]
at com.sun.proxy.$Proxy163.getAndPutIfAbsent(Unknown Source) ~[na:na]
at 
org.apache.nifi.processors.standard.DetectDuplicate.onTrigger(DetectDuplicate.java:182)
 ~[nifi-standard-processors-1.0.0.jar:1.0.0]
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
 [nifi-api-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1064)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
 [nifi-framework-core-1.0.0.jar:1.0.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_101]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[na:1.8.0_101]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_101]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]




DetectDuplicate : java.net.ConnectException

2016-03-19 Thread Arathi Maddula
Hi,
Iam new to Apache Nifi and Iam trying it out on my Windows machine. When I use 
the DetectDuplicate processor, I get java.net.ConnectException. How can I start 
DistributedMapCacheClientService?

Thanks,
Arathi



Re: DetectDuplicate : java.net.ConnectException

2016-03-19 Thread Matt Burgess
Arathi,

You'll need to add another Controller Service, one of
type DistributedMapCacheServer, set up on port 4557 (to match your
DistributedMapCacheClientService), and enable/start it. Then you should be
able to connect successfully.

Regards,
Matt

On Thu, Mar 17, 2016 at 4:15 PM, Arathi Maddula 
wrote:

> Hi Adrin,
>
> I already created an instance of the DistributedMapCacheClientService
> with the server hostname pointing to localhost and port 4557.  I use this
> service in the
> “Configure Processor” dialog of “DetectDuplicate” processor. But when I
> execute the workflow, I get java.net.ConnectException in Bulletin board.
>
> Could you tell me if I missed anything?
>
>
>
> Thanks,
> Arathi
>
>
>
> *From:* Aldrin Piri [mailto:aldrinp...@gmail.com]
> *Sent:* Thursday, March 17, 2016 4:06 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: DetectDuplicate : java.net.ConnectException
>
>
>
> Hi Arathi,
>
>
>
> Welcome to the NiFi community.  The DistributedMapCacheClientService works
> in tandem with the DistributedMapCacheServer.  Creating an instance of that
> Controller Service will let you configure the associated properties which
> can then also be entered for your client, DistributedMapCacheClientService,
> listed above.
>
>
>
> Please let us know if you run into any issues.
>
>
>
> [1]
> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.distributed.cache.server.map.DistributedMapCacheServer/index.html
>
>
>
> On Thu, Mar 17, 2016 at 3:50 PM, Arathi Maddula 
> wrote:
>
> Hi,
>
> Iam new to Apache Nifi and Iam trying it out on my Windows machine. When I
> use the DetectDuplicate processor, I get java.net.ConnectException. How can
> I start DistributedMapCacheClientService?
>
>
>
> Thanks,
>
> Arathi
>
>
>
>
>


RE: DetectDuplicate : java.net.ConnectException

2016-03-19 Thread Arathi Maddula
Hi Matt,

Thanks a lot! DistributedMapCacheClientService is working now.

Thanks,
Arathi

From: Matt Burgess [mailto:mattyb...@gmail.com]
Sent: Thursday, March 17, 2016 4:18 PM
To: users@nifi.apache.org
Subject: Re: DetectDuplicate : java.net.ConnectException

Arathi,

You'll need to add another Controller Service, one of type 
DistributedMapCacheServer, set up on port 4557 (to match your 
DistributedMapCacheClientService), and enable/start it. Then you should be able 
to connect successfully.

Regards,
Matt

On Thu, Mar 17, 2016 at 4:15 PM, Arathi Maddula 
mailto:amadd...@boardreader.com>> wrote:
Hi Adrin,
I already created an instance of the DistributedMapCacheClientService with the 
server hostname pointing to localhost and port 4557.  I use this service in the
“Configure Processor” dialog of “DetectDuplicate” processor. But when I execute 
the workflow, I get java.net.ConnectException in Bulletin board.
Could you tell me if I missed anything?

Thanks,
Arathi

From: Aldrin Piri [mailto:aldrinp...@gmail.com<mailto:aldrinp...@gmail.com>]
Sent: Thursday, March 17, 2016 4:06 PM
To: users@nifi.apache.org<mailto:users@nifi.apache.org>
Subject: Re: DetectDuplicate : java.net.ConnectException

Hi Arathi,

Welcome to the NiFi community.  The DistributedMapCacheClientService works in 
tandem with the DistributedMapCacheServer.  Creating an instance of that 
Controller Service will let you configure the associated properties which can 
then also be entered for your client, DistributedMapCacheClientService, listed 
above.

Please let us know if you run into any issues.

[1] 
http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.distributed.cache.server.map.DistributedMapCacheServer/index.html

On Thu, Mar 17, 2016 at 3:50 PM, Arathi Maddula 
mailto:amadd...@boardreader.com>> wrote:
Hi,
Iam new to Apache Nifi and Iam trying it out on my Windows machine. When I use 
the DetectDuplicate processor, I get java.net.ConnectException. How can I start 
DistributedMapCacheClientService?

Thanks,
Arathi





Re: DetectDuplicate : java.net.ConnectException

2016-03-19 Thread Bryan Bende
Hello,

DistributedMapCacheClientService requires a DistributedMapCacheServer to be
setup first.

You then point the client service at the host (most likely local host in
your scenario) and port of the server.

Let us know if that doesn't help.

-Bryan

On Thu, Mar 17, 2016 at 3:50 PM, Arathi Maddula 
wrote:

> Hi,
>
> Iam new to Apache Nifi and Iam trying it out on my Windows machine. When I
> use the DetectDuplicate processor, I get java.net.ConnectException. How can
> I start DistributedMapCacheClientService?
>
>
>
> Thanks,
>
> Arathi
>
>
>


RE: DetectDuplicate : java.net.ConnectException

2016-03-19 Thread Arathi Maddula
Hi Adrin,
I already created an instance of the DistributedMapCacheClientService with the 
server hostname pointing to localhost and port 4557.  I use this service in the
“Configure Processor” dialog of “DetectDuplicate” processor. But when I execute 
the workflow, I get java.net.ConnectException in Bulletin board.
Could you tell me if I missed anything?

Thanks,
Arathi

From: Aldrin Piri [mailto:aldrinp...@gmail.com]
Sent: Thursday, March 17, 2016 4:06 PM
To: users@nifi.apache.org
Subject: Re: DetectDuplicate : java.net.ConnectException

Hi Arathi,

Welcome to the NiFi community.  The DistributedMapCacheClientService works in 
tandem with the DistributedMapCacheServer.  Creating an instance of that 
Controller Service will let you configure the associated properties which can 
then also be entered for your client, DistributedMapCacheClientService, listed 
above.

Please let us know if you run into any issues.

[1] 
http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.distributed.cache.server.map.DistributedMapCacheServer/index.html

On Thu, Mar 17, 2016 at 3:50 PM, Arathi Maddula 
mailto:amadd...@boardreader.com>> wrote:
Hi,
Iam new to Apache Nifi and Iam trying it out on my Windows machine. When I use 
the DetectDuplicate processor, I get java.net.ConnectException. How can I start 
DistributedMapCacheClientService?

Thanks,
Arathi




RE: DetectDuplicate : java.net.ConnectException

2016-03-19 Thread Arathi Maddula
Hi Bryan,

How can I setup a DistributedMapCacheServer?

Thanks,
Arathi

From: Bryan Bende [mailto:bbe...@gmail.com]
Sent: Thursday, March 17, 2016 4:09 PM
To: users@nifi.apache.org
Subject: Re: DetectDuplicate : java.net.ConnectException

Hello,

DistributedMapCacheClientService requires a DistributedMapCacheServer to be 
setup first.

You then point the client service at the host (most likely local host in your 
scenario) and port of the server.

Let us know if that doesn't help.

-Bryan

On Thu, Mar 17, 2016 at 3:50 PM, Arathi Maddula 
mailto:amadd...@boardreader.com>> wrote:
Hi,
Iam new to Apache Nifi and Iam trying it out on my Windows machine. When I use 
the DetectDuplicate processor, I get java.net.ConnectException. How can I start 
DistributedMapCacheClientService?

Thanks,
Arathi




Re: DetectDuplicate : java.net.ConnectException

2016-03-18 Thread Aldrin Piri
Hi Arathi,

Welcome to the NiFi community.  The DistributedMapCacheClientService works
in tandem with the DistributedMapCacheServer.  Creating an instance of that
Controller Service will let you configure the associated properties which
can then also be entered for your client, DistributedMapCacheClientService,
listed above.

Please let us know if you run into any issues.

[1]
http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.distributed.cache.server.map.DistributedMapCacheServer/index.html

On Thu, Mar 17, 2016 at 3:50 PM, Arathi Maddula 
wrote:

> Hi,
>
> Iam new to Apache Nifi and Iam trying it out on my Windows machine. When I
> use the DetectDuplicate processor, I get java.net.ConnectException. How can
> I start DistributedMapCacheClientService?
>
>
>
> Thanks,
>
> Arathi
>
>
>