Mark,
Definitely sounds like a possibility. In my usage, the GetSQS processor
would yield often (because the SQS queue is low volume), but it was only
run once per second (or 10 seconds, depending on the setting I used).
I'm unsure if/how this explains why I saw high CPU usage for several
days, then once I played with the settings, it dropped to a normal
amount, but hopefully its another data point.
Cheers,
Adam
On 11/4/15 2:33 PM, Mark Payne wrote:
Adam,
I wonder if this ticket that I just created [1] is actually the same
issue that you're seeing here.
When GetSQS determines there is nothing to do, it will "yield",
essentially pausing itself for some amount of time
(by default 1 second). But the framework wasn't properly pausing the
processor. Instead, it continually ran a task that
simply asks "are you yielded?" Since the answer was yes, it finished
that task and ran it again. This can cause the
CPU usage to be significantly higher when there's nothing to do than
when there is actually work to do.
Thanks
-Mark
[1] https://issues.apache.org/jira/browse/NIFI-1111
On Nov 3, 2015, at 1:05 PM, Adam Lamar <adamond...@gmail.com
<mailto:adamond...@gmail.com>> wrote:
Hey Joe,
I think there are two possible JIRAs.
1) Add long polling support using setWaitTimeSeconds() - should be
really easy. I can take a crack at a pull request. Here's a JIRA:
https://issues.apache.org/jira/browse/NIFI-1103
2) Investigate the high CPU usage. I saw this initially for several
days, but it went away after I adjusted the run schedule (from 1
second to 10 seconds back to 1 second). I have CPU charts showing the
high usage and corresponding drop, but I need to reproduce the issue.
I'll circle back in a few days when I get some time to work on it.
Cheers,
Adam
On 11/3/15 2:41 AM, Joe Witt wrote:
Adam,
Just wanted to follow up on this. Have you had any better results and
should we put a JIRA in behind what you're seeing?
Thanks
Joe
On Tue, Oct 20, 2015 at 7:58 PM, Adam Lamar <adamond...@gmail.com
<mailto:adamond...@gmail.com>> wrote:
Adam,
Thanks for the reply!
Amazon supports (and recommends) long polling on SQS queues[1]. The
GetSQS
code doesn't attempt long polling at all, but I wasn't sure if this was
intentional or if the option had just never been added. With a 20
second
long poll, the processor would make 3 requests per minute instead
of 60,
assuming the queue was empty during that time.
Another data point - even during high CPU usage, the GetSQS
processor was
only making one request per second to SQS (verified via tcpdump).
While not
ideal from a billing perspective, doesn't it seem wrong that 1
request a
second is causing such high CPU?
Perhaps to muddy the waters a bit, I played with the run schedule
yesterday,
and even now that I've turned it back to 1 second, CPU usage is
remaining
low. Before I could start/stop GetSQS repeatedly and observe the
high CPU
usage, but now I can't reproduce it. If I'm able to consistently
reproduce
the issue in the future, I'll be sure to post again.
Cheers,
Adam
[1]
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html
On 10/20/15 4:37 AM, Adam Estrada wrote:
Adam,
I suspect that getSQS is polling Amazon to check for data. It's not
exactly like your standard message broker in that you have to
force the
poll. Anyway, throw a wait time in there and see if that fixes it.
This will
also help lower your monthly Amazon bill...
Adam
On Oct 19, 2015, at 11:41 PM, Adam Lamar <adamond...@gmail.com>
wrote:
Hi everybody!
I've been testing NiFi 0.3.0 with the GetSQS processor to fetch
objects
from an AWS bucket as they're created. My flow looks like this:
GetSQS
SplitJson
ExtractText
FetchS3Object
PutFile
I noticed that GetSQS causes a high amount of CPU usage - about
90% of
one core. If I turn off GetSQS, CPU usage immediately drops to
2%. If I turn
GetSQS back on with the run schedule at 10, it stays at 2%.
Would it be worth using setWaitTimeSeconds [1] to make the SQS
receive a
blocking call? Alternatively, should GetSQS default to a longer run
schedule?
Cheers,
Adam
[1]
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/sqs/model/ReceiveMessageRequest.html#setWaitTimeSeconds(java.lang.Integer)