Re: Tuning and nimbus at 99%

Michael Rose Sun, 02 Mar 2014 17:20:08 -0800

Are you running Zookeeper on the same machine as the Nimbus box?

Michael Rose (@Xorlev <https://twitter.com/xorlev>)
Senior Platform Engineer, FullContact <http://www.fullcontact.com/>
[email protected]



On Sun, Mar 2, 2014 at 6:16 PM, Sean Solbak <[email protected]> wrote:

> This is the first step of 4. When I save to db I'm actually saving to a
> queue, (just using db for now).  The 2nd step we index the data and 3rd we
> do aggregation/counts for reporting.  The last is a search that I'm
> planning on using drpc for.  Within step 2 we pipe certain datasets in real
> time to the clients it applies to.  I'd like this and the drpc to be sub 2s
> which should be reasonable.
>
> Your right that I could speed up step 1 by not using trident but our
> requirements seem like a good use case for the other 3 steps.  With many
> results per second batching should effect performance a ton if the batch
> size is small enough.
>
> What would cause nimbus to be at 100% CPU with the topologies killed?
>
> Sent from my iPhone
>
> On Mar 2, 2014, at 5:46 PM, Sean Allen <[email protected]>
> wrote:
>
> Is there a reason you are using trident?
>
> If you don't need to handle the events as a batch, you are probably going
> to get performance w/o it.
>
>
> On Sun, Mar 2, 2014 at 2:23 PM, Sean Solbak <[email protected]> wrote:
>
>> Im writing a fairly basic trident topology as follows:
>>
>> - 4 spouts of events
>> - merges into one stream
>> - serializes the object as an event in a string
>> - saves to db
>>
>> I split the serialization task away from the spout as it was cpu
>> intensive to speed it up.
>>
>> The problem I have is that after 10 minutes there is over 910k tuples
>> emitted/transfered but only 193k records are saved.
>>
>> The overall load of the topology seems fine.
>>
>> - 536.404 ms complete latency at the topolgy level
>> - The highest capacity of any bolt is 0.3 which is the serialization one.
>> - each bolt task has sub 20 ms execute latency and sub 40 ms process
>> latency.
>>
>> So it seems trident has all the records internally, but I need these
>> events as close to realtime as possible.
>>
>> Does anyone have any guidance as to how to increase the throughput?  Is
>> it simply a matter of tweeking max spout pending and the batch size?
>>
>> Im running it on 2 m1-smalls for now.  I dont see the need to upgrade it
>> until the demand on the boxes seems higher.  Although CPU usage on the
>> nimbus box is pinned.  Its at like 99%.  Why would that be?  Its at 99%
>> even when all the topologies are killed.
>>
>> We are currently targeting processing 200 million records per day which
>> seems like it should be quite easy based on what Ive read that other people
>> have achieved.  I realize that hardware should be able to boost this as
>> well but my first goal is to get trident to push the records to the db
>> quicker.
>>
>> Thanks in advance,
>> Sean
>>
>>
>
>
> --
>
> Ce n'est pas une signature
>
>

Re: Tuning and nimbus at 99%

Reply via email to