Github user revans2 commented on the pull request:

    https://github.com/apache/incubator-storm/pull/168#issuecomment-49057769
  
    I don't know hard limits when force sync is off.  It depends on the 
operating system, file system, disk RPM/type, the amount of free memory, and 
what else is going on in the system.  
    
    When force sync is off ZK will not wait for the edit to hit the platter, it 
just waits for the data to hit the page cache in the OS before going on.  The 
OS decides when to write the dirty pages back to disk.  Linux typically is 
configured to make sure dirty pages are not around for more than a couple of 
mins.  But even a few seconds worth of edits can become a rather large batch 
that can usually be written in a single large extent to disk, so seeks are 
greatly reduced.  For us we regularly see disk utilization hovering around 5% 
and occasionally spiking to 20% for a 400 node cluster.  It was very similar on 
the 800 simulated nodes with disk utilization about 7%.  I did not save the 
simulated 2000 node metrics so I don't feel comfortable giving hard numbers.  
But off the top of my head I seem to remember it being 10 to 15%, but I really 
don't know for sure.
    
    Even if utilization doubled to 10% going from 400 to 800 nodes and it grew 
linearly there after it would become a bottleneck again at 6500+ nodes (It is 
really hard to actually hit 100% disk utilization in the real world. I picked 
80% for this number).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to