Let me consider this case: there are 100 processes, which of those mutate a single tiny cell for every 50 miliseconds, on a single range. RangeServer receives 2000 update requests per second and invokes 2000 small write requests on DFS. Since it is discomfort to write many small block of data on DFS separately, appending on the user commit log with every flushes. As I measured, writing a tiny block write on KFS takes about 0.6 miliseconds.
To make Hypertable more scalable over huge number of mutator clients, it would be good to make a update request queue that, if many update request jobs are pending, bundle them, write their commit logs altogether once, and update for the corresponding ranges. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
