I imported 9b rows in 5 days or so, a few minor crashes, average speed
between 50-200 k ops/sec.  The client needs some love to make it more
efficient on grouping commits during bulk upload.

On Jun 27, 2009 4:02 PM, "Andrew Purtell" <[email protected]> wrote:

Test:

- Latest trunk.

- Config modified only with a store file split threshold of 1GB

- 4 node testbed:
   1) namenode, datanode, hmaster, heritrix, jobtracker
   2) datanode, regionserver, heritrix, tasktracker, mapper (2)
   3) datanode, regionserver, heritrix, tasktracker, mapper (2)
   4) datanode, regionserver, heritrix, tasktracker, mapper (2)

- 100 heritrix threads - 4 hosts, 25 threads each - feeding in ~5MB/sec
average new edits

- 2 mappers x 3 hosts processing new edits and writing back
serialized/compressed Documents

- 3K average transactions/sec reported by master

- 'hadoop balancer -threshold 0.1'

- 1 hour test run

Result:

Passed with no incidents!

  - Andy

Reply via email to