Is it possiblte to build a "bucket" or "container" system that has x
amount size and scales to the next bucket once that size has been reached?

The issue i have is a db with 235 million pages takes FOREVER to do
anything on simply because it makes a duplicate of itself for all processes.

Would it make sense to create buckets, much like extents (but larger) in
rdbms world and work on smaller volumes of data at a time?

I guess i vision a nutch db daemon that processes all "bucket requests"
and queues them up appropriately while it processes the workfload it can.

I vision an erra when i add in my daily addurl requests and it not take 4
days to process them :)



Reply via email to