Unfortunately, Andrej's concerns are well-founded. sizelimit= was in 1.2.0, but I removed it from 1.3.0, and reserved_space= does not offer exactly the same functionality. This ought to be mentioned in the NEWS file (and sizelimit= should be listed as removed from docs/ configuration.txt .. if not, please file a docs bug ticket).

I removed sizelimit= because it turned out to be too expensive to use on large storage servers, such as the ones we run here at allmydata.com, which hold several million shares each. Each time the node started, it had to do the python equivalent of /bin/du, to wall through all shares, measure their size, add them all together, then compare the total against the sizelimit= value. This was causing node startup to block for a long time (upwards of 15 minutes), impacting server availability and discouraging us from upgrading servers in a timely fashion.

In addition, using du to measure space-consumed was inaccurate, as it didn't always take minimum block size into account. Consequently the server could easily use much more space than you wanted it to.

The new reserved_space= control, added in 1.3.0, uses the python equivalent of /bin/df (specifically os.statvfs), which is practically instantaneous (a single syscall), because the filesystem keeps track of partition-wide space usage continually. However, it doesn't enable the kind of limits that Andrej would like to enforce, and we're still looking for an os.statvfs equivalent for windows (currently reserved_space= is not honored on windows).

With the new share-crawling framework I just added last week, we could conceivably bring back sizelimit=. it would probably be applied slowly: when first enabled, we do a slow (hours or days) crawl of all shares to add up their size in a non-blocking CPU-yielding manner, then start enforcing the limit once we'd found out how much space we were actually using. We'd persist the results of the crawl to let us get moving faster on subsequent restarts.

We could also bring sizelimit= back as is, with a warning that it may take a long time to reatart the node when there are a lot of shares.

Or, the Accounting work I'm slowly accomplishing in the background would provide a more fine-grained limit, and would include a coarse server-wide limit as a side-effect.

Hope that helps,
 -Brian


On Mar 1, 2009, at 1:44 AM, Rogério Schneider <[email protected]> wrote:

Andrej, what you want is 'sizelimit'. This configuration limits the utilization of
storage for a given client.

To share only 2GB, for example:

[storage]
enabled = true
sizelimit = 2000000000

http://allmydata.org/source/tahoe/trunk/docs/configuration.txt

Regards,
Rogério Schneider


On Sun, Mar 1, 2009 at 3:09 AM, Andrej Falout <[email protected]> wrote:
Thanks Rogério,


That 'reserved_space' is, as documented, the minimum space tahoe will try to keep free in your disk. Say, when you disk goes to only 2gb (in my config) it will stop storing new chunks. This is exactly what you want.

Unless I'm missing something, not exactly.

On my 3T mount, this would consume all space between currently used 1T and (3T-2G)

I'd like to allocate only a portion of (3T-2G) to Tahoe storage, as I will need it in the future.

therefor the need for something like storage_use_max=xxxx parameter.

IIUC, reserved_space= has the meaning of "dont store any more data if partition space is less then this"

I would like to be able to declare "use maximum x MB, but only if doing so will not reduce available space under y MB"

--
Andrej Falout


_______________________________________________
tahoe-dev mailing list
[email protected]
http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev



--
Rogério Schneider

MSN: [email protected]
GTalk: [email protected]
TerraVoip: stockrt
Skype: stockrt
_______________________________________________
tahoe-dev mailing list
[email protected]
http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
_______________________________________________
tahoe-dev mailing list
[email protected]
http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev

Reply via email to