Unfortunately, Andrej's concerns are well-founded. sizelimit= was in
1.2.0, but I removed it from 1.3.0, and reserved_space= does not offer
exactly the same functionality. This ought to be mentioned in the NEWS
file (and sizelimit= should be listed as removed from docs/
configuration.txt .. if not, please file a docs bug ticket).
I removed sizelimit= because it turned out to be too expensive to use
on large storage servers, such as the ones we run here at
allmydata.com, which hold several million shares each. Each time the
node started, it had to do the python equivalent of /bin/du, to wall
through all shares, measure their size, add them all together, then
compare the total against the sizelimit= value. This was causing node
startup to block for a long time (upwards of 15 minutes), impacting
server availability and discouraging us from upgrading servers in a
timely fashion.
In addition, using du to measure space-consumed was inaccurate, as it
didn't always take minimum block size into account. Consequently the
server could easily use much more space than you wanted it to.
The new reserved_space= control, added in 1.3.0, uses the python
equivalent of /bin/df (specifically os.statvfs), which is practically
instantaneous (a single syscall), because the filesystem keeps track
of partition-wide space usage continually. However, it doesn't enable
the kind of limits that Andrej would like to enforce, and we're still
looking for an os.statvfs equivalent for windows (currently
reserved_space= is not honored on windows).
With the new share-crawling framework I just added last week, we could
conceivably bring back sizelimit=. it would probably be applied
slowly: when first enabled, we do a slow (hours or days) crawl of all
shares to add up their size in a non-blocking CPU-yielding manner,
then start enforcing the limit once we'd found out how much space we
were actually using. We'd persist the results of the crawl to let us
get moving faster on subsequent restarts.
We could also bring sizelimit= back as is, with a warning that it may
take a long time to reatart the node when there are a lot of shares.
Or, the Accounting work I'm slowly accomplishing in the background
would provide a more fine-grained limit, and would include a coarse
server-wide limit as a side-effect.
Hope that helps,
-Brian
On Mar 1, 2009, at 1:44 AM, Rogério Schneider <[email protected]>
wrote:
Andrej, what you want is 'sizelimit'. This configuration limits the
utilization of
storage for a given client.
To share only 2GB, for example:
[storage]
enabled = true
sizelimit = 2000000000
http://allmydata.org/source/tahoe/trunk/docs/configuration.txt
Regards,
Rogério Schneider
On Sun, Mar 1, 2009 at 3:09 AM, Andrej Falout <[email protected]>
wrote:
Thanks Rogério,
That 'reserved_space' is, as documented, the minimum space tahoe
will try to keep free in your disk. Say, when you disk goes to only
2gb (in my config) it will stop storing new chunks. This is exactly
what you want.
Unless I'm missing something, not exactly.
On my 3T mount, this would consume all space between currently used
1T and (3T-2G)
I'd like to allocate only a portion of (3T-2G) to Tahoe storage, as
I will need it in the future.
therefor the need for something like storage_use_max=xxxx parameter.
IIUC, reserved_space= has the meaning of "dont store any more data
if partition space is less then this"
I would like to be able to declare "use maximum x MB, but only if
doing so will not reduce available space under y MB"
--
Andrej Falout
_______________________________________________
tahoe-dev mailing list
[email protected]
http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
--
Rogério Schneider
MSN: [email protected]
GTalk: [email protected]
TerraVoip: stockrt
Skype: stockrt
_______________________________________________
tahoe-dev mailing list
[email protected]
http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
_______________________________________________
tahoe-dev mailing list
[email protected]
http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev