Hello Matthew,

Le jeudi 20 janvier 2011 15:23:21, Matthew Ife a écrit :
> The patch below adds native support for quotas in bacula 5.0.3.

Thanks for your interest, this feature is really interesting for big hosting 
providers.

> We resell bacula and require a method to prevent resource abuse of the
> service we provide. We have used another method to do this in the past
> (Max Volume Jobs and Max Volume Bytes). However, this solution does not
> scale to our needs, leads to avoidable delays in backups due to waiting
> for volumes to become free and is difficult to administratively manage.

An other quick way to handle this is to have a RunScript that checks for the 
quota when the job starts, and prevents or not the job to run. Perhaps more 
flexible, but not so well integrated.

> Personally I believe volume management should not be responsible for this
> type of work, and that it belongs in job management. Thus the patch below
> provides this functionality.
> 
> Quotas attempt to mostly emulate the functionality of filesystem quotas,
> providing soft and hard limits with grace times.
> 
> I've tried to document where I thin necessary. I would like the patch to be
> looked into and criticized - I am not a developer by trade and imagine
> there is some things that could be done better! I am open to suggestions.

No problem, you code is clean (next time, I will appreciate a git diff or
diff -Naur which are easier to read)

> A "Quota" is determined by the sum total of all JobBytes values within the
> JobRetention period. Thus increasing your job retention increases the
> scope for which quota is evaluated. In addition deleting or purging your
> jobs has the effect to modifying the reported quota value.
> 
> Quotas are checked at two spots. The first is when the job is started.
> Quotas are checked and if you exceed quota your job is terminated before
> connections to bacula-fd are initialized.

Ok, nice.

> The second place they are
> checked is in the Job Monitor job. Every minute those jobs which have
> quotas enabled are checked again for their quotas against their running
> jobs.


I don't like the idea to spawn a storage daemon connexion every minutes for 
every jobs in the watchdog, this is a big overhead, what are the benefits to 
check the quota during the job running time?

I can understand that it's more "strict", but the cost associated seems to be 
very high for a very small benefit. For example, you can have cases where the 
job will be almost completed and you will cancel it for few bytes and have to 
restart the whole job later, this job will use resources and won't be usable 
for restore (directly).

If you absolutely don't want to use more resources than expected, you can 
probably estimate the size of the next job by looking previous job in the 
catalog.


> This is done by taking the sum total of all jobs so far, plus the
> bytes value (as reported by the storage daemon for that job) and
> performing the quota check against that.
> 
> To enable this facility in the Client resource of a bacula config file the
> following four items have been introduced.
> 
> Hard Quota - (takes a byte size) The absolute ceiling limit of size a total
> quota can be. Soft Quota   - (takes a byte size) The limit of quota you
> can have once you have exceeded your grace period. Soft Quota Grace Period
> - (takes a time period) The amount of grace period time you are permitted
> before soft quotas are enforced. Strict Quotas - (takes yes or no) This
> changes the behavior of quotas. When in the soft quota you can 'burst' up
> to the hard limit or grace time (as per filesystem quotas). When strict is
> off (the default) the client ends up receiving the total quota they
> bursted up to as their new soft quota. If strict is turned on, then the
> true soft quota is enforced the next time a backup is attempted to be ran.

Sorry, I'm a bit lost with the "Strict Quotas" keyword, can you explain it 
again with some examples ?

> In addition, I added a "purge -> quotas" option in bconsole to reset quotas
> (this effectively resets the grace time so exceeding soft quota starts the
> grace timer again).

Ok

> Onto the code. Now, I'm certainly open to criticisms and suggestions on how
> to improve this. There are a few things I consciously did which might need
> to change.

Your code is rather clean, I see few tweaks that can wait, but this is a very 
good start. A very important point is to have a simple test for this feature. 
You can adapt an existing regress script and ensure that your code does the 
right thing, I can help you on this part. First, take a look to 
regress/tests/prune-test for example, it shows how to add directives and how 
to check results)

> Firstly, the nature of this code means that locking inside of it must be
> avoided. Thus, I have had to rewrite some functions already used elsewhere
> in bacula omitting the locks and working around potential indefinite
> blocks.

This part is associated with the SD pooling code, which is not essential 
(IMHO)

> These have been added to a quota.c file along with the actual
> quota checking code. For the most part they duplicate already used
> functions but I felt updating existing functions to support what work I
> was doing was dangerous. This might not be the best approach and I'm open
> to suggestions here.

What about group of clients?  I can imagine that some users will want to 
allocate quota per "group" of clients (or by Pool), and not client by client 
basis.   Is it something that should be considered now? (IMHO, it can wait)

>From what I can see, the Quota table can be merged with the Client table.

> The majority of quota checking code is in quota.c, there is some stuff in
> other header files and some stuff in the cats directory used for database
> access to quota check with.
> 
> This patch is running on our own network of 300 backup servers spread
> amongst 1200 (so far) clients using it. We are gradually cranking up the
> usage week by week. So far this patch works flawlessly. Other tortures
> I've done include running every job with quotas on the backup server
> immediately (about 15) to see if it would cope. So far things have been
> alright.
> 
> This does need more testing, particularly on postgres and sqlite3 which I
> have not done.
>
> I also don't know how this code would cope on one director
> running 1000 concurrent jobs (this is something we intend to be doing in
> the coming months).

With the SD pooling code, I can imagine that if you have a network problem, it 
can lead to strange things.

> OK, enough rambling on. Heres the patch below, I apologize for the size.
> Feedback is greatly appreciated. Again, Im not a developer so if anyone
> knows how I can make the patch put quota.c/quota.h in the correct
> directory (not the one I originally diffed from) I would be much obliged!

Having specific files for quota is ok, no worry. Once we made a decision on 
the pooling part, it will be easier to decide.

To use git, first clone the git repo from sourceforge, create your own branch 
from the Branch-5.1 code, add your changes, and generate diffs (you can also 
publish them on github for example).

git clone git://bacula.git.sourceforge.net/gitroot/bacula/bacula
cd bacula
git checkout -b quota origin/Branch-5.1

git add ...
git commit

git diff origin/Branch-5.1


You can find information on our git usage on:
http://www.bacula.org/5.1.x-manuals/en/developers/developers/Git_Usage.html

Bye

-- 
Need professional help and support for Bacula ?
Visit http://www.baculasystems.com

------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to