On Thursday 15 November 2018 14:17:29 Austin S. Hemmelgarn wrote:

> On 2018-11-15 13:36, Gene Heskett wrote:
> > On Thursday 15 November 2018 12:57:54 Austin S. Hemmelgarn wrote:
> >> On 2018-11-15 11:53, Gene Heskett wrote:
> >>> On Thursday 15 November 2018 07:36:37 Austin S. Hemmelgarn wrote:
> >>>> On 2018-11-15 06:16, Gene Heskett wrote:
> >>>>> I ask because after last nights run it showed one huge and 3
> >>>>> teeny level 0's for the 4 new dle's.  So I just re-adjusted the
> >>>>> locations of some categories and broke the big one up into 2
> >>>>> pieces. "./[A-P]*" and ./[Q-Z]*", so the next run will have 5
> >>>>> new dle's.
> >>>>>
> >>>>> But an estimate does not show the new names that results in.
> >>>>> I've even took the estimate assignment calcsize back out of the
> >>>>> global dumptype, which ack the manpage, forces the estimates to
> >>>>> be derived from a dummy run of tar, didn't help.
> >>>>>
> >>>>> Clues? Having this info from an estimate query might take a
> >>>>> couple hours, but it sure would be helpfull when redesigning
> >>>>> ones dle's.I'm fairly certain you can't, because it specifically
> >>>>> shows server-side
> >>>>
> >>>> estimates, which have no data to work from if there has never
> >>>> been a dump run for the DLE.
> >>>
> >>> Even if you told it to user tar for the estimate phase? That has
> >>> enough legs to be called a bug. IMO anyway.
> >>
> >> As mentioned in one of my other responses, I can kind of see the
> >> value in this not bothering the client systems.  Keep in mind that
> >> server estimates cost nothing on the client, while calcsize or
> >> client estimates may use a significant amount of resources.
> >
> > My default has been calcsize for three or 4 years, changed because
> > tar was changed & was screwing up the estimates. I can remember 15+
> > years ago when I was using real tar estimates, on a much smaller
> > machine, and it could come within 50 megabytes of filling a DDS-2
> > tape (4 GB compressed) for weeks at a time. So that part of amanda
> > worked a lot better than it does today. And its slowly gone to the
> > dogs as my system grew in complexity.  And went in a handbasket when
> > I had to change to calcsize during the tar churn.
>
> I've not been using AMANDA anywhere near as long as you have, but I've
> actually not seen any issues with accuracy of 'estimate client' mode
> estimates with current versions of GNU tar, except when the estimate
> ran while data in the DLE was being modified (and in that case, it
> makes sense that it would be bogus).  I generally don't 'estimate
> client' on my own systems though because it consistently takes far
> longer than 'estimate calcsize', and I'm not picky about the estimates
> being perfect.
>
> >> In this case, I do think the documentation should be a bit clearer,
> >
> > Yes, but who is to rewrite it?  He should know a heck of a lot more
> > than I do about the amanda innards than I do even after 2 decades,
> > and better defined words here and there too. diakdevice is a very
> > poor substitute for the far more common slanguage of "/path/to/"
> >
> >> and it would be useful to be able to get regular (calcsize and/or
> >> client) estimates on-demand, but I do think that the default is
> >> reasonably sane.
> >
> > It may well be sane, we'll see how it works in the morning. AIUI,
> > calcsize runs only on old history. so that should not impinge a load
> > on the client, even when the client is itself.
>
> Unless I'm mistaken:
>
> * 'estimate server' runs only on historical data, and doesn't even
> talk to the client systems.  It's good at limiting the impact the
> estimate has on the client, but reliably gives bogus estimates if your
> DLEs don't show consistent behavior (that is, each backup of a given
> level is roughly the same size as every other backup at that level).
> * 'estimate client' relies on the backup program being used to give it
> info about how big it will be.  It gives estimates that are close to
> 100% accurate, but currently essentially requires running the backup
> process twice (once for the estimate, once for the actual backup) and
> imposes a non-negligible amount of load on the client.

That depends on the clients instant duty's. I have backed up a milling 
machine while it was running a 90 lines of code, 3 days to finish while 
sharpening a saw blade, with no apparent interaction on a dual core atom 
powered box. One core was locked away for LCNC, (isolcpus at work) the 
other was free to do the backup client. Didn't bither it a bot. :)

> * 'estimate calcsize' does something kind of in-between.  AIUI, it
> looks at some historical data, and also looks at the on-disk size of
> the data,

That would take time to access the dle's, and the answer is effectively 
instant, ergo it is not questioning the client(s), it has to be working 
only from the history in its own logs.

> then factors in compression ratios and such to give an 
> estimate that's usually reasonably accurate without needing the DLEs
> to be consistent or imposing significant load on the clients.

That compression is also in the logs, so calcsize can find that in a few 
milliseconds.


Copyright 2018 by Maurice E. Heskett
-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page <http://geneslinuxbox.net:6309/gene>

Reply via email to