Re: [gentoo-user] Genlop wonky again

2024-01-08 Thread Wols Lists

On 09/01/2024 03:35, Peter Humphrey wrote:

On Sunday, 7 January 2024 08:34:15 GMT Wols Lists wrote:


Weird! I took a module on statistics in my Open University (Chemistry)
degree 40-odd years ago. Probably the same one? I've still got the
modules as a reference work, though I probably couldn't lay my hands on
them easily now ...


Could have been the same. It was M100, the first version of their maths
foundation course, in 1976.

Ah. So you predate me slightly. I took M101. But I also took the second 
level statistics course, can't remember what it was ...


Cheers,
Wol



Re: [gentoo-user] Genlop wonky again

2024-01-08 Thread Peter Humphrey
On Sunday, 7 January 2024 08:34:15 GMT Wols Lists wrote:

> Weird! I took a module on statistics in my Open University (Chemistry)
> degree 40-odd years ago. Probably the same one? I've still got the
> modules as a reference work, though I probably couldn't lay my hands on
> them easily now ...

Could have been the same. It was M100, the first version of their maths 
foundation course, in 1976.

-- 
Regards,
Peter.






Re: [gentoo-user] Genlop wonky again

2024-01-07 Thread Wols Lists

On 07/01/2024 00:52, Peter Humphrey wrote:

They seemed to say that the subject was founded on two
basic principles; then they proceeded to define each of them in terms of the
other.


I should add, I dug into this sort of stuff, and you do know the entire 
edifice of Peano (ie number theory), thanks to Godel, is built on the 
edifice that " true == false " :-) ?


Basically, no matter how hard you try, you cannot escape the Cretan Paradox.

To quote some famous mathematician - "If you define a religion as the 
irrational belief in the unprovable, then Mathematics is the only 
religion that can prove it is one".


That's why the Ancient Philosophers debated how many Angels can dance on 
the Head of a Pin. Set aside your prejudices, your beliefs that "that 
*must* be stupid", read Terry Pratchett's "Science of Diskworld", and 
realise that it doesn't matter WHERE you start, the application of logic 
and reason will lead you down the Rabbit Hole into Wonderland.


And modern man is no better at avoiding that trap than the ancients.

Cheers,
Wol



Re: [gentoo-user] Genlop wonky again

2024-01-07 Thread Wols Lists

On 07/01/2024 00:52, Peter Humphrey wrote:

On Saturday, 6 January 2024 19:28:05 GMT Wols Lists wrote:


Statistics is one of those areas where, if you don't know what you're
doing and you use the wrong maths, then you are going to get stupid results.

"Statistics tell you how to get from A to B. What they don't tell you is
that you're all at C".


I took a module on statistics in my Open University maths degree 40-odd years
ago. I was bemused. They seemed to say that the subject was founded on two
basic principles; then they proceeded to define each of them in terms of the
other.

Weird! I took a module on statistics in my Open University (Chemistry) 
degree 40-odd years ago. Probably the same one? I've still got the 
modules as a reference work, though I probably couldn't lay my hands on 
them easily now ...



I'm still waiting for the entire edifice to come crashing down around our ears.
:)

Nah - it's been abused for so long nobody's noticed it came down 
centuries ago :-)


Cheers,
Wol



Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread Peter Humphrey
On Saturday, 6 January 2024 19:28:05 GMT Wols Lists wrote:

> Statistics is one of those areas where, if you don't know what you're
> doing and you use the wrong maths, then you are going to get stupid results.
> 
> "Statistics tell you how to get from A to B. What they don't tell you is
> that you're all at C".

I took a module on statistics in my Open University maths degree 40-odd years 
ago. I was bemused. They seemed to say that the subject was founded on two 
basic principles; then they proceeded to define each of them in terms of the 
other.

I'm still waiting for the entire edifice to come crashing down around our ears. 
 
:)

-- 
Regards,
Peter.






Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread Wols Lists

On 06/01/2024 17:59, Peter Humphrey wrote:

On Saturday, 6 January 2024 16:21:30 GMT Wols Lists wrote:


... it's nothing to do with more power or whatever, it's down to simple
statistics. If genloop guesses the statistical spread wrongly, it's
going to mess up its estimates.



Aren't you exaggerating genlop's complexity? I wasn't aware of any use of
statistics in it, other than a simple arithmetic mean to estimate the time
remaining. It certainly seems to do that, anyway.


Other than a simple arithmetic mean !!! Other than a simple arithmetic 
mean !!!


If that's the case, you've just confirmed my statement - genloop is 
almost certainly using the wrong statistics for the job !!!


If you take the average (arithmetic mean) of a power-law (exponential 
decay) distribution, your results are going to be garbage.


Statistics is one of those areas where, if you don't know what you're 
doing and you use the wrong maths, then you are going to get stupid results.


"Statistics tell you how to get from A to B. What they don't tell you is 
that you're all at C".


Cheers,
Wol



Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread Peter Humphrey
On Saturday, 6 January 2024 16:26:49 GMT Daniel Pielmeier wrote:
> Am 5. Januar 2024 23:51:39 UTC schrieb Peter Humphrey 
:
> >Hello list,
> >
> >I've just had some strange output from genlop on my 16-thread i5 box, thus:
> >
> ># genlop -t libreoffice | /bin/grep minute
> >
> >   merge time: 37 minutes and 38 seconds.
> >   merge time: 52 minutes and 59 seconds.
> >   merge time: 46 minutes and 17 seconds.
> >
> ># genlop -c
> >
> > Currently merging 11 out of 11
> > 
> > * app-office/libreoffice-7.5.9.2
> > 
> >   current merge time: 4 minutes and 3 seconds.
> >   ETA: 1 hour, 4 minutes and 24 seconds.
> >
> >### Then, once the update finished:
> >
> >#  genlop -t libreoffice | /bin/grep minute
> >
> >   merge time: 37 minutes and 38 seconds.
> >   merge time: 52 minutes and 59 seconds.
> >   merge time: 46 minutes and 17 seconds.
> >   merge time: 38 minutes and 40 seconds.
> >
> >I know genlop is, shall we say, not perfect, but how can it be so grossly
> >wrong as that?
> >
> >I have this in make.conf, and it hasn't changed since I built the machine:
> >
> >grep '\-j' /etc/portage/make.conf
> >EMERGE_DEFAULT_OPTS="--jobs --load-average=12
> >MAKEOPTS="-j12 -l12"
> 
> There are not by chance binary merges which took less than a minute? That
> might explain the differences.

That would skew the prediction downwards, not up.

> What is the output wihout the grep or filtering by merge time instead.

The same.

-- 
Regards,
Peter.






Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread Peter Humphrey
On Saturday, 6 January 2024 16:21:30 GMT Wols Lists wrote:

> ... it's nothing to do with more power or whatever, it's down to simple
> statistics. If genloop guesses the statistical spread wrongly, it's
> going to mess up its estimates.

Aren't you exaggerating genlop's complexity? I wasn't aware of any use of 
statistics in it, other than a simple arithmetic mean to estimate the time 
remaining. It certainly seems to do that, anyway.

> If you have a double-peak distribution, with a large short-lived peak,
> and a small long-lived peak, you can get some weird results, especially
> if you have assumed a bell curve (almost always wrong) or an exponential
> decay (which is generally, NOT ALWAYS, a good choice).

I doubt it does any of that.

-- 
Regards,
Peter.






Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread gennaro amelio



Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread Jack

On 1/6/24 11:21, Wols Lists wrote:

On 06/01/2024 16:12, John Blinka wrote:
And it doesn’t actually take 2x longer - the new estimate is just 
grossly wrong.


I presume that the old estimate was also wrong.

And it's nothing to do with more power or whatever, it's down to 
simple statistics. If genloop guesses the statistical spread wrongly, 
it's going to mess up its estimates.


If you have a double-peak distribution, with a large short-lived peak, 
and a small long-lived peak, you can get some weird results, 
especially if you have assumed a bell curve (almost always wrong) or 
an exponential decay (which is generally, NOT ALWAYS, a good choice).


Cheers,
Wol


I think there is a slightly deeper question also involved. First, I'll 
assume (safe or not) that genlop's assumption of total build time for a 
package depends solely on the previous build times, with all the foibles 
Wol implies in that.  However, that estimate then gets adjusted as the 
build progresses.  Clearly, experience shows up that the estimated 
remaining time is NOT simply the estimated build time minus the time 
spent so far, except possibly when an emerge is only for one package.  
What else contributes to that estimate?  If that adjustment includes 
using the number of other builds going on at the same time, and their 
original and estimated build times, I can see lots of opportunity for 
shenanigans


Jack.




Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread Daniel Pielmeier
Am 5. Januar 2024 23:51:39 UTC schrieb Peter Humphrey :
>Hello list,
>
>I've just had some strange output from genlop on my 16-thread i5 box, thus:
>
># genlop -t libreoffice | /bin/grep minute
>   merge time: 37 minutes and 38 seconds.
>   merge time: 52 minutes and 59 seconds.
>   merge time: 46 minutes and 17 seconds.
>
># genlop -c
>
> Currently merging 11 out of 11
>
> * app-office/libreoffice-7.5.9.2
>
>   current merge time: 4 minutes and 3 seconds.
>   ETA: 1 hour, 4 minutes and 24 seconds.
>
>### Then, once the update finished:
>
>#  genlop -t libreoffice | /bin/grep minute
>   merge time: 37 minutes and 38 seconds.
>   merge time: 52 minutes and 59 seconds.
>   merge time: 46 minutes and 17 seconds.
>   merge time: 38 minutes and 40 seconds.
>
>I know genlop is, shall we say, not perfect, but how can it be so grossly 
>wrong as that?
>
>I have this in make.conf, and it hasn't changed since I built the machine:
>
>grep '\-j' /etc/portage/make.conf
>EMERGE_DEFAULT_OPTS="--jobs --load-average=12
>MAKEOPTS="-j12 -l12"
>

There are not by chance binary merges which took less than a minute? That might 
explain the differences.
What is the output wihout the grep or filtering by merge time instead. 

-- 
Best regards
Daniel



Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread Wols Lists

On 06/01/2024 16:12, John Blinka wrote:
And it doesn’t actually take 2x longer - the new estimate is just 
grossly wrong.


I presume that the old estimate was also wrong.

And it's nothing to do with more power or whatever, it's down to simple 
statistics. If genloop guesses the statistical spread wrongly, it's 
going to mess up its estimates.


If you have a double-peak distribution, with a large short-lived peak, 
and a small long-lived peak, you can get some weird results, especially 
if you have assumed a bell curve (almost always wrong) or an exponential 
decay (which is generally, NOT ALWAYS, a good choice).


Cheers,
Wol



Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread John Blinka
On Sat, Jan 6, 2024 at 3:56 AM Wols Lists  wrote:

> On 06/01/2024 00:54, John Blinka wrote:
> > I’ve often found that it gives one estimate when multiple packages are
> > being built, then a much longer estimate for still-in-progress builds
> > once some of the builds have finished.
> >
> > That result defies common sense. Less remaining work has to take less,
> > not more (much more), time.
>
> Common sense isn't common and, well, often doesn't make sense.
>
> If there's a bunch of small builds skewing the "time per build" estimate
> down, as they drop off the list the estimated time per build will go up,
> and if the skew is serious enough it can even make the total estimated
> time go up ...


I don’t follow you. What is the source of this “skew”? Why should more
available processing power/less load cause builds to run more slowly? I’d
really like to  understand your point.

I have observed what I reported above many times, often when there are 2
builds running, a long one and a shorter one. Once the shorter one ends ,
the longer one’s time estimate via genlop increases , sometimes by 2x. And
it doesn’t actually take 2x longer - the new estimate is just grossly
wrong. Invoking skew or common sense being uncommon/wrong doesn’t change my
and the original poster’s observations that genlop sometimes gives really
bad time estimates. Something’s not right.

Respectfully

John

>


Re: [gentoo-user] Genlop wonky again

2024-01-06 Thread Wols Lists

On 06/01/2024 00:54, John Blinka wrote:
I’ve often found that it gives one estimate when multiple packages are 
being built, then a much longer estimate for still-in-progress builds 
once some of the builds have finished.


That result defies common sense. Less remaining work has to take less, 
not more (much more), time.


Common sense isn't common and, well, often doesn't make sense.

If there's a bunch of small builds skewing the "time per build" estimate 
down, as they drop off the list the estimated time per build will go up, 
and if the skew is serious enough it can even make the total estimated 
time go up ...


Cheers,
Wol



Re: [gentoo-user] Genlop wonky again

2024-01-05 Thread John Blinka
On Fri, Jan 5, 2024 at 6:52 PM Peter Humphrey  wrote:

> Hello list,
>
> I've just had some strange output from genlop on my 16-thread i5 box, thus:
>
> # genlop -t libreoffice | /bin/grep minute
>merge time: 37 minutes and 38 seconds.
>merge time: 52 minutes and 59 seconds.
>merge time: 46 minutes and 17 seconds.
>
> # genlop -c
>
>  Currently merging 11 out of 11
>
>  * app-office/libreoffice-7.5.9.2
>
>current merge time: 4 minutes and 3 seconds.
>ETA: 1 hour, 4 minutes and 24 seconds.
>
> ### Then, once the update finished:
>
> #  genlop -t libreoffice | /bin/grep minute
>merge time: 37 minutes and 38 seconds.
>merge time: 52 minutes and 59 seconds.
>merge time: 46 minutes and 17 seconds.
>merge time: 38 minutes and 40 seconds.
>
> I know genlop is, shall we say, not perfect, but how can it be so grossly
> wrong as that?
>
> I have this in make.conf, and it hasn't changed since I built the machine:
>
> grep '\-j' /etc/portage/make.conf
> EMERGE_DEFAULT_OPTS="--jobs --load-average=12
> MAKEOPTS="-j12 -l12"
>
> --
> Regards,
> Peter.



I’ve often found that it gives one estimate when multiple packages are
being built, then a much longer estimate for still-in-progress builds once
some of the builds have finished.

That result defies common sense. Less remaining work has to take less, not
more (much more), time.

This observation tells me that the algorithm is very fundamentally broken.
The only way to answer how it can be so grossly wrong is to examine its
algorithm. That’s been on my to-do list for ages, but the thought of
debugging it has so far not risen to worth-the-effort status.

I use nearly the same build options as you, so perhaps we’re triggering the
same problem. But my less-work-implies-longer-time observations suggests to
me that the problem is more fundamental than details of jobs/threads/etc.

John Blinka

>


[gentoo-user] Genlop wonky again

2024-01-05 Thread Peter Humphrey
Hello list,

I've just had some strange output from genlop on my 16-thread i5 box, thus:

# genlop -t libreoffice | /bin/grep minute
   merge time: 37 minutes and 38 seconds.
   merge time: 52 minutes and 59 seconds.
   merge time: 46 minutes and 17 seconds.

# genlop -c

 Currently merging 11 out of 11

 * app-office/libreoffice-7.5.9.2

   current merge time: 4 minutes and 3 seconds.
   ETA: 1 hour, 4 minutes and 24 seconds.

### Then, once the update finished:

#  genlop -t libreoffice | /bin/grep minute
   merge time: 37 minutes and 38 seconds.
   merge time: 52 minutes and 59 seconds.
   merge time: 46 minutes and 17 seconds.
   merge time: 38 minutes and 40 seconds.

I know genlop is, shall we say, not perfect, but how can it be so grossly 
wrong as that?

I have this in make.conf, and it hasn't changed since I built the machine:

grep '\-j' /etc/portage/make.conf
EMERGE_DEFAULT_OPTS="--jobs --load-average=12
MAKEOPTS="-j12 -l12"

-- 
Regards,
Peter.