Re: [sqlite] Standard deviation last x entries

2019-10-20 Thread Bart Smissaert
This is getting far away from SQLite now and moving to medical statistics.
The thing is that the authors of this algorithm have taken a large amount
of patient
data and have calculated what the best predictors are to calculate the
chance for the
individual patient to have a heart attack or stroke in the next 10 years.
For blood pressure
they calculated that SD was the best one to get a figure for the
variability of systolic blood
pressure and they calculated that this SD was relevant.
All explained in the article mentioned in the mentioned link:
https://qrisk.org/three/

RBS

On Sun, Oct 20, 2019 at 2:04 PM Richard Damon 
wrote:

> In my experience max(x) - min(x) isn't a great measure for this sort of
> thing, as it is ultimately dependent on outliers. Something like
> 75%-tile to 25%-tile (or other similar values) might make more sense.
>
> Ultimately, taking a Standard Deviation (or Variance) of a set of
> reading for a set of persons seems a bit unusual as it is mixing two
> different statistics. One is that each person themselves have a mean and
> Standard Deviation to their readings, and then you have the statistics
> of the whole population, with things like the group mean (the mean of
> the individual means), the variation of the means ( the Standard
> Deviation of the Individual Means), the mean variation (the Mean of the
> individual Standard Deviations) and even the variability of the
> varibility (The Standard Deviation of the individual Standard deviations).
>
> Note that I would definitely treat the readings from an individual as a
> sample, unless the patients were being CONTINUALLY monitored, as
> periodic readings ARE samples (and presumably the patient is more
> important than the actual readings). I might also be tempted to treat
> your total group of patients as a sample, unless you think you are (and
> have) treating everyone that falls in the category.
>
> On 10/20/19 8:31 AM, Gabor Grothendieck wrote:
> > Another approach is to question whether you really need the
> > standard deviation or if any measure of variation would do.
> >
> > You could use the range, max(x) - min(x) , and just report it as the
> range
> > rather than the standard deviation.  Also range/4 approximates the
> > standard deviation (google the range rule)
> > although the approximation is not very good.
> >
> >
> >
> > On Fri, Oct 18, 2019 at 6:41 PM Gabor Grothendieck
> >  wrote:
> >> There is a stdev function for sqlite here:
> >> https://www.sqlite.org/contrib//download/extension-functions.c?get=25
> >>
> >> On Wed, Oct 16, 2019 at 7:57 PM Olaf Schmidt 
> wrote:
> >>> Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
>  Sorry, I forgot to tell that. It is date column with an integer
> number.
> 
>  ID xValue xDate
>  
>  1   130  40123
>  1   120  41232
>  1   140  40582
>  1   100  40888
>  1   110  42541
>  2   140  41225
>  2   130  41589
>  2   150  40872
> >>> Because SQLite on Android has at least CTEs available,
> >>> (since Android 5 or so), I thought I make an attempt using them...
> >>>
> >>> For your above sample-set I've created a table Test this way
> >>> (with slightly changed Column-Names + different IDs with "gaps"
> >>> + a few more sample-records, to cover especially n=1 sets):
> >>>
> >>> Create Table Test(ID, xVal, xDat);
> >>>
> >>> Insert Into Test Values(3, 130, 40123);
> >>> Insert Into Test Values(3, 120, 41232);
> >>> Insert Into Test Values(3, 140, 40582);
> >>> Insert Into Test Values(3, 100, 40888);
> >>> Insert Into Test Values(3, 110, 42541);
> >>>
> >>> Insert Into Test Values(5, 140, 41225);
> >>> Insert Into Test Values(5, 130, 41589);
> >>> Insert Into Test Values(5, 150, 40872);
> >>>
> >>> Insert Into Test Values(6, 110, 41225);
> >>> Insert Into Test Values(6, 115, 41227);
> >>>
> >>> Insert Into Test Values(9, 105, 41225);
> >>>
> >>> After the above Table-Creation + Inserts, one should
> >>> define a "Base-View" first, which is then able to select
> >>> appropriate aggregates (respecting your "max-4" condition):
> >>>
> >>> Create View vw_StdDev_Max4_Aggregates As
> >>> Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
> >>> Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
> >>> Limit 4)
> >>> Group By ID;
> >>>
> >>> Only this view above has to be adjusted, according to your
> >>> real table- and column-names, but the following two "derived views",
> >>> will work after these changes as well, as long as you leave the
> >>> above view-name - and the field-names: (ID, a1, a2, n) intact.
> >>>
> >>> So what remains, is two more view-defs (one for the population-StdDev -
> >>> and one with a few slight changes, to cover the sample-StdDev as well.
> >>>
> >>> Create View vw_StdDev_Samples As
> >>> Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
> >>>With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
> >>>  Select s2, 

Re: [sqlite] Standard deviation last x entries

2019-10-20 Thread Richard Damon
In my experience max(x) - min(x) isn't a great measure for this sort of
thing, as it is ultimately dependent on outliers. Something like
75%-tile to 25%-tile (or other similar values) might make more sense.

Ultimately, taking a Standard Deviation (or Variance) of a set of
reading for a set of persons seems a bit unusual as it is mixing two
different statistics. One is that each person themselves have a mean and
Standard Deviation to their readings, and then you have the statistics
of the whole population, with things like the group mean (the mean of
the individual means), the variation of the means ( the Standard
Deviation of the Individual Means), the mean variation (the Mean of the
individual Standard Deviations) and even the variability of the
varibility (The Standard Deviation of the individual Standard deviations).

Note that I would definitely treat the readings from an individual as a
sample, unless the patients were being CONTINUALLY monitored, as
periodic readings ARE samples (and presumably the patient is more
important than the actual readings). I might also be tempted to treat
your total group of patients as a sample, unless you think you are (and
have) treating everyone that falls in the category.

On 10/20/19 8:31 AM, Gabor Grothendieck wrote:
> Another approach is to question whether you really need the
> standard deviation or if any measure of variation would do.
>
> You could use the range, max(x) - min(x) , and just report it as the range
> rather than the standard deviation.  Also range/4 approximates the
> standard deviation (google the range rule)
> although the approximation is not very good.
>
>
>
> On Fri, Oct 18, 2019 at 6:41 PM Gabor Grothendieck
>  wrote:
>> There is a stdev function for sqlite here:
>> https://www.sqlite.org/contrib//download/extension-functions.c?get=25
>>
>> On Wed, Oct 16, 2019 at 7:57 PM Olaf Schmidt  wrote:
>>> Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
 Sorry, I forgot to tell that. It is date column with an integer number.

 ID xValue xDate
 
 1   130  40123
 1   120  41232
 1   140  40582
 1   100  40888
 1   110  42541
 2   140  41225
 2   130  41589
 2   150  40872
>>> Because SQLite on Android has at least CTEs available,
>>> (since Android 5 or so), I thought I make an attempt using them...
>>>
>>> For your above sample-set I've created a table Test this way
>>> (with slightly changed Column-Names + different IDs with "gaps"
>>> + a few more sample-records, to cover especially n=1 sets):
>>>
>>> Create Table Test(ID, xVal, xDat);
>>>
>>> Insert Into Test Values(3, 130, 40123);
>>> Insert Into Test Values(3, 120, 41232);
>>> Insert Into Test Values(3, 140, 40582);
>>> Insert Into Test Values(3, 100, 40888);
>>> Insert Into Test Values(3, 110, 42541);
>>>
>>> Insert Into Test Values(5, 140, 41225);
>>> Insert Into Test Values(5, 130, 41589);
>>> Insert Into Test Values(5, 150, 40872);
>>>
>>> Insert Into Test Values(6, 110, 41225);
>>> Insert Into Test Values(6, 115, 41227);
>>>
>>> Insert Into Test Values(9, 105, 41225);
>>>
>>> After the above Table-Creation + Inserts, one should
>>> define a "Base-View" first, which is then able to select
>>> appropriate aggregates (respecting your "max-4" condition):
>>>
>>> Create View vw_StdDev_Max4_Aggregates As
>>> Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
>>> Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
>>> Limit 4)
>>> Group By ID;
>>>
>>> Only this view above has to be adjusted, according to your
>>> real table- and column-names, but the following two "derived views",
>>> will work after these changes as well, as long as you leave the
>>> above view-name - and the field-names: (ID, a1, a2, n) intact.
>>>
>>> So what remains, is two more view-defs (one for the population-StdDev -
>>> and one with a few slight changes, to cover the sample-StdDev as well.
>>>
>>> Create View vw_StdDev_Samples As
>>> Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
>>>With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
>>>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
>>> Limit 32
>>>) Select s From r Order By i Desc Limit 1
>>> ) s From vw_StdDev_Max4_Aggregates Where n>1
>>> Union All
>>> Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
>>>
>>> Create View vw_StdDev_Populations As
>>> Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
>>>With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
>>>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
>>> Limit 32
>>>) Select s From r Order By i Desc Limit 1
>>> ) sigma From vw_StdDev_Max4_Aggregates Where n>1
>>> Union All
>>> Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
>>>
>>> 
>>> Ok, here the result I get on my test-table, when the 3 views
>>> are in place, and the following gets 

Re: [sqlite] Standard deviation last x entries

2019-10-20 Thread Bart Smissaert
Yes, could use something else than SD, but the mentioned calculator uses SD.
Got this all working fine now.

RBS

On Sun, 20 Oct 2019, 13:31 Gabor Grothendieck, 
wrote:

> Another approach is to question whether you really need the
> standard deviation or if any measure of variation would do.
>
> You could use the range, max(x) - min(x) , and just report it as the range
> rather than the standard deviation.  Also range/4 approximates the
> standard deviation (google the range rule)
> although the approximation is not very good.
>
>
>
> On Fri, Oct 18, 2019 at 6:41 PM Gabor Grothendieck
>  wrote:
> >
> > There is a stdev function for sqlite here:
> > https://www.sqlite.org/contrib//download/extension-functions.c?get=25
> >
> > On Wed, Oct 16, 2019 at 7:57 PM Olaf Schmidt 
> wrote:
> > >
> > > Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
> > > > Sorry, I forgot to tell that. It is date column with an integer
> number.
> > > >
> > > > ID xValue xDate
> > > > 
> > > > 1   130  40123
> > > > 1   120  41232
> > > > 1   140  40582
> > > > 1   100  40888
> > > > 1   110  42541
> > > > 2   140  41225
> > > > 2   130  41589
> > > > 2   150  40872
> > >
> > > Because SQLite on Android has at least CTEs available,
> > > (since Android 5 or so), I thought I make an attempt using them...
> > >
> > > For your above sample-set I've created a table Test this way
> > > (with slightly changed Column-Names + different IDs with "gaps"
> > > + a few more sample-records, to cover especially n=1 sets):
> > >
> > > Create Table Test(ID, xVal, xDat);
> > >
> > > Insert Into Test Values(3, 130, 40123);
> > > Insert Into Test Values(3, 120, 41232);
> > > Insert Into Test Values(3, 140, 40582);
> > > Insert Into Test Values(3, 100, 40888);
> > > Insert Into Test Values(3, 110, 42541);
> > >
> > > Insert Into Test Values(5, 140, 41225);
> > > Insert Into Test Values(5, 130, 41589);
> > > Insert Into Test Values(5, 150, 40872);
> > >
> > > Insert Into Test Values(6, 110, 41225);
> > > Insert Into Test Values(6, 115, 41227);
> > >
> > > Insert Into Test Values(9, 105, 41225);
> > >
> > > After the above Table-Creation + Inserts, one should
> > > define a "Base-View" first, which is then able to select
> > > appropriate aggregates (respecting your "max-4" condition):
> > >
> > > Create View vw_StdDev_Max4_Aggregates As
> > > Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
> > > Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
> > > Limit 4)
> > > Group By ID;
> > >
> > > Only this view above has to be adjusted, according to your
> > > real table- and column-names, but the following two "derived views",
> > > will work after these changes as well, as long as you leave the
> > > above view-name - and the field-names: (ID, a1, a2, n) intact.
> > >
> > > So what remains, is two more view-defs (one for the population-StdDev -
> > > and one with a few slight changes, to cover the sample-StdDev as well.
> > >
> > > Create View vw_StdDev_Samples As
> > > Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
> > >With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
> > >  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > > Limit 32
> > >) Select s From r Order By i Desc Limit 1
> > > ) s From vw_StdDev_Max4_Aggregates Where n>1
> > > Union All
> > > Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
> > >
> > > Create View vw_StdDev_Populations As
> > > Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
> > >With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
> > >  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > > Limit 32
> > >) Select s From r Order By i Desc Limit 1
> > > ) sigma From vw_StdDev_Max4_Aggregates Where n>1
> > > Union All
> > > Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
> > >
> > > 
> > > Ok, here the result I get on my test-table, when the 3 views
> > > are in place, and the following gets executed:
> > >
> > > Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
> > > ID  n   sigma
> > > 
> > > 3   4   14.7901994577491
> > > 5   3   8.16496580927733
> > > 6   2   2.5
> > > 9   1   0
> > >
> > >
> > > Select ID, n_SampleSize, s From vw_StdDev_Samples;
> > > ID  n   s
> > > 
> > > 3   4   17.0782512765993
> > > 5   3   10.1
> > > 6   2   3.53553390593274
> > > 9   1   null
> > >
> > > Please note the returned null (vs. the real 0) in the last result
> > > (done, because "unbiased estimates" are undefined for sample-size == 1)
> > >
> > > Finally for those interested, the CTE-based "iterated Square-Root",
> > > cut out for an isolated calculation of sqrt(2).
> > >
> > > With r(s2, s, i) As (Select 2, 1, 1 Union All
> > >Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > > Limit 32
> > > ) Select s 

Re: [sqlite] Standard deviation last x entries

2019-10-20 Thread Gabor Grothendieck
Another approach is to question whether you really need the
standard deviation or if any measure of variation would do.

You could use the range, max(x) - min(x) , and just report it as the range
rather than the standard deviation.  Also range/4 approximates the
standard deviation (google the range rule)
although the approximation is not very good.



On Fri, Oct 18, 2019 at 6:41 PM Gabor Grothendieck
 wrote:
>
> There is a stdev function for sqlite here:
> https://www.sqlite.org/contrib//download/extension-functions.c?get=25
>
> On Wed, Oct 16, 2019 at 7:57 PM Olaf Schmidt  wrote:
> >
> > Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
> > > Sorry, I forgot to tell that. It is date column with an integer number.
> > >
> > > ID xValue xDate
> > > 
> > > 1   130  40123
> > > 1   120  41232
> > > 1   140  40582
> > > 1   100  40888
> > > 1   110  42541
> > > 2   140  41225
> > > 2   130  41589
> > > 2   150  40872
> >
> > Because SQLite on Android has at least CTEs available,
> > (since Android 5 or so), I thought I make an attempt using them...
> >
> > For your above sample-set I've created a table Test this way
> > (with slightly changed Column-Names + different IDs with "gaps"
> > + a few more sample-records, to cover especially n=1 sets):
> >
> > Create Table Test(ID, xVal, xDat);
> >
> > Insert Into Test Values(3, 130, 40123);
> > Insert Into Test Values(3, 120, 41232);
> > Insert Into Test Values(3, 140, 40582);
> > Insert Into Test Values(3, 100, 40888);
> > Insert Into Test Values(3, 110, 42541);
> >
> > Insert Into Test Values(5, 140, 41225);
> > Insert Into Test Values(5, 130, 41589);
> > Insert Into Test Values(5, 150, 40872);
> >
> > Insert Into Test Values(6, 110, 41225);
> > Insert Into Test Values(6, 115, 41227);
> >
> > Insert Into Test Values(9, 105, 41225);
> >
> > After the above Table-Creation + Inserts, one should
> > define a "Base-View" first, which is then able to select
> > appropriate aggregates (respecting your "max-4" condition):
> >
> > Create View vw_StdDev_Max4_Aggregates As
> > Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
> > Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
> > Limit 4)
> > Group By ID;
> >
> > Only this view above has to be adjusted, according to your
> > real table- and column-names, but the following two "derived views",
> > will work after these changes as well, as long as you leave the
> > above view-name - and the field-names: (ID, a1, a2, n) intact.
> >
> > So what remains, is two more view-defs (one for the population-StdDev -
> > and one with a few slight changes, to cover the sample-StdDev as well.
> >
> > Create View vw_StdDev_Samples As
> > Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
> >With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
> >  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> >) Select s From r Order By i Desc Limit 1
> > ) s From vw_StdDev_Max4_Aggregates Where n>1
> > Union All
> > Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
> >
> > Create View vw_StdDev_Populations As
> > Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
> >With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
> >  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> >) Select s From r Order By i Desc Limit 1
> > ) sigma From vw_StdDev_Max4_Aggregates Where n>1
> > Union All
> > Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
> >
> > 
> > Ok, here the result I get on my test-table, when the 3 views
> > are in place, and the following gets executed:
> >
> > Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
> > ID  n   sigma
> > 
> > 3   4   14.7901994577491
> > 5   3   8.16496580927733
> > 6   2   2.5
> > 9   1   0
> >
> >
> > Select ID, n_SampleSize, s From vw_StdDev_Samples;
> > ID  n   s
> > 
> > 3   4   17.0782512765993
> > 5   3   10.1
> > 6   2   3.53553390593274
> > 9   1   null
> >
> > Please note the returned null (vs. the real 0) in the last result
> > (done, because "unbiased estimates" are undefined for sample-size == 1)
> >
> > Finally for those interested, the CTE-based "iterated Square-Root",
> > cut out for an isolated calculation of sqrt(2).
> >
> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> >Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> > ) Select s From r Order By i Desc Limit 1
> >
> > The max. iterations are limited to 32 - though for input-values
> > in a "typical range", the iteration will usually exit earlier...
> > In case of the example above -> sqrt(2), the iteration-counter
> > i reached only 6 (as can be seen, when you include i alongside s
> > in the final Select.
> >
> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> >Select s2, 

Re: [sqlite] Standard deviation last x entries

2019-10-20 Thread Bart Smissaert
Thanks for that and have tried this now (on Android app) and works fine.
It is fast as well, although slightly slower than the previous version.
I ran this on a column with 8000 values ranging from 0 to 1600 and this
took about 140 milli-seconds
on a fast Samsung S9 phone. Database is on a SD.

I think as square root it such a common thing it will be worth it to be
added to the standard SQLite functions.

RBS




On Sun, Oct 20, 2019 at 3:24 AM Keith Medcalf  wrote:

>
> Here is a recursive CTE that will calculate the square root to the best
> precision your processor math implementation is capable of.  It limits the
> recursion by prohibiting the insertion of duplicate guesses by using UNION
> rather than UNION ALL, which will cause pathological cases that oscillate
> to terminate.  It then scans those guesses looking for the one that is the
> "closest" approximation to the actual square root.  If you ask for the
> square root of a negative number, you get null, and if subnormal division
> gives a nan (NULL) that will also terminate the recursion.
>
> with guesses(findRootOf, guessRoot)
>   as (
>select ?1,
>   case when ?1 < 0 then null else ?1 / 2.0 end
>   union
>select findRootOf,
>   (guessRoot + findRootOf / guessRoot) / 2.0 as newGuessRoot
>  from guesses
> where guessRoot > 0.0
>  )
>   select guessRoot
> from guesses
> order by abs(guessRoot*guessRoot - findRootOf)
>limit 1;
>
> Note that the parameter is used twice.  Once to put in the table, and
> again to compute the first "guess" (which is really to prime the guess with
> NULL to prevent attempts to calculate the root of a negative number).
>
> guessRoot exactly matches the results of the sqrt function 76% of the
> time, and is within 1 ULP 100% of the time.
> guessRoot*guessRoot == findRootOf 48% of the time, within 1 ULP 94% of the
> time, and within 2 ULP 100% of the time.
> *based on 5,000,000 randomly generated positive real numbers
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-19 Thread Keith Medcalf

Here is a recursive CTE that will calculate the square root to the best 
precision your processor math implementation is capable of.  It limits the 
recursion by prohibiting the insertion of duplicate guesses by using UNION 
rather than UNION ALL, which will cause pathological cases that oscillate to 
terminate.  It then scans those guesses looking for the one that is the 
"closest" approximation to the actual square root.  If you ask for the square 
root of a negative number, you get null, and if subnormal division gives a nan 
(NULL) that will also terminate the recursion.

with guesses(findRootOf, guessRoot)
  as (
   select ?1,
  case when ?1 < 0 then null else ?1 / 2.0 end
  union
   select findRootOf,
  (guessRoot + findRootOf / guessRoot) / 2.0 as newGuessRoot
 from guesses
where guessRoot > 0.0
 )
  select guessRoot
from guesses
order by abs(guessRoot*guessRoot - findRootOf)
   limit 1;

Note that the parameter is used twice.  Once to put in the table, and again to 
compute the first "guess" (which is really to prime the guess with NULL to 
prevent attempts to calculate the root of a negative number).

guessRoot exactly matches the results of the sqrt function 76% of the time, and 
is within 1 ULP 100% of the time.
guessRoot*guessRoot == findRootOf 48% of the time, within 1 ULP 94% of the 
time, and within 2 ULP 100% of the time.
*based on 5,000,000 randomly generated positive real numbers

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume. 



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Keith Medcalf

Note that you might need to change the 2.22044604925031e-16 constant to 2 ULPs 
(4.44089209850063e-16) in order to get around some pathological rounding cases.

It probably will not be noticeable faster since you are only saving a few 
machine cycles per iteration.  It might be noticeable unless you are running 
the calculation thousands or millions of times. 

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Bart Smissaert
>Sent: Friday, 18 October, 2019 16:48
>To: SQLite mailing list 
>Subject: Re: [sqlite] Standard deviation last x entries
>
>> However, to optimize the calculation the following is more efficient
>
>I tested it, but didn't find it any faster.
>Naming makes it a lot clearer though.
>
>RBS
>
>On Fri, Oct 18, 2019 at 10:56 PM Keith Medcalf 
>wrote:
>
>>
>> We are calculating the square root using Newtons Method of successive
>> approximation
>>
>> https://pages.mtu.edu/~shene/COURSES/cs201/NOTES/chap04/sqrt.html
>>
>> (x>0)*(y+x/y)/2 yi
>>
>> is computing the new guess of the square root of x based on the last
>guess
>> y and calling that calculation yi (the AS keyword is omitted).  The
>(x>0)
>> part is to make sure that x is always a positive number otherwise the
>new
>> guess will be 0.  This means that the attempt to find the square root
>of a
>> negative number (or 0) will always return 0 for each of two iterations
>thus
>> meeting the stop condition because abs(0 - 0) is less than any positive
>> number.
>>
>> To use a CTE with better named variables one would use:
>>
>> with guesses(findRootOf, guessRoot, iteration)
>>   as (
>>  select ?, 1.0, 1
>>   union all
>>  select findRootOf, (findRootOf > 0) * (guessRoot + findRootOf
>/
>> guessRoot) / 2.0 as newGuess, iteration + 1
>>from guesses
>>   where abs(newGuess - guessRoot) > 1e-12
>>   limit 32
>>  )
>>   select guessRoot
>> from guesses
>> order by iteration desc
>>limit 1;
>>
>> However, to optimize the calculation the following is more efficient
>> (since it only calculates the new guess once each iteration, and
>computes
>> to the limit of precision).  We are substituting calculating the guess
>> twice and its difference from a constant (9 operations) with computing
>the
>> actual precision (6 operations):
>>
>> with guesses(findRootOf, guessRoot, iteration)
>>   as (
>>  select ?, 1.0, 1
>>   union all
>>  select findRootOf, (findRootOf > 0) * (guessRoot + findRootOf
>/
>> guessRoot) / 2.0, iteration + 1
>>from guesses
>>   where abs(guessRoot*guessRoot - findRootOf) >
>> (2.22044604925031e-16 * FindRootOf)
>>   limit 32
>>  )
>>   select guessRoot
>> from guesses
>> order by iteration desc
>>limit 1;
>>
>> where 2.22044604925031e-16 is the epsilon of IEEE754 double precision
>> floating point numbers (the limit of computational precision in the
>last
>> binary place)
>>
>> --
>> The fact that there's a Highway to Hell but only a Stairway to Heaven
>says
>> a lot about anticipated traffic volume.
>>
>> >-Original Message-
>> >From: sqlite-users  On
>> >Behalf Of Bart Smissaert
>> >Sent: Friday, 18 October, 2019 14:55
>> >To: SQLite mailing list 
>> >Subject: Re: [sqlite] Standard deviation last x entries
>> >
>> >Hi Olaf,
>> >
>> >Could you tell me what this is doing:
>> >
>> >,(x>0)*(y+x/y)/2 yi
>> >
>> >Especially the yi after the 2
>> >How does the yi relate to the preceding bit?
>> >
>> >RBS
>> >
>> >
>> >
>> >On Fri, Oct 18, 2019 at 8:26 PM Olaf Schmidt 
>wrote:
>> >
>> >> Am 18.10.2019 um 19:45 schrieb Bart Smissaert:
>> >>
>> >> > Regarding:
>> >> >
>> >> > With r(s2, s, i) As (Select 2, 1, 1 Union All
>> >> > Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-
>s*s)>1e-12
>> >> > Limit 32
>> >> > ) Select s From r Order By i Desc Limit 1
>> >> >
>> >> > How would this work if I wanted to update all the values in a
>table
>> >> column
>> >> > to have the square root?
>> >>
>&g

Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Bart Smissaert
> However, to optimize the calculation the following is more efficient

I tested it, but didn't find it any faster.
Naming makes it a lot clearer though.

RBS

On Fri, Oct 18, 2019 at 10:56 PM Keith Medcalf  wrote:

>
> We are calculating the square root using Newtons Method of successive
> approximation
>
> https://pages.mtu.edu/~shene/COURSES/cs201/NOTES/chap04/sqrt.html
>
> (x>0)*(y+x/y)/2 yi
>
> is computing the new guess of the square root of x based on the last guess
> y and calling that calculation yi (the AS keyword is omitted).  The (x>0)
> part is to make sure that x is always a positive number otherwise the new
> guess will be 0.  This means that the attempt to find the square root of a
> negative number (or 0) will always return 0 for each of two iterations thus
> meeting the stop condition because abs(0 - 0) is less than any positive
> number.
>
> To use a CTE with better named variables one would use:
>
> with guesses(findRootOf, guessRoot, iteration)
>   as (
>  select ?, 1.0, 1
>   union all
>  select findRootOf, (findRootOf > 0) * (guessRoot + findRootOf /
> guessRoot) / 2.0 as newGuess, iteration + 1
>from guesses
>   where abs(newGuess - guessRoot) > 1e-12
>   limit 32
>  )
>   select guessRoot
> from guesses
> order by iteration desc
>limit 1;
>
> However, to optimize the calculation the following is more efficient
> (since it only calculates the new guess once each iteration, and computes
> to the limit of precision).  We are substituting calculating the guess
> twice and its difference from a constant (9 operations) with computing the
> actual precision (6 operations):
>
> with guesses(findRootOf, guessRoot, iteration)
>   as (
>  select ?, 1.0, 1
>   union all
>  select findRootOf, (findRootOf > 0) * (guessRoot + findRootOf /
> guessRoot) / 2.0, iteration + 1
>from guesses
>   where abs(guessRoot*guessRoot - findRootOf) >
> (2.22044604925031e-16 * FindRootOf)
>   limit 32
>  )
>   select guessRoot
> from guesses
> order by iteration desc
>limit 1;
>
> where 2.22044604925031e-16 is the epsilon of IEEE754 double precision
> floating point numbers (the limit of computational precision in the last
> binary place)
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
> >-Original Message-
> >From: sqlite-users  On
> >Behalf Of Bart Smissaert
> >Sent: Friday, 18 October, 2019 14:55
> >To: SQLite mailing list 
> >Subject: Re: [sqlite] Standard deviation last x entries
> >
> >Hi Olaf,
> >
> >Could you tell me what this is doing:
> >
> >,(x>0)*(y+x/y)/2 yi
> >
> >Especially the yi after the 2
> >How does the yi relate to the preceding bit?
> >
> >RBS
> >
> >
> >
> >On Fri, Oct 18, 2019 at 8:26 PM Olaf Schmidt  wrote:
> >
> >> Am 18.10.2019 um 19:45 schrieb Bart Smissaert:
> >>
> >> > Regarding:
> >> >
> >> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> >> > Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> >> > Limit 32
> >> > ) Select s From r Order By i Desc Limit 1
> >> >
> >> > How would this work if I wanted to update all the values in a table
> >> column
> >> > to have the square root?
> >>
> >> Well, as with any other (single-value-returning) Sub-Select
> >> (which was enclosed in parentheses)...
> >>
> >> E.g.
> >>   "Update MyTable Set MyCol = (Select 1)"
> >> would update MyCol with the value 1 across the whole table...
> >>
> >> Same thing basically (only "a bit larger") with a CTE-based
> >> "Single-Value-SubSelect"...
> >>
> >> BTW, I've updated and tuned the thing a bit (which should now
> >> offer more precision, and ~20% more performance as well):
> >>
> >> With  r (x, y, i) As (
> >>Select ?, 1, 1  Union All
> >>Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit
> >32
> >> ) Select y From r Order By i Desc Limit 1
> >>
> >> Note the question-marked "Parameter" for the "squared Input-Value"
> >> (in the first Select Statement of the CTEs triple).
> >>
> >> Integrated into an Update-Query it could look this way:
> >>
> >> Update MyTable Set MySquareRoot = ( -- S

Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Bart Smissaert
Thanks, I did see that, but can't use this on Android.

RBS

On Fri, Oct 18, 2019 at 11:41 PM Gabor Grothendieck 
wrote:

> There is a stdev function for sqlite here:
> https://www.sqlite.org/contrib//download/extension-functions.c?get=25
>
> On Wed, Oct 16, 2019 at 7:57 PM Olaf Schmidt  wrote:
> >
> > Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
> > > Sorry, I forgot to tell that. It is date column with an integer number.
> > >
> > > ID xValue xDate
> > > 
> > > 1   130  40123
> > > 1   120  41232
> > > 1   140  40582
> > > 1   100  40888
> > > 1   110  42541
> > > 2   140  41225
> > > 2   130  41589
> > > 2   150  40872
> >
> > Because SQLite on Android has at least CTEs available,
> > (since Android 5 or so), I thought I make an attempt using them...
> >
> > For your above sample-set I've created a table Test this way
> > (with slightly changed Column-Names + different IDs with "gaps"
> > + a few more sample-records, to cover especially n=1 sets):
> >
> > Create Table Test(ID, xVal, xDat);
> >
> > Insert Into Test Values(3, 130, 40123);
> > Insert Into Test Values(3, 120, 41232);
> > Insert Into Test Values(3, 140, 40582);
> > Insert Into Test Values(3, 100, 40888);
> > Insert Into Test Values(3, 110, 42541);
> >
> > Insert Into Test Values(5, 140, 41225);
> > Insert Into Test Values(5, 130, 41589);
> > Insert Into Test Values(5, 150, 40872);
> >
> > Insert Into Test Values(6, 110, 41225);
> > Insert Into Test Values(6, 115, 41227);
> >
> > Insert Into Test Values(9, 105, 41225);
> >
> > After the above Table-Creation + Inserts, one should
> > define a "Base-View" first, which is then able to select
> > appropriate aggregates (respecting your "max-4" condition):
> >
> > Create View vw_StdDev_Max4_Aggregates As
> > Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
> > Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
> > Limit 4)
> > Group By ID;
> >
> > Only this view above has to be adjusted, according to your
> > real table- and column-names, but the following two "derived views",
> > will work after these changes as well, as long as you leave the
> > above view-name - and the field-names: (ID, a1, a2, n) intact.
> >
> > So what remains, is two more view-defs (one for the population-StdDev -
> > and one with a few slight changes, to cover the sample-StdDev as well.
> >
> > Create View vw_StdDev_Samples As
> > Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
> >With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
> >  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> >) Select s From r Order By i Desc Limit 1
> > ) s From vw_StdDev_Max4_Aggregates Where n>1
> > Union All
> > Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
> >
> > Create View vw_StdDev_Populations As
> > Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
> >With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
> >  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> >) Select s From r Order By i Desc Limit 1
> > ) sigma From vw_StdDev_Max4_Aggregates Where n>1
> > Union All
> > Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
> >
> > 
> > Ok, here the result I get on my test-table, when the 3 views
> > are in place, and the following gets executed:
> >
> > Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
> > ID  n   sigma
> > 
> > 3   4   14.7901994577491
> > 5   3   8.16496580927733
> > 6   2   2.5
> > 9   1   0
> >
> >
> > Select ID, n_SampleSize, s From vw_StdDev_Samples;
> > ID  n   s
> > 
> > 3   4   17.0782512765993
> > 5   3   10.1
> > 6   2   3.53553390593274
> > 9   1   null
> >
> > Please note the returned null (vs. the real 0) in the last result
> > (done, because "unbiased estimates" are undefined for sample-size == 1)
> >
> > Finally for those interested, the CTE-based "iterated Square-Root",
> > cut out for an isolated calculation of sqrt(2).
> >
> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> >Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> > ) Select s From r Order By i Desc Limit 1
> >
> > The max. iterations are limited to 32 - though for input-values
> > in a "typical range", the iteration will usually exit earlier...
> > In case of the example above -> sqrt(2), the iteration-counter
> > i reached only 6 (as can be seen, when you include i alongside s
> > in the final Select.
> >
> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> >Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> > ) Select i, s From r Order By i Desc Limit 1
> >
> > HTH
> >
> > Olaf
> >
> >
> >
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > 

Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Gabor Grothendieck
There is a stdev function for sqlite here:
https://www.sqlite.org/contrib//download/extension-functions.c?get=25

On Wed, Oct 16, 2019 at 7:57 PM Olaf Schmidt  wrote:
>
> Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
> > Sorry, I forgot to tell that. It is date column with an integer number.
> >
> > ID xValue xDate
> > 
> > 1   130  40123
> > 1   120  41232
> > 1   140  40582
> > 1   100  40888
> > 1   110  42541
> > 2   140  41225
> > 2   130  41589
> > 2   150  40872
>
> Because SQLite on Android has at least CTEs available,
> (since Android 5 or so), I thought I make an attempt using them...
>
> For your above sample-set I've created a table Test this way
> (with slightly changed Column-Names + different IDs with "gaps"
> + a few more sample-records, to cover especially n=1 sets):
>
> Create Table Test(ID, xVal, xDat);
>
> Insert Into Test Values(3, 130, 40123);
> Insert Into Test Values(3, 120, 41232);
> Insert Into Test Values(3, 140, 40582);
> Insert Into Test Values(3, 100, 40888);
> Insert Into Test Values(3, 110, 42541);
>
> Insert Into Test Values(5, 140, 41225);
> Insert Into Test Values(5, 130, 41589);
> Insert Into Test Values(5, 150, 40872);
>
> Insert Into Test Values(6, 110, 41225);
> Insert Into Test Values(6, 115, 41227);
>
> Insert Into Test Values(9, 105, 41225);
>
> After the above Table-Creation + Inserts, one should
> define a "Base-View" first, which is then able to select
> appropriate aggregates (respecting your "max-4" condition):
>
> Create View vw_StdDev_Max4_Aggregates As
> Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
> Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
> Limit 4)
> Group By ID;
>
> Only this view above has to be adjusted, according to your
> real table- and column-names, but the following two "derived views",
> will work after these changes as well, as long as you leave the
> above view-name - and the field-names: (ID, a1, a2, n) intact.
>
> So what remains, is two more view-defs (one for the population-StdDev -
> and one with a few slight changes, to cover the sample-StdDev as well.
>
> Create View vw_StdDev_Samples As
> Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
>With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
>) Select s From r Order By i Desc Limit 1
> ) s From vw_StdDev_Max4_Aggregates Where n>1
> Union All
> Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
>
> Create View vw_StdDev_Populations As
> Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
>With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
>) Select s From r Order By i Desc Limit 1
> ) sigma From vw_StdDev_Max4_Aggregates Where n>1
> Union All
> Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
>
> 
> Ok, here the result I get on my test-table, when the 3 views
> are in place, and the following gets executed:
>
> Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
> ID  n   sigma
> 
> 3   4   14.7901994577491
> 5   3   8.16496580927733
> 6   2   2.5
> 9   1   0
>
>
> Select ID, n_SampleSize, s From vw_StdDev_Samples;
> ID  n   s
> 
> 3   4   17.0782512765993
> 5   3   10.1
> 6   2   3.53553390593274
> 9   1   null
>
> Please note the returned null (vs. the real 0) in the last result
> (done, because "unbiased estimates" are undefined for sample-size == 1)
>
> Finally for those interested, the CTE-based "iterated Square-Root",
> cut out for an isolated calculation of sqrt(2).
>
> With r(s2, s, i) As (Select 2, 1, 1 Union All
>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
> ) Select s From r Order By i Desc Limit 1
>
> The max. iterations are limited to 32 - though for input-values
> in a "typical range", the iteration will usually exit earlier...
> In case of the example above -> sqrt(2), the iteration-counter
> i reached only 6 (as can be seen, when you include i alongside s
> in the final Select.
>
> With r(s2, s, i) As (Select 2, 1, 1 Union All
>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
> ) Select i, s From r Order By i Desc Limit 1
>
> HTH
>
> Olaf
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Bart Smissaert
> and calling that calculation yi (the AS keyword is omitted).

Ah, indeed, simple!

Thanks for that explanation.
Will study this and try it out.

RBS

On Fri, Oct 18, 2019 at 10:56 PM Keith Medcalf  wrote:

>
> We are calculating the square root using Newtons Method of successive
> approximation
>
> https://pages.mtu.edu/~shene/COURSES/cs201/NOTES/chap04/sqrt.html
>
> (x>0)*(y+x/y)/2 yi
>
> is computing the new guess of the square root of x based on the last guess
> y and calling that calculation yi (the AS keyword is omitted).  The (x>0)
> part is to make sure that x is always a positive number otherwise the new
> guess will be 0.  This means that the attempt to find the square root of a
> negative number (or 0) will always return 0 for each of two iterations thus
> meeting the stop condition because abs(0 - 0) is less than any positive
> number.
>
> To use a CTE with better named variables one would use:
>
> with guesses(findRootOf, guessRoot, iteration)
>   as (
>  select ?, 1.0, 1
>   union all
>  select findRootOf, (findRootOf > 0) * (guessRoot + findRootOf /
> guessRoot) / 2.0 as newGuess, iteration + 1
>from guesses
>   where abs(newGuess - guessRoot) > 1e-12
>   limit 32
>  )
>   select guessRoot
> from guesses
> order by iteration desc
>limit 1;
>
> However, to optimize the calculation the following is more efficient
> (since it only calculates the new guess once each iteration, and computes
> to the limit of precision).  We are substituting calculating the guess
> twice and its difference from a constant (9 operations) with computing the
> actual precision (6 operations):
>
> with guesses(findRootOf, guessRoot, iteration)
>   as (
>  select ?, 1.0, 1
>   union all
>  select findRootOf, (findRootOf > 0) * (guessRoot + findRootOf /
> guessRoot) / 2.0, iteration + 1
>from guesses
>   where abs(guessRoot*guessRoot - findRootOf) >
> (2.22044604925031e-16 * FindRootOf)
>   limit 32
>  )
>   select guessRoot
> from guesses
> order by iteration desc
>limit 1;
>
> where 2.22044604925031e-16 is the epsilon of IEEE754 double precision
> floating point numbers (the limit of computational precision in the last
> binary place)
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
> >-Original Message-
> >From: sqlite-users  On
> >Behalf Of Bart Smissaert
> >Sent: Friday, 18 October, 2019 14:55
> >To: SQLite mailing list 
> >Subject: Re: [sqlite] Standard deviation last x entries
> >
> >Hi Olaf,
> >
> >Could you tell me what this is doing:
> >
> >,(x>0)*(y+x/y)/2 yi
> >
> >Especially the yi after the 2
> >How does the yi relate to the preceding bit?
> >
> >RBS
> >
> >
> >
> >On Fri, Oct 18, 2019 at 8:26 PM Olaf Schmidt  wrote:
> >
> >> Am 18.10.2019 um 19:45 schrieb Bart Smissaert:
> >>
> >> > Regarding:
> >> >
> >> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> >> > Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> >> > Limit 32
> >> > ) Select s From r Order By i Desc Limit 1
> >> >
> >> > How would this work if I wanted to update all the values in a table
> >> column
> >> > to have the square root?
> >>
> >> Well, as with any other (single-value-returning) Sub-Select
> >> (which was enclosed in parentheses)...
> >>
> >> E.g.
> >>   "Update MyTable Set MyCol = (Select 1)"
> >> would update MyCol with the value 1 across the whole table...
> >>
> >> Same thing basically (only "a bit larger") with a CTE-based
> >> "Single-Value-SubSelect"...
> >>
> >> BTW, I've updated and tuned the thing a bit (which should now
> >> offer more precision, and ~20% more performance as well):
> >>
> >> With  r (x, y, i) As (
> >>Select ?, 1, 1  Union All
> >>Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit
> >32
> >> ) Select y From r Order By i Desc Limit 1
> >>
> >> Note the question-marked "Parameter" for the "squared Input-Value"
> >> (in the first Select Statement of the CTEs triple).
> >>
> >> Integrated into an Update-Query it could look this way:
> >>
> >> Update MyTable Set MySquareRoot = ( -- SubSelect-Open

Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Keith Medcalf

We are calculating the square root using Newtons Method of successive 
approximation

https://pages.mtu.edu/~shene/COURSES/cs201/NOTES/chap04/sqrt.html

(x>0)*(y+x/y)/2 yi  

is computing the new guess of the square root of x based on the last guess y 
and calling that calculation yi (the AS keyword is omitted).  The (x>0) part is 
to make sure that x is always a positive number otherwise the new guess will be 
0.  This means that the attempt to find the square root of a negative number 
(or 0) will always return 0 for each of two iterations thus meeting the stop 
condition because abs(0 - 0) is less than any positive number.  

To use a CTE with better named variables one would use:

with guesses(findRootOf, guessRoot, iteration)
  as (
 select ?, 1.0, 1
  union all
 select findRootOf, (findRootOf > 0) * (guessRoot + findRootOf / 
guessRoot) / 2.0 as newGuess, iteration + 1
   from guesses
  where abs(newGuess - guessRoot) > 1e-12
  limit 32
 )
  select guessRoot
from guesses
order by iteration desc
   limit 1;

However, to optimize the calculation the following is more efficient (since it 
only calculates the new guess once each iteration, and computes to the limit of 
precision).  We are substituting calculating the guess twice and its difference 
from a constant (9 operations) with computing the actual precision (6 
operations):

with guesses(findRootOf, guessRoot, iteration)
  as (
 select ?, 1.0, 1
  union all
 select findRootOf, (findRootOf > 0) * (guessRoot + findRootOf / 
guessRoot) / 2.0, iteration + 1
   from guesses
  where abs(guessRoot*guessRoot - findRootOf) > (2.22044604925031e-16 * 
FindRootOf)
  limit 32
 )
  select guessRoot
from guesses
order by iteration desc
   limit 1;

where 2.22044604925031e-16 is the epsilon of IEEE754 double precision floating 
point numbers (the limit of computational precision in the last binary place)

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Bart Smissaert
>Sent: Friday, 18 October, 2019 14:55
>To: SQLite mailing list 
>Subject: Re: [sqlite] Standard deviation last x entries
>
>Hi Olaf,
>
>Could you tell me what this is doing:
>
>,(x>0)*(y+x/y)/2 yi
>
>Especially the yi after the 2
>How does the yi relate to the preceding bit?
>
>RBS
>
>
>
>On Fri, Oct 18, 2019 at 8:26 PM Olaf Schmidt  wrote:
>
>> Am 18.10.2019 um 19:45 schrieb Bart Smissaert:
>>
>> > Regarding:
>> >
>> > With r(s2, s, i) As (Select 2, 1, 1 Union All
>> > Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
>> > Limit 32
>> > ) Select s From r Order By i Desc Limit 1
>> >
>> > How would this work if I wanted to update all the values in a table
>> column
>> > to have the square root?
>>
>> Well, as with any other (single-value-returning) Sub-Select
>> (which was enclosed in parentheses)...
>>
>> E.g.
>>   "Update MyTable Set MyCol = (Select 1)"
>> would update MyCol with the value 1 across the whole table...
>>
>> Same thing basically (only "a bit larger") with a CTE-based
>> "Single-Value-SubSelect"...
>>
>> BTW, I've updated and tuned the thing a bit (which should now
>> offer more precision, and ~20% more performance as well):
>>
>> With  r (x, y, i) As (
>>Select ?, 1, 1  Union All
>>Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit
>32
>> ) Select y From r Order By i Desc Limit 1
>>
>> Note the question-marked "Parameter" for the "squared Input-Value"
>> (in the first Select Statement of the CTEs triple).
>>
>> Integrated into an Update-Query it could look this way:
>>
>> Update MyTable Set MySquareRoot = ( -- SubSelect-Opening-Paren...
>>   With  r (x, y, i) As (
>> Select MySquaredValue, 1, 1  Union All
>> Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit
>32
>>   ) Select y From r Order By i Desc Limit 1
>> ) -- SubSelect-Closing-Paren...
>>
>> HTH
>>
>> Olaf
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Bart Smissaert
Hi Olaf,

Could you tell me what this is doing:

,(x>0)*(y+x/y)/2 yi

Especially the yi after the 2
How does the yi relate to the preceding bit?

RBS



On Fri, Oct 18, 2019 at 8:26 PM Olaf Schmidt  wrote:

> Am 18.10.2019 um 19:45 schrieb Bart Smissaert:
>
> > Regarding:
> >
> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> > Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> > ) Select s From r Order By i Desc Limit 1
> >
> > How would this work if I wanted to update all the values in a table
> column
> > to have the square root?
>
> Well, as with any other (single-value-returning) Sub-Select
> (which was enclosed in parentheses)...
>
> E.g.
>   "Update MyTable Set MyCol = (Select 1)"
> would update MyCol with the value 1 across the whole table...
>
> Same thing basically (only "a bit larger") with a CTE-based
> "Single-Value-SubSelect"...
>
> BTW, I've updated and tuned the thing a bit (which should now
> offer more precision, and ~20% more performance as well):
>
> With  r (x, y, i) As (
>Select ?, 1, 1  Union All
>Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit 32
> ) Select y From r Order By i Desc Limit 1
>
> Note the question-marked "Parameter" for the "squared Input-Value"
> (in the first Select Statement of the CTEs triple).
>
> Integrated into an Update-Query it could look this way:
>
> Update MyTable Set MySquareRoot = ( -- SubSelect-Opening-Paren...
>   With  r (x, y, i) As (
> Select MySquaredValue, 1, 1  Union All
> Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit 32
>   ) Select y From r Order By i Desc Limit 1
> ) -- SubSelect-Closing-Paren...
>
> HTH
>
> Olaf
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Bart Smissaert
Have tested this and CTE pure SQL method is a lot faster (about 4 x) than
my previous method in code:

strSQL = "SELECT ID, SD_BP FROM QR3PARAMS"
RS1 = cConn.SQL1.ExecQuery(strSQL)
strSQL = "UPDATE QR3PARAMS SET SD_BP = ? WHERE ID = ?"
cConn.BeginTransaction
Do While RS1.NextRow
cConn.SQL1.ExecNonQuery2(strSQL, Array As String(Sqrt(RS1.GetDouble2(1)),
RS1.GetInt2(0)))
Loop
cConn.EndTransaction

This is B4A code on Android with the SQLCipher driver.

RBS


On Fri, Oct 18, 2019 at 8:26 PM Olaf Schmidt  wrote:

> Am 18.10.2019 um 19:45 schrieb Bart Smissaert:
>
> > Regarding:
> >
> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> > Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> > ) Select s From r Order By i Desc Limit 1
> >
> > How would this work if I wanted to update all the values in a table
> column
> > to have the square root?
>
> Well, as with any other (single-value-returning) Sub-Select
> (which was enclosed in parentheses)...
>
> E.g.
>   "Update MyTable Set MyCol = (Select 1)"
> would update MyCol with the value 1 across the whole table...
>
> Same thing basically (only "a bit larger") with a CTE-based
> "Single-Value-SubSelect"...
>
> BTW, I've updated and tuned the thing a bit (which should now
> offer more precision, and ~20% more performance as well):
>
> With  r (x, y, i) As (
>Select ?, 1, 1  Union All
>Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit 32
> ) Select y From r Order By i Desc Limit 1
>
> Note the question-marked "Parameter" for the "squared Input-Value"
> (in the first Select Statement of the CTEs triple).
>
> Integrated into an Update-Query it could look this way:
>
> Update MyTable Set MySquareRoot = ( -- SubSelect-Opening-Paren...
>   With  r (x, y, i) As (
> Select MySquaredValue, 1, 1  Union All
> Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit 32
>   ) Select y From r Order By i Desc Limit 1
> ) -- SubSelect-Closing-Paren...
>
> HTH
>
> Olaf
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Bart Smissaert
Hi Olaf,

OK, thanks, that works fine indeed.

> Well, as with any other (single-value-returning) Sub-Select
It is just that saw SQL's where the update part came after the CTE, eg:
https://stackoverflow.com/questions/27637278/how-to-remove-duplicate-rows-with-cte-when-partitioning-by-another-tables-colum

WITH cte AS
( SELECT ROW_NUMBER() OVER
(PARTITION BY GroupId ORDER BY Created) AS Rnk FROM
  @Item AS i JOIN @ItemType AS it ON i.ItemTypeId = it.Id )
DELETE FROM cte WHERE Rnk > 1;

RBS


On Fri, Oct 18, 2019 at 8:26 PM Olaf Schmidt  wrote:

> Am 18.10.2019 um 19:45 schrieb Bart Smissaert:
>
> > Regarding:
> >
> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> > Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> > ) Select s From r Order By i Desc Limit 1
> >
> > How would this work if I wanted to update all the values in a table
> column
> > to have the square root?
>
> Well, as with any other (single-value-returning) Sub-Select
> (which was enclosed in parentheses)...
>
> E.g.
>   "Update MyTable Set MyCol = (Select 1)"
> would update MyCol with the value 1 across the whole table...
>
> Same thing basically (only "a bit larger") with a CTE-based
> "Single-Value-SubSelect"...
>
> BTW, I've updated and tuned the thing a bit (which should now
> offer more precision, and ~20% more performance as well):
>
> With  r (x, y, i) As (
>Select ?, 1, 1  Union All
>Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit 32
> ) Select y From r Order By i Desc Limit 1
>
> Note the question-marked "Parameter" for the "squared Input-Value"
> (in the first Select Statement of the CTEs triple).
>
> Integrated into an Update-Query it could look this way:
>
> Update MyTable Set MySquareRoot = ( -- SubSelect-Opening-Paren...
>   With  r (x, y, i) As (
> Select MySquaredValue, 1, 1  Union All
> Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit 32
>   ) Select y From r Order By i Desc Limit 1
> ) -- SubSelect-Closing-Paren...
>
> HTH
>
> Olaf
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Olaf Schmidt

Am 18.10.2019 um 19:45 schrieb Bart Smissaert:


Regarding:

With r(s2, s, i) As (Select 2, 1, 1 Union All
Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
Limit 32
) Select s From r Order By i Desc Limit 1

How would this work if I wanted to update all the values in a table column
to have the square root?


Well, as with any other (single-value-returning) Sub-Select
(which was enclosed in parentheses)...

E.g.
 "Update MyTable Set MyCol = (Select 1)"
would update MyCol with the value 1 across the whole table...

Same thing basically (only "a bit larger") with a CTE-based
"Single-Value-SubSelect"...

BTW, I've updated and tuned the thing a bit (which should now
offer more precision, and ~20% more performance as well):

With  r (x, y, i) As (
  Select ?, 1, 1  Union All
  Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit 32
) Select y From r Order By i Desc Limit 1

Note the question-marked "Parameter" for the "squared Input-Value"
(in the first Select Statement of the CTEs triple).

Integrated into an Update-Query it could look this way:

Update MyTable Set MySquareRoot = ( -- SubSelect-Opening-Paren...
 With  r (x, y, i) As (
   Select MySquaredValue, 1, 1  Union All
   Select x,(x>0)*(y+x/y)/2 yi,i+1 From r Where Abs(yi-y)>1e-12 Limit 32
 ) Select y From r Order By i Desc Limit 1
) -- SubSelect-Closing-Paren...

HTH

Olaf

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-18 Thread Bart Smissaert
Hi Olaf,

Regarding:

With r(s2, s, i) As (Select 2, 1, 1 Union All
   Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
Limit 32
) Select s From r Order By i Desc Limit 1

How would this work if I wanted to update all the values in a table column
to have the square root?

RBS

On Thu, Oct 17, 2019 at 12:56 AM Olaf Schmidt  wrote:

> Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
> > Sorry, I forgot to tell that. It is date column with an integer number.
> >
> > ID xValue xDate
> > 
> > 1   130  40123
> > 1   120  41232
> > 1   140  40582
> > 1   100  40888
> > 1   110  42541
> > 2   140  41225
> > 2   130  41589
> > 2   150  40872
>
> Because SQLite on Android has at least CTEs available,
> (since Android 5 or so), I thought I make an attempt using them...
>
> For your above sample-set I've created a table Test this way
> (with slightly changed Column-Names + different IDs with "gaps"
> + a few more sample-records, to cover especially n=1 sets):
>
> Create Table Test(ID, xVal, xDat);
>
> Insert Into Test Values(3, 130, 40123);
> Insert Into Test Values(3, 120, 41232);
> Insert Into Test Values(3, 140, 40582);
> Insert Into Test Values(3, 100, 40888);
> Insert Into Test Values(3, 110, 42541);
>
> Insert Into Test Values(5, 140, 41225);
> Insert Into Test Values(5, 130, 41589);
> Insert Into Test Values(5, 150, 40872);
>
> Insert Into Test Values(6, 110, 41225);
> Insert Into Test Values(6, 115, 41227);
>
> Insert Into Test Values(9, 105, 41225);
>
> After the above Table-Creation + Inserts, one should
> define a "Base-View" first, which is then able to select
> appropriate aggregates (respecting your "max-4" condition):
>
> Create View vw_StdDev_Max4_Aggregates As
> Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
> Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
> Limit 4)
> Group By ID;
>
> Only this view above has to be adjusted, according to your
> real table- and column-names, but the following two "derived views",
> will work after these changes as well, as long as you leave the
> above view-name - and the field-names: (ID, a1, a2, n) intact.
>
> So what remains, is two more view-defs (one for the population-StdDev -
> and one with a few slight changes, to cover the sample-StdDev as well.
>
> Create View vw_StdDev_Samples As
> Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
>With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
>) Select s From r Order By i Desc Limit 1
> ) s From vw_StdDev_Max4_Aggregates Where n>1
> Union All
> Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
>
> Create View vw_StdDev_Populations As
> Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
>With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
>) Select s From r Order By i Desc Limit 1
> ) sigma From vw_StdDev_Max4_Aggregates Where n>1
> Union All
> Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
>
> 
> Ok, here the result I get on my test-table, when the 3 views
> are in place, and the following gets executed:
>
> Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
> ID  n   sigma
> 
> 3   4   14.7901994577491
> 5   3   8.16496580927733
> 6   2   2.5
> 9   1   0
>
>
> Select ID, n_SampleSize, s From vw_StdDev_Samples;
> ID  n   s
> 
> 3   4   17.0782512765993
> 5   3   10.1
> 6   2   3.53553390593274
> 9   1   null
>
> Please note the returned null (vs. the real 0) in the last result
> (done, because "unbiased estimates" are undefined for sample-size == 1)
>
> Finally for those interested, the CTE-based "iterated Square-Root",
> cut out for an isolated calculation of sqrt(2).
>
> With r(s2, s, i) As (Select 2, 1, 1 Union All
>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
> ) Select s From r Order By i Desc Limit 1
>
> The max. iterations are limited to 32 - though for input-values
> in a "typical range", the iteration will usually exit earlier...
> In case of the example above -> sqrt(2), the iteration-counter
> i reached only 6 (as can be seen, when you include i alongside s
> in the final Select.
>
> With r(s2, s, i) As (Select 2, 1, 1 Union All
>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
> ) Select i, s From r Order By i Desc Limit 1
>
> HTH
>
> Olaf
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org

Re: [sqlite] Standard deviation last x entries

2019-10-17 Thread Bart Smissaert
Hi Olaf,

Tested all on the Windows app and works perfectly fine indeed.
Thanks for that.

RBS

On Thu, Oct 17, 2019 at 12:56 AM Olaf Schmidt  wrote:

> Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
> > Sorry, I forgot to tell that. It is date column with an integer number.
> >
> > ID xValue xDate
> > 
> > 1   130  40123
> > 1   120  41232
> > 1   140  40582
> > 1   100  40888
> > 1   110  42541
> > 2   140  41225
> > 2   130  41589
> > 2   150  40872
>
> Because SQLite on Android has at least CTEs available,
> (since Android 5 or so), I thought I make an attempt using them...
>
> For your above sample-set I've created a table Test this way
> (with slightly changed Column-Names + different IDs with "gaps"
> + a few more sample-records, to cover especially n=1 sets):
>
> Create Table Test(ID, xVal, xDat);
>
> Insert Into Test Values(3, 130, 40123);
> Insert Into Test Values(3, 120, 41232);
> Insert Into Test Values(3, 140, 40582);
> Insert Into Test Values(3, 100, 40888);
> Insert Into Test Values(3, 110, 42541);
>
> Insert Into Test Values(5, 140, 41225);
> Insert Into Test Values(5, 130, 41589);
> Insert Into Test Values(5, 150, 40872);
>
> Insert Into Test Values(6, 110, 41225);
> Insert Into Test Values(6, 115, 41227);
>
> Insert Into Test Values(9, 105, 41225);
>
> After the above Table-Creation + Inserts, one should
> define a "Base-View" first, which is then able to select
> appropriate aggregates (respecting your "max-4" condition):
>
> Create View vw_StdDev_Max4_Aggregates As
> Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
> Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
> Limit 4)
> Group By ID;
>
> Only this view above has to be adjusted, according to your
> real table- and column-names, but the following two "derived views",
> will work after these changes as well, as long as you leave the
> above view-name - and the field-names: (ID, a1, a2, n) intact.
>
> So what remains, is two more view-defs (one for the population-StdDev -
> and one with a few slight changes, to cover the sample-StdDev as well.
>
> Create View vw_StdDev_Samples As
> Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
>With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
>) Select s From r Order By i Desc Limit 1
> ) s From vw_StdDev_Max4_Aggregates Where n>1
> Union All
> Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
>
> Create View vw_StdDev_Populations As
> Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
>With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
>) Select s From r Order By i Desc Limit 1
> ) sigma From vw_StdDev_Max4_Aggregates Where n>1
> Union All
> Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
>
> 
> Ok, here the result I get on my test-table, when the 3 views
> are in place, and the following gets executed:
>
> Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
> ID  n   sigma
> 
> 3   4   14.7901994577491
> 5   3   8.16496580927733
> 6   2   2.5
> 9   1   0
>
>
> Select ID, n_SampleSize, s From vw_StdDev_Samples;
> ID  n   s
> 
> 3   4   17.0782512765993
> 5   3   10.1
> 6   2   3.53553390593274
> 9   1   null
>
> Please note the returned null (vs. the real 0) in the last result
> (done, because "unbiased estimates" are undefined for sample-size == 1)
>
> Finally for those interested, the CTE-based "iterated Square-Root",
> cut out for an isolated calculation of sqrt(2).
>
> With r(s2, s, i) As (Select 2, 1, 1 Union All
>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
> ) Select s From r Order By i Desc Limit 1
>
> The max. iterations are limited to 32 - though for input-values
> in a "typical range", the iteration will usually exit earlier...
> In case of the example above -> sqrt(2), the iteration-counter
> i reached only 6 (as can be seen, when you include i alongside s
> in the final Select.
>
> With r(s2, s, i) As (Select 2, 1, 1 Union All
>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
> ) Select i, s From r Order By i Desc Limit 1
>
> HTH
>
> Olaf
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-17 Thread Bart Smissaert
Ignore the mentioned problem, must have been an e-mail artefact and view
creates fine.
Will test all later.

RBS

On Thu, Oct 17, 2019 at 11:02 AM Bart Smissaert 
wrote:

> Hi Olaf,
>
> Thanks, will try that out.
> I have a slight problem with your first create view on Android (fine in my
> Windows desktop app), objecting to:
> T.ID, highlighting the dot.
> I tried with giving an alias T2 to the table in the subselect, but that
> made no difference.
> Will try later.
>
> RBS
>
> On Thu, Oct 17, 2019 at 12:56 AM Olaf Schmidt  wrote:
>
>> Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
>> > Sorry, I forgot to tell that. It is date column with an integer number.
>> >
>> > ID xValue xDate
>> > 
>> > 1   130  40123
>> > 1   120  41232
>> > 1   140  40582
>> > 1   100  40888
>> > 1   110  42541
>> > 2   140  41225
>> > 2   130  41589
>> > 2   150  40872
>>
>> Because SQLite on Android has at least CTEs available,
>> (since Android 5 or so), I thought I make an attempt using them...
>>
>> For your above sample-set I've created a table Test this way
>> (with slightly changed Column-Names + different IDs with "gaps"
>> + a few more sample-records, to cover especially n=1 sets):
>>
>> Create Table Test(ID, xVal, xDat);
>>
>> Insert Into Test Values(3, 130, 40123);
>> Insert Into Test Values(3, 120, 41232);
>> Insert Into Test Values(3, 140, 40582);
>> Insert Into Test Values(3, 100, 40888);
>> Insert Into Test Values(3, 110, 42541);
>>
>> Insert Into Test Values(5, 140, 41225);
>> Insert Into Test Values(5, 130, 41589);
>> Insert Into Test Values(5, 150, 40872);
>>
>> Insert Into Test Values(6, 110, 41225);
>> Insert Into Test Values(6, 115, 41227);
>>
>> Insert Into Test Values(9, 105, 41225);
>>
>> After the above Table-Creation + Inserts, one should
>> define a "Base-View" first, which is then able to select
>> appropriate aggregates (respecting your "max-4" condition):
>>
>> Create View vw_StdDev_Max4_Aggregates As
>> Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
>> Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
>> Limit 4)
>> Group By ID;
>>
>> Only this view above has to be adjusted, according to your
>> real table- and column-names, but the following two "derived views",
>> will work after these changes as well, as long as you leave the
>> above view-name - and the field-names: (ID, a1, a2, n) intact.
>>
>> So what remains, is two more view-defs (one for the population-StdDev -
>> and one with a few slight changes, to cover the sample-StdDev as well.
>>
>> Create View vw_StdDev_Samples As
>> Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
>>With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
>>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
>> Limit 32
>>) Select s From r Order By i Desc Limit 1
>> ) s From vw_StdDev_Max4_Aggregates Where n>1
>> Union All
>> Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
>>
>> Create View vw_StdDev_Populations As
>> Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
>>With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
>>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
>> Limit 32
>>) Select s From r Order By i Desc Limit 1
>> ) sigma From vw_StdDev_Max4_Aggregates Where n>1
>> Union All
>> Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
>>
>> 
>> Ok, here the result I get on my test-table, when the 3 views
>> are in place, and the following gets executed:
>>
>> Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
>> ID  n   sigma
>> 
>> 3   4   14.7901994577491
>> 5   3   8.16496580927733
>> 6   2   2.5
>> 9   1   0
>>
>>
>> Select ID, n_SampleSize, s From vw_StdDev_Samples;
>> ID  n   s
>> 
>> 3   4   17.0782512765993
>> 5   3   10.1
>> 6   2   3.53553390593274
>> 9   1   null
>>
>> Please note the returned null (vs. the real 0) in the last result
>> (done, because "unbiased estimates" are undefined for sample-size == 1)
>>
>> Finally for those interested, the CTE-based "iterated Square-Root",
>> cut out for an isolated calculation of sqrt(2).
>>
>> With r(s2, s, i) As (Select 2, 1, 1 Union All
>>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
>> Limit 32
>> ) Select s From r Order By i Desc Limit 1
>>
>> The max. iterations are limited to 32 - though for input-values
>> in a "typical range", the iteration will usually exit earlier...
>> In case of the example above -> sqrt(2), the iteration-counter
>> i reached only 6 (as can be seen, when you include i alongside s
>> in the final Select.
>>
>> With r(s2, s, i) As (Select 2, 1, 1 Union All
>>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
>> Limit 32
>> ) Select i, s From r Order By i Desc Limit 1
>>
>> HTH
>>
>> Olaf
>>
>>
>>
>> 

Re: [sqlite] Standard deviation last x entries

2019-10-17 Thread Bart Smissaert
Hi Olaf,

Thanks, will try that out.
I have a slight problem with your first create view on Android (fine in my
Windows desktop app), objecting to:
T.ID, highlighting the dot.
I tried with giving an alias T2 to the table in the subselect, but that
made no difference.
Will try later.

RBS

On Thu, Oct 17, 2019 at 12:56 AM Olaf Schmidt  wrote:

> Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
> > Sorry, I forgot to tell that. It is date column with an integer number.
> >
> > ID xValue xDate
> > 
> > 1   130  40123
> > 1   120  41232
> > 1   140  40582
> > 1   100  40888
> > 1   110  42541
> > 2   140  41225
> > 2   130  41589
> > 2   150  40872
>
> Because SQLite on Android has at least CTEs available,
> (since Android 5 or so), I thought I make an attempt using them...
>
> For your above sample-set I've created a table Test this way
> (with slightly changed Column-Names + different IDs with "gaps"
> + a few more sample-records, to cover especially n=1 sets):
>
> Create Table Test(ID, xVal, xDat);
>
> Insert Into Test Values(3, 130, 40123);
> Insert Into Test Values(3, 120, 41232);
> Insert Into Test Values(3, 140, 40582);
> Insert Into Test Values(3, 100, 40888);
> Insert Into Test Values(3, 110, 42541);
>
> Insert Into Test Values(5, 140, 41225);
> Insert Into Test Values(5, 130, 41589);
> Insert Into Test Values(5, 150, 40872);
>
> Insert Into Test Values(6, 110, 41225);
> Insert Into Test Values(6, 115, 41227);
>
> Insert Into Test Values(9, 105, 41225);
>
> After the above Table-Creation + Inserts, one should
> define a "Base-View" first, which is then able to select
> appropriate aggregates (respecting your "max-4" condition):
>
> Create View vw_StdDev_Max4_Aggregates As
> Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
> Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
> Limit 4)
> Group By ID;
>
> Only this view above has to be adjusted, according to your
> real table- and column-names, but the following two "derived views",
> will work after these changes as well, as long as you leave the
> above view-name - and the field-names: (ID, a1, a2, n) intact.
>
> So what remains, is two more view-defs (one for the population-StdDev -
> and one with a few slight changes, to cover the sample-StdDev as well.
>
> Create View vw_StdDev_Samples As
> Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
>With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
>) Select s From r Order By i Desc Limit 1
> ) s From vw_StdDev_Max4_Aggregates Where n>1
> Union All
> Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
>
> Create View vw_StdDev_Populations As
> Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
>With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
>  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
>) Select s From r Order By i Desc Limit 1
> ) sigma From vw_StdDev_Max4_Aggregates Where n>1
> Union All
> Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
>
> 
> Ok, here the result I get on my test-table, when the 3 views
> are in place, and the following gets executed:
>
> Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
> ID  n   sigma
> 
> 3   4   14.7901994577491
> 5   3   8.16496580927733
> 6   2   2.5
> 9   1   0
>
>
> Select ID, n_SampleSize, s From vw_StdDev_Samples;
> ID  n   s
> 
> 3   4   17.0782512765993
> 5   3   10.1
> 6   2   3.53553390593274
> 9   1   null
>
> Please note the returned null (vs. the real 0) in the last result
> (done, because "unbiased estimates" are undefined for sample-size == 1)
>
> Finally for those interested, the CTE-based "iterated Square-Root",
> cut out for an isolated calculation of sqrt(2).
>
> With r(s2, s, i) As (Select 2, 1, 1 Union All
>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
> ) Select s From r Order By i Desc Limit 1
>
> The max. iterations are limited to 32 - though for input-values
> in a "typical range", the iteration will usually exit earlier...
> In case of the example above -> sqrt(2), the iteration-counter
> i reached only 6 (as can be seen, when you include i alongside s
> in the final Select.
>
> With r(s2, s, i) As (Select 2, 1, 1 Union All
>Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> Limit 32
> ) Select i, s From r Order By i Desc Limit 1
>
> HTH
>
> Olaf
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org

Re: [sqlite] Standard deviation last x entries

2019-10-16 Thread Olaf Schmidt

Am 12.10.2019 um 16:47 schrieb Bart Smissaert:

Sorry, I forgot to tell that. It is date column with an integer number.

ID xValue xDate

1   130  40123
1   120  41232
1   140  40582
1   100  40888
1   110  42541
2   140  41225
2   130  41589
2   150  40872


Because SQLite on Android has at least CTEs available,
(since Android 5 or so), I thought I make an attempt using them...

For your above sample-set I've created a table Test this way
(with slightly changed Column-Names + different IDs with "gaps"
+ a few more sample-records, to cover especially n=1 sets):

Create Table Test(ID, xVal, xDat);

Insert Into Test Values(3, 130, 40123);
Insert Into Test Values(3, 120, 41232);
Insert Into Test Values(3, 140, 40582);
Insert Into Test Values(3, 100, 40888);
Insert Into Test Values(3, 110, 42541);

Insert Into Test Values(5, 140, 41225);
Insert Into Test Values(5, 130, 41589);
Insert Into Test Values(5, 150, 40872);

Insert Into Test Values(6, 110, 41225);
Insert Into Test Values(6, 115, 41227);

Insert Into Test Values(9, 105, 41225);

After the above Table-Creation + Inserts, one should
define a "Base-View" first, which is then able to select
appropriate aggregates (respecting your "max-4" condition):

Create View vw_StdDev_Max4_Aggregates As
Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc 
Limit 4)

Group By ID;

Only this view above has to be adjusted, according to your
real table- and column-names, but the following two "derived views",
will work after these changes as well, as long as you leave the
above view-name - and the field-names: (ID, a1, a2, n) intact.

So what remains, is two more view-defs (one for the population-StdDev -
and one with a few slight changes, to cover the sample-StdDev as well.

Create View vw_StdDev_Samples As
Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
  With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12 
Limit 32

  ) Select s From r Order By i Desc Limit 1
) s From vw_StdDev_Max4_Aggregates Where n>1
Union All
Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;

Create View vw_StdDev_Populations As
Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
  With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12 
Limit 32

  ) Select s From r Order By i Desc Limit 1
) sigma From vw_StdDev_Max4_Aggregates Where n>1
Union All
Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;


Ok, here the result I get on my test-table, when the 3 views
are in place, and the following gets executed:

Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
ID  n   sigma

3   4   14.7901994577491
5   3   8.16496580927733
6   2   2.5
9   1   0


Select ID, n_SampleSize, s From vw_StdDev_Samples;
ID  n   s

3   4   17.0782512765993
5   3   10.1
6   2   3.53553390593274
9   1   null

Please note the returned null (vs. the real 0) in the last result
(done, because "unbiased estimates" are undefined for sample-size == 1)

Finally for those interested, the CTE-based "iterated Square-Root",
cut out for an isolated calculation of sqrt(2).

With r(s2, s, i) As (Select 2, 1, 1 Union All
  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12 
Limit 32

) Select s From r Order By i Desc Limit 1

The max. iterations are limited to 32 - though for input-values
in a "typical range", the iteration will usually exit earlier...
In case of the example above -> sqrt(2), the iteration-counter
i reached only 6 (as can be seen, when you include i alongside s
in the final Select.

With r(s2, s, i) As (Select 2, 1, 1 Union All
  Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12 
Limit 32

) Select i, s From r Order By i Desc Limit 1

HTH

Olaf



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-12 Thread Bart Smissaert
This is not a sample, but all the systolic blood pressure values of all of
our patients.
All this has to do with the calculation of the QRisk3 score:
https://qrisk.org/three/

RBS

On Sat, Oct 12, 2019 at 5:34 PM Richard Damon 
wrote:

> One thing to point out, it sounds like you are dealing with a sample,
> and I think you want to get the estimated standard deviation of the
> process, which says you really want to use the adjusted formula that
> uses N-1 in the denominator, as the expected value of the standard
> deviation of a sample is smaller than the standard deviation of the
> process/population.
>
> One quick test is to think what you would do if you had just a single
> point.  Does this mean you really have a Standard Deviation of Zero (as
> that is the SD of a population of one) or does it mean you don't have an
> idea of what the SD of the population is.
>
> On 10/12/19 12:13 PM, Bart Smissaert wrote:
> > Thanks, I do know how to calculate the SD in code, but I thought in this
> > particular case it might be faster to do this in SQL.
> > Only problem is the square root and for that reason I will test this in
> > code as well and see how it compares with SQL.
> >
> > I found a way to get the one from last step, so that is without doing the
> > final square root step.
> > Note that this has to do with a number of systolic blood pressure
> readings
> > and I am actually taking the last 6 and  that is useful as due
> > to treatment the SD often will be less for the later values:
> >
> > UPDATE QR3PARAMS SET AVG_BP =
> > (SELECT AVG(SYST) FROM (SELECT SYSTOLIC AS SYST FROM BP WHERE ID =
> > QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6))
> >
> > UPDATE QR3PARAMS SET DEV_BP =
> > (SELECT AVG(SYST * SYST) FROM (SELECT SYSTOLIC - QR3PARAMS.AVG_BP AS SYST
> > FROM BP WHERE ID = QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6))
> >
> > RBS
> >
> >
> >
> >
> > On Sat, Oct 12, 2019 at 4:21 PM Keith Medcalf 
> wrote:
> >
> >> The Standard Deviation of the Population is the Square Root of the mean
> of
> >> the second order differences.
> >>
> >> For your input values, you calculate the mean.
> >> Then you calculate the mean of the square of the difference between each
> >> value and the mean of the values.
> >> Then you take the square root of that.
> >>
> >> You can also compute it using a running calculation (generally more
> >> accurate on IEEE754 hardware).
> >>
> >> So just retrieve the values you are interested in and do the
> calculations
> >> in your application.  That is, if you only want FOUR values, then write
> a
> >> query that returns only those FOUR values, and use those to computer the
> >> answer.  While you can do all the calculations (including the Square
> Root)
> >> in SQL, it is likely to be horrendously slow.
> >>
> >> However, before you can do *ANYTHING* else, you need to be able to
> >> demonstrate that you can write a select statement that retrieves the
> values
> >> you want to work on, only the values you want to work on, and nothing
> but
> >> the values you want to work on.  Having 47,000 values and sauing that
> you
> >> would like to find the standard deviation of some set of four of them is
> >> entertaining, but not useful in any way.
> >>
> >> --
> >> The fact that there's a Highway to Hell but only a Stairway to Heaven
> says
> >> a lot about anticipated traffic volume.
> >>
> >>> -Original Message-
> >>> From: sqlite-users  On
> >>> Behalf Of Bart Smissaert
> >>> Sent: Saturday, 12 October, 2019 08:09
> >>> To: General Discussion of SQLite Database  >>> us...@mailinglists.sqlite.org>
> >>> Subject: [sqlite] Standard deviation last x entries
> >>>
> >>> How do I get the standard deviation of the last 4 entries (there could
> be
> >>> less than 4) of an integer column grouped by an integer ID entry in
> >>> another
> >>> column in the same table.
> >>>
> >>> So data could be like this:
> >>>
> >>> ID xValue
> >>> 
> >>> 1   130
> >>> 1   120
> >>> 1   140
> >>> 1   100
> >>> 1   110
> >>> 2   140
> >>> 2   130
> >>> 2   150
> >>>
> >>> I know that SQLite doesn't have the SQRT function (and I can't make a
> UDF
> >>> as I am doing this in an Android phone app) but I can do this last step
> >>> in
> >>> code.
> >>>
> >>> RBS
> >>> ___
> >>> sqlite-users mailing list
> >>> sqlite-users@mailinglists.sqlite.org
> >>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >>
> >>
> >> ___
> >> sqlite-users mailing list
> >> sqlite-users@mailinglists.sqlite.org
> >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >>
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> --
> Richard Damon
>
> ___
> sqlite-users mailing list
> 

Re: [sqlite] Standard deviation last x entries

2019-10-12 Thread Richard Damon
One thing to point out, it sounds like you are dealing with a sample,
and I think you want to get the estimated standard deviation of the
process, which says you really want to use the adjusted formula that
uses N-1 in the denominator, as the expected value of the standard
deviation of a sample is smaller than the standard deviation of the
process/population.

One quick test is to think what you would do if you had just a single
point.  Does this mean you really have a Standard Deviation of Zero (as
that is the SD of a population of one) or does it mean you don't have an
idea of what the SD of the population is.

On 10/12/19 12:13 PM, Bart Smissaert wrote:
> Thanks, I do know how to calculate the SD in code, but I thought in this
> particular case it might be faster to do this in SQL.
> Only problem is the square root and for that reason I will test this in
> code as well and see how it compares with SQL.
>
> I found a way to get the one from last step, so that is without doing the
> final square root step.
> Note that this has to do with a number of systolic blood pressure readings
> and I am actually taking the last 6 and  that is useful as due
> to treatment the SD often will be less for the later values:
>
> UPDATE QR3PARAMS SET AVG_BP =
> (SELECT AVG(SYST) FROM (SELECT SYSTOLIC AS SYST FROM BP WHERE ID =
> QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6))
>
> UPDATE QR3PARAMS SET DEV_BP =
> (SELECT AVG(SYST * SYST) FROM (SELECT SYSTOLIC - QR3PARAMS.AVG_BP AS SYST
> FROM BP WHERE ID = QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6))
>
> RBS
>
>
>
>
> On Sat, Oct 12, 2019 at 4:21 PM Keith Medcalf  wrote:
>
>> The Standard Deviation of the Population is the Square Root of the mean of
>> the second order differences.
>>
>> For your input values, you calculate the mean.
>> Then you calculate the mean of the square of the difference between each
>> value and the mean of the values.
>> Then you take the square root of that.
>>
>> You can also compute it using a running calculation (generally more
>> accurate on IEEE754 hardware).
>>
>> So just retrieve the values you are interested in and do the calculations
>> in your application.  That is, if you only want FOUR values, then write a
>> query that returns only those FOUR values, and use those to computer the
>> answer.  While you can do all the calculations (including the Square Root)
>> in SQL, it is likely to be horrendously slow.
>>
>> However, before you can do *ANYTHING* else, you need to be able to
>> demonstrate that you can write a select statement that retrieves the values
>> you want to work on, only the values you want to work on, and nothing but
>> the values you want to work on.  Having 47,000 values and sauing that you
>> would like to find the standard deviation of some set of four of them is
>> entertaining, but not useful in any way.
>>
>> --
>> The fact that there's a Highway to Hell but only a Stairway to Heaven says
>> a lot about anticipated traffic volume.
>>
>>> -Original Message-
>>> From: sqlite-users  On
>>> Behalf Of Bart Smissaert
>>> Sent: Saturday, 12 October, 2019 08:09
>>> To: General Discussion of SQLite Database >> us...@mailinglists.sqlite.org>
>>> Subject: [sqlite] Standard deviation last x entries
>>>
>>> How do I get the standard deviation of the last 4 entries (there could be
>>> less than 4) of an integer column grouped by an integer ID entry in
>>> another
>>> column in the same table.
>>>
>>> So data could be like this:
>>>
>>> ID xValue
>>> 
>>> 1   130
>>> 1   120
>>> 1   140
>>> 1   100
>>> 1   110
>>> 2   140
>>> 2   130
>>> 2   150
>>>
>>> I know that SQLite doesn't have the SQRT function (and I can't make a UDF
>>> as I am doing this in an Android phone app) but I can do this last step
>>> in
>>> code.
>>>
>>> RBS
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


-- 
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-12 Thread Bart Smissaert
Thanks, I do know how to calculate the SD in code, but I thought in this
particular case it might be faster to do this in SQL.
Only problem is the square root and for that reason I will test this in
code as well and see how it compares with SQL.

I found a way to get the one from last step, so that is without doing the
final square root step.
Note that this has to do with a number of systolic blood pressure readings
and I am actually taking the last 6 and  that is useful as due
to treatment the SD often will be less for the later values:

UPDATE QR3PARAMS SET AVG_BP =
(SELECT AVG(SYST) FROM (SELECT SYSTOLIC AS SYST FROM BP WHERE ID =
QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6))

UPDATE QR3PARAMS SET DEV_BP =
(SELECT AVG(SYST * SYST) FROM (SELECT SYSTOLIC - QR3PARAMS.AVG_BP AS SYST
FROM BP WHERE ID = QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6))

RBS




On Sat, Oct 12, 2019 at 4:21 PM Keith Medcalf  wrote:

>
> The Standard Deviation of the Population is the Square Root of the mean of
> the second order differences.
>
> For your input values, you calculate the mean.
> Then you calculate the mean of the square of the difference between each
> value and the mean of the values.
> Then you take the square root of that.
>
> You can also compute it using a running calculation (generally more
> accurate on IEEE754 hardware).
>
> So just retrieve the values you are interested in and do the calculations
> in your application.  That is, if you only want FOUR values, then write a
> query that returns only those FOUR values, and use those to computer the
> answer.  While you can do all the calculations (including the Square Root)
> in SQL, it is likely to be horrendously slow.
>
> However, before you can do *ANYTHING* else, you need to be able to
> demonstrate that you can write a select statement that retrieves the values
> you want to work on, only the values you want to work on, and nothing but
> the values you want to work on.  Having 47,000 values and sauing that you
> would like to find the standard deviation of some set of four of them is
> entertaining, but not useful in any way.
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
> >-Original Message-
> >From: sqlite-users  On
> >Behalf Of Bart Smissaert
> >Sent: Saturday, 12 October, 2019 08:09
> >To: General Discussion of SQLite Database  >us...@mailinglists.sqlite.org>
> >Subject: [sqlite] Standard deviation last x entries
> >
> >How do I get the standard deviation of the last 4 entries (there could be
> >less than 4) of an integer column grouped by an integer ID entry in
> >another
> >column in the same table.
> >
> >So data could be like this:
> >
> >ID xValue
> >
> >1   130
> >1   120
> >1   140
> >1   100
> >1   110
> >2   140
> >2   130
> >2   150
> >
> >I know that SQLite doesn't have the SQRT function (and I can't make a UDF
> >as I am doing this in an Android phone app) but I can do this last step
> >in
> >code.
> >
> >RBS
> >___
> >sqlite-users mailing list
> >sqlite-users@mailinglists.sqlite.org
> >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-12 Thread Keith Medcalf

#!python3
from __future__ import absolute_import, division, print_function, 
unicode_literals

import apsw

db = apsw.Connection()

db.executescript("""
create table x
(
value   integer not null
);
insert into x values (120), (130), (140), (110);
""")


# Method 1:  Using partial Running calc

sumx = 0.0
sumx2 = 0.0
count = 0

for row in db.execute('select value from x'):
sumx += row.value
sumx2 += row.value ** 2
count += 1

sd1 = (sumx2 / count - (sumx / count) ** 2) ** 0.5


# Method 2:  Using double retrieval brute force calc

avgx = 0.0
avgm = 0.0
count = 0

for row in db.execute('select avg(value) as avg from x'):
avgx = row.avg

for row in db.execute('select value from x'):
avgm += (row.value - avgx) ** 2
count += 1

sd2 = (avgm / count) ** 0.5


# Method 3:  Using my extension which does full running calc

for row in db.execute('select stdevp(value) as stdev from x'):
sd3 = row.stdev


# Method 4:  Using mostly pure SQL

for row in db.execute('select count(value) as count, sum(value) as sumx, 
sum(value*value) as sumx2 from x'):
sd4 = (row.sumx2 / row.count - (row.sumx / row.count) ** 2) ** 0.5


# Method 5:  Using even more calcs in SQL

for row in db.execute('''select sumx2 / count - (sumx / count) * (sumx / count) 
as var
   from (select count(value) as count, sum(value) as 
sumx, sum(value*value) as sumx2 from x)'''):
sd5 = row.var ** 0.5


print(sd1, sd2, sd3, sd4, sd5)

=>

11.180339887498949 11.180339887498949 11.180339887498949 11.180339887498949 
11.180339887498949

You could even calculate the square root in a recursive CTE if you wanted and 
not have your application do anything at all other than submit the query.  
However, Method 1 is likely the most efficient (and easiest) to do assuming 
that you do not have an extension which calculates the aggregate, and is also 
likely to execute the fastest.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Richard Damon
>Sent: Saturday, 12 October, 2019 09:33
>To: sqlite-users@mailinglists.sqlite.org
>Subject: Re: [sqlite] Standard deviation last x entries
>
>On 10/12/19 11:23 AM, Richard Damon wrote:
>> On 10/12/19 10:08 AM, Bart Smissaert wrote:
>>> How do I get the standard deviation of the last 4 entries (there could
>be
>>> less than 4) of an integer column grouped by an integer ID entry in
>another
>>> column in the same table.
>>>
>>> So data could be like this:
>>>
>>> ID xValue
>>> 
>>> 1   130
>>> 1   120
>>> 1   140
>>> 1   100
>>> 1   110
>>> 2   140
>>> 2   130
>>> 2   150
>>>
>>> I know that SQLite doesn't have the SQRT function (and I can't make a
>UDF
>>> as I am doing this in an Android phone app) but I can do this last
>step in
>>> code.
>>>
>>> RBS
>> I may not be the best at forming the SQL to do this, but here is the
>> general idea I would use. First use a query with ORDER BY DECR and
>LIMIT
>> to get the last 4 items, and select the value, and a computer column of
>> value2 = value * value.
>>
>> Then over this result, do a GROUP BY to compute N  = count(value), S =
>> sum(value), S2 =sum(value2)
>>
>> You can then compute the variance (standard deviation squared) from the
>> equation
>>
>> Variance = S2 / N - (S / N) * (S / N)
>>
>> This equation assumes that you are working with the full population and
>> not just a sample, if you data is to be considered a sample you need to
>> make a small adjustment in the formula to
>>
>> Estimated Variance of Popultion = S2 / (N-1) - S * S / (N * (N-1))
>>
>>
>(Sorry about the messed up subject, reposting to fix it)
>
>--
>Richard Damon
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-12 Thread Richard Damon
On 10/12/19 11:23 AM, Richard Damon wrote:
> On 10/12/19 10:08 AM, Bart Smissaert wrote:
>> How do I get the standard deviation of the last 4 entries (there could be
>> less than 4) of an integer column grouped by an integer ID entry in another
>> column in the same table.
>>
>> So data could be like this:
>>
>> ID xValue
>> 
>> 1   130
>> 1   120
>> 1   140
>> 1   100
>> 1   110
>> 2   140
>> 2   130
>> 2   150
>>
>> I know that SQLite doesn't have the SQRT function (and I can't make a UDF
>> as I am doing this in an Android phone app) but I can do this last step in
>> code.
>>
>> RBS
> I may not be the best at forming the SQL to do this, but here is the
> general idea I would use. First use a query with ORDER BY DECR and LIMIT
> to get the last 4 items, and select the value, and a computer column of
> value2 = value * value.
>
> Then over this result, do a GROUP BY to compute N  = count(value), S =
> sum(value), S2 =sum(value2)
>
> You can then compute the variance (standard deviation squared) from the
> equation
>
> Variance = S2 / N - (S / N) * (S / N)
>
> This equation assumes that you are working with the full population and
> not just a sample, if you data is to be considered a sample you need to
> make a small adjustment in the formula to
>
> Estimated Variance of Popultion = S2 / (N-1) - S * S / (N * (N-1))
>
>
(Sorry about the messed up subject, reposting to fix it)

-- 
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-12 Thread Keith Medcalf

The Standard Deviation of the Population is the Square Root of the mean of the 
second order differences.

For your input values, you calculate the mean.
Then you calculate the mean of the square of the difference between each value 
and the mean of the values.
Then you take the square root of that.

You can also compute it using a running calculation (generally more accurate on 
IEEE754 hardware).

So just retrieve the values you are interested in and do the calculations in 
your application.  That is, if you only want FOUR values, then write a query 
that returns only those FOUR values, and use those to computer the answer.  
While you can do all the calculations (including the Square Root) in SQL, it is 
likely to be horrendously slow.

However, before you can do *ANYTHING* else, you need to be able to demonstrate 
that you can write a select statement that retrieves the values you want to 
work on, only the values you want to work on, and nothing but the values you 
want to work on.  Having 47,000 values and sauing that you would like to find 
the standard deviation of some set of four of them is entertaining, but not 
useful in any way.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Bart Smissaert
>Sent: Saturday, 12 October, 2019 08:09
>To: General Discussion of SQLite Database us...@mailinglists.sqlite.org>
>Subject: [sqlite] Standard deviation last x entries
>
>How do I get the standard deviation of the last 4 entries (there could be
>less than 4) of an integer column grouped by an integer ID entry in
>another
>column in the same table.
>
>So data could be like this:
>
>ID xValue
>
>1   130
>1   120
>1   140
>1   100
>1   110
>2   140
>2   130
>2   150
>
>I know that SQLite doesn't have the SQRT function (and I can't make a UDF
>as I am doing this in an Android phone app) but I can do this last step
>in
>code.
>
>RBS
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-12 Thread Bart Smissaert
Sorry, I forgot to tell that. It is date column with an integer number.

ID xValue xDate

1   130  40123
1   120  41232
1   140  40582
1   100  40888
1   110  42541
2   140  41225
2   130  41589
2   150  40872

RBS

On Sat, Oct 12, 2019 at 3:18 PM Igor Tandetnik  wrote:

> On 10/12/2019 10:08 AM, Bart Smissaert wrote:
> > How do I get the standard deviation of the last 4 entries (there could be
> > less than 4) of an integer column grouped by an integer ID entry in
> another
> > column in the same table.
>
> What do you mean by "last 4 entries"? What determines the order? How does
> one tell which entry is the first and which is the last?
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Standard deviation last x entries

2019-10-12 Thread Igor Tandetnik

On 10/12/2019 10:08 AM, Bart Smissaert wrote:

How do I get the standard deviation of the last 4 entries (there could be
less than 4) of an integer column grouped by an integer ID entry in another
column in the same table.


What do you mean by "last 4 entries"? What determines the order? How does one 
tell which entry is the first and which is the last?
--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users