Another approach is to question whether you really need the
standard deviation or if any measure of variation would do.

You could use the range, max(x) - min(x) , and just report it as the range
rather than the standard deviation.  Also range/4 approximates the
standard deviation (google the range rule)
although the approximation is not very good.



On Fri, Oct 18, 2019 at 6:41 PM Gabor Grothendieck
<ggrothendi...@gmail.com> wrote:
>
> There is a stdev function for sqlite here:
> https://www.sqlite.org/contrib//download/extension-functions.c?get=25
>
> On Wed, Oct 16, 2019 at 7:57 PM Olaf Schmidt <n...@vbrichclient.com> wrote:
> >
> > Am 12.10.2019 um 16:47 schrieb Bart Smissaert:
> > > Sorry, I forgot to tell that. It is date column with an integer number.
> > >
> > > ID xValue xDate
> > > ----------------
> > > 1   130      40123
> > > 1   120      41232
> > > 1   140      40582
> > > 1   100      40888
> > > 1   110      42541
> > > 2   140      41225
> > > 2   130      41589
> > > 2   150      40872
> >
> > Because SQLite on Android has at least CTEs available,
> > (since Android 5 or so), I thought I make an attempt using them...
> >
> > For your above sample-set I've created a table Test this way
> > (with slightly changed Column-Names + different IDs with "gaps"
> > + a few more sample-records, to cover especially n=1 sets):
> >
> > Create Table Test(ID, xVal, xDat);
> >
> > Insert Into Test Values(3, 130, 40123);
> > Insert Into Test Values(3, 120, 41232);
> > Insert Into Test Values(3, 140, 40582);
> > Insert Into Test Values(3, 100, 40888);
> > Insert Into Test Values(3, 110, 42541);
> >
> > Insert Into Test Values(5, 140, 41225);
> > Insert Into Test Values(5, 130, 41589);
> > Insert Into Test Values(5, 150, 40872);
> >
> > Insert Into Test Values(6, 110, 41225);
> > Insert Into Test Values(6, 115, 41227);
> >
> > Insert Into Test Values(9, 105, 41225);
> >
> > After the above Table-Creation + Inserts, one should
> > define a "Base-View" first, which is then able to select
> > appropriate aggregates (respecting your "max-4" condition):
> >
> > Create View vw_StdDev_Max4_Aggregates As
> > Select ID, Avg(xVal) a1, Avg(xVal*xVal) a2, Count(*) n From Test T
> > Where xDat In (Select xDat From Test Where T.ID=ID Order By xDat Desc
> > Limit 4)
> > Group By ID;
> >
> > Only this view above has to be adjusted, according to your
> > real table- and column-names, but the following two "derived views",
> > will work after these changes as well, as long as you leave the
> > above view-name - and the field-names: (ID, a1, a2, n) intact.
> >
> > So what remains, is two more view-defs (one for the population-StdDev -
> > and one with a few slight changes, to cover the sample-StdDev as well.
> >
> > Create View vw_StdDev_Samples As
> > Select ID, n n_SampleSize, (a2-a1*a1)*n/(n-1) "s^2", (
> >    With r(s2, s, i) As (Select (a2-a1*a1)*n/(n-1), 1, 1 Union All
> >      Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> >    ) Select s From r Order By i Desc Limit 1
> > ) s From vw_StdDev_Max4_Aggregates Where n>1
> > Union All
> > Select ID, n, Null, Null From vw_StdDev_Max4_Aggregates Where n=1;
> >
> > Create View vw_StdDev_Populations As
> > Select ID, n n_PopulationSize, (a2-a1*a1) "sigma^2", (
> >    With r(s2, s, i) As (Select (a2-a1*a1), 1, 1 Union All
> >      Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> >    ) Select s From r Order By i Desc Limit 1
> > ) sigma From vw_StdDev_Max4_Aggregates Where n>1
> > Union All
> > Select ID, n, 0, 0 From vw_StdDev_Max4_Aggregates Where n=1;
> >
> > ------------------------------------
> > Ok, here the result I get on my test-table, when the 3 views
> > are in place, and the following gets executed:
> >
> > Select ID, n_PopulationSize, sigma From vw_StdDev_Populations;
> > ID  n   sigma
> > ------------------------
> > 3   4   14.7901994577491
> > 5   3   8.16496580927733
> > 6   2   2.5
> > 9   1   0
> >
> >
> > Select ID, n_SampleSize, s From vw_StdDev_Samples;
> > ID  n   s
> > ------------------------
> > 3   4   17.0782512765993
> > 5   3   10.0000000000001
> > 6   2   3.53553390593274
> > 9   1   null
> >
> > Please note the returned null (vs. the real 0) in the last result
> > (done, because "unbiased estimates" are undefined for sample-size == 1)
> >
> > Finally for those interested, the CTE-based "iterated Square-Root",
> > cut out for an isolated calculation of sqrt(2).
> >
> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> >    Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> > ) Select s From r Order By i Desc Limit 1
> >
> > The max. iterations are limited to 32 - though for input-values
> > in a "typical range", the iteration will usually exit earlier...
> > In case of the example above -> sqrt(2), the iteration-counter
> > i reached only 6 (as can be seen, when you include i alongside s
> > in the final Select.
> >
> > With r(s2, s, i) As (Select 2, 1, 1 Union All
> >    Select s2, (s2>0)*(s+s2/s)*.5, i+1 From r Where Abs(s2-s*s)>1e-12
> > Limit 32
> > ) Select i, s From r Order By i Desc Limit 1
> >
> > HTH
> >
> > Olaf
> >
> >
> >
> > _______________________________________________
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to