Re: [sqlite] rationale for private archives of the SQLite mailing list and suppressed reply-to
> On Oct 11, 2019, at 9:19 AM, Mike Bayer wrote: > > GNU Mailman is still very widely used and IMO does the job very well Its web interface is like something from 1997. In particular, it makes reading archives very painful, clicking through to one message at a time. I’d recommend groups.io — it hosts mailing lists with excellent browsable archives, or from another perspective, web forums with excellent email notifications. In my experience it pleases both mailing-list and forum fans. —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] insertion of a single record taking more than 200 msec some times
> On Oct 11, 2019, at 7:43 AM, GopiKrishna Parisa > wrote: > > One weird thing is for some insertions, it's taking more than 200 msec for > single record where as others takes around 20 to 40 (avag 27 msec). Those sound like numbers I’d expect for committing a transaction. (The longer time is for a WAL flush.) But the major performance tip for SQLite is: minimize the number of transactions, because commits are expensive. In your case, try to insert as many records as you can in one transaction. If you’re copying them from a text file, just use a single transaction. If they’re live, you do want to commit transactions periodically for durability, maybe every few seconds. —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Standard deviation last x entries
This is not a sample, but all the systolic blood pressure values of all of our patients. All this has to do with the calculation of the QRisk3 score: https://qrisk.org/three/ RBS On Sat, Oct 12, 2019 at 5:34 PM Richard Damon wrote: > One thing to point out, it sounds like you are dealing with a sample, > and I think you want to get the estimated standard deviation of the > process, which says you really want to use the adjusted formula that > uses N-1 in the denominator, as the expected value of the standard > deviation of a sample is smaller than the standard deviation of the > process/population. > > One quick test is to think what you would do if you had just a single > point. Does this mean you really have a Standard Deviation of Zero (as > that is the SD of a population of one) or does it mean you don't have an > idea of what the SD of the population is. > > On 10/12/19 12:13 PM, Bart Smissaert wrote: > > Thanks, I do know how to calculate the SD in code, but I thought in this > > particular case it might be faster to do this in SQL. > > Only problem is the square root and for that reason I will test this in > > code as well and see how it compares with SQL. > > > > I found a way to get the one from last step, so that is without doing the > > final square root step. > > Note that this has to do with a number of systolic blood pressure > readings > > and I am actually taking the last 6 and that is useful as due > > to treatment the SD often will be less for the later values: > > > > UPDATE QR3PARAMS SET AVG_BP = > > (SELECT AVG(SYST) FROM (SELECT SYSTOLIC AS SYST FROM BP WHERE ID = > > QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6)) > > > > UPDATE QR3PARAMS SET DEV_BP = > > (SELECT AVG(SYST * SYST) FROM (SELECT SYSTOLIC - QR3PARAMS.AVG_BP AS SYST > > FROM BP WHERE ID = QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6)) > > > > RBS > > > > > > > > > > On Sat, Oct 12, 2019 at 4:21 PM Keith Medcalf > wrote: > > > >> The Standard Deviation of the Population is the Square Root of the mean > of > >> the second order differences. > >> > >> For your input values, you calculate the mean. > >> Then you calculate the mean of the square of the difference between each > >> value and the mean of the values. > >> Then you take the square root of that. > >> > >> You can also compute it using a running calculation (generally more > >> accurate on IEEE754 hardware). > >> > >> So just retrieve the values you are interested in and do the > calculations > >> in your application. That is, if you only want FOUR values, then write > a > >> query that returns only those FOUR values, and use those to computer the > >> answer. While you can do all the calculations (including the Square > Root) > >> in SQL, it is likely to be horrendously slow. > >> > >> However, before you can do *ANYTHING* else, you need to be able to > >> demonstrate that you can write a select statement that retrieves the > values > >> you want to work on, only the values you want to work on, and nothing > but > >> the values you want to work on. Having 47,000 values and sauing that > you > >> would like to find the standard deviation of some set of four of them is > >> entertaining, but not useful in any way. > >> > >> -- > >> The fact that there's a Highway to Hell but only a Stairway to Heaven > says > >> a lot about anticipated traffic volume. > >> > >>> -Original Message- > >>> From: sqlite-users On > >>> Behalf Of Bart Smissaert > >>> Sent: Saturday, 12 October, 2019 08:09 > >>> To: General Discussion of SQLite Database >>> us...@mailinglists.sqlite.org> > >>> Subject: [sqlite] Standard deviation last x entries > >>> > >>> How do I get the standard deviation of the last 4 entries (there could > be > >>> less than 4) of an integer column grouped by an integer ID entry in > >>> another > >>> column in the same table. > >>> > >>> So data could be like this: > >>> > >>> ID xValue > >>> > >>> 1 130 > >>> 1 120 > >>> 1 140 > >>> 1 100 > >>> 1 110 > >>> 2 140 > >>> 2 130 > >>> 2 150 > >>> > >>> I know that SQLite doesn't have the SQRT function (and I can't make a > UDF > >>> as I am doing this in an Android phone app) but I can do this last step > >>> in > >>> code. > >>> > >>> RBS > >>> ___ > >>> sqlite-users mailing list > >>> sqlite-users@mailinglists.sqlite.org > >>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > >> > >> > >> ___ > >> sqlite-users mailing list > >> sqlite-users@mailinglists.sqlite.org > >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > >> > > ___ > > sqlite-users mailing list > > sqlite-users@mailinglists.sqlite.org > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > -- > Richard Damon > > ___ > sqlite-users mailing list >
Re: [sqlite] Standard deviation last x entries
One thing to point out, it sounds like you are dealing with a sample, and I think you want to get the estimated standard deviation of the process, which says you really want to use the adjusted formula that uses N-1 in the denominator, as the expected value of the standard deviation of a sample is smaller than the standard deviation of the process/population. One quick test is to think what you would do if you had just a single point. Does this mean you really have a Standard Deviation of Zero (as that is the SD of a population of one) or does it mean you don't have an idea of what the SD of the population is. On 10/12/19 12:13 PM, Bart Smissaert wrote: > Thanks, I do know how to calculate the SD in code, but I thought in this > particular case it might be faster to do this in SQL. > Only problem is the square root and for that reason I will test this in > code as well and see how it compares with SQL. > > I found a way to get the one from last step, so that is without doing the > final square root step. > Note that this has to do with a number of systolic blood pressure readings > and I am actually taking the last 6 and that is useful as due > to treatment the SD often will be less for the later values: > > UPDATE QR3PARAMS SET AVG_BP = > (SELECT AVG(SYST) FROM (SELECT SYSTOLIC AS SYST FROM BP WHERE ID = > QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6)) > > UPDATE QR3PARAMS SET DEV_BP = > (SELECT AVG(SYST * SYST) FROM (SELECT SYSTOLIC - QR3PARAMS.AVG_BP AS SYST > FROM BP WHERE ID = QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6)) > > RBS > > > > > On Sat, Oct 12, 2019 at 4:21 PM Keith Medcalf wrote: > >> The Standard Deviation of the Population is the Square Root of the mean of >> the second order differences. >> >> For your input values, you calculate the mean. >> Then you calculate the mean of the square of the difference between each >> value and the mean of the values. >> Then you take the square root of that. >> >> You can also compute it using a running calculation (generally more >> accurate on IEEE754 hardware). >> >> So just retrieve the values you are interested in and do the calculations >> in your application. That is, if you only want FOUR values, then write a >> query that returns only those FOUR values, and use those to computer the >> answer. While you can do all the calculations (including the Square Root) >> in SQL, it is likely to be horrendously slow. >> >> However, before you can do *ANYTHING* else, you need to be able to >> demonstrate that you can write a select statement that retrieves the values >> you want to work on, only the values you want to work on, and nothing but >> the values you want to work on. Having 47,000 values and sauing that you >> would like to find the standard deviation of some set of four of them is >> entertaining, but not useful in any way. >> >> -- >> The fact that there's a Highway to Hell but only a Stairway to Heaven says >> a lot about anticipated traffic volume. >> >>> -Original Message- >>> From: sqlite-users On >>> Behalf Of Bart Smissaert >>> Sent: Saturday, 12 October, 2019 08:09 >>> To: General Discussion of SQLite Database >> us...@mailinglists.sqlite.org> >>> Subject: [sqlite] Standard deviation last x entries >>> >>> How do I get the standard deviation of the last 4 entries (there could be >>> less than 4) of an integer column grouped by an integer ID entry in >>> another >>> column in the same table. >>> >>> So data could be like this: >>> >>> ID xValue >>> >>> 1 130 >>> 1 120 >>> 1 140 >>> 1 100 >>> 1 110 >>> 2 140 >>> 2 130 >>> 2 150 >>> >>> I know that SQLite doesn't have the SQRT function (and I can't make a UDF >>> as I am doing this in an Android phone app) but I can do this last step >>> in >>> code. >>> >>> RBS >>> ___ >>> sqlite-users mailing list >>> sqlite-users@mailinglists.sqlite.org >>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >> >> >> ___ >> sqlite-users mailing list >> sqlite-users@mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users -- Richard Damon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Standard deviation last x entries
Thanks, I do know how to calculate the SD in code, but I thought in this particular case it might be faster to do this in SQL. Only problem is the square root and for that reason I will test this in code as well and see how it compares with SQL. I found a way to get the one from last step, so that is without doing the final square root step. Note that this has to do with a number of systolic blood pressure readings and I am actually taking the last 6 and that is useful as due to treatment the SD often will be less for the later values: UPDATE QR3PARAMS SET AVG_BP = (SELECT AVG(SYST) FROM (SELECT SYSTOLIC AS SYST FROM BP WHERE ID = QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6)) UPDATE QR3PARAMS SET DEV_BP = (SELECT AVG(SYST * SYST) FROM (SELECT SYSTOLIC - QR3PARAMS.AVG_BP AS SYST FROM BP WHERE ID = QR3PARAMS.ID ORDER BY ENTRY_DATE DESC LIMIT 6)) RBS On Sat, Oct 12, 2019 at 4:21 PM Keith Medcalf wrote: > > The Standard Deviation of the Population is the Square Root of the mean of > the second order differences. > > For your input values, you calculate the mean. > Then you calculate the mean of the square of the difference between each > value and the mean of the values. > Then you take the square root of that. > > You can also compute it using a running calculation (generally more > accurate on IEEE754 hardware). > > So just retrieve the values you are interested in and do the calculations > in your application. That is, if you only want FOUR values, then write a > query that returns only those FOUR values, and use those to computer the > answer. While you can do all the calculations (including the Square Root) > in SQL, it is likely to be horrendously slow. > > However, before you can do *ANYTHING* else, you need to be able to > demonstrate that you can write a select statement that retrieves the values > you want to work on, only the values you want to work on, and nothing but > the values you want to work on. Having 47,000 values and sauing that you > would like to find the standard deviation of some set of four of them is > entertaining, but not useful in any way. > > -- > The fact that there's a Highway to Hell but only a Stairway to Heaven says > a lot about anticipated traffic volume. > > >-Original Message- > >From: sqlite-users On > >Behalf Of Bart Smissaert > >Sent: Saturday, 12 October, 2019 08:09 > >To: General Discussion of SQLite Database >us...@mailinglists.sqlite.org> > >Subject: [sqlite] Standard deviation last x entries > > > >How do I get the standard deviation of the last 4 entries (there could be > >less than 4) of an integer column grouped by an integer ID entry in > >another > >column in the same table. > > > >So data could be like this: > > > >ID xValue > > > >1 130 > >1 120 > >1 140 > >1 100 > >1 110 > >2 140 > >2 130 > >2 150 > > > >I know that SQLite doesn't have the SQRT function (and I can't make a UDF > >as I am doing this in an Android phone app) but I can do this last step > >in > >code. > > > >RBS > >___ > >sqlite-users mailing list > >sqlite-users@mailinglists.sqlite.org > >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Standard deviation last x entries
#!python3 from __future__ import absolute_import, division, print_function, unicode_literals import apsw db = apsw.Connection() db.executescript(""" create table x ( value integer not null ); insert into x values (120), (130), (140), (110); """) # Method 1: Using partial Running calc sumx = 0.0 sumx2 = 0.0 count = 0 for row in db.execute('select value from x'): sumx += row.value sumx2 += row.value ** 2 count += 1 sd1 = (sumx2 / count - (sumx / count) ** 2) ** 0.5 # Method 2: Using double retrieval brute force calc avgx = 0.0 avgm = 0.0 count = 0 for row in db.execute('select avg(value) as avg from x'): avgx = row.avg for row in db.execute('select value from x'): avgm += (row.value - avgx) ** 2 count += 1 sd2 = (avgm / count) ** 0.5 # Method 3: Using my extension which does full running calc for row in db.execute('select stdevp(value) as stdev from x'): sd3 = row.stdev # Method 4: Using mostly pure SQL for row in db.execute('select count(value) as count, sum(value) as sumx, sum(value*value) as sumx2 from x'): sd4 = (row.sumx2 / row.count - (row.sumx / row.count) ** 2) ** 0.5 # Method 5: Using even more calcs in SQL for row in db.execute('''select sumx2 / count - (sumx / count) * (sumx / count) as var from (select count(value) as count, sum(value) as sumx, sum(value*value) as sumx2 from x)'''): sd5 = row.var ** 0.5 print(sd1, sd2, sd3, sd4, sd5) => 11.180339887498949 11.180339887498949 11.180339887498949 11.180339887498949 11.180339887498949 You could even calculate the square root in a recursive CTE if you wanted and not have your application do anything at all other than submit the query. However, Method 1 is likely the most efficient (and easiest) to do assuming that you do not have an extension which calculates the aggregate, and is also likely to execute the fastest. -- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. >-Original Message- >From: sqlite-users On >Behalf Of Richard Damon >Sent: Saturday, 12 October, 2019 09:33 >To: sqlite-users@mailinglists.sqlite.org >Subject: Re: [sqlite] Standard deviation last x entries > >On 10/12/19 11:23 AM, Richard Damon wrote: >> On 10/12/19 10:08 AM, Bart Smissaert wrote: >>> How do I get the standard deviation of the last 4 entries (there could >be >>> less than 4) of an integer column grouped by an integer ID entry in >another >>> column in the same table. >>> >>> So data could be like this: >>> >>> ID xValue >>> >>> 1 130 >>> 1 120 >>> 1 140 >>> 1 100 >>> 1 110 >>> 2 140 >>> 2 130 >>> 2 150 >>> >>> I know that SQLite doesn't have the SQRT function (and I can't make a >UDF >>> as I am doing this in an Android phone app) but I can do this last >step in >>> code. >>> >>> RBS >> I may not be the best at forming the SQL to do this, but here is the >> general idea I would use. First use a query with ORDER BY DECR and >LIMIT >> to get the last 4 items, and select the value, and a computer column of >> value2 = value * value. >> >> Then over this result, do a GROUP BY to compute N = count(value), S = >> sum(value), S2 =sum(value2) >> >> You can then compute the variance (standard deviation squared) from the >> equation >> >> Variance = S2 / N - (S / N) * (S / N) >> >> This equation assumes that you are working with the full population and >> not just a sample, if you data is to be considered a sample you need to >> make a small adjustment in the formula to >> >> Estimated Variance of Popultion = S2 / (N-1) - S * S / (N * (N-1)) >> >> >(Sorry about the messed up subject, reposting to fix it) > >-- >Richard Damon > >___ >sqlite-users mailing list >sqlite-users@mailinglists.sqlite.org >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Standard deviation last x entries
On 10/12/19 11:23 AM, Richard Damon wrote: > On 10/12/19 10:08 AM, Bart Smissaert wrote: >> How do I get the standard deviation of the last 4 entries (there could be >> less than 4) of an integer column grouped by an integer ID entry in another >> column in the same table. >> >> So data could be like this: >> >> ID xValue >> >> 1 130 >> 1 120 >> 1 140 >> 1 100 >> 1 110 >> 2 140 >> 2 130 >> 2 150 >> >> I know that SQLite doesn't have the SQRT function (and I can't make a UDF >> as I am doing this in an Android phone app) but I can do this last step in >> code. >> >> RBS > I may not be the best at forming the SQL to do this, but here is the > general idea I would use. First use a query with ORDER BY DECR and LIMIT > to get the last 4 items, and select the value, and a computer column of > value2 = value * value. > > Then over this result, do a GROUP BY to compute N = count(value), S = > sum(value), S2 =sum(value2) > > You can then compute the variance (standard deviation squared) from the > equation > > Variance = S2 / N - (S / N) * (S / N) > > This equation assumes that you are working with the full population and > not just a sample, if you data is to be considered a sample you need to > make a small adjustment in the formula to > > Estimated Variance of Popultion = S2 / (N-1) - S * S / (N * (N-1)) > > (Sorry about the messed up subject, reposting to fix it) -- Richard Damon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] /
On 10/12/19 10:08 AM, Bart Smissaert wrote: > How do I get the standard deviation of the last 4 entries (there could be > less than 4) of an integer column grouped by an integer ID entry in another > column in the same table. > > So data could be like this: > > ID xValue > > 1 130 > 1 120 > 1 140 > 1 100 > 1 110 > 2 140 > 2 130 > 2 150 > > I know that SQLite doesn't have the SQRT function (and I can't make a UDF > as I am doing this in an Android phone app) but I can do this last step in > code. > > RBS I may not be the best at forming the SQL to do this, but here is the general idea I would use. First use a query with ORDER BY DECR and LIMIT to get the last 4 items, and select the value, and a computer column of value2 = value * value. Then over this result, do a GROUP BY to compute N = count(value), S = sum(value), S2 =sum(value2) You can then compute the variance (standard deviation squared) from the equation Variance = S2 / N - (S / N) * (S / N) This equation assumes that you are working with the full population and not just a sample, if you data is to be considered a sample you need to make a small adjustment in the formula to Estimated Variance of Popultion = S2 / (N-1) - S * S / (N * (N-1)) -- Richard Damon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Standard deviation last x entries
The Standard Deviation of the Population is the Square Root of the mean of the second order differences. For your input values, you calculate the mean. Then you calculate the mean of the square of the difference between each value and the mean of the values. Then you take the square root of that. You can also compute it using a running calculation (generally more accurate on IEEE754 hardware). So just retrieve the values you are interested in and do the calculations in your application. That is, if you only want FOUR values, then write a query that returns only those FOUR values, and use those to computer the answer. While you can do all the calculations (including the Square Root) in SQL, it is likely to be horrendously slow. However, before you can do *ANYTHING* else, you need to be able to demonstrate that you can write a select statement that retrieves the values you want to work on, only the values you want to work on, and nothing but the values you want to work on. Having 47,000 values and sauing that you would like to find the standard deviation of some set of four of them is entertaining, but not useful in any way. -- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. >-Original Message- >From: sqlite-users On >Behalf Of Bart Smissaert >Sent: Saturday, 12 October, 2019 08:09 >To: General Discussion of SQLite Database us...@mailinglists.sqlite.org> >Subject: [sqlite] Standard deviation last x entries > >How do I get the standard deviation of the last 4 entries (there could be >less than 4) of an integer column grouped by an integer ID entry in >another >column in the same table. > >So data could be like this: > >ID xValue > >1 130 >1 120 >1 140 >1 100 >1 110 >2 140 >2 130 >2 150 > >I know that SQLite doesn't have the SQRT function (and I can't make a UDF >as I am doing this in an Android phone app) but I can do this last step >in >code. > >RBS >___ >sqlite-users mailing list >sqlite-users@mailinglists.sqlite.org >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Standard deviation last x entries
Sorry, I forgot to tell that. It is date column with an integer number. ID xValue xDate 1 130 40123 1 120 41232 1 140 40582 1 100 40888 1 110 42541 2 140 41225 2 130 41589 2 150 40872 RBS On Sat, Oct 12, 2019 at 3:18 PM Igor Tandetnik wrote: > On 10/12/2019 10:08 AM, Bart Smissaert wrote: > > How do I get the standard deviation of the last 4 entries (there could be > > less than 4) of an integer column grouped by an integer ID entry in > another > > column in the same table. > > What do you mean by "last 4 entries"? What determines the order? How does > one tell which entry is the first and which is the last? > -- > Igor Tandetnik > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Standard deviation last x entries
On 10/12/2019 10:08 AM, Bart Smissaert wrote: How do I get the standard deviation of the last 4 entries (there could be less than 4) of an integer column grouped by an integer ID entry in another column in the same table. What do you mean by "last 4 entries"? What determines the order? How does one tell which entry is the first and which is the last? -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Standard deviation last x entries
How do I get the standard deviation of the last 4 entries (there could be less than 4) of an integer column grouped by an integer ID entry in another column in the same table. So data could be like this: ID xValue 1 130 1 120 1 140 1 100 1 110 2 140 2 130 2 150 I know that SQLite doesn't have the SQRT function (and I can't make a UDF as I am doing this in an Android phone app) but I can do this last step in code. RBS ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users