Oh, one other tip: make sure you VACUUM <https://sqlite.org/lang_vacuum.html> both databases before comparing sizes.
On Tue, Oct 13, 2020 at 6:27 AM Tom Keffer <[email protected]> wrote: > How are you running the two benchmarks? In the same process? SQLite caches > pages, so the second query should be much faster. > > Try reversing the order. > > -tk > > On Tue, Oct 13, 2020 at 6:18 AM d k <[email protected]> wrote: > >> I never used SQLite before and my previous comments were based on mysql. >> I decided to do an experiment as it was raining yesterday. Read some of the >> SQLite documentation and installed it. Impressed by how lightweight it is. >> >> I have to agree with everyone that the savings from removing all the >> unused columns in sqlite is small. In my case 19.3% or from 762,440 kb to >> 615,352 kb = 147,088 kb with 5,480,150 rows. Not insignificant but it is a >> small difference. >> >> Since I had the two SQLite database files I decided to try a query. >> Expected a large time penalty from all the null columns. "SELECT * from >> archive WHERE datetime BETWEEN 1490563860 and 1546298910;" on both files. >> I'm doing this on machine other than what I run weewx on, it's much faster. >> >> When executed against the unmodified schema: >> Execution finished without errors. >> Result: 928677 rows returned in 729ms... >> >> When executed against the modified schema: >> Execution finished without errors. >> Result: 928677 rows returned in 127ms... >> >> That is without additional columns I need, which drove this to begin >> with. When they are added in as they are null for all of these records the >> execution goes to about 429ms for the modified schema. >> >> A savings of 602ms or a query that runs in 82.6% less time is, well huge >> when it's a simple select statement. >> >> I haven't tried replacing all the nulls with some value yet. Perhaps '-1' >> or some other value those columns would never hold. But the next rainy day >> I just might. >> >> If you aren't doing a lot of queries on the data this also isn't all that >> big a deal. Which was the reason for installing mysql. The nulls are the >> reason you can't normalize the data and have it work acceptable. >> >> On Sunday, October 11, 2020 at 9:09:23 PM UTC-4 bdf0506 wrote: >> >>> I've found that deleting columns out of the schema of the archive table >>> does more hard than good. I had a 3.x installation for a while, and i had >>> trimmed many columns that I didn't need, and also renamed some. While it >>> would work great for general manual querying of the data, many skins would >>> throw weird errors, mostly when they would expect schemas that weren't >>> there. I moved to 4.x recently, and decided to ditch my custom schema and >>> go with a fresh install with the extended schema. Exporting data from the >>> old column names to then importing to the new column names proved to be >>> trickier than I would have hoped, but eventually got it working and the had >>> to go and manually update many skins I would use. Overall it was a PITA and >>> I wish I would have just stuck with the original schema from the beginning. >>> >>> tl;dr - stick with the default schemas to save you a headache! >>> >>> On Saturday, October 10, 2020 at 12:40:53 AM UTC-4 [email protected] >>> wrote: >>> >>>> BTW I'm in awe of what you've done with this. It's an amazing effort >>>> and I really like what you've done. It works better than many comercial >>>> apps I've had to use. >>>> >>>> On Friday, October 9, 2020 at 6:17:00 PM UTC-4 [email protected] wrote: >>>> >>>>> Sorry. I should have prefaced my comments that they pertain to SQLite. >>>>> I have no experience with MySQL. >>>>> >>>>> On Fri, Oct 9, 2020 at 8:52 AM d k <[email protected]> wrote: >>>>> >>>>>> The size of the indexes on the archive table are <51mb in both cases. >>>>>> There is no difference here. I totally agree. >>>>>> >>>>>> I think the reason you don't see a difference in size is because of >>>>>> how null values are stored, I think in 1 byte but haven't found a >>>>>> reference. So yes even if you remove 20 unused types you only remove 20 >>>>>> bytes which as you point out is nothing. But the extra columns still >>>>>> affect >>>>>> read and write performance. Write isn't a big big deal as we don't do >>>>>> lots >>>>>> of writes anyway. But we might do lots of reads depending on what we are >>>>>> doing with our station data and we probably are all running this on >>>>>> inexpensive slow hardware. In my case a RPi but a new one which isn't all >>>>>> that slow other than if you're comparing it to something else that's new. >>>>>> But, for instance it still cut the time to make the daily summiers by >>>>>> more >>>>>> than half. Again not like we do that often so not a huge deal. >>>>>> >>>>>> This is where the real change probably came from. I also changed the >>>>>> data types of the observations from double (8 bytes) to float (4 bytes). >>>>>> Mysql made the sqllite data type doubles instead of floats. I don't have >>>>>> REAL_AS_FLOAT set and that's my fault. >>>>>> >>>>>> I am going to move to FLOAT(n) and set the precision on the columns >>>>>> next which won't change the row length, as the columns are all still 4 >>>>>> bytes, but to make things easier when I use other applications against >>>>>> this >>>>>> data set. >>>>>> >>>>>> In my case the length of the data went from ~1.1 gb to <650mb in this >>>>>> case. It also reduced the size of the binlogs, which get purged anyway. >>>>>> It >>>>>> also reduced the size of the *ib* files. It cut the time to and size of >>>>>> dumping the table almost by half, I haven't tried restoring yet but >>>>>> expect >>>>>> the same. Queries run faster. >>>>>> >>>>>> In my opinion there are other reasons to trim the schema to fit your >>>>>> needs other than the size of the data file. But yes it's more work and >>>>>> that >>>>>> depends on how you use your data if it's worth it or not. Obviously I >>>>>> think >>>>>> it's worth it and YMMV. >>>>>> >>>>>> -dk >>>>>> On Friday, October 9, 2020 at 9:01:49 AM UTC-4 [email protected] >>>>>> wrote: >>>>>> >>>>>>> Trimming the schema does not make as big a difference in database >>>>>>> size as you might think. >>>>>>> >>>>>>> For example, using my own database of 1.4M rows, trimming the schema >>>>>>> from 48 observation types to 27, reduces the size from 268MB to 201MB. >>>>>>> >>>>>>> The reason is that most of the space is taken up by the indexes, not >>>>>>> the column data. >>>>>>> >>>>>>> -tk >>>>>>> >>>>>>> On Thu, Oct 8, 2020 at 8:02 PM d k <[email protected]> wrote: >>>>>>> >>>>>>>> Yup.. I just found that and was about to report back I was trying >>>>>>>> it that was it. Just restarted the test system to see if it went away. >>>>>>>> I >>>>>>>> think I got rid of all of them now. >>>>>>>> >>>>>>>> Gary you are the best. Thanks so much. >>>>>>>> >>>>>>>> On Thursday, October 8, 2020 at 10:54:27 PM UTC-4 gjr80 wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> First up, thank you for not posting images of text, it’s makes >>>>>>>>> reading/searching logs a real pain. >>>>>>>>> >>>>>>>>> The error is due to a skin trying to generate a plot that involves >>>>>>>>> extraTemp1 and from the short log extract I would guess that this >>>>>>>>> is from the Seasons skin. If you look in the Seasons skin config >>>>>>>>> file (skins/Seasons/skin.conf) under [ImageGenerator] you will find >>>>>>>>> the >>>>>>>>> daytemp, weektemp, monthtemp and yeartemp plots use extraTemp1 (and >>>>>>>>> extraTemp2 and extraTemp3). Easiest fix is to comment out those >>>>>>>>> plots, eg: >>>>>>>>> >>>>>>>>> # [[[daytemp]]] >>>>>>>>> # yscale = None, None, 0.5 >>>>>>>>> # [[[[extraTemp1]]]] >>>>>>>>> # [[[[extraTemp2]]]] >>>>>>>>> # [[[[extraTemp3]]]] >>>>>>>>> >>>>>>>>> Save skin.conf and the error should go away on the next report >>>>>>>>> cycle. >>>>>>>>> >>>>>>>>> Gary >>>>>>>>> On Friday, 9 October 2020 at 12:29:14 UTC+10 [email protected] >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I tried to post this as an image but it doesn't show. So here is >>>>>>>>>> the text. >>>>>>>>>> >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> Caught unrecoverable exception in generator >>>>>>>>>> 'weewx.imagegenerator.ImageGenerator' >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** extraTemp1 >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** Traceback (most recent call last): >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** File "/usr/share/weewx/weewx/reportengine.py", line >>>>>>>>>> 197, in >>>>>>>>>> run >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** obj.start() >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** File "/usr/share/weewx/weewx/reportengine.py", line >>>>>>>>>> 280, in >>>>>>>>>> start >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** self.run() >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** File "/usr/share/weewx/weewx/imagegenerator.py", >>>>>>>>>> line 41, >>>>>>>>>> in run >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** self.genImages(self.gen_ts) >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** File "/usr/share/weewx/weewx/imagegenerator.py", >>>>>>>>>> line 176, >>>>>>>>>> in genImages >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** start_vec_t, stop_vec_t ,data_vec_t = >>>>>>>>>> weewx.xtypes.get_series(var_type, >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** File "/usr/share/weewx/weewx/xtypes.py", line 91, in >>>>>>>>>> get_series >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** raise weewx.UnknownType(obs_type) >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** weewx.UnknownType: extraTemp1 >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine: >>>>>>>>>> **** Generator terminated >>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] DEBUG weewx.reportengine: >>>>>>>>>> Report 'SmartphoneReport' not enabled. Skipping. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thursday, October 8, 2020 at 9:39:14 PM UTC-4 Duane Kerzic >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Thanks for all the help you provided last time around. Thanks in >>>>>>>>>>> advance this time for your help. >>>>>>>>>>> >>>>>>>>>>> I wanted to clean up weewx.archive table and make it a bit >>>>>>>>>>> smaller. So I deleted the columns I don't think I'll ever use. But >>>>>>>>>>> now I'm >>>>>>>>>>> getting this in the system log. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'm guessing that extraTemp1 is coded into one of those files >>>>>>>>>>> but I haven't looked to find out yet. >>>>>>>>>>> >>>>>>>>>>> I've shortened the average row length of the archive table to >>>>>>>>>>> 126 from 217 bytes. Huge difference when you have 10 years of data. >>>>>>>>>>> >>>>>>>>>>> -dk >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "weewx-user" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/weewx-user/02d0a56e-c9fc-4e48-a74a-cdb6291474bbn%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/weewx-user/02d0a56e-c9fc-4e48-a74a-cdb6291474bbn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "weewx-user" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> >>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/weewx-user/853cbe79-ee93-49b3-90c3-6a9a02dab1b7n%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/weewx-user/853cbe79-ee93-49b3-90c3-6a9a02dab1b7n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >> You received this message because you are subscribed to the Google Groups >> "weewx-user" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/weewx-user/77e5e730-9cf0-4dbb-99c1-568d7fd32caen%40googlegroups.com >> <https://groups.google.com/d/msgid/weewx-user/77e5e730-9cf0-4dbb-99c1-568d7fd32caen%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/CAPq0zEBPz47mkg%3DEYSc3kqyDdrDfWhFPMrPsmy9MNayU4zTwiw%40mail.gmail.com.
