Oh, one other tip: make sure you VACUUM
<https://sqlite.org/lang_vacuum.html> both databases before comparing sizes.

On Tue, Oct 13, 2020 at 6:27 AM Tom Keffer <[email protected]> wrote:

> How are you running the two benchmarks? In the same process? SQLite caches
> pages, so the second query should be much faster.
>
> Try reversing the order.
>
> -tk
>
> On Tue, Oct 13, 2020 at 6:18 AM d k <[email protected]> wrote:
>
>> I never used SQLite before and my previous comments were based on mysql.
>> I decided to do an experiment as it was raining yesterday. Read some of the
>> SQLite documentation and installed it. Impressed by how lightweight it is.
>>
>> I have to agree with everyone that the savings from removing all the
>> unused columns in sqlite is small. In my case 19.3% or from 762,440 kb to
>> 615,352 kb = 147,088 kb with 5,480,150 rows. Not insignificant but it is a
>> small difference.
>>
>> Since I had the two SQLite database files I decided to try a query.
>> Expected a large time penalty from all the null columns. "SELECT * from
>> archive WHERE datetime BETWEEN 1490563860 and 1546298910;" on both files.
>> I'm doing this on machine other than what I run weewx on, it's much faster.
>>
>> When executed against the unmodified schema:
>> Execution finished without errors.
>> Result: 928677 rows returned in 729ms...
>>
>> When executed against the modified schema:
>> Execution finished without errors.
>> Result: 928677 rows returned in 127ms...
>>
>> That is without additional columns I need, which drove this to begin
>> with. When they are added in as they are null for all of these records the
>> execution goes to about 429ms for the modified schema.
>>
>>  A savings of 602ms or a query that runs in 82.6% less time is, well huge
>> when it's a simple select statement.
>>
>> I haven't tried replacing all the nulls with some value yet. Perhaps '-1'
>> or some other value those columns would never hold. But the next rainy day
>> I just might.
>>
>> If you aren't doing a lot of queries on the data this also isn't all that
>> big a deal. Which was the reason for installing mysql. The nulls are the
>> reason you can't normalize the data and have it work acceptable.
>>
>> On Sunday, October 11, 2020 at 9:09:23 PM UTC-4 bdf0506 wrote:
>>
>>> I've found that deleting columns out of the schema of the archive table
>>> does more hard than good. I had a 3.x installation for a while, and i had
>>> trimmed many columns that I didn't need, and also renamed some. While it
>>> would work great for general manual querying of the data, many skins would
>>> throw weird errors, mostly when they would expect schemas that weren't
>>> there. I moved to 4.x recently, and decided to ditch my custom schema and
>>> go with a fresh install with the extended schema. Exporting data from the
>>> old column names to then importing to the new column names proved to be
>>> trickier than I would have hoped, but eventually got it working and the had
>>> to go and manually update many skins I would use. Overall it was a PITA and
>>> I wish I would have just stuck with the original schema from the beginning.
>>>
>>> tl;dr - stick with the default schemas to save you a headache!
>>>
>>> On Saturday, October 10, 2020 at 12:40:53 AM UTC-4 [email protected]
>>> wrote:
>>>
>>>> BTW I'm in awe of what you've done with this. It's an amazing effort
>>>> and I really like what you've done. It works better than many comercial
>>>> apps I've had to use.
>>>>
>>>> On Friday, October 9, 2020 at 6:17:00 PM UTC-4 [email protected] wrote:
>>>>
>>>>> Sorry. I should have prefaced my comments that they pertain to SQLite.
>>>>> I have no experience with MySQL.
>>>>>
>>>>> On Fri, Oct 9, 2020 at 8:52 AM d k <[email protected]> wrote:
>>>>>
>>>>>> The size of the indexes on the archive table are <51mb in both cases.
>>>>>> There is no difference here. I totally agree.
>>>>>>
>>>>>> I think the reason you don't see a difference in size is because of
>>>>>> how null values are stored, I think in 1 byte but haven't found a
>>>>>> reference. So yes even if you remove 20 unused types you only remove 20
>>>>>> bytes which as you point out is nothing. But the extra columns still 
>>>>>> affect
>>>>>> read and write performance. Write isn't a big big deal as we don't do 
>>>>>> lots
>>>>>> of writes anyway. But we might do lots of reads depending on what we are
>>>>>> doing with our station data and we probably are all running this on
>>>>>> inexpensive slow hardware. In my case a RPi but a new one which isn't all
>>>>>> that slow other than if you're comparing it to something else that's new.
>>>>>> But, for instance it still cut the time to make the daily summiers by 
>>>>>> more
>>>>>> than half. Again not like we do that often so not a huge deal.
>>>>>>
>>>>>> This is where the real change probably came from. I also changed the
>>>>>> data types of the observations from double (8 bytes) to float (4 bytes).
>>>>>> Mysql made the sqllite data type doubles instead of floats. I don't have
>>>>>> REAL_AS_FLOAT set and that's my fault.
>>>>>>
>>>>>> I am going to move to FLOAT(n) and set the precision on the columns
>>>>>> next which won't change the row length, as the columns are all still 4
>>>>>> bytes, but to make things easier when I use other applications against 
>>>>>> this
>>>>>> data set.
>>>>>>
>>>>>> In my case the length of the data went from ~1.1 gb to <650mb in this
>>>>>> case. It also reduced the size of the binlogs, which get purged anyway. 
>>>>>> It
>>>>>> also reduced the size of the *ib* files. It cut the time to and size of
>>>>>> dumping the table almost by half, I haven't tried restoring yet but 
>>>>>> expect
>>>>>> the same. Queries run faster.
>>>>>>
>>>>>> In my opinion there are other reasons to trim the schema to fit your
>>>>>> needs other than the size of the data file. But yes it's more work and 
>>>>>> that
>>>>>> depends on how you use your data if it's worth it or not. Obviously I 
>>>>>> think
>>>>>> it's worth it and YMMV.
>>>>>>
>>>>>> -dk
>>>>>> On Friday, October 9, 2020 at 9:01:49 AM UTC-4 [email protected]
>>>>>> wrote:
>>>>>>
>>>>>>> Trimming the schema  does not make as big a difference in database
>>>>>>> size as you might think.
>>>>>>>
>>>>>>> For example, using my own database of 1.4M rows, trimming the schema
>>>>>>> from 48 observation types to 27, reduces the size from 268MB to 201MB.
>>>>>>>
>>>>>>> The reason is that most of the space is taken up by the indexes, not
>>>>>>> the column data.
>>>>>>>
>>>>>>> -tk
>>>>>>>
>>>>>>> On Thu, Oct 8, 2020 at 8:02 PM d k <[email protected]> wrote:
>>>>>>>
>>>>>>>> Yup.. I just found that and was about to report back I was trying
>>>>>>>> it that was it. Just restarted the test system to see if it went away. 
>>>>>>>> I
>>>>>>>> think I got rid of all of them now.
>>>>>>>>
>>>>>>>> Gary you are the best.  Thanks so much.
>>>>>>>>
>>>>>>>> On Thursday, October 8, 2020 at 10:54:27 PM UTC-4 gjr80 wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> First up, thank you for not posting images of text, it’s makes
>>>>>>>>> reading/searching logs a real pain.
>>>>>>>>>
>>>>>>>>> The error is due to a skin trying to generate a plot that involves
>>>>>>>>> extraTemp1 and from the short log extract I would guess that this
>>>>>>>>> is from the Seasons skin. If you look in the Seasons skin config
>>>>>>>>> file (skins/Seasons/skin.conf) under [ImageGenerator] you will find 
>>>>>>>>> the
>>>>>>>>> daytemp, weektemp, monthtemp and yeartemp plots use extraTemp1 (and
>>>>>>>>> extraTemp2 and extraTemp3). Easiest fix is to comment out those 
>>>>>>>>> plots, eg:
>>>>>>>>>
>>>>>>>>> #      [[[daytemp]]]
>>>>>>>>> #           yscale = None, None, 0.5
>>>>>>>>> #           [[[[extraTemp1]]]]
>>>>>>>>> #           [[[[extraTemp2]]]]
>>>>>>>>> #           [[[[extraTemp3]]]]
>>>>>>>>>
>>>>>>>>> Save skin.conf and the error should go away on the next report
>>>>>>>>> cycle.
>>>>>>>>>
>>>>>>>>> Gary
>>>>>>>>> On Friday, 9 October 2020 at 12:29:14 UTC+10 [email protected]
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I tried to post this as an image but it doesn't show. So here is
>>>>>>>>>> the text.
>>>>>>>>>>
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>> Caught unrecoverable exception in generator
>>>>>>>>>> 'weewx.imagegenerator.ImageGenerator'
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****  extraTemp1
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****  Traceback (most recent call last):
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****    File "/usr/share/weewx/weewx/reportengine.py", line 
>>>>>>>>>> 197, in
>>>>>>>>>> run
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****      obj.start()
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****    File "/usr/share/weewx/weewx/reportengine.py", line 
>>>>>>>>>> 280, in
>>>>>>>>>> start
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****      self.run()
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****    File "/usr/share/weewx/weewx/imagegenerator.py", 
>>>>>>>>>> line 41,
>>>>>>>>>> in run
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****      self.genImages(self.gen_ts)
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****    File "/usr/share/weewx/weewx/imagegenerator.py", 
>>>>>>>>>> line 176,
>>>>>>>>>> in genImages
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****      start_vec_t, stop_vec_t ,data_vec_t =
>>>>>>>>>> weewx.xtypes.get_series(var_type,
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****    File "/usr/share/weewx/weewx/xtypes.py", line 91, in
>>>>>>>>>> get_series
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****      raise weewx.UnknownType(obs_type)
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****  weewx.UnknownType: extraTemp1
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] ERROR weewx.reportengine:
>>>>>>>>>>         ****  Generator terminated
>>>>>>>>>> Oct  8 20:03:19 prometis weewx[271870] DEBUG weewx.reportengine:
>>>>>>>>>> Report 'SmartphoneReport' not enabled. Skipping.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thursday, October 8, 2020 at 9:39:14 PM UTC-4 Duane Kerzic
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for all the help you provided last time around. Thanks in
>>>>>>>>>>> advance this time for your help.
>>>>>>>>>>>
>>>>>>>>>>> I wanted to clean up weewx.archive table and make it a bit
>>>>>>>>>>> smaller. So I deleted the columns I don't think I'll ever use. But 
>>>>>>>>>>> now I'm
>>>>>>>>>>> getting this in the system log.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I'm guessing that extraTemp1 is coded into one of those files
>>>>>>>>>>> but I haven't looked to find out yet.
>>>>>>>>>>>
>>>>>>>>>>> I've shortened the average row length of the archive table to
>>>>>>>>>>> 126 from 217 bytes. Huge difference when you have 10 years of data.
>>>>>>>>>>>
>>>>>>>>>>> -dk
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "weewx-user" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/weewx-user/02d0a56e-c9fc-4e48-a74a-cdb6291474bbn%40googlegroups.com
>>>>>>>> <https://groups.google.com/d/msgid/weewx-user/02d0a56e-c9fc-4e48-a74a-cdb6291474bbn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "weewx-user" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>>
>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/weewx-user/853cbe79-ee93-49b3-90c3-6a9a02dab1b7n%40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/weewx-user/853cbe79-ee93-49b3-90c3-6a9a02dab1b7n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> --
>> You received this message because you are subscribed to the Google Groups
>> "weewx-user" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/weewx-user/77e5e730-9cf0-4dbb-99c1-568d7fd32caen%40googlegroups.com
>> <https://groups.google.com/d/msgid/weewx-user/77e5e730-9cf0-4dbb-99c1-568d7fd32caen%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-user/CAPq0zEBPz47mkg%3DEYSc3kqyDdrDfWhFPMrPsmy9MNayU4zTwiw%40mail.gmail.com.

Reply via email to