All vacuumed up.
.../weewx_archive.db 770,588,672
.../weewx_elec_archive.db 630,120,488
Still 140,468 kb smaller or 18.2% Again today this isn't an issue. Rather
small amount of space when you can buy a HD for ~$0.04/gb or less and an SD
card for ~$0.4/gb or so.
This is a long way from when I started and a mb of HD was $2.79 ($2,790/gb)
and had an access time of > 20ms. I guess I old habits are hard to break.
On Tuesday, October 13, 2020 at 10:44:20 AM UTC-4 d k wrote:
> Maybe this belongs in -development rather than -user?
>
> The whole reason for this exercise is that I want to change the schema
> anyway to add other data from my ted6000 and perhaps a couple of other
> items like a sensaphone web600 in addition to the weather envoy. So I need
> columns that aren't in the wview_extended schema. I've figured out how to
> fix the reports and having the ability to do that makes this a lot easier.
>
> And I'm still learning about what tk did when he built this. So it's
> wonderful that he's an email away. You don't get that often.
>
> On Tuesday, October 13, 2020 at 10:27:18 AM UTC-4 d k wrote:
>
>> Thought I vacuumed them but doing it again to be sure. They are separate
>> files and opened in separate processes. I didn't use weewx to create them.
>> Use the following statements. I'll see what happens. They don't all have
>> the same data types. I used integers where that works.
>>
>> For the unmodified one in file .../weewx_archive.db:
>> CREATE TABLE "archive" (
>> "dateTime" int NOT NULL UNIQUE,
>> "usUnits" int NOT NULL,
>> "interval" int NOT NULL,
>> "barometer" REAL DEFAULT NULL,
>> "pressure" REAL DEFAULT NULL,
>> "altimeter" REAL DEFAULT NULL,
>> "inTemp" REAL DEFAULT NULL,
>> "outTemp" REAL DEFAULT NULL,
>> "inHumidity" REAL DEFAULT NULL,
>> "outHumidity" REAL DEFAULT NULL,
>> "windSpeed" REAL DEFAULT NULL,
>> "windDir" REAL DEFAULT NULL,
>> "windGust" REAL DEFAULT NULL,
>> "windGustDir" REAL DEFAULT NULL,
>> "rainRate" REAL DEFAULT NULL,
>> "rain" REAL DEFAULT NULL,
>> "dewpoint" REAL DEFAULT NULL,
>> "windchill" REAL DEFAULT NULL,
>> "heatindex" REAL DEFAULT NULL,
>> "ET" REAL REAL,
>> "radiation" REAL DEFAULT NULL,
>> "UV" REAL REAL,
>> "extraTemp1" REAL DEFAULT NULL,
>> "extraTemp2" REAL DEFAULT NULL,
>> "extraTemp3" REAL DEFAULT NULL,
>> "soilTemp1" REAL DEFAULT NULL,
>> "soilTemp2" REAL DEFAULT NULL,
>> "soilTemp3" REAL DEFAULT NULL,
>> "soilTemp4" REAL DEFAULT NULL,
>> "leafTemp1" REAL DEFAULT NULL,
>> "leafTemp2" REAL DEFAULT NULL,
>> "extraHumid1" REAL DEFAULT NULL,
>> "extraHumid2" REAL DEFAULT NULL,
>> "soilMoist1" REAL DEFAULT NULL,
>> "soilMoist2" REAL DEFAULT NULL,
>> "soilMoist3" REAL DEFAULT NULL,
>> "soilMoist4" REAL DEFAULT NULL,
>> "leafWet1" REAL DEFAULT NULL,
>> "leafWet2" REAL DEFAULT NULL,
>> "rxCheckPercent" REAL DEFAULT NULL,
>> "txBatteryStatus" REAL DEFAULT NULL,
>> "consBatteryVoltage" REAL DEFAULT NULL,
>> "hail" REAL DEFAULT NULL,
>> "hailRate" REAL DEFAULT NULL,
>> "heatingTemp" REAL DEFAULT NULL,
>> "heatingVoltage" REAL DEFAULT NULL,
>> "supplyVoltage" REAL DEFAULT NULL,
>> "referenceVoltage" REAL DEFAULT NULL,
>> "windBatteryStatus" REAL DEFAULT NULL,
>> "rainBatteryStatus" REAL DEFAULT NULL,
>> "outTempBatteryStatus" REAL DEFAULT NULL,
>> "inTempBatteryStatus" REAL DEFAULT NULL,
>> PRIMARY KEY("dateTime")
>> );
>>
>> For the modified one in file .../weewx_elec_archive.db:
>> CREATE TABLE "archive" (
>> "dateTime" INTEGER NOT NULL UNIQUE,
>> "usUnits" INTEGER DEFAULT NULL,
>> "interval" INTEGER DEFAULT NULL,
>> "barometer" REAL DEFAULT NULL,
>> "pressure" REAL DEFAULT NULL,
>> "altimeter" REAL DEFAULT NULL,
>> "inTemp" REAL DEFAULT NULL,
>> "outTemp" REAL DEFAULT NULL,
>> "inHumidity" INTEGER DEFAULT NULL,
>> "outHumidity" INTEGER DEFAULT NULL,
>> "windSpeed" REAL DEFAULT NULL,
>> "windDir" REAL DEFAULT NULL,
>> "windGust" REAL DEFAULT NULL,
>> "windGustDir" REAL DEFAULT NULL,
>> "rainRate" REAL DEFAULT NULL,
>> "rain" REAL DEFAULT NULL,
>> "dewpoint" REAL DEFAULT NULL,
>> "windchill" REAL DEFAULT NULL,
>> "heatindex" REAL DEFAULT NULL,
>> "ET" REAL DEFAULT NULL,
>> "radiation" INTEGER DEFAULT NULL,
>> "UV" REAL DEFAULT NULL,
>> "soilTemp1" REAL DEFAULT NULL,
>> "leafTemp1" REAL DEFAULT NULL,
>> "soilMoist1" INTEGER DEFAULT NULL,
>> "leafWet1" INTEGER DEFAULT NULL,
>> "rxCheckPercent" REAL DEFAULT NULL,
>> "txBatteryStatus" INTEGER DEFAULT NULL,
>> "consBatteryVoltage" REAL DEFAULT NULL,
>> PRIMARY KEY("dateTime")
>> );
>>
>>
>> On Tuesday, October 13, 2020 at 9:29:18 AM UTC-4 [email protected] wrote:
>>
>>> Oh, one other tip: make sure you VACUUM
>>> <https://sqlite.org/lang_vacuum.html> both databases before comparing
>>> sizes.
>>>
>>> On Tue, Oct 13, 2020 at 6:27 AM Tom Keffer <[email protected]> wrote:
>>>
>>>> How are you running the two benchmarks? In the same process? SQLite
>>>> caches pages, so the second query should be much faster.
>>>>
>>>> Try reversing the order.
>>>>
>>>> -tk
>>>>
>>>> On Tue, Oct 13, 2020 at 6:18 AM d k <[email protected]> wrote:
>>>>
>>>>> I never used SQLite before and my previous comments were based on
>>>>> mysql. I decided to do an experiment as it was raining yesterday. Read
>>>>> some
>>>>> of the SQLite documentation and installed it. Impressed by how
>>>>> lightweight
>>>>> it is.
>>>>>
>>>>> I have to agree with everyone that the savings from removing all the
>>>>> unused columns in sqlite is small. In my case 19.3% or from 762,440 kb to
>>>>> 615,352 kb = 147,088 kb with 5,480,150 rows. Not insignificant but it is
>>>>> a
>>>>> small difference.
>>>>>
>>>>> Since I had the two SQLite database files I decided to try a query.
>>>>> Expected a large time penalty from all the null columns. "SELECT * from
>>>>> archive WHERE datetime BETWEEN 1490563860 and 1546298910;" on both files.
>>>>> I'm doing this on machine other than what I run weewx on, it's much
>>>>> faster.
>>>>>
>>>>> When executed against the unmodified schema:
>>>>> Execution finished without errors.
>>>>> Result: 928677 rows returned in 729ms...
>>>>>
>>>>> When executed against the modified schema:
>>>>> Execution finished without errors.
>>>>> Result: 928677 rows returned in 127ms...
>>>>>
>>>>> That is without additional columns I need, which drove this to begin
>>>>> with. When they are added in as they are null for all of these records
>>>>> the
>>>>> execution goes to about 429ms for the modified schema.
>>>>>
>>>>> A savings of 602ms or a query that runs in 82.6% less time is, well
>>>>> huge when it's a simple select statement.
>>>>>
>>>>> I haven't tried replacing all the nulls with some value yet. Perhaps
>>>>> '-1' or some other value those columns would never hold. But the next
>>>>> rainy
>>>>> day I just might.
>>>>>
>>>>> If you aren't doing a lot of queries on the data this also isn't all
>>>>> that big a deal. Which was the reason for installing mysql. The nulls are
>>>>> the reason you can't normalize the data and have it work acceptable.
>>>>>
>>>>> On Sunday, October 11, 2020 at 9:09:23 PM UTC-4 bdf0506 wrote:
>>>>>
>>>>>> I've found that deleting columns out of the schema of the archive
>>>>>> table does more hard than good. I had a 3.x installation for a while,
>>>>>> and i
>>>>>> had trimmed many columns that I didn't need, and also renamed some.
>>>>>> While
>>>>>> it would work great for general manual querying of the data, many skins
>>>>>> would throw weird errors, mostly when they would expect schemas that
>>>>>> weren't there. I moved to 4.x recently, and decided to ditch my custom
>>>>>> schema and go with a fresh install with the extended schema. Exporting
>>>>>> data
>>>>>> from the old column names to then importing to the new column names
>>>>>> proved
>>>>>> to be trickier than I would have hoped, but eventually got it working
>>>>>> and
>>>>>> the had to go and manually update many skins I would use. Overall it was
>>>>>> a
>>>>>> PITA and I wish I would have just stuck with the original schema from
>>>>>> the
>>>>>> beginning.
>>>>>>
>>>>>> tl;dr - stick with the default schemas to save you a headache!
>>>>>>
>>>>>> On Saturday, October 10, 2020 at 12:40:53 AM UTC-4 [email protected]
>>>>>> wrote:
>>>>>>
>>>>>>> BTW I'm in awe of what you've done with this. It's an amazing effort
>>>>>>> and I really like what you've done. It works better than many comercial
>>>>>>> apps I've had to use.
>>>>>>>
>>>>>>> On Friday, October 9, 2020 at 6:17:00 PM UTC-4 [email protected]
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Sorry. I should have prefaced my comments that they pertain to
>>>>>>>> SQLite. I have no experience with MySQL.
>>>>>>>>
>>>>>>>> On Fri, Oct 9, 2020 at 8:52 AM d k <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> The size of the indexes on the archive table are <51mb in both
>>>>>>>>> cases. There is no difference here. I totally agree.
>>>>>>>>>
>>>>>>>>> I think the reason you don't see a difference in size is because
>>>>>>>>> of how null values are stored, I think in 1 byte but haven't found a
>>>>>>>>> reference. So yes even if you remove 20 unused types you only remove
>>>>>>>>> 20
>>>>>>>>> bytes which as you point out is nothing. But the extra columns still
>>>>>>>>> affect
>>>>>>>>> read and write performance. Write isn't a big big deal as we don't do
>>>>>>>>> lots
>>>>>>>>> of writes anyway. But we might do lots of reads depending on what we
>>>>>>>>> are
>>>>>>>>> doing with our station data and we probably are all running this on
>>>>>>>>> inexpensive slow hardware. In my case a RPi but a new one which isn't
>>>>>>>>> all
>>>>>>>>> that slow other than if you're comparing it to something else that's
>>>>>>>>> new.
>>>>>>>>> But, for instance it still cut the time to make the daily summiers by
>>>>>>>>> more
>>>>>>>>> than half. Again not like we do that often so not a huge deal.
>>>>>>>>>
>>>>>>>>> This is where the real change probably came from. I also changed
>>>>>>>>> the data types of the observations from double (8 bytes) to float (4
>>>>>>>>> bytes). Mysql made the sqllite data type doubles instead of floats. I
>>>>>>>>> don't
>>>>>>>>> have REAL_AS_FLOAT set and that's my fault.
>>>>>>>>>
>>>>>>>>> I am going to move to FLOAT(n) and set the precision on the
>>>>>>>>> columns next which won't change the row length, as the columns are
>>>>>>>>> all
>>>>>>>>> still 4 bytes, but to make things easier when I use other
>>>>>>>>> applications
>>>>>>>>> against this data set.
>>>>>>>>>
>>>>>>>>> In my case the length of the data went from ~1.1 gb to <650mb in
>>>>>>>>> this case. It also reduced the size of the binlogs, which get purged
>>>>>>>>> anyway. It also reduced the size of the *ib* files. It cut the time
>>>>>>>>> to and
>>>>>>>>> size of dumping the table almost by half, I haven't tried restoring
>>>>>>>>> yet but
>>>>>>>>> expect the same. Queries run faster.
>>>>>>>>>
>>>>>>>>> In my opinion there are other reasons to trim the schema to fit
>>>>>>>>> your needs other than the size of the data file. But yes it's more
>>>>>>>>> work and
>>>>>>>>> that depends on how you use your data if it's worth it or not.
>>>>>>>>> Obviously I
>>>>>>>>> think it's worth it and YMMV.
>>>>>>>>>
>>>>>>>>> -dk
>>>>>>>>> On Friday, October 9, 2020 at 9:01:49 AM UTC-4 [email protected]
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Trimming the schema does not make as big a difference in
>>>>>>>>>> database size as you might think.
>>>>>>>>>>
>>>>>>>>>> For example, using my own database of 1.4M rows, trimming the
>>>>>>>>>> schema from 48 observation types to 27, reduces the size from 268MB
>>>>>>>>>> to
>>>>>>>>>> 201MB.
>>>>>>>>>>
>>>>>>>>>> The reason is that most of the space is taken up by the indexes,
>>>>>>>>>> not the column data.
>>>>>>>>>>
>>>>>>>>>> -tk
>>>>>>>>>>
>>>>>>>>>> On Thu, Oct 8, 2020 at 8:02 PM d k <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yup.. I just found that and was about to report back I was
>>>>>>>>>>> trying it that was it. Just restarted the test system to see if it
>>>>>>>>>>> went
>>>>>>>>>>> away. I think I got rid of all of them now.
>>>>>>>>>>>
>>>>>>>>>>> Gary you are the best. Thanks so much.
>>>>>>>>>>>
>>>>>>>>>>> On Thursday, October 8, 2020 at 10:54:27 PM UTC-4 gjr80 wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> First up, thank you for not posting images of text, it’s makes
>>>>>>>>>>>> reading/searching logs a real pain.
>>>>>>>>>>>>
>>>>>>>>>>>> The error is due to a skin trying to generate a plot that
>>>>>>>>>>>> involves extraTemp1 and from the short log extract I would
>>>>>>>>>>>> guess that this is from the Seasons skin. If you look in the
>>>>>>>>>>>> Seasons skin config file (skins/Seasons/skin.conf) under
>>>>>>>>>>>> [ImageGenerator]
>>>>>>>>>>>> you will find the daytemp, weektemp, monthtemp and yeartemp plots
>>>>>>>>>>>> use
>>>>>>>>>>>> extraTemp1 (and extraTemp2 and extraTemp3). Easiest fix is to
>>>>>>>>>>>> comment out
>>>>>>>>>>>> those plots, eg:
>>>>>>>>>>>>
>>>>>>>>>>>> # [[[daytemp]]]
>>>>>>>>>>>> # yscale = None, None, 0.5
>>>>>>>>>>>> # [[[[extraTemp1]]]]
>>>>>>>>>>>> # [[[[extraTemp2]]]]
>>>>>>>>>>>> # [[[[extraTemp3]]]]
>>>>>>>>>>>>
>>>>>>>>>>>> Save skin.conf and the error should go away on the next report
>>>>>>>>>>>> cycle.
>>>>>>>>>>>>
>>>>>>>>>>>> Gary
>>>>>>>>>>>> On Friday, 9 October 2020 at 12:29:14 UTC+10 [email protected]
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I tried to post this as an image but it doesn't show. So here
>>>>>>>>>>>>> is the text.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: Caught unrecoverable exception in generator
>>>>>>>>>>>>> 'weewx.imagegenerator.ImageGenerator'
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** extraTemp1
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** Traceback (most recent call
>>>>>>>>>>>>> last):
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** File
>>>>>>>>>>>>> "/usr/share/weewx/weewx/reportengine.py", line 197, in run
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** obj.start()
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** File
>>>>>>>>>>>>> "/usr/share/weewx/weewx/reportengine.py", line 280, in start
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** self.run()
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** File
>>>>>>>>>>>>> "/usr/share/weewx/weewx/imagegenerator.py", line 41, in run
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** self.genImages(self.gen_ts)
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** File
>>>>>>>>>>>>> "/usr/share/weewx/weewx/imagegenerator.py", line 176, in
>>>>>>>>>>>>> genImages
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** start_vec_t, stop_vec_t
>>>>>>>>>>>>> ,data_vec_t =
>>>>>>>>>>>>> weewx.xtypes.get_series(var_type,
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** File
>>>>>>>>>>>>> "/usr/share/weewx/weewx/xtypes.py", line 91, in get_series
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** raise
>>>>>>>>>>>>> weewx.UnknownType(obs_type)
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** weewx.UnknownType: extraTemp1
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] ERROR
>>>>>>>>>>>>> weewx.reportengine: **** Generator terminated
>>>>>>>>>>>>> Oct 8 20:03:19 prometis weewx[271870] DEBUG
>>>>>>>>>>>>> weewx.reportengine: Report 'SmartphoneReport' not enabled.
>>>>>>>>>>>>> Skipping.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thursday, October 8, 2020 at 9:39:14 PM UTC-4 Duane Kerzic
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for all the help you provided last time around. Thanks
>>>>>>>>>>>>>> in advance this time for your help.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I wanted to clean up weewx.archive table and make it a bit
>>>>>>>>>>>>>> smaller. So I deleted the columns I don't think I'll ever use.
>>>>>>>>>>>>>> But now I'm
>>>>>>>>>>>>>> getting this in the system log.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm guessing that extraTemp1 is coded into one of those files
>>>>>>>>>>>>>> but I haven't looked to find out yet.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've shortened the average row length of the archive table to
>>>>>>>>>>>>>> 126 from 217 bytes. Huge difference when you have 10 years of
>>>>>>>>>>>>>> data.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -dk
>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>> Google Groups "weewx-user" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>> https://groups.google.com/d/msgid/weewx-user/02d0a56e-c9fc-4e48-a74a-cdb6291474bbn%40googlegroups.com
>>>>>>>>>>>
>>>>>>>>>>> <https://groups.google.com/d/msgid/weewx-user/02d0a56e-c9fc-4e48-a74a-cdb6291474bbn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "weewx-user" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to [email protected].
>>>>>>>>>
>>>>>>>> To view this discussion on the web visit
>>>>>>>>> https://groups.google.com/d/msgid/weewx-user/853cbe79-ee93-49b3-90c3-6a9a02dab1b7n%40googlegroups.com
>>>>>>>>>
>>>>>>>>> <https://groups.google.com/d/msgid/weewx-user/853cbe79-ee93-49b3-90c3-6a9a02dab1b7n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "weewx-user" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/weewx-user/77e5e730-9cf0-4dbb-99c1-568d7fd32caen%40googlegroups.com
>>>>>
>>>>> <https://groups.google.com/d/msgid/weewx-user/77e5e730-9cf0-4dbb-99c1-568d7fd32caen%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>
--
You received this message because you are subscribed to the Google Groups
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/weewx-user/a82db714-911b-45df-9f3b-caef8c7f0471n%40googlegroups.com.