Side note:
re-reading Vince's post, i had the process backwards - detailed event
records in Archive table; summary/special info in separate tables...
On Sunday, February 28, 2021 at 2:44:18 PM UTC-8 David Prellwitz wrote:
> Thanks for the quick reply! I apologize for any negative connotations
> (not intended), i guess my perceptions and assumptions are off-base! it's
> just in all my professional DB roles (25yrs +) i've had to use visual
> layout tools to understand existing data creation, utilization, ranges,
> applications and operational impacts. These visual tools allowed us to use
> as much third-normal form reduction steps as possible to simplify data
> architecture to the lowest possible level. My misjudgement of WeeWx layout
> and structure is just that, my misjudgement.
>
> Perhaps i am over-thinking things as i try to get basic things done.
> Habits of a long life in IT Management/DB Architecture/Programming. I just
> think i may be some ideas and observations that may improve WeeWx but, I
> can drop these distractions if it's an issue.
>
> Now, the issue I had was the lack of understanding the role of the
> additional tables - they didn't seem to have any relationship to the
> archive table; other than Epoch. If i'm reading your statement correctly,
> all the other tables are the detail records for each of the elements in the
> archive table - just related by element type ("Name"?) and epoch timestamp?
> So if i went looking for "Wind Gust Direction" instances, I'd look in the
> Archive table for "WindGustDir" and find an entry (which represents what, a
> summary, an average, ?), and one or more "Archive_day_windGustDir"
> entries, depending on the frequency of that element's recording? (Assuming
> archived wind-gust data is recorded for one-minute intervals, but
> "Archive_day_windGustDir" are recorded at 3second intervals for
> Wind-Gust-Dir table.) Knowledge at that level of detail should be known
> somewhere, right? Perhaps somewhere in GitHub?
>
> How is this useful? It appears to me (IMHO) that a good understanding of
> the data flow and mapping could help reduce event process time; reduce data
> storage needs; increase functionality; improve simplicity and make it a
> better product. Looking at the architecture of WeeWx, the process diagram
> shows a single thread for each event (i.e., UDP packet arrival); followed
> by event processing; followed by data-recording; followed by report
> generation. All of these appear to be serial. If the front end was to be
> multi-threaded to allow multiple device/station packet reception and placed
> in a triggered repository that the rest of WeeWx processed, wouldn't WeeWx
> be able to process multiple stations data without having multiple instances
> of WeeWx (and multiple SQL db's). A quick, multi-threaded front-end could
> be prioritized to allow for sub-second processing. Leaving more "leisured"
> approach for the rest of Weewx to process and create reports.
>
> I'm a bit rusty in data-mapping (did it from 1975 through 1998) so i'm
> sure any post-grad would be sharper at this then I am. Perhaps having some
> Data Science Professor create a post-doc project that would help (assuming
> i'm not too far off base here)! Being retired, i now have time on my hands
> to assist if anyone thinks that would be useful.
>
> On Saturday, February 27, 2021 at 5:43:09 PM UTC-8 vince wrote:
>
>> I'd suggest your commentary is more than a little bit unfair and
>> inaccurate, and I'll leave it at that.
>>
>> Weewx by default uses an underlying sqlite3 database and puts its
>> readings into an 'archive' table that has a large number of fields for the
>> actual weather measurements.
>>
>> CREATE TABLE archive (`dateTime` INTEGER NOT NULL UNIQUE PRIMARY KEY,
>> `usUnits` INTEGER NOT NULL, `interval` INTEGER NOT NULL, `altimeter` REAL,
>> `appTemp` REAL, `appTemp1` REAL, `barometer` REAL, `batteryStatus1` REAL,
>> `batteryStatus2` REAL, `batteryStatus3` REAL, `batteryStatus4` REAL,
>> `batteryStatus5` REAL, `batteryStatus6` REAL, `batteryStatus7` REAL,
>> `batteryStatus8` REAL, `cloudbase` REAL, `co` REAL, `co2` REAL,
>> `consBatteryVoltage` REAL, `dewpoint` REAL, `dewpoint1` REAL, `ET` REAL,
>> `extraHumid1` REAL, `extraHumid2` REAL, `extraHumid3` REAL, `extraHumid4`
>> REAL, `extraHumid5` REAL, `extraHumid6` REAL, `extraHumid7` REAL,
>> `extraHumid8` REAL, `extraTemp1` REAL, `extraTemp2` REAL, `extraTemp3`
>> REAL, `extraTemp4` REAL, `extraTemp5` REAL, `extraTemp6` REAL, `extraTemp7`
>> REAL, `extraTemp8` REAL, `forecast` REAL, `hail` REAL, `hailBatteryStatus`
>> REAL, `hailRate` REAL, `heatindex` REAL, `heatindex1` REAL, `heatingTemp`
>> REAL, `heatingVoltage` REAL, `humidex` REAL, `humidex1` REAL, `inDewpoint`
>> REAL, `inHumidity` REAL, `inTemp` REAL, `inTempBatteryStatus` REAL,
>> `leafTemp1` REAL, `leafTemp2` REAL, `leafWet1` REAL, `leafWet2` REAL,
>> `lightning_distance` REAL, `lightning_disturber_count` REAL,
>> `lightning_energy` REAL, `lightning_noise_count` REAL,
>> `lightning_strike_count` REAL, `luminosity` REAL, `maxSolarRad` REAL, `nh3`
>> REAL, `no2` REAL, `noise` REAL, `o3` REAL, `outHumidity` REAL, `outTemp`
>> REAL, `outTempBatteryStatus` REAL, `pb` REAL, `pm10_0` REAL, `pm1_0` REAL,
>> `pm2_5` REAL, `pressure` REAL, `radiation` REAL, `rain` REAL,
>> `rainBatteryStatus` REAL, `rainRate` REAL, `referenceVoltage` REAL,
>> `rxCheckPercent` REAL, `signal1` REAL, `signal2` REAL, `signal3` REAL,
>> `signal4` REAL, `signal5` REAL, `signal6` REAL, `signal7` REAL, `signal8`
>> REAL, `snow` REAL, `snowBatteryStatus` REAL, `snowDepth` REAL,
>> `snowMoisture` REAL, `snowRate` REAL, `so2` REAL, `soilMoist1` REAL,
>> `soilMoist2` REAL, `soilMoist3` REAL, `soilMoist4` REAL, `soilTemp1` REAL,
>> `soilTemp2` REAL, `soilTemp3` REAL, `soilTemp4` REAL, `supplyVoltage` REAL,
>> `txBatteryStatus` REAL, `UV` REAL, `uvBatteryStatus` REAL,
>> `windBatteryStatus` REAL, `windchill` REAL, `windDir` REAL, `windGust`
>> REAL, `windGustDir` REAL, `windrun` REAL, `windSpeed` REAL);
>>
>> If you look at the beginning of the line, you'll see the unique key is
>> the dateTime (seconds since the unix epoch).
>>
>> There are also a number of additional sqlite3 tables created, one per
>> item above, with the dateTime for the record, the max/min for that
>> measurement, when the max/min occurred, and also for things that are
>> accumulated a running total.
>>
>> CREATE TABLE archive_day_inTemp (dateTime INTEGER NOT NULL UNIQUE PRIMARY
>> KEY, min REAL, mintime INTEGER, max REAL, maxtime INTEGER, sum REAL, count
>> INTEGER, wsum REAL, sumtime INTEGER);
>>
>> These tables are not connected in any way at the database level, they're
>> individual tables. They're also using the dateTime of the record as the
>> primary key for the records in the table.
>>
>> Weewx handles the heavy lifting for keeping the summary tables up to date
>> from the data in the archive table. When you see people who ask how to
>> clear bad data from their (archive table in the) database, they are
>> generally told to 'drop daily' (which deletes the summary tables) and
>> 'rebuild daily' (which regenerates the summary tables from the
>> hand-modified archive table).
>>
>> That help any ?
>>
>> I guess I'm kinda speechless re: using a drawing tool to visualize a
>> database nor really what you're trying to do. Perhaps the tool you want
>> to use is google and going through the sqlite3 documentation online to
>> better understand how to reverse engineer and examine a sqlite3 database,
>> but I guess that's a bit harsh. I'll assume that you're overthinking
>> things a bit and looking for complexity where it doesn't exist.
>>
>>
>> - archive table has all the 'meat'
>> - summary archive_whatever tables have the pre-computed summaries of
>> max/min/sums of each element for later use
>>
>>
>>
>>
--
You received this message because you are subscribed to the Google Groups
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/weewx-user/10dc0d29-7899-4905-ac14-44d9f31fa964n%40googlegroups.com.