Yes, the other tables are "daily summaries." They are strictly an optimization. There's nothing in them that cannot be extracted from the main archive table and, indeed, they can be rebuilt from the latter using the tool wee_database.
WeeWX is multi-threaded, but not in the way you describe. Yes, there is a single "main" thread that blocks, waiting for data, then process it, then stores it in the database. A separate thread handles reporting. Still other threads upload through a RESTful facility. On the surface, multi-threading data acquisition (or, alternatively, using an async interface) sounds attractive. However, if the goal is to acquire data simultaneously from several sources, that can as easily be accomplished by launching multiple instances of WeeWX. If the goal is to intertwine the various sources and have them all stored in the same table in the same database, that is far more complicated. The simple architecture of WeeWX would be lost. However, if you have some concrete, specific, ideas on how to service more than one device simultaneously, then I'm all ears. I would really love to take advantage of the new "async" interfaces in modern versions of Python (unfortunately, the 3rd party libraries needed to use them have not caught up. Specifically, async versions of pyusb and pyserial). But, we are way beyond general ideas. What's needed is a working framework, even if it's merely a prototype. Through the years, these ideas have come up from time to time. Various alternative architectures have been considered. However, simplicity has always won out. I think the rich ecosystem of 3rd party extensions that surrounds WeeWX has shown that to be a good choice. -tk On Sun, Feb 28, 2021 at 2:44 PM David Prellwitz <[email protected]> wrote: > Thanks for the quick reply! I apologize for any negative connotations > (not intended), i guess my perceptions and assumptions are off-base! it's > just in all my professional DB roles (25yrs +) i've had to use visual > layout tools to understand existing data creation, utilization, ranges, > applications and operational impacts. These visual tools allowed us to use > as much third-normal form reduction steps as possible to simplify data > architecture to the lowest possible level. My misjudgement of WeeWx layout > and structure is just that, my misjudgement. > > Perhaps i am over-thinking things as i try to get basic things done. > Habits of a long life in IT Management/DB Architecture/Programming. I just > think i may be some ideas and observations that may improve WeeWx but, I > can drop these distractions if it's an issue. > > Now, the issue I had was the lack of understanding the role of the > additional tables - they didn't seem to have any relationship to the > archive table; other than Epoch. If i'm reading your statement correctly, > all the other tables are the detail records for each of the elements in the > archive table - just related by element type ("Name"?) and epoch timestamp? > So if i went looking for "Wind Gust Direction" instances, I'd look in the > Archive table for "WindGustDir" and find an entry (which represents what, a > summary, an average, ?), and one or more "Archive_day_windGustDir" > entries, depending on the frequency of that element's recording? (Assuming > archived wind-gust data is recorded for one-minute intervals, but > "Archive_day_windGustDir" are recorded at 3second intervals for > Wind-Gust-Dir table.) Knowledge at that level of detail should be known > somewhere, right? Perhaps somewhere in GitHub? > > How is this useful? It appears to me (IMHO) that a good understanding of > the data flow and mapping could help reduce event process time; reduce data > storage needs; increase functionality; improve simplicity and make it a > better product. Looking at the architecture of WeeWx, the process diagram > shows a single thread for each event (i.e., UDP packet arrival); followed > by event processing; followed by data-recording; followed by report > generation. All of these appear to be serial. If the front end was to be > multi-threaded to allow multiple device/station packet reception and placed > in a triggered repository that the rest of WeeWx processed, wouldn't WeeWx > be able to process multiple stations data without having multiple instances > of WeeWx (and multiple SQL db's). A quick, multi-threaded front-end could > be prioritized to allow for sub-second processing. Leaving more "leisured" > approach for the rest of Weewx to process and create reports. > > I'm a bit rusty in data-mapping (did it from 1975 through 1998) so i'm > sure any post-grad would be sharper at this then I am. Perhaps having some > Data Science Professor create a post-doc project that would help (assuming > i'm not too far off base here)! Being retired, i now have time on my hands > to assist if anyone thinks that would be useful. > > On Saturday, February 27, 2021 at 5:43:09 PM UTC-8 vince wrote: > >> I'd suggest your commentary is more than a little bit unfair and >> inaccurate, and I'll leave it at that. >> >> Weewx by default uses an underlying sqlite3 database and puts its >> readings into an 'archive' table that has a large number of fields for the >> actual weather measurements. >> >> CREATE TABLE archive (`dateTime` INTEGER NOT NULL UNIQUE PRIMARY KEY, >> `usUnits` INTEGER NOT NULL, `interval` INTEGER NOT NULL, `altimeter` REAL, >> `appTemp` REAL, `appTemp1` REAL, `barometer` REAL, `batteryStatus1` REAL, >> `batteryStatus2` REAL, `batteryStatus3` REAL, `batteryStatus4` REAL, >> `batteryStatus5` REAL, `batteryStatus6` REAL, `batteryStatus7` REAL, >> `batteryStatus8` REAL, `cloudbase` REAL, `co` REAL, `co2` REAL, >> `consBatteryVoltage` REAL, `dewpoint` REAL, `dewpoint1` REAL, `ET` REAL, >> `extraHumid1` REAL, `extraHumid2` REAL, `extraHumid3` REAL, `extraHumid4` >> REAL, `extraHumid5` REAL, `extraHumid6` REAL, `extraHumid7` REAL, >> `extraHumid8` REAL, `extraTemp1` REAL, `extraTemp2` REAL, `extraTemp3` >> REAL, `extraTemp4` REAL, `extraTemp5` REAL, `extraTemp6` REAL, `extraTemp7` >> REAL, `extraTemp8` REAL, `forecast` REAL, `hail` REAL, `hailBatteryStatus` >> REAL, `hailRate` REAL, `heatindex` REAL, `heatindex1` REAL, `heatingTemp` >> REAL, `heatingVoltage` REAL, `humidex` REAL, `humidex1` REAL, `inDewpoint` >> REAL, `inHumidity` REAL, `inTemp` REAL, `inTempBatteryStatus` REAL, >> `leafTemp1` REAL, `leafTemp2` REAL, `leafWet1` REAL, `leafWet2` REAL, >> `lightning_distance` REAL, `lightning_disturber_count` REAL, >> `lightning_energy` REAL, `lightning_noise_count` REAL, >> `lightning_strike_count` REAL, `luminosity` REAL, `maxSolarRad` REAL, `nh3` >> REAL, `no2` REAL, `noise` REAL, `o3` REAL, `outHumidity` REAL, `outTemp` >> REAL, `outTempBatteryStatus` REAL, `pb` REAL, `pm10_0` REAL, `pm1_0` REAL, >> `pm2_5` REAL, `pressure` REAL, `radiation` REAL, `rain` REAL, >> `rainBatteryStatus` REAL, `rainRate` REAL, `referenceVoltage` REAL, >> `rxCheckPercent` REAL, `signal1` REAL, `signal2` REAL, `signal3` REAL, >> `signal4` REAL, `signal5` REAL, `signal6` REAL, `signal7` REAL, `signal8` >> REAL, `snow` REAL, `snowBatteryStatus` REAL, `snowDepth` REAL, >> `snowMoisture` REAL, `snowRate` REAL, `so2` REAL, `soilMoist1` REAL, >> `soilMoist2` REAL, `soilMoist3` REAL, `soilMoist4` REAL, `soilTemp1` REAL, >> `soilTemp2` REAL, `soilTemp3` REAL, `soilTemp4` REAL, `supplyVoltage` REAL, >> `txBatteryStatus` REAL, `UV` REAL, `uvBatteryStatus` REAL, >> `windBatteryStatus` REAL, `windchill` REAL, `windDir` REAL, `windGust` >> REAL, `windGustDir` REAL, `windrun` REAL, `windSpeed` REAL); >> >> If you look at the beginning of the line, you'll see the unique key is >> the dateTime (seconds since the unix epoch). >> >> There are also a number of additional sqlite3 tables created, one per >> item above, with the dateTime for the record, the max/min for that >> measurement, when the max/min occurred, and also for things that are >> accumulated a running total. >> >> CREATE TABLE archive_day_inTemp (dateTime INTEGER NOT NULL UNIQUE PRIMARY >> KEY, min REAL, mintime INTEGER, max REAL, maxtime INTEGER, sum REAL, count >> INTEGER, wsum REAL, sumtime INTEGER); >> >> These tables are not connected in any way at the database level, they're >> individual tables. They're also using the dateTime of the record as the >> primary key for the records in the table. >> >> Weewx handles the heavy lifting for keeping the summary tables up to date >> from the data in the archive table. When you see people who ask how to >> clear bad data from their (archive table in the) database, they are >> generally told to 'drop daily' (which deletes the summary tables) and >> 'rebuild daily' (which regenerates the summary tables from the >> hand-modified archive table). >> >> That help any ? >> >> I guess I'm kinda speechless re: using a drawing tool to visualize a >> database nor really what you're trying to do. Perhaps the tool you want >> to use is google and going through the sqlite3 documentation online to >> better understand how to reverse engineer and examine a sqlite3 database, >> but I guess that's a bit harsh. I'll assume that you're overthinking >> things a bit and looking for complexity where it doesn't exist. >> >> >> - archive table has all the 'meat' >> - summary archive_whatever tables have the pre-computed summaries of >> max/min/sums of each element for later use >> >> >> >> -- > You received this message because you are subscribed to the Google Groups > "weewx-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/weewx-user/59c1ba34-84c8-42f7-8b01-249f9b25b350n%40googlegroups.com > <https://groups.google.com/d/msgid/weewx-user/59c1ba34-84c8-42f7-8b01-249f9b25b350n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "weewx-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/weewx-user/CAPq0zEBZmVxPhTB%2BrXxdp4OQ-jDWpdCLChzhBRZBySteKknfoA%40mail.gmail.com.
