Re: [weewx-user] Re: weeimport a smattering of missing records

gjr80 Sun, 23 Mar 2025 17:59:14 -0700

Thanks, I've a couple of days with not much free time, but when I have a 
chance I will run this through a test system and see if we can work out 
what's going on.


Gary
On Saturday, 22 March 2025 at 23:28:24 UTC+10 Andrew McGinnis wrote:

> Here they are. CSV is 2025-03-20T00:00 to 2025-03-21T00:00, with every 
> field in the schema, in the same order.  The .conf has every field_map 
> mapped 1:1 since they are the same. No units were specified beyond the 
> mandatory datetime field, as the documentation says units is not required 
> if the usUnits field is populated (it is). raw_datetime_format isn't 
> specified, as the documentation says it's ignored if the datetime field is 
> unix_epoch already.
>
> On Sat, Mar 22, 2025 at 5:02 AM gjr80 <[email protected]> wrote:
>
>> Let's take it from the top. Please post the csv data (preferably the 
>> file) and import config file that caused the original error. Unless you are 
>> talking about a fairly obvious error it is extremely difficult (and time 
>> consuming) to work through a failed weectl import session without those 
>> two pieces of the puzzle. 
>>
>> Gary
>> On Saturday, 22 March 2025 at 09:12:47 UTC+10 Andrew McGinnis wrote:
>>
>>> So odd thing happens when I try to weectl --import.  
>>>
>>> I dumped the data from the database I want to take it from (as well as 
>>> the production one) via 
>>> sudo sqlite3 -header -csv /var/lib/weewx/weewx.sdb 'select * from 
>>> archive where datetime >= 1742443200 and datetime <= 1742529600;' > 
>>> 20th_pi4b8.csv 
>>>
>>> I copied both over to my PC and ran both of the output files through 
>>> Excel to reconcile between the two, saving out a single 20th.csv with the 
>>> data cleaned up as I want it to be.  The schema/header and values are laid 
>>> out exactly as my database schema is.  The values are unmodified from the 
>>> sqlite output, beyond copy/paste in Excel.  
>>>
>>> Copied that cleaned 20th.csv back to the production weewx pi, and ran:
>>>
>>> pi@pi4b8:~/weeimport $ weectl import 
>>> --import-config=/home/pi/weeimport/20th.conf --update --dry-run
>>> Using configuration file /etc/weewx/weewx.conf
>>> This is a dry run. Nothing will actually be done.
>>> Starting weectl import...
>>> A CSV import from source file '/home/pi/weeimport/20th.csv' has been 
>>> requested.
>>> Using database binding 'wx_binding', which is bound to database 
>>> 'weewx.sdb'
>>> Destination table 'archive' unit system is '0x01' (US).
>>> Imported records will overwrite existing database records.
>>> All WeeWX UV fields will be set to None.
>>> All WeeWX radiation fields will be set to None.
>>> This is a dry run, imported data will not be saved to archive.
>>> Starting dry run import ...
>>> Traceback (most recent call last):
>>>   File "/usr/share/weewx/weeimport/weeimport.py", line 856, in 
>>> map_raw_data
>>>     _value = float(_row[self.map[_field]['source_field']].strip())
>>> ValueError: could not convert string to float: ''
>>>
>>> During handling of the above exception, another exception occurred:
>>>
>>> Traceback (most recent call last):
>>>   File "/usr/share/weewx/weectl.py", line 75, in <module>
>>>     main()
>>>   File "/usr/share/weewx/weectl.py", line 67, in main
>>>     namespace.func(namespace)
>>>   File "/usr/share/weewx/weectllib/__init__.py", line 90, in dispatch
>>>     namespace.action_func(config_dict, namespace)
>>>   File "/usr/share/weewx/weectllib/import_cmd.py", line 85, in 
>>> import_func
>>>     weectllib.import_actions.obs_import(config_dict,
>>>   File "/usr/share/weewx/weectllib/import_actions.py", line 58, in 
>>> obs_import
>>>     source_obj.run()
>>>   File "/usr/share/weewx/weeimport/weeimport.py", line 406, in run
>>>     _mapped_data = self.map_raw_data(_raw_data, self.archive_unit_sys)
>>>   File "/usr/share/weewx/weeimport/weeimport.py", line 904, in 
>>> map_raw_data
>>>     self.map[_field]['unit'] == 'degree_compass':
>>>   File "/usr/lib/python3/dist-packages/configobj.py", line 554, in 
>>> __getitem__
>>>     val = dict.__getitem__(self, key)
>>> KeyError: 'unit'
>>>
>>> I looked at the file in nano, and noticed it was saved in DOS format.  I 
>>> changed it to be not-DOS/Mac (...native then I guess?), but same result. 
>>>
>>> If I go in through 
>>> pi@pi4b8:~/weeimport $ sudo sqlite3 /var/lib/weewx/weewx.sdb 
>>> SQLite version 3.34.1 2021-01-20 14:10:07
>>> Enter ".help" for usage hints.
>>> sqlite> DELETE FROM archive WHERE datetime >= 1742443200 and datetime <= 
>>> 1742529600;
>>> sqlite> .mode csv archive
>>> sqlite> .import 20th.csv archive
>>> 20th.csv:1: INSERT failed: datatype mismatch
>>> 20th.csv:1443: INSERT failed: datatype mismatch
>>> sqlite> .quit
>>>
>>> The data is in there though (and in the correct columns- no offset 
>>> errors). Doing weectl database --rebuild-daily --date=2025-03-20 
>>> rebuilds successfully.  That said, my graphs look... off, where the data is 
>>> highly variable in it's 10-min interval that the graph uses for plot 
>>> points. You can see it's very smooth once I'm back to using directly 
>>> captured data.
>>> Current (trailing 24hr):
>>> [image: image.png] [image: image.png]
>>> Yesterday (00:00-23:59)
>>> [image: image.png] [image: image.png]
>>>
>>> The sqlite output for '*select datetime,outTemp from archive...;*' for 
>>> Mar-20 13:30-15:30 (imported data) and Mar-21 13:30-15:30 (direct) have a 
>>> similar number of missing records for the outTemp and the 
>>> extraTemp/extraHumid observations, but the imported data has '
>>> *epochtime,""*' in a csv output for the blank records, where the 
>>> direct's csv output is just '*epochtime,'*
>>>
>>> The graphs are being generated (with a 10-minute resolution) very 
>>> differently. If I take a 10-min period of records:
>>> -the graph point for the imported data is the sum of the 10 records, 
>>> divided by 10, even though 8 were actually recorded- averaging in every 
>>> blank as a 0
>>> -the graph point for my direct observations, the same period also had 
>>> blanks, but the average was taken of the records present.
>>>
>>> I imagine everything in the background is likewise factoring what are 
>>> meant to be blanks, as zeros. 
>>>
>>> So, the sqlite import added "" for blank records to the database it 
>>> would seem.  Before I go even more leeroy jenkins on my database, what do I 
>>> need to do? Go do it again, but add/insert *None* for every blank 
>>> observation? add/insert *""* for every blank? Or better, what am I not 
>>> seeing/missing in my weectl import that it's erroring out?
>>>
>>> Here is the first two lines of the csv file I'm trying to import for 
>>> context:
>>>
>>> pi@pi4b8:~/weeimport $ head -2 20th.csv 
>>>
>>> dateTime,usUnits,interval,altimeter,appTemp,appTemp1,barometer,batteryStatus1,batteryStatus2,batteryStatus3,batteryStatus4,batteryStatus5,batteryStatus6,batteryStatus7,batteryStatus8,cloudbase,co,co2,consBatteryVoltage,dewpoint,dewpoint1,ET,extraHumid1,extraHumid2,extraHumid3,extraHumid4,extraHumid5,extraHumid6,extraHumid7,extraHumid8,extraTemp1,extraTemp2,extraTemp3,extraTemp4,extraTemp5,extraTemp6,extraTemp7,extraTemp8,forecast,hail,hailBatteryStatus,hailRate,heatindex,heatindex1,heatingTemp,heatingVoltage,humidex,humidex1,inDewpoint,inHumidity,inTemp,inTempBatteryStatus,leafTemp1,leafTemp2,leafWet1,leafWet2,lightning_distance,lightning_disturber_count,lightning_energy,lightning_noise_count,lightning_strike_count,luminosity,maxSolarRad,nh3,no2,noise,o3,outHumidity,outTemp,outTempBatteryStatus,pb,pm10_0,pm1_0,pm2_5,pressure,radiation,rain,rainBatteryStatus,rainRate,referenceVoltage,rxCheckPercent,signal1,signal2,signal3,signal4,signal5,signal6,signal7,signal8,snow,snowBatteryStatus,snowDepth,snowMoisture,snowRate,so2,soilMoist1,soilMoist2,soilMoist3,soilMoist4,soilTemp1,soilTemp2,soilTemp3,soilTemp4,supplyVoltage,txBatteryStatus,UV,uvBatteryStatus,windBatteryStatus,windchill,windDir,windGust,windGustDir,windrun,windSpeed,pm4_0,pm4_0out,pm2_5out,pm1_0out,pm10_0out,co2out
>>>
>>> 1742443200,1,1,30.13063837,62.49347962,,30.12772479,0,1,1,1,1,,,,604.5454057,,1106,1500,58.65679821,,,59,57,58,40,51,,,,72.61,72.37,72.47727273,78.30909091,76.82,,,,,,,,61.1239978,,,,67.94440487,,56.89523489,54,74.66,1,,,,,,,,,,,0,,,,,92,60.999998,1,,2.8,0.6,1.6,30.04971646,,0,0.88889,0,,,,,,,,,,,20.12999882,1,,,,,45.88235294,,,,70.7,,,,,1,,,1,60.999998,202.5,2.678862144,202.5,0.038218048,2.293082881,2.4,,2,,2,
>>>
>>> On Sun, Mar 16, 2025 at 4:27 PM gjr80 <[email protected]> wrote:
>>>
>>>> That should be fine. No need for 'empty' fields to be included in the 
>>>> header line and data lines, but in this case it is probably the 
>>>> simplest/quickest approach. Just be very careful that you don't end up 
>>>> with 
>>>> any missing or extra commas as you will likely then put the wrong data in 
>>>> the wrong field.
>>>>
>>>> Gary
>>>> On Sunday, 16 March 2025 at 05:57:55 UTC+10 Andrew McGinnis wrote:
>>>>
>>>>> Via a dump of every column in the table that exists, whether I'm using 
>>>>> it or not, the csv content could be:
>>>>>
>>>>>
>>>>> "dateTime","usUnits","interval","altimeter","appTemp","appTemp1","barometer","batteryStatus1","batteryStatus2","batteryStatus3","batteryStatus4","batteryStatus5","batteryStatus6","batteryStatus7","batteryStatus8","cloudbase","co","co2","consBatteryVoltage","dewpoint","dewpoint1","ET","extraHumid1","extraHumid2","extraHumid3","extraHumid4","extraHumid5","extraHumid6","extraHumid7","extraHumid8","extraTemp1","extraTemp2","extraTemp3","extraTemp4","extraTemp5","extraTemp6","extraTemp7","extraTemp8","forecast","hail","hailBatteryStatus","hailRate","heatindex","heatindex1","heatingTemp","heatingVoltage","humidex","humidex1","inDewpoint","inHumidity","inTemp","inTempBatteryStatus","leafTemp1","leafTemp2","leafWet1","leafWet2","lightning_distance","lightning_disturber_count","lightning_energy","lightning_noise_count","lightning_strike_count","luminosity","maxSolarRad","nh3","no2","noise","o3","outHumidity","outTemp","outTempBatteryStatus","pb","pm10_0","pm1_0","pm2_5","pressure","radiation","rain","rainBatteryStatus","rainRate","referenceVoltage","rxCheckPercent","signal1","signal2","signal3","signal4","signal5","signal6","signal7","signal8","snow","snowBatteryStatus","snowDepth","snowMoisture","snowRate","so2","soilMoist1","soilMoist2","soilMoist3","soilMoist4","soilTemp1","soilTemp2","soilTemp3","soilTemp4","supplyVoltage","txBatteryStatus","UV","uvBatteryStatus","windBatteryStatus","windchill","windDir","windGust","windGustDir","windrun","windSpeed","pm4_0","pm4_0out","pm2_5out","pm1_0out","pm10_0out","co2out"
>>>>>
>>>>> 1741612440,1,1,,,,,1.0,,,1.0,,,,,,,,,,,,58.0,,,58.0,,,,,75.02,,,77.36,,,,,,,,,,,,,,,,,,,,,,,,,,,,,238.51020260006104,,,,,,,,,,,,,
>>>>> 0.00999881889763705
>>>>> ,,0.0199976377952741,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
>>>>>
>>>>> where the larger font size one is the rain amount that I'm inserting. 
>>>>> Then one observation line for each record period that I need to change 
>>>>> (that way, no change is required to the column name line for periods that 
>>>>> have a different mix of observations with data)?
>>>>>
>>>>> Only 8 observations got recorded in that period I have above (I have 
>>>>> more than 8 XXXbatteryStatus observations alone), but at this point I'm 
>>>>> just seeking to add the rain observation.
>>>>>
>>>>> On Fri, Mar 14, 2025 at 9:41 PM gjr80 <[email protected]> wrote:
>>>>>
>>>>>> See comments below.
>>>>>>
>>>>>> Gary
>>>>>>
>>>>>> On Saturday, 15 March 2025 at 03:47:33 UTC+10 Andrew McGinnis wrote:
>>>>>>
>>>>>> If you want to use weectl import to import data into existing 
>>>>>> archive records, each record in your import source data (in your case 
>>>>>> your 
>>>>>> CSV data) needs to include the existing data for all fields as well as 
>>>>>> the 
>>>>>> updated rainfall data. You then need to use the (as yet undocumented) 
>>>>>> --update command line option when running weectl import. 
>>>>>>
>>>>>>
>>>>>> So rather than just the records I'm *missing*, I need to go from the 
>>>>>> first missing record to the last missing record, inclusive of all 
>>>>>> in-between records? I can easily do that, just not 100% on whether it's 
>>>>>> required. 
>>>>>>
>>>>>>
>>>>>> No, you don't need 'in between records'. Your import data should only 
>>>>>> include the records you intend to update/alter, you do need the data 
>>>>>> from 
>>>>>> the other fields of the records you are updating/importing. Perhaps an 
>>>>>> example might make it clearer. Say your archive table contains the 
>>>>>> following:
>>>>>>
>>>>>> dateTime    outTemp  inTemp  rain  outHumidity
>>>>>> 1742000400  23.1     20.1     1.0    80
>>>>>> 1742000100  23.2     20.1     2.0    81
>>>>>> 1741999800  23.2     20.1     5.0    84
>>>>>> 1741999500  23.3     20.2     0.0    83
>>>>>> 1741999200  23.5     20.2     0.8    83
>>>>>> 1741998900  23.5     20.2     0.6    84
>>>>>> 1741998600  23.4     20.1     0.6    82
>>>>>> 1741998300  23.3     20.1     0.8    83
>>>>>> 1741998000  23.3     20.2     0.0    84
>>>>>> 1741997700  23.3     20.1     1.0    82
>>>>>>
>>>>>> Lets say you want to update the rain value in records timestamped 
>>>>>> 1741999500 and 1741998000 (ie the 0.0 values) and your new rain 
>>>>>> values (per archive period values) for these two records are 0.2 and 0.4 
>>>>>> respectively. So you might start with your import data as follows:
>>>>>>
>>>>>> dateTime      rain
>>>>>> 1741999500     0.2
>>>>>> 1741998000     0.4
>>>>>>
>>>>>> If you import this data using the --update command line option (or 
>>>>>> without the --update command line option if you deleted the two 
>>>>>> archive table records concerned) your archive table will be:
>>>>>>
>>>>>> dateTime    outTemp  inTemp  rain  outHumidity
>>>>>> 1742000400  23.1     20.1     1.0    80
>>>>>> 1742000100  23.2     20.1     2.0    81
>>>>>> 1741999800  23.2     20.1     5.0    84
>>>>>> 1741999500                    0.2    
>>>>>> 1741999200  23.5     20.2     0.8    83
>>>>>> 1741998900  23.5     20.2     0.6    84
>>>>>> 1741998600  23.4     20.1     0.6    82
>>>>>> 1741998300  23.3     20.1     0.8    83
>>>>>> 1741998000                    0.4    
>>>>>> 1741997700  23.3     20.1     1.0    82
>>>>>>
>>>>>> Each row in your import data is considered to be an archive record in 
>>>>>> itself and any fields that are not included in your import data will be 
>>>>>> empty. To keep the rest of the existing data in the records being 
>>>>>> updated 
>>>>>> (in this case outTemp, inTemp and outHumidity), your import data 
>>>>>> needs to include values for outTemp, inTemp and outHumidity. Your 
>>>>>> import data would be:
>>>>>>
>>>>>> dateTime    outTemp  inTemp  rain  outHumidity
>>>>>> 1741999500  23.3     20.2     0.2    83
>>>>>> 1741998000  23.3     20.2     0.4    84
>>>>>>
>>>>>> Importing this data will give you:
>>>>>>
>>>>>> dateTime    outTemp  inTemp  rain  outHumidity
>>>>>> 1742000400  23.1     20.1     1.0    80
>>>>>> 1742000100  23.2     20.1     2.0    81
>>>>>> 1741999800  23.2     20.1     5.0    84
>>>>>> 1741999500  23.3     20.2     0.2    83
>>>>>> 1741999200  23.5     20.2     0.8    83
>>>>>> 1741998900  23.5     20.2     0.6    84
>>>>>> 1741998600  23.4     20.1     0.6    82
>>>>>> 1741998300  23.3     20.1     0.8    83
>>>>>> 1741998000  23.3     20.2     0.4    84
>>>>>> 1741997700  23.3     20.1     1.0    82
>>>>>>
>>>>>> Which I suspect is the desired result. Of course this is a very 
>>>>>> simple and unrealistic example but hopefully you get the idea.
>>>>>>  
>>>>>>
>>>>>>
>>>>>> Alternatively, you could avoid using the --update command line 
>>>>>> option by first deleting the records to be imported from the WeeWX 
>>>>>> database 
>>>>>> and then importing your merged/updated CSV data.
>>>>>>
>>>>>>
>>>>>> If you'll save me the research, how do I delete specific 
>>>>>> records/periods, either a single period record, or a range? I just 
>>>>>> recently 
>>>>>> noticed that the soilMoist1 observation, on my weewx testing 
>>>>>> install, recorded about 84 hours of very wrong values, that my 
>>>>>> production 
>>>>>> weewx install didn't (both getting the data from the same mqtt topic). I 
>>>>>> know how to drop entire observation columns via weectl database 
>>>>>> drop-columns NAME, but in this case it's just range of periods 
>>>>>> within the column that need purging. Or is it as simple as, zeroing out 
>>>>>> the 
>>>>>> values in a range via sqlite> update archive set soilMoist1=0.0 
>>>>>> where dateTime > {startepochtime} and dateTime <= {endepochtime]; ?
>>>>>>
>>>>>>
>>>>>>  
>>>>>>  In terms of deleting records I use the sqlite3 utility to allow me 
>>>>>> to use SQL commands to interact with the database. Some people use GUI 
>>>>>> based sql editors to manipulate the database; I don't so I can't help 
>>>>>> you 
>>>>>> there. In terms of sqlite3 the following commands can be used to 
>>>>>> delete archive table records:
>>>>>>
>>>>>> $ sqlite3 /var/lib/weewx/weewx.sdb
>>>>>> sqlite> DELETE FROM archive WHERE dateTime=1741999500;
>>>>>> sqlite3> .q
>>>>>> $
>>>>>>
>>>>>> In this case we deleted the archive record timestamped 1741999500. 
>>>>>>
>>>>>> To delete a range of archive records by timestamp use something like:
>>>>>>
>>>>>> sqlite> DELETE FROM archive WHERE dateTime>=1741969500 AND 
>>>>>> dateTime<=1741999500;
>>>>>>
>>>>>> which would delete all archive records timestamped from 1741969500 
>>>>>> to 1741999500 inclusive.
>>>>>>
>>>>>> Note that you may need to install the sqlite3 utility on your system 
>>>>>> using something like (for debian based systems):
>>>>>>
>>>>>> $ sudo apt install sqlite3
>>>>>>
>>>>>> -- 
>>>>>> You received this message because you are subscribed to a topic in 
>>>>>> the Google Groups "weewx-user" group.
>>>>>> To unsubscribe from this topic, visit 
>>>>>> https://groups.google.com/d/topic/weewx-user/1KX3-8lbU34/unsubscribe.
>>>>>> To unsubscribe from this group and all its topics, send an email to 
>>>>>> [email protected].
>>>>>>
>>>>> To view this discussion visit 
>>>>>> https://groups.google.com/d/msgid/weewx-user/4506f570-fb2c-40f2-95ab-a98c02a59011n%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/weewx-user/4506f570-fb2c-40f2-95ab-a98c02a59011n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> -- 
>>>> You received this message because you are subscribed to a topic in the 
>>>> Google Groups "weewx-user" group.
>>>> To unsubscribe from this topic, visit 
>>>> https://groups.google.com/d/topic/weewx-user/1KX3-8lbU34/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to 
>>>> [email protected].
>>>>
>>> To view this discussion visit 
>>>> https://groups.google.com/d/msgid/weewx-user/6b7b89a9-5703-4ae9-8c01-bdaa310ff7can%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/weewx-user/6b7b89a9-5703-4ae9-8c01-bdaa310ff7can%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "weewx-user" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/weewx-user/1KX3-8lbU34/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> [email protected].
>>
> To view this discussion visit 
>> https://groups.google.com/d/msgid/weewx-user/9a43dbf7-5658-49ef-819f-b00a1d5ba510n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/weewx-user/9a43dbf7-5658-49ef-819f-b00a1d5ba510n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/weewx-user/60ea0db7-f287-47b7-8336-eb04af1beb41n%40googlegroups.com.

Re: [weewx-user] Re: weeimport a smattering of missing records

Reply via email to