[weewx-user] Re: weectl import skip existing data?

vince Sat, 05 Apr 2025 10:55:13 -0700

FWIW, I think you guys are massively overthinking this….
On Thursday, March 20, 2025 at 11:48:20 AM UTC-7 [email protected] wrote:


> Attached is an updated query.  See the comments in it.  It can find 
> missing records by comparing two databases.  It reports the extra records 
> in the more complete database.  That's the same as saying it reports the 
> missing records in the less complete database.
>
> The situation with timestamps not matching up exactly between the two 
> databases raises an issue.  If you create a csv of the missing records to 
> import, do you need to correct the timestamps to match the interval Weewx 
> is expecting?  For example, if you determine that the 09:00:00 entry is 
> missing in weewx, but the 9 o'clock entry in the other database is actually 
> timestamped 09:00:10, do you need to fix that before importing it?  Or does 
> Weewx not really care about that?
>
> Walt
> On Wednesday, March 19, 2025 at 2:33:58 PM UTC-5 vince wrote:
>
>> oops - typo'd the field names in multiple places previously.  Sorry.  I 
>> was working off an alternate db with different field names than weewx. 
>>  This one matches the weewx schema.
>>
>> select dateTime,datetime(dateTime,'unixepoch','localtime') from archive 
>> where datetime(dateTime,'unixepoch','localtime') like "2024-12-28 09%" 
>> limit 2;
>> 1735405200|2024-12-28 09:00:00
>> 1735405500|2024-12-28 09:05:00
>>
>>
>> On Wednesday, March 19, 2025 at 12:23:53 PM UTC-7 vince wrote:
>>
>>> Ryan - here's an example.  It looks for the first two records from 
>>> 2024-12-18 starting with 9am localtime.
>>>
>>> select datetime,datetime(datetime,'unixepoch','localtime') from archive 
>>> where datetime(datetime,'unixepoch','localtime') like "2024-12-28 09%" 
>>> limit 2;
>>> 1735405200|2024-12-28 09:00:00
>>> 1735405500|2024-12-28 09:05:00
>>>
>>
>>> On Tuesday, March 18, 2025 at 9:16:49 PM UTC-7 vince wrote:
>>>
>>>>
>>>> … where datetime(dateTime,'unixepoch','localtime') … gets you a more 
>>>> human friendly datetime but you’ll have to work out the syntax for 
>>>> comparison to a date string that does what you want.
>>>>
>>>> Still not understanding why you’re working so hard on this since you 
>>>> know which days your network was down. Heck you can pretty close visually 
>>>> just by looking at the graphs. But see if the syntax above helps any.
>>>>
>>>> Away from computer so I can’t give you a complete example…
>>>>
>>>> On Tuesday, March 18, 2025 at 8:05:50 PM UTC-7 Ryan Stasel wrote:
>>>>
>>>>> I'm drawing a blank here... I'm fine altering my import data to not 
>>>>> include overlapping data... but how the heck do I figure out what's 
>>>>> missing? 
>>>>>
>>>>> While I can convert unixepoch to a readable date, is there an easy way 
>>>>> to just do something like
>>>>>
>>>>> select * from archive where datetime -like "2025-02-28";
>>>>>
>>>>> Then I can see what's missing and alter the import file.
>>>>>
>>>>> Thanks! 
>>>>> On Tuesday, March 18, 2025 at 5:11:07 AM UTC-7 [email protected] wrote:
>>>>>
>>>>>> Attached is a simple sqlite query for finding extra database rows.  
>>>>>> In it, both databases are weewx databases, so if you're trying to find 
>>>>>> differences between weewx and some other database, you'll have to modify 
>>>>>> the query, but the idea remains the same.
>>>>>>
>>>>>> Note that the order of the databases is important.  It reports rows 
>>>>>> that are IN db1 but NOT IN db2.  If you swapped the order of the 
>>>>>> databases, 
>>>>>> the counts would still be correct but it would print no rows because 
>>>>>> every 
>>>>>> record in the smaller database is in the larger database.
>>>>>>
>>>>>> Walt
>>>>>> On Monday, March 17, 2025 at 5:07:39 PM UTC-5 gjr80 wrote:
>>>>>>
>>>>>>> If you want to tell what data already exists you will need to get 
>>>>>>> your SQL hands dirty (you could use Cheetah produce a WeeWX report that 
>>>>>>> lists all the timestamps in some given period, but that's an awful lot 
>>>>>>> of 
>>>>>>> messing around for little gain). WeeWX archive record timestamps are 
>>>>>>> saved 
>>>>>>> as an integers, with the DBMS enforcing the unique constraint. So a 
>>>>>>> source 
>>>>>>> record with one second difference to an existing record would be 
>>>>>>> imported.
>>>>>>>
>>>>>>> The import process does report on what records were imported and 
>>>>>>> what records were not via the log; imported records will be logged as 
>>>>>>> being 
>>>>>>> added and records that were rejected due to duplicate timestamps will 
>>>>>>> be 
>>>>>>> reported with a unique key error. Arguably a little crude, but to do 
>>>>>>> any 
>>>>>>> more detailed user friendly analysis and reporting would have to 
>>>>>>> expensive 
>>>>>>> time wise. Of course the reporting is after the fact not before.
>>>>>>>
>>>>>>> Gary
>>>>>>>
>>>>>>> On Tuesday, 18 March 2025 at 02:14:24 UTC+10 [email protected] 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks gjr80. I guess the question is, how do I tell easily what 
>>>>>>>> data already exists (the time stamps will have to match, I assume). So 
>>>>>>>> "2025-02-28 08:15:00" existing (for example) I assume won't match 
>>>>>>>> against 
>>>>>>>> an import of "2025-02-28 08:16:00" (let alone if there's seconds other 
>>>>>>>> than 
>>>>>>>> 00). 
>>>>>>>>
>>>>>>>> Is there an "easy" way to tell? or am I stuck going into sqlite3 
>>>>>>>> and doing some selects (not hard, just was kind of hoping the import 
>>>>>>>> process WOULD report on records already existing). =)
>>>>>>>>
>>>>>>>> On Sunday, March 16, 2025 at 5:59:03 PM UTC-7 gjr80 wrote:
>>>>>>>>
>>>>>>>>> The default weectl import action is to *not* import records where 
>>>>>>>>> there is already an archive record in the database with the same 
>>>>>>>>> timestamp. weectl 
>>>>>>>>> import will appear to import such records but if you look at the 
>>>>>>>>> WeeWX log you will see entries similar to:
>>>>>>>>>
>>>>>>>>> 2023-11-04 15:33:01 weectl-import[3795]: ERROR weewx.manager: 
>>>>>>>>> Unable to add record 2018-09-04 04:20:00 AEST (1535998800) to 
>>>>>>>>> database 
>>>>>>>>> 'weewx.sdb': UNIQUE constraint failed: archive.dateTime
>>>>>>>>>
>>>>>>>>> indicating a record with the same timestamp (in this case 
>>>>>>>>> 2018-09-04 04:20:00 AEST (1535998800)) already exists in the database 
>>>>>>>>> and 
>>>>>>>>> the imported record was discarded. Long story, but it was too 
>>>>>>>>> inefficient 
>>>>>>>>> to check every record before attempting to add it to the archive, 
>>>>>>>>> hence the 
>>>>>>>>> somewhat cumbersome import and notification. Note that if you use the 
>>>>>>>>> (presently) undocumented --update command line option with weectl 
>>>>>>>>> import pre-existing records will be overwritten.
>>>>>>>>>
>>>>>>>>> Bottom line - just run weectl import as per the docs, and 
>>>>>>>>> whatever you do don't include --update on your weectl import 
>>>>>>>>> command line. And of course always make a backup of your database 
>>>>>>>>> before 
>>>>>>>>> importing.
>>>>>>>>>
>>>>>>>>> Gary
>>>>>>>>> On Monday, 17 March 2025 at 09:56:17 UTC+10 [email protected] 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I lost a few days of data due to a network issue, but I have the 
>>>>>>>>>> data in another system. Am trying to import from csv, but I think my 
>>>>>>>>>> csv 
>>>>>>>>>> overlaps a bit with what's already in the weewx database. 
>>>>>>>>>>
>>>>>>>>>> Doing a dryrun, it says it will import every line in the csv... 
>>>>>>>>>>
>>>>>>>>>> Is there a way to get weectl import to NOT import duplicate data, 
>>>>>>>>>> or is there an easy way to see what data is in those dates so I can 
>>>>>>>>>> modify 
>>>>>>>>>> csv to exclude those entries?
>>>>>>>>>>
>>>>>>>>>> Thanks! 
>>>>>>>>>>
>>>>>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/weewx-user/fd66ad09-f38d-40ef-bf1c-454af418ab19n%40googlegroups.com.

[weewx-user] Re: weectl import skip existing data?

Reply via email to