Well, make sure you back up your database before you try to delete anything
in case this goes wrong. I would
1. Stop weewx
2. Make a copy of the database
3. Execute the delete SQL
4. Run the original query again to see if things are OK
5. Rebuild weewx dailies
6. Start weewx
Hopefully I'm not forgetting anything.
I'm attaching SQL to delete the spiked records. You will have to change
the database name in the .open statement to the db you want to change. You
may need to run sqlite3 with sudo in order to delete records.
I thought a better solution would be to update the spiked temp with an
average of the readings surrounding the spike, but I'm having trouble
coming up with the SQL to do that.
Walt
On Thursday, September 30, 2021 at 9:41:17 AM UTC-5 [email protected] wrote:
> Thank you Walt, this script works fine! in fact it highlighted that spike
> in my graph:
>
> DT datetime temp temp_prior
> temp_next variance d_back d_forw
> ------------------- ---------- ---------------- ----------------
> ---------------- ---------- ------------------ ----------------
> 2021-09-30 04:15:00 1632968100 19.2970370370371 19.3363567649282
> 14.5003779289494 2.38 0.0393197278911508 4.79665910808766
>
> (I added two columns d_back = |t1-t2| and t_forw = |t1-t3|)
> Now, I need to delete the wrong record: in this case, it is t3 (d_forw =
> 4.8). How can I automatically find it and then remove? Unfortunately I
> don't know the sql language and I can't complete the script myself.
>
> Thank you very much,
> Andrea
> Il giorno giovedì 30 settembre 2021 alle 15:27:37 UTC+2 [email protected]
> ha scritto:
>
>> Here's a query using the "join a table to itself" method if you don't
>> have the LAG function. It prints observations that are outside the average
>> of the surrounding observations by 2+ degrees. I get about 30 observations
>> this year and I think it's legit. Although I don't get spikes in my data,
>> I do use a 15 minute interval and my station is closer to the ground than
>> it should be. A lot can happen in 15 minutes and my temp graph is always
>> kinda wiggly.
>>
>> Caveat Emptor on the attached query. You could modify it to delete or
>> update the offending rows instead of printing them out.
>>
>> Walt
>>
>>
>> On Thursday, September 30, 2021 at 7:52:41 AM UTC-5 [email protected]
>> wrote:
>>
>>> Yes, it's possible, but it's not a simple SELECT statement. For example,
>>> this query will return all rows where the temperature difference between
>>> adjacent rows is greater than 1 degree:
>>>
>>>
>>> *SELECT g.* FROM (SELECT dateTime,
>>> datetime(dateTime,'unixepoch','localtime'), LAG(outTemp) OVER (ORDER BY
>>> dateTime) AS prev_outTemp, outTemp FROM archive) AS g WHERE
>>> ABS(g.outTemp-g.prev_outTemp) > 1;*
>>>
>>> For my own database, this returns
>>>
>>>
>>>
>>>
>>>
>>>
>>> *1192323300|2007-10-13 17:55:00|64.099998|63.01192382100|2007-10-14
>>> 10:15:00|51.900002|53.01192756800|2007-10-18
>>> 18:20:00|44.700001|46.7000011192757100|2007-10-18
>>> 18:25:00|46.700001|49.51192757400|2007-10-18 18:30:00|49.5|47.400002*
>>> ...
>>>
>>> You would then need to arrange to delete those specific timestamps.
>>>
>>> The select statement will only work with modern versions of sqlite,
>>> which have the LAG() function.
>>>
>>>
>>>
>>>
>>> On Thu, Sep 30, 2021 at 3:35 AM [email protected] <[email protected]>
>>> wrote:
>>>
>>>> Hi there,
>>>> I need to remove some temperature spikes from my weewx.sdb database. I
>>>> know this has been already asked many times, but my case is different. I
>>>> need to delete spikes in which the values are climatically acceptable: so,
>>>> it is not sufficient to give (temperature is intended in °C)
>>>>
>>>> echo "delete from archive where outTemp < -6;" | sudo sqlite3
>>>> /home/weewx/archive/weewx.sdb
>>>>
>>>> for solving the problem. In my case, the spike was about 16°C when
>>>> temperature was about 20°C (see the graph below)
>>>> [image: daytempdew.png]
>>>> What I need to do is something like: DELETE if Δt = 300 AND |ΔT| > 5,
>>>> where
>>>> Δt is time difference (in seconds, I chose 300 because my archive
>>>> interval is 5 minutes) and ΔT is the temperature difference in that time
>>>> bin.
>>>> In other words I am thinking about using a differential approach. It
>>>> seems to me that this can be the only solution, is this possible with
>>>> sqlite3?
>>>> Thank you,
>>>> Andrea
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "weewx-user" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/weewx-user/32298c8c-6d93-4a80-8b9e-29fdcc181c8an%40googlegroups.com
>>>>
>>>> <https://groups.google.com/d/msgid/weewx-user/32298c8c-6d93-4a80-8b9e-29fdcc181c8an%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>
--
You received this message because you are subscribed to the Google Groups
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/weewx-user/55b237fa-c890-4f45-bbc1-68518587e89fn%40googlegroups.com.
-- DELETE THE ROWS WITH TEMPERATURE ANOMALIES
-- THIS ALTERS THE DATABASE!
-- MAKE A BACKUP BEFORE RUNNING THIS! NO GUARANTEES!
-- Wrap the 'find spikes' query in another WITH statement,
-- then delete all rows IN that result
-- find anomalies in temperature readings
--
-- my archive interval is 15 minutes, if yours is different (many people use 5
minutes):
-- change the "(15 * 60)" parts in two places in the from clause
.open /var/lib/weewx/weewx.sdb.20210930B
.changes on
.mode column
.headers on
with tx as (
with t0 as (
select t1.datetime
, t1.outtemp as temp
, t2.outtemp as temp_prior
, t3.outtemp as temp_next
, round(((t2.outtemp + t3.outtemp) / 2.0), 2) as temp_avg
, round(abs(t1.outtemp - ((t2.outtemp + t3.outtemp) / 2)), 2) as
variance
from archive as t1
, archive as t2 on t2.datetime = t1.datetime - (15 * 60)
, archive as t3 on t3.datetime = t1.datetime + (15 * 60)
where date(t1.datetime, 'unixepoch', 'localtime') >= '2021-01-01')
select datetime(datetime, 'unixepoch', 'localtime') as DT
, *
from t0
where variance >= 2.0)
-- just print the rows with spikes
--select * from tx;
-- delete the rows with spikes
delete from archive where datetime in (select datetime from tx);