Here's an updated query that does better checking and it can delete the 
rows or update the temperatures.  Test it out on a copy of your database 
first.  This works for sqlite.  Don't know about others.

Walt

On Sunday, October 3, 2021 at 6:48:23 AM UTC-5 [email protected] wrote:

> Exactly. Is there a way to consider only cases such as  +0, -10, +0 (i.e. 
> spikes) and not those like  +0, -6. -8 ?
> Meanwhile I thank you for your precious help.
> Andrea
>
> Il giorno giovedì 30 settembre 2021 alle 23:03:35 UTC+2 [email protected] 
> ha scritto:
>
>> Just a reminder, this delete query will delete all situations where the 
>> criteria is met.  Some of these situations are legit, temps can change 
>> rapidly.  You want to make sure it's going to delete only the records you 
>> want before running it.  There are probably some enhancements that can be 
>> made to the query to find more obvious spikes.  Something like +0, -10, +0 
>> might be a spike, whereas +0, -6. -8 is not, the temp is just falling fast.
>>
>> On Thursday, September 30, 2021 at 1:11:05 PM UTC-5 [email protected] 
>> wrote:
>>
>>> Well, make sure you back up your database before you try to delete 
>>> anything in case this goes wrong.  I would
>>>
>>> 1. Stop weewx
>>> 2. Make a copy of the database
>>> 3. Execute the delete SQL
>>> 4. Run the original query again to see if things are OK
>>> 5. Rebuild weewx dailies
>>> 6. Start weewx
>>>
>>> Hopefully I'm not forgetting anything.
>>>
>>> I'm attaching SQL to delete the spiked records.  You will have to change 
>>> the database name in the .open statement to the db you want to change.  You 
>>> may need to run sqlite3 with sudo in order to delete records.
>>>
>>> I thought a better solution would be to update the spiked temp with an 
>>> average of the readings surrounding the spike, but I'm having trouble 
>>> coming up with the SQL to do that.
>>>
>>> Walt
>>> On Thursday, September 30, 2021 at 9:41:17 AM UTC-5 [email protected] 
>>> wrote:
>>>
>>>> Thank you Walt, this script works fine! in fact it highlighted that 
>>>> spike in my graph:
>>>>
>>>> DT                   datetime    temp              temp_prior        
>>>> temp_next         variance    d_back              d_forw
>>>> -------------------  ----------  ----------------  ----------------  
>>>> ----------------  ----------  ------------------  ----------------
>>>> 2021-09-30 04:15:00  1632968100  19.2970370370371  19.3363567649282  
>>>> 14.5003779289494  2.38        0.0393197278911508  4.79665910808766
>>>>
>>>> (I added two columns d_back = |t1-t2| and t_forw = |t1-t3|)
>>>> Now, I need to delete the wrong record: in this case, it is t3 (d_forw 
>>>> = 4.8). How can I automatically find it and then remove? Unfortunately I 
>>>> don't know the sql language and I can't complete the script myself.
>>>>
>>>> Thank you very much,
>>>> Andrea
>>>> Il giorno giovedì 30 settembre 2021 alle 15:27:37 UTC+2 
>>>> [email protected] ha scritto:
>>>>
>>>>> Here's a query using the "join a table to itself" method if you don't 
>>>>> have the LAG function.  It prints observations that are outside the 
>>>>> average 
>>>>> of the surrounding observations by 2+ degrees.  I get about 30 
>>>>> observations 
>>>>> this year and I think it's legit.  Although I don't get spikes in my 
>>>>> data, 
>>>>> I do use a 15 minute interval and my station is closer to the ground than 
>>>>> it should be.  A lot can happen in 15 minutes and my temp graph is always 
>>>>> kinda wiggly.
>>>>>
>>>>> Caveat Emptor on the attached query.  You could modify it to delete or 
>>>>> update the offending rows instead of printing them out.
>>>>>
>>>>> Walt
>>>>>
>>>>>
>>>>> On Thursday, September 30, 2021 at 7:52:41 AM UTC-5 [email protected] 
>>>>> wrote:
>>>>>
>>>>>> Yes, it's possible, but it's not a simple SELECT statement. For 
>>>>>> example, this query will return all rows where the temperature 
>>>>>> difference 
>>>>>> between adjacent rows is greater than 1 degree:
>>>>>>
>>>>>>
>>>>>> *SELECT g.* FROM (SELECT dateTime, 
>>>>>> datetime(dateTime,'unixepoch','localtime'),  LAG(outTemp) OVER (ORDER BY 
>>>>>> dateTime) AS prev_outTemp, outTemp FROM archive) AS g WHERE 
>>>>>> ABS(g.outTemp-g.prev_outTemp) > 1;*
>>>>>>
>>>>>> For my own database, this returns
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *1192323300|2007-10-13 17:55:00|64.099998|63.01192382100|2007-10-14 
>>>>>> 10:15:00|51.900002|53.01192756800|2007-10-18 
>>>>>> 18:20:00|44.700001|46.7000011192757100|2007-10-18 
>>>>>> 18:25:00|46.700001|49.51192757400|2007-10-18 18:30:00|49.5|47.400002*
>>>>>> ...
>>>>>>
>>>>>> You would then need to arrange to delete those specific timestamps.
>>>>>>
>>>>>> The select statement will only work with modern versions of sqlite, 
>>>>>> which have the LAG() function.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Sep 30, 2021 at 3:35 AM [email protected] <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi there,
>>>>>>> I need to remove some temperature spikes from my weewx.sdb database. 
>>>>>>> I know this has been already asked many times, but my case is 
>>>>>>> different. I 
>>>>>>> need to delete spikes in which the values are climatically acceptable: 
>>>>>>> so, 
>>>>>>> it is not sufficient to give (temperature is intended in °C)
>>>>>>>
>>>>>>> echo "delete from archive where outTemp < -6;" | sudo sqlite3 
>>>>>>> /home/weewx/archive/weewx.sdb
>>>>>>>
>>>>>>> for solving the problem. In my case, the spike was about 16°C when 
>>>>>>> temperature was about 20°C (see the graph below)
>>>>>>> [image: daytempdew.png]
>>>>>>> What I need to do is something like: DELETE if Δt = 300 AND |ΔT| > 5, 
>>>>>>> where 
>>>>>>> Δt is time difference (in seconds, I chose 300 because my archive 
>>>>>>> interval is 5 minutes) and ΔT is the temperature difference in that 
>>>>>>> time 
>>>>>>> bin. 
>>>>>>> In other words I am thinking about using a differential approach. It 
>>>>>>> seems to me that this can be the only solution, is this possible with 
>>>>>>> sqlite3?
>>>>>>> Thank you,
>>>>>>> Andrea
>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "weewx-user" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/weewx-user/32298c8c-6d93-4a80-8b9e-29fdcc181c8an%40googlegroups.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/weewx-user/32298c8c-6d93-4a80-8b9e-29fdcc181c8an%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"weewx-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/weewx-user/a3b5661f-58d6-4c17-8ce2-4c55ca620eb1n%40googlegroups.com.
-- find anomalies in temperature observations
-- optionally alter the temperature or 
-- delete the entire observation record
-- 
-- see comments below for:
--   naming the database to operate on,
--   setting your archive interval,
--   refining the search, 
--   what to do with the results (print, alter, or delete)

.mode columns
.headers on
.changes on
-- name the database to query/alter 
.open c:/users/wfs91/documents/weather/weewx2.sdb

create temp table result1 as
with t0 as (
   select t1.datetime
        , t2.outtemp as temp_prior
        , t1.outtemp as temp
        , t3.outtemp as temp_next
        , round(((t2.outtemp + t3.outtemp) / 2), 2) as temp_avg
        , round(abs(t1.outtemp - ((t2.outtemp + t3.outtemp) / 2)), 2) as 
temp_variance
   from archive as t1
-- set your archive interval in the next two lines;  my interval is 15 minutes 
but many people use 5
      , archive as t2 on t2.datetime = t1.datetime - (15 * 60)
      , archive as t3 on t3.datetime = t1.datetime + (15 * 60)
-- uncomment the following line and change dates to limit the search over time
-- where date(t1.datetime, 'unixepoch', 'localtime') between '2000-01-01' and 
'2030-12-31'
)
select datetime(datetime, 'unixepoch', 'localtime') as DT
     , *
from t0
-- change the values in this where clause to refine the search
-- here we select temps that:
--   vary a certain amount from the average of the adjacent observations
--   are either higher or lower than both of the adjacent observations
--   depart a certain amount from adjacent observations
--     (a small change at either end might indicate legit observations)
where temp_variance >= 2.0
      and
      ((temp > temp_prior and temp > temp_next) or (temp < temp_prior and temp 
< temp_next))
      and
      abs(temp_prior - temp) >= 1.0
      and
      abs(temp_next - temp) >= 1.0
;
-- print out the rows
select * from result1
;
-- to delete the rows found, uncomment the statement below
-- NB: this deletes the entire row, all observations (temp, humidity, wind, 
etc) are lost
--delete from archive where datetime in (select datetime from result1)
;
-- to update the temperature with the average of the adjacent observations
-- uncomment the next two lines
--update archive as ta set outtemp = (select temp_avg from result1 where 
datetime == ta.datetime)
--where datetime in (select datetime from result1)
;

Reply via email to