So one approach I have is this:


select ds, ts_win, opt_string, opt_id, max(eday) as max_eday from (
select a.ds, a.ts_win, a.opt_string, b.opt_id, b.eday, b.ts_win as
rpt_ts_win, b.ts
FROM
(
select distinct ds, opt_string, opt_id, concat(substr(ts, 1, 4), '0:00') as
ts_win
from dfs.prod.view_optdata where ds = '2018-12-02' order by ds, ts_win,
opt_string
) a JOIN
(
select opt_id, ds, opt_string, concat(substr(ts, 1, 4), '0:00') as ts_win,
ts, eday
from dfs.prod.view_optdata where ds = '2018-12-02'
) b ON a.ds = b.ds and a.opt_string = b.opt_string
WHERE b.ts <= a.ts_win
) c
group by ds, ts_win, opt_string, opt_id
order by ds, ts_win ASC, opt_string ASC, opt_id ASC

This breaks up my day into 10 minute intervals (based on string slicing)
and then shows me what's reported before that... I think :)

This is limited in that I can do it only in time intervals that can be
substringed out of a time field in  string format.  Still open to other
ideas :)


On Tue, Dec 4, 2018 at 10:05 AM John Omernik <j...@omernik.com> wrote:

> Time for a fun question: How to be clever with queries!
>
>
> I have a table that takes readings from an IoT type device
>
>
> opt_id         dt           ts          eday   opt_string
> 2.1.1       2018-12-01   10:43:43       12.5       1
> 2.1.2       2018-12-01   10:32:43       5.5        1
> 2.1.3       2018-12-01   10:55:02       20.5       1
> 2.2.1       2018-12-01   10:43:43       12.2       2
> 2.2.2       2018-12-01   10:12:56       43.7       2
> 2.2.3       2018-12-01   10:50:23       12.2       2
> 2.1.1       2018-12-01   13:43:43       45.5       1
> 2.1.2       2018-12-01   13:32:43       20.5       1
> 2.1.3       2018-12-01   13:55:02       45.5       1
> 2.2.1       2018-12-01   13:43:43       30.2       2
> 2.2.2       2018-12-01   13:12:56       81.7       2
> 2.2.3       2018-12-01   13:50:23       50.2       2
>
>
> Pretty simple. Essentially, each "device" (opt_id) reports throughout the
> day, what it's cumulative value (eday), at that time, for that day
>
> These devices belong to groups which is it's own column (opt_string)
>
>
> So if I run a query like this:
>
> select opt_id, ds, max(opt_string) as opt_string, max(ts) as max_ts,
> max(Eday) as max_Eday
> from `mydata`
> where ds = '2018-12-01'
> group by ds, opt_id, order by opt_id ASC
>
> I get:
>
> opt_id ds opt_string max_ts max_Eday
> 2.1.1 2018-12-02 1 15:56:33 181.000
> 2.1.3 2018-12-02 1 15:47:02 162.500
> 2.1.4 2018-12-02 1 15:54:19 122.250
> 2.1.5 2018-12-02 1 15:40:15 132.000
> 2.1.6 2018-12-02 1 15:45:04 183.250
> 2.1.7 2018-12-02 1 15:41:15 161.000
> 2.1.8 2018-12-02 1 15:42:10 166.250
> 2.1.9 2018-12-02 1 15:39:23 126.250
> 2.2.1 2018-12-02 2 16:02:54 163.250
> 2.2.2 2018-12-02 2 15:57:39 178.750
> 2.2.3 2018-12-02 2 15:46:53 148.250
> 2.2.4 2018-12-02 2 15:40:27 163.500
> 2.2.5 2018-12-02 2 15:50:49 147.500
> 2.2.6 2018-12-02 2 15:40:24 156.000
> 2.2.7 2018-12-02 2 15:43:00 104.750
> 2.2.8 2018-12-02 2 15:41:55 170.750
> 2.2.9 2018-12-02 2 15:41:04 127.250
>
> Which shows me the last time each device reported on a day, and what that
> report time was, Great.
>
>
> So what if I wanted to show the values through the day? Is there a way
> from a query perpective, to perhaps use Windowing or something to show
> results like the above raw table, but instead of just having the random
> times returned, group by opt_id and get the max_ts as I am doing, but do
> that for each say 5 minute, or 30 minute, or 60 minute block in a day?
>
> The best example to explain what I am looking for is using 3 groups of 1
> hour blocks: I could use a UNION like this:
>
> select * from (
> select opt_id, ds,  max(opt_string) as opt_string, '10:00:00' as max_ts,
> max(Eday) as max_Eday
> from dfs.prod.view_optdata where ds = '2018-12-02' and ts <= '10:00:00'
> group by ds, opt_id
> order by opt_id ASC
> )
> UNION
> (select opt_id, ds,  max(opt_string) as opt_string, '11:00:00' as max_ts,
> max(Eday) as max_Eday
> from dfs.prod.view_optdata where ds = '2018-12-02' and ts <= '11:00:00'
> group by ds, opt_id
> order by opt_id ASC)
> UNION
> (select opt_id, ds,  max(opt_string) as opt_string, '12:00:00' as max_ts,
> max(Eday) as max_Eday
> from dfs.prod.view_optdata where ds = '2018-12-02' and ts <= '12:00:00'
> group by ds, opt_id
> order by opt_id ASC)
>
>
> Where I would get something like:
>
> index opt_id ds opt_string max_ts max_Eday
> 0 2.1.1 12/2/18 1 10:00:00 18
> 1 2.1.3 12/2/18 1 10:00:00 15.75
> 2 2.1.4 12/2/18 1 10:00:00 12.5
> 3 2.1.5 12/2/18 1 10:00:00 12.75
> 4 2.1.6 12/2/18 1 10:00:00 18.75
> 5 2.1.7 12/2/18 1 10:00:00 7.25
> 6 2.1.8 12/2/18 1 10:00:00 16.25
> 7 2.1.9 12/2/18 1 10:00:00 9.75
> 8 2.2.1 12/2/18 2 10:00:00 18
> 9 2.2.2 12/2/18 2 10:00:00 15.25
> 10 2.2.3 12/2/18 2 10:00:00 2.75
> 11 2.2.4 12/2/18 2 10:00:00 6.5
> 12 2.2.5 12/2/18 2 10:00:00 16.25
> 13 2.2.6 12/2/18 2 10:00:00 11.25
> 14 2.2.7 12/2/18 2 10:00:00 13.5
> 15 2.2.8 12/2/18 2 10:00:00 13.75
> 16 2.2.9 12/2/18 2 10:00:00 14.5
> 17 2.1.1 12/2/18 1 11:00:00 41
> 18 2.1.3 12/2/18 1 11:00:00 32
> 19 2.1.4 12/2/18 1 11:00:00 25
> 20 2.1.5 12/2/18 1 11:00:00 26.75
> 21 2.1.6 12/2/18 1 11:00:00 42.25
> 22 2.1.7 12/2/18 1 11:00:00 27
> 23 2.1.8 12/2/18 1 11:00:00 35.75
> 24 2.1.9 12/2/18 1 11:00:00 22
> 25 2.2.1 12/2/18 2 11:00:00 34.5
> 26 2.2.2 12/2/18 2 11:00:00 42.5
> 27 2.2.3 12/2/18 2 11:00:00 19
> 28 2.2.4 12/2/18 2 11:00:00 29.25
> 29 2.2.5 12/2/18 2 11:00:00 32.25
> 30 2.2.6 12/2/18 2 11:00:00 30.25
> 31 2.2.7 12/2/18 2 11:00:00 26.75
> 32 2.2.8 12/2/18 2 11:00:00 33.5
> 33 2.2.9 12/2/18 2 11:00:00 28.5
> 34 2.1.1 12/2/18 1 12:00:00 81.5
> 35 2.1.3 12/2/18 1 12:00:00 62.75
> 36 2.1.4 12/2/18 1 12:00:00 48.5
> 37 2.1.5 12/2/18 1 12:00:00 48.25
> 38 2.1.6 12/2/18 1 12:00:00 80.5
> 39 2.1.7 12/2/18 1 12:00:00 62
> 40 2.1.8 12/2/18 1 12:00:00 66.25
> 41 2.1.9 12/2/18 1 12:00:00 43.75
> 42 2.2.1 12/2/18 2 12:00:00 62.75
> 43 2.2.2 12/2/18 2 12:00:00 83.5
> 44 2.2.3 12/2/18 2 12:00:00 49
> 45 2.2.4 12/2/18 2 12:00:00 66.25
> 46 2.2.5 12/2/18 2 12:00:00 56.5
> 47 2.2.6 12/2/18 2 12:00:00 68
> 48 2.2.7 12/2/18 2 12:00:00 43
> 49 2.2.8 12/2/18 2 12:00:00 72.5
> 50 2.2.9 12/2/18 2 12:00:00 49.25
>
>
> That's great, but if I wanted to say all the hours in a day, that Union
> would get ugly, or what about 30 minute blocks? 5 minute blocks?
>
> Is there a way to get these types of results without the Union and being
> explicit about the times?  Any advice, including "Look at X" would be
> helpful. Or if you need a diversion, dig on in!
>
> John
>
>

Reply via email to