If I have table timestamps:

hive> desc timestamps;

OK
ts      bigint  


hive> select ts from timestamps order by ts
OK

1
2
3
4
5
6
7
8
9
10
30
32
34
36
38
40
42
44
46
48
50
70
74
78
100
105
110
115

and I want to make groups of the values where splits between groups
occur where two time-consecutive entries have a difference greater
than 10.

Eg, above, the splits would be such that the numbers would be grouped
into these ranges:

0-10
30-50
70-78
100-115

because (30 - 10), (70 - 50), and (100 - 78) are each greater than 10.

I'd like the query to result in the following:

hive> select ...

0       7
0       9
0       6
0       3
0       10
0       1
0       4
0       5
0       8
0       2
30      34
30      44
30      40
30      38
30      36
30      32
30      46
30      42
30      48
30      50
30      30
70      74
70      70
70      78
100     100
100     105
100     110
100     115

What is the most efficient hive query that will do this ? Thanks,

pk

Reply via email to