If I have table timestamps: hive> desc timestamps;
OK ts bigint hive> select ts from timestamps order by ts OK 1 2 3 4 5 6 7 8 9 10 30 32 34 36 38 40 42 44 46 48 50 70 74 78 100 105 110 115 and I want to make groups of the values where splits between groups occur where two time-consecutive entries have a difference greater than 10. Eg, above, the splits would be such that the numbers would be grouped into these ranges: 0-10 30-50 70-78 100-115 because (30 - 10), (70 - 50), and (100 - 78) are each greater than 10. I'd like the query to result in the following: hive> select ... 0 7 0 9 0 6 0 3 0 10 0 1 0 4 0 5 0 8 0 2 30 34 30 44 30 40 30 38 30 36 30 32 30 46 30 42 30 48 30 50 30 30 70 74 70 70 70 78 100 100 100 105 100 110 100 115 What is the most efficient hive query that will do this ? Thanks, pk