* Bengt Lindholm
> I have a table where I need to group the content on a timestamp. Any
> record that is less than say 5 minutes from any other record needs to
> be grouped with that other record.
>
> ID timestamp
> 1 2004-02-02 12:00:00
> 2 2004-02-02 12:00:05
> 3 2004-02-02 12:05:20
> 4 2004-02-02 12:15:00
> 5 2004-02-02 12:17:15
> 6 2004-02-02 12:21:20
>
> With this recordset I would get:
> 1 & 2 should be grouped since there's less than 5 minutes between the
> records
> 3 will be in a group of its own
> 4, 5 & 6 is a third group since any record in the group has less than 5
> minutes to another record in the group
>
> Is this possible to do in MySQL, and how is it accomplished?
You could use mysql user variables:
mysql> set @a:=NULL,@b:=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select if(clicktime-interval 5 minute>[EMAIL PROTECTED],@b:[EMAIL
PROTECTED],@b) x,
-> @a:=clicktime from stats where clicktime between
-> "2003-11-24 02:00:00" and "2003-11-24 02:59:59"
-> order by clicktime;
+------+---------------------+
| x | @a:=clicktime |
+------+---------------------+
| 0 | 2003-11-24 02:09:57 |
| 0 | 2003-11-24 02:10:05 |
| 0 | 2003-11-24 02:10:09 |
| 0 | 2003-11-24 02:10:22 |
| 0 | 2003-11-24 02:10:24 |
:
| 0 | 2003-11-24 02:41:42 |
| 0 | 2003-11-24 02:41:43 |
| 0 | 2003-11-24 02:43:12 |
| 1 | 2003-11-24 02:49:19 |
| 1 | 2003-11-24 02:50:46 |
:
| 1 | 2003-11-24 02:56:28 |
| 1 | 2003-11-24 02:56:41 |
| 1 | 2003-11-24 02:56:50 |
+------+---------------------+
136 rows in set (0.01 sec)
As you can see, I only get two groups with my test data, and the result
seems to be correct, but when I apply a GROUP BY and a COUNT() I get a
different result. I get groups of all rows within five minutes of the
_first_ row of each group:
mysql> set @a:=NULL,@b:=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select if(clicktime-interval 5 minute>[EMAIL PROTECTED],@b:[EMAIL
PROTECTED],@b) x,
-> @a:=clicktime,count(*) from stats where clicktime between
-> "2003-11-24 02:00:00" and "2003-11-24 02:59:59"
-> group by x order by clicktime;
+------+---------------------+----------+
| x | @a:=clicktime | count(*) |
+------+---------------------+----------+
| 0 | 2003-11-24 02:09:57 | 22 |
| 2 | 2003-11-24 02:15:17 | 29 |
| 4 | 2003-11-24 02:20:55 | 17 |
| 6 | 2003-11-24 02:25:55 | 22 |
| 8 | 2003-11-24 02:31:12 | 4 |
| 10 | 2003-11-24 02:36:19 | 21 |
| 12 | 2003-11-24 02:41:42 | 3 |
| 14 | 2003-11-24 02:49:19 | 13 |
| 16 | 2003-11-24 02:56:12 | 5 |
+------+---------------------+----------+
9 rows in set (0.00 sec)
This seems to have something to do with how mysql handles GROUP BY queries,
the @a variable is not re-assigned for each row. Note that this is a
non-standard GROUP BY statement, as a column (clicktime) is used in the
field list but not in the GROUP BY clause. Using a group function (MIN() in
this case) on the assignment expression makes this a more standard compliant
GROUP BY, and it seems to give the result we want:
mysql> set @a:=NULL,@b:=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select if(clicktime-interval 5 minute>[EMAIL PROTECTED],@b:[EMAIL
PROTECTED],@b) x,
-> min(@a:=clicktime),count(*) from stats where clicktime between
-> "2003-11-24 02:00:00" and "2003-11-24 02:59:59"
-> group by x order by clicktime;
+------+---------------------+----------+
| x | min(@a:=clicktime) | count(*) |
+------+---------------------+----------+
| 0 | 2003-11-24 02:09:57 | 118 |
| 1 | 2003-11-24 02:49:19 | 18 |
+------+---------------------+----------+
2 rows in set (0.01 sec)
--
Roger
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]