http://bugzilla.spamassassin.org/show_bug.cgi?id=4019
------- Additional Comments From [EMAIL PROTECTED] 2004-12-06 12:17 -------
Subject: Re: BayesSQL token column type for MySQL may end up with semi-bogus
data
On Mon, Dec 06, 2004 at 11:35:23AM -0800, [EMAIL PROTECTED] wrote:
> A quick look at SQL.pm...
>
> dump_db_toks() already uses SQL's RPAD() to re-pad the token.
>
Oh yeah. Sorry, my FIFO mind must have completely forgotten about
this, thanks for the pointer.
> backup_database() on the other hand does not do this and proceeds to unpack
> the
> tokens without re-padding them... which will result in a different value than
> unpacking the same string with trailing space.
>
> So, if I understand what the problem IS, changing "token" to "RPAD(token,5,'
> ')"
> when $sql is set in backup_database() should fix the problem... unless I'm
> missing something and the problem also exists when you dump a DB (using
> dump_db_toks).
This is obviously the fix for this piece.
I'm still troubled by the following:
mysql> desc bayes_token2;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| token | char(5) | | PRI | | |
+-------+---------+------+-----+---------+-------+
1 row in set (0.00 sec)
mysql> insert into bayes_token2 values ('test ');
Query OK, 1 row affected (0.00 sec)
mysql> insert into bayes_token2 values ('test1');
Query OK, 1 row affected (0.00 sec)
mysql> insert into bayes_token2 values ('blah ');
Query OK, 1 row affected (0.00 sec)
mysql> insert into bayes_token2 values ('foo ');
Query OK, 1 row affected (0.00 sec)
mysql> select * from bayes_token2 where token = 'test ';
+-------+
| token |
+-------+
| test |
+-------+
1 row in set (0.00 sec)
mysql> select * from bayes_token2 where token = 'test';
+-------+
| token |
+-------+
| test |
+-------+
1 row in set (0.00 sec)
mysql> select * from bayes_token2 where token = 'blah';
+-------+
| token |
+-------+
| blah |
+-------+
1 row in set (0.00 sec)
mysql> select * from bayes_token2 where token = 'foo';
+-------+
| token |
+-------+
| foo |
+-------+
1 row in set (0.00 sec)
mysql> select * from bayes_token2 where token = 'foo ';
+-------+
| token |
+-------+
| foo |
+-------+
1 row in set (0.00 sec)
mysql> select * from bayes_token2 where token = 'foo ';
+-------+
| token |
+-------+
| foo |
+-------+
1 row in set (0.00 sec)
mysql> select token, length(token) from bayes_token2;
+-------+---------------+
| token | length(token) |
+-------+---------------+
| blah | 4 |
| foo | 3 |
| test | 4 |
| test1 | 5 |
+-------+---------------+
4 rows in set (0.00 sec)
'foo' == 'foo ' could cause some sort of problem I think, but maybe
I'm over analyzing things.
Michael
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.