Hi Ken,

My space should be tokens, I think...

mysql> select count(*) from dspam_token_data;                                   
                                                                                
                                                                                
                                +----------+
| count(*) |
+----------+
| 12948090 |
+----------+
1 row in set (2 min 11.29 sec)

mysql> select count(*) from dspam_signature_data;
+----------+
| count(*) |
+----------+
|   423411 |
+----------+
1 row in set (4.71 sec)

I can change the configuration, no problem.
What settings you suggest?

----- Mensagem original ----- 
De: k...@rice.edu 
Para: "Alfredo Saldanha" <asalda...@infolink.com.br> 
Cc: dspam-user@lists.sourceforge.net 
Enviadas: Quinta-feira, 20 de junho de 2013 15:09:43 
Assunto: Re: [Dspam-user] Huge MySQL Database 

On Thu, Jun 20, 2013 at 02:28:09PM -0300, Alfredo Saldanha wrote: 
> Hi there, 
> 
> In my structure I have 8 MX servers making load balance, 70 thousand email 
> accounts with, around 2,5 million messages per day. 
> I have 3 slaves MySQL servers receiving dspam read access and one master with 
> dspam write access. 
> In the MX servers I have a dspam connecting in localhost mysql-proxy pool, 
> than it connects in my 3 mysql slaves. 
> My database crashed up the first week... =( 
> I made a few adjusts and it works for 1 month. 
> Right now I have 70GB in my database and crashad up. I can not grow up more 
> and if it coninue in this way, where am I stop? 
> 
> dspam.conf: http://dpaste.com/1257262/ 
> MySQL Master my.cnf: http://dpaste.com/1257265/ 
> 
> PS.: I have purge scritp running every night. 
> 
> I'm seeing that is almost impossible to run dspam in my structure. 
> Should I give up? 
> 
> Thanks. 
> 

Hi Alfredo, 

Where is the space being used in the database? The tokens or the signatures? 
My guess is the signatures so maybe dropping the length of time you keep them 
or move to a signature-less process would help. Also, are you using a group 
to help minimize the size of the individual corpuses? Why are you using the 
CHAIN tokenizer instead of OSB. The latter is much more effective and needs 
a smaller corpus for the same level of accuracy. 

My two cents, 
Ken 

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to