I'm in dire need of suggestions for reconstruction of a database that
contains USENET posts from a few groups on the server.
This is the structure:
CREATE TABLE /*!32300 IF NOT EXISTS*/ nntp (
pid mediumint(10) NOT NULL auto_increment,
aid mediumint(10) NOT NULL DEFAULT '0' ,
from_name varchar(50) NOT NULL DEFAULT '' ,
subject varchar(100) NOT NULL DEFAULT '' ,
date int(11) NOT NULL DEFAULT '0' ,
body text NOT NULL DEFAULT '' ,
mid varchar(50) NOT NULL DEFAULT '' ,
reference varchar(50) NOT NULL DEFAULT '' ,
newsgroup varchar(50) NOT NULL DEFAULT '' ,
tid varchar(50) NOT NULL DEFAULT '' ,
viewed mediumint(10) NOT NULL DEFAULT '0' ,
replies mediumint(3) NOT NULL DEFAULT '0' ,
PRIMARY KEY (pid),
INDEX nntp_reference_newsgroup (reference,newsgroup),
INDEX nntp_mid_newsgroup_date (mid,newsgroup,date),
INDEX nntp_mid (mid),
INDEX nntp_date (date),
INDEX nntp_reference_newsgroup_tid_date (reference,newsgroup,tid,date),
INDEX tid (tid)
This table contains data from multiple groups on the server. Right now, the
amount of posts is reaching into 150K posts including full body. It's quite
slow with a simple query to count the total posts of a particular group from
a certain date:
SELECT count(pid) AS total FROM nntp WHERE reference = '' AND
newsgroup='macromedia.dreamweaver' AND date > '1008511669'
An Index was created on the WHERE clause so that it could find the info
fast, but it takes at least 30s to just query for the total row count.
So my question is what's the best way to reconstruct this table. One method
of handling was to seperate each group crawled into it's own table, but I am
thinking into teh future when I begin to have hundreds of USENET groups to
crawl and wouldn't want to maintain 100s of tables.
Another idea would be to seperate the body of the posts into a seperate
table. Any other ideas/suggestions?
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]