Thanks Shawn!!!!!!!!!!!!!!!!!!!!! Peter Brawley <[EMAIL PROTECTED]> wrote:Seena,
That data is for one date. To test Shawn's 2-stage query I'd need a data for several days. Without your tables and data, I tried Shawn's algorithm on a test database we use a lot, nwib, which is a MySQL port of the widely used (and copied) Microsoft Northwind database. It has customers, orders, orderdetails, payments etc, and it's populated. Your question, who are the top ten spammers per day? is logically equivalent, in nwib, to: what customers made the top ten number of orders for any given year? Translating Shawn's method to nwib.orders gives us: CREATE TABLE orderstats ( yr SMALLINT NOT NULL , rank INT AUTO_INCREMENT, customerID CHAR(5) NOT NULL, ordercount INT UNSIGNED NOT NULL, PRIMARY KEY(yr,rank) ); INSERT orderstats ( yr, customerID, ordercount ) SELECT YEAR(orderdate) AS yr, customerID, COUNT(customerID) AS ordercount FROM orders GROUP BY yr, customerID ORDER BY yr,ordercount DESC which, as Shawn says, auto-populates orderstats.rank for each group. After the above executes, SELECT * from orderstats WHERE rank <= 10 gives the top ten ordering customers per year, as expected. So I am unclear why the algorithm does not work for your spam table. PB ----- Seena Blace wrote: Peter here u go.just a few.You may see sample report. 2005-05-05 | 1108529 | 0 | 1 | 0 | 0 | | 2005-05-05 | 1108529 | 0 | 1 | 0 | 0 | | 2005-05-05 | 1108529 | 0 | 1 | 0 | 0 | | 2005-05-05 | 1108529 | 0 | 1 | 0 | 0 | | 2005-05-05 | 1108529 | 0 | 1 | 0 | 0 | | 2005-05-05 | 1108529 | 0 | 1 | 0 | 0 | | 2005-05-05 | 14140 | 0 | 0 | 0 | 0 | | 2005-05-05 | 46 | 1 | 0 | 1 | 0 | | 2005-05-05 | 10378 | 0 | 0 | 0 | 0 | | 2005-05-05 | 16068 | 0 | 0 | 0 | 0 | | 2005-05-05 | 110085 | 0 | 1 | 0 | 0 Peter Brawley <[EMAIL PROTECTED]> wrote: Seena, I'm curious how Shawn's elegant-looking solution fails. Would you mind sending me a small data sample so I can explore his solution? Thx. Peter Brawley ----- Seena Blace wrote: Shawan,Here you go.mysql> select * from spam_stats -> where rank <=10 ;+-------------+-------------------------+------+-----------+-------+-----------+| report_date | report_sender_domain_id | rank | processed | spam | suspected |+-------------+-------------------------+------+-----------+-------+-----------+| 2005-04-07 | 46 | 1 | 22054 | 19263 | 13 || 2005-04-06 | 46 | 2 | 20229 | 16998 | 37 || 2005-04-08 | 46 | 3 | 19493 | 16443 | 24 || 2005-04-05 | 46 | 4 | 19322 | 15921 | 158 || 2005-04-29 | 46 | 5 | 19241 | 15804 | 8 || 2005-05-02 | 46 | 6 | 47732 | 15287 | 82 || 2005-05-04 | 46 | 7 | 91907 | 14275 | 25 || 2005-04-29 | 52 | 8 | 15875 | 13422 | 4 || 2005-05-02 | 52 | 9 | 56530 | 13098 | 31 || 2005-05-04 | 52 | 10 | 129549 | 12566 | 18 |+-------------+-------------------------+------+-----------+-------+-----------+10 rows in set (0.00 sec)[EMAIL PROTECTED] wrote:Please try my solution before you tell me it's broken, OK? I know you want to see the top 10 spammers for EACH day. That's what I wrote for you. Please try my solution with your data and get back to me with the results and explain to me what's wrong so I can fix it.Thank you for your patience,Shawn GreenDatabase AdministratorUnimin Corporation - Spruce PineSeena Blace wrote on 05/16/2005 04:41:19 PM: Shawn,query SELECT * FROM spam_statsWHERE rank <= 10;will return all rows which I don't want.I need datewise top 10 spam domain.means condition would be serach those rows which are having top 10 spam (means highest) on each day and show the output like which I send [EMAIL PROTECTED] wrote:Seena Blace wrote on 05/16/2005 11:13:48 AM: shawnI think 2nd query will return only 10 rows.I want out like followingsdate domain spam05/05/05 hotmail.com 12005/05/05 yahoo.com 11005/05/05 abc.com 9905/05/05 def.com 8005/05/05 mnpo.net 79......................................... like that upto 10--------------------------------------------------05/06/05 yahoo.com 30005/06/05 def.com 25005/06/05 zer.com 200..................................like that upto 10Each day there are multiple entry from diffrent domains or same domain. I want each day whatever top 10 spam sender [EMAIL PROTECTED] wrote:Seena Blace wrote on 05/16/2005 10:08:15 AM: Any suggestion pl?Seena Blace wrote:hi,here is table descriptionreport1+-------------------------+------------------+------+-----+---------------------+----------------+| Field | Type | Null | Key | Default | Extra |+-------------------------+------------------+------+-----+---------------------+----------------+| id | int(10) unsigned | | PRI | NULL | auto_increment || host_id | int(10) unsigned | | MUL | 0 | || report_rcpt_domain_id | int(10) unsigned | YES | MUL | NULL | || report_sender_domain_id | int(10) unsigned | YES | MUL | NULL | || report_ipaddress_id | int(10) unsigned | YES | MUL | NULL | || time | datetime | | MUL | 0000-00-00 00:00:00 | || detected_spam | int(10) unsigned | | | 0 | || detected_virus | int(10) unsigned | | | 0 | || processed | int(10) unsigned | | | 0 | || allowed | int(10) unsigned | | | 0 | || suspected | int(10) unsigned | | | 0 | || blocked | int(10) unsigned | | | 0 | || spam | int(10) unsigned | | | 0 | || virus | int(10) unsigned | | | 0 | |I WANT REPORT LIKE FOLLOWINGSdate sender processed spam suspectedI want top 10 spam sender each day.QUery i'm using select date_format(time,'%Y-%d-%m'),report_sender_domain_id,processed ,spam from report1order by spam desc ,report_sender_domain_id,date_format(time,'%Y-%d-%m') limit 10;Please suggest.thanks It was the weekend. Not everyone lurks on their days off.Try thisselect date_format(time,'%Y-%d-%m'),report_sender_domain_id,processed ,spam ,suspectedfrom report1order by spam desc,report_sender_domain_id,date_format(time,'%Y-%d-%m') limit 10;or if there is more than one entry per spammer per daySELECT date_format(time,'%Y-%d-%m'),report_sender_domain_id,sum(processed) as processed,sum(spam) as spam ,sum(suspected) as suspectedFROM report1GROUP BY date_format(time,'%Y-%d-%m'),report_sender_domain_idORDER BY spam desc,report_sender_domain_id,date_format(time,'%Y-%d-%m') LIMIT 10;That will give you their total stats for each day.Shawn GreenDatabase AdministratorUnimin Corporation - Spruce Pine--------------------------------- OK, then what you want to do will take two steps and another table (unless you want to write application code) to do with MySQL;First step is to create a table where the intermediate results can be stored. What this table does is number each row for each day. Since it appears that you may want this information frequently (but only need to update it at the end of every day) I will make this a permanent table (not a temporary table).CREATE TABLE spam_stats (report_date date not null,report_sender_domain_id int unsigned not null,rank int unsigned auto_increment,processed int unsigned,spam int unsigned,suspected int unsigned,PRIMARY KEY (report_date, rank),UNIQUE KEY (report_sender_domain_id, report_date),KEY (rank));The UNIQUE key ensures that the same spammer cannot have more than one entry per day while the PRIMARY KEY allows for a groupwize autonumber (each entry per day gets it's own number, rank, starting at 1). The last key is optional but will seriously speed up the returns for the report you wanted. Next we need to populate our new table with the "spam" report data (filled in from most spam to least spam per domain per day).INSERT spam_stats (report_date, report_sender_domain_id, processed, spam, suspected)SELECT date(`time`),report_sender_domain_id,sum(processed) as processed,sum(spam) as spam ,sum(suspected) as suspectedFROM report1GROUP BY date(`time`),report_sender_domain_id ORDER BY spam desc,report_sender_domain_id,date(`time`);What will happen is that the rank column will be automatically filled in for each day's spam statistics. Then to get your "top ten" list you just run:SELECT * FROM spam_statsWHERE rank <= 10;Update spam_stats at the end of each new day and will only need to calculate those values once. If you create more tables like spam_stats for your other statistics (processed, suspected, blocked, etc.), you will be able to do rank-based reporting for those statistics, too ("top 20 sites blocked", for example).If you play around with the general idea of the groupwize numbering feature of auto_increment + primary key, I am sure you can solve a lot of your reporting issues.Shawn GreenDatabase AdministratorUnimin Corporation - Spruce PineYahoo! MailStay connected, organized, and protected. Take the tour ---------------------------------Yahoo! Mail Stay connected, organized, and protected. Take the tour --------------------------------- No virus found in this incoming message.Checked by AVG Anti-Virus.Version: 7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005 No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005 --------------------------------- Yahoo! Mail Mobile Take Yahoo! Mail with you! Check email on your mobile phone. --------------------------------- No virus found in this incoming message.Checked by AVG Anti-Virus.Version: 7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005 No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005 --------------------------------- Do you Yahoo!? Make Yahoo! your home page