Thanks Shawn!!!!!!!!!!!!!!!!!!!!!

Peter Brawley <[EMAIL PROTECTED]> wrote:Seena,

That data is for one date. To test Shawn's 2-stage query I'd need a data for 
several days.

Without your tables and data, I tried Shawn's algorithm on a test database we 
use a lot, nwib, which is a MySQL port of the widely used (and copied) 
Microsoft Northwind database.  It has customers, orders, orderdetails, payments 
etc, and it's populated. Your question, 
  who are the top ten spammers per day?
is logically equivalent, in nwib, to: 
  what customers made the top ten number of orders for any given year?
Translating Shawn's method to nwib.orders gives us:

CREATE TABLE orderstats (
  yr SMALLINT NOT NULL ,
  rank INT AUTO_INCREMENT,
  customerID CHAR(5) NOT NULL,
  ordercount INT UNSIGNED NOT NULL,
  PRIMARY KEY(yr,rank)
);

INSERT orderstats 
  ( yr, customerID, ordercount )
  SELECT 
    YEAR(orderdate) AS yr,
    customerID,
    COUNT(customerID) AS ordercount
  FROM orders
  GROUP BY yr, customerID
  ORDER BY yr,ordercount DESC

which, as Shawn says, auto-populates orderstats.rank for each group. After the 
above executes,
  SELECT * from orderstats WHERE rank <= 10 
gives the top ten ordering customers per year, as expected.

So I am unclear why the algorithm does not work for your spam table.

PB

-----

Seena Blace wrote: Peter 
here u go.just a few.You may see sample report.
 2005-05-05                   |                 1108529 |             0 |       
       1 |    0 |     0 |
| 2005-05-05                   |                 1108529 |             0 |      
        1 |    0 |     0 |
| 2005-05-05                   |                 1108529 |             0 |      
        1 |    0 |     0 |
| 2005-05-05                   |                 1108529 |             0 |      
        1 |    0 |     0 |
| 2005-05-05                   |                 1108529 |             0 |      
        1 |    0 |     0 |
| 2005-05-05                   |                 1108529 |             0 |      
        1 |    0 |     0 |
| 2005-05-05                   |                   14140 |             0 |      
        0 |    0 |     0 |
| 2005-05-05                   |                      46 |             1 |      
        0 |    1 |     0 |
| 2005-05-05                   |                   10378 |             0 |      
        0 |    0 |     0 |
| 2005-05-05                   |                   16068 |             0 |      
        0 |    0 |     0 |
| 2005-05-05                   |                  110085 |             0 |      
        1 |    0 |     0

Peter Brawley <[EMAIL PROTECTED]> wrote:
Seena,

I'm curious how Shawn's elegant-looking solution fails. Would you mind sending 
me a small data sample so I can explore his solution? Thx.

Peter Brawley

-----

Seena Blace wrote: 
Shawan,Here you go.mysql> select * from spam_stats    -> where rank <=10 
;+-------------+-------------------------+------+-----------+-------+-----------+|
 report_date | report_sender_domain_id | rank | processed | spam  | suspected 
|+-------------+-------------------------+------+-----------+-------+-----------+|
 2005-04-07  |                      46 |    1 |     22054 | 19263 |        13 
|| 2005-04-06  |                      46 |    2 |     20229 | 16998 |        37 
|| 2005-04-08  |                      46 |    3 |     19493 | 16443 |        24 
|| 2005-04-05  |                      46 |    4 |     19322 | 15921 |       158 
|| 2005-04-29  |                      46 |    5 |     19241 | 15804 |         8 
|| 2005-05-02  |                      46 |    6 |     47732 | 15287 |        82 
|| 2005-05-04  |                      46 |    7 |     91907 | 14275 |        25 
|| 2005-04-29  |                      52 |    8 |     15875 | 13422 |         4 
|| 2005-05-02  |                      52 |   
 9 |     56530 | 13098 |        31 || 2005-05-04  |                      52 |   
10 |    129549 | 12566 |        18 
|+-------------+-------------------------+------+-----------+-------+-----------+10
 rows in set (0.00 sec)[EMAIL PROTECTED] wrote:Please try my solution before 
you tell me it's broken, OK? I know you want to see the top 10 spammers for 
EACH day. That's what I wrote for you. Please try my solution with your data 
and get back to me with the results and explain to me what's wrong so I can fix 
it.Thank you for your patience,Shawn GreenDatabase AdministratorUnimin 
Corporation - Spruce PineSeena Blace wrote on 05/16/2005 04:41:19 PM:  

Shawn,query SELECT * FROM spam_statsWHERE rank <= 10;will return all rows which 
I don't want.I need datewise top 10 spam domain.means condition would be serach 
those rows which are having top 10 spam (means highest) on each day and show 
the output like which I send [EMAIL PROTECTED] wrote:Seena Blace wrote on 
05/16/2005 11:13:48 AM:    

shawnI think 2nd query will return only 10 rows.I want out like followingsdate 
domain spam05/05/05 hotmail.com 12005/05/05 yahoo.com 11005/05/05 abc.com 
9905/05/05 def.com 8005/05/05 mnpo.net 
79......................................... like that upto 
10--------------------------------------------------05/06/05 yahoo.com 
30005/06/05 def.com 25005/06/05 zer.com 
200..................................like that upto 10Each day there are 
multiple entry from diffrent domains or same       

domain.  

I want each day whatever top 10 spam sender [EMAIL PROTECTED] wrote:Seena Blace 
wrote on 05/16/2005 10:08:15 AM:      

Any suggestion pl?Seena Blace wrote:hi,here is table 
descriptionreport1+-------------------------+------------------+------+-----+---------------------+----------------+|
 Field | Type | Null | Key | Default | Extra 
|+-------------------------+------------------+------+-----+---------------------+----------------+|
 id | int(10) unsigned | | PRI | NULL | auto_increment || host_id | int(10) 
unsigned | | MUL | 0 | || report_rcpt_domain_id | int(10) unsigned | YES | MUL 
| NULL | || report_sender_domain_id | int(10) unsigned | YES | MUL | NULL | || 
report_ipaddress_id | int(10) unsigned | YES | MUL | NULL | || time | datetime 
| | MUL | 0000-00-00 00:00:00 | || detected_spam | int(10) unsigned | | | 0 | 
|| detected_virus | int(10) unsigned | | | 0 | || processed | int(10) unsigned 
| | | 0 | || allowed | int(10) unsigned | | | 0 | || suspected | int(10) 
unsigned | | | 0 | || blocked | int(10) unsigned | | | 0 | || spam | int(10) 
unsigned | | | 0 | || virus | int(10) unsigned | | | 0 | |I
 WANT REPORT LIKE FOLLOWINGSdate sender processed spam suspectedI want top 10 
spam sender each day.QUery i'm using select 
date_format(time,'%Y-%d-%m'),report_sender_domain_id,processed ,spam from 
report1order by spam desc ,report_sender_domain_id,date_format(time,'%Y-%d-%m') 
limit 10;Please suggest.thanks        

It was the weekend. Not everyone lurks on their days off.Try thisselect 
date_format(time,'%Y-%d-%m'),report_sender_domain_id,processed ,spam 
,suspectedfrom report1order by spam 
desc,report_sender_domain_id,date_format(time,'%Y-%d-%m') limit 10;or if there 
is more than one entry per spammer per daySELECT 
date_format(time,'%Y-%d-%m'),report_sender_domain_id,sum(processed) as 
processed,sum(spam) as spam ,sum(suspected) as suspectedFROM report1GROUP BY 
date_format(time,'%Y-%d-%m'),report_sender_domain_idORDER BY spam 
desc,report_sender_domain_id,date_format(time,'%Y-%d-%m') LIMIT 10;That will 
give you their total stats for each day.Shawn GreenDatabase AdministratorUnimin 
Corporation - Spruce Pine---------------------------------      

OK, then what you want to do will take two steps and another table     

(unless   

you want to write application code) to do with MySQL;First step is to create a 
table where the intermediate results can be stored. What this table does is 
number each row for each day. Since it appears that you may want this 
information frequently (but only need to update it at the end of every day) I 
will make this a permanent table     

(not   

a temporary table).CREATE TABLE spam_stats (report_date date not 
null,report_sender_domain_id int unsigned not null,rank int unsigned 
auto_increment,processed int unsigned,spam int unsigned,suspected int 
unsigned,PRIMARY KEY (report_date, rank),UNIQUE KEY (report_sender_domain_id, 
report_date),KEY (rank));The UNIQUE key ensures that the same spammer cannot 
have more than one entry per day while the PRIMARY KEY allows for a groupwize 
autonumber (each entry per day gets it's own number, rank, starting at 1). The 
last     

  

key is optional but will seriously speed up the returns for the report     

you   

wanted. Next we need to populate our new table with the "spam" report     

data   

(filled in from most spam to least spam per domain per day).INSERT spam_stats 
(report_date, report_sender_domain_id, processed,     

spam,   

suspected)SELECT date(`time`),report_sender_domain_id,sum(processed) as 
processed,sum(spam) as spam ,sum(suspected) as suspectedFROM report1GROUP BY 
date(`time`),report_sender_domain_id ORDER BY spam 
desc,report_sender_domain_id,date(`time`);What will happen is that the rank 
column will be automatically filled in     

  

for each day's spam statistics. Then to get your "top ten" list you just     

  

run:SELECT * FROM spam_statsWHERE rank <= 10;Update spam_stats at the end of 
each new day and will only need to calculate those values once. If you create 
more tables like spam_stats     

for   

your other statistics (processed, suspected, blocked, etc.), you will be     

  

able to do rank-based reporting for those statistics, too ("top 20 sites     

  

blocked", for example).If you play around with the general idea of the 
groupwize numbering feature of auto_increment + primary key, I am sure you can 
solve a lot     

of   

your reporting issues.Shawn GreenDatabase AdministratorUnimin Corporation - 
Spruce PineYahoo! MailStay connected, organized, and protected. Take the tour   
 

                ---------------------------------Yahoo! Mail Stay connected, 
organized, and protected. Take the tour  


---------------------------------
No virus found in this incoming message.Checked by AVG Anti-Virus.Version: 
7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005  
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005



---------------------------------
Yahoo! Mail Mobile
Take Yahoo! Mail with you! Check email on your mobile phone. 

---------------------------------
No virus found in this incoming message.Checked by AVG Anti-Virus.Version: 
7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005  
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005


                
---------------------------------
Do you Yahoo!?
 Make Yahoo! your home page   

Reply via email to