Thanks Shawn!!!!!!!!!!!!!!!!!!!!!
Peter Brawley <[EMAIL PROTECTED]> wrote:Seena,
That data is for one date. To test Shawn's 2-stage query I'd need a data for
several days.
Without your tables and data, I tried Shawn's algorithm on a test database we
use a lot, nwib, which is a MySQL port of the widely used (and copied)
Microsoft Northwind database. It has customers, orders, orderdetails, payments
etc, and it's populated. Your question,
who are the top ten spammers per day?
is logically equivalent, in nwib, to:
what customers made the top ten number of orders for any given year?
Translating Shawn's method to nwib.orders gives us:
CREATE TABLE orderstats (
yr SMALLINT NOT NULL ,
rank INT AUTO_INCREMENT,
customerID CHAR(5) NOT NULL,
ordercount INT UNSIGNED NOT NULL,
PRIMARY KEY(yr,rank)
);
INSERT orderstats
( yr, customerID, ordercount )
SELECT
YEAR(orderdate) AS yr,
customerID,
COUNT(customerID) AS ordercount
FROM orders
GROUP BY yr, customerID
ORDER BY yr,ordercount DESC
which, as Shawn says, auto-populates orderstats.rank for each group. After the
above executes,
SELECT * from orderstats WHERE rank <= 10
gives the top ten ordering customers per year, as expected.
So I am unclear why the algorithm does not work for your spam table.
PB
-----
Seena Blace wrote: Peter
here u go.just a few.You may see sample report.
2005-05-05 | 1108529 | 0 |
1 | 0 | 0 |
| 2005-05-05 | 1108529 | 0 |
1 | 0 | 0 |
| 2005-05-05 | 1108529 | 0 |
1 | 0 | 0 |
| 2005-05-05 | 1108529 | 0 |
1 | 0 | 0 |
| 2005-05-05 | 1108529 | 0 |
1 | 0 | 0 |
| 2005-05-05 | 1108529 | 0 |
1 | 0 | 0 |
| 2005-05-05 | 14140 | 0 |
0 | 0 | 0 |
| 2005-05-05 | 46 | 1 |
0 | 1 | 0 |
| 2005-05-05 | 10378 | 0 |
0 | 0 | 0 |
| 2005-05-05 | 16068 | 0 |
0 | 0 | 0 |
| 2005-05-05 | 110085 | 0 |
1 | 0 | 0
Peter Brawley <[EMAIL PROTECTED]> wrote:
Seena,
I'm curious how Shawn's elegant-looking solution fails. Would you mind sending
me a small data sample so I can explore his solution? Thx.
Peter Brawley
-----
Seena Blace wrote:
Shawan,Here you go.mysql> select * from spam_stats -> where rank <=10
;+-------------+-------------------------+------+-----------+-------+-----------+|
report_date | report_sender_domain_id | rank | processed | spam | suspected
|+-------------+-------------------------+------+-----------+-------+-----------+|
2005-04-07 | 46 | 1 | 22054 | 19263 | 13
|| 2005-04-06 | 46 | 2 | 20229 | 16998 | 37
|| 2005-04-08 | 46 | 3 | 19493 | 16443 | 24
|| 2005-04-05 | 46 | 4 | 19322 | 15921 | 158
|| 2005-04-29 | 46 | 5 | 19241 | 15804 | 8
|| 2005-05-02 | 46 | 6 | 47732 | 15287 | 82
|| 2005-05-04 | 46 | 7 | 91907 | 14275 | 25
|| 2005-04-29 | 52 | 8 | 15875 | 13422 | 4
|| 2005-05-02 | 52 |
9 | 56530 | 13098 | 31 || 2005-05-04 | 52 |
10 | 129549 | 12566 | 18
|+-------------+-------------------------+------+-----------+-------+-----------+10
rows in set (0.00 sec)[EMAIL PROTECTED] wrote:Please try my solution before
you tell me it's broken, OK? I know you want to see the top 10 spammers for
EACH day. That's what I wrote for you. Please try my solution with your data
and get back to me with the results and explain to me what's wrong so I can fix
it.Thank you for your patience,Shawn GreenDatabase AdministratorUnimin
Corporation - Spruce PineSeena Blace wrote on 05/16/2005 04:41:19 PM:
Shawn,query SELECT * FROM spam_statsWHERE rank <= 10;will return all rows which
I don't want.I need datewise top 10 spam domain.means condition would be serach
those rows which are having top 10 spam (means highest) on each day and show
the output like which I send [EMAIL PROTECTED] wrote:Seena Blace wrote on
05/16/2005 11:13:48 AM:
shawnI think 2nd query will return only 10 rows.I want out like followingsdate
domain spam05/05/05 hotmail.com 12005/05/05 yahoo.com 11005/05/05 abc.com
9905/05/05 def.com 8005/05/05 mnpo.net
79......................................... like that upto
10--------------------------------------------------05/06/05 yahoo.com
30005/06/05 def.com 25005/06/05 zer.com
200..................................like that upto 10Each day there are
multiple entry from diffrent domains or same
domain.
I want each day whatever top 10 spam sender [EMAIL PROTECTED] wrote:Seena Blace
wrote on 05/16/2005 10:08:15 AM:
Any suggestion pl?Seena Blace wrote:hi,here is table
descriptionreport1+-------------------------+------------------+------+-----+---------------------+----------------+|
Field | Type | Null | Key | Default | Extra
|+-------------------------+------------------+------+-----+---------------------+----------------+|
id | int(10) unsigned | | PRI | NULL | auto_increment || host_id | int(10)
unsigned | | MUL | 0 | || report_rcpt_domain_id | int(10) unsigned | YES | MUL
| NULL | || report_sender_domain_id | int(10) unsigned | YES | MUL | NULL | ||
report_ipaddress_id | int(10) unsigned | YES | MUL | NULL | || time | datetime
| | MUL | 0000-00-00 00:00:00 | || detected_spam | int(10) unsigned | | | 0 |
|| detected_virus | int(10) unsigned | | | 0 | || processed | int(10) unsigned
| | | 0 | || allowed | int(10) unsigned | | | 0 | || suspected | int(10)
unsigned | | | 0 | || blocked | int(10) unsigned | | | 0 | || spam | int(10)
unsigned | | | 0 | || virus | int(10) unsigned | | | 0 | |I
WANT REPORT LIKE FOLLOWINGSdate sender processed spam suspectedI want top 10
spam sender each day.QUery i'm using select
date_format(time,'%Y-%d-%m'),report_sender_domain_id,processed ,spam from
report1order by spam desc ,report_sender_domain_id,date_format(time,'%Y-%d-%m')
limit 10;Please suggest.thanks
It was the weekend. Not everyone lurks on their days off.Try thisselect
date_format(time,'%Y-%d-%m'),report_sender_domain_id,processed ,spam
,suspectedfrom report1order by spam
desc,report_sender_domain_id,date_format(time,'%Y-%d-%m') limit 10;or if there
is more than one entry per spammer per daySELECT
date_format(time,'%Y-%d-%m'),report_sender_domain_id,sum(processed) as
processed,sum(spam) as spam ,sum(suspected) as suspectedFROM report1GROUP BY
date_format(time,'%Y-%d-%m'),report_sender_domain_idORDER BY spam
desc,report_sender_domain_id,date_format(time,'%Y-%d-%m') LIMIT 10;That will
give you their total stats for each day.Shawn GreenDatabase AdministratorUnimin
Corporation - Spruce Pine---------------------------------
OK, then what you want to do will take two steps and another table
(unless
you want to write application code) to do with MySQL;First step is to create a
table where the intermediate results can be stored. What this table does is
number each row for each day. Since it appears that you may want this
information frequently (but only need to update it at the end of every day) I
will make this a permanent table
(not
a temporary table).CREATE TABLE spam_stats (report_date date not
null,report_sender_domain_id int unsigned not null,rank int unsigned
auto_increment,processed int unsigned,spam int unsigned,suspected int
unsigned,PRIMARY KEY (report_date, rank),UNIQUE KEY (report_sender_domain_id,
report_date),KEY (rank));The UNIQUE key ensures that the same spammer cannot
have more than one entry per day while the PRIMARY KEY allows for a groupwize
autonumber (each entry per day gets it's own number, rank, starting at 1). The
last
key is optional but will seriously speed up the returns for the report
you
wanted. Next we need to populate our new table with the "spam" report
data
(filled in from most spam to least spam per domain per day).INSERT spam_stats
(report_date, report_sender_domain_id, processed,
spam,
suspected)SELECT date(`time`),report_sender_domain_id,sum(processed) as
processed,sum(spam) as spam ,sum(suspected) as suspectedFROM report1GROUP BY
date(`time`),report_sender_domain_id ORDER BY spam
desc,report_sender_domain_id,date(`time`);What will happen is that the rank
column will be automatically filled in
for each day's spam statistics. Then to get your "top ten" list you just
run:SELECT * FROM spam_statsWHERE rank <= 10;Update spam_stats at the end of
each new day and will only need to calculate those values once. If you create
more tables like spam_stats
for
your other statistics (processed, suspected, blocked, etc.), you will be
able to do rank-based reporting for those statistics, too ("top 20 sites
blocked", for example).If you play around with the general idea of the
groupwize numbering feature of auto_increment + primary key, I am sure you can
solve a lot
of
your reporting issues.Shawn GreenDatabase AdministratorUnimin Corporation -
Spruce PineYahoo! MailStay connected, organized, and protected. Take the tour
---------------------------------Yahoo! Mail Stay connected,
organized, and protected. Take the tour
---------------------------------
No virus found in this incoming message.Checked by AVG Anti-Virus.Version:
7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005
---------------------------------
Yahoo! Mail Mobile
Take Yahoo! Mail with you! Check email on your mobile phone.
---------------------------------
No virus found in this incoming message.Checked by AVG Anti-Virus.Version:
7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.11.12 - Release Date: 5/17/2005
---------------------------------
Do you Yahoo!?
Make Yahoo! your home page