Let me give you a better dataset:

msg_sender_num | env_sender_num
----------------+----------------
            118 |          53003
            118 |          51778
            118 |          49679
            118 |          49457
            118 |          37434
            118 |          37389
            118 |          33644
            118 |          33609
            118 |          26043
            118 |          26004
            118 |          24288
            118 |          23357
            118 |          16246
            118 |          16103
            118 |          12967
            118 |          12140
            118 |           4191
            118 |            122
            118 |            860

with the SQL:

SELECT record_of_claims.msg_sender_num, count(DISTINCT
record_of_claims.env_sender_num) AS env_from_claims_count FROM
record_of_claims WHERE (record_of_claims.msg_sender_num =118) GROUP BY
record_of_claims.msg_sender_num;

You get:

msg_sender_num | env_from_claims_count
----------------+-----------------------
            118 |                    19

Which is correct for the following reason:

msg_sender_num |                envelope_from
----------------+---------------------------------------------
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]
            118 | [EMAIL PROTECTED]

19 different envelope from names all claiming to be the same Message
from: 118 ("TopOffers [EMAIL PROTECTED]")

All of the above address would be added to blacklist for 120 days.

If I say every 15 minutes or so create a new table full of the results
of the SQL view it sort of solves my problem. Still takes forever to
process but the next view that needs these results can do an index scan
on the resulting table and not have to build the list all over again. In
this was I can cut my processing time to 6/10th of a second. But I have
to create/drop/rename tables on a time interval. Not the best solution.

Jerry Wintrode
Network Administrator
Tripos, Inc.





---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Reply via email to