Re: Regexp / Sql-perl problem.

Bodo Eing Thu, 12 Jul 2001 05:49:40 -0700
Date sent:              Thu, 12 Jul 2001 13:01:42 +0200
From:                   Nicolas JOURDEN <[EMAIL PROTECTED]>
To:                     [EMAIL PROTECTED]
Subject:                Regexp / Sql-perl problem.

> Hi guys,
> 
> I'have an SQL database with a lot of email and name.. from user who 
> registred into.
> 
> I'd like to do statistics, about mail adresse after the @, you know the 
> 'domain'.
> 
> So i have this :
> Id - EMail :
> 1 - [EMAIL PROTECTED]
> 12 - [EMAIL PROTECTED]
> 11 - [EMAIL PROTECTED]
> 121 - [EMAIL PROTECTED]
> 52 - [EMAIL PROTECTED]
> 151 - [EMAIL PROTECTED]
> 2 - [EMAIL PROTECTED]
> 
> How should i extract : tata.com, titi.com, and may by count them ?
> If you have an idea you'll be verry happy !
> 
> The other way could be to select all email adresse and regex them do ++ 
> count, and them display, but later it'll take to much of time :/

If you do not want to use a regex, you need the domain name directly 
from your database. Restructure your table to hold an additional 
column called domain, and put the domain name in there. Then you can 
say

"SELECT COUNT (domain) AS (Number), domain FROM Emails GROUP BY 
domain";

Nevertheless, this may not be absolutely regex-free, because you must 
extract the domain somehow before inserting it into the table...
In addition, this solution may affect the rest of your app...

Another possibility is to dump out the whole table as a text file 
with your database's dumping facility and solve your problem with the 
classic approach (scanning linewise: chomp; /\@(.*)$/; 
$frequency{$1}++;)

If any of these approaches are faster than just using your database 
as is, saying

my $sql = "SELECT EMail FROM $table";

...prepare and execute...

while ( my $email = $sth->fetchrow_array) {
        
        $email =~ /\@(.*)$/;
        $frequency{$1}++;
}

remains a matter of benchmarking...

Bodo
[EMAIL PROTECTED]
Re: Regexp / Sql-perl problem.

Reply via email to