Date sent: Thu, 12 Jul 2001 13:01:42 +0200
From: Nicolas JOURDEN <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: Regexp / Sql-perl problem.
> Hi guys,
>
> I'have an SQL database with a lot of email and name.. from user who
> registred into.
>
> I'd like to do statistics, about mail adresse after the @, you know the
> 'domain'.
>
> So i have this :
> Id - EMail :
> 1 - [EMAIL PROTECTED]
> 12 - [EMAIL PROTECTED]
> 11 - [EMAIL PROTECTED]
> 121 - [EMAIL PROTECTED]
> 52 - [EMAIL PROTECTED]
> 151 - [EMAIL PROTECTED]
> 2 - [EMAIL PROTECTED]
>
> How should i extract : tata.com, titi.com, and may by count them ?
> If you have an idea you'll be verry happy !
>
> The other way could be to select all email adresse and regex them do ++
> count, and them display, but later it'll take to much of time :/
If you do not want to use a regex, you need the domain name directly
from your database. Restructure your table to hold an additional
column called domain, and put the domain name in there. Then you can
say
"SELECT COUNT (domain) AS (Number), domain FROM Emails GROUP BY
domain";
Nevertheless, this may not be absolutely regex-free, because you must
extract the domain somehow before inserting it into the table...
In addition, this solution may affect the rest of your app...
Another possibility is to dump out the whole table as a text file
with your database's dumping facility and solve your problem with the
classic approach (scanning linewise: chomp; /\@(.*)$/;
$frequency{$1}++;)
If any of these approaches are faster than just using your database
as is, saying
my $sql = "SELECT EMail FROM $table";
...prepare and execute...
while ( my $email = $sth->fetchrow_array) {
$email =~ /\@(.*)$/;
$frequency{$1}++;
}
remains a matter of benchmarking...
Bodo
[EMAIL PROTECTED]