On Wed, Jul 07, 2004 at 11:49:47AM +0800, fooler wrote:
> 
> $ lynx -source
> 'http://lists.q-linux.com/pipermail/plug/2004-July/author.html'|grep
> "<I>"|tr -d "<I>"|uniq -c|sort -r|head -20

That's all right as long as you tolerate undecoded html entity and the
I element doesn't get mixed with the other elements on the same line.

$ lynx -source 'http://lists.q-linux.com/pipermail/plug/2004-July/author.html' | 
grep '<I>.*Vanni'
<I>Paolo Vanni M. Ve&#241;egas

$ lynx -source 'http://lists.q-linux.com/pipermail/plug/2004-July/author.html' |
perl -MXML::LibXML -e 'print XML::LibXML->new->parse_html_fh(*STDIN)
->findvalue(q!//i[contains(.,"Vanni")]!)'
Paolo Vanni M. Ve�egas

Out of interest, I tried writing the equivalent of your post in perl but
this is the shortest form that I can create :-)

$ lynx -source 'http://lists.q-linux.com/pipermail/plug/2004-July/author.html' |
perl -lne 
'END{print"$_{$_}\t$_"for(sort{$_{$b}<=>$_{$a}}keys%_)[0..20]}s/<I>//&&$_{$_}++'
33      Orlando Andico
33      Zak B. Elep
23      Andy Sy
12      Sacha Chua
10      Federico Sevilla III
9       JM Ibanez
8       Prem Vilas Fortran Rara
8       Holden Hao
7       stderr
7       dido at imperium.ph
7       Kelsey Hartigan Go
6       andrelst at mozcom.com
5       ramfree26 at softhome.net
5       Joebert Jacaba
5       Arshad Amade - EBS
4       Paolo Alexis Falcone
4       CG Haravata
3       Randy Ong
3       Eric Noel
3       Ariz C. Jacinto
3       Miguel A Paraz

-- 
$_=q:; # SHERWIN #
70;72;69;6e;74;20;
27;4a;75;73;74;20;
61;6e;6f;74;68;65;
72;20;50;65;72;6c;
20;6e;6f;76;69;63;
65;27;:;;s=~?(..);
?=pack q$C$,hex$1;
;;;=egg;;;;eval;;;
--
Philippine Linux Users' Group (PLUG) Mailing List
[EMAIL PROTECTED] (#PLUG @ irc.free.net.ph)
Official Website: http://plug.linux.org.ph
Searchable Archives: http://marc.free.net.ph
.
To leave, go to http://lists.q-linux.com/mailman/listinfo/plug
.
Are you a Linux newbie? To join the newbie list, go to
http://lists.q-linux.com/mailman/listinfo/ph-linux-newbie

Reply via email to