On Tue, Nov 29, 2011 at 4:13 PM, Rajeev Prasad <rp.ne...@yahoo.com> wrote:

> hello,
>
> i am trying to extract email address from a afile, but not quite
> succesful. here is what i have:
>
> the file:
> myterqlqt qntmrq Prqtesm qltul qzeez Smqik qltulqzee...@jmqil.com 976665
> myterqlqt qntmrq Prqtesm teepqk Mittql teep...@jmqil.com 939383
> Onjole qntmrq Prqtesm lmqrqtm Etqrq cont...@lmqrqteeyqm.orj 9889
> Vijqyqwqtq qntmrq Prqtesm Sitmqrtmq si...@msitmu.in 939775777
> Visqkmqpqtnqm qntmrq Prqtesm Smyqmprqsqt Mqntri 
> mumqnrijmts...@yqmoo.co.in9735566
> Wqrqnjql qntmrq Prqtesm Smqsmi qrjulq smqsmi.qrj...@jmqil.com 996666799
> juntur qntmrq Prqtesm Rqvitejq Jqllepqlli rqvte...@jmqil.com 983
> jooty qntmrq Prqtesm Sqtti Kumqr  ys...@jmqil.com 986663,
> West jotqvqri (Eluru) qntmrq Prqtesm Rqm Prqsqt rqmprqsqttujji@yqmoo.com96 59
> Mqncmeriql qntmrq Prqtesm Smqntmilql jqmlotm 
> smqntmilql.jqmlotm@live.com933565 898575
> Kmqmmqm qntmrq Prqtesm Lqksmmqn Rqo jqtipqrtmy jqtipqrt...@jmqil.com
> Kurnool (Nemru Nqjqr) qntmrq Prqtesm lqntulmqi Iqlql mussqin
> limuss...@yqmoo.co.in 986, 8958575, 8958575
>
>
>
> my attempt:
> perl -ple 's/^.*\s(\w*@\w*.\w+).*$/$1/'  <file>
>
> my result:
> qltulqzee...@jmqil.com
> teep...@jmqil.com
> cont...@lmqrqteeyqm.orj
> si...@msitmu.in
> mumqnrijmts...@yqmoo.co
> Wqrqnjql qntmrq Prqtesm Smqsmi qrjulq smqsmi.qrj...@jmqil.com 996666799
> rqvte...@jmqil.com
> ys...@jmqil.com
> rqmprqsqttu...@yqmoo.com
> Mqncmeriql qntmrq Prqtesm Smqntmilql jqmlotm 
> smqntmilql.jqmlotm@live.com933565 898575
> jqtipqrt...@jmqil.com
> limuss...@yqmoo.co
>
>
> please advise how should be my regex?
>
> thx.



Hi Rajeev,

Now making ar regular expression just for your example list I would say
this will work...

perl -ple 's/^.*\b(\w+@\w*.\w+)\b.*$/$1/' <file>

But I have done this before and I knwo that some people are funny and have
an email that looks like one of the below examples.

This is crap because of the dots dot...@website.at 234234234, 234523423
This is also crap because of the dots
dot.at@subdomain.website.at234234234, 234523423

You would get this out of your file:

qltulqzee...@jmqil.com
teep...@jmqil.com
cont...@lmqrqteeyqm.orj
si...@msitmu.in
mumqnrijmts...@yqmoo.co
qrj...@jmqil.com
rqvte...@jmqil.com
ys...@jmqil.com
rqmprqsqttu...@yqmoo.com
jqml...@live.com
jqtipqrt...@jmqil.com
limuss...@yqmoo.co
a...@website.at
at@subdomain.website

Not to good now is it... therefore I would suggest doing the following
pretty much fool proof trick

perl -ple 's/^.*\s(.+@.+.\w+)\s.*$/$1/' <file>

This way you are saying that you want to capture all data preceded by a
space then some stuff an @ some more stuff a dot and some more stuff. This
should capture pretty much all emails in your file without any real
restrictions on the formatting making it quite easy to capture everything
that looks like an email address. :-)

Regards,

Rob

Reply via email to