[PHP] Strip emails from a document

2013-01-26 Thread Tedd Sperling
Hi gang:

I thought I had a function to strip emails from a document, but I can't find it.

So, before I start writing a common script, do any of you have a simple script 
to do this?

Here's an example of the problem:

Before:

Will Alex ale...@cit.msu.edu;Moita Zact za...@cit.msu.edu;Bob Arms 
ar...@cit.msu.edu;Meia Terms term...@cit.msu.edu;

After:

ale...@cit.msu.edu
za...@cit.msu.edu
ar...@cit.msu.edu
term...@cit.msu.edu

Cheers,

tedd


_
t...@sperling.com
http://sperling.com


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Strip emails from a document

2013-01-26 Thread shiplu
I think you meant extract emails from document, right?

I'd probably find `@` and iterate before and after unless I get posix
punct, space, characters.  But it'll probably give some false matches. So
its really hard to find 100% emails from an arbitrary text. This is because
valid email can contain many different type of characters. According to RFC
822 space is a valid character in email. So finding all the valid emails is
tough.
In a *trivial situation* an email would be separated by space. So find @
first. Then go back and front to find the first space. You'll get most
common emails. Something like using this regex pattern
[^[:space:]@]+@[^[:space:]]+ would suffice.
But keep in mind, it'll work on trivial cases. Not on special cases.
Regular expression can not be used on special cases. Here is full RFC-822
compliant email matching regular expression
http://ex-parrot.com/~pdw/Mail-RFC822-Address.html


More information can be found on
http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address


On Sat, Jan 26, 2013 at 10:24 PM, Tedd Sperling t...@sperling.com wrote:

 Hi gang:

 I thought I had a function to strip emails from a document, but I can't
 find it.

 So, before I start writing a common script, do any of you have a simple
 script to do this?

 Here's an example of the problem:

 Before:

 Will Alex ale...@cit.msu.edu;Moita Zact za...@cit.msu.edu;Bob
 Arms ar...@cit.msu.edu;Meia Terms term...@cit.msu.edu;

 After:

 ale...@cit.msu.edu
 za...@cit.msu.edu
 ar...@cit.msu.edu
 term...@cit.msu.edu

 Cheers,

 tedd


 _
 t...@sperling.com
 http://sperling.com


 --
 PHP General Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php




-- 
Shiplu.Mokadd.im
ImgSign.com | A dynamic signature machine
Innovation distinguishes between follower and leader


Re: [PHP] Strip emails from a document

2013-01-26 Thread Jonathan Sundquist
If you are expecting the email address to always be the same but the first
part being different you can create a regular expression to match it that
way. Using a regular expression over all is going to be your best bet as
shiplu suggested.


On Sat, Jan 26, 2013 at 10:54 AM, shiplu shiplu@gmail.com wrote:

 I think you meant extract emails from document, right?

 I'd probably find `@` and iterate before and after unless I get posix
 punct, space, characters.  But it'll probably give some false matches. So
 its really hard to find 100% emails from an arbitrary text. This is because
 valid email can contain many different type of characters. According to RFC
 822 space is a valid character in email. So finding all the valid emails is
 tough.
 In a *trivial situation* an email would be separated by space. So find @
 first. Then go back and front to find the first space. You'll get most
 common emails. Something like using this regex pattern
 [^[:space:]@]+@[^[:space:]]+ would suffice.
 But keep in mind, it'll work on trivial cases. Not on special cases.
 Regular expression can not be used on special cases. Here is full RFC-822
 compliant email matching regular expression
 http://ex-parrot.com/~pdw/Mail-RFC822-Address.html


 More information can be found on

 http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address


 On Sat, Jan 26, 2013 at 10:24 PM, Tedd Sperling t...@sperling.com wrote:

  Hi gang:
 
  I thought I had a function to strip emails from a document, but I can't
  find it.
 
  So, before I start writing a common script, do any of you have a simple
  script to do this?
 
  Here's an example of the problem:
 
  Before:
 
  Will Alex ale...@cit.msu.edu;Moita Zact za...@cit.msu.edu;Bob
  Arms ar...@cit.msu.edu;Meia Terms term...@cit.msu.edu;
 
  After:
 
  ale...@cit.msu.edu
  za...@cit.msu.edu
  ar...@cit.msu.edu
  term...@cit.msu.edu
 
  Cheers,
 
  tedd
 
 
  _
  t...@sperling.com
  http://sperling.com
 
 
  --
  PHP General Mailing List (http://www.php.net/)
  To unsubscribe, visit: http://www.php.net/unsub.php
 
 


 --
 Shiplu.Mokadd.im
 ImgSign.com | A dynamic signature machine
 Innovation distinguishes between follower and leader



Re: [PHP] Strip emails from a document

2013-01-26 Thread Tedd Sperling

On Jan 26, 2013, at 12:20 PM, Daniel Brown danbr...@php.net wrote:
 
It's imperfect, but will work for the majority of emails:
 
 ?php
 function scrape_emails($input) {

 preg_match_all(/\b([a-z0-9%\._\+\-]+@[a-z0-9-\.]+\.[a-z]{2,6})\b/Ui,$input,$matches);
return $matches;
 }
 ?

It works imperfectly enough for me. :-)

Here's the result:

http://www.webbytedd.com/aa/strip-email/index.php

Thanks to all.

Cheers,

tedd

PS: Yes, 'extract is what I meant and more correct than 'strip' as I said.

_
t...@sperling.com
http://sperling.com

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Strip emails from a document

2013-01-26 Thread shiplu
What is your input?



On Sat, Jan 26, 2013 at 11:29 PM, Tedd Sperling t...@sperling.com wrote:


 On Jan 26, 2013, at 12:20 PM, Daniel Brown danbr...@php.net wrote:
 
 It's imperfect, but will work for the majority of emails:
 
  ?php
  function scrape_emails($input) {
 
  
 preg_match_all(/\b([a-z0-9%\._\+\-]+@[a-z0-9-\.]+\.[a-z]{2,6})\b/Ui,$input,$matches);
 return $matches;
  }
  ?

 It works imperfectly enough for me. :-)

 Here's the result:

 http://www.webbytedd.com/aa/strip-email/index.php

 Thanks to all.

 Cheers,

 tedd

 PS: Yes, 'extract is what I meant and more correct than 'strip' as I said.

 _
 t...@sperling.com
 http://sperling.com

 --
 PHP General Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php




-- 
Shiplu.Mokadd.im
ImgSign.com | A dynamic signature machine
Innovation distinguishes between follower and leader


[PHP] Re: Strip emails from a document

2013-01-26 Thread Tim Streater
On 26 Jan 2013 at 16:24, Tedd Sperling t...@sperling.com wrote:

 I thought I had a function to strip emails from a document, but I can't find
 it.

 So, before I start writing a common script, do any of you have a simple script
 to do this?

I have a function that will take a comma separated string consisting of emails 
in these formats:

 Soap, Joe joe.s...@example.com
 Joe Soap  joe.s...@example.com
 (Joe Soap) joe.s...@example.com

and turn them into a list where all the above examples are converted to:

 Joe Soap  joe.s...@example.com

but it won't handle things like:

 joe@soap@example.com

which I understand is also valid.

You are welcome to it if you wish.

Cheers,

--
Cheers  --  Tim

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP] Strip emails from a document

2013-01-26 Thread Tedd Sperling

On Jan 26, 2013, at 12:48 PM, shiplu shiplu@gmail.com wrote:

 What is your input?
 


Check my first email in this thread.

Cheers,

tedd


_
t...@sperling.com
http://sperling.com


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php