Re: [PHP] Regex Help for URL's [ANSWER]

2006-05-17 Thread Edward Vermillion


On May 16, 2006, at 7:53 PM, Chrome wrote:


-Original Message-
From: Robert Samuel White [mailto:[EMAIL PROTECTED]
Sent: 17 May 2006 01:42
To: php-general@lists.php.net
Subject: RE: [PHP] Regex Help for URL's [ANSWER]



That's what I was doing.  I was parsing A:HREF, IMG:SRC, etc.

But when I implemented a new feature on my network, where you  
could click

on
a row and have it take you to another domain, I need a better  
solution.


Go to http://www.enetwizard.ws and it might make more sense.

All the links on the left have an ONCLICK=location.href = ''  
attribute in

the TR tag.

This solution allowed me to make sure those links included the  
session

information, just like the A:HREF links do.

It also had the advantage of updating the links in my CSS.


O that breaks accessibility standards! Compliment the  
'onclick's with

onkeydown at least :)

But still you get a solid onclick=... scenario

If these are visible in the source then they are fairly easy to  
pick out


Though you may need more than 1 regex ;)

My complaint here is, don't break accessibility :)



And don't forget the folks who have javascript turned off or are  
using text based browsers too.


Ed

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Regex Help for URL's [ANSWER]

2006-05-16 Thread Robert Samuel White
In case any one is looking for a solution to a similar problem as me, here
is the answer.  I used the code from my original post as my guiding light,
and with some experimentation, I figured it out.

To get any URL, regardless of where it is located, use this:

preg_match_all(#\'http://(.*)\'#U, $content, $matches);

This match anything similar to:

'http://www.domain.com/dir/dir/file.txt?query=blah'

This is useful, if for example, you have a tag like this one:

A HREF=javascript:void(0); ONCLICK=javascript:window.open =
'http://www.domain.com/dir/dir/file.txt?query=blah';

Now, for tags which are in quotes, rather than single quotes, just use:

preg_match_all(#\http://(.*)\#U, $content, $matches);


This is really only the first step.

In order to be useful, you need a way to process these urls according to
your own specific needs:

preg_match_all(#\'http://(.*)\'#U, $content, $matches);

$content = preg_replace(#\'http://(.*)\'#U, '###URL###', $content);

This will modify the $content variable to change all urls to ###URL###

You can then go through them one at a time to process them:

for ($count = 0; $count  count($matches[1]); $count++)

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Regex Help for URL's [ANSWER]

2006-05-16 Thread Robert Cummings
On Tue, 2006-05-16 at 18:49, Robert Samuel White wrote:
 In case any one is looking for a solution to a similar problem as me, here
 is the answer.  I used the code from my original post as my guiding light,
 and with some experimentation, I figured it out.
 
 To get any URL, regardless of where it is located, use this:
 
 preg_match_all(#\'http://(.*)\'#U, $content, $matches);
 
 This match anything similar to:
 
 'http://www.domain.com/dir/dir/file.txt?query=blah'
 
 This is useful, if for example, you have a tag like this one:
 
 A HREF=javascript:void(0); ONCLICK=javascript:window.open =
 'http://www.domain.com/dir/dir/file.txt?query=blah';
 
 Now, for tags which are in quotes, rather than single quotes, just use:
 
 preg_match_all(#\http://(.*)\#U, $content, $matches);

I'd roll those two into one expression:

preg_match_all(#(\|')http://(.*)(\|')#U, $content, $matches);

Cheers,
Rob.
-- 
..
| InterJinn Application Framework - http://www.interjinn.com |
::
| An application and templating framework for PHP. Boasting  |
| a powerful, scalable system for accessing system services  |
| such as forms, properties, sessions, and caches. InterJinn |
| also provides an extremely flexible architecture for   |
| creating re-usable components quickly and easily.  |
`'

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Regex Help for URL's [ANSWER]

2006-05-16 Thread Richard Lynch
On Tue, May 16, 2006 6:21 pm, Robert Cummings wrote:
 On Tue, 2006-05-16 at 18:49, Robert Samuel White wrote:
 In case any one is looking for a solution to a similar problem as
 me, here
 preg_match_all(#(\|')http://(.*)(\|')#U, $content, $matches);

And it's missing the original requirement of matching https URLs, so
maybe make it be ...https?://...

Plus, http could be IN CAPS, so change the U to iU

And, actually, SOME old-school HTML pages will have neither ' nor 
around the URL, and are (or were) valid:
href=page2.html
was considered valid for HTML for a long long long time
So toss in (\|')?
And then you may be finding URLs that are not actually linked but are
part of the visible content, so maybe you only want the ones that
have
a[^]href=
in front of them.

If I can toss off 3 problems without even trying...

So I still think Google or searching the archives (as I suggested
off-list) will be the quickest route to a CORRECT answer, but here we
are again in this same thread we've been in every month or so for the
better part of a decade...

PS the (\|') bit may move the URLs into $matches[2] instead of
$matches[1] or whatever.

-- 
Like Music?
http://l-i-e.com/artists.htm

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Regex Help for URL's [ANSWER]

2006-05-16 Thread Robert Samuel White
All pages used by my content management system must be in a valid format.

Old-school style pages are never created so the solution I have come up with
is perfect for my needs.

Thank you.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Regex Help for URL's [ANSWER]

2006-05-16 Thread Chrome

 -Original Message-
 From: Robert Samuel White [mailto:[EMAIL PROTECTED]
 Sent: 17 May 2006 01:16
 To: php-general@lists.php.net
 Subject: RE: [PHP] Regex Help for URL's [ANSWER]
 
 All pages used by my content management system must be in a valid format.
 
 Old-school style pages are never created so the solution I have come up
 with
 is perfect for my needs.
 
 Thank you.

Doesn't that make it a proprietary solution? IMHO offering the regex may
create a false situation for people... So the answer may not be for everyone

Might be wrong :)

Dan

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Regex Help for URL's [ANSWER]

2006-05-16 Thread Robert Samuel White
In my opinion, it is the most reasonable solution.  I have looked all over
the web for something else, but this works perfectly for me.  It's
impossible to tell where an url starts and ends if you don't have it in
quotes or single quotes.  If someone really needs to find all the urls in a
page, then they'll code their pages to make use of this limitation.

-Original Message-
From: Chrome [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, May 16, 2006 8:24 PM
To: 'Robert Samuel White'; php-general@lists.php.net
Subject: RE: [PHP] Regex Help for URL's [ANSWER]


 -Original Message-
 From: Robert Samuel White [mailto:[EMAIL PROTECTED]
 Sent: 17 May 2006 01:16
 To: php-general@lists.php.net
 Subject: RE: [PHP] Regex Help for URL's [ANSWER]
 
 All pages used by my content management system must be in a valid format.
 
 Old-school style pages are never created so the solution I have come up
 with
 is perfect for my needs.
 
 Thank you.

Doesn't that make it a proprietary solution? IMHO offering the regex may
create a false situation for people... So the answer may not be for everyone

Might be wrong :)

Dan

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Regex Help for URL's [ANSWER]

2006-05-16 Thread Chrome
 -Original Message-
 From: Robert Samuel White [mailto:[EMAIL PROTECTED]
 Sent: 17 May 2006 01:28
 To: php-general@lists.php.net
 Subject: RE: [PHP] Regex Help for URL's [ANSWER]
 
 In my opinion, it is the most reasonable solution.  I have looked all over
 the web for something else, but this works perfectly for me.  It's
 impossible to tell where an url starts and ends if you don't have it in
 quotes or single quotes.  If someone really needs to find all the urls in
 a
 page, then they'll code their pages to make use of this limitation.
 
 -Original Message-
 From: Chrome [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, May 16, 2006 8:24 PM
 To: 'Robert Samuel White'; php-general@lists.php.net
 Subject: RE: [PHP] Regex Help for URL's [ANSWER]
 
 
  -Original Message-
  From: Robert Samuel White [mailto:[EMAIL PROTECTED]
  Sent: 17 May 2006 01:16
  To: php-general@lists.php.net
  Subject: RE: [PHP] Regex Help for URL's [ANSWER]
 
  All pages used by my content management system must be in a valid
 format.
 
  Old-school style pages are never created so the solution I have come up
  with
  is perfect for my needs.
 
  Thank you.
 
 Doesn't that make it a proprietary solution? IMHO offering the regex may
 create a false situation for people... So the answer may not be for
 everyone
 
 Might be wrong :)
 
 Dan
 
 --
 PHP General Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php
 
 
 __ NOD32 1.1542 (20060516) Information __
 
 This message was checked by NOD32 antivirus system.
 http://www.eset.com

If we are talking clickable links, why not focus on the a construct
itself? Otherwise URLs are just part of the page's textual content... Very
difficult to parse that

Disseminating an a tag isn't brain-meltingly difficult with a regex if you
put your mind to it... With or without quotes, be they single, double or
non-existent


If I've misunderstood please chastise me :)

HTH

Dan

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Regex Help for URL's [ANSWER]

2006-05-16 Thread Robert Samuel White

 If we are talking clickable links, why not focus on the a construct
 itself? Otherwise URLs are just part of the page's textual content... Very
 difficult to parse that

 Disseminating an a tag isn't brain-meltingly difficult with a regex if
 you put your mind to it... With or without quotes, be they single, double 
 or non-existent

If I've misunderstood please chastise me :)

HTH

Dan


Dan,

That's what I was doing.  I was parsing A:HREF, IMG:SRC, etc.

But when I implemented a new feature on my network, where you could click on
a row and have it take you to another domain, I need a better solution.

Go to http://www.enetwizard.ws and it might make more sense.

All the links on the left have an ONCLICK=location.href = '' attribute in
the TR tag.

This solution allowed me to make sure those links included the session
information, just like the A:HREF links do.

It also had the advantage of updating the links in my CSS.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Regex Help for URL's [ANSWER]

2006-05-16 Thread Chrome
 -Original Message-
 From: Robert Samuel White [mailto:[EMAIL PROTECTED]
 Sent: 17 May 2006 01:42
 To: php-general@lists.php.net
 Subject: RE: [PHP] Regex Help for URL's [ANSWER]
 
 
  If we are talking clickable links, why not focus on the a construct
  itself? Otherwise URLs are just part of the page's textual content...
 Very
  difficult to parse that
 
  Disseminating an a tag isn't brain-meltingly difficult with a regex if
  you put your mind to it... With or without quotes, be they single,
 double
  or non-existent
 
 If I've misunderstood please chastise me :)
 
 HTH
 
 Dan
 
 
 Dan,
 
 That's what I was doing.  I was parsing A:HREF, IMG:SRC, etc.
 
 But when I implemented a new feature on my network, where you could click
 on
 a row and have it take you to another domain, I need a better solution.
 
 Go to http://www.enetwizard.ws and it might make more sense.
 
 All the links on the left have an ONCLICK=location.href = '' attribute in
 the TR tag.
 
 This solution allowed me to make sure those links included the session
 information, just like the A:HREF links do.
 
 It also had the advantage of updating the links in my CSS.

O that breaks accessibility standards! Compliment the 'onclick's with
onkeydown at least :)

But still you get a solid onclick=... scenario

If these are visible in the source then they are fairly easy to pick out

Though you may need more than 1 regex ;)

My complaint here is, don't break accessibility :)

Dan


-- 
http://chrome.me.uk

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php