Re: [gentoo-user] Regex question

2008-06-04 Thread tecnic5
Hi,

I just wanted to comment something about Iain's suggestion:

'^\?.*(http|https|ftp)://'

If you add that '^' you're assuming that's the beginning of the string (as 
you may already know); the thing is I cannot see the cases where your URL 
starts with '?', the characters, and finally protocol and rest of URL. I 
mean, I can understand you found that string somewhere in the URL, but I 
don't see it being like that from the very beginning.

Perhaps I missed something by the way, can you guys enlight me?

Regards,

Abraham Marín Pérez [EMAIL PROTECTED] 
Responsable de I+D 
SILVANO CONSULTORES 
Tfno.: 93.412.79.12 -- Fax: 93.410.92.90 
http://www.silvanoc.com/ 






Iain Buchanan [EMAIL PROTECTED]
04/06/2008 01:33
Por favor, responda a gentoo-user
 
Para:   gentoo-user@lists.gentoo.org
cc: 
Asunto: Re: [gentoo-user] Regex question

Hi,

On Tue, 2008-06-03 at 17:39 +1000, Adam Carter wrote:
 I want to filter the strings; ? something http:// or ? something?
 https:// or ? something ftp:// from URLs in apache. I know i need to
 escape ? but i'm not sure about /

/ needs to be escaped in perl if your regex delimiters are / as well,
but here you use ' so I would _hope_ that you don't need \/\/ sort of
syntax.  YMMV.

  and i've used '(something|otherthing|whatever)' to make the 'or's
 work. 

I usually do perl, but it should be the same...  Actually on reading
further, Apache uses Perl Compatible Regular Expressions provided by
the PCRE library.  Neat.  You should be viewing this with a fixed width
font too :)
 
 LocationMatch '(\?.*http:\/\/|\?.*https:\/\/|\?.*ftp\/\/)'
typo here==^

how about taking the common bits out of the ()
'\?.*(http|https|ftp)://'

and it's good practise to use ^ if that's what you're expecting:
'^\?.*(http|https|ftp)://'

 Order allow,deny
 Deny from all
 /LocationMatch
 
 is that regex correct? Will egrep use the exact same regex syntax (so
 i can use it to check?)

I think they're essentially the same, with the exception of some classes
(like [:punct:] or [\d]).  egrep may need some extra escaping so as not
to confuse your shell.  The best way to test is to use the real program,
so see if you can get info out of your logs to help.

hth,
-- 
Iain Buchanan iaindb at netspace dot net dot au

  I tripped over a hole that was sticking up out of the ground.

-- 
gentoo-user@lists.gentoo.org mailing list



--
gentoo-user@lists.gentoo.org mailing list



Re: [gentoo-user] Regex question

2008-06-04 Thread Iain Buchanan
On Wed, 2008-06-04 at 08:57 +0200, [EMAIL PROTECTED] wrote:
 Hi,
 
 I just wanted to comment something about Iain's suggestion:
 
 '^\?.*(http|https|ftp)://'
 
 If you add that '^' you're assuming that's the beginning of the string (as 
 you may already know); the thing is I cannot see the cases where your URL 
 starts with '?', the characters, and finally protocol and rest of URL. I 
 mean, I can understand you found that string somewhere in the URL, but I 
 don't see it being like that from the very beginning.

I was indeed assuming ? was at the beginning when I added the ^...

 Perhaps I missed something by the way, can you guys enlight me?
-- 
Iain Buchanan iaindb at netspace dot net dot au

The difference between a good haircut and a bad one is seven days.

-- 
gentoo-user@lists.gentoo.org mailing list



RE: [gentoo-user] Regex question

2008-06-04 Thread Adam Carter
  I just wanted to comment something about Iain's suggestion:
 
  '^\?.*(http|https|ftp)://'
 
  If you add that '^' you're assuming that's the beginning of
 the string (as
  you may already know); the thing is I cannot see the cases
 where your URL
  starts with '?', the characters, and finally protocol and
 rest of URL. I
  mean, I can understand you found that string somewhere in
 the URL, but I
  don't see it being like that from the very beginning.

Thanks guys. I know ^ and will omit it as the ? Is not at the beginning of the 
string.

I'll try '\?.*(http|https|ftp)://'
--
gentoo-user@lists.gentoo.org mailing list



Re: [gentoo-user] Regex question

2008-06-04 Thread Bruce Munro

Adam Carter wrote:



Thanks guys. I know ^ and will omit it as the ? Is not at the beginning of the 
string.

I'll try '\?.*(http|https|ftp)://'


You can squeeze that up a bit more...

\?.*(https?|ftp)://

'https?' means 'http' followed by an optional (0 or 1) 's'.

Cheers,
-Bruce

--
gentoo-user@lists.gentoo.org mailing list



Re: [gentoo-user] Regex question

2008-06-03 Thread Iain Buchanan
Hi,

On Tue, 2008-06-03 at 17:39 +1000, Adam Carter wrote:
 I want to filter the strings; ? something http:// or ? something?
 https:// or ? something ftp:// from URLs in apache. I know i need to
 escape ? but i'm not sure about /

/ needs to be escaped in perl if your regex delimiters are / as well,
but here you use ' so I would _hope_ that you don't need \/\/ sort of
syntax.  YMMV.

  and i've used '(something|otherthing|whatever)' to make the 'or's
 work. 

I usually do perl, but it should be the same...  Actually on reading
further, Apache uses Perl Compatible Regular Expressions provided by
the PCRE library.  Neat.  You should be viewing this with a fixed width
font too :)
 
 LocationMatch '(\?.*http:\/\/|\?.*https:\/\/|\?.*ftp\/\/)'
typo here==^

how about taking the common bits out of the ()
'\?.*(http|https|ftp)://'

and it's good practise to use ^ if that's what you're expecting:
'^\?.*(http|https|ftp)://'

 Order allow,deny
 Deny from all
 /LocationMatch
  
 is that regex correct? Will egrep use the exact same regex syntax (so
 i can use it to check?)

I think they're essentially the same, with the exception of some classes
(like [:punct:] or [\d]).  egrep may need some extra escaping so as not
to confuse your shell.  The best way to test is to use the real program,
so see if you can get info out of your logs to help.

hth,
-- 
Iain Buchanan iaindb at netspace dot net dot au

  I tripped over a hole that was sticking up out of the ground.

-- 
gentoo-user@lists.gentoo.org mailing list