Re: [PHP] preg_match question...
bruce wrote: > hi... > > trying to figure out the best approach to using preg_match to extract the > number from the follwing type of line... > > " 131646 sometext follows.." > > basically, i want to extract the number, without the text, but i have to be > able to match on the "text" > > i've been playing with different preg_match regexs.. but i'm missing > something obvious! > > thoughts/comments.. > > Why don't you show us some of the attempts that you have tried that didn't work. When you say that you need to match on some of the text, give us an example of the text that you are trying to match. And give us examples of the input text you are trying to match to. -- Jim Lucas "Some men are born to greatness, some achieve greatness, and some have greatness thrust upon them." Twelfth Night, Act II, Scene V by William Shakespeare -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question...
preg_match('/^([0-9]+) (.)+/',$sString,$aMatches); Matches will be 1 => the number ; 2 => the text. The expression only matches if there is any character after the space. Not necessarily text, it might be another number or special characters 2009/2/6 bruce > hi... > > trying to figure out the best approach to using preg_match to extract the > number from the follwing type of line... > > " 131646 sometext follows.." > > basically, i want to extract the number, without the text, but i have to be > able to match on the "text" > > i've been playing with different preg_match regexs.. but i'm missing > something obvious! > > thoughts/comments.. > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > -- Alpar Torok
[PHP] preg_match question...
hi... trying to figure out the best approach to using preg_match to extract the number from the follwing type of line... " 131646 sometext follows.." basically, i want to extract the number, without the text, but i have to be able to match on the "text" i've been playing with different preg_match regexs.. but i'm missing something obvious! thoughts/comments.. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question
Nicklas Bondesson wrote: How do I find an exact match of a string with preg_match? Example: String1: www.test.com/ String2: www.test.com/somepage.php?param1=true How do you write the regexp to only return String1 and not String2 when you match against "www.test.com" ?? I've heard == works well. Or strcmp(), etc... -- ---John Holmes... Amazon Wishlist: www.amazon.com/o/registry/3BEXC84AB3A5E/ php|architect: The Magazine for PHP Professionals – www.phparch.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question
* Thus wrote Nicklas Bondesson: > Actually I think I got it working now (without escaping the "."). cause . is any character wwwatestbcom will get matched as well. Curt -- First, let me assure you that this is not one of those shady pyramid schemes you've been hearing about. No, sir. Our model is the trapezoid! -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] preg_match question
Actually I think I got it working now (without escaping the "."). If I run into trouble I will post again. Thanks! Nicke From: John Legg [mailto:[EMAIL PROTECTED] Sent: den 18 augusti 2004 18:15 To: "Nicklas Bondesson" Cc: [EMAIL PROTECTED] Subject: RE: [PHP] preg_match question Hi Nicke Difficult to say without knowing exactly what you are trying to achieve. If you are just comparing two strings for 'exactness' then it would probably be better to just do a direct string comparison (without regex). Regex's are generally more expensive operations. One thing you should do is escape the "." portion of the URL you are comparing. As "." matches any character except newline. So it should be: $pattern = "/^www\.test\.com$/"; If you tell me what you are trying to achieve, I might be able to advise further. Rgds John --- Is it clever to use word boundary here? "/b"? Nicke From: John Legg [mailto:[EMAIL PROTECTED] <http://uk.f251.mail.yahoo.com/ym/[EMAIL PROTECTED]&YY=19751&orde r=down&sort=date&pos=0&view=a&head=b> ] Sent: den 18 augusti 2004 18:02 To: "Nicklas Bondesson" Cc: [EMAIL PROTECTED] <http://uk.f251.mail.yahoo.com/ym/[EMAIL PROTECTED]&YY=19 751&order=down&sort=date&pos=0&view=a&head=b> Subject: Re: [PHP] preg_match question Try using a pattern set to the following: $pattern = "/^www.test.com$/"; and refer to: http://uk.php.net/manual/en/reference.pcre.pattern.syntax.php Rgds John --- Hello, How do I find an exact match of a string with preg_match? Example: String1: www.test.com/ String2: www.test.com/somepage.php?param1=true How do you write the regexp to only return String1 and not String2 when you match against "www.test.com" ?? Thanks in advance. Nicke -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] preg_match question
Hi Nicke Difficult to say without knowing exactly what you are trying to achieve. If you are just comparing two strings for 'exactness' then it would probably be better to just do a direct string comparison (without regex). Regex's are generally more expensive operations. One thing you should do is escape the "." portion of the URL you are comparing. As "." matches any character except newline. So it should be: $pattern = "/^www\.test\.com$/"; If you tell me what you are trying to achieve, I might be able to advise further. Rgds John --- Is it clever to use word boundary here? "/b"? Nicke From: John Legg [mailto:[EMAIL PROTECTED] Sent: den 18 augusti 2004 18:02 To: "Nicklas Bondesson" Cc: [EMAIL PROTECTED] Subject: Re: [PHP] preg_match question Try using a pattern set to the following: $pattern = "/^www.test.com$/"; and refer to: http://uk.php.net/manual/en/reference.pcre.pattern.syntax.php Rgds John --- Hello, How do I find an exact match of a string with preg_match? Example: String1: www.test.com/ String2: www.test.com/somepage.php?param1=true How do you write the regexp to only return String1 and not String2 when you match against "www.test.com" ?? Thanks in advance. Nicke -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] preg_match question
Is it clever to use word boundary here? "/b"? Nicke From: John Legg [mailto:[EMAIL PROTECTED] Sent: den 18 augusti 2004 18:02 To: "Nicklas Bondesson" Cc: [EMAIL PROTECTED] Subject: Re: [PHP] preg_match question Try using a pattern set to the following: $pattern = "/^www.test.com$/"; and refer to: http://uk.php.net/manual/en/reference.pcre.pattern.syntax.php Rgds John --- Hello, How do I find an exact match of a string with preg_match? Example: String1: www.test.com/ String2: www.test.com/somepage.php?param1=true How do you write the regexp to only return String1 and not String2 when you match against "www.test.com" ?? Thanks in advance. Nicke -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question
On Wed, 18 Aug 2004 17:16:24 +0200, Nicklas Bondesson <[EMAIL PROTECTED]> wrote: > Hello, > > How do I find an exact match of a string with preg_match? > > Example: > > String1: www.test.com/ > String2: www.test.com/somepage.php?param1=true > > How do you write the regexp to only return String1 and not String2 when you > match against "www.test.com" ?? $pattern = '/^www\.test\.com\/$/'; preg_match($pattern, YOUR STRING) -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question
Try using a pattern set to the following: $pattern = "/^www.test.com$/"; and refer to: http://uk.php.net/manual/en/reference.pcre.pattern.syntax.php Rgds John --- Hello, How do I find an exact match of a string with preg_match? Example: String1: www.test.com/ String2: www.test.com/somepage.php?param1=true How do you write the regexp to only return String1 and not String2 when you match against "www.test.com" ?? Thanks in advance. Nicke -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] preg_match question
Hello, How do I find an exact match of a string with preg_match? Example: String1: www.test.com/ String2: www.test.com/somepage.php?param1=true How do you write the regexp to only return String1 and not String2 when you match against "www.test.com" ?? Thanks in advance. Nicke -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question
Thank you both Jason and Curt... Looks like I was pretty close... In fact, I found a fault in my logic, too, that meant only some but not all instances of $element were being validated, even after i got the preg function right! But it's all working great now. - john > > > > As John has previously suggested the best way to go about this is to remove > > anything which is not in the set of acceptable characters, which for the > > above is: > > > > $new = preg_replace('/[^A-Za-z-\()\s]/', '', $old); > > > > Then check that $new is not empty(). > > or if $new != $old, bad characters exist. > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question
* Thus wrote Jason Wong: > On Wednesday 21 July 2004 10:04, John Van Pelt wrote: > > > function isWord($element) { > > return !preg_match ("/[^A-Za-z\-\(\)\s]/", $element); > > } > > > > I want to test the string $element and make sure that it contains nothing > > but: > > - characters A-Z > > - characters a-z > > - hyphen > > - parentheses (open and close) > > - space character or its equivalent > > As John has previously suggested the best way to go about this is to remove > anything which is not in the set of acceptable characters, which for the > above is: > > $new = preg_replace('/[^A-Za-z-\()\s]/', '', $old); > > Then check that $new is not empty(). or if $new != $old, bad characters exist. Curt -- First, let me assure you that this is not one of those shady pyramid schemes you've been hearing about. No, sir. Our model is the trapezoid! -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question
On Wednesday 21 July 2004 10:04, John Van Pelt wrote: > function isWord($element) { > return !preg_match ("/[^A-Za-z\-\(\)\s]/", $element); > } > > I want to test the string $element and make sure that it contains nothing > but: > - characters A-Z > - characters a-z > - hyphen > - parentheses (open and close) > - space character or its equivalent As John has previously suggested the best way to go about this is to remove anything which is not in the set of acceptable characters, which for the above is: $new = preg_replace('/[^A-Za-z-\()\s]/', '', $old); Then check that $new is not empty(). -- Jason Wong -> Gremlins Associates -> www.gremlins.biz Open Source Software Systems Integrators * Web Design & Hosting * Internet & Intranet Applications Development * -- Search the list archives before you post http://marc.theaimsgroup.com/?l=php-general -- /* Finagle's Seventh Law: The perversity of the universe tends toward a maximum. */ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] preg_match question
I am a semi-newbie at php and a complete newbie to regex What am I doing wrong here? function isWord($element) { return !preg_match ("/[^A-Za-z\-\(\)\s]/", $element); } I want to test the string $element and make sure that it contains nothing but: - characters A-Z - characters a-z - hyphen - parentheses (open and close) - space character or its equivalent Another issue that may be related to the problem (in case my function above is correct, which I doubt), is that $element is being returned as a $_POST value from a form field... and it CAN be empty. Will my function choke on that? Do i need to test separately for ''? Thanks in advance, -john -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Preg_match question
Great quote Jeroen. >>Always code as if the guy who ends up maintaining your code will be a violent >>psychopath who knows where you live. >> -- Martin Golding -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] Preg_match question
> Is there a preg to find a "?" in a string since a "?" is used for > calculations as I see it. > You just need to escape it with a backslash, e.g. \? Good luck, Marco -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] Preg_match question
Thanks Marco, I worked it out, with trial and error but I will refine what I have by reading your examples. Is there a preg to find a "?" in a string since a "?" is used for calculations as I see it. Thank you and others for all the help with this Dave Carrera -Original Message- From: Marco Tabini [mailto:[EMAIL PROTECTED] Sent: 29 May 2004 13:32 To: 'Dave Carrera'; [EMAIL PROTECTED] Subject: RE: [PHP] Preg_match question I guess the simplest would be to use preg_match_all on '//i' Assuming that all your links are in that format, this will extract all the contents of the href portion of the links in your string. There was a regex series on our magazine that also covered more complex examples of this kind. Cheers, Marco -- php|architect - The Magazine for PHP Professionals http://www.phparch.com <-- Get your free issue today! > -Original Message- > From: Dave Carrera [mailto:[EMAIL PROTECTED] > Sent: May 29, 2004 4:30 AM > To: [EMAIL PROTECTED] > Subject: [PHP] Preg_match question > > Hi List, > > I have managed to list files on my website using a combination of > preg_match_all and str_replace in an array. > > What I would like to know is how do I lift out of a tag the > contents from href=". > > Example: > > Link returned = Of to some page > > The bit I want to play with is ./somepage.php. > > I think it might be something to do with another preg of some kind but > I can not work it our so I ask for help from the list. > > Thank you in advance for any help or guidance you my give > > Dave C > > --- > Outgoing mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.691 / Virus Database: 452 - Release Date: 26/05/2004 > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php --- Incoming mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.691 / Virus Database: 452 - Release Date: 26/05/2004 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.691 / Virus Database: 452 - Release Date: 26/05/2004 -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] Preg_match question
I guess the simplest would be to use preg_match_all on '//i' Assuming that all your links are in that format, this will extract all the contents of the href portion of the links in your string. There was a regex series on our magazine that also covered more complex examples of this kind. Cheers, Marco -- php|architect - The Magazine for PHP Professionals http://www.phparch.com <-- Get your free issue today! > -Original Message- > From: Dave Carrera [mailto:[EMAIL PROTECTED] > Sent: May 29, 2004 4:30 AM > To: [EMAIL PROTECTED] > Subject: [PHP] Preg_match question > > Hi List, > > I have managed to list files on my website using a combination of > preg_match_all and str_replace in an array. > > What I would like to know is how do I lift out of a tag the contents > from href=". > > Example: > > Link returned = Of to some page > > The bit I want to play with is ./somepage.php. > > I think it might be something to do with another preg of some kind but I > can > not work it our so I ask for help from the list. > > Thank you in advance for any help or guidance you my give > > Dave C > > --- > Outgoing mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.691 / Virus Database: 452 - Release Date: 26/05/2004 > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Preg_match question
On Sat, 29 May 2004, Dave Carrera wrote: > > Link returned = Of to some page > > The bit I want to play with is ./somepage.php. > $str = 'Of to some page'; if( preg_match("//Ui", $str, $matches) ) echo $matches[1]; else echo "string didn't match"; -- Jeroen Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. -- Martin Golding -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Preg_match question
Hi List, I have managed to list files on my website using a combination of preg_match_all and str_replace in an array. What I would like to know is how do I lift out of a tag the contents from href=". Example: Link returned = Of to some page The bit I want to play with is ./somepage.php. I think it might be something to do with another preg of some kind but I can not work it our so I ask for help from the list. Thank you in advance for any help or guidance you my give Dave C --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.691 / Virus Database: 452 - Release Date: 26/05/2004 -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question: locating unmatched HTML tags
Hi, Saturday, February 22, 2003, 12:35:15 PM, you wrote: AC> My apologies in advance if this too basic or there's a solution easily AC> found out there, but after lots of searching, I'm still lost. AC> I'm trying to build a regexp that would parse user-supplied text and AC> identify cases where HTML tags are left open or are not properly AC> matched-e.g., tags without closing tags. This is for a sort of AC> message board type of application, and I'd like to allow users to use AC> some HTML, but just would like to check to ensure that no stray tags are AC> input that would screw up the rest of the page's display. I'm new to AC> regular expressions, and the one below is as far as I've gotten. If AC> anyone has any suggestions, they'd be very much appreciated. AC> Thanks, AC> Andy AC> $suspect_tags = "b|i|u|strong|em|font|a|ol|ul|blockquote "; AC> $pattern = '/<(' . $suspect_tags . '[^>]*>)(.*)(?!<\/\1)/Ui'; AC> if (preg_match($pattern,$_POST['entry'],$matches)) { AC>//do something to report the unclosed tags AC> } else { AC>echo 'Input looks fine. No unmatched tags.'; AC> } Here is a function that will fixup simple tags like ,it will add in the missing /b tag at the next start/end tag or end of document. function fix_mismatch($str){ $match = array(); $split = preg_split('!\<(.*?)\>!s', $str); $c = count($split); $r = ($c == 1)? $str : ''; if($c > 1){ $fix = ''; preg_match_all('!\<(.*?)\>!s', $str,$match); for($x=0,$y=0;$x < $c;$x++){ $out = $split[$x].$fix; //add in text + any fixup end tag $fix = ''; if(isset($match[0][$x])){ $list = explode(' ',$match[1][$x]); //split up compound tag like $t = trim(strtolower($list[0]));//get the tag name switch ($t){ //add tags to check/fix here case 'b': case 'div': case 'i': case 'textarea': $st = '/'.$t; //make an end tag to search for $rest = array_slice($match[1],$x+1); // get the remaining tags $found = false; while(!$found && list(,$v) = each($rest)){ $et = explode(' ',$v); $found = ($st == trim(strtolower($et[0])))? True:False; //have we found it ? } if(!$found){ $fix = '<'.$st.'>'; //create an html end tag } break; } $out .= $match[0][$x]; //add in tag } $r .= $out; //build return string } } return $r; } //usage $test1 = 'This is a bold word and another bold word end of test'; $test2 = 'frog'; echo fix_mismatch($test1); echo ''; echo fix_mismatch($test2); -- regards, Tom -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question: locating unmatched HTML tags
Hi, Saturday, February 22, 2003, 12:35:15 PM, you wrote: AC> My apologies in advance if this too basic or there's a solution easily AC> found out there, but after lots of searching, I'm still lost. AC> I'm trying to build a regexp that would parse user-supplied text and AC> identify cases where HTML tags are left open or are not properly AC> matched-e.g., tags without closing tags. This is for a sort of AC> message board type of application, and I'd like to allow users to use AC> some HTML, but just would like to check to ensure that no stray tags are AC> input that would screw up the rest of the page's display. I'm new to AC> regular expressions, and the one below is as far as I've gotten. If AC> anyone has any suggestions, they'd be very much appreciated. AC> Thanks, AC> Andy AC> $suspect_tags = "b|i|u|strong|em|font|a|ol|ul|blockquote "; AC> $pattern = '/<(' . $suspect_tags . '[^>]*>)(.*)(?!<\/\1)/Ui'; AC> if (preg_match($pattern,$_POST['entry'],$matches)) { AC>//do something to report the unclosed tags AC> } else { AC>echo 'Input looks fine. No unmatched tags.'; AC> } The simplest is just to add directly to the end of their message, may not be technically correct but it won't do any harm either :) -- regards, Tom -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] preg_match question: locating unmatched HTML tags
Good point, and I might end up doing just that if I can't find a solution. The problem is that I'm considering using for some forms a wysiwyg replacement (e.g., http://www.interactivetools.com/products/htmlarea/ or http://www.siteworkspro.com) that results in HTML output. And I wanted to check the output of that to make sure there aren't any extraneous tags. Andy > -Original Message- > From: John W. Holmes [mailto:[EMAIL PROTECTED] > Sent: Saturday, February 22, 2003 5:04 PM > To: 'Andy Crain'; [EMAIL PROTECTED] > Subject: RE: [PHP] preg_match question: locating unmatched HTML tags > > Well, like someone else said, it's hard to look for and match stuff that > isn't there. In addition to the security benefit, it's just easier to > code something that looks for [b](.*)[/b] and replaces those tags with > and (or and if you want to be technically > correct). > > Honestly, if you've got a small group of people like you say then just > teach them HTML so they don't make mistakes like this. Or provide a > "preview" mode so they can double check their work. > > ---John W. Holmes... > > PHP Architect - A monthly magazine for PHP Professionals. Get your copy > today. http://www.phparch.com/ > > > -Original Message- > > From: Andy Crain [mailto:[EMAIL PROTECTED] > > Sent: Saturday, February 22, 2003 4:54 PM > > To: [EMAIL PROTECTED] > > Subject: RE: [PHP] preg_match question: locating unmatched HTML tags > > > > John, > > Thanks. I'm considering that, but the application I'm working on is > for > > a small intranet that will be for only a small group of supervised > > users, so vulnerability isn't such a large concern. > > Andy > > > > > -Original Message- > > > From: John W. Holmes [mailto:[EMAIL PROTECTED] > > > Sent: Saturday, February 22, 2003 1:06 AM > > > To: 'Andy Crain'; [EMAIL PROTECTED] > > > Subject: RE: [PHP] preg_match question: locating unmatched HTML tags > > > > > > > I'm trying to build a regexp that would parse user-supplied text > and > > > > identify cases where HTML tags are left open or are not properly > > > > matched-e.g., tags without closing tags. This is for a > sort > > > of > > > > message board type of application, and I'd like to allow users to > > use > > > > some HTML, but just would like to check to ensure that no stray > tags > > > are > > > > input that would screw up the rest of the page's display. I'm new > to > > > > regular expressions, and the one below is as far as I've gotten. > If > > > > anyone has any suggestions, they'd be very much appreciated. > > > > > > Letting users enter HTML is a bad idea. Even if you only let them > use > > > tags, they can still put ONCLICK and mouseover effects for the > > bold > > > text to screw with your other users. > > > > > > Use a BB style code, such as [b] for bold, [i] for italics, etc. > This > > > way, you only match pairs and replace them with HTML and use > > > htmlentities on anything else. This way an unmatched [b] tag won't > be > > > replaced with and mess up your code. > > > > > > ---John W. Holmes... > > > > > > PHP Architect - A monthly magazine for PHP Professionals. Get your > > copy > > > today. http://www.phparch.com/ > > > > > > > > > > > > -- > > > PHP General Mailing List (http://www.php.net/) > > > To unsubscribe, visit: http://www.php.net/unsub.php > > > > > > > > > > -- > > PHP General Mailing List (http://www.php.net/) > > To unsubscribe, visit: http://www.php.net/unsub.php > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] preg_match question: locating unmatched HTML tags
Well, like someone else said, it's hard to look for and match stuff that isn't there. In addition to the security benefit, it's just easier to code something that looks for [b](.*)[/b] and replaces those tags with and (or and if you want to be technically correct). Honestly, if you've got a small group of people like you say then just teach them HTML so they don't make mistakes like this. Or provide a "preview" mode so they can double check their work. ---John W. Holmes... PHP Architect - A monthly magazine for PHP Professionals. Get your copy today. http://www.phparch.com/ > -Original Message- > From: Andy Crain [mailto:[EMAIL PROTECTED] > Sent: Saturday, February 22, 2003 4:54 PM > To: [EMAIL PROTECTED] > Subject: RE: [PHP] preg_match question: locating unmatched HTML tags > > John, > Thanks. I'm considering that, but the application I'm working on is for > a small intranet that will be for only a small group of supervised > users, so vulnerability isn't such a large concern. > Andy > > > -Original Message- > > From: John W. Holmes [mailto:[EMAIL PROTECTED] > > Sent: Saturday, February 22, 2003 1:06 AM > > To: 'Andy Crain'; [EMAIL PROTECTED] > > Subject: RE: [PHP] preg_match question: locating unmatched HTML tags > > > > > I'm trying to build a regexp that would parse user-supplied text and > > > identify cases where HTML tags are left open or are not properly > > > matched-e.g., tags without closing tags. This is for a sort > > of > > > message board type of application, and I'd like to allow users to > use > > > some HTML, but just would like to check to ensure that no stray tags > > are > > > input that would screw up the rest of the page's display. I'm new to > > > regular expressions, and the one below is as far as I've gotten. If > > > anyone has any suggestions, they'd be very much appreciated. > > > > Letting users enter HTML is a bad idea. Even if you only let them use > > tags, they can still put ONCLICK and mouseover effects for the > bold > > text to screw with your other users. > > > > Use a BB style code, such as [b] for bold, [i] for italics, etc. This > > way, you only match pairs and replace them with HTML and use > > htmlentities on anything else. This way an unmatched [b] tag won't be > > replaced with and mess up your code. > > > > ---John W. Holmes... > > > > PHP Architect - A monthly magazine for PHP Professionals. Get your > copy > > today. http://www.phparch.com/ > > > > > > > > -- > > PHP General Mailing List (http://www.php.net/) > > To unsubscribe, visit: http://www.php.net/unsub.php > > > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] preg_match question: locating unmatched HTML tags
Ernest, Thanks very much. This is pretty close to what I'm looking for. The only problem is that it doesn't catch nested tags. For example, "some text some text some text" makes it through without error since, I think, preg_match resumes matching at the after spotting and then checking its first match, at . Andy > -Original Message- > From: Ernest E Vogelsinger [mailto:[EMAIL PROTECTED] > Sent: Saturday, February 22, 2003 5:48 AM > To: Andy Crain > Cc: [EMAIL PROTECTED] > Subject: Re: [PHP] preg_match question: locating unmatched HTML tags > > At 03:35 22.02.2003, Andy Crain said: > [snip] > >My apologies in advance if this too basic or there's a solution easily > >found out there, but after lots of searching, I'm still lost. > > > >I'm trying to build a regexp that would parse user-supplied text and > >identify cases where HTML tags are left open or are not properly > >matched-e.g., tags without closing tags. This is for a sort of > >message board type of application, and I'd like to allow users to use > >some HTML, but just would like to check to ensure that no stray tags are > >input that would screw up the rest of the page's display. I'm new to > >regular expressions, and the one below is as far as I've gotten. If > >anyone has any suggestions, they'd be very much appreciated. > > > >$suspect_tags = "b|i|u|strong|em|font|a|ol|ul|blockquote "; > >$pattern = '/<(' . $suspect_tags . '[^>]*>)(.*)(?!<\/\1)/Ui'; > >if (preg_match($pattern,$_POST['entry'],$matches)) { > > //do something to report the unclosed tags > >} else { > > echo 'Input looks fine. No unmatched tags.'; > >} > [snip] > > Hi, > > I don't believe you can create a regular expression to look for something > that's NOT there. > > I'd take this approach (tested with drawbacks, see below): > > function check_tags($text) { > $suspect_tags = "b|i|u|strong|em|font|a|ol|ul|blockquote"; > $re_find = '/<\s*(' . $suspect_tags . ').*?>(.*)/is'; > > while (preg_match($re_find,$text,$matches)) { > // a suspect tag was found, check if closed > $suspect = $matches[1]; > $text = $matches[2]; > $re_close = '/<\s*\/\s*' . $suspect . '\s*?>(.*)/is'; > if (preg_match($re_close, $text, $matches)) { > // fine, found matching closer, continue loop > $text = $matches[1]; > } > else { > // not closed - return to report it > return $suspect; > } > } > return null; > } > > $text = << This text contains < font > size=+4 > an > unclosed suspect tag. > > EOT; > > $tag = check_tags($text); > if ($tag) echo "Unmatched: \"$tag\"\n"; > else echo "Perfect!\n"; > > The drawbacks: This approach is softly targeted at unintended typos, such > as in the example text. It won't catch deliberate attacks, such as >Blindtext http://www.vogelsinger.at/ > > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] preg_match question: locating unmatched HTML tags
John, Thanks. I'm considering that, but the application I'm working on is for a small intranet that will be for only a small group of supervised users, so vulnerability isn't such a large concern. Andy > -Original Message- > From: John W. Holmes [mailto:[EMAIL PROTECTED] > Sent: Saturday, February 22, 2003 1:06 AM > To: 'Andy Crain'; [EMAIL PROTECTED] > Subject: RE: [PHP] preg_match question: locating unmatched HTML tags > > > I'm trying to build a regexp that would parse user-supplied text and > > identify cases where HTML tags are left open or are not properly > > matched-e.g., tags without closing tags. This is for a sort > of > > message board type of application, and I'd like to allow users to use > > some HTML, but just would like to check to ensure that no stray tags > are > > input that would screw up the rest of the page's display. I'm new to > > regular expressions, and the one below is as far as I've gotten. If > > anyone has any suggestions, they'd be very much appreciated. > > Letting users enter HTML is a bad idea. Even if you only let them use > tags, they can still put ONCLICK and mouseover effects for the bold > text to screw with your other users. > > Use a BB style code, such as [b] for bold, [i] for italics, etc. This > way, you only match pairs and replace them with HTML and use > htmlentities on anything else. This way an unmatched [b] tag won't be > replaced with and mess up your code. > > ---John W. Holmes... > > PHP Architect - A monthly magazine for PHP Professionals. Get your copy > today. http://www.phparch.com/ > > > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] preg_match question: locating unmatched HTML tags
At 03:35 22.02.2003, Andy Crain said: [snip] >My apologies in advance if this too basic or there's a solution easily >found out there, but after lots of searching, I'm still lost. > >I'm trying to build a regexp that would parse user-supplied text and >identify cases where HTML tags are left open or are not properly >matched-e.g., tags without closing tags. This is for a sort of >message board type of application, and I'd like to allow users to use >some HTML, but just would like to check to ensure that no stray tags are >input that would screw up the rest of the page's display. I'm new to >regular expressions, and the one below is as far as I've gotten. If >anyone has any suggestions, they'd be very much appreciated. > >$suspect_tags = "b|i|u|strong|em|font|a|ol|ul|blockquote "; >$pattern = '/<(' . $suspect_tags . '[^>]*>)(.*)(?!<\/\1)/Ui'; >if (preg_match($pattern,$_POST['entry'],$matches)) { > //do something to report the unclosed tags >} else { > echo 'Input looks fine. No unmatched tags.'; >} [snip] Hi, I don't believe you can create a regular expression to look for something that's NOT there. I'd take this approach (tested with drawbacks, see below): function check_tags($text) { $suspect_tags = "b|i|u|strong|em|font|a|ol|ul|blockquote"; $re_find = '/<\s*(' . $suspect_tags . ').*?>(.*)/is'; while (preg_match($re_find,$text,$matches)) { // a suspect tag was found, check if closed $suspect = $matches[1]; $text = $matches[2]; $re_close = '/<\s*\/\s*' . $suspect . '\s*?>(.*)/is'; if (preg_match($re_close, $text, $matches)) { // fine, found matching closer, continue loop $text = $matches[1]; } else { // not closed - return to report it return $suspect; } } return null; } $text = << an unclosed suspect tag. EOT; $tag = check_tags($text); if ($tag) echo "Unmatched: \"$tag\"\n"; else echo "Perfect!\n"; The drawbacks: This approach is softly targeted at unintended typos, such as in the example text. It won't catch deliberate attacks, such as Blindtext http://www.vogelsinger.at/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] preg_match question: locating unmatched HTML tags
> I'm trying to build a regexp that would parse user-supplied text and > identify cases where HTML tags are left open or are not properly > matched-e.g., tags without closing tags. This is for a sort of > message board type of application, and I'd like to allow users to use > some HTML, but just would like to check to ensure that no stray tags are > input that would screw up the rest of the page's display. I'm new to > regular expressions, and the one below is as far as I've gotten. If > anyone has any suggestions, they'd be very much appreciated. Letting users enter HTML is a bad idea. Even if you only let them use tags, they can still put ONCLICK and mouseover effects for the bold text to screw with your other users. Use a BB style code, such as [b] for bold, [i] for italics, etc. This way, you only match pairs and replace them with HTML and use htmlentities on anything else. This way an unmatched [b] tag won't be replaced with and mess up your code. ---John W. Holmes... PHP Architect - A monthly magazine for PHP Professionals. Get your copy today. http://www.phparch.com/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] preg_match question: locating unmatched HTML tags
My apologies in advance if this too basic or there's a solution easily found out there, but after lots of searching, I'm still lost. I'm trying to build a regexp that would parse user-supplied text and identify cases where HTML tags are left open or are not properly matched-e.g., tags without closing tags. This is for a sort of message board type of application, and I'd like to allow users to use some HTML, but just would like to check to ensure that no stray tags are input that would screw up the rest of the page's display. I'm new to regular expressions, and the one below is as far as I've gotten. If anyone has any suggestions, they'd be very much appreciated. Thanks, Andy $suspect_tags = "b|i|u|strong|em|font|a|ol|ul|blockquote "; $pattern = '/<(' . $suspect_tags . '[^>]*>)(.*)(?!<\/\1)/Ui'; if (preg_match($pattern,$_POST['entry'],$matches)) { //do something to report the unclosed tags } else { echo 'Input looks fine. No unmatched tags.'; }