Re: [PHP] html parsing
Ron Croonenberg wrote: I think the problem is that I read the lines in PHP, I read them with fgets and output them with printf. So the php interpreter never gets to see the line. the apache doesn't parse php output, so it doesn't happen there either. So.. I figured.. I either had to parse it in php myself OR convince the apache server to parse it for me. This is exactly where the problem lies. Apache isn't serving the page to PHP. PHP is just reading the file from the filesystem. What you might try is having PHP fopen() the url to the server/file. This way, apache will parse the SSI and spit out what it needs, then you can read the file anyway you like. understand that php reading a file from the file system, is different then your browser calling to a server/webpage and having Apache serve the page back to the browser. -- Jim Lucas Perseverance is not a long race; it is many short races one after the other Walter Elliot Some men are born to greatness, some achieve greatness, and some have greatness thrust upon them. Twelfth Night, Act II, Scene V by William Shakespeare -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] html parsing
Ron Croonenberg wrote: I think the problem is that I read the lines in PHP, I read them with fgets and output them with printf. So the php interpreter never gets to see the line. the apache doesn't parse php output, so it doesn't happen there either. So.. I figured.. I either had to parse it in php myself OR convince the apache server to parse it for me. Absolutely. Why are you parsing it yourself anyway - what value are you adding? /Per Jessen, Zürich -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] html parsing
On 08 November 2007 06:41, Ron Croonenberg wrote: ok I wrote something quick and dirty real quick: But somehow it doesn't seem to like recursion. Is there something special one needs to do in php ? Recursive functions work just fine in PHP. What's the error message? As far as I can see, there's nothing wrong with what you've posted, except for where there's some inefficiency due to not having found the right functions to use (see corrections below). here's the code snippet: function parsehtmlline($line) { if (strlen(strstr($line, #include)) == 0 strlen(strstr($line, !--)) == 0) { if (strpos($line, #include)===FALSE strpos($line, !--)===FALSE) /* nothing to parse just print it */ print($line); } else { /* extract the filename */ $ssi = extractHTMLssi($line); /* open the file if it exists and output */ /* it else just print the string */ if (file_exists($ssi)) { $incfile = fopen($ssi, r); while (!feof($incfile)) { $ssiline = fgets($incfile, 1024); // somehow PhP doesn't really like recursion, needs to be fixed // for now just print the line. // parsehtmlline($ssiline); print($ssiline); } fclose($incfile); } else print($ssi); } } function extractHTMLssi($line) { $ssi = ; $strptr = strstr($line, \); if (strlen($strptr) == 0) return $line; else { $ssi=$strptr; $iss=strrev($ssi); $strptr = strstr($iss, \); $iss = substr($strptr, 1, -1); $ssi = strrev($iss); } return $ssi; I'd replace the whole of this function body with something like: $pos = strpos($line, ''); if ($strpos===FALSE): return $line; else: return substr($line, $pos+1, strrpos($line, '')-1); endif; } Cheers! Mike - Mike Ford, Electronic Information Services Adviser, JG125, The Headingley Library, James Graham Building, Leeds Metropolitan University, Headingley Campus, LEEDS, LS6 3QS, United Kingdom Email: [EMAIL PROTECTED] Tel: +44 113 812 4730 Fax: +44 113 812 3211 To view the terms under which this email is distributed, please go to http://disclaimer.leedsmet.ac.uk/email.htm -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] html parsing
Hi Mike, (I know it can probably be done better, or more elegant). The problem is that the recursion doesn't seem to stop. (Unless there are html ssi's that loop) there shouldn't be a problem. It basically just freezes up (probably the recursion not terminating/ falling back.) However the html just displays fine when I don't use recursion, so t somewhat baffled me that when I use recursion the script doesn't terminate. I did get to see some error msgs about problems with file streams. Maybe it is running out of file handles ? thanks, Ron Ford, Mike wrote: On 08 November 2007 06:41, Ron Croonenberg wrote: ok I wrote something quick and dirty real quick: But somehow it doesn't seem to like recursion. Is there something special one needs to do in php ? Recursive functions work just fine in PHP. What's the error message? As far as I can see, there's nothing wrong with what you've posted, except for where there's some inefficiency due to not having found the right functions to use (see corrections below). here's the code snippet: function parsehtmlline($line) { if (strlen(strstr($line, #include)) == 0 strlen(strstr($line, !--)) == 0) { if (strpos($line, #include)===FALSE strpos($line, !--)===FALSE) /* nothing to parse just print it */ print($line); } else { /* extract the filename */ $ssi = extractHTMLssi($line); /* open the file if it exists and output */ /* it else just print the string */ if (file_exists($ssi)) { $incfile = fopen($ssi, r); while (!feof($incfile)) { $ssiline = fgets($incfile, 1024); // somehow PhP doesn't really like recursion, needs to be fixed // for now just print the line. // parsehtmlline($ssiline); print($ssiline); } fclose($incfile); } else print($ssi); } } function extractHTMLssi($line) { $ssi = ; $strptr = strstr($line, \); if (strlen($strptr) == 0) return $line; else { $ssi=$strptr; $iss=strrev($ssi); $strptr = strstr($iss, \); $iss = substr($strptr, 1, -1); $ssi = strrev($iss); } return $ssi; I'd replace the whole of this function body with something like: $pos = strpos($line, ''); if ($strpos===FALSE): return $line; else: return substr($line, $pos+1, strrpos($line, '')-1); endif; } Cheers! Mike - Mike Ford, Electronic Information Services Adviser, JG125, The Headingley Library, James Graham Building, Leeds Metropolitan University, Headingley Campus, LEEDS, LS6 3QS, United Kingdom Email: [EMAIL PROTECTED] Tel: +44 113 812 4730 Fax: +44 113 812 3211 To view the terms under which this email is distributed, please go to http://disclaimer.leedsmet.ac.uk/email.htm -- = Ron Croonenberg | | Phone: 1 765 658 4761 Lab Instructor | Fax: 1 765 658 4732 Technology Coordinator| | Department of Computer Science| e-mail: [EMAIL PROTECTED] DePauw University | 275 Julian Science Math Center | 602 South College Ave.| Greencastle, IN 46135| = -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] html parsing
Hi Per, the pages are templates. depending on some variables either one page shows up, or another without links in the address bar changing etc. Ron Per Jessen wrote: Ron Croonenberg wrote: I think the problem is that I read the lines in PHP, I read them with fgets and output them with printf. So the php interpreter never gets to see the line. the apache doesn't parse php output, so it doesn't happen there either. So.. I figured.. I either had to parse it in php myself OR convince the apache server to parse it for me. Absolutely. Why are you parsing it yourself anyway - what value are you adding? /Per Jessen, Zürich -- = Ron Croonenberg | | Phone: 1 765 658 4761 Lab Instructor | Fax: 1 765 658 4732 Technology Coordinator| | Department of Computer Science| e-mail: [EMAIL PROTECTED] DePauw University | 275 Julian Science Math Center | 602 South College Ave.| Greencastle, IN 46135| = -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] html parsing
Can you just change it to ?php include('include/header.sinc); ? ? On Nov 7, 2007, at 7:24 PM, Ron Croonenberg [EMAIL PROTECTED] wrote: Hello, I have a script that ads data to a html template. However when there is an include in the html like: !--#include file=include/header.sinc -- it is not processed, but just ends up as a string in the page. So I guess it needs to be parsed. Is there an easy way to do that ? thanks, Ron -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] html parsing
ok I wrote something quick and dirty real quick: But somehow it doesn't seem to like recursion. Is there something special one needs to do in php ? here's the code snippet: function parsehtmlline($line) { if (strlen(strstr($line, #include)) == 0 strlen(strstr($line, !--)) == 0) { /* nothing to parse just print it */ print($line); } else { /* extract the filename */ $ssi = extractHTMLssi($line); /* open the file if it exists and output */ /* it else just print the string */ if (file_exists($ssi)) { $incfile = fopen($ssi, r); while (!feof($incfile)) { $ssiline = fgets($incfile, 1024); // somehow PhP doesn't really like recursion, needs to be fixed // for now just print the line. // parsehtmlline($ssiline); print($ssiline); } fclose($incfile); } else print($ssi); } } function extractHTMLssi($line) { $ssi = ; $strptr = strstr($line, \); if (strlen($strptr) == 0) return $line; else { $ssi=$strptr; $iss=strrev($ssi); $strptr = strstr($iss, \); $iss = substr($strptr, 1, -1); $ssi = strrev($iss); } return $ssi; } -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] html parsing
Ron Croonenberg wrote: Hello, I have a script that ads data to a html template. However when there is an include in the html like: !--#include file=include/header.sinc -- it is not processed, but just ends up as a string in the page. So I guess it needs to be parsed. Is there an easy way to do that ? thanks, Ron Do you have SSI enable in Apache? You are running Apache right? Do you have the file named .shtml? Or do you have Apache setup to run SSI through the HTML parser allowing you to have the files named .html instead. But I guess the other question is why are you using an SSI command that you could easily use a PHP command for and thus not have to mix the two? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] html parsing
Yes I do have SSI enabled here is what I am doing. I have a php script that reads (a basic) file with html in it. (meaning it it would have an html/shtml extension it would just work in a browser. Now I rename the file to myhtml.dat and let the script read it and print it to stdout. That works like a char. until there's a server side include in the html page. (ssi's work in both html and in php) I tried to use ?php include(that.file); ? but that doesn't seem to work Tom Ray [Lists] wrote: Ron Croonenberg wrote: Hello, I have a script that ads data to a html template. However when there is an include in the html like: !--#include file=include/header.sinc -- it is not processed, but just ends up as a string in the page. So I guess it needs to be parsed. Is there an easy way to do that ? thanks, Ron Do you have SSI enable in Apache? You are running Apache right? Do you have the file named .shtml? Or do you have Apache setup to run SSI through the HTML parser allowing you to have the files named .html instead. But I guess the other question is why are you using an SSI command that you could easily use a PHP command for and thus not have to mix the two? -- = Ron Croonenberg | | Phone: 1 765 658 4761 Lab Instructor | Fax: 1 765 658 4732 Technology Coordinator| | Department of Computer Science| e-mail: [EMAIL PROTECTED] DePauw University | 275 Julian Science Math Center | 602 South College Ave.| Greencastle, IN 46135| = -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] html parsing
I think the problem is that I read the lines in PHP, I read them with fgets and output them with printf. So the php interpreter never gets to see the line. the apache doesn't parse php output, so it doesn't happen there either. So.. I figured.. I either had to parse it in php myself OR convince the apache server to parse it for me. Ron Casey wrote: Can you just change it to ?php include('include/header.sinc); ? ? On Nov 7, 2007, at 7:24 PM, Ron Croonenberg [EMAIL PROTECTED] wrote: Hello, I have a script that ads data to a html template. However when there is an include in the html like: !--#include file=include/header.sinc -- it is not processed, but just ends up as a string in the page. So I guess it needs to be parsed. Is there an easy way to do that ? thanks, Ron -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- = Ron Croonenberg | | Phone: 1 765 658 4761 Lab Instructor | Fax: 1 765 658 4732 Technology Coordinator| | Department of Computer Science| e-mail: [EMAIL PROTECTED] DePauw University | 275 Julian Science Math Center | 602 South College Ave.| Greencastle, IN 46135| = -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] html parsing question
I've got a little question. I am writing a page where I would like to parse my own invented HTML-looking tags (but want to keep the real HTML tags intact). I use a buffer for the output, and just before the end use ob_get_contents() to get the whole buffer which I want to check for those tags. (Something like: $tag-content = ob_get_contents(); $output = $tag-interpret(); ob_end_clean(); echo $output; ) Now my question is, what is the fastest way to search and replace through this file, whereby the interpretation of the tag can be somewhat complicated? I first just had a loop running character by character through this text, looking for tags, but that is obviously way too slow. Now I have something like: preg_replace_callback (/(.+)/, array($this, 'tag_callback'), $this-content); But then I don't know how to interpret different tags differently. Any suggestions? (A tag looks like: CANTR REPLACE NAME=text where then the whole tag has to be replaced by something called 'text' that has to be looked up in a table. So, CANTR REPLACE NAME=text has to be replaced with something else than CANTR REPLACE NAME=main - well, you get the idea.) Here's an example of how to use preg_replace_callback. Within the callback() function, $matches[1] is going to contain whatever value was in your NAME attribute of your CANTR tag. Act accordingly to it. ?php $text = 'Hello cantr replace text=name. You are on page cantr replace text=main right now.'; function callback($matches) { switch($matches[1]) { case 'name': $retval = 'John'; break; case 'main': $retval = 'Index.php'; break; } return $retval; } $new_text = preg_replace_callback('/cantr replace text=([^]+)/i','callback',$text); echo hr; echo $new_text; ? ---John W. Holmes... PHP Architect - A monthly magazine for PHP Professionals. Get your copy today. http://www.phparch.com/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] html parsing question
You can output xhtml with your custom tags and use xml parsing functions Jos Elkink wrote: Hello, I've got a little question. I am writing a page where I would like to parse my own invented HTML-looking tags (but want to keep the real HTML tags intact). I use a buffer for the output, and just before the end use ob_get_contents() to get the whole buffer which I want to check for those tags. (Something like: $tag-content = ob_get_contents(); $output = $tag-interpret(); ob_end_clean(); echo $output; ) Now my question is, what is the fastest way to search and replace through this file, whereby the interpretation of the tag can be somewhat complicated? I first just had a loop running character by character through this text, looking for tags, but that is obviously way too slow. Now I have something like: preg_replace_callback (/(.+)/, array($this, 'tag_callback'), $this-content); But then I don't know how to interpret different tags differently. Any suggestions? (A tag looks like: CANTR REPLACE NAME=text where then the whole tag has to be replaced by something called 'text' that has to be looked up in a table. So, CANTR REPLACE NAME=text has to be replaced with something else than CANTR REPLACE NAME=main - well, you get the idea.) Thanks in advance for any help! Jos -- Jos Elkink Game Administration Council Cantr II http://www.cantr.net -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] html parsing question
below is a snippet from a larger code used to capture a daily output graphic file from a site... I put your requirements into it and removed a lot of superfluous stuff, but it is not set to capture text spanning more than one line, but you should be able to modify it accordingly if you can follow the logic. on 8/2/01 5:30 PM, Chuck Barnett at [EMAIL PROTECTED] wrote: Hi I have a question about parsing a page on another server to grab some headlines. I want to search down the page until I find a string -headlines- then I want to grab everything between the next pair of table/table tags. Anyone have a quick solution? i would check out http://php.net/strpos ?PHP /* HTML Content Grabber no copyright */ $file_url = http://www.somewhere.com/;; $grabbed_file = file($file_url); if ($grabbed_file[0]) { for ($i = 0; $i = count($grabbed_file) - 1; $i++) { if($returnstr = strstr ($grabbed_file[$i], '-headlines-')) { $trim = substr ($returnstr, strpos($returnstr, 'table')+7,strpos($returnstr, '/table')-(strpos($returnstr, 'table')+7)); }else{ # not in this line, keep searching } } } else { # Something went wrong } ? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP] html parsing question
on 8/2/01 5:30 PM, Chuck Barnett at [EMAIL PROTECTED] wrote: Hi I have a question about parsing a page on another server to grab some headlines. I want to search down the page until I find a string -headlines- then I want to grab everything between the next pair of table/table tags. Anyone have a quick solution? i would check out http://php.net/strpos Thanks, Chuck -- mike cullerton -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]