Re: [PHP] how do I use php://memory?
On Jan 29, 2010, at 10:57 PM, Mari Masuda wrote: > > On Jan 29, 2010, at 4:38 PM, Nathan Nobbe wrote: > >> On Fri, Jan 29, 2010 at 5:35 PM, Mari Masuda >> wrote: >> Hello, >> >> I have a function that uses tidy to attempt to clean up a bunch of crappy >> HTML that I inherited. In order to use tidy, I write the crappy HTML to a >> temporary file on disk, run tidy, and extract and return the clean(er) HTML. >> The program itself works fine but with all of the disk access, it runs >> quite slowly. >> >> why read from disk in the first place? >> >> http://us3.php.net/manual/en/tidy.parsestring.php >> >> -nathan > > Thank you, this looks like exactly what I need. Unfortunately I cannot get > it to work on my machine. [snip] So I figured it out... I was using the wrong command to compile libtidy. (I am not a *nix geek so I had no idea I was messing it up.) To get it working, I followed the instructions I found here: http://www.php.net/manual/en/ref.tidy.php#64281 My setup is slightly different from the person who wrote the directions in that I am running OS X 10.6.2 and PHP 5.2.12. However, the only difference between the instructions I followed and what I actually had to do is that the line to comment out for me was line 525 instead of 508. (The actual line is: typedef unsigned long ulong;) Thank you everyone for your help! -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] how do I use php://memory?
On Jan 29, 2010, at 4:38 PM, Nathan Nobbe wrote: > On Fri, Jan 29, 2010 at 5:35 PM, Mari Masuda wrote: > Hello, > > I have a function that uses tidy to attempt to clean up a bunch of crappy > HTML that I inherited. In order to use tidy, I write the crappy HTML to a > temporary file on disk, run tidy, and extract and return the clean(er) HTML. > The program itself works fine but with all of the disk access, it runs quite > slowly. > > why read from disk in the first place? > > http://us3.php.net/manual/en/tidy.parsestring.php > > -nathan Thank you, this looks like exactly what I need. Unfortunately I cannot get it to work on my machine. I recompiled PHP with --with-tidy=/usr/local and this is the version and modules in use: [Fri Jan 29 22:50:41] ~: php -vPHP 5.2.12 (cli) (built: Jan 29 2010 22:35:24) Copyright (c) 1997-2009 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2009 Zend Technologies [Fri Jan 29 22:52:30] ~: php -m [PHP Modules] ctype date dom filter gd hash iconv json libxml mbstring mysql mysqli pcre PDO pdo_mysql pdo_sqlite posix Reflection session SimpleXML SPL SQLite standard tidy tokenizer xml xmlreader xmlwriter zlib [Zend Modules] [Fri Jan 29 22:52:34] ~: When I run this test code = blahhello"; $config = array('indent' => true, 'wrap' => '0'); // Tidy $tidy = new tidy(); var_dump($tidy); $tidy->parseString($html, $config, 'utf8'); var_dump($tidy); $tidy->cleanRepair(); var_dump($tidy); echo tidy_get_output($tidy); var_dump($tidy); ?> = I get this output: = object(tidy)#1 (2) { ["errorBuffer"]=> NULL ["value"]=> NULL } object(tidy)#1 (2) { ["errorBuffer"]=> NULL ["value"]=> NULL } object(tidy)#1 (2) { ["errorBuffer"]=> NULL ["value"]=> NULL } object(tidy)#1 (2) { ["errorBuffer"]=> NULL ["value"]=> NULL } I have no clue what I'm doing wrong... Mari -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] how do I use php://memory?
Op 1/30/10 1:35 AM, Mari Masuda schreef: > Hello, > > I have a function that uses tidy to attempt to clean up a bunch of crappy > HTML that I inherited. In order to use tidy, I write the crappy HTML to a > temporary file on disk, run tidy, and extract and return the clean(er) HTML. > The program itself works fine but with all of the disk access, it runs quite > slowly. I saw on this page (http://www.php.net/manual/en/wrappers.php.php) > that I could write to memory by using php://memory. Unfortunately, I could > not quite get it to work. The problem is that in the below function, the > code within the [[[if (file_exists($dirty_file_path))]]] does not get run if > I change [[[$dirty_file_path]]] to "php://memory". Has anyone ever > successfully used php://memory before? If so, what can I do to use it in my > code? Thank you. what does it matter that it runs slowly, run it once and be done with it? alternatively use the php tidy extension and avoid the file system and shelling out altogether. actually I'd imagine shelling out from a webserver process is the bottle neck and not saving/reading from the file system. lastly I don't suppose you've heard of /dev/shm ? and, er, no, I don't have experience with php://memory but you might try searching for other people's code: http://www.google.com/codesearch?q=php%3A%2F%2Fmemory&hl=en&btnG=Search+Code > //== > function cleanUpHtml($dirty_html, $enclose_text=true) { > > $parent_dir = "/filesWrittenFromPHP/"; > $now = time(); > $random = rand(); > > //save dirty html to a file so tidy can process it > $dirty_file_path = $parent_dir . "dirty" . $now . "-" . $random . > ".txt"; > $dirty_handle = fopen($dirty_file_path, "w"); > fwrite($dirty_handle, $dirty_html); > fclose($dirty_handle); > > $cleaned_html = ""; > $start = 0; > $end = 0; > > if (file_exists($dirty_file_path)) { > exec("/usr/local/bin/tidy -miq -wrap 0 -asxhtml --doctype > strict --preserve-entities yes --css-prefix \"tidy\" --tidy-mark no > --char-encoding utf8 --drop-proprietary-attributes yes --fix-uri yes " . > ($enclose_text ? "--enclose-text yes " : "") . $dirty_file_path . " 2> > /dev/null"); > > $tidied_html = file_get_contents($dirty_file_path); > > $start = strpos($tidied_html, "") + 6; > $end = strpos($tidied_html, "") - 1; > > $cleaned_html = trim(substr($tidied_html, $start, ($end - > $start))); > } > > unlink($dirty_file_path); > > > return $cleaned_html; > } > //== > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] how do I use php://memory?
On Fri, Jan 29, 2010 at 5:35 PM, Mari Masuda wrote: > Hello, > > I have a function that uses tidy to attempt to clean up a bunch of crappy > HTML that I inherited. In order to use tidy, I write the crappy HTML to a > temporary file on disk, run tidy, and extract and return the clean(er) HTML. > The program itself works fine but with all of the disk access, it runs > quite slowly. why read from disk in the first place? http://us3.php.net/manual/en/tidy.parsestring.php -nathan