Jochem,

Thank you for responding, and for explaining more about regular expressions.

yes but you wouldn't use preg_replace() but rather preg_match() or 
preg_match_all()
which gives you back an array (via 3rd/4th[?] reference argument) which contains
the texts that matched (and therefore want to keep).
I looked up preg_match_all() on php.net, and, in combination with what was said before, came up with this syntax:

preg_match_all( "#^<li[^>]*>(.*)<br[^>]*>#is", $response, $wordList, PREG_PATTERN_ORDER );
var_dump($wordList);

The idea is to catch all text between <li> and <br> tags.

Unfortunately, the result I get from var_dump is:

array(2) { [0]=> array(0) { } [1]=> array(0) { } }

In other words, it made no matches.

The text being searched is an entire web page which contains the following:
(Please note the following includes utf-8 encoded Japanese text. Apologies if it comes out as ASCII gibberish)

<FONT color="red">日本語</FONT>は<FONT color="red">簡単</FONT>だよ<br>
<ul><li> 日本語 【にほんご】 (n) Japanese language; (P); EP <br>
<li> 簡単 【かんたん】 (adj-na,n) simple; (P); EP <br>
</ul><p>

So, my preg_match_all search should have found:

日本語 【にほんご】 (n) Japanese language; (P); EP
簡単 【かんたん】 (adj-na,n) simple; (P); EP

I've checked and rechecked my syntax, and I can't see why it would fail.

Have I messed up the regular expression, or the use of preg_match_all?

--
Dave M G

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to