nbsp means a nonbreaking space. most html renderer remove double spaces, for 
historical reasons as far as i know. thus the nbsp was introduced, and can 
appear anywhere in a text, most often to do basic indentation. however, filter 
only works on full lines, and is thus not helpful with that. you should use 
'replace' for single occurrences of strings, that can't appear in normal text:

replace " " with space in theHTML

On 12 Jun 2011, at 15:42, Keith Clarke wrote:

> Thanks for the insights Jim (and Stephen) - all very useful.
> A list of stuff is now emerging from the depths of the page. The only problem 
> I have now is some stubborn ' ' characters that don't respond to 
> filtering without " " or numToChar(160).
> Any ideas?
> Best,
> Keith..
> 
> On 12 Jun 2011, at 14:18, Jim Ault wrote:
> 
>> I forgot to mention the old frames style if you are looking into archives on 
>> old sites,
>> and <IFRAME> on newer sites, easy to detect, but now you have a second 
>> <head> </head> <body> </body>.
>> 
>> On Jun 12, 2011, at 4:14 AM, Keith Clarke wrote:
>> 
>>> I've got the HTML source into a reasonable shape for processing with line 
>>> and item chunk expressions by using:
>>> 
>>> put field "fld Page Source Code" into tHTML
>>> replace "/div>" with "/div>" & return in tHTML
>>> replace "/tr>" with "/tr>" & return in tHTML
>>> replace "/td>" with "/td>" & tab in tHTML
>>> filter tHTML with <strings that isolate only the interesting, data-laden 
>>> table rows>
>>> 
>>> So, I can now have line-level chunk expressions mapped to divs and table 
>>> row tags, together with item-level expressions for iterating through the 
>>> tags and their attributes within table rows. Nice!
>>> 
>>> Now the rich seams have been revealed, it's time to start digging out them 
>>> there nuggets! :-)
>> 
>> Jim Ault
>> Las Vegas
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to