Thanks for the insights Jim (and Stephen) - all very useful. A list of stuff is now emerging from the depths of the page. The only problem I have now is some stubborn ' ' characters that don't respond to filtering without " " or numToChar(160). Any ideas? Best, Keith..
On 12 Jun 2011, at 14:18, Jim Ault wrote: > I forgot to mention the old frames style if you are looking into archives on > old sites, > and <IFRAME> on newer sites, easy to detect, but now you have a second <head> > </head> <body> </body>. > > On Jun 12, 2011, at 4:14 AM, Keith Clarke wrote: > >> I've got the HTML source into a reasonable shape for processing with line >> and item chunk expressions by using: >> >> put field "fld Page Source Code" into tHTML >> replace "/div>" with "/div>" & return in tHTML >> replace "/tr>" with "/tr>" & return in tHTML >> replace "/td>" with "/td>" & tab in tHTML >> filter tHTML with <strings that isolate only the interesting, data-laden >> table rows> >> >> So, I can now have line-level chunk expressions mapped to divs and table row >> tags, together with item-level expressions for iterating through the tags >> and their attributes within table rows. Nice! >> >> Now the rich seams have been revealed, it's time to start digging out them >> there nuggets! :-) > > Jim Ault > Las Vegas _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode