Ralf Hautkappe wrote: > > hi, > > i want to extract links and their tags out of html with gforth... i have go= > t=20 > one solution with search ( a1 n1 s| href=3D"| search...) .. but i feel its = > to=20 > complex., because i parse large files with different levels of links....
What are different levels of links? > is= > =20 > there an other way? You could use a general string matcher like FoSM by Gordon Charlton (later maintained by Chris Jakeman), or a general parser/parser generator like BNFparse by Brad Rodriguez or Gray by me. Or you could use a general SGML/HTML/XML parser with an appropriate DTD, but I don't know one written in Forth, and real-world web documents don't conform to DTDs anyway (I don't know how the usual parsers deal with that). For your problem, I would probably stick with SEARCH, maybe with a little SCAN, SKIP, and their backwards equivalents. I would not work a line-at-a-time, but a file-at-a-time, because links can cross line boundaries. > maybe using forth=B4s interpreter? I don't think the Forth interrpeter can be used profitably without major surgery. - anton --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
