Markus Fischer wrote: > > Could you give some details about your implementation? What do you > > use for HTML parsing and does it work correctly with scripts, > > comments, attributes? > > I'm not sure if I understand you correctly ... what I was talking about > has nothing to do with parsing specific kind of contents. It's just > strings and custom (sprintf-based) highlighting. The highlighting part > doesn't care what the content is, actually.
Oh. I see. Hint: you can use default analyzer to tokenize input string. It returns original term position in a text and returns terms in a normalized form. It's useful if you use stemming or any other term normalization (ex. converting to lower case to be case-insensitive). I agree it would be good to have an option for simple (but configurable) plain text highlighting feature. With best regards, Alexander Veremyev. > Our scenario is targeted against highlighting matches in text fragments, > just like e.g. Google does. The text to match against and highlight in > our cases has always been plain ASCII, whether it came from HTML, PDF or > something else. > > If you take the existing highlighting part currently in Z_S_L, it is not > much different, just that it is more flexible and doesn't use the DOM. > > Maybe our case it too specific for the general audience anyway and I got > the wrong vision that it's a common case, ops? > > > I think highlighting templates with printf syntax is a good idea. > > Good start then :) > > - - Markus > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (MingW32) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFHPNDi1nS0RcInK9ARAsjoAJ9m/mCd7jVOcxR3hpFbdBYpPmoL8wCdFccM > AZnTKnsDRpMQhAKuNock9w0= > =iGiD > -----END PGP SIGNATURE----- > > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.5.503 / Virus Database: 269.15.32/1131 - Release Date: > 14.11.2007 16:54 > No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.503 / Virus Database: 269.15.33/1133 - Release Date: 15.11.2007 20:57
