Activity report on *[JIRA] Bug SKER4947 - StringChopper will not handle cdata*
Scarab Link: http://sesat.no/scarab/issues/id/SKER4947 Module: Sesat> Kernel Activity generated by Håvard Frøiland ([EMAIL PROTECTED]) at 11/14/2008 09:33 *Reasons for the changes* *Comments* - By Håvard Frøiland - 11/14/2008 09:33 --- "Status of this bug is that the problem at hand is much more involved then what I first thought. And the use of the chop function is also quite different from what I thought. At the moment the chop function is a compromise between correct xml handeling, and just plain text handeling. Escaped xml is not handled, and CDATA is left alone (not chopped). Characters like & is also just passed through. After a discussion with Endre we thought that crating a new implementation that would chop xml correctly, and handeling cdata, xml escaping correctly was the way to go. But when this chopper also should handle txt (like urls) and so on, the ambiguity in the data gets possibly impossible to handle. I have attached the xmlchopper code just for keeping. The chop function is used extensive, and the performance penalty by doing to much work her, is huge. I messuered the speed, and found that the new implementation (based on sax) was about 30-40 times slower. This makes it useless sine it is such a hotspot already. So my current hunch, is that we should not try to handle cdata more then we do today. But I would like to have a talk to someone about this. (Endre or Mick) "
_______________________________________________ Kernel-issues mailing list [email protected] http://sesat.no/mailman/listinfo/kernel-issues
