Activity report on

  *[JIRA] Bug SKER4947 - StringChopper will not handle cdata*

  Scarab Link: http://sesat.no/scarab/issues/id/SKER4947
  Module: Sesat> Kernel


  Activity generated by Håvard Frøiland ([EMAIL PROTECTED]) at 11/14/2008 09:33

  *Reasons for the changes*


  *Comments*
  - By Håvard Frøiland - 11/14/2008 09:33 ---
  "Status of this bug is that the problem at hand is much more involved then 
what I first thought. And the use of the chop function is also quite different 
from what I thought.

At the moment the chop function is a compromise between correct xml handeling, 
and just plain text handeling. Escaped xml is not handled, and CDATA is left 
alone (not chopped). Characters like & is also just passed through.

After a discussion with Endre we thought that crating a new implementation that 
would chop xml correctly, and handeling cdata, xml escaping correctly was the 
way to go. But when this chopper also should handle txt (like urls) and so on, 
the ambiguity in the data gets possibly impossible to handle. I have attached 
the xmlchopper code just for keeping.

The chop function is used extensive, and the performance penalty by doing to 
much work her, is huge. I messuered the speed, and found that the new 
implementation (based on sax) was about 30-40 times slower. This makes it 
useless sine it is such a hotspot already.

So my current hunch, is that we should not try to handle cdata more then we do 
today. But I would like to have a talk to someone about this. (Endre or Mick) "
_______________________________________________
Kernel-issues mailing list
[email protected]
http://sesat.no/mailman/listinfo/kernel-issues

Svar til