Hi Stefan, And thanks for taking time to read the doc and giving us your feedback.
-1! > Xsl is terrible slow! > Xml will blow up memory and storage usage. But there still something I don't understand... Regarding a previous discussion we had about the use of OpenSearch API to replace Servlet => HTML by Servlet => XML => HTML (using xsl), here is a copy of one of my comment: In my opinion, it is the front-end "dreamed" architecture. But more pragmatically, I'm not sure it's a good idea. XSL transformation is a rather slow process!! And the Nutch front-end must be very responsive. and then your response and Doug response too: Stefan: We already done experiments using XSLT. There are some ways to improve speed, however it is 20 ++ % slower then jsp. Doug: I don't think this would make a significant impact on overall Nutch search performance. (the complete thread is available at http://www.mail-archive.com/[email protected]/msg03811.html ) I'm a little bit confused... why the use of xsl must be considered as too time and memory expansive in the back-end process, but not in the front-end? Dublin core may is good for semantic web, but not for a content storage. It is not used as a content storage, but just as an intermediate step: the output of the xsl transformation, that will be then indexed using standard nutch APIs. (notice that this xml file schema is perfectly mapped to Parse and ParseData objects) > In general the goal must be to minimalize memory usage and improve > performance such a parser would increase memory usage and definitely > slow down parsing. Not improving the flexibility, extensibility and features? Jérôme -- http://motrech.free.fr/ http://www.frutch.org/
