To: Xerces-C Developers List International Characters, Inc. has been developing a high-performance XML parser based on the systematic restructuring of Xerces-C++ to incorporate Parabix (parallel bit stream) technology. Called icXML, we are now preparing to release this parser under the Apache License in the hope that it will be ultimately accepted as a Xerces subproject (with our continuing participation).
The performance improvements offered by icXML are dramatic. Our target is a 50% speed-up compared to Apache Xerces C++, although we are measuring more than 100% speed-up (twice as fast) in some applications. Parabix technology is the result of an ongoing research program at Simon Fraser University where I am professor of Computing Science. It takes advantage of the SIMD capabilities of modern processors and a novel transposition of character streams into parallel bit streams to process up to 256 characters at a time. icXML is based on the second generation Parabix technology as described in our papers appearing the proceedings of EuroPar 2011 and HPCA 2012. At present, our working stable version is icXML 0.6, and we are targeting icXML 0.7 which should be close to functionally complete for UTF-8 and UTF-16 inputs and the IGXML scanner. When a few bugs are resolved, we hope to be able to package it up for public access on an SVN server. On thing that is not quite clear to me, though, is the best organization for keeping our code in a common framework with existing Xerces code. We presently have some source subdirectories for our own newly created files, while we have also made edits, both major and minor, to many other Xerces source files in place. Is there any way that the autotools chain can be used to address these issues? Any advice on structuring would be highly appreciated. Parabix and icXML are trademarks of International Characters, Inc. Robert D. Cameron, Ph.D. CTO, International Characters, Inc. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
