Ok, I've come up with my first significant problem with the available stax parsers. Woodstox has a problem properly handling DTD's. The Stax reference impl handles them properly, but interferes with the function of the automatic character encoding sniffing. I haven't tried the Sun implementation yet. I'm trying to find a listing of what other stax implementations are available so I can test and debug the code with each.
For now, I've disabled automatic charset sniffing so that the stax ref impl can be used out of the box. - James
