It isn’t the fault of BaseX. The parser (tagsoup, if you choose HTML
parsing) inserts the default values for attributes. You should be able
to suppress it by adding nodefaults=true to HTMLOPT.
On 2012-05-09 17:07, jida...@jidanni.org wrote:
"AH" == Alexander Holupirek<alexander.holupi...@uni-konstanz.de> writes:
AH> Please post a small snippet or example, so that we are able to test the
problem.
Taking the example from the Debian basex man page, we add an innocent
<br> and<a>:
cat> bad.html<<\EOF
<html>
<ul>
<li>A<a href="o">z</a>
<li>B<br>
</ul>
</html>
EOF
basex -c 'set parser html; set htmlopt method=html,nons=true; create db htmldb
bad.html'
basex -q "doc('htmldb')"
<html>
<body>
<ul>
<li>A<a shape="rect" href="o">z</a> HORRIBLE
</li>
<li>B<br clear="none"/> TERRIBLE
</li>
</ul>
</body>
</html>
How can I stop basex from insisting on adding such atrocious junk?
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930
Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk