Hi All,
I think I've found a bit of a bug in the HTML formatting code for text.
If someone has deliberately put some HTML in a field (in our case a
pair of <sup> tags), the formatting code strips only portions of the
offending tag(s) and leaves a bit left over.
In our case it stripped the first < and the closing </sup>, but left
sup> at the beginning of the text that was supposed to be superscripted.
i.e. Art4<sup>tm1a(KOMP)Wtsi</sup> became Art4sup>tm1a(KOMP)Wtsi with
some extra span tags to make the latter part superscript.
I've had a look at the code and come up with a small fix that seems
to work here so I thought i'd better let you guys know about it.
Here's the diffs:
Index: lib/BioMart/Formatter/HTML.pm
===================================================================
RCS file: /cvsroot/biomart/biomart-perl/lib/BioMart/Formatter/HTML.pm,v
retrieving revision 1.9
diff -r1.9 HTML.pm
115,117c115,121
< # superscripting for emma mart
< $$row[$$attribute_positions[$i]] =~ s/\<(.*)\>/<span
style="vertical-align:super;font-size:0.8em">$1<\/span>/;
<
---
> # catch html formatting...
> if ( $$row[$$attribute_positions[$i]] =~ /\<.+\>.+\<\/.+
\>/ ) {
> # Do nothing - leave the original formatting
> } else {
> # superscripting for emma mart
> $$row[$$attribute_positions[$i]] =~ s/\<(.*)\>/<span
style="vertical-align:super;font-size:0.8em">$1<\/span>/;
> }
Index: lib/BioMart/Formatter/HTML_36.pm
===================================================================
RCS file: /cvsroot/biomart/biomart-perl/lib/BioMart/Formatter/
HTML_36.pm,v
retrieving revision 1.5
diff -r1.5 HTML_36.pm
256,259c256,263
< # superscripting for emma mart
< $$row[$$attribute_positions[$i]] =~ s/\<(.*)\>/<span
style="vertical-align:super;font-size:0.8em">$1<\/span>/;
<
<
---
> # catch html formatting...
> if ( $$row[$$attribute_positions[$i]] =~ /\<.+\>.+\<\/.+
\>/ ) {
> # Do nothing - leave the original formatting
> } else {
> # superscripting for emma mart
> $$row[$$attribute_positions[$i]] =~ s/\<(.*)\>/<span
style="vertical-align:super;font-size:0.8em">$1<\/span>/;
> }
Hope this is useful!
Cheers,
Darren
P.S. Cheers for making a great tool too!
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.