I'm working with some users who are generating a malformed pepXML file by 
using Mascot2XML when the protein description includes special XML 
characters, like < and >. The output pepXML file includes this text:

...
<alternative_protein protein="tr|A0N5G5|A0N5G5_HUMAN" 
protein_descr="Rheumatoid factor D5 light chain (Fragment) OS=Homo sapiens 
GN=V<kappa>3 PE=2 SV=1" num_tol_term="2" peptide_prev_aa="R" 
peptide_next_aa="A"/>
<kappa>
<kappa>
<search_score name="ionscore" value="46.68"/>
...

Note the "<kappa>" that's included, unencoded in the value of 
the protein_descr attribute.

I'm attaching a patch, which uses the same encoding approach that's being 
used for the primary protein identification as of 2010 with revision 4877.

However, I'm also worried about the unclosed <kappa> tags immediately 
afterwards. I assume that's coming from the modtags code a little later in 
that same function. However, after a little poking around, I've been unable 
to find the implementation of writeTraditional(). Anyone have ideas on what 
might be going wrong here?

Thanks,
Josh

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/spctools-discuss.
For more options, visit https://groups.google.com/d/optout.
Index: MascotConverter.cxx
===================================================================
--- MascotConverter.cxx	(revision 7405)
+++ MascotConverter.cxx	(working copy)
@@ -3391,7 +3391,7 @@
         if (generate_description_) {
           proteinMap::const_iterator it = proteinDescriptionMap_.find (rank1_proteins_[rank1Index]);
           if (it != proteinDescriptionMap_.end()) {
-            fprintf(fout, " protein_descr=\"%s\"", it->second.c_str());
+            fprintf(fout, " protein_descr=\"%s\"", XMLEscape(it->second).c_str());
           }
           else {
             fprintf(fout, " protein_descr=\"NON_EXISTENT PROTEIN DESCRIPTION\"");

Reply via email to