Jimmy, As you suggested, I replace the value in the "protein" with the first word from the "protein_descr" in each search_hit entry:
for not decoy string, it looks like this: <search_hit hit_rank="1" peptide="YPSRPLPPPPPFGLGFVPPPPPPYGPGR" peptide_prev_aa="R" peptide_next_aa="I" protein="gi|73975149|ref| XP_862242.1|" num_tot_proteins="3" num_matched_ions="15" tot_num_ions="54" calc_neutral_pep_mass="2949.569" massdiff="0.59985990298992" is_rejected="0" protein_descr="gi|73975149| ref|XP_862242.1| PREDICTED: hypothetical protein XP_857149 isoform 2 [Canis familiaris]" num_tol_term="2" num_missed_cleavages="0"> <alternative_protein protein="gi|73975151|ref|XP_862273.1|" protein_descr="gi|73975151|ref|XP_862273.1| PREDICTED: hypothetical protein XP_857180 isoform 3 [Canis familiaris]"/> <alternative_protein protein="gi|73975153|ref|XP_862298.1|" protein_descr="gi|73975153|ref|XP_862298.1| PREDICTED: hypothetical protein XP_857205 isoform 4 [Canis familiaris]"/> <search_score name="pvalue" value="0.000002382663488"/> <search_score name="expect" value="0.028320338214958"/> </search_hit> For decoy string, it looks like this: <search_hit hit_rank="2" peptide="QESARYSAKVTVAGLEESATEAQQQIR" peptide_prev_aa="K" peptide_next_aa="S" protein="decoy_2098 2" num_tot_proteins="1" num_matched_ions="17" tot_num_ions="52" calc_neutral_pep_mass="2949.481" massdiff="0.9728599071700 05" is_rejected="0" protein_descr="decoy_20982"> <search_score name="pvalue" value="0.000014221885248"/> <search_score name="expect" value="0.168173793062284"/> </search_hit> Is there anything else I need to fix? Thanks! Ping On Jul 2, 4:44 pm, Jimmy Eng <[email protected]> wrote: > Ping, > > I just downloaded OMSSA 2.1.4 and tried the direct pep.xml export > myself. I do see a problem with the resulting pep.xml file that the > "-op" option generates that's causing the problem you're seeing. > > The key error message in your output is this: > WARNING: No decoys with label DECOY were found in this dataset. > > Looking at the generated pep.xml files, OMSSA seems to be placing some > number in the protein="" attribute of each search_hit element. Whereas > PeptideProphet expects this protein attribute to contain some protein > identifier that includes the DECOY string for those decoy matches. In > the converters we use, the value of the protein attribute is the first > word of the protein definition line. > > As for a fix, we need someone at NCBI to address this and hopefully > someone here will contact them about this. For you in the short term, > you're going to need a developer to modify you pep.xml files to replace > the value in the "protein" attribute with the first word from the > "protein_descr" attribute of each search_hit entry. > > - Jimmy > > Ping wrote: > > Hi, > > > I am trying to run the xinteract on the omssa pep.xml output files. my > > omssa's version is 2.1.4, my TPP version is 4.2.1. But I couldn't get > > it through. I search the old post, there is a similar post, but the > > problem was solved by specifying enzyme to xinteract. > > > I tried it, but is still not working. InteractParse went through, but > > PeptideProphetParser got stuck by a segmentation fault. > > > Any help would be greatly appreciated! > > > Many Thanks, > > > Ping > > > ***** output for interactParser and PeptideProphetParser > > > InteractParser 'interact.pep.xml' 'omssa.pep.xml' -L'7' -E'trypsin' -C > > -P > > file 1: ParoSaliv_SHAM_03.pep.xml > > processed altogether 2623 results > > > PeptideProphetParser 'interact.pep.xml' DECOY=DECOY MINPROB=0 > > NONPARAM > > Using Decoy Label "DECOY". > > Using non-parametric distributions > > (OMSSA) (minprob 0) > > WARNING!! The discriminant function for OMSSA is not yet complete. It > > is presented here to help facilitate trial and discussion. Reliance > > on this code for publishable scientific results is not recommended. > > init with OMSSA Trypsin > > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization: > > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN > > > PeptideProphet (TPP v4.2 JETSTREAM rev 1, Build 200905131510 > > (linux)) akel...@isb > > read in 75 1+, 1790 2+, 749 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra. > > Initialising statistical models ... > > WARNING: No decoys with label DECOY were found in this dataset. > > reverting to fully unsupervised method. > > Iterations: .........10.........20 > > Segmentation fault --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
