Hello Alastair, When you ran PeptideProphet as follows:
PeptideProphetParser H1_EhN_cm2.pep.xml RT DECOY=XXX_ NONPARAM Here is what happened: You gave PeptideProphet the names of your decoys but without the flag DECOYPROBS the decoy hits are all incorrect as far as PeptideProphet is concerned and they will all get a probability of 0 (unless you enable DECOYPROBS). By default PeptideProphet removes any results with probability less than 5% unless you set the flag MINPROB=0 (or the flag ZERO). So unless you apply the flag DECOYPROBS you will not see decoys with non-zero probabilities. And if you use the flag MINPROB=0 but without DECOYPROBS, all your decoys will be reported with probability 0, which will not be helpful if you want to utilize these decoys later. If you want to use these decoys later you have to rerun the analysis with DECOYPROBS, and possibly with MINPROB=0 so you can count the number of decoy hits at low probabilities to get the decoy rate among random matches, which you can compare to the decoy rate in you database (they should be similar.) By the way, in my testing I did not find that specifying the same set of decoys for the iProphet classifier is helpful, when you use iProphet after PeptideProphet and PeptideProphet already utilized the same decoys. When the information about the decoys is already used at the PeptideProphet step it doesn't seem to help the classification at the iProphet step. They could still be useful as a second measure of FDR, however when the PeptideProphet DECOYPROBS flag is enabled. Hopefully this helps you understand the issue. Cheer, -David On Fri, Mar 6, 2020 at 5:40 AM 'Alastair Skeffington' via spctools-discuss < [email protected]> wrote: > Hello, > > I'm combining some Mascot and comet search results with TPP and have the > problem that it doesn't seem to be identifying decoys correctly. > > I'm running TPP.5.1.0 on ubuntu 16. > > The comet and the mascot search are done with decoy search turned off, but > I added decoys to the database with the prefix "XXX_" > > Then I continue to run TPP like this: > > InteractParser H1_EhN_cm2.pep.xml H1_ma_EhN.pep.xml H1_co_EhN2.pep.xml > -D../../Databases/EhuxAllproteins_MCC_decoy.fasta -s -XH1 -Etrypsin > > DatabaseParser H1_EhN_cm2.pep.xml > > RefreshParser H1_EhN_cm2.pep.xml > ../../Databases/EhuxAllproteins_MCC_decoy.fasta > > PeptideProphetParser H1_EhN_cm2.pep.xml RT DECOY=XXX_ NONPARAM > > ProphetModels.pl -i H1_EhN_cm2.pep.xml > > tpp_models.pl H1_EhN_cm2.pep.xml > > And get the following printed to stdout: > > > ../../../../Databases/EhuxAllproteins_MCC_decoy.fasta,../../Databases/EhuxAllproteins_MCC_decoy.fasta > - Building Commentz-Walter keyword tree... > - Searching the tree... > - Linking duplicate entries... > - Printing results... > > using RT > Using Decoy Label "XXX_". > Using non-parametric distributions > (MASCOT) > error: -1.0 ion score > Analyzing H1_EhN_cm2.pep.xml ... > Parsing search results "/home/ > mpimp-golm.mpg.de/skeffington/winhome/proteomics2/CAP/CAPdata/mascot/Ehnew/H1 > (MASCOT)"... > => Found 23168 hits. (0 decoys, 0 excluded) > => Total so far: 23168 hits. (0 decoys, 0 excluded) > Parsing search results "/home/ > mpimp-golm.mpg.de/skeffington/winhome/proteomics2/CAP/CAPdata/H1 > (Comet)"... > => Found 21343 hits. (0 decoys, 0 excluded) > => Total so far: 44511 hits. (0 decoys, 0 excluded) > File: H1_EhN_cm2.pep.xml > - in ms run: /home/ > mpimp-golm.mpg.de/skeffington/winhome/proteomics2/CAP/CAPdata/mascot/Ehnew/H1. > .. > - in ms run: /home/ > mpimp-golm.mpg.de/skeffington/winhome/proteomics2/CAP/CAPdata/H1... > > ------------------------------------------------------------------------------- > TPP DASHBOARD -- started at Fri Mar 6 14:18:51 2020 > > ------------------------------------------------------------------------------- > File H1_EhN_cm2.pep.xml is pepxml > --> Trying to write file H1_EhN_cm2.pep-MODELS.html > > ------------------------------------------------------------------------------- > Finished at Fri Mar 6 14:18:54 2020 with 0 errors. > > ------------------------------------------------------------------------------- > > > So I don't understand why it doesn't find the decoys when they are clearly > there in the search results, when they are clearly there in the .pep.xml > file: > > <spectrum_query spectrum="H1.27890.27890.4" start_scan="27890" > end_scan="27890" precursor_neutral_mass="1899.7835" assumed_charge="4" > index="13563" experiment_label="H1"> > <search_result> > <search_hit hit_rank="1" peptide="SWHGEAASKTVDSLPR" peptide_prev_aa="R" > peptide_next_aa="F" protein="XXX_44165" num_tot_proteins="1" > num_matched_ions="6" tot_num_ions="60" calc_neutral_pep_mass="1899.7996" > massdiff="-0.0160" num_tol_term="2" num_missed_cleavages="1" > is_rejected="0"> > <modification_info modified_peptide="S[167]WHGEAASKTVDS[167]LPR"> > <mod_aminoacid_mass position="1" mass="166.998359"/> > <mod_aminoacid_mass position="13" mass="166.998359"/> > </modification_info> > <search_score name="ionscore" value="15.77"/> > <search_score name="identityscore" value="33.71"/> > <search_score name="star" value="0"/> > <search_score name="homologyscore" value="27.00"/> > <search_score name="expect" value="3.1107"/> > </search_hit> > </search_result> > </spectrum_query> > > Any ideas would be much appreciated! > > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/spctools-discuss/90ecb5d5-8e5d-48ea-b19a-ed8fdbc84f59%40googlegroups.com > <https://groups.google.com/d/msgid/spctools-discuss/90ecb5d5-8e5d-48ea-b19a-ed8fdbc84f59%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CAGJJY%3D8MOdVKMoCBAT7nqBv7gq_nn-S6KuXEk-_-DZhMP6dKhg%40mail.gmail.com.
