Dear David,
At APCF, we are using multiple algorithms (Mascot, X!tandem and
OMSSA currently and Crux in the near future) for processing same data. The
standard search reasults are passed over to TPP for the postprocessing.
After PeptideProhet parser runs, we are combining results into one output
file by merging results files using your iProphet. The result file of
iProphet is further processed with proteinProphet parser. We are also
planning to quantify peptides produced by iProphet by using your tools ASAP,
XPRESS and Libra before running ProteinProphet parser.
I would like to know over all impact of the results when we combine
the non-decoy peptideProphet results for the Mascot and X!tandem with decoy
based peptideProphet results for the OMSSA in running iProphet and other
postprocessing TPP tools (ex. quantitation and proteinProphetParser). We are
really keen to know more information in this space.
If you think it is advisable to use consistent decoy database across
all the algorithms (Mascot, X!tandem, OMSSA and Crux) if user wants to
include OMSSA search results in the postprocessing TPP task, could you mind
to advise us?
with regards,
Jagan Kommineni
On Sat, Jul 17, 2010 at 1:01 AM, David Shteynberg <
[email protected]> wrote:
> Dear Jagan,
>
> The modelling for OMSSA (also Inspect and Myrimatch) is done with
> semi-parametric modeling which *requires* decoys to learn the shapes
> of the mixture model distributions. Without decoys in the database
> these search engines cannot be processed through the TPP. Why are you
> reluctant to include decoys in the model? We sometimes use two
> independent sets of decoys in the database, where one set is used for
> the semi-parametric modelling and the other to independently evaluate
> the model against another decoy set. Also, a match that is
> significant is not necessarily correct, decoy matches with significant
> scores are common.
>
> -David
>
> On Fri, Jul 16, 2010 at 12:03 AM, Jagan Kommineni
> <[email protected]> wrote:
> > Dear David,
> >
> > After increasing the e-value to 1e6, I run the OMSSA search with
> > standard fasta and decoy databases (generated using TPP's decoyFASTA)
> with
> > identical input data and parameters.
> >
> > In the first case (non-decoy) I got 101 of 18,872 peptide matches are
> > significant and in the latter case (decoy), I found 89 of 18,012 peptide
> > matches are significant.
> >
> > I have used same input file in both experiments which is having 3,444
> > spectras.
> >
> > When I run PeptideProphetParser against non-decoy dayabase I got
> > Segmentation fault eventhough in both cases return similar set of results
> > (similar set of false positives) from the standard OMSSA search. Here is
> the
> > STDOUT, PeptideProphetParser run for NON-DECOY (standard fasta database).
> >
> >
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > [r...@compute-3-0 2010-07-16]#
> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser
> 'jagan-J1229.pepprophet.xml'
> > 'jagan-J1229.pep.xml' -L'7' -E'trypsin' -C -P
> > file 1: jagan-J1229.pep.xml
> > processed altogether 3635 results
> >
> > results written to file
> > /mnt/sanfs/APCF/results/tpp/2010-07-16/jagan-J1229.pepprophet.shtml
> >
> > [r...@compute-3-0 2010-07-16]#
> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser
> > 'jagan-J1229.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM
> > Using Decoy Label "decoy".
> > Using non-parametric distributions
> > (OMSSA) (minprob 0)
> > WARNING!! The discriminant function for OMSSA is not yet complete. It is
> > presented here to help facilitate trial and discussion. Reliance on this
> > code for publishable scientific results is not recommended.
> > init with OMSSA trypsin
> > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization:
> > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
> >
> > PeptideProphet (TPP v4.4 JETSTREAM (unstable development prerelease)
> rev
> > 0, Build 201007011135 (linux)) akel...@isb
> > read in 272 1+, 1490 2+, 1766 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
> > Initialising statistical models ...
> > Found 0 Decoys, and 3528 Non-Decoys
> > WARNING: No decoys with label decoy were found in this dataset. reverting
> to
> > fully unsupervised method.
> > Iterations: .........10.........20
> > Segmentation fault
> > [r...@compute-3-0 2010-07-16]#
> >
> >
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > As mentioned in the latter case where I use decoy database for the OMSSA
> > search, PeptideProphetParser issues only the warning messages but finally
> I
> > can able to view pepXML files without any hassle. But similar type of
> input
> > file when I run TPP pipeline after standard mascot search, I see 0 hits
> for
> > changes 4, ,5, 6 and 7 but no warning messages.
> >
> >
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > [r...@compute-3-0 2010-07-16]#
> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser
> 'jagan-J237.pepprophet.xml'
> > 'jagan-J237.pep.xml' -L'7' -E'trypsin' -C -P
> > file 1: jagan-J237.pep.xml
> > processed altogether 3417 results
> >
> > results written to file
> > /mnt/sanfs/APCF/results/tpp/2010-07-16/jagan-J237.pepprophet.shtml
> >
> > [r...@compute-3-0 2010-07-16]#
> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser
> > 'jagan-J237.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM
> > Using Decoy Label "decoy".
> > Using non-parametric distributions
> > (OMSSA) (minprob 0)
> > WARNING!! The discriminant function for OMSSA is not yet complete. It is
> > presented here to help facilitate trial and discussion. Reliance on this
> > code for publishable scientific results is not recommended.
> > init with OMSSA trypsin
> > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization:
> > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
> >
> > PeptideProphet (TPP v4.4 JETSTREAM (unstable development prerelease)
> rev
> > 0, Build 201007011135 (linux)) akel...@isb
> > read in 213 1+, 1376 2+, 1766 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
> > Initialising statistical models ...
> > Found 997 Decoys, and 2358 Non-Decoys
> > Iterations: .........10.........20.....
> > WARNING: Mixture model quality test failed for charge (1+).
> > WARNING: Mixture model quality test failed for charge (4+).
> > WARNING: Mixture model quality test failed for charge (5+).
> > WARNING: Mixture model quality test failed for charge (6+).
> > WARNING: Mixture model quality test failed for charge (7+).
> > model complete after 26 iterations
> > [r...@compute-3-0 2010-07-16]#
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > I wonder is there anyway, I can run TPP on the OMSSA results produced by
> > standard fasta database rather than decoy database.
> >
> > I kept the OMSSA search result files on APCF wiki and here is the link
> ...
> >
> >
> >
> https://search.apcf.edu.au/wiki/index.php/Apcfwiki:Community_Portal#APCF__OMSSA_files
> >
> > OMSSA files (jagan-J1229.omx non-decoy output and jagan-J237.omx decoy
> > output and O070512-01.mgf input file
> >
> > with regards,
> >
> >
> > Jagan Kommineni
> >
> >
> >
> > On Fri, Jul 9, 2010 at 7:18 AM, David Shteynberg
> > <[email protected]> wrote:
> >>
> >> Hi Jagan,
> >>
> >> It appears that the QC filters were triggered on the PeptideProphet
> >> MixtureModel. This is likely due to too few data points in the
> >> analysis for good stats: read in 0 1+, 82 2+, 44 3+, 0 4+, 0 5+, 0
> >> 6+, and 0 7+ spectra.
> >>
> >>
> >> With OMSSA this could be due to too low e-value setting which filters
> >> out many results which the model can utilize to better model the
> >> negative and positive distributions. Set your OMSSA e-value to a high
> >> value like 1e6 and this problem will likely go away. Unless you don't
> >> have very many correct results due to wrong parameters or bad data or
> >> something else.
> >>
> >> Hope this helps.
> >>
> >> -David
> >>
> >>
> >> On Fri, Jul 2, 2010 at 1:02 AM, Jagan Kommineni
> >> <[email protected]> wrote:
> >> > Dear All,
> >> >
> >> > I have created decoy database for the SwisPlot database using
> decoyFASTA
> >> > of
> >> > the TPPDistribution and run the following TPP commands after the omssa
> >> > search with decoy database and here is the output on STDOUT ...
> >> >
> >> > ---------------------------------------
> >> > [r...@compute-3-0 run-on-compute]#
> >> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser
> >> > 'jagan-J128.pepprophet.xml'
> >> > 'jagan-J128.pep.xml'
> >> >
> >> >
> -D'/home/APCF/databases/SwissProt/uniprot_sprot_Jan2009/decoy/decoy_uniprot_sprot.fasta'
> >> > -L'7' -E'trypsin' -C -P
> >> > file 1: jagan-J128.pep.xml
> >> > processed altogether 126 results
> >> >
> >> >
> >> > results written to file
> >> >
> >> >
> /mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.pepprophet.shtml
> >> >
> >> >
> >> >
> >> > [r...@compute-3-0 run-on-compute]#
> >> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser
> >> > 'jagan-J128.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM
> >> > Using Decoy Label "decoy".
> >> > Using non-parametric distributions
> >> > (OMSSA) (minprob 0)
> >> > WARNING!! The discriminant function for OMSSA is not yet complete. It
> >> > is
> >> > presented here to help facilitate trial and discussion. Reliance on
> >> > this
> >> > code for publishable scientific results is not recommended.
> >> > init with OMSSA Trypsin
> >> > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization:
> >> > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
> >> >
> >> > PeptideProphet (TPP v4.3 JETSTREAM rev 1, Build 201003241044
> (linux))
> >> > akel...@isb
> >> > read in 0 1+, 82 2+, 44 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
> >> > Initialising statistical models ...
> >> > Iterations: .........10.........20.....
> >> > WARNING: Mixture model quality test failed for charge (1+).
> >> > WARNING: Mixture model quality test failed for charge (2+).
> >> > WARNING: Mixture model quality test failed for charge (4+).
> >> > WARNING: Mixture model quality test failed for charge (5+).
> >> > WARNING: Mixture model quality test failed for charge (6+).
> >> > WARNING: Mixture model quality test failed for charge (7+).
> >> > model complete after 26 iterations
> >> > [r...@compute-3-0 run-on-compute]#
> >> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/RefreshParser
> >> > 'jagan-J128.pepprophet.xml'
> >> >
> >> >
> '/home/APCF/databases/SwissProt/uniprot_sprot_Jan2009/decoy/decoy_uniprot_sprot.fasta'
> >> > - Building Commentz-Walter keyword tree... - Searching the tree...
> >> > - Linking duplicate entries... - Printing results...
> >> >
> >> > [r...@compute-3-0 run-on-compute]#
> >> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/ProteinProphet
> >> > 'jagan-J128.pepprophet.xml'
> >> > 'jagan-J128.prot.xml'
> >> > ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the
> >> > original Perl by A. Keller (TPP v4.3 JETSTREAM rev 1, Build
> 201003241044
> >> > (linux))
> >> > (xml input) (report Protein Length) (using degen pep info)
> >> > . . . reading in
> >> >
> >> >
> /mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.pepprophet.xml.
> >> > . .
> >> > . . . read in 0 1+, 0 2+, 33 3+, 0 4+, 0 5+, 0 6+, 0 7+ spectra with
> min
> >> > prob 0.05
> >> > Could not find/open font when opening font "arial", using internal
> >> > non-scalable font
> >> > INFO: mu=6.3014e-09, db_size=584667857
> >> >
> >> > protein probabilities written to file
> >> >
> >> >
> /mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.prot.xml
> >> > direct your browser to
> >> >
> >> >
> http://nfs//mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.prot.shtml
> >> >
> >> > [r...@compute-3-0 run-on-compute]#
> >> >
> >> > -------------------------------------
> >> >
> >> > I noticed there are some warning messages indicating some tests are
> >> > failed,
> >> > how critical are these messages.
> >> >
> >> >
> >> > with regards,
> >> >
> >> >
> >> > Dr. Jagan Kommineni
> >> > Ludwig Institute for Cancer research
> >> > Pakville VIC 3145
> >> > Australia.
> >> >
> >> > --
> >> > You received this message because you are subscribed to the Google
> >> > Groups
> >> > "spctools-discuss" group.
> >> > To post to this group, send email to
> [email protected].
> >> > To unsubscribe from this group, send email to
> >> > [email protected]<spctools-discuss%[email protected]>
> .
> >> > For more options, visit this group at
> >> > http://groups.google.com/group/spctools-discuss?hl=en.
> >> >
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "spctools-discuss" group.
> >> To post to this group, send email to [email protected].
> >> To unsubscribe from this group, send email to
> >> [email protected]<spctools-discuss%[email protected]>
> .
> >> For more options, visit this group at
> >> http://groups.google.com/group/spctools-discuss?hl=en.
> >>
> >
> >
> >
> > --
> > Dr. Jagan Kommineni
> > Ludwig Institute for Cancer research
> > Pakville VIC 3145
> > Australia.
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "spctools-discuss" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected]<spctools-discuss%[email protected]>
> .
> > For more options, visit this group at
> > http://groups.google.com/group/spctools-discuss?hl=en.
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.
>
>
--
Dr. Jagan Kommineni
Ludwig Institute for Cancer research
Pakville VIC 3145
Australia.
--
You received this message because you are subscribed to the Google Groups
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/spctools-discuss?hl=en.