Dear Jagan,

The modelling for OMSSA (also Inspect and Myrimatch) is done with
semi-parametric modeling which *requires* decoys to learn the shapes
of the mixture model distributions.  Without decoys in the database
these search engines cannot be processed through the TPP.  Why are you
reluctant to include decoys in the model?  We sometimes use two
independent sets of decoys in the database, where one set is used for
the semi-parametric modelling and the other to independently evaluate
the model against another decoy set.  Also, a match that is
significant is not necessarily correct, decoy matches with significant
scores are common.

-David

On Fri, Jul 16, 2010 at 12:03 AM, Jagan Kommineni
<[email protected]> wrote:
> Dear David,
>
>          After increasing the e-value to 1e6, I run the OMSSA search with
> standard fasta and decoy databases (generated using TPP's decoyFASTA) with
> identical input data and parameters.
>
> In the first case (non-decoy) I got 101 of 18,872 peptide matches are
> significant and in the latter case (decoy), I found 89 of 18,012 peptide
> matches are significant.
>
> I have used same input file in both experiments which is having 3,444
> spectras.
>
> When I run PeptideProphetParser against non-decoy dayabase I got
> Segmentation fault eventhough in both cases return similar set of results
> (similar set of false positives) from the standard OMSSA search. Here is the
> STDOUT, PeptideProphetParser run for NON-DECOY (standard fasta database).
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> [r...@compute-3-0 2010-07-16]#
> /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser 'jagan-J1229.pepprophet.xml'
> 'jagan-J1229.pep.xml' -L'7'  -E'trypsin' -C -P
>  file 1: jagan-J1229.pep.xml
>  processed altogether 3635 results
>
> results written to file
> /mnt/sanfs/APCF/results/tpp/2010-07-16/jagan-J1229.pepprophet.shtml
>
> [r...@compute-3-0 2010-07-16]#
> /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser
> 'jagan-J1229.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM
> Using Decoy Label "decoy".
> Using non-parametric distributions
>  (OMSSA) (minprob 0)
> WARNING!! The discriminant function for OMSSA is not yet complete.  It is
> presented here to help facilitate trial and discussion.  Reliance on this
> code for publishable scientific results is not recommended.
> init with OMSSA trypsin
> MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization:
> UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
>
>  PeptideProphet  (TPP v4.4 JETSTREAM (unstable development prerelease) rev
> 0, Build 201007011135 (linux)) akel...@isb
>  read in 272 1+, 1490 2+, 1766 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
> Initialising statistical models ...
> Found 0 Decoys, and 3528 Non-Decoys
> WARNING: No decoys with label decoy were found in this dataset. reverting to
> fully unsupervised method.
> Iterations: .........10.........20
> Segmentation fault
> [r...@compute-3-0 2010-07-16]#
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> As mentioned in the latter case where I use decoy database for the OMSSA
> search, PeptideProphetParser issues only the warning messages but finally I
> can able to view pepXML files without any hassle. But similar type of input
> file when I run TPP pipeline after standard mascot search, I see 0 hits for
> changes 4, ,5, 6 and 7 but no warning messages.
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> [r...@compute-3-0 2010-07-16]#
> /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser 'jagan-J237.pepprophet.xml'
> 'jagan-J237.pep.xml' -L'7'  -E'trypsin' -C -P
>  file 1: jagan-J237.pep.xml
>  processed altogether 3417 results
>
>  results written to file
> /mnt/sanfs/APCF/results/tpp/2010-07-16/jagan-J237.pepprophet.shtml
>
> [r...@compute-3-0 2010-07-16]#
> /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser
> 'jagan-J237.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM
> Using Decoy Label "decoy".
> Using non-parametric distributions
>  (OMSSA) (minprob 0)
> WARNING!! The discriminant function for OMSSA is not yet complete.  It is
> presented here to help facilitate trial and discussion.  Reliance on this
> code for publishable scientific results is not recommended.
> init with OMSSA trypsin
> MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization:
> UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
>
>  PeptideProphet  (TPP v4.4 JETSTREAM (unstable development prerelease) rev
> 0, Build 201007011135 (linux)) akel...@isb
>  read in 213 1+, 1376 2+, 1766 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
> Initialising statistical models ...
> Found 997 Decoys, and 2358 Non-Decoys
> Iterations: .........10.........20.....
> WARNING: Mixture model quality test failed for charge (1+).
> WARNING: Mixture model quality test failed for charge (4+).
> WARNING: Mixture model quality test failed for charge (5+).
> WARNING: Mixture model quality test failed for charge (6+).
> WARNING: Mixture model quality test failed for charge (7+).
> model complete after 26 iterations
> [r...@compute-3-0 2010-07-16]#
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> I wonder is there anyway, I can run TPP on the OMSSA results produced by
> standard fasta database rather than decoy database.
>
> I kept the OMSSA search result files on APCF wiki and here is the link ...
>
>
> https://search.apcf.edu.au/wiki/index.php/Apcfwiki:Community_Portal#APCF__OMSSA_files
>
> OMSSA files (jagan-J1229.omx non-decoy output and jagan-J237.omx decoy
> output and O070512-01.mgf input file
>
> with regards,
>
>
> Jagan Kommineni
>
>
>
> On Fri, Jul 9, 2010 at 7:18 AM, David Shteynberg
> <[email protected]> wrote:
>>
>> Hi Jagan,
>>
>> It appears that the QC filters were triggered on the PeptideProphet
>> MixtureModel.  This is likely due to too few data points in the
>> analysis for good stats:  read in 0 1+, 82 2+, 44 3+, 0 4+, 0 5+, 0
>> 6+, and 0 7+ spectra.
>>
>>
>> With OMSSA this could be due to too low e-value setting which filters
>> out many results which the model can utilize to better model the
>> negative and positive distributions.  Set your OMSSA e-value to a high
>> value like 1e6 and this problem will likely go away.  Unless you don't
>> have very many correct results due to wrong parameters or bad data or
>> something else.
>>
>> Hope this helps.
>>
>> -David
>>
>>
>> On Fri, Jul 2, 2010 at 1:02 AM, Jagan Kommineni
>> <[email protected]> wrote:
>> > Dear All,
>> >
>> > I have created decoy database for the SwisPlot database using decoyFASTA
>> > of
>> > the TPPDistribution and run the following TPP commands after the omssa
>> > search with decoy database and here is the output on STDOUT ...
>> >
>> > ---------------------------------------
>> > [r...@compute-3-0 run-on-compute]#
>> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser
>> > 'jagan-J128.pepprophet.xml'
>> > 'jagan-J128.pep.xml'
>> >
>> > -D'/home/APCF/databases/SwissProt/uniprot_sprot_Jan2009/decoy/decoy_uniprot_sprot.fasta'
>> > -L'7'  -E'trypsin' -C -P
>> >  file 1: jagan-J128.pep.xml
>> >  processed altogether 126 results
>> >
>> >
>> >  results written to file
>> >
>> > /mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.pepprophet.shtml
>> >
>> >
>> >
>> > [r...@compute-3-0 run-on-compute]#
>> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser
>> > 'jagan-J128.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM
>> > Using Decoy Label "decoy".
>> > Using non-parametric distributions
>> >  (OMSSA) (minprob 0)
>> > WARNING!! The discriminant function for OMSSA is not yet complete.  It
>> > is
>> > presented here to help facilitate trial and discussion.  Reliance on
>> > this
>> > code for publishable scientific results is not recommended.
>> > init with OMSSA Trypsin
>> > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization:
>> > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
>> >
>> >  PeptideProphet  (TPP v4.3 JETSTREAM rev 1, Build 201003241044 (linux))
>> > akel...@isb
>> >  read in 0 1+, 82 2+, 44 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>> > Initialising statistical models ...
>> > Iterations: .........10.........20.....
>> > WARNING: Mixture model quality test failed for charge (1+).
>> > WARNING: Mixture model quality test failed for charge (2+).
>> > WARNING: Mixture model quality test failed for charge (4+).
>> > WARNING: Mixture model quality test failed for charge (5+).
>> > WARNING: Mixture model quality test failed for charge (6+).
>> > WARNING: Mixture model quality test failed for charge (7+).
>> > model complete after 26 iterations
>> > [r...@compute-3-0 run-on-compute]#
>> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/RefreshParser
>> > 'jagan-J128.pepprophet.xml'
>> >
>> > '/home/APCF/databases/SwissProt/uniprot_sprot_Jan2009/decoy/decoy_uniprot_sprot.fasta'
>> >   - Building Commentz-Walter keyword tree...  - Searching the tree...
>> >   - Linking duplicate entries...  - Printing results...
>> >
>> > [r...@compute-3-0 run-on-compute]#
>> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/ProteinProphet
>> > 'jagan-J128.pepprophet.xml'
>> > 'jagan-J128.prot.xml'
>> > ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the
>> > original Perl by A. Keller (TPP v4.3 JETSTREAM rev 1, Build 201003241044
>> > (linux))
>> >  (xml input) (report Protein Length) (using degen pep info)
>> > . . . reading in
>> >
>> > /mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.pepprophet.xml.
>> > . .
>> > . . . read in 0 1+, 0 2+, 33 3+, 0 4+, 0 5+, 0 6+, 0 7+ spectra with min
>> > prob 0.05
>> > Could not find/open font when opening font "arial", using internal
>> > non-scalable font
>> > INFO: mu=6.3014e-09, db_size=584667857
>> >
>> >  protein probabilities written to file
>> >
>> > /mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.prot.xml
>> >  direct your browser to
>> >
>> > http://nfs//mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.prot.shtml
>> >
>> > [r...@compute-3-0 run-on-compute]#
>> >
>> > -------------------------------------
>> >
>> > I noticed there are some warning messages indicating some tests are
>> > failed,
>> > how critical are these messages.
>> >
>> >
>> > with regards,
>> >
>> >
>> > Dr. Jagan Kommineni
>> > Ludwig Institute for Cancer research
>> > Pakville VIC 3145
>> > Australia.
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "spctools-discuss" group.
>> > To post to this group, send email to [email protected].
>> > To unsubscribe from this group, send email to
>> > [email protected].
>> > For more options, visit this group at
>> > http://groups.google.com/group/spctools-discuss?hl=en.
>> >
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "spctools-discuss" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/spctools-discuss?hl=en.
>>
>
>
>
> --
> Dr. Jagan Kommineni
> Ludwig Institute for Cancer research
> Pakville VIC 3145
> Australia.
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to