Hi Jagan,

My intuition suggests that it is advisable to use a consistent
database and similar search parameters whenever possible when
combining search engines with iProphet.  I don't really have anything
to offer here other than my intuition since I have not actually tried
this.  We always now include decoys in the databases we use.  I
suspect the statistical models used by iProphet may not apply equally
to spectral matches coming from different search engines, and may
create bias in the results.

-David

On Fri, Jul 16, 2010 at 9:42 PM, Jagan Kommineni
<[email protected]> wrote:
> Dear David,
>
>         At APCF, we are using multiple algorithms (Mascot, X!tandem and
> OMSSA currently and Crux in the near future) for processing same data. The
> standard search reasults are passed over to TPP for the postprocessing.
> After PeptideProhet parser runs, we are combining results into one output
> file by merging results files using your iProphet. The result file of
> iProphet is further processed with proteinProphet parser.  We are also
> planning to quantify peptides produced by iProphet by using your tools ASAP,
> XPRESS and Libra before running ProteinProphet parser.
>
>         I would like to know over all impact of the results when we combine
> the non-decoy peptideProphet results for the Mascot and X!tandem with decoy
> based peptideProphet results for the OMSSA in running iProphet and other
> postprocessing TPP tools (ex. quantitation and proteinProphetParser). We are
> really keen to know more information in this space.
>
>         If you think it is advisable to use consistent decoy database across
> all the algorithms (Mascot, X!tandem, OMSSA and Crux) if user wants to
> include OMSSA search results in the postprocessing TPP task, could you mind
> to advise us?
>
> with regards,
>
> Jagan Kommineni
>
>
> On Sat, Jul 17, 2010 at 1:01 AM, David Shteynberg
> <[email protected]> wrote:
>>
>> Dear Jagan,
>>
>> The modelling for OMSSA (also Inspect and Myrimatch) is done with
>> semi-parametric modeling which *requires* decoys to learn the shapes
>> of the mixture model distributions.  Without decoys in the database
>> these search engines cannot be processed through the TPP.  Why are you
>> reluctant to include decoys in the model?  We sometimes use two
>> independent sets of decoys in the database, where one set is used for
>> the semi-parametric modelling and the other to independently evaluate
>> the model against another decoy set.  Also, a match that is
>> significant is not necessarily correct, decoy matches with significant
>> scores are common.
>>
>> -David
>>
>> On Fri, Jul 16, 2010 at 12:03 AM, Jagan Kommineni
>> <[email protected]> wrote:
>> > Dear David,
>> >
>> >          After increasing the e-value to 1e6, I run the OMSSA search
>> > with
>> > standard fasta and decoy databases (generated using TPP's decoyFASTA)
>> > with
>> > identical input data and parameters.
>> >
>> > In the first case (non-decoy) I got 101 of 18,872 peptide matches are
>> > significant and in the latter case (decoy), I found 89 of 18,012 peptide
>> > matches are significant.
>> >
>> > I have used same input file in both experiments which is having 3,444
>> > spectras.
>> >
>> > When I run PeptideProphetParser against non-decoy dayabase I got
>> > Segmentation fault eventhough in both cases return similar set of
>> > results
>> > (similar set of false positives) from the standard OMSSA search. Here is
>> > the
>> > STDOUT, PeptideProphetParser run for NON-DECOY (standard fasta
>> > database).
>> >
>> >
>> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >
>> > [r...@compute-3-0 2010-07-16]#
>> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser
>> > 'jagan-J1229.pepprophet.xml'
>> > 'jagan-J1229.pep.xml' -L'7'  -E'trypsin' -C -P
>> >  file 1: jagan-J1229.pep.xml
>> >  processed altogether 3635 results
>> >
>> > results written to file
>> > /mnt/sanfs/APCF/results/tpp/2010-07-16/jagan-J1229.pepprophet.shtml
>> >
>> > [r...@compute-3-0 2010-07-16]#
>> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser
>> > 'jagan-J1229.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM
>> > Using Decoy Label "decoy".
>> > Using non-parametric distributions
>> >  (OMSSA) (minprob 0)
>> > WARNING!! The discriminant function for OMSSA is not yet complete.  It
>> > is
>> > presented here to help facilitate trial and discussion.  Reliance on
>> > this
>> > code for publishable scientific results is not recommended.
>> > init with OMSSA trypsin
>> > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization:
>> > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
>> >
>> >  PeptideProphet  (TPP v4.4 JETSTREAM (unstable development prerelease)
>> > rev
>> > 0, Build 201007011135 (linux)) akel...@isb
>> >  read in 272 1+, 1490 2+, 1766 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>> > Initialising statistical models ...
>> > Found 0 Decoys, and 3528 Non-Decoys
>> > WARNING: No decoys with label decoy were found in this dataset.
>> > reverting to
>> > fully unsupervised method.
>> > Iterations: .........10.........20
>> > Segmentation fault
>> > [r...@compute-3-0 2010-07-16]#
>> >
>> >
>> > --------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >
>> > As mentioned in the latter case where I use decoy database for the OMSSA
>> > search, PeptideProphetParser issues only the warning messages but
>> > finally I
>> > can able to view pepXML files without any hassle. But similar type of
>> > input
>> > file when I run TPP pipeline after standard mascot search, I see 0 hits
>> > for
>> > changes 4, ,5, 6 and 7 but no warning messages.
>> >
>> >
>> > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >
>> > [r...@compute-3-0 2010-07-16]#
>> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser
>> > 'jagan-J237.pepprophet.xml'
>> > 'jagan-J237.pep.xml' -L'7'  -E'trypsin' -C -P
>> >  file 1: jagan-J237.pep.xml
>> >  processed altogether 3417 results
>> >
>> >  results written to file
>> > /mnt/sanfs/APCF/results/tpp/2010-07-16/jagan-J237.pepprophet.shtml
>> >
>> > [r...@compute-3-0 2010-07-16]#
>> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser
>> > 'jagan-J237.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM
>> > Using Decoy Label "decoy".
>> > Using non-parametric distributions
>> >  (OMSSA) (minprob 0)
>> > WARNING!! The discriminant function for OMSSA is not yet complete.  It
>> > is
>> > presented here to help facilitate trial and discussion.  Reliance on
>> > this
>> > code for publishable scientific results is not recommended.
>> > init with OMSSA trypsin
>> > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization:
>> > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
>> >
>> >  PeptideProphet  (TPP v4.4 JETSTREAM (unstable development prerelease)
>> > rev
>> > 0, Build 201007011135 (linux)) akel...@isb
>> >  read in 213 1+, 1376 2+, 1766 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>> > Initialising statistical models ...
>> > Found 997 Decoys, and 2358 Non-Decoys
>> > Iterations: .........10.........20.....
>> > WARNING: Mixture model quality test failed for charge (1+).
>> > WARNING: Mixture model quality test failed for charge (4+).
>> > WARNING: Mixture model quality test failed for charge (5+).
>> > WARNING: Mixture model quality test failed for charge (6+).
>> > WARNING: Mixture model quality test failed for charge (7+).
>> > model complete after 26 iterations
>> > [r...@compute-3-0 2010-07-16]#
>> >
>> > -------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >
>> > I wonder is there anyway, I can run TPP on the OMSSA results produced by
>> > standard fasta database rather than decoy database.
>> >
>> > I kept the OMSSA search result files on APCF wiki and here is the link
>> > ...
>> >
>> >
>> >
>> > https://search.apcf.edu.au/wiki/index.php/Apcfwiki:Community_Portal#APCF__OMSSA_files
>> >
>> > OMSSA files (jagan-J1229.omx non-decoy output and jagan-J237.omx decoy
>> > output and O070512-01.mgf input file
>> >
>> > with regards,
>> >
>> >
>> > Jagan Kommineni
>> >
>> >
>> >
>> > On Fri, Jul 9, 2010 at 7:18 AM, David Shteynberg
>> > <[email protected]> wrote:
>> >>
>> >> Hi Jagan,
>> >>
>> >> It appears that the QC filters were triggered on the PeptideProphet
>> >> MixtureModel.  This is likely due to too few data points in the
>> >> analysis for good stats:  read in 0 1+, 82 2+, 44 3+, 0 4+, 0 5+, 0
>> >> 6+, and 0 7+ spectra.
>> >>
>> >>
>> >> With OMSSA this could be due to too low e-value setting which filters
>> >> out many results which the model can utilize to better model the
>> >> negative and positive distributions.  Set your OMSSA e-value to a high
>> >> value like 1e6 and this problem will likely go away.  Unless you don't
>> >> have very many correct results due to wrong parameters or bad data or
>> >> something else.
>> >>
>> >> Hope this helps.
>> >>
>> >> -David
>> >>
>> >>
>> >> On Fri, Jul 2, 2010 at 1:02 AM, Jagan Kommineni
>> >> <[email protected]> wrote:
>> >> > Dear All,
>> >> >
>> >> > I have created decoy database for the SwisPlot database using
>> >> > decoyFASTA
>> >> > of
>> >> > the TPPDistribution and run the following TPP commands after the
>> >> > omssa
>> >> > search with decoy database and here is the output on STDOUT ...
>> >> >
>> >> > ---------------------------------------
>> >> > [r...@compute-3-0 run-on-compute]#
>> >> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/InteractParser
>> >> > 'jagan-J128.pepprophet.xml'
>> >> > 'jagan-J128.pep.xml'
>> >> >
>> >> >
>> >> > -D'/home/APCF/databases/SwissProt/uniprot_sprot_Jan2009/decoy/decoy_uniprot_sprot.fasta'
>> >> > -L'7'  -E'trypsin' -C -P
>> >> >  file 1: jagan-J128.pep.xml
>> >> >  processed altogether 126 results
>> >> >
>> >> >
>> >> >  results written to file
>> >> >
>> >> >
>> >> > /mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.pepprophet.shtml
>> >> >
>> >> >
>> >> >
>> >> > [r...@compute-3-0 run-on-compute]#
>> >> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/PeptideProphetParser
>> >> > 'jagan-J128.pepprophet.xml' DECOY=decoy MINPROB=0 NONPARAM
>> >> > Using Decoy Label "decoy".
>> >> > Using non-parametric distributions
>> >> >  (OMSSA) (minprob 0)
>> >> > WARNING!! The discriminant function for OMSSA is not yet complete.
>> >> > It
>> >> > is
>> >> > presented here to help facilitate trial and discussion.  Reliance on
>> >> > this
>> >> > code for publishable scientific results is not recommended.
>> >> > init with OMSSA Trypsin
>> >> > MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN,
>> >> > Ionization:
>> >> > UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN
>> >> >
>> >> >  PeptideProphet  (TPP v4.3 JETSTREAM rev 1, Build 201003241044
>> >> > (linux))
>> >> > akel...@isb
>> >> >  read in 0 1+, 82 2+, 44 3+, 0 4+, 0 5+, 0 6+, and 0 7+ spectra.
>> >> > Initialising statistical models ...
>> >> > Iterations: .........10.........20.....
>> >> > WARNING: Mixture model quality test failed for charge (1+).
>> >> > WARNING: Mixture model quality test failed for charge (2+).
>> >> > WARNING: Mixture model quality test failed for charge (4+).
>> >> > WARNING: Mixture model quality test failed for charge (5+).
>> >> > WARNING: Mixture model quality test failed for charge (6+).
>> >> > WARNING: Mixture model quality test failed for charge (7+).
>> >> > model complete after 26 iterations
>> >> > [r...@compute-3-0 run-on-compute]#
>> >> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/RefreshParser
>> >> > 'jagan-J128.pepprophet.xml'
>> >> >
>> >> >
>> >> > '/home/APCF/databases/SwissProt/uniprot_sprot_Jan2009/decoy/decoy_uniprot_sprot.fasta'
>> >> >   - Building Commentz-Walter keyword tree...  - Searching the tree...
>> >> >   - Linking duplicate entries...  - Printing results...
>> >> >
>> >> > [r...@compute-3-0 run-on-compute]#
>> >> > /mnt/sanfs/APCF/APCF_WEB/tpp/bin/ProteinProphet
>> >> > 'jagan-J128.pepprophet.xml'
>> >> > 'jagan-J128.prot.xml'
>> >> > ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the
>> >> > original Perl by A. Keller (TPP v4.3 JETSTREAM rev 1, Build
>> >> > 201003241044
>> >> > (linux))
>> >> >  (xml input) (report Protein Length) (using degen pep info)
>> >> > . . . reading in
>> >> >
>> >> >
>> >> > /mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.pepprophet.xml.
>> >> > . .
>> >> > . . . read in 0 1+, 0 2+, 33 3+, 0 4+, 0 5+, 0 6+, 0 7+ spectra with
>> >> > min
>> >> > prob 0.05
>> >> > Could not find/open font when opening font "arial", using internal
>> >> > non-scalable font
>> >> > INFO: mu=6.3014e-09, db_size=584667857
>> >> >
>> >> >  protein probabilities written to file
>> >> >
>> >> >
>> >> > /mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.prot.xml
>> >> >  direct your browser to
>> >> >
>> >> >
>> >> > http://nfs//mnt/sanfs/APCF/results/omssa/decoy_test_run/run-on-compute/jagan-J128.prot.shtml
>> >> >
>> >> > [r...@compute-3-0 run-on-compute]#
>> >> >
>> >> > -------------------------------------
>> >> >
>> >> > I noticed there are some warning messages indicating some tests are
>> >> > failed,
>> >> > how critical are these messages.
>> >> >
>> >> >
>> >> > with regards,
>> >> >
>> >> >
>> >> > Dr. Jagan Kommineni
>> >> > Ludwig Institute for Cancer research
>> >> > Pakville VIC 3145
>> >> > Australia.
>> >> >
>> >> > --
>> >> > You received this message because you are subscribed to the Google
>> >> > Groups
>> >> > "spctools-discuss" group.
>> >> > To post to this group, send email to
>> >> > [email protected].
>> >> > To unsubscribe from this group, send email to
>> >> > [email protected].
>> >> > For more options, visit this group at
>> >> > http://groups.google.com/group/spctools-discuss?hl=en.
>> >> >
>> >>
>> >> --
>> >> You received this message because you are subscribed to the Google
>> >> Groups
>> >> "spctools-discuss" group.
>> >> To post to this group, send email to [email protected].
>> >> To unsubscribe from this group, send email to
>> >> [email protected].
>> >> For more options, visit this group at
>> >> http://groups.google.com/group/spctools-discuss?hl=en.
>> >>
>> >
>> >
>> >
>> > --
>> > Dr. Jagan Kommineni
>> > Ludwig Institute for Cancer research
>> > Pakville VIC 3145
>> > Australia.
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "spctools-discuss" group.
>> > To post to this group, send email to [email protected].
>> > To unsubscribe from this group, send email to
>> > [email protected].
>> > For more options, visit this group at
>> > http://groups.google.com/group/spctools-discuss?hl=en.
>> >
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "spctools-discuss" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/spctools-discuss?hl=en.
>>
>
>
>
> --
> Dr. Jagan Kommineni
> Ludwig Institute for Cancer research
> Pakville VIC 3145
> Australia.
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to