I like to refer to it as the "decoy rate" as it is the rate at which decoys
are acquired among matches drawn at random from the database.

-David


On Fri, Feb 7, 2014 at 1:59 PM, Eric Deutsch <[email protected]>wrote:

> Maybe "decoy fraction" is the right term for this concept?
>
>
>
>
>
> *From:* [email protected] [mailto:
> [email protected]] *On Behalf Of *Dave Trudgian
> *Sent:* Thursday, February 06, 2014 2:20 PM
> *To:* [email protected]
> *Subject:* Re: [spctools-discuss] NSP model in iProphet/ProteinProphet;
> model vs decoy based FDR in ProteinProphet
>
>
>
> David,
>
>
>
> Thanks for the pointer to the iProphet paper - very useful. I'd just been
> thinking over a coffee about r=1/3 if ProphetModels could ignore the first
> decoy set. Disabling DECOYPROBS on the DECOY1 set hadn't come into my head.
> I'd worried in the past about the degeneracy issue, but have just ignored
> it so far.
>
>
>
> I have been working off the decoy probs downstream to report estimated
> FDRs both at model fitting (DECOY1) and on the independent set (DECOY2),
> with the latter used for filtering, and the former just as info for the
> curious. I guess I can disable DECOYPROBS and just compute FDR on the
> independent set, or modify ProphetModels.pl so it can ignore specified
> (DECOY1) sequences in its computations. That way the ProphetModels.pl
> output is going to be consistent with the downstream stuff.
>
>
>
> I guess the only thing I'm left wondering is whether the ProphetModels.pl
> help statement might confusing to others as well? I've always considered a
> 'ratio' to generally between two distinct sets, i.e. target:decoy rather
> than a subset vs total. Maybe it could be explicitly stated?
>
>
>
> -r <NUM>  -- Specify decoy ratio (decoy/total sequences). Will guess from
> P<0.001 hits if not specified.
>
>
>
> Thanks again.
>
>
>
> Dave T
>
>
>
>
>
> On Thursday, February 6, 2014 3:39:20 PM UTC-6, David Shteynberg wrote:
>
> Hi Dave,
>
>
>
> r is computed as Decoy / Total with less than 2% probability.  There is a
> detailed discussion of this in the iProphet paper.
>
>
>
>
>
> If you have a DB of 50% target 50% decoy and none of the decoys are
> discarded (which is one way to use your 50%T 25%D1 25%D2) then r = 0.5
>
>
>
> If you discard half of the decoys e.g. D1 is used for modelling and
> DECOYPROBS is disabled (in which case all D1 get probability 0) and all D1
> should be excluded from the analysis by ProphetModels.pl .  Then the
> remaining decoys D2 will constitute roughly 1/3 of the remaining database
> entries and r will be roughly one third ( 25/75 = 0.3333) .  In fact, r is
> related not only to the protein counts but to the distinct peptides in each
> set of the Database entries, and as the original database and the decoys
> may have degenerate (repeated) peptides,  that's why it will be only
> roughly that percentage and vary depending on the database, how the decoys
> are constructed and how indepent are D1's decoys from D2 decoys.
>
>
>
> The iProphet paper carries more info on this than I can put in an email,
> so that's a good reference for this.
>
>
>
> Cheers,
>
> -David
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Feb 6, 2014 at 12:17 PM, Dave Trudgian <
> [email protected]> wrote:
>
> David,
>
>
>
> I just saw Rene's note about the -r 0.25 decoy ratio. I'm similarly using
> 2 decoy sets (50% target, 25% DECOY_1, 25% DECOY_2) but with -r 0.5. I had
> assumed the ratio was supposed to be specified as decoys_used/targets and
> there are twice as many targets as DECOY_2s in my case so -r = 0.5.
>
>
>
> Having looked in ProphetModels.pl I'm now not so sure.... the estimation
> if -r isn't supplied is pp_prob_array / pp_prob_array_decoy for hits with
> p<=0.02, but I'm not sure whether this is total/decoy or target/decoy.
>
>
>
> Can you confirm which approach is correct?
>
>
>
> Not a huge problem for me if -r 0.5 is wrong, as am computing and using
> decoy stats elsewhere, external to TPP. Would just mean the plots from
> ProphetModels.pl that are being saved are wrong.
>
>
>
> Thanks,
>
>
>
> Dave Trudgian
>
>
>
> On Thursday, December 19, 2013 2:01:53 AM UTC-6, Rene B wrote:
>
> Hi David,
>
>
>
> Thank you for your quick reply and suggestions. The decoy ratio is set to
> 0.25 as I use two sets of decoys, one for modeling and the other for
> validation. Each decoy set corresponds to 25% of entries in the database.
>
>
>
> Kind regards,
>
>
> Rene
>
>
>
> Op woensdag 18 december 2013 20:13:27 UTC+1 schreef David Shteynberg:
>
> Hello Rene
>
> Thanks for using the tools and double checking your work.
>
> In my tests I have found that applying the NSP model at the iProphet step
> greatly improves performance on peptide level.  And applying the NSP model
> at the ProteinProphet step improves performance on the protein level.  The
> two models are somewhat different since the ProteinProphet model considers
> grouping information while the iProphet model doesnt.  I have not found the
> two to interfere.
>
> A safe and conservative approach so would look at the conservative
> estimate e.g. ProteinProphet probability cutoff to give me 1% error with
> decoys or 1% error with the model which ever is more conservative.
>
> When the model tends to underestimate error on protein or peptide level
> this is usually stemming from underestimation at the spectrum level by
> PeptideProphet and can be controlled by the CLEVEL={value} option for
> PeptideProphetParser -c{value} for xinteract.  Setting this to a number
> greater than zero like .5 or 1 or 2 will serve to make the model more
> conservative overall, a negative value will have opposite effect which will
> carry through to the peptide and protein levels.
>
> Also I am curious why you set decoy rate to 0.25?
>
> Best,
> David
>
> On Dec 18, 2013 7:29 AM, "Rene B" <[email protected]> wrote:
>
> Hi all,
>
>
>
> I am running PeptideProphet, iProphet and ProteinProphet (TPP 4.6.3) on Q
> Exactive data searched with Comet, Myrimatch and OMSSA. I wondered if the
> NSP model should be disabled in ProteinProphet when it is enabled in
> iProphet? I got confused because it seems Petunia enables the NSP model
> both in iprophet and proteinprophet by default (ie. when xinteract runs
> with the -ip option).
>
>
>
> Another question is that when I compare decoy estimated protein FDRs to
> ProteinProphet modelled FDRs, ProteinProphet seems a bit optimistic (decoy
> based FDR of 0.1% corresponds to ~0.02% model FDR). This is with NSP
> enabled in iProphet and disabled in ProteinProphet. How should I deal with
> discrepancy, ie. should I take the decoy or probability based FDR to select
> a probability cutoff?
>
>
>
> I have attached some examples for a search with myrimatch only. These are
> the commands I used to generate the graphs:
>
>
>
> xinteract -Nmyrimatch.pep.xml -OAP -p0 -a%ExperimentFolder% -dDECOY0
> -E%ExperimentTag% *.pep.xml
>
> InterProphetParser myrimatch.pep.xml myrimatch.ipro.pep.xml
>
> ProphetModels.pl -i myrimatch.ipro.pep.xml -k -r 0.25 -d "DECOY1"
>
> ProteinProphet myrimatch.ipro.pep.xml myrimatch.prot.xml IPROPHET NONSP
>
> ProtProphModels.pl -k -r 0.25 -d DECOY1 -i myrimatch.prot.xml
>
>
>
> The graphs are:
>
>
>
> myrimatch_all.ipro.pep_FDR_10pc: PeptideProphet/iProphet decoy vs model
> FDR, all models enabled
>
> myrimatch_nonsp.ipro.pep_FDR_10pc: PeptideProphet/iProphet decoy vs model
> FDR, NSP model disabled in iProphet
>
> myrimatch_nonsp.prot_FDR_5pc: ProteinProphet decoy vs model FDR, NSP model
> disabled in ProteinProphet
>
> myrimatch_all.prot_FDR_5pc: ProteinProphet decoy vs model FDR, NSP model
> enabled in iProphet and ProteinProphet
>
>
>
> Thanks in advance!
>
>
>
> Kind regards,
>
>
>
> Rene
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/spctools-discuss.
> For more options, visit https://groups.google.com/groups/opt_out.
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/spctools-discuss.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/spctools-discuss.
> For more options, visit https://groups.google.com/groups/opt_out.
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/spctools-discuss.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/spctools-discuss.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to