I like to refer to it as the "decoy rate" as it is the rate at which decoys are acquired among matches drawn at random from the database.
-David On Fri, Feb 7, 2014 at 1:59 PM, Eric Deutsch <[email protected]>wrote: > Maybe "decoy fraction" is the right term for this concept? > > > > > > *From:* [email protected] [mailto: > [email protected]] *On Behalf Of *Dave Trudgian > *Sent:* Thursday, February 06, 2014 2:20 PM > *To:* [email protected] > *Subject:* Re: [spctools-discuss] NSP model in iProphet/ProteinProphet; > model vs decoy based FDR in ProteinProphet > > > > David, > > > > Thanks for the pointer to the iProphet paper - very useful. I'd just been > thinking over a coffee about r=1/3 if ProphetModels could ignore the first > decoy set. Disabling DECOYPROBS on the DECOY1 set hadn't come into my head. > I'd worried in the past about the degeneracy issue, but have just ignored > it so far. > > > > I have been working off the decoy probs downstream to report estimated > FDRs both at model fitting (DECOY1) and on the independent set (DECOY2), > with the latter used for filtering, and the former just as info for the > curious. I guess I can disable DECOYPROBS and just compute FDR on the > independent set, or modify ProphetModels.pl so it can ignore specified > (DECOY1) sequences in its computations. That way the ProphetModels.pl > output is going to be consistent with the downstream stuff. > > > > I guess the only thing I'm left wondering is whether the ProphetModels.pl > help statement might confusing to others as well? I've always considered a > 'ratio' to generally between two distinct sets, i.e. target:decoy rather > than a subset vs total. Maybe it could be explicitly stated? > > > > -r <NUM> -- Specify decoy ratio (decoy/total sequences). Will guess from > P<0.001 hits if not specified. > > > > Thanks again. > > > > Dave T > > > > > > On Thursday, February 6, 2014 3:39:20 PM UTC-6, David Shteynberg wrote: > > Hi Dave, > > > > r is computed as Decoy / Total with less than 2% probability. There is a > detailed discussion of this in the iProphet paper. > > > > > > If you have a DB of 50% target 50% decoy and none of the decoys are > discarded (which is one way to use your 50%T 25%D1 25%D2) then r = 0.5 > > > > If you discard half of the decoys e.g. D1 is used for modelling and > DECOYPROBS is disabled (in which case all D1 get probability 0) and all D1 > should be excluded from the analysis by ProphetModels.pl . Then the > remaining decoys D2 will constitute roughly 1/3 of the remaining database > entries and r will be roughly one third ( 25/75 = 0.3333) . In fact, r is > related not only to the protein counts but to the distinct peptides in each > set of the Database entries, and as the original database and the decoys > may have degenerate (repeated) peptides, that's why it will be only > roughly that percentage and vary depending on the database, how the decoys > are constructed and how indepent are D1's decoys from D2 decoys. > > > > The iProphet paper carries more info on this than I can put in an email, > so that's a good reference for this. > > > > Cheers, > > -David > > > > > > > > > > > > > > On Thu, Feb 6, 2014 at 12:17 PM, Dave Trudgian < > [email protected]> wrote: > > David, > > > > I just saw Rene's note about the -r 0.25 decoy ratio. I'm similarly using > 2 decoy sets (50% target, 25% DECOY_1, 25% DECOY_2) but with -r 0.5. I had > assumed the ratio was supposed to be specified as decoys_used/targets and > there are twice as many targets as DECOY_2s in my case so -r = 0.5. > > > > Having looked in ProphetModels.pl I'm now not so sure.... the estimation > if -r isn't supplied is pp_prob_array / pp_prob_array_decoy for hits with > p<=0.02, but I'm not sure whether this is total/decoy or target/decoy. > > > > Can you confirm which approach is correct? > > > > Not a huge problem for me if -r 0.5 is wrong, as am computing and using > decoy stats elsewhere, external to TPP. Would just mean the plots from > ProphetModels.pl that are being saved are wrong. > > > > Thanks, > > > > Dave Trudgian > > > > On Thursday, December 19, 2013 2:01:53 AM UTC-6, Rene B wrote: > > Hi David, > > > > Thank you for your quick reply and suggestions. The decoy ratio is set to > 0.25 as I use two sets of decoys, one for modeling and the other for > validation. Each decoy set corresponds to 25% of entries in the database. > > > > Kind regards, > > > Rene > > > > Op woensdag 18 december 2013 20:13:27 UTC+1 schreef David Shteynberg: > > Hello Rene > > Thanks for using the tools and double checking your work. > > In my tests I have found that applying the NSP model at the iProphet step > greatly improves performance on peptide level. And applying the NSP model > at the ProteinProphet step improves performance on the protein level. The > two models are somewhat different since the ProteinProphet model considers > grouping information while the iProphet model doesnt. I have not found the > two to interfere. > > A safe and conservative approach so would look at the conservative > estimate e.g. ProteinProphet probability cutoff to give me 1% error with > decoys or 1% error with the model which ever is more conservative. > > When the model tends to underestimate error on protein or peptide level > this is usually stemming from underestimation at the spectrum level by > PeptideProphet and can be controlled by the CLEVEL={value} option for > PeptideProphetParser -c{value} for xinteract. Setting this to a number > greater than zero like .5 or 1 or 2 will serve to make the model more > conservative overall, a negative value will have opposite effect which will > carry through to the peptide and protein levels. > > Also I am curious why you set decoy rate to 0.25? > > Best, > David > > On Dec 18, 2013 7:29 AM, "Rene B" <[email protected]> wrote: > > Hi all, > > > > I am running PeptideProphet, iProphet and ProteinProphet (TPP 4.6.3) on Q > Exactive data searched with Comet, Myrimatch and OMSSA. I wondered if the > NSP model should be disabled in ProteinProphet when it is enabled in > iProphet? I got confused because it seems Petunia enables the NSP model > both in iprophet and proteinprophet by default (ie. when xinteract runs > with the -ip option). > > > > Another question is that when I compare decoy estimated protein FDRs to > ProteinProphet modelled FDRs, ProteinProphet seems a bit optimistic (decoy > based FDR of 0.1% corresponds to ~0.02% model FDR). This is with NSP > enabled in iProphet and disabled in ProteinProphet. How should I deal with > discrepancy, ie. should I take the decoy or probability based FDR to select > a probability cutoff? > > > > I have attached some examples for a search with myrimatch only. These are > the commands I used to generate the graphs: > > > > xinteract -Nmyrimatch.pep.xml -OAP -p0 -a%ExperimentFolder% -dDECOY0 > -E%ExperimentTag% *.pep.xml > > InterProphetParser myrimatch.pep.xml myrimatch.ipro.pep.xml > > ProphetModels.pl -i myrimatch.ipro.pep.xml -k -r 0.25 -d "DECOY1" > > ProteinProphet myrimatch.ipro.pep.xml myrimatch.prot.xml IPROPHET NONSP > > ProtProphModels.pl -k -r 0.25 -d DECOY1 -i myrimatch.prot.xml > > > > The graphs are: > > > > myrimatch_all.ipro.pep_FDR_10pc: PeptideProphet/iProphet decoy vs model > FDR, all models enabled > > myrimatch_nonsp.ipro.pep_FDR_10pc: PeptideProphet/iProphet decoy vs model > FDR, NSP model disabled in iProphet > > myrimatch_nonsp.prot_FDR_5pc: ProteinProphet decoy vs model FDR, NSP model > disabled in ProteinProphet > > myrimatch_all.prot_FDR_5pc: ProteinProphet decoy vs model FDR, NSP model > enabled in iProphet and ProteinProphet > > > > Thanks in advance! > > > > Kind regards, > > > > Rene > > > > > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/spctools-discuss. > For more options, visit https://groups.google.com/groups/opt_out. > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/spctools-discuss. > For more options, visit https://groups.google.com/groups/opt_out. > > > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/spctools-discuss. > For more options, visit https://groups.google.com/groups/opt_out. > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/spctools-discuss. > For more options, visit https://groups.google.com/groups/opt_out. > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/spctools-discuss. For more options, visit https://groups.google.com/groups/opt_out.
