Hi, As far as I understand it, the TPP uses a mixture model-based approach to determine posterior probabilities and it then uses these probabilities to estimate the FDR. This can be done with or without the use of a decoy database. However, when using some of the more sophisticated options (like non-parametric modelling), you will need a decoy database to help the modelling algorithms pin down the negative distribution. Also, when the data is of not so excellent quality, the decoys can help make a better distinction between good and bad identifications.
After the analysis with ProteinProphet, you get something like this in the ProteinProphet window: Prob Sens FPER Corr Incorr 0.00 1.000 0.930 217 2874 0.10 1.000 0.337 217 110 0.20 1.000 0.337 217 110 0.30 0.915 0.216 198 55 0.40 0.865 0.152 187 34 0.50 0.816 0.102 177 20 0.60 0.777 0.074 168 14 0.70 0.747 0.058 162 10 0.80 0.647 0.019 140 3 0.90 0.608 0.010 132 1 0.95 0.569 0.005 123 1 0.96 0.547 0.004 119 0 0.97 0.529 0.003 115 0 0.98 0.507 0.002 110 0 0.99 0.470 0.001 102 0 1.00 0.152 0.000 33 0 The FPER is your FDR, so if you decide to set it at 1%, you notice that this corresponds with a probability cut off of 0.90. Now, in your protein list, you accept all proteins with a prob. of 0.09 or higher, which is estimated to be 132 correct ones and 1 incorrect protein. Everything below is discarded. I found lots of info in the following papers (esp. the last paper). Hope this helps! Cheers, Bjorn [1] Choi, H., Fermin, D., and Nesvizhskii, A. I. Significance analysis of spectral count data in label-free shotgun proteomics. Mol. Cell. Proteomics 7, 12 (2008), 2373–2385. [2] Choi, H., Ghosh, D., and Nesvizhskii, A. I. Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling. J. Proteome Res. 7, 1 (2008), 286–292. [3] Choi, H., and Nesvizhskii, A. I. False discovery rates and related statistical concepts in mass spectrometry-based proteomics. J. Proteome Res. 7, 1 (2008), 47–50. [4] Choi, H., and Nesvizhskii, A. I. Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics. J. Proteome Res. 7, 1 (2008), 254–265. [5] Deutsch, E. W., Mendoza, L., Shteynberg, D., Farrah, T., Lam, H., Tasman, N., Sun, Z., Nilsson, E., Pratt, B., Prazen, B., Eng, J. K., Martin, D. B., Nesvizhskii, A. I., and Aebersold, R. A guided tour of the Trans-Proteomic Pipeline. Proteomics 10, 6 (2010), 1150–1159. [6] Keller, A., Nesvizhskii, A. I., Kolker, E., and Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 20 (2002), 5383–5392. [7] Nesvizhskii, A. I. A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. J. Proteomics 73, 11 (2010), 2092–2123. -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/spctools-discuss. For more options, visit https://groups.google.com/groups/opt_out.
