Hi David, Thanks for the input, I'm running now PeptideProphet adding the d in -O option -Od, and running with min probability to 0, which I guess then takes more time and produces bigger files, almost 10x increase. There is a difference but not much. However I know understand a little bit more what you refer. When running like this I also see that in the model page in Petunia I get similar the predicted and decoy ROC curves, similar to what you posted of Florian's data, and mine look similar so I guess I'm on the right path.
The peptides that I was referring are not decoys are peptides belonging to targets and looking at the spectrum the look fine. The peptides where the same sequence, but with a different label, e.g. n[29]ELVISK [156] and n[35]ELVIS[162], both in the same fraction, both get a high probability in PeptideProphet, but after running iProphet, in Petunia I only see one. After looking into it I think it might be something in Petunia, as if I check manually the pepXML I see them both, just not on the browser. Alejandro On Tuesday, December 5, 2017 at 4:09:33 PM UTC+1, David Shteynberg wrote: > > Hello Alejandro, > > If you have decoys in your database, the best comparison would look at the > peptide/protein IDs at a set decoy-estimated error rate. I would suggest > you compare the results using Decoy Peptide Validation and Decoy Protein > Validation tools to give yourself the most accurate comparison at the > decoy-estimated error rate. I woudl also suggest you set you minimum > PeptideProphet probability to 0 to allow the models in iProphet the most > discriminating power between corrects and incorrect. Finally, there is no > reason to expect your high scoring PeptideProphet results to remain high > scoring after iProphet (what if you high scoring PeptideProphet results are > Decoys?) The goal of iProphet is to identify the correct peptide > sequences, this entails pushing down the wrong high scoring results at the > PeptideProphet level. So...rerun the analysis using minimum PeptideProphet > probability of 0 and compare the results at the same decoy-estimated error > rates at the spectum, peptide and protein level. If you still have > concerns please link your data so I can download and troubleshoot the > analysis. > > -David > > > > On Tue, Dec 5, 2017 at 6:47 AM, Alejandro <[email protected] > <javascript:>> wrote: > >> Dear all, >> >> I would like to reopen this discussion. I have been testing iProphet and >> have experienced a similar thing as Florian. >> >> I am searching dimethyl labeled samples, doing two static searches (heavy >> and light) either with Comet or x!Tandem, then I combine both searches >> (heavy and light) with PeptideProphet and do ProteinProphet, "as is" and >> also using the MPT for a 0.01 error in ProteinProphet. Then I use the basic >> PeptideProphet results (run with P0.05) of Comet and X!tandem to combine >> both results using iProphet with default parameters and selecting >> ProteinProphet. Unfortunately, I would expect to increase the IDs, or at >> least to have the same as with either search engine. However, this is not >> the case, for e.g. >> >> ProteinProphet results filtered to 0.01 error >> >> Comet >> 1809 (238 single hits) = 1571 >> >> Tandem >> 1498 (152 single hits) = 1346 >> >> Comet and Tandem combined with iProphet >> 1623 (376) = 1247 >> >> Comet and Tandem combined with iProphet without NSP model >> >> 1717 (366) = 1351 >> >> So, the single peptide hits appear to increase when combining, and in the >> end there are less proteins identified with more than 1 peptide. >> >> When looking at the models of each search engine, there's a good >> separation of both distributions. >> >> Furthermore, when looking at specific proteins I have encountered that >> peptides having a PeptideProphet probability above 0.9 (above my MPT for >> 0.01) in both Comet and X!tandem, are gone when combining with iProphet. >> Why could this be happening? Shouldn't this get even higher probability? >> >> Hope someone of you could give me a hint on this. >> >> Cheers, >> >> Alejandro >> >> >> On Friday, March 20, 2015 at 2:55:39 PM UTC+1, Florian wrote: >>> >>> Hej David, >>> >>> sorry for my late response, I wanted to do some proper testing before >>> reporting back to you. I did the following tests now, always using the -Od >>> option and checking manually for Decoy hits: >>> >>> a) I reran the analysis that I posted above with these results: >>> >>> 1) X!Tandem only without iProphet: 1884 (557), 1% model estimated error, >>> 18/1884 = 0.95% Decoy estimated error >>> 2) X!Tandem only with iProphet: 1760 (854), 0.9% model estimated error, >>> 6/1760 = 0.34% Decoy estimated error >>> 3) MSGF only without iProphet: 2138 (632), 1% model estimated error, >>> 31/2138 = 1.4% Decoy estimated error >>> 4) MSGF only with iProphet: 1975 (876), 0.8% model estimated error, >>> 8/1975 = 0.4% Decoy estimated error >>> 5) X!Tandem and MSGF without iProphet: 2176 (590), 0.5% model estimated >>> error, 28/2176 = 1.2% Decoy estimated error >>> 6) X!Tandem and MSGF with iProphet: 2154 (1057), 1% model estimated >>> error, 17/2154 = 0.8% Decoy estimated error >>> >>> b) I included a 5% FDR on peptide level after PeptideProphet, based on >>> the model estimate: >>> >>> 1) X!Tandem only without iProphet: 1847 (569), 1% model estimated error, >>> 13/1847 = 0.7% Decoy estimated error >>> 2) X!Tandem only with iProphet: 1756 (848), 0.9% model estimated error, >>> 5/1756 = 0.3% Decoy estimated error >>> 3) MSGF only without iProphet: 2125 (647), 1% model estimated error, >>> 29/2125 = 1.3% Decoy estimated error >>> 4) MSGF only with iProphet: 1975 (875), 0.8% model estimated error, >>> 8/1975 = 0.4% Decoy estimated error >>> 5) X!Tandem and MSGF without iProphet: 2216 (655), 0.9% model estimated >>> error, 37/2216 = 1.6% Decoy estimated error >>> 6) X!Tandem and MSGF with iProphet: 2157 (1065), 1% model estimated >>> error, 20/2157 = 0.9% Decoy estimated error >>> >>> c) I used the less redundant swissprot database: >>> >>> 1) X!Tandem only without iProphet: 1812 (419), 0.7% model estimated >>> error, 15/1812 = 0.8% Decoy estimated error >>> 2) X!Tandem only with iProphet: 1671 (752), 0.7% model estimated error, >>> 8/1671 = 0.5% Decoy estimated error >>> 3) MSGF only without iProphet: 2077 (602), 0.8% model estimated error, >>> 0/2077 = 0% Decoy estimated error >>> 4) MSGF only with iProphet: 1945 (855), 0.8% model estimated error, >>> 0/1945 = 0% Decoy estimated error >>> 5) X!Tandem and MSGF without iProphet: 2238 (622), 1% model estimated >>> error, 51/2187 = 2.3% Decoy estimated error >>> 6) X!Tandem and MSGF with iProphet: 2147 (1031), 0.9% model estimated >>> error, 21/2144 = 1% Decoy estimated error >>> >>> My conclusions: >>> - shutting off the IPROPHET option in ProteinProphet lets the model >>> underestimate the error. However, it is not far off. >>> - enabling the IPROPHET option in ProteinProphet gives a good error >>> estimation, but one looses a lot of peptides from proteins that were >>> identified by more than one peptide. As I understood the iProphet >>> algorithms, such peptides should rather get a higher probability in >>> iProphet. >>> >>> I also uploaded the data to my dropbox and will send you the link. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "spctools-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/spctools-discuss. >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/spctools-discuss. For more options, visit https://groups.google.com/d/optout.
