Re: [spctools-discuss] Less protein IDs after running iProphet

Alejandro Tue, 05 Dec 2017 06:47:23 -0800

Dear all,

I would like to reopen this discussion. I have been testing iProphet and 
have experienced a similar thing as Florian.


I am searching dimethyl labeled samples, doing two static searches (heavy 
and light) either with Comet or x!Tandem, then I combine both searches 
(heavy and light) with PeptideProphet and do ProteinProphet, "as is" and 
also using the MPT for a 0.01 error in ProteinProphet. Then I use the basic 
PeptideProphet results (run with P0.05) of Comet and X!tandem to combine 
both results using iProphet with default parameters and selecting 
ProteinProphet. Unfortunately, I would expect to increase the IDs, or at 
least to have the same as with either search engine. However, this is not 
the case, for e.g.

ProteinProphet results filtered to 0.01 error

Comet
1809 (238 single hits) = 1571

Tandem
1498 (152 single hits) = 1346

Comet and Tandem combined with iProphet
1623 (376) = 1247

Comet and Tandem combined with iProphet without NSP model

1717 (366) = 1351

So, the single peptide hits appear to increase when combining, and in the 
end there are less proteins identified with more than 1 peptide.

When looking at the models of each search engine, there's a good separation 
of both distributions.

Furthermore, when looking at specific proteins I have encountered that 
peptides having a PeptideProphet probability above 0.9 (above my MPT for 
0.01) in both Comet and X!tandem, are gone when combining with iProphet. 
Why could this be happening? Shouldn't this get even higher probability?

Hope someone of you could give me a hint on this.

Cheers,

Alejandro


On Friday, March 20, 2015 at 2:55:39 PM UTC+1, Florian wrote:
>
> Hej David,
>
> sorry for my late response, I wanted to do some proper testing before 
> reporting back to you. I did the following tests now, always using the -Od 
> option and checking manually for Decoy hits:
>
> a) I reran the analysis that I posted above with these results:
>
> 1) X!Tandem only without iProphet: 1884 (557), 1% model estimated error, 
> 18/1884 = 0.95% Decoy estimated error
> 2) X!Tandem only with iProphet: 1760 (854), 0.9% model estimated error, 
> 6/1760 = 0.34% Decoy estimated error
> 3) MSGF only without iProphet: 2138 (632), 1% model estimated error, 
> 31/2138 = 1.4% Decoy estimated error
> 4) MSGF only with iProphet: 1975 (876), 0.8% model estimated error, 8/1975 
> = 0.4% Decoy estimated error
> 5) X!Tandem and MSGF without iProphet: 2176 (590), 0.5% model estimated 
> error, 28/2176 = 1.2% Decoy estimated error
> 6) X!Tandem and MSGF with iProphet: 2154 (1057), 1% model estimated error, 
> 17/2154 = 0.8% Decoy estimated error
>
> b) I included a 5% FDR on peptide level after PeptideProphet, based on the 
> model estimate:
>
> 1) X!Tandem only without iProphet: 1847 (569), 1% model estimated error, 
> 13/1847 = 0.7% Decoy estimated error
> 2) X!Tandem only with iProphet: 1756 (848), 0.9% model estimated error, 
> 5/1756 = 0.3% Decoy estimated error
> 3) MSGF only without iProphet: 2125 (647), 1% model estimated error, 
> 29/2125 = 1.3% Decoy estimated error
> 4) MSGF only with iProphet: 1975 (875), 0.8% model estimated error, 8/1975 
> = 0.4% Decoy estimated error
> 5) X!Tandem and MSGF without iProphet: 2216 (655), 0.9% model estimated 
> error, 37/2216 = 1.6% Decoy estimated error
> 6) X!Tandem and MSGF with iProphet: 2157 (1065), 1% model estimated error, 
> 20/2157 = 0.9% Decoy estimated error
>
> c) I used the less redundant swissprot database:
>
> 1) X!Tandem only without iProphet: 1812 (419), 0.7% model estimated error, 
> 15/1812 = 0.8% Decoy estimated error
> 2) X!Tandem only with iProphet: 1671 (752), 0.7% model estimated error, 
> 8/1671 = 0.5% Decoy estimated error
> 3) MSGF only without iProphet: 2077 (602), 0.8% model estimated error, 
> 0/2077 = 0% Decoy estimated error
> 4) MSGF only with iProphet: 1945 (855), 0.8% model estimated error, 0/1945 
> = 0% Decoy estimated error
> 5) X!Tandem and MSGF without iProphet: 2238 (622), 1% model estimated 
> error, 51/2187 = 2.3% Decoy estimated error
> 6) X!Tandem and MSGF with iProphet: 2147 (1031), 0.9% model estimated 
> error, 21/2144 = 1% Decoy estimated error
>
> My conclusions:
> - shutting off the IPROPHET option in ProteinProphet lets the model 
> underestimate the error. However, it is not far off.
> - enabling the IPROPHET option in ProteinProphet gives a good error 
> estimation, but one looses a lot of peptides from proteins that were 
> identified by more than one peptide. As I understood the iProphet 
> algorithms, such peptides should rather get a higher probability in 
> iProphet.
>
> I also uploaded the data to my dropbox and will send you the link.
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/spctools-discuss.
For more options, visit https://groups.google.com/d/optout.

Re: [spctools-discuss] Less protein IDs after running iProphet

Reply via email to