Valeriia,

That PXD003594 experiment has 76 raw files associated with.  Were there a
subset of raw files that you analyzed here or does your analysis include
all 76 runs?    Just for a quick test, I downloaded 4 raw files from that
experiment and searched it with Comet against the UniProt human database.
Here's a very basic summary:

   -    b1369p080_sample_01_a.raw:  high res MS/MS, almost no IDs (less
   than 100 positive PSMs)
   - b1369p601_DMSO_G1_B2_S10.RAW:  ion trap MS/MS, ~6000 PSMs at 1% error
   rate
   -  b1369p601_GDC_G2_B3_S23.RAW:  ion trap MS/MS, ~6000 PSMs at 1% error
   rate
   -           b1369p65_PP4_R.RAW:  ion trap MS/MS, ~500 PSMs at 1% error
   rate

So there are runs with thousands of good PSM IDs in them.  Note that in the
4 raw files that I sampled, there was a mix of high-res and low-res MS/MS
spectra so hopefully you adjusted the fragment ion settings appropriately
for each raw file using my suggested parameter settings shown below.  If
the fragment ion settings aren't the issue, feel free to follow-up
including attaching the contents of your comet.params file.

high-res:
   fragment_bin_tol = 0.02
   fragment_bin_offset = 0.0
   theoretical_fragment_ions = 0

low-res:
   fragment_bin_tol = 1.0005
   fragment_bin_offset = 0.4
   theoretical_fragment_ions = 1

Jimmy

On Thu, Mar 13, 2025 at 5:02 PM Valeriia Vasylieva <vvs.ha...@gmail.com>
wrote:

> Hi.
> I run Comet and Peptideprophet on two public datasets with TDC with
> UniProt. I calculated the q-value in Python based on fval distribution and
> filtered data with a threshold 1%. Like that:
> #df - PeptideProphet outputdf = df.sort_values(by='fval',
> ascending=False).reset_index(drop=True)
> # Calculate cumulative counts of targets and decoys df['cum_targets'] =
> (df['database'] == 'T').cumsum() df['cum_decoys'] = (df['database'] ==
> 'D').cumsum() # Calculate FDR df['FDR'] = df['cum_decoys'] /
> df['cum_targets'] # cumulative minimum from bottom to top df['q-value'] =
> df['FDR'][::-1].cummin()[::-1]
>
> Lower you see the proportion of PSMs annotated as targets and decoys which
> passed the value threshold or not.  One of the datasets (PXD03594) has a
> very low number of identifications. It also has a wide distribution of
> decoys (on the graph the raw files are plotted together). Could anyone
> suggest what could have happened here? I used default parameters, just
> changed peptide length from 7 to 30 aa, and peptide mass range
> 500.0-6000.0, and also enabled Methioning clipping.
> Thanks!
>
> [image: Capture9.PNG][image: Capture8.PNG]
>
>
> [image: Capture6.PNG][image: Capture7.PNG]
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to spctools-discuss+unsubscr...@googlegroups.com.
> To view this discussion visit
> https://groups.google.com/d/msgid/spctools-discuss/e8201f2e-0fd1-4c36-ac34-dbeada186d13n%40googlegroups.com
> <https://groups.google.com/d/msgid/spctools-discuss/e8201f2e-0fd1-4c36-ac34-dbeada186d13n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to spctools-discuss+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/spctools-discuss/CAJqD6ENwNdQm0LLSVzJ0YmWEpP3eCwJ9JshbFLW6yvdhk1H4Yw%40mail.gmail.com.

Reply via email to