Thanks for the explanation.
If I understand correctly, the protein_group probability is done using
all
the peptides from the subgroups, while protein subgroup probability
is only for peptides from that subgroup (both unique and non-unique).

However, what about the case where protein_group probability is 1.0
even though all the individual subgroup probabilities are 0.
In this case the sibling groups have several entries in the
"unique_stripped_peptides" tag but "total_number_peptides=0"
Any explanation?
In general, which is better filtering by the protein_group
probability or the probability in each subgroup?

Thanks.
LW


On Aug 13, 10:02 am, GATTACA <[email protected]> wrote:
> So in these cases, all the proteins in the protein group share at
> least one peptide.
> The different sub groups occur because certain "clusters" of proteins
> share peptides that are specific to the cluster.
>
> As an example, imagine a group that consists of 3 sibling groups: a,b,
> and c. All of the protein identifiers in the group correspond to
> Histones. Sibling group 'a' contains peptides that are unique to
> Histone2A. While sibling group 'b' contains Histone3 and sibling group
> 'c' has Histone4A.
>
> All 3 sibling groups share at least some peptides in common, but each
> sibling group also has some peptides, unique to itself.
>
> Because peptide probabilities in ProteinProphet are adjusted based
> upon the number of sibling peptides (nsp) and how the peptides are
> shared among various proteins (wt) the probability for a sibling group
> can be different from the probability of the group as a whole.
>
> I don't know how clear that is, but that's my attempt at explaining
> it.
>
> On Aug 12, 7:40 pm, LW <[email protected]> wrote:
>
>
>
>
>
>
>
> > Hi,
>
> > I have a question on the prot.xml. It seems like each protein group is
> > a probability and each
> > subgroup (those with group_sibling_id="a", "b", etc) has a
> > probability.
>
> > <protein_group group_number="1" probability="1.0000">
> > <protein protein_name="DECOY_40330" n_indistinguishable_proteins="1"
> > probability="1.0000" percent_coverage="2.9"
> > unique_stripped_peptides="LMVSNQFK+NMMTIETNSSTSVVSPRASTAR"
> > group_sibling_id="a" total_number_peptides="8"
> > pct_spectrum_ids="0.019" confidence="0.004">
>
> > How is the probability for the protein_group determined? I came across
> > cases where all the
> > subgroup probabilities are 0 but the protein_group probability is
> > 1.0.
> > How do I explain this?
>
> > Thanks,
> > LW

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to