[spctools-discuss] Re: Tools for search results comparisons

alastair.skeffington via spctools-discuss Wed, 29 Aug 2018 02:14:59 -0700

I've written some scripts to perform various sorts of comparisons directly 
on the pep.xml files without recourse to other tools. If there's any 
interest in these please let me know and I can make them more user 
friendly. 
BW,
Alastair


Am Freitag, 20. Juli 2018 17:28:52 UTC+2 schrieb 
[email protected]:
>
> Hello,
>
> I'm working with some large protein databases derived from PacBio 
> sequencing projects which contain quite a lot of redundancy (different 
> isoforms and variants of the same gene / protein). I've noticed that the 
> protein level identifications I get are heavily dependent on the search 
> engine or combination of search engines I use. I concerns me, for example, 
> when a protein identified rather confidently (eg 10's of PSMs, >5 peptides) 
> with one search engine doesn't appear in the list at all with another 
> search engine - even as a subset protein in a group. 
>
> To understand better how the particular nature of these databases is 
> affecting the protein identifications I want to compare the search results 
> in various ways. Before I invest time writing my own solutions, I wanted to 
> know if anyone knows of any software that can compute metrics such as the 
> following when comparing search results:
>
> 1) For each spectrum in the PSM searches, how many of the top N PSMs are 
> shared between the searches?
> 2) Proportion of all lead proteins (top protein from a group) that are the 
> same
> 3) Proportion of IDs that are the same when considering all group members
> 4) More complex measures of group similarity: eg. Proportion of groups 
> where >X% of members are within a group together in the other search, and 
> how many of these equivalent groups contain the lead protein from the other 
> group?
> 5) Other things you can think of that might be useful
>
> I saw one tool called compid, (
> https://pubs.acs.org/doi/pdf/10.1021/pr100824w) but it only works with 
> MASCOT and PARAGON results and the download is broken anyway.
>
> Are there any other solutions around?
>
> To work with the data myself I'd need to convert the pep.xml and prot.xml 
> data to tsv format. Obviously Petunia does this beautifully, but I'd rather 
> be able to process multiple files on the linux command line if possible. I 
> saw an old post in this group about this issue and they suggested RpepXML 
> (I don't know if it will work for prot.xml) or tppXMLparser (which isn't 
> flexible in terms of the fields it can output). Are these the most current 
> solutions to the problem or is there something more recent that I've missed?
>
> Any ideas / thoughts / help would be very much appreciated!!
>
> Thanks,
>
> Alastair
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/spctools-discuss.
For more options, visit https://groups.google.com/d/optout.

[spctools-discuss] Re: Tools for search results comparisons

Reply via email to