For info, I've modified the .pep.xml parsing into a single step, and on a very small test dataset ASAPRatioProteinRatioParser run-time is down from 19.8s to 0.8s. I'd expect the speed-up on a large dataset to be higher due to the compound effects of more protein groups + bigger .pep.xml.

Will tidy up the code and test thoroughly next week, and then share it.

Cheers,

DT

Brian Pratt wrote:
Nothing to add!  Dave has the right idea for a performance fix, and Natalie is 
correct about TPP being addicted to Boost already.

Brian

On Thu, Feb 4, 2010 at 2:37 PM, Natalie Tasman 
<[email protected]<mailto:[email protected]>> wrote:
Hi Dave, Brian,

Just jumping into comment on Boost.  Yes, it is welcomed and in fact already 
used in the TPP (as well as the TPP-included ProteoWizard project); however, 
because the Boost API and process of building Boost libraries have not been 
particularly stable, we've found it necessary to fix the version of Boost that 
we work with.  Our current version is 1.39.0.  As long as you can test against 
that, you're contributes would be fine-- and no doubt very welcomed.

-Natalie



On 2/4/10 2:26 PM, Dave Trudgian wrote:
Brian,

Yup. I just discovered this too, as per other post. On our servers it's not 
disk-bound, as the 1.6GB .pep.xml is fully cached, but the continued rpeated 
slows things down.

R.E. solutions for this, is Boost code welcomed in the main TPP tools? I think 
I can re-write using a single pass parse of the .pep.xml into a 
Boost.MultiIndex hash of structs/classes containing the required peptide info.

DT

Brian Pratt wrote:
Looking at the code I can see where this would easily become diskbound for 
large data sets - it reads and rereads the same pepXML files repeatedly, but 
the effect is probably masked by disk cacheing up to a certain point.  Somebody 
would need to write some logic for cacheing the file contents to speed this up 
properly.

Brian

On Thu, Feb 4, 2010 at 12:09 PM, Jake W <[email protected]<mailto:[email protected]> 
<mailto:[email protected]<mailto:[email protected]>>> wrote:

   I've seen the same thing.  On datasets where
   ASAPRatioPeptideRatioParser completes in a few minutes,
   ASAPRatioProteinRatioParser can take an hour or so.  This is on a
   Windows machine running TPP ver. 4.3.0.

   Jake

   --
   You received this message because you are subscribed to the Google
   Groups "spctools-discuss" group.
   To post to this group, send email to
   [email protected]<mailto:[email protected]>
<mailto:[email protected]<mailto:[email protected]>>.
   To unsubscribe from this group, send email to
   
[email protected]<mailto:spctools-discuss%[email protected]>
<mailto:spctools-discuss%[email protected]<mailto:spctools-discuss%[email protected]>>.
   For more options, visit this group at
   http://groups.google.com/group/spctools-discuss?hl=en.


--
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to 
[email protected]<mailto:[email protected]>.
To unsubscribe from this group, send email to 
[email protected]<mailto:spctools-discuss%[email protected]>.
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.


--
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to 
[email protected]<mailto:[email protected]>.
To unsubscribe from this group, send email to 
[email protected]<mailto:spctools-discuss%[email protected]>.
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.



--
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.



--
Dr. David Trudgian
Bioinformatician in Proteomics
University of Oxford

Mon-Thu: CCMP, Roosevelt Drive
Tel: (+44) (01865 2)87784

Friday : Dunn School of Pathology, S. Parks Rd.
Tel: (+44) (01865 2)75557



--
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to