Hi Joe,
Msconvert and Trapper use the same API. But Trapper is a dedicated
converter so it doesn't worry about random access. Thus it doesn't need
to do the initial enumeration which accounts for the large difference in
run time. Pwiz is not just about conversion though. Pwiz's SeeMS tool
can directly view raw spectra like MassHunter does (except for that
blasted initialization time on profile data!).
-Matt
Joe Slagel wrote:
Matt,
Thanks for the explanation. Does this mean that trapper isn't using
the Agilent API? (Asking the naive trapper user question)
-Joe
On Tue, Feb 16, 2010 at 1:26 PM, Matthew Chambers
<[email protected]
<mailto:[email protected]>> wrote:
The delay in start up time for Agilent is a known issue.
Unfortunately the current Agilent API doesn't provide a way to get
the list of scanIds in a data file, nor a way to get a spectrum's
scanId without getting its data arrays. ProteoWizard is designed
to support random access by nativeID and by index to all of the
data formats it supports, so it has to enumerate all the spectra
up front in order to get each of their scanIds. On profile data,
that takes a frustrating amount of time.
I have a feature request pending with Agilent to get a function
which provides either a list of scanIds or a spectrum without
metadata.
Thanks for doing the comparison. I agree with Dave that the
conversion time sounds pretty long in both cases and I suspect a
network share.
-Matt
bio.x2y wrote:
Hi,
I independently used both Trapper (4.3.1) and Msconvert (pwiz
1.6.0)
to convert a 1.4Gb Agilent MassHunter ".d" file containing ~16000
spectra to mzXML.
Parameters:
$msconvert --mzXML --verbose large.d
$trapper --mzXML -v large.d large.mzXML
Msconvert took 5hrs 38mins to complete, generating a 57Gb file.
Trapper took 1hr 26mins to complete, generating a 28Gb file.
I can understand the differences in size, given the differences in
structure and precision. However, the time difference still
appears
quite high.
One interesting observation is that msconvert does not start
writing
to the output file until 1hr 20min has elapsed. At that point, the
file begins filling and the progress messages start appearing
in the
output. Trapper, on the other hand, starts filling the output
file and
reporting progress immediately.
I have seen this occur now for two runs on two different days,
so I
don't think it's related to other activity on the machine.
Perhaps msconvert is engaging in some preprocessing that isn't
strictly necessary, for Agilent ".d" files at least?
Thanks,
bio.x2y
--
You received this message because you are subscribed to the Google Groups
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/spctools-discuss?hl=en.