Thibault,
I have a question about how you are generating the mzML files for
searching. What tool are you using for this task, and do you have the raw
file available for me to test?
I have noticed that your files seem to be missing the MS1 scans. This
alone is fine but in order to identify the spectrum in the file we need
either the scan number or the index of the scan to pull up the correct
spectrum. It appears that the scan numbers are not listed in the index at
the end of you mzML file. The tail end of the mzML file should look like:
<offset idRef="controllerType=0 controllerNumber=1
scan=51126">481306970</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51127">481314790</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51128">481322971</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51129">481330551</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51130">481339052</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51131">481347643</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51132">481355682</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51133">481363528</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51134">481370856</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51135">481379107</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51136">481386761</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51137">481395077</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51138">481402922</offset>
<offset idRef="controllerType=0 controllerNumber=1
scan=51139">481410705</offset>
</index>
<index name="chromatogram">
<offset idRef="TIC">481418736</offset>
</index>
</indexList>
<indexListOffset>482010351</indexListOffset>
<fileChecksum>567fb08dd79b57d085a11680b48d666bdeacce73</fileChecksum>
</indexedmzML>
The tail end of your mzML file looks like:
<offset idRef="index=26193">437068879</offset>
<offset idRef="index=26194">437074577</offset>
<offset idRef="index=26195">437079014</offset>
<offset idRef="index=26196">437084952</offset>
<offset idRef="index=26197">437091489</offset>
<offset idRef="index=26198">437096988</offset>
<offset idRef="index=26199">437103093</offset>
<offset idRef="index=26200">437108018</offset>
<offset idRef="index=26201">437113514</offset>
<offset idRef="index=26202">437120031</offset>
<offset idRef="index=26203">437126116</offset>
<offset idRef="index=26204">437131854</offset>
<offset idRef="index=26205">437137939</offset>
<offset idRef="index=26206">437144269</offset>
<offset idRef="index=26207">437151494</offset>
<offset idRef="index=26208">437156976</offset>
<offset idRef="index=26209">437161858</offset>
</index>
<index name="chromatogram">
</index>
</indexList>
<indexListOffset>437167166</indexListOffset>
<fileChecksum>1fdc7999912fcb3e83a93d74c06f03dc3695005c</fileChecksum>
</indexedmzML>
In your mzML file the TPP can only identify your spectra correctly by their
index (1-based). This suggests even using the old Jackhammer version of
X!Tandem pipeline, will not be able to extract the correct scan from the
mzML file.
For example, tandem refers to spectrum scan=9401 with a zero based index of
5056.
In the Jackhammer pepXML this spectrum is encoded as:
<spectrum_query spectrum="20131226_HeLa_bRP01_120min.09401.09401.2"
start_scan="9401" end_scan="9401" precursor_neutral_mass="903.3941"
assumed_charge="2" index="5057" retention_time_sec="1966.72">
Notice the 1-based index is 5057 or 5056+1.
In order to be able to extract this scan from your mzML file as it stands
this spectrum should be encoded in pepXML with scan number 5057 or
<spectrum_query spectrum="20131226_HeLa_bRP01_120min.05057.05057.2"
start_scan="5057" end_scan="5057" precursor_neutral_mass="903.3941"
assumed_charge="2" index="1" retention_time_sec="1966.722">
I have added some new code for Tandem2XML to allow it to refer to the
spectrum in the first or second version based on the options the user can
set but I want to first address the issue with mzML file. Once I
understand how your mzML file came to be I can recommend the best way
forward for properly processing this type of data.
Thanks,
-David
On Wed, Jan 31, 2018 at 11:31 PM, Thibault Robin <[email protected]> wrote:
> Dear David,
>
> Here is the tandem file produced using the tpp version of X!Tandem:
> https://www.dropbox.com/s/nm3lgs540urpl6o/20131226_HeLa_bRP0
> 1_120min.tandem?dl=0
>
> Hope it helps and thank you for your time.
>
> Cheers,
>
> Thibault
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/spctools-discuss.
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/spctools-discuss.
For more options, visit https://groups.google.com/d/optout.