Hello Brian, I see the reason for such XML output. Some SEQUEST *.out files have uncommon format. In these files there are multiple lines representing the first hit. The second hit information starts after several lines of first hit information.
I may be wrong but probably your code assumes the line below the first hit
to be the second hit. Please see the attached for the SEQUEST output file
that produced for the following XML output (also attached). Also the XML
output below shows wrong information about the second hit.
Please see the attached for more details.
*************************
<spectrum_query spectrum="2009_0813_04.11113.11113.1" start_scan="11113"
end_scan="11113" precursor_neutral_mass="604.8877" assumed_charge="1"
index="1123">
<search_result>
<search_hit hit_rank="1" peptide="SSEER" peptide_prev_aa="-"
peptide_next_aa="-" protein="gi|91206454|ref|NP_001035146.1|"
num_tot_proteins="13" num_matched_ions="5" tot_num_ions="8"
calc_neutral_pep_mass="606.2609" massdiff="-1.373190" num_tol_term="2"
num_missed_cleavages="0" is_rejected="0">
<search_score name="xcorr" value="0.300"/>
<search_score name="deltacn" value="0.050"/>
<search_score name="deltacnstar" value="0.000"/>
<search_score name="spscore" value="43.9"/>
<search_score name="sprank" value="6"/>
</search_hit>
<search_hit hit_rank="2" peptide="SSEER" peptide_prev_aa="
is_rejected="0">
<search_score name="xcorr" value="0.000"/>
<search_score name="deltacn" value="0.050"/>
<search_score name="deltacnstar" value="0.000"/>
<search_score name="spscore" value="0.0"/>
<search_score name="sprank" value="0"/>
</search_hit>
</search_result>
</spectrum_query>
*****************************************************
Thanks,
~Nikhil Garge.
On Fri, Sep 4, 2009 at 12:08 PM, Brian Pratt <[email protected]>wrote:
> You could work around that with sed, replacing all 'peptide_prev_aa="
> is_rejected' with 'peptide_prev_aa="" is_rejected'.
>
> But it's very strange that you see this at all. I don't see anywhere in
> the code that would write 'peptide_prev_aa but not peptide_next_aa and other
> attributes.
>
> On Fri, Sep 4, 2009 at 7:23 AM, nik <[email protected]> wrote:
>
>>
>> Hello,
>>
>> I generated *.pep.xml files from SEQUEST out files using OUT2XML.exe.
>> I generrate output for 1st two hits. However, for some search hits, I
>> see the probelm as explained below.
>>
>> For one sepctrum it generated following line for hit 2:
>>
>> <search_hit hit_rank="2" peptide="DNIQGITKPAIRR" peptide_prev_aa="
>> is_rejected="0">
>>
>> Because of such incorrect lines the XML reader cannot read the files
>> correctly (through Java or C#). Do you know how to fix such error
>> lines?
>>
>> Thanks,
>> ~Nikhil.
>>
>> Complete output as shown below:
>>
>> *************************
>> <spectrum_query spectrum="2009_0813_04.10119.10119.2"
>> start_scan="10119" end_scan="10119" precursor_neutral_mass="1481.4557"
>> assumed_charge="2" index="139">
>> <search_result>
>> <search_hit hit_rank="1" peptide="DNIQGITKPAIRR"
>> peptide_prev_aa="-" peptide_next_aa="-" protein="gi|28173560|ref|
>> NP_778224.1|" num_tot_proteins="14" num_matched_ions="14"
>> tot_num_ions="24" calc_neutral_pep_mass="1480.8475"
>> massdiff="+0.608260" num_tol_term="2" num_missed_cleavages="0"
>> is_rejected="0">
>> <search_score name="xcorr" value="0.937"/>
>> <search_score name="deltacn" value="0.079"/>
>> <search_score name="deltacnstar" value="0.000"/>
>> <search_score name="spscore" value="249.4"/>
>> <search_score name="sprank" value="19"/>
>> </search_hit>
>> <search_hit hit_rank="2" peptide="DNIQGITKPAIRR"
>> peptide_prev_aa=" is_rejected="0">
>> <search_score name="xcorr" value="0.000"/>
>> <search_score name="deltacn" value="0.079"/>
>> <search_score name="deltacnstar" value="0.000"/>
>> <search_score name="spscore" value="0.0"/>
>> <search_score name="sprank" value="0"/>
>> </search_hit>
>> </search_result>
>> </spectrum_query>
>> *****************************************************
>>
>>
>>
>
> >
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"spctools-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/spctools-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---
<spectrum_query spectrum="2009_0813_04.11113.11113.1" start_scan="11113"
end_scan="11113" precursor_neutral_mass="604.8877" assumed_charge="1"
index="1123">
<search_result>
<search_hit hit_rank="1" peptide="SSEER" peptide_prev_aa="-"
peptide_next_aa="-" protein="gi|91206454|ref|NP_001035146.1|"
num_tot_proteins="13" num_matched_ions="5" tot_num_ions="8"
calc_neutral_pep_mass="606.2609" massdiff="-1.373190" num_tol_term="2"
num_missed_cleavages="0" is_rejected="0">
<search_score name="xcorr" value="0.300"/>
<search_score name="deltacn" value="0.050"/>
<search_score name="deltacnstar" value="0.000"/>
<search_score name="spscore" value="43.9"/>
<search_score name="sprank" value="6"/>
</search_hit>
<search_hit hit_rank="2" peptide="SSEER" peptide_prev_aa="
is_rejected="0">
<search_score name="xcorr" value="0.000"/>
<search_score name="deltacn" value="0.050"/>
<search_score name="deltacnstar" value="0.000"/>
<search_score name="spscore" value="0.0"/>
<search_score name="sprank" value="0"/>
</search_hit>
</search_result>
</spectrum_query>
2009_0813_04.11113.11113.1.out
Description: Binary data
