Hi Andreas,

I suspect this problem was only with your sequest search because of
the offending spectrum I pointed out.  I've corrected the source of
PeptideProphet so it will in the future simply spit out a warning and
ignore spectra where the encoded charge doesn't match assumed charge.
For now you can get your sequest analysis to go through correctly by
removing the offending spectrum.  I need to debug the pepXML tool that
generated the file to solve the underlying converter issue.  Do you
have the .out files available?

Thanks,
-David



On Wed, Jun 2, 2010 at 3:37 PM, Andreas Quandt <[email protected]> wrote:
> hey david,
> there is another question, i would like to ask: when trying to trace the
> problem we also analyzed these mzxml files with other programs such as
> xtandem, mascot, and omssa. by looking at their results we did not oberserve
> any suspicious behavior. hence, do you think the problem you described does
> only occur when using sequest or is this a more general problem?
> cheers,
> andreas
>
> On Thu, Jun 3, 2010 at 12:24 AM, Andreas Quandt <[email protected]>
> wrote:
>>
>> hey david,
>> many thanks for looking into it!
>> i was wondering if you could point me to a source where conditions such as
>> not using a '.' character in the file's basename are described because i was
>> not aware of that.
>> many thanks in advance,
>> andreas
>>
>> On Wed, Jun 2, 2010 at 12:17 AM, David Shteynberg
>> <[email protected]> wrote:
>>>
>>> Hello Ludovic,
>>>
>>> I think the problem might be in the pepXML generated in your pipeline.
>>>  The offending entry is in the file 20100422_04_control_07.c.pep.xml :
>>>
>>>
>>>  <spectrum_query spectrum="20100422_04_control_07.c.10165.10165.10"
>>> start_scan="10165" end_scan="10165" precursor_neutral_mass="3602.2759"
>>> assumed_charge="1" index="307" retention_time_sec="5514.3">
>>>
>>>
>>> Looks like the assumed charge opf 1+ doesn't match the encoded charge
>>> in the spectrum name (the number after the last dot "10" in this
>>> case).
>>>
>>> If you simply delete this entry from the file the fvalues and
>>> probabilities should line up again.  I will change the code to error
>>> out when this scenario is encountered in the future.
>>>
>>>
>>> To address the underlying issue it looks like the conversion to pepXML
>>> failed on this entry for some reason.  How were the pepXML files
>>> generated?  Can you send me the original .out files so I can test the
>>> Out2XML sequest pepXML converter tool.  This could also be caused by
>>> the '.' character in the basename of your files.  Only alphanumeric
>>> and underscore characters are allowed there traditionally.
>>>
>>> Thanks,
>>> -David
>>>
>>> On Mon, May 31, 2010 at 3:37 PM, David Shteynberg
>>> <[email protected]> wrote:
>>> > Hello Ludovic,
>>> >
>>> > Yes, I see that there is a difference in the output files.  I think
>>> > the problem is that the output of probabilities and fvals is
>>> > misaligned from the spectra in the pep.xml file.  You'll see that the
>>> > next spectrum has the correct value.  This definitely points to a bug
>>> > in the 4.3 version that is quite dated now.  We are close to releasing
>>> > a new version.  Are you able to test the latest SVN trunk version of
>>> > the software on your system to check if the bug is still present
>>> > there?
>>> >
>>> > Thanks,
>>> > -David
>>> >
>>> > On Mon, May 31, 2010 at 1:53 AM, lgillet <[email protected]>
>>> > wrote:
>>> >> Hi David,
>>> >>
>>> >> No, I confirm again that I do find a difference upon running the
>>> >> xinteract command with the files in different orders (I confirm I also
>>> >> see those differences on TPP 4.3.1 installed on Unix and on
>>> >> WindowsXP).
>>> >> I have re-run the command on the same folder, on the same files, to
>>> >> avoid any confusion about file names or so. The result of the first
>>> >> command was named NormalOrder, and the second ScrambledOrder.
>>> >> I have made another zip file (xinteract-output.zip) with the output of
>>> >> my commands (such that you can check if I do anything wrong by running
>>> >> the command). Also at the end of the text file, I have copy-pasted
>>> >> some line from the summary visualisation from the web interface, with
>>> >> different filter criteria.
>>> >> I also attached in the zip the output of the 2 interact.pep.xml.
>>> >> If you look at a diff between both files, you will find plenty of
>>> >> differences on some spectra. For example, you can have a look at:
>>> >>
>>> >> 20100422_01_control_04.c.07416.07416.3 => fval = 4.88 Vs. 0.058 (in
>>> >> NormalOrder Vs. ScrambledOrder resp.)
>>> >> 20100422_01_control_05.c.08545.08545.3 => fval = 5.36 Vs. completely
>>> >> absent (in NormalOrder Vs. ScrambledOrder resp.)
>>> >>
>>> >> Finally, could you please run those 4 pep.xml files on your server on
>>> >> TPP 4.0 and TPP 4.3 by yourself?
>>> >> You may realize by yourself then that there are not only "some
>>> >> differeces" but the differences (especially if you look at the decoy
>>> >> protein hits) is terrible.
>>> >>
>>> >> Thanks and let me know if something is still not clear.
>>> >>
>>> >> Ludovic
>>> >>
>>> >> On May 27, 7:33 pm, David Shteynberg <[email protected]>
>>> >> wrote:
>>> >>> Hi Ludovic,
>>> >>>
>>> >>> It is completely normal to expect some difference in the results
>>> >>> between version of the software since the models maybe slightly
>>> >>> different in new a version due to optimization, bug fixes and the
>>> >>> sort.  Hopefully the new analysis is able to increase your correct
>>> >>> identifications at a set error rate.
>>> >>>
>>> >>> When I run your data through our 4.3.1 pipeline I get your result in
>>> >>> the scrambled analysis (regardless of the order in which I specify my
>>> >>> input pepxml files).  The difference in *your* two analyses is due to
>>> >>> the difference in your input files.  Here is the relevant info from
>>> >>> your two 4.3.1 analyses:
>>> >>>
>>> >>> interact-TPP-V4.3.pep.xml has 8931 spectra in charge 2+ that it
>>> >>> models:
>>> >>>
>>> >>> <mixture_model precursor_ion_charge="2" comments="using no. tolerable
>>> >>> trypsin term. [ntt] 0 data as pseudonegatives"
>>> >>> prior_probability="0.427" est_tot_correct="3830.1"
>>> >>> tot_num_spectra="8931" num_iterations="28">
>>> >>>
>>> >>> interact_TPP-V4.3_scrambled.pep.xml has 8929 spectra in charge 2+
>>> >>> that
>>> >>> it models:
>>> >>>
>>> >>> <mixture_model precursor_ion_charge="2" comments="using no. tolerable
>>> >>> trypsin term. [ntt] 0 data as pseudonegatives"
>>> >>> prior_probability="0.427" est_tot_correct="3829.1"
>>> >>> tot_num_spectra="8929" num_iterations="28">
>>> >>>
>>> >>> Since the inputs are different in the two analyses the results will
>>> >>> be
>>> >>> different.  Please verify that the inputs your are giving to the two
>>> >>> analyses in different order are *not identical*.  Can you verify
>>> >>> this?
>>> >>>
>>> >>> Thanks,
>>> >>> -David
>>> >>>
>>> >>> On Thu, May 27, 2010 at 8:12 AM, lgillet <[email protected]>
>>> >>> wrote:
>>> >>> > Hi David,
>>> >>>
>>> >>> > TPP is installed in different servers in our Institute. I have re-
>>> >>> > uploaded a new file (lgillet_interact-again.zip) for which the TPP
>>> >>> > xinteract was performed on the same server and with different
>>> >>> > versions
>>> >>> > of the TPP. You can see that the results are still very different,
>>> >>> > even the scrambled case.
>>> >>> > Note that I used the version TPP v4.3 JETSTREAM rev 1, Build
>>> >>> > 201004201202 (linux); which I do not know if it is the same as the
>>> >>> > SVN
>>> >>> > TPP that you mentioned or a "nightly-built".
>>> >>> > - Can you re-confirm my results using your installation of TPP with
>>> >>> > my
>>> >>> > 4 pep.xml files?
>>> >>> > - Can you re-confirm the differences in decoy % using your
>>> >>> > installation of TPP between TPP V4.0 and V4.3 with my 4 pep.xml
>>> >>> > files?
>>> >>> > Thanks again,
>>> >>> > Ludovic
>>> >>>
>>> >>> > On May 26, 9:01 pm, David Shteynberg
>>> >>> > <[email protected]>
>>> >>> > wrote:
>>> >>> >> Hi Ludovic,
>>> >>>
>>> >>> >> I was unable to duplicate the different results on different order
>>> >>> >> of
>>> >>> >> input using the latest version of SVN tpp or version 4.3.1.  I
>>> >>> >> noticed
>>> >>> >> that your two analyses point to different locations.  Are you sure
>>> >>> >> that the files at these locations are identical?
>>> >>>
>>> >>> >> Thanks,
>>> >>> >> -David
>>> >>>
>>> >>> >> On Wed, May 26, 2010 at 10:47 AM, lgillet
>>> >>> >> <[email protected]> wrote:
>>> >>> >> > Hi David,
>>> >>>
>>> >>> >> > all my apologizes, the rar file got corrupted probably during
>>> >>> >> > the
>>> >>> >> > upload (the original on my HD was fine).
>>> >>> >> > I have uploaded again a zip file this time:
>>> >>> >> > lgillet_pepxml-again2.zip
>>> >>> >> > I hope that works this time (after download, I can decompress it
>>> >>> >> > back).
>>> >>> >> > Thanks for having a look at this issue.
>>> >>> >> > Best,
>>> >>> >> > Ludovic
>>> >>>
>>> >>> >> > On May 25, 7:24 pm, David Shteynberg
>>> >>> >> > <[email protected]>
>>> >>> >> > wrote:
>>> >>> >> >> Hi Ludovic,
>>> >>>
>>> >>> >> >> It seems the file you uploaded lgillet_pepxml_for_TPP4.3.rar is
>>> >>> >> >> corrupted.  At least I am unable to open it. Please upload
>>> >>> >> >> again.
>>> >>>
>>> >>> >> >> Thanks,
>>> >>> >> >> -David
>>> >>>
>>> >>> >> >> On Wed, May 19, 2010 at 2:54 AM, lgillet
>>> >>> >> >> <[email protected]> wrote:
>>> >>> >> >> > Hi David, Hi Natalie,
>>> >>>
>>> >>> >> >> > I just posted the 4 pepxml files which give me the most
>>> >>> >> >> > striking
>>> >>> >> >> > differences in results between TPP-V4.0 and TPP-V4.3:
>>> >>> >> >> > lgillet_pepxml_for_TPP4.3.rar. I also posted the results
>>> >>> >> >> > (interact.pep.xml) which I obtain from running TPP-V4.0,
>>> >>> >> >> > TPP-V4.3 and
>>> >>> >> >> > TPP-V4.3 on scrambled file order (file #4>#3>#2>#1):
>>> >>> >> >> > lgillet_interact-
>>> >>> >> >> > results.rar.
>>> >>> >> >> > I really tried my best to figure out what the problem could
>>> >>> >> >> > be.
>>> >>> >> >> > Maybe you could re-run the same analyses (TPP-V4.0, TPP-V4.3,
>>> >>> >> >> > TPP-V4.3
>>> >>> >> >> > with the scrambled file order) and let me know if you confirm
>>> >>> >> >> > my
>>> >>> >> >> > results or if there is something wrong maybe with the
>>> >>> >> >> > compiled version
>>> >>> >> >> > we have on our server (could still be a possibility).
>>> >>> >> >> > Finally, to answer Natalie's question, the differences are
>>> >>> >> >> > quite
>>> >>> >> >> > dramatic (to my opinion) between V4.0 and V4.3 (I would not
>>> >>> >> >> > have
>>> >>> >> >> > worried about 1-2% differences in IDs), but here, I am
>>> >>> >> >> > passing from 1%
>>> >>> >> >> > decoy (V4.0) to 23% decoy (V4.3) hits (at the same proba >
>>> >>> >> >> > 0.9). Also
>>> >>> >> >> > the number of unique peptides reported by V4.0 and V4.3 is
>>> >>> >> >> > quite
>>> >>> >> >> > different (2150 and 3161 resp.). Finally, many decoy hits
>>> >>> >> >> > pulled up in
>>> >>> >> >> > V4.3 with a prob>0.9 have actually a very bad MS/MS spectrum
>>> >>> >> >> > and a
>>> >>> >> >> > very low prob<0.01 (only reported if you use -p0 option) on
>>> >>> >> >> > V4.0.
>>> >>>
>>> >>> >> >> > Have a look at those MS/MS spectra for example:
>>> >>>
>>> >>> >> >> > 20100422_04_control_07.c.07700.07700.4
>>> >>> >> >> > 20100422_04_control_07.c.02864.02864.3
>>> >>>
>>> >>> >> >> > Let me know if you need any extra information.
>>> >>>
>>> >>> >> >> > Thanks a lot for your help on that.
>>> >>>
>>> >>> >> >> > Best,
>>> >>>
>>> >>> >> >> > Ludovic
>>> >>>
>>> >>> >> >> > On May 18, 11:21 pm, Natalie Tasman
>>> >>> >> >> > <[email protected]>
>>> >>> >> >> > wrote:
>>> >>> >> >> >> Ludovic,
>>> >>>
>>> >>> >> >> >> Go ahead and post the files to the newsgroup's file area
>>> >>> >> >> >> (http://groups.google.com/group/spctools-discuss/files), and
>>> >>> >> >> >> hopefully
>>> >>> >> >> >> one of the validation experts will take a look.
>>> >>>
>>> >>> >> >> >> I will point out that PeptideProphet uses random
>>> >>> >> >> >> initialization for
>>> >>> >> >> >> it's curve fitting (EM algorithm).  So it's not out of the
>>> >>> >> >> >> question
>>> >>> >> >> >> that you'd see some small differences between runs on the
>>> >>> >> >> >> same data
>>> >>> >> >> >> files, regardless of the order.  Can you provide some
>>> >>> >> >> >> measure of the
>>> >>> >> >> >> differences between runs for the reordered datasets?
>>> >>>
>>> >>> >> >> >> -Natalie
>>> >>>
>>> >>> >> >> >> On Tue, May 18, 2010 at 4:35 AM, lgillet
>>> >>> >> >> >> <[email protected]> wrote:
>>> >>> >> >> >> > Hi everybody,
>>> >>> >> >> >> > I recently encountered a "bug" I think when people in my
>>> >>> >> >> >> > lab installed
>>> >>> >> >> >> > the newest TPP (v4.3 JETSTREAM rev 1, Build 201004201202
>>> >>> >> >> >> > (linux)),
>>> >>> >> >> >> > especially when I try to confront the result to v4.0 which
>>> >>> >> >> >> > was our
>>> >>> >> >> >> > former "benchmark" version.
>>> >>> >> >> >> > When searching the same 4 pep.xml files with v4.0 and
>>> >>> >> >> >> > v4.3, I get an
>>> >>> >> >> >> > incredible difference in decoy hits number. For example,
>>> >>> >> >> >> > with v4.0,
>>> >>> >> >> >> > p>0.9, I would get my "regular" 1% decoy, while with v4.3,
>>> >>> >> >> >> > p>0.9, I
>>> >>> >> >> >> > get above 25% of decoys?!??
>>> >>> >> >> >> > All the interact are run with the following options:
>>> >>> >> >> >> > xinteract -OApld -
>>> >>> >> >> >> > ddecoy *.pep.xml
>>> >>> >> >> >> > I could nail down the "problem" to the
>>> >>> >> >> >> > PeptideProphetParser which
>>> >>> >> >> >> > behaves very differently between v4.0 and v4.3, while
>>> >>> >> >> >> > InteractParser
>>> >>> >> >> >> > (which introduces the "is_rejected=1" tags) and
>>> >>> >> >> >> > RefreshParser do not
>>> >>> >> >> >> > influence the results.
>>> >>> >> >> >> > But at the moment, I do not know if it is an issue of the
>>> >>> >> >> >> > decoy
>>> >>> >> >> >> > statistical distribution of prophet or not...
>>> >>>
>>> >>> >> >> >> > One more thing that makes me even more suspicious is the
>>> >>> >> >> >> > fact that,
>>> >>> >> >> >> > only with TPP version 4.3, if you search those files in a
>>> >>> >> >> >> > difference
>>> >>> >> >> >> > order (let say: xinteract file1 file2 file3 Vs xinteract
>>> >>> >> >> >> > file3 file2
>>> >>> >> >> >> > file1), you do get differences in the results as well?!?
>>> >>>
>>> >>> >> >> >> > I am willing to send the 4 pepxml where those observations
>>> >>> >> >> >> > are the
>>> >>> >> >> >> > most critical to David or Luis or anybody interested, but
>>> >>> >> >> >> > I truly
>>> >>> >> >> >> > believe that there might be something going wrong with the
>>> >>> >> >> >> > TPP v4.3.
>>> >>>
>>> >>> >> >> >> > Let me know to whom I should post the files.
>>> >>>
>>> >>> >> >> >> > Best regards,
>>> >>>
>>> >>> >> >> >> > Ludovic
>>> >>>
>>> >>> >> >> >> > --
>>> >>> >> >> >> > You received this message because you are subscribed to
>>> >>> >> >> >> > the Google Groups "spctools-discuss" group.
>>> >>> >> >> >> > To post to this group, send email to
>>> >>> >> >> >> > [email protected].
>>> >>> >> >> >> > To unsubscribe from this group, send email to
>>> >>> >> >> >> > [email protected].
>>> >>> >> >> >> > For more options, visit this group
>>> >>> >> >> >> > athttp://groups.google.com/group/spctools-discuss?hl=en.
>>> >>>
>>> >>> >> >> >> --
>>> >>> >> >> >> You received this message because you are subscribed to the
>>> >>> >> >> >> Google Groups "spctools-discuss" group.
>>> >>> >> >> >> To post to this group, send email to
>>> >>> >> >> >> [email protected].
>>> >>> >> >> >> To unsubscribe from this group, send email to
>>> >>> >> >> >> [email protected].
>>> >>> >> >> >> For more options, visit this group
>>> >>> >> >> >> athttp://groups.google.com/group/spctools-discuss?hl=en.
>>> >>>
>>> >>> >> >> > --
>>> >>> >> >> > You received this message because you are subscribed to the
>>> >>> >> >> > Google Groups "spctools-discuss" group.
>>> >>> >> >> > To post to this group, send email to
>>> >>> >> >> > [email protected].
>>> >>> >> >> > To unsubscribe from this group, send email to
>>> >>> >> >> > [email protected].
>>> >>> >> >> > For more options, visit this group
>>> >>> >> >> > athttp://groups.google.com/group/spctools-discuss?hl=en.
>>> >>>
>>> >>> >> >> --
>>> >>> >> >> You received this message because you are subscribed to the
>>> >>> >> >> Google Groups "spctools-discuss" group.
>>> >>> >> >> To post to this group, send email to
>>> >>> >> >> [email protected].
>>> >>> >> >> To unsubscribe from this group, send email to
>>> >>> >> >> [email protected].
>>> >>> >> >> For more options, visit this group
>>> >>> >> >> athttp://groups.google.com/group/spctools-discuss?hl=en.
>>> >>>
>>> >>> >> > --
>>> >>> >> > You received this message because you are subscribed to the
>>> >>> >> > Google Groups "spctools-discuss" group.
>>> >>> >> > To post to this group, send email to
>>> >>> >> > [email protected].
>>> >>> >> > To unsubscribe from this group, send email to
>>> >>> >> > [email protected].
>>> >>> >> > For more options, visit this group
>>> >>> >> > athttp://groups.google.com/group/spctools-discuss?hl=en.
>>> >>>
>>> >>> > --
>>> >>> > You received this message because you are subscribed to the Google
>>> >>> > Groups "spctools-discuss" group.
>>> >>> > To post to this group, send email to
>>> >>> > [email protected].
>>> >>> > To unsubscribe from this group, send email to
>>> >>> > [email protected].
>>> >>> > For more
>>> >>>
>>> >>> ...
>>> >>>
>>> >>> read more »
>>> >>
>>> >> --
>>> >> You received this message because you are subscribed to the Google
>>> >> Groups "spctools-discuss" group.
>>> >> To post to this group, send email to
>>> >> [email protected].
>>> >> To unsubscribe from this group, send email to
>>> >> [email protected].
>>> >> For more options, visit this group at
>>> >> http://groups.google.com/group/spctools-discuss?hl=en.
>>> >>
>>> >>
>>> >
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "spctools-discuss" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to
>>> [email protected].
>>> For more options, visit this group at
>>> http://groups.google.com/group/spctools-discuss?hl=en.
>>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to