hey david,

many thanks for looking into it!
i was wondering if you could point me to a source where conditions such as
not using a '.' character in the file's basename are described because i was
not aware of that.

many thanks in advance,
andreas


On Wed, Jun 2, 2010 at 12:17 AM, David Shteynberg <
[email protected]> wrote:

> Hello Ludovic,
>
> I think the problem might be in the pepXML generated in your pipeline.
>  The offending entry is in the file 20100422_04_control_07.c.pep.xml :
>
>
>  <spectrum_query spectrum="20100422_04_control_07.c.10165.10165.10"
> start_scan="10165" end_scan="10165" precursor_neutral_mass="3602.2759"
> assumed_charge="1" index="307" retention_time_sec="5514.3">
>
>
> Looks like the assumed charge opf 1+ doesn't match the encoded charge
> in the spectrum name (the number after the last dot "10" in this
> case).
>
> If you simply delete this entry from the file the fvalues and
> probabilities should line up again.  I will change the code to error
> out when this scenario is encountered in the future.
>
>
> To address the underlying issue it looks like the conversion to pepXML
> failed on this entry for some reason.  How were the pepXML files
> generated?  Can you send me the original .out files so I can test the
> Out2XML sequest pepXML converter tool.  This could also be caused by
> the '.' character in the basename of your files.  Only alphanumeric
> and underscore characters are allowed there traditionally.
>
> Thanks,
> -David
>
> On Mon, May 31, 2010 at 3:37 PM, David Shteynberg
> <[email protected]> wrote:
> > Hello Ludovic,
> >
> > Yes, I see that there is a difference in the output files.  I think
> > the problem is that the output of probabilities and fvals is
> > misaligned from the spectra in the pep.xml file.  You'll see that the
> > next spectrum has the correct value.  This definitely points to a bug
> > in the 4.3 version that is quite dated now.  We are close to releasing
> > a new version.  Are you able to test the latest SVN trunk version of
> > the software on your system to check if the bug is still present
> > there?
> >
> > Thanks,
> > -David
> >
> > On Mon, May 31, 2010 at 1:53 AM, lgillet <[email protected]>
> wrote:
> >> Hi David,
> >>
> >> No, I confirm again that I do find a difference upon running the
> >> xinteract command with the files in different orders (I confirm I also
> >> see those differences on TPP 4.3.1 installed on Unix and on
> >> WindowsXP).
> >> I have re-run the command on the same folder, on the same files, to
> >> avoid any confusion about file names or so. The result of the first
> >> command was named NormalOrder, and the second ScrambledOrder.
> >> I have made another zip file (xinteract-output.zip) with the output of
> >> my commands (such that you can check if I do anything wrong by running
> >> the command). Also at the end of the text file, I have copy-pasted
> >> some line from the summary visualisation from the web interface, with
> >> different filter criteria.
> >> I also attached in the zip the output of the 2 interact.pep.xml.
> >> If you look at a diff between both files, you will find plenty of
> >> differences on some spectra. For example, you can have a look at:
> >>
> >> 20100422_01_control_04.c.07416.07416.3 => fval = 4.88 Vs. 0.058 (in
> >> NormalOrder Vs. ScrambledOrder resp.)
> >> 20100422_01_control_05.c.08545.08545.3 => fval = 5.36 Vs. completely
> >> absent (in NormalOrder Vs. ScrambledOrder resp.)
> >>
> >> Finally, could you please run those 4 pep.xml files on your server on
> >> TPP 4.0 and TPP 4.3 by yourself?
> >> You may realize by yourself then that there are not only "some
> >> differeces" but the differences (especially if you look at the decoy
> >> protein hits) is terrible.
> >>
> >> Thanks and let me know if something is still not clear.
> >>
> >> Ludovic
> >>
> >> On May 27, 7:33 pm, David Shteynberg <[email protected]>
> >> wrote:
> >>> Hi Ludovic,
> >>>
> >>> It is completely normal to expect some difference in the results
> >>> between version of the software since the models maybe slightly
> >>> different in new a version due to optimization, bug fixes and the
> >>> sort.  Hopefully the new analysis is able to increase your correct
> >>> identifications at a set error rate.
> >>>
> >>> When I run your data through our 4.3.1 pipeline I get your result in
> >>> the scrambled analysis (regardless of the order in which I specify my
> >>> input pepxml files).  The difference in *your* two analyses is due to
> >>> the difference in your input files.  Here is the relevant info from
> >>> your two 4.3.1 analyses:
> >>>
> >>> interact-TPP-V4.3.pep.xml has 8931 spectra in charge 2+ that it models:
> >>>
> >>> <mixture_model precursor_ion_charge="2" comments="using no. tolerable
> >>> trypsin term. [ntt] 0 data as pseudonegatives"
> >>> prior_probability="0.427" est_tot_correct="3830.1"
> >>> tot_num_spectra="8931" num_iterations="28">
> >>>
> >>> interact_TPP-V4.3_scrambled.pep.xml has 8929 spectra in charge 2+ that
> >>> it models:
> >>>
> >>> <mixture_model precursor_ion_charge="2" comments="using no. tolerable
> >>> trypsin term. [ntt] 0 data as pseudonegatives"
> >>> prior_probability="0.427" est_tot_correct="3829.1"
> >>> tot_num_spectra="8929" num_iterations="28">
> >>>
> >>> Since the inputs are different in the two analyses the results will be
> >>> different.  Please verify that the inputs your are giving to the two
> >>> analyses in different order are *not identical*.  Can you verify this?
> >>>
> >>> Thanks,
> >>> -David
> >>>
> >>> On Thu, May 27, 2010 at 8:12 AM, lgillet <[email protected]>
> wrote:
> >>> > Hi David,
> >>>
> >>> > TPP is installed in different servers in our Institute. I have re-
> >>> > uploaded a new file (lgillet_interact-again.zip) for which the TPP
> >>> > xinteract was performed on the same server and with different
> versions
> >>> > of the TPP. You can see that the results are still very different,
> >>> > even the scrambled case.
> >>> > Note that I used the version TPP v4.3 JETSTREAM rev 1, Build
> >>> > 201004201202 (linux); which I do not know if it is the same as the
> SVN
> >>> > TPP that you mentioned or a "nightly-built".
> >>> > - Can you re-confirm my results using your installation of TPP with
> my
> >>> > 4 pep.xml files?
> >>> > - Can you re-confirm the differences in decoy % using your
> >>> > installation of TPP between TPP V4.0 and V4.3 with my 4 pep.xml
> files?
> >>> > Thanks again,
> >>> > Ludovic
> >>>
> >>> > On May 26, 9:01 pm, David Shteynberg <[email protected]
> >
> >>> > wrote:
> >>> >> Hi Ludovic,
> >>>
> >>> >> I was unable to duplicate the different results on different order
> of
> >>> >> input using the latest version of SVN tpp or version 4.3.1.  I
> noticed
> >>> >> that your two analyses point to different locations.  Are you sure
> >>> >> that the files at these locations are identical?
> >>>
> >>> >> Thanks,
> >>> >> -David
> >>>
> >>> >> On Wed, May 26, 2010 at 10:47 AM, lgillet <[email protected]>
> wrote:
> >>> >> > Hi David,
> >>>
> >>> >> > all my apologizes, the rar file got corrupted probably during the
> >>> >> > upload (the original on my HD was fine).
> >>> >> > I have uploaded again a zip file this time:
> lgillet_pepxml-again2.zip
> >>> >> > I hope that works this time (after download, I can decompress it
> >>> >> > back).
> >>> >> > Thanks for having a look at this issue.
> >>> >> > Best,
> >>> >> > Ludovic
> >>>
> >>> >> > On May 25, 7:24 pm, David Shteynberg <
> [email protected]>
> >>> >> > wrote:
> >>> >> >> Hi Ludovic,
> >>>
> >>> >> >> It seems the file you uploaded lgillet_pepxml_for_TPP4.3.rar is
> >>> >> >> corrupted.  At least I am unable to open it. Please upload again.
> >>>
> >>> >> >> Thanks,
> >>> >> >> -David
> >>>
> >>> >> >> On Wed, May 19, 2010 at 2:54 AM, lgillet <
> [email protected]> wrote:
> >>> >> >> > Hi David, Hi Natalie,
> >>>
> >>> >> >> > I just posted the 4 pepxml files which give me the most
> striking
> >>> >> >> > differences in results between TPP-V4.0 and TPP-V4.3:
> >>> >> >> > lgillet_pepxml_for_TPP4.3.rar. I also posted the results
> >>> >> >> > (interact.pep.xml) which I obtain from running TPP-V4.0,
> TPP-V4.3 and
> >>> >> >> > TPP-V4.3 on scrambled file order (file #4>#3>#2>#1):
> lgillet_interact-
> >>> >> >> > results.rar.
> >>> >> >> > I really tried my best to figure out what the problem could be.
> >>> >> >> > Maybe you could re-run the same analyses (TPP-V4.0, TPP-V4.3,
> TPP-V4.3
> >>> >> >> > with the scrambled file order) and let me know if you confirm
> my
> >>> >> >> > results or if there is something wrong maybe with the compiled
> version
> >>> >> >> > we have on our server (could still be a possibility).
> >>> >> >> > Finally, to answer Natalie's question, the differences are
> quite
> >>> >> >> > dramatic (to my opinion) between V4.0 and V4.3 (I would not
> have
> >>> >> >> > worried about 1-2% differences in IDs), but here, I am passing
> from 1%
> >>> >> >> > decoy (V4.0) to 23% decoy (V4.3) hits (at the same proba >
> 0.9). Also
> >>> >> >> > the number of unique peptides reported by V4.0 and V4.3 is
> quite
> >>> >> >> > different (2150 and 3161 resp.). Finally, many decoy hits
> pulled up in
> >>> >> >> > V4.3 with a prob>0.9 have actually a very bad MS/MS spectrum
> and a
> >>> >> >> > very low prob<0.01 (only reported if you use -p0 option) on
> V4.0.
> >>>
> >>> >> >> > Have a look at those MS/MS spectra for example:
> >>>
> >>> >> >> > 20100422_04_control_07.c.07700.07700.4
> >>> >> >> > 20100422_04_control_07.c.02864.02864.3
> >>>
> >>> >> >> > Let me know if you need any extra information.
> >>>
> >>> >> >> > Thanks a lot for your help on that.
> >>>
> >>> >> >> > Best,
> >>>
> >>> >> >> > Ludovic
> >>>
> >>> >> >> > On May 18, 11:21 pm, Natalie Tasman <
> [email protected]>
> >>> >> >> > wrote:
> >>> >> >> >> Ludovic,
> >>>
> >>> >> >> >> Go ahead and post the files to the newsgroup's file area
> >>> >> >> >> (http://groups.google.com/group/spctools-discuss/files), and
> hopefully
> >>> >> >> >> one of the validation experts will take a look.
> >>>
> >>> >> >> >> I will point out that PeptideProphet uses random
> initialization for
> >>> >> >> >> it's curve fitting (EM algorithm).  So it's not out of the
> question
> >>> >> >> >> that you'd see some small differences between runs on the same
> data
> >>> >> >> >> files, regardless of the order.  Can you provide some measure
> of the
> >>> >> >> >> differences between runs for the reordered datasets?
> >>>
> >>> >> >> >> -Natalie
> >>>
> >>> >> >> >> On Tue, May 18, 2010 at 4:35 AM, lgillet <
> [email protected]> wrote:
> >>> >> >> >> > Hi everybody,
> >>> >> >> >> > I recently encountered a "bug" I think when people in my lab
> installed
> >>> >> >> >> > the newest TPP (v4.3 JETSTREAM rev 1, Build 201004201202
> (linux)),
> >>> >> >> >> > especially when I try to confront the result to v4.0 which
> was our
> >>> >> >> >> > former "benchmark" version.
> >>> >> >> >> > When searching the same 4 pep.xml files with v4.0 and v4.3,
> I get an
> >>> >> >> >> > incredible difference in decoy hits number. For example,
> with v4.0,
> >>> >> >> >> > p>0.9, I would get my "regular" 1% decoy, while with v4.3,
> p>0.9, I
> >>> >> >> >> > get above 25% of decoys?!??
> >>> >> >> >> > All the interact are run with the following options:
> xinteract -OApld -
> >>> >> >> >> > ddecoy *.pep.xml
> >>> >> >> >> > I could nail down the "problem" to the PeptideProphetParser
> which
> >>> >> >> >> > behaves very differently between v4.0 and v4.3, while
> InteractParser
> >>> >> >> >> > (which introduces the "is_rejected=1" tags) and
> RefreshParser do not
> >>> >> >> >> > influence the results.
> >>> >> >> >> > But at the moment, I do not know if it is an issue of the
> decoy
> >>> >> >> >> > statistical distribution of prophet or not...
> >>>
> >>> >> >> >> > One more thing that makes me even more suspicious is the
> fact that,
> >>> >> >> >> > only with TPP version 4.3, if you search those files in a
> difference
> >>> >> >> >> > order (let say: xinteract file1 file2 file3 Vs xinteract
> file3 file2
> >>> >> >> >> > file1), you do get differences in the results as well?!?
> >>>
> >>> >> >> >> > I am willing to send the 4 pepxml where those observations
> are the
> >>> >> >> >> > most critical to David or Luis or anybody interested, but I
> truly
> >>> >> >> >> > believe that there might be something going wrong with the
> TPP v4.3.
> >>>
> >>> >> >> >> > Let me know to whom I should post the files.
> >>>
> >>> >> >> >> > Best regards,
> >>>
> >>> >> >> >> > Ludovic
> >>>
> >>> >> >> >> > --
> >>> >> >> >> > You received this message because you are subscribed to the
> Google Groups "spctools-discuss" group.
> >>> >> >> >> > To post to this group, send email to
> [email protected].
> >>> >> >> >> > To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> >>> >> >> >> > For more options, visit this group athttp://
> groups.google.com/group/spctools-discuss?hl=en.
> >>>
> >>> >> >> >> --
> >>> >> >> >> You received this message because you are subscribed to the
> Google Groups "spctools-discuss" group.
> >>> >> >> >> To post to this group, send email to
> [email protected].
> >>> >> >> >> To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> >>> >> >> >> For more options, visit this group athttp://
> groups.google.com/group/spctools-discuss?hl=en.
> >>>
> >>> >> >> > --
> >>> >> >> > You received this message because you are subscribed to the
> Google Groups "spctools-discuss" group.
> >>> >> >> > To post to this group, send email to
> [email protected].
> >>> >> >> > To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> >>> >> >> > For more options, visit this group athttp://
> groups.google.com/group/spctools-discuss?hl=en.
> >>>
> >>> >> >> --
> >>> >> >> You received this message because you are subscribed to the
> Google Groups "spctools-discuss" group.
> >>> >> >> To post to this group, send email to
> [email protected].
> >>> >> >> To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> >>> >> >> For more options, visit this group athttp://
> groups.google.com/group/spctools-discuss?hl=en.
> >>>
> >>> >> > --
> >>> >> > You received this message because you are subscribed to the Google
> Groups "spctools-discuss" group.
> >>> >> > To post to this group, send email to
> [email protected].
> >>> >> > To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> >>> >> > For more options, visit this group athttp://
> groups.google.com/group/spctools-discuss?hl=en.
> >>>
> >>> > --
> >>> > You received this message because you are subscribed to the Google
> Groups "spctools-discuss" group.
> >>> > To post to this group, send email to
> [email protected].
> >>> > To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> >>> > For more
> >>>
> >>> ...
> >>>
> >>> read more ยป
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups "spctools-discuss" group.
> >> To post to this group, send email to [email protected].
> >> To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> >> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.
> >>
> >>
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to