Re: [spctools-discuss] Re: "base_name" constraint in pepXML

Greg Bowersock Fri, 20 Nov 2009 11:34:16 -0800

David, just a quick reply to part of your message. Normally, I make a
directory for an experiment and I will process the mascot, sequest,
and possibly X!Tandem data from each mzXML file in the same directory. I do
append the name of the search to the TPP files, so I can determine which
search engine the data came from. These data sometimes are combined (not
always with iProphet, since I just recently implemented it), so I would
violate your separate directory for each search engine rule. I've never seen
any documentation that either way is required, so I doubt I am the only one
processing data this way.


Greg

On Fri, Nov 20, 2009 at 12:51 PM, David Shteynberg <
[email protected]> wrote:

> I will try to reason this through using trying to think from the
> original authors point of view.  The idea is that different searches
> of the same data would happen in separate directories and the
> base_name (full path to the data file) would identify one search of an
> mzXML file representing one msms_run, and more than one search would
> never happen in one directory on the same file.  Also it is natural to
> keep all results from one search engine run on an mzXML file in one
> place in the mzXML file. However, you could have more than one search
> that references the same data and these don't necessarily have to be
> placed together in the pepXML file.  Although the problem with this is
> that you could have different paths to the same data and these would
> all be listed.  In the iProphet tool (which combines results from
> multiple searches of the same data), I don't look at either base_name
> but rather the spectrum names themselves, with the combination of
> experiment_label, which is a user specified parameter that identifies
> data from the same experiment.  The idea is that the combination of
> experiment_label and spectrum name will uniquely identify a spectrum
> searched.  I hope this is helpful.  Let us know if you have other
> questions.
>
> -David
>
>
>
> On Fri, Nov 20, 2009 at 10:30 AM, David Shteynberg
>  <[email protected]> wrote:
> > OK I take that back.  I see where the unique constraint is listed.  I
> > will have to consider your questions further.
> >
> > -David
> >
> > On Fri, Nov 20, 2009 at 10:27 AM, David Shteynberg
> > <[email protected]> wrote:
> >> Hi Hendrik,
> >>
> >> The element msms_pipeline_analysis/msms_run_summary has an attribute
> >> base_name to specify the path to the datafile.  In case the searched
> >> file specified is different from the original data file there is
> >> another entry in the element
> >> msms_pipeline_analysis/msms_run_summary/search_summary for base_name.
> >> As far as I know, there is nothing in the schema that requires these
> >> to be unique in the pepXML file. Can you point me to where this
> >> constraint is specified in the schema.  I checked version 1.8.
> >>
> >> -David
> >>
> >> On Fri, Nov 13, 2009 at 12:31 AM, Eric Deutsch
> >> <[email protected]> wrote:
> >>>
> >>>
> >>> Hi Hendrik, I think we need to get an authoritative answer from David
> on
> >>> this one. And he is currently traveling in the Land of the Finns. We
> will
> >>> let/ask him to answer when he is next able.
> >>>
> >>> Regards,
> >>> Eric
> >>>
> >>>
> >>>> From: [email protected] [mailto:spctools-
> >>>> [email protected]] On Behalf Of Hendrik Weisser
> >>>>
> >>>> Hi!
> >>>>
> >>>> I'm working on the pepXML parser in OpenMS. I've been confronted with
> >>>> a type of pepXML file I hadn't seen before, where search results from
> >>>> different search engines - but for the same experiment - were
> >>>> collected in one file (with one "msms_run_summary" per search engine).
> >>>> I've added (maybe prematurely) support for this to the OpenMS parser,
> >>>> and then wanted to construct a simple pepXML file for testing
> >>>> purposes.
> >>>>
> >>>> In doing so, I've now come across a constraint in the pepXML schema
> >>>> (at least from v1.8 on) that says values of the "base_name" attribute
> >>>> (supposed to contain the full path to the searched mzXML file) in the
> >>>> "search_summary" element have to be unique within the document.
> >>>> What is the rationale behind this constraint? Is it supposed to
> >>>> prevent the above case, where different searches of the same
> >>>> experiment end up in one file? Why would that be desirable/necessary?
> >>>> (Also note that I can construct a valid and parseable pepXML file from
> >>>> two different search runs of the same file if I change the path in
> >>>> "base_name"...)
> >>>>
> >>>> In an earlier discussion (http://groups.google.com/group/spctools-
> >>>> discuss/msg/7760dcda02877922?hl=en), it was mentioned that
> >>>> "base_name"s in "msms_run_summary" elements had to be unique in the
> >>>> document - however, as per the schema, that's not true. Also, the
> >>>> "base_name" of an "msms_run_summary" is not tied to the "base_name" in
> >>>> subordinate "search_summary"s. If there were such a constraint, it
> >>>> would be impossible to have more than one "search_summary" under an
> >>>> "msms_run_summary" - however, this is allowed in the schema.
> >>>> When does it make sense to have different "base_name"s in an
> >>>> "msms_run_summary" and its subordinate "search_summary"(s)? Judging
> >>>> from the schema documentation and the files I've seen, it seems that
> >>>> the values should be the same. On the other hand, why have the
> >>>> attribute in both elements then?
> >>>>
> >>>> All this adds to my confusion about the appropriate use of
> >>>> "base_name"...
> >>>>
> >>>> I would be happy if someone could clear things up for me.
> >>>>
> >>>>
> >>>> Best regards
> >>>>
> >>>> Hendrik
> >>>>
> >>>>
> >>>
> >>>
> >>> --~--~---------~--~----~------------~-------~--~----~
> >>> You received this message because you are subscribed to the Google
> Groups "spctools-discuss" group.
> >>> To post to this group, send email to [email protected]
> >>> To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> >>> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en
> >>> -~----------~----~----~----~------~----~------~--~---
> >>>
> >>>
> >>
> >
>
> --
>
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<spctools-discuss%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=.
>
>
>

--

You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=.

Re: [spctools-discuss] Re: "base_name" constraint in pepXML

Reply via email to