David, just a quick reply to part of your message. Normally, I make a directory for an experiment and I will process the mascot, sequest, and possibly X!Tandem data from each mzXML file in the same directory. I do append the name of the search to the TPP files, so I can determine which search engine the data came from. These data sometimes are combined (not always with iProphet, since I just recently implemented it), so I would violate your separate directory for each search engine rule. I've never seen any documentation that either way is required, so I doubt I am the only one processing data this way.
Greg On Fri, Nov 20, 2009 at 12:51 PM, David Shteynberg < [email protected]> wrote: > I will try to reason this through using trying to think from the > original authors point of view. The idea is that different searches > of the same data would happen in separate directories and the > base_name (full path to the data file) would identify one search of an > mzXML file representing one msms_run, and more than one search would > never happen in one directory on the same file. Also it is natural to > keep all results from one search engine run on an mzXML file in one > place in the mzXML file. However, you could have more than one search > that references the same data and these don't necessarily have to be > placed together in the pepXML file. Although the problem with this is > that you could have different paths to the same data and these would > all be listed. In the iProphet tool (which combines results from > multiple searches of the same data), I don't look at either base_name > but rather the spectrum names themselves, with the combination of > experiment_label, which is a user specified parameter that identifies > data from the same experiment. The idea is that the combination of > experiment_label and spectrum name will uniquely identify a spectrum > searched. I hope this is helpful. Let us know if you have other > questions. > > -David > > > > On Fri, Nov 20, 2009 at 10:30 AM, David Shteynberg > <[email protected]> wrote: > > OK I take that back. I see where the unique constraint is listed. I > > will have to consider your questions further. > > > > -David > > > > On Fri, Nov 20, 2009 at 10:27 AM, David Shteynberg > > <[email protected]> wrote: > >> Hi Hendrik, > >> > >> The element msms_pipeline_analysis/msms_run_summary has an attribute > >> base_name to specify the path to the datafile. In case the searched > >> file specified is different from the original data file there is > >> another entry in the element > >> msms_pipeline_analysis/msms_run_summary/search_summary for base_name. > >> As far as I know, there is nothing in the schema that requires these > >> to be unique in the pepXML file. Can you point me to where this > >> constraint is specified in the schema. I checked version 1.8. > >> > >> -David > >> > >> On Fri, Nov 13, 2009 at 12:31 AM, Eric Deutsch > >> <[email protected]> wrote: > >>> > >>> > >>> Hi Hendrik, I think we need to get an authoritative answer from David > on > >>> this one. And he is currently traveling in the Land of the Finns. We > will > >>> let/ask him to answer when he is next able. > >>> > >>> Regards, > >>> Eric > >>> > >>> > >>>> From: [email protected] [mailto:spctools- > >>>> [email protected]] On Behalf Of Hendrik Weisser > >>>> > >>>> Hi! > >>>> > >>>> I'm working on the pepXML parser in OpenMS. I've been confronted with > >>>> a type of pepXML file I hadn't seen before, where search results from > >>>> different search engines - but for the same experiment - were > >>>> collected in one file (with one "msms_run_summary" per search engine). > >>>> I've added (maybe prematurely) support for this to the OpenMS parser, > >>>> and then wanted to construct a simple pepXML file for testing > >>>> purposes. > >>>> > >>>> In doing so, I've now come across a constraint in the pepXML schema > >>>> (at least from v1.8 on) that says values of the "base_name" attribute > >>>> (supposed to contain the full path to the searched mzXML file) in the > >>>> "search_summary" element have to be unique within the document. > >>>> What is the rationale behind this constraint? Is it supposed to > >>>> prevent the above case, where different searches of the same > >>>> experiment end up in one file? Why would that be desirable/necessary? > >>>> (Also note that I can construct a valid and parseable pepXML file from > >>>> two different search runs of the same file if I change the path in > >>>> "base_name"...) > >>>> > >>>> In an earlier discussion (http://groups.google.com/group/spctools- > >>>> discuss/msg/7760dcda02877922?hl=en), it was mentioned that > >>>> "base_name"s in "msms_run_summary" elements had to be unique in the > >>>> document - however, as per the schema, that's not true. Also, the > >>>> "base_name" of an "msms_run_summary" is not tied to the "base_name" in > >>>> subordinate "search_summary"s. If there were such a constraint, it > >>>> would be impossible to have more than one "search_summary" under an > >>>> "msms_run_summary" - however, this is allowed in the schema. > >>>> When does it make sense to have different "base_name"s in an > >>>> "msms_run_summary" and its subordinate "search_summary"(s)? Judging > >>>> from the schema documentation and the files I've seen, it seems that > >>>> the values should be the same. On the other hand, why have the > >>>> attribute in both elements then? > >>>> > >>>> All this adds to my confusion about the appropriate use of > >>>> "base_name"... > >>>> > >>>> I would be happy if someone could clear things up for me. > >>>> > >>>> > >>>> Best regards > >>>> > >>>> Hendrik > >>>> > >>>> > >>> > >>> > >>> --~--~---------~--~----~------------~-------~--~----~ > >>> You received this message because you are subscribed to the Google > Groups "spctools-discuss" group. > >>> To post to this group, send email to [email protected] > >>> To unsubscribe from this group, send email to > [email protected]<spctools-discuss%[email protected]> > >>> For more options, visit this group at > http://groups.google.com/group/spctools-discuss?hl=en > >>> -~----------~----~----~----~------~----~------~--~--- > >>> > >>> > >> > > > > -- > > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<spctools-discuss%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/spctools-discuss?hl=. > > > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=.
