On Thu, Jul 31, 2008 at 10:04 AM, Edward d'Auvergne <[EMAIL PROTECTED]>wrote:
> Hi, > > Thank you for answering all my questions. I think that clearly covers > most things for me to start thinking about how this can be implemented > (although I'm still unsure about how to input the bond lengths used in > the calculation of the dipolar constants). For adding BMRB NMR-STAR > v3.1 file format reading and writing capabilities, I've now created a > branch of the relax 1.3 development line which is viewable at > http://svn.gna.org/viewcvs/relax/. I think that it would be > beneficial to add, in addition to the creation of STAR files for BMRB, > reading capabilities simultaneously so that data from the BMRB can > easily be read by relax and then a new or extended analysis performed > (relax can also create input for Modelfree4 and Dasha, as well as > control these programs). > > So the major difficulty in implementing this, as I see it, is the > support for generic STAR formatted files or the specific NMR-STAR v3.1 > file format. I have done extensive searches and although Python > perfectly supports XML reading and writing, I haven't been able to > find any Python packages for generic STAR format support. Would > anyone know of a STAR or NMR-STAR 3.1 Dictionary reader/writer for > Python? I could write a STAR format parser and writer, but that would > take a lot of time. It would be easier if a Python package for this > could be found or recycled. However the major issue with using a > preexisting package would be legal issues with the copyright > licencing. Ideally the STAR format parser and writer would be > appropriately licenced, for example maybe using > http://www.python.org/download/releases/2.4.2/license/, to allow > incorporation into the standard python modules (sitting alongside the > XML reader/writer) so that all NMR programs with a python interface, > which is quite a few nowadays, could have very easier access to the > BMRB data. > > I have found PyCIFRW (http://anbf2.kek.jp/CIF/) and this also includes > PySTARRW which could be useful. However these have lisencing issues > which clash with the open source GPL licence of relax. So > unfortunately I can't use these files. The only other Python STAR > reader/writer I've found is that used by in the CCPN data model > (http://www.ccpn.ac.uk/). This has the ability to convert NMR STAR > format to the CCPN data model format through the file > 'ccpnmr/ccpnmr1.0/python/ccpnmr/format/converters/NmrStarFormat.py'. > The copyright licensing should be ok, but unfortunately this is not a > generic reader/writer but something which is tightly integrated into > CCPN. Hence it would be too difficult to incorporate this file into > relax. I would like to have relax interface with the CCPN data model > (https://mail.gna.org/public/relax-devel/2007-11/msg00037.html), but > this would be far into the future and support for a model-free > analysis may not be fully supported by CCPN yet > (https://mail.gna.org/public/relax-devel/2007-12/msg00002.html). One > other thing I noticed at CCPN was a comment that a STAR reader/writer > written in Python by Jurgen Doreleijers > (http://tang.bmrb.wisc.edu/~jurgen/ <http://tang.bmrb.wisc.edu/%7Ejurgen/>) > was incorporated into their > software. Do you know anything about this Python module? > > Once a usable STAR reader/writer is accessible by relax, then creating > and reading BMRB deposition files should be relatively straight > forward. > > Regards, > > Edward > > > On Tue, Jul 29, 2008 at 7:16 PM, Eldon Ulrich <[EMAIL PROTECTED]> wrote: > > Hi, > > > > Thank you for the quick response and feedback. I will try to answer as > many > > of your comments and questions below. We are converting all of our data > from > > NMR-STAR v2.1 to NMR-STAR v3.1. Examples of the v3.1 files can be found > on > > the BMRB ftp site at > > > > ftp://ftp.bmrb.wisc.edu/pub/data/nmr-star-v3/ > > > > These are early beta files and may have serious problems. > > > > For the purposes of this discussion, I will be referring to v3.1 tags. > > Descriptions for these tags can be found at this URL: > > > > http://www.bmrb.wisc.edu/formats.html > > > > Files containing a fake NMR-STAR v3.1 file (nmrstar3_fake.txt) and other > > information on the dictionary in its 'working' form is available from the > > BMRB ftp site: > > > > ftp://ftp.bmrb.wisc.edu/pub/data/nmr-star_dict/dictionary_files > > > > We are very open to suggestions from the community on how to model and > > capture relaxation data and are quite excited about this discussion. I am > > sure I have not addressed all of your questions, but I hope this is a > start. > > > > Cheers, > > Eldon > > > > > > Edward d'Auvergne wrote: > >> > >> Hi, > >> > >> I've had a look at the fields and have a few questions as to how these > >> should be implemented. I'm assuming that these are the fields for > >> simply depositing R1 relaxation data into the BMRB, is this correct? > > > > The Excel file contains the tags for the fields in the ADIT-NMR > deposition > > system that are mandatory. These fields represent for the most part the > meta > > information about the molecule, sample, sample conditions, spectrometers, > > etc. The T1 fields were included as an example for one kind of relaxation > > data and the mandatory fields that would need to be entered in ADIT-NMR. > The > > actual tables of data would be uploaded at the time of deposition. > >> > >> So the first question I have has to do with Rx versus Tx. Almost all > >> theories for the interpretation of the T1 relaxation times are > >> dependent upon this being in the R1 rate form (with units of > >> rad.s^-1). relax (http://nmr-relax.com), Art Palmer's curvefit > >> ( > http://cpmcnet.columbia.edu/dept/gsas/biochem/labs/palmer/software.html), > >> David Fushman's RELAXFIT > >> (http://gandalf.umd.edu/FushmanLab/pdsw.html), and almost all other > >> programs calculate the Rx relaxation rate errors and not relaxation > >> time errors via Monte Carlo simulation. Then the programs relax > >> (http://nmr-relax.com), modelfree4 > >> ( > http://cpmcnet.columbia.edu/dept/gsas/biochem/labs/palmer/software.html), > >> dasha (http://www.nmr.ru/dasha.html), DYNAMICS > >> (http://gandalf.umd.edu/FushmanLab/pdsw.html), Tensor2 > >> (http://www.ibs.fr/ext/labos/LRMN/softs/welcome.htm), etc. all work > >> with the rates and not the times. So the storage of relation times > >> and their errors may not be very useful. Is it possible to deposit > >> rates and their errors rather than the antiquated relaxation times and > >> their errors? > > > > Yes, you can deposit rates and the appropriate error and not the times. > The > > T1.Val and T1.Val_err tags can have units of appropriate for either times > or > > rates (i.e., s or s-1). In the header to the table of T1 values is a tag > > _Heteronucl_T1_list.T1_val_units. The value to this tag defines whether > the > > T1 data have been expressed as times or rates. > > > > The terminology used for relaxation studies in NMR has been quite > diverse. > > At the time these tags were constructed, the term 'T1' still seemed to be > > the most commonly used. But, we realized capturing the data as rates was > > extremely important and so we allowed for the units for the values to > > actually determine if the values were times or rates. > > > >> Also, conversion of the Rx relation rate errors to the > >> Tx time errors would require full Monte Carlo simulation to be > >> accurate, and I'm not sure if anyone would have done this properly. I > >> could be wrong (anyone on this list who knows otherwise, please > >> correct me), but I don't think there are any programs that use the Tx > >> times or that properly convert Rx errors to Tx errors and vice versa. > >> > >> The second question I have has to do with the integration of relax > >> with the BRMB deposition and automating the process. Can all data for > >> a model-free analysis be deposited at once? For example if relax was > >> to create a STAR formatted file with the ADIT-NMR fields with the R1, > >> R2, and NOE values and errors at multiple fields, with the S2, S2f, > >> S2s, te, ts, tf, and Rex parameters and errors, the selected model > >> information (model name or parameters of the model), parameters such > >> as the CSA value used and bond length, and global parameters such as > >> the diffusion tensor, could this file be accepted? Or will this > >> require multiple small files for multiple deposition? > >> > > All of the data can be uploaded as one file. The NMR-STAR format is > modular > > and a single file can contain as many modules (saveframes) of the same or > > different type with a few exceptions. A module or saveframe begins with > the > > key term 'save_somestring' and ends with the key term 'save_'. A file can > > contain as many R1, R2, and NOE modules as needed. Within each of the > > modules there is a header tag that takes as a value the field strength of > > the spectrometer used to collect the data in that module as well as the > NMR > > experiment. It is important that the experiment used for the data be > defined > > uniquely. > > > > The following list of tags contains most of the values you mention, S2, > S2f, > > S2s, te, ts, Rex all with errors, and type of model fit. It is missing > the > > tf, but this can be easily added. The units for te and ts are provided in > > the header tags > > _Order_parameter_list.Tau_e_val_units and > > _Order_parameter_list.Tau_s_val_units. For the order parameter data, it > is > > important to include the experiments used to collect the underlying data. > In > > this way the order parameters are linked to the R1, R2, etc data used in > > doing the fitting. It is possible to include in the file a description of > > the software used and the 'method' or parameter file. > > > > > > _Order_param.Order_param_val > > _Order_param.Order_param_val_fit_err > > _Order_param.Tau_e_val > > _Order_param.Tau_e_val_fit_err > > _Order_param.Rex_val > > _Order_param.Rex_val_fit_err > > _Order_param.Model_free_sum_squared_errs > > _Order_param.Model_fit > > _Order_param.Sf2_val > > _Order_param.Sf2_val_fit_err > > _Order_param.Ss2_val > > _Order_param.Ss2_val_fit_err > > _Order_param.Tau_s_val > > _Order_param.Tau_s_val_fit_err > > _Order_param.SH2_val > > _Order_param.SH2_val_fit_err > > _Order_param.SN2_val > > _Order_param.SN2_val_fit_err > > > > The CSA data would be included in a separate module, but the same file. > > > > > >> I've also noticed from some of the deposited data (e.g. > >> > >> > http://www.bmrb.wisc.edu/data_library/gen_saveframe.php?accNum=6470&saveframe=T1_relaxation > >> ) that all the data is identified by residue number. For supporting > >> analyses using nucleic acids, small biomolecules, or proteins where > >> more than just the backbone NH relaxation has been studied, would it > >> be possible to additionally have an atom or spin numerical code and > >> textual label? If an analysis is done on a molecular complex, is the > >> deposition of data for multiple molecules supported as well? > > > > The header tag of the type '_Heteronucl_T1_list.T1_coherence_type' is > > intended to provide an idea of the coherence being measured. In addition, > > the following set of tags or similar set for other kinds of data are > > provided for every row in a data value table. The values for these tags > > allow an atom within a molecular assembly of almost any complexity > > (including ones that are undergoing chemical or conformational exchange) > to > > be defined. > > > > _T1.Entity_assembly_ID > > _T1.Entity_ID > > _T1.Comp_index_ID > > _T1.Seq_ID > > _T1.Comp_ID > > _T1.Atom_ID > > _T1.Atom_type > > _T1.Atom_isotope_number > > > > The data that is available from BMRB has been supplied by authors for the > > most part and the quality and how well the data are described is variable > > and in all cases out of our control as authors do not respond to our > > requests for better descriptions and more complete data sets. > > > >> > >> I still have many questions about the fields, their format in the STAR > >> file to deposit, which are compulsory, and which fields do not yet > >> exist for deposition of all model-free data (much of this data can be > >> seen in the relax results file > >> > >> > http://svn.gna.org/viewcvs/relax/1.3/test_suite/shared_data/model_free/OMP/final_results_trunc_1.3.bz2 > >> ). For example most of the STAR tags in > >> > >> > http://www.bmrb.wisc.edu/data_library/gen_saveframe.php?accNum=5841&saveframe=S2_parameters > >> are not in the excel spreadsheet. And why are order parameters and > >> their errors input using the STAR format tags '_S2_value' and > >> '_S2_error' whereas the T1 fields are called '_T1_value' and > >> '_T1_value_error' and the effective model-free internal correlation > >> time te filed under '_Tau_e_value' and '_Tau_e_value_fit_error'? > > > > When working on an almost 5000 tag dictionary over many years, > > inconsistencies creep into the tag names. We have tried to eliminate > these > > inconsistencies as much as possible in the NMR-STAR v3 dictionary, but I > > would guess there are still at least a few. > > > >> Would you have an example deposition text file formatted correctly > >> using the ADIT-NMR tags in the Excel file? Or is this unmodified, for > >> example is > >> http://www.bmrb.wisc.edu/cgi-bin/explore.cgi?format=raw&bmrbId=5841 > >> the same file as that that the authors deposited? > > > > I do not have a full relaxation example file. For example files you > should > > look in the directory on the ftp site listed above. We are working to > clean > > up these files as quickly as possible. > > > >> And how is the > >> field strength dependent data handled, e.g. in > >> http://www.bmrb.wisc.edu/cgi-bin/explore.cgi?format=raw&bmrbId=4970 > >> there are 2 spectrometers declared to be a 600 and 750, yet there is > >> relaxation data at 500, 600 and 750 present in the file? > > > > As mentioned above, for each module containing data that are field > strength > > dependent there should be a tag that takes as a value the field strength > of > > the spectrometer used to collect the data. For data like order parameters > > that are derived from different sets of data, currently the experiment > list > > is used to trace back to the input data and spectrometer field strength. > > > >> > >> Cheers, > >> > >> Edward > >> > >> > >> P.S. For reference, this message will soon appear at > >> https://mail.gna.org/public/relax-devel/. > >> > >> > >> > >> On Mon, Jul 28, 2008 at 6:01 PM, Eldon Ulrich <[EMAIL PROTECTED]> > wrote: > >>> > >>> Hi Edward, > >>> > >>> Sorry for the delay in providing a list of the required ADIT-NMR > fields. > >>> An > >>> Excel file with the information is attached compiled by one of our > >>> students. > >>> The table provides a fairly complete description of the field and where > >>> appropriate the dependencies on other fields. In terms of the > >>> experimental > >>> data, only the fields required for T1 relaxation data were included. > The > >>> required fields may vary slightly depending on the kinds of data being > >>> deposited. > >>> > >>> I hope this information helps. If you have any questions or need > >>> additional > >>> information, please let me know. > >>> > >>> All the best, > >>> Eldon > >>> > >>> _______________________________________________ > >>> relax (http://nmr-relax.com) > >>> > >>> This is the relax-devel mailing list > >>> [email protected] > >>> > >>> To unsubscribe from this list, get a password > >>> reminder, or change your subscription options, > >>> visit the list information page at > >>> https://mail.gna.org/listinfo/relax-devel > >>> > >>> > > > > > > _______________________________________________ > relax (http://nmr-relax.com) > > This is the relax-devel mailing list > [email protected] > > To unsubscribe from this list, get a password > reminder, or change your subscription options, > visit the list information page at > https://mail.gna.org/listinfo/relax-devel Hi Ed some alternatives 1. stardom (gpl; ignore what it says on the first web page and just look at the license) converts start files to an xml format http://www.pasteur.fr/recherche/unites/Binfs/stardom 2. ccpn format converters come in two parts (I have helped write one for import of data from xplor-marvin) I would have a look at ccpnmr1.0/python/ccp/format/nmrStar which is a basic star file reader framework... 3. I can assist here (my structure calculation stuff is now done [mostly so I am heading back to dynamics]) ;-) regards gary
_______________________________________________ relax (http://nmr-relax.com) This is the relax-devel mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-devel

