Hi Edward, I agree with you. I had not think about all the other files in relax since I was so focused on the 'generic' file... but your idea of implementing this at a higher level is quite logical and would give much more flexibility to relax. Maybe we could create a branch for this, as you proposed...
Concerning the generic format, what do you think we should do ? Should we just introduce variables so the user tells relax in which column things are..? The user could also tell that he uses the generic format, as well as how many header lines there are, etc... In fact, the user could, for now (until we get the automatic recognition stuff working), tell everything concerning the formatting of his file... What do you think ? Lots of work ahead ! Cheers, Séb :) Edward d'Auvergne wrote: > Hi, > > I think this is a great idea. But I don't think it's the right time or > place to implement this. There are many output and input files in relax > formatted with the 5 mol, res, and spin name and number columns. For > example the sequence data reading and writing, the value reading and > writing, the generic intensity reading (and possibly writing in the > future), the relaxation data reading and writing, the spin deselection > file reading, and the RDC and PCS reading and writing. So, as you can > see, the idea you propose covers all of these file types and touches a > lot of code. I think to implement this, we need a new branch. > > The way I see this being implemented is quite complex. I think we > should have one special object, maybe in generic_fns.mol_res_spin, which > parses the file and converts it into other special objects (all > contained within the main object). There could be a method in this > object called data_loop() which returns the spin_id, and an array with > the data from the remaining columns. Maybe also a method which returns > the column indices to help deconvolute the data array. Then this file > parsing object can have the abilities of accepting the mol_name_col, > etc. values to allow header-less or other non-standard formats to be > supported. If none of the mol, res, and spin name or num values are > given (maybe the user function default), then the text 'mol_name_col', > etc. can be searched for. And if neither works, then time for a > RelaxError. > > This object can also have a write() method to generate these files. We > just need to create another method to feed in the mol name, res name and > num, spin name and num, and finally the other data to be written. > > Obviously if this is implemented (well, actually 'when' is more > appropriate here), then all the code mentioned in the first paragraph > will have to change. This will require major surgery, although once the > object is in place, the various parts of relax can be converted bit by > bit and separately. So I think your idea should go into relax, but at a > much higher level. What do you think of these ideas? > > Regards, > > Edward > > > On Fri, 2008-12-05 at 11:56 -0500, Sébastien Morin wrote: > >> Hi Ed, >> >> What about making the code recognize automatically which columns are what ? >> >> We could, for example, have the code determine the number of fields and >> then search the header for strings as 'res_num' or 'res_name', etc, and >> when all searched fields recognized, assume that the remaining fields >> are intensities to extract... The absent fields could be given a default >> value such as 'None'. For this, we would need to have the header sent to >> the intensity_generic() function (from the autodetect_format() function). >> >> I think this would be great because it could allow users not to input >> column numbers and have their files automatically parsed, in whatever >> fields the data is... >> >> What do you think of this approach ? Do you see any problem with it ? >> >> Let me know what you think. >> >> Regards, >> >> >> Séb :) >> >> >> >> Edward d'Auvergne wrote: >> >>> Sorry, this task of the generic formatted file is far more complicated >>> than I thought. It's structure should be modelled after the >>> generic_fns.value.read() function, as this takes a similarly formatted >>> file. Flexibility here is key - any int arguments for the >>> mol_name_col, res_num_col, res_name_col, spin_num_col, spin_name_col >>> should be acceptable. I.e. you can put this information at the end of >>> the file if you are crazy enough. But most of the code in >>> generic_fns.value.read() can be used. It just needs to be shifted >>> into functions of generic_fns.spectrum such as >>> number_of_header_lines() and intensity_generic(). >>> >>> In the future I might write some functions in generic_fns.mol_res_spin >>> to parse any spin specific but generically formatted file. But for >>> now, the generic_fns.value.read() function needs to be mimicked. This >>> is an insanely complex task, considering the additional flexibility I >>> talked about in >>> https://mail.gna.org/public/relax-devel/2008-12/msg00016.html >>> especially the automatic reading with the spin specific columns being >>> allowed to be anywhere. So if you think this is too much, I can take >>> over at any point. >>> >>> Regards, >>> >>> Edward >>> >>> >>> >>> On Thu, Dec 4, 2008 at 7:30 PM, <[EMAIL PROTECTED]> wrote: >>> >>> >>>> Author: semor >>>> Date: Thu Dec 4 19:30:11 2008 >>>> New Revision: 8138 >>>> >>>> URL: http://svn.gna.org/viewcvs/relax?rev=8138&view=rev >>>> Log: >>>> Modified the autodetection code for the generic format. >>>> >>>> This now recognizes the most generic format as in >>>> 'test_suite/shared_data/peak_lists/generic_intensity2.txt'. >>>> >>>> >>>> Modified: >>>> 1.3/generic_fns/spectrum.py >>>> >>>> Modified: 1.3/generic_fns/spectrum.py >>>> URL: >>>> http://svn.gna.org/viewcvs/relax/1.3/generic_fns/spectrum.py?rev=8138&r1=8137&r2=8138&view=diff >>>> ============================================================================== >>>> --- 1.3/generic_fns/spectrum.py (original) >>>> +++ 1.3/generic_fns/spectrum.py Thu Dec 4 19:30:11 2008 >>>> @@ -254,7 +254,7 @@ >>>> break >>>> >>>> # Generic format. >>>> - if line[0] in ['mol_name', 'res_num', 'res_name', 'spin_num', >>>> 'spin_name']: >>>> + if line[0] in ['mol_name', 'res_num', 'res_name', 'spin_num', >>>> 'spin_name'] or line[0] in ['Num', 'Name']: >>>> return 'generic' >>>> >>>> # Sparky format. >>>> >>>> >>>> _______________________________________________ >>>> relax (http://nmr-relax.com) >>>> >>>> This is the relax-commits mailing list >>>> [EMAIL PROTECTED] >>>> >>>> To unsubscribe from this list, get a password >>>> reminder, or change your subscription options, >>>> visit the list information page at >>>> https://mail.gna.org/listinfo/relax-commits >>>> >>>> >>>> >>> _______________________________________________ >>> relax (http://nmr-relax.com) >>> >>> This is the relax-devel mailing list >>> [email protected] >>> >>> To unsubscribe from this list, get a password >>> reminder, or change your subscription options, >>> visit the list information page at >>> https://mail.gna.org/listinfo/relax-devel >>> >>> >>> >> _______________________________________________ >> relax (http://nmr-relax.com) >> >> This is the relax-devel mailing list >> [email protected] >> >> To unsubscribe from this list, get a password >> reminder, or change your subscription options, >> visit the list information page at >> https://mail.gna.org/listinfo/relax-devel >> > > > _______________________________________________ relax (http://nmr-relax.com) This is the relax-devel mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-devel

