Re: [bug #21598] Considering molecule numbers when writing pyMol (or Molmol) macros

Martin Ballaschk Thu, 11 Dec 2014 06:47:11 -0800

Hi Edward,

sorry for responding so late – I totally forgot about your message and found it 
now again only by accident.


To answer your question, I think it is a good idea to leave it up to the user 
to number the chains accordingly. In the end, it seems like the user is the 
only one competent enough to judge which name the chains should have – relax 
would have to guess, as you explained, especially if one loads several "A 
chains" from different sources. Some kind of user interaction is necessary. 

Wouldn't be the simples solution to hand over a string array of IDs to the 
pymol macro, which in turn would assign these to the chains? The length of the 
array would have to match the number of chains. Which ID is attributed to which 
chain would be probably up to the order of the loaded molecules. 

It would certainly help already if there was *any* chain ID mentioned in the 
macros, which could be replaced by e.g. sed afterwards, if necessary. Now I 
just add the "and chain A" bit more or less manually for every line.

Cheers,
Martin


> On 11.11.2014, at 17:35, Edward d'Auvergne <edw...@nmr-relax.com> wrote:
> 
> Hi Martin,
> 
> I've been looking at old bug reports and was wondering if you still
> had interest in this PyMOL/MOLMOL macro issue?  Would your problem be
> solved if we added the chain ID as an argument to the
> pymol.macro_create user function?  If we added software specific
> selections to the pymol and molmol user functions, then maybe these
> can be re-run after a model-free analysis finishes and macros
> including this additional selection information could be manually
> created.  Such a solution essentially pushes the structure mapping
> problem back onto the user.  What do you think?
> 
> Cheers,
> 
> Edward
> 
> 
> 
> On 10 February 2014 08:48, Edward d'Auvergne <edw...@nmr-relax.com> wrote:
>> Hi Martin,
>> 
>> I've been thinking about this one for quite a while and there are some
>> complications.  The problem is with the mapping of molecules in the
>> relax data store and the original PDB chain IDs.  For example, if you
>> start with a PDB file with 3 chains but only load the structure with
>> chain ID B, then there will only be one structure in the internal
>> structural object.  Using relax's current logic, this would map to
>> chain ID A so the result would be a very confusing for user.
>> 
>> For relax to create valid PDB files, the original chain IDs have to be
>> forgotten (though they are stored in the cdp.structure object).  You
>> cannot load in a structure with chain ID B and write out its atoms
>> with chain ID B, because you could load into the relax data store a
>> second PDB file with a molecule with the chain ID of B as well.
>> Therefore the current relax logic is that each loaded molecule is
>> reassigned a chain ID from A to Z (followed by 0-9 and then a-z, as
>> specified at http://www.wwpdb.org/procedure.html#toc_4) - this is
>> mandated by the PDB standard.  So if you load in molecules from random
>> sources, they will be given the IDs A, B, C, etc. when using the
>> structure.write_pdb user function.  It is important to note that with
>> the relax structure user functions, you can convert one PDB file with
>> X models into one model with X structures in relax, or you can convert
>> X PDB structures into X models of one structure in relax.  This
>> important ability really complicates the molecule to molecule mapping.
>> 
>> The macros themselves are molecule independent - they will operate on
>> which ever structure is loaded.  So maybe what is needed is to access
>> the original chain IDs which are already stored.  This itself is
>> difficult.  Firstly, you can see the original IDs by looking at the
>> code.  Have a look at the add_atom() method in the
>> lib/structure/internal/molecule.py file.  This is called from the
>> fill_object_from_pdb() method.  In the add_atom() method, you will see
>> that the original chain IDs are stored as the molecule container
>> 'chain_id' list.  So it might be best for the macros to access these
>> original IDs rather than using the new IDs, however this is the
>> problem.
>> 
>> The challenge would be to pull out these original IDs.  For the
>> model-free analysis, the macro creation is via the module
>> specific_analyses/model_free/macro_base.py, using the classic_style()
>> method.  There you will see that for each parameter converted into a
>> macro, there is a spin_loop() call.  This returns the molecule name.
>> But note that information in relax is stored in a different place to
>> the structural data (cdp.mol verses cdp.structure).  This separation
>> is deliberate as cdp.mol must be able to handle the case of no
>> structural information and it is also a complex structure holding lots
>> of NMR information, whereas cdp.structure object is highly optimised
>> for speed and handling huge quantities of models and/or molecules and
>> only holds basic structural information.
>> 
>> The problem is that there is no link between cdp.mol and cdp.structure
>> to from one to the other.  This difficult issue would need to be
>> solved.  Then that would allow you to follow from the molecule name in
>> cdp.mol to the molecule in cdp.structure.structural_data[0] (this is
>> the first model), to the original chain IDs of each atom.  To find the
>> correct atom, this will also be a challenge as residue names and
>> numbers and spin names and numbers can be changed.  It may in fact be
>> easier to change the code generating the macros to use the
>> cdp.structure.atom_loop() function rather than the spin_loop().  Then
>> the pipe_control.mol_res_spin.generate_spin_id_unique() function could
>> be used to create a spin ID to try to match.  This will work quite
>> well for obtaining the residue and spin info.  But again here you have
>> the molecule matching problem, as the molecule names in cdp.mol and
>> cdp.structure are different, and the name in cdp.mol can be changed by
>> the user at any time.
>> 
>> So I will have to think about this issue even more.  Feel free to look
>> at the code I mentioned and see if you can see an easy solution!
>> 
>> Regards,
>> 
>> Edward
>> 
>> 
>> 
>> On 5 February 2014 11:16, Martin Ballaschk
>> <no-reply.invalid-addr...@gna.org> wrote:
>>> URL:
>>>  <http://gna.org/bugs/?21598>
>>> 
>>>                 Summary: Considering molecule numbers when writing pyMol (or
>>> Molmol) macros
>>>                 Project: relax
>>>            Submitted by: mab
>>>            Submitted on: Wed 05 Feb 2014 10:16:04 AM GMT
>>>                Category: relax's source code
>>> Specific analysis category: Model-free analysis
>>>                Priority: 5 - Normal
>>>                Severity: 1 - Wish
>>>                  Status: None
>>>             Assigned to: None
>>>         Originator Name:
>>>        Originator Email:
>>>             Open/Closed: Open
>>>                 Release: 3.1.5
>>>         Discussion Lock: Any
>>>        Operating System: All systems
>>> 
>>>    _______________________________________________________
>>> 
>>> Details:
>>> 
>>> Hi Edward.
>>> 
>>> Relax automatically creates mappings of the various model-free parameters 
>>> onto
>>> the loaded structure by generatiing .pml (or Molmol .mac) scripts/macros. To
>>> use them, the PDB structure has to be opened in PyMol, and then the mapping
>>> script has to be run.
>>> 
>>> The problem: when loading the mapping script, all chains of the current
>>> molecule are treated the same, i.e. the values are not only mapped to chain 
>>> A
>>> of my multimer, but also onto chain B, C, etc.
>>> 
>>> The reason is that these mappings are based on residue numbers only. To make
>>> one's life easier, all present molecules should be treated individually.
>>> 
>>> E.g., Instead of:
>>> select pept_bond, (name ca,n AND resi 2) or (name ca,c AND resi 1)
>>> 
>>> it should read:
>>> select pept_bond, (name ca,n AND resi 2 AND chain A) or (name ca,c AND resi 
>>> 1
>>> AND chain A)
>>> 
>>> in the pymol script.
>>> 
>>> To fix this, one just would have take into account the molecule number that 
>>> is
>>> read by relax:
>>> 
>>> structure.read_pdb(file='XYZ.pdb', read_mol=1)
>>> 
>>> "Molecule 1" would translate into "chain A", "molecule 2" to "chain B" , etc
>>> in the PyMol script, by looping over the molecules present, assuming all
>>> present molecules have been loaded from the same pdb. If the different
>>> molecules are loaded from different files, the molecule-chain mapping would
>>> make little sense. One way to circumvent this problem could be something 
>>> like
>>> a "multimer" flag to tell relax to specify molecule/chain numbers.
>>> 
>>> I don't know what the scripts would look like if there are several molecules
>>> loaded into relax at the same time. If there is no seperate treatment for
>>> them, a fix like this would probably help.
>>> 
>>> Cheers,
>>> Martin
>>> 
>>> 
>>> 
>>> 
>>> 
>>>    _______________________________________________________
>>> 
>>> Reply to this item at:
>>> 
>>>  <http://gna.org/bugs/?21598>
>>> 
>>> _______________________________________________
>>>  Message sent via/by Gna!
>>>  http://gna.org/
>>> 
>>> 
>>> _______________________________________________
>>> relax (http://www.nmr-relax.com)
>>> 
>>> This is the relax-devel mailing list
>>> relax-devel@gna.org
>>> 
>>> To unsubscribe from this list, get a password
>>> reminder, or change your subscription options,
>>> visit the list information page at
>>> https://mail.gna.org/listinfo/relax-devel


_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
relax-devel@gna.org

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: [bug #21598] Considering molecule numbers when writing pyMol (or Molmol) macros

Reply via email to