-Original Message-
From: Charlie
Sent: Wednesday, February 18, 2004 8:56 AM
To: Warren DeLano
Subject: Re: [PyMOL] selecting multiple atoms ie oxygen
Warren DeLano wrote:
John,
color red, 5paa and elem o
The problem with using asterices as wildcards in atom names is that
some ill-conceived PDB files actually use them in atom names.
However, PyMOL does support the use of a terminal wildcard in some
cases, such as with the delete command...
create obj01, none
create obj02, none
delete obj*
And with residue names
color red, as*
color blue, gl*
color pink, hi*
Cheers,
Warren
Hi Warren,
I've struggled with this a couple of times.
Could it be worth finding another way round, so that a
consistent role for * exists.
Might it not be better to require users to somehow escape *'s
in atom names and then allow * as a wildcard. Although this
might be less clear for newbies, the lack of * as a wilcard
in atom names is currently unclear. I don't know much about
python, but the unix system of escaping special chars with a
backslash would be an option, then one could select
foo,(bar//A/50/*1\*) to get atoms O1* and C1* from the molecule.
You've probably been through this and have a perfectly sound
reason for not doing it !
The newbie issue is what troubled me. Unix hacks know how to edit PDB files
to replace asterisks with something more benign, and they know to escape
common wildcards -- but the ordinary person does not.
Clearly we can't make everyone happy, so perhaps consistency should be the
guide? But on the other hand, we are talking about a huge portion of the
PDB. Nearly every nucleic acid structure seems to suffer from this
unfortunate naming convention (~5000 PDB entries contain C1\* according to
grep of a recent copy of the PDB).
It is true that well-established conventions already exist for handling
asterisks, but I don't believe in following conventions blindly,
particularly when so many people would be negatively affected.
I welcome further discussion on this point. Guidance from the community
will be crucial, since I don't have a good solution in mind yet. Some food
for thought:
1) Are atom name wildcards really needed when a more precise way of
selecting by element symbol already exists?
2) If so, then what are the proposals?
a. Escape non-wildcard asterisks with backslash? (regexp convention,
but would trip-up newbies, break current PyMOL scripts, and inconvenience a
whole field of research)
b. Escape wildcard asterisks in atom names with a backslash (that
would be very backwards from the standard convention and create further
confusion).
c. Add a configurable atom_name_wildcard toggle?
d. Support alternative wildcards for atom names?
Would . or .* work?
Also note that currently PyMOL doesn't have a regexp engine, and it doesn't
support full wildcards -- just terminal astericks in a few situations. Full
regexps matching (and thus full convention adherence) would be a nice
addition in the future, but it will need to be configured somehow as well,
probably via some global setting like regexp_based_matching.
Cheers,
Warren