Re: [ccp4bb] problem of conventions

2011-04-06 Thread Ian Tickle
Bernhard,

 Well, it *IS* broke.

As they say it works for me, so either you're using a different set
of programs from me, or you're using the same programs but in a
different way.  Perhaps you could be more specific as to which
program(s) appear to be broken?  If possible please post the
logfile(s) on this forum, then someone might recognise the problem(s).
 Did you try reporting it to CCP4 (assuming of course we're talking
about CCP4 programs)?  You're the 2nd person in this thread to claim
that the space-group handling for the alternate settings is broken, so
it would be nice to get to the bottom of it!


 If you are running some type of process, as you
 implied in referring to LIMS, then there is a step in which you move from
 the crystal system and point group to the actual space group. So, at that
 point you identify P22121. The next clear step, automatically by software,
 is to convert to P21212, and move on. That doesn't take an enormous amount
 of code writing, and you have a clear trail on how you got there.

I'm puzzled why I need a workaround for a bug that only you and
possibly James have experienced: AFAIK no-one else has reported
problems with this recently.  Wouldn't it be make more sense to fix
the bug(s)? - that way, everyone benefits and I don't need to do
anything!  Anyway, to respond to your suggestion: I've spent some time
looking into this (so I hope you'll forgive the delay in replying!),
and unfortunately it's not as simple as you think.  I can see 3 main
steps that would be required for a workaround:

Step 1 (create new crystal form entry):  First I would have to make a
copy of the entry for the old crystal form in the PROTEINS table,
giving it a new unique ID.  Then I would perform the
re-indexing/re-orientation operations on the reference  free-R MTZ
files and the PDB file for the refined structure, and change the
filename entries in the row of PROTEINS table just created to point to
them.  This row also contains the parameters for MR, rigid-body
refinement, TLS and binding site definitions but these won't need to
be changed.  The user interface would need to be modified to give
users the option of implementing this change, since I know some
(most?) users who won't be happy to do it!

One problem I foresee is confusing the users with a multiplicity of
unit cells, since we already work with potentially 2 different cells
per crystal: first the 'canonical' unit cell for the crystal form from
the reference MTZ file header; then there's the unit cell for the
isomorphous crystal as found by the indexing software.  Users
understand that the indexing program won't necessarily choose the
reference cell, particularly in the situation you indicate below where
2 cell lengths are almost equal.  Now you want me to add a 3rd
possibly different unit cell, i.e. that after a second run of
re-indexing to the 'standard setting'; the users won't understand the
need for this.

Next comes a tricky bit: for tracking purposes I would somehow need to
make a link from the new crystal form to the old one, my guess is with
a self-referencing foreign key.  All the database applications for
doing searches  reports would need to be modified to recognise this
change.  This doesn't look trivial to me!  I would need to hand this
task over to the database administrator  programmers, since I'm not
involved with administration of the database.  Getting a clear trail
doesn't happen automatically, it has to be programmed!  I anticipate
some searching questions from all the users and the db admin, such as
why do we need to do this?, what bad things will happen if we
don't? and why haven't we seen these bad things happening before?.
I'm hoping that you will be able to provide convincing answers to
these questions - because I can't!

Step 2 (re-index historical data): Then I would need to copy each
entry for the historical datasets that were previously added to the
database for the old crystal form to the new crystal form (of course
it's actually _same_ crystal form, but we're fooling the LIMS into
treating it as though it were a new one).  This is so that we can
continue to track the data using the new crystal form ID.  All
datasets for a given crystal form must be indexed in the same way
since the LIMS interface allows you to mix  match PDB, MTZ  MAP
files for the crystal form without the need to do superpositions (of
course superpositions can be done if needed, but then you lose the
symmetry info).  These 'historical' datasets are all the ones
generated in the process of getting and optimising the crystal form,
i.e. from all the different constructs made (typically ~ 30 +- 20),
the purifications and crystallisation trials, optimising the
cryobuffer  DMSO concentration for soaking ligands, then the datasets
used during the structure determination (MR/MAD/SAD etc).  This may
run to 100-150 datasets, but the actual number is immaterial since
it's just as easy to write the database application for many as for
one.  So a 

Re: [ccp4bb] problem of conventions

2011-04-01 Thread Ian Tickle
On Fri, Apr 1, 2011 at 5:30 AM, Santarsiero, Bernard D. b...@uic.edu wrote:
 Ian,

 I think it's amazing that we can program computers to resolve a  b  c
 but it would be a major undertaking to store the matrix transformations
 for 22121 to 21212 and reindex a cell to a standard setting.

I think you misunderstood the point I was making.  Multiply your one
by the several hundred datasets we sometimes collect for the various
clones and crystallisation conditions needed to optimise the crystal
form for soaking - that's what I mean by 'major undertaking'.  As I
explained all the datasets collected for a given crystal form have to
be indexed the same way (even if only for archival purposes) before we
can store them in the database (otherwise we would end up in an awful
muddle!).  I don't have a batch script to filter all the relevant
datasets from the database, re-index each one (that's the easy part!),
and re-register them all as a new crystal form.  Why should I? -
no-one has given me a cogent reason to re-index them in the first
place which would justify the resulting downtime of the project (OK
call me lazy!).  I hope you see that doing each one manually is a
non-starter: the project would have to be locked during the period of
the operation so no new datasets could be down- or uploaded (which
would further cause the upstream pipeline to backup).  Operations that
appear trivial when you only have to do them once suddenly become big
problems when they have to be performed on an industrial scale!

 I was also
 told that I was lazy to not reindex to the standard setting when I was a
 grad student. Now it takes less than a minute to enter a transformation
 and re-index.

They told you wrong!  The conventional cell is the convention (by
definition!), and the standard setting doesn't always correspond to
the conventional cell (though in most cases it does).  There's a
reason for the distinction between meanings of 'standard' and
'conventional' - the meanings are very precise and
non-interchangeable.

 The orthorhombic rule of a  b  c makes sense in 222 or 212121, but when
 there is a standard setting of the 2-fold along the c-axis, then why not
 adopt that?

As I explained, sometimes we don't know the true space group (in terms
of assigning the screw axes) until further along the pipeline (e.g.
after MR or refinement), or at least it's always safer to be
non-committal beyond P222 - why commit oneself to an irrevocable
decision before it's absolutely necessary?  You don't need to know the
exact space group just to screen crystals for diffracting power!
Adopting the standard setting would in the particular case of SGs 5,
17  18 require later re-indexing  I hope you see why for us that's a
non-starter.

I'm not a believer in conventions for their own sake - a convention is
merely a default set of rules which you apply when you have no sound
basis on which to make a choice - the convention makes what is
effectively a totally arbitrary choice for you.  Conventions do have
the advantage that if other people follow them then they will make the
same decisions as you.  The moment I have sufficient justification
(e.g. as I said isomorphism overrides convention) to break with
convention then I would have no hesitation in doing so.  The fact that
the standard setting has a 2-fold along c is merely an arbitrary
choice and doesn't seem to me to be a good enough reason to break with
the unit-cell convention.

-- Ian


 On Thu, March 31, 2011 5:48 pm, Ian Tickle wrote:
 On Thu, Mar 31, 2011 at 10:43 PM, James Holton jmhol...@lbl.gov wrote:
 I have the 2002 edition, and indeed it only contains space group
 numbers up to 230.  The page numbers quoted by Ian contain space group
 numbers 17 and 18.

 You need to distinguish the 'IT space group number' which indeed goes
 up to 230 (i.e. the number of unique settings), from the 'CCP4 space
 group number' which, peculiar to CCP4 (which is why I called it
 'CCP4-ese'), adds a multiple of 1000 to get a unique number for the
 alternate settings as used in the API.  The page I mentioned show the
 diagrams for IT SG #18 P22121 (CCP4 #3018), P21221 (CCP4 #2018) and
 P21212 (CCP4 #18), so they certainly are all there!

 Although I am all for program authors building in support for the
 screwy orthorhombics (as I call them), I should admit that my
 fuddy-duddy strategy for dealing with them remains simply to use space
 groups 17 and 18, and permute the cell edges around with REINDEX to
 put the unique (screw or non-screw) axis on the c position.

 Re-indexing is not an option for us (indeed if there were no
 alternative, it would be a major undertaking), because the integrity
 of our LIMS database requires that all protein-ligand structures from
 the same target  crystal form are indexed with the same (or nearly
 the same) cell and space group (and it makes life so much easier!).
 With space-groups such as P22121 it can happen (indeed it has
 happened) that it was not possible to define the 

[ccp4bb] problem of conventions

2011-04-01 Thread Boaz Shaanan
Excuse my naive (perhaps ignorant) question: when was the
 abc rule/convention/standard/whatever introduced? None of the 
textbooks I came across mentions it as far as I could see (not that this is 
reason for or against this rule of course).

    Thanks,

   Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel
Phone: 972-8-647-2220 ; Fax: 646-1710
Skype: boaz.shaanan‎


Re: [ccp4bb] problem of conventions

2011-04-01 Thread Gerard Bricogne
Dear Boaz,

 I think you are the one who is finally asking the essential question. 
 
 The classification we all know about, which goes back to the 19th
century, is not into 230 space groups, but 230 space-group *types*, i.e.
classes where every form of equivalencing (esp. by choice of setting) has
been applied to the enumeration of the classes and the choice of a unique
representative for each of them. This process of maximum reduction leaves
very little room for the introducing conventions like a certain ordering
of the lengths of cell parameters. This seems to me to be a major mess-up in
the field - a sort of second-hand mathematics by (IUCr) committee which
has remained so ill-understood as to generate all these confusions. The work
on the derivation of the classes of 4-dimensional space groups explained the
steps of this classification beautifully (arithmetic classes - extension by
non-primitive translations - equivalencing under the action of the
normaliser), the last step being the choice of a privileged setting *in
termns of the group itself* in choosing the representative of each class.
The extra convention abc leads to choosing that representative in a way
that depends on the metric properties of the sample instead of once and for
all (how about that for a brilliant step backward!). Software providers then
have to de-standardise the set of 230 space group *types* (where each
representative is uniquely defined once you give the space group (*type*)
number) to accommodate all alternative choices of settings that might be
randomly thrown at them by the metric properties of e.g. everyone's
orthorhombic crystals. Mathematically, what one then needs to return to is
the step before taking out the action of the normaliser, but this picture
gets drowned in clerical disputes about low-level software issues.

 My own take on this (when I was writing symmetry-reduction routines for
my NCS-averaging programs, along with space-group specific FFT routines in
the dark ages) was: once you have a complete mathematical classification
that is engraved in stone (i.e. in the old International Tables and in 
crystallographic software as we knew it), then stick to it and re-index back
and forth to/from the unique representative listed under the IT number, as
needed - don't try and extend group-theoretic Tables to re-introduce
incidental metrical properties that had been so neatly factored out from the
final symmetry picture. Otherwise you get a dog's dinner.


 So much for my 0.02 Euro.
 
 
 With best wishes,
 
  Gerard.

--
On Fri, Apr 01, 2011 at 11:30:12AM +, Boaz Shaanan wrote:
 Excuse my naive (perhaps ignorant) question: when was the
  abc rule/convention/standard/whatever introduced? None of the 
 textbooks I came across mentions it as far as I could see (not that this is 
 reason for or against this rule of course).
 
     Thanks,
 
    Boaz
 
 
 Boaz Shaanan, Ph.D.
 Dept. of Life Sciences
 Ben-Gurion University of the Negev
 Beer-Sheva 84105
 Israel
 Phone: 972-8-647-2220 ; Fax: 646-1710
 Skype: boaz.shaanan‎

-- 

 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===


Re: [ccp4bb] problem of conventions

2011-04-01 Thread Ian Tickle
Dear Gerard,

The theory's fine as long as the space group can be unambiguously
determined from the diffraction pattern.  However practice is
frequently just like the ugly fact that destroys the beautiful theory,
which means that a decision on the choice of unit cell may have to be
made on the basis of incomplete or imperfect information (i.e.
mis-identification of the systematic absences).  The 'conservative'
choice (particularly if it's not necessary to make a choice at that
time!) is to choose the space group without screw axes (i.e. P222 for
orthorhombic).  Then if it turns out later that you were wrong it's
easy to throw away the systematic absences and change the space group
symbol.  If you make any other choice and it turns out you were wrong
you might find it hard sometime later to recover the reflections you
threw away!  This of course implies that the unit-cell choice
automatically conforms to the IT convention; this convention is of
course completely arbitrary but you have to make a choice and that one
is as good as any.

So at that point lets say this is the 1970s and you know it might be
several years before your graduate student is able to collect the
high-res data and do the model-building and refinement, so you publish
the unit cell and tentative space group, and everyone starts making
use of your data.  Some years later the structure solution and
refinement is completed and the space group can now be assigned
unambiguously.  The question is do you then revise your previous
choice of unit cell risking the possibility of confusing everyone
including yourself, just in order that the space-group setting
complies with a completely arbitrary 'standard' (and the unit cell
non-conventional), and requiring a re-index of your data (and
permutation of the co-ordinate datasets).  Or do you stick with the IT
unit cell convention and leave it as it is?  For me the choice is easy
('if it ain't broke then don't fix it!').

Cheers

-- Ian

On Fri, Apr 1, 2011 at 1:40 PM, Gerard Bricogne g...@globalphasing.com wrote:
 Dear Boaz,

     I think you are the one who is finally asking the essential question.

     The classification we all know about, which goes back to the 19th
 century, is not into 230 space groups, but 230 space-group *types*, i.e.
 classes where every form of equivalencing (esp. by choice of setting) has
 been applied to the enumeration of the classes and the choice of a unique
 representative for each of them. This process of maximum reduction leaves
 very little room for the introducing conventions like a certain ordering
 of the lengths of cell parameters. This seems to me to be a major mess-up in
 the field - a sort of second-hand mathematics by (IUCr) committee which
 has remained so ill-understood as to generate all these confusions. The work
 on the derivation of the classes of 4-dimensional space groups explained the
 steps of this classification beautifully (arithmetic classes - extension by
 non-primitive translations - equivalencing under the action of the
 normaliser), the last step being the choice of a privileged setting *in
 termns of the group itself* in choosing the representative of each class.
 The extra convention abc leads to choosing that representative in a way
 that depends on the metric properties of the sample instead of once and for
 all (how about that for a brilliant step backward!). Software providers then
 have to de-standardise the set of 230 space group *types* (where each
 representative is uniquely defined once you give the space group (*type*)
 number) to accommodate all alternative choices of settings that might be
 randomly thrown at them by the metric properties of e.g. everyone's
 orthorhombic crystals. Mathematically, what one then needs to return to is
 the step before taking out the action of the normaliser, but this picture
 gets drowned in clerical disputes about low-level software issues.

     My own take on this (when I was writing symmetry-reduction routines for
 my NCS-averaging programs, along with space-group specific FFT routines in
 the dark ages) was: once you have a complete mathematical classification
 that is engraved in stone (i.e. in the old International Tables and in
 crystallographic software as we knew it), then stick to it and re-index back
 and forth to/from the unique representative listed under the IT number, as
 needed - don't try and extend group-theoretic Tables to re-introduce
 incidental metrical properties that had been so neatly factored out from the
 final symmetry picture. Otherwise you get a dog's dinner.


     So much for my 0.02 Euro.


     With best wishes,

          Gerard.

 --
 On Fri, Apr 01, 2011 at 11:30:12AM +, Boaz Shaanan wrote:
 Excuse my naive (perhaps ignorant) question: when was the
  abc rule/convention/standard/whatever introduced? None of the
 textbooks I came across mentions it as far as I could see (not that this is 
 reason for or against this rule of course).

     Thanks,

   

Re: [ccp4bb] problem of conventions

2011-04-01 Thread Santarsiero, Bernard D.
Dear Ian,

Well, it *IS* broke. If you are running some type of process, as you
implied in referring to LIMS, then there is a step in which you move from
the crystal system and point group to the actual space group. So, at that
point you identify P22121. The next clear step, automatically by software,
is to convert to P21212, and move on. That doesn't take an enormous amount
of code writing, and you have a clear trail on how you got there.

To be even more intrusive, what if you had cell parameters of 51.100,
51.101, and 51.102, and it's orthorhombic, P21212. For other co-crystals,
soaks, mutants, etc., you might have both experimental errors and real
differences in the unit cell, so you're telling me that you would process
according to the a  b  c rule in P222 to average and scale, and then it
might turn out to be P22121, P21221, or P21212 later on? When you wish to
compare coordinates, then you have re-assign one coordinate data to match
the other by using superposition, rather than taking on an earlier step of
just using the conventional space group of P21212?

Again, while I see use of the a  b  c rule when there isn't an
overriding reason to assign it otherwise, as in P222 or P212121, there
*is* a reason to stick to the convention of one standard setting. That's
the rationale on using P21/n sometimes vs. P21/c, or I2 vs C2, to avoid a
large beta angle, and adopt a non-standard setting.

Finally, if you think it's fine to use P22121, then can I assume that you
also allow the use of space group A2 and B2?

Bernie


Bernie







On Fri, April 1, 2011 8:46 am, Ian Tickle wrote:
 Dear Gerard,

 The theory's fine as long as the space group can be unambiguously
 determined from the diffraction pattern.  However practice is
 frequently just like the ugly fact that destroys the beautiful theory,
 which means that a decision on the choice of unit cell may have to be
 made on the basis of incomplete or imperfect information (i.e.
 mis-identification of the systematic absences).  The 'conservative'
 choice (particularly if it's not necessary to make a choice at that
 time!) is to choose the space group without screw axes (i.e. P222 for
 orthorhombic).  Then if it turns out later that you were wrong it's
 easy to throw away the systematic absences and change the space group
 symbol.  If you make any other choice and it turns out you were wrong
 you might find it hard sometime later to recover the reflections you
 threw away!  This of course implies that the unit-cell choice
 automatically conforms to the IT convention; this convention is of
 course completely arbitrary but you have to make a choice and that one
 is as good as any.

 So at that point lets say this is the 1970s and you know it might be
 several years before your graduate student is able to collect the
 high-res data and do the model-building and refinement, so you publish
 the unit cell and tentative space group, and everyone starts making
 use of your data.  Some years later the structure solution and
 refinement is completed and the space group can now be assigned
 unambiguously.  The question is do you then revise your previous
 choice of unit cell risking the possibility of confusing everyone
 including yourself, just in order that the space-group setting
 complies with a completely arbitrary 'standard' (and the unit cell
 non-conventional), and requiring a re-index of your data (and
 permutation of the co-ordinate datasets).  Or do you stick with the IT
 unit cell convention and leave it as it is?  For me the choice is easy
 ('if it ain't broke then don't fix it!').

 Cheers

 -- Ian

 On Fri, Apr 1, 2011 at 1:40 PM, Gerard Bricogne g...@globalphasing.com
 wrote:
 Dear Boaz,

     I think you are the one who is finally asking the essential
 question.

     The classification we all know about, which goes back to the 19th
 century, is not into 230 space groups, but 230 space-group *types*, i.e.
 classes where every form of equivalencing (esp. by choice of setting)
 has
 been applied to the enumeration of the classes and the choice of a
 unique
 representative for each of them. This process of maximum reduction
 leaves
 very little room for the introducing conventions like a certain
 ordering
 of the lengths of cell parameters. This seems to me to be a major
 mess-up in
 the field - a sort of second-hand mathematics by (IUCr) committee
 which
 has remained so ill-understood as to generate all these confusions. The
 work
 on the derivation of the classes of 4-dimensional space groups explained
 the
 steps of this classification beautifully (arithmetic classes -
 extension by
 non-primitive translations - equivalencing under the action of the
 normaliser), the last step being the choice of a privileged setting *in
 termns of the group itself* in choosing the representative of each
 class.
 The extra convention abc leads to choosing that representative in a
 way
 that depends on the metric properties of the sample instead of once and
 for
 all 

[ccp4bb] problem of conventions

2011-03-31 Thread Anita Lewit-Bentley

Dear all,

I would like to share my experiencde with a rather unexpected problem  
of indexing conventions. Perhaps I can save people some time


I have a crystal in the more unusual P21212 space-group (No 18). Its  
unit cell lengths are bac (please note). I systematically use XDS  
for data integration, since so far it was able to handle even the most  
horrible-looking spots.


Now XDS indexed my data in space-group 18, but with the axes order  
abc! It had, in fact, invented a space-group P22121, which does  
not exist. I did not realise this until I had spent a couple of weeks  
with beautiful peaks in rotation functions, but hopeless results in  
translation functions. It wasn't until I looked more closely into the  
definition of the screw axes that I realised the problem.


POINTLESS does not allow a reindexing of reflexions within the same  
space-group, but fortunately REINDEX did the trick at the level of  
intensities, because I like to use SCALA for careful scaling of my data.


So, basically, beyond just warning people who might encounter similar  
problems, I was wo,dering if XDS could perhaps reindex reflexions  
according to Int. Table conventions once the screw axes of a crystal  
system have been identified?


With best wishes,

Anita


Anita Lewit-Bentley
Unité d'Immunologie Structurale
CNRS URA 2185
Département de Biologie Structurale  Chimie
Institut Pasteur
25 rue du Dr. Roux
75724 Paris cedex 15
FRANCE

Tel: 33- (0)1 45 68 88 95
FAX: 33-(0)1 40 61 30 74
email: ale...@pasteur.fr



Re: [ccp4bb] problem of conventions

2011-03-31 Thread Santarsiero, Bernard D.
If you are using CCP4, it can accomodate P22121. However, just reindex in
CCP4 to the correct setting with P21212.

Bernie Santarsiero


On Thu, March 31, 2011 9:28 am, Anita Lewit-Bentley wrote:
 Dear all,

 I would like to share my experiencde with a rather unexpected problem
 of indexing conventions. Perhaps I can save people some time

 I have a crystal in the more unusual P21212 space-group (No 18). Its
 unit cell lengths are bac (please note). I systematically use XDS
 for data integration, since so far it was able to handle even the most
 horrible-looking spots.

 Now XDS indexed my data in space-group 18, but with the axes order
 abc! It had, in fact, invented a space-group P22121, which does
 not exist. I did not realise this until I had spent a couple of weeks
 with beautiful peaks in rotation functions, but hopeless results in
 translation functions. It wasn't until I looked more closely into the
 definition of the screw axes that I realised the problem.

 POINTLESS does not allow a reindexing of reflexions within the same
 space-group, but fortunately REINDEX did the trick at the level of
 intensities, because I like to use SCALA for careful scaling of my data.

 So, basically, beyond just warning people who might encounter similar
 problems, I was wo,dering if XDS could perhaps reindex reflexions
 according to Int. Table conventions once the screw axes of a crystal
 system have been identified?

 With best wishes,

 Anita


 Anita Lewit-Bentley
 Unité d'Immunologie Structurale
 CNRS URA 2185
 Département de Biologie Structurale  Chimie
 Institut Pasteur
 25 rue du Dr. Roux
 75724 Paris cedex 15
 FRANCE

 Tel: 33- (0)1 45 68 88 95
 FAX: 33-(0)1 40 61 30 74
 email: ale...@pasteur.fr




Re: [ccp4bb] problem of conventions

2011-03-31 Thread Phil Evans
The IUCr standard is to make abc, see

http://nvl.nist.gov/pub/nistpubs/jres/107/4/j74mig.pdf
http://nvl.nist.gov/pub/nistpubs/jres/106/6/j66mig.pdf

and P 2 21 21 is a perfectly valid space group

Pointless will reindex within the same point group, or you can choose the abc 
convention (SETTING CELL_BASED) or the reference setting P 21 21 2 (SETTING 
SYMMETRY_BASED), so you have a choice

Most programs are perfectly happy with space group P 2 21 21 (I'm not sure 
about [auto]Sharp)

Phil

On 31 Mar 2011, at 15:28, Anita Lewit-Bentley wrote:

 Dear all,
 
 I would like to share my experiencde with a rather unexpected problem of 
 indexing conventions. Perhaps I can save people some time
 
 I have a crystal in the more unusual P21212 space-group (No 18). Its unit 
 cell lengths are bac (please note). I systematically use XDS for data 
 integration, since so far it was able to handle even the most 
 horrible-looking spots.
 
 Now XDS indexed my data in space-group 18, but with the axes order abc! It 
 had, in fact, invented a space-group P22121, which does not exist. I did 
 not realise this until I had spent a couple of weeks with beautiful peaks in 
 rotation functions, but hopeless results in translation functions. It wasn't 
 until I looked more closely into the definition of the screw axes that I 
 realised the problem.
 
 POINTLESS does not allow a reindexing of reflexions within the same 
 space-group, but fortunately REINDEX did the trick at the level of 
 intensities, because I like to use SCALA for careful scaling of my data.
 
 So, basically, beyond just warning people who might encounter similar 
 problems, I was wo,dering if XDS could perhaps reindex reflexions according 
 to Int. Table conventions once the screw axes of a crystal system have been 
 identified?
 
 With best wishes,
 
 Anita
 
 
 Anita Lewit-Bentley
 Unité d'Immunologie Structurale
 CNRS URA 2185
 Département de Biologie Structurale  Chimie
 Institut Pasteur
 25 rue du Dr. Roux
 75724 Paris cedex 15
 FRANCE
 
 Tel: 33- (0)1 45 68 88 95
 FAX: 33-(0)1 40 61 30 74
 email: ale...@pasteur.fr
 


Re: [ccp4bb] problem of conventions

2011-03-31 Thread Tim Gruene
Dear Anita,

I happen to have a very similar problem today.

Does XDS use the desired setting if you provide it with the correct cell and
space group during the IDXREF step? You can otherwise re-index in CORRECT.


To comment on Phil:
I fed the mtz-file from pointless into ctruncate (or maybe it was scala) which
left the space group string (P2 21 21) but turned the space group number 18 into
3018 - this does screw up autosharp and maybe also other programs which use the
space group number/ symbol and not the symmetry operators.

Tim

On Thu, Mar 31, 2011 at 04:28:18PM +0200, Anita Lewit-Bentley wrote:
 Dear all,
 
 I would like to share my experiencde with a rather unexpected
 problem of indexing conventions. Perhaps I can save people some
 time
 
 I have a crystal in the more unusual P21212 space-group (No 18). Its
 unit cell lengths are bac (please note). I systematically use XDS
 for data integration, since so far it was able to handle even the
 most horrible-looking spots.
 
 Now XDS indexed my data in space-group 18, but with the axes order
 abc! It had, in fact, invented a space-group P22121, which does
 not exist. I did not realise this until I had spent a couple of
 weeks with beautiful peaks in rotation functions, but hopeless
 results in translation functions. It wasn't until I looked more
 closely into the definition of the screw axes that I realised the
 problem.
 
 POINTLESS does not allow a reindexing of reflexions within the same
 space-group, but fortunately REINDEX did the trick at the level of
 intensities, because I like to use SCALA for careful scaling of my
 data.
 
 So, basically, beyond just warning people who might encounter
 similar problems, I was wo,dering if XDS could perhaps reindex
 reflexions according to Int. Table conventions once the screw axes
 of a crystal system have been identified?
 
 With best wishes,
 
 Anita
 
 
 Anita Lewit-Bentley
 Unité d'Immunologie Structurale
 CNRS URA 2185
 Département de Biologie Structurale  Chimie
 Institut Pasteur
 25 rue du Dr. Roux
 75724 Paris cedex 15
 FRANCE
 
 Tel: 33- (0)1 45 68 88 95
 FAX: 33-(0)1 40 61 30 74
 email: ale...@pasteur.fr
 

-- 
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

phone: +49 (0)551 39 22149

GPG Key ID = A46BEE1A



signature.asc
Description: Digital signature


Re: [ccp4bb] problem of conventions

2011-03-31 Thread Ian Tickle
 To comment on Phil:
 I fed the mtz-file from pointless into ctruncate (or maybe it was scala) which
 left the space group string (P2 21 21) but turned the space group number 18 
 into
 3018 - this does screw up autosharp and maybe also other programs which use 
 the
 space group number/ symbol and not the symmetry operators.

If that's the case autosharp should be fixed so it recognises the
correct space group! (in this P22121 or #3018 in CCP4-ese).  It has
been fixed in autoBuster so maybe you're using an old version of
autosharp?

-- Ian


Re: [ccp4bb] problem of conventions

2011-03-31 Thread Ian Tickle
 I would like to share my experiencde with a rather unexpected problem of 
 indexing conventions. Perhaps I can save people some
 time

 I have a crystal in the more unusual P21212 space-group (No 18). Its unit 
 cell lengths are bac (please note). I systematically
 use XDS for data integration, since so far it was able to handle even the 
 most horrible-looking spots.

 Now XDS indexed my data in space-group 18, but with the axes order abc! It 
 had, in fact, invented a space-group P22121,
 which does not exist. I did not realise this until I had spent a couple of 
 weeks with beautiful peaks in rotation functions, but
 hopeless results in translation functions. It wasn't until I looked more 
 closely into the definition of the screw axes that I realised the
 problem.

 POINTLESS does not allow a reindexing of reflexions within the same 
 space-group, but fortunately REINDEX did the trick at the
 level of intensities, because I like to use SCALA for careful scaling of my 
 data.

I was wo,dering if XDS could perhaps reindex reflexions according
 to Int. Table conventions once the screw axes of a crystal system have been
 identified?

The International Tables / IUCr / NIST convention _is_  a=b=c for
orthorhombic so no re-indexing is necessary or desirable.  See IT vol.
A 5th ed. (2002), table 9.3.4.1 (p. 758 in my edition) for all the
conventional cells.  The problem may be that some programs are not
sticking to the agreed convention - but then the obvious solution is
to fix the program (or use a different one).  Is the problem that XDS
is indexing it correctly as P22121 but calling it SG #18 (i.e. instead
of the correct #3018).  That would certainly confuse all CCP4 programs
which generally tend to use the space-group number first if it's
available.

I'm not clear what you mean when you say P22121 doesn't exist?  It's
clearly shown in my edition of IT (p. 202).  Maybe your lab needs to
invest in the most recent edition of IT?

Cheers

-- Ian


Re: [ccp4bb] problem of conventions

2011-03-31 Thread Santarsiero, Bernard D.
Interesting. My IT, both volume I and volume A (1983) only have P21212 for
space group #18. Do I have to purchase a new volume A every year to keep
up with the new conventions?

Cheers,

Bernie


On Thu, March 31, 2011 12:57 pm, Ian Tickle wrote:
 I would like to share my experiencde with a rather unexpected problem of
 indexing conventions. Perhaps I can save people some
 time

 I have a crystal in the more unusual P21212 space-group (No 18). Its
 unit cell lengths are bac (please note). I systematically
 use XDS for data integration, since so far it was able to handle even
 the most horrible-looking spots.

 Now XDS indexed my data in space-group 18, but with the axes order
 abc! It had, in fact, invented a space-group P22121,
 which does not exist. I did not realise this until I had spent a couple
 of weeks with beautiful peaks in rotation functions, but
 hopeless results in translation functions. It wasn't until I looked more
 closely into the definition of the screw axes that I realised the
 problem.

 POINTLESS does not allow a reindexing of reflexions within the same
 space-group, but fortunately REINDEX did the trick at the
 level of intensities, because I like to use SCALA for careful scaling of
 my data.

I was wo,dering if XDS could perhaps reindex reflexions according
 to Int. Table conventions once the screw axes of a crystal system have
 been
 identified?

 The International Tables / IUCr / NIST convention _is_  a=b=c for
 orthorhombic so no re-indexing is necessary or desirable.  See IT vol.
 A 5th ed. (2002), table 9.3.4.1 (p. 758 in my edition) for all the
 conventional cells.  The problem may be that some programs are not
 sticking to the agreed convention - but then the obvious solution is
 to fix the program (or use a different one).  Is the problem that XDS
 is indexing it correctly as P22121 but calling it SG #18 (i.e. instead
 of the correct #3018).  That would certainly confuse all CCP4 programs
 which generally tend to use the space-group number first if it's
 available.

 I'm not clear what you mean when you say P22121 doesn't exist?  It's
 clearly shown in my edition of IT (p. 202).  Maybe your lab needs to
 invest in the most recent edition of IT?

 Cheers

 -- Ian



Re: [ccp4bb] problem of conventions

2011-03-31 Thread Ian Tickle
There are no 'new' conventions to keep up with: recent editions of the
old volume 1 or new A do not disagree on the question of the unit cell
conventions (except for minor details which don't affect the majority
of the common space groups), where by recent I mean going back ~ 70
years.  So it's certainly not the case that the conventions are
changing every year (that would be silly!) - they have been defined
exactly once in the last 100 years!  I believe the unit cell
conventions currently in use were actually first defined by the 1952
edition of International Tables, so both the 1969 edition (volume '1')
and the 1983 edition (1st of volume 'A') will certainly describe them.
 I have only the 2002 edition (the 5th) so I can't tell you exactly
where to find the relevant info in the older editions.  The very first
edition of IT (1935 I believe) did not define the unit cell
conventions, only the space groups, so I wouldn't recommend that!

The older editions did not include information on alternate
space-group settings simply in order to save paper: the 1952 edition
was published in the years following WW2 when there was a paper
shortage, so this was an important consideration!  Only one setting
(the 'standard setting') of each space group, chosen arbitrarily, was
described and the crystallographer was expected to permute it to get
the desired setting.  If you need to see all the alternate settings
laid out explicitly then you need to get hold of a recent (e.g. the
5th printed or 1st online) edition; failing that you have to work them
out yourself!  I thought the alternate settings were first described
(though possibly without the diagrams) in the 1st (1983) edition of
volume A, but I'm relying on memory and could well be wrong.  The
setting was often chosen to be consistent with a pre-existing
isomorphous structure (i.e. generally isomorphism overrides
convention); if there was none either the setting was defined by the
unit cell convention, or often it was simply easiest to use the
standard setting.  Of course not everyone followed the conventions: it
was common to write programs that could handle only the standard
settings (and it still is!).  Wiser programmers allowed space-groups
to be defined arbitrarily by the equivalent positions instead of the
number or symbol, so then it was straightforward to select any desired
alternate setting.

Note that the convention describes the unit cells, from which the
space-group symbols are then derived, not the other way around.  The
ratiionale behind this is simple: there was a time not so long ago
(which I remember!) when data collection and structure solution for
even routine structures was actually non-trivial (I'm not implying
that it's always trivial even nowadays!).  However it was possible
relatively straightforwardly to obtain the unit cell from precession
photos (i.e. the indexing).  It used to be common practice to publish
an initial communication giving the unit cell and possibly a tentative
space group; this would be followed up (often several years later!) by
structures determined to successively higher resolution as more data
was collected.  Of course it was not possible to be 100% certain of
the space-group assignment from the precession photos (and for several
space-groups there is of course no unique space-group determinable
from the systematic absences alone); final space-group assignment
often had to wait several years for the structure determination.
Hence it made sense to define the setting from the unit cell, not the
space group.

I recommend the 2 papers from the US National Institute of Standards 
Technology (see Phil's posting) for more on this: the NIST conventions
are the same as the IUCr ones, i.e. based on the unit cell (in fact
Alan Mighell when he was at NIST wrote much of unit-cell convention
material in IT).

-- Ian

On Thu, Mar 31, 2011 at 7:36 PM, Santarsiero, Bernard D. b...@uic.edu wrote:
 Interesting. My IT, both volume I and volume A (1983) only have P21212 for
 space group #18. Do I have to purchase a new volume A every year to keep
 up with the new conventions?

 Cheers,

 Bernie


 On Thu, March 31, 2011 12:57 pm, Ian Tickle wrote:
 I would like to share my experiencde with a rather unexpected problem of
 indexing conventions. Perhaps I can save people some
 time

 I have a crystal in the more unusual P21212 space-group (No 18). Its
 unit cell lengths are bac (please note). I systematically
 use XDS for data integration, since so far it was able to handle even
 the most horrible-looking spots.

 Now XDS indexed my data in space-group 18, but with the axes order
 abc! It had, in fact, invented a space-group P22121,
 which does not exist. I did not realise this until I had spent a couple
 of weeks with beautiful peaks in rotation functions, but
 hopeless results in translation functions. It wasn't until I looked more
 closely into the definition of the screw axes that I realised the
 problem.

 POINTLESS does not allow a 

Re: [ccp4bb] problem of conventions

2011-03-31 Thread James Holton
I have the 2002 edition, and indeed it only contains space group
numbers up to 230.  The page numbers quoted by Ian contain space group
numbers 17 and 18.

Although I am all for program authors building in support for the
screwy orthorhombics (as I call them), I should admit that my
fuddy-duddy strategy for dealing with them remains simply to use space
groups 17 and 18, and permute the cell edges around with REINDEX to
put the unique (screw or non-screw) axis on the c position.  I have
yet to encounter a program that gets broken when presented with data
that doesn't have abc, but there are many non-CCP4 programs out
there that still don't seem to understand P22121, P21221, P2122 and
P2212.

This is not the only space group convention issue out there!  The
R3x vs H3x business continues to be annoying to this day!

-James Holton
MAD Scientist

On Thu, Mar 31, 2011 at 11:36 AM, Santarsiero, Bernard D. b...@uic.edu wrote:
 Interesting. My IT, both volume I and volume A (1983) only have P21212 for
 space group #18. Do I have to purchase a new volume A every year to keep
 up with the new conventions?

 Cheers,

 Bernie


 On Thu, March 31, 2011 12:57 pm, Ian Tickle wrote:
 I would like to share my experiencde with a rather unexpected problem of
 indexing conventions. Perhaps I can save people some
 time

 I have a crystal in the more unusual P21212 space-group (No 18). Its
 unit cell lengths are bac (please note). I systematically
 use XDS for data integration, since so far it was able to handle even
 the most horrible-looking spots.

 Now XDS indexed my data in space-group 18, but with the axes order
 abc! It had, in fact, invented a space-group P22121,
 which does not exist. I did not realise this until I had spent a couple
 of weeks with beautiful peaks in rotation functions, but
 hopeless results in translation functions. It wasn't until I looked more
 closely into the definition of the screw axes that I realised the
 problem.

 POINTLESS does not allow a reindexing of reflexions within the same
 space-group, but fortunately REINDEX did the trick at the
 level of intensities, because I like to use SCALA for careful scaling of
 my data.

I was wo,dering if XDS could perhaps reindex reflexions according
 to Int. Table conventions once the screw axes of a crystal system have
 been
 identified?

 The International Tables / IUCr / NIST convention _is_  a=b=c for
 orthorhombic so no re-indexing is necessary or desirable.  See IT vol.
 A 5th ed. (2002), table 9.3.4.1 (p. 758 in my edition) for all the
 conventional cells.  The problem may be that some programs are not
 sticking to the agreed convention - but then the obvious solution is
 to fix the program (or use a different one).  Is the problem that XDS
 is indexing it correctly as P22121 but calling it SG #18 (i.e. instead
 of the correct #3018).  That would certainly confuse all CCP4 programs
 which generally tend to use the space-group number first if it's
 available.

 I'm not clear what you mean when you say P22121 doesn't exist?  It's
 clearly shown in my edition of IT (p. 202).  Maybe your lab needs to
 invest in the most recent edition of IT?

 Cheers

 -- Ian




Re: [ccp4bb] problem of conventions

2011-03-31 Thread Ian Tickle
On Thu, Mar 31, 2011 at 10:43 PM, James Holton jmhol...@lbl.gov wrote:
 I have the 2002 edition, and indeed it only contains space group
 numbers up to 230.  The page numbers quoted by Ian contain space group
 numbers 17 and 18.

You need to distinguish the 'IT space group number' which indeed goes
up to 230 (i.e. the number of unique settings), from the 'CCP4 space
group number' which, peculiar to CCP4 (which is why I called it
'CCP4-ese'), adds a multiple of 1000 to get a unique number for the
alternate settings as used in the API.  The page I mentioned show the
diagrams for IT SG #18 P22121 (CCP4 #3018), P21221 (CCP4 #2018) and
P21212 (CCP4 #18), so they certainly are all there!

 Although I am all for program authors building in support for the
 screwy orthorhombics (as I call them), I should admit that my
 fuddy-duddy strategy for dealing with them remains simply to use space
 groups 17 and 18, and permute the cell edges around with REINDEX to
 put the unique (screw or non-screw) axis on the c position.

Re-indexing is not an option for us (indeed if there were no
alternative, it would be a major undertaking), because the integrity
of our LIMS database requires that all protein-ligand structures from
the same target  crystal form are indexed with the same (or nearly
the same) cell and space group (and it makes life so much easier!).
With space-groups such as P22121 it can happen (indeed it has
happened) that it was not possible to define the space group correctly
at the processing stage due to ambiguous absences; indeed it was only
after using the SGALternative ALL option in Phaser and refining each
TF solution that we identified the space group correctly as P22121.

Having learnt the lesson the hard way, we routinely use P222 for all
processing of orthorhombics, which of course always gives the
conventional abc setting, and only assign the space group well down
the pipeline and only when we are 100% confident; by that time it's
too late to re-index (indeed why on earth would we want to give
ourselves all that trouble?).  This is therefore totally analogous to
the scenario of yesteryear that I described where it was common to see
a 'unit cell' communication followed some years later by the structure
paper (though we have compressed the gap somewhat!), and we base the
setting on the unit cell convention for exactly the same reason.

It's only if you're doing 1 structure at a time that you can afford
the luxury of re-indexing - and also the pain: many times I've seen
even experienced people getting their files mixed up and trying to
refine with differently indexed MTZ  PDB files (why is my R factor so
high?)!  My advice would be - _never_ re-index!

-- Ian


  I have
 yet to encounter a program that gets broken when presented with data
 that doesn't have abc, but there are many non-CCP4 programs out
 there that still don't seem to understand P22121, P21221, P2122 and
 P2212.

I find that surprising!  Exactly which 'many' programs are those?  You
really should report them to CCP4 (or to me if it's one of mine) so
they can be fixed!  We've been using CCP4 programs as integral
components of our processing pipeline (from data processing through to
validation) for the last 10 years and I've never come across one
that's broken in the way you describe (I've found many broken for
other reasons and either fixed it myself or reported it - you should
do the same!).  Any program which uses csymlib with syminfo.lib can
automatically handle all space groups defined in syminfo, which
includes all the common alternates you mentioned (and others such as
I2).  The only program I'm aware of that's limited to the standard
settings is sftools (because it has its own internal space group table
- it would be nice to see it updated to use syminfo!).

 This is not the only space group convention issue out there!  The
 R3x vs H3x business continues to be annoying to this day!

Yeah to that!  H centring was defined in IT long ago (look it up) and
it has nothing to do with the R setting!

-- Ian


Re: [ccp4bb] problem of conventions

2011-03-31 Thread Santarsiero, Bernard D.
Ian,

I think it's amazing that we can program computers to resolve a  b  c
but it would be a major undertaking to store the matrix transformations
for 22121 to 21212 and reindex a cell to a standard setting. I was also
told that I was lazy to not reindex to the standard setting when I was a
grad student. Now it takes less than a minute to enter a transformation
and re-index.

The orthorhombic rule of a  b  c makes sense in 222 or 212121, but when
there is a standard setting of the 2-fold along the c-axis, then why not
adopt that? Often we chose a non-setting when there was a historical
precedence, as in the comparison of one structure to another, e.g., P21/c
with beta greater than 120deg vs. P21/n, etc. That is no more difficult
with modern computing than dragging along three space groups for #18.
There was a compactness to 230, and only 230 space groups. (I cheat, since
I agree there is both the rhombohedral and hexagonal cell settings for
R3bar.)

Bernie



On Thu, March 31, 2011 5:48 pm, Ian Tickle wrote:
 On Thu, Mar 31, 2011 at 10:43 PM, James Holton jmhol...@lbl.gov wrote:
 I have the 2002 edition, and indeed it only contains space group
 numbers up to 230.  The page numbers quoted by Ian contain space group
 numbers 17 and 18.

 You need to distinguish the 'IT space group number' which indeed goes
 up to 230 (i.e. the number of unique settings), from the 'CCP4 space
 group number' which, peculiar to CCP4 (which is why I called it
 'CCP4-ese'), adds a multiple of 1000 to get a unique number for the
 alternate settings as used in the API.  The page I mentioned show the
 diagrams for IT SG #18 P22121 (CCP4 #3018), P21221 (CCP4 #2018) and
 P21212 (CCP4 #18), so they certainly are all there!

 Although I am all for program authors building in support for the
 screwy orthorhombics (as I call them), I should admit that my
 fuddy-duddy strategy for dealing with them remains simply to use space
 groups 17 and 18, and permute the cell edges around with REINDEX to
 put the unique (screw or non-screw) axis on the c position.

 Re-indexing is not an option for us (indeed if there were no
 alternative, it would be a major undertaking), because the integrity
 of our LIMS database requires that all protein-ligand structures from
 the same target  crystal form are indexed with the same (or nearly
 the same) cell and space group (and it makes life so much easier!).
 With space-groups such as P22121 it can happen (indeed it has
 happened) that it was not possible to define the space group correctly
 at the processing stage due to ambiguous absences; indeed it was only
 after using the SGALternative ALL option in Phaser and refining each
 TF solution that we identified the space group correctly as P22121.

 Having learnt the lesson the hard way, we routinely use P222 for all
 processing of orthorhombics, which of course always gives the
 conventional abc setting, and only assign the space group well down
 the pipeline and only when we are 100% confident; by that time it's
 too late to re-index (indeed why on earth would we want to give
 ourselves all that trouble?).  This is therefore totally analogous to
 the scenario of yesteryear that I described where it was common to see
 a 'unit cell' communication followed some years later by the structure
 paper (though we have compressed the gap somewhat!), and we base the
 setting on the unit cell convention for exactly the same reason.

 It's only if you're doing 1 structure at a time that you can afford
 the luxury of re-indexing - and also the pain: many times I've seen
 even experienced people getting their files mixed up and trying to
 refine with differently indexed MTZ  PDB files (why is my R factor so
 high?)!  My advice would be - _never_ re-index!

 -- Ian


  I have
 yet to encounter a program that gets broken when presented with data
 that doesn't have abc, but there are many non-CCP4 programs out
 there that still don't seem to understand P22121, P21221, P2122 and
 P2212.

 I find that surprising!  Exactly which 'many' programs are those?  You
 really should report them to CCP4 (or to me if it's one of mine) so
 they can be fixed!  We've been using CCP4 programs as integral
 components of our processing pipeline (from data processing through to
 validation) for the last 10 years and I've never come across one
 that's broken in the way you describe (I've found many broken for
 other reasons and either fixed it myself or reported it - you should
 do the same!).  Any program which uses csymlib with syminfo.lib can
 automatically handle all space groups defined in syminfo, which
 includes all the common alternates you mentioned (and others such as
 I2).  The only program I'm aware of that's limited to the standard
 settings is sftools (because it has its own internal space group table
 - it would be nice to see it updated to use syminfo!).

 This is not the only space group convention issue out there!  The
 R3x vs H3x business continues to be annoying to this day!