date:20201204

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Dale Tronrud

On 12/4/2020 12:15 PM, Marcin Wojdyr wrote:
> On Fri, 4 Dec 2020 at 19:16, Dale Tronrud  wrote:
>> learn the sequence you have to go to the mmCIF records that define the
>> connectivity between residues.  It is entirely possible that "3" comes
>> before "1" because these indexes don't contain any information, other
>> than being unique within the chain.
>
> In mmCIF you have label_seq_id that must be both unique and
> sequential. So 3 is always the third residue wrt to the full sequence.
>

   It is very important not to read more meaning into a data tag than 
is actually defined in the mmCIF spec.  _atom_site.label_seq_id is defined

http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_atom_site.label_seq_id.html

as a pointer into the _entity_poly_seq table.  It has to be an signed 
integer (although I'm not clear on what a negative value for a pointer 
means).  In that table there is a data item _entity_poly_seq.num,

http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_entity_poly_seq.num.html

which is not a pointer, not an ID, but a name for that particular 
_entity_poly_seq row.  It must be a number that is unique and 
sequential, and presumably indicates a "sequence number".  Note that the 
rows in _entity_poly_seq can be listed in the loop_ in any order.  You 
can't assumed that the order they are listed in the mmCIF says anything 
about connectivity.  You get the order of the "things" in the sequence 
from _entity_poly_seq.num.

   This means that the _atom_site.label_seq_id could be "3", pointing 
to the third entry in _entity_poly_seq which happens to have its .num 
equal to "1".  You may not think that someone would choose to do this, 
but if the first .num is -15 you can't avoid a mismatch.  In either case 
the mmCIF is perfectly acceptable and the meaning is absolutely clear.

   Pulling up one of my favorite PDB entries I get

loop_
_entity_poly_seq.entity_id
_entity_poly_seq.num
_entity_poly_seq.mon_id
_entity_poly_seq.hetero
1 1   ILE n
1 2   THR n
1 3   GLY n
1 4   THR n
1 5   SER n
1 6   THR n
1 7   VAL n

These rows are listed in order of their .num item, and all the 
_atom_site.label_seq_id's will be equal to the _entity_poly_seq.num, but 
nothing in the spec forces that to be the case, and your software should 
not, ever, make that assumption.  Your software should also never assume 
that successive rows in _entity_poly_seq are chemically linked.  The 
order is arbitrary.  You also can't assume that the row with 
_entity_poly_seq.num equal to "3" is chemically linked to the one with 
.num equal to "2", much less the chemical nature of such a link. 
_entity_poly_seq is not a data table that defines chemistry, only 
"sequence".

   The whole point of a proper data base structure is that you don't 
assume anything!  All information has to be specifically encoded in the 
tables of the data base.  If your software makes use of a particular 
tag, you should go to the definition of that tag and use it, and not 
make additional extrapolations about it.

   I'm not saying the the data tag definitions of mmCIF are perfect, 
far from it.  But the foundation on CIF is sound and you have to stick 
with that formal structure, based in data base theory, if you are going 
to get the benefit of a proper data base.

   We have been used to the slap-dash world of PDB format for decades, 
where we try to make it work by stuffing extra characters on the end of 
the line or in a little gap that you have forgotten its real purpose. 
This has led to nothing but grief.  When I was writing my refinement 
program I can tell you that the most complex and difficult subroutine 
system was the one trying to read PDB files.  There were PDB files that 
had the number of electrons in the atom written in the occupancy column! 
 Some had the name of a calcium atom shifted to the left and some did 
not, making them indistinguishable from Calpha atoms.  The PDB format is 
an insane mess and is completely unworkable.  Please, let it die!

   The problem with Dr. Croll's suggestion "Using chain A as an 
example, perhaps the glycans could become Ag1, Ag2, etc.?" is that it 
loads connectivity information into names.  How can one write a standard 
database validation script to verify the correctness of this 
information?  You have defined a meaning to the characters in a "name" 
which is not defined in the data schema.  On the other hand, the data in 
the mmCIF, as currently defined is certainly complete enough that his 
software could generate names of this style for display to his users. 
His user interface is not limited by mmCIF in any way, and "value added" 
features like this might make his software even more successful.

   I certainly agree that the names chosen by the authors are of 
considerable value when examining their model in the light of their 
paper and understanding.  My understanding is that mmCIF has places for 
these names.  I do find it distressing that the PDB has chosen to

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Marcin Wojdyr

On Fri, 4 Dec 2020 at 19:16, Dale Tronrud  wrote:
>
> Creating meaning in the chain names "A, B, C, Ag1, Ag2, Ag3" is
> exactly the problem.

It's not about "creating meaning" but about consistent naming. For humans.

> "chain names" ( or "entity identifiers" if I
> recall the mmCIF terminology correctly) are simply database "indexes".

No, entity is a somewhat different thing (multiple chains can point to
the same entity). entity_id is specified in addition to label_asym_id
and auth_asym_id.
asym = "structural element in the asymmetric unit" (so-called chain).

> The values of indices are meaningless in themselves, they are just
> unique values that can be used to unambiguously identify a record. In
> principle, you could just assign random ISO characters (I don't think
> mmCIF allows unicode) and the mmCIF would be considered identical.

And then you'd use this random string also in a publication when
referring to the chain, and in the user interface?

> You are trying to force meaning to the characters with an index, and
> that puts multiple types of information in a single field. As Robbie
> said already exists, if you want to encode connectivity into the data
> base you have to add records that define that connectivity.  That places
> the connectivity information explicitly in the data models and allows
> standard data base tools to track and validate.

No one was proposing to replace connectivity with names.
It was about naming that will be easier to work with for people.

> learn the sequence you have to go to the mmCIF records that define the
> connectivity between residues.  It is entirely possible that "3" comes
> before "1" because these indexes don't contain any information, other
> than being unique within the chain.

In mmCIF you have label_seq_id that must be both unique and
sequential. So 3 is always the third residue wrt to the full sequence.

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Dale Tronrud

   I agree that the user experience is very important, but that is not 
the purpose of a data base design.  The data scheme is designed for the 
storage and manipulation of data by software in a clear and unambiguous 
way. The presentation of the data to a user is the job of the 
application developer, such as yourself.  As anyone who has looked 
inside a mmCIF will tell you, it was not designed for human reading or 
editing.  Yes, it can be manually edited and read, and that is handy for 
people like you and I, but the average human protein modeler shouldn't 
be in there.


   According to Robbie, the information is present in the mmCIF to 
allow you to code a tool that will allow your users to navigate the 
model.  Maybe we can discuss off-line ideas for how this can be done.


   Anyway, I agree with you that representing glycans with one sugar 
differently than poly-glycans is not the best solution.  The PDB has 
shown little interest in my opinions on such matters in the past so I'm 
not getting involved in that argument.  I just jumped in to defend the 
adherence of mmCIF to formal data base theory, and suggest that the 
software developers reading mmCIF also stick to those rules, and not 
make unwarranted assumptions about the meaning of data items.


Dale Tronrud

On 12/4/2020 10:37 AM, Tristan Croll wrote:
OK, I understand your point more clearly now - but I'm not sure I fully 
agree, for the simple reason that people aren't computers. You're right 
that for the purposes of software validation tools the chain IDs are 
essentially arbitrary - as long as they're unique, nothing else really 
matters. But to a human simply wanting to /explore/ a model in their 
favourite visualisation program this makes everything just that bit less 
intuitive - if they want to, say, go to the first glycan attached to 
chain A they have no way of doing so short of tracing through from the 
N-terminus until they find it, unless the program provides a tool that 
already understands the concept of "first glycan attached to chain A". 
So if we go forward with the "chain IDs are entirely arbitrary, 
therefore it doesn't matter what they are" approach, then every existing 
visualisation tool gets a little bit more difficult to use with glycans 
until their authors take the time to write new task-specific code.


In the grand scheme of things it's a minor issue, I suppose - but in my 
opinion it really is important to keep the experience of the end user in 
mind when making decisions like this.


*From:* Dale Tronrud 
*Sent:* 04 December 2020 18:16
*To:* Tristan Croll ; CCP4BB@JISCMAIL.AC.UK 

*Subject:* Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at 
the PDB -- N-glycans are now separate chains if more than one residue


     Creating meaning in the chain names "A, B, C, Ag1, Ag2, Ag3" is
exactly the problem.  "chain names" ( or "entity identifiers" if I
recall the mmCIF terminology correctly) are simply database "indexes".
The values of indices are meaningless in themselves, they are just
unique values that can be used to unambiguously identify a record. In
principle, you could just assign random ISO characters (I don't think
mmCIF allows unicode) and the mmCIF would be considered identical.

     You are trying to force meaning to the characters with an index, and
that puts multiple types of information in a single field.  As Robbie
said already exists, if you want to encode connectivity into the data
base you have to add records that define that connectivity.  That places
the connectivity information explicitly in the data models and allows
standard data base tools to track and validate.

     The idioms of the PDB cause problems that lead people to these
mistakes.  The PDB assigns the indices "1", "2", and "3" to residues in
a chain.  A person could be misled into thinking that "2" comes between
"1" and "3" in the sequence.  This is not necessarily true at all.  To
learn the sequence you have to go to the mmCIF records that define the
connectivity between residues.  It is entirely possible that "3" comes
before "1" because these indexes don't contain any information, other
than being unique within the chain.

Dale Tronrud

On 12/4/2020 9:46 AM, Tristan Croll wrote:

 This suggestion violates a basic principle of data base theory.  A
 single data item cannot encode two pieces of information.

I'm sorry if I was unclear, but I don't believe I was suggesting 
anything of the sort. Hopefully this example should make it more clear - 
I'm just suggesting a slight variation on the existing system, no more:


If we start with model containing 3 protein chains A-C, with chain A 
containing amino acid residues 1-200, and 3 N-linked glycans with 
residues numbered, say, 1000-1005, 1020-1026 and 1040-1043 (a fairly 
common approach I've seen taken to the problem in the past, and one I've 
taken myself), then if I understand correctly after

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Robbie Joosten

Ah yes, polymer connectivity depends on the order of appearance not the 
numbering. On top of that, the connectivity is implicit. There are structures 
where some chains are numbered in reverse order, especially in double helices. 
How convenient is it that each base pair has residues of the same number  This 
is a proper code breaker for many programs and indeed some of our code has this 
issue as well (we discovered this very recently and are fixing this). 

What is even worse is that residues that are sequentially numbered are not 
necessarily connected. You can have residues 100 , 100A, 100B, 100C, 101 (yes, 
with insertion codes) and residues 100A, 100B and 100C were not modelled (or 
were deleted) because they did not have proper density. Many programs will 
gladly connect residue 100 to 101 and rip the structure to shreds when you do 
some sort of refinement. Well, at least in the olden days. When you interpret a 
structure model, you really have to check whether a possible peptide bond has a 
sensible length. Of course, you get away with not checking 99.99% of the time. 
But the PDB is pretty big nowadays...

Cheers,
Robbie

> -Original Message-
> From: CCP4 bulletin board  On Behalf Of Dale
> Tronrud
> Sent: Friday, December 4, 2020 19:16
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
> PDB -- N-glycans are now separate chains if more than one residue
> 
> Creating meaning in the chain names "A, B, C, Ag1, Ag2, Ag3" is exactly 
> the
> problem.  "chain names" ( or "entity identifiers" if I recall the mmCIF
> terminology correctly) are simply database "indexes".
> The values of indices are meaningless in themselves, they are just unique
> values that can be used to unambiguously identify a record. In principle, you
> could just assign random ISO characters (I don't think mmCIF allows unicode)
> and the mmCIF would be considered identical.
> 
> You are trying to force meaning to the characters with an index, and that
> puts multiple types of information in a single field.  As Robbie said already
> exists, if you want to encode connectivity into the data base you have to add
> records that define that connectivity.  That places the connectivity
> information explicitly in the data models and allows standard data base tools
> to track and validate.
> 
> The idioms of the PDB cause problems that lead people to these mistakes.
> The PDB assigns the indices "1", "2", and "3" to residues in a chain.  A 
> person
> could be misled into thinking that "2" comes between "1" and "3" in the
> sequence.  This is not necessarily true at all.  To learn the sequence you 
> have
> to go to the mmCIF records that define the connectivity between residues.
> It is entirely possible that "3" comes before "1" because these indexes don't
> contain any information, other than being unique within the chain.
> 
> Dale Tronrud
> 
> On 12/4/2020 9:46 AM, Tristan Croll wrote:
> > This suggestion violates a basic principle of data base theory.  A
> > single data item cannot encode two pieces of information.
> >
> > I'm sorry if I was unclear, but I don't believe I was suggesting
> > anything of the sort. Hopefully this example should make it more clear
> > - I'm just suggesting a slight variation on the existing system, no more:
> >
> > If we start with model containing 3 protein chains A-C, with chain A
> > containing amino acid residues 1-200, and 3 N-linked glycans with
> > residues numbered, say, 1000-1005, 1020-1026 and 1040-1043 (a fairly
> > common approach I've seen taken to the problem in the past, and one
> > I've taken myself), then if I understand correctly after remediation
> > we'll have a model with protein chains A-C and glycan chains D-F. The
> > problem is, unless and until all the available visualisation software
> > updates to automatically associate chains D-F to chain A based on
> > linkage, the user just has to remember that chains D-F are actually the
> chain A glycans.
> > This is a simple case, but things quickly become far more messy when
> > you have multiple glycosylated species each with multiple glycans per
> chain.
> > If, instead, the new chain assignments were something like "A, B, C,
> > Ag1, Ag2, Ag3", then we have something that is far more immediately
> > accessible to the user.
> >
> > --
> > --
> > *From:* Dale Tronrud 
> > *Sent:* 04 December 2020 17:01
> > *To:* Tristan Croll ; CCP4BB@JISCMAIL.AC.UK
> > 
> > *Subject:* Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at
> > the PDB -- N-glycans are now separate chains if more than one residue
> >
> >      This suggestion violates a basic principle of data base theory.
> > A single data item cannot encode two pieces of information.  The whole
> > structure of CIF falls apart if this is done.
> >
> >      Does the new PDB convention contain a CIF record of the link that
> > bridges

Re: [ccp4bb] pdb-l: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Tristan Croll

No - they're changing the auth_asym_id. See 
https://www.wwpdb.org/documentation/carbohydrate-remediation:

Oligosaccharide molecules are classified as a new entity type, branched, 
assigned a unique chain ID (_atom_site.auth_asym_id) and a new mmCIF category 
introduced to define the type of branching (_pdbx_entity_branch.type) .
wwPDB:
wwPDB: Worldwide Protein Data Bank. Carbohydrate Remediation. As the PDB 
archive grows, and the related science and techniques evolve, the 3D structures 
represented in the Core Archive require ongoing improvement ("remediation") to 
ensure consistency, accuracy, and overall quality.
www.wwpdb.org


From: Greg Couch 
Sent: 04 December 2020 18:51
To: Luca Jovine ; Mailing List CCP4 ; 
Mailing List PDB, 
Cc: Tristan Croll 
Subject: pdb-l: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the 
PDB -- N-glycans are now separate chains if more than one residue

mmCIF has two different chain ids.  One the the label_asym_id which is
used for internal consistency with the entities.  The other is the
auth_asym_id which is whatever the author chooses. If the glycans are
separate entities, then the label_asym_id HAS to be different for each
instance of an entity.  But the auth_asym_id could be the same for all
covalently bonded units.

In ChimeraX, we use the label_seq_id for the chain id when constructing
the molecule and we use the auth_asym_id for the chain id in the user
interface.

 -- Greg




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Robbie Joosten

Hi Luca,

Your point remains completely valid and I agree that residues that can belong 
to a longer chain should be treated as such. The same problem is with peptide 
ligands (at least in PDB times), if they consist of three residues they would 
their own chains, with 2 residues they would not. It's spectacular how much 
code beaks on that. 

At the same time you have to understand that the PDB makes design choices and 
that some experimentalist will find an exception. Yes, there are PDB entries 
where they add amino acids as crystallisation additives. As an interesting 
exception, this works better for nucleic acids: A loose nucleotide, say AMP, 
has a different name than a nucleotide that is part of a polymer. 

Anyway, we should appreciate the work that goes into setting up something as 
the PDB which is, by all means, a triumph in biological databases and an 
example to other fields.

Cheers,
Robbie



> -Original Message-
> From: CCP4 bulletin board  On Behalf Of Luca
> Jovine
> Sent: Friday, December 4, 2020 18:45
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
> PDB -- N-glycans are now separate chains if more than one residue
> 
> Dear Dale and Robbie,
> 
> I agree with your comments! But may I stir back the discussion to the
> original issue, which is that one-residue N-glycans are now treated
> differently from multi-residue N-glycans (although they are both covalently
> linked to a protein chain)? This inconsistency is independent of the file
> format…
> 
> Best, Luca
> 
> 
>   On 4 Dec 2020, at 18:30, Robbie Joosten
> mailto:robbie_joos...@hotmail.com>
> > wrote:
> 
>   Dear Dale,
> 
>   Yes, good point. Let's stop bending over backwards to come up with
> faux PDB compatibility and focus on making mmCIF better.
> 
>   There are struct_conn records that describe the linkages. This is
> enough to reconstruct the connectivity. There is an ongoing debate on how
> to capture the restraints for such linkages. But at least this can in 
> principle be
> captured in mmCIF whereas this is pretty much undoable in PDB format.
> 
>   Cheers,
>   Robbie
> 
> 
>   On 4 Dec 2020 18:01, Dale Tronrud   > wrote:
> 
> 
> 
>   This suggestion violates a basic principle of data base
> theory.  A
>   single data item cannot encode two pieces of information.
> The whole
>   structure of CIF falls apart if this is done.
> 
>   Does the new PDB convention contain a CIF record of the
> link that
>   bridges between the protein chain and the, now separated,
> glycan chain?
> If not, I think this is the principle failing of their new
> scheme.
> 
>   Dale Tronrud
> 
>   On 12/4/2020 12:06 AM, Tristan Croll wrote:
>   > To go one step further: in large, heavily glycosylated multi-
> chain complexes the assignment of a random new chain ID to each glycan
> will lead to headaches for people building visualisations using existing
> viewers, because it loses the easy name-based association of glycan to
> parent protein chain. A suggestion: why not take full advantage of the
> mmCIF capability for multi-character chain IDs, and name them by
> appending characters to the parent chain ID? Using chain A as an example,
> perhaps the glycans could become Ag1, Ag2, etc.?
>   >
>   >> On 4 Dec 2020, at 07:48, Luca Jovine   > wrote:
>   >>
>   >> CC: pdb-l
>   >>
>   >> Dear Zhijie and Robbie,
>   >>
>   >> I agree with both of you that the new carbohydrate chain
> assignment convention that has been recently adopted by PDB introduces
> confusion, not just for PDB-REDO but also - and especially - for end users.
>   >>
>   >> Could we kindly ask PDB to improve consistency by either
> assigning a separate chain to all covalently attached carbohydrates
> (regardless of whether one or more residues have been traced), or reverting
> to the old system (where N-/O-glycans inherited the same chain ID of the
> protein to which they are attached)? The current hybrid solution hardly
> seems optimal...
>   >>
>   >> Best regards,
>   >>
>   >> Luca
>   >>
>   >>> On 3 Dec 2020, at 20:17, Robbie Joosten
> mailto:robbie_joos...@hotmail.com>
> > wrote:
>   >>>
>   >>> Dear Zhijie,
>   >>>
>   >>> In generally I like the treatment of carbohydrates now as
> branched polymers. I didn't realise there was an exception. It makes sense
> for unlinked carbohydrate ligands, but not for N- or O-glycosylation sites as
> these might change during model building or, in my case, carbohydrate
> rebuilding in PDB-REDO powered by Coot. Thanks for pointing this out.
>

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Tristan Croll

OK, I understand your point more clearly now - but I'm not sure I fully agree, 
for the simple reason that people aren't computers. You're right that for the 
purposes of software validation tools the chain IDs are essentially arbitrary - 
as long as they're unique, nothing else really matters. But to a human simply 
wanting to explore a model in their favourite visualisation program this makes 
everything just that bit less intuitive - if they want to, say, go to the first 
glycan attached to chain A they have no way of doing so short of tracing 
through from the N-terminus until they find it, unless the program provides a 
tool that already understands the concept of "first glycan attached to chain 
A". So if we go forward with the "chain IDs are entirely arbitrary, therefore 
it doesn't matter what they are" approach, then every existing visualisation 
tool gets a little bit more difficult to use with glycans until their authors 
take the time to write new task-specific code.

In the grand scheme of things it's a minor issue, I suppose - but in my opinion 
it really is important to keep the experience of the end user in mind when 
making decisions like this.

From: Dale Tronrud 
Sent: 04 December 2020 18:16
To: Tristan Croll ; CCP4BB@JISCMAIL.AC.UK 

Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- 
N-glycans are now separate chains if more than one residue

Creating meaning in the chain names "A, B, C, Ag1, Ag2, Ag3" is
exactly the problem.  "chain names" ( or "entity identifiers" if I
recall the mmCIF terminology correctly) are simply database "indexes".
The values of indices are meaningless in themselves, they are just
unique values that can be used to unambiguously identify a record. In
principle, you could just assign random ISO characters (I don't think
mmCIF allows unicode) and the mmCIF would be considered identical.

You are trying to force meaning to the characters with an index, and
that puts multiple types of information in a single field.  As Robbie
said already exists, if you want to encode connectivity into the data
base you have to add records that define that connectivity.  That places
the connectivity information explicitly in the data models and allows
standard data base tools to track and validate.

The idioms of the PDB cause problems that lead people to these
mistakes.  The PDB assigns the indices "1", "2", and "3" to residues in
a chain.  A person could be misled into thinking that "2" comes between
"1" and "3" in the sequence.  This is not necessarily true at all.  To
learn the sequence you have to go to the mmCIF records that define the
connectivity between residues.  It is entirely possible that "3" comes
before "1" because these indexes don't contain any information, other
than being unique within the chain.

Dale Tronrud

On 12/4/2020 9:46 AM, Tristan Croll wrote:
> This suggestion violates a basic principle of data base theory.  A
> single data item cannot encode two pieces of information.
>
> I'm sorry if I was unclear, but I don't believe I was suggesting
> anything of the sort. Hopefully this example should make it more clear -
> I'm just suggesting a slight variation on the existing system, no more:
>
> If we start with model containing 3 protein chains A-C, with chain A
> containing amino acid residues 1-200, and 3 N-linked glycans with
> residues numbered, say, 1000-1005, 1020-1026 and 1040-1043 (a fairly
> common approach I've seen taken to the problem in the past, and one I've
> taken myself), then if I understand correctly after remediation we'll
> have a model with protein chains A-C and glycan chains D-F. The problem
> is, unless and until all the available visualisation software updates to
> automatically associate chains D-F to chain A based on linkage, the user
> just has to remember that chains D-F are actually the chain A glycans.
> This is a simple case, but things quickly become far more messy when you
> have multiple glycosylated species each with multiple glycans per chain.
> If, instead, the new chain assignments were something like "A, B, C,
> Ag1, Ag2, Ag3", then we have something that is far more immediately
> accessible to the user.
>
> 
> *From:* Dale Tronrud 
> *Sent:* 04 December 2020 17:01
> *To:* Tristan Croll ; CCP4BB@JISCMAIL.AC.UK
> 
> *Subject:* Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at
> the PDB -- N-glycans are now separate chains if more than one residue
>
>  This suggestion violates a basic principle of data base theory.  A
> single data item cannot encode two pieces of information.  The whole
> structure of CIF falls apart if this is done.
>
>  Does the new PDB convention contain a CIF record of the link that
> bridges between the protein chain and the, now separated, glycan chain?
>If not, I think this is the principle failing of their new scheme.
>
>

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Dale Tronrud

Creating meaning in the chain names "A, B, C, Ag1, Ag2, Ag3" is
exactly the problem. "chain names" ( or "entity identifiers" if I
recall the mmCIF terminology correctly) are simply database "indexes".
The values of indices are meaningless in themselves, they are just
unique values that can be used to unambiguously identify a record. In
principle, you could just assign random ISO characters (I don't think
mmCIF allows unicode) and the mmCIF would be considered identical.

You are trying to force meaning to the characters with an index, and
that puts multiple types of information in a single field. As Robbie
said already exists, if you want to encode connectivity into the data
base you have to add records that define that connectivity. That places
the connectivity information explicitly in the data models and allows
standard data base tools to track and validate.

The idioms of the PDB cause problems that lead people to these
mistakes. The PDB assigns the indices "1", "2", and "3" to residues in
a chain. A person could be misled into thinking that "2" comes between
"1" and "3" in the sequence. This is not necessarily true at all. To
learn the sequence you have to go to the mmCIF records that define the
connectivity between residues. It is entirely possible that "3" comes
before "1" because these indexes don't contain any information, other
than being unique within the chain.

Dale Tronrud

On 12/4/2020 9:46 AM, Tristan Croll wrote:

This suggestion violates a basic principle of data base theory. A
single data item cannot encode two pieces of information.

I'm sorry if I was unclear, but I don't believe I was suggesting
anything of the sort. Hopefully this example should make it more clear -
I'm just suggesting a slight variation on the existing system, no more:

If we start with model containing 3 protein chains A-C, with chain A
containing amino acid residues 1-200, and 3 N-linked glycans with
residues numbered, say, 1000-1005, 1020-1026 and 1040-1043 (a fairly
common approach I've seen taken to the problem in the past, and one I've
taken myself), then if I understand correctly after remediation we'll
have a model with protein chains A-C and glycan chains D-F. The problem
is, unless and until all the available visualisation software updates to
automatically associate chains D-F to chain A based on linkage, the user
just has to remember that chains D-F are actually the chain A glycans.
This is a simple case, but things quickly become far more messy when you
have multiple glycosylated species each with multiple glycans per chain.
If, instead, the new chain assignments were something like "A, B, C,
Ag1, Ag2, Ag3", then we have something that is far more immediately
accessible to the user.

*From:* Dale Tronrud
*Sent:* 04 December 2020 17:01
*To:* Tristan Croll ; CCP4BB@JISCMAIL.AC.UK

*Subject:* Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at
the PDB -- N-glycans are now separate chains if more than one residue

This suggestion violates a basic principle of data base theory. A
single data item cannot encode two pieces of information. The whole
structure of CIF falls apart if this is done.

Does the new PDB convention contain a CIF record of the link that
bridges between the protein chain and the, now separated, glycan chain?
If not, I think this is the principle failing of their new scheme.

Dale Tronrud

On 12/4/2020 12:06 AM, Tristan Croll wrote:
To go one step further: in large, heavily glycosylated multi-chain complexes the assignment of a random new chain ID to each glycan will lead to headaches for people building visualisations using existing viewers, because it loses the easy name-based association of glycan to parent protein chain. A suggestion: why not take full
advantage of the mmCIF capability for multi-character chain IDs, and
name them by appending characters to the parent chain ID? Using chain A
as an example, perhaps the glycans could become Ag1, Ag2, etc.?

On 4 Dec 2020, at 07:48, Luca Jovine wrote:

CC: pdb-l

Dear Zhijie and Robbie,

I agree with both of you that the new carbohydrate chain assignment convention
that has been recently adopted by PDB introduces confusion, not just for
PDB-REDO but also - and especially - for end users.

Could we kindly ask PDB to improve consistency by either assigning a separate chain to all covalently attached carbohydrates (regardless of whether one or more residues have been traced), or reverting to the old system (where N-/O-glycans inherited the same chain ID of the protein to which they are attached)? The current

hybrid solution hardly seems optimal...

Best regards,

Luca

On 3 Dec 2020, at 20:17, Robbie Joosten wrote:

Dear Zhijie,

In generally I like the treatment of carbohydrates now as branched polymers. I didn't realise there was an exception. It makes sense for

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread radu

Hi Tristan,

I fully subscribe to your idea! I was quite surprised to see our model revised
with different glycan chain IDs upon PDB annotation. I imagine there must have
been some "administrative" reasoning behind this decision, but it's just a
nightmare for subsequent visualisation. And, to me at least, this change makes
no sense. Protein chains with covalently attached glycans are one biochemical,
structural and functional unit.

Best wishes,

Radu

> This suggestion violates a basic principle of data base theory.  A
> single data item cannot encode two pieces of information.
>
> I'm sorry if I was unclear, but I don't believe I was suggesting anything of
> the sort. Hopefully this example should make it more clear - I'm just
> suggesting a slight variation on the existing system, no more:
>
> If we start with model containing 3 protein chains A-C, with chain A
> containing amino acid residues 1-200, and 3 N-linked glycans with residues
> numbered, say, 1000-1005, 1020-1026 and 1040-1043 (a fairly common approach
> I've seen taken to the problem in the past, and one I've taken myself), then
> if I understand correctly after remediation we'll have a model with protein
> chains A-C and glycan chains D-F. The problem is, unless and until all the
> available visualisation software updates to automatically associate chains D-F
> to chain A based on linkage, the user just has to remember that chains D-F are
> actually the chain A glycans. This is a simple case, but things quickly become
> far more messy when you have multiple glycosylated species each with multiple
> glycans per chain. If, instead, the new chain assignments were something like
> "A, B, C, Ag1, Ag2, Ag3", then we have something that is far more immediately
> accessible to the user.
>
> 
> From: Dale Tronrud 
> Sent: 04 December 2020 17:01
> To: Tristan Croll ; CCP4BB@JISCMAIL.AC.UK
> 
> Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB --
> N-glycans are now separate chains if more than one residue
>
>
> This suggestion violates a basic principle of data base theory.  A
> single data item cannot encode two pieces of information.  The whole
> structure of CIF falls apart if this is done.
>
> Does the new PDB convention contain a CIF record of the link that
> bridges between the protein chain and the, now separated, glycan chain?
>   If not, I think this is the principle failing of their new scheme.
>
> Dale Tronrud
>
> On 12/4/2020 12:06 AM, Tristan Croll wrote:
>> To go one step further: in large, heavily glycosylated multi-chain complexes
>> the assignment of a random new chain ID to each glycan will lead to
>> headaches for people building visualisations using existing viewers, because
>> it loses the easy name-based association of glycan to parent protein chain.
>> A suggestion: why not take full advantage of the mmCIF capability for
>> multi-character chain IDs, and name them by appending characters to the
>> parent chain ID? Using chain A as an example, perhaps the glycans could
>> become Ag1, Ag2, etc.?
>>
>>> On 4 Dec 2020, at 07:48, Luca Jovine  wrote:
>>>
>>> CC: pdb-l
>>>
>>> Dear Zhijie and Robbie,
>>>
>>> I agree with both of you that the new carbohydrate chain assignment
>>> convention that has been recently adopted by PDB introduces confusion, not
>>> just for PDB-REDO but also - and especially - for end users.
>>>
>>> Could we kindly ask PDB to improve consistency by either assigning a
>>> separate chain to all covalently attached carbohydrates (regardless of
>>> whether one or more residues have been traced), or reverting to the old
>>> system (where N-/O-glycans inherited the same chain ID of the protein to
>>> which they are attached)? The current hybrid solution hardly seems
>>> optimal...
>>>
>>> Best regards,
>>>
>>> Luca
>>>
 On 3 Dec 2020, at 20:17, Robbie Joosten 
 wrote:

 Dear Zhijie,

 In generally I like the treatment of carbohydrates now as branched
 polymers. I didn't realise there was an exception. It makes sense for
 unlinked carbohydrate ligands, but not for N- or O-glycosylation sites as
 these might change during model building or, in my case, carbohydrate
 rebuilding in PDB-REDO powered by Coot. Thanks for pointing this out.

 Cheers,
 Robbie

> -Original Message-
> From: CCP4 bulletin board  On Behalf Of Zhijie Li
> Sent: Thursday, December 3, 2020 19:52
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
> PDB -- N-glycans are now separate chains if more than one residue
>
> Hi all,
>
> I was confused when I saw mysterious new glycan chains emerging during
> PDB deposition and spent quite some time trying to find out what was
> wrong with my coordinates.  Then it occurred to me that a lot of recent
> structures also had tens of N-glycan chains.  Finally I realized that
>

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Tristan Croll

This suggestion violates a basic principle of data base theory.  A
single data item cannot encode two pieces of information.

I'm sorry if I was unclear, but I don't believe I was suggesting anything of 
the sort. Hopefully this example should make it more clear - I'm just 
suggesting a slight variation on the existing system, no more:

If we start with model containing 3 protein chains A-C, with chain A containing 
amino acid residues 1-200, and 3 N-linked glycans with residues numbered, say, 
1000-1005, 1020-1026 and 1040-1043 (a fairly common approach I've seen taken to 
the problem in the past, and one I've taken myself), then if I understand 
correctly after remediation we'll have a model with protein chains A-C and 
glycan chains D-F. The problem is, unless and until all the available 
visualisation software updates to automatically associate chains D-F to chain A 
based on linkage, the user just has to remember that chains D-F are actually 
the chain A glycans. This is a simple case, but things quickly become far more 
messy when you have multiple glycosylated species each with multiple glycans 
per chain. If, instead, the new chain assignments were something like "A, B, C, 
Ag1, Ag2, Ag3", then we have something that is far more immediately accessible 
to the user.

From: Dale Tronrud 
Sent: 04 December 2020 17:01
To: Tristan Croll ; CCP4BB@JISCMAIL.AC.UK 

Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- 
N-glycans are now separate chains if more than one residue

This suggestion violates a basic principle of data base theory.  A
single data item cannot encode two pieces of information.  The whole
structure of CIF falls apart if this is done.

Does the new PDB convention contain a CIF record of the link that
bridges between the protein chain and the, now separated, glycan chain?
  If not, I think this is the principle failing of their new scheme.

Dale Tronrud

On 12/4/2020 12:06 AM, Tristan Croll wrote:
> To go one step further: in large, heavily glycosylated multi-chain complexes 
> the assignment of a random new chain ID to each glycan will lead to headaches 
> for people building visualisations using existing viewers, because it loses 
> the easy name-based association of glycan to parent protein chain. A 
> suggestion: why not take full advantage of the mmCIF capability for 
> multi-character chain IDs, and name them by appending characters to the 
> parent chain ID? Using chain A as an example, perhaps the glycans could 
> become Ag1, Ag2, etc.?
>
>> On 4 Dec 2020, at 07:48, Luca Jovine  wrote:
>>
>> CC: pdb-l
>>
>> Dear Zhijie and Robbie,
>>
>> I agree with both of you that the new carbohydrate chain assignment 
>> convention that has been recently adopted by PDB introduces confusion, not 
>> just for PDB-REDO but also - and especially - for end users.
>>
>> Could we kindly ask PDB to improve consistency by either assigning a 
>> separate chain to all covalently attached carbohydrates (regardless of 
>> whether one or more residues have been traced), or reverting to the old 
>> system (where N-/O-glycans inherited the same chain ID of the protein to 
>> which they are attached)? The current hybrid solution hardly seems optimal...
>>
>> Best regards,
>>
>> Luca
>>
>>> On 3 Dec 2020, at 20:17, Robbie Joosten  wrote:
>>>
>>> Dear Zhijie,
>>>
>>> In generally I like the treatment of carbohydrates now as branched 
>>> polymers. I didn't realise there was an exception. It makes sense for 
>>> unlinked carbohydrate ligands, but not for N- or O-glycosylation sites as 
>>> these might change during model building or, in my case, carbohydrate 
>>> rebuilding in PDB-REDO powered by Coot. Thanks for pointing this out.
>>>
>>> Cheers,
>>> Robbie
>>>
 -Original Message-
 From: CCP4 bulletin board  On Behalf Of Zhijie Li
 Sent: Thursday, December 3, 2020 19:52
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
 PDB -- N-glycans are now separate chains if more than one residue

 Hi all,

 I was confused when I saw mysterious new glycan chains emerging during
 PDB deposition and spent quite some time trying to find out what was
 wrong with my coordinates.  Then it occurred to me that a lot of recent
 structures also had tens of N-glycan chains.  Finally I realized that this
 phenomenon is a consequence of this PDB policy announced here in July.

 For future depositors who might also get puzzled, let's put it in a short
 sentence:  O- and N-glycans are now separate chains if it they contain more
 than one residue; single residues remain with the protein chain.

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Luca Jovine

Dear Dale and Robbie,

I agree with your comments! But may I stir back the discussion to the original 
issue, which is that one-residue N-glycans are now treated differently from 
multi-residue N-glycans (although they are both covalently linked to a protein 
chain)? This inconsistency is independent of the file format…

Best, Luca

On 4 Dec 2020, at 18:30, Robbie Joosten 
mailto:robbie_joos...@hotmail.com>> wrote:

Dear Dale,

Yes, good point. Let's stop bending over backwards to come up with faux PDB 
compatibility and focus on making mmCIF better.

There are struct_conn records that describe the linkages. This is enough to 
reconstruct the connectivity. There is an ongoing debate on how to capture the 
restraints for such linkages. But at least this can in principle be captured in 
mmCIF whereas this is pretty much undoable in PDB format.

Cheers,
Robbie

On 4 Dec 2020 18:01, Dale Tronrud 
mailto:de...@daletronrud.com>> wrote:

This suggestion violates a basic principle of data base theory.  A
single data item cannot encode two pieces of information.  The whole
structure of CIF falls apart if this is done.

Does the new PDB convention contain a CIF record of the link that
bridges between the protein chain and the, now separated, glycan chain?
  If not, I think this is the principle failing of their new scheme.

Dale Tronrud

On 12/4/2020 12:06 AM, Tristan Croll wrote:
> To go one step further: in large, heavily glycosylated multi-chain complexes 
> the assignment of a random new chain ID to each glycan will lead to headaches 
> for people building visualisations using existing viewers, because it loses 
> the easy name-based association of glycan to parent protein chain. A 
> suggestion: why not take full advantage of the mmCIF capability for 
> multi-character chain IDs, and name them by appending characters to the 
> parent chain ID? Using chain A as an example, perhaps the glycans could 
> become Ag1, Ag2, etc.?
>
>> On 4 Dec 2020, at 07:48, Luca Jovine 
>> mailto:luca.jov...@ki.se>> wrote:
>>
>> CC: pdb-l
>>
>> Dear Zhijie and Robbie,
>>
>> I agree with both of you that the new carbohydrate chain assignment 
>> convention that has been recently adopted by PDB introduces confusion, not 
>> just for PDB-REDO but also - and especially - for end users.
>>
>> Could we kindly ask PDB to improve consistency by either assigning a 
>> separate chain to all covalently attached carbohydrates (regardless of 
>> whether one or more residues have been traced), or reverting to the old 
>> system (where N-/O-glycans inherited the same chain ID of the protein to 
>> which they are attached)? The current hybrid solution hardly seems optimal...
>>
>> Best regards,
>>
>> Luca
>>
>>> On 3 Dec 2020, at 20:17, Robbie Joosten 
>>> mailto:robbie_joos...@hotmail.com>> wrote:
>>>
>>> Dear Zhijie,
>>>
>>> In generally I like the treatment of carbohydrates now as branched 
>>> polymers. I didn't realise there was an exception. It makes sense for 
>>> unlinked carbohydrate ligands, but not for N- or O-glycosylation sites as 
>>> these might change during model building or, in my case, carbohydrate 
>>> rebuilding in PDB-REDO powered by Coot. Thanks for pointing this out.
>>>
>>> Cheers,
>>> Robbie
>>>
 -Original Message-
 From: CCP4 bulletin board 
 mailto:CCP4BB@JISCMAIL.AC.UK>> On Behalf Of Zhijie 
 Li
 Sent: Thursday, December 3, 2020 19:52
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
 PDB -- N-glycans are now separate chains if more than one residue

 Hi all,

 I was confused when I saw mysterious new glycan chains emerging during
 PDB deposition and spent quite some time trying to find out what was
 wrong with my coordinates.  Then it occurred to me that a lot of recent
 structures also had tens of N-glycan chains.  Finally I realized that this
 phenomenon is a consequence of this PDB policy announced here in July.

 For future depositors who might also get puzzled, let's put it in a short
 sentence:  O- and N-glycans are now separate chains if it they contain more
 than one residue; single residues remain with the protein chain.

 https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wwpdb.org%2Fdocumentation%2Fcarbohydrate-remediationdata=04%7C01%7Cluca.jovine%40KI.SE%7C1d790a0717ce4217c7a308d897c01b47%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C1%7C637426199684263065%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=mBrkCJECFpZyCih4kOCcCvLT1GzQHxD5GD7bZDI9s1s%3Dreserved=0

 "Oligosaccharide molecules are classified as a new entity type, branched,
 assigned a unique chain ID (_atom_site.auth_asym_id) and a new mmCIF
 category introduced to define the type of branching
 (_pdbx_entity_branch.type) . "

 I found the

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Nave, Colin (DLSLtd,RAL,LSCI)

Michel
Yes, a good point. relevant to the difference between AlphaGo and AlphaFold2. 
My understanding is that Alpha Go did begin with information about previous 
games but after this, it played against itself and became significantly better. 
AlphaFold2 relied perhaps completely on knowledge of previous "games" but 
didn't have an opponent to play against.

There is a difference between the intrinsic nature of the folding problem and 
the successful implementation, using additional information,  of AlphaFold2. I 
was really asking about the intrinsic nature of the folding problem (and Chess, 
Go) but, in practice, the question is probably not particularly relevant.

It might be true, for single isolated proteins that "all the information 
required for the 3D structure is in the sequence." However, many proteins can 
and do form amyloids. I think it was Chris Dobson who pointed out that most 
sequences would form amyloids and only a small number of sequences, tuned by 
natural selection, would form useful folds. Even these could easily revert to 
amyloids (otherwise known as the precipitant in the crystallisation well). 
Chaperones get involved and there is the issue of kinetic rather than 
thermodynamic control. See also James Holton's comments about energy 
minimisation. All this just indicates that the problem would be very hard 
without known structures. However, the advantage for predicting structure from 
sequence is that one can assume that the vast majority of sequences people are 
interested in will fold in to something useful, rather than an amyloid. Of 
course spider silk forms amyloid fibres and they are structurally useful.

All interesting issues
  Colin


From: CCP4 bulletin board  On Behalf Of Michel Fodje
Sent: 04 December 2020 15:58
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

I think the results from AlphaFold2, although exciting and a breakthrough are 
being exaggerated just a bit.  We know that all the information required for 
the 3D structure is in the sequence. The protein folding problem is simply how 
to go from a sequence to the 3D structure. This is not a complex problem in the 
sense that cells solve it deterministically.  Thus the problem is due to lack 
of understanding and not due to complexity.  AlphaFold and all the others 
trying to solve this problem are "cheating" in that they are not just using the 
sequence, they are using other sequences like it (multiple-sequence 
alignments), and they are using all the structural information contained in the 
PDB.  All of this information is not used by the cells.   In short, unless 
AlphaFold2 now allows us to understand how exactly a single protein sequence 
produces a particular 3D structure, the protein folding problem is hardly 
solved in a theoretical sense. The only reason we know how well AlphaFold2 did 
is because the structures were solved and we could compare with the 
predictions, which means verification is lacking.

The protein folding problem will be solved when we understand how to go from a 
sequence to a structure, and can verify a given structure to be correct without 
experimental data. Even if AlphaFold2 got 99% of structures right, your next 
interesting target protein might be the 1%. How would you know?   Until then, 
what AlphaFold2 is telling us right now is that all (most) of the information 
present in the sequence that determines the 3D structure can be gleaned in bits 
and pieces scattered between homologous sequences, multiple-sequence 
alignments, and other protein 3D structures in the PDB.  Deep Learning allows a 
huge amount of data to be thrown at a problem and the back-propagation of the 
networks then allows careful fine-tuning of weights which determine how 
relevant different pieces of information are to the prediction.  The networks 
used here are humongous and a detailed look at the weights (if at all feasible) 
may point us in the right direction.


From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Nave, Colin (DLSLtd,RAL,LSCI)
Sent: December 4, 2020 9:14 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

The subject line for Isabel's email is very good.

I do have a question (more a request) for the more computer scientist oriented 
people. I think it is relevant for where this technology will be going. It 
comes from trying to understand whether problems addressed by Alpha are NP, NP 
hard, NP complete etc. My understanding is that the previous successes of Alpha 
were for complete information games such as Chess and Go. Both the rules and 
the present position were available to both sides. The folding problem might be 
in a different category. It would be nice if someone could explain the 
difference (if any) between Go and the protein folding problem perhaps using 
the NP type categories.

Colin

Re: [ccp4bb] Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread Georg Mlynek

Hi Jack, was the dewar shipped on 1st April and just arrived? You can
buy the CX100 also as CXR100 and there you can replace the adsorbent
material. Looks like this.

https://www.google.com/search?q=replaceable+adsorbent+material+kits=1C1CHBF_deAT848AT848=ALeKk01N9aWj7Kw07AMDN1oGMnke1spe8A:1607102862749=lnms=isch=X=2ahUKEwj5r4Ha7LTtAhUR7eAKHbaZB6IQ_AUoAnoECAUQBA=1920=880#imgrc=o0UXuuQwacE4DM

Br, Georg.

Am 2020-12-04 um 5:42 PM schrieb Nukri Sanishvili:

Hi John,
I think I know what might have happened:
Many of the MX beamlines at the APS use some sort of filler in the
containers where the LN2 is dumped. If I remember correctly, one of
the beamlines is using fine gravel for this purpose. Also, it is
required that before shipping, the dewars are emptied - i.e. don't
contain liquid. Now, imagine somebody dumping the liquid into the
grave-filled container without removing the blue cap and without
holding the dewar in the air - i.e. the top of the dewar with the cap
on is slightly buried into the gravel. Upon straightening the dewar
up, the blue cap would scoop up a little bit of the gravel.
Distribution of the pebbles on your picture is also noteworthy. It
suggests the side where the pebbles are was the side dipped into the
gravel.

You might want to discuss this with your beamline host.
Best,
Nukri

On Fri, Dec 4, 2020 at 10:05 AM Tanner, John J. > wrote:

When we opened our CX100 shipping dewar returned from APS via
FedEx this week, we observed what appears to be tiny rocks on the
rim below the foam neck core:

https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0

Has anyone seen this before? Is this perhaps the absorbent
material from the inside of the dewar?

Thanks,

Jack

John J. Tanner
Professor of Biochemistry and Chemistry
Associate Chair of Biochemistry
Department of Biochemistry
University of Missouri
117 Schweitzer Hall
503 S College Avenue
Columbia, MO 65211
Phone: 573-884-1280
Email: tanne...@missouri.edu
https://cafnrfaculty.missouri.edu/tannerlab/

Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
Office: Schlundt Annex 203A

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list
hosted by www.jiscmail.ac.uk, terms & conditions are available at
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Robbie Joosten

Dear Dale,Yes, good point. Let's stop bending over backwards to come up with faux PDB compatibility and focus on making mmCIF better.There are struct_conn records that describe the linkages. This is enough to reconstruct the connectivity. There is an ongoing debate on how to capture the restraints for such linkages. But at least this can in principle be captured in mmCIF whereas this is pretty much undoable in PDB format.Cheers,RobbieOn 4 Dec 2020 18:01, Dale Tronrud  wrote:

    This suggestion violates a basic principle of data base theory.  A 

single data item cannot encode two pieces of information.  The whole 

structure of CIF falls apart if this is done.



    Does the new PDB convention contain a CIF record of the link that 

bridges between the protein chain and the, now separated, glycan chain? 

  If not, I think this is the principle failing of their new scheme.



Dale Tronrud



On 12/4/2020 12:06 AM, Tristan Croll wrote:

> To go one step further: in large, heavily glycosylated multi-chain complexes the assignment of a random new chain ID to each glycan will lead to headaches for people building visualisations using existing viewers, because it loses the easy name-based association of glycan to parent protein chain. A suggestion: why not take full advantage of the mmCIF capability for multi-character chain IDs, and name them by appending characters to the parent chain ID? Using chain A as an example, perhaps the glycans could become Ag1, Ag2, etc.?

> 

>> On 4 Dec 2020, at 07:48, Luca Jovine  wrote:

>>

>> CC: pdb-l

>>

>> Dear Zhijie and Robbie,

>>

>> I agree with both of you that the new carbohydrate chain assignment convention that has been recently adopted by PDB introduces confusion, not just for PDB-REDO but also - and especially - for end users.

>>

>> Could we kindly ask PDB to improve consistency by either assigning a separate chain to all covalently attached carbohydrates (regardless of whether one or more residues have been traced), or reverting to the old system (where N-/O-glycans inherited the same chain ID of the protein to which they are attached)? The current hybrid solution hardly seems optimal...

>>

>> Best regards,

>>

>> Luca

>>

>>> On 3 Dec 2020, at 20:17, Robbie Joosten  wrote:

>>>

>>> Dear Zhijie,

>>>

>>> In generally I like the treatment of carbohydrates now as branched polymers. I didn't realise there was an exception. It makes sense for unlinked carbohydrate ligands, but not for N- or O-glycosylation sites as these might change during model building or, in my case, carbohydrate rebuilding in PDB-REDO powered by Coot. Thanks for pointing this out.

>>>

>>> Cheers,

>>> Robbie

>>>

 -Original Message-

 From: CCP4 bulletin board  On Behalf Of Zhijie Li

 Sent: Thursday, December 3, 2020 19:52

 To: CCP4BB@JISCMAIL.AC.UK

 Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the

 PDB -- N-glycans are now separate chains if more than one residue



 Hi all,



 I was confused when I saw mysterious new glycan chains emerging during

 PDB deposition and spent quite some time trying to find out what was

 wrong with my coordinates.  Then it occurred to me that a lot of recent

 structures also had tens of N-glycan chains.  Finally I realized that this

 phenomenon is a consequence of this PDB policy announced here in July.





 For future depositors who might also get puzzled, let's put it in a short

 sentence:  O- and N-glycans are now separate chains if it they contain more

 than one residue; single residues remain with the protein chain.





 https://eur01.safelinks.protection.outlook.com/?url=""



 "Oligosaccharide molecules are classified as a new entity type, branched,

 assigned a unique chain ID (_atom_site.auth_asym_id) and a new mmCIF

 category introduced to define the type of branching

 (_pdbx_entity_branch.type) . "











 I found the differential treatment of single-residue glycans and multi-residue

 glycans not only bit lack of aesthetics but also misleading.  When a structure

 contains both NAG-NAG... and single NAG on N-glycosylation sites, it might

 be because of lack of density for building more residues, or because that

 some of the glycosylation sites are now indeed single NAGs (endoH etc.)

 while some others are not cleaved due to accessibility issues.    Leaving NAGs

 on the protein chain while assigning NAG-NAG... to a new chain, feels like

 suggesting something about their true oligomeric state.





 For example, for cryoEM structures, when one only builds a single NAG at a

 site does not necessarily mean that the protein was treated by endoH. In

 fact all sites are extended to at least tri-Man in most cases. Then why

 keeping some sites associated with the protein chain while others kicked

 out?

Re: [ccp4bb] Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread David Schuller

We use aquarium gravel for this purpose. It does not have the fine 
dust-like particles that normal construction gravel might.


On 12/4/20 11:42 AM, Nukri Sanishvili wrote:

Hi John,
I think I know what might have happened:
Many of the MX beamlines at the APS use some sort of filler in the 
containers where the LN2 is dumped. If I remember correctly, one of 
the beamlines is using fine gravel for this purpose. Also, it is 
required that before shipping, the dewars are emptied - i.e. don't 
contain liquid. Now, imagine somebody dumping the liquid into the 
grave-filled container without removing the blue cap and without 
holding the dewar in the air - i.e. the top of the dewar with the cap 
on is slightly buried into the gravel. Upon straightening the dewar 
up, the blue cap would scoop up a little bit of the gravel. 
Distribution of the pebbles on your picture is also noteworthy. It 
suggests the side where the pebbles are was the side dipped into the 
gravel.

You might want to discuss this with your beamline host.
Best,
Nukri

On Fri, Dec 4, 2020 at 10:05 AM Tanner, John J. > wrote:


When we opened our CX100 shipping dewar returned from APS via
FedEx this week, we observed what appears to be tiny rocks on the
rim below the foam neck core:

https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0


Has anyone seen this before? Is this perhaps the absorbent
material from the inside of the dewar?

Thanks,

Jack

John J. Tanner
Professor of Biochemistry and Chemistry
Associate Chair of Biochemistry
Department of Biochemistry
University of Missouri
117 Schweitzer Hall
503 S College Avenue
Columbia, MO 65211
Phone: 573-884-1280
Email: tanne...@missouri.edu 
https://cafnrfaculty.missouri.edu/tannerlab/

Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
Office: Schlundt Annex 203A



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 





--
===
All Things Serve the Beam
===
   David J. Schuller
   modern man in a post-modern world
   MacCHESS, Cornell University
   schul...@cornell.edu




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

[ccp4bb] AW: Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread Hughes, Jonathan

hey jack,
our experience with these "logistics" people is that you'll probably need to 
buy a new dewar ;-)
cheers
jon

Von: CCP4 bulletin board  Im Auftrag von Tanner, John J.
Gesendet: Freitag, 4. Dezember 2020 17:04
An: CCP4BB@JISCMAIL.AC.UK
Betreff: [ccp4bb] Tiny rocks on my CX100 shipping dewar

When we opened our CX100 shipping dewar returned from APS via FedEx this week, 
we observed what appears to be tiny rocks on the rim below the foam neck core:

https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0

Has anyone seen this before? Is this perhaps the absorbent material from the 
inside of the dewar?

Thanks,

Jack

John J. Tanner
Professor of Biochemistry and Chemistry
Associate Chair of Biochemistry
Department of Biochemistry
University of Missouri
117 Schweitzer Hall
503 S College Avenue
Columbia, MO 65211
Phone: 573-884-1280
Email: tanne...@missouri.edu
https://cafnrfaculty.missouri.edu/tannerlab/
Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
Office: Schlundt Annex 203A



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Dale Tronrud



   This suggestion violates a basic principle of data base theory.  A 
single data item cannot encode two pieces of information.  The whole 
structure of CIF falls apart if this is done.


   Does the new PDB convention contain a CIF record of the link that 
bridges between the protein chain and the, now separated, glycan chain? 
 If not, I think this is the principle failing of their new scheme.


Dale Tronrud

On 12/4/2020 12:06 AM, Tristan Croll wrote:

To go one step further: in large, heavily glycosylated multi-chain complexes 
the assignment of a random new chain ID to each glycan will lead to headaches 
for people building visualisations using existing viewers, because it loses the 
easy name-based association of glycan to parent protein chain. A suggestion: 
why not take full advantage of the mmCIF capability for multi-character chain 
IDs, and name them by appending characters to the parent chain ID? Using chain 
A as an example, perhaps the glycans could become Ag1, Ag2, etc.?


On 4 Dec 2020, at 07:48, Luca Jovine  wrote:

CC: pdb-l

Dear Zhijie and Robbie,

I agree with both of you that the new carbohydrate chain assignment convention 
that has been recently adopted by PDB introduces confusion, not just for 
PDB-REDO but also - and especially - for end users.

Could we kindly ask PDB to improve consistency by either assigning a separate 
chain to all covalently attached carbohydrates (regardless of whether one or 
more residues have been traced), or reverting to the old system (where 
N-/O-glycans inherited the same chain ID of the protein to which they are 
attached)? The current hybrid solution hardly seems optimal...

Best regards,

Luca


On 3 Dec 2020, at 20:17, Robbie Joosten  wrote:

Dear Zhijie,

In generally I like the treatment of carbohydrates now as branched polymers. I 
didn't realise there was an exception. It makes sense for unlinked carbohydrate 
ligands, but not for N- or O-glycosylation sites as these might change during 
model building or, in my case, carbohydrate rebuilding in PDB-REDO powered by 
Coot. Thanks for pointing this out.

Cheers,
Robbie


-Original Message-
From: CCP4 bulletin board  On Behalf Of Zhijie Li
Sent: Thursday, December 3, 2020 19:52
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
PDB -- N-glycans are now separate chains if more than one residue

Hi all,

I was confused when I saw mysterious new glycan chains emerging during
PDB deposition and spent quite some time trying to find out what was
wrong with my coordinates.  Then it occurred to me that a lot of recent
structures also had tens of N-glycan chains.  Finally I realized that this
phenomenon is a consequence of this PDB policy announced here in July.


For future depositors who might also get puzzled, let's put it in a short
sentence:  O- and N-glycans are now separate chains if it they contain more
than one residue; single residues remain with the protein chain.


https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wwpdb.org%2Fdocumentation%2Fcarbohydrate-remediationdata=04%7C01%7Cluca.jovine%40KI.SE%7C1d790a0717ce4217c7a308d897c01b47%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C1%7C637426199684263065%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=mBrkCJECFpZyCih4kOCcCvLT1GzQHxD5GD7bZDI9s1s%3Dreserved=0

"Oligosaccharide molecules are classified as a new entity type, branched,
assigned a unique chain ID (_atom_site.auth_asym_id) and a new mmCIF
category introduced to define the type of branching
(_pdbx_entity_branch.type) . "





I found the differential treatment of single-residue glycans and multi-residue
glycans not only bit lack of aesthetics but also misleading.  When a structure
contains both NAG-NAG... and single NAG on N-glycosylation sites, it might
be because of lack of density for building more residues, or because that
some of the glycosylation sites are now indeed single NAGs (endoH etc.)
while some others are not cleaved due to accessibility issues.Leaving NAGs
on the protein chain while assigning NAG-NAG... to a new chain, feels like
suggesting something about their true oligomeric state.


For example, for cryoEM structures, when one only builds a single NAG at a
site does not necessarily mean that the protein was treated by endoH. In
fact all sites are extended to at least tri-Man in most cases. Then why
keeping some sites associated with the protein chain while others kicked
out?

Zhijie





From: CCP4 bulletin board  on behalf of John
Berrisford 
Sent: Thursday, July 9, 2020 4:39 AM
To: CCP4BB@JISCMAIL.AC.UK 
Subject: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB


Dear CCP4BB

PDB data will shortly incorporate a new data representation for
carbohydrates in PDB entries and reference data that improves the
Findability and Interoperability of these molecules in macromolecular
structures. In order

[ccp4bb] 2x Postdoc positions and Lab Manager position

2020-12-04 Thread Gordon Joyce

The Joyce Laboratory, Henry M. Jackson Foundation supporting the Emerging 
Infectious Diseases Branch, Walter Reed Army Institute of Research have 
openings for two postdoctoral positions and a Lab Manager position in 
Structural Biology of Emerging Infectious Diseases. One postdoctoral position 
involves the structural and biochemical characterization of small molecule 
inhibitors against Emerging Infectious pathogens including SARS-CoV-2, as part 
of an integrated drug development pipeline at WRAIR. One postdoctoral position 
involves the structural and biochemical characterization of viral glycoproteins 
from Emerging Infectious pathogens including SARS-CoV-2 to enable vaccine 
development, antibody therapeutics and small molecule inhibitors. The Lab 
manager position will oversee day-to-day activities in the lab including lab 
inventory, ordering and routine mammalian cell culture and transient 
transfection. These positions offer a unique opportunity to gain extensive 
structural biology experience focused on vaccine and therapeutic development 
for pandemic prevention. Interested candidates should apply at the sites 
indicated below, or with a cover letter and a current CV directly to Gordon 
Joyce gjo...@eidresearch.org

Postdoctoral Fellow: 
https://recruiting.ultipro.com/HEN1006HMJ/JobBoard/3a6861f3-0883-4466-8b7d-35e87635b33d/OpportunityDetail?opportunityId=badea1fa-6ab9-4be5-8ddb-57530404a901
Postdoctoral Fellow: 
https://recruiting.ultipro.com/HEN1006HMJ/JobBoard/3a6861f3-0883-4466-8b7d-35e87635b33d/OpportunityDetail?opportunityId=01d3c013-4b96-4f68-8d58-5ca1b631ba2e
Lab Manager: 
https://recruiting.ultipro.com/HEN1006HMJ/JobBoard/3a6861f3-0883-4466-8b7d-35e87635b33d/OpportunityDetail?opportunityId=a7a0f9d2-dd49-4009-96b6-df951fce9218

M. Gordon Joyce, Ph.D.
Chief, Structural Biology
The Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc. 
(HJF)
In support of the Emerging Infectious Diseases Branch (EIDB) and the US 
Military HIV Research Program (MHRP)
Walter Reed Army Institute of Research (WRAIR)
1N33, 503 Robert Grant Ave,
Silver Spring, MD 20910
Office: 301-319-7528
Cell: 240-672-4311
https://eidresearch.org/our-team/m-gordon-joyce-phd
gjo...@eidresearch.org

**
This message contains information that may be confidential, privileged, 
proprietary, or otherwise protected. If you are not the intended recipient, 
notify the sender immediately and delete/destroy all copies of this message and 
any attachments. This message is not intended to constitute or include either 
an electronic record or an electronic signature unless otherwise specifically 
indicated.



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread Diana Tomchick

Interesting observation. In our home X-ray lab, we constructed a liquid 
nitrogen dumping station from a large garbage can filled with sand to about 12 
inches. To keep the sand out of the dewars we topped the sand with a large wad 
of screen-door netting. All items can easily be purchased from a local home 
improvement store, and the screen-door netting is super cheap (plus it makes it 
easy to remove the stray bits of garbage that mistakenly find their way into 
the nitrogen dump/garbage can (you can put a sign that states, “not for 
garbage” on a garbage can, but you can’t get 100% compliance, LOL).

Diana

**
Diana R. Tomchick
Professor
Departments of Biophysics and Biochemistry
UT Southwestern Medical Center
5323 Harry Hines Blvd.
Rm. ND10.214A
Dallas, TX 75390-8816
diana.tomch...@utsouthwestern.edu
(214) 645-6383 (phone)
(214) 645-6353 (fax)




On Dec 4, 2020, at 10:42 AM, Nukri Sanishvili 
mailto:sannu...@gmail.com>> wrote:


EXTERNAL MAIL

Hi John,
I think I know what might have happened:
Many of the MX beamlines at the APS use some sort of filler in the containers 
where the LN2 is dumped. If I remember correctly, one of the beamlines is using 
fine gravel for this purpose. Also, it is required that before shipping, the 
dewars are emptied - i.e. don't contain liquid. Now, imagine somebody dumping 
the liquid into the grave-filled container without removing the blue cap and 
without holding the dewar in the air - i.e. the top of the dewar with the cap 
on is slightly buried into the gravel. Upon straightening the dewar up, the 
blue cap would scoop up a little bit of the gravel. Distribution of the pebbles 
on your picture is also noteworthy. It suggests the side where the pebbles are 
was the side dipped into the gravel.
You might want to discuss this with your beamline host.
Best,
Nukri

On Fri, Dec 4, 2020 at 10:05 AM Tanner, John J. 
mailto:tanne...@missouri.edu>> wrote:
When we opened our CX100 shipping dewar returned from APS via FedEx this week, 
we observed what appears to be tiny rocks on the rim below the foam neck core:

https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0

Has anyone seen this before? Is this perhaps the absorbent material from the 
inside of the dewar?

Thanks,

Jack

John J. Tanner
Professor of Biochemistry and Chemistry
Associate Chair of Biochemistry
Department of Biochemistry
University of Missouri
117 Schweitzer Hall
503 S College Avenue
Columbia, MO 65211
Phone: 573-884-1280
Email: tanne...@missouri.edu
https://cafnrfaculty.missouri.edu/tannerlab/
Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
Office: Schlundt Annex 203A



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

CAUTION: This email originated from outside UTSW. Please be cautious of links 
or attachments, and validate the sender's email address before replying.




UT Southwestern

Medical Center

The future of medicine, today.



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread James Holton

Run it for more cycles.  Doesn't take long to drift far enough for it to 
not find its way back when you turn x-ray back on.


This isn't just a problem in refmac, or phenix, or x-plor, or even MD 
programs like AMBER. The problem is that in order to make a structure 
fit into density you have to distort the geometry.  Turn the geometry 
weight up too high and your R factors blow up.  Turn the X-ray weight up 
too high and you get badly distorted geometry. I think we've all 
experienced that?


-James Holton
MAD Scientist

On 12/3/2020 8:29 PM, Jon Cooper wrote:
Hello James, that's really strange - I've used refmac et al., to do 
poor man's energy minimizations of models and they've generally come 
out fine, unless the restraints, etc, are wildly off-target. I wasn't 
playing with X-ray weights though, since there never was a dataset, of 
course.


Cheers, Jon.C.

Sent from ProtonMail mobile



 Original Message 
On 4 Dec 2020, 01:34, James Holton < jmhol...@lbl.gov> wrote:


It is a major leap forward for structure prediction for sure.  A
hearty congratulations to all those teams over all those years.

The part I don't understand is the accuracy.  If we understand
what holds molecules together so well, then why is it that when I
refine an X-ray structure and turn the X-ray weight term down to
zero ... the molecule blows up in my face?

-James Holton
MAD Scientist


On 12/3/2020 3:17 AM, Isabel Garcia-Saez wrote:

Dear all,

Just commenting that after the stunning performance of AlphaFold
that uses AI from Google maybe some of us we could dedicate
ourselves to the noble art of gardening, baking, doing Chinese
Calligraphy, enjoying the clouds pass or everything together
(just in case I have already prepared my subscription to Netflix).

https://www.nature.com/articles/d41586-020-03348-4


Well, I suppose that we still have the structures of complexes
(at the moment). I am wondering how the labs will have access to
this technology in the future (would it be for free coming from
the company DeepMind - Google?). It seems that they have already
published some code. Well, exciting times.

Cheers,

Isabel


Isabel Garcia-SaezPhD
Institut de Biologie Structurale
Viral Infection and Cancer Group (VIC)-Cell Division Team
71, Avenue des Martyrs
CS 10090
38044 Grenoble Cedex 9
France
Tel.: 00 33 (0) 457 42 86 15
e-mail: isabel.gar...@ibs.fr 
FAX: 00 33 (0) 476 50 18 90
http://www.ibs.fr/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1







To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1







To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread Nukri Sanishvili

Hi John,
I think I know what might have happened:
Many of the MX beamlines at the APS use some sort of filler in the
containers where the LN2 is dumped. If I remember correctly, one of the
beamlines is using fine gravel for this purpose. Also, it is required that
before shipping, the dewars are emptied - i.e. don't contain liquid. Now,
imagine somebody dumping the liquid into the grave-filled container without
removing the blue cap and without holding the dewar in the air - i.e. the
top of the dewar with the cap on is slightly buried into the gravel. Upon
straightening the dewar up, the blue cap would scoop up a little bit of the
gravel. Distribution of the pebbles on your picture is also noteworthy. It
suggests the side where the pebbles are was the side dipped into the gravel.
You might want to discuss this with your beamline host.
Best,
Nukri

On Fri, Dec 4, 2020 at 10:05 AM Tanner, John J. 
wrote:

> When we opened our CX100 shipping dewar returned from APS via FedEx this
> week, we observed what appears to be tiny rocks on the rim below the foam
> neck core:
>
> https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0
>
> Has anyone seen this before? Is this perhaps the absorbent material from
> the inside of the dewar?
>
> Thanks,
>
> Jack
>
> John J. Tanner
> Professor of Biochemistry and Chemistry
> Associate Chair of Biochemistry
> Department of Biochemistry
> University of Missouri
> 117 Schweitzer Hall
> 503 S College Avenue
> Columbia, MO 65211
> Phone: 573-884-1280
> Email: tanne...@missouri.edu 
> https://cafnrfaculty.missouri.edu/tannerlab/
> Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
> Office: Schlundt Annex 203A
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread Pedro Matias

It does not look like absorbent material and why should it be outside 
the dewar and not all over the pucks?


I concur with Jürgen - it does look like gravel.

Às 16:18 de 04/12/2020, Jurgen Bosch escreveu:
That looks like gravel from the driveway where the dewar was tossed 
around

Jürgen

On Dec 4, 2020, at 11:04 AM, Tanner, John J. > wrote:


When we opened our CX100 shipping dewar returned from APS via FedEx 
this week, we observed what appears to be tiny rocks on the rim below 
the foam neck core:


https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0 



Has anyone seen this before? Is this perhaps the absorbent material 
from the inside of the dewar?


Thanks,

Jack

John J. Tanner
Professor of Biochemistry and Chemistry
Associate Chair of Biochemistry
Department of Biochemistry
University of Missouri
117 Schweitzer Hall
503 S College Avenue
Columbia, MO 65211
Phone: 573-884-1280
Email: tanne...@missouri.edu 
https://cafnrfaculty.missouri.edu/tannerlab/ 


Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
Office: Schlundt Annex 203A



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 








To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 




--

Industry and Medicine Applied Crystallography
Macromolecular Crystallography Unit
___
Phones : (351-21) 446-9100 Ext. 1669
 (351-21) 446-9669 (direct)
 Fax   : (351-21) 441-1277 or 443-3644

email : mat...@itqb.unl.pt

http://www.itqb.unl.pt/research/biological-chemistry/industry-and-medicine-applied-crystallography
http://www.itqb.unl.pt/labs/macromolecular-crystallography-unit

Mailing address :
Instituto de Tecnologia Quimica e Biologica António Xavier
Universidade Nova de Lisboa
Av. da República
2780-157 Oeiras
PORTUGAL

ITQB NOVA, a great choice for your PhD
https://youtu.be/de6j-aaTWNQ

Master Programme in Biochemistry for Health
https://youtu.be/UKstDCFjYI8




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread Fischmann, Thierry

We’ve had issues with the adsorbent material inside a dewar peeling out as 
small particles but they were white in color. Not sure it’s the same.

If the dewar is defective the stuff will ultimately find its way to and gum up 
the goniohead motors, so it’s important to address this.

Thierry

From: CCP4 bulletin board  On Behalf Of Jurgen Bosch
Sent: Friday, December 4, 2020 11:18 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Tiny rocks on my CX100 shipping dewar

EXTERNAL EMAIL – Use caution with any links or file attachments.
That looks like gravel from the driveway where the dewar was tossed around
Jürgen


On Dec 4, 2020, at 11:04 AM, Tanner, John J. 
mailto:tanne...@missouri.edu>> wrote:

When we opened our CX100 shipping dewar returned from APS via FedEx this week, 
we observed what appears to be tiny rocks on the rim below the foam neck core:

https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0

Has anyone seen this before? Is this perhaps the absorbent material from the 
inside of the dewar?

Thanks,

Jack

John J. Tanner
Professor of Biochemistry and Chemistry
Associate Chair of Biochemistry
Department of Biochemistry
University of Missouri
117 Schweitzer Hall
503 S College Avenue
Columbia, MO 65211
Phone: 573-884-1280
Email: tanne...@missouri.edu
https://cafnrfaculty.missouri.edu/tannerlab/
Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
Office: Schlundt Annex 203A


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (2000 Galloping Hill Road, Kenilworth,
New Jersey, USA 07033), and/or its affiliates Direct contact information
for affiliates is available at 
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from 
your system.




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread Nukri Sanishvili

Hi John,
These look like real pebbles. If so, they would not word as an absorbent.
It looks more like a bad joke. Was the FedEx driver expecting some kind of
Thanksgiving gift from you? Wait till Christmas time then to see the real
surprises...
Best,
Nukri

On Fri, Dec 4, 2020 at 10:05 AM Tanner, John J. 
wrote:

> When we opened our CX100 shipping dewar returned from APS via FedEx this
> week, we observed what appears to be tiny rocks on the rim below the foam
> neck core:
>
> https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0
>
> Has anyone seen this before? Is this perhaps the absorbent material from
> the inside of the dewar?
>
> Thanks,
>
> Jack
>
> John J. Tanner
> Professor of Biochemistry and Chemistry
> Associate Chair of Biochemistry
> Department of Biochemistry
> University of Missouri
> 117 Schweitzer Hall
> 503 S College Avenue
> Columbia, MO 65211
> Phone: 573-884-1280
> Email: tanne...@missouri.edu 
> https://cafnrfaculty.missouri.edu/tannerlab/
> Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
> Office: Schlundt Annex 203A
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread Jurgen Bosch

That looks like gravel from the driveway where the dewar was tossed around
Jürgen 

> On Dec 4, 2020, at 11:04 AM, Tanner, John J.  wrote:
> 
> When we opened our CX100 shipping dewar returned from APS via FedEx this 
> week, we observed what appears to be tiny rocks on the rim below the foam 
> neck core:
> 
> https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0 
> 
> 
> Has anyone seen this before? Is this perhaps the absorbent material from the 
> inside of the dewar? 
> 
> Thanks,
> 
> Jack
> 
> John J. Tanner
> Professor of Biochemistry and Chemistry
> Associate Chair of Biochemistry
> Department of Biochemistry
> University of Missouri
> 117 Schweitzer Hall
> 503 S College Avenue
> Columbia, MO 65211
> Phone: 573-884-1280
> Email: tanne...@missouri.edu 
> https://cafnrfaculty.missouri.edu/tannerlab/
> Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
> Office: Schlundt Annex 203A
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 
> 



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

[ccp4bb] Tiny rocks on my CX100 shipping dewar

2020-12-04 Thread Tanner, John J.

When we opened our CX100 shipping dewar returned from APS via FedEx this week, 
we observed what appears to be tiny rocks on the rim below the foam neck core:

https://www.dropbox.com/s/ky09a1vbm9t0mrl/CX100withrocks.png?dl=0

Has anyone seen this before? Is this perhaps the absorbent material from the 
inside of the dewar?

Thanks,

Jack

John J. Tanner
Professor of Biochemistry and Chemistry
Associate Chair of Biochemistry
Department of Biochemistry
University of Missouri
117 Schweitzer Hall
503 S College Avenue
Columbia, MO 65211
Phone: 573-884-1280
Email: tanne...@missouri.edu
https://cafnrfaculty.missouri.edu/tannerlab/
Lab: Schlundt Annex rooms 3,6,9, 203B, 203C
Office: Schlundt Annex 203A



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Marcin Wojdyr

On Fri, 4 Dec 2020 at 09:21, Luca Jovine  wrote:
>
> Yes Tristan, that would be even better - also because such an Ag1, Ag2,… 
> system could conveniently fall back on a single-character chain A, when 
> generating legacy PDB format files from the mmCIF ones.

mmCIF already has two sets of identifiers, including two separate
chain names (and both are present in all files from the wwPDB). The
one discussed here is auth_asym_id (author's chain ID - used in pdb
files), the other one is label_asym_id (let's call it label chain ID).
Having two sets of identifiers is obviously confusing (as can be seen
by just looking at any mmCIF file) and it'd help a lot if the two IDs
were consistent: the label chain ID could start with the corresponding
author's chain ID. For example, similarly to what Tristan proposed,
author's chain A could be split into label chains Amain, Ag1, Ag2,
etc.

Marcin

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Michel Fodje

I think the results from AlphaFold2, although exciting and a breakthrough are 
being exaggerated just a bit.  We know that all the information required for 
the 3D structure is in the sequence. The protein folding problem is simply how 
to go from a sequence to the 3D structure. This is not a complex problem in the 
sense that cells solve it deterministically.  Thus the problem is due to lack 
of understanding and not due to complexity.  AlphaFold and all the others 
trying to solve this problem are "cheating" in that they are not just using the 
sequence, they are using other sequences like it (multiple-sequence 
alignments), and they are using all the structural information contained in the 
PDB.  All of this information is not used by the cells.   In short, unless 
AlphaFold2 now allows us to understand how exactly a single protein sequence 
produces a particular 3D structure, the protein folding problem is hardly 
solved in a theoretical sense. The only reason we know how well AlphaFold2 did 
is because the structures were solved and we could compare with the 
predictions, which means verification is lacking.

The protein folding problem will be solved when we understand how to go from a 
sequence to a structure, and can verify a given structure to be correct without 
experimental data. Even if AlphaFold2 got 99% of structures right, your next 
interesting target protein might be the 1%. How would you know?   Until then, 
what AlphaFold2 is telling us right now is that all (most) of the information 
present in the sequence that determines the 3D structure can be gleaned in bits 
and pieces scattered between homologous sequences, multiple-sequence 
alignments, and other protein 3D structures in the PDB.  Deep Learning allows a 
huge amount of data to be thrown at a problem and the back-propagation of the 
networks then allows careful fine-tuning of weights which determine how 
relevant different pieces of information are to the prediction.  The networks 
used here are humongous and a detailed look at the weights (if at all feasible) 
may point us in the right direction.


From: CCP4 bulletin board  On Behalf Of Nave, Colin 
(DLSLtd,RAL,LSCI)
Sent: December 4, 2020 9:14 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

The subject line for Isabel's email is very good.

I do have a question (more a request) for the more computer scientist oriented 
people. I think it is relevant for where this technology will be going. It 
comes from trying to understand whether problems addressed by Alpha are NP, NP 
hard, NP complete etc. My understanding is that the previous successes of Alpha 
were for complete information games such as Chess and Go. Both the rules and 
the present position were available to both sides. The folding problem might be 
in a different category. It would be nice if someone could explain the 
difference (if any) between Go and the protein folding problem perhaps using 
the NP type categories.

Colin



From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Isabel Garcia-Saez
Sent: 03 December 2020 11:18
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

Dear all,

Just commenting that after the stunning performance of AlphaFold that uses AI 
from Google maybe some of us we could dedicate ourselves to the noble art of 
gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
everything together (just in case I have already prepared my subscription to 
Netflix).

https://www.nature.com/articles/d41586-020-03348-4

Well, I suppose that we still have the structures of complexes (at the moment). 
I am wondering how the labs will have access to this technology in the future 
(would it be for free coming from the company DeepMind - Google?). It seems 
that they have already published some code. Well, exciting times.

Cheers,

Isabel


Isabel Garcia-Saez  PhD
Institut de Biologie Structurale
Viral Infection and Cancer Group (VIC)-Cell Division Team
71, Avenue des Martyrs
CS 10090
38044 Grenoble Cedex 9
France
Tel.: 00 33 (0) 457 42 86 15
e-mail: isabel.gar...@ibs.fr
FAX: 00 33 (0) 476 50 18 90
http://www.ibs.fr/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



--

This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not 
necessarily of Diamond Light Source Ltd.
Diamond Light Source

[ccp4bb] cryo-EM postdoc position in Klaholz lab CBI / IGBMC, Strasbourg

2020-12-04 Thread Bruno KLAHOLZ



Dear all,



for those still interested in experimentally determined structures (maybe to 
complement predictions? ;-)

we have a post-doctoral researcher position to join the Klaholz group at the 
Centre for Integrative Biology, IGBMC, Illkirch, France:

https://instruct-eric.eu/jobs/postdoctoral-researcher/

http://www.igbmc.fr/igbmc/recrutement/job_offer/524/

https://www.linkedin.com/posts/bruno-klaholz-35893a125_we-have-an-open-position-for-a-post-doctoral-activity-6733065823482646528-tvhP



We are studying the structure and function of large nucleoprotein complexes 
(ribosome complexes, chromatin complexes, viruses) through integrated 
structural biology, including biochemistry, crystallography, high resolution 
cryo-EM, tomography, super-resolution imaging & software developments.

We are looking for a scientist with relevant expertise in structural biology, 
ideally with strong expertise in cryo electron microscopy or tomography. For 
further details please see below.



Please contact me if you are interested.



With best regards,



Bruno Klaholz







For ongoing projects and full publication list of the team see 
http://igbmc.fr/Klaholz



Our group is located at the Centre for Integrative Biology (CBI) at IGBMC, 
Illkirch/Strasbourg, France, which comprises cutting-edge cryo-EM facilities:

The CBI provides a state-of-the-art scientific and technological environment in 
integrated structural biology to address the structure and function of 
biological systems, notably on gene expression, from the atomic, molecular to 
the tissue scales. The CBI http://www.igbmc.fr/grandesstructures/cbi/ hosts the 
French and European Infrastructures for Integrated Structural Biology, FRISBI 
http://frisbi.eu/, Instruct-ERIC https://www.structuralbiology.eu/ and 
iNext-Discovery https://inext-discovery.eu/network/inext-d/home which comprises 
advanced electron microscopy facilities equipped with cutting-edge 
instrumentation such as cryo electron microscopes including a Titan Krios, a 
Glacios (under installation), a Polara, cryo Focused Ion Beam Scanning Electron 
Microscope (cryo-FIB/SEM) and super-resolution fluorescence microscopy, see 
http://frisbi.eu/centers/instruct-center-france-1-igbmc/cryo-electron-microscopy/.
 The Titan Krios microscope is equipped with K3 camera, Falcon 3 camera, GIF 
energy filter and phase plate. In addition, the EM facility has a suite of 
associated equipments for sample preparation and dedicated computing resources 
for image processing and 3D reconstruction by single particle cryo-EM and cryo 
electron tomography.



Highly-motivated candidates with a recent PhD or MD/PhD in structural biology 
and specifically in cryo-EM or cryo-ET are encouraged to apply.

Candidates should have strong expertise in protein biochemistry, in cryo-ET 
and/or in cryo-EM, image processing & atomic model refinement with various 
programs; computational skills are a strong plus. The candidate should have 
excellent writing and communication skills in English, be able to work 
independently and have a strong team spirit.

We are studying the structure and function of protein complexes (ribosome 
complexes, chromatin complexes, viruses) through integrated structural biology, 
including biochemical and biophysical methods, crystallography, high resolution 
cryo-EM, tomography, super-resolution imaging & software developments. Several 
projects are available for outstanding young scientists who would like to 
combine these approaches.

Applications should be sent via email to 
klah...@igbmc.fr including CV, list of publications, 
names of 3 referees and motivation letter. Deadline for application: Jan 8th 
2021.


###
Bruno P. Klaholz
Centre for Integrative Biology
Department of Integrated Structural Biology
Institute of Genetics and of Molecular and Cellular Biology
IGBMC - UMR 7104 - U 1258
1, rue Laurent Fries
BP 10142
67404 ILLKIRCH CEDEX
FRANCE
Tel. from abroad: 0033.369.48.52.78
Tel. inside France: 03.69.48.52.78
websites:
http://www.igbmc.fr/research/department/3/team/36/
http://www.igbmc.fr/grandesstructures/cbi
http://frisbi.eu
http://instruct-eric.eu
https://inext-discovery.eu




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Nave, Colin (DLSLtd,RAL,LSCI)

The subject line for Isabel's email is very good.

I do have a question (more a request) for the more computer scientist oriented 
people. I think it is relevant for where this technology will be going. It 
comes from trying to understand whether problems addressed by Alpha are NP, NP 
hard, NP complete etc. My understanding is that the previous successes of Alpha 
were for complete information games such as Chess and Go. Both the rules and 
the present position were available to both sides. The folding problem might be 
in a different category. It would be nice if someone could explain the 
difference (if any) between Go and the protein folding problem perhaps using 
the NP type categories.

Colin



From: CCP4 bulletin board  On Behalf Of Isabel 
Garcia-Saez
Sent: 03 December 2020 11:18
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

Dear all,

Just commenting that after the stunning performance of AlphaFold that uses AI 
from Google maybe some of us we could dedicate ourselves to the noble art of 
gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
everything together (just in case I have already prepared my subscription to 
Netflix).

https://www.nature.com/articles/d41586-020-03348-4

Well, I suppose that we still have the structures of complexes (at the moment). 
I am wondering how the labs will have access to this technology in the future 
(would it be for free coming from the company DeepMind - Google?). It seems 
that they have already published some code. Well, exciting times.

Cheers,

Isabel


Isabel Garcia-Saez  PhD
Institut de Biologie Structurale
Viral Infection and Cancer Group (VIC)-Cell Division Team
71, Avenue des Martyrs
CS 10090
38044 Grenoble Cedex 9
France
Tel.: 00 33 (0) 457 42 86 15
e-mail: isabel.gar...@ibs.fr
FAX: 00 33 (0) 476 50 18 90
http://www.ibs.fr/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

-- 
This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not 
necessarily of Diamond Light Source Ltd. 
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments 
are free from viruses and we cannot accept liability for any damage which you 
may sustain as a result of software viruses which may be transmitted in or with 
the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and 
Wales with its registered office at Diamond House, Harwell Science and 
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Boaz Shaanan

Hi all,
Can someone point me to cases of glycoprotein structures in the PDB for which the old (traditional?) system of naming N or O linked chain was found inadequate? Thanks.
Stay safe,
Boaz

Boaz Shaanan, Ph.D.
Department of Life Sciences
Ben Gurion University of the Negev
Beer Sheva
Israel

On Dec 4, 2020 14:51, Zhijie Li  wrote:

Hi Tristan and all,

I totally agree that randomly naming the glycan chains is going to give users headaches.  But using more than 2 letters would make the entry incompatible with the PDB format, which I wish will remain as a download option for as long as possible. 

 How about restricting the length of chain IDs to two characters? Then they can be fitted in columns 21 and 22 of PDB files. For example, all N-glycan chains associated with chain A can be named 0A...9A, AA,BA, CA. zA. That’s a maximum of 62 Nglycans on
 a chain. If we also use the printable symols that’s another 20-30 chains.  
On top of this, if the residues are still each given a residue number shift and the “major” letter of the chain ID(one that indicates protein chain) remains on column  22, this would mean that the new system might even look the same as the old system to the
 softwares that strictly stick to the PDB format specification, as the only change is an extra letter on column 21, which was unused.  I think some of the tools are already treating both column 21 and 22 as chain IDs. Even the pdb deposition system seems to
 start giving out 2-letter IDs when it hits ‘z’.  So this is natural to them and will cause them no problem at all.  Does the new carbohydrate standard allow starting a glycan chain at residue 5021?

Zhijie 

> On Dec 4, 2020, at 3:06 AM, Tristan Croll  wrote:
> 
> EXTERNAL EMAIL:
> 
> To go one step further: in large, heavily glycosylated multi-chain complexes the assignment of a random new chain ID to each glycan will lead to headaches for people building visualisations using existing viewers, because it loses the easy name-based association
 of glycan to parent protein chain. A suggestion: why not take full advantage of the mmCIF capability for multi-character chain IDs, and name them by appending characters to the parent chain ID? Using chain A as an example, perhaps the glycans could become
 Ag1, Ag2, etc.?
> 
>> On 4 Dec 2020, at 07:48, Luca Jovine  wrote:
>> 
>> CC: pdb-l
>> 
>> Dear Zhijie and Robbie,
>> 
>> I agree with both of you that the new carbohydrate chain assignment convention that has been recently adopted by PDB introduces confusion, not just for PDB-REDO but also - and especially - for end users.
>> 
>> Could we kindly ask PDB to improve consistency by either assigning a separate chain to all covalently attached carbohydrates (regardless of whether one or more residues have been traced), or reverting to the old system (where N-/O-glycans inherited the same
 chain ID of the protein to which they are attached)? The current hybrid solution hardly seems optimal...
>> 
>> Best regards,
>> 
>> Luca
>> 
 On 3 Dec 2020, at 20:17, Robbie Joosten  wrote:
>>> 
>>> Dear Zhijie,
>>> 
>>> In generally I like the treatment of carbohydrates now as branched polymers. I didn't realise there was an exception. It makes sense for unlinked carbohydrate ligands, but not for N- or O-glycosylation sites as these might change during model building or,
 in my case, carbohydrate rebuilding in PDB-REDO powered by Coot. Thanks for pointing this out.
>>> 
>>> Cheers,
>>> Robbie
>>> 
 -Original Message-
 From: CCP4 bulletin board  On Behalf Of Zhijie Li
 Sent: Thursday, December 3, 2020 19:52
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
 PDB -- N-glycans are now separate chains if more than one residue

 Hi all,

 I was confused when I saw mysterious new glycan chains emerging during
 PDB deposition and spent quite some time trying to find out what was
 wrong with my coordinates.  Then it occurred to me that a lot of recent
 structures also had tens of N-glycan chains.  Finally I realized that this
 phenomenon is a consequence of this PDB policy announced here in July.

 For future depositors who might also get puzzled, let's put it in a short
 sentence:  O- and N-glycans are now separate chains if it they contain more
 than one residue; single residues remain with the protein chain.

https://eur01.safelinks.protection.outlook.com/?url="">

 "Oligosaccharide molecules are classified as a new entity type, branched,
 assigned a unique chain ID (_atom_site.auth_asym_id) and a new mmCIF
 category introduced to define the type of branching
 (_pdbx_entity_branch.type) . "

 I found the differential treatment of single-residue glycans and multi-residue
 glycans not

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Randy John Read

Hi Frank,

Yes, until CASP7 (back in 2006), I used to like saying that there are many more 
ways to make a homology model worse than the starting template than to make it 
better, and that homology modelling programs were very good at finding them!  
After seeing that at least some models (e.g. from Rosetta) were actually better 
in CASP7, I had to stop saying that!

It’s not just anecdotal.  Even in CASP7, most models were still worse for MR 
than the best template someone could have found.

Randy

> On 4 Dec 2020, at 12:22, Frank von Delft  wrote:
> 
> I guess that also means that AlphaFold has learnt the crystal-structure-ness 
> that older homology methods never achieved - which is why (anecdotally?) a 
> "better" homology model tended to give worse MR performance than the "worse" 
> template?
> 
> (Or something like that, I'm parrotting what I remember people (maybe Randy?) 
> saying long ago about the problems with homology models in MR.)
> 
> 
> On 04/12/2020 11:57, Adam Simpkin wrote:
>> I thought I might be able to add a little to this conversation as I 
>> performed some MR runs as part of the CASP14 High Accuracy analysis. There 
>> were 30 targets with reflection data. Of these, AlphaFold2 models could be 
>> used to directly solve 24 structures after converting
>> RMS error predictions to simulated B-factors to aid the MR 
>> (10.1002/prot.25800).
>> 
>> Some of the models did contain sufficient local errors to impede MR. 
>> However, we were able to obtain a further 3 solutions by using AMPLE to 
>> truncate the models based on the per-residue RMS error predictions provided. 
>> In fact, a moderate truncation in AMPLE improved the quality of the MR 
>> solution in ~78% that succeeded by removing the few incorrectly models loops 
>> (typically at lattice interfaces).
>> 
>> A final thing to note was that the 3 structures that didn’t work still 
>> provided high quality model predictions (GDT_TS of 69, 84 & 83). These 
>> targets all contained multiple chains in the ASU and one was fairly low 
>> resolution (>3 Angstroms). Overall though I think the take home is clear, 
>> these models are really good and when the method or something similar is 
>> more publicly available I think it will definitely simplify MR for 
>> troublesome targets.
>> 
>> Best wishes,
>> 
>> Adam
>> 
>> 
>> 
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>> 
>> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
>> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
>> https://www.jiscmail.ac.uk/policyandsecurity/
> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> 
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
> https://www.jiscmail.ac.uk/policyandsecurity/

-
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research Tel: +44 1223 336500
The Keith Peters Building   Fax: +44 1223 336827
Hills Road   E-mail: 
rj...@cam.ac.uk
Cambridge CB2 0XY, U.K.  
www-structmed.cimr.cam.ac.uk




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

[ccp4bb] Workshop on electron crystallography Dec 17th 2020

2020-12-04 Thread Tim Gruene

Dear all,

on behalf of the organisers, I would like to draw your attention to a
workshop on electron crystallography, taking place December 17th
2020. Its purpose is to update of developers and users in the field
with the current state of their research.

You find more information and a link to the registration at the
workshop URL
https://www.uni-ulm.de/en/einrichtungen/hrem/christmas-elec-crystall/christmas-elec-crystall/

The workshop program will soon be available.

Best regards,

TG on behalf of Tatiana Gorelik, Mauro Gemmi, Lukas Palatinus, and
Stephanie Kodjikian

-- 
--
Tim Gruene
Head of the Centre for X-ray Structure Analysis
Faculty of Chemistry
University of Vienna

Phone: +43-1-4277-70202

GPG Key ID = A46BEE1A



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


pgpQb_1GwtDKw.pgp
Description: OpenPGP digital signature

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Zhijie Li

Hi Tristan and all,

I totally agree that randomly naming the glycan chains is going to give users 
headaches.  But using more than 2 letters would make the entry incompatible 
with the PDB format, which I wish will remain as a download option for as long 
as possible.  
 How about restricting the length of chain IDs to two characters? Then they can 
be fitted in columns 21 and 22 of PDB files. For example, all N-glycan chains 
associated with chain A can be named 0A...9A, AA,BA, CA. zA. That’s a 
maximum of 62 Nglycans on a chain. If we also use the printable symols that’s 
another 20-30 chains.  
On top of this, if the residues are still each given a residue number shift and 
the “major” letter of the chain ID(one that indicates protein chain) remains on 
column  22, this would mean that the new system might even look the same as the 
old system to the softwares that strictly stick to the PDB format 
specification, as the only change is an extra letter on column 21, which was 
unused.  I think some of the tools are already treating both column 21 and 22 
as chain IDs. Even the pdb deposition system seems to start giving out 2-letter 
IDs when it hits ‘z’.  So this is natural to them and will cause them no 
problem at all.  Does the new carbohydrate standard allow starting a glycan 
chain at residue 5021? 

Zhijie 

> On Dec 4, 2020, at 3:06 AM, Tristan Croll  wrote:
> 
> EXTERNAL EMAIL:
> 
> To go one step further: in large, heavily glycosylated multi-chain complexes 
> the assignment of a random new chain ID to each glycan will lead to headaches 
> for people building visualisations using existing viewers, because it loses 
> the easy name-based association of glycan to parent protein chain. A 
> suggestion: why not take full advantage of the mmCIF capability for 
> multi-character chain IDs, and name them by appending characters to the 
> parent chain ID? Using chain A as an example, perhaps the glycans could 
> become Ag1, Ag2, etc.?
> 
>> On 4 Dec 2020, at 07:48, Luca Jovine  wrote:
>> 
>> CC: pdb-l
>> 
>> Dear Zhijie and Robbie,
>> 
>> I agree with both of you that the new carbohydrate chain assignment 
>> convention that has been recently adopted by PDB introduces confusion, not 
>> just for PDB-REDO but also - and especially - for end users.
>> 
>> Could we kindly ask PDB to improve consistency by either assigning a 
>> separate chain to all covalently attached carbohydrates (regardless of 
>> whether one or more residues have been traced), or reverting to the old 
>> system (where N-/O-glycans inherited the same chain ID of the protein to 
>> which they are attached)? The current hybrid solution hardly seems optimal...
>> 
>> Best regards,
>> 
>> Luca
>> 
 On 3 Dec 2020, at 20:17, Robbie Joosten  wrote:
>>> 
>>> Dear Zhijie,
>>> 
>>> In generally I like the treatment of carbohydrates now as branched 
>>> polymers. I didn't realise there was an exception. It makes sense for 
>>> unlinked carbohydrate ligands, but not for N- or O-glycosylation sites as 
>>> these might change during model building or, in my case, carbohydrate 
>>> rebuilding in PDB-REDO powered by Coot. Thanks for pointing this out.
>>> 
>>> Cheers,
>>> Robbie
>>> 
 -Original Message-
 From: CCP4 bulletin board  On Behalf Of Zhijie Li
 Sent: Thursday, December 3, 2020 19:52
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
 PDB -- N-glycans are now separate chains if more than one residue

 Hi all,

 I was confused when I saw mysterious new glycan chains emerging during
 PDB deposition and spent quite some time trying to find out what was
 wrong with my coordinates.  Then it occurred to me that a lot of recent
 structures also had tens of N-glycan chains.  Finally I realized that this
 phenomenon is a consequence of this PDB policy announced here in July.

 For future depositors who might also get puzzled, let's put it in a short
 sentence:  O- and N-glycans are now separate chains if it they contain more
 than one residue; single residues remain with the protein chain.

 https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wwpdb.org%2Fdocumentation%2Fcarbohydrate-remediationdata=04%7C01%7Cluca.jovine%40KI.SE%7C1d790a0717ce4217c7a308d897c01b47%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C1%7C637426199684263065%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=mBrkCJECFpZyCih4kOCcCvLT1GzQHxD5GD7bZDI9s1s%3Dreserved=0

 "Oligosaccharide molecules are classified as a new entity type, branched,
 assigned a unique chain ID (_atom_site.auth_asym_id) and a new mmCIF
 category introduced to define the type of branching
 (_pdbx_entity_branch.type) . "

 I found the differential treatment of single-residue glycans and 
 multi-residue
 glycans not only bit

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Frank von Delft

I guess that also means that AlphaFold has learnt the 
crystal-structure-ness that older homology methods never achieved - 
which is why (anecdotally?) a "better" homology model tended to give 
worse MR performance than the "worse" template?


(Or something like that, I'm parrotting what I remember people (maybe 
Randy?) saying long ago about the problems with homology models in MR.)



On 04/12/2020 11:57, Adam Simpkin wrote:

I thought I might be able to add a little to this conversation as I performed 
some MR runs as part of the CASP14 High Accuracy analysis. There were 30 
targets with reflection data. Of these, AlphaFold2 models could be used to 
directly solve 24 structures after converting
RMS error predictions to simulated B-factors to aid the MR (10.1002/prot.25800).

Some of the models did contain sufficient local errors to impede MR. However, 
we were able to obtain a further 3 solutions by using AMPLE to truncate the 
models based on the per-residue RMS error predictions provided. In fact, a 
moderate truncation in AMPLE improved the quality of the MR solution in ~78% 
that succeeded by removing the few incorrectly models loops (typically at 
lattice interfaces).

A final thing to note was that the 3 structures that didn’t work still provided high 
quality model predictions (GDT_TS of 69, 84 & 83). These targets all contained 
multiple chains in the ASU and one was fairly low resolution (>3 Angstroms). 
Overall though I think the take home is clear, these models are really good and when 
the method or something similar is more publicly available I think it will definitely 
simplify MR for troublesome targets.

Best wishes,

Adam



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Adam Simpkin

I thought I might be able to add a little to this conversation as I performed 
some MR runs as part of the CASP14 High Accuracy analysis. There were 30 
targets with reflection data. Of these, AlphaFold2 models could be used to 
directly solve 24 structures after converting 
RMS error predictions to simulated B-factors to aid the MR 
(10.1002/prot.25800). 

Some of the models did contain sufficient local errors to impede MR. However, 
we were able to obtain a further 3 solutions by using AMPLE to truncate the 
models based on the per-residue RMS error predictions provided. In fact, a 
moderate truncation in AMPLE improved the quality of the MR solution in ~78% 
that succeeded by removing the few incorrectly models loops (typically at 
lattice interfaces). 

A final thing to note was that the 3 structures that didn’t work still provided 
high quality model predictions (GDT_TS of 69, 84 & 83). These targets all 
contained multiple chains in the ASU and one was fairly low resolution (>3 
Angstroms). Overall though I think the take home is clear, these models are 
really good and when the method or something similar is more publicly available 
I think it will definitely simplify MR for troublesome targets. 

Best wishes, 

Adam



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Ioannis Vakonakis

In defence of experimental science, may I also suggest that the models 
AlphaFold2 and other predictors worked on were derived from sequences for which 
they knew a well-defined structure is possible. Much of the work we do is 
exactly on taking unwieldy sequences with disordered elements, multiple domains 
etc, and optimising these into constructs that produce well-diffracting 
crystals. Though NMR and CryoEM do not need crystals, construct optimization is 
often just as important for them too. Perhaps AlphaFold3 will be able to 
predict structures from gene sequences without that implicit optimisation, but 
that's not the case just yet.

Best

John

From: CCP4 bulletin board  on behalf of Jan Löwe 

Sent: 04 December 2020 10:33
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

AlphaFold2 and its performance are indeed a true breakthrough of potentially 
seismic proportion - and will accelerate much of what we are trying to achieve.

But I feel it is also important to point out that the method principally relies 
on experimental data: lots of protein sequences that through evolution have 
evolved to generate very similar folds and structures. Machine learning is used 
to transform the resulting (experiment-derived) co-evolution matrices (can be 
thought of as images for ML training) into distances between atoms. The 
distances are then used to generate models (minimiser in Alpha Fold 1, 
something ML in 2, as far as I could figure out). Note that contacts between 
proteins can also generate evolutionary couplings (as already used by the 
pioneers of the field, such as Debora Marks and Chris Sander), so something 
like AlphaFold will be able to make inroads there as well and that application 
might well have a greater impact.

This leaves three important goals remaining if I may add: 1) a way to do this 
for any single sequence, without alignment, or indeed any folding polymer (for 
when we will be able to make coded polymers that are not made from amino 
acids). 2) a method to obtain accurate numbers, such as binding energies and 
rates. 3) the inverse: predicting sequences that have a particular 
function/fold.

I would like to suggest that all three will require looking at how we can use 
more of the physics of the problem, but might well involve more machine 
learning.

Jan

On 04/12/2020 00:49, Paul Adams wrote:

I agree completely Tom. Having been recently involved in some efforts to 
identify interesting compounds against SARS-CoV-2, I can say that the current 
AI/ML methods for docking/predicting small molecule binding have very very low 
success rates (I’m being generous here), even when you are working with the 
experimental protein structure! Maybe this is the next frontier for the 
prediction methods (after they’ve solved the protein/protein complex problem of 
course), but it seems there is a long way to go.

Given that many structures are solved to look at their interaction with other 
proteins or small molecules I think that experimental structural biology is 
here to stay for a while - past Tom’s retirement even! However, will these 
fairly accurate protein predictions make experimental phasing a thing of the 
past?


On Dec 3, 2020, at 4:16 PM, Peat, Tom (Manufacturing, Parkville) 
mailto:tom.p...@csiro.au>> wrote:

Although they can now get the fold correct, I don't think they have all the 
side chain placement so perfect as to be able to predict the fold and how a 
compound or another protein binds, so we can still do complexes. I don't know 
what others end up spending their time doing, but much of my work has been 
trying to fit ligands into density, which may take another few years of 
algorithm development, which is fine for me as I can retire!
cheers, tom

Tom Peat, PhD
Proteins Group
Biomedical Program, CSIRO
343 Royal Parade
Parkville, VIC, 3052
+613 9662 7304
+614 57 539 419
tom.p...@csiro.au


From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
on behalf of Jon Cooper 
<488a26d62010-dmarc-requ...@jiscmail.ac.uk>
Sent: Friday, December 4, 2020 9:55 AM
To: CCP4BB@JISCMAIL.AC.UK 
mailto:CCP4BB@JISCMAIL.AC.UK>>
Subject: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

Thanks all, very interesting, so our methods are just needed to identify the 
crystallization impurities, when the trays have been thrown away ;-

Cheers, Jon.C.

Sent from ProtonMail mobile



 Original Message 
On 3 Dec 2020, 22:31, Anastassis Perrakis < 
a.perra...@nki.nl> wrote:

AlphaFold - or similar ideas that will surface up sooner or later - will beyond 
doubt have major impact. The accuracy it demonstrated compared to others is 
excellent.

“Our” target (T1068) that was not solvable by MR with the homologous search 
structure

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Randy John Read

Hi James,

One really interesting (and to me surprising) aspect of how well AlphaFold2 
does is that it does really well without actually understanding chemistry and 
physics.  (John Jumper from DeepMind talked about choices of deep learning 
model types and how they affect the “inductive bias” that allows things like 
chemistry to be learned indirectly, but it’s not programmed in.)  The best 
example is the fantastic models they made of monomers from trimeric proteins.  
The monomers can be assembled into trimers that look very much like the real 
thing, but they really modelled just monomers — somehow the machine learning 
algorithm implicitly knows about the trimers from the existence of distant 
homologues in the PDB.  As Joana said, AlphaFold2 did very well even on targets 
with no identifiable homologues, but I suspect that targets like these trimers 
will still require the presence of homologues.  Anyway, the modelled monomer 
makes no sense as a monomer, and any sensible force field would much prefer 
something else that buries more surface area!

Following up on some other comments, AlphaFold2 is a pretty complete 
reinvention compared to the original AlphaFold from 2 years ago.  AlphaFold 
followed a two-step process, where probability distributions for distances were 
learned in the first step (similar to the co-evolution constraints inferred by 
algorithms like the ones from Marks & Sander), and then those distance 
distributions were used in a minimisation step to fold the protein.  If I 
recall, the first step used a convolutional deep neural network.  In 
AlphaFold2, it’s all done in one end-to-end process going from sequence (and 
multiple sequence alignments) to xyz coordinates.  The model type has changed 
to something called an attention module, which John Jumper said acts to 
implicitly and iteratively learn a graph representing atoms and their 
interactions.

Once this algorithm or others like it are available to the community, it is 
indeed going to change the focus of what we do as structural biologists, but 
importantly it’s going to allow us to do more and to focus more on the 
biological questions than the technology.  (How and when it will become 
available is not entirely clear: John Jumper mentioned “internal discussions” 
in DeepMind about how to share with the community, and said there would be more 
news on that next year.)

Best wishes,

Randy Read

> On 4 Dec 2020, at 01:34, James Holton  wrote:
> 
> It is a major leap forward for structure prediction for sure.  A hearty 
> congratulations to all those teams over all those years.
> 
> The part I don't understand is the accuracy.  If we understand what holds 
> molecules together so well, then why is it that when I refine an X-ray 
> structure and turn the X-ray weight term down to zero ... the molecule blows 
> up in my face?
> 
> -James Holton
> MAD Scientist
> 
> 
> On 12/3/2020 3:17 AM, Isabel Garcia-Saez wrote:
>> Dear all,
>> 
>> Just commenting that after the stunning performance of AlphaFold that uses 
>> AI from Google maybe some of us we could dedicate ourselves to the noble art 
>> of gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
>> everything together (just in case I have already prepared my subscription to 
>> Netflix).
>> 
>> https://www.nature.com/articles/d41586-020-03348-4
>> 
>> Well, I suppose that we still have the structures of complexes (at the 
>> moment). I am wondering how the labs will have access to this technology in 
>> the future (would it be for free coming from the company DeepMind - 
>> Google?). It seems that they have already published some code. Well, 
>> exciting times. 
>> 
>> Cheers,
>> 
>> Isabel
>> 
>> 
>> Isabel Garcia-Saez   PhD
>> Institut de Biologie Structurale
>> Viral Infection and Cancer Group (VIC)-Cell Division Team
>> 71, Avenue des Martyrs
>> CS 10090
>> 38044 Grenoble Cedex 9
>> France
>> Tel.: 00 33 (0) 457 42 86 15
>> e-mail: isabel.gar...@ibs.fr
>> FAX: 00 33 (0) 476 50 18 90
>> http://www.ibs.fr/
>> 
>> 
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> 

-
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical Research Tel: +44 1223 336500
The Keith Peters Building   Fax: +44 1223 336827
Hills Road   E-mail: 
rj...@cam.ac.uk
Cambridge CB2 0XY, U.K.  
www-structmed.cimr.cam.ac.uk

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Jan Löwe

AlphaFold2 and its performance are indeed a true breakthrough of 
potentially seismic proportion - and will accelerate much of what we are 
trying to achieve.

But I feel it is also important to point out that the method principally 
relies on experimental data: lots of protein sequences that through 
evolution have evolved to generate very similar folds and structures. 
Machine learning is used to transform the resulting (experiment-derived) 
co-evolution matrices (can be thought of as images for ML training) into 
distances between atoms. The distances are then used to generate models 
(minimiser in Alpha Fold 1, something ML in 2, as far as I could figure 
out). Note that contacts between proteins can also generate evolutionary 
couplings (as already used by the pioneers of the field, such as Debora 
Marks and Chris Sander), so something like AlphaFold will be able to 
make inroads there as well and that application might well have a 
greater impact.

This leaves three important goals remaining if I may add: 1) a way to do 
this for any single sequence, without alignment, or indeed any folding 
polymer (for when we will be able to make coded polymers that are not 
made from amino acids). 2) a method to obtain accurate numbers, such as 
binding energies and rates. 3) the inverse: predicting sequences that 
have a particular function/fold.

I would like to suggest that all three will require looking at how we 
can use more of the physics of the problem, but might well involve more 
machine learning.

Jan

On 04/12/2020 00:49, Paul Adams wrote:

I agree completely Tom. Having been recently involved in some efforts 
to identify interesting compounds against SARS-CoV-2, I can say that 
the current AI/ML methods for docking/predicting small molecule 
binding have very very low success rates (I’m being generous here), 
even when you are working with the experimental protein structure! 
Maybe this is the next frontier for the prediction methods (after 
they’ve solved the protein/protein complex problem of course), but it 
seems there is a long way to go.

Given that many structures are solved to look at their interaction 
with other proteins or small molecules I think that experimental 
structural biology is here to stay for a while - past Tom’s retirement 
even! However, will these fairly accurate protein predictions make 
experimental phasing a thing of the past?

On Dec 3, 2020, at 4:16 PM, Peat, Tom (Manufacturing, Parkville) 
mailto:tom.p...@csiro.au>> wrote:

Although they can now get the fold correct, I don't think they have 
all the side chain placement so perfect as to be able to predict the 
fold_and_how a compound or another protein binds, so we can still do 
complexes. I don't know what others end up spending their time doing, 
but much of my work has been trying to fit ligands into density, 
which may take another few years of algorithm development, which is 
fine for me as I can retire!

cheers, tom

Tom Peat, PhD
Proteins Group
Biomedical Program, CSIRO
343 Royal Parade
Parkville, VIC, 3052
+613 9662 7304
+614 57 539 419
tom.p...@csiro.au 

*From:*CCP4 bulletin board > on behalf of Jon Cooper 
<488a26d62010-dmarc-requ...@jiscmail.ac.uk 
>

*Sent:*Friday, December 4, 2020 9:55 AM
*To:*CCP4BB@JISCMAIL.AC.UK 
>

*Subject:*Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)
Thanks all, very interesting, so our methods are just needed to 
identify the crystallization impurities, when the trays have been 
thrown away ;-

Cheers, Jon.C.

Sent from ProtonMail mobile

 Original Message 
On 3 Dec 2020, 22:31, Anastassis Perrakis > wrote:

AlphaFold - or similar ideas that will surface up sooner or later
- will beyond doubt have major impact. The accuracy it
demonstrated compared to others is excellent.

“Our” target (T1068) that was not solvable by MR with the
homologous search structure or a homology model (it was phased
with Archimboldo, rather easily), is easily solvable with
the AlphaFold model as a search model. In PHASER I get Rotation
Z-score 17.9, translation Z-score 26.0, using defaults.

imho what remains to be seen is:

a. how and when will a prediction server be available?
b. even if training needs computing that will surely unaccessible
to most, will there be code that can be installed in a
“reasonable” number of GPUs and how fast will it be?
c. how do model quality metrics (that do not compared with the
known answer) correlate with the expected RMSD? AlphaFold, no
matter how impressive, still gets things wrong.
c. will the AI efforts now gear to ligand (fragment?) prediction
with similarly impressive performance?

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread THOMPSON Andrew

Just thinking out loud and following up Tom's post  - Could prediction be a 
guide to sample preparation for detailed binding studies?
Andy

De : CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] de la part de Luca Pellegrini 
[lp...@cam.ac.uk]
Envoyé : vendredi 4 décembre 2020 10:15
À : CCP4BB@JISCMAIL.AC.UK
Objet : Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

Exciting times, indeed. I haven’t looked through the results myself, but it 
does look like an extraordinary advance. I wonder though how this advance 
correlates with ‘understanding’ how proteins folds. Can these outstanding 
results be distilled in a set of improved principles for how proteins fold? Ot 
put it another way, should we invite the AlphaFold programmers to deliver the 
conclusive lecture on the theory of protein folding? Or maybe we should invite 
the algorithm to present its results…

Best wishes,
Luca

Luca Pellegrini, PhD
Department of Biochemistry
University of Cambridge
Cambridge CB2 1GA
UK



On 3 Dec 2020, at 11:17, Isabel Garcia-Saez 
mailto:isabel.gar...@ibs.fr>> wrote:

Dear all,

Just commenting that after the stunning performance of AlphaFold that uses AI 
from Google maybe some of us we could dedicate ourselves to the noble art of 
gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
everything together (just in case I have already prepared my subscription to 
Netflix).

https://www.nature.com/articles/d41586-020-03348-4

Well, I suppose that we still have the structures of complexes (at the moment). 
I am wondering how the labs will have access to this technology in the future 
(would it be for free coming from the company DeepMind - Google?). It seems 
that they have already published some code. Well, exciting times.

Cheers,

Isabel


Isabel Garcia-Saez PhD
Institut de Biologie Structurale
Viral Infection and Cancer Group (VIC)-Cell Division Team
71, Avenue des Martyrs
CS 10090
38044 Grenoble Cedex 9
France
Tel.: 00 33 (0) 457 42 86 15
e-mail: isabel.gar...@ibs.fr
FAX: 00 33 (0) 476 50 18 90
http://www.ibs.fr/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Luca Pellegrini

Exciting times, indeed. I haven’t looked through the results myself, but it 
does look like an extraordinary advance. I wonder though how this advance 
correlates with ‘understanding’ how proteins folds. Can these outstanding 
results be distilled in a set of improved principles for how proteins fold? Ot 
put it another way, should we invite the AlphaFold programmers to deliver the 
conclusive lecture on the theory of protein folding? Or maybe we should invite 
the algorithm to present its results…  

Best wishes,
Luca

Luca Pellegrini, PhD
Department of Biochemistry
University of Cambridge
Cambridge CB2 1GA
UK



> On 3 Dec 2020, at 11:17, Isabel Garcia-Saez  wrote:
> 
> Dear all,
> 
> Just commenting that after the stunning performance of AlphaFold that uses AI 
> from Google maybe some of us we could dedicate ourselves to the noble art of 
> gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
> everything together (just in case I have already prepared my subscription to 
> Netflix).
> 
> https://www.nature.com/articles/d41586-020-03348-4 
> 
> 
> Well, I suppose that we still have the structures of complexes (at the 
> moment). I am wondering how the labs will have access to this technology in 
> the future (would it be for free coming from the company DeepMind - Google?). 
> It seems that they have already published some code. Well, exciting times. 
> 
> Cheers,
> 
> Isabel
> 
> 
> Isabel Garcia-SaezPhD
> Institut de Biologie Structurale
> Viral Infection and Cancer Group (VIC)-Cell Division Team
> 71, Avenue des Martyrs
> CS 10090
> 38044 Grenoble Cedex 9
> France
> Tel.: 00 33 (0) 457 42 86 15
> e-mail: isabel.gar...@ibs.fr 
> FAX: 00 33 (0) 476 50 18 90
> http://www.ibs.fr/
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 
> 



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

[ccp4bb] Postdoctoral position at CNIO (Madrid, SPAIN)

2020-12-04 Thread Munoz.Ines

Dear all,
We are looking for a highly motivated structural biologist to join the 
Crystallography and Protein Engineering Unit at the Spanish National Cancer 
Research Centre (CNIO), as a postdoctoral fellow, with a 20 months contract.
The work will be focused on developing a therapeutic molecules against 
SARS-CoV-2. The successful applicant will use interdisciplinary approaches 
including structural biology techniques (X-ray crystallography and cryoEM), 
molecular biology and biochemistry.
Feel free to contact me for informal inquiries.
Best,
Inés.

Inés G. Muñoz
Head of Crystallography & Protein Engineering Unit
Structural Biology Programme
imu...@cnio.es
+34 91 732 8000 (ext 3020)

Melchor Fernández Almagro, 3
28029 Madrid, Spain
www.cnio.es


[cid:image002.jpg@01D3CCFF.80156950]






[Hazte Amigo del CNIO. Más investigación, menos cáncer] 


Fb Hazte Amigo del CNIO |  Tw 
@HazteAmigoCNIO |  Youtube 
canalcnio


**ADVERTENCIA LEGAL**: Este correo electrónico, y en su caso los ficheros 
adjuntos, pueden contener información protegida para el uso exclusivo de su 
destinatario. Se prohíbe la distribución, reproducción o cualquier otro tipo de 
transmisión por parte de otra persona que no sea el destinatario. Si usted 
recibe por error este correo, se ruega comunicarlo al remitente y borrar el 
mensaje recibido.
De conformidad con lo dispuesto en el Reglamento (UE) 2016/679 relativo a la 
protección de los datos personales de las personas físicas, la información 
personal que nos pueda facilitar a través de este correo electrónico quedará 
registrada por la Fundación CNIO con la finalidad de tramitar el objeto del 
presente correo electrónico. El tratamiento de sus datos personales se 
encuentra legitimado por ser necesario para gestionar el objeto del presente 
mensaje. Estos datos personales no serán comunicados a ningún destinatario 
salvo a aquellos que usted nos autorice o así venga exigido por una ley. Ud. 
podrá ejercer los derechos de acceso, rectificación, supresión, limitación de 
tratamiento, portabilidad y oposición en la siguiente dirección: c/Melchor 
Fernandez Almagro 3, 28029 (Madrid). Podrá ponerse en contacto con el Delegado 
de Protección de Datos en: delegado_l...@cnio.es. Para el caso de que Ud. 
precise conocer información adicional sobre el tratamiento de sus datos 
personales, puede consultar dicha información adicional en el siguiente enlace 
dentro de nuestra página web: https://www.cnio.es/es/privacidad/index.asp

**LEGAL NOTICE**: This email and any attached files may contain protected 
information for the sole use of its intended recipient or addressee. Anyone 
other than the intended recipient or addressee is strictly prohibited from 
distributing, reproducing or transmitting the email and its contents in any 
way. If you receive this email in error, please notify the sender and delete 
the message.
Pursuant to the provisions of EU Regulation 2016/679 regarding the protection 
of personal data, any personal information you provide through this email will 
be registered by the CNIO Foundation in order to deal with content of this 
email. Your personal data must be processed in order to be able to deal with 
the content and purpose of this message. Your personal details will not be 
passed on to anyone else unless you authorise us to do so or we are required to 
do so by law. You may exercise your rights regarding access, rectification, 
suppression, limitation of processing, portability and opposition by writing to 
the following address: c/Melchor Fernandez Almagro 3, 28029 (Madrid). You may 
contact the Data Protection Delegate (Delegado de Protección de Datos) at: 
delegado_l...@cnio.es. If you require further information about the processing 
of your personal data, go to the following link on our webpage: 
https://www.cnio.es/es/privacidad/index.asp



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Anastassis Perrakis

Dear Boaz,

The archimboldo model gives Rot z-score: 8.1, Trans Z-score 13.8

Not sure this matters, as it lacks a few loops that even good old arp/warp can 
fill up in ten minutes ;-)

A.

On Dec 4, 2020, at 0:40, Boaz Shaanan 
mailto:bshaa...@bgu.ac.il>> wrote:

Just curious, how does the result of the Phaser run  with the Alphafold model 
compare with a Phaser run using the Arcimboldo phased model as a probe?
Boaz

Boaz Shaanan, Ph.D.
Department of Life Sciences
Ben Gurion University of the Negev
Beer Sheva
Israel

On Dec 4, 2020 00:32, Anastassis Perrakis 
mailto:a.perra...@nki.nl>> wrote:
AlphaFold - or similar ideas that will surface up sooner or later - will beyond 
doubt have major impact. The accuracy it demonstrated compared to others is 
excellent.

“Our” target (T1068) that was not solvable by MR with the homologous search 
structure or a homology model (it was phased with Archimboldo, rather easily), 
is easily solvable with the AlphaFold model as a search model. In PHASER I get 
Rotation Z-score 17.9, translation Z-score 26.0, using defaults.


imho what remains to be seen is:

a. how and when will a prediction server be available?
b. even if training needs computing that will surely unaccessible to most, will 
there be code that can be installed in a “reasonable” number of GPUs and how 
fast will it be?
c. how do model quality metrics (that do not compared with the known answer) 
correlate with the expected RMSD? AlphaFold, no matter how impressive, still 
gets things wrong.
c. will the AI efforts now gear to ligand (fragment?) prediction with similarly 
impressive performance?

Exciting times.

A.




On 3 Dec 2020, at 21:55, Jon Cooper 
<488a26d62010-dmarc-requ...@jiscmail.ac.uk>
 wrote:

Hello. A quick look suggests that a lot of the test structures were solved by 
phaser or molrep, suggesting it is a very welcome improvement on homology 
modelling. It would be interesting to know how it performs with structures of 
new or uncertain fold, if there are any left these days. Without resorting to 
jokes about artificial intelligence, I couldn't make that out from the CASP14 
website or the many excellent articles that have appeared. Best wishes, Jon 
Cooper.


Sent from ProtonMail mobile



 Original Message 
On 3 Dec 2020, 11:17, Isabel Garcia-Saez < 
isabel.gar...@ibs.fr> wrote:

Dear all,

Just commenting that after the stunning performance of AlphaFold that uses AI 
from Google maybe some of us we could dedicate ourselves to the noble art of 
gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
everything together (just in case I have already prepared my subscription to 
Netflix).

https://www.nature.com/articles/d41586-020-03348-4

Well, I suppose that we still have the structures of complexes (at the moment). 
I am wondering how the labs will have access to this technology in the future 
(would it be for free coming from the company DeepMind - Google?). It seems 
that they have already published some code. Well, exciting times.

Cheers,

Isabel


Isabel Garcia-Saez PhD
Institut de Biologie Structurale
Viral Infection and Cancer Group (VIC)-Cell Division Team
71, Avenue des Martyrs
CS 10090
38044 Grenoble Cedex 9
France
Tel.: 00 33 (0) 457 42 86 15
e-mail: isabel.gar...@ibs.fr
FAX: 00 33 (0) 476 50 18 90
http://www.ibs.fr/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Luca Jovine

Yes Tristan, that would be even better - also because such an Ag1, Ag2,… system 
could conveniently fall back on a single-character chain A, when generating 
legacy PDB format files from the mmCIF ones.
Exactly for the reason that you pointed out, personally I do not understand the 
logic of assigning a different chain ID to covalently attached glycans (or any 
other covalent post-translational modification, for that matter). Aren't 
residue IDs enough to make it clear that such residues are not amino acids?
-Luca

> On 4 Dec 2020, at 09:06, Tristan Croll  wrote:
> 
> To go one step further: in large, heavily glycosylated multi-chain complexes 
> the assignment of a random new chain ID to each glycan will lead to headaches 
> for people building visualisations using existing viewers, because it loses 
> the easy name-based association of glycan to parent protein chain. A 
> suggestion: why not take full advantage of the mmCIF capability for 
> multi-character chain IDs, and name them by appending characters to the 
> parent chain ID? Using chain A as an example, perhaps the glycans could 
> become Ag1, Ag2, etc.?
> 
>> On 4 Dec 2020, at 07:48, Luca Jovine  wrote:
>> 
>> CC: pdb-l
>> 
>> Dear Zhijie and Robbie,
>> 
>> I agree with both of you that the new carbohydrate chain assignment 
>> convention that has been recently adopted by PDB introduces confusion, not 
>> just for PDB-REDO but also - and especially - for end users.
>> 
>> Could we kindly ask PDB to improve consistency by either assigning a 
>> separate chain to all covalently attached carbohydrates (regardless of 
>> whether one or more residues have been traced), or reverting to the old 
>> system (where N-/O-glycans inherited the same chain ID of the protein to 
>> which they are attached)? The current hybrid solution hardly seems optimal...
>> 
>> Best regards,
>> 
>> Luca
>> 
>>> On 3 Dec 2020, at 20:17, Robbie Joosten  wrote:
>>> 
>>> Dear Zhijie,
>>> 
>>> In generally I like the treatment of carbohydrates now as branched 
>>> polymers. I didn't realise there was an exception. It makes sense for 
>>> unlinked carbohydrate ligands, but not for N- or O-glycosylation sites as 
>>> these might change during model building or, in my case, carbohydrate 
>>> rebuilding in PDB-REDO powered by Coot. Thanks for pointing this out.
>>> 
>>> Cheers,
>>> Robbie
>>> 
 -Original Message-
 From: CCP4 bulletin board  On Behalf Of Zhijie Li
 Sent: Thursday, December 3, 2020 19:52
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
 PDB -- N-glycans are now separate chains if more than one residue

 Hi all,

 I was confused when I saw mysterious new glycan chains emerging during
 PDB deposition and spent quite some time trying to find out what was
 wrong with my coordinates.  Then it occurred to me that a lot of recent
 structures also had tens of N-glycan chains.  Finally I realized that this
 phenomenon is a consequence of this PDB policy announced here in July.

 For future depositors who might also get puzzled, let's put it in a short
 sentence:  O- and N-glycans are now separate chains if it they contain more
 than one residue; single residues remain with the protein chain.

 https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wwpdb.org%2Fdocumentation%2Fcarbohydrate-remediationdata=04%7C01%7Cluca.jovine%40ki.se%7Ca3fed8eed9d94a481d6808d8982b731f%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C1%7C637426659666244613%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=BqPiPeexG0nLhUmaih2Bq8ppjkX%2F%2BbLP4SoBGL4u5%2Fw%3Dreserved=0

 "Oligosaccharide molecules are classified as a new entity type, branched,
 assigned a unique chain ID (_atom_site.auth_asym_id) and a new mmCIF
 category introduced to define the type of branching
 (_pdbx_entity_branch.type) . "

 I found the differential treatment of single-residue glycans and 
 multi-residue
 glycans not only bit lack of aesthetics but also misleading.  When a 
 structure
 contains both NAG-NAG... and single NAG on N-glycosylation sites, it might
 be because of lack of density for building more residues, or because that
 some of the glycosylation sites are now indeed single NAGs (endoH etc.)
 while some others are not cleaved due to accessibility issues.Leaving 
 NAGs
 on the protein chain while assigning NAG-NAG... to a new chain, feels like
 suggesting something about their true oligomeric state.

 For example, for cryoEM structures, when one only builds a single NAG at a
 site does not necessarily mean that the protein was treated by endoH. In
 fact all sites are extended to at least tri-Man in most cases. Then why
 keeping some sites associated with the protein chain while

Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Anastassis Perrakis

btw, if anyone has any leverage to the people making the CASP#14 pages, having 
info in acronyms (e.g. GDT) accessible by a simple “mouse over” instead of 
re-directing to the explanation page would be handy.

In any case, the Casp web-pages in general, leave quite a bit to be desired for 
the average user - they seem more like an "API for humans” and less concerned 
about modern design principles, to put it mildly.

Tassos

On Dec 4, 2020, at 8:53, Joana Pereira 
mailto:joana.pere...@tuebingen.mpg.de>> wrote:

Hi everybody,

As one of the persons playing with the CASP14 data before all news came out, I 
can answer some of the questions raised in this thread.

- "Does anyone know how AlphaFold performs on sequences with little 
conservation?"
One of the things we looked at was how the accuracy of the models was dependent 
on the Neff (number of effective sequences, relates to how deep alignments are 
for that sequence and, thus, to the number of homologs and the conservation of 
the sequence). What we could see is that, basically, in CASP14 it does not 
anymore and that (near-)singleton sequences could be modeled with a pretty good 
accuracy.

- "It would be interesting to know how it performs with structures of new or 
uncertain fold."
It does pretty well! Similarly to the Neff relationship, we also see a 
basically flat line at a GDT of 70-80 at any level of target difficulty. Of 
course the accuracy is slightly higher for easy targets (those for which there 
are templates in the PDB), but to have a GDT of around 70 in Free-Modelling, 
hard targets, is quite impressive.

- "I don't think they have all the side chain placement so perfect as to be 
able to predict the fold and how a compound or another protein binds"
Yap, sidechains remain the poorest modeled parts. Still, those modeled by 
AlphaFold were the closest to the "reality" of the target...

- "I'm curious how well AlphaFold would do on an Intrinsically Disordered 
Protein (IDP)"
Oh yes, that is a super good point and I have been thinking about it too. Maybe 
one should start throwing some IDPs into CASP too :) There's the CAID 
experiment but, on its current state, AlphaFold would not be possible to test.

Best wishes
Joana

---
Dr. Joana Pereira
Postdoctoral Researcher
Department of Protein Evolution

Max Planck Institute for Developmental Biology
Max-Planck-Ring 5
72076 Tübingen
GERMANY


On 03.12.20 23:46, Reza Khayat wrote:
Does anyone know how AlphaFold performs on sequences with little conservation? 
Virus and phage proteins are like this. Their structures are homologous, but 
sequence identity can be less than 10%.

Reza

Reza Khayat, PhD
Associate Professor
City College of New York
Department of Chemistry and Biochemistry
New York, NY 10031

From: CCP4 bulletin board  
on behalf of Anastassis Perrakis 
Sent: Thursday, December 3, 2020 5:31 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [EXTERNAL] Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

AlphaFold - or similar ideas that will surface up sooner or later - will beyond 
doubt have major impact. The accuracy it demonstrated compared to others is 
excellent.

“Our” target (T1068) that was not solvable by MR with the homologous search 
structure or a homology model (it was phased with Archimboldo, rather easily), 
is easily solvable with the AlphaFold model as a search model. In PHASER I get 
Rotation Z-score 17.9, translation Z-score 26.0, using defaults.


imho what remains to be seen is:

a. how and when will a prediction server be available?
b. even if training needs computing that will surely unaccessible to most, will 
there be code that can be installed in a “reasonable” number of GPUs and how 
fast will it be?
c. how do model quality metrics (that do not compared with the known answer) 
correlate with the expected RMSD? AlphaFold, no matter how impressive, still 
gets things wrong.
c. will the AI efforts now gear to ligand (fragment?) prediction with similarly 
impressive performance?

Exciting times.

A.




On 3 Dec 2020, at 21:55, Jon Cooper 
<488a26d62010-dmarc-requ...@jiscmail.ac.uk>
 wrote:

Hello. A quick look suggests that a lot of the test structures were solved by 
phaser or molrep, suggesting it is a very welcome improvement on homology 
modelling. It would be interesting to know how it performs with structures of 
new or uncertain fold, if there are any left these days. Without resorting to 
jokes about artificial intelligence, I couldn't make that out from the CASP14 
website or the many excellent articles that have appeared. Best wishes, Jon 
Cooper.


Sent from ProtonMail mobile



 Original Message 
On 3 Dec 2020, 11:17, Isabel Garcia-Saez < 
isabel.gar...@ibs.fr> wrote:

Dear all,

Just commenting that after the stunning performance of

Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB -- N-glycans are now separate chains if more than one residue

2020-12-04 Thread Tristan Croll

To go one step further: in large, heavily glycosylated multi-chain complexes 
the assignment of a random new chain ID to each glycan will lead to headaches 
for people building visualisations using existing viewers, because it loses the 
easy name-based association of glycan to parent protein chain. A suggestion: 
why not take full advantage of the mmCIF capability for multi-character chain 
IDs, and name them by appending characters to the parent chain ID? Using chain 
A as an example, perhaps the glycans could become Ag1, Ag2, etc.?

> On 4 Dec 2020, at 07:48, Luca Jovine  wrote:
> 
> CC: pdb-l
> 
> Dear Zhijie and Robbie,
> 
> I agree with both of you that the new carbohydrate chain assignment 
> convention that has been recently adopted by PDB introduces confusion, not 
> just for PDB-REDO but also - and especially - for end users.
> 
> Could we kindly ask PDB to improve consistency by either assigning a separate 
> chain to all covalently attached carbohydrates (regardless of whether one or 
> more residues have been traced), or reverting to the old system (where 
> N-/O-glycans inherited the same chain ID of the protein to which they are 
> attached)? The current hybrid solution hardly seems optimal...
> 
> Best regards,
> 
> Luca
> 
>> On 3 Dec 2020, at 20:17, Robbie Joosten  wrote:
>> 
>> Dear Zhijie,
>> 
>> In generally I like the treatment of carbohydrates now as branched polymers. 
>> I didn't realise there was an exception. It makes sense for unlinked 
>> carbohydrate ligands, but not for N- or O-glycosylation sites as these might 
>> change during model building or, in my case, carbohydrate rebuilding in 
>> PDB-REDO powered by Coot. Thanks for pointing this out.
>> 
>> Cheers,
>> Robbie
>> 
>>> -Original Message-
>>> From: CCP4 bulletin board  On Behalf Of Zhijie Li
>>> Sent: Thursday, December 3, 2020 19:52
>>> To: CCP4BB@JISCMAIL.AC.UK
>>> Subject: Re: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the
>>> PDB -- N-glycans are now separate chains if more than one residue
>>> 
>>> Hi all,
>>> 
>>> I was confused when I saw mysterious new glycan chains emerging during
>>> PDB deposition and spent quite some time trying to find out what was
>>> wrong with my coordinates.  Then it occurred to me that a lot of recent
>>> structures also had tens of N-glycan chains.  Finally I realized that this
>>> phenomenon is a consequence of this PDB policy announced here in July.
>>> 
>>> 
>>> For future depositors who might also get puzzled, let's put it in a short
>>> sentence:  O- and N-glycans are now separate chains if it they contain more
>>> than one residue; single residues remain with the protein chain.
>>> 
>>> 
>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wwpdb.org%2Fdocumentation%2Fcarbohydrate-remediationdata=04%7C01%7Cluca.jovine%40KI.SE%7C1d790a0717ce4217c7a308d897c01b47%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C1%7C637426199684263065%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=mBrkCJECFpZyCih4kOCcCvLT1GzQHxD5GD7bZDI9s1s%3Dreserved=0
>>> 
>>> "Oligosaccharide molecules are classified as a new entity type, branched,
>>> assigned a unique chain ID (_atom_site.auth_asym_id) and a new mmCIF
>>> category introduced to define the type of branching
>>> (_pdbx_entity_branch.type) . "
>>> 
>>> 
>>> 
>>> 
>>> 
>>> I found the differential treatment of single-residue glycans and 
>>> multi-residue
>>> glycans not only bit lack of aesthetics but also misleading.  When a 
>>> structure
>>> contains both NAG-NAG... and single NAG on N-glycosylation sites, it might
>>> be because of lack of density for building more residues, or because that
>>> some of the glycosylation sites are now indeed single NAGs (endoH etc.)
>>> while some others are not cleaved due to accessibility issues.Leaving 
>>> NAGs
>>> on the protein chain while assigning NAG-NAG... to a new chain, feels like
>>> suggesting something about their true oligomeric state.
>>> 
>>> 
>>> For example, for cryoEM structures, when one only builds a single NAG at a
>>> site does not necessarily mean that the protein was treated by endoH. In
>>> fact all sites are extended to at least tri-Man in most cases. Then why
>>> keeping some sites associated with the protein chain while others kicked
>>> out?
>>> 
>>> Zhijie
>>> 
>>> 
>>> 
>>> 
>>> 
>>> From: CCP4 bulletin board  on behalf of John
>>> Berrisford 
>>> Sent: Thursday, July 9, 2020 4:39 AM
>>> To: CCP4BB@JISCMAIL.AC.UK 
>>> Subject: [ccp4bb] Coming July 29: Improved Carbohydrate Data at the PDB
>>> 
>>> 
>>> Dear CCP4BB
>>> 
>>> PDB data will shortly incorporate a new data representation for
>>> carbohydrates in PDB entries and reference data that improves the
>>> Findability and Interoperability of these molecules in macromolecular
>>> structures. In order to remediate and improve the representation of
>>> carbohydrates across the archive, the wwPDB has:
>>>

46 matches

Mail list logo