Re: [ccp4bb] Checking X-ray sequence (no more protein).

2022-08-03 Thread Jon Cooper
Thank you very much for all the helpful replies. I have summarised the 
discussion here:

https://justpaste.it/9cfl9

Best wishes, Jon Cooper.
jon.b.coo...@protonmail.com

Sent with [Proton Mail](https://proton.me/) secure email.



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Checking X-ray sequence (no more protein).

2022-07-29 Thread David J. Schuller
I don't know how much effort I would put into that, given how easy nucleic acid 
sequencing has become.

===
 All Things Serve the Beam
 ===
 David J. Schuller
 modern man in a post-modern world
 MacCHESS, Cornell University
 schul...@cornell.edu

From: CCP4 bulletin board  on behalf of Robbie Joosten 

Sent: Friday, July 29, 2022 9:56 AM
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] Checking X-ray sequence (no more protein).

Hi Jon,

There are placeholders for ASP/ASN and GLU/GLN ambiguities: ASX and GLX 
respectively. You can just use those. AFAICT there no such thing for VAL/THR 
ambiguities. You could look for the most likely canadidata based on multiple 
sequence alignments. Refinement of both alternatives can give hints in 
B-factors and if you are lucky in difference density. But if hydrogen bonding 
gives no hints, then the residues are also not in a place where the identity 
really matters. You can give your best guess with a CAVEAT record or use the 
name UNK to indicate that you do not know what the residue is. You would loose 
the knowledge that it is either VAL or THR in that case.

Cheers,
Robbie

> -Original Message-
> From: CCP4 bulletin board  On Behalf Of Jon
> Cooper
> Sent: Friday, July 29, 2022 12:14
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: [ccp4bb] Checking X-ray sequence (no more protein).
>
> Hello, I am looking for suggestions of ways to check a 1.7 Angstrom X-ray
> sequence for a protein where it is impractical to do experimental sequencing,
> protein or DNA. The structure refines to publishable R/R-free and the main
> ambiguities seem to be Thr/Val, Asp/Asn and Glu/Gln where alternative H-
> bonding networks are possible. Running alpha-fold seems an interesting
> option? Any suggestions much appreciated.
>
> Cheers, Jon.C.
>
> Sent from ProtonMail mobile
>
>
>
>
> 
>
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list 
hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk>, terms & conditions are 
available at https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Checking X-ray sequence (no more protein).

2022-07-29 Thread Robbie Joosten
Hi Jon,

There are placeholders for ASP/ASN and GLU/GLN ambiguities: ASX and GLX 
respectively. You can just use those. AFAICT there no such thing for VAL/THR 
ambiguities. You could look for the most likely canadidata based on multiple 
sequence alignments. Refinement of both alternatives can give hints in 
B-factors and if you are lucky in difference density. But if hydrogen bonding 
gives no hints, then the residues are also not in a place where the identity 
really matters. You can give your best guess with a CAVEAT record or use the 
name UNK to indicate that you do not know what the residue is. You would loose 
the knowledge that it is either VAL or THR in that case. 

Cheers,
Robbie

> -Original Message-
> From: CCP4 bulletin board  On Behalf Of Jon
> Cooper
> Sent: Friday, July 29, 2022 12:14
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: [ccp4bb] Checking X-ray sequence (no more protein).
> 
> Hello, I am looking for suggestions of ways to check a 1.7 Angstrom X-ray
> sequence for a protein where it is impractical to do experimental sequencing,
> protein or DNA. The structure refines to publishable R/R-free and the main
> ambiguities seem to be Thr/Val, Asp/Asn and Glu/Gln where alternative H-
> bonding networks are possible. Running alpha-fold seems an interesting
> option? Any suggestions much appreciated.
> 
> Cheers, Jon.C.
> 
> Sent from ProtonMail mobile
> 
> 
> 
> 
> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Checking X-ray sequence (no more protein).

2022-07-29 Thread Jon Cooper
Thank you, yes, threading style tools to assess the likelihood of having a 
given amino acid in a certain position in the fold would be a good approach. I 
have tried one but wasn't hugely informative, in my hands anyway. All 
suggestions very welcome but big database science is a bit outside my skill 
set. Of course, sequence conservation in the family helps a lot with the 
assignment but there are bigger ambiguities in less conserved surface regions 
e.g. a disordered Lys can refine well as a Ser, Ala or Gly even, but e.g. 
nearby conserved acidic groups might suggest the presence of the basic amino 
acid which could salt-bridge with them, but why then would it be so disordered? 
Tangles we weave...

Sent from ProtonMail mobile

 Original Message 
On 29 Jul 2022, 14:17, Clemens Vonrhein wrote:

> Maybe a crazy idea, but couldn't one use various model/geometry validation 
> tools to figure out some of those ambiguities? As a test one could take a 
> very good 1.7A structure and do some random ASN->ASP, THR->VAL etc mutations 
> followed by refinement (including hydrogens). Wouldn't some validation tool 
> pick up unfavourable conformations, poor rotamers and/or hydrogen clashes and 
> poor H-networks (compared to the initial, correct sequence)? Maybe there is 
> some kind of "fingerprint" in validation results for such incorrect residue 
> assignments that can distinguish correct from incorrect sequences ... Or put 
> another way, if model validation can not pick up such sequence errors: should 
> we be worried about the reliability of our validation criteria? A large scale 
> re-refinement of deposited structures with (1) the current/correct sequence 
> and (2) those ASN/ASP, THR/VAL etc ambiguities artificially introduced, could 
> provide a clever algorithm (AI?) with the data basis to figure out those 
> "fingerprints". Maybe even for the ASN/GLN/HIS side-chain orientations when 
> the sequence is actually correct. Cheers Clemens On Fri, Jul 29, 2022 at 
> 12:08:58PM +, Jon Cooper wrote: > Thank you so much for your replies. I 
> apologise for being unclear. The protein is purified from a plant that hasn't 
> had its genome sequence determined. We know the enzyme family of the protein 
> and therefore the structure was originally solved by MR. The 'X-ray sequence' 
> we have is just determined from looking at the 1.7 Angstrom density, which is 
> good, over several refinement and rebuilding rounds. The resulting sequence 
> has been run through blast and it is up to 58% identical with other family 
> members. To me this seemed low but that degree of identity is typical of 
> other family members. The postgrad who did the work did obtain some peptide 
> sequences and prior to that about 20% of the sequence was determined by the 
> Edman method with the usual Asp/Asn and Glu/Gln ambiguity. However, there 
> isn't any prospect of us doing further experimental work, sorry, but that's 
> the way it is!! > > Best wishes, Jon.C. > > Sent from ProtonMail mobile > > 
>  Original Message  > On 29 Jul 2022, 12:23, Jan Dohnalek 
> wrote: > > > If you know at least something about your protein, organism, 
> type of molecule, ..., you could try mass spectrometry peptide mapping to 
> known sequences, this may give you some answers for the ambiguities you might 
> be seeing, if nothing else .. > > > > Jan > > > > On Fri, Jul 29, 2022 at 
> 12:15 PM Jon Cooper wrote: > > > >> Hello, I am looking for suggestions of 
> ways to check a 1.7 Angstrom X-ray sequence for a protein where it is 
> impractical to do experimental sequencing, protein or DNA. The structure 
> refines to publishable R/R-free and the main ambiguities seem to be Thr/Val, 
> Asp/Asn and Glu/Gln where alternative H-bonding networks are possible. 
> Running alpha-fold seems an interesting option? Any suggestions much 
> appreciated. > >> > >> Cheers, Jon.C. > >> > >> Sent from ProtonMail mobile > 
> >> > >> --- > >> 
> > >> To unsubscribe from the CCP4BB list, click the following link: > >> 
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 > > > > -- > 
> > > > Jan Dohnalek, Ph.D > > Institute of Biotechnology > > > > Academy of 
> Sciences of the Czech Republic > > Biocev > > Prumyslova 595 > > 252 50 
> Vestec near Prague > > Czech Republic > > Tel. +420 325 873 758 > > 
>  > > 
> To unsubscribe from the CCP4BB list, click the following link: > 
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 > > This 
> message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
> hosted by www.jiscmail.ac.uk, terms & conditions are available at 
> https://www.jiscmail.ac.uk/policyandsecurity/ -- 
> *-- * Clemens 
> Vonrhein, Ph.D. vonrhein AT GlobalPhasing DOT com * Global Phasing Ltd., 
> 

Re: [ccp4bb] Checking X-ray sequence (no more protein).

2022-07-29 Thread Clemens Vonrhein
Maybe a crazy idea, but couldn't one use various model/geometry
validation tools to figure out some of those ambiguities? As a test
one could take a very good 1.7A structure and do some random ASN->ASP,
THR->VAL etc mutations followed by refinement (including
hydrogens). Wouldn't some validation tool pick up unfavourable
conformations, poor rotamers and/or hydrogen clashes and poor
H-networks (compared to the initial, correct sequence)?

Maybe there is some kind of "fingerprint" in validation results for
such incorrect residue assignments that can distinguish correct from
incorrect sequences ... Or put another way, if model validation can
not pick up such sequence errors: should we be worried about the
reliability of our validation criteria?

A large scale re-refinement of deposited structures with (1) the
current/correct sequence and (2) those ASN/ASP, THR/VAL etc ambiguities
artificially introduced, could provide a clever algorithm (AI?) with
the data basis to figure out those "fingerprints". Maybe even for the
ASN/GLN/HIS side-chain orientations when the sequence is actually
correct.

Cheers

Clemens


On Fri, Jul 29, 2022 at 12:08:58PM +, Jon Cooper wrote:
> Thank you so much for your replies. I apologise for being unclear. The 
> protein is purified from a plant that hasn't had its genome sequence 
> determined. We know the enzyme family of the protein and therefore the 
> structure was originally solved by MR. The 'X-ray sequence' we have is just 
> determined from looking at the 1.7 Angstrom density, which is good, over 
> several refinement and rebuilding rounds. The resulting sequence has been run 
> through blast and it is up to 58% identical with other family members. To me 
> this seemed low but that degree of identity is typical of other family 
> members. The postgrad who did the work did obtain some peptide sequences and 
> prior to that about 20% of the sequence was determined by the Edman method 
> with the usual Asp/Asn and Glu/Gln ambiguity. However, there isn't any 
> prospect of us doing further experimental work, sorry, but that's the way it 
> is!!
> 
> Best wishes, Jon.C.
> 
> Sent from ProtonMail mobile
> 
>  Original Message 
> On 29 Jul 2022, 12:23, Jan Dohnalek wrote:
> 
> > If you know at least something about your protein, organism, type of 
> > molecule, ..., you could try mass spectrometry peptide mapping to known 
> > sequences, this may give you some answers for the ambiguities you might be 
> > seeing, if nothing else ..
> >
> > Jan
> >
> > On Fri, Jul 29, 2022 at 12:15 PM Jon Cooper 
> > <488a26d62010-dmarc-requ...@jiscmail.ac.uk> wrote:
> >
> >> Hello, I am looking for suggestions of ways to check a 1.7 Angstrom X-ray 
> >> sequence for a protein where it is impractical to do experimental 
> >> sequencing, protein or DNA. The structure refines to publishable R/R-free 
> >> and the main ambiguities seem to be Thr/Val, Asp/Asn and Glu/Gln where 
> >> alternative H-bonding networks are possible. Running alpha-fold seems an 
> >> interesting option? Any suggestions much appreciated.
> >>
> >> Cheers, Jon.C.
> >>
> >> Sent from ProtonMail mobile
> >>
> >> ---
> >>
> >> To unsubscribe from the CCP4BB list, click the following link:
> >> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> >
> > --
> >
> > Jan Dohnalek, Ph.D
> > Institute of Biotechnology
> >
> > Academy of Sciences of the Czech Republic
> > Biocev
> > Prumyslova 595
> > 252 50 Vestec near Prague
> > Czech Republic
> > Tel. +420 325 873 758
> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> 
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
> https://www.jiscmail.ac.uk/policyandsecurity/

-- 

*--
* Clemens Vonrhein, Ph.D. vonrhein AT GlobalPhasing DOT com
* Global Phasing Ltd., Sheraton House, Castle Park 
* Cambridge CB3 0AX, UK   www.globalphasing.com
*--



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Checking X-ray sequence (no more protein).

2022-07-29 Thread Natesh Ramanathan
Dear Jon,
 We did exactly the same back in 1999's.
1) Natesh R, Bhanumoorthy P, Vithayathil PJ, Sekar K, Ramakumar S,
Viswamitra MA. (1999). Crystal structure at 1.8 Å resolution and proposed
amino acid sequence of a thermostable xylanase from *Thermoascus
aurantiacus*. J. Mol. Biol., 288, 999-1012.


and then updated the 1.8 A crystal structure derived sequence with ultra
high resolution electron density map.  We were also able to model
additional terminal residues in ultra high resolution map (reported in 2003
paper below).

2) Natesh R, Manikandan K, Bhanumoorthy P, Viswamitra MA, Ramakumar S.
(2003). Thermostable xylanase from *Thermoascus aurantiacus* at ultrahigh
resolution (0.89 Å) at 100 K and atomic resolution (1.11 Å) at 293 K
refined anisotropically to small-molecule accuracy. Acta Crystallogr., D
59,105-117. 



I will be able happy to help off the egroup if you need more
information/help.

Best regards,
Natesh

--
--
"Live Simply and do Serious Things .. "
- Dorothy Mary Crowfoot Hodgkin OM, FRS

"In Science truth always wins"
- Max Ferdinand Perutz OM FRS
--
Dr. Ramanathan Natesh
Assistant Professor,
Founding President and President Cryo Electron Microscopy and 3 Dimensional
Image Processing Society of India (CEM3DIPSI)
http://cem3dipsi.iisertvm.ac.in/
School of Biology,
Indian Institute of Science Education and Research Thiruvananthapuram
(IISER-TVM),
Maruthamala P.O., Vithura,
Thiruvananthapuram,  695551, Kerala, India

nat...@iisertvm.ac.in
http://faculty.iisertvm.ac.in/natesh

*Researcher ID*: http://www.researcherid.com/rid/C-4488-2008
*ORCID*: http://orcid.org/-0002-1145-5962
Vidwan-ID : 94134: http://iisertvm.irins.org/profile/94134
*PUBLONS*: https://publons.com/author/1520837/ramanathan-natesh#profile

Office Ph. 0091- 471-2778087

On Fri, 29 Jul 2022 at 15:45, Jon Cooper <
488a26d62010-dmarc-requ...@jiscmail.ac.uk> wrote:

> Hello, I am looking for suggestions of ways to check a 1.7 Angstrom X-ray
> sequence for a protein where it is impractical to do experimental
> sequencing, protein or DNA. The structure refines to publishable R/R-free
> and the main ambiguities seem to be Thr/Val, Asp/Asn and Glu/Gln where
> alternative H-bonding networks are possible. Running alpha-fold seems an
> interesting option? Any suggestions much appreciated.
>
> Cheers, Jon.C.
>
> Sent from ProtonMail mobile
>
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Checking X-ray sequence (no more protein).

2022-07-29 Thread Jon Cooper
Thank you so much for your replies. I apologise for being unclear. The protein 
is purified from a plant that hasn't had its genome sequence determined. We 
know the enzyme family of the protein and therefore the structure was 
originally solved by MR. The 'X-ray sequence' we have is just determined from 
looking at the 1.7 Angstrom density, which is good, over several refinement and 
rebuilding rounds. The resulting sequence has been run through blast and it is 
up to 58% identical with other family members. To me this seemed low but that 
degree of identity is typical of other family members. The postgrad who did the 
work did obtain some peptide sequences and prior to that about 20% of the 
sequence was determined by the Edman method with the usual Asp/Asn and Glu/Gln 
ambiguity. However, there isn't any prospect of us doing further experimental 
work, sorry, but that's the way it is!!

Best wishes, Jon.C.

Sent from ProtonMail mobile

 Original Message 
On 29 Jul 2022, 12:23, Jan Dohnalek wrote:

> If you know at least something about your protein, organism, type of 
> molecule, ..., you could try mass spectrometry peptide mapping to known 
> sequences, this may give you some answers for the ambiguities you might be 
> seeing, if nothing else ..
>
> Jan
>
> On Fri, Jul 29, 2022 at 12:15 PM Jon Cooper 
> <488a26d62010-dmarc-requ...@jiscmail.ac.uk> wrote:
>
>> Hello, I am looking for suggestions of ways to check a 1.7 Angstrom X-ray 
>> sequence for a protein where it is impractical to do experimental 
>> sequencing, protein or DNA. The structure refines to publishable R/R-free 
>> and the main ambiguities seem to be Thr/Val, Asp/Asn and Glu/Gln where 
>> alternative H-bonding networks are possible. Running alpha-fold seems an 
>> interesting option? Any suggestions much appreciated.
>>
>> Cheers, Jon.C.
>>
>> Sent from ProtonMail mobile
>>
>> ---
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> --
>
> Jan Dohnalek, Ph.D
> Institute of Biotechnology
>
> Academy of Sciences of the Czech Republic
> Biocev
> Prumyslova 595
> 252 50 Vestec near Prague
> Czech Republic
> Tel. +420 325 873 758



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Checking X-ray sequence (no more protein).

2022-07-29 Thread Jan Dohnalek
If you know at least something about your protein, organism, type of
molecule, ..., you could try mass spectrometry peptide mapping to known
sequences, this may give you some answers for the ambiguities you might be
seeing, if nothing else ..

Jan


On Fri, Jul 29, 2022 at 12:15 PM Jon Cooper <
488a26d62010-dmarc-requ...@jiscmail.ac.uk> wrote:

> Hello, I am looking for suggestions of ways to check a 1.7 Angstrom X-ray
> sequence for a protein where it is impractical to do experimental
> sequencing, protein or DNA. The structure refines to publishable R/R-free
> and the main ambiguities seem to be Thr/Val, Asp/Asn and Glu/Gln where
> alternative H-bonding networks are possible. Running alpha-fold seems an
> interesting option? Any suggestions much appreciated.
>
> Cheers, Jon.C.
>
> Sent from ProtonMail mobile
>
>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>


-- 
Jan Dohnalek, Ph.D
Institute of Biotechnology
Academy of Sciences of the Czech Republic
Biocev
Prumyslova 595
252 50 Vestec near Prague
Czech Republic

Tel. +420 325 873 758



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/