2 days ago :-)

https://github.com/cdk/cdk/commit/dfbc32822e7d471bfb5a60aaf39a701371541280

On Fri, 26 Nov 2021 at 19:11, Andres Fernando Bernal Escobar <
andresf.bern...@utadeo.edu.co> wrote:

> Hi John,
> Thanks for the fix. When will it be merged into the main branch?
> Andrés
>
> El vie, 26 de nov. de 2021 a la(s) 03:21, John Mayfield (
> john.wilkinson...@gmail.com) escribió:
>
>> Hi Andres,
>>
>> Excellent analysis - thank you. Good to see that the recent Phenol change
>> should bring things more in agreement.
>>
>> John
>>
>> On Wed, 24 Nov 2021 at 21:29, Andres Fernando Bernal Escobar <
>> andresf.bern...@utadeo.edu.co> wrote:
>>
>>> Hello John, thanks for your answer. I ran a quick comparison between CDK
>>> and PubChem, with a few hand-picked molecules. These are the results:
>>> https://docs.google.com/spreadsheets/d/1yl3b05W319ZQW5K9TZf0iMYHbPoJyP5BV8QMLhf1kLE/edit?usp=sharing
>>>
>>> I split the molecules in four subsets. The first comprises seemingly
>>> non-problematic molecules: carboxylic acids, amines, aliphatic esters,
>>> aliphatic ethers. In these cases CDK, PubChem and my own intuition are all
>>> in agreement.
>>>
>>> The second subset comprises molecules where I think CDK is wrong and
>>> PubChem is correct: phenols. This is due to the issue that you corrected in
>>> the branch you linked.
>>>
>>> The third subset comprises molecules where I think CDK is correct and
>>> PubChem is wrong: aromatic ethers, amides, nitro compounds. In the case of
>>> aromatic ethers, we know CDK explicitly introduces a correction to exclude
>>> aromatic ether oxygens from the HB acceptors count. I am not a specialist,
>>> but I understand there are sound reasons to make this exception. PubChem
>>> doesn't seem to implement it. In the case of amides and nitro compounds I
>>> don't quite understand what is going on with PubChem, but CDK's answer
>>> seems the correct one to me.
>>>
>>> The last subset comprises aromatic esters (acyloxy substituents). I
>>> honestly don't know what is correct in this case. Are oxygen atoms from
>>> aromatic esters also an exception, just as those from aromatic ethers? That
>>> would mean CDK is right. Otherwise, another correction is needed to make
>>> sure CDK excludes no oxygens on aromatic rings other than those of ethers.
>>>
>>> El mar, 23 de nov. de 2021 a la(s) 04:27, John Mayfield (
>>> john.wilkinson...@gmail.com) escribió:
>>>
>>>> Thanks for your email. I've always thought the CDK HBond acceptor/donor
>>>> code is a little wonky and needs investigating. I don't have time to look
>>>> deeply at it but yes my reading of this is it doesn't check for the ether
>>>> oxygen correctly. If someone was inclined checking CDK's (and RDKit's)
>>>> values with PubChem would be a quick project that may provide some insight
>>>> onto missed cases and disagreements.
>>>>
>>>> I've made a change here to get the correct value for phenol:
>>>> https://github.com/cdk/cdk/compare/bug/hbondacceptor?expand=1
>>>>
>>>> On Fri, 15 Oct 2021 at 11:27, Guillermo Restrepo <
>>>> guillermo.restr...@mis.mpg.de> wrote:
>>>>
>>>>> We are working with some descriptors taken from Reaxys database, which
>>>>> according to its owner are computed using your CDK library. We found
>>>>> something unexpected and would very much appreciate it if you could
>>>>> help
>>>>> us to understand.
>>>>>
>>>>> We noted that some phenols are reported as having 0 hydrogen bond
>>>>> acceptors, whereas we expected them to have at least one. We checked
>>>>> CDK
>>>>> source code and found this comment on
>>>>> HBondAcceptorCountDescriptor.java:
>>>>>
>>>>> The following groups are counted as hydrogen bond acceptors:
>>>>> - any oxygen where the formal charge of the oxygen is non-positive
>>>>> (i.e.
>>>>> formal charge <= 0) except
>>>>>        - an aromatic ether oxygen (i.e. an ether oxygen that is
>>>>> adjacent
>>>>> to at least one aromatic carbon)
>>>>>         - an oxygen that is adjacent to a nitrogen
>>>>> - any nitrogen where the formal charge of the nitrogen is non-positive
>>>>> (i.e. formal charge <= 0) except
>>>>>         - a nitrogen that is adjacent to an oxygen
>>>>>
>>>>> The way we understood it, this means that phenols should have at least
>>>>> one hydrogen bond acceptor. But further down in the same file, these
>>>>> lines seem to specify otherwise:
>>>>>
>>>>> // looking for suitable oxygen atoms
>>>>>              else if (atom.getAtomicNumber() == IElement.O &&
>>>>> atom.getFormalCharge() <= 0) {
>>>>>                  //excluding oxygens that are adjacent to a nitrogen
>>>>> or
>>>>> to an aromatic carbon
>>>>>                  List<IBond> neighbours =
>>>>> ac.getConnectedBondsList(atom);
>>>>>                  for (IBond bond : neighbours) {
>>>>>                      IAtom neighbor = bond.getOther(atom);
>>>>>                      if (neighbor.getAtomicNumber() == IElement.N ||
>>>>>                          (neighbor.getAtomicNumber() == IElement.C &&
>>>>>                           neighbor.isAromatic() &&
>>>>>                           bond.getOrder() != IBond.Order.DOUBLE))
>>>>>                          continue atomloop;;
>>>>>                  }
>>>>>                  hBondAcceptors++;
>>>>>              }
>>>>>
>>>>> Is this intended, or is it a bug, or are we misunderstanding something?
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Cdk-user mailing list
>>>>> Cdk-user@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>>
>>>> _______________________________________________
>>>> Cdk-user mailing list
>>>> Cdk-user@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>
>>>
>>> --
>>>
>>>
>>> *Andrés Bernal*
>>> *Área de Ciencias Básicas y Modelado*
>>> *Profesor Asociado*
>>> Ext. 1705
>>> andresf.bern...@utadeo.edu.co
>>> Dirección Utadeo: Carrera 4 # 22-61
>>>
>>>
>>>
>>> *ADVERTENCIA SOBRE CONFIDENCIALIDAD*
>>>
>>> Las opiniones expresadas en el presente mensaje no representan
>>> necesariamente la opinión oficial de La Universidad de Bogotá Jorge Tadeo
>>> lozano. La información contenida en este correo electrónico, incluyendo sus
>>> anexos, está dirigida exclusivamente a su destinatario y puede contener
>>> datos de carácter confidencial protegidos por la ley. Si usted no es el
>>> destinatario de este mensaje por favor infórmenos y elimínelo a la mayor
>>> brevedad. Cualquier retención, difusión, distribución, divulgación o copia
>>> de éste mensaje es prohibida y será sancionada por la ley.
>>>
>>> Este mensaje ha sido sometido a programas antivirus. No obstante, La
>>> Universidad de Bogotá Jorge Tadeo lozano no asume ninguna responsabilidad
>>> por eventuales daños generados por el recibo y uso de este material, siendo
>>> responsabilidad del destinatario verificar con sus propios medios de la
>>> existencia de virus u otros defectos.
>>>
>>>  *WARNING ABOUT CONFIDENTIAL INFORMATION*
>>>
>>> The opinions expressed herein do not necessarily reflect the positions
>>> of the Universidad de Bogotá Jorge Tadeo Lozano. The information contained
>>> in this electronic mail and attachments is confidential and intended only
>>> for the use of the individual or entity to whom it is addressed and may
>>> have confidential data. If you are not the intended recipient, you are
>>> hereby notified that any disclosure, copying, distribution, or any other
>>> use of the information is strictly prohibited and has legal repercussions.
>>> Therefore, if you have received this document by mistake, please notify the
>>> sender immediately and destroy this document and attachments without making
>>> any copy of any kind.
>>>
>>> This message has been tested by antivirus software. Nonetheless, the
>>> Universidad de Bogotá Jorge Tadeo Lozano assumes no liability for any
>>> damages or loss of any kind that might arise from the use of, misuse of, or
>>> the inability to use the materials contained on this electronic message. It
>>> is the responsibility of the recipient to verify by his own means the
>>> presence of a virus or any other harmful components, defects or errors.
>>>
>>> *ADVERTENCIA SOBRE CONFIDENCIALIDAD*
>>>
>>> Las opiniones expresadas en el presente mensaje no representan
>>> necesariamente la opinión oficial de La Universidad de Bogotá Jorge Tadeo
>>> lozano. La información contenida en este correo electrónico, incluyendo sus
>>> anexos, está dirigida exclusivamente a su destinatario y puede contener
>>> datos de carácter confidencial protegidos por la ley. Si usted no es el
>>> destinatario de este mensaje por favor infórmenos y elimínelo a la mayor
>>> brevedad. Cualquier retención, difusión, distribución, divulgación o copia
>>> de éste mensaje es prohibida y será sancionada por la ley.
>>>
>>> Este mensaje ha sido sometido a programas antivirus. No obstante, La
>>> Universidad de Bogotá Jorge Tadeo lozano no asume ninguna responsabilidad
>>> por eventuales daños generados por el recibo y uso de este material, siendo
>>> responsabilidad del destinatario verificar con sus propios medios de la
>>> existencia de virus u otros defectos.
>>>
>>>  *WARNING ABOUT CONFIDENTIAL INFORMATION*
>>>
>>> The opinions expressed herein do not necessarily reflect the positions
>>> of the Universidad de Bogotá Jorge Tadeo Lozano. The information contained
>>> in this electronic mail and attachments is confidential and intended only
>>> for the use of the individual or entity to whom it is addressed and may
>>> have confidential data. If you are not the intended recipient, you are
>>> hereby notified that any disclosure, copying, distribution, or any other
>>> use of the information is strictly prohibited and has legal repercussions.
>>> Therefore, if you have received this document by mistake, please notify the
>>> sender immediately and destroy this document and attachments without making
>>> any copy of any kind.
>>> This message has been tested by antivirus software. Nonetheless, the
>>> Universidad de Bogotá Jorge Tadeo Lozano assumes no liability for any
>>> damages or loss of any kind that might arise from the use of, misuse of, or
>>> the inability to use the materials contained on this electronic message. It
>>> is the responsibility of the recipient to verify by his own means the
>>> presence of a virus or any other harmful components, defects or errors.
>>
>>
> *ADVERTENCIA SOBRE CONFIDENCIALIDAD*
>
> Las opiniones expresadas en el presente mensaje no representan
> necesariamente la opinión oficial de La Universidad de Bogotá Jorge Tadeo
> lozano. La información contenida en este correo electrónico, incluyendo sus
> anexos, está dirigida exclusivamente a su destinatario y puede contener
> datos de carácter confidencial protegidos por la ley. Si usted no es el
> destinatario de este mensaje por favor infórmenos y elimínelo a la mayor
> brevedad. Cualquier retención, difusión, distribución, divulgación o copia
> de éste mensaje es prohibida y será sancionada por la ley.
>
> Este mensaje ha sido sometido a programas antivirus. No obstante, La
> Universidad de Bogotá Jorge Tadeo lozano no asume ninguna responsabilidad
> por eventuales daños generados por el recibo y uso de este material, siendo
> responsabilidad del destinatario verificar con sus propios medios de la
> existencia de virus u otros defectos.
>
>  *WARNING ABOUT CONFIDENTIAL INFORMATION*
>
> The opinions expressed herein do not necessarily reflect the positions of
> the Universidad de Bogotá Jorge Tadeo Lozano. The information contained in
> this electronic mail and attachments is confidential and intended only for
> the use of the individual or entity to whom it is addressed and may have
> confidential data. If you are not the intended recipient, you are hereby
> notified that any disclosure, copying, distribution, or any other use of
> the information is strictly prohibited and has legal repercussions.
> Therefore, if you have received this document by mistake, please notify the
> sender immediately and destroy this document and attachments without making
> any copy of any kind.
> This message has been tested by antivirus software. Nonetheless, the
> Universidad de Bogotá Jorge Tadeo Lozano assumes no liability for any
> damages or loss of any kind that might arise from the use of, misuse of, or
> the inability to use the materials contained on this electronic message. It
> is the responsibility of the recipient to verify by his own means the
> presence of a virus or any other harmful components, defects or errors.
> _______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to