Hi John, Thanks for the fix. When will it be merged into the main branch? Andrés
El vie, 26 de nov. de 2021 a la(s) 03:21, John Mayfield ( john.wilkinson...@gmail.com) escribió: > Hi Andres, > > Excellent analysis - thank you. Good to see that the recent Phenol change > should bring things more in agreement. > > John > > On Wed, 24 Nov 2021 at 21:29, Andres Fernando Bernal Escobar < > andresf.bern...@utadeo.edu.co> wrote: > >> Hello John, thanks for your answer. I ran a quick comparison between CDK >> and PubChem, with a few hand-picked molecules. These are the results: >> https://docs.google.com/spreadsheets/d/1yl3b05W319ZQW5K9TZf0iMYHbPoJyP5BV8QMLhf1kLE/edit?usp=sharing >> >> I split the molecules in four subsets. The first comprises seemingly >> non-problematic molecules: carboxylic acids, amines, aliphatic esters, >> aliphatic ethers. In these cases CDK, PubChem and my own intuition are all >> in agreement. >> >> The second subset comprises molecules where I think CDK is wrong and >> PubChem is correct: phenols. This is due to the issue that you corrected in >> the branch you linked. >> >> The third subset comprises molecules where I think CDK is correct and >> PubChem is wrong: aromatic ethers, amides, nitro compounds. In the case of >> aromatic ethers, we know CDK explicitly introduces a correction to exclude >> aromatic ether oxygens from the HB acceptors count. I am not a specialist, >> but I understand there are sound reasons to make this exception. PubChem >> doesn't seem to implement it. In the case of amides and nitro compounds I >> don't quite understand what is going on with PubChem, but CDK's answer >> seems the correct one to me. >> >> The last subset comprises aromatic esters (acyloxy substituents). I >> honestly don't know what is correct in this case. Are oxygen atoms from >> aromatic esters also an exception, just as those from aromatic ethers? That >> would mean CDK is right. Otherwise, another correction is needed to make >> sure CDK excludes no oxygens on aromatic rings other than those of ethers. >> >> El mar, 23 de nov. de 2021 a la(s) 04:27, John Mayfield ( >> john.wilkinson...@gmail.com) escribió: >> >>> Thanks for your email. I've always thought the CDK HBond acceptor/donor >>> code is a little wonky and needs investigating. I don't have time to look >>> deeply at it but yes my reading of this is it doesn't check for the ether >>> oxygen correctly. If someone was inclined checking CDK's (and RDKit's) >>> values with PubChem would be a quick project that may provide some insight >>> onto missed cases and disagreements. >>> >>> I've made a change here to get the correct value for phenol: >>> https://github.com/cdk/cdk/compare/bug/hbondacceptor?expand=1 >>> >>> On Fri, 15 Oct 2021 at 11:27, Guillermo Restrepo < >>> guillermo.restr...@mis.mpg.de> wrote: >>> >>>> We are working with some descriptors taken from Reaxys database, which >>>> according to its owner are computed using your CDK library. We found >>>> something unexpected and would very much appreciate it if you could >>>> help >>>> us to understand. >>>> >>>> We noted that some phenols are reported as having 0 hydrogen bond >>>> acceptors, whereas we expected them to have at least one. We checked >>>> CDK >>>> source code and found this comment on HBondAcceptorCountDescriptor.java: >>>> >>>> The following groups are counted as hydrogen bond acceptors: >>>> - any oxygen where the formal charge of the oxygen is non-positive >>>> (i.e. >>>> formal charge <= 0) except >>>> - an aromatic ether oxygen (i.e. an ether oxygen that is >>>> adjacent >>>> to at least one aromatic carbon) >>>> - an oxygen that is adjacent to a nitrogen >>>> - any nitrogen where the formal charge of the nitrogen is non-positive >>>> (i.e. formal charge <= 0) except >>>> - a nitrogen that is adjacent to an oxygen >>>> >>>> The way we understood it, this means that phenols should have at least >>>> one hydrogen bond acceptor. But further down in the same file, these >>>> lines seem to specify otherwise: >>>> >>>> // looking for suitable oxygen atoms >>>> else if (atom.getAtomicNumber() == IElement.O && >>>> atom.getFormalCharge() <= 0) { >>>> //excluding oxygens that are adjacent to a nitrogen or >>>> to an aromatic carbon >>>> List<IBond> neighbours = >>>> ac.getConnectedBondsList(atom); >>>> for (IBond bond : neighbours) { >>>> IAtom neighbor = bond.getOther(atom); >>>> if (neighbor.getAtomicNumber() == IElement.N || >>>> (neighbor.getAtomicNumber() == IElement.C && >>>> neighbor.isAromatic() && >>>> bond.getOrder() != IBond.Order.DOUBLE)) >>>> continue atomloop;; >>>> } >>>> hBondAcceptors++; >>>> } >>>> >>>> Is this intended, or is it a bug, or are we misunderstanding something? >>>> >>>> >>>> >>>> _______________________________________________ >>>> Cdk-user mailing list >>>> Cdk-user@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/cdk-user >>>> >>> _______________________________________________ >>> Cdk-user mailing list >>> Cdk-user@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/cdk-user >>> >> >> >> -- >> >> >> *Andrés Bernal* >> *Área de Ciencias Básicas y Modelado* >> *Profesor Asociado* >> Ext. 1705 >> andresf.bern...@utadeo.edu.co >> Dirección Utadeo: Carrera 4 # 22-61 >> >> >> >> *ADVERTENCIA SOBRE CONFIDENCIALIDAD* >> >> Las opiniones expresadas en el presente mensaje no representan >> necesariamente la opinión oficial de La Universidad de Bogotá Jorge Tadeo >> lozano. La información contenida en este correo electrónico, incluyendo sus >> anexos, está dirigida exclusivamente a su destinatario y puede contener >> datos de carácter confidencial protegidos por la ley. Si usted no es el >> destinatario de este mensaje por favor infórmenos y elimínelo a la mayor >> brevedad. Cualquier retención, difusión, distribución, divulgación o copia >> de éste mensaje es prohibida y será sancionada por la ley. >> >> Este mensaje ha sido sometido a programas antivirus. No obstante, La >> Universidad de Bogotá Jorge Tadeo lozano no asume ninguna responsabilidad >> por eventuales daños generados por el recibo y uso de este material, siendo >> responsabilidad del destinatario verificar con sus propios medios de la >> existencia de virus u otros defectos. >> >> *WARNING ABOUT CONFIDENTIAL INFORMATION* >> >> The opinions expressed herein do not necessarily reflect the positions of >> the Universidad de Bogotá Jorge Tadeo Lozano. The information contained in >> this electronic mail and attachments is confidential and intended only for >> the use of the individual or entity to whom it is addressed and may have >> confidential data. If you are not the intended recipient, you are hereby >> notified that any disclosure, copying, distribution, or any other use of >> the information is strictly prohibited and has legal repercussions. >> Therefore, if you have received this document by mistake, please notify the >> sender immediately and destroy this document and attachments without making >> any copy of any kind. >> >> This message has been tested by antivirus software. Nonetheless, the >> Universidad de Bogotá Jorge Tadeo Lozano assumes no liability for any >> damages or loss of any kind that might arise from the use of, misuse of, or >> the inability to use the materials contained on this electronic message. It >> is the responsibility of the recipient to verify by his own means the >> presence of a virus or any other harmful components, defects or errors. >> >> *ADVERTENCIA SOBRE CONFIDENCIALIDAD* >> >> Las opiniones expresadas en el presente mensaje no representan >> necesariamente la opinión oficial de La Universidad de Bogotá Jorge Tadeo >> lozano. La información contenida en este correo electrónico, incluyendo sus >> anexos, está dirigida exclusivamente a su destinatario y puede contener >> datos de carácter confidencial protegidos por la ley. Si usted no es el >> destinatario de este mensaje por favor infórmenos y elimínelo a la mayor >> brevedad. Cualquier retención, difusión, distribución, divulgación o copia >> de éste mensaje es prohibida y será sancionada por la ley. >> >> Este mensaje ha sido sometido a programas antivirus. No obstante, La >> Universidad de Bogotá Jorge Tadeo lozano no asume ninguna responsabilidad >> por eventuales daños generados por el recibo y uso de este material, siendo >> responsabilidad del destinatario verificar con sus propios medios de la >> existencia de virus u otros defectos. >> >> *WARNING ABOUT CONFIDENTIAL INFORMATION* >> >> The opinions expressed herein do not necessarily reflect the positions of >> the Universidad de Bogotá Jorge Tadeo Lozano. The information contained in >> this electronic mail and attachments is confidential and intended only for >> the use of the individual or entity to whom it is addressed and may have >> confidential data. If you are not the intended recipient, you are hereby >> notified that any disclosure, copying, distribution, or any other use of >> the information is strictly prohibited and has legal repercussions. >> Therefore, if you have received this document by mistake, please notify the >> sender immediately and destroy this document and attachments without making >> any copy of any kind. >> This message has been tested by antivirus software. Nonetheless, the >> Universidad de Bogotá Jorge Tadeo Lozano assumes no liability for any >> damages or loss of any kind that might arise from the use of, misuse of, or >> the inability to use the materials contained on this electronic message. It >> is the responsibility of the recipient to verify by his own means the >> presence of a virus or any other harmful components, defects or errors. > > -- **ADVERTENCIA SOBRE CONFIDENCIALIDAD** Las opiniones expresadas en el presente mensaje no representan necesariamente la opinión oficial de La Universidad de Bogotá Jorge Tadeo lozano. La información contenida en este correo electrónico, incluyendo sus anexos, está dirigida exclusivamente a su destinatario y puede contener datos de carácter confidencial protegidos por la ley. Si usted no es el destinatario de este mensaje por favor infórmenos y elimínelo a la mayor brevedad. Cualquier retención, difusión, distribución, divulgación o copia de éste mensaje es prohibida y será sancionada por la ley. Este mensaje ha sido sometido a programas antivirus. No obstante, La Universidad de Bogotá Jorge Tadeo lozano no asume ninguna responsabilidad por eventuales daños generados por el recibo y uso de este material, siendo responsabilidad del destinatario verificar con sus propios medios de la existencia de virus u otros defectos. **WARNING ABOUT CONFIDENTIAL INFORMATION** The opinions expressed herein do not necessarily reflect the positions of the Universidad de Bogotá Jorge Tadeo Lozano. The information contained in this electronic mail and attachments is confidential and intended only for the use of the individual or entity to whom it is addressed and may have confidential data. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or any other use of the information is strictly prohibited and has legal repercussions. Therefore, if you have received this document by mistake, please notify the sender immediately and destroy this document and attachments without making any copy of any kind. This message has been tested by antivirus software. Nonetheless, the Universidad de Bogotá Jorge Tadeo Lozano assumes no liability for any damages or loss of any kind that might arise from the use of, misuse of, or the inability to use the materials contained on this electronic message. It is the responsibility of the recipient to verify by his own means the presence of a virus or any other harmful components, defects or errors.
_______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user