Hi guys,
I just had the possibility to test the corrected version. In general the
inconsistencies were solved, but I still have doubt.
Molecules from Row2 to Row8 now have all the correct values for
LargestChainDescriptor. Anyway I think that Row0 and Row1 molecules should
have (given its definition) a LargestChainDescriptor = 1 because they do
have one atom in a non cyclic system. In my opinion LargestChainDescriptor
should distinguish between these cases and those where no atoms are present
in non cyclic system. While with the current implementation seems it is not
possible to have a LargestChainDescriptor = 1.
I my mind:
1. "Cc1nn(c(N)c1)-c2nc3c(s2)cccc3" should have a LargestChainDescriptor
= 1
2. "s1c2c(cccc2)nc1-n3cccn3" should have a LargestChainDescriptor = 0
What do you think about this? Do I am missing something?
Thanks in advance for your help and support.
Gio
On Thu, Feb 11, 2016 at 4:47 PM, Rajarshi Guha <rajarshi.g...@gmail.com>
wrote:
> Yes, it does appear to be a bug.
>
> Inspeting the code the LargestChainDescriptor is looking for the longest
> path that contains non-aromatic, non-ring atoms.
>
> For Row0, there are only two non-ring, non-aromatic atoms and so there
> are two possible chains with a single atom, hence each chain has the value
> of 0.
>
> For Row2, the longest chain according to the definition is the C#N
> substructure, hence the value should be 2
>
> For Row5 and Row6, the values should also be 2.
>
> I've attached a patch against master that fixes this (I wasn't sure how to
> make a pull request from a local branch)
>
>
> On Thu, Feb 11, 2016 at 7:50 AM, Giovanni Cincilla <gcinci...@gmail.com>
> wrote:
>
>> Thank you for your quick reply.
>> Examples are provided in the original KNIME forum thread (
>> https://tech.knime.org/forum/cdk/cdk-largest-chain-descriptor-inconsistencies#comment-41502),
>> anyway I can report those here:
>>
>> 2. In some cases terminal atoms seems not to be counted as part of the
>> largest chain LC (LC Row0 = 0; LC Row2 = 1), while in other cases they are
>> (LC Row5 = 2; LC Row6 =2).
>> 3. In some cases iso terminal groups are countes as 2 (Row4) while in
>> other cases are counted as 3 (Row8).
>>
>> *Molecules:*
>> "Row0","Cc1nn(c(c1)N)c1nc2c(s1)cccc2"
>> "Row1","Clc1cnn(c(=O)c1Cl)Cc1[nH]c(=O)c2c(n1)c1ccccc1o2"
>> "Row2","Nc1c(cn[nH]1)C#N"
>> "Row3","Fc1ccc(cc1)C(=O)c1ccc(cc1)Oc1ncc(cn1)Br"
>> "Row4","[O-][n+]1ccc(cc1)[N+](=O)[O-]"
>> "Row5","OCc1ccccc1CN"
>> "Row6","COc1ccc(cc1)c1noc(c1)Cn1nc(C)c(c(c1=O)C#N)C"
>> "Row7","Cc1nc(SCc2nc3ccsc3c(=O)[nH]2)c2c(n1)scc2c1cccs1"
>> "Row8","CC(=O)c1cccc(c1)Nc1nc(nc2c1cccc2)c1cccnc1"
>>
>> I hope the examples are quite clear.
>>
>> On Thu, Feb 11, 2016 at 1:18 PM, Rajarshi Guha <rajarshi.g...@gmail.com>
>> wrote:
>>
>>> Could you provide an example where issues 2 & 3 show up?
>>>
>>> The descriptor is meant to compute the length of the longest aliphatic
>>> chain in a molecule
>>>
>>> On Thu, Feb 11, 2016 at 4:29 AM, Giovanni Cincilla <gcinci...@gmail.com>
>>> wrote:
>>>
>>>> Dear all,
>>>> I use CDK mainly through KNIME and I found some supposed
>>>> inconsistencies using the LargestChainDescriptor. I originally posted my
>>>> doubt in KNIME-CDK forum where I also provided examples:
>>>>
>>>>
>>>> https://tech.knime.org/forum/cdk/cdk-largest-chain-descriptor-inconsistencies#comment-41502
>>>>
>>>> I'm not sure about the purpose of such descriptor. Essentially my
>>>> doubts are the following:
>>>>
>>>> 1. The LargestChainDescriptor count atoms that belong to aliphatic
>>>> rings. Is that correct or it is a bug? The word "chain" in my opinion
>>>> can
>>>> seems opposed to the word "ring".
>>>> 2. In some cases terminal atoms seems not to be counted as part of
>>>> the largest chain, while in other cases they are.
>>>> 3. In some cases iso terminal groups are counted as 2 while in
>>>> other cases are counted as 3
>>>> 4. Atoms between rings in some case to be correctly counted, while
>>>> in other cases they are not
>>>>
>>>> Please, can anybody provide some clarification about these issues?
>>>> Thanks,
>>>> Gio
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>>> Monitor end-to-end web transactions and take corrective actions now
>>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>>> _______________________________________________
>>>> Cdk-user mailing list
>>>> Cdk-user@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>>
>>>
>>>
>>> --
>>> Rajarshi Guha | http://blog.rguha.net
>>> NIH Center for Advancing Translational Science
>>>
>>
>>
>
>
> --
> Rajarshi Guha | http://blog.rguha.net
> NIH Center for Advancing Translational Science
>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user