Re: [Rdkit-discuss] A couple of questions about CoordGen in RDKit

2018-06-02 Thread Jason Biggs
Nicola,
Thanks for the example, I can definitely see the improvement in the diagram
from the template.  Is it mainly then these complicated bridged ring
systems that use the templates?  I do like the diagrams from Coordgen very
much, even when it doesn't use a template.

How difficult will it be to add more templates to the templates.mae file -
as I find examples of molecules that don't do well in the default method?
Can this be done with the rdkit, or would it need something else from the
Schrodinger repo?


This would be a question for both Greg and Nicola:  What would be good
words to describe the layout methods used by coordgen vs rdkit?  I want to
have an option for the user,  Molecule[ , DiagramLayout ->
"MethodName"], but just using "RDKit" and "CoordGen" isn't right because it
doesn't describe the underlying algorithm, just the library that implements
them.

Would it be wrong to call the rdkit method "DistanceGeometry"?  What is the
main distinction between the two methods?

Best,

Jason


(the image from Nicola's email didn't come through to me, showing the
example with template on the bottom, without template on the top - big
improvement)





Jason Biggs


On Sat, Jun 2, 2018 at 7:08 AM, Nicola Zonta 
wrote:

> Hello,
>
> yes, I don’t think we check for the existence of the directory (I got rid
> of that code when we released cause it was using a proprietary lib and
> never replaced it). It’s surprising that you get the same results though.
>
> here’s the smiles I use (or you can use any molecule in templates.mae (I
> am not sure if maeparser has been integrated with RDKit yet?) )
>
> C12CC3CC(C1)CC(C2)C3
>
> which should look something like this if the templates are used
>
>
> weird about the unstable coordinates, I think looking at the structure it
> has to do with the minimisation but I have no quick solution for it
>
>
> On 02 Jun 2018, at 12:35, Greg Landrum  wrote:
>
> Hi Jason,
>
> That's a great question. I can also confirm that it seems that setting the
> parameter file location to a bogus value seems to have no effect.
> @Nic: can you help us out here? I figure you can probably answer the
> question quicker than I can dig through the code. :-)
>
> -greg
>
>
> On Thu, May 31, 2018 at 10:22 PM Jason Biggs 
> wrote:
>
>> I recently switched to the 2018_03_1 release, and I am trying out the new
>> 2D coordinate generating functions.  The diagrams look good, but I can't
>> seem to figure out what the role of the template file is.
>>
>> I find that I can set the templateFileDir parameter either to a real
>> directory with the templates.mae file in it, or to an almost-empty string "
>> ", and it has no effect.  Is there an example SMILES where using the
>> template file changes the returned diagram?
>>
>> Another thing I notice is the conformer generated by CoordGen isn't
>> always reproducible.  I find that if I run the following code multiple
>> times, I will get different results,
>>
>>
>> m = Chem.MolFromSmiles('CO[C@H]1[C@]2(O)C(=O)N3C=CC(C)(C)c4c(C=
>> C3C(=O)N2[C@@]23[C@@]1(O)c1c1N3C([C@H]2C)(C)C)c1c1[nH]4')
>> Chem.rdCoordGen.AddCoords(m)
>> m.GetConformer(0).GetPositions()[0]
>>
>> will sometime output
>>
>> array([-1.2795,  1.35720001,  0.])
>>
>>
>> but other times outputs
>>
>> array([-1.28240005,  1.365 ,  0.])
>>
>>
>> Obviously it's a small difference, but I would prefer to always return
>> the same values for the same input.
>>
>> Best,
>> Jason
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org !
>> http://sdm.link/slashdot___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] A couple of questions about CoordGen in RDKit

2018-06-02 Thread Greg Landrum
Hi Jason,

That's a great question. I can also confirm that it seems that setting the
parameter file location to a bogus value seems to have no effect.
@Nic: can you help us out here? I figure you can probably answer the
question quicker than I can dig through the code. :-)

-greg


On Thu, May 31, 2018 at 10:22 PM Jason Biggs  wrote:

> I recently switched to the 2018_03_1 release, and I am trying out the new
> 2D coordinate generating functions.  The diagrams look good, but I can't
> seem to figure out what the role of the template file is.
>
> I find that I can set the templateFileDir parameter either to a real
> directory with the templates.mae file in it, or to an almost-empty string "
> ", and it has no effect.  Is there an example SMILES where using the
> template file changes the returned diagram?
>
> Another thing I notice is the conformer generated by CoordGen isn't always
> reproducible.  I find that if I run the following code multiple times, I
> will get different results,
>
>
> m = Chem.MolFromSmiles('CO[C@H]1[C@
> ]2(O)C(=O)N3C=CC(C)(C)c4c(C=C3C(=O)N2[C@@]23[C@@]1(O)c1c1N3C([C@H
> ]2C)(C)C)c1c1[nH]4')
> Chem.rdCoordGen.AddCoords(m)
> m.GetConformer(0).GetPositions()[0]
>
> will sometime output
>
> array([-1.2795,  1.35720001,  0.])
>
>
> but other times outputs
>
> array([-1.28240005,  1.365 ,  0.])
>
>
> Obviously it's a small difference, but I would prefer to always return the
> same values for the same input.
>
> Best,
> Jason
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] A couple of questions about CoordGen in RDKit

2018-06-02 Thread Nicola Zonta
Hi,

the core of the coordgen method is based on this paper
https://pubs.acs.org/doi/abs/10.1021/ci050550m 


> On 2 Jun 2018, at 15:59, Greg Landrum  wrote:
> 
> Hi Jason,
> 
> On Sat, Jun 2, 2018 at 3:41 PM Jason Biggs  > wrote:
> 
> This would be a question for both Greg and Nicola:  What would be good words 
> to describe the layout methods used by coordgen vs rdkit?  I want to have an 
> option for the user,  Molecule[ , DiagramLayout -> "MethodName"], 
> but just using "RDKit" and "CoordGen" isn't right because it doesn't describe 
> the underlying algorithm, just the library that implements them.  
> 
> Would it be wrong to call the rdkit method "DistanceGeometry"?  What is the 
> main distinction between the two methods?  
> 
> The RDKit uses distance geometry for 3D conformations. For 2D coordinates 
> it's using an internal algorithm that hasn't really been described anywhere. 
> I think it's fine to call it the "RDKit depiction algorithm" or "RDKit layout 
> algorithm" if you prefer.
> 
> -greg
> 
> 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] A couple of questions about CoordGen in RDKit

2018-06-02 Thread Nicola Zonta
Hello,

yes, I don’t think we check for the existence of the directory (I got rid of 
that code when we released cause it was using a proprietary lib and never 
replaced it). It’s surprising that you get the same results though.

here’s the smiles I use (or you can use any molecule in templates.mae (I am not 
sure if maeparser has been integrated with RDKit yet?) )

C12CC3CC(C1)CC(C2)C3

which should look something like this if the templates are used



weird about the unstable coordinates, I think looking at the structure it has 
to do with the minimisation but I have no quick solution for it 


> On 02 Jun 2018, at 12:35, Greg Landrum  wrote:
> 
> Hi Jason,
> 
> That's a great question. I can also confirm that it seems that setting the 
> parameter file location to a bogus value seems to have no effect.
> @Nic: can you help us out here? I figure you can probably answer the question 
> quicker than I can dig through the code. :-)
> 
> -greg
> 
> 
> On Thu, May 31, 2018 at 10:22 PM Jason Biggs  > wrote:
> I recently switched to the 2018_03_1 release, and I am trying out the new 2D 
> coordinate generating functions.  The diagrams look good, but I can't seem to 
> figure out what the role of the template file is.  
> 
> I find that I can set the templateFileDir parameter either to a real 
> directory with the templates.mae file in it, or to an almost-empty string " 
> ", and it has no effect.  Is there an example SMILES where using the template 
> file changes the returned diagram?
> 
> Another thing I notice is the conformer generated by CoordGen isn't always 
> reproducible.  I find that if I run the following code multiple times, I will 
> get different results,
> 
> 
> m = 
> Chem.MolFromSmiles('CO[C@H]1[C@]2(O)C(=O)N3C=CC(C)(C)c4c(C=C3C(=O)N2[C@@]23[C@@]1(O)c1c1N3C([C@H]2C)(C)C)c1c1[nH]4')
> Chem.rdCoordGen.AddCoords(m)
> m.GetConformer(0).GetPositions()[0]
> 
> will sometime output
> 
> array([-1.2795,  1.35720001,  0.])
> 
> but other times outputs
> 
> array([-1.28240005,  1.365 ,  0.])
> 
> Obviously it's a small difference, but I would prefer to always return the 
> same values for the same input.  
> 
> Best,
> Jason
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net 
> 
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 
> 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] A couple of questions about CoordGen in RDKit

2018-06-02 Thread Greg Landrum
Hi Jason,

On Sat, Jun 2, 2018 at 3:41 PM Jason Biggs  wrote:

>
> This would be a question for both Greg and Nicola:  What would be good
> words to describe the layout methods used by coordgen vs rdkit?  I want to
> have an option for the user,  Molecule[ , DiagramLayout ->
> "MethodName"], but just using "RDKit" and "CoordGen" isn't right because it
> doesn't describe the underlying algorithm, just the library that implements
> them.
>
> Would it be wrong to call the rdkit method "DistanceGeometry"?  What is
> the main distinction between the two methods?
>

The RDKit uses distance geometry for 3D conformations. For 2D coordinates
it's using an internal algorithm that hasn't really been described
anywhere.
I think it's fine to call it the "RDKit depiction algorithm" or "RDKit
layout algorithm" if you prefer.

-greg
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss