Dear all,
this seems trivial, but it may also be worth checking the sanity of original 
melting point data, crystallographers sometimes enter the melting point in 
degree Celsius, when degree Fahrenheit is expected, so cross checking with the 
crystallization temperature etc. can be quite useful.
Best wishes,Maria

    On Wednesday, 10 October 2018, 15:28:06 BST, Michal Krompiec 
<michal.kromp...@gmail.com> wrote:  
 
 Dear All,Thank you all very much for your feedback! Actually, the number of 
collisions didn't decrease when I increased the bit length, though increasing 
radius to 3 did help a bit. Overall, it is good to know that great results are 
not to be expected.Best wishes,Michal
On Wed, 10 Oct 2018 at 13:31, Chris Earnshaw <cgearns...@gmail.com> wrote:

Hi
It sounds to me like you're already getting better results than you could 
reasonably expect.
Prediction of melting point is a phenomenally difficult thing to do; you're 
trying to find the temperature at which a (generally undefined) solid 
crystalline phase is in equilibrium with a (probably even less defined) liquid 
phase. You also need to consider that the crystalline form of your solid phase 
is not necessarily truly constant - what polymorph is involved? Melting points 
of alternative polymorphs can be radically different and this is one of the 
real bugbears of pharmaceutical and agrochemical development. If you haven't 
found the most stable form early in the development process there can be very 
nasty surprises downstream.
Expecting to handle all these challenges with a descriptor as simple as a 
molecular fingerprint - regardless of bit-length, collisions etc. is probably 
over optimistic...
Regards,Chris Earnshaw

On Wed, 10 Oct 2018 at 13:16, Michal Krompiec <michal.kromp...@gmail.com> wrote:

Hi Thomas,Radius 2, 2048 bits, 5200 data points.
On Wed, 10 Oct 2018 at 13:13, Thomas Evangelidis <teva...@gmail.com> wrote:

What's your bitvector length and radius? How many training samples do you have?
On Wed, 10 Oct 2018 at 13:51, Michal Krompiec <michal.kromp...@gmail.com> wrote:

Hi all,I have a slightly off-topic question. I'm trying to train a neural 
network on a dataset of small molecules and their melting points. I did get a 
not-so-bad accuracy with Morgan fingerprints, but I've realised that regardless 
of FP radius and bitvector length, several dozen molecules have the same 
fingerprints but wildly different melting points. I am pretty sure this is a 
"solved problem" so I don't want to reinvent the wheel. What is the 
recommended/usual way of dealing with this?Thanks,Michal

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



-- 
 
======================================================================

Dr Thomas Evangelidis

Research Scientist

IOCB - Institute of Organic Chemistry and Biochemistry of the Czech Academy of 
Sciences
Prague, Czech Republic  & CEITEC - Central European Institute of Technology
Brno, Czech Republic 

email: teva...@gmail.com

website:https://sites.google.com/site/thomasevangelidishomepage/





_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
  
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to