Hi

It sounds to me like you're already getting better results than you could
reasonably expect.

Prediction of melting point is a phenomenally difficult thing to do; you're
trying to find the temperature at which a (generally undefined) solid
crystalline phase is in equilibrium with a (probably even less defined)
liquid phase. You also need to consider that the crystalline form of your
solid phase is not necessarily truly constant - what polymorph is involved?
Melting points of alternative polymorphs can be radically different and
this is one of the real bugbears of pharmaceutical and agrochemical
development. If you haven't found the most stable form early in the
development process there can be very nasty surprises downstream.

Expecting to handle all these challenges with a descriptor as simple as a
molecular fingerprint - regardless of bit-length, collisions etc. is
probably over optimistic...

Regards,
Chris Earnshaw

On Wed, 10 Oct 2018 at 13:16, Michal Krompiec <michal.kromp...@gmail.com>
wrote:

> Hi Thomas,
> Radius 2, 2048 bits, 5200 data points.
>
> On Wed, 10 Oct 2018 at 13:13, Thomas Evangelidis <teva...@gmail.com>
> wrote:
>
>> What's your bitvector length and radius? How many training samples do you
>> have?
>>
>> On Wed, 10 Oct 2018 at 13:51, Michal Krompiec <michal.kromp...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>> I have a slightly off-topic question. I'm trying to train a neural
>>> network on a dataset of small molecules and their melting points. I did get
>>> a not-so-bad accuracy with Morgan fingerprints, but I've realised that
>>> regardless of FP radius and bitvector length, several dozen molecules have
>>> the same fingerprints but wildly different melting points. I am pretty sure
>>> this is a "solved problem" so I don't want to reinvent the wheel. What is
>>> the recommended/usual way of dealing with this?
>>> Thanks,
>>> Michal
>>>
>>>
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
>>
>> --
>>
>> ======================================================================
>>
>> Dr Thomas Evangelidis
>>
>> Research Scientist
>>
>> IOCB - Institute of Organic Chemistry and Biochemistry of the Czech
>> Academy of Sciences <https://www.uochb.cz/web/structure/31.html?lang=en>
>> Prague, Czech Republic
>>   &
>> CEITEC - Central European Institute of Technology
>> <https://www.ceitec.eu/>
>> Brno, Czech Republic
>>
>> email: teva...@gmail.com
>>
>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>
>>
>> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to