Hi Dr. Guillaume,

I played around with the ability to map a set of fragments to molecules a
couple months ago. The result of my experiments are here:
https://github.com/coleb/fragment_mapper

You give it a set of molecules and fragments you would like to have mapped.
It tries to find the smallest set of fragments by trying the largest first
using a greedy algorithm. Does fairly well at finding the largest alkyl
chain to satisfy parts of the molecule. But is entirely dependent on what
fragments are in the input set. I was interested in using this to determine
how well fragment collections cover sets of molecules.

The scripts will output reports of what fragments are mapped (or
conversely, what is missing). Attaching example PDFs of that.

Let me know if you find it useful. The major drawbacks I've noticed in my
experimenting is that it gets tricked up be tautomer changes from the
fragment to the molecule (been playing with a way to work around that by
trying out what Roger presented at the UGM). Also, it doesn't check the
bond orders between the fragments, which matters for my use case, but
doesn't look like it does for yours.

Cheers,
Brian

On Thu, Dec 8, 2016 at 2:43 AM, Guillaume GODIN <
guillaume.go...@firmenich.com> wrote:

> Dear all,
>
>
> I would like to know if you have an idea on how to determine the "real"
> fragment count in a molecule. I mean find one fragment with priority and
> remove it from the molecule and continue until the molecule was empty.
>
>
> the complex part is related to the proper enumaration of linear or
> branched alkaned substituants:
>
>
> iso_Bu, iso_Pr, ter_Bu, 2_Bu, CH2, CH2CH2, CH2CH2CH2, CH2CH2CH2CH2, CH3,
> CH3, Et, Pr, Bu
>
> here few examples:
>
> Pentylamine, CCCCCN => CH2:1 & Bu:1 & NH2:1
>
> Isopropyl Palmitate, CCCCCCCCCCCCCCCC(=O)OC(C)C => Bu:1 & iso_Pr:1 &
> CH2CH2CH2:1 & COO:1 & CH2CH2CH2CH2:2
>
> Di-2-Ethylhexyl Ether, CCCCC(CC)COCC(CC)CCCC => CH2:2 & CH:2 & Bu:2
> & Et:2 & O:1
>
>
> ​any idea ?
>
> *Dr. Guillaume GODIN*
> Principal Scientist
> Chemoinformatic & Datamining
> Innovation
> CORPORATE R&D DIVISION
> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045>
> MOBILE          +41 (0)79 536 1039 <+41%2079%20536%2010%2039>
>         Firmenich SA
>         RUE DES JEUNES 1 | CASE POSTALE 239 | CH-1211 GENEVE 8
>
>
> **********************************************************************
> DISCLAIMER
> This email and any files transmitted with it, including replies and
> forwarded copies (which may contain alterations) subsequently transmitted
> from Firmenich, are confidential and solely for the use of the intended
> recipient. The contents do not represent the opinion of Firmenich except to
> the extent that it relates to their official business.
> **********************************************************************
>
> ------------------------------------------------------------
> ------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today.http://sdm.link/xeonphi
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>

Attachment: MappingNotFound.pdf
Description: Adobe PDF document

Attachment: NotFullyCovered.pdf
Description: Adobe PDF document

Attachment: Success.pdf
Description: Adobe PDF document

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/xeonphi
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to