Hi,

I’d use a text based version of the structure InChiKey or canonical SMILES it 
then becomes a easy task to do the comparison in Python

I wrote a script to do this in Vortex but it should be easy to modify.
https://www.macinchem.org/reviews/vortex/tut28/scripting_vortex28.php 
<https://www.macinchem.org/reviews/vortex/tut28/scripting_vortex28.php>


Cheers

Chris
> 
> 
> Today's Topics:
> 
>   1. Non-redundant database of molecules (Wandr?)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Wed, 13 Sep 2017 07:13:56 -0300
> From: Wandr? <wandrevel...@gmail.com>
> To: rdkit-discuss@lists.sourceforge.net
> Subject: [Rdkit-discuss] Non-redundant database of molecules
> Message-ID:
>       <caemzefdrr5vsh1ohmm1vwd7g8xkdmtoukfsfdqnx4zyobla...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Hi,
> 
> My name is Wandr? and I'm from Brazil.
> I'm trying to do a big database of molecules, but, I want to eliminate all
> the redundant molecules before insert them in database.
> I want to know what is the best method to identify one molecule in RDKit.
> Is SMILES ("Chem.MolToSmiles(mol,isomericSmiles=True)") or I will need to
> compare all molecules, one by one, before insert them in database (using
> Tanimoto)?
> This can be hard to do because my database will have lot of millions of
> molecules, so, compare one by one before insert is the only answer?
> Compare if the SMILES as already inserted is easy (text compare), but,
> compare fingerprint of molecule...
> 
> If I really need to compare the fingerprint of molecule, how to store this
> data in PostgreSQL without use cartridge? I will generate the fingeprint
> (Atompair, for example) and store this fingerprint in database and compare
> all the fingerprints, one by one, before insert a now molecule. This
> fingerprint (Atompair) have lot of features, so, store this in relational
> database is expensive.
> It is possible?
> 
> Thanks!
> 
> --
> Wandr? Nunes de Pinho Veloso
> Professor Assistente - Unifei - Campus Avan?ado de Itabira-MG
> Doutorando em Bioinform?tica - Universidade Federal de Minas Gerais - UFMG
> Pesquisador do INSILICO - Grupo Interdisciplinar em Simula??o e
> Intelig?ncia Computacional - UNIFEI
> Membro do Grupo de Pesquisa Assinaturas Biol?gicas da FIOCRUZ
> Membro do Grupo de Pesquisa Bioinform?tica Estrutural da UFMG
> Laborat?rio de Bioinform?tica e Sistemas - LBS, DCC, UFMG
> -------------- next part --------------
> An HTML attachment was scrubbed...
> 
> ------------------------------
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> 
> 
> ------------------------------
> 
> End of Rdkit-discuss Digest, Vol 119, Issue 20
> **********************************************

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to