Thanks Malitha.
I choose this descriptors because I will store this on my database, so,
will be fast compare one molecule before insert them in database.
My worry now is if the RDKit will generate different SMILES or InChI in
same SDF molecule or equals in different molecules (molecules from RCSB
PDB, PubChem, ChemBL, for example).

--
Wandré Nunes de Pinho Veloso
Professor Assistente - Unifei - Campus Avançado de Itabira-MG
Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
Inteligência Computacional - UNIFEI
Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG

2017-09-13 16:22 GMT-03:00 Malitha Kabir <malitha12...@gmail.com>:

> Hi Wandré,
>
> It seems you already did intense research on it. Kindly accept my comments
> as an addition to your idea (not the answer you trying to find out). In my
> idea, categorizing molecules using it's descriptor should reduce
> computation time. RDKit currently offer calculation of about 200
> descriptors! So, a careful look up at those makes a lot of sense to me.
> Conceptually, descriptor matching should follow a sequence (I don't know
> what sequence would be ideal) - for example MolWt should match first (H
> contribution and ions should be taken into consideration here) and then
> subsequent matching of other descriptors (might be different while writing
> programs). There are a few reading materials on molecular fingerprint and
> database schema. You may have a look at those.
>
> The links are from Daylight. I am neither involved with the company nor
> their product.
> http://www.daylight.com/dayhtml/doc/theory/theory.finger.html
> http://www.daylight.com/dayhtml/doc/theory/theory.thor.html
>
> Best regards,
> - malitha
>
>
> On Thu, Sep 14, 2017 at 12:43 AM, Wandré <wandrevel...@gmail.com> wrote:
>
>> Thanks for all the answers.
>>
>> Reading all answers, I think in something different... If the SMILES
>> (Chem.MolToSmiles(mol,isomericSmiles=True)) and Inchi
>> (Chem.MolToInchi(mol)) can generate the same value in different molecules,
>> I will generate others descriptors (NumHDonors, NumHAcceptors, Ri
>> ngCount, GetNumAtoms, TPSA, pyLabuteASA, MolWt, CalcNumRotatableBonds
>> and MolLogP) to compare all the molecules that SMILES and Inchi are the
>> same.
>> If all this data are the same, I will generate the fingerprint (Atompair
>> for exemple) and use Tanimoto coefficient and, if this value, when I
>> compare two molecules, is 1, this molecules are the same.
>>
>> Where is my mistake (I think that is, one or more, mistakes)?
>>
>> Thanks!
>>
>> --
>> Wandré Nunes de Pinho Veloso
>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>> Inteligência Computacional - UNIFEI
>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>
>> 2017-09-13 14:19 GMT-03:00 Dimitri Maziuk <dmaz...@bmrb.wisc.edu>:
>>
>>> On 09/13/2017 11:46 AM, Markus Sitzmann wrote:
>>> > The case that you have 3D information available for a molecule dataset
>>> is rare, if you want it trustworthy it gets even worse than that. And what
>>> is the point then to generate the configuration of a molecule first if you
>>> can not trust that either?
>>>
>>> Veering further off topic, do you even care in the first place? E.g. if
>>> your molecule always exists as a mixture of isomers, except in some
>>> megabuck-per-microgram painstakingly created reference samples, a
>>> 3D-based system will represent it as two distinct molecules. Whereas you
>>> want it represented as one.
>>>
>>> Last I looked PDB Ligand Expo had two different benzenes. Their software
>>> doesn't (didn't?) do the circle version so they don't have the third one.
>>>
>>> --
>>> Dimitri Maziuk
>>> Programmer/sysadmin
>>> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>>>
>>>
>>> ------------------------------------------------------------
>>> ------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to