Hi Wandré,

1) apt-get installs rdkit 2013 (link below). So, please install it through
conda (as Markus suggested)
https://packages.ubuntu.com/trusty/python/python-rdkit

2) I am not familiar with the case of wrong SMILE generation. But the link
below says something more that I think you need to know.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3495655/

3) As you are trying to store data, it would be great to consider whether
you are storing energy minimized molecule or not. (my opinion). Surface
area related descriptors will yield different result and bond connectivity
related descriptor will yield same result in both cases.

4) Sharing my personal experience, during my undergraduate school part of
my final year project was stressed up with conceptual questions. I failed
to utilize the  blessing of advanced development due to the lack of time.
The later experience was not so good.

Please keep in mind that we can generate a non redundant database with few
molecules but for millions of molecules it should be quite though task.
Have a great day!

- malitha




On Thu, Sep 14, 2017 at 2:05 AM, Markus Sitzmann <markus.sitzm...@gmail.com>
wrote:

> PS. The conda version has InChI support
>
> On Wed, Sep 13, 2017 at 10:04 PM, Markus Sitzmann <
> markus.sitzm...@gmail.com> wrote:
>
>> Strong recommendation: use the conda version:
>>
>> http://www.rdkit.org/docs/Install.html
>>
>> On Wed, Sep 13, 2017 at 9:58 PM, Wandré <wandrevel...@gmail.com> wrote:
>>
>>> I just run sudo apt-get install python-rdkit librdkit1 rdkit-data ūüėĀ
>>> I'm trying to solve this with this link: http://www.blopig.com/bl
>>> og/2013/02/how-to-install-rdkit-on-ubuntu-12-04/
>>>
>>> --
>>> Wandré Nunes de Pinho Veloso
>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>> Doutorando em Bioinform√°tica - Universidade Federal de Minas Gerais -
>>> UFMG
>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>> Inteligência Computacional - UNIFEI
>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>> Membro do Grupo de Pesquisa Bioinform√°tica Estrutural da UFMG
>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>
>>> 2017-09-13 16:55 GMT-03:00 Markus Sitzmann <markus.sitzm...@gmail.com>:
>>>
>>>> How did you install rdkit so far? And where? Is it the conda/anaconda
>>>> version?
>>>>
>>>> On Wed, Sep 13, 2017 at 9:39 PM, Wandré <wandrevel...@gmail.com> wrote:
>>>>
>>>>> How to install RDKit with InChI?
>>>>> When I run Chem.inchi.INCHI_AVAILABLE, the result is False
>>>>>
>>>>> --
>>>>> Wandré Nunes de Pinho Veloso
>>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>>>> Doutorando em Bioinform√°tica - Universidade Federal de Minas Gerais -
>>>>> UFMG
>>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>>>> Inteligência Computacional - UNIFEI
>>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>>>> Membro do Grupo de Pesquisa Bioinform√°tica Estrutural da UFMG
>>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>>>
>>>>> 2017-09-13 16:30 GMT-03:00 Wandré <wandrevel...@gmail.com>:
>>>>>
>>>>>> Thanks Malitha.
>>>>>> I choose this descriptors because I will store this on my database,
>>>>>> so, will be fast compare one molecule before insert them in database.
>>>>>> My worry now is if the RDKit will generate different SMILES or InChI
>>>>>> in same SDF molecule or equals in different molecules (molecules from 
>>>>>> RCSB
>>>>>> PDB, PubChem, ChemBL, for example).
>>>>>>
>>>>>> --
>>>>>> Wandré Nunes de Pinho Veloso
>>>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>>>>> Doutorando em Bioinform√°tica - Universidade Federal de Minas Gerais -
>>>>>> UFMG
>>>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>>>>> Inteligência Computacional - UNIFEI
>>>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>>>>> Membro do Grupo de Pesquisa Bioinform√°tica Estrutural da UFMG
>>>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>>>>
>>>>>> 2017-09-13 16:22 GMT-03:00 Malitha Kabir <malitha12...@gmail.com>:
>>>>>>
>>>>>>> Hi Wandré,
>>>>>>>
>>>>>>> It seems you already did intense research on it. Kindly accept my
>>>>>>> comments as an addition to your idea (not the answer you trying to find
>>>>>>> out). In my idea, categorizing molecules using it's descriptor should
>>>>>>> reduce computation time. RDKit currently offer calculation of about 200
>>>>>>> descriptors! So, a careful look up at those makes a lot of sense to me.
>>>>>>> Conceptually, descriptor matching should follow a sequence (I don't know
>>>>>>> what sequence would be ideal) - for example MolWt should match first (H
>>>>>>> contribution and ions should be taken into consideration here) and then
>>>>>>> subsequent matching of other descriptors (might be different while 
>>>>>>> writing
>>>>>>> programs). There are a few reading materials on molecular fingerprint 
>>>>>>> and
>>>>>>> database schema. You may have a look at those.
>>>>>>>
>>>>>>> The links are from Daylight. I am neither involved with the company
>>>>>>> nor their product.
>>>>>>> http://www.daylight.com/dayhtml/doc/theory/theory.finger.html
>>>>>>> http://www.daylight.com/dayhtml/doc/theory/theory.thor.html
>>>>>>>
>>>>>>> Best regards,
>>>>>>> - malitha
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 14, 2017 at 12:43 AM, Wandré <wandrevel...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks for all the answers.
>>>>>>>>
>>>>>>>> Reading all answers, I think in something different... If the
>>>>>>>> SMILES (Chem.MolToSmiles(mol,isomericSmiles=True)) and Inchi
>>>>>>>> (Chem.MolToInchi(mol)) can generate the same value in different 
>>>>>>>> molecules,
>>>>>>>> I will generate others descriptors (NumHDonors, NumHAcceptors, Ri
>>>>>>>> ngCount, GetNumAtoms, TPSA, pyLabuteASA, MolWt, CalcNumRotatableBonds
>>>>>>>> and MolLogP) to compare all the molecules that SMILES and Inchi are the
>>>>>>>> same.
>>>>>>>> If all this data are the same, I will generate the fingerprint
>>>>>>>> (Atompair for exemple) and use Tanimoto coefficient and, if this value,
>>>>>>>> when I compare two molecules, is 1, this molecules are the same.
>>>>>>>>
>>>>>>>> Where is my mistake (I think that is, one or more, mistakes)?
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> --
>>>>>>>> Wandré Nunes de Pinho Veloso
>>>>>>>> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
>>>>>>>> Doutorando em Bioinform√°tica - Universidade Federal de Minas
>>>>>>>> Gerais - UFMG
>>>>>>>> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
>>>>>>>> Inteligência Computacional - UNIFEI
>>>>>>>> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
>>>>>>>> Membro do Grupo de Pesquisa Bioinform√°tica Estrutural da UFMG
>>>>>>>> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>>>>>>>>
>>>>>>>> 2017-09-13 14:19 GMT-03:00 Dimitri Maziuk <dmaz...@bmrb.wisc.edu>:
>>>>>>>>
>>>>>>>>> On 09/13/2017 11:46 AM, Markus Sitzmann wrote:
>>>>>>>>> > The case that you have 3D information available for a molecule
>>>>>>>>> dataset is rare, if you want it trustworthy it gets even worse than 
>>>>>>>>> that.
>>>>>>>>> And what is the point then to generate the configuration of a molecule
>>>>>>>>> first if you can not trust that either?
>>>>>>>>>
>>>>>>>>> Veering further off topic, do you even care in the first place?
>>>>>>>>> E.g. if
>>>>>>>>> your molecule always exists as a mixture of isomers, except in some
>>>>>>>>> megabuck-per-microgram painstakingly created reference samples, a
>>>>>>>>> 3D-based system will represent it as two distinct molecules.
>>>>>>>>> Whereas you
>>>>>>>>> want it represented as one.
>>>>>>>>>
>>>>>>>>> Last I looked PDB Ligand Expo had two different benzenes. Their
>>>>>>>>> software
>>>>>>>>> doesn't (didn't?) do the circle version so they don't have the
>>>>>>>>> third one.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Dimitri Maziuk
>>>>>>>>> Programmer/sysadmin
>>>>>>>>> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------
>>>>>>>>> ------------------
>>>>>>>>> Check out the vibrant tech community on one of the world's most
>>>>>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>>>>>> _______________________________________________
>>>>>>>>> Rdkit-discuss mailing list
>>>>>>>>> Rdkit-discuss@lists.sourceforge.net
>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------
>>>>>>>> ------------------
>>>>>>>> Check out the vibrant tech community on one of the world's most
>>>>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>>>>> _______________________________________________
>>>>>>>> Rdkit-discuss mailing list
>>>>>>>> Rdkit-discuss@lists.sourceforge.net
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> ------------------------------------------------------------
>>>>> ------------------
>>>>> Check out the vibrant tech community on one of the world's most
>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>> _______________________________________________
>>>>> Rdkit-discuss mailing list
>>>>> Rdkit-discuss@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>>
>>>>>
>>>>
>>>
>>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to