Hello all,
First and foremost, please excuse any inaccuracies as I am new to the world of
Cheminformatics. I'll start with some background on my issue. I've got a MySQL
database with chemical information such as CASRNs, Annotation Class, and SMILE
strings. I have a web app currently in development that takes a list of CASRNs
as input and performs an enrichment analysis on those chemicals. The goal is to
now take a SMILE string as input, query the rdkit postgresql database (which
will consist of a table with rows that have CASRNs and SMILE strings) to find
any chemicals that match the input SMILE string over a set threshold. These
chemicals CASRNs will then become the input for enrichment and the rest of the
app will be agnostic toward how the CASRNs were obtained.
I suppose I should say this: if the description above isn't something rdkit is
suited for and I'm totally off base, please let me know! As I stated earlier,
I'm new to this realm.
Anyway, the issues I'm facing currently involve the installation of rdkit and
the postgresql cartridge. I should say, I've tried this on both Windows 7 and
Windows 10 (work desktop in 7 and laptop is 10). So, the first question I have
is when running the command
conda install -c https://conda.binstar.org/rdkit rdkit-postgresql
I'm faced with a "PackageNotFoundError". Conda then suggests maybe I meant
"rdkit-postgresql: postgresql". So, I run
conda install -c https://conda.binstar.org/rdkit postgresql
which seems to install just fine. My question is simply, is this ok? Do i
specifically need the "rdkit-postgresql" package? I searched anaconda.org for
the rdkit-postgresql package and found it. It said it could be installed with
conda install -c rdkit rdkit-postgresql=2016.03.4
but this resulted in a similar error. I was wondering if it was possibly a
platform difference, because anaconda.org shows the package as "linux-64".
I'd also just like to mention a few other issues I was (seemingly) able to work
around, but I would still like to mention them as they may provide context for
my next question. The path to the rdkit bin folder for me is
"C:\Users\Larson\Anaconda3\envs\my-rdkit-env\Library\bin" while the
documentation specifies it should follow the "[conda
folder]/envs/my-rdkit-env/bin" convention. Perhaps it's a change in the
directory structure that hasn't been updated in the documentation? Next, to
actually start the postgresql server the command I had to use was
C:\Users\Larson\Anaconda3\envs\my-rdkit-env\Library\bin\pg_ctl -D
path/to/db/data -l logfile start
as opposed to
[conda folder]/envs/my-rdkit-env/bin/postgres -D
/folder/where/data/should/be/stored
as outlined in the documentation.
And I believe that brings us to my current issue/question. In the
documentation, it specifies to create a database you should do the following:
createdb my_rdkit_db
psql my_rdkit_db
# create extension rdkit;
The first two lines I can get through just fine, but the "create extension"
command gives me a file not found error, with the file in question being
"rdkit.control". I searched the rdkit github page and found one such file and
copied it to the appropriate location
("C:\Users\Larson\Anaconda3\envs\my-rdkit-env\Library\share\extension", I
believe). After doing so I got a new error that I can't replicate at the
moment, but it was another file not found error and the file it was looking for
was "rdkit--3.5.sql" I'm fairly sure. I could not find any such file on github,
which is what prompted me to create this post.
Please let me know if I'm doing anything incorrectly or anything else that
might help. I will also do my best to provide any clarification if necessary.
Thanks very much for your time,
Larson Danes
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss