Re: [Rdkit-discuss] want advice for good teaching data set

2018-08-29 Thread TJ O'Donnell
Hi Andrew
ChEMBL 24 has compound properties in the table compound_properties.  I
think the alogp
is computed using (Crippen) atom types and the acd_logp is uses ACD labs
methods.
TJ

On Wed, Aug 29, 2018 at 5:52 AM Andrew Dalke 
wrote:

> Hi all,
>
>   I am starting to put together materials for the Python/RDKit training
> course I'm giving just before the RDKit UGM next month.
>
> I would like to structure part of it around the SQLite release of the
> ChEMBL data set. More specifically, I plan to include examples of machine
> learning with scikit-learn, using RDKit descriptors and values from ChEMBL
> 24 (and making sure to use the new schema).
>
> Two problems. First, I'm not a computational chemist and I don't know what
> would constitute a good example to use. "Good" in this case means one whose
> outlines are well-known to likely students. Second, I don't have much
> experience with the ChEMBL data.
>
> My thought is to make a logP model. The easiest would be to based it on
> atom types. For this option, can anyone suggest where I can find logP data
> from ChEMBL?
>
> Another possibility is to use a pre-existing model, like the notebook
> George Papadatos did for Ligand-based Target Prediction at
> http://nbviewer.jupyter.org/gist/madgpap/10457778 .
>
> Perhaps someone here could point me to other existing resources along
> similar lines?
>
> Best regards,
>
> Andrew
> da...@dalkescientific.com
>
>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS for an amide in an aromatic ring

2017-09-18 Thread TJ O'Donnell
Try either of these:

[N,n](C)-,:[C,c](=[O])

C[N,n]-,:[C,c](=[O])

TJ O'Donnell

On Mon, Sep 18, 2017 at 4:26 PM, James T. Metz via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> Hello,
>
> Given the following aromatic structure
>
> m = Chem.MolFromSmiles("CN1C=CC(N)=NC1=O")
>
> I would like to construct a SMARTS pattern to
> recognize the aromatic amide (nitrogen attached to
> the exocyclic methyl group) and not recognize the other
> NCO group of atoms.
>
>
> I have tried
>
> pattern = Chem.MolFromSmarts('[N,n]-,:[C,c](=[O])')
>
> but, this matches *both* NCO groups of atoms which
> I do not want.
>
>
> The completely "aliphatic version"
>
> pattern = Chem.MolFromSmarts('[N]-[C](=[O])')
>
> does not match either NCO group of atoms.
>
> I am stumped.  I have also tried several recursive
> SMARTS expressions, but I can't get the syntax
> right.
>
> I would appreciate any suggestions.  Thank you.
>
>
> Regards,
> Jim Metz
>
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread TJ O'Donnell
Let the database do the work for you.  Create a canonical SMILES column
and/or InChI column and declare them to be unique.  As you insert new
rows, postgres will let  you know if there is already a row with the same
SMILES or InChI.
Here's some help on how to handle that.
https://www.postgresql.org/docs/9.5/static/sql-insert.html#SQL-ON-CONFLICT

TJ O'Donnell

On Wed, Sep 13, 2017 at 3:13 AM, Wandré <wandrevel...@gmail.com> wrote:

> Hi,
>
> My name is Wandré and I'm from Brazil.
> I'm trying to do a big database of molecules, but, I want to eliminate all
> the redundant molecules before insert them in database.
> I want to know what is the best method to identify one molecule in RDKit.
> Is SMILES ("Chem.MolToSmiles(mol,isomericSmiles=True)") or I will need to
> compare all molecules, one by one, before insert them in database (using
> Tanimoto)?
> This can be hard to do because my database will have lot of millions of
> molecules, so, compare one by one before insert is the only answer?
> Compare if the SMILES as already inserted is easy (text compare), but,
> compare fingerprint of molecule...
>
> If I really need to compare the fingerprint of molecule, how to store this
> data in PostgreSQL without use cartridge? I will generate the fingeprint
> (Atompair, for example) and store this fingerprint in database and compare
> all the fingerprints, one by one, before insert a now molecule. This
> fingerprint (Atompair) have lot of features, so, store this in relational
> database is expensive.
> It is possible?
>
> Thanks!
>
> --
> Wandré Nunes de Pinho Veloso
> Professor Assistente - Unifei - Campus Avançado de Itabira-MG
> Doutorando em Bioinformática - Universidade Federal de Minas Gerais - UFMG
> Pesquisador do INSILICO - Grupo Interdisciplinar em Simulação e
> Inteligência Computacional - UNIFEI
> Membro do Grupo de Pesquisa Assinaturas Biológicas da FIOCRUZ
> Membro do Grupo de Pesquisa Bioinformática Estrutural da UFMG
> Laboratório de Bioinformática e Sistemas - LBS, DCC, UFMG
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Fwd: Need SMARTS to distinguish 6-ring vs macrocyclic ether oxygens

2017-09-06 Thread TJ O'Donnell
I verified that r6 does the trick.  Using my rdchord cartridge, I get

tjo=> select
rd.list_matches(rd.rdmol('OCC1OC2OC3C(CO)OC(OC4C(CO)OC(OC5C(CO)OC(OC6C(CO)OC(OC7C(CO)OC(OC1C(O)C2O)C(O)C7O)C(O)C6O)C(O)C5O)C(O)C4O)C(O)C3O'),
'[O;H0;D2;r6]',0,1);
  list_matches

 {{4},{11},{18},{25},{32},{39}}
(1 row)

tjo=> select
rd.list_matches(rd.rdmol('OCC1OC2OC3C(CO)OC(OC4C(CO)OC(OC5C(CO)OC(OC6C(CO)OC(OC7C(CO)OC(OC1C(O)C2O)C(O)C7O)C(O)C6O)C(O)C5O)C(O)C4O)C(O)C3O'),
'[O;H0;D2;!r6]',0,1);
  list_matches

 {{6},{13},{20},{27},{34},{41}}
(1 row)

Here's an image showing the atom numbers corresponding to the list_matches
output.

TJ

[image: Inline image 2]

On Wed, Sep 6, 2017 at 6:04 PM, TJ O'Donnell <t...@acm.org> wrote:

> Try using [O;H0;D2;r6] lower-case r.  Sorry I'm not at a computer to
> check this.
> R6 means in 6 rings.
> r6 means in ring of size 6.
>
> http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html
>
> TJ O'Donnell
>
> On Wed, Sep 6, 2017 at 4:34 PM, James T. Metz via Rdkit-discuss <
> rdkit-discuss@lists.sourceforge.net> wrote:
>
>> Hello,
>>
>> Given the following SMILES for a macrocyclic hexaose
>>
>>OCC1OC2OC3C(CO)OC(OC4C(CO)OC(OC5C(CO)OC(OC6C(CO)OC(OC7C(CO)
>> OC(OC1C(O)C2O)C(O)C7O)C(O)C6O)C(O)C5O)C(O)C4O)C(O)C3O
>>
>> can anyone suggest a SMARTS pattern that will distinguish ether
>> oxygens
>> in the smaller 6-membered rings versus the ethers in the larger
>> macrocyclic
>> structure?
>>
>> For example, using RDkit, I have tried (e.g., pattern =
>> Chem.MolFromSmarts('[O;H0;D2]') )
>>
>> [O;H0;D2]  ===>  gives 12 matches (all ether oxygens)
>>
>> [O;H0;D2;R]  ===>  gives 12 matches (all ether oxygens)
>>
>> [O;H0;D2;!R]  ===>  gives 0 matches
>>
>> [O;H0;D2;R6]  ===>  gives 0 matches
>>
>>
>> I am stumped.  Any ideas?
>>
>> If it is necessary to write more complicated PYTHON/RDkit/SMARTS
>> code, I am certainly willing to try that.
>>
>> Thanks!
>>
>> Regards,
>> Jim Metz
>> Northwestern University
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Fwd: Need SMARTS to distinguish 6-ring vs macrocyclic ether oxygens

2017-09-06 Thread TJ O'Donnell
Try using [O;H0;D2;r6] lower-case r.  Sorry I'm not at a computer to check
this.
R6 means in 6 rings.
r6 means in ring of size 6.

http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html

TJ O'Donnell

On Wed, Sep 6, 2017 at 4:34 PM, James T. Metz via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> Hello,
>
> Given the following SMILES for a macrocyclic hexaose
>
>OCC1OC2OC3C(CO)OC(OC4C(CO)OC(OC5C(CO)OC(OC6C(CO)OC(OC7C(CO)
> OC(OC1C(O)C2O)C(O)C7O)C(O)C6O)C(O)C5O)C(O)C4O)C(O)C3O
>
> can anyone suggest a SMARTS pattern that will distinguish ether oxygens
> in the smaller 6-membered rings versus the ethers in the larger macrocyclic
> structure?
>
> For example, using RDkit, I have tried (e.g., pattern =
> Chem.MolFromSmarts('[O;H0;D2]') )
>
> [O;H0;D2]  ===>  gives 12 matches (all ether oxygens)
>
> [O;H0;D2;R]  ===>  gives 12 matches (all ether oxygens)
>
> [O;H0;D2;!R]  ===>  gives 0 matches
>
> [O;H0;D2;R6]  ===>  gives 0 matches
>
>
> I am stumped.  Any ideas?
>
> If it is necessary to write more complicated PYTHON/RDkit/SMARTS code,
> I am certainly willing to try that.
>
> Thanks!
>
> Regards,
> Jim Metz
> Northwestern University
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] connecting to postgres in rdkit environment

2017-02-25 Thread TJ O'Donnell
The server itself must be told to allow remote connections.
You might check these two things.
1.  You can edit the postgresql.conf file (not sure where that is on your
system).

https://www.postgresql.org/docs/9.2/static/runtime-config-connection.html
 Uncomment or add the line listen_addresses='*'. You can
 tailor that to be more specific, but try this first.

2.  The file pg_hba.conf also controls access.  Look at this:
  https://www.postgresql.org/docs/9.3/static/auth-pg-hba-conf.html

Be sure to restart the server after you make changes to these files.

Hope this helps,
TJ O'Donnell


On Sat, Feb 25, 2017 at 12:34 PM, <nbell8...@yahoo.com> wrote:

> Hi,
> I've installed rdkit on a CentOS machine using anaconda python and set up
> a postgresql compound database in the rdkit environment. It works great on
> the machine's console.
> I now want to access it remotely and I'm trying to set up a jdbc postgres
> driver to access it from a windows client but this is not working. If I
> test the driver on the server it tells me that the connection is refused
> and I should check that the machine is accepting TCP requests.
>
> I have opened the standard port that postgres uses
> -A INPUT -m state --state NEW -m tcp -p tcp --dport 5432 -j ACCEPT
>
> iptables -L returns
> ACCEPT tcp  --  anywhere anywherestate NEW tcp
> dpt:postgres
>
> this is where I don't know what to check next. A few things that might be
> relevant. If I "ps -eaf | grep post" I see four postgres processes running
> under my username (not postgres), so I think there is a server working.
> There is also a "system" postgresql (version 9.2) which I have connected to
> previously a long time ago. This connection no longer works either and I
> don't really care about that but could be an interfering factor.
>
> If anyone has suggestions about what to check next or solve this I'd be
> grateful
>
> thanks,
> Neil
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Struggling with apache + rdkit + django

2016-06-21 Thread TJ O'Donnell
I would suggest setting PYTHONPATH in
config or ini files for
Apache or Django or uwsgi
Not sure which is required.

On Tue, Jun 21, 2016 at 11:15 AM, Téletchéa Stéphane <
stephane.teletc...@univ-nantes.fr> wrote:

> Le 21/06/2016 20:05, Bennion, Brian a écrit :
> > What is the actual problem that is occurring?  You have listed what you
> have tried to do to fix a problem.
> >
> > Brian
>
> Dear Brian,
>
> I get a 500 error meaning something is not working properly, but no
> trace in logs (either apache or django),
> so I can only "assume" it comes from there since in the "developper"
> mode there is no problem (everything works as expected).
>
> Sorry for the confusion,
>
> Stéphane
>
> --
> Assistant Professor in BioInformatics, UFIP, UMR 6286 CNRS, Team Protein
> Design In Silico
> UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, 44322
> Nantes cedex 03, France
> Tél : +33 251 125 636 / Fax : +33 251 125 632
> http://www.ufip.univ-nantes.fr/ - http://www.steletch.org
>
>
>
> --
> Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] molecule standardization in cartridge search

2015-09-25 Thread TJ O'Donnell
Tim,

I have a set of postgres python (PL/Python) functions using rdkit.
It is available at
https://github.com/tjod/rdchord
and some docs at
https://github.com/tjod/rdchord/wiki

TJ O'Donnell

On Fri, Sep 25, 2015 at 6:54 AM, Tim Dudgeon <tdudgeon...@gmail.com> wrote:

> Jan,
>
> thanks for that. I'll give it a try.
> Are there any examples of writing RDKit functions and procedures for
> postgres in python?
> I see this general postgres docs:
> http://www.postgresql.org/docs/9.4/static/plpython.html
> but wondered if there are any RDKit specific examples anywhere?
>
> Tim
>
> On 25/09/2015 08:30, Jan Holst Jensen wrote:
> > On 2015-09-24 16:22, Tim Dudgeon wrote:
> >> I'm trying to get to grips with using the RDKit cartridge, and so far
> >> its going well.
> >> One thing I'm concerned about is molecule standardization, along the
> >> lines of the ChemAxon Standardizer that allows substructure searches to
> >> be done is a way that is largely independent of the quirks of structure
> >> representation. The classic example would be how nitro groups are
> >> represented, so that it didn't matter which nitro representation was in
> >> the query or target structures, because both were converted to a
> >> canonical form.
> >>
> >> My initial thoughts are that this would be done by:
> >> 1. loading the "raw" structures into a source column that would never be
> >> changed
> >> 2. defining a function that performed the necessary transform to
> >> generate the canonical form of a molecule.
> >> 3. generating a "canonical" structure column that was the result of
> >> passing the raw structures through that function
> >> 4. building the SSS index on that canonical column
> >> 5. executing queries using that function to canonicalize the query
> >> structure
> >>
> >> The problem I'm finding is that there do not seem to be postgres
> >> functions defined for doing molecular transforms (essentially a reaction
> >> transform) and doing things like removing explicit hydrogens. At least
> >> not in the functions listed on this page:
> >> http://rdkit.org/docs/Cartridge.html#functions
> >>
> >> Am I missing something here, or might I be barking up completely the
> >> wrong tree?
> >>
> >> Tim
> >
> > Hi Tim,
> >
> > We have about the same situation and we're adding standardization
> > (beyond what RDKit implicitly does when it sanitizes the molecule)
> > through Python stored procedures. You will need to build and maintain
> > a normal Python-enabled RDKit installation in parallel to the
> > cartridge. The Python stored procedures can access the normal RDKit
> > installation and then run whatever Python code is necessary to do
> > additional molecule cleanup.
> >
> > You will need to tweak your Postgres environment so the Python stored
> > procedures can load RDKit. This is what I have defined in an
> > environment file on CentOS:
> >
> > RDBASE=/opt/rdkit
> > LD_LIBRARY_PATH=/opt/rdkit/lib
> > PYTHONPATH=/opt/rdkit
> >
> > On Ubuntu this would go into /etc/postgresql/9.x/main/environment (in
> > a slightly different format where the values have to be single-quoted).
> >
> > Cheers
> > -- Jan, Biochemfusion
>
>
>
> --
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] https://sourceforge.net

2015-03-23 Thread TJ O'Donnell
How about including a link on sourceforge to this:
https://help.github.com/articles/support-for-subversion-clients/
so that folks without git clients can get started.

TJ

On Fri, Mar 20, 2015 at 9:48 PM, Greg Landrum greg.land...@gmail.com
wrote:

 The mailing lists and one form of the downloads are hosted there. It's a
 very good point that having the trackers still active on sourceforge is
 confusing. I just deleted them.

 We should also do something about the svn repo that's there, just to make
 clear that it's no longer active.

 Does anyone see a problem with me doing a commit there that removes all
 the code and just leaves a look in github readme?


 On Fri, Mar 20, 2015 at 7:44 PM, Soren Wacker swac...@ucalgary.ca wrote:

 Hi,

 rdkit has moved to github, but there is still the repository on
 sourceforge.net.
 However, if you google 'rdkit bugs' the sourceforge page comes up first.
 I find that confusing. Is there a reason to keep the sourceforge.net
 stuff?
 If not, why don't you remove the sourceforge repository?

 kind regards
 Soren

 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub
 for all
 things parallel software development, from weekly thought leadership
 blogs to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for
 all
 things parallel software development, from weekly thought leadership blogs
 to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Oracle, pypl and rdkit

2015-03-12 Thread TJ O'Donnell
I've implemented a suite of rdkit functions
for postgres using plpython
https://github.com/tjod/rdchord
and the overhead is minimal
since most of the heavy lifting of substructure searching
is done by rdkit.

I think the same would be true of oracle.
---
TJ O'Donnell

On Thu, Mar 12, 2015 at 4:24 PM, Michal Krompiec michal.kromp...@gmail.com
wrote:

 Hello, has anybody tried to implement substructure searching in an Oracle
 database using PYPL and RDKit? Is it just a matter of writing a wrapper
 function for molecule.HasSubstructMatch(pattern) or is the overhead of
 calling pypl each time too costly timewise? Do consecutive pypl calls
 always share the same interpreter?
 Best wishes,
 Michal


 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for
 all
 things parallel software development, from weekly thought leadership blogs
 to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] autodock vina pdbqt file to mol2

2014-05-09 Thread TJ O'Donnell
Babel can read and write both pdbqt and mol2 files. I'm not sure how the
atom ordering might be accomplished though.

TJ
On May 9, 2014 2:43 PM, Jan Domanski jan...@gmail.com wrote:

 Thanks for the quick reply Christos!

 I found the pdbqt_to_pdb script that you mentioned but a google search for
 a pdbqt to mol2 yield nothing (other than this thread). the pdbqt_to_pdb
 converter is very crude: it retains only the best pose from _out.pdbqt and
 it basically just strips the BRANCH and ROOT tags deposited by autodock
 (which I was doing anyway with the sed).

 The main problems remaining are atom order (I can fix that) and missing
 hydrogens (can't fix that). There is a mode where I can prevent the
 prepare_ligand4.py from removing the hydrogens – but the output poses then
 have really weird geometry.

 But let's refocus a little bit: this is not an autodock vina question
 (although many folks here are knowledgeable enough to help me). This is a
 question on a mol2 file to which it should be possible to add Hs with rdkit
 and it's somehow not happening (at least not in my hands). My mol2 could be
 somehow malformatted.





 On 9 May 2014 20:57, Christos Kannas chriskan...@gmail.com wrote:

 Hi Jan,

 AutoDock has a set of tools (MGLTools) that have tools to convert pdb to
 pdbqt and vice-versa.
 If I recall it can also convert pdbqt to mol2 also. See this discussion
 http://autodock.1369657.n2.nabble.com/ADL-pdbqt-to-mol2-td6755769.html

 Best,

 Christos

 Christos Kannas

 Researcher
 Ph.D Student

 Mob (UK): +44 (0) 7447700937
 Mob (Cyprus): +357 99530608

 [image: View Christos Kannas's profile on 
 LinkedIn]http://cy.linkedin.com/in/christoskannas


 On 9 May 2014 20:17, Jan Domanski jan...@gmail.com wrote:

  Hi guys,

 I'm really stuck here: I have some output from autodock vina in a rather
 obscure pdbqt format. It's a little bit like pdb but not quite. I'm trying
 to get back a mol2 file.

 The autodock pdbqt file has only the polar hydrogens in it – part of the
 trick is to re-add the hydrogens.

 Example autodock vina output is attached (it's a conformer of the ACE
 native ligand DUDE).

 First of all, I convert that to a PDB file by doing a simple sed,
 sed -e '/ROOT/d' -e '/BRANCH/d'
 Then I reorder the atoms to match those of the original
 crystal_ligand.mol2 (because autodock re-orders the atoms duh).

 Finally, I save a mol2 file out (attached) ordered as the original
 crystal_ligand and with polar hydrogens (for each pose of a conformer).

 Let's go to rdkit and try to add hydrogens:

 mol = Chem.MolFromMol2File(output, removeHs=False)
 mol2 = AllChem.AddHs(mol, addCoords=True)
 print mol.GetNumAtoms(), mol2.GetNumAtoms()
 44 44

 So, only the implicit hydorgens are present. Calling AddHs doesn't raise
 an error and it doesn't really change the number of hydrogens...

 Now this may not be the best way of doing things: what I care for is to
 get a mol2 from autodock vina that I can compare to the original mol2 from
 DUD (same atom order, same number of atoms). Maybe there are other ways to
 achieve this: one idea would be to inject the docked pose coordinates into
 the original mol2 atoms (heavy and polar hydrogens) and somehow adjust
 the non-polar hydrogens.

 Thanks,

 - Jan



 --
 Is your legacy SCM system holding you back? Join Perforce May 7 to find
 out:
 #149; 3 signs your SCM is hindering your productivity
 #149; Requirements for releasing software faster
 #149; Expert tips and advice for migrating your SCM now
 http://p.sf.net/sfu/perforce
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





 --
 Is your legacy SCM system holding you back? Join Perforce May 7 to find
 out:
 #149; 3 signs your SCM is hindering your productivity
 #149; Requirements for releasing software faster
 #149; Expert tips and advice for migrating your SCM now
 http://p.sf.net/sfu/perforce
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
#149; 3 signs your SCM is hindering your productivity
#149; Requirements for releasing software faster
#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit cartridge - opposite of mol_from_ctab() would be nice.

2014-02-24 Thread TJ O'Donnell
Hi All

I would like to announce the availability of a somewhat different
rdkit-based
postgresql extension.  This uses rdkit for all the basic cheminformatics
functions (canonical smiles, molfile handling, smarts matching,
fingerprints, etc.)
but is based on the use of postgres' plpython language.
This does not use the existing rdkit postgres cartridge, although I have
demonstrated that the two can be used side-by-side (via the use of
rdkit pickled mol objects).

I hope this use of python might make it easier to extend postgres even
further with
additional functions based on rdkit.  The code can be checked out from
sourceforge using this:

svn checkout svn://svn.code.sf.net/p/sci3d/code/trunk/openchord/src/rdkitchord

This is a work in progress, so I would appreciate any feedback.  There are
still
some wrinkles that need to be ironed out.   I plan to document
the installation and useage better, probably using github.

TJ O'Donnell



On Sat, Feb 22, 2014 at 10:53 PM, Greg Landrum greg.land...@gmail.comwrote:


 On Fri, Feb 21, 2014 at 5:45 PM, Jan Holst Jensen 
 j...@biochemfusion.comwrote:

  Hi Greg,

 It would be great to gain the experience. I am working on a registration
 project where we will likely need to surface additional functions in the
 cartridge, just to try them out. So, knowing how to do that in a way where
 things that turn out useful can be contributed back cleanly would be great.


 Sounds good.



  if structures don't have conformers

 Ah, yes; good question. Decisions, decisions... I'll dodge the question
 :-) and say it sounds like a perfect fit for an optional parameter, e.g.

 mol_to_ctab(m mol, add_depiction_if_missing bool default true)

 I would go for default true because I believe that is the general
 preference.


 Having the optional argument that defaults to true make sense to me.

 Here's an attempt to briefly summarize what needs to be changed in order
 to add the new functionality:

 - Add mol_to_ctab to rdkit_io.c
 - Add molToCtabText (or some such thing) to adapter.cpp and rdkit.h
 - Add mol_to_ctab() definitions to rdkit.sql91.in and, if you want to
 support older versions of postgres, rdkit.sql.in
 - Update link dependencies in Makefile if necessary (will be necessary if
 you add depictions)
 - Add tests to one of the files in sql/ (the most logical place is
 probably rdkit-91.sql and rdkit-pre91.sql if you are supporting older
 versions) and the corresponding output file in expected/


 I think that's it.

 -greg



  Cheers
 -- Jan


 On 2014-02-21 16:47, Greg Landrum wrote:

 Hi Jan,

  Great idea. I'd be happy to add it, but I can also talk you through
 it if you want to gain the experience.

  One important question: if structures don't have conformers (if they
 are loaded from SMILES, for example), should ctabs with all zero
 coordinates be generated or should depictions be generated?

  -greg



 On Fri, Feb 21, 2014 at 2:23 PM, Jan Holst Jensen 
 j...@biochemfusion.comwrote:

  Hi Greg,

 Are there any plans for a mol_*to*_ctab() function in the PG cartridge
 ? Would make SD file export from the database a bit easier.

 If there are no immediate plans, I can take a stab at adding it myself.

 * Looks like rdkit_io.c is the place to add it ?
 * Should I manually define the new SQL function in rdkit.sql.in, or is
 there some higher-level place I should add it instead ?

 Cheers
 -- Jan


 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss






 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PDB reader and bond perception

2014-01-13 Thread TJ O'Donnell
Hi JP

I use this file from PDB Europe:
ftp://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/files/pdb.tar.gz
Useful links followed from
http://www.ebi.ac.uk/pdbe-srv/pdbechem/

The pdb.tar.gz file has the standard residues and LOTS of others
with specific CONNECT records.

TJ



On Mon, Jan 13, 2014 at 9:54 AM, JP jeanpaul.ebe...@inhibox.com wrote:

 RDKitters!

 Finally back on the mailing list!

 I am sure we've been through this at the UGM (my mind must have wandered
 off!), but a quick question about the PDB reader and bond perception.  Is
 this supported with the current PDB reader?  I remember that someone
 (PaulE, perhaps?) was saying bond perception was painful, but there was
 some dictionary for PDB ligands which helps (any idea the name of this
 dictionary?).

 To the technical details.

 I am reading in the following PDB file with a simple MolFromPDBFile() call:

 HETATM1  O1P 84T A1862 -27.016   9.387 -72.564  1.00 20.81
   O
 HETATM2  P   84T A1862 -27.282   9.818 -73.968  1.00 19.65
   P
 HETATM3  O2P 84T A1862 -27.881  11.176 -74.182  1.00 21.49
   O
 HETATM4  N   84T A1862 -25.869   9.583 -74.813  1.00 19.78
   N
 HETATM5  C   84T A1862 -25.759  10.010 -76.075  1.00 19.97
   C
 HETATM6  CA  84T A1862 -24.493   9.748 -76.807  1.00 19.75
   C
 HETATM7  CB  84T A1862 -24.794   8.678 -77.847  1.00 19.73
   C
 HETATM8  CG  84T A1862 -23.571   8.324 -78.681  1.00 19.70
   C
 HETATM9  CD2 84T A1862 -23.309   9.519 -79.611  1.00 18.49
   C
 HETATM   10  CD1 84T A1862 -23.863   6.932 -79.305  1.00 18.60
   C
 HETATM   11  OHB 84T A1862 -25.210   7.467 -77.223  1.00 19.17
   O
 HETATM   12  OH  84T A1862 -23.549   9.127 -75.984  1.00 20.33
   O
 HETATM   13  O   84T A1862 -26.672  10.517 -76.692  1.00 20.26
   O
 HETATM   14  O5' 84T A1862 -28.377   8.861 -74.619  1.00 19.39
   O
 HETATM   15  C5' 84T A1862 -28.002   7.536 -74.954  1.00 18.47
   C
 HETATM   16  C4' 84T A1862 -28.909   7.000 -76.012  1.00 18.24
   C
 HETATM   17  C3' 84T A1862 -28.901   7.826 -77.298  1.00 18.28
   C
 HETATM   18  C2' 84T A1862 -30.318   7.610 -77.768  1.00 18.69
   C
 HETATM   19  O2' 84T A1862 -30.789   8.641 -78.581  1.00 19.64
   O
 HETATM   20  O4' 84T A1862 -30.262   6.951 -75.529  1.00 18.80
   O
 HETATM   21  C1' 84T A1862 -31.152   7.470 -76.521  1.00 19.01
   C
 HETATM   22  N9  84T A1862 -31.753   8.732 -76.009  1.00 20.08
   N
 HETATM   23  C4  84T A1862 -33.033   9.013 -76.158  1.00 21.10
   C
 HETATM   24  N3  84T A1862 -34.018   8.339 -76.786  1.00 21.58
   N
 HETATM   25  C2  84T A1862 -35.263   8.846 -76.830  1.00 21.95
   C
 HETATM   26  C8  84T A1862 -31.223   9.701 -75.291  1.00 20.27
   C
 HETATM   27  N7  84T A1862 -32.173  10.618 -75.019  1.00 21.28
 N
 HETATM   28  C5  84T A1862 -33.315  10.213 -75.563  1.00 21.81
   C
 HETATM   29  C6  84T A1862 -34.624  10.702 -75.627  1.00 22.85
   C
 HETATM   30  N1  84T A1862 -35.550  10.010 -76.285  1.00 22.44
   N
 HETATM   31  N6  84T A1862 -35.008  11.862 -75.052  1.00 23.86
   N
 TER
 END

 But I am losing all the double bond (and aromatic) information:

 m = Chem.MolFromPDBFile(sys.argv[1])
 print Chem.MolToSmiles(m)

 Gives me:

 CC(C)C(O)C(O)C(O)NP(O)(O)OCC1CC(O)C(N2CNC3C2NCNC3N)O1

 As usual, many thanks for your time,

 -
 Jean-Paul Ebejer
 Early Stage Researcher


 --
 CenturyLink Cloud: The Leader in Enterprise Cloud Services.
 Learn Why More Businesses Are Choosing CenturyLink Cloud For
 Critical Workloads, Development Environments  Everything In Between.
 Get a Quote or Start a Free Trial Today.

 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] incorrect stereochemistry

2012-10-25 Thread TJ O'Donnell
Hi All

In a recent list of about 100,000 smiles, I ran into 512 that caused
some problems.
Basically, the stereochemistry of the canonicalized (isomericSmiles=True) smiles
gets reversed.  I saw some discussion of this topic a while back, but it seems
it had not been resolved.
[15:07:50] Warning: ring stereochemistry detected. The output SMILES
is not canonical.
 Any help or input on this?
Some offending smiles are below along with the code
I used to test this.  I can provide a file of 512 if you'd like.
I'm using 2012.09.1, freshly compiled from svn
and passing all tests

TJ O'Donnell

---
from rdkit import Chem
import sys
for line in sys.stdin:
  smi = line.split(None,1)[0]
  mol = Chem.MolFromSmiles(smi)
  if mol:
print smi
print Chem.MolToSmiles(mol, isomericSmiles=True)
  else:
print can't parse smiles
 my truncated input 
CC1(c2cc(C(F)(F)F)cc(C(F)(F)F)c2)CCN([C@@]2(c3c3)CC[C@H](N3CCN(c4c4Cl)C(=O)C3)CC2)C1=O
Fc1ccc2[nH]cc([C@H]3CC[C@H](N4CCN(c56c5OCCO6)CC4)CC3)c2c1
Fc1ccc2[nH]cc([C@H]3CC[C@H](N4CCN(c56c5OCCO6)CC4)CC3)c2c1
Fc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56c5OCCO6)CC4)CC3)c2c1
Fc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56c5OCCO6)CC4)CC3)c2c1
c1ccc(CCN[C@H]2CC[C@H](Nc34cnccc43)CC2)cc1
c1ccc(CCN[C@H]2CC[C@H](Nc34cnccc43)CC2)cc1
c1ccc(CCN[C@@H]2CC[C@H](Nc34cnccc43)CC2)cc1
c1ccc(CCN[C@@H]2CC[C@H](Nc34cnccc43)CC2)cc1
CCCn1c2[nH]c(C3CCC(NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c(C3CCC(NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@@H]3CC[C@H](NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@@H]3CC[C@H](NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@H]3CC[C@H](NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@H]3CC[C@H](NC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
O=C(O)[C@H]1CC[C@H](Oc2(Sc3ccc(/C=C/C(=O)N4CCOCC4)c(C(F)(F)F)c3C(F)(F)F)c2)CC1
O=C(O)[C@H]1CC[C@H](Oc2(Sc3ccc(/C=C/C(=O)N4CCOCC4)c(C(F)(F)F)c3C(F)(F)F)c2)CC1
O=C(O)[C@@H]1CC[C@H](Oc2(Sc3ccc(/C=C/C(=O)N4CCOCC4)c(C(F)(F)F)c3C(F)(F)F)c2)CC1
O=C(O)[C@@H]1CC[C@H](Oc2(Sc3ccc(/C=C/C(=O)N4CCOCC4)c(C(F)(F)F)c3C(F)(F)F)c2)CC1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
O=C(NC1CCC(CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
O=C(N[C@@H]1CC[C@@H](CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
Cn1ccc2ccc3c4[nH]c5c(5CCN[C@H]5CC[C@H](O)CC5)c4c4c(c3c21)C(=O)NC4=O
N=C(N)Nc1ccc(CNC(=O)N2CCN(C(=O)O[C@@H]3CCC[C@H](OC(=O)N4CCN(C(=O)n5ccnc5)CC4)CCC3)CC2)cc1
CC(C)c1cc(C(C)C)c(S(=O)(=O)NC[C@H]2CC[C@H](C(=O)NNC(=O)c3cc4c4s3)CC2)c(C(C)C)c1
O=C(CCC[C@@H]1OO[C@H]((=O)c2c2)OO1)c1c1
-- my truncated output ; input smiles/output smiles pairs of lines --
N#Cc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@@H]3CC[C@@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
N#Cc1ccc2[nH]cc([C@H]3CC[C@H](N4CCN(c56nccnc65)CC4)CC3)c2c1
O=C(NC1CCC(CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
O=C(NC1CCC(CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
O=C(N[C@@H]1CC[C@@H](CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
O=C(N[C@H]1CC[C@H](CCN2CCN(c3(Cl)c3Cl)CC2)CC1)c1cccs1
Cn1ccc2ccc3c4[nH]c5c(5CCN[C@H]5CC[C@H](O)CC5)c4c4c(c3c21)C(=O)NC4=O
Cn1ccc2ccc3c4[nH]c5c(5CCN[C@@H]5CC[C@@H](O)CC5)c4c4c(c3c21)C(=O)NC4=O
N=C(N)Nc1ccc(CNC(=O)N2CCN(C(=O)O[C@@H]3CCC[C@H](OC(=O)N4CCN(C(=O)n5ccnc5)CC4)CCC3)CC2)cc1
N=C(N)Nc1ccc(CNC(=O)N2CCN(C(=O)O[C@H]3CCC[C@@H](OC(=O)N4CCN(C(=O)n5ccnc5)CC4)CCC3)CC2)cc1
CC(C)c1cc(C(C)C)c(S(=O)(=O)NC[C@H]2CC[C@H](C(=O)NNC(=O)c3cc4c4s3)CC2)c(C(C)C)c1
CC(C)c1cc(C(C)C)c(S(=O)(=O)NC[C@@H]2CC[C@@H](C(=O)NNC(=O)c3cc4c4s3)CC2)c(C(C)C)c1
O=C(CCC[C@@H]1OO[C@H]((=O)c2c2)OO1)c1c1
O=C(CCC[C@@H]1OO[C@H]((=O)c2c2)OO1)c1c1
O=C(CCC[C@@H]1OO[C@@H]((=O)c2c2)OO1)c1c1
O=C(CCC[C@H]1OO[C@H]((=O)c2c2)OO1)c1c1
CCCn1c2[nH]c([C@@H]3CC[C@@H](CNC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c([C@H]3CC[C@H](CNC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c(C3CCC(CNC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
CCCn1c2[nH]c(C3CCC(CNC(C)=O)CC3)nc2c(=O)n(CCC)c1=O
c1cc2c(2N2CCN([C@H]3CC[C@@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@@H]3CC[C@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@H]3CC[C@@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@@H]3CC[C@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@@H]3CC[C@@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@H]3CC[C@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@@H]3CC[C@@H](c4c[nH]c5c54)CC3)CC2)[nH]1
c1cc2c(2N2CCN([C@H]3CC[C@H](c4c[nH]c5c54)CC3)CC2)[nH]1
C(=O)N[C@@]1(C(=O)N[C@H

Re: [Rdkit-discuss] postgresql

2011-06-01 Thread TJ O'Donnell
There are binaries available at http://www.postgresql.org/download/
and a nice wiki at http://wiki.postgresql.org/wiki/Main_Page
The postgres community is great - check out the mailing lists
at http://www.postgresql.org/community/

TJ
-
TJ O'Donnell, Ph.D.
President, gNova Inc.
t...@gnova.com


On Wed, Jun 1, 2011 at 6:33 AM, Peter Schmidtke pschmid...@ub.edu wrote:
 Hey Paul,

 hope you are fine ;)

 What system/architecture are you using?

 ++

 Peter


 On 01/06/2011, at 15:31, paul.czodrow...@merck.de wrote:


 dear rdkitters,

 i would like to install postgresql/sqlite. could anyone point to a good
 tutorial on how to set-up such a system? i know how to use google, but
 maybe you guys are faster... :)


 paul

 This message and any attachment are confidential and may be privileged or
 otherwise protected from disclosure. If you are not the intended recipient,
 you must not copy this message or attachment or disclose the contents to
 any other person. If you have received this transmission in error, please
 notify the sender immediately and delete the message and any attachment
 from your system. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not accept liability for any omissions or errors in this
 message which may arise as a result of E-Mail-transmission or for damages
 resulting from any unauthorized changes of the content of this message and
 any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not guarantee that this message is free of viruses and does
 not accept liability for any damages caused by any virus transmitted
 therewith.

 Click http://disclaimer.merck.de to access the German, French, Spanish and
 Portuguese versions of this disclaimer.


 --
 Simplify data backup and recovery for your virtual environment with vRanger.
 Installation's a snap, and flexible recovery options mean your data is safe,
 secure and there when you need it. Data protection magic?
 Nope - It's vRanger. Get your free trial download today.
 http://p.sf.net/sfu/quest-sfdev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

 Peter Schmidtke

 -
 PhD Student
 Department of Physical Chemistry
 School of Pharmacy
 University of Barcelona
 Barcelona, Spain



 --
 Simplify data backup and recovery for your virtual environment with vRanger.
 Installation's a snap, and flexible recovery options mean your data is safe,
 secure and there when you need it. Data protection magic?
 Nope - It's vRanger. Get your free trial download today.
 http://p.sf.net/sfu/quest-sfdev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Simplify data backup and recovery for your virtual environment with vRanger. 
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today. 
http://p.sf.net/sfu/quest-sfdev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Contractors working with the RDKit?

2011-03-20 Thread TJ O'Donnell
I am willing and able to do consulting and contract programming using RDKit,
using either Python or C++

http://gnova.com

TJ

TJ O'Donnell, Ph.D.
President, gNova, Inc.
t...@gnova.com

On Sat, Mar 19, 2011 at 10:46 PM, Greg Landrum greg.land...@gmail.com wrote:
 Dear all,

 I was recently asked if there was anyone out there who was able to do
 contract development work with or on the RDKit. It's a good question,
 but I didn't have a good answer handy. So I'm asking here.

 If you currently do, or are willing to do, contract development work
 either extending the RDKit or developing new tools based on the RDKit,
 please reply to this thread. It would be helpful if you indicate your
 comfort level on both the C++ or Python sides. If there's sufficient
 interest/response, I'd be happy to include a section either on
 rdkit.org or on the wiki with names/links.

 Thanks,
 -greg

 --
 Colocation vs. Managed Hosting
 A question and answer guide to determining the best fit
 for your organization - today and in the future.
 http://p.sf.net/sfu/internap-sfd2d
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] can't kekulize smiles generated by Chem.MolToSmiles

2011-01-10 Thread TJ O'Donnell
Hi Greg,

As usual, thanks for your quick response.  Yes, these were big molecules.
Let me know if you'd like me to try out any changes.  I can recompile
changes from subversion easily now.  I discovered these four examples
using 1/10 of the chembl database and can try any new code changes
on the entire set of 600K molecules.

TJ


On Sun, Jan 9, 2011 at 10:51 PM, Greg Landrum greg.land...@gmail.com wrote:
 Hi TJ,

 On Mon, Jan 10, 2011 at 2:37 AM, TJ O'Donnell t...@acm.org wrote:
 Thanks Greg.  I compiled in the changes and that molfile works fine
 now, but.
 Here are four new examples of molfiles that convert to mol and smiles just
 fine, but the resulting smiles won't parse properly back to a mol.

 Can you take a look?

 Thanks for finding another good bug. The problem here is caused, as
 you probably guessed, by the size of the molecules (specifically by
 the fact that more than 50 rings were open at one point during the
 generation of the SMILES). I will get it fixed for the release.

 Best Regards,
 -greg


--
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl 
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] can't kekulize smiles generated by Chem.MolToSmiles

2011-01-06 Thread TJ O'Donnell
I've stumbled onto a molfile which is read properly (MolFromMolBlock) and
produces a proper smiles (MolToSmiles).  But the smiles generated fails
on Chem.MolFromSmiles.  Can you help figure this one out?
I've attached the molfile in question.

Here is a simple script I used to show this issue.

from rdkit import rdBase
from rdkit import Chem
import sys
print rdBase.boostVersion
print rdBase.rdkitVersion
mb = sys.stdin.read()
mol = Chem.MolFromMolBlock(mb)
if mol:
  smi = Chem.MolToSmiles(mol, isomericSmiles=True)
  print smi
  newmol = Chem.MolFromSmiles(smi)

and the result I get
python rdmol.py 254080.mol
1_40
2010.12.1
[Cl-].CC(C)(C)c1[Te+]c(C(C)(C)C)cc(/C=C/C=C2C=C(C(C)(C)C)OC(C(C)(C)C)=C2)c1
[18:13:52] Can't kekulize mol

I just rebuilt from subversion source - not sure why this version
shows as 2010.12.1
RL: https://rdkit.svn.sourceforge.net/svnroot/rdkit/trunk
Repository Root: https://rdkit.svn.sourceforge.net/svnroot/rdkit
Repository UUID: 19320e9b-7711-0410-929e-f4fff3a11e9f
Revision: 1611
Node Kind: directory
Schedule: normal
Last Changed Author: glandrum
Last Changed Rev: 1611
Last Changed Date: 2011-01-05 00:45:35 -0800 (Wed, 05 Jan 2011)


Thanks,
TJ O'Donnell


254080.mol
Description: Binary data
--
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl ___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Question: modifying default parameters for the RDKit fingerprint?

2010-12-29 Thread TJ O'Donnell
Hi Greg-

No objection here.  I've been using 1024 with 2 bits here.
Are you still using 2048 for the default size?

TJ O'Donnell


On Tue, Dec 28, 2010 at 11:33 PM, Greg Landrum greg.land...@gmail.com wrote:
 Dear all,

 As I mentioned in an earlier message
 (http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01430.html),
 the default parameters for the RDKit fingerprint end up setting far
 too many bits for drug-like molecules. The result of this is
 similarity values that are in general too high and more frequent
 occurrences of molecules that are similar to each other only due to
 bit collisions.

 The easy solution to this problem is to decrease the number of bits
 set per path found (the nBitsPerHash parameter) from 4 to 2. I propose
 doing this for the Q4 2010 release of the RDKit. The downside is that
 the fingerprints generated with that release will not be compatible
 with fingerprints from earlier releases unless you specify
 nBitsPerHash=4 on your own. The upside is a much more useful
 similarity fingerprint.

 Any objections to me making this change?

 -greg

 --
 Learn how Oracle Real Application Clusters (RAC) One Node allows customers
 to consolidate database storage, standardize their database environment, and,
 should the need arise, upgrade to a full multi-node Oracle RAC database
 without downtime or disruption
 http://p.sf.net/sfu/oracle-sfdevnl
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] reading tag data from string, not file

2010-12-29 Thread TJ O'Donnell
I can see how to read an sd file using SDMolSupplier and using mol.GetProp()
to get the tag data from the file.
But, I have each molblock (chunk of lines between  in an sdf file)
in a separate string.  I don't see a way to get properties from that
molblock string or
even better from the mol=Chem.MolFromMolBlock(molblock)
E.g. mol.GetPropNames() returns a null array (or just the private and
computed props if mol.GetPropNames(True,True)
Can you give me some hints on how I might get the property tag data
from a string molblock?

TJ O'Donnell

--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MolFromMolBlock never returns

2010-12-27 Thread TJ O'Donnell
HI Greg

Thanks for the quick reply.  Sure enough, the latest version of rdkit
fixes the problem I was having.
I should have tried that first!  Now that I have the build issues
worked out, a svn update
and make install is pretty quick.

TJ

On Sun, Dec 26, 2010 at 8:31 PM, Greg Landrum greg.land...@gmail.com wrote:
 Hi TJ,

 2010/12/24 TJ O'Donnell t...@acm.org:
 I have a mol file that causes MolFromMolBlock to get stuck.
 I reproduced this problem with this simple python script (below).
 I've attached the problem input molfile.  I got the file
 from the chembl08 download.  Another large molfile finishes
 in seconds, but I stopped this one after about 1 minute.
 Can you see what might be the problem?

 I'm afraid I am not using the most recent version, but
 one I built last July.

 There have been some fixes related to handling of large molecules
 since July. Certainly the current state of the code from svn (and
 probably the last release, though I haven't tried this) handles your
 SD file without problems or huge delays (less than half a second on my
 machine).

 Best Regards,
 -greg


--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] MolFromMolBlock never returns

2010-12-26 Thread TJ O'Donnell

I have a mol file that causes MolFromMolBlock to get stuck.
I reproduced this problem with this simple python script (below).
I've attached the problem input molfile.  I got the file
from the chembl08 download.  Another large molfile finishes
in seconds, but I stopped this one after about 1 minute.
Can you see what might be the problem?

I'm afraid I am not using the most recent version, but
one I built last July.
From subversion:
URL: https://rdkit.svn.sourceforge.net/svnroot/rdkit/trunk
Repository Root: https://rdkit.svn.sourceforge.net/svnroot/rdkit
Repository UUID: 19320e9b-7711-0410-929e-f4fff3a11e9f
Revision: 1450
Node Kind: directory
Schedule: normal
Last Changed Author: glandrum
Last Changed Rev: 1450
Last Changed Date: 2010-07-08 21:15:09 -0700 (Thu, 08 Jul 2010)

Thanks,
TJ O'Donnell, Ph.D.
President, gNova, Inc.

from rdkit import Chem
import sys
mb = sys.stdin.read()
mol = Chem.MolFromMolBlock(mb)
if mol:
  print len(mb),mol
  print Chem.MolToSmiles(mol, isomericSmiles=False)
 
  CDK3/26/10,13:38

469504  0  0  0  0  0  0  0  0999 V2000
   10.8216  -24.66950. N   0  0  0  0  0  0  0  0  0  0  0  0
   10.8142  -30.38920. C   0  0  0  0  0  0  0  0  0  0  0  0
   11.2281  -29.67280. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.0536  -29.67410. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.4664  -30.39150. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.0474  -31.10860. C   0  0  0  0  0  0  0  0  0  0  0  0
   11.2233  -31.10370. C   0  0  0  0  0  0  0  0  0  0  0  0
   10.8074  -31.81630. C   0  0  0  0  0  0  0  0  0  0  0  0
9.9824  -31.81230. N   0  0  0  0  0  0  0  0  0  0  0  0
9.5730  -31.09550. C   0  0  0  0  0  0  0  0  0  0  0  0
8.7480  -31.09160. C   0  0  0  0  0  0  0  0  0  0  0  0
9.9889  -30.38300. O   0  0  0  0  0  0  0  0  0  0  0  0
8.3321  -31.80410. C   0  0  0  0  0  0  0  0  0  0  0  0
8.3389  -30.37520. N   0  0  0  0  0  0  0  0  0  0  0  0
7.5139  -30.37130. C   0  0  0  0  0  0  0  0  0  0  0  0
7.1048  -29.65480. C   0  0  0  0  0  0  0  0  0  0  0  0
7.0980  -31.08380. O   0  0  0  0  0  0  0  0  0  0  0  0
8.7415  -32.52090. C   0  0  0  0  0  0  0  0  0  0  0  0
9.7280  -33.41670. N   0  0  0  0  0  0  0  0  0  0  0  0
9.0115  -33.82580. C   0  0  0  0  0  0  0  0  0  0  0  0
8.4011  -33.27080. N   0  0  0  0  0  0  0  0  0  0  0  0
9.5665  -32.52490. C   0  0  0  0  0  0  0  0  0  0  0  0
   10.4016  -32.51210. C   0  0  0  0  0  0  0  0  0  0  0  0
   11.2308  -32.50580. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.2914  -30.39420. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.7019  -29.67400. N   0  0  0  0  0  0  0  0  0  0  0  0
   14.1221  -30.37920. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.7017  -28.85570. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.7238  -31.10640. C   0  0  0  0  0  0  0  0  0  0  0  0
   11.6317  -31.78480. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.8988  -31.11770. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.4962  -31.83780. N   0  0  0  0  0  0  0  0  0  0  0  0
   12.8916  -32.56190. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.7164  -32.58150. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.4622  -33.26630. O   0  0  0  0  0  0  0  0  0  0  0  0
   14.1457  -31.87700. N   0  0  0  0  0  0  0  0  0  0  0  0
   14.1117  -33.30560. C   0  0  0  0  0  0  0  0  0  0  0  0
   14.9705  -31.89670. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.3999  -31.19220. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.3659  -32.62070. O   0  0  0  0  0  0  0  0  0  0  0  0
   13.6824  -34.01000. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.9991  -34.77120. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.3731  -35.30850. N   0  0  0  0  0  0  0  0  0  0  0  0
   12.6686  -34.87910. C   0  0  0  0  0  0  0  0  0  0  0  0
   12.8593  -34.07650. N   0  0  0  0  0  0  0  0  0  0  0  0
   12.9851  -28.44690. O   0  0  0  0  0  0  0  0  0  0  0  0
   14.4140  -28.43950. C   0  0  0  0  0  0  0  0  0  0  0  0
   14.4097  -27.61450. N   0  0  0  0  0  0  0  0  0  0  0  0
   15.1309  -28.84870. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.8432  -28.43250. C   0  0  0  0  0  0  0  0  0  0  0  0
   15.9284  -27.61510. N   0  0  0  0  0  0  0  0  0  0  0  0
   16.7345  -27.43940. C   0  0  0  0  0  0  0  0  0  0  0  0
   17.1510  -28.15220. N   0  0  0  0  0  0  0  0  0  0  0  0
   16.6021  -28.76800. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.6931  -27.20570. C   0  0  0  0  0  0  0  0  0  0  0  0
   13.6888  -26.38070. C   0  0  0  0  0  0  0  0  0  0  0  0

[Rdkit-discuss] compile issues

2010-07-08 Thread TJ O'Donnell
Hi Greg

I'm trying to build rdkit on a 64-bit redhat system.

g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)

I built boost 1.43, the latest flex, and got up to this point
building rdkit

[ 82%] Building CXX object 
Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/lex.yysln.cpp.o
Linking CXX shared library libSLNParse.so
/usr/bin/ld: 
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/libboost_regex.a(cpp_regex_traits.o):
 
relocation R_X86_64_32S against `std::basic_stringchar, 
std::char_traitschar, std::allocatorchar 
 ::_Rep::_S_empty_rep_storage' can not be used when making a shared 
object; recompile with -fPIC
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/libboost_regex.a: 
could not read symbols: Bad value
collect2: ld returned 1 exit status
make[2]: *** [Code/GraphMol/SLNParse/libSLNParse.so] Error 1
make[1]: *** [Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/all] Error 2
make: *** [all] Error 2

Can you help?

Thanks,
TJ O'Donnell

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss