Re: [Rdkit-discuss] Clustering - visualization?

2016-05-14 Thread Robert DeLisle
Thanks, Curt!  I'll give those a look.  It'll give me a very good reason to
start digging into SciPy a bit more and exploit the added functionality
that will bring.

Regarding my original question and for anyone else that might be
interested...

I did indeed find an answer through a lot of code dredging.  I found the
Murtagh.ClusterData() function in RDKit, and was able to generate clusters
from that.  The function returns a single member list, that single member
being a Cluster object.  I can feed that object to ClusterVis.ClusterToImg
to get the dendrogram I wanted.  Here's a short code snip showing the
pieces.

...
c_tree = Murtagh.ClusterData(dists,nfps,Murtagh.WARDS,isDistData=True)
...
rdkit.ML.Cluster.ClusterVis.ClusterToImg(c_tree[0], size=(500,500),
fileName='test.png')
...

I can then break the cluster tree into subtrees:

...
rdkit.ML.Cluster.ClusterUtils.SplitIntoNClusters(c_tree[0], 5)
...

And I've written a short function to extract out the individual structure
memberships for each group:

...

groups = ClusterUtils.SplitIntoNClusters(c_tree[0], 5)

def GetGroupMembers( grp, memberlist=[] ):
for child in grp.GetChildren():
if (child.GetData() is None ):
GetGroupMembers( child, memberlist )
else:
memberlist.append( child.GetData() )

return memberlist

print GetGroupMembers(groups[0])




On Sat, May 14, 2016 at 11:21 AM, Curt Fischer <curt.r.fisc...@gmail.com>
wrote:

> Hi Robert,
>
> For the number of molecules you are interested in, it's viable to use
> SciPy / NumPy clustering functions instead of rdkit's built in C-linked
> functions.  This approach will probably not be as fast rdkit's built-in
> clustering functionalities, and will probably not scale to tens of
> thousands of molecules as well as rdkit's functions, but if you use SciPy
> or NumPy in other types of technical computing, this approach may be more
> transparent, generalizable, and easier to use.
>
> I have an example Jupyter notebook in GitHub that describes what I mean;
> here are the GitHub and nbviewer links:
>
>
> https://github.com/tentrillion/ipython_notebooks/blob/master/chemical_similarity_in_python.ipynb
>
> https://nbviewer.jupyter.org/github/tentrillion/ipython_notebooks/blob/master/chemical_similarity_in_python.ipynb
>
> Here are some of the most important parts of the code for generating a
> dendrogram.
>
> 1. Generate a numpy fingerprint matrix from a list of rdkit Molecules.
>
> for smiles in smiles_list:
> mol = Chem.MolFromSmiles(smiles)
> mols.append(mol)
> fingerprint_mat = np.vstack(np.asarray(rdmolops.RDKFingerprint(mol, fpSize = 
> 2048), dtype = 'bool') for mol in mols)
>
>
> 2. Generate the distance matrix.  *pdist* and *squareform* are from
> *scipy.spatial.distance*.
>
> dist_mat = pdist(fingerprint_mat, 'jaccard') dist_df = pd.DataFrame(
> squareform(dist_mat), index = smiles_list, columns= smiles_list)
>
> As far as I can tell, the Jaccard distance is equivalent to one minus the
> Tanimoto similarity.
>
> 3. Perform hierarchical clustering on the distance matrix and show the
> dendrogram (see the github notebook for the plot). *hc* is
> *scipy.cluster.hierarchy*.
>
> z = hc.linkage(dist_mat)dendrogram = hc.dendrogram(z, labels=dist_df.columns, 
> leaf_rotation=90)plt.show()
>
>
> A helpful page for dendrograms using SciPy is this one:
> https://joernhees.de/blog/2015/08/26/scipy-hierarchical-clustering-and-dendrogram-tutorial/
>
> Good luck!
>
> Curt
>
> On Sat, May 14, 2016 at 9:11 AM, Robert DeLisle <rkdeli...@gmail.com>
> wrote:
>
>> Next up is clustering...
>>
>> I've got about 350 structures to cluster and I've worked through the
>> example code from the RDKit Cookbook (
>> http://www.rdkit.org/docs/Cookbook.html#clustering-molecules).  All
>> seems well and good there, but I would like to see the dendrogram.  I see
>> that there is a ClusterVis module to generate images, PDF, and SVG, but all
>> require a Cluster object as input.  I don't find anywhere a description of
>> acquiring or building that object based upon the results of clustering.
>>
>> Any tips?
>>
>> -Kirk
>>
>>
>>
>>
>> --
>> Mobile security can be enabling, not merely restricting. Employees who
>> bring their own devices (BYOD) to work are irked by the imposition of MDM
>> restrictions. Mobile Device Manager Plus allows you to control only the
>> apps on BYO-devices by containerizing them, leaving personal data
>> untouched!
>> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
>> ___
>> Rdkit-discuss mailin

[Rdkit-discuss] Clustering - visualization?

2016-05-14 Thread Robert DeLisle
Next up is clustering...

I've got about 350 structures to cluster and I've worked through the
example code from the RDKit Cookbook (
http://www.rdkit.org/docs/Cookbook.html#clustering-molecules).  All seems
well and good there, but I would like to see the dendrogram.  I see that
there is a ClusterVis module to generate images, PDF, and SVG, but all
require a Cluster object as input.  I don't find anywhere a description of
acquiring or building that object based upon the results of clustering.

Any tips?

-Kirk
--
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] GetSubstructMatch vs MMFFOptimize

2016-05-14 Thread Robert DeLisle
RDKitters,

I'm working on a project in which I want to align a collection of
structures with their most similar structures and display the results in
PyMOL.  To accomplish this, I've built a Python script similar to the one
attached here in which I start with pairs of structures, find the MCS of
those structures, create a template based on the MCS and a 3D conformation
of the structure of interest, and then generate a constrained conformation
of a query structure.  I tried to comment the attached code enough to lead
you through the process.

What I find is that quite often, the ConstrainedEmbed() function fails with
the error "molecule doesn't match the core" which seems very odd since the
pairs for which it fails are very similar.  The attached .png shows one
such pair and their MCS.

What I've found is that when I generate a 3D conformation for the first
structure and optimize it with MMFF (MMFFOptimize), this often causes
GetSubstructMatch to fail finding the MCS within the structure.  If instead
I used UFFOptimize, everything seems to work OK most of the time.

In my code, I've noted where the error occurs and flanked it with some
print statements to show what happens.  Specficially, at like 36 I have the
MMFFOptimize line, and at 37 the UFFOptimize line.   I've also attached a
set of structures for which MMFF fails.

While using UFFOptimize produces great results, I'm curious regarding why
MMFFOptimize creates a problem.  And, whether this is a bug which should be
fixed, or just a glitch related to atom typing and other parameterizations
that occur with MMFF.

Thanks for any explanation or ideas.

-Kirk
Struct1 Cc1cc(NC(=O)CSc2ccc3nnc(CCNC(=O)c4c4)n3n2)no1
Struct2 CCOc1c1NC(=O)CSc1ccc2nnc(CCNC(=O)c3ccc(C)cc3)n2n1
from copy import deepcopy

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import Draw
from rdkit.Chem import rdFMCS

from rdkit.Chem import PyMol
pymol=PyMol.MolViewer()

#get the structures
fin = open('Substruct.txt', 'r')
mols = []
for l in fin:
arr = l.strip().split('\t')
mols.append(Chem.MolFromSmiles(arr[1]))

#find the maximum common substructure
mcs = rdFMCS.FindMCS( mols, completeRingsOnly=True, 
 ringMatchesRingOnly=True )
mcs_mol = Chem.MolFromSmarts(mcs.smartsString)

#check the mcs - looks reasonable
z = [ AllChem.Compute2DCoords(m) for m in mols + [mcs_mol] ]
img = Draw.MolsToGridImage( mols + [mcs_mol], subImgSize=(300,300),
  legends = ['Struct1', 'Struct2', 'MCS'] )
img.save('Substruct.png')


#here's where the error occurs
#before MMFF optimization, GetSubstructMatch is correct
print mols[0].GetSubstructMatch(mcs_mol)

#create a 3D structure for the first 
AllChem.EmbedMolecule(mols[0])
AllChem.MMFFOptimizeMolecule(mols[0])
#AllChem.UFFOptimizeMolecule(mols[0]) #UFF works!

#after MMFF optimization, substruct match no longer correct
print mols[0].GetSubstructMatch(mcs_mol)

#create a template from the mcs and structure 1
mcs_match = mols[0].GetSubstructMatch(mcs_mol)
template = deepcopy(mols[0])
for i,a in enumerate( template.GetAtoms() ):
if (i not in mcs_match):
template.GetAtomWithIdx(i).SetAtomicNum(0)

template = Chem.DeleteSubstructs(template, Chem.MolFromSmarts('[#0]'))

#create a 3d structure for the second constrained to the mcs
mols[1] = AllChem.ConstrainedEmbed(mols[1], template)

#show the results in PyMOL
pymol.ShowMol(mols[0], name='Struct1')
pymol.Zoom('Struct1')
pymol.SetDisplayStyle('Struct1', 'sticks')
pymol.ShowMol(mols[1], name='Struct2', showOnly=False)



--
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] PyMOL from RDKit? (Resurrection)

2016-04-23 Thread Robert DeLisle
Paolo,

Thank you!  That works perfectly.  I'm not sure what step I was missing
before, but your script does the trick.


In working with this, I found myself wanting to save some PyMOL files
programatically, but I see that there is not a Save option in the RDKit
PyMOL code.  I added the snip below to the MolViewer class and it seems to
work nicely.  I don't know if it is generally useful or if it should be
added to the code base - I'll let Greg make that decision.

-Kirk


  def SaveFile(self, filename):
 id = self.server.save(filename)
 return id



I've attached my modified PyMol.py file as well.




On Fri, Apr 22, 2016 at 3:08 PM, Paolo Tosco <paolo.to...@unito.it> wrote:

> Dear Robert,
>
> I have just built the latest PyMOL 1.8.2.0 on CentOS 7, I started it:
>
> pymol -R
>
> and then I ran the following Python script:
>
> #!/usr/bin/env python
>
> import os
> import rdkit
> from rdkit import Chem
> from rdkit.Chem import PyMol
> from rdkit.Chem import AllChem
>
> s = PyMol.MolViewer()
> mol = Chem.MolFromSmiles \
>   ('CCOCCn1c(C2CC[NH+](CCc3ccc(C(C)(C)C(=O)[O-])cc3)CC2)nc2c21')
> mol = AllChem.AddHs(mol)
> AllChem.EmbedMolecule(mol)
> AllChem.MMFFOptimizeMolecule(mol)
> s.ShowMol(mol, name = 'bilastine', showOnly = False)
> s.Zoom('bilastine')
> s.SetDisplayStyle('bilastine', 'sticks')
>
> I obtained the expected display:
>
>
>
> Cheers,
> p.
>
>
> On 04/22/2016 09:09 PM, Robert DeLisle wrote:
>
> Back again!
>
> I apologize for resurrecting an old topic, but I'm once again trying to
> work with PyMOL through RDKit.  I've been following the approach in this
> thread (
> http://www.mail-archive.com/rdkit-discuss%40lists.sourceforge.net/msg00325.html)
> but it seems not to work any longer.  I'm using PyMOL 1.8 on Fedora and I
> see that the xml-rpc file is current, so that's no longer a problem.  When
> I step through the process and hit this step:
>
> s.ShowMol(m,name='ligand',showOnly=False)
>
>
> nothing happens in the PyMOL viewer.  It just remains blank.
>
> Any updates on operating with PyMOL?
>
> -Kirk
>
>
>
>
> --
> Find and fix application performance issues faster with Applications Manager
> Applications Manager provides deep performance insights into multiple tiers of
> your business applications. It resolves application problems quickly and
> reduces your MTTR. Get your free 
> trial!https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
>
>
>
> ___
> Rdkit-discuss mailing 
> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
# $Id$
#
# Copyright (C) 2004-2012 Greg Landrum and Rational Discovery LLC
#
#   @@ All Rights Reserved @@
#  This file is part of the RDKit.
#  The contents are covered by the terms of the BSD license
#  which is included in the file license.txt, found at the root
#  of the RDKit source tree.
#
""" uses pymol to interact with molecules

"""
from rdkit import Chem
import os, tempfile

# Python3 compatibility
try:
  from xmlrpclib import Server
except ImportError:
  from xmlrpc.client import Server


_server=None
class MolViewer(object):
  def __init__(self,host=None,port=9123,force=0,**kwargs):
global _server
if not force and _server is not None:
  self.server=_server
else:
  if not host:
host=os.environ.get('PYMOL_RPCHOST','localhost')
  _server=None
  serv = Server('http://%s:%d'%(host,port))
  serv.ping()
  _server = serv
  self.server=serv
self.InitializePyMol()

  def InitializePyMol(self):
""" does some initializations to set up PyMol according to our
tastes

"""
self.server.do('set valence,1')
self.server.do('set stick_rad,0.15')
self.server.do('set mouse_selection_mode,0')
self.server.do('set line_width,2')
self.server.do('set selection_width,10')
self.server.do('set auto_zoom,0')


  def DeleteAll(self):
" blows out everything in the viewer "
self.server.deleteAll()

  def DeleteAllExcept(self,excludes):
" deletes everything except the items in the provided list of arguments "
allNames = self.server.getNames('*',False)
for nm in allNames:
  if nm not in excludes:
self.server.deleteObject(nm)

  def LoadFile(self,filename,name,showOnly=False):
""" calls pymol's "load" command on the given filename; the loaded object
is assigned the name "name"
"""
if showOnly:
  self.DeleteAll()
id = self.server.loadFile(filename,name)
return id

  def SaveFile(self, filename):
   

[Rdkit-discuss] Aligning in 3D

2016-04-22 Thread Robert DeLisle
In working with RDKit I've been able to align 2D structures based upon a
common core of MCS using

AllChem.GenerateDepictionMatching2DStructure(m,p)

The next step for me is to generate 3D structures and align them based upon
that same common core.  Obviously this leads to multiple steps, not the
least of which is generating conformations that are consistent across the
common core for the various molecules.  I seem to recall the ability to
generate a conformation and minimize it (either UFF or MMFF) and apply
constraints based upon an input substructure, but I cannot find the details.

Any tips to accomplish this one?

-Kirk
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] PyMOL from RDKit? (Resurrection)

2016-04-22 Thread Robert DeLisle
Back again!

I apologize for resurrecting an old topic, but I'm once again trying to
work with PyMOL through RDKit.  I've been following the approach in this
thread (
http://www.mail-archive.com/rdkit-discuss%40lists.sourceforge.net/msg00325.html)
but it seems not to work any longer.  I'm using PyMOL 1.8 on Fedora and I
see that the xml-rpc file is current, so that's no longer a problem.  When
I step through the process and hit this step:

s.ShowMol(m,name='ligand',showOnly=False)

nothing happens in the PyMOL viewer.  It just remains blank.

Any updates on operating with PyMOL?

-Kirk
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cairoCanvas.py errors?

2016-04-18 Thread Robert DeLisle
Even better!  Thanks, Greg.

On Mon, Apr 18, 2016, 12:37 AM Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Kirk,
>
> Welcome back!
> Those were fixed for the 2015.09 release:
> https://github.com/rdkit/rdkit/pull/644
>
> Best,
> -greg
>
>
> On Mon, Apr 18, 2016 at 1:11 AM, Robert DeLisle <rkdeli...@gmail.com>
> wrote:
>
>> Long time no message!
>>
>> Anywho, I've been working today with RDKit 2015.03.01 and in the process
>> of generating a grid of molecule depictions (Draw.MolsToGridImage()), I
>> received the error message below.
>>
>> From the last line, it seems there has been an API change that changes
>> tostring() to tobytes().  I also found that fromstring() needs to change to
>> frombytes().
>>
>> When I made these changes and saved the results, everything works fine.
>> I thought it might be useful to know given the upcoming release.
>>
>> -Kirk
>>
>>
>>
>> raceback (most recent call last):
>>   File "GenerateStructFigures.py", line 57, in 
>> legends = lbls)
>>   File
>> "/storage/software/RDKit/RDKit_current/rdkit/Chem/Draw/__init__.py", line
>> 316, in MolsToGridImage
>> **kwargs),(col*subImgSize[0],row*subImgSize[1]))
>>   File
>> "/storage/software/RDKit/RDKit_current/rdkit/Chem/Draw/__init__.py", line
>> 94, in MolToImage
>> img,canvas=_createCanvas(size)
>>   File
>> "/storage/software/RDKit/RDKit_current/rdkit/Chem/Draw/__init__.py", line
>> 50, in _createCanvas
>> canvas = Canvas(img)
>>   File
>> "/storage/software/RDKit/RDKit_current/rdkit/Chem/Draw/cairoCanvas.py",
>> line 67, in __init__
>> imgd = image.tostring("raw","BGRA")
>>   File "/usr/lib64/python2.7/site-packages/PIL/Image.py", line 686, in
>> tostring
>> "Please call tobytes() instead.")
>> Exception: tostring() has been removed. Please call tobytes() instead.
>>
>>
>>
>> --
>> Find and fix application performance issues faster with Applications
>> Manager
>> Applications Manager provides deep performance insights into multiple
>> tiers of
>> your business applications. It resolves application problems quickly and
>> reduces your MTTR. Get your free trial!
>> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] cairoCanvas.py errors?

2016-04-17 Thread Robert DeLisle
Long time no message!

Anywho, I've been working today with RDKit 2015.03.01 and in the process of
generating a grid of molecule depictions (Draw.MolsToGridImage()), I
received the error message below.

>From the last line, it seems there has been an API change that changes
tostring() to tobytes().  I also found that fromstring() needs to change to
frombytes().

When I made these changes and saved the results, everything works fine.  I
thought it might be useful to know given the upcoming release.

-Kirk



raceback (most recent call last):
  File "GenerateStructFigures.py", line 57, in 
legends = lbls)
  File "/storage/software/RDKit/RDKit_current/rdkit/Chem/Draw/__init__.py",
line 316, in MolsToGridImage
**kwargs),(col*subImgSize[0],row*subImgSize[1]))
  File "/storage/software/RDKit/RDKit_current/rdkit/Chem/Draw/__init__.py",
line 94, in MolToImage
img,canvas=_createCanvas(size)
  File "/storage/software/RDKit/RDKit_current/rdkit/Chem/Draw/__init__.py",
line 50, in _createCanvas
canvas = Canvas(img)
  File
"/storage/software/RDKit/RDKit_current/rdkit/Chem/Draw/cairoCanvas.py",
line 67, in __init__
imgd = image.tostring("raw","BGRA")
  File "/usr/lib64/python2.7/site-packages/PIL/Image.py", line 686, in
tostring
"Please call tobytes() instead.")
Exception: tostring() has been removed. Please call tobytes() instead.
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit from Java

2014-11-30 Thread Robert DeLisle
RDkit-ers,

I've been working with RDKit from Java for a while now and I'm spinning my
wheels due to being too new to Java.  I'm very comfortable with RDKit from
Python, but Java is a new animal for me.  I've downloaded the RDKit Java
binaries and I have this:

boost_system-vc100-mt-1_51.dll
GraphMolWrap.dll
org.RDKit.jar
org.RDKitDoc.jar

The two DLLs are most likely C++ libraries that are compiled into .dll so
that the code can use them.  I know this is true for boost and I'm guessing
the GraphMolWrap.dll is similar.

The two .jar files are the pieces of interest, but I cannot seem to find
any documentation on RDKit from Java to get started.  I can find some other
examples - mostly the KNIME nodes for RDKit - that give me some clues
toward function names, etc. but I'm stuck as to how to even get started.

I did find this:  https://code.google.com/p/rdkit/wiki/SwigExperiment

At the bottom I see a Jython console session, but I'm just not able to
convert this into a .java file which I can compile with javac and then
actually run.

Any tips on how to import libraries into a very simple chunk of Java code?
Or better yet, 5-10 lines of a .java file that does something mindlessly
simple would be great to help me get started.

-Kirk
--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit from Java

2014-11-30 Thread Robert DeLisle
By the way, I'm looking through the Java wrapper and I'm not seeing any
functions that would provide access to the 2D depiction code from Java.
Does that exist and I'm just not seeing it?

-Kirk



On Sun, Nov 30, 2014 at 9:37 PM, Robert DeLisle rkdeli...@gmail.com wrote:

 Thanks, Greg!  That helps a lot.  I think I'm on the right track now, and
 if I can wrap my head around how to get all the dependencies to talk to
 each other appropriately (DLLs, etc.), I should be on my way.  Any tips on
 configuration are always appreciated.  (This is what I get for venturing
 into Java world, right?)

 As for the archaeology, like I tell everyone - I'm thorough.  8^D

 -Kirk



 On Sun, Nov 30, 2014 at 8:32 PM, Greg Landrum greg.land...@gmail.com
 wrote:

 Hi Kirk,

 On Mon, Dec 1, 2014 at 2:14 AM, Robert DeLisle rkdeli...@gmail.com
 wrote:


 I've been working with RDKit from Java for a while now and I'm spinning
 my wheels due to being too new to Java.  I'm very comfortable with RDKit
 from Python, but Java is a new animal for me.  I've downloaded the RDKit
 Java binaries and I have this:

 boost_system-vc100-mt-1_51.dll
 GraphMolWrap.dll
 org.RDKit.jar
 org.RDKitDoc.jar

 The two DLLs are most likely C++ libraries that are compiled into .dll
 so that the code can use them.  I know this is true for boost and I'm
 guessing the GraphMolWrap.dll is similar.


 Exactly.


 The two .jar files are the pieces of interest, but I cannot seem to find
 any documentation on RDKit from Java to get started.  I can find some other
 examples - mostly the KNIME nodes for RDKit - that give me some clues
 toward function names, etc. but I'm stuck as to how to even get started.


 Yeah, the code for the knime nodes has too much knime and not enough
 RDKit to be useful as a place to learn.



 I did find this:  https://code.google.com/p/rdkit/wiki/SwigExperiment

 At the bottom I see a Jython console session, but I'm just not able to
 convert this into a .java file which I can compile with javac and then
 actually run.


 Wow; that's ancient. Nice archaeology to find it. :-)


 Any tips on how to import libraries into a very simple chunk of Java
 code? Or better yet, 5-10 lines of a .java file that does something
 mindlessly simple would be great to help me get started.


 The Java (and C#) wrappers are under-documented and there's very little
 sample code out there.
 Probably your best bet is the testing code for the java wrapper:

 https://github.com/rdkit/rdkit/tree/master/Code/JavaWrappers/gmwrapper/src-test/org/RDKit
 This isn't comprehensive, but it does contain at least a starting point
 for most of the functionality.

 Best,
 -greg





--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit from Java

2014-11-30 Thread Robert DeLisle
Thanks again, Greg!

I can imagine 2D depiction is a tricky bit of code.  I'll leave that one to
the experts.

All the best!

Kirk
On Nov 30, 2014 9:47 PM, Greg Landrum greg.land...@gmail.com wrote:


 On Mon, Dec 1, 2014 at 5:39 AM, Robert DeLisle rkdeli...@gmail.com
 wrote:

 By the way, I'm looking through the Java wrapper and I'm not seeing any
 functions that would provide access to the 2D depiction code from Java.
 Does that exist and I'm just not seeing it?


 The only thing currently available is the ToSVG() method that's on the
 ROMol class. This is what is used within Knime.

 I do really hope to have better depiction options available for the next
 release -- Dave Cosgrove submitted an excellent starting point earlier this
 year --  but that's a time-consuming bit to get right.

 -greg


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit on Win7 - DLL load failed

2014-11-09 Thread Robert DeLisle
Hi again, all!

I'm trying to install RDKit on a 64-bit Windows 7 instance (in
VirtualBox).  I've done the following:

installed Python 2.7 (32-bit)
installed NumPy (for Python 2.7 32-bit)
installed PIL (for Python 2.7 32-bit)

environment variables are:

RDBASE = c:\RDKit_2014_09_1
PYTHONPATH = %RDBASE%
PATH = %PATH%;%RDBASE%\lib


From a Python instance, I get this:

Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)]
on win
32
Type help, copyright, credits or license for more information.
 import rdkit
 from rdkit import Chem
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\RDKit_2014_09_1\rdkit\Chem\__init__.py, line 18, in module
from rdkit import rdBase
ImportError: DLL load failed: %1 is not a valid Win32 application.



I've search the discuss archives, and found details about making sure the
VC++ redistributables are present - they are.  I see in RDKit\lib there are
two files named *vc100*.dll, so I assume having the msvcp100.dll and
msvcr100.dll are the correct versions.  I've tried moving them to the
RDKit\lib folder - no luck.  I've also tried renaming any/all of them
without the *100* version stamp - again, no luck.

Running dependency walker, PYTHON27.DLL
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit on Win7 - DLL load failed

2014-11-09 Thread Robert DeLisle
OOPS!

On Sun, Nov 9, 2014 at 4:37 PM, Robert DeLisle rkdeli...@gmail.com wrote:

 Hi again, all!

 I'm trying to install RDKit on a 64-bit Windows 7 instance (in
 VirtualBox).  I've done the following:

 installed Python 2.7 (32-bit)
 installed NumPy (for Python 2.7 32-bit)
 installed PIL (for Python 2.7 32-bit)

 environment variables are:

 RDBASE = c:\RDKit_2014_09_1
 PYTHONPATH = %RDBASE%
 PATH = %PATH%;%RDBASE%\lib


 From a Python instance, I get this:

 Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)]
 on win
 32
 Type help, copyright, credits or license for more information.
  import rdkit
  from rdkit import Chem
 Traceback (most recent call last):
   File stdin, line 1, in module
   File C:\RDKit_2014_09_1\rdkit\Chem\__init__.py, line 18, in module
 from rdkit import rdBase
 ImportError: DLL load failed: %1 is not a valid Win32 application.



 I've search the discuss archives, and found details about making sure the
 VC++ redistributables are present - they are.  I see in RDKit\lib there are
 two files named *vc100*.dll, so I assume having the msvcp100.dll and
 msvcr100.dll are the correct versions.  I've tried moving them to the
 RDKit\lib folder - no luck.  I've also tried renaming any/all of them
 without the *100* version stamp - again, no luck.

 Running dependency walker, PYTHON27.DLL

--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit on Win7 - DLL load failed

2014-11-09 Thread Robert DeLisle
Let's try this one last time.  Somehow I got two early sends of that
e-mail.  I apologize for the now triple post!

As I was saying...

I'm trying to install RDKit on a 64-bit Windows 7 instance (in
VirtualBox).  I've done the following:

installed Python 2.7 (32-bit)
installed NumPy (for Python 2.7 32-bit)
installed PIL (for Python 2.7 32-bit)

environment variables are:

RDBASE = c:\RDKit_2014_09_1
PYTHONPATH = %RDBASE%
PATH = %PATH%;%RDBASE%\lib


From a Python instance, I get this:

Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)]
on win
32
Type help, copyright, credits or license for more information.
 import rdkit
 from rdkit import Chem
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\RDKit_2014_09_1\rdkit\Chem\__init__.py, line 18, in module
from rdkit import rdBase
ImportError: DLL load failed: %1 is not a valid Win32 application.



I've search the discuss archives, and found details (links below) about
making sure the VC++ redistributables are present - they are.  I see in
RDKit\lib there are two files named *vc100*.dll, so I assume having the
msvcp100.dll and msvcr100.dll are the correct versions.  I've tried moving
them to the RDKit\lib folder - no luck.  I've also tried renaming any/all
of them without the *100* version stamp - again, no luck.

Running dependency walker, PYTHON27.DLL is marked as not found.  I do see
it in C:\Windows\SysWOW64, however.  It also appears that all of the marked
DLLs have a 64 next to them, suggesting everything has been compiled for
64-bit.  I've double checked that I am indeed using the 32-bit versions of
RDKit.  (I tried going to 64-bit, but I find that NumPy isn't available for
64-bit Python.)

The last piece that I notice off, is within the
RDKit_2014_09_1.win32.py27.zip file, the compressed directory is actually
titled RDKit_2014_03_1.  I assume this is just a typo, but is it the right
version?

Any help is greatly appreciated.

-Kirk


RDKit-discuss archive links:
http://www.mail-archive.com/rdkit-discuss%40lists.sourceforge.net/msg02558.html
http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg02381.html




On Sun, Nov 9, 2014 at 4:37 PM, Robert DeLisle rkdeli...@gmail.com wrote:

 OOPS!

 On Sun, Nov 9, 2014 at 4:37 PM, Robert DeLisle rkdeli...@gmail.com
 wrote:

 Hi again, all!

 I'm trying to install RDKit on a 64-bit Windows 7 instance (in
 VirtualBox).  I've done the following:

 installed Python 2.7 (32-bit)
 installed NumPy (for Python 2.7 32-bit)
 installed PIL (for Python 2.7 32-bit)

 environment variables are:

 RDBASE = c:\RDKit_2014_09_1
 PYTHONPATH = %RDBASE%
 PATH = %PATH%;%RDBASE%\lib


 From a Python instance, I get this:

 Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)]
 on win
 32
 Type help, copyright, credits or license for more information.
  import rdkit
  from rdkit import Chem
 Traceback (most recent call last):
   File stdin, line 1, in module
   File C:\RDKit_2014_09_1\rdkit\Chem\__init__.py, line 18, in module
 from rdkit import rdBase
 ImportError: DLL load failed: %1 is not a valid Win32 application.



 I've search the discuss archives, and found details about making sure the
 VC++ redistributables are present - they are.  I see in RDKit\lib there are
 two files named *vc100*.dll, so I assume having the msvcp100.dll and
 msvcr100.dll are the correct versions.  I've tried moving them to the
 RDKit\lib folder - no luck.  I've also tried renaming any/all of them
 without the *100* version stamp - again, no luck.

 Running dependency walker, PYTHON27.DLL



--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit on Win7 - DLL load failed

2014-11-09 Thread Robert DeLisle
Done and done!

Downloaded, unzipped, and working!  Thank you, Greg.  And, no worries at
all.  If I had a nickel for every silly mistake I've madewell...how
many stars are there in the universe?

Thanks also for the links to 64-bit NumPy.  I'll definitely give those a go.

-Kirk





On Sun, Nov 9, 2014 at 9:26 PM, Greg Landrum greg.land...@gmail.com wrote:

 Hi Kirk,

 It looks like I made a stupid mistake when creating the win32 binaries and
 zipped the wrong directory. :-(

 I just replaced the win32 binaries on both github and sf.net with new
 versions that should be correct.

 FYI, you can get win64 python binaries for numpy and many other useful
 packages here: http://www.lfd.uci.edu/~gohlke/pythonlibs/

 -greg



 On Mon, Nov 10, 2014 at 12:46 AM, Robert DeLisle rkdeli...@gmail.com
 wrote:

 Let's try this one last time.  Somehow I got two early sends of that
 e-mail.  I apologize for the now triple post!

 As I was saying...

 I'm trying to install RDKit on a 64-bit Windows 7 instance (in
 VirtualBox).  I've done the following:

 installed Python 2.7 (32-bit)
 installed NumPy (for Python 2.7 32-bit)
 installed PIL (for Python 2.7 32-bit)

 environment variables are:

 RDBASE = c:\RDKit_2014_09_1
 PYTHONPATH = %RDBASE%
 PATH = %PATH%;%RDBASE%\lib


 From a Python instance, I get this:

 Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)]
 on win
 32
 Type help, copyright, credits or license for more information.
  import rdkit
  from rdkit import Chem
 Traceback (most recent call last):
   File stdin, line 1, in module
   File C:\RDKit_2014_09_1\rdkit\Chem\__init__.py, line 18, in module
 from rdkit import rdBase
 ImportError: DLL load failed: %1 is not a valid Win32 application.



 I've search the discuss archives, and found details (links below) about
 making sure the VC++ redistributables are present - they are.  I see in
 RDKit\lib there are two files named *vc100*.dll, so I assume having the
 msvcp100.dll and msvcr100.dll are the correct versions.  I've tried moving
 them to the RDKit\lib folder - no luck.  I've also tried renaming any/all
 of them without the *100* version stamp - again, no luck.

 Running dependency walker, PYTHON27.DLL is marked as not found.  I do see
 it in C:\Windows\SysWOW64, however.  It also appears that all of the marked
 DLLs have a 64 next to them, suggesting everything has been compiled for
 64-bit.  I've double checked that I am indeed using the 32-bit versions of
 RDKit.  (I tried going to 64-bit, but I find that NumPy isn't available for
 64-bit Python.)

 The last piece that I notice off, is within the
 RDKit_2014_09_1.win32.py27.zip file, the compressed directory is actually
 titled RDKit_2014_03_1.  I assume this is just a typo, but is it the right
 version?

 Any help is greatly appreciated.

 -Kirk


 RDKit-discuss archive links:

 http://www.mail-archive.com/rdkit-discuss%40lists.sourceforge.net/msg02558.html

 http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg02381.html




 On Sun, Nov 9, 2014 at 4:37 PM, Robert DeLisle rkdeli...@gmail.com
 wrote:

 OOPS!

 On Sun, Nov 9, 2014 at 4:37 PM, Robert DeLisle rkdeli...@gmail.com
 wrote:

 Hi again, all!

 I'm trying to install RDKit on a 64-bit Windows 7 instance (in
 VirtualBox).  I've done the following:

 installed Python 2.7 (32-bit)
 installed NumPy (for Python 2.7 32-bit)
 installed PIL (for Python 2.7 32-bit)

 environment variables are:

 RDBASE = c:\RDKit_2014_09_1
 PYTHONPATH = %RDBASE%
 PATH = %PATH%;%RDBASE%\lib


 From a Python instance, I get this:

 Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit
 (Intel)] on win
 32
 Type help, copyright, credits or license for more information.
  import rdkit
  from rdkit import Chem
 Traceback (most recent call last):
   File stdin, line 1, in module
   File C:\RDKit_2014_09_1\rdkit\Chem\__init__.py, line 18, in module
 from rdkit import rdBase
 ImportError: DLL load failed: %1 is not a valid Win32 application.



 I've search the discuss archives, and found details about making sure
 the VC++ redistributables are present - they are.  I see in RDKit\lib there
 are two files named *vc100*.dll, so I assume having the msvcp100.dll and
 msvcr100.dll are the correct versions.  I've tried moving them to the
 RDKit\lib folder - no luck.  I've also tried renaming any/all of them
 without the *100* version stamp - again, no luck.

 Running dependency walker, PYTHON27.DLL





 --

 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit from Java

2014-08-29 Thread Robert DeLisle
Hello, all.  Long time, no see.

I have a project in which an application is being developed in Java and I
would like to use some of the RDKit functionality to enhance it.  I can
easily write the Python code to do what I need, but I need to get that into
a form that can be accessed from Java.

The only solution I've come up with is to use something akin to py2exe
which has the nice feature of not requiring the full Python and RDKit
installation on the target machine, but would require some type of
intermediate step (probably a file process) to pass data between Java and
the .exe.  Ideally, it would be nice to pass the results through
interfaces, but that's being quite hopeful.


I've searched through the RDKit-discuss archives for this type of thing,
but I haven't seen anything that really answers my question.  Also, I know
there are RDKit KNIME nodes, so surely there's a direct way to this that
I'm not aware of.

Any suggestions or tips are greatly appreciated!

-Kirk
--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Editable molecule confusion

2013-05-31 Thread Robert DeLisle
Another attempt - see Code Block 3, below.  In this case, I construct the
ring systems from the group up using an EditableMol.  Once again it fails
during sanitization, but now I think I know why.  The original structure
has an indole with a substituted nitrogen.  During building, that nitrogen
does not have a hydrogen attached, so the valence is not satisfied and
sanitization fails.  If I change this to an indane, it works just fine.

The problem is, I cannot add hydrogens to the nitrogen until after the
EditableMol is converted to a Mol, but I cannot convert it to a Mol until
hydrogens are added.  All of this would require some fairly sophisticated
logic about the nitrogen which I'm not sure I want to include for this
simple task.




Code Block 3:
from rdkit import Chem
from rdkit.Chem import AllChem

sdin = Chem.SDMolSupplier('test.sdf')
sdout = Chem.SDWriter('rings.sdf')

for m in sdin:

  em = Chem.EditableMol(Chem.Mol())
  indexmap = {}

  for a in m.GetAtoms():
if ( a.IsInRing() ):
  indexmap[a.GetIdx()] = em.AddAtom(Chem.Atom(a.GetAtomicNum()))

  for b in m.GetBonds():
if ( b.IsInRing() ):
  em.AddBond(
indexmap[b.GetBeginAtomIdx()],indexmap[b.GetEndAtomIdx()],b.GetBondType() )

  for nm in Chem.GetMolFrags(em.GetMol(), asMols=True):
AllChem.Compute2DCoords(nm)
sdout.write(nm)


On Fri, May 31, 2013 at 2:41 PM, Robert DeLisle rkdeli...@gmail.com wrote:

 I am attempting to reduce a molecule (attached SDF) to just its ring
 systems using Code Block 1 at the bottom.

 The problem is that when I get through the loops removing non-ring
 atoms/bonds, and convert the EditableMol back to a Mol, I end up with 7
 disjoint sets of atoms:

 ((0, 1, 2, 3, 4, 5, 6, 7, 8), (9, 10, 11, 12, 13, 14), (15,), (16,), (17,
 20), (18,), (19,))

 It appears that when I remove an atom from the EditableMol by index, the
 indices are reassigned.  I tried to test this with the inelegant code in
 Code Block 2, which gives me the expected sets of atom indices with respect
 to number and size:

 ((0, 1, 2, 3, 4, 5, 6, 7, 8), (9, 10, 11, 12, 13, 14), (15, 16, 17, 18,
 19, 20))

 - but it still fails to sanitize when I convert back to a Mol.

 What am I missing here?  Also, is there an easier (ie, existing) way to do
 this?  I'm just looking to reduce the molecule to its ring systems and
 write those to an SD file.

 -Kirk


 Code block 1:
 from rdkit import Chem

 sdin = Chem.SDMolSupplier('test.sdf')
 sdout = Chem.SDWriter('rings.sdf')

 for m in sdin:

   print len(m.GetBonds()),len(m.GetAtoms())
   em = Chem.EditableMol(m)

   for a in m.GetAtoms():
 if ( not a.IsInRing() ):
   em.RemoveAtom(a.GetIdx())
   print a.GetIdx(), m.GetAtomWithIdx(a.GetIdx()).GetSymbol()

   for b in m.GetBonds():
 if ( not b.IsInRing() ):
   a1 = b.GetBeginAtomIdx()
   a2 = b.GetEndAtomIdx()
   em.RemoveBond(a1,a2)

   m3 = em.GetMol()
   print len(m3.GetBonds()), len(m3.GetAtoms())

   f = Chem.GetMolFrags(em.GetMol())
   print f

   #for f in Chem.GetMolFrags(m3,asMols = True):
 #sdout.write(f)

 Code block 2:
 from rdkit import Chem

 sdin = Chem.SDMolSupplier('test.sdf')
 sdout = Chem.SDWriter('rings.sdf')

 for m in sdin:

   print len(m.GetBonds()),len(m.GetAtoms())
   em = Chem.EditableMol(m)

   active = True

   while ( active == True ):
 active = False
 for a in m.GetAtoms():
   if ( not a.IsInRing() ):
 print a.GetIdx(), m.GetAtomWithIdx(a.GetIdx()).GetSymbol()
 em.RemoveAtom(a.GetIdx())
 active = True
 m=em.GetMol()
 em = Chem.EditableMol(m)
 break

   active = True
   while ( active == True):
 active = False
 for b in m.GetBonds():
   if ( not b.IsInRing() ):
 active = True
 a1 = b.GetBeginAtomIdx()
 a2 = b.GetEndAtomIdx()
 em.RemoveBond(a1,a2)
 m=em.GetMol()
 em = Chem.EditableMol(m)
 break

   m3 = em.GetMol()
   print len(m3.GetBonds()), len(m3.GetAtoms())

   f = Chem.GetMolFrags(em.GetMol())
   print f

   #for f in Chem.GetMolFrags(m3,asMols = True):
 #sdout.write(f)



--
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with 2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit_2012_09_1 build errors

2012-12-11 Thread Robert DeLisle
Long time no email.

I'm attempting to build RDKit on CentOS 5.8 and I'm getting the following
error:

In file included from
/usr/local/include/boost/fusion/include/std_pair.hpp:10:0,
 from /usr/local/include/boost/math/tools/tuple.hpp:90,
 from
/usr/local/include/boost/math/special_functions/detail/igamma_inverse.hpp:13,
 from
/usr/local/include/boost/math/special_functions/gamma.hpp:1543,
 from
/usr/local/include/boost/math/special_functions/detail/bessel_jy.hpp:14,
 from
/usr/local/include/boost/math/special_functions/bessel.hpp:17,
 from
/usr/local/include/boost/math/special_functions.hpp:18,
 from
/usr/local/include/boost/random/generate_canonical.hpp:22,
 from /usr/local/include/boost/random.hpp:52,
 from /opt/RDKit_current/Code/RDGeneral/utils.h:17,
 from /opt/RDKit_current/Code/RDGeneral/utils.cpp:11:
/usr/local/include/boost/fusion/adapted/std_pair.hpp:17:1: error: ‘access’
is not a class or namespace
/usr/local/include/boost/fusion/adapted/std_pair.hpp:17:1: error: expected
unqualified-id before ‘’ token
/usr/local/include/boost/fusion/adapted/std_pair.hpp:17:1: error: ‘access’
is not a class or namespace
/usr/local/include/boost/fusion/adapted/std_pair.hpp:17:1: error: expected
unqualified-id before ‘’ token


Clearly seems to be a boost problem, but I'm just not able to track it
down.  I followed these instructions:
http://code.google.com/p/rdkit/wiki/BuildingOnCentOS57, and it appears
boost 1.48 built OK.

Any tips?

-Kirk
--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Fwd: Building on CentOS 5.8: Python-related tests fail

2012-06-22 Thread Robert DeLisle
Just my $0.02.  You may realize and have tried all of this already, but...

I've spent a lot of time getting RDKit built on CentOS since version
5.4.  The newer versions make this much easier with  updated CMake,
GCC, etc.  One problem that I've had is trying to build while still
having CentOS' standard (i.e., old) Boost libraries still installed.
I know that CMake has some flags with which to set the Boost library
location, but I could never get them to work and the build to see the
newly built Boost library when the system standard was present.  The
only thing that worked for me was to remove the system boost and build
my own.  The make system then finds the custom build without a
problem.

-Kirk




On Fri, Jun 22, 2012 at 9:46 AM, Leonardo Trabuco ltrab...@gmail.com wrote:
 Hi Greg,

 Thanks for following up. Below is the output you asked for. Looks like an
 import error in the boost library. Any ideas?

 Thanks again,
 Leo

 UpdateCTestConfiguration  from
 :/net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/DartConfiguration.tcl
 Start processing tests
 UpdateCTestConfiguration  from
 :/net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/DartConfiguration.tcl
 Test project
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build
 Constructing a list of tests
 Done constructing a list of tests
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/RDGeneral
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/DataStructs
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/DataStructs/Wrap
   3/ 76 Testing pyBV
 Test command: /usr/bin/python2.6
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/Code/DataStructs/Wrap/testBV.py
 Test timeout computed to be: 9.99988e+06
 Traceback (most recent call last):
   File
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/Code/DataStructs/Wrap/testBV.py,
 line 1, in module
     from rdkit import DataStructs
   File
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/rdkit/DataStructs/__init__.py,
 line 11, in module
     from rdkit import rdBase
 ImportError:
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/boost/lib/libboost_python.so.1.49.0:
 undefined symbol: Py_InitModule4
 -- Process completed
 ***Failed
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/Geometry
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/Geometry/Wrap
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/Numerics
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/Numerics/Alignment
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/Numerics/Alignment/Wrap
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/Numerics/Optimizer
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/ForceField
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/DistGeom
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/DistGeom/Wrap
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/Depictor
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/Depictor/Wrap
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/SmilesParse
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/FileParsers
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/Substruct
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/ChemReactions
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/ChemReactions/Wrap
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/ChemTransforms
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/Subgraphs
 Changing directory into
 /net/netfile2/ag-russell/install/CentOS-5.8-x86_64/RDKit_2012_03_1/build/Code/GraphMol/FragCatalog
 Changing directory into
 

[Rdkit-discuss] Giant SD file with RDKit

2011-11-21 Thread Robert DeLisle
RDKit-sters,

I'm working with a huge SD file that by all ways I measure it contains
~5,050,000 structures.  (This is an eMolecules dataset.)  In processing the
file, I've run into an odd error.  Even with the following very simple
code, the file seems to be bottomless.  I let it run overnight and I saw
number as high as 42,000,000.

Any ideas?

-Kirk



from rdkit import Chem

sdin = Chem.SDMolSupplier

for i,m in enumerate(sdin):

   if ( i % 10 == 0 ):
  print 'Structure #' + str(i)
--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Giant SD file with RDKit

2011-11-21 Thread Robert DeLisle
Eddie,

Thanks for the quick response.

I checked the file as you suggested and I get this:

000 2424 2424 000a
005

So it appears to end with (0x0a), correct?

Getting the file to you might be a trick as it is over 4 GB compressed.

My intention was to partition the file into multiple, smaller files, but
this weird error occurred.

-Kirk





On Mon, Nov 21, 2011 at 11:42 AM, Eddie Cao eddie@me.com wrote:

 Hi Robert,

 It might help to create a small SD file consisting only of the last few
 structures in the SD file to make sure the error was not because the file
 does not end properly. Specifically, the latest RDKit release has a bug
 that causes it to stuck if the file does not end with line-feed character
 (0x0a). An easy way to check is to run `tail -1 INPUT.sdf | hexdump`. If
 the last character is not 0a, then you are a victim of this bug. The
 following example uses a bad SDF that ends with character 24:

 $ tail -1 test.sdf | hexdump
 000 24 24 24 24
 004


 If you provide a link to the SD file, I can also help you check.

 Eddie


 On Nov 21, 2011, at 10:20 AM, Robert DeLisle wrote:

 RDKit-sters,

 I'm working with a huge SD file that by all ways I measure it contains
 ~5,050,000 structures.  (This is an eMolecules dataset.)  In processing the
 file, I've run into an odd error.  Even with the following very simple
 code, the file seems to be bottomless.  I let it run overnight and I saw
 number as high as 42,000,000.

 Any ideas?

 -Kirk



 from rdkit import Chem

 sdin = Chem.SDMolSupplier

 for i,m in enumerate(sdin):

if ( i % 10 == 0 ):
   print 'Structure #' + str(i)




 --
 All the data continuously generated in your IT infrastructure
 contains a definitive record of customers, application performance,
 security threats, fraudulent activity, and more. Splunk takes this
 data and makes sense of it. IT sense. And common sense.

 http://p.sf.net/sfu/splunk-novd2d___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Giant SD file with RDKit

2011-11-21 Thread Robert DeLisle
Andrew,

Good catch!  I had wondered if there might be a size problem but couldn't
make the connection that you made.  I'll find another method to partition
the file.

-Kirk



On Mon, Nov 21, 2011 at 12:01 PM, Andrew Dalke da...@dalkescientific.comwrote:

 On Nov 21, 2011, at 7:47 PM, Robert DeLisle wrote:
  Getting the file to you might be a trick as it is over 4 GB compressed.

 I think that's a clue.

 RDKit uses tell/seek operations on the underlying file stream, like this:


  ROMol *SDMolSupplier::next() {
PRECONDITION(dp_inStream,no stream);
// set the stream to the current position
dp_inStream-seekg(d_molpos[d_last]);


 d_molpos contains std::streampos elements,

 MolSupplier.h:std::vectorstd::streampos d_molpos; // vector of
 positions in the file for molecules


 and I can't tell if that's a 32-bit or 64-bit value, but there's
 code which assumes it's an unsigned 32-bit integer:

  std::string SDMolSupplier::getItemText(unsigned int idx){
PRECONDITION(dp_inStream,no stream);
unsigned int holder=d_last;
moveTo(idx);
unsigned int begP=d_molpos[idx];
unsigned int endP;
try {


 My guess is that there's an overflow in this code, causing it to
 loop from 2**32 back to 0.


Andrew
da...@dalkescientific.com




 --
 All the data continuously generated in your IT infrastructure
 contains a definitive record of customers, application performance,
 security threats, fraudulent activity, and more. Splunk takes this
 data and makes sense of it. IT sense. And common sense.
 http://p.sf.net/sfu/splunk-novd2d
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Giant SD file with RDKit

2011-11-21 Thread Robert DeLisle
Andrew,

In thinking about this, an unsigned 32-bit integer should give me over 4
billion values, and a signed 32-bit gives 2 billion.  I know that the file
has slightly over 5 million structures and ~300 million lines.  Neither of
these is over the limit, so I wouldn't expect an overflow.


-Kirk



On Mon, Nov 21, 2011 at 12:22 PM, Robert DeLisle rkdeli...@gmail.comwrote:

 Andrew,

 Good catch!  I had wondered if there might be a size problem but couldn't
 make the connection that you made.  I'll find another method to partition
 the file.

 -Kirk




 On Mon, Nov 21, 2011 at 12:01 PM, Andrew Dalke 
 da...@dalkescientific.comwrote:

 On Nov 21, 2011, at 7:47 PM, Robert DeLisle wrote:
  Getting the file to you might be a trick as it is over 4 GB compressed.

 I think that's a clue.

 RDKit uses tell/seek operations on the underlying file stream, like this:


  ROMol *SDMolSupplier::next() {
PRECONDITION(dp_inStream,no stream);
// set the stream to the current position
dp_inStream-seekg(d_molpos[d_last]);


 d_molpos contains std::streampos elements,

 MolSupplier.h:std::vectorstd::streampos d_molpos; // vector of
 positions in the file for molecules


 and I can't tell if that's a 32-bit or 64-bit value, but there's
 code which assumes it's an unsigned 32-bit integer:

  std::string SDMolSupplier::getItemText(unsigned int idx){
PRECONDITION(dp_inStream,no stream);
unsigned int holder=d_last;
moveTo(idx);
unsigned int begP=d_molpos[idx];
unsigned int endP;
try {


 My guess is that there's an overflow in this code, causing it to
 loop from 2**32 back to 0.


Andrew
da...@dalkescientific.com




 --
 All the data continuously generated in your IT infrastructure
 contains a definitive record of customers, application performance,
 security threats, fraudulent activity, and more. Splunk takes this
 data and makes sense of it. IT sense. And common sense.
 http://p.sf.net/sfu/splunk-novd2d
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Giant SD file with RDKit

2011-11-21 Thread Robert DeLisle
Andrew - thank you for the clarification.  Obviously a character offset
into the file makes much more sense than a line offset.  oops.  8^)

Greg - thanks for the link. I may give that a try.  I have a different
approach in place now, so this file is taken care of.  I genuinely hope I
don't have to process this many structures too often 8^)  but I'll
certainly give the ForwardSDMolSupplier a try just in case I do.





On Mon, Nov 21, 2011 at 1:00 PM, Greg Landrum greg.land...@gmail.comwrote:

 Kirk,

 On Mon, Nov 21, 2011 at 8:42 PM, Robert DeLisle rkdeli...@gmail.com
 wrote:
 
  In thinking about this, an unsigned 32-bit integer should give me over 4
  billion values, and a signed 32-bit gives 2 billion.  I know that the
 file
  has slightly over 5 million structures and ~300 million lines.  Neither
 of
  these is over the limit, so I wouldn't expect an overflow.

 The determining factor is, unfortunately, the file size, not the
 number of lines.

 If you're willing to live on the bleeding edge for a bit, there's an
 RDKit branch that contains a new way of working with SD files that is
 well suited to dealing with large files:

 https://rdkit.svn.sourceforge.net/svnroot/rdkit/branches/StreambufSupport_18Nov2011

 The new feature is the ForwardSDMolSupplier, this can be initialized
 from a filename:
 In [3]: suppl = Chem.ForwardSDMolSupplier('PubChemBackground.sdf')

 or a python file-like object:
 In [4]: suppl2 = Chem.ForwardSDMolSupplier(file('PubChemBackground.sdf'))

 You can read out molecules by looping over the supplier:
 In [5]: for mol in suppl2:
   ...: if mol is None: continue
   ...: print mol.GetNumAtoms()
   ...:
 24
 17
  

 Since these work using file-like objects, you can directly read from
 compressed files:

 In [6]: suppl3  = Chem.ForwardSDMolSupplier(gzip.open('bigfile.sdf.gz'))

 The differences to the standard SDMolSupplier :
  - the ForwardSDMolSupplier is not random access; you cannot ask for
 a particular item
  - there's no reset method, if you want to go through the molecules
 more than once, you have to create the supplier from scratch.

 Coincidentally, this was inspired by some suggestions Andrew has made
 in the last week or so.

 I will be merging this branch back into the trunk sometime in the next
 week, but the code is there, mostly tested, and usable now.

 -greg

--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] [Rdkit-devel] Beta of Q2 2011 Release Available

2011-07-05 Thread Robert DeLisle
It works here on CentOS 5.6.  Testing with my code goes fine, but the test
step (ctest from the build directory) results in 72/76 tests failed.
Problem with the test DB?





On Fri, Jul 1, 2011 at 5:40 AM, Greg Landrum greg.land...@gmail.com wrote:

 Dear all,

 This morning I tagged the beta for the Q2 2011 (2011.06 in the new
 numbering) release in svn:
 http://rdkit.svn.sourceforge.net/viewvc/rdkit/tags/Release_2011_06_1beta1/

 and uploaded a source distribution to the google code site:

 http://code.google.com/p/rdkit/downloads/detail?name=RDKit_2011_06_1beta1.tgz
 If there's demand for it, I will also put up a windows binary.

 As usual: if no show-stopper bugs appear, I will do the release itself
 in about a week.

 Excerpts from the release notes are below.

 One highlight I will call your attention to is that, thanks to some
 nice work from Eddie Cao, it is now possible to generate InChI codes
 from within the RDKit :

 In [2]: inchi = Chem.MolToInchi(Chem.MolFromSmiles('c1c1C(=O)O'))
 In [3]: print inchi
 InChI=1S/C7H6O2/c8-7(9)6-4-2-1-3-5-6/h1-5H,(H,8,9)

 and then convert the InChIs to InChI keys:

 In [4]: print Chem.InchiToInchiKey(inchi)
 WPYMKLBDIGXBTP-UHFFFAOYSA-N

 There is also experimental and partial support for converting InChI
 back into a molecule:

 In [5]: m2 = Chem.MolFromInchi(inchi)
 In [6]: print Chem.MolToSmiles(m2)
 O=C(O)c1c1

 Note that this last bit is not something InChI is actually designed
 for, so it's probably not a good idea to rely on it.



 Best Regards,
 -greg


 **  Release_2011.06.1 ***
 (Changes relative to Release_2011.03.2)

 Acknowledgements:
  - Eddie Cao, Andrew Dalke, James Davidson, JP Ebejer, Bernd Wiswedel

 Bug Fixes:
  - A problem with similarity values between SparseIntVects that
   contain negative values was fixed. (Issue 3295215)
  - An edge case in SmilesMolSupplier.GetItemText() was fixed. (Issue
   3299878)
  - The drawing code now uses dashed lines for aromatic bonds without
   kekulization. (Issue 3305420)
  - AllChem.ConstrainedEmbed works again. (Issue 3305420)
  - atomic RGP values from mol files are accessible from python (Issue
   3313539)
  - M RGP blocks are now written to mol files. (Issue 3313540)
  - Atom.GetSymbol() for R atoms read from mol files is now
   correct. (Issue 3316600)
  - The handling of isotope specifications is more robust.
  - A thread-safety problem in SmilesWrite::GetAtomSmiles() was fixed.
  - some of the MACCS keys definitions have been corrected

 New Features:
  - The smiles, smarts, and reaction smarts parsers all now take an
 additional
   argument, replacements, that carries out string substitutions
 pre-parsing.
  - There is now optional support for generating InChI codes and keys
   for molecules.
  - the atom pair and topological torsion fingerprint generators now
   take an optional ignoreAtoms argument
  - a function to calculate exact molecular weight was added.
  - new java wrappers are now available in $RDBASE/Code/JavaWrappers
  - the methods getMostCommonIsotope() and getMostCommonIsotopeMass()
   have been added to the PeriodicTable class.

 New Database Cartridge Features:

 Deprecated modules (to be removed in next release):
  - The original SWIG wrappers in $RDBASE/Code/Demos/SWIG are deprecated

 Removed modules:

 Other:
  - The quality of the drawings produced by both the python molecule drawing
   code and $RDBASE/Code/Demos/RDKit/Draw is better.
  - the python molecule drawing code will now use superscripts and
   subscripts appropriately when using the aggdraw or cairo canvases
   (cairo canvas requires pango for this to work).
  - $RDBASE/Code/Demos/RDKit/Draw now includes an example using cairo
  - A lot of compiler warnings were cleaned up.
  - The error reporting in the SMILES, SMARTS, and SLN parsers was improved.
  - the code for calculating molecular formula is now in C++
   (Descriptors::calcMolFormula())


 --
 All of the data generated in your IT infrastructure is seriously valuable.
 Why? It contains a definitive record of application performance, security
 threats, fraudulent activity, and more. Splunk takes this data and makes
 sense of it. IT sense. And common sense.
 http://p.sf.net/sfu/splunk-d2d-c2
 ___
 Rdkit-devel mailing list
 rdkit-de...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-devel

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

Re: [Rdkit-discuss] Friday evening problem... Centos RDKit -- again

2011-06-10 Thread Robert DeLisle
I will defer to Greg's expertise for a more accurate answer, but I would
suspect that the problem is the difference in using the system version of
Python and a version of RDKit that is built with a newer version of GCC.
You may be getting stuck in  dependency confusion between the two versions.

You should be able to build and install Python 2.7 without disturbing the
system's Python 2.4.3.

-Kirk






On Fri, Jun 10, 2011 at 12:00 PM, JP jeanpaul.ebe...@inhibox.com wrote:

 I am installing the brand new RDKit (2011_03_2) on CentOS (lol!) on a
 Friday evening (6.54pm here in Oxford)...  So I probably deserve the misery
 of the following.

 I have already gone through the whole RDKit on Centos installation
 procedure and pain on other machines and I now am undaunted by it.  Bring it
 on.
 Still I installed everything (almost) according to the book (
 http://code.google.com/p/rdkit/wiki/BuildingOnCentOS) with the exception
 that I stuck to Python 2.4.3 (Python 2.7, doesn't play nicely with Rocks)

 And I get this anti-fancy error message

  from rdkit import Chem
 Traceback (most recent call last):
   File stdin, line 1, in ?
   File /share/apps/RDKit_2011_03_2/rdkit/Chem/__init__.py, line 18, in ?
 from rdkit import rdBase
 ImportError: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not found
 (required by /share/apps/RDKit_2011_03_2/rdkit/rdBase.so)

 Any ideas?

 [jp@xxx build]$ echo $LD_LIBRARY_PATH

 /share/apps/RDKit_2011_03_2/lib:/share/apps/boost_1_46_1/lib:/opt/gridengine/lib/lx26-amd64:/share/apps/openbabel/lib:/usr/local/lib:/share/apps/openbabel/lib:

 [jp@xxx build]$ echo $PYTHONPATH
 :/share/apps/RDKit_2011_03_2

 [jp@xxx build]$ echo $RDBASE
 /share/apps/RDKit_2011_03_2

 Any sympathy will be greatly appreciated.

 Cheers
 JP


 --
 EditLive Enterprise is the world's most technically advanced content
 authoring tool. Experience the power of Track Changes, Inline Image
 Editing and ensure content is compliant with Accessibility Checking.
 http://p.sf.net/sfu/ephox-dev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Friday evening problem... Centos RDKit -- again

2011-06-10 Thread Robert DeLisle
Check the version of gcc fort that working build.  It may not match yours.
Not that it fixes your problem, unfortunately.
On Jun 10, 2011 1:14 PM, JP jeanpaul.ebe...@inhibox.com wrote:
 It seems someone got it to work with python 2.4 on Centos
 (at least according to http://code.google.com/p/rdkit/wiki/WorkingBuilds).
 But even this is god knows how many permutations (gcc / boost / mpfr / gmp
 / bison / flex etc) away from mine...

 I'd be interested in Greg's take on supported platforms.

 What a start to the weekend!


 On 10 June 2011 20:04, Robert DeLisle rkdeli...@gmail.com wrote:

 I can't blame you there. One ring to bind them would be preferred.

 Have you searched the RDKit discussion list archives regarding Python
 version compatibility? I vaguely remember something about older versions
of
 Python in general, but I don't know if it applies to this case.






 On Fri, Jun 10, 2011 at 1:00 PM, JP jeanpaul.ebe...@inhibox.com wrote:


 Hi there Kirk,

 Your suggestion was interesting to tinker with -- but it doesn't help my
 specific case.

 If I set the environment to work with python 2.7 (and RDKit), I break
 ROCKs functionality which I need from time to time.
 I do not want to stay switching between p2.4 and p2.7 in the same
 session...




 On 10 June 2011 19:20, Robert DeLisle rkdeli...@gmail.com wrote:

 I will defer to Greg's expertise for a more accurate answer, but I
would
 suspect that the problem is the difference in using the system version
of
 Python and a version of RDKit that is built with a newer version of
GCC.
 You may be getting stuck in dependency confusion between the two
versions.

 You should be able to build and install Python 2.7 without disturbing
the
 system's Python 2.4.3.

 -Kirk






 On Fri, Jun 10, 2011 at 12:00 PM, JP jeanpaul.ebe...@inhibox.com
wrote:

 I am installing the brand new RDKit (2011_03_2) on CentOS (lol!) on a
 Friday evening (6.54pm here in Oxford)... So I probably deserve the
misery
 of the following.

 I have already gone through the whole RDKit on Centos installation
 procedure and pain on other machines and I now am undaunted by it.
Bring it
 on.
 Still I installed everything (almost) according to the book (
 http://code.google.com/p/rdkit/wiki/BuildingOnCentOS) with the
 exception that I stuck to Python 2.4.3 (Python 2.7, doesn't play
nicely with
 Rocks)

 And I get this anti-fancy error message

  from rdkit import Chem
 Traceback (most recent call last):
 File stdin, line 1, in ?
 File /share/apps/RDKit_2011_03_2/rdkit/Chem/__init__.py, line 18, in
 ?
 from rdkit import rdBase
 ImportError: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not
 found (required by /share/apps/RDKit_2011_03_2/rdkit/rdBase.so)

 Any ideas?

 [jp@xxx build]$ echo $LD_LIBRARY_PATH


/share/apps/RDKit_2011_03_2/lib:/share/apps/boost_1_46_1/lib:/opt/gridengine/lib/lx26-amd64:/share/apps/openbabel/lib:/usr/local/lib:/share/apps/openbabel/lib:

 [jp@xxx build]$ echo $PYTHONPATH
 :/share/apps/RDKit_2011_03_2

 [jp@xxx build]$ echo $RDBASE
 /share/apps/RDKit_2011_03_2

 Any sympathy will be greatly appreciated.

 Cheers
 JP



--
 EditLive Enterprise is the world's most technically advanced content
 authoring tool. Experience the power of Track Changes, Inline Image
 Editing and ensure content is compliant with Accessibility Checking.
 http://p.sf.net/sfu/ephox-dev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Friday evening problem... Centos RDKit -- again

2011-06-10 Thread Robert DeLisle
Greg and JP,

For my own education, could this be related to having upgraded GCC through
the CentOS install instructions, but using the old Python?  The new version
of RDKit would have been built with the newer GCC but the old Python may not
refer to the correct libraries?

Or am I mixing concepts here?

-Kirk





On Fri, Jun 10, 2011 at 1:30 PM, Greg Landrum greg.land...@gmail.comwrote:

 Hi

 On Friday, June 10, 2011, JP jeanpaul.ebe...@inhibox.com wrote:
  I am installing the brand new RDKit (2011_03_2) on CentOS (lol!) on a
 Friday evening (6.54pm here in Oxford)...  So I probably deserve the misery
 of the following.

 Nobody deserves the misery of working with Centos. ;-)

  I have already gone through the whole RDKit on Centos installation
 procedure and pain on other machines and I now am undaunted by it.  Bring it
 on.

 Good attitude!

 
  Still I installed everything (almost) according to the book (
 http://code.google.com/p/rdkit/wiki/BuildingOnCentOS) with the exception
 that I stuck to Python 2.4.3 (Python 2.7, doesn't play nicely with Rocks)
 
 
  And I get this anti-fancy error message
  from rdkit import ChemTraceback (most recent call last):  File
 stdin, line 1, in ?
 
File /share/apps/RDKit_2011_03_2/rdkit/Chem/__init__.py, line 18, in
 ?from rdkit import rdBaseImportError: /usr/lib64/libstdc++.so.6: version
 `GLIBCXX_3.4.9' not found (required by
 /share/apps/RDKit_2011_03_2/rdkit/rdBase.so)

 That is a glibc problem. It means that you are using something that
 has been built with a version of g++ that is more modern than the
 version of libstdc++ (I think) that is being found. You might want to
 google around a little bit for the error message in combination with
 centos and see what you can find. Believe it or not, this doesn't have
 much to do with the rdkit.

 -greg

 
 
  Any ideas?
  [jp@xxx build]$ echo
 $LD_LIBRARY_PATH/share/apps/RDKit_2011_03_2/lib:/share/apps/boost_1_46_1/lib:/opt/gridengine/lib/lx26-amd64:/share/apps/openbabel/lib:/usr/local/lib:/share/apps/openbabel/lib:
 
 
  [jp@xxx build]$ echo $PYTHONPATH :/share/apps/RDKit_2011_03_2
  [jp@xxx build]$ echo $RDBASE/share/apps/RDKit_2011_03_2
 
 
  Any sympathy will be greatly appreciated.
  CheersJP
 


 --
 EditLive Enterprise is the world's most technically advanced content
 authoring tool. Experience the power of Track Changes, Inline Image
 Editing and ensure content is compliant with Accessibility Checking.
 http://p.sf.net/sfu/ephox-dev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Fwd: Re: Re: Re: Bug in MolToFile?

2011-04-19 Thread Robert DeLisle
Keeping everyone on the thread..
-- Forwarded message --
From: Robert DeLisle rkdeli...@gmail.com
Date: Apr 19, 2011 3:42 PM
Subject: Re: Re: Re: [Rdkit-discuss] Bug in MolToFile?
To: Greg Landrum greg.land...@gmail.com

Greg,

Thank you again for the off-line assistance.

Just to update the status for the others out there, the new Draw code does
work in my hands.  And, it is as simple as downloading just the
/rdkit/Chem/Draw directory from the SourceForge svn trunk and copying it
into the existing source tree.

Downloading that same directory from the Google Code trunk isn't very useful
- oops.

-Kirk








On Tue, Apr 19, 2011 at 12:43 PM, Greg Landrum greg.land...@gmail.comwrote:

 Hi Kirk,

 On Tue, Apr 19, 2011 at 6:46 PM,  rkdeli...@gmail.com wrote:
  H
 
  I just repeated the process - I copied the most recent release to a new
  directory, copied in the rdkit/Chem/Draw directory from SVN, no build
 step
  this time - I get the same error:
 
  Traceback (most recent call last):
  File SOMtoHTML_101203.py, line 227, in module
  create_2D_depiction()
  File SOMtoHTML_101203.py, line 50, in create_2D_depiction
  Draw.MolToFile(m, picture_parent_folder+'/'+name+'.png', (picture_size,
  picture_size) )
  File /opt/RDKit_2011_03_1_up1/rdkit/Chem/Draw/__init__.py, line 56, in
  MolToFile
  import cairo
  ImportError: No module named cairo
 
 
  I looked at the __init.py__ file from the SVN set and I see this:
 
  def MolToFile(mol,fileName,size=(300,300),kekulize=True,
 wedgeBonds=True):
  # original contribution from Uwe Hoffmann
  import cairo
 
  Line 56 is import cairo

 I'm really confused...
 Here:

 http://rdkit.svn.sourceforge.net/viewvc/rdkit/trunk/rdkit/Chem/Draw/__init__.py?revision=1712view=markup
 it's different

 If you go to that directory and do: svn info what do you see?

 
  def MolToImageFile occurs on line 100.
 
  What have I done wrong here? Has anyone else out there tested it?

 I haven't heard back from anyone yet.

 -greg

--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Installation driving me mad (RDKit on Centos 5.4 final)

2011-02-23 Thread Robert DeLisle
JP and George - Excellent!  I'm glad to hear my CentOS install walk-through
is going to good use.  8^)

Greg - I'm reasonably sure that when I tried to build NumPy in the absence
of the linear algebra libraries, I received an error and the build failed.
Unfortunately, I didn't bother to try them one at a time to verify which was
needed.  I'll leave that as an exercise for the reader.  8^)  I edited the
Wiki stating that the libraries may or may not be necessary.

Regarding repos necessary, I see that all but atlas can be found in the
base repo, which I assume is the CentOS default.  I'm puzzled as to why
blas* and atlas* aren't there.  I did put in a link to the epel repository
RPM, which should cure the yum install problem for all linear algebra
libraries.

-Kirk







On Wed, Feb 23, 2011 at 6:04 AM, George Papadatos gpapada...@gmail.comwrote:

 Fair enough, I did not know that! However, according to the same
 documentation, these packages are highly recommended for NumPy and required
 for SciPy:
 http://scipy.org/Installing_SciPy/Linux#head-9cf6f4b7fe9ba63fc228203c4f28554a74970847


 http://scipy.org/Installing_SciPy/Linux#head-9cf6f4b7fe9ba63fc228203c4f28554a74970847In
 any case, here is a repository for CentOS 5/RHEL 5 with the necessary rpms
 (for those who can't access yum):
 http://download.opensuse.org/repositories/home:/ashigabou/
  http://download.opensuse.org/repositories/home:/ashigabou/After that,
 Kirk's walk though has been most helpful.

 George


 On 23 February 2011 11:12, Greg Landrum greg.land...@gmail.com wrote:

 Let me elaborate on that... from the numpy installation page
 (http://docs.scipy.org/doc/numpy/user/install.html:
 NumPy does not require any external linear algebra libraries to be
 installed. However, if these are available, NumPy’s setup script can
 detect them and use them for building. A number of different LAPACK
 library setups can be used, including optimized LAPACK libraries such
 as ATLAS, MKL or the Accelerate/vecLib framework on OS X.

 Best,
 -greg




 On Wed, Feb 23, 2011 at 12:10 PM, Greg Landrum greg.land...@gmail.com
 wrote:
  I'm not convinced of that. I'm pretty sure that I have built numpy on
  redhat and ubuntu systems without ever installing lapack.
 
  -greg
 
 
  On Wed, Feb 23, 2011 at 12:06 PM, George Papadatos 
 gpapada...@gmail.com wrote:
  ...yet you need them to build Numpy...
  George
 
  On 23 February 2011 11:03, Greg Landrum greg.land...@gmail.com
 wrote:
 
  To be very clear: you do not need *any* of these packages to install
 the
  RDKit.
 
  -greg
 
 
  On Wed, Feb 23, 2011 at 10:53 AM, JP jeanpaul.ebe...@inhibox.com
 wrote:
   Great wiki - I wonder how I missed that.
   But the first instruction
   sudo yum install atlas, atlas-devel, blas blas-devel lapack
 lapack-devel
  
   Gives me the following error:
   No package atlas, available.
   No package atlas-devel, available.
   No package blas available.
   No package lapack available.
   Is there a repos I have to add to /etc/yum.repos.d/ ?
  
  
   On 22 February 2011 18:41, Robert DeLisle rkdeli...@gmail.com
 wrote:
  
   What are your environment settings?  You should have at minimum,
 these:
  
   $RDBASE = the directory where you have installed the RDKit code
  
  
   $LD_LIBRARY_PATH = /usr/local/lib:/$RDBASE/lib
  
   $PYTHONPATH = $RDBASE
  
  
   At least this worked for me for a CentOS installation, detailed
 here -
   http://code.google.com/p/rdkit/wiki/BuildingOnCentOS
  
  
  
   Another possibility is your PATH variable.  Make sure that
 /usr/local
   pathnames precede any /usr options.
   This will ensure looking into /usr/local first.
  
   There also may be options for cmake that will force it into the
 correct
   directory.  I've found in the past that even though
  
  
   it says in the initial output that is looking in the correct
 location
   for
   boost and python, it doesn't necessarily follow its
   own advice.
  
   -Kirk
  
  
  
  
   On Tue, Feb 22, 2011 at 9:44 AM, JP jeanpaul.ebe...@inhibox.com
   wrote:
  
   I ended up not using yum to install Numpy - I installed it from
   source,
   which was only slightly painful.
import platform; print platform.python_version()
   # /usr/local/lib/python2.7/platform.pyc matches
   /usr/local/lib/python2.7/platform.py
   import platform # precompiled from
   /usr/local/lib/python2.7/platform.pyc
   2.7.0
import numpy as N
a=N.random.randn(10, 10)
   
   In /usr/lib64/ I can find some libpython2.4.so
 , libpython2.4.so.1.0
   What should I do?
  
   On 22 February 2011 16:23, rkdeli...@gmail.com wrote:
  
   Are you sure that your NumPy installation is going to the correct
   Python
   instance? I see from the logs that you have Python 2.7 installed,
 or
   at
   least that is what cmake is finding at /usr/local/lib. You use
 yum to
   install NumPy, but the standard installation of Python on CentOS
 5.x
   is 2.4
   and it is located in /usr/lib. Which version of Python has NumPy?
  
  
   -Kirk

Re: [Rdkit-discuss] RDKit on CentOs 5

2011-01-06 Thread Robert DeLisle
 was up and
  running.
 
  If you run into any problems, please post them so that we can
  (hopefully) help and others can benefit in the future.
 
  -Kirk
 
 
 
  On Jan 6, 2011 10:11am, Igor Filippov [Contr] ig...@helix.nih.gov
  wrote:
   Dear Kirk,
  
  
  
   Thank you so much! I'm in the process of compiling gcc-4.5.1 right
  now,
  
   having got gmp, mpc, and mpfr built with the older version of gcc.
  
   Your instructions have to be preserved for the others, I can't
  believe
  
   I'm the only one using CentOs/RHEL on a server/compute node.
  
  
  
   Greg, don't take it as a slam but compiling the Linux kernel is a
  walk
  
   in a park compared to a recent RDkit. I'm working on it second day
  and
  
   I'm barely half-way through the process of installing dependencies.
  Even
  
   without python the version of gcc which comes with CentOs 5 (4.1.2)
  
   cannot compile RDKit.
  
   On the other hand the RPM packages for Fedora have been painless to
  
   install, how nice it would be to have the RDKit RPMs for CentOs!
  
  
  
   Best,
  
   Igor
  
  
  
   On Wed, 2011-01-05 at 17:41 -0500, Robert DeLisle wrote:
  
I have been able to reproducibly build RDKit on CentOS 5.5, but it
  
required a significant amount of updating of the build components.
  
The attached walk-through script should get you there.  I do not
  
recall ever seeing that particular error, however.
  
   
  
-Kirk
  
   
  
   
  
   
  
   
  
On Wed, Jan 5, 2011 at 1:15 PM, Igor Filippov [Contr]
  
ig...@helix.nih.gov wrote:
  
Dear All,
  
   
  
Has anyone successfully compiled RDKit on CentOs 5? I'm
  
running into the
  
following error message:
  
   [ 15%] Building CXX object
  
   
  
   
  Code/Numerics/Alignment/Wrap/CMakeFiles/rdAlignment.dir/rdAlignment.cpp.o
  
   
  
   
 
  /root/RDKit_2010_09_1/Code/Numerics/Alignment/Wrap/rdAlignment.cpp:14:31:
 error: numpy/arrayobject.h: No such file or directory
  
   
  
On CentOs 5 arrayobject.h is part of python-numeric
  package
  
and it's
  
located in:
  
   /usr/include/python2.4/Numeric/arrayobject.h
  
   
  
I'm attempting to compile RDKit_2010_09_1, using boost
  version
  
1.39.0,
  
x86_64 system.
  
   
  
Regards,
  
Igor
  
   
  
   
  
   
  
   
  
   
  
   
 
 --
  
Learn how Oracle Real Application Clusters (RAC) One Node
  
allows customers
  
to consolidate database storage, standardize their
  database
  
environment, and,
  
should the need arise, upgrade to a full multi-node Oracle
  RAC
  
database
  
without downtime or disruption
  
http://p.sf.net/sfu/oracle-sfdevnl
  
___
  
Rdkit-discuss mailing list
  
Rdkit-discuss@lists.sourceforge.net
  
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
  
   
  
  
  
  
  



--
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit build on Fedora 14

2010-11-19 Thread Robert DeLisle
Greg,

After finalizing my build of RDKit on CentOS (as per my previous message
thread), I decided to give it a shot on Fedora 14.  I'm happy to report that
this build goes incredibly smoothly without even a hiccup.  The details for
Fedora 14's standard install are:

GCC 4.5.1-4
Boost 1.44.0
Python 2.7
NumPy 1.4.1
cmake 2.8.2
flex 2.5.35
bison 2.4.3

I then moved on to imaging and found the standard PIL (v1.1.7) installation
did not have Freetype support, but the Freetype2 (v2.4.2) libraries are
installed.  I rebuilt PIL with the following modification to its setup.py
file in order to direct it to the correct libraries and include files.

change line 40 from

FREETYPE_ROOT = None

to

FREETYPE_ROOT = usr/lib64,/usr/include

I was also able to get aggdraw in place by modifying its setup.py in a
similar manner, but you don't have the benefit of decomposing the library
and include directories as with PIL.

change line 21 from:

FREETYPE_ROOT = ../../kits/freetype-2.1.10

to

FREETYPE_ROOT = /usr

and, change line 56 from

library_dirs.append(os.path.join(FREETYPE_ROOT, lib))

to
library_dirs.append(os.path.join(FREETYPE_ROOT, lib64))


That's it.  Easy.  8^)

-Kirk
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit on CentOS 5

2010-11-17 Thread Robert DeLisle
Peter - This is great!  I've only browsed through the script you have, but I
do see a few differences.  I'll give it a shot now and report back.  Thank
you so much for posting this.

Greg - I ran ldd on rdBase.so, and here's the output:

libRDGeneral.so.1 = /opt/RDKit_svn_20101115/lib/libRDGeneral.so.1
(0x2ad72659b000)
libRDBoost.so.1 = /opt/RDKit_svn_20101115/lib/libRDBoost.so.1
(0x2ad7267d6000)
libboost_python.so.1.44.0 =
/usr/local/lib/libboost_python.so.1.44.0 (0x2ad726ba5000)
libstdc++.so.6 = /usr/local/lib64/libstdc++.so.6
(0x2ad726df7000)
libm.so.6 = /lib64/libm.so.6 (0x2ad72712f000)
libgcc_s.so.1 = /usr/local/lib64/libgcc_s.so.1 (0x2ad7273b2000)
libc.so.6 = /lib64/libc.so.6 (0x2ad7275c9000)
libutil.so.1 = /lib64/libutil.so.1 (0x2ad72792)
libpthread.so.0 = /lib64/libpthread.so.0 (0x2ad727b23000)
libdl.so.2 = /lib64/libdl.so.2 (0x2ad727d3f000)
librt.so.1 = /lib64/librt.so.1 (0x2ad727f43000)
/lib64/ld-linux-x86-64.so.2 (0x0031f700)

It does look like it is refering to the correct instances of what I've
built.  There are a few system level C/C++ library references, but I'm not
seeing anything odd here.  What's your take on it?

-Kirk





On Wed, Nov 17, 2010 at 6:34 AM, Peter Schmidtke pschmid...@ub.edu wrote:

 Hey Greg,

 yep that would be great, as right now they are only on a group internal
 blog ;) I saw that you recently changed you linux build instructions
 (concerning database things, boost numerical bindings etc...), but I did
 this before this came out ;)

 First lets see if Robert comes through the install process without major
 problems and then you can post it on your wiki (I might have forgotten some
 stuff).
 Some things are based on installing pycuda on those machines, this is why
 signals and things like that are compiled with boost (might be worth to
 mention somehow in case people need both).

 ++

 Peter

 On 17/11/2010, at 14:25, Greg Landrum wrote:

  Dear Peter,
 
  Thanks for posting these very detailed instructions. Do you mind if I
  post them to the wiki (with credit of course) to make them easier to
  find?
 
  I made a few comments and suggestions below:
 
  On Wed, Nov 17, 2010 at 11:19 AM, Peter Schmidtke pschmid...@ub.edu
 wrote:
  Dear Robert,
  I recently ran also into several problems while installing rdkit on a
 fresh
  Centos 5.3. It's a real headache. Anyway, this time I've written up a
 guide
  of how to do it step by step, I hope I didn't forget anything in the
 end.
  However, now it works just fine on our Centos machines. Here's the step
 by
  step installing guide :
 
  Centos is a stable but not very userfriendly OS. This becomes obvious
 when
  one wants to install python packages like pycuda etc...Centos comes with
 a
  very old python version, 2.4, but lots of newer features, like pycuda
  require a newer python version. Lets start the lengthy install process
 under
  Centos :
 
  Installing Python 2.6 or newer
 
  If you already have python2.7 installed, please check that it was
 installed
  with --enabled-shared. If this is the case you should have
 libpython2.7.so
  in /usr/local/lib. If not, you should have libpython2.7.a. If the second
 is
  the case, you have to install python2.7 with the following way :
 
  Download the current version from python (source code). Like with 2.6 or
 2.7
  (don't grab the 3.x for now) :
 
  wget http://www.python.org/ftp/python/2.7/Python-2.7.tgz
 
  Next untar and unzip the file, go to Python-2.7 directory and issue :
 
  ./configure --enable-shared; make; sudo make install
 
  This installs python in the /usr/local/ directory.
 
  Add the RPMForge repo to yum :
 
  wget
 
 http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm
 
  su -c 'rpm -Uvh rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm'
 
  Then install atlas, lapack, blas :
 
  yum install atlas-c++.x86_64 atlas-c++-devel.x86_64 lapack.x86_64
  lapack-devel.x86_64 blas.x86_64 blas-devel.x86_64
 
  Now we can install fftw3 :
 
  yum install fftw3.x86_64 fftw3-devel.x86_64
 
 
 
  Now we could potentially install numpy 1.3 or 1.4, but as python2.7 is
 brand
  new there are some problems. I downloaded :
 
  wget
 http://sourceforge.net/projects/numpy/files/NumPy/1.3.0/numpy-1.3.0.tar.gz/download
 
  then untar and unzip this whole thing and go to the numpy directory
 
  Download the following patch :
 
  wget
 http://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/dev-python/numpy/files/numpy-1.4.0-python-2.7.patch
 
  and apply it in this directory using :
 
  patch -p0  numpy-1.4.0-python-2.7.patch
 
  Now build numpy using python setup.py build; python setup.py install
 
  Numpy should now be accessible from python2.7, simply try a import numpy
  after launching python to check.
 
  First we need to install the boost libraries and their python bindings.
  Download boost to your downloads 

Re: [Rdkit-discuss] Building RDKit on CentOS 5

2010-11-17 Thread Robert DeLisle
:
http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01107.html

In /agg2/include/agg_array.h
change line #523 from this:

unsigned align = (alignment - unsigned(ptr) %  alignment) % alignment;

to this:

unsigned align = (alignment - (unsigned long)(ptr) % alignment) % alignment;


I can then get aggdraw to build, but running the selftest.py gives a
segmentation fault.  If I go ahead and install, it seems to work just fine
and the images produced from RDKit are much improved.  The build process is:

export CFLAGS=-fpermissive
python setup.py build_ext -i
python setup.py install


































On Wed, Nov 17, 2010 at 9:51 AM, Robert DeLisle rkdeli...@gmail.com wrote:

 Peter - This is great!  I've only browsed through the script you have, but
 I do see a few differences.  I'll give it a shot now and report back.  Thank
 you so much for posting this.

 Greg - I ran ldd on rdBase.so, and here's the output:

 libRDGeneral.so.1 = /opt/RDKit_svn_20101115/lib/libRDGeneral.so.1
 (0x2ad72659b000)
 libRDBoost.so.1 = /opt/RDKit_svn_20101115/lib/libRDBoost.so.1
 (0x2ad7267d6000)
 libboost_python.so.1.44.0 =
 /usr/local/lib/libboost_python.so.1.44.0 (0x2ad726ba5000)
 libstdc++.so.6 = /usr/local/lib64/libstdc++.so.6
 (0x2ad726df7000)
 libm.so.6 = /lib64/libm.so.6 (0x2ad72712f000)
 libgcc_s.so.1 = /usr/local/lib64/libgcc_s.so.1
 (0x2ad7273b2000)
 libc.so.6 = /lib64/libc.so.6 (0x2ad7275c9000)
 libutil.so.1 = /lib64/libutil.so.1 (0x2ad72792)
 libpthread.so.0 = /lib64/libpthread.so.0 (0x2ad727b23000)
 libdl.so.2 = /lib64/libdl.so.2 (0x2ad727d3f000)
 librt.so.1 = /lib64/librt.so.1 (0x2ad727f43000)
 /lib64/ld-linux-x86-64.so.2 (0x0031f700)

 It does look like it is refering to the correct instances of what I've
 built.  There are a few system level C/C++ library references, but I'm not
 seeing anything odd here.  What's your take on it?

 -Kirk






 On Wed, Nov 17, 2010 at 6:34 AM, Peter Schmidtke pschmid...@ub.eduwrote:

 Hey Greg,

 yep that would be great, as right now they are only on a group internal
 blog ;) I saw that you recently changed you linux build instructions
 (concerning database things, boost numerical bindings etc...), but I did
 this before this came out ;)

 First lets see if Robert comes through the install process without major
 problems and then you can post it on your wiki (I might have forgotten some
 stuff).
 Some things are based on installing pycuda on those machines, this is why
 signals and things like that are compiled with boost (might be worth to
 mention somehow in case people need both).

 ++

 Peter

 On 17/11/2010, at 14:25, Greg Landrum wrote:

  Dear Peter,
 
  Thanks for posting these very detailed instructions. Do you mind if I
  post them to the wiki (with credit of course) to make them easier to
  find?
 
  I made a few comments and suggestions below:
 
  On Wed, Nov 17, 2010 at 11:19 AM, Peter Schmidtke pschmid...@ub.edu
 wrote:
  Dear Robert,
  I recently ran also into several problems while installing rdkit on a
 fresh
  Centos 5.3. It's a real headache. Anyway, this time I've written up a
 guide
  of how to do it step by step, I hope I didn't forget anything in the
 end.
  However, now it works just fine on our Centos machines. Here's the step
 by
  step installing guide :
 
  Centos is a stable but not very userfriendly OS. This becomes obvious
 when
  one wants to install python packages like pycuda etc...Centos comes
 with a
  very old python version, 2.4, but lots of newer features, like pycuda
  require a newer python version. Lets start the lengthy install process
 under
  Centos :
 
  Installing Python 2.6 or newer
 
  If you already have python2.7 installed, please check that it was
 installed
  with --enabled-shared. If this is the case you should have
 libpython2.7.so
  in /usr/local/lib. If not, you should have libpython2.7.a. If the
 second is
  the case, you have to install python2.7 with the following way :
 
  Download the current version from python (source code). Like with 2.6
 or 2.7
  (don't grab the 3.x for now) :
 
  wget http://www.python.org/ftp/python/2.7/Python-2.7.tgz
 
  Next untar and unzip the file, go to Python-2.7 directory and issue :
 
  ./configure --enable-shared; make; sudo make install
 
  This installs python in the /usr/local/ directory.
 
  Add the RPMForge repo to yum :
 
  wget
 
 http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm
 
  su -c 'rpm -Uvh rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm'
 
  Then install atlas, lapack, blas :
 
  yum install atlas-c++.x86_64 atlas-c++-devel.x86_64 lapack.x86_64
  lapack-devel.x86_64 blas.x86_64 blas-devel.x86_64
 
  Now we can install fftw3 :
 
  yum install fftw3.x86_64 fftw3-devel.x86_64
 
 
 
  Now we could potentially install numpy 1.3 or 1.4, but as python2.7 is
 brand
  new there are some

Re: [Rdkit-discuss] Building RDKit on CentOS 5

2010-11-16 Thread Robert DeLisle
I'm sorry for the slow response.  Busy day.

For this install, I started on a CentOS 5.5 system that was up to date with
all package upgrades.  Following is what I've done so far:

installed blas, blas-devel, lapack, lapack-devel through yum

I had problems in the past with the standard GCC package on CentOS which is
version 4.1.2, so I rebuilt the GCC 4.4.5 package and included mpfr 2.4.1

Installed cmake 2.8.2
Installed flex 2.5.35

CentOS's Python installation is v2.4.1, so I built and installed 2.7.  Due
to errors found later in the process, I built this with the -fPIC switch and
also enabled Unicode UCS4 support

./configure CFLAGS=-fPIC --enable-unicode=ucs4

Built and installed NumPy 1.5.0

Boost on CentOS 5.5 is v1.33, so I built and installed boost 1.44 with the
following commands:

./bootstrap.sh --with-libraries=python,regex
./bjam address-model=64 install


Finally, with RDKit I have $LD_LIBRARY_PATH








On Mon, Nov 15, 2010 at 10:11 PM, rkdeli...@gmail.com wrote:

 No, I made sure to include the address-model=64 switch to bjam.

 Tomorrow when I get in I'll update the thread with all the steps I've
 followed.

 -Kirk




 On Nov 15, 2010 9:52pm, Greg Landrum greg.land...@gmail.com wrote:
  Kirk,
 
 
 
  On Tue, Nov 16, 2010 at 12:38 AM, Robert DeLisle rkdeli...@gmail.com
 wrote:
 
   Yes, that is also true.
 
  
 
   The error in my most recent messages stems from the default build of
 Python
 
   supporst Unicode UCS2, but apparently boost expects UCS4.  A rebuild of
 
   Python with UCS4 enabled fixed that problem.
 
  
 
   Now I get a similar error related to Py_InitModule4 not being defined.
 From
 
   what I can find, this is a 32-bit - 64-bit problem in which this was
 defined
 
   as Py_InitModule4_64 in the 64-bit Python libraries but that change may
 not
 
   have cascaded to all necessary parts of the build process.  Most of the
 
   changes involve some substantial changes to the accessing code, but I'm
 
   still looking for a better option.
 
 
 
  Could it be that the boost libraries you are using were not built in
 
  64bit mode? I've managed to force a 64bit build in the past with the
 
  following command line:
 
  ./bjam address-model=64 cflags=-fPIC cxxflags=-fPIC install
 
 
 
  Best Regards,
 
  -greg
 

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit on CentOS 5

2010-11-16 Thread Robert DeLisle
-- Forwarded message --
From: Robert DeLisle rkdeli...@gmail.com
Date: Tue, Nov 16, 2010 at 3:03 PM
Subject: Re: Re: [Rdkit-discuss] Building RDKit on CentOS 5
To: Greg Landrum greg.land...@gmail.com, Robert DeLisle 
rkdeli...@gmail.com


'm sorry for the slow response.  Busy day.

For this install, I started on a CentOS 5.5 system that was up to date with
all package upgrades.  Following is what I've done so far:

installed blas, blas-devel, lapack, lapack-devel through yum

I had problems in the past with the standard GCC package on CentOS which is
version 4.1.2, so I rebuilt the GCC 4.4.5 package and included mpfr 2.4.1

Installed cmake 2.8.2
Installed flex 2.5.35

CentOS's Python installation is v2.4.1, so I built and installed 2.7.  Due
to errors found later in the process, I built this with the -fPIC switch and
also enabled Unicode UCS4 support

./configure CFLAGS=-fPIC --enable-unicode=ucs4

Built and installed NumPy 1.5.0

Boost on CentOS 5.5 is v1.33, so I built and installed boost 1.44 with the
following commands:

./bootstrap.sh --with-libraries=python,regex
./bjam address-model=64 install


Finally, with RDKit I have $LD_LIBRARY_PATH set with /usr/local/lib first to
avoid conflicts with the system packages.  GCC and Python are both in
/usr/local and these are the instances referred to by my user and root.  For
RDKit, the following commands were done:

cmake -DBoost_USE_STATIC_LIBS=OFF -DBOOST_ROOT=/usr/local ..
make
make install



I have also installed FreeType2 and PIL - both seem fine with Python 2.7.  I
attempted aggdraw, but the self-test seem to always give me a Segmentation
Fault.  I found that I can build aggdraw using the code as-is as long as I
include CFLAGS=-fpermissive, or there is a one line code change that makes
the compiler happy on 64-bit.  Either way I still get the seg fault upon
testing.


Regarding RDKit, the first group of errors I received consisted of that
requiring Python be built with -fPIC and what seems to be the typical
USE_STATIC_LIBS error.

Initially, an -fPIC error would occur around 87% which was not cured by the
Python rebuild or any other modification.  I found that by switching to the
SVN code, the problem was solved.  Upon inspecting the errors logs, it
appeared that the build process was always referring to the system Boost
install and not my new install despite having set -DBOOST_ROOT correctly.

Currently, the build goes to completion but upon issuing 'from rdkit import
Chem' wihtin Python 2.7, I get an error related to Py_InitModule4 not being
defined.  From a little Google searching for Py_InitModule4 the only thing
I've seen thus far is a conflict in various packages on code built on 32-bit
or 64-bit systems.  It seems that this name has been renamed to
Py_InitModule4_64 on 64-bit systems but that change may not be reflected in
all code necessary.  It seemed a widespread problem and not specific to any
one application or library, which makes me think it is something in a Python
include file.

I appreciate any help that anyone can provide.  Please let me know if I need
to clarify or add any details.

-Kirk






On Mon, Nov 15, 2010 at 10:11 PM, rkdeli...@gmail.com wrote:

 No, I made sure to include the address-model=64 switch to bjam.

 Tomorrow when I get in I'll update the thread with all the steps I've
 followed.

 -Kirk




 On Nov 15, 2010 9:52pm, Greg Landrum greg.land...@gmail.com wrote:
  Kirk,
 
 
 
  On Tue, Nov 16, 2010 at 12:38 AM, Robert DeLisle rkdeli...@gmail.com
 wrote:
 
   Yes, that is also true.
 
  
 
   The error in my most recent messages stems from the default build of
 Python
 
   supporst Unicode UCS2, but apparently boost expects UCS4.  A rebuild of
 
   Python with UCS4 enabled fixed that problem.
 
  
 
   Now I get a similar error related to Py_InitModule4 not being defined.
 From
 
   what I can find, this is a 32-bit - 64-bit problem in which this was
 defined
 
   as Py_InitModule4_64 in the 64-bit Python libraries but that change may
 not
 
   have cascaded to all necessary parts of the build process.  Most of the
 
   changes involve some substantial changes to the accessing code, but I'm
 
   still looking for a better option.
 
 
 
  Could it be that the boost libraries you are using were not built in
 
  64bit mode? I've managed to force a 64bit build in the past with the
 
  following command line:
 
  ./bjam address-model=64 cflags=-fPIC cxxflags=-fPIC install
 
 
 
  Best Regards,
 
  -greg
 

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net

Re: [Rdkit-discuss] Building RDKit on CentOS 5

2010-11-16 Thread Robert DeLisle
I apologize for that doulbe post before - itchy send finger.

Here's the specific error I'm getting after the build process has otherwise
succeded.

 from rdkit import Chem
Traceback (most recent call last):
  File stdin, line 1, in module
  File /opt/RDKit_svn_20101115/rdkit/Chem/__init__.py, line 18, in
module
from rdkit import rdBase
ImportError: /usr/lib64/libboost_python.so.2: undefined symbol:
Py_InitModule4







On Tue, Nov 16, 2010 at 3:04 PM, Robert DeLisle rkdeli...@gmail.com wrote:



 -- Forwarded message --
 From: Robert DeLisle rkdeli...@gmail.com
 Date: Tue, Nov 16, 2010 at 3:03 PM
 Subject: Re: Re: [Rdkit-discuss] Building RDKit on CentOS 5
 To: Greg Landrum greg.land...@gmail.com, Robert DeLisle 
 rkdeli...@gmail.com


 'm sorry for the slow response.  Busy day.

 For this install, I started on a CentOS 5.5 system that was up to date with
 all package upgrades.  Following is what I've done so far:

 installed blas, blas-devel, lapack, lapack-devel through yum

 I had problems in the past with the standard GCC package on CentOS which is
 version 4.1.2, so I rebuilt the GCC 4.4.5 package and included mpfr 2.4.1

 Installed cmake 2.8.2
 Installed flex 2.5.35

 CentOS's Python installation is v2.4.1, so I built and installed 2.7.  Due
 to errors found later in the process, I built this with the -fPIC switch and
 also enabled Unicode UCS4 support

 ./configure CFLAGS=-fPIC --enable-unicode=ucs4

 Built and installed NumPy 1.5.0

 Boost on CentOS 5.5 is v1.33, so I built and installed boost 1.44 with the
 following commands:

 ./bootstrap.sh --with-libraries=python,regex
 ./bjam address-model=64 install


 Finally, with RDKit I have $LD_LIBRARY_PATH set with /usr/local/lib first
 to avoid conflicts with the system packages.  GCC and Python are both in
 /usr/local and these are the instances referred to by my user and root.  For
 RDKit, the following commands were done:

 cmake -DBoost_USE_STATIC_LIBS=OFF -DBOOST_ROOT=/usr/local ..
 make
 make install



 I have also installed FreeType2 and PIL - both seem fine with Python 2.7.
 I attempted aggdraw, but the self-test seem to always give me a Segmentation
 Fault.  I found that I can build aggdraw using the code as-is as long as I
 include CFLAGS=-fpermissive, or there is a one line code change that makes
 the compiler happy on 64-bit.  Either way I still get the seg fault upon
 testing.


 Regarding RDKit, the first group of errors I received consisted of that
 requiring Python be built with -fPIC and what seems to be the typical
 USE_STATIC_LIBS error.

 Initially, an -fPIC error would occur around 87% which was not cured by the
 Python rebuild or any other modification.  I found that by switching to the
 SVN code, the problem was solved.  Upon inspecting the errors logs, it
 appeared that the build process was always referring to the system Boost
 install and not my new install despite having set -DBOOST_ROOT correctly.

 Currently, the build goes to completion but upon issuing 'from rdkit import
 Chem' wihtin Python 2.7, I get an error related to Py_InitModule4 not being
 defined.  From a little Google searching for Py_InitModule4 the only thing
 I've seen thus far is a conflict in various packages on code built on 32-bit
 or 64-bit systems.  It seems that this name has been renamed to
 Py_InitModule4_64 on 64-bit systems but that change may not be reflected in
 all code necessary.  It seemed a widespread problem and not specific to any
 one application or library, which makes me think it is something in a Python
 include file.

 I appreciate any help that anyone can provide.  Please let me know if I
 need to clarify or add any details.

 -Kirk






 On Mon, Nov 15, 2010 at 10:11 PM, rkdeli...@gmail.com wrote:

 No, I made sure to include the address-model=64 switch to bjam.

 Tomorrow when I get in I'll update the thread with all the steps I've
 followed.

 -Kirk




 On Nov 15, 2010 9:52pm, Greg Landrum greg.land...@gmail.com wrote:
  Kirk,
 
 
 
  On Tue, Nov 16, 2010 at 12:38 AM, Robert DeLisle rkdeli...@gmail.com
 wrote:
 
   Yes, that is also true.
 
  
 
   The error in my most recent messages stems from the default build of
 Python
 
   supporst Unicode UCS2, but apparently boost expects UCS4.  A rebuild
 of
 
   Python with UCS4 enabled fixed that problem.
 
  
 
   Now I get a similar error related to Py_InitModule4 not being
 defined.  From
 
   what I can find, this is a 32-bit - 64-bit problem in which this was
 defined
 
   as Py_InitModule4_64 in the 64-bit Python libraries but that change
 may not
 
   have cascaded to all necessary parts of the build process.  Most of
 the
 
   changes involve some substantial changes to the accessing code, but
 I'm
 
   still looking for a better option.
 
 
 
  Could it be that the boost libraries you are using were not built in
 
  64bit mode? I've managed to force a 64bit build in the past with the
 
  following command line:
 
  ./bjam address-model=64 cflags=-fPIC

[Rdkit-discuss] Building RDKit on CentOS 5

2010-11-15 Thread Robert DeLisle
I've been working to build RDKit on Centos 5, and I'm hitting a very common
error.  Unfortunately, none of the standard fixes have helped.

Details:

The error that I'm seeing is this:

[ 82%] Building CXX object
Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/SLNParse.cpp.o
[ 83%] Building CXX object
Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/SLNAttribs.cpp.o
[ 83%] Building CXX object
Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/sln.tab.cpp.o
[ 84%] Building CXX object
Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/lex.yysln.cpp.o
Linking CXX shared library libSLNParse.so
/usr/bin/ld: /usr/lib/../lib64/libboost_regex.a(instances.o): relocation
R_X86_64_32 against
`boost::object_cacheboost::re_detail::cpp_regex_traits_basechar,
boost::re_detail::cpp_regex_traits_implementationchar
::do_get(boost::re_detail::cpp_regex_traits_basechar const, unsigned
long)::s_data' can not be used when making a shared object; recompile with
-fPIC
/usr/lib/../lib64/libboost_regex.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[2]: *** [Code/GraphMol/SLNParse/libSLNParse.so] Error 1
make[1]: *** [Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/all] Error 2
make: *** [all] Error 2


I've taken the standard steps of building Python (v2.7) with the -fPIC
flag.  Specficially, I attached CFLAGS=-fPIC to configure in the Python
build.  This solved the first instance of this type of error occuring at
about 3%.

I've also tried the two fixes for Boost with the following command line to
build RDKit:

cmake -DBOOST_ROOT=/usr/local -DBoost_USE_STATIC_LIBS=OFF ..


I still get this error, and I notice that the Boost libraries that are being
referred to are actually the system installation in usr/lib64 and not those
that I've build in /usr/local/lib.  It would seem that I can't seem to force
make to look in the right location.

Any tips are greatly apprciated.

-Kirk
--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit on CentOS 5

2010-11-15 Thread Robert DeLisle
Yep, I've defintely done that.  I've even gone so far as to wipe out the
directory entirely and start with a fresh RDKit directory.  I also looked
into the cache file and seen that the library directories appear to be set
as /usr/local/lib and /user/local/lib64, but one the error occurs, it refers
to /usr/lib64.  I can't seem to find any reason for this.

-Kirk




On Mon, Nov 15, 2010 at 3:26 PM, Eddie Cao cao.yi...@gmail.com wrote:

 Have you tried to remove the CMake cache file before rerun cmake?

 rm -f CMakeCache.txt

 After rerun cmake, take a look at that file again and make sure things like
 Boost_INCLUDE_DIR and Boost_LIBRARY_DIRS all point to  /usr/local/include
 and /usr/local/lib, etc.

 Eddie


 On Nov 15, 2010, at 12:45 PM, Robert DeLisle wrote:

  I've been working to build RDKit on Centos 5, and I'm hitting a very
 common error.  Unfortunately, none of the standard fixes have helped.
 
  Details:
 
  The error that I'm seeing is this:
 
  [ 82%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/SLNParse.cpp.o
  [ 83%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/SLNAttribs.cpp.o
  [ 83%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/sln.tab.cpp.o
  [ 84%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/lex.yysln.cpp.o
  Linking CXX shared library libSLNParse.so
  /usr/bin/ld: /usr/lib/../lib64/libboost_regex.a(instances.o): relocation
 R_X86_64_32 against
 `boost::object_cacheboost::re_detail::cpp_regex_traits_basechar,
 boost::re_detail::cpp_regex_traits_implementationchar
 ::do_get(boost::re_detail::cpp_regex_traits_basechar const, unsigned
 long)::s_data' can not be used when making a shared object; recompile with
 -fPIC
  /usr/lib/../lib64/libboost_regex.a: could not read symbols: Bad value
  collect2: ld returned 1 exit status
  make[2]: *** [Code/GraphMol/SLNParse/libSLNParse.so] Error 1
  make[1]: *** [Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/all] Error 2
  make: *** [all] Error 2
 
 
  I've taken the standard steps of building Python (v2.7) with the -fPIC
 flag.  Specficially, I attached CFLAGS=-fPIC to configure in the Python
 build.  This solved the first instance of this type of error occuring at
 about 3%.
 
  I've also tried the two fixes for Boost with the following command line
 to build RDKit:
 
  cmake -DBOOST_ROOT=/usr/local -DBoost_USE_STATIC_LIBS=OFF ..
 
 
  I still get this error, and I notice that the Boost libraries that are
 being referred to are actually the system installation in usr/lib64 and not
 those that I've build in /usr/local/lib.  It would seem that I can't seem to
 force make to look in the right location.
 
  Any tips are greatly apprciated.
 
  -Kirk
 
 
 
 
 --
  Centralized Desktop Delivery: Dell and VMware Reference Architecture
  Simplifying enterprise desktop deployment and management using
  Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
  client virtualization framework. Read more!
 
 http://p.sf.net/sfu/dell-eql-dev2dev___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit on CentOS 5

2010-11-15 Thread Robert DeLisle
It must be something in the release version of RDKit.  I just grabbed the
SVN version, put it in the same location, followed the same procedures, and
it has just compiled fine without any other changes on my part.

Greg - any ideas what the difference is here?  Not that it matters given
that the SVN is working, but just for curiosity's sake.

Sadly, now I get this from with Python:


 from rdkit import Chem
Traceback (most recent call last):
  File stdin, line 1, in module
  File /opt/RDKit_svn_20101115/rdkit/Chem/__init__.py, line 18, in
module
from rdkit import rdBase
ImportError: /usr/lib64/libboost_python.so.2: undefined symbol:
PyUnicodeUCS4_FromEncodedObject



-Kirk





On Mon, Nov 15, 2010 at 3:28 PM, Robert DeLisle rkdeli...@gmail.com wrote:

 Yep, I've defintely done that.  I've even gone so far as to wipe out the
 directory entirely and start with a fresh RDKit directory.  I also looked
 into the cache file and seen that the library directories appear to be set
 as /usr/local/lib and /user/local/lib64, but one the error occurs, it refers
 to /usr/lib64.  I can't seem to find any reason for this.

 -Kirk





 On Mon, Nov 15, 2010 at 3:26 PM, Eddie Cao cao.yi...@gmail.com wrote:

 Have you tried to remove the CMake cache file before rerun cmake?

 rm -f CMakeCache.txt

 After rerun cmake, take a look at that file again and make sure things
 like Boost_INCLUDE_DIR and Boost_LIBRARY_DIRS all point to
  /usr/local/include and /usr/local/lib, etc.

 Eddie


 On Nov 15, 2010, at 12:45 PM, Robert DeLisle wrote:

  I've been working to build RDKit on Centos 5, and I'm hitting a very
 common error.  Unfortunately, none of the standard fixes have helped.
 
  Details:
 
  The error that I'm seeing is this:
 
  [ 82%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/SLNParse.cpp.o
  [ 83%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/SLNAttribs.cpp.o
  [ 83%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/sln.tab.cpp.o
  [ 84%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/lex.yysln.cpp.o
  Linking CXX shared library libSLNParse.so
  /usr/bin/ld: /usr/lib/../lib64/libboost_regex.a(instances.o): relocation
 R_X86_64_32 against
 `boost::object_cacheboost::re_detail::cpp_regex_traits_basechar,
 boost::re_detail::cpp_regex_traits_implementationchar
 ::do_get(boost::re_detail::cpp_regex_traits_basechar const, unsigned
 long)::s_data' can not be used when making a shared object; recompile with
 -fPIC
  /usr/lib/../lib64/libboost_regex.a: could not read symbols: Bad value
  collect2: ld returned 1 exit status
  make[2]: *** [Code/GraphMol/SLNParse/libSLNParse.so] Error 1
  make[1]: *** [Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/all] Error
 2
  make: *** [all] Error 2
 
 
  I've taken the standard steps of building Python (v2.7) with the -fPIC
 flag.  Specficially, I attached CFLAGS=-fPIC to configure in the Python
 build.  This solved the first instance of this type of error occuring at
 about 3%.
 
  I've also tried the two fixes for Boost with the following command line
 to build RDKit:
 
  cmake -DBOOST_ROOT=/usr/local -DBoost_USE_STATIC_LIBS=OFF ..
 
 
  I still get this error, and I notice that the Boost libraries that are
 being referred to are actually the system installation in usr/lib64 and not
 those that I've build in /usr/local/lib.  It would seem that I can't seem to
 force make to look in the right location.
 
  Any tips are greatly apprciated.
 
  -Kirk
 
 
 
 
 --
  Centralized Desktop Delivery: Dell and VMware Reference Architecture
  Simplifying enterprise desktop deployment and management using
  Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
  client virtualization framework. Read more!
 
 http://p.sf.net/sfu/dell-eql-dev2dev___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit on CentOS 5

2010-11-15 Thread Robert DeLisle
Yes, that is also true.

The error in my most recent messages stems from the default build of Python
supporst Unicode UCS2, but apparently boost expects UCS4.  A rebuild of
Python with UCS4 enabled fixed that problem.

Now I get a similar error related to Py_InitModule4 not being defined.  From
what I can find, this is a 32-bit - 64-bit problem in which this was defined
as Py_InitModule4_64 in the 64-bit Python libraries but that change may not
have cascaded to all necessary parts of the build process.  Most of the
changes involve some substantial changes to the accessing code, but I'm
still looking for a better option.







On Mon, Nov 15, 2010 at 4:14 PM, Eddie Cao cao.yi...@gmail.com wrote:

 Make sure /usr/local/lib appears before /usr/lib64 in your LD_LIBRARY_PATH.
 It seems python import loads the system boost rather than your custom boost.

 -Eddie

 On Nov 15, 2010, at 2:44 PM, Robert DeLisle wrote:

 It must be something in the release version of RDKit.  I just grabbed the
 SVN version, put it in the same location, followed the same procedures, and
 it has just compiled fine without any other changes on my part.

 Greg - any ideas what the difference is here?  Not that it matters given
 that the SVN is working, but just for curiosity's sake.

 Sadly, now I get this from with Python:


  from rdkit import Chem
 Traceback (most recent call last):
   File stdin, line 1, in module
   File /opt/RDKit_svn_20101115/rdkit/Chem/__init__.py, line 18, in
 module
 from rdkit import rdBase
 ImportError: /usr/lib64/libboost_python.so.2: undefined symbol:
 PyUnicodeUCS4_FromEncodedObject



 -Kirk





 On Mon, Nov 15, 2010 at 3:28 PM, Robert DeLisle rkdeli...@gmail.comwrote:

 Yep, I've defintely done that.  I've even gone so far as to wipe out the
 directory entirely and start with a fresh RDKit directory.  I also looked
 into the cache file and seen that the library directories appear to be set
 as /usr/local/lib and /user/local/lib64, but one the error occurs, it refers
 to /usr/lib64.  I can't seem to find any reason for this.

 -Kirk





 On Mon, Nov 15, 2010 at 3:26 PM, Eddie Cao cao.yi...@gmail.com wrote:

 Have you tried to remove the CMake cache file before rerun cmake?

 rm -f CMakeCache.txt

 After rerun cmake, take a look at that file again and make sure things
 like Boost_INCLUDE_DIR and Boost_LIBRARY_DIRS all point to
  /usr/local/include and /usr/local/lib, etc.

 Eddie


 On Nov 15, 2010, at 12:45 PM, Robert DeLisle wrote:

  I've been working to build RDKit on Centos 5, and I'm hitting a very
 common error.  Unfortunately, none of the standard fixes have helped.
 
  Details:
 
  The error that I'm seeing is this:
 
  [ 82%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/SLNParse.cpp.o
  [ 83%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/SLNAttribs.cpp.o
  [ 83%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/sln.tab.cpp.o
  [ 84%] Building CXX object
 Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/lex.yysln.cpp.o
  Linking CXX shared library libSLNParse.so
  /usr/bin/ld: /usr/lib/../lib64/libboost_regex.a(instances.o):
 relocation R_X86_64_32 against
 `boost::object_cacheboost::re_detail::cpp_regex_traits_basechar,
 boost::re_detail::cpp_regex_traits_implementationchar
 ::do_get(boost::re_detail::cpp_regex_traits_basechar const, unsigned
 long)::s_data' can not be used when making a shared object; recompile with
 -fPIC
  /usr/lib/../lib64/libboost_regex.a: could not read symbols: Bad value
  collect2: ld returned 1 exit status
  make[2]: *** [Code/GraphMol/SLNParse/libSLNParse.so] Error 1
  make[1]: *** [Code/GraphMol/SLNParse/CMakeFiles/SLNParse.dir/all] Error
 2
  make: *** [all] Error 2
 
 
  I've taken the standard steps of building Python (v2.7) with the -fPIC
 flag.  Specficially, I attached CFLAGS=-fPIC to configure in the Python
 build.  This solved the first instance of this type of error occuring at
 about 3%.
 
  I've also tried the two fixes for Boost with the following command line
 to build RDKit:
 
  cmake -DBOOST_ROOT=/usr/local -DBoost_USE_STATIC_LIBS=OFF ..
 
 
  I still get this error, and I notice that the Boost libraries that are
 being referred to are actually the system installation in usr/lib64 and not
 those that I've build in /usr/local/lib.  It would seem that I can't seem to
 force make to look in the right location.
 
  Any tips are greatly apprciated.
 
  -Kirk
 
 
 
 
 --
  Centralized Desktop Delivery: Dell and VMware Reference Architecture
  Simplifying enterprise desktop deployment and management using
  Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
  client virtualization framework. Read more!
 
 http://p.sf.net/sfu/dell-eql-dev2dev___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists

Re: [Rdkit-discuss] How to trap the exceptions in RDKit?

2010-10-22 Thread Robert DeLisle
The easiest trap is simply this:

if (m is None):
  #error handling code

The problem that I have had is that this will effectively skip bad
molecules, but in a large SD file, it is difficult to find out which
molecules they were.

sd = Chem.SDMolSupplier(test.sdf)

for m in sd:
 if m is None:
  #how do I get more information about the broken molecules?
 else:
  #do the normal stuff.









On Fri, Oct 22, 2010 at 8:51 AM, nikolaus.sti...@novartis.com wrote:


 Hi,

 your molecule itself is not ok - the Sn has a larger valence then
 permitted. There was a similar case recently about phosphor. I would guess
 that you should fix your molecule first.

 In your code example below your molecule m is None and hence the rest will
 not work.

 Hope that helps

 Cheers
 Nik



  *sridhar.kuntamukk...@thomsonreuters.com*

 10/22/2010 04:41 PM
   To
 rdkit-discuss@lists.sourceforge.net
  cc
   Subject
 [Rdkit-discuss] How to trap the exceptions in RDKit?




 Hi,
 I have the following code which raises an exception because the molecule is
 not up to its expectations. But I can’t find a way to trap the exception.
 Can someone suggest one, please?

 from rdkit import Chem
 from rdkit.Chem import AvailDescriptors
 from rdkit.Chem import Crippen


 m=Chem.MolFromSmiles('c1ccc2c(c1)/C=N/c3c3S[Ti]O2.[CH]1[CH][CH][CH][CH]1.Cl[Sn-](Cl)(Cl)(Cl)Cl')
 # I tried this way
 molog = Crippen.MolLogP(m)
 print molog
 # and also this way first
 if AvailDescriptors.descDict['MolLogP'](m):
mollogp = AvailDescriptors.descDict['MolLogP'](m)

 I also wanted to calc. NumHDonors and NumHAcceptors. But if it failed on
 one descriptor, will it fail on other descriptors as well?
 Any suggestions?
 Thanks
 Sridhar

 p.s. to Eddie. Turned out my server has the 11g client and the installation
 on the server works fine. I guess I must have missed a line or two of
 instructions that the client must be 11g’s and not the DB itself. My PC had
 oracle 9 and 10 clients.

 *From:* Eddie Cao [mailto:cao.yi...@gmail.com] *
 Sent:* Wednesday, October 20, 2010 4:26 PM*
 To:* Kuntamukkula, Sridhar (HlthcrScience)*
 Cc:* rdkit-disc...@lists.sourceforge.net*
 Subject:* Re: [Rdkit-discuss] [Rdkit-devel] How to build the RDKit?

 Hi,

 Being not an Oracle user, I cannot give you a concrete answer, but a quick
 Google search indicates that it might be a version inconsistency between the
 client and the server. Are you sure you are connecting to 11g? Please
 contact your database administrator for problems regarding Oracle database
 or ask the folks on the cx_Oracle mailing list:

 *http://lists.sourceforge.net/lists/listinfo/cx-oracle-users*http://lists.sourceforge.net/lists/listinfo/cx-oracle-users

 -Eddie


 On Oct 20, 2010, at 11:03 AM, 
 *sridhar.kuntamukk...@thomsonreuters.com*sridhar.kuntamukk...@thomsonreuters.comwrote:


 Hi,
 I have downloaded the “*Windows x86 
 Installer*http://prdownloads.sourceforge.net/cx-oracle/cx_Oracle-5.0.4-11g.win32-py2.5.msi?download(Oracle
  11g, Python 2.5)” to my PC and installed it.
 From Python command-line, when I try to connect to oracle, I get the
 following error.

 C:\RDkitpython
 Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)]
 on win
 32
 Type help, copyright, credits or license for more information.
  import cx_Oracle
  con = cx_Oracle.connect(user, pwd, chemdev)
 Traceback (most recent call last):
   File stdin, line 1, in module
 cx_Oracle.DatabaseError: ORA-24315: illegal attribute type

 I have originally used the connstr=user/p...@chemdev' and this gave me the
 same error as above.
 Then I found the above syntax in the cx_oracle_doc folder’s readme.txt and
 am lost now.

 MY PC has Windows XP, Oracle is on a different server (with a TNS entry
 “chemdev”) and I just added TNS_ADMIN Registry_entry and the path of
 TNS_ADMIN to the PATH env. Variable.
 From a command prompt, sqlplus user/p...@chemdev works fine.

 Any thoughts?

 Many thanks
 Sridhar

 *From:* *nikolaus.sti...@novartis.com* nikolaus.sti...@novartis.com[mailto:
 nikolaus.sti...@novartis.com] *
 Sent:* Wednesday, October 20, 2010 2:07 AM*
 To:* Eddie Cao*
 Cc:* *rdkit-de...@lists.sourceforge.net*rdkit-de...@lists.sourceforge.net;
 Kuntamukkula, Sridhar (HlthcrScience); *
 rdkit-disc...@lists.sourceforge.net* rdkit-discuss@lists.sourceforge.net
 *
 Subject:* Re: [Rdkit-devel] How to build the RDKit?


 Hi Sridhar,

 congrats on getting things working. One more comment - maybe you want to
 post these kind of questions rather in the discuss than the devel list. It
 is much more populated and you will for sure get replies quicker.

 Cheers
 Nik

   *Eddie Cao **cao.yi...@gmail.com* cao.yi...@gmail.com**

 10/19/2010 11:49 PM

   To
 *sridhar.kuntamukk...@thomsonreuters.com*sridhar.kuntamukk...@thomsonreuters.com
 cc
 *rdkit-de...@lists.sourceforge.net* rdkit-de...@lists.sourceforge.net
 Subject
 Re: [Rdkit-devel] How to build the RDKit?








 Hi Sridhar,

 Congratulations! If 

Re: [Rdkit-discuss] Error depicting a smiles string

2010-05-03 Thread Robert DeLisle
Greg,

I found the files of interest and ran a few tests.  The files resulting from
the tests are in the attached archive and here are the details.

The structures in question came from the non-aggregators set of Shoichet
which were available on his web page.  My original intent was to convert the
SMILES files from the Shoichet set to SDF.  This went smoothly enough until
I had to process the SDF for a different purpose.  Four structures were
found to cause problems.

In the attached archive, each offending structure has 5 associated files
named according the the NGC ID associated with the original SMILES:

.smi - The original SMILES.

.sdf - The result I had found in my SMILES to SDF conversion having nan as
the atom coordinates.

.mol - Generated manually today by:
m = Chem.MolFromSmiles('offending SMILES')
AllChem.Compute2DCoords(m)
print file ('blah.mol','w+'), Chem.MolToMolBlock(m)

_fix.smi - This is the RDKit generated SMILES for the structure.

_fix.mol - The result of the following after the code snip above:
m=Chem.MolFromSmiles(Chem.MolToSmiles(m))
AllChem.Compute2DCoords(m)
print file ('blah_fix.mol','w+'), Chem.MolToMolBlock(m)


Only 14662 did not result in a fixed mol file.  Interestingly, the first bad
conversion only has nan for coordinates of the platinum hexachloride.  After
the SMILES round-trip, all coordinates are nan.

Please let me know if you need any further details.

-Kirk








On Sat, May 1, 2010 at 10:24 PM, Greg Landrum greg.land...@gmail.comwrote:

 On Fri, Apr 30, 2010 at 12:56 PM, Greg Landrum greg.land...@gmail.com
 wrote:
 
  I don't see any problems in your script, so I have to assume that it's
  a problem with the binary you're using. I'm travelling and don't have
  a windows machine handy, so this will have to wait until I'm back home
  this weekend.

 Ok, I was able to reproduce this on my windows box. It's clearly a
 problem with the windows build:

 In [29]: m = Chem.MolFromSmiles('OC(=O)C11')

 In [30]: AllChem.Compute2DCoords(m)
 Out[30]: 0

 In [31]: print Chem.MolToMolBlock(m)
 --- print(Chem.MolToMolBlock(m))

 RDKit  2D

  8  8  0  0  0  0  0  0  0  0999 V2000
   -1.#IND1.#QNB0. O   0  0  0  0  0  0  0  0  0  0  0  0
   -1.#IND1.#QNB0. C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.#IND1.#QNB0. O   0  0  0  0  0  0  0  0  0  0  0  0
   -1.#IND1.#QNB0. C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.#IND1.#QNB0. C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.#IND1.#QNB0. C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.#IND1.#QNB0. C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.#IND1.#QNB0. C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0
  2  3  2  3
  2  4  1  0
  4  5  1  0
  5  6  1  0
  6  7  1  0
  7  8  1  0
  8  4  1  0
 M  END

 I will look into this and see where the problem lies.

 Note: whatever is going on here doesn't affect every depiction; other
 molecules do end up with correct coordinates.

 Best Regards,
 -greg


 --
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



nan.tgz
Description: GNU Zip compressed data
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit and PIL

2010-04-03 Thread Robert DeLisle
Oops!  I meant Q4_2009.

I was working on a Windows system with Python 2.6.  I did not build PIL from
source - I simply downloaded the Windows installer.  Reverting to the 1.1.6
version was the only change I made that fixed the font error.

I have another LINUX system that has the same font error that I want to
check.  I don't remember which version of PIL was installed on that one, but
I'm reasonably sure it was 1.1.7.  Once I test that one, I'll let you know
if I get a similar result.  I'll also capture the complete error message on
the Windows system and send that to you.

I hope your vacation was good.

-Kirk






On Sat, Apr 3, 2010 at 8:19 AM, Greg Landrum greg.land...@gmail.com wrote:

 Hi Kirk,

 On Fri, Apr 2, 2010 at 2:47 PM,  rkdeli...@gmail.com wrote:
  While updated various parts of my system, I found that the latest version
 of
  RDKit (Q3_2009) combined with the latest version of PIL (1.1.7) leads to
  errors when trying to execute Draw.MolToImageFile(...)

 FYI: there is a Q4_2009 version.

  The error stated
  cannot load font and not image was generated. When I downgraded to PIL
  v1.1.6, all is fine.
 
  Nothing mission critical here, just an FYI.

 Thanks for the information. Which system was this on? Did you build
 PIL both times yourself or get it in binary form? The reason I ask is
 that 1.1.7 works fine for me on the Mac:
 [3] m = Chem.MolFromSmiles('c1n1')
 [4] Draw.MolToImageFile(m,'blah.png')
 [5] import Image
 [6] Image.VERSION
 Out[6]: '1.1.7'

 The cannot load font type errors are, I think, generally related to
 freetype, so I'm a bit surprised that one version would work and the
 other not.

 Best regards,
 -greg

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] beta of Q4 2009 release up

2010-01-20 Thread Robert DeLisle
Gianluca,

Interesting.  My failed Fedora 12 (32-bit) did not have any issues with
PYTHON_INCLUDE_DIR but rather the CMake - Boost library issue that Greg
describes.

-Kirk





On Wed, Jan 20, 2010 at 2:48 AM, Gianluca Sforna gia...@gmail.com wrote:

 On Wed, Jan 20, 2010 at 6:03 AM, Greg Landrum greg.land...@gmail.com
 wrote:
  Please note that the new release supports cmake-based builds and is,
  consequently, much easier to build than before. Notes on how to do
  builds on linux/mac are here:
  http://code.google.com/p/rdkit/wiki/BuildingWithCmake
  Windows instructions will be here (I'm still working on these):
  http://code.google.com/p/rdkit/wiki/BuildingOnWindows_2009Q4

 I tried a build in my Fedora 12 x86_64, but a simple mkdir build; cd
 build; cmake ..; make failed because PYTHON_INCLUDE_DIR was not
 correctly set.
 Is there a reason why we are doing that hackery to find python libs?

 BTW, the following patch fixed my build:

 diff --git a/CMakeLists.txt b/CMakeLists.txt
 index 2f5be8f..f42198b 100644
 --- a/CMakeLists.txt
 +++ b/CMakeLists.txt
 @@ -24,39 +24,10 @@ set(RDKit_PythonDir ${CMAKE_SOURCE_DIR}/rdkit)
  # defines macros: rdkit_python_extension, rdkit_test
  include(RDKitUtils)

 -#---
 -# pull in python:
 -# start with a bit of hackery to allow the user to provide their own
 -# path to python:
 -if(PYTHON_LIBRARIES)
 -  set(oPYTHON_LIBRARIES ${PYTHON_LIBRARIES})
 -endif(PYTHON_LIBRARIES)
 -if(PYTHON_INCLUDE_DIR)
 -  set(oPYTHON_INCLUDE_DIR ${PYTHON_INCLUDE_DIR})
 -endif(PYTHON_INCLUDE_DIR)
 -find_package(PythonLibs)
 -if(oPYTHON_LIBRARIES)
 -  set(PYTHON_LIBRARIES ${oPYTHON_LIBRARIES})
 -endif(oPYTHON_LIBRARIES)
 -if(oPYTHON_INCLUDE_DIR)
 -  set(PYTHON_INCLUDE_DIR ${oPYTHON_INCLUDE_DIR})
 -endif(oPYTHON_INCLUDE_DIR)
 -
 -
 -
 -if(NOT PYTHON_LIBRARIES AND NOT PYTHON_INCLUDE_DIR)
 -  set(PYTHON_FOUND NO)
 -else(NOT PYTHON_LIBRARIES AND NOT PYTHON_INCLUDE_DIR)
 -  set(PYTHON_FOUND YES)
 -endif(NOT PYTHON_LIBRARIES AND NOT PYTHON_INCLUDE_DIR)
 -endif(oPYTHON_LIBRARIES)
 -if(oPYTHON_INCLUDE_DIR)
 -  set(PYTHON_INCLUDE_DIR ${oPYTHON_INCLUDE_DIR})
 -endif(oPYTHON_INCLUDE_DIR)
 -
 -
 -if(NOT PYTHON_LIBRARIES AND NOT PYTHON_INCLUDE_DIR)
 -  set(PYTHON_FOUND NO)
 -else(NOT PYTHON_LIBRARIES AND NOT PYTHON_INCLUDE_DIR)
 -  set(PYTHON_FOUND YES)
 -endif(NOT PYTHON_LIBRARIES AND NOT PYTHON_INCLUDE_DIR)
 -if(PYTHON_FOUND)
 -  MESSAGE(STATUS Found Python libraries in ${PYTHON_INCLUDE_DIR} as
 ${PYTHON_LIBRARIES})
 -else(PYTHON_FOUND)
 -  MESSAGE(FATAL_ERROR Python libraries not found)
 -endif(PYTHON_FOUND)
 -
 -
 -include_directories(${PYTHON_INCLUDE_DIR})
 +find_package(PythonLibs REQUIRED)
 +include_directories(${PYTHON_INCLUDE_PATH})
  link_directories(${PYTHON_LIBRARIES})
 +
  find_package(NumPy REQUIRED)
  include_directories(${PYTHON_NUMPY_INCLUDE_PATH})

 Cheers

 G.


 --
 Gianluca Sforna

 http://morefedora.blogspot.com
 http://www.linkedin.com/in/gianlucasforna


 --
 Throughout its 18-year history, RSA Conference consistently attracts the
 world's best and brightest in the field, creating opportunities for
 Conference
 attendees to learn about information security's most important issues
 through
 interactions with peers, luminaries and emerging and established companies.
 http://p.sf.net/sfu/rsaconf-dev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SearchDb functionality in Q32009

2009-11-30 Thread Robert DeLisle
Greg,

Yes, that did the trick - thank you.  Interestingly, my previous version
didn't seem to have that dependency.  Odd.

Now if I can just get my apache web server to recognize it as well.  8^)

-Kirk



On Wed, Nov 25, 2009 at 9:47 PM, Greg Landrum greg.land...@gmail.comwrote:

 Dear Kirk,

 On Wed, Nov 25, 2009 at 10:06 PM,  rkdeli...@gmail.com wrote:
  With a previous version of RDKit I had been able to do this:
 
  import SearchDb
  from SearchDb import parser
 
  What are the new namespaces to get this back up?

 SearchDb.py is in $RDBASE/Projects/DbCLI. This isn't in the usual
 recommended PYTHONPATH, so you'll have to add that directory
 explicitly, something like:
 export PYTHONPATH=$PYTHONPATH:$RDBASE/Projects/DbCLI

 hope this helps,
 -greg

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Compiling on Red Hat linux

2009-03-27 Thread Robert DeLisle
Can't you get a 32-bit source file and rebuild it for 64?



On Fri, Mar 27, 2009 at 2:36 PM, George Oakman oakm...@hotmail.com wrote:

  Thank you all for great comments, what I need to so is starting to be
 clearer.

 I just can't find a X86_64 lapack RPM for RHEL4, but I'll keep looking.
 If one of you knows where I could find it that'd be great.

 George.

  From: ig...@helix.nih.gov
  To: greg.land...@gmail.com
  Date: Fri, 27 Mar 2009 15:46:10 -0400
  CC: rdkit-discuss@lists.sourceforge.net
  Subject: Re: [Rdkit-discuss] Compiling on Red Hat linux
 
 
   How about now?
   http://code.google.com/p/rdkit/wiki/BuildingOnLinux
   or
   http://code.google.com/p/rdkit/wiki/NewLinuxBuild
  
  Ah, this is much better! Would it be possible to add a bullet point with
  options for python-less build?
 
even (?!) the default?
  
   That's definitely possible, but I wonder how advisable it is.
   What fraction of active linux boxes are running 64bit?
 
  I switched all of the linux computers I control to 64 bit for a couple
  of years now. Out of 60+ systems I think I have 32-bit installed on 2,
  one being my old home PC...
  There is simply no good reason to run 32-bit linux on 64-bit hardware.
  All of the old 32-bit programs work fine for me, and there is a bonus
  that I don't have the ancient 2/4 Gb filesize/RAM limits.
 
  Igor
 
 
 
 --
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

 --
 Windows Live Messenger just got better. Find out 
 more!http://clk.atdmt.com/UKM/go/134665230/direct/01/


 --

 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




Re: [Rdkit-discuss] RDKit - DbCLI

2008-11-24 Thread Robert DeLisle
Fantastic!  Thanks, Greg!

After I got things working I've been able to generate a database and do some
preliminary searches.  I'm impressed at how quickly I can search ~100,000
compounds with SMARTS patterns.  I have a feeling this one is going to get a
lot of use.

-Kirk



On Mon, Nov 24, 2008 at 12:45 AM, Greg Landrum greg.land...@gmail.comwrote:

 Dear Kirk,

 On Fri, Nov 21, 2008 at 12:38 AM, Robert DeLisle rkdeli...@gmail.com
 wrote:
 
  After running through the process with exception handling in place I was
  able to isolate 10 structures that were being problematic.  All of them
 had
  at least one bond designated as 0 order in the SD file - much as you
 found
  for some of the other structures previously.  I assume that these passed
 the
  initial import step but are failing upon descriptor generation for
 obvious
  reasons.
 
  I suppose the only request that I have is for more graceful error
 handling.
  I've attached my (admittedly sloppy) version of CreateDB.py showing what
 I
  did to isolate the errors.

 The problem here was in the mol file parser: it was not correctly
 setting up bonds that have order 0. Now it generates a warning (order
 0 isn't technically allowed by the ctab spec) and sets the bond up
 correctly. I also added some error checking to handle other bogus bond
 orders.

 This was entered as issue 2337369
 (
 https://sourceforge.net/tracker2/?func=detailaid=2337369group_id=160139atid=814650
 )
 and fixed in rev892.

 -greg



Re: [Rdkit-discuss] RDKit - DbCLI

2008-11-20 Thread Robert DeLisle
Greg,

After running through the process with exception handling in place I was
able to isolate 10 structures that were being problematic.  All of them had
at least one bond designated as 0 order in the SD file - much as you found
for some of the other structures previously.  I assume that these passed the
initial import step but are failing upon descriptor generation for obvious
reasons.

I suppose the only request that I have is for more graceful error handling.
I've attached my (admittedly sloppy) version of CreateDB.py showing what I
did to isolate the errors.

-Kirk





On Thu, Nov 20, 2008 at 1:33 PM, rkdeli...@gmail.com wrote:

 Indeed I can. Luckily I had a console window open with the error in place
 just as I saw your message:



 [13:21:16] INFO: Done: 54500
 Traceback (most recent call last):
 File C:\RDKit_Q32008_1\Projects\dbcli\CreateDB.py, line 222, in module
 mol = Chem.Mol(str(pkl))
 RuntimeError: Unknown exception


 I've just wrapped this one in a try-catch block as well.




 On Nov 20, 2008 1:17pm, Greg Landrum greg.land...@gmail.com wrote:
  Can you send me the console output without disclosing things you
 
  oughtn't to disclose?
 
 
 
  FYI: the deprecation warnings ought not to be causing the problem.
 
  There ought to be a bug report filed against this already, but it
 
  looks like I forgot to submit it. grn.
 
 
 
  -greg
 
 
 
  On Thu, Nov 20, 2008 at 9:06 PM,   wrote:
 
   Greg,
 
  
 
   Thanks for the quick response.
 
  
 
   In reading my original question I realize I didn't explain myself well.
 
   Sorry about that. 8^)
 
  
 
   I'm trying to set up a database of ~100,000 structures which will be
 queried
 
   by very few structures at a time. While running CreateDB.py I get to
 the
 
   step that gives an output of:
 
  
 
   'Generating fingerprints and descriptors:'
 
  
 
   In reading the output more closely I see that there are some
 deprecation
 
   warnings that mention a distance matrix - that's where my original
 question
 
   regarding a pairwise computation step came from. Regardless, after
 around
 
   50,000 structures, I get a 'Runtime: unexpected exception' message and
 
   Python stops. Having done a bit more research I see that each molecule
 is
 
   passed through Atom Pair, Fingerprint, and Descriptor generation. I
 assume
 
   it is failing somewhere within those steps, but I haven't yet
 identified
 
   where or why. I have just wrapped all of those procedures in try-catch
 
   blocks in hopes of finding the offending structure. Once I have it,
 I'll do
 
   some tests on it and send it your way.
 
  
 
   -Kirk
 
  
 
  
 
  
 
   On Nov 20, 2008 12:41pm, Greg Landrum wrote:
 
   [moving a general-interest question to the mailing list]
 
  
 
  
 
  
 
   Hi Kirk,
 
  
 
  
 
  
 
   On Thu, Nov 20, 2008 at 6:03 PM,   wrote:
 
  
 
   
 
  
 
I have another question on DbCLI. After getting rid of problematic
 
  
 
structures, I was able to get DbCLI to the pairwise comparison step,
 but
 
my
 
  
 
  
 
  
 
   I'm not sure what the pairwise comparison step is with the DbCLI
 stuff.
 
  
 
   Step one is loading the database with CreateDb.py, step 2 is doing
 
  
 
   searches with SearchDb.py. What are you asking about?
 
  
 
  
 
  
 
dataset has on the order of 100,000 structures. After about 50,000
 
  
 
structures Python issued an Unexpected error response and stopped.
 Is
 
this
 
  
 
likely due to the enormous size of a pairwise distance table for
 this
 
  
 
dataset? Have to had problems with very large datasets in the past
 or
 
has
 
  
 
this typically worked smoothly?
 
  
 
  
 
  
 
   I must admit that I've never queried with that number of structures.
 
  
 
   My typical use case is to have a large database (10^5-10^6 compounds)
 
  
 
   and query that with a few (~10) structures. The code hasn't really
 
  
 
   been written to deal with giant query sets. That is doable, but it
 
  
 
   would require some reworking. Probably the best bet would be to
 
  
 
   support loading the queries from a database as well; that way you
 
  
 
   wouldn't have to reprocess the queries every time and could pretty
 
  
 
   easily handle the only loading a few at a time problem.
 
  
 
  
 
  
 
   It's an interesting thing to think about.
 
  
 
  
 
  
 
   -greg
 
  
 

# $Id: CreateDb.py 665 2008-05-15 04:33:40Z glandrum $
#
#  Copyright (c) 2007, Novartis Institutes for BioMedical Research Inc.
#  All rights reserved.
# 
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: 
#
# * Redistributions of source code must retain the above copyright 
#   notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above
#   copyright notice, this list of conditions and the following 
#   disclaimer in the documentation and/or 

Re: [Rdkit-discuss] H-bond Acceptor problem

2008-10-28 Thread Robert DeLisle
I agree with Nik an additional 2 pence.  In fact, while reading Greg's
original note, my thoughts were essentially identical to Nik's comments.

-Kirk



On Tue, Oct 28, 2008 at 2:40 AM, nikolaus.sti...@novartis.com wrote:


 Hi Greg,

 maybe some comments on your suggestions.

  1) Should the renaming mentioned above (i.e. the NumHAcceptor and
  NumHDonor descriptors start returning the official Lipinski values
  and the existing functions are renamed to NumHAcceptorAlt and
  NumHDonorAlt) be done?

 Personally, I would guess that most people would not expect to receive an
 N/O count if they are asking for H-donors and acceptors. Hence, I would
 propably use a different naming convention that includes the Lipinski
 specification (e.g. LipNumHAcc or similar). That way people will not get
 confused by very high counts for those values.

  2) Is the above SMARTS reasonable for the more detailed HAcceptor
 definition?

 As you say - they are very basic but to me they look reasonable. If you
 actually want to tune them at a low level than I would propably change the F
 definition to fluoro's attached to aromatic rings only ( I know there is a
 lot of papers out there that discuss this issue ) but that's only me and I
 would guess that over time people should fine-tune these definitions to
 their own like anyway.

 My 2 pence
 Nik




  *Greg Landrum greg.land...@gmail.com*

 28.10.2008 06:55
   To
 rdkit-discuss@lists.sourceforge.net
  cc
   Subject
 Re: [Rdkit-discuss] H-bond Acceptor problem




 I wanted to make one more post on this topic, ask a couple questions
 (at the bottom of the post), and give people a few days to comment
 before I regenerate the regression test data and commit a change for
 this bug.

 On Wed, Oct 15, 2008 at 8:19 PM, Hans Purkey hans.pur...@gmail.com
 wrote:
  If the intention is to follow Lipinski's definitions of Hbond acceptors,
  then  it should be a simple N+O count (look back at the original paper
 and
  that is how he difined it for simplicity).

 For those who are coming to this late, this is the NOCount()
 descriptor, which is already present in the RDKit.

  However, if the descriptor is intended to match a more
 intuitive/realistic
  definition of HBA, then N-H shouldn't be a part of it.

 I don't think I agree with this. There are plenty of cases of
 nitrogens with attached Hs that act as H-bond acceptors (I did a CCD
 search yesterday to be sure), but that's a side topic.

 Back to the main topic: since these descriptors are all defined in a
 module named Lipinski, and since this all qualitative anyway, I'd
 propose the following change:
 The existing NumHDonors and NumHAcceptors (with fixes, discussed
 below) be renamed to NumHDonorsAlt and NumHAcceptorsAlt and NOCount
 and NHOHCount be aliased to NumHAcceptors and NumHDonors. I'd then
 deprecate NOCount and NHOHCount (they will generate warnings when used
 in the next release and then be completely removed in the release
 after that).

 For the purposes of fixing the more complex HAcceptor descriptor I
 propose the following SMARTS:

 HAcceptorSmarts = Chem.MolFromSmarts('[$([O,S;H1;v2]-[!$(*=[O,N,P,S])]),\
 $([O,S;H0;v2]),$([O,S;-]),\
 $([N;v3;!$(n-...@[o,N,P,S])]),\
 $([nH0,o,s;+0]),\
 $([F;!$(F-*-F)])]')d

 There are two changes here: the third line and the last one.
 The third line includes nitrogens that have three neighbors and that
 are not connected to another atom that has a non-ring double bond to
 O, N, P, or S.
 The last line includes Fs that are not connected to another atom that
 has more than one F attached (to exclude CF3 and CF2).

 I realize these are not highly tuned, very detailed definitions like
 those in the fdef file discussed elsewhere on this thread, but are
 they acceptable for a qualitative descriptor?

 So, the two questions:
 1) Should the renaming mentioned above (i.e. the NumHAcceptor and
 NumHDonor descriptors start returning the official Lipinski values
 and the existing functions are renamed to NumHAcceptorAlt and
 NumHDonorAlt) be done?
 2) Is the above SMARTS reasonable for the more detailed HAcceptor
 definition?

 Thanks for any feedback,
 -greg

 -
 This SF.Net email is sponsored by the Moblin Your Move Developer's
 challenge
 Build the coolest Linux based applications with Moblin SDK  win great
 prizes
 Grand prize is a trip for two to an Open Source event anywhere in the world
 http://moblin-contest.org/redirect.php?banner_id=100url=/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


 _

 CONFIDENTIALITY NOTICE

 The information contained in this e-mail message is intended only for the
 exclusive use of the individual or entity named above and may contain
 information that is privileged, confidential or exempt from disclosure under
 applicable law. If the 

[Rdkit-discuss] RDKit to PyMol

2008-10-18 Thread Robert DeLisle
Greg,

I notice within the Python API that RDKit has the ability to communicate
with PyMol.  I have not, however, been able to find an example and haven't
quite figured it out on my own.  Could you provide an example of opening a
file in PyMol through RDKit, please?

-Kirk


Re: [Rdkit-discuss] H-bond Acceptor problem

2008-10-15 Thread Robert DeLisle
Good point, Hans.

I see that within the available descriptors there are NHOHCount and NOCount,
which I assume are equivalent to Lipinski's Donors and Acceptors.  Also
there are NumHAcceptors and NumHDonors which I would expect to differentiate
themselves from the Linpinski versions in some way.

-Kirk




On Wed, Oct 15, 2008 at 1:19 PM, Hans Purkey hans.pur...@gmail.com wrote:

 If the intention is to follow Lipinski's definitions of Hbond acceptors,
 then  it should be a simple N+O count (look back at the original paper and
 that is how he difined it for simplicity).

 However, if the descriptor is intended to match a more intuitive/realistic
 definition of HBA, then N-H shouldn't be a part of it.

 Hans


 On Oct 15, 2008, at 11:50 AM, Greg Landrum wrote:

  [heh, worse than sending a message without an attachment is hitting
 send before the message is done and sending a message without text...
 sorry]

 On Wed, Oct 15, 2008 at 7:59 PM, Robert DeLisle rkdeli...@gmail.com
 wrote:


 As you know, I've been working with descriptors in RDKit, and I think
 I've
 found a bug in the calculation of H-bond Acceptors.  Attached is an
 example
 structure, N-methyl-1H-indole-6-carboxamide.  When I calculate
 NumHAcceptors
 for this structure, I get 3.  I've looked at numerous other strucures and
 it
 seems that nitrogens are always counted.  I went into the code and found
 the
 definitions used for HAcceptors:


 Here's a simpler case showing the same behavior:
 [15]  m2 = Chem.MolFromSmiles('CNC(=O)c1c[nH]cc1')

 [16]  Lipinski.NumHAcceptors(m2)
 Out[16]: 3

 so that confirms the wrong count


 $([O,S;H1;v2]-[!$(*=[O,N,P,S])])
 $([O,S;H0;v2])
 $([O,S;-])
 $([Nv3;H1,H2]-[!$(*=[O,N,P,S])])
 $([N;v3;H0])
 $([n,o,s;+0])
 F

 Unless I'm misinterpreting the SMARTS (a very good possiblity), both NH
 groups are being counted as an acceptor due to matching
 $([Nv3;H1,H2]-[!$(*=[O,N,P,S])]), but shouldn't the amide NH be excluded
 according to this same definition?


 [20] 
 m2.GetSubstructMatches(Chem.MolFromSmarts('[$([Nv3;H1,H2]-[!$(*=[O,N,P,S])])]'))
 Out[20]: ((1,),)

 Only matches one nitrogen... the amide nitrogen. The aromatic N
 matches the second but last definition:
 [29]  m2.GetSubstructMatches(Chem.MolFromSmarts('[$([n,o,s;+0])]'))
 Out[29]: ((6,),)

 The problem is that the first definition matches an N that is single
 bonded to an atom that isn't doubly bonded to O,N,P, or S. It does not
 exclude Ns that are single bonded to an atom that is doubly bonded to
 O,N,P, or S. So your amide with a secondary N matches. The problem
 isn't the matcher, it's the definition.

 Is that clear?

 I agree that this is a bug in the definition and will fix it. Would
 you mind entering the bug at sf.net or should I do it?

 -greg

 -
 This SF.Net email is sponsored by the Moblin Your Move Developer's
 challenge
 Build the coolest Linux based applications with Moblin SDK  win great
 prizes
 Grand prize is a trip for two to an Open Source event anywhere in the
 world
 http://moblin-contest.org/redirect.php?banner_id=100url=/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





Re: [Rdkit-discuss] RDKit Descriptors

2008-09-18 Thread Robert DeLisle
Greg,

Thank you for the response.

I was able to get PEOE_VSA1 through PEOE_VSA14, SMR_VSA1 through SMR_VSA10,
and EState_VSA1 through EState_VSA11 working.  Are these the correct limits
on the vector components?

I was unable, however, to get Slogp_VSA or VSA_EState working with any
integer suffix between 1 and 10.

I've also done a correlation analysis on all the descriptors that I've
gotten working.  After computing descriptors for some 24,000 compounds I
removed those with less than 10% variance and limited correlations between
variables to a maximum of 0.85 (using KNIME).  I'm happy to send a list of
the resulting descriptors or a correlation matrix if you or anyone else is
interested.



On Wed, Sep 17, 2008 at 11:36 PM, Greg Landrum greg.land...@gmail.comwrote:

 Dear Kirk,

 On Thu, Sep 18, 2008 at 12:58 AM, Robert DeLisle rkdeli...@gmail.com
 wrote:
  I've finally found time to start using RDKit and started with descriptor
  calculation.  Following the examples on the wiki
  (http://code.google.com/p/rdkit/wiki/DescriptorsInTheRDKit), I get a
  KeyError any time I attempt to obtain HeavyAtomCount, RingCount,

 HeavyAtomCount and RingCount were introduced after the May release, so
 they're in the subversion version of the code. They will be in the Q3
 release (which will happen sometime in the next couple of weeks,
 hopefully).

  PEOP_VSA,
  SMR_VSA, Slogp_VSA, EState_VSA, and VSA_Estate.

 The various X_VSA descriptors are vector-valued and you access them by
 element, so you could ask for PEOE_VSA4 or Slogp_VSA10.

  (BTW, what is the
  difference between the two last VSA descriptors?)

 The standard VSA descriptors provide map summed VSA values into bins
 determined by the other descriptor. So, for example, SMR_VSA uses
 atomic contributions to the VSA and uses bins determined by atomic
 contributions to the SMR. EState_VSA is the same, it just uses atomic
 EState values. VSA_EState is reversed: atomic EState values are put
 into bins determined by the VSA contributions.

 Best Regards,
 -greg



[Rdkit-discuss] RDKit Descriptors

2008-09-17 Thread Robert DeLisle
I've finally found time to start using RDKit and started with descriptor
calculation.  Following the examples on the wiki (
http://code.google.com/p/rdkit/wiki/DescriptorsInTheRDKit), I get a KeyError
any time I attempt to obtain HeavyAtomCount, RingCount, PEOP_VSA, SMR_VSA,
Slogp_VSA, EState_VSA, and VSA_Estate.  (BTW, what is the difference between
the two last VSA descriptors?)

-Kirk DeLisle