Re: [Rdkit-discuss] compile py wrappers with vcpkg boost 1.72
I now figured out that the problem is with static build boost libraries. It works fine with dynamic build boost libraries. Why does the python wrapping not work with static build boost libraries, and is it possible for it to work with static libraries? Regards Rasmus On Wed, Apr 8, 2020 at 7:58 PM Rasmus "Termo" Lundsgaard < termope...@gmail.com> wrote: > Dear all. > > I am trying to compile RDKit with boost from vcpkg which by standard use > latest which is boost 1.72, and link against python37. > > All none python crests pass fine, but most python tests keeps failing for > me with something the "did not match C++ signature" error. Here is a > minimal example that fails: > > In [7]: mol = Chem.MolFromSmiles('') > In [8]: mol.GetNumAtoms() > Out[8]: 4 > In [9]: mol.Debug() > Atoms: > 0 6 C chg: 0 deg: 1 exp: 1 imp: 3 hyb: 4 arom?: 0 chi: 0 > 1 6 C chg: 0 deg: 2 exp: 2 imp: 2 hyb: 4 arom?: 0 chi: 0 > 2 6 C chg: 0 deg: 2 exp: 2 imp: 2 hyb: 4 arom?: 0 chi: 0 > 3 6 C chg: 0 deg: 1 exp: 1 imp: 3 hyb: 4 arom?: 0 chi: 0 > Bonds: > 0 0->1 order: 1 conj?: 0 aromatic?: 0 > 1 1->2 order: 1 conj?: 0 aromatic?: 0 > 2 2->3 order: 1 conj?: 0 aromatic?: 0 > In [10]: molh = Chem.AddHs(mol) > > --- > ArgumentError Traceback (most recent call last) > in > > 1 molh = Chem.AddHs(mol) > > ArgumentError: Python argument types in > rdkit.Chem.rdmolops.AddHs(Mol) > did not match C++ signature: > AddHs(RDKit::ROMol mol, bool explicitOnly=False, bool addCoords=False, > boost::python::api::object onlyOnAtoms=None, bool addResidueInfo=False) > > Do you have any idea of why that is? My only suspiscion right now is if it > can because it is linking to the found numpy (which is in and conda > environment I am using for testing), if this is using another boost > version, can that be it? > > my cmake call: > cmake -DPy_ENABLE_SHARED=1 \ > -DPYTHON_EXECUTABLE=/home/hafnium/miniconda3/envs/py37/bin/python3 \ > -DRDK_BUILD_COORDGEN_SUPPORT=OFF \ > -DRDK_INSTALL_INTREE=OFF \ > -DRDK_INSTALL_STATIC_LIBS=OFF \ > -DRDK_BUILD_CPP_TESTS=ON \ > -DRDK_USE_BOOST_IOSTREAMS=OFF \ > -DBoost_NO_SYSTEM_PATHS=ON \ > > -DCMAKE_INSTALL_PREFIX=/home/hafnium/Hflabs-sync/HFgit/External/rdkit/rdkit-install > \ > -DCMAKE_BUILD_TYPE=Release \ > > -DCMAKE_TOOLCHAIN_FILE=/home/hafnium/Hflabs-sync/vcpkg/scripts/buildsystems/vcpkg.cmake > \ > .. > > > > The following tests FAILED: > 17 - pyAlignment (Failed) > 21 - pyForceFieldConstraints (Failed) > 41 - pyDepictor (Failed) > 62 - pyChemReactions (Failed) > 63 - pyChemReactionEnumerations (Failed) > 64 - pyChemReactionSanitize (Failed) > 70 - pyFilterCatalog (Failed) > 72 - pyFragCatalog (Failed) > 87 - pyMolDescriptors (Child aborted) > 88 - pyMolDescriptors3D (Failed) > 93 - pyTestGenerator (Failed) > 94 - pyTestMHFP (Failed) > 96 - pyPartialCharges (Failed) > 98 - pyMolTransforms (Failed) > 103 - pyForceFieldHelpers (Failed) > 105 - pyDistGeomHelpers (Failed) > 107 - pyMolAlign (Failed) > 109 - pyChemicalFeatures (Failed) > 111 - pyShapeHelpers (Failed) > 113 - pyMolCatalog (Failed) > 117 - pyMolDraw2D (Failed) > 119 - pyFMCS (Failed) > 123 - pyMolHash (Failed) > 125 - pyMMPA (Failed) > 127 - pyReducedGraphs (Failed) > 130 - pySubstructLibrary (Failed) > 132 - pyRGroupDecomposition (Failed) > 134 - pyMolInterchange (Failed) > 137 - pyGraphMolWrap (Failed) > 138 - pyTestConformerWrap (Failed) > 139 - pyTestTrajectory (Failed) > 140 - pyTestSGroups (Failed) > 142 - pyTestPropertyLists (Failed) > 151 - pyMolStandardize (Failed) > 153 - pyScaffoldNetwork (Failed) > 157 - pyMatCalc (Failed) > 160 - pySimDivPickers (Failed) > 161 - pyRanker (Failed) > 163 - pyFeatures (Failed) > 164 - pythonTestDbCLI (Failed) > 165 - pythonTestDirML (Failed) > 166 - pythonTestDirDataStructs (Failed) > 168 - pythonTestDirSimDivFilters (Failed) > 169 - pythonTestDirVLib (Failed) > 170 - pythonTestDirChem (Failed) > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] compile py wrappers with vcpkg boost 1.72
Dear all. I am trying to compile RDKit with boost from vcpkg which by standard use latest which is boost 1.72, and link against python37. All none python crests pass fine, but most python tests keeps failing for me with something the "did not match C++ signature" error. Here is a minimal example that fails: In [7]: mol = Chem.MolFromSmiles('') In [8]: mol.GetNumAtoms() Out[8]: 4 In [9]: mol.Debug() Atoms: 0 6 C chg: 0 deg: 1 exp: 1 imp: 3 hyb: 4 arom?: 0 chi: 0 1 6 C chg: 0 deg: 2 exp: 2 imp: 2 hyb: 4 arom?: 0 chi: 0 2 6 C chg: 0 deg: 2 exp: 2 imp: 2 hyb: 4 arom?: 0 chi: 0 3 6 C chg: 0 deg: 1 exp: 1 imp: 3 hyb: 4 arom?: 0 chi: 0 Bonds: 0 0->1 order: 1 conj?: 0 aromatic?: 0 1 1->2 order: 1 conj?: 0 aromatic?: 0 2 2->3 order: 1 conj?: 0 aromatic?: 0 In [10]: molh = Chem.AddHs(mol) --- ArgumentError Traceback (most recent call last) in > 1 molh = Chem.AddHs(mol) ArgumentError: Python argument types in rdkit.Chem.rdmolops.AddHs(Mol) did not match C++ signature: AddHs(RDKit::ROMol mol, bool explicitOnly=False, bool addCoords=False, boost::python::api::object onlyOnAtoms=None, bool addResidueInfo=False) Do you have any idea of why that is? My only suspiscion right now is if it can because it is linking to the found numpy (which is in and conda environment I am using for testing), if this is using another boost version, can that be it? my cmake call: cmake -DPy_ENABLE_SHARED=1 \ -DPYTHON_EXECUTABLE=/home/hafnium/miniconda3/envs/py37/bin/python3 \ -DRDK_BUILD_COORDGEN_SUPPORT=OFF \ -DRDK_INSTALL_INTREE=OFF \ -DRDK_INSTALL_STATIC_LIBS=OFF \ -DRDK_BUILD_CPP_TESTS=ON \ -DRDK_USE_BOOST_IOSTREAMS=OFF \ -DBoost_NO_SYSTEM_PATHS=ON \ -DCMAKE_INSTALL_PREFIX=/home/hafnium/Hflabs-sync/HFgit/External/rdkit/rdkit-install \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_TOOLCHAIN_FILE=/home/hafnium/Hflabs-sync/vcpkg/scripts/buildsystems/vcpkg.cmake \ .. The following tests FAILED: 17 - pyAlignment (Failed) 21 - pyForceFieldConstraints (Failed) 41 - pyDepictor (Failed) 62 - pyChemReactions (Failed) 63 - pyChemReactionEnumerations (Failed) 64 - pyChemReactionSanitize (Failed) 70 - pyFilterCatalog (Failed) 72 - pyFragCatalog (Failed) 87 - pyMolDescriptors (Child aborted) 88 - pyMolDescriptors3D (Failed) 93 - pyTestGenerator (Failed) 94 - pyTestMHFP (Failed) 96 - pyPartialCharges (Failed) 98 - pyMolTransforms (Failed) 103 - pyForceFieldHelpers (Failed) 105 - pyDistGeomHelpers (Failed) 107 - pyMolAlign (Failed) 109 - pyChemicalFeatures (Failed) 111 - pyShapeHelpers (Failed) 113 - pyMolCatalog (Failed) 117 - pyMolDraw2D (Failed) 119 - pyFMCS (Failed) 123 - pyMolHash (Failed) 125 - pyMMPA (Failed) 127 - pyReducedGraphs (Failed) 130 - pySubstructLibrary (Failed) 132 - pyRGroupDecomposition (Failed) 134 - pyMolInterchange (Failed) 137 - pyGraphMolWrap (Failed) 138 - pyTestConformerWrap (Failed) 139 - pyTestTrajectory (Failed) 140 - pyTestSGroups (Failed) 142 - pyTestPropertyLists (Failed) 151 - pyMolStandardize (Failed) 153 - pyScaffoldNetwork (Failed) 157 - pyMatCalc (Failed) 160 - pySimDivPickers (Failed) 161 - pyRanker (Failed) 163 - pyFeatures (Failed) 164 - pythonTestDbCLI (Failed) 165 - pythonTestDirML (Failed) 166 - pythonTestDirDataStructs (Failed) 168 - pythonTestDirSimDivFilters (Failed) 169 - pythonTestDirVLib (Failed) 170 - pythonTestDirChem (Failed) ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] compiling C++examples from "Release_2019_09_2"
Found a solution, but I guess the CMakeLists.txt should be updated to work with the current state of the project. adding "maeparser coordgen RingDecomposerLib DataStructs" to the end of the COMPONENTS list will add the correct .so files to RDKIT_LIBRARIES and it compiles... /Rasmus On Fri, Jan 3, 2020 at 9:46 AM Rasmus "Termo" Lundsgaard < termope...@gmail.com> wrote: > Hi Dan > > Yes, they are in the $RDBASE/lib directory and LD_LIBRARY_PATH is set to > point to this as well. That was also my first guess that LD_LIBRARY_PATH > was not set correctly. > > mkdir -p build && cd build > export RDBASE=/home/termo/HFlabs/rdkit > export > LD_LIBRARY_PATH=$RDBASE/lib:/home/termo/miniconda3/envs/rdkit-dev/include > > echo "" > echo "LD_LIBRARY_PATH:" $LD_LIBRARY_PATH > ls $RDBASE/lib/libRDKitmaeparser.so.1 > echo "" > > cmake -DBOOST_ROOT="/home/termo/miniconda3/envs/rdkit-dev" .. > make VERBOSE=1 > cd .. > > gives this attached output (the same as before, but now also showing the > file exists and LD_LIBRARY_PATH) > > > > On Fri, Jan 3, 2020 at 12:36 AM Dan Nealschneider < > dan.nealschnei...@schrodinger.com> wrote: > >> Looks like you're missing libRDKitmaeparser.so.1, >> libRDKitcoordgen.so.1, libRDKitRingDecomposerLib.so.1, >> and libRDKitDataStructs.so.1. Are they in a directory pointed to by your >> LD_LIBRARY_PATH? >> >> *dan nealschneider* | senior developer >> [image: Schrodinger Logo] <https://www.schrodinger.com/> >> >> >> On Thu, Jan 2, 2020 at 1:25 PM Rasmus "Termo" Lundsgaard < >> termope...@gmail.com> wrote: >> >>> I have compiled rdkit as suggested in the docs by using a conda >>> environment for c++ and boost. >>> >>> I would like to move from python to cpp with my RDkit work, and I >>> thought to start with the C++ exaples in the Docs/Book, but I'm having some >>> problems getting the minimal c++ examples to link with the current CMake >>> files there. >>> >>> Attached is the output from the make command where I have only set it to >>> make "example1.cpp" in CMakeLists.txt >>> >>> I guess the problem is the "/bin/ld: warning: libRDKitmaeparser.so.1, >>> needed by /home/termo/HFlabs/rdkit/lib/libRDKitFileParsers.so, not found" >>> >>> I have set RDBASE and LD_LIBRARY_PATH, and as far as I can see with the >>> "-Wl,-rpath,/home/termo/HFlabs/rdkit/lib" part in the linking command >>> it should find the needed .so file (that is there). >>> >>> Any idea why it fails to find the .so files? >>> >>> /Rasmus >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] compiling C++examples from "Release_2019_09_2"
Hi Dan Yes, they are in the $RDBASE/lib directory and LD_LIBRARY_PATH is set to point to this as well. That was also my first guess that LD_LIBRARY_PATH was not set correctly. mkdir -p build && cd build export RDBASE=/home/termo/HFlabs/rdkit export LD_LIBRARY_PATH=$RDBASE/lib:/home/termo/miniconda3/envs/rdkit-dev/include echo "" echo "LD_LIBRARY_PATH:" $LD_LIBRARY_PATH ls $RDBASE/lib/libRDKitmaeparser.so.1 echo "" cmake -DBOOST_ROOT="/home/termo/miniconda3/envs/rdkit-dev" .. make VERBOSE=1 cd .. gives this attached output (the same as before, but now also showing the file exists and LD_LIBRARY_PATH) On Fri, Jan 3, 2020 at 12:36 AM Dan Nealschneider < dan.nealschnei...@schrodinger.com> wrote: > Looks like you're missing libRDKitmaeparser.so.1, > libRDKitcoordgen.so.1, libRDKitRingDecomposerLib.so.1, > and libRDKitDataStructs.so.1. Are they in a directory pointed to by your > LD_LIBRARY_PATH? > > *dan nealschneider* | senior developer > [image: Schrodinger Logo] <https://www.schrodinger.com/> > > > On Thu, Jan 2, 2020 at 1:25 PM Rasmus "Termo" Lundsgaard < > termope...@gmail.com> wrote: > >> I have compiled rdkit as suggested in the docs by using a conda >> environment for c++ and boost. >> >> I would like to move from python to cpp with my RDkit work, and I thought >> to start with the C++ exaples in the Docs/Book, but I'm having some >> problems getting the minimal c++ examples to link with the current CMake >> files there. >> >> Attached is the output from the make command where I have only set it to >> make "example1.cpp" in CMakeLists.txt >> >> I guess the problem is the "/bin/ld: warning: libRDKitmaeparser.so.1, >> needed by /home/termo/HFlabs/rdkit/lib/libRDKitFileParsers.so, not found" >> >> I have set RDBASE and LD_LIBRARY_PATH, and as far as I can see with the >> "-Wl,-rpath,/home/termo/HFlabs/rdkit/lib" part in the linking command it >> should find the needed .so file (that is there). >> >> Any idea why it fails to find the .so files? >> >> /Rasmus >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > LD_LIBRARY_PATH: /home/termo/HFlabs/rdkit/lib:/home/termo/miniconda3/envs/rdkit-dev/include /home/termo/HFlabs/rdkit/lib/libRDKitmaeparser.so.1 -- The C compiler identification is GNU 7.3.0 -- The CXX compiler identification is GNU 7.3.0 -- Check for working C compiler: /home/termo/miniconda3/envs/rdkit-dev/bin/x86_64-conda_cos6-linux-gnu-cc -- Check for working C compiler: /home/termo/miniconda3/envs/rdkit-dev/bin/x86_64-conda_cos6-linux-gnu-cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /home/termo/miniconda3/envs/rdkit-dev/bin/x86_64-conda_cos6-linux-gnu-c++ -- Check for working CXX compiler: /home/termo/miniconda3/envs/rdkit-dev/bin/x86_64-conda_cos6-linux-gnu-c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Boost: /home/termo/miniconda3/envs/rdkit-dev/include (found version "1.67.0") found components: iostreams filesystem system regex Looking for RDKit component ChemReactions MyRDKit_LIBRARY_ChemReactions : /home/termo/HFlabs/rdkit/lib/libRDKitChemReactions.so Looking for RDKit component FileParsers MyRDKit_LIBRARY_FileParsers : /home/termo/HFlabs/rdkit/lib/libRDKitFileParsers.so Looking for RDKit component SmilesParse MyRDKit_LIBRARY_SmilesParse : /home/termo/HFlabs/rdkit/lib/libRDKitSmilesParse.so Looking for RDKit component Depictor MyRDKit_LIBRARY_Depictor : /home/termo/HFlabs/rdkit/lib/libRDKitDepictor.so Looking for RDKit component RDGeometryLib MyRDKit_LIBRARY_RDGeometryLib : /home/termo/HFlabs/rdkit/lib/libRDKitRDGeometryLib.so Looking for RDKit component RDGeneral MyRDKit_LIBRARY_RDGeneral : /home/termo/HFlabs/rdkit/lib/libRDKitRDGeneral.so Looking for RDKit component SubstructMatch MyRDKit_LIBRARY_SubstructMatch : /home/termo/HFlabs/rdkit/lib/libRDKitSubstructMatch.so Looking for RDKit component Subgraphs MyRDKit_LIBRARY_Subgraphs : /home/termo/HFlabs/rdkit/lib/libRDKitSubgraphs.so Looking for RDKit component MolDraw2D MyRDKit_LIBRARY_MolDraw2D : /home/termo/HFlabs/rdkit/lib/libRDKitMolDraw2D.so Looking for RDKit component GraphMol MyRDKit_LIBRARY_GraphMol : /home/termo/HFlabs/rdkit/lib/libRDKitGraphMol.so Looking for RDKit component DistGeometry MyRDKit_LIBRARY_DistGeometry : /home/termo/HFlabs/rdkit/lib/libRDKitDistGeometry.so Looking for RDKit component
[Rdkit-discuss] compiling C++examples from "Release_2019_09_2"
I have compiled rdkit as suggested in the docs by using a conda environment for c++ and boost. I would like to move from python to cpp with my RDkit work, and I thought to start with the C++ exaples in the Docs/Book, but I'm having some problems getting the minimal c++ examples to link with the current CMake files there. Attached is the output from the make command where I have only set it to make "example1.cpp" in CMakeLists.txt I guess the problem is the "/bin/ld: warning: libRDKitmaeparser.so.1, needed by /home/termo/HFlabs/rdkit/lib/libRDKitFileParsers.so, not found" I have set RDBASE and LD_LIBRARY_PATH, and as far as I can see with the "-Wl,-rpath,/home/termo/HFlabs/rdkit/lib" part in the linking command it should find the needed .so file (that is there). Any idea why it fails to find the .so files? /Rasmus -- The C compiler identification is GNU 7.3.0 -- The CXX compiler identification is GNU 7.3.0 -- Check for working C compiler: /home/termo/miniconda3/envs/rdkit-dev/bin/x86_64-conda_cos6-linux-gnu-cc -- Check for working C compiler: /home/termo/miniconda3/envs/rdkit-dev/bin/x86_64-conda_cos6-linux-gnu-cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /home/termo/miniconda3/envs/rdkit-dev/bin/x86_64-conda_cos6-linux-gnu-c++ -- Check for working CXX compiler: /home/termo/miniconda3/envs/rdkit-dev/bin/x86_64-conda_cos6-linux-gnu-c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Boost: /home/termo/miniconda3/envs/rdkit-dev/include (found version "1.67.0") found components: iostreams filesystem system regex Looking for RDKit component ChemReactions MyRDKit_LIBRARY_ChemReactions : /home/termo/HFlabs/rdkit/lib/libRDKitChemReactions.so Looking for RDKit component FileParsers MyRDKit_LIBRARY_FileParsers : /home/termo/HFlabs/rdkit/lib/libRDKitFileParsers.so Looking for RDKit component SmilesParse MyRDKit_LIBRARY_SmilesParse : /home/termo/HFlabs/rdkit/lib/libRDKitSmilesParse.so Looking for RDKit component Depictor MyRDKit_LIBRARY_Depictor : /home/termo/HFlabs/rdkit/lib/libRDKitDepictor.so Looking for RDKit component RDGeometryLib MyRDKit_LIBRARY_RDGeometryLib : /home/termo/HFlabs/rdkit/lib/libRDKitRDGeometryLib.so Looking for RDKit component RDGeneral MyRDKit_LIBRARY_RDGeneral : /home/termo/HFlabs/rdkit/lib/libRDKitRDGeneral.so Looking for RDKit component SubstructMatch MyRDKit_LIBRARY_SubstructMatch : /home/termo/HFlabs/rdkit/lib/libRDKitSubstructMatch.so Looking for RDKit component Subgraphs MyRDKit_LIBRARY_Subgraphs : /home/termo/HFlabs/rdkit/lib/libRDKitSubgraphs.so Looking for RDKit component MolDraw2D MyRDKit_LIBRARY_MolDraw2D : /home/termo/HFlabs/rdkit/lib/libRDKitMolDraw2D.so Looking for RDKit component GraphMol MyRDKit_LIBRARY_GraphMol : /home/termo/HFlabs/rdkit/lib/libRDKitGraphMol.so Looking for RDKit component DistGeometry MyRDKit_LIBRARY_DistGeometry : /home/termo/HFlabs/rdkit/lib/libRDKitDistGeometry.so Looking for RDKit component DistGeomHelpers MyRDKit_LIBRARY_DistGeomHelpers : /home/termo/HFlabs/rdkit/lib/libRDKitDistGeomHelpers.so Looking for RDKit component MolAlign MyRDKit_LIBRARY_MolAlign : /home/termo/HFlabs/rdkit/lib/libRDKitMolAlign.so Looking for RDKit component Optimizer MyRDKit_LIBRARY_Optimizer : /home/termo/HFlabs/rdkit/lib/libRDKitOptimizer.so Looking for RDKit component ForceField MyRDKit_LIBRARY_ForceField : /home/termo/HFlabs/rdkit/lib/libRDKitForceField.so Looking for RDKit component ForceFieldHelpers MyRDKit_LIBRARY_ForceFieldHelpers : /home/termo/HFlabs/rdkit/lib/libRDKitForceFieldHelpers.so Looking for RDKit component Alignment MyRDKit_LIBRARY_Alignment : /home/termo/HFlabs/rdkit/lib/libRDKitAlignment.so Looking for RDKit component ForceField MyRDKit_LIBRARY_ForceField : /home/termo/HFlabs/rdkit/lib/libRDKitForceField.so Looking for RDKit component MolTransforms MyRDKit_LIBRARY_MolTransforms : /home/termo/HFlabs/rdkit/lib/libRDKitMolTransforms.so Looking for RDKit component EigenSolvers MyRDKit_LIBRARY_EigenSolvers : /home/termo/HFlabs/rdkit/lib/libRDKitEigenSolvers.so RDKIT_INCLUDE_DIR : /home/termo/HFlabs/rdkit//Code RDKIT_LIBRARIES :
Re: [Rdkit-discuss] handling of stereo information from mol files when not sanitizing
Hi Greg. Thks for your gist. I guess the line: nbrs = [(x.GetOtherAtomIdx(1),x.GetIdx()) for x in atom.GetBonds()] should be: nbrs = [(x.GetOtherAtomIdx(atom.GetIdx()),x.GetIdx()) for x in atom.GetBonds()] On Tue, Dec 3, 2019 at 5:38 PM Greg Landrum wrote: > What's going on here is that the RDKit defines stereochemistry based on > the ordering of bonds, not atom indices. > This has come up on the list multiple times, a relatively recent instance > is here: > > https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08955.html > > Here's a gist that I have laying around that may help here:[1] > https://gist.github.com/greglandrum/9f0e068e53171174b6797348eca64b3e > > > -greg > [1] Now if only I could find *why* have that gist laying around... > > > On Tue, Dec 3, 2019 at 3:32 PM Rasmus "Termo" Lundsgaard < > termope...@gmail.com> wrote: > >> Hi Pablo, >> >> thank you for the heads up on that removeHs is not honorred when not >> sanitizing (and that removeH has to be done to solve that issue here). >> >> Now I tried with the same molecule but where I also move around on the >> order of the atoms (attached as Ran1_neworder.sdf), and here I still get a >> different isomeric smiles, eventhough the chiral tag is the same: >> for f in ['Ran1.sdf','Ran2.sdf', 'Ran1_neworder.sdf']: >> m = Chem.MolFromMolFile(f, sanitize=False) >> m = Chem.RemoveHs(m, sanitize=False) >> print( Chem.MolToSmiles(set_correct_Chiral_flags(m), >> isomericSmiles=True) ) >> >> >> C[C@@H](N)C(=O)O >> C[C@@H](N)C(=O)O >> C[C@H](N)C(=O)O >> >> >> On Tue, Dec 3, 2019 at 2:58 PM Paolo Tosco >> wrote: >> >>> Hi Rasmus, >>> >>> the problem is that, as stated in the rdmolfiles.MolFromMolFile() docs, >>> the removeHs option is only honored when sanitize is True. >>> >>> So to obtain sensible results without sanitizing you should rather do >>> something like: >>> >>> m1 = Chem.MolFromMolFile('Ran1.sdf', sanitize=False) >>> m1 = Chem.RemoveHs(m1, sanitize=False) >>> print( Chem.MolToSmiles(set_correct_Chiral_flags(m1), >>> isomericSmiles=True) ) >>> m2 = Chem.MolFromMolFile('Ran2.sdf', sanitize=False) >>> m2 = Chem.RemoveHs(m2, sanitize=False) >>> print( Chem.MolToSmiles(set_correct_Chiral_flags(m2), >>> isomericSmiles=True) ) >>> >>> You may check the individual sanitization operations here: >>> >>> https://www.rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=rdmolops%20sanitizeflags#rdkit.Chem.rdmolops.SanitizeFlags >>> >>> Cheers, >>> p. >>> >>> On 03/12/2019 12:46, Rasmus "Termo" Lundsgaard wrote: >>> >>> Hi all >>> >>> I would like to avoid sanitizing the sdf files, as information in these >>> files should be seen as the ground truth. >>> >>> I however have some problems in figuring out how to read and set chiral >>> information from the file and also have RDkit behave the same always. >>> Attached are two sdf files with no 3d information and only stereo >>> information in the atoms section for R-Aniline. The only difference as I >>> see it is the order of the lines of the bond information. >>> Even so I get two different smiles back with isomeric information when >>> not sanitizing. >>> >>> Attached is also the minimal python code: which for me at least outputs: >>> >>> not setting chiral flags >>>> CC(N)C(=O)O >>>> CC(N)C(=O)O >>>> >>>> setting chiral flags >>>> [H]OC(=O)[C@]([H])(N([H])[H])C([H])([H])[H] >>>> [H]OC(=O)[C@@]([H])(N([H])[H])C([H])([H])[H] >>>> >>>> setting chiral flags and sanitize >>>> C[C@@H](N)C(=O)O >>>> C[C@@H](N)C(=O)O >>>> >>> >>> Any ideas to why this happens and how I can handle it strictly. Also >>> what does the sanitizing exactly do? >>> >>> Regards Rasmus >>> >>> >>> >>> ___ >>> Rdkit-discuss mailing >>> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] handling of stereo information from mol files when not sanitizing
Hi Pablo, thank you for the heads up on that removeHs is not honorred when not sanitizing (and that removeH has to be done to solve that issue here). Now I tried with the same molecule but where I also move around on the order of the atoms (attached as Ran1_neworder.sdf), and here I still get a different isomeric smiles, eventhough the chiral tag is the same: for f in ['Ran1.sdf','Ran2.sdf', 'Ran1_neworder.sdf']: m = Chem.MolFromMolFile(f, sanitize=False) m = Chem.RemoveHs(m, sanitize=False) print( Chem.MolToSmiles(set_correct_Chiral_flags(m), isomericSmiles=True) ) C[C@@H](N)C(=O)O C[C@@H](N)C(=O)O C[C@H](N)C(=O)O On Tue, Dec 3, 2019 at 2:58 PM Paolo Tosco wrote: > Hi Rasmus, > > the problem is that, as stated in the rdmolfiles.MolFromMolFile() docs, > the removeHs option is only honored when sanitize is True. > > So to obtain sensible results without sanitizing you should rather do > something like: > > m1 = Chem.MolFromMolFile('Ran1.sdf', sanitize=False) > m1 = Chem.RemoveHs(m1, sanitize=False) > print( Chem.MolToSmiles(set_correct_Chiral_flags(m1), isomericSmiles=True) > ) > m2 = Chem.MolFromMolFile('Ran2.sdf', sanitize=False) > m2 = Chem.RemoveHs(m2, sanitize=False) > print( Chem.MolToSmiles(set_correct_Chiral_flags(m2), isomericSmiles=True) > ) > > You may check the individual sanitization operations here: > > https://www.rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=rdmolops%20sanitizeflags#rdkit.Chem.rdmolops.SanitizeFlags > > Cheers, > p. > > On 03/12/2019 12:46, Rasmus "Termo" Lundsgaard wrote: > > Hi all > > I would like to avoid sanitizing the sdf files, as information in these > files should be seen as the ground truth. > > I however have some problems in figuring out how to read and set chiral > information from the file and also have RDkit behave the same always. > Attached are two sdf files with no 3d information and only stereo > information in the atoms section for R-Aniline. The only difference as I > see it is the order of the lines of the bond information. > Even so I get two different smiles back with isomeric information when not > sanitizing. > > Attached is also the minimal python code: which for me at least outputs: > > not setting chiral flags >> CC(N)C(=O)O >> CC(N)C(=O)O >> >> setting chiral flags >> [H]OC(=O)[C@]([H])(N([H])[H])C([H])([H])[H] >> [H]OC(=O)[C@@]([H])(N([H])[H])C([H])([H])[H] >> >> setting chiral flags and sanitize >> C[C@@H](N)C(=O)O >> C[C@@H](N)C(=O)O >> > > Any ideas to why this happens and how I can handle it strictly. Also what > does the sanitizing exactly do? > > Regards Rasmus > > > > ___ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > Ran1_neworder.sdf Description: StarMath document ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] handling of stereo information from mol files when not sanitizing
Hi all I would like to avoid sanitizing the sdf files, as information in these files should be seen as the ground truth. I however have some problems in figuring out how to read and set chiral information from the file and also have RDkit behave the same always. Attached are two sdf files with no 3d information and only stereo information in the atoms section for R-Aniline. The only difference as I see it is the order of the lines of the bond information. Even so I get two different smiles back with isomeric information when not sanitizing. Attached is also the minimal python code: which for me at least outputs: not setting chiral flags > CC(N)C(=O)O > CC(N)C(=O)O > > setting chiral flags > [H]OC(=O)[C@]([H])(N([H])[H])C([H])([H])[H] > [H]OC(=O)[C@@]([H])(N([H])[H])C([H])([H])[H] > > setting chiral flags and sanitize > C[C@@H](N)C(=O)O > C[C@@H](N)C(=O)O > Any ideas to why this happens and how I can handle it strictly. Also what does the sanitizing exactly do? Regards Rasmus Ran2.sdf Description: Binary data Ran1.sdf Description: Binary data from rdkit import Chem def set_correct_Chiral_flags(mol, debug=False): for a in mol.GetAtoms(): if a.HasProp("molParity"): try: parity=int(a.GetProp("molParity")) except ValueError: parity=None if parity and debug: print(a.GetSymbol(), a.GetIdx(), parity) if parity==1: a.SetChiralTag(Chem.rdchem.ChiralType.CHI_TETRAHEDRAL_CW) elif parity==2: a.SetChiralTag(Chem.rdchem.ChiralType.CHI_TETRAHEDRAL_CCW) elif parity==3: a.SetChiralTag(Chem.rdchem.ChiralType.CHI_UNSPECIFIED) return mol print('\nnot setting chiral flags and sanitizing') print( Chem.MolToSmiles(Chem.MolFromMolFile('Ran1.sdf', removeHs=True, sanitize=True), isomericSmiles=True) ) print( Chem.MolToSmiles(Chem.MolFromMolFile('Ran2.sdf', removeHs=True, sanitize=True), isomericSmiles=True) ) print('\nsetting chiral flags') print( Chem.MolToSmiles(set_correct_Chiral_flags(Chem.MolFromMolFile('Ran1.sdf', removeHs=True, sanitize=False)), isomericSmiles=True) ) print( Chem.MolToSmiles(set_correct_Chiral_flags(Chem.MolFromMolFile('Ran2.sdf', removeHs=True, sanitize=False)), isomericSmiles=True) ) print('\nsetting chiral flags and sanitize') print( Chem.MolToSmiles(set_correct_Chiral_flags(Chem.MolFromMolFile('Ran1.sdf', removeHs=True, sanitize=True)), isomericSmiles=True) ) print( Chem.MolToSmiles(set_correct_Chiral_flags(Chem.MolFromMolFile('Ran2.sdf', removeHs=True, sanitize=True)), isomericSmiles=True) ) ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss