Re: [Rdkit-discuss] Request for Assistance: Understanding InChI to Mol Conversion Issue in RDKit

2023-12-12 Thread Jan Holst Jensen
You can also cross-check with standard InChI to see if this is an RDKit 
issue or a more general InChI issue. To convert InChI strings (and 
optionally AuxInfo) to SDF format with the standard inchi-1 executable, 
put the InChI string and AuxInfo into a text file and convert it like this.


P:\Projects\RInChI\INCHI-1-BIN__V1.06\windows\32bit>*type test.txt*
InChI=1/Ca.2H
AuxInfo=1/0/N:1;2;3/rA:3Ca0H0H0/rB:;;/rC:;;;

P:\Projects\RInChI\INCHI-1-BIN__V1.06\windows\32bit>*inchi-1.exe 
/InChI2Struct /OutputSDF test.txt*

InChI version 1, Software v. 1.06 (inchi-1 executable)
Windows 32-bit Build (MS VS 2015) of Dec 18 2020 20:45:14

Opened log file 'test.txt.log'
Opened input file 'test.txt'
Opened output file 'test.txt.txt'
Opened problem file 'test.txt.prb'
The command line used:
"inchi-1.exe /InChI2Struct /OutputSDF test.txt"
Converting InChI(s) to structure(s) in MOL format
Output SDfile only without stereochemical information and atom coordinates
Input format: InChI (plain identifier)
Output format: SDfile only (without stereochemical info and atom 
coordinates)

Timeout per structure: 6 msec
Up to 1024 atoms per structure


Finished processing 1 structure: 0 errors, processing time 0:00:00.00

Elapsed walltime: 15 msec.

P:\Projects\RInChI\INCHI-1-BIN__V1.06\windows\32bit>type test.txt.txt
Structure #1.
  InChIV10

  3  0  0  0  0  0  0  0  0  0  1 V2000
    0.    0.    0. Ca  0  0  0 0 15  0  0  0  0
    0.    0.    0. H   0  0  0 0 15  0  0  0  0
    0.    0.    0. H   0  0  0 0 15  0  0  0  0
M  END


P:\Projects\RInChI\INCHI-1-BIN__V1.06\windows\32bit>

Cheers
-- Jan

On 2023-12-12 07:59, S Joshua Swamidass wrote:

Perhaps provide some examples were this failure happens.

Sent from Gmail Mobile


On Tue, Nov 28, 2023 at 7:35 PM 李大舟  wrote:

Dear RDKit Developers and Maintainers,

I hope this email finds you well. My name is Dr. Dazhou Li, and I
am a researcher working on the development of a tool for
extracting chemical compound structures recognized by OCR (Optical
Character Recognition) technology. I have been using the RDKit
library for a crucial step in this process, specifically the
rdkit.Chem.inchi.MolFromInchi() function, to convert InChI-format
strings into Mol format representations.

Firstly, I would like to express my gratitude for the excellent
work you have done in developing and maintaining the RDKit
library, which has been an invaluable resource in my research. The
library has consistently delivered high-quality results in various
aspects of chemical informatics, and I appreciate your dedication
to its development.

However, I have encountered a specific issue with the
rdkit.Chem.inchi.MolFromInchi() function that I hope you can help
me understand and resolve. When attempting to convert InChI-format
strings generated by my tool, some of them fail with an error
message reporting "NaN." Since the rdkit.Chem.inchi.MolFromInchi()
function calls C++ code, I am unable to directly inspect its
execution or source code to diagnose the issue.

My primary request is for assistance in understanding the internal
workings of the rdkit.Chem.inchi.MolFromInchi() function,
specifically the checking process or generation step that leads to
the "NaN" error when certain InChI-format strings are processed.
It is crucial for my research to determine at which point in the
execution of this function my generated InChI-formatted strings
are considered unreasonable, as this information will help me
refine my tool's output to be compatible with RDKit.

I understand that the RDKit library is a complex and comprehensive
toolkit, and I appreciate the complexity involved in diagnosing
such issues. However, any insights or guidance you can provide
regarding the problematic cases and the internal processes of the
rdkit.Chem.inchi.MolFromInchi() function would be immensely
valuable to me and would help me ensure the compatibility of my
tool with RDKit.

If possible, I would be grateful for access to relevant
documentation or insights into the specific error conditions that
may lead to the "NaN" result. Additionally, any suggestions or
best practices for generating InChI-format strings that are more
likely to be successfully processed by RDKit would be greatly
appreciated.

Thank you for your time and consideration. I look forward to your
response and hope that we can collaborate to resolve this issue
and enhance the compatibility of my tool with the RDKit library.

Please feel free to reach out to me if you require any additional
information or if there are specific details about my tool or the
InChI-format strings that would aid in diagnosing the issue.

Best regards,

Dr. Dazhou Li
Shenyang University of Chemical Technology

Re: [Rdkit-discuss] SDMolSupplier warning 2023.9.2

2023-12-12 Thread Jan Holst Jensen
> I could not figure out how Rdkit is guessing it as 2D structure, as 
there is no such information in SDF.



Ah, but there is, just a little hidden :-). The source-and-timestamp 
line of each molfile in the SDF contains that information. The line is 
the second line of the molfile and you can find the 2D/3D tag as the 
last two characters of that line. An example:


:MOLFILE_BEGINS:

  Mol2Comp0618061807*2D*

  1  0  0  0  0  0  0  0  0  0999 V2000
   10.2967   -1.5283    0. Ar  0  0  0  0  0  0  0  0  0  0 0  0
M  END
:MOLFILE_ENDS:

Cheers
-- Jan

On 2023-12-13 03:39, Mandar Kulkarni wrote:

Hello,

I am using RDKit 2023.9.2's  SDMolSupplier to read docked SDF files 
(V2000 formats, suppl  = Chem.SDMolSupplier(sdf_file);) and getting a 
warning as:


Warning: molecule is tagged as 2D, but at least one Z coordinate is not zero. 
Marking the mol as 3D.

I could not figure out how Rdkit is guessing it as 2D structure, as 
there is no such information in SDF.
Is there any more information need to provide SDMolSupplier to make it 
understand it's a 3D molecule?

I kindly look forward to hearing to suggestions.
TIhanks in advance,
Mandar Kulkarni


smime.p7s
Description: S/MIME Cryptographic Signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] mol_from_ctab doesn't preserve coordinates

2020-05-10 Thread Jan Holst Jensen

Hi Sharang (adding the list as I missed reply-all previously),

Glad to hear that RDKit does what it is supposed to.

I am unfortunately not familiar with how Datawarrior processes 
molecules, so I don't know if it does its own layout. To remove RDKit 
from the equation - if you can feed Datawarrior a molfile directly from 
disk, you can see if Datawarrior preserves the layout of the molfile or not.


Cheers
-- Jan

On 2020-05-09 16:32, Sharang Phatak wrote:

Hi Jan,

Thank you for your message. I did go back and check with mol_to_ctab 
and found out the coordinates are indeed identical. I am using 
Datawarrior to visualize these structures using Datawarrior's native 
SQL integration tool. The structures then are displayed a bit 
differently, flipped / rotated within this tool. Perhaps it's a 
different issue unrelated to RDKit. If you've experienced such issues 
and share what you did to overcome them, it would be much appreciated.


Cheers,
Sharang

On Thu, May 7, 2020 at 3:22 AM Jan Holst Jensen <mailto:j...@biochemfusion.com>> wrote:


Hi Sharang,

A very old version of RDKit ?

When I use your form of calling mol_to_ctab() it does preserve
coordinates for me.

select mol_to_ctab(mol_from_ctab('


   4  3  0  0  0  0  0  0  1  0999 V2000
 5.    5.    0. C   0  0  0  0  0  0  0 0  0 0  0  0
 5.    4.    0. C   0  0  0  0  0  0  0 0  0 0  0  0
 6.    3.    0. N   0  0  0  0  0  0  0 0  0 0  0  0
 4.    3.    0. O   0  0  0  0  0  0  0 0  0 0  0  0
   2  1  1  1  0  0  0
   2  3  1  0  0  0  0
   2  4  1  0  0  0  0
M  END
'::cstring, true));


  RDKit  3D

   4  3  0  0  0  0  0  0  0  0999 V2000
 5.    5.    0. C   0  0  0  0  0  0  0 0  0 0  0  0
 5.    4.    0. C   0  0  0  0  0  0  0 0  0 0  0  0
 6.    3.    0. N   0  0  0  0  0  0  0 0  0 0  0  0
 4.    3.    0. O   0  0  0  0  0  0  0 0  0 0  0  0
   2  1  1  1
   2  3  1  0
   2  4  1  0
M  END


If I leave out the second optional boolean parameter, which
defaults to
false, the coordinates are re-generated by RDKit.

select mol_to_ctab(mol_from_ctab('


   4  3  0  0  0  0  0  0  1  0999 V2000
 5.    5.    0. C   0  0  0  0  0  0  0 0  0 0  0  0
 5.    4.    0. C   0  0  0  0  0  0  0 0  0 0  0  0
 6.    3.    0. N   0  0  0  0  0  0  0 0  0 0  0  0
 4.    3.    0. O   0  0  0  0  0  0  0 0  0 0  0  0
   2  1  1  1  0  0  0
   2  3  1  0  0  0  0
   2  4  1  0  0  0  0
M  END
'::cstring));


  RDKit  2D

   4  3  0  0  0  0  0  0  0  0999 V2000
 0.    0.    0. C   0  0  0  0  0  0  0 0  0 0  0  0
 1.2990    0.7500    0. C   0  0  0  0  0  0  0 0  0 0  0  0
 2.5981   -0.    0. N   0  0  0  0  0  0  0 0  0 0  0  0
 1.2990    2.2500    0. O   0  0  0  0  0  0  0 0  0 0  0  0
   2  1  1  6
   2  3  1  0
   2  4  1  0
M  END

This is on a fairly old RDKit 2016_09_4 on Postgres 9.6. Earlier
versions would ignore the second parameter - that was fixed around
the
2016_09 release if I recall correctly.

Cheers
    -- Jan Holst Jensen

On 2020-05-07 00:21, Sharang Phatak wrote:
> Hi,
>
> I am following the documentation for postgres / rdkit. I have a
table
> with valid molfiles as confirmed from is_valid_ctab(). I am then
> trying to insert into a table 'mols' using
> mol_from_ctab(molfile::cstring,true).
>
> However, the coordinates are not preserved. Is there something I am
> missing?
>
> Thank you,
> Sharang





smime.p7s
Description: S/MIME Cryptographic Signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Different InChI: RDKit Knime Vs RDKit Python

2019-11-12 Thread Jan Holst Jensen

Hi Christos,

The differences are as far as I can see all in the /b layer of the InChI 
(/b = stereo parity of double bonds), so my guess is that differing 2D 
coordinates is the cause. Do you also see the difference if you run the 
output SDF back through KNIME (that is: Are the coordinates generated by 
SDF production different than earlier conformations used to calculate 
InChI ?) ?


I don't know if subtle RDKit version differences between KNIME and conda 
could cause it - I am not familiar enough with the KNIME nodes to be 
able to judge that.


Cheers
-- Jan Holst Jensen

On 2019-11-12 13:14, Christos Kannas wrote:

Dear RDKiters,

I'm having the following problem.
I have a workflow that standardises compounds and as part of the 
process it generates standard InChI and InChIkey for the compound. The 
output is stored in an SDF.
If I parse the SDF to a dataframe in jupyter notebook, then use the 
mol object to generate standard inchi, for a small number of compounds 
the new standard InChI is slightly different than the one generated in 
Knime environment.


Environments Details:
- RDKit Knime Nodes: 3.8.0v201906261723
- RDKit Python (conda): 2018.09.3, 2019.03.4, 2019.09.1

See image: https://imgur.com/a/EnYoHWG

Best,

Christos

Christos Kannas

Scientific Software Developer (Cheminformatics)





smime.p7s
Description: S/MIME Cryptographic Signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Question on chirality

2019-09-13 Thread Jan Holst Jensen

Hi Navid,

I am not familiar with the paper you mention, but I believe that the 
problem is caused by non-isomeric input SMILES.


Below is an Alanine read in from molfile, with coordinates. It has a 
chiral center with "S" configuration. When you output it as non-isomeric 
SMILES and read it back in, the chiral information is lost because the 
molecule no longer has a conformation:


>>> mol = Chem.MolFromMolBlock("""
   ...   BIOCHEMF09131911262D
   ...
   ...   7  6  0  0  1  0  0  0  0  0999 V2000
   ... 0.    0.    0. N   0  0  0  0  0  0  0 0  0  0  0  0
   ... 0.7145    0.4125    0. C   0  0  0  0  0  0  0 0  0  0  0  0
   ... 1.4290    0.    0. C   0  0  0  0  0  0  0 0  0  0  0  0
   ... 1.4209   -0.8208    0. O   0  0  0  0  0  0  0 0  0  0  0  0
   ... 0.7084    1.2417    0. C   0  0  0  0  0  0  0 0  0  0  0  0
   ...    -1.    0.    0. H   0  0  0  0  0  0  0 0  0  0  0  0
   ... 2.4290    0.    0. O   0  0  0  0  0  0  0 0  0  0  0  0
   ...   2  3  1  0  0  0  0
   ...   1  2  1  0  0  0  0
   ...   3  4  2  0  0  0  0
   ...   2  5  1  1  0  0  0
   ...   1  6  1  0  0  0  0
   ...   3  7  1  0  0  0  0
   ... M  END
   ... """)
>>> Chem.AssignAtomChiralTagsFromStructure(mol)
>>> Chem.FindMolChiralCenters(mol)
   [(1, 'S')]
>>> Chem.MolToSmiles(mol)
   'CC(N)C(=O)O'
>>> mol = Chem.MolFromSmiles("CC(N)C(=O)O")
>>> Chem.AssignAtomChiralTagsFromStructure(mol)
>>> Chem.FindMolChiralCenters(mol)
   []
>>>


You can generate a conformation that produces chiral information by 3D 
embedding the molecule.


>>> from rdkit.Chem import AllChem
>>> AllChem.EmbedMolecule(mol)
   0
>>> Chem.AssignAtomChiralTagsFromStructure(mol)
>>> Chem.FindMolChiralCenters(mol)
   [(1, 'S')]
>>>


Another way would be if you can get isomeric SMILES as input. Then the 
chiral information is right there.


>>> Chem.MolToSmiles(mol, isomericSmiles = True)
   'C[C*@*H](N)C(=O)O'
>>> mol = Chem.MolFromSmiles("C[C@H](N)C(=O)O")
>>> Chem.FindMolChiralCenters(mol)
   [(1, 'S')]
>>>


Cheers
-- Jan Holst Jensen


On 2019-09-12 04:44, Navid Shervani-Tabar wrote:

Hello,

In the paper: "Graph Networks as a Universal Machine Learning
Framework for Molecules and Crystals", authors introduce chirality as 
an atom feature input to analyze QM9 dataset. I was trying to recreate 
this atom feature as following


> Chirality: (categorical) R, S, or not a Chiral center (one-hot encoded).

The code I used is:

    from chainer_chemistry import datasets
    from chainer_chemistry.dataset.preprocessors.ggnn_preprocessor 
import GGNNPreprocessor

    from rdkit import Chem
    import numpy as np

    dataset, dataset_smiles = datasets.get_qm9(GGNNPreprocessor(), 
return_smiles=True)


    for i in range(len(dataset_smiles)):
        mol = Chem.MolFromSmiles(dataset_smiles[i])
        Chem.AssignAtomChiralTagsFromStructure(mol)
        chiral_cc = Chem.FindMolChiralCenters(mol)

        if not len(chiral_cc) == 0:
            print(chiral_cc)

The output shows no Chiral centers for this dataset. When I use 
`includeUnassigned=True`, code gives a list of tuples, but instead of 
"R/S", I get "?". I was wondering if there is a mistake in my 
implementation. If this is expected, any thoughts on how chirality was 
assigned in the above paper? Thanks.


Sincerely,
Navid




smime.p7s
Description: S/MIME Cryptographic Signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Generating .rxn and/or .rdf fiiles from reaction smiles

2019-09-03 Thread Jan Holst Jensen

Hi Markus,

You can generate the RXN file output from the rxn object via 
ReactionToRxnBlock, like this Python code:


>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> rxn = AllChem.ReactionFromSmarts('CC(=O)O.OCC>>CC(=O)OCC')
>>> AllChem.ReactionToRxnBlock(rxn)
   '$RXN\n\n  RDKit\n\n  2  1\n$MOL\n\n RDKit \n\n  4  3  0  0 
   0  0  0  0  0  0999 V2000\n    0. 0.    0. C   0  0  0 
   0  0  0  0  0  0  0  0  0\n 0.    0.    0. C   0  0  0 
   0  0  0  0  0  0  0 0  0\n    0.    0.    0. O   0  0 
   0  0  0  0  0 0  0  0  0  0\n    0.    0.    0. O   0 
   0  0  0 0  0  0  0  0  0  0  0\n  1  2  6  0\n  2  3  2  0\n  2  4 
   6 0\nM  END\n$MOL\n\n RDKit  \n\n  3  2  0  0  0  0 0 
   0  0  0999 V2000\n    0.    0.    0. O   0  0 0  0  0 
   0  0  0  0  0  0  0\n    0.    0.    0. C   0  0  0  0 
   0  0  0  0  0  0  0  0\n    0. 0.    0. C   0  0  0  0 
   0  0  0  0  0  0  0  0\n  1 2  6  0\n  2  3  6  0\nM 
   END\n$MOL\n\n RDKit \n\n  6  5  0  0  0  0  0  0  0  0999
   V2000\n    0. 0.    0. C   0  0  0  0  0  0  0  0  0  0 
   0  0\n 0.    0.    0. C   0  0  0  0  0  0  0  0  0  0
   0  0\n    0.    0.    0. O   0  0  0  0  0  0  0 0  0 
   0  0  0\n    0.    0.    0. O   0  0  0  0 0  0  0  0 
   0  0  0  0\n    0.    0.    0. C   0 0  0  0  0  0  0 
   0  0  0  0  0\n    0.    0. 0. C   0  0  0  0  0  0  0 
   0  0  0  0  0\n  1  2  6  0\n 2  3  2  0\n  2  4  6  0\n  4  5  6 
   0\n  5  6  6  0\nM END\n'
>>>

Writing an RDF file is quite like writing an SD file, except that the 
molfile entry is replaced with an RXN file entry. The syntax for data 
fields is a bit different too. Data fields can be used to describe 
agents, conditions, etc. but there is no fixed vocabulary (that I know 
of) for e.g. specifying "this is an agent". Hence, exchanging and 
parsing RDF files can be ... interesting.


Cheers
-- Jan Holst Jensen

On 2019-09-03 11:59, Markus Grimm via Rdkit-discuss wrote:

Dear all,

I'm trying to generate .rxn files or one single .rdf file from a csv 
file where I have a bunch of reactions smiles stored.
I know how to instantiate a rxn object but I no idea how to safe this 
to several .rxn files or a RDF file.


I already asked this question last Friday but unfortunately I did not 
receive any answer. So I'm wondering is it even possible?


Thank you very much for your support.

Best regards,
Markus




smime.p7s
Description: S/MIME Cryptographic Signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] The RDKit database cartridge install problem

2019-02-06 Thread Jan Holst Jensen

On 2019-02-05 20:38, tech tech wrote:

Hi, all,

I tried to install the cartridge for Postgres.
I am using ubuntu 18.04 and postgres 9.6
However, when I typed this commond:
  /home/hat/anaconda3/envs/my-rdkit-env/bin/initdb -D /var/www/rdkit
I saw following errors. I don't know how to fix it. Does anyone have a
correct input file to init the database? Thanks

Tom


The files belonging to this database system will be owned by user "hat".
[...]



performing post-bootstrap initialization ... FATAL:  column
pol.polpermissive does not exist at character 169
STATEMENT:  CREATE VIEW pg_policies AS
 SELECT
 N.nspname AS schemaname,
 C.relname AS tablename,
[...]



 FROM pg_catalog.pg_policy pol
 JOIN pg_catalog.pg_class C ON (C.oid = pol.polrelid)
 LEFT JOIN pg_catalog.pg_namespace N ON (N.oid = C.relnamespace);
child process exited with exit code 1
initdb: removing contents of data directory "/var/www/rdkit"


Hi Tom,

The polpermissive column is not available in Postgres 9.6. On a Postgres 
9.6 database the pg_catalog.pg_policies table has the columns


polname
polrelid
polcmd
polroles
polqual
polwithcheck

and on a Postgres 11.1 database the table has these columns:

polname
polrelid
polcmd
*polpermissive*
polroles
polqual
polwithcheck

Don't know if you have the option of using a newer version of Postgres ?

Cheers
-- Jan


smime.p7s
Description: S/MIME Cryptographic Signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] C++ - how do I create a copy of an RWMol ?

2018-11-15 Thread Jan Holst Jensen

Hi,

I have an issue with the cartridge - mol_to_svg() changes the input 
molecule so I want to make MolGetSVG() to work on a copy of the input 
molecule instead.


In Code/PgSQL/rdkit/adapter.cpp:

   extern "C" char *MolGetSVG(CROMol i, unsigned int w, unsigned int h,
   const char *legend, const char *params) {
  // This is the original code that makes a read-write
  // "im" that is passed to the drawing routines where
  // it gets changed.
  RWMol *im = (RWMol *)i;

  // SVG routines need an RWMol since they change the
  // molecule as they prepare it for drawing. We don't
  // want a plain SQL function (mol_to_svg) to have
  // unexpected side effects, so take a copy and render
  // (and change) that instead.
  RWMol *input_mol = (RWMol *)i;
  RWMol input_copy = *input_mol;
  RWMol *im = _copy;

  ...

My construction of the "input_copy" object is obviously wrong, 'cause 
the database crashes. What is the correct way of copying an RWMol object ?


Cheers
-- Jan


smime.p7s
Description: S/MIME Cryptographic Signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] number of significant digits in molblock?

2018-10-05 Thread Jan Holst Jensen

Hi Michal,

V2000 format is restricted by its specification to fixed format with 4 
decimals. V3000 output is not restricted to a fixed format, but the 
current code still rounds it in practice as seen below.


To get extra precision you could change the formatting of x, y, and z 
coordinate output in Code/GraphMol/FileParsers/MolFileWriter.cpp, 
function GetV3000MolFileAtomLine(), look for the


    ss << " " << x << " " << y << " " << z;

line. Adding extra digits to the X, Y, and Z coordinates *should* not 
cause issues for compliant V3000 readers.


Cheers
-- Jan

>>> import rdkit
>>> from rdkit import Chem
>>> from Chem import AllChem
>>> m = Chem.MolFromSmiles('CC')
>>> AllChem.Compute2DCoords(m)
0
>>> m.GetConformer(0).SetAtomPosition(0, 
rdkit.Geometry.Point3D(0.123456789, 0.2, 0.3))

>>> print(Chem.MolToMolBlock(m))
 RDKit  2D

  2  1  0  0  0  0  0  0  0  0999 V2000
    0.1235    0.2000    0.3000 C   0  0  0  0  0  0  0 0  0  0  0  0    
<== 4 decimal digits

    0.7500   -0.    0. C   0  0  0  0  0  0  0 0  0  0  0  0
  1  2  1  0
M  END

>>> print(Chem.MolToMolBlock(m, forceV3000=True))

 RDKit  2D

  0  0  0  0  0  0  0  0  0  0999 V3000
M  V30 BEGIN CTAB
M  V30 COUNTS 2 1 0 0 0
M  V30 BEGIN ATOM
M  V30 1 C 0.123457 0.2 0.3 0    <== 6 decimal digits
M  V30 2 C 0.75 -5.55112e-17 0 0
M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 1 1 1 2
M  V30 END BOND
M  V30 END CTAB
M  END

>>>

On 2018-10-05 11:42, Michal Krompiec wrote:

Hello,
Is it possible to control the number of significant digits of XYZ 
coordinates? I am modifying coordinates of my molecules 
using SetAtomPosition but when I save them into an SDF it seems that 
the precision is limited to 4 digits after the decimal point (I'd like 
10 instead...).

Best wishes,
Michal




smime.p7s
Description: S/MIME Cryptographic Signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Strange behaviour of the substructure search

2017-11-30 Thread Jan Holst Jensen

On 2017-11-30 11:45, Lionel Colliandre wrote:
1- for polymers (brackets with n label) , the ctab is not considered 
as valid and the mol_from_ctab function is not working (example of a 
ctab at the end of the email). I think that it is the "M  STY 1   1 
SRU" block that is problematic. To the best of my knowledge no 
cartridge is able to search directly a polymer but I would like simply 
to be able to search the monomeric motif. Even with big warning, is 
there a way to read and search such polymeric molecules with RDKit?


Hi Lionel,

I have the exact same challenge in a molecule database. We decided to go 
with a convention that end users will register polymers with exactly 3 
repeats so you can do sensible SSS searches.


In the registered polymers we use MUL (multiple-group) S-group brackets 
instead of the SRU bracket type which RDKit (most sensibly) refuses to 
process. We have various checks in place that ensure that polymers get 
registered this way. The checks are implemented through regexp 
expressions - I know, processing molfiles via regexp parsing is crazy 
:-) but it works OK.


If you want a simple dirty hack you can try to process the polymers this 
way:


1) Replace "SRU" S-group types with "MUL".
2) Change the "n" label to "1".

That will change your example molfile into:

"

  Mrv1718011301710072D

  4  3  0  0  0  0    999 V2000
   -6.3839    2.3661    0. O   0  0  0  0  0  0  0  0 0  0 0  0
   -5.7428    1.8469    0. C   0  0  0  0  0  0  0  0 0  0 0  0
   -4.9726    2.1425    0. O   0  0  0  0  0  0  0  0 0  0 0  0
   -4.3314    1.6233    0. H   0  0  0  0  0  0  0  0 0  0 0  0
  1  2  1  0  0  0  0
  2  3  1  0  0  0  0
  3  4  1  0  0  0  0
M  STY  1   1 *MUL*
M  SCN  1   1 HT
M  SAL   1  2   2   3
M  SDI   1  4   -4.3971    2.3134   -5.0201    1.5441
M  SDI   1  4   -6.3183    1.6760   -5.6953    2.4454
M  SBL   1  2   1   3
M  SMT   1 *1*
M  END
"

which RDKit will process and you get the single monomer unit.

Cheers
-- Jan Holst Jensen
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Masking groups as atoms in RDKit

2017-09-29 Thread Jan Holst Jensen

Hi Kovas,

Greg has precisely pointed out the major problem of collapsing fragments 
into single atoms: Searching and comparing structures.


With that warning in mind: I use pseudo atoms (e.g. "Ala", "Arg",...) to 
good effect to represent amino acids in peptides and proteins. My 
colleague Esben Bjerrum has done custom builds of RDKit where the 
atomic_data.cpp file was changed to add the 22 natural amino acids.


The rest of RDKit handles the new atoms surprisingly well. The new atoms 
can also be used in SMARTS queries as long as you reference them by 
atomic number (and Greg's caution about searching applies doubly in that 
case).


So, yes, that's one way of doing it. Just don't expect anyone else to be 
able to interpret your molfiles reliably :-).



You write that you want to mask away the macromolecule part since you 
are not going to interact with it. In that case it sounds like it is OK 
to throw away the underlying chemistry of the macromolecule and 
substitute a label for depiction. I would then go with Greg's suggestion 
to use dummy atoms and labels, e.g.


   import rdkit
   from rdkit import Chem
   from rdkit.Chem import Draw

   m = Chem.MolFromSmiles('CC[*:1]')
   # Put a molfile label on the star atom.
   m.GetAtoms()[2].SetProp("molFileAlias", "Macromol-section")

   print(Chem.MolToMolBlock(m))

   PRINT OUTPUT:

 RDKit

  3  2  0  0  0  0  0  0  0  0999 V2000
    0.    0.    0. C   0  0  0  0  0  0  0  0 0  0  0  0
    0.    0.    0. C   0  0  0  0  0  0  0  0 0  0  0  0
    0.    0.    0. R   0  0  0  0  0  1  0  0 0  1  0  0
  1  2  1  0
  2  3  1  0
   A    3
   Macromol-section
   M  END

If you paste that molfile into MarvinSketch you see this (different 
tools will show labels in different ways):




I am very much a molfile guy, so I don't know if labels can be carried 
over to RDKit SMILES strings.


Cheers
-- Jan

On 2017-09-28 08:00, Kovas Palunas wrote:
The way i was thinking about it, the smarts of OCC would not match the 
O[but] because [but] is a totally new atom that is not related to 
carbon at all.  This doesn't really make sense in this example, but it 
does (i think) for most of my purposes (where i want to mask away a 
biological macromolecule that i do not want to interact with).


There are probably still edge cases i'm not seeing... but maybe it's 
still worth a try?  I saw there was a periodic table module in RDKit.  
Is it possible to add these atoms there?


- Kovas


From: Greg Landrum
Sent: Wednesday, September 27, 10:13 PM
Subject: Re: [Rdkit-discuss] Masking groups as atoms in RDKit
To: Kovas Palunas
Cc: rdkit-discuss@lists.sourceforge.net



I'm afraid that there's likely to be rather a lot of devil hiding in 
the details (as is so often the case).


A simple example of one problem: let's take your [But]O case. Suppose 
you do a substructure search for the molecule defined by the SMARTS 
"OCC". Does that match "[But]O"?  What does it return when I ask for 
the substructure matches (this function, if you aren't familiar with 
it, returns the indices of the matching atoms)? What about the SMARTS 
"CC"?


One solution to this that works with substructure searching is to have 
the molecule contain all the atoms - "O" in your example - but to 
have the four C atoms marked as a group so that drawings of the 
molecule display "[But]O". Supporting this type of functionality is on 
the To Do list (it's part of supporting S Groups from Mol files).


If you just want to indicate that there is a [But] group there but not 
really do anything with the group's structure, there's are probably 
already ways to handle this using dummy atoms and custom labels.


-greg




On Wed, Sep 27, 2017 at 9:26 PM, Kovas Palunas 
> wrote:
Ideally, I'd like to treat these pseudoatoms as similarly to normal 
atoms as possible.  I would mostly want to use them for substructure 
matching, running reactions, and also display purposes.  Also, basic 
atom queries, such as getting a mapping number or a atom symbol.


I was thinking that maybe this could be done by just defining the CoA 
atom type (for example) just as the carbon or oxygen atom types are 
defined (setting atomic weight, valences, etc.).


Does this make sense?

 - Kovas
*From:*Greg Landrum>

*Sent:*Wednesday, September 27, 2017 2:27:04 AM
*To:*Kovas Palunas
*Cc:*rdkit 
-disc...@lists.sourceforge.net 


*Subject:*Re: [Rdkit-discuss] Masking groups as atoms in RDKit

Where would you want to use this?
Is it for depiction (i.e. drawing molecules) or something else?

-greg


On Tue, Sep 26, 2017 at 10:12 PM, Kovas Palunas 
> wrote:

Hi all,

Has anyone tried implementing or using a group to atom 

[Rdkit-discuss] RDKit for Excel - Python3-ready.

2017-09-25 Thread Jan Holst Jensen

Hi RDKitters,

I was able to use the RDKit Hackathon to work on getting RDKit for Excel 
ready for Python 3. The current version now supports both Python 2 and 3.


The documentation has also been extended so I hope that it is now 
relatively straightforward to get it up and running.


https://github.com/janholstjensen/rdkit4excel 
<https://github.com/janholstjensen/rdkit4excel>


Cheers
-- Jan Holst Jensen
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Announce: RDKit for Excel.

2017-06-19 Thread Jan Holst Jensen

Hi Curt,

The add-in is based on COM automation services, so it is Windows-only to 
the best of my knowledge. We chose to implement it as a COM service 
since the Python code was then extremely simple and very easy to extend.


I don't have access to a Mac so I am having a hard time figuring out if 
MS Office on Mac implements enough of the COM subsystem to support COM 
automation servers. It might be that both Mac and Windows could be 
supported if the add-in was written in C++ as an XLL. My gut feeling is 
that it would then be harder to extend dynamically, but I could be 
wrong. One thing I do know is that it would be more complex to write.


To support both Windows and Mac you can also take the approach that 
xlwings has taken where a DLL is loaded by VBA code and talks to the 
Python code:

https://www.xlwings.org/

We initially tried to base our add-in on xlwings but got into trouble 
sharing Excel worksheets with User Defined Functions. The UDFs in 
xlwings are made available to Excel through VBA auto-generated from the 
Python code. Smart concept, but gets us into trouble using UDFs in 
sheets that are passed around or copied.


Python 3 - yes, we should go there. If the COM support that the add-in 
relies on is available in Python 3 it should be straightforward (famous 
last words :-) ). Hopefully I will have some time to check this out over 
the summer.


Cheers
-- Jan

On 2017-06-19 18:40, Curt Fischer wrote:

This is great!  Thanks Jan.

I haven't tried it yet, but based on the GitHub README it looks like 
this is only for Excel on a Windows box.  Is that right, or can Mac 
versions of Excel also work?  And since it's come up on the mailing 
list recently, is there a plan to expand to/move to Python 3 at any 
point soon?


Curt

On Sun, Jun 18, 2017 at 11:06 AM, Jan Holst Jensen 
<j...@biochemfusion.com <mailto:j...@biochemfusion.com>> wrote:


Hi RDKitters,

I am happy to announce an open source Excel add-in that gives easy
access to the RDKit Python API. The add-in is BSD-licensed like RDKit.
https://github.com/janholstjensen/rdkit4excel
<https://github.com/janholstjensen/rdkit4excel>

Screenshot of the add-in running in Excel 2016 (note: molecule
rendering requires additional 3rd party software):


The add-in is easily extendable via pure Python scripting. A new
Excel function is added by adding a function to the CRDKitXL
Python class and annotating the new function's input/output
parameter types through structured comments. For example, adding
an "rdkit_SmilesToMolBlock()" function that has a single "smiles"
string input parameter:

#RDKITXL: in:smiles:str, out:str
def rdkit_SmilesToMolBlock(self, smiles):
# Python function implementation follows here...


Many thanks to Esben Jannik Bjerrum who did the implementation of
this first version.

Cheers
-- Jan Holst Jensen


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
<https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>




--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Announce: RDKit for Excel.

2017-06-18 Thread Jan Holst Jensen

Hi RDKitters,

I am happy to announce an open source Excel add-in that gives easy 
access to the RDKit Python API. The add-in is BSD-licensed like RDKit.

https://github.com/janholstjensen/rdkit4excel

Screenshot of the add-in running in Excel 2016 (note: molecule rendering 
requires additional 3rd party software):



The add-in is easily extendable via pure Python scripting. A new Excel 
function is added by adding a function to the CRDKitXL Python class and 
annotating the new function's input/output parameter types through 
structured comments. For example, adding an "rdkit_SmilesToMolBlock()" 
function that has a single "smiles" string input parameter:


   #RDKITXL: in:smiles:str, out:str
def rdkit_SmilesToMolBlock(self, smiles):
# Python function implementation follows here...


Many thanks to Esben Jannik Bjerrum who did the implementation of this 
first version.


Cheers
-- Jan Holst Jensen
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Windows binaries - exists ? Where ?

2017-05-05 Thread Jan Holst Jensen

Hi Greg,

> Using conda makes the installation a lot simpler and saves me a fair 
amount of time while doing the builds.


Sounds like a good deal.

> Is using the anaconda python distribution a problem for you?

Actually not. I was concerned about its support for other packages, but 
it comes with pywin32 built-in so we can create COM services (think 
Excel...) and it does indeed make it much easier to install RDKit.


Thanks for the help.

Cheers
-- Jan

On 2017-05-05 04:48, Greg Landrum wrote:

Hi Jan,

I've stopped creating standalone windows binaries now that the conda 
packages are available.
Using conda makes the installation a lot simpler and saves me a fair 
amount of time while doing the builds.


Is using the anaconda python distribution a problem for you?

On Thu, May 4, 2017 at 5:41 PM, Jan Holst Jensen 
<j...@biochemfusion.com <mailto:j...@biochemfusion.com>> wrote:


Hi,

I was trying to find Python Windows binaries for the latest RDKit
release. I could find them for the older 2016_03_1 release on
SourceForge but had no luck finding anything newer on SourceForge
or Github.

According to GitHub they should be there:
https://github.com/rdkit/rdkit/tree/Release_2017_03_1
<https://github.com/rdkit/rdkit/tree/Release_2017_03_1>

Under "Installation":
"Windows binaries are available with each release." - but where :-) ?

Cheers
-- Jan



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit Windows binaries - exists ? Where ?

2017-05-04 Thread Jan Holst Jensen
Hi,

I was trying to find Python Windows binaries for the latest RDKit 
release. I could find them for the older 2016_03_1 release on 
SourceForge but had no luck finding anything newer on SourceForge or Github.

According to GitHub they should be there:
https://github.com/rdkit/rdkit/tree/Release_2017_03_1

Under "Installation":
"Windows binaries are available with each release." - but where :-) ?

Cheers
-- Jan

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SD file read error

2017-01-11 Thread Jan Holst Jensen
rceforge.net/p/rdkit/mailman/message/30036309/

In principle it should be possible to fix these errors on the fly by 
loading the molecules without sanitization, detect the valence issues 
and somehow automagically correct them, then sanitize and calculate mol 
weights. In this case it looks like it would be easier and faster to 
locate the relatively few molecules with valence errors and fix them 
manually.


Cheers
-- Jan Holst Jensen, Biochemfusion
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI support PostgreSQL cartridge

2016-03-03 Thread Jan Holst Jensen

On 2016-03-03 15:23, Max Kretzschmar wrote:

Dear all,

I have a problem regarding the postgres cartridge. Everything is 
working fine except the Inchi functions:


select * from mol_inchi('c1c1'::mol);
  mol_inchi
-
 InChI not available
(1 Zeile)


The RDKit was build from source with InChI support and they are 
available within my python environment. The RDKit version I use is 
2014_03_01, the cartridge version is 0.72.0. Did I miss something 
here? Is there a parameter to set when building the cartridge?


I appreciate any help you can provide.

Max


Hi Max,

Indeed there is a parameter. You need to modify 
Code/PgSQL/rdkit/Makefile. It starts with


# -
# Variables used and default values:
# USE_INCHI enables InChI functions; requires rdkit built with inchi support:
#USE_INCHI=0
# USE_AVALON enables the avalon fingerprint; requires rdkit built with avalon 
support:
#USE_AVALON=0
# USE_THREADS links against boost.system; required with non-ancient boost 
versions if inchi is enabled or the rdkit is built with threadsafe SSS:
#USE_THREADS=0
# STATIC_LINK links against the static RDKit libraries:
#STATIC_LINK=1
# -


Uncomment the #USE_INCHI=0 line and change it so it reads:

USE_INCHI=1


and then rebuild and reinstall the cartridge. That should do it :-).

Cheers
-- Jan Holst Jensen

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Valences

2016-01-29 Thread Jan Holst Jensen

On 2016-01-29 17:02, Rich Lewis wrote:

Dear all,

I tried to load a periodate from its smiles (I think these are 
correct), but it failed as according to rdkit, Iodine cannot have a 
valence of 7.  However, it can when stabilised with oxygens (at least 
from what I recall from my undergraduate course!).

In [*2**3*]: Chem.MolFromSmiles('[O-]I(=O)(=O)=O')
[15:29:04] Explicit valence for atom # 1 I, 7, is greater than permitted

In [*24*]: Chem.MolFromSmiles('OI(=O)(=O)=O')
[15:24:21] Explicit valence for atom # 1 I, 7, is greater than permitted

In [*25*]: Chem.MolFromSmiles('[H]OI(=O)(=O)=O')
[15:24:26] Explicit valence for atom # 1 I, 7, is greater than permitted

I don’t know if the file rdkit/Code/GraphMol/atomic_data.cpp is where 
the valences are configured for rdkit, but in the file, I saw that the 
allowed valence states are +1, +2(?) and +5 but not +7.  A few of the 
other elements are also similarly conservative, such as Br only 
allowing +1. Should these be expanded, or is this deliberate?




Hi Rich,

The file rdkit/Code/GraphMol/atomic_data.cpp is indeed where the 
valences are configured. There was a similar discussion in 2015 that 
resulted in additional valence data being added to P.


http://sourceforge.net/p/rdkit/mailman/message/34131420/

You can try out the valence change locally by adding it to 
atomic_data.cpp (TAB-delimited format) and rebuilding RDKit.


Cheers
-- Jan Holst Jensen
--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311=/4140___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] molecule standardization in cartridge search

2015-09-26 Thread Jan Holst Jensen

Hi Tim,

Soren (cc:ed) wrote me and asked about molvs. Thanks to Soren for 
reminding that the original question was about standardization more than 
calling Python code from Postgres :-).

http://molvs.readthedocs.org/en/latest/

Take a look at molvs - it's got lots of functionality that you will 
need. We also use molvs as the backbone for much of our standardization.


Cheers
-- Jan


Hi Tim,

A simple getting-started example is:

   CREATE FUNCTION smiles2molfile(smiles text) RETURNS text
  LANGUAGE plpythonu AS $$
   import rdkit
   from rdkit import Chem

   mol = Chem.MolFromSmiles(smiles)
   return Chem.MolToMolBlock(mol)
   $$;


and you can then

select smiles2molfile('CC');

and get back a molfile.

For more advanced usage it is worth taking a look at the rdchord project 
that TJ has sent links to.


Cheers
-- Jan

On 2015-09-25 15:54, Tim Dudgeon wrote:

Jan,

thanks for that. I'll give it a try.
Are there any examples of writing RDKit functions and procedures for
postgres in python?
I see this general postgres docs:
http://www.postgresql.org/docs/9.4/static/plpython.html
but wondered if there are any RDKit specific examples anywhere?

Tim

On 25/09/2015 08:30, Jan Holst Jensen wrote:

On 2015-09-24 16:22, Tim Dudgeon wrote:

I'm trying to get to grips with using the RDKit cartridge, and so far
its going well.
One thing I'm concerned about is molecule standardization, along the
lines of the ChemAxon Standardizer that allows substructure searches to
be done is a way that is largely independent of the quirks of structure
representation. The classic example would be how nitro groups are
represented, so that it didn't matter which nitro representation was in
the query or target structures, because both were converted to a
canonical form.

My initial thoughts are that this would be done by:
1. loading the "raw" structures into a source column that would never be
changed
2. defining a function that performed the necessary transform to
generate the canonical form of a molecule.
3. generating a "canonical" structure column that was the result of
passing the raw structures through that function
4. building the SSS index on that canonical column
5. executing queries using that function to canonicalize the query
structure

The problem I'm finding is that there do not seem to be postgres
functions defined for doing molecular transforms (essentially a reaction
transform) and doing things like removing explicit hydrogens. At least
not in the functions listed on this page:
http://rdkit.org/docs/Cartridge.html#functions

Am I missing something here, or might I be barking up completely the
wrong tree?

Tim

Hi Tim,

We have about the same situation and we're adding standardization
(beyond what RDKit implicitly does when it sanitizes the molecule)
through Python stored procedures. You will need to build and maintain
a normal Python-enabled RDKit installation in parallel to the
cartridge. The Python stored procedures can access the normal RDKit
installation and then run whatever Python code is necessary to do
additional molecule cleanup.

You will need to tweak your Postgres environment so the Python stored
procedures can load RDKit. This is what I have defined in an
environment file on CentOS:

RDBASE=/opt/rdkit
LD_LIBRARY_PATH=/opt/rdkit/lib
PYTHONPATH=/opt/rdkit

On Ubuntu this would go into /etc/postgresql/9.x/main/environment (in
a slightly different format where the values have to be single-quoted).

Cheers
-- Jan, Biochemfusion


--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] molecule standardization in cartridge search

2015-09-25 Thread Jan Holst Jensen
On 2015-09-24 16:22, Tim Dudgeon wrote:
> I'm trying to get to grips with using the RDKit cartridge, and so far
> its going well.
> One thing I'm concerned about is molecule standardization, along the
> lines of the ChemAxon Standardizer that allows substructure searches to
> be done is a way that is largely independent of the quirks of structure
> representation. The classic example would be how nitro groups are
> represented, so that it didn't matter which nitro representation was in
> the query or target structures, because both were converted to a
> canonical form.
>
> My initial thoughts are that this would be done by:
> 1. loading the "raw" structures into a source column that would never be
> changed
> 2. defining a function that performed the necessary transform to
> generate the canonical form of a molecule.
> 3. generating a "canonical" structure column that was the result of
> passing the raw structures through that function
> 4. building the SSS index on that canonical column
> 5. executing queries using that function to canonicalize the query structure
>
> The problem I'm finding is that there do not seem to be postgres
> functions defined for doing molecular transforms (essentially a reaction
> transform) and doing things like removing explicit hydrogens. At least
> not in the functions listed on this page:
> http://rdkit.org/docs/Cartridge.html#functions
>
> Am I missing something here, or might I be barking up completely the
> wrong tree?
>
> Tim

Hi Tim,

We have about the same situation and we're adding standardization 
(beyond what RDKit implicitly does when it sanitizes the molecule) 
through Python stored procedures. You will need to build and maintain a 
normal Python-enabled RDKit installation in parallel to the cartridge. 
The Python stored procedures can access the normal RDKit installation 
and then run whatever Python code is necessary to do additional molecule 
cleanup.

You will need to tweak your Postgres environment so the Python stored 
procedures can load RDKit. This is what I have defined in an environment 
file on CentOS:

RDBASE=/opt/rdkit
LD_LIBRARY_PATH=/opt/rdkit/lib
PYTHONPATH=/opt/rdkit

On Ubuntu this would go into /etc/postgresql/9.x/main/environment (in a 
slightly different format where the values have to be single-quoted).

Cheers
-- Jan, Biochemfusion

--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Save files with new atom properties and read again

2015-07-06 Thread Jan Holst Jensen

Hi Hitesh,

Great to hear, and glad that it worked for you out-of-the-box :-).

Cheers
-- Jan

On 2015-07-06 15:57, Hitesh Patel wrote:

Hi Jan and Greg,
I have tried the way you explained. It works fine for me. I am also 
using 2015.03.1 version So, additional function was not required.

Thanks.



On Sun, Jul 5, 2015 at 11:30 PM Hitesh Patel hiteshpatel...@gmail.com 
mailto:hiteshpatel...@gmail.com wrote:


Hi Jan and Greg,

Thanks for nice suggestion. I think that will serve the purpose. I
will do it tomorrow asap and let you know.


Regards,

Dr. Hitesh Patel
Post-Doctoral Fellow,
Technische Universität Dortmund,
Chemische Biologie,
Otto-Hahn-Straße 6,
44227, Dortmund,
Germany
Room: C1-05-181
Work: 0231 755-4740
Mob.: +49 (0)176 5544 6467

Email: hitesh.pa...@tu-dortmund.de
mailto:hitesh.pa...@tu-dortmund.de


On Sun, Jul 5, 2015 at 10:49 AM, Jan Holst Jensen
j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:

Hi Greg,

I was running with both RDKit 2014.09.2 and 2013.09.1. V   lines
accepted and read in input, but no V   lines produced in mol
block output.

I downloaded 2015.03.1 and that produces V   lines in mol block
output. Another good reason to upgrade :-).

2014.09.2:

  from rdkit import rdBase
  rdBase.rdkitVersion
'2014.09.2'
  from rdkit import Chem
  m = Chem.MolFromSmiles('CO')
  m.GetAtomWithIdx(0).SetProp('molFileValue','a1')
  print Chem.MolToMolBlock(m)

  RDKit

   2  1  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0 0  0  0  0 
0 0  0
 0.0.0. O   0  0  0  0  0 0  0  0  0 
0 0  0

   1  2  1  0
M  END

 


2015.03.1:

  from rdkit import rdBase
  rdBase.rdkitVersion
'2015.03.1'
  from rdkit import Chem
  m = Chem.MolFromSmiles('CO')
  m.GetAtomWithIdx(0).SetProp('molFileValue','a1')
  print Chem.MolToMolBlock(m)

  RDKit

   2  1  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0 0  0  0  0 
0 0  0
 0.0.0. O   0  0  0  0  0 0  0  0  0 
0 0  0

   1  2  1  0
V1 a1
M  END

 

Cheers
-- Jan

On 2015-07-05 04:59, Greg Landrum wrote:
 Hmm, that's strange. Unless that atom also has a query
associated with
 it, the atom values definitely should be written. It
certainly works
 at least some of the time:

 In [5]: m = Chem.MolFromSmiles('CO')

 In [6]: m.GetAtomWithIdx(0).SetProp('molFileValue','a1')

 In [7]: mb = Chem.MolToMolBlock(m)

 In [8]: print(mb)

  RDKit

   2  1  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0 0  0  0  0  0 
0  0  0
 0.0.0. O   0  0  0  0 0  0  0  0  0 
0  0  0

   1  2  1  0
 V1 a1
 M  END

 Jan: what version of the RDKit are you using?

 -greg



 On Sat, Jul 4, 2015 at 8:08 PM, Jan Holst Jensen
 j...@biochemfusion.com mailto:j...@biochemfusion.com
mailto:j...@biochemfusion.com mailto:j...@biochemfusion.com
wrote:

 Hi Hitesh,

 The V2000 molfile format has a feature that can be used
to set a
 simple text value for an atom by adding V  lines to the
 molfile. The RDKit molfile *reader* supports this
feature as seen
 below (I have seen this feature used to e.g. tag
reactive centers
 in a molecule when doing RDKit reaction-based enumeration).

 from rdkit import Chem molfile_with_values =

 
.join(open(C:/temp/cns-with-values.mol).readlines()) print
  molfile_with_values

   -ISIS-  07041519212D

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0958   -2.68330. C   0  0  0 0  0  0 0  0 
0  0  0 0
 0.8083   -2.27080. N   0  0  0 0  0  0 0  0 
0  0  0 0

 1.5208   -2.6792 0.
tel:2.6792%20%20%20%200. S   0
 0  0  0  0  0  0  0  0  0  0 0
   1  2  1  0  0  0  0
   2  3  1  0  0  0  0
 V1 Carbs
 V3 Sulfuric
 M  END

 m = Chem.MolFromMolBlock(molfile_with_values)

  m.GetAtoms()[0].GetProp('molFileValue')
 'Carbs

Re: [Rdkit-discuss] Save files with new atom properties and read again

2015-07-05 Thread Jan Holst Jensen
Hi Greg,

I was running with both RDKit 2014.09.2 and 2013.09.1. V   lines 
accepted and read in input, but no V   lines produced in mol block output.

I downloaded 2015.03.1 and that produces V   lines in mol block 
output. Another good reason to upgrade :-).

2014.09.2:

  from rdkit import rdBase
  rdBase.rdkitVersion
'2014.09.2'
  from rdkit import Chem
  m = Chem.MolFromSmiles('CO')
  m.GetAtomWithIdx(0).SetProp('molFileValue','a1')
  print Chem.MolToMolBlock(m)

  RDKit

   2  1  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0 0  0
 0.0.0. O   0  0  0  0  0  0  0  0  0  0 0  0
   1  2  1  0
M  END

 


2015.03.1:

  from rdkit import rdBase
  rdBase.rdkitVersion
'2015.03.1'
  from rdkit import Chem
  m = Chem.MolFromSmiles('CO')
  m.GetAtomWithIdx(0).SetProp('molFileValue','a1')
  print Chem.MolToMolBlock(m)

  RDKit

   2  1  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0 0  0
 0.0.0. O   0  0  0  0  0  0  0  0  0  0 0  0
   1  2  1  0
V1 a1
M  END

 

Cheers
-- Jan

On 2015-07-05 04:59, Greg Landrum wrote:
 Hmm, that's strange. Unless that atom also has a query associated with 
 it, the atom values definitely should be written. It certainly works 
 at least some of the time:

 In [5]: m = Chem.MolFromSmiles('CO')

 In [6]: m.GetAtomWithIdx(0).SetProp('molFileValue','a1')

 In [7]: mb = Chem.MolToMolBlock(m)

 In [8]: print(mb)

  RDKit

   2  1  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.0.0. O   0  0  0  0  0  0  0  0  0  0  0  0
   1  2  1  0
 V1 a1
 M  END

 Jan: what version of the RDKit are you using?

 -greg



 On Sat, Jul 4, 2015 at 8:08 PM, Jan Holst Jensen 
 j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:

 Hi Hitesh,

 The V2000 molfile format has a feature that can be used to set a
 simple text value for an atom by adding V   lines to the
 molfile. The RDKit molfile *reader* supports this feature as seen
 below (I have seen this feature used to e.g. tag reactive centers
 in a molecule when doing RDKit reaction-based enumeration).

 from rdkit import Chem  molfile_with_values =

  .join(open(C:/temp/cns-with-values.mol).readlines()) print
  molfile_with_values

   -ISIS-  07041519212D

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0958   -2.68330. C   0  0  0  0  0  0 0  0  0  0  0 0
 0.8083   -2.27080. N   0  0  0  0  0  0 0  0  0  0  0 0
 1.5208   -2.6792 0. tel:2.6792%20%20%20%200. S   0 
 0  0  0  0  0  0  0  0  0  0 0
   1  2  1  0  0  0  0
   2  3  1  0  0  0  0
 V1 Carbs
 V3 Sulfuric
 M  END

 m =  Chem.MolFromMolBlock(molfile_with_values)

  m.GetAtoms()[0].GetProp('molFileValue')
 'Carbs'

  m.GetAtoms()[1].GetProp('molFileValue')

 Traceback (most recent call last):
   File stdin, line 1, in module
 KeyError: 'molFileValue'

  m.GetAtoms()[2].GetProp('molFileValue')

 'Sulfuric'



 As you can see, the V   lines in the molfile are put into RDKit
 atom molFileValue properties.

 Unfortunately, the atom values are not written when RDKit outputs
 a molfile:

 print Chem.MolToMolBlock(m)


  RDKit  2D

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0958   -2.68330. C   0  0  0  0  0  0 0  0  0  0  0 0
 0.8083   -2.27080. N   0  0  0  0  0  0 0  0  0  0  0 0
 1.5208   -2.6792 0. tel:2.6792%20%20%20%200. S   0 
 0  0  0  0  0  0  0  0  0  0 0
   1  2  1  0
   2  3  1  0
 M  END



 But, it is fairly easy to add them with this function:

 def MolToMolBlock_WithAtomValues(mol):
 mol_block = Chem.MolToMolBlock(mol).split(\n)
 # Delete the M  END line.
 mol_block = mol_block[:-2]
 # Add appropriate V lines.
 for atom in mol.GetAtoms():
 if atom.HasProp(molFileValue):
 mol_block.append(V  %3d %s % (atom.GetIdx() + 1,
 atom.GetProp(molFileValue)))

 mol_block.append(M  END)
 return \n.join(mol_block)

 This lets you persist atom text values. Disclaimer: I have no idea
 if this will break in the presence of other property lines, e.g.
 M CHG etc., but ... it's a start.

 As an example, let's first create the CNS molecule without atom
 values.

 m =  Chem.MolFromSmiles(CNS) print

  MolToMolBlock_WithAtomValues(m)

  RDKit

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0 0  0  0  0  0 0
 0.0.0. N   0  0  0  0  0  0 0  0  0  0  0 0
 0.0.0. S   0  0  0  0  0  0 0  0  0

Re: [Rdkit-discuss] Save files with new atom properties and read again

2015-07-04 Thread Jan Holst Jensen
Hi Hitesh,

The V2000 molfile format has a feature that can be used to set a simple 
text value for an atom by adding V   lines to the molfile. The RDKit 
molfile *reader* supports this feature as seen below (I have seen this 
feature used to e.g. tag reactive centers in a molecule when doing RDKit 
reaction-based enumeration).

 from rdkit import Chem  molfile_with_values =
  .join(open(C:/temp/cns-with-values.mol).readlines()) print
  molfile_with_values

   -ISIS-  07041519212D

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0958   -2.68330. C   0  0  0  0  0  0  0  0  0  0  0 0
 0.8083   -2.27080. N   0  0  0  0  0  0  0  0  0  0  0 0
 1.5208   -2.67920. S   0  0  0  0  0  0  0  0  0  0  0 0
   1  2  1  0  0  0  0
   2  3  1  0  0  0  0
V1 Carbs
V3 Sulfuric
M  END

 m =  Chem.MolFromMolBlock(molfile_with_values)
  m.GetAtoms()[0].GetProp('molFileValue')
'Carbs'
  m.GetAtoms()[1].GetProp('molFileValue')
Traceback (most recent call last):
   File stdin, line 1, in module
KeyError: 'molFileValue'
  m.GetAtoms()[2].GetProp('molFileValue')
'Sulfuric'


As you can see, the V   lines in the molfile are put into RDKit atom 
molFileValue properties.

Unfortunately, the atom values are not written when RDKit outputs a molfile:

 print  Chem.MolToMolBlock(m)

  RDKit  2D

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0958   -2.68330. C   0  0  0  0  0  0  0  0  0  0  0 0
 0.8083   -2.27080. N   0  0  0  0  0  0  0  0  0  0  0 0
 1.5208   -2.67920. S   0  0  0  0  0  0  0  0  0  0  0 0
   1  2  1  0
   2  3  1  0
M  END



But, it is fairly easy to add them with this function:

def MolToMolBlock_WithAtomValues(mol):
 mol_block = Chem.MolToMolBlock(mol).split(\n)
 # Delete the M  END line.
 mol_block = mol_block[:-2]
 # Add appropriate V lines.
 for atom in mol.GetAtoms():
 if atom.HasProp(molFileValue):
 mol_block.append(V  %3d %s % (atom.GetIdx() + 1, 
atom.GetProp(molFileValue)))

 mol_block.append(M  END)
 return \n.join(mol_block)

This lets you persist atom text values. Disclaimer: I have no idea if 
this will break in the presence of other property lines, e.g. M CHG 
etc., but ... it's a start.

As an example, let's first create the CNS molecule without atom values.

 m =  Chem.MolFromSmiles(CNS) print
  MolToMolBlock_WithAtomValues(m)

  RDKit

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0 0
 0.0.0. N   0  0  0  0  0  0  0  0  0  0  0 0
 0.0.0. S   0  0  0  0  0  0  0  0  0  0  0 0
   1  2  1  0
   2  3  1  0
M  END


Add two atom values and use the new function to persist the atom values 
to the V2000 molfile output:

  m.GetAtoms()[0].SetProp(molFileValue, C-atom)
  m.GetAtoms()[2].SetProp(molFileValue, Here is an S-atom)
  print MolToMolBlock_WithAtomValues(m)

  RDKit

   3  2  0  0  0  0  0  0  0  0999 V2000
 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0 0
 0.0.0. N   0  0  0  0  0  0  0  0  0  0  0 0
 0.0.0. S   0  0  0  0  0  0  0  0  0  0  0 0
   1  2  1  0
   2  3  1  0
V1 C-atom
V3 Here is an S-atom
M  END


Check that the output can be read back in:

 molblock_test =  MolToMolBlock_WithAtomValues(m) m_test =
  Chem.MolFromMolBlock(molblock_test)
  m_test.GetAtoms()[0].GetProp(molFileValue)
'C-atom'
  m_test.GetAtoms()[1].GetProp(molFileValue)
Traceback (most recent call last):
   File stdin, line 1, in module
KeyError: 'molFileValue'
  m_test.GetAtoms()[2].GetProp(molFileValue)
'Here is an S-atom'


If you have multiple properties they would have to be encoded into the 
text value as e.g. key-value pairs. The text values are in principle 
limited to max. 70-80 characters (72 ?) by the MDL molfile 
specification, but RDKit probably accepts longer strings (I would guess, 
but have not tried).

A more generic solution would be to map RDKit atom and bond properties 
to molfile S-group data - but that's a bit more involved and is not 
supported at the moment.

Cheers
-- Jan Holst Jensen

On 2015-07-04 18:28, Greg Landrum wrote:
 Hi,
 
  On Friday, July 3, 2015, Hitesh Patel hiteshpatel...@gmail.com
  mailto:hiteshpatel...@gmail.com wrote:
 
  Hi Greg,
 
  At first priority, I will use mol2 format. A s shown in mol2 format
  explanation, we can set user specified atom attributes. I copied the
  text below for your convenience. See the bold text.
 
 
  The rdkit does not yet have a mol2 writer, so that isn't an option.
 
 
  For second priority, I can use  mol files. There I have to set
  Properties block:
 
  * |M ALS| - atom list and exclusive list * |M APO| - Rgroup
  attachment point * |M CHG| - charge * .
 
  But, I am not sure, whether the user defined property block is
  allowed or not.
 
 
  M CHG and M ALS are already used by the rdkit when atoms have charges

Re: [Rdkit-discuss] @= operator doesn't work as expected with the latest RDKit

2015-06-18 Thread Jan Holst Jensen

Hi Michael,

I don't know what's wrong, but I can confirm the behavior on an RDKit 
build based on 2013_09_2:


select id, molecule
  from parents$calc_props
 where molecule operator(rdkit.@) 
'CC[C@H](C)C(=O)O[C@H]1C[C@@H](C)C=C2C=C[C@H](C)[C@H](CC[C@@H]3C[C@@H](O)CC(=O)O3)[C@H]21'

limit 10

id  molecule
56 
CC[C@H](C)C(=O)O[C@H]1C[C@@H](C)C=C2C=C[C@H](C)[C@H](CC[C@@H]3C[C@@H](O)CC(=O)O3)[C@H]21 




select id, molecule
  from parents$calc_props
 where molecule operator(rdkit.@) 
'CC[C@H](C)C(=O)O[C@H]1C[C@@H](C)C=C2C=C[C@H](C)[C@H](CC[C@@H]3C[C@@H](O)CC(=O)O3)[C@H]21'

limit 10

same result as above - record with ID = 56

select id, molecule
  from parents$calc_props
 where molecule operator(rdkit.@=) 
'CC[C@H](C)C(=O)O[C@H]1C[C@@H](C)C=C2C=C[C@H](C)[C@H](CC[C@@H]3C[C@@H](O)CC(=O)O3)[C@H]21'

limit 10

no results

The operator should work according to docs

http://www.rdkit.org/docs/Cartridge.html


   Substructure and exact structure search

 * @ : substructure search operator. Returns whether or not the mol or
   qmol on the right is a substructure of the mol on the left.
 * @ : substructure search operator. Returns whether or not the mol or
   qmol on the left is a substructure of the mol on the right.
 * @= : returns whether or not two molecules are the same.


A workaround is this:

select id, molecule
  from parents$calc_props
 where molecule operator(rdkit.@) 
'CC[C@H](C)C(=O)O[C@H]1C[C@@H](C)C=C2C=C[C@H](C)[C@H](CC[C@@H]3C[C@@H](O)CC(=O)O3)[C@H]21'
   and molecule operator(rdkit.@) 
'CC[C@H](C)C(=O)O[C@H]1C[C@@H](C)C=C2C=C[C@H](C)[C@H](CC[C@@H]3C[C@@H](O)CC(=O)O3)[C@H]21'

limit 10

which returns the molecule as intended. It was suggested by Greg in an 
ancient post from 2010:


http://sourceforge.net/p/rdkit/mailman/message/26662092/

Cheers
-- Jan

On 2015-06-18 18:01, Michał Nowotka wrote:

Hi,

I'm working on mychembl_20 and I have latest stable RDKit version
(Release_2015_03_1) installed and compiled as postgres cartridge.

Some facts:

1. All tests went fine, including tests for postgres extension.
2. Substructure (@) and similarity (tanimoto_sml) searches are working fine.

Now I'm trying to execute this SQL statement:

SELECT COUNT(*)
FROM mols_rdkit
WHERE (m@='C');

This runs without errors but returns 0. It returns 0 for any SMILES
string, even if I know that the structure exists in the database.

Table mold_rdkit exists, is not empty and has column m (as I said,
substructure and similarity are working fine).

Am I doing something wrong? Is there anything I can do to verify if
the problem is related to RDKit or something else?

Kind regards,

Michał Nowotka


--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Jan Holst Jensen
Actually, you want to send your loving thoughts to MDL (now: Biovia). 
They defined the SDF format :-).


Cheers
-- Jan

On 2015-04-29 13:26, Nicholas Firth wrote:

Ahh ok… Interesting way to format a file! Got to love ChemAxon...

Best,
Nick

*Nicholas C. Firth*| PhD Student | Cancer Therapeutics
The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton 
| Surrey | SM2 5NG


*T* 020 8722 4033 |*E*nicholas.fi...@icr.ac.uk 
mailto:nicholas.fi...@icr.ac.uk|*W*www.icr.ac.uk 
http://www.icr.ac.uk/|*Twitter*@ICRnews https://twitter.com/ICRnews


*Facebook*www.facebook.com/theinstituteofcancerresearch 
http://www.facebook.com/theinstituteofcancerresearch


*Making the discoveries that defeat cancer*



On 29 Apr 2015, at 12:23, Paolo Tosco paolo.to...@unito.it 
mailto:paolo.to...@unito.it wrote:


Hi Nick,

newlines in data properties are fine, but they should not include 
blank lines (i.e., multiple newlines).

For example, in:

  my_property1
1

2

3

4

  my_property2
1234

  my_property3
5678

my_property1 will be truncated to just 1. Based on the 
specifications, if you want to include a blank line, it should 
actually be either a   or a \t, rather than being completely blank.


Cheers,
Paolo

On 04/29/15 12:16, Nicholas Firth wrote:
I use SD files with new lines in the properties quite frequently 
(inherited from Pipeline Pilot's merge function) and I've never had a 
problem reading them. I've attached an SD file that works fine for me.


In [2]: suppl = Chem.SDMolSupplier('/Volumes/nfirth/tempf.sdf')

In [3]: m = suppl[0]

In [4]: t = m.GetProp('genNum')

In [5]: print t
1
2
3
4

In [6]: print t.split('\n')
['1', '2', '3', '4']


So I guess the problem is in the writer?

Best,
Nick


--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] XAMPP server No module named RDkit

2015-03-05 Thread Jan Holst Jensen

 I dont have python and RDkit path in my XAMPP environment variable path.
 but python and RDkit path is set on my system (windows server 2003)
 coluld you please suggest how to add python and RDkit path in XAMPP 
environment variable .


Hi Sujit,

An issue that could cause this behavior, is that services (and I assume 
that XAMPP is running as a service on your Windows system) do not 
inherit the environment of the process that starts the service. The only 
way to get services to see the new environment on a Windows 2003 server 
is to reboot the server.


Unless you can find a way to set the environment for the XAMMP service 
via a file as JP can on Linux, then you need to reboot the server. 
And... I guess you have done this, but don't forget to set the 
environment variables as SYSTEM variables so the service can see them.


Cheers
-- Jan Holst Jensen

On 2015-03-05 11:13, JP wrote:
I don't have access to a Windows OS to play with this.  Sorry I cannot 
be of more help.


On my system (Apache2 and Ubuntu) - you hardcore the env variables in 
a file called envvars in /etc/apache2.


You should be able to use the SetEnv directive in the the apache conf 
file or in the sites conf files.
This is explained here: 
http://httpd.apache.org/docs/current/mod/mod_env.html


A really terrible way to do this would be directly in PHP before your 
shell_exec call (http://php.net/manual/en/function.putenv.php). This 
is bound to bite you later.




-
Jean-Paul Ebejer
Early Stage Researcher

On 5 March 2015 at 10:25, Sujit Tangadpalliwar 
sujit.tangadpalli...@gmail.com 
mailto:sujit.tangadpalli...@gmail.com wrote:


Dear Ebejer,

Thanks for your reply.
I dont have python and RDkit path in my XAMPP environment variable
path.
but python and RDkit path is set on my system (windows server 2003)
coluld you please suggest how to add python and RDkit path in
XAMPP environment variable .

Thanks in advance

Regards
Sujit



--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Inchi installation in postgresql database driving me mad

2015-02-12 Thread Jan Holst Jensen

On 2015-02-12 17:50, JP wrote:

My Makefile now looks:


# -
# Variables used and default values:
USE_INCHI=1  # enables InChI functions; requires rdkit
built with inchi support
# USE_AVALON=0 # enables avalon fingerprint; requires
rdkit built with avalon support
USE_POPCOUNT=1   # enables use of the CPU's popcount instruction
# USE_THREADS=0# links against boost.system; required with
non-ancient boost versions if inchi is enabled or the rdkit is
built with threadsafe SSS
# STATIC_LINK=1# link against the static RDKit libraries
# 



Hi JP,

You were almost there. I had this problem too. The USE_INCHI line should 
read


USE_INCHI=1

and not

USE_INCHI=1  # enables InChI functions; requires rdkit built with 
inchi support


I guess the comment gets included into the USE_INCHI variable and then 
the check for is USE_INCHI == 1 in the makefile fails.


Cheers
-- Jan
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Building RDKit with static runtime on Windows.

2015-02-06 Thread Jan Holst Jensen

Hi,

I have a couple of plugins (DLLs) that are based on RDKit. To minimize 
deployment issues I build the DLLs with /MT instead of the default /MD 
so the MSVC runtime libraries will be linked statically into the DLL. 
And so I won't have to worry about the presence or absence of MSVC 
runtime DLLs on the target system.


I had to hack a little to get RDKit to build with a static runtime and 
link it correctly to boost. I am fine with hacking it, as I might be the 
only one building it this way. But if there is already a way to make 
cmake build RDKit with a static runtime I'd love to hear about it.


RDKit source files in C:\RDKit_2014_09_1\. InChI sources downloaded. 
Using boost binaries from 
http://sourceforge.net/projects/boost/files/boost-binaries/. Doing a 
32-bit build.


   set RDBASE=C:\RDKit_2014_09_1
   set PATH=%PATH%;%RDBASE%\lib;c:\local\boost_1_57_0\lib32-msvc-10.0
   cd C:\RDKit_2014_09_1\build
   *# Add to cmake command line: -DBoost_USE_STATIC_LIBS=ON and
   -DBoost_USE_STATIC_RUNTIME=ON**
   *cmake -DBOOST_ROOT=C:/local/boost_1_57_0/
   *-DBoost_USE_STATIC_LIBS=ON -DBoost_USE_STATIC_RUNTIME=ON*
   -DRDK_BUILD_PYTHON_WRAPPERS= -DRDK_BUILD_INCHI_SUPPORT=ON -GVisual
   Studio 10 ..
   # Switch to a Cygwin prompt.
   cd /cygdrive/C/RDKit_2014_09_1/build/
   *# Force compiler to compile with /MT instead of /MD in Release builds.
   *find . -name *.vcxproj -type f -exec sed -i
   s/RuntimeLibraryMultiThreadedDLL/RuntimeLibraryMultiThreaded/
   '{}' \;
   *# Remove the BOOST_ALL_DYN_LINK define which will otherwise cause a
   link error in boost's auto_link.hpp.
   *find . -name *.vcxproj -type f -exec sed -i
   s/BOOST_ALL_DYN_LINK;// '{}' \;

   # Switch back to DOS prompt to build.
   c:/Windows/Microsoft.NET/Framework64/v4.0.30319/MSBuild.exe /m:4
   /p:Configuration=Release INSTALL.vcxproj


Cheers
-- Jan Holst Jensen
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] portable PostgreSQL + RDKit cartridge?

2014-08-28 Thread Jan Holst Jensen
On 2014-08-28 14:34, Michal Krompiec wrote:
 Hello, has anybody tried to compile a portable Windows binary of
 PostgreSQL with RDKit cartridge? There is a portable PostreSQL at
 http://sourceforge.net/projects/postgresqlportable/ and I wonder if it
 is possible to use it with the cartridge.
 Best regards,
 Michal


Hi Michal,

I got through building a Windows version of the RDKit cartridge a while 
back, but I didn't end up using it for real. I would think that the 
instructions still mostly apply:

http://sourceforge.net/p/rdkit/mailman/message/30127487/

If the resulting DLL and the extension control files are put in the 
right place in the portable image, I guess you should be able to get it 
working.

Cheers
-- Jan

--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge build fail

2014-08-26 Thread Jan Holst Jensen

On 2014-08-26 17:34, Andrew Pannifer wrote:

Hi,
I have just installed rdkit on a RHEL machine (from source) together 
with postgresql 9.0 (from the repo on the website). rdkit is doing 
fine on the ctest results and modules import fine while postgresql is 
also OK.


In contrast, attempts to build the cartridge are not so good (results 
below). I wasn't sure if the devel package was in there so installed 
again via yum and no errors there but still get the same result. Any 
help here to solve this would be appreciated.


make  make install  make installcheck
make: pg_config: Command not found
/bin/sh: pg_config: command not found
make: Nothing to be done for `all'.
make: pg_config: Command not found
/bin/sh: pg_config: command not found
make: *** No rule to make target `install'.  Stop.

thanks,
Andrew


Hi Andrew,

Red Hat is a bit peculiar about how it packages PostgreSQL - well, at 
least it differs from Debian/Ubuntu. Even when you have installed the 
devel package, pg_config may not be on your standard PATH.


I am on CentOS 6.5 with PostgreSQL 9.3 at the moment and to build the 
RDKit cartridge I do this first:


export PATH=$PATH:/usr/pgsql-9.3/bin/

and then it builds for me. So something similar may apply to your setup ?

When you can run 'pg_config' from the command line before you build, you 
should be good to go.


Cheers
-- Jan
--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] getting stereo info from bonds

2014-08-22 Thread Jan Holst Jensen

On 2014-08-22 10:38, Michał Nowotka wrote:

A question I have is why you want to access the bond wedging.

This is very good question so I will begin with answering this. I'm
writing a module, which converts *mrv files to molfiles, both ways. In
my case, the original mrv file looks like this:

[...] and down so '1' and '6' ('W' and 'H' in marvin terms). So what about
other values which this field can have, If for example I have this
molfile:

[...] So 4 instead of 1, how I will get this information from RDKit?


Hi Michal,

If you don't already have it you should grab a copy of the molfile 
specification, ctfile.pdf - you can get it here:


https://community.accelrys.com/docs/DOC-3451

Valid values for bond type are:

   bond type

   1 = Single, 2 = Double,
   3 = Triple, 4 = Aromatic,
   5 = Single or Double,
   6 = Single or Aromatic,
   7 = Double or Aromatic, 8 = Any

   [Query] Values 4 through 8
   are for SSS queries only.


So a 4 is a query bond feature. I don't know the corresponding RDKit 
property, but I would expect it to be query-related.


Cheers
-- Jan
--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] getting stereo info from bonds

2014-08-22 Thread Jan Holst Jensen
Hi Michal,

Apologies.

Cheers
-- Jan

On 2014-08-22 14:43, Michał Nowotka wrote:
 Hi Jan,

 First of all, please note, that in my message you are citing I'm
 referring to the same document you are suggesting me to read.
 Secondly, after reading my message carefully, you will probably find
 out that '4' value in my question refers to the 'bond stereo'
 property, not 'bond type'.

 On Fri, Aug 22, 2014 at 1:32 PM, Jan Holst Jensen j...@biochemfusion.com 
 wrote:
 On 2014-08-22 10:38, Michał Nowotka wrote:

 A question I have is why you want to access the bond wedging.

 This is very good question so I will begin with answering this. I'm
 writing a module, which converts *mrv files to molfiles, both ways. In
 my case, the original mrv file looks like this:

 [...] and down so '1' and '6' ('W' and 'H' in marvin terms). So what about
 other values which this field can have, If for example I have this
 molfile:

 [...] So 4 instead of 1, how I will get this information from RDKit?


 Hi Michal,

 If you don't already have it you should grab a copy of the molfile
 specification, ctfile.pdf - you can get it here:

 https://community.accelrys.com/docs/DOC-3451

 Valid values for bond type are:

 bond type

 1 = Single, 2 = Double,
 3 = Triple, 4 = Aromatic,
 5 = Single or Double,
 6 = Single or Aromatic,
 7 = Double or Aromatic, 8 = Any

 [Query] Values 4 through 8
 are for SSS queries only.


 So a 4 is a query bond feature. I don't know the corresponding RDKit
 property, but I would expect it to be query-related.

 Cheers
 -- Jan


--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] getting stereo info from bonds

2014-08-22 Thread Jan Holst Jensen

On 2014-08-22 10:38, Michał Nowotka wrote:

A question I have is why you want to access the bond wedging.

[...] Now imagine I only have this molfile and I want to convert it back to
*mrv. I don't want to write my own parser for molfiles when I know
that RDKit can already parse it. But I need to extract this 'bond
stereo' information from within RDKit somehow.

Now when you say that this '1' or 'W' value corresponds to bond
direction, I'm guessing that 'direction' can store only two values: up
and down so '1' and '6' ('W' and 'H' in marvin terms). So what about
other values which this field can have, If for example I have this
molfile:



  10 10  0  0  0  0  0  0  0  0999 V2000
-1.6741   -0.26870. C   0  0  0  0  0  0  0  0  0  0  0  0
-2.3885   -0.68120. C   0  0  0  0  0  0  0  0  0  0  0  0
-2.3885   -1.50630. C   0  0  0  0  0  0  0  0  0  0  0  0
-1.6741   -1.91880. C   0  0  0  0  0  0  0  0  0  0  0  0
-0.9596   -1.50630. C   0  0  0  0  0  0  0  0  0  0  0  0
-0.9596   -0.68120. C   0  0  0  0  0  0  0  0  0  0  0  0
-0.2451   -0.26860. C   0  0  0  0  0  0  0  0  0  0  0  0
-0.24510.55630. O   0  0  0  0  0  0  0  0  0  0  0  0
 0.4692   -0.68110. C   0  0  0  0  0  0  0  0  0  0  0  0
 0.4692   -1.50610. C   0  0  0  0  0  0  0  0  0  0  0  0
   1  2  2  0  0  0  0
   2  3  1  0  0  0  0
   3  4  2  0  0  0  0
   4  5  1  0  0  0  0
   5  6  2  0  0  0  0
   6  1  1  0  0  0  0
   6  7  1  0  0  0  0
   7  9  1  0  0  0  0
   9 10  1  0  0  0  0
   7  8  1  4  0  0  0
M  END

So 4 instead of 1, how I will get this information from RDKit?




Hi Michal,

I took a look at the C++ code in GraphMol/FileParsers/MolFileParser.cpp.

ParseMolFileBondLine() for parsing V2000 molfiles sets the BondDir to 
UNKNOWN (case 4, bond stereo type = 4):


  stereo = FileParserUtils::toInt(text.substr(9,3));
  switch(stereo){
  case 0:
res-setBondDir(Bond::NONE);
break;
  case 1:
res-setBondDir(Bond::BEGINWEDGE);
break;
  case 6:
res-setBondDir(Bond::BEGINDASH);
break;
  case 3: // either double bond
res-setBondDir(Bond::EITHERDOUBLE);
res-setStereo(Bond::STEREOANY);
break;
  case 4: // either single bond
res-setBondDir(Bond::UNKNOWN);
break;
  }

In ParseV3000BondBlock() for V3000 molfiles the same thing happens, so 
they agree (case 2, CFG=2, bond type = single (1)):


  if(prop==CFG){
unsigned int cfg=atoi(val.c_str());
switch(cfg){
case 0: break;
case 1:
  bond-setBondDir(Bond::BEGINWEDGE);
  chiralityPossible=true;
  break;
case 2:
  if(bType==1) bond-setBondDir(Bond::UNKNOWN);
  else if(bType==2){
bond-setBondDir(Bond::EITHERDOUBLE);
bond-setStereo(Bond::STEREOANY);
  }
  break;
case 3:
  bond-setBondDir(Bond::BEGINDASH);
  chiralityPossible=true;
  break;
default:
  errout  bad bond CFG val' on line line;
  throw FileParseException(errout.str()) ;
}
  } else if(prop==TOPO){

The bonds will therefore be assigned a BondDir value of Bond::UNKNOWN 
for single either bonds and BOND::EITHERDOUBLE for double either bonds.


I read in a V2000 molfile where the second bond is a single either bond 
(stereo bond type of 4) and the third bond is a double either bond 
(stereo bond type of 3).


 from rdkit import Chem
 m = Chem.MolFromMolFile(C:/temp/either.mol, sanitize=False,
   removeHs=False)
 for b in m.GetBonds(): print b.GetBondDir()
   ...
   NONE
   UNKNOWN
   5
   NONE
   NONE
   NONE



Only slight surprise is that Python returns a 5 instead of an 
EITHERDOUBLE string.


 Chem.rdchem.BondDir.values
{0: rdkit.Chem.rdchem.BondDir.NONE, 1: 
rdkit.Chem.rdchem.BondDir.BEGINWEDGE, 2: 
rdkit.Chem.rdchem.BondDir.BEGINDASH, 3: rdkit.Chem.rdchem.BondDir.ENDDOWNRI
GHT, 4: rdkit.Chem.rdchem.BondDir.ENDUPRIGHT, 6: 
rdkit.Chem.rdchem.BondDir.UNKNOWN}



For some reason Python does not map the BondDir value 5 to a name. But 
the value does match EITHERDOUBLE's implicit ordinal value defined in 
GraphMol/Bond.h, so it matches what I expect from reading the parser code:


//! the bond's direction (for chirality)
typedef enum {
  NONE=0, //! no special style
  BEGINWEDGE, //! wedged: narrow at begin
  BEGINDASH,  //! dashed: narrow at begin
  // FIX: this may not really be adequate
  ENDDOWNRIGHT,   //! for cis/trans
  ENDUPRIGHT, //!  ditto
  EITHERDOUBLE,   //! a crossed double bond
  UNKNOWN,//! intentionally unspecified stereochemistry
} BondDir;

So the information is 

Re: [Rdkit-discuss] RDKit database cartridge question

2014-07-29 Thread Jan Holst Jensen

On 2014-07-29 12:17, acanada wrote:

Hello,

I have a postgres table with compound names and other info. I want to save also 
.mol or .sdf information associated to this compounds. I'm getting the 
structure information from the chebi web service and I assume that I have to 
save this info in some way in order to enable the substructural search and 
other searching options

I have searched for information in The RDKit database cartridge but I cannot find an 
explanation for my purposes. I'm searching a type for saving structures, and the way to 
tell the table some options (tautomers, repeated molecules...etc) when saving the data.

Can anybody tell me where to find information, how-to or tutorial for what I'm 
trying to do?
My apologizes if this is a too simple question.

Thank you very much,
Andrés


Hi Andrés,

The type you are looking for is mol - an RDKit molecule type column 
which can be indexed for substructure searches.


To convert a molfile to the mol type, use the cartridge 
mol_from_ctab() function.


So e.g. create a table like this:

   create table mols (
  id serial,
  molecule mol
   );


and to insert molecules into it:

   insert into mols (molecule) values ('CCN');


SMILES text will be automatically converted to mol type. Assuming that 
you have a variable molfile that holds a molfile as text do this:


   insert into mols (molecule) values (mol_from_ctab(molfile::cstring));


Note that the molfile string has to be cast to cstring. You probably 
also want to register it like this:


   insert into mols (molecule) values (mol_from_ctab(molfile::cstring,
   true));


The additional boolean parameter controls whether the coordinates of the 
original molfile are retained in the database. The default is false 
meaning that molfile coordinates are not retained. If you want to be 
able to export the structures as-they-were you want to set the parameter 
to true.


To get back the molfiles, use the mol_to_ctab() function:

   select id, mol_to_ctab(molecule) from mols;


Hope this gets you started.

Cheers
-- Jan
--
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Fw: how can I test if a molecule already has 2D coordinates?

2014-05-07 Thread Jan Holst Jensen

On 2014-05-07 17:26, chemis...@gmx.de wrote:

*Gesendet:* Mittwoch, 07. Mai 2014 um 15:56 Uhr
*Von:* chemis...@gmx.de
*An:* rdkit-discuss@lists.sourceforge.net
*Betreff:* how can I test if a molecule already has 2D coordinates?
for a given molecule, how can I test if it already has 2D coordinates?
I would like to calculate these coordinates only for molecules that do 
not yet have them.

Many thanks in advance.
Kind regards,
Axel
I think I found a way: if I call
mol.GetConformer()
I get a ValueError, if no 2D coordinates are present. This error I can 
catch.

Kind regards,
Axel


Hi Axel,

And even more straightforward: mol.GetNumConformers().

 from rdkit import Chem
 from rdkit.Chem import AllChem
 mol = Chem.MolFromSmiles('CCF')
 mol.GetNumConformers()
0
 AllChem.Compute2DCoords(mol)
0
 mol.GetNumConformers()
1


Cheers
-- Jan
--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
#149; 3 signs your SCM is hindering your productivity
#149; Requirements for releasing software faster
#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] ReducedGraphs - now, where is this code ?

2014-04-24 Thread Jan Holst Jensen
Hi Rdkitters,

A colleague of mine has stumbled upon this page:

http://www.rdkit.org/Python_Docs/rdkit.Chem.rdReducedGraphs-module.html

And so he asked me how to use it, because he couldn't find it. Neither 
can I, not even in the latest git version. Is this a special secret 
feature :-) ?

Cheers
-- Jan

--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Inconsistency between Cartridge and Python

2014-04-23 Thread Jan Holst Jensen

On 2014-04-23 16:01, Daniel Moser wrote:


Hi all,

I think I found some inconsistent behaviour between the cartridge and 
python. At least for NCO the cartridge function mol_hbd returns a 
different value than Descriptors.NumHDonors.


[...]

I'm seeing this behaviour both with the 2013.09.2 as well as the 
2012.12.1 cartridge version. Can anyone confirm this?




I can confirm this difference between Python and the cartridge using a 
fairly recent git version.


The cartrige mol_hbd() function calls calcLipinskiHBD:

MOLDESCR(HBD,RDKit::Descriptors::calcLipinskiHBD,int)

which is

unsigned int calcLipinskiHBD(const ROMol mol){
  unsigned int res=0;
  for(ROMol::ConstAtomIterator iter=mol.beginAtoms();
  iter!=mol.endAtoms();++iter){
if( ((*iter)-getAtomicNum()==7 || (*iter)-getAtomicNum()==8) ) {
  res += (*iter)-getTotalNumHs(true);
}
  }
  return res;
}

while the Python NumHDonors() is using a SMARTS-based approach as far as 
I can tell.


The function that the cartridge uses can be accessed via AllChem and 
then I can also demonstrate the difference in Python:


 from rdkit import Chem
 m = Chem.MolFromSmiles('NCO')
 from rdkit.Chem import Descriptors
 print Descriptors.NumHDonors(m)
2
 from rdkit.Chem import AllChem
 print AllChem.CalcNumLipinskiHBD(m)
3


Cheers
-- Jan
--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit RPM for CentOS 5.3 ?

2014-03-21 Thread Jan Holst Jensen
I have added a comment to the Wiki that describes how I built RDKit on a 
CentOS 5.3 server, in case others also suddenly find themselves on an 
old OS.


https://code.google.com/p/rdkit/wiki/BuildingWithCmake

Thanks again to Gianluca and Greg - the EPEL repository does save the day.

Cheers
-- Jan

On 2014-02-26 11:26, Jan Holst Jensen wrote:

On 2014-02-26 11:04, Greg Landrum wrote:



On Wed, Feb 26, 2014 at 10:56 AM, Gianluca Sforna gia...@gmail.com 
mailto:gia...@gmail.com wrote:


On Tue, Feb 25, 2014 at 9:40 PM, Jan Holst Jensen
j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:

 I am testing if rdkit can be used from Oracle on a customer (test!)
 database. And said database runs on a CentOS 5.3 server - no OS
upgrade
 in the near future. First step is to get rdkit working in
Python 2.4 on
 that server.

For a start, you can make your life easier by adding the EPEL
repository and pulling python26 package from there.


Yeah, I am pretty sure that there is python code that is not going to 
work with 2.4


IIRC the only real blocker that prevented me from building RDKit
against CentOS5 + EPEL was flex


Fortunately, flex is no longer required, so that one is gone.

-greg


Thanks, Gianluca and Greg. The EPEL repository looks like it could 
save the day, giving me a tool chain so I can build rdkit. I'll give 
it a try!


Cheers
-- Jan


--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Three tests failing on CentOS 5.3 - important ?

2014-03-19 Thread Jan Holst Jensen

On 2014-03-19 05:54, Greg Landrum wrote:


On Tue, Mar 18, 2014 at 4:59 PM, Jan Holst Jensen 
j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:


Hi RDKitters,

I managed to get RDKit 2013_09_2 built on CentOS 5.3. Will post a
short recipe later.


Wow; that's an ancient version.


Yup. Approaching archaeology-region here.



Right now, I am still left with three tests that fail, but I think
(hope) that I can live with that ? Failing tests are:

72:pythonTestDbCLI
73:pythonTestDirML
78:pythonTestDirChem

The test log shows that test 72 won't run because of missing
SQLite support.

72/78 Testing: pythonTestDbCLI
72/78 Test: pythonTestDbCLI
Command: /usr/bin/python26
/u01/software/RDKit_2013_09_2/Projects/test_list.py
--testDir /u01/software/RDKit_2013_09_2/Projects
Directory: /u01/software/RDKit_2013_09_2/build/Projects
pythonTestDbCLI start time: Mar 05 11:07 CET
Output:
--
Traceback (most recent call last):
  File TestDbCLI.py, line 9, in ?
from rdkit.Dbase.DbConnection import DbConnect
  File
/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbConnection.py,
line 21, in ?
from rdkit.Dbase import DbUtils,DbInfo
  File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbUtils.py,
line 17, in ?
from rdkit.Dbase.DbResultSet import
DbResultSet,RandomAccessDbResultSet
  File
/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbResultSet.py,
line 12, in ?
from rdkit.Dbase import DbInfo
  File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbInfo.py,
line 12, in ?
import DbModule
  File
/u01/software/RDKit_2013_09_2/rdkit/Dbase/DbModule.py, line
61, in ?
raise ImportError,Neither sqlite nor PgSQL support found.
ImportError: Neither sqlite nor PgSQL support found.


A bit puzzling, since I can import sqlite3 just fine from Python
when run interactively. As far as I can understand RDConfig.py a
successful import of sqlite3 should make it report that SQLite
support is availabe ? For my purposes, failing this test is
probably fine - I don't expect I need sqlite support on this machine.


It's probably not important unless you are planning on using the DbCLI 
code.
If you want to try and track it down: can you do from rdkit.Dbase 
import DbModule?


Goes just fine interactively.

   [oracle@localhost ~]$ python26
   Python 2.6.8 (unknown, Nov  7 2012, 14:47:45)
   [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
   Type help, copyright, credits or license for more information.
 from rdkit.Dbase import DbModule
 quit()
   [oracle@localhost ~]$

Well, let's leave it at that. Since I am not planning to use the DbCLI 
code at the moment I am OK with it.




Tests 73 has this in the log:

Traceback (most recent call last):
  File UnitTestBuildComposite.py, line 16, in ?
from rdkit.ML import BuildComposite
  File
/u01/software/RDKit_2013_09_2/rdkit/ML/BuildComposite.py,
line 203, in ?
from rdkit.ML.Composite import Composite,BayesComposite
  File
/u01/software/RDKit_2013_09_2/rdkit/ML/Composite/Composite.py,
line 25, in ?
from rdkit.ML.Data import DataUtils
  File
/u01/software/RDKit_2013_09_2/rdkit/ML/Data/DataUtils.py,
line 57, in ?
from rdkit.ML.Data import MLData
  File
/u01/software/RDKit_2013_09_2/rdkit/ML/Data/MLData.py, line
8, in ?
import numpy
ImportError: No module named numpy
...

and test 78:

Output:
--
  File UnitTestInchi.py, line 187
except InchiReadWriteError as inst:
^
SyntaxError: invalid syntax
  File PandasTools.py, line 100
except Exception as e:
  ^
SyntaxError: invalid syntax
Traceback (most recent call last):
  File UnitTestEState.py, line 17, in ?
import numpy
ImportError: No module named numpy
Traceback (most recent call last):
  File UnitTestFingerprints.py, line 17, in ?
import numpy
ImportError: No module named numpy
...


The numpy module loads fine when run interactively. So maybe it is
something else that is wrong - just that the error reported from
Python is a bit misleading (?).


That's a strange one, but if you can import numpy and rdkit.Chem, then 
I wouldn't be concerned about it.
Again, if you're interested in trying to track it down, there are some 
experiments we can do.



I haven't run into stuff that doesn't

Re: [Rdkit-discuss] Three tests failing on CentOS 5.3 - important ?

2014-03-19 Thread Jan Holst Jensen

Hi Markus,

It is Python 2.6.8. Old, but apparently not *that* old :-) :

   [oracle@localhost ~]$ cat test.py
   try:
raise Exception, Hello, I am an error
   except Exception as e:
print An exception was raised:  + str(e) + .
   [oracle@localhost ~]$ python26 test.py
   An exception was raised: Hello, I am an error.
   [oracle@localhost ~]$


Cheers
-- Jan

On 2014-03-19 09:35, Markus Sitzmann wrote:

I think the syntax except Exception as e: did't exist before python
2.6 ... are you running this on an older version? :-)

Cheers,
Markus

On Wed, Mar 19, 2014 at 7:54 AM, Jan Holst Jensen j...@biochemfusion.com 
wrote:

On 2014-03-19 05:54, Greg Landrum wrote:


On Tue, Mar 18, 2014 at 4:59 PM, Jan Holst Jensen j...@biochemfusion.com
wrote:

Hi RDKitters,

I managed to get RDKit 2013_09_2 built on CentOS 5.3. Will post a short
recipe later.


Wow; that's an ancient version.


Yup. Approaching archaeology-region here.




Right now, I am still left with three tests that fail, but I think (hope)
that I can live with that ? Failing tests are:

 72:pythonTestDbCLI
 73:pythonTestDirML
 78:pythonTestDirChem

The test log shows that test 72 won't run because of missing SQLite
support.

72/78 Testing: pythonTestDbCLI
72/78 Test: pythonTestDbCLI
Command: /usr/bin/python26
/u01/software/RDKit_2013_09_2/Projects/test_list.py --testDir
/u01/software/RDKit_2013_09_2/Projects
Directory: /u01/software/RDKit_2013_09_2/build/Projects
pythonTestDbCLI start time: Mar 05 11:07 CET
Output:
--
Traceback (most recent call last):
   File TestDbCLI.py, line 9, in ?
 from rdkit.Dbase.DbConnection import DbConnect
   File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbConnection.py, line
21, in ?
 from rdkit.Dbase import DbUtils,DbInfo
   File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbUtils.py, line 17, in
?
 from rdkit.Dbase.DbResultSet import
DbResultSet,RandomAccessDbResultSet
   File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbResultSet.py, line
12, in ?
 from rdkit.Dbase import DbInfo
   File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbInfo.py, line 12, in
?
 import DbModule
   File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbModule.py, line 61,
in ?
 raise ImportError,Neither sqlite nor PgSQL support found.
ImportError: Neither sqlite nor PgSQL support found.


A bit puzzling, since I can import sqlite3 just fine from Python when
run interactively. As far as I can understand RDConfig.py a successful
import of sqlite3 should make it report that SQLite support is availabe ?
For my purposes, failing this test is probably fine - I don't expect I need
sqlite support on this machine.


It's probably not important unless you are planning on using the DbCLI code.
If you want to try and track it down: can you do from rdkit.Dbase import
DbModule?


Goes just fine interactively.

[oracle@localhost ~]$ python26
Python 2.6.8 (unknown, Nov  7 2012, 14:47:45)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type help, copyright, credits or license for more information.

from rdkit.Dbase import DbModule
quit()

[oracle@localhost ~]$

Well, let's leave it at that. Since I am not planning to use the DbCLI code
at the moment I am OK with it.





Tests 73 has this in the log:

Traceback (most recent call last):
   File UnitTestBuildComposite.py, line 16, in ?
 from rdkit.ML import BuildComposite
   File /u01/software/RDKit_2013_09_2/rdkit/ML/BuildComposite.py, line
203, in ?
 from rdkit.ML.Composite import Composite,BayesComposite
   File /u01/software/RDKit_2013_09_2/rdkit/ML/Composite/Composite.py,
line 25, in ?
 from rdkit.ML.Data import DataUtils
   File /u01/software/RDKit_2013_09_2/rdkit/ML/Data/DataUtils.py, line
57, in ?
 from rdkit.ML.Data import MLData
   File /u01/software/RDKit_2013_09_2/rdkit/ML/Data/MLData.py, line 8, in
?
 import numpy
ImportError: No module named numpy
...

and test 78:

Output:
--
   File UnitTestInchi.py, line 187
 except InchiReadWriteError as inst:
 ^
SyntaxError: invalid syntax
   File PandasTools.py, line 100
 except Exception as e:
   ^
SyntaxError: invalid syntax
Traceback (most recent call last):
   File UnitTestEState.py, line 17, in ?
 import numpy
ImportError: No module named numpy
Traceback (most recent call last):
   File UnitTestFingerprints.py, line 17, in ?
 import numpy
ImportError: No module named numpy
...


The numpy module loads fine when run interactively. So maybe it is
something else that is wrong - just that the error reported from Python is a
bit misleading (?).


That's a strange one, but if you can import numpy and rdkit.Chem, then I
wouldn't be concerned about it.
Again, if you're interested in trying to track it down, there are some
experiments we can do.



I haven't run into stuff that doesn't work yet because of these test

[Rdkit-discuss] Three tests failing on CentOS 5.3 - important ?

2014-03-18 Thread Jan Holst Jensen

Hi RDKitters,

I managed to get RDKit 2013_09_2 built on CentOS 5.3. Will post a short 
recipe later.


Right now, I am still left with three tests that fail, but I think 
(hope) that I can live with that ? Failing tests are:


72:pythonTestDbCLI
73:pythonTestDirML
78:pythonTestDirChem

The test log shows that test 72 won't run because of missing SQLite support.

   72/78 Testing: pythonTestDbCLI
   72/78 Test: pythonTestDbCLI
   Command: /usr/bin/python26
   /u01/software/RDKit_2013_09_2/Projects/test_list.py --testDir
   /u01/software/RDKit_2013_09_2/Projects
   Directory: /u01/software/RDKit_2013_09_2/build/Projects
   pythonTestDbCLI start time: Mar 05 11:07 CET
   Output:
   --
   Traceback (most recent call last):
  File TestDbCLI.py, line 9, in ?
from rdkit.Dbase.DbConnection import DbConnect
  File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbConnection.py,
   line 21, in ?
from rdkit.Dbase import DbUtils,DbInfo
  File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbUtils.py, line
   17, in ?
from rdkit.Dbase.DbResultSet import
   DbResultSet,RandomAccessDbResultSet
  File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbResultSet.py,
   line 12, in ?
from rdkit.Dbase import DbInfo
  File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbInfo.py, line
   12, in ?
import DbModule
  File /u01/software/RDKit_2013_09_2/rdkit/Dbase/DbModule.py,
   line 61, in ?
raise ImportError,Neither sqlite nor PgSQL support found.
   ImportError: Neither sqlite nor PgSQL support found.


A bit puzzling, since I can import sqlite3 just fine from Python when 
run interactively. As far as I can understand RDConfig.py a successful 
import of sqlite3 should make it report that SQLite support is availabe 
? For my purposes, failing this test is probably fine - I don't expect I 
need sqlite support on this machine.


Tests 73 has this in the log:

   Traceback (most recent call last):
  File UnitTestBuildComposite.py, line 16, in ?
from rdkit.ML import BuildComposite
  File /u01/software/RDKit_2013_09_2/rdkit/ML/BuildComposite.py,
   line 203, in ?
from rdkit.ML.Composite import Composite,BayesComposite
  File
   /u01/software/RDKit_2013_09_2/rdkit/ML/Composite/Composite.py,
   line 25, in ?
from rdkit.ML.Data import DataUtils
  File /u01/software/RDKit_2013_09_2/rdkit/ML/Data/DataUtils.py,
   line 57, in ?
from rdkit.ML.Data import MLData
  File /u01/software/RDKit_2013_09_2/rdkit/ML/Data/MLData.py,
   line 8, in ?
import numpy
   ImportError: No module named numpy
   ...

and test 78:

   Output:
   --
  File UnitTestInchi.py, line 187
except InchiReadWriteError as inst:
^
   SyntaxError: invalid syntax
  File PandasTools.py, line 100
except Exception as e:
  ^
   SyntaxError: invalid syntax
   Traceback (most recent call last):
  File UnitTestEState.py, line 17, in ?
import numpy
   ImportError: No module named numpy
   Traceback (most recent call last):
  File UnitTestFingerprints.py, line 17, in ?
import numpy
   ImportError: No module named numpy
   ...


The numpy module loads fine when run interactively. So maybe it is 
something else that is wrong - just that the error reported from Python 
is a bit misleading (?).


I haven't run into stuff that doesn't work yet because of these test 
failures, so I think that I can get by without them passing. But it 
would be nice to know if it is potentially critical or not.


Cheers
-- Jan
--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Pg cartridge - mol_to_ctab() and trouble with conformers.

2014-03-05 Thread Jan Holst Jensen

Hi,

About ready to push a changeset for implementing mol_to_ctab(), but I 
would like it to play nice and preserve input depictions.


Ideally I would like the following

select mol_to_ctab(mol_from_ctab(input-molfile));

to output a molfile where the coordinates of input-molfile are preserved.

If I do that in Python it works:

 from rdkit import Chem
 m = Chem.MolFromMolBlock(chiral1.mol
   ...   ChemDraw04200416412D
   ...
   ...   5  4  0  0  0  0  0  0  0  0999 V2000
   ...-0.01410.05530. C   0  0  0  0  0  0  0  0 0  0  0  0
   ... 0.81090.05530. F   0  0  0  0  0  0  0  0 0  0  0  0
   ...-0.42660.76970. Br  0  0  0  0  0  0  0  0 0  0  0  0
   ...-0.0141   -0.76970. Cl  0  0  0  0  0  0  0  0 0  0  0  0
   ...-0.8109   -0.15830. C   0  0  0  0  0  0  0  0 0  0  0  0
   ...   1  2  1  0
   ...   1  3  1  0
   ...   1  4  1  1
   ...   1  5  1  0
   ... M  END)
 m
   rdkit.Chem.rdchem.Mol object at 0x1240980
   * m.GetNumConformers()**
   **1*
 Chem.MolToMolBlock(m)
   'chiral1.mol\n RDKit  2D\n\n  5  4  0  0  0  0  0 0  0 
   0999 V2000\n   -0.01410.05530. C   0  0  0  0 0  0  0 
   0  0  0  0  0\n0.81090.05530. F   0  0 0  0  0  0 
   0  0  0  0  0  0\n   -0.42660.76970. Br 0  0  0  0  0 
   0  0  0  0  0  0  0\n   -0.0141   -0.7697 0. Cl  0  0  0  0  0 
   0  0  0  0  0  0  0\n   -0.8109 -0.15830. C   0  0  0  0  0 
   0  0  0  0  0  0  0\n  1 2  1  6\n  1  3  1  0\n  1  4  1  0\n  1 
   5  1  0\nM  END\n'

 quit()


In the PG cartridge I lose the conformer of the input. My implementation 
looks like this:


rdkit_io.c:

   PG_FUNCTION_INFO_V1(mol_to_ctab);
   Datum   mol_to_ctab(PG_FUNCTION_ARGS);
   Datum
   mol_to_ctab(PG_FUNCTION_ARGS) {
  CROMol  mol;
  char*str;
  int len;

  fcinfo-flinfo-fn_extra = SearchMolCache(
   fcinfo-flinfo-fn_extra,
   fcinfo-flinfo-fn_mcxt,
   PG_GETARG_DATUM(0),
NULL, mol, NULL);

  bool createDepictionIfMissing = PG_GETARG_BOOL(1);
  str = makeCtabText(mol, len, createDepictionIfMissing);

  PG_RETURN_CSTRING( pnstrdup(str, len) );
   }


adapter.cpp:

   extern C char *
   makeCtabText(CROMol data, int *len, bool createDepictionIfMissing) {
  ROMol *mol = (ROMol*)data;

  try {
ereport(NOTICE,
(errcode(ERRCODE_SUCCESSFUL_COMPLETION),
 errmsg(mol conformer count = %d,
   mol-getNumConformers(;

if (createDepictionIfMissing  mol-getNumConformers() == 0) {
  RDDepict::compute2DCoords(*mol);
}
StringData = MolToMolBlock(*mol);
  } catch (...) {
ereport(WARNING,
(errcode(ERRCODE_WARNING),
 errmsg(makeCtabText: problems converting molecule to
   CTAB)));
StringData=;
  }

  *len = StringData.size();
  return (char*)StringData.c_str();
   }


If I run the Python example equivalent from psql:

   postgres=# select mol_to_ctab(mol_from_ctab('chiral1.mol
  ChemDraw04200416412D

  5  4  0  0  0  0  0  0  0  0999 V2000
   -0.01410.05530. C   0  0  0  0  0  0  0  0  0 0  0  0
0.81090.05530. F   0  0  0  0  0  0  0  0  0 0  0  0
   -0.42660.76970. Br  0  0  0  0  0  0  0  0  0 0  0  0
   -0.0141   -0.76970. Cl  0  0  0  0  0  0  0  0  0 0  0  0
   -0.8109   -0.15830. C   0  0  0  0  0  0  0  0  0 0  0  0
  1  2  1  0
  1  3  1  0
  1  4  1  1
  1  5  1  0
   M  END', false));
   *NOTICE:  mol conformer count = 0*
   mol_to_ctab
   ---
   +
  RDKit 2D   +
   +
   5  4  0  0  0  0  0  0  0  0999 V2000  +
 0.0.0. C   0  0  0  0  0  0  0  0  0 0  0  0+
-1.50000.0. F   0  0  0  0  0  0  0  0  0 0  0  0+
-0.   -1.50000. Br  0  0  0  0  0  0  0  0  0 0  0  0+
 0.1.50000. Cl  0  0  0  0  0  0  0  0  0 0  0  0+
 1.50000.0. C   0  0  0  0  0  0  0  0  0 0  0  0+
   1  2  1 6 +
   1  3  1 0 +
   1  4  1 0 +
   1  5  1 0 +
 M END +

   (1 row)

   postgres=#

Something I missed about querying a mol for conformers ? As of now I 
lose the input conformer and the code will always output a 
calculated-from-scratch depiction.


Cheers
-- Jan

--
Subversion Kills Productivity. Get off Subversion  Make the Move to 

Re: [Rdkit-discuss] Pg cartridge - mol_to_ctab() and trouble with conformers.

2014-03-05 Thread Jan Holst Jensen

Hi Greg,

Thanks for the explanation.

I added this to rdkit_io.c in mol_from_ctab():

+  bool keepConformer = PG_GETARG_BOOL(1);
-  mol = parseMolCTAB(data,false,true);
+  mol = parseMolCTAB(data,keepConformer,true);

and then I can get the expected behavior and have my tests complete 
successfully. Yes :-).


I will go ahead and create a pull request for mol_to_ctab(). The tests 
for mol_to_ctab() will assume that mol_from_ctab() uses the optional 
parameter to keep the conformer.


Cheers
-- Jan

On 2014-03-05 13:25, Greg Landrum wrote:

Hi Jan,

The below behavior is the result of a bug 
(https://github.com/rdkit/rdkit/issues/229).
mol_from_ctab() takes an (undocumented) optional argument that is 
supposed to determine whether or not the molecule's conformation is 
stored in the database. The default is to not store the conformation; 
this reduces the size of the database and the speed at which molecules 
are depickled. The bug is that even if you try to keep the 
conformation the argument is ignored and the conformation is discarded.


I'll get this fixed tomorrow morning. Alternatively, if you want to 
fix it now, the change just needs to be made in the definition of 
mol_from_ctab() in rdkit_io.c


-greg




On Wed, Mar 5, 2014 at 10:27 AM, Jan Holst Jensen 
j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:


Hi,

About ready to push a changeset for implementing mol_to_ctab(),
but I would like it to play nice and preserve input depictions.

Ideally I would like the following

select mol_to_ctab(mol_from_ctab(input-molfile));

to output a molfile where the coordinates of input-molfile are
preserved.

If I do that in Python it works:

 from rdkit import Chem
 m = Chem.MolFromMolBlock(chiral1.mol
...   ChemDraw04200416412D
...
...   5  4  0  0  0  0  0  0  0  0999 V2000
...-0.01410.05530. C   0  0  0  0 0  0  0  0 
0  0  0  0
... 0.81090.05530. F   0  0  0  0 0  0  0  0 
0  0  0  0
...-0.42660.76970. Br  0  0  0  0 0  0  0  0 
0  0  0  0
...-0.0141   -0.76970. Cl  0  0  0  0 0  0  0  0 
0  0  0  0
...-0.8109   -0.15830. C   0  0  0  0 0  0  0  0 
0  0  0  0

...   1  2  1  0
...   1  3  1  0
...   1  4  1  1
...   1  5  1  0
... M  END)
 m
rdkit.Chem.rdchem.Mol object at 0x1240980
* m.GetNumConformers()**
**1*
 Chem.MolToMolBlock(m)
'chiral1.mol\n RDKit  2D\n\n  5  4  0 0  0  0  0 
0  0  0999 V2000\n   -0.01410.0553 0. C   0  0  0  0 
0  0  0  0  0  0  0  0\n 0.81090.05530. F   0  0 
0  0  0  0  0  0 0  0  0  0\n   -0.42660.76970.

Br  0  0 0  0  0  0  0  0  0  0  0  0\n   -0.0141   -0.7697
0. Cl  0  0  0  0  0  0  0  0  0  0  0  0\n -0.8109  
-0.15830. C   0  0  0  0  0  0  0 0  0  0  0  0\n  1 
2  1  6\n  1  3  1  0\n  1  4  1 0\n  1  5  1  0\nM  END\n'

 quit()


In the PG cartridge I lose the conformer of the input. My
implementation looks like this:

rdkit_io.c:

PG_FUNCTION_INFO_V1(mol_to_ctab);
Datum   mol_to_ctab(PG_FUNCTION_ARGS);
Datum
mol_to_ctab(PG_FUNCTION_ARGS) {
  CROMol  mol;
  char*str;
  int len;

  fcinfo-flinfo-fn_extra = SearchMolCache(
fcinfo-flinfo-fn_extra,
fcinfo-flinfo-fn_mcxt,
PG_GETARG_DATUM(0),
NULL, mol, NULL);

  bool createDepictionIfMissing = PG_GETARG_BOOL(1);
  str = makeCtabText(mol, len, createDepictionIfMissing);

  PG_RETURN_CSTRING( pnstrdup(str, len) );
}


adapter.cpp:

extern C char *
makeCtabText(CROMol data, int *len, bool
createDepictionIfMissing) {
  ROMol *mol = (ROMol*)data;

  try {
ereport(NOTICE,
(errcode(ERRCODE_SUCCESSFUL_COMPLETION),
 errmsg(mol conformer count = %d,
mol-getNumConformers(;

if (createDepictionIfMissing  mol-getNumConformers() ==
0) {
  RDDepict::compute2DCoords(*mol);
}
StringData = MolToMolBlock(*mol);
  } catch (...) {
ereport(WARNING,
(errcode(ERRCODE_WARNING),
 errmsg(makeCtabText: problems converting
molecule to CTAB)));
StringData=;
  }

  *len = StringData.size();
  return (char*)StringData.c_str();
}


If I run the Python example equivalent from psql:

postgres=# select mol_to_ctab(mol_from_ctab('chiral1.mol
  ChemDraw04200416412D

  5  4  0  0  0  0  0  0  0  0999 V2000
   -0.0141

[Rdkit-discuss] Announce: RDKit in Oracle - via pypl cartridge.

2014-02-27 Thread Jan Holst Jensen

Hi RDKitters,

I have dabbled with calling RDKit from Oracle and have succeeded. It is 
done via an Oracle cartridge that makes it possible to call Python 
scripts from Oracle. The cartridge is not nearly as sophisticated as 
Postgres' support for Python, but it gets the job done.


The cartridge is open source, BSD-licensed, and can be downloaded from here:
http://biochemfusion.com/downloads/pypl_2014-02-27.zip - Yes, it's only 
an 8.3 KB download.


The cartridge was created on a Linux machine with CentOS 6.2 and Oracle 
11g. It shouldn't be too hard to make it compile on Windows too, but I 
haven't had the need yet.


With the cartridge we can create Oracle functions like below that 
operate on MDL molfiles, returning SMILES and LogP values:


create function mol_to_smiles(molfile in clob) return varchar2
is
begin
  return pypl.run_script(
'from rdkit import Chem' || Chr(10) ||
'from rdkit.Chem import Descriptors' || Chr(10) ||
'molfile = ' || molfile || '' || Chr(10) ||
'result = Chem.MolToSmiles(Chem.MolFromMolBlock(molfile))', 'result');
end;
/

create function mol_logp(molfile in clob) return number
is
begin
  return to_number(pypl.run_script(
'from rdkit import Chem' || Chr(10) ||
'from rdkit.Chem import Descriptors' || Chr(10) ||
'molfile = ' || molfile || '' || Chr(10) ||
'result = str(Descriptors.MolLogP(Chem.MolFromMolBlock(molfile)))', 
'result'));

end;
/

If we have a COMPOUNDS table where the STRUCTURE column has the molecule 
stored in Accelrys Direct format, we can then do the following select:


select
id,
mol_to_smiles(molfile(structure)) as smiles,
mol_logp(molfile(structure)) as logp
  from compounds
 where id = 3;

*ID**SMILES**LOGP*
1   O=C(O)c1c1  1.3848
2   CCC(=O)OCCOc1c1 2.0186
3   CC1=CC(=O)C=CC1=O   0.6407


[The molfile() function is an Accelrys Direct function that produces a 
molfile CLOB from the STUCTURE BLOB column.]


There you have it - RDKit functionality directly in an Oracle database. 
Hope that you will find this useful.


Cheers
-- Jan
--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit RPM for CentOS 5.3 ?

2014-02-26 Thread Jan Holst Jensen

  
  
On 2014-02-26 11:04, Greg Landrum
  wrote:


  

  
  On Wed, Feb 26, 2014 at 10:56 AM,
Gianluca Sforna gia...@gmail.com wrote:

  On Tue, Feb 25, 2014 at 9:40 PM, Jan Holst
Jensen j...@biochemfusion.com
wrote:

 I am testing if rdkit can be used from Oracle on a
customer (test!)
 database. And said database runs on a CentOS 5.3
server - no OS upgrade
 in the near future. First step is to get rdkit
working in Python 2.4 on
 that server.

  
  For a start, you can make your life easier by adding the
  EPEL
  repository and pulling python26 package from there.



Yeah, I am pretty sure that there is python code that
  is not going to work with 2.4


  IIRC the only real blocker that prevented me from building
  RDKit
  against CentOS5 + EPEL was flex



Fortunately, flex is no longer required, so that one is
  gone.


-greg

  

  


Thanks, Gianluca and Greg. The EPEL repository looks like it could
save the day, giving me a tool chain so I can build rdkit. I'll give
it a try!

Cheers
-- Jan

-- 
  

  
  


  

   Biochemfusion ApS
 CVR(VAT) No. DK 32 05 74 46 
  
  


  by Jan Holst Jensen
  e-mail: j...@biochemfusion.com


  Lindegrdsvej 44, 2. TV
  Web: http://biochemfusion.com


  DK-2920 Charlottenlund, Copenhagen
  Phone: +45 30 48 00 50

  

.
  

  

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit cartridge - opposite of mol_from_ctab() would be nice.

2014-02-25 Thread Jan Holst Jensen

Hi TJ,

Interesting approach. I have also used plpython to get to rdkit 
functionality that were not in the postgres cartridge. Very useful.


Cheers
-- Jan

On 2014-02-24 21:41, TJ O'Donnell wrote:

Hi All

I would like to announce the availability of a somewhat different 
rdkit-based

postgresql extension.  This uses rdkit for all the basic cheminformatics
functions (canonical smiles, molfile handling, smarts matching, 
fingerprints, etc.)

but is based on the use of postgres' plpython language.
This does not use the existing rdkit postgres cartridge, although I have
demonstrated that the two can be used side-by-side (via the use of
rdkit pickled mol objects).

I hope this use of python might make it easier to extend postgres even 
further with

additional functions based on rdkit.  The code can be checked out from
sourceforge using this:

svn checkout 
svn://svn.code.sf.net/p/sci3d/code/trunk/openchord/src/rdkit 
http://svn.code.sf.net/p/sci3d/code/trunk/openchord/src/rdkit chord


This is a work in progress, so I would appreciate any feedback.  There 
are still

some wrinkles that need to be ironed out.   I plan to document
the installation and useage better, probably using github.

TJ O'Donnell



On Sat, Feb 22, 2014 at 10:53 PM, Greg Landrum greg.land...@gmail.com 
mailto:greg.land...@gmail.com wrote:



On Fri, Feb 21, 2014 at 5:45 PM, Jan Holst Jensen
j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:

Hi Greg,

It would be great to gain the experience. I am working on a
registration project where we will likely need to surface
additional functions in the cartridge, just to try them out.
So, knowing how to do that in a way where things that turn out
useful can be contributed back cleanly would be great.


Sounds good.


 if structures don't have conformers

Ah, yes; good question. Decisions, decisions... I'll dodge the
question :-) and say it sounds like a perfect fit for an
optional parameter, e.g.

mol_to_ctab(m mol, add_depiction_if_missing bool default true)

I would go for default true because I believe that is the
general preference.


Having the optional argument that defaults to true make sense to me.

Here's an attempt to briefly summarize what needs to be changed in
order to add the new functionality:

- Add mol_to_ctab to rdkit_io.c
- Add molToCtabText (or some such thing) to adapter.cpp and rdkit.h
- Add mol_to_ctab() definitions to rdkit.sql91.in
http://rdkit.sql91.in and, if you want to support older versions
of postgres, rdkit.sql.in http://rdkit.sql.in
- Update link dependencies in Makefile if necessary (will be
necessary if you add depictions)
- Add tests to one of the files in sql/ (the most logical place is
probably rdkit-91.sql and rdkit-pre91.sql if you are supporting
older versions) and the corresponding output file in expected/


I think that's it.

-greg

Cheers
-- Jan


On 2014-02-21 16:47, Greg Landrum wrote:

Hi Jan,

Great idea. I'd be happy to add it, but I can also talk you
through it if you want to gain the experience.

One important question: if structures don't have conformers
(if they are loaded from SMILES, for example), should ctabs
with all zero coordinates be generated or should depictions
be generated?

-greg


On Fri, Feb 21, 2014 at 2:23 PM, Jan Holst Jensen
j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:

Hi Greg,

Are there any plans for a mol_*to*_ctab() function in the
PG cartridge ? Would make SD file export from the
database a bit easier.

If there are no immediate plans, I can take a stab at
adding it myself.

* Looks like rdkit_io.c is the place to add it ?
* Should I manually define the new SQL function in
rdkit.sql.in http://rdkit.sql.in, or is there some
higher-level place I should add it instead ?

Cheers
-- Jan


--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid
Common Pitfalls.
Read the Whitepaper.

http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
mailto:Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss







--
Managing the Performance of Cloud-Based Applications
Take

[Rdkit-discuss] rdkit RPM for CentOS 5.3 ?

2014-02-25 Thread Jan Holst Jensen
Hi RDKitters,

OK - CentOS 5.3 is getting ancient, but: Does anyone have an rdkit RPM 
or just a binary build for CentOS 5.3 that works with its standard 
Python 2.4 ? The build process doesn't look so fun on such an old OS.

I am testing if rdkit can be used from Oracle on a customer (test!) 
database. And said database runs on a CentOS 5.3 server - no OS upgrade 
in the near future. First step is to get rdkit working in Python 2.4 on 
that server.

Cheers
-- Jan

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Pg cartridge: Linking issue when adding depiction functionality.

2014-02-23 Thread Jan Holst Jensen

Hi Greg,

I now have a working version of mol_to_ctab() - except when I add in 
depiction generation. There is a linking issue I don't understand.


In adapter.cpp I have added this function that implements the C++ part 
of mol_to_ctab():


   extern C char *
   makeCtabText(CROMol data, int *len, bool createDepictionIfMissing) {
  ROMol   *mol = (ROMol*)data;

  try {
if (createDepictionIfMissing  mol-getNumConformers() == 0) {
  RDDepict::compute2DCoords(*mol);
}
StringData = MolToMolBlock(*mol);
  } catch (...) {
ereport(WARNING,
(errcode(ERRCODE_WARNING),
 errmsg(makeCtabText: problems converting molecule to
   CTAB)));
StringData=;
  }

  *len = StringData.size();
  return (char*)StringData.c_str();
   }


But, postgres then won't start (I preload the 'rdkit' library) and in 
the postgres startup log I see a complaint about unresolved symbols. 
Those are caused by the RDDepict::compute2DCoords() call. If I compile 
without the RDDepict::... line, all is well, and the cartridge loads.


I check the built rdkit.so (with the depiction call):

   [jhje@bcfregbuild rdkit]$ nm -D rdkit.so | grep  U |grep RD
 U
   _ZN6RDGeom11Transform2D12SetTransformERKNS_7Point2DES3_S3_S3_
 U
   _ZN6RDGeom11Transform2D12SetTransformERKNS_7Point2DEd
 U
   _ZNK6RDGeom11Transform2D14TransformPointERNS_7Point2DE
   [jhje@bcfregbuild rdkit]$


So I do a naive search for those unresolved symbols in rdkit's lib 
directory to see what libs are needed:


   [jhje@bcfregbuild lib]$ grep
   _ZN6RDGeom11Transform2D12SetTransformERKNS_7Point2DES3_S3_S3_ *.a
   Binary file libDepictor_static.a matches
   Binary file libRDGeometryLib_static.a matches
   [jhje@bcfregbuild lib]$ grep
   _ZN6RDGeom11Transform2D12SetTransformERKNS_7Point2DEd *.a
   Binary file libDepictor_static.a matches
   Binary file libRDGeometryLib_static.a matches
   [jhje@bcfregbuild lib]$ grep
   _ZNK6RDGeom11Transform2D14TransformPointERNS_7Point2DE *.a
   Binary file libDepictor_static.a matches
   Binary file libRDGeometryLib_static.a matches
   [jhje@bcfregbuild lib]$


I have added Depictor to the RDKLIBS in the Makefile - RDGeometryLib was 
already there:


RDKLIBS   = ${INCHILIBS} [...] -lDataStructs -lRDGeometryLib 
-lRDGeneral *-lDepictor*


So I would have thought that all was well. But I still have these three 
unlinked symbols after adding the Depictor lib and rebuilding. Some 
secondary dependencies ... ? I hope that I can slap my forehead and say 
DUH when I am told the reason why...


Cheers
-- Jan
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Pg cartridge: Linking issue when adding depiction functionality.

2014-02-23 Thread Jan Holst Jensen

RDKLIBS changed:

  - ... -lRDGeometryLib -lRDGeneral -lDepictor
  + ... -lDepictor -lRDGeometryLib -lRDGeneral

And lo, there was much rejoicing and slapping of foreheads - that did 
it. Thanks!


Cheers
-- Jan

On 2014-02-23 12:56, Greg Landrum wrote:

Jan,

It is probably the order of the libraries (static linking can 
introduce that).

Try moving the depiction lib to the beginning of the libs list

-greg

On Sunday, February 23, 2014, Jan Holst Jensen j...@biochemfusion.com 
mailto:j...@biochemfusion.com wrote:


Hi Greg,

I now have a working version of mol_to_ctab() - except when I add
in depiction generation. There is a linking issue I don't understand.

In adapter.cpp I have added this function that implements the C++
part of mol_to_ctab():

extern C char *
makeCtabText(CROMol data, int *len, bool
createDepictionIfMissing) {
  ROMol   *mol = (ROMol*)data;

  try {
if (createDepictionIfMissing  mol-getNumConformers() ==
0) {
  RDDepict::compute2DCoords(*mol);
}
StringData = MolToMolBlock(*mol);
  } catch (...) {
ereport(WARNING,
(errcode(ERRCODE_WARNING),
 errmsg(makeCtabText: problems converting
molecule to CTAB)));
StringData=;
  }

  *len = StringData.size();
  return (char*)StringData.c_str();
}


But, postgres then won't start (I preload the 'rdkit' library) and
in the postgres startup log I see a complaint about unresolved
symbols. Those are caused by the RDDepict::compute2DCoords() call.
If I compile without the RDDepict::... line, all is well, and the
cartridge loads.

I check the built rdkit.so (with the depiction call):

[jhje@bcfregbuild rdkit]$ nm -D rdkit.so | grep  U |grep RD
 U
_ZN6RDGeom11Transform2D12SetTransformERKNS_7Point2DES3_S3_S3_
 U
_ZN6RDGeom11Transform2D12SetTransformERKNS_7Point2DEd
 U
_ZNK6RDGeom11Transform2D14TransformPointERNS_7Point2DE
[jhje@bcfregbuild rdkit]$


So I do a naive search for those unresolved symbols in rdkit's lib
directory to see what libs are needed:

[jhje@bcfregbuild lib]$ grep
_ZN6RDGeom11Transform2D12SetTransformERKNS_7Point2DES3_S3_S3_ *.a
Binary file libDepictor_static.a matches
Binary file libRDGeometryLib_static.a matches
[jhje@bcfregbuild lib]$ grep
_ZN6RDGeom11Transform2D12SetTransformERKNS_7Point2DEd *.a
Binary file libDepictor_static.a matches
Binary file libRDGeometryLib_static.a matches
[jhje@bcfregbuild lib]$ grep
_ZNK6RDGeom11Transform2D14TransformPointERNS_7Point2DE *.a
Binary file libDepictor_static.a matches
Binary file libRDGeometryLib_static.a matches
[jhje@bcfregbuild lib]$


I have added Depictor to the RDKLIBS in the Makefile -
RDGeometryLib was already there:

RDKLIBS   = ${INCHILIBS} [...] -lDataStructs
-lRDGeometryLib -lRDGeneral *-lDepictor*

So I would have thought that all was well. But I still have these
three unlinked symbols after adding the Depictor lib and
rebuilding. Some secondary dependencies ... ? I hope that I can
slap my forehead and say DUH when I am told the reason why...

Cheers
-- Jan



--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit nodes in KNIME stopped working suddenly

2014-02-21 Thread Jan Holst Jensen
On 2014-02-21 11:04, Michal Krompiec wrote:
 Hello,
 We've been using the RDKit nodes in KNIME for quite a while without
 any problems. But suddenly they ceased to work on some computers,
 while still working on other ones. Tried with a fresh KNIME
 installation with latest RDKit nodes - same problem. What could be
 wrong? I pasted the warning/error messages below:

 WARNRDKitTypesPluginActivator Library file
 GraphMolWrap.dll found:
 C:\Temp\knime_2.9.1\plugins\org.rdkit.knime.bin.win32.x86_2.4.0.201402061135\os\win32\x86\GraphMolWrap.dll

 ERROR  RDKitTypesPluginActivator Loading of library
 GraphMolWrap.dll failed (possibly a subsequent error):
 C:\Temp\knime_2.9.1\configuration\org.eclipse.osgi\bundles\libtemp\224_0\GraphMolWrap.dll:
 Can't find dependent libraries

 ERROR  RDKitTypesPluginActivator The library GraphMolWrap.dll
 has dependency issues. Please run a dependency walker on this file to
 find out what is missing.

 ERROR  RDKitTypesPluginActivator Suggestion for fix: Please
 correct your system libraries based on the outcome of the dependency
 walker.

 WARNHistogram 2 columns without a valid
 domain will be ignored. In order to calculate the domain use the
 Nominal Values or Domain Calculator node.

 WARNRDKit From Molecule   Could not load native RDKit
 library: 
 C:\Temp\knime_2.9.1\plugins\org.rdkit.knime.bin.win32.x86_2.4.0.201402061135\os\win32\x86\boost_system-vc100-mt-1_51.dll:
 Can't find dependent libraries

 WARNRDKit From Molecule   Could not load native RDKit
 library: 
 C:\Temp\knime_2.9.1\plugins\org.rdkit.knime.bin.win32.x86_2.4.0.201402061135\os\win32\x86\boost_system-vc100-mt-1_51.dll:
 Can't find dependent libraries

 WARNRDKit From Molecule   Could not load native RDKit
 library: 
 C:\Temp\knime_2.9.1\plugins\org.rdkit.knime.bin.win32.x86_2.4.0.201402061135\os\win32\x86\boost_system-vc100-mt-1_51.dll:
 Can't find dependent libraries

 Thanks in advance,

 Michal


Hello Michal,

Could it be missing VC++ runtime DLLs on those machines ? Do you have 
MSVCP100.DLL and MSVCR100.DLL in your path ?

Cheers
-- Jan

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit cartridge - opposite of mol_from_ctab() would be nice.

2014-02-21 Thread Jan Holst Jensen

Hi Greg,

Are there any plans for a mol_*to*_ctab() function in the PG cartridge ? 
Would make SD file export from the database a bit easier.


If there are no immediate plans, I can take a stab at adding it myself.

* Looks like rdkit_io.c is the place to add it ?
* Should I manually define the new SQL function in rdkit.sql.in, or is 
there some higher-level place I should add it instead ?


Cheers
-- Jan
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit cartridge - opposite of mol_from_ctab() would be nice.

2014-02-21 Thread Jan Holst Jensen

Hi Greg,

It would be great to gain the experience. I am working on a registration 
project where we will likely need to surface additional functions in the 
cartridge, just to try them out. So, knowing how to do that in a way 
where things that turn out useful can be contributed back cleanly would 
be great.


 if structures don't have conformers

Ah, yes; good question. Decisions, decisions... I'll dodge the question 
:-) and say it sounds like a perfect fit for an optional parameter, e.g.


mol_to_ctab(m mol, add_depiction_if_missing bool default true)

I would go for default true because I believe that is the general 
preference.


Cheers
-- Jan

On 2014-02-21 16:47, Greg Landrum wrote:

Hi Jan,

Great idea. I'd be happy to add it, but I can also talk you through 
it if you want to gain the experience.


One important question: if structures don't have conformers (if they 
are loaded from SMILES, for example), should ctabs with all zero 
coordinates be generated or should depictions be generated?


-greg


On Fri, Feb 21, 2014 at 2:23 PM, Jan Holst Jensen 
j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:


Hi Greg,

Are there any plans for a mol_*to*_ctab() function in the PG
cartridge ? Would make SD file export from the database a bit easier.

If there are no immediate plans, I can take a stab at adding it
myself.

* Looks like rdkit_io.c is the place to add it ?
* Should I manually define the new SQL function in rdkit.sql.in
http://rdkit.sql.in, or is there some higher-level place I
should add it instead ?

Cheers
-- Jan


--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
mailto:Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] flexmatch in RDKit cartridge?

2014-02-20 Thread Jan Holst Jensen

Hi George et al,

flexmatch(... 'all') is the most strict exact match that the 
Symyx/Accelrys cartridge has. You can relax the matching behavior to 
varying degrees by passing it different options, e.g. using 'tau' 
instead of 'all' will make the identity check tautomer-agnostic (to the 
extent that the cartridge will perceive tautomers correctly - an 
interesting discussion topic in itself).


The various options to flexmatch() are well documented in the Accelrys 
documentation for the cartridge, but I don't know if that is publicly 
available.


The short answer in my opinion: Yes, @= should be the equivalent of 
flexmatch(m1, m2, 'all'). To emulate flexmatch(..., 'all') with rdkit, I 
find a small gotcha with regards to chiral matching:


-- Clearly not identical.
postgres=# select mol('CCC') @= mol('CCF');
 ?column?
--
 f
(1 row)

-- Clearly identical.
postgres=# select mol('CCC') @= mol('CCC');
 ?column?
--
 t
(1 row)

-- Ala versus dAla - should *not* be identical ?
postgres=# select mol('C[C@H](N)C(=O)O') @= mol('C[C@@H](N)C(=O)O');
 ?column?
--
 t
(1 row)

To get the expected behavior of @= you need to turn on chiral matching. 
Even though the parameter says that is controls SSS behavior it 
apparently also has an effect on exact matching:


postgres=# set rdkit.do_chiral_sss=true;
SET
-- Ala versus dAla - no longer identical.
postgres=# select mol('C[C@H](N)C(=O)O') @= mol('C[C@@H](N)C(=O)O');
 ?column?
--
 f
(1 row)

-- Ala versus Ala - phew, identical.
postgres=# select mol('C[C@H](N)C(=O)O') @= mol('C[C@H](N)C(=O)O');
 ?column?
--
 t
(1 row)

Cheers
-- Jan


On 2014-02-20 13:46, George Papadatos wrote:

Hi there,
Wouldn't that be (at least partly) possible with an exact structure 
search?


  * @= : returns whether or not two molecules are the same.

Cheers,
George


On 20 February 2014 11:59, Greg Landrum greg.land...@gmail.com 
mailto:greg.land...@gmail.com wrote:


Sounds interesting. Can anyone provide a pointer to a doc with
more specific info about what this actually does?


On Thursday, February 20, 2014, Micha? Nowotka mmm...@gmail.com
mailto:mmm...@gmail.com wrote:

Hi,

Symix cartridge defines something called flexmatch - Finds
records that are an exact match of the 2D or 3D structure that
you specify in the query.
Is there anything similar in RDKit cartridge? I looked into
documentation and couldn't find this feature.

Regards,
Michal Nowotka



--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
mailto:Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Chem.MolToMolBlock() does not switch to V3000 format as expected, when number of atoms or bonds 999.

2013-12-20 Thread Jan Holst Jensen

On 2013-12-20 04:57, Greg Landrum wrote:

Hi Jan


On Thu, Dec 19, 2013 at 11:32 PM, Jan Holst Jensen 
j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:



The first lines of C:/temp/big.mol are:

HERCEPTIN FAB (ANTIBODY) - LIGHT CHAIN
 RDKit  3D

78908023  0  0  0  0  0  0  0  0999 V2000
   34.5390   88.7520   88.2980 N   0  0  0  0  0  0 0  0  0 
0  0  0
   34.7910   87.4610   89.0350 C   0  0  2  0  0  0 0  0  0 
0  0  0
   35.9070   86.7790   88.2540 C   0  0  0  0  0  0 0  0  0 
0  0  0



which is unreadable since there are 7890 atoms and 8023 bonds and
V2000 format supports max. 999 atoms/bonds. I would have expected
RDKit to switch to V3000 format automatically (?). I don't see any
parameter in MolToMolBlock() that can force V3000 output ?


That's a perfectly reasonable expectation, but, unfortunately, the 
RDKit has a small hole in the functionality it provides: there is no 
V3000 mol block writer. I just added the feature request in github and 
will try to get it in for the next release.


Sorry for the inconvenience.
-greg



Ah, OK. No worries; sounds like I am the only one who has run into it 
this far. I'll just use a different output format for now.


Cheers
-- Jan
--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] install error

2013-06-05 Thread Jan Holst Jensen

Hi Yingfeng,

Looks like what happened to me, when I had forgotten to set the 
environment. Did you set LD_LIBRARY_PATH to $RDBASE/lib before running 
the tests ?


Kind regards
-- Jan Holst Jensen

On 2013-06-05 16:09, Yingfeng Wang wrote:

Yes. I tried again. This time,

after run download-inchi.sh

I
mkdir build
cd build
sudo cmake .. -DRDK_BUILD_INCHI_SUPPORT=ON
sudo make
sudo make install

by far, no error reported.

but, when I run sudo ctest

I got

The following tests FAILED:
 1 - testInchi (OTHER_FAULT)
 3 - testDataStructs (OTHER_FAULT)
 4 - pyBV (Failed)
 5 - pyDiscreteValueVect (Failed)
 6 - pySparseIntVect (Failed)
 8 - testGrid (OTHER_FAULT)
 9 - testPyGeometry (Failed)
12 - pyAlignment (Failed)
16 - pyDistGeom (Failed)
20 - graphmolMolOpsTest (SEGFAULT)
22 - graphmoltestChirality (OTHER_FAULT)
23 - graphmoltestPickler (OTHER_FAULT)
25 - testDepictor (OTHER_FAULT)
26 - pyDepictor (Failed)
29 - fileParsersTest1 (OTHER_FAULT)
30 - testMolSupplier (OTHER_FAULT)
31 - testMolWriter (OTHER_FAULT)
32 - testTplParser (OTHER_FAULT)
33 - testMol2ToMol (OTHER_FAULT)
35 - testReaction (OTHER_FAULT)
36 - pyChemReactions (Failed)
37 - testChemTransforms (OTHER_FAULT)
40 - testFragCatalog (OTHER_FAULT)
41 - pyFragCatalog (Failed)
42 - testDescriptors (OTHER_FAULT)
43 - pyMolDescriptors (Failed)
44 - testFingerprints (OTHER_FAULT)
45 - pyPartialCharges (Failed)
46 - testMolTransforms (OTHER_FAULT)
47 - pyMolTransforms (Failed)
48 - testForceFieldHelpers (OTHER_FAULT)
49 - pyForceFieldHelpers (Failed)
50 - testDistGeomHelpers (OTHER_FAULT)
51 - pyDistGeom (Failed)
52 - testMolAlign (OTHER_FAULT)
53 - pyMolAlign (Failed)
54 - testFeatures (OTHER_FAULT)
55 - pyChemicalFeatures (Failed)
56 - testShapeHelpers (OTHER_FAULT)
57 - pyShapeHelpers (Failed)
59 - pyMolCatalog (Failed)
61 - pySLNParse (Failed)
62 - pyGraphMolWrap (Failed)
63 - pyTestConformerWrap (Failed)
66 - pyMatCalc (Failed)
67 - pyCMIM (Failed)
68 - pyRanker (Failed)
70 - pyFeatures (Failed)
71 - pythonTestDbCLI (Failed)
72 - pythonTestDirML (Failed)
77 - pythonTestDirChem (Failed)
Errors while running CTest

Thanks.



On Wed, Jun 5, 2013 at 5:54 AM, JP jeanpaul.ebe...@inhibox.com 
mailto:jeanpaul.ebe...@inhibox.com wrote:


,

Before running cmake - have you
run ./External/INCHI-API/download-inchi.sh ?




On 5 June 2013 04:13, Yingfeng Wang ywang...@gmail.com
mailto:ywang...@gmail.com wrote:

After getting the latest code by git, I install RDKit on my
ubuntu 12.04.

In the step of make install, I got

CMake Error at External/INCHI-API/cmake_install.cmake:124 (FILE):

It seems the file $RDBASE/lib/libRDInchiLib.so.1.2013.06.1pre
can't be found. Please note that I have turned on the flag of
inchi.

Thanks.

Yingfeng



--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] install error

2013-06-05 Thread Jan Holst Jensen
OK, that looks fine. And I assume you build directly in 
/opt/RDKit_latest/latest/, so 'make install' puts files in 
/opt/RDKit_latest/latest/lib/.


The only other thing that comes to my mind is if your boost libraries 
are included in LD_LIBRARY_PATH.


Cheers
-- Jan

On 2013-06-05 16:55, Yingfeng Wang wrote:

Yes, I did.

export RDBASE=/opt/RDKit_latest/latest
export LD_LIBRARY_PATH=$RDBASE/lib:$LD_LIBRARY_PATH
export PYTHONPATH=$RDBASE:$PYTHONPATH

I put them on ~/.bashrc


On Wed, Jun 5, 2013 at 10:53 AM, Jan Holst Jensen 
j...@biochemfusion.com mailto:j...@biochemfusion.com wrote:


Hi Yingfeng,

Looks like what happened to me, when I had forgotten to set the
environment. Did you set LD_LIBRARY_PATH to $RDBASE/lib before
running the tests ?

Kind regards
-- Jan Holst Jensen


On 2013-06-05 16:09, Yingfeng Wang wrote:

Yes. I tried again. This time,

after run download-inchi.sh

I
mkdir build
cd build
sudo cmake .. -DRDK_BUILD_INCHI_SUPPORT=ON
sudo make
sudo make install

by far, no error reported.

but, when I run sudo ctest

I got

The following tests FAILED:
 1 - testInchi (OTHER_FAULT)
 3 - testDataStructs (OTHER_FAULT)
 4 - pyBV (Failed)
 5 - pyDiscreteValueVect (Failed)
 6 - pySparseIntVect (Failed)
 8 - testGrid (OTHER_FAULT)
 9 - testPyGeometry (Failed)
12 - pyAlignment (Failed)
16 - pyDistGeom (Failed)
20 - graphmolMolOpsTest (SEGFAULT)
22 - graphmoltestChirality (OTHER_FAULT)
23 - graphmoltestPickler (OTHER_FAULT)
25 - testDepictor (OTHER_FAULT)
26 - pyDepictor (Failed)
29 - fileParsersTest1 (OTHER_FAULT)
30 - testMolSupplier (OTHER_FAULT)
31 - testMolWriter (OTHER_FAULT)
32 - testTplParser (OTHER_FAULT)
33 - testMol2ToMol (OTHER_FAULT)
35 - testReaction (OTHER_FAULT)
36 - pyChemReactions (Failed)
37 - testChemTransforms (OTHER_FAULT)
40 - testFragCatalog (OTHER_FAULT)
41 - pyFragCatalog (Failed)
42 - testDescriptors (OTHER_FAULT)
43 - pyMolDescriptors (Failed)
44 - testFingerprints (OTHER_FAULT)
45 - pyPartialCharges (Failed)
46 - testMolTransforms (OTHER_FAULT)
47 - pyMolTransforms (Failed)
48 - testForceFieldHelpers (OTHER_FAULT)
49 - pyForceFieldHelpers (Failed)
50 - testDistGeomHelpers (OTHER_FAULT)
51 - pyDistGeom (Failed)
52 - testMolAlign (OTHER_FAULT)
53 - pyMolAlign (Failed)
54 - testFeatures (OTHER_FAULT)
55 - pyChemicalFeatures (Failed)
56 - testShapeHelpers (OTHER_FAULT)
57 - pyShapeHelpers (Failed)
59 - pyMolCatalog (Failed)
61 - pySLNParse (Failed)
62 - pyGraphMolWrap (Failed)
63 - pyTestConformerWrap (Failed)
66 - pyMatCalc (Failed)
67 - pyCMIM (Failed)
68 - pyRanker (Failed)
70 - pyFeatures (Failed)
71 - pythonTestDbCLI (Failed)
72 - pythonTestDirML (Failed)
77 - pythonTestDirChem (Failed)
Errors while running CTest

Thanks.



On Wed, Jun 5, 2013 at 5:54 AM, JP jeanpaul.ebe...@inhibox.com
mailto:jeanpaul.ebe...@inhibox.com wrote:

,

Before running cmake - have you
run ./External/INCHI-API/download-inchi.sh ?




On 5 June 2013 04:13, Yingfeng Wang ywang...@gmail.com
mailto:ywang...@gmail.com wrote:

After getting the latest code by git, I install RDKit on
my ubuntu 12.04.

In the step of make install, I got

CMake Error at External/INCHI-API/cmake_install.cmake:124
(FILE):

It seems the
file $RDBASE/lib/libRDInchiLib.so.1.2013.06.1pre can't be
found. Please note that I have turned on the flag of inchi.

Thanks.

Yingfeng






--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI strings from molfiles - stereo perception.

2013-05-01 Thread Jan Holst Jensen
Dear Greg,

On 2013-05-01 09:58, Greg Landrum wrote:
 Dear Jan,

 On Mon, Apr 29, 2013 at 8:03 AM, Jan Holst Jensen j...@biochemfusion.com 
 wrote:
 Hi RDKitters,

 I wonder why the InChI strings generated by RDKit differ from the ones
 generated by the standard IUPAC inchi-1 executable.
 At least some were due to an RDKit bug that has been fixed for a while
 (it's in the 2013.03 release). The fix isn't reflected in the knime
 nodes because we haven't done an update of the knime binaries in a
 while; that's coming in the next day or so.

Ah - sounds wonderful. Thanks.

Out of sheer laziness my Python-enabled RDKit builds have been without 
InChI support so I couldn't compare with the KNIME nodes - just assumed 
that they behaved identically. Well... as they say Assumption is the 
mother of all sc***-ups :-).

 OK, now those InChI samples look like they are heavy on fringe cases and
 perhaps thus likely to really stress toolkits.
 These are the best kind. :-)

Indeed :-).

 So I took something more peaceful and ran a peptide from PubChem through
 (pubchem_71296070.mol - attached).


 In [4]: Chem.MolToInchi(Chem.MolFromMolFile('pubchem_71296070.mol'))
 Out[4]: 
 'InChI=1S/C33H55N9O10/c1-18(43)26(37)31(49)40-23(16-20-10-4-3-5-11-20)30(48)39-21(12-6-8-14-34)28(46)38-22(13-7-9-15-35)29(47)42-27(19(2)44)32(50)41-24(33(51)52)17-25(36)45/h3-5,10-11,18-19,21-24,26-27,43-44H,6-9,12-17,34-35,37H2,1-2H3,(H2,36,45)(H,38,46)(H,39,48)(H,40,49)(H,41,50)(H,42,47)(H,51,52)/t18-,19-,21+,22+,23+,24+,26+,27+/m1/s1'

 also looks fine.

Yep. Everything should be in order then.

Cheers
-- Jan

--
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with 2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] InChI strings from molfiles - stereo perception.

2013-04-29 Thread Jan Holst Jensen

Hi RDKitters,

I wonder why the InChI strings generated by RDKit differ from the ones 
generated by the standard IUPAC inchi-1 executable.


I have used the IUPAC inchi-1 executable from a command line to generate 
IUPAC InChI strings (the executable that comes pre-built with the InChI 
1.04 binary download).


RDKit InChI strings were generated with the RDKit KNIME nodes, this version:

 RDKit KNIME integration2.1.0.201302211506

I constructed a KNIME workflow that reads in an SD-file, uses the 
Molecule to RDKit node and then the RDKit To InChI node with default 
options to generate RDKit InChI strings.


I ran the standard InChI example file Samples.sdf through the KNIME 
workflow and compared with the InChIs generated from the IUPAC 
executable. A number of InChI strings are different; it seems to be 
almost all stereo-related.


For example: InChI strings generated for spiro.mol (spiro.mol - attached):

IUPAC: 
InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3/t2*7-,8-,9-/m10/s1

RDKit: InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3

and stertaut.mol (stertaut.mol - attached):

IUPAC: 
InChI=1S/C6H6O5/c7-1-2-3(5(8)9)4(2)6(10)11/h1,3-4,7H,(H,8,9)(H,10,11)/b2-1-/t3-,4+/m0/s1
RDKit: 
InChI=1S/C6H6O5/c7-1-2-3(5(8)9)4(2)6(10)11/h1,3-4,7H,(H,8,9)(H,10,11)/t3-,4-/m1/s1


OK, now those InChI samples look like they are heavy on fringe cases and 
perhaps thus likely to really stress toolkits.


So I took something more peaceful and ran a peptide from PubChem 
through (pubchem_71296070.mol - attached).


IUPAC: 
InChI=1S/C33H55N9O10/c1-18(43)26(37)31(49)40-23(16-20-10-4-3-5-11-20)30(48)39-21(12-6-8-14-34)28(46)38-22(13-7-9-15-35)29(47)42-27(19(2)44)32(50)41-24(33(51)52)17-25(36)45/h3-5,10-11,18-19,21-24,26-27,43-44H,6-9,12-17,34-35,37H2,1-2H3,(H2,36,45)(H,38,46)(H,39,48)(H,40,49)(H,41,50)(H,42,47)(H,51,52)/t18-,19-,21+,22+,23+,24+,26+,27+/m1/s1
RDKit: 
InChI=1S/C33H55N9O10/c1-18(43)26(37)31(49)40-23(16-20-10-4-3-5-11-20)30(48)39-21(12-6-8-14-34)28(46)38-22(13-7-9-15-35)29(47)42-27(19(2)44)32(50)41-24(33(51)52)17-25(36)45/h3-5,10-11,18-19,21-24,26-27,43-44H,6-9,12-17,34-35,37H2,1-2H3,(H2,36,45)(H,38,46)(H,39,48)(H,40,49)(H,41,50)(H,42,47)(H,51,52)/t18-,19-,21+,22+,23+,24+,26+,27+/m0/s1


The only difference in this case is that IUPAC outputs an InChI string 
with /m0 and RDKit an InChI with /m1. As far as I can understand from 
the InChI FAQ the /m0 /m1 difference indicates that these are different 
enantiomers.


I converted the InChIs back to molecule with the InChI to RDKit KNIME 
node. The molecule generated from the IUPAC InChI (from-iupac-inchi.mol 
- attached) faithfully reconstructs the original PubChem molecule. When 
I construct a molecule from the RDKit InChI (from-rdkit-inchi.mol - 
attached), all the stereo centers have been inverted (as expected - 
different enantiomer).


Is there a good explanation for this ?

Cheers
-- Jan

71296070
  -OEChem-04201305312D

107107  0 1  0  0  0  0  0999 V2000
8.06221.25000. O   0  0  0  0  0  0  0  0  0  0  0  0
   10.6603   -1.25000. O   0  0  0  0  0  0  0  0  0  0  0  0
   13.2583   -1.75000. O   0  0  0  0  0  0  0  0  0  0  0  0
5.4641   -1.25000. O   0  0  0  0  0  0  0  0  0  0  0  0
   13.25831.25000. O   0  0  0  0  0  0  0  0  0  0  0  0
2.86601.25000. O   0  0  0  0  0  0  0  0  0  0  0  0
2.86604.25000. O   0  0  0  0  0  0  0  0  0  0  0  0
   16.72240.25000. O   0  0  0  0  0  0  0  0  0  0  0  0
   15.8564   -1.25000. O   0  0  0  0  0  0  0  0  0  0  0  0
   16.72241.25000. O   0  0  0  0  0  0  0  0  0  0  0  0
8.9282   -0.25000. N   0  0  0  0  0  0  0  0  0  0  0  0
6.33010.25000. N   0  0  0  0  0  0  0  0  0  0  0  0
   11.52630.25000. N   0  0  0  0  0  0  0  0  0  0  0  0
4.59811.25000. N   0  0  0  0  0  0  0  0  0  0  0  0
   14.1244   -0.25000. N   0  0  0  0  0  0  0  0  0  0  0  0
   11.52634.25000. N   0  0  0  0  0  0  0  0  0  0  0  0
8.9282   -4.25000. N   0  0  0  0  0  0  0  0  0  0  0  0
4.59813.25000. N   0  0  0  0  0  0  0  0  0  0  0  0
   15.85642.75000. N   0  0  0  0  0  0  0  0  0  0  0  0
7.1962   -0.25000. C   0  0  1  0  0  0  0  0  0  0  0  0
9.79420.25000. C   0  0  1  0  0  0  0  0  0  0  0  0
7.1962   -1.25000. C   0  0  0  0  0  0  0  0  0  0  0  0
9.79421.25000. C   0  0  0  0  0  0  0  0  0  0  0  0
8.0622   -1.75000. C   0  0  0  0  0  0  0  0  0  0  0  0
   10.66031.75000. C   0  0  0  0  0  0  0  0  0  0  0  0
8.06220.25000. C   0  0  0  0  0  0  0  0  0  0  0  0
   10.6603   -0.25000. C   0  0  0  0  0  0  0  0  0  0  0  0
4.59810.25000. C   0  0  2  0  0  0  0  0  0  0  0  0
   10.66032.75000. C   0  

[Rdkit-discuss] PG cartridge - Strange SearchMolCache() usage in rdkit_op.c

2012-11-24 Thread Jan Holst Jensen
Hi RDKitters,

I now have the Postgres cartridge code compiling on 64-bit Windows. I 
had to add a few explicit casts to dodge C2440 compiler errors but 
nothing strange there. What puzzles me are the compile errors I saw when 
compiling rdkit_op.c.

rdkit.h defines SearchMolCache() like this:

   void* SearchMolCache( void *cache, struct MemoryContextData * ctx, 
Datum a,
 Mol **m, CROMol *mol, bytea **sign);

and it is called with 6 parameters all over the place - except in 
rdkit_op.c where an extra seventh NULL parameter is consistently added. 
That results in a:

 rdkit_op.c(75) : error C2660: 'SearchMolCache' : function does not 
take 7 arguments

which makes good sense to me.

Why don't I see this error when compiling on Linux ? I first though that 
it was because I am compiling in C++ mode (C mode is pretty limited when 
using MSVC) but compiling rdkit_op.c in C mode still gives me:

 rdkit_op.c(75) : warning C4020: 'SearchMolCache' : too many actual 
parameters

Is it because gcc is more laid-back than MSVC and thinks that hey, an 
extra parameter on the stack - no worries, or what's happening here ? 
Disclaimer: I don't have much C experience, only a couple of years 
programming C++.

Anyway, I removed the extra NULL parameter and now it compiles. I will 
post patches when I get the linking sorted out and the cartridge runs.

Cheers
-- Jan

--
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit pgSQL cartridge on 64-bit Windows, anyone ?

2012-11-21 Thread Jan Holst Jensen
On 2012-11-14 10:30, paul.czodrow...@merckgroup.com wrote:
 Dear Jan,

 nope - but this reminds me on one UGM topic:
 Could anyone provide a 64bit Win7 build?




Well, a couple of notes along the way to building the PostgreSQL 
cartridge for Windows. This one is for building 64-bit RDKit on Windows.

I am building on a Windows Server 2008 R2 x64 machine with Visual Studio 
2010, cygwin, boost 1.51 (both x86 and x64 present), and cmake. Note 
that I have built without Python wrappers and without Java wrappers.

I downloaded RDKit 2012_09_1 and generated a VS solution file with 
cmake. Started Visual Studio and changed the build configuraton to 
Release/x64.

Only half of the projects build, the others fail with a linker error. I 
found three reasons for this:


1) cmake is apparently a little too smart when it locates boost 
libraries. I had changed my PATH to include the x64 boost libs instead 
of the x86 libs but still cmake was clever enough to find the x86 libs 
(perhaps because they were installed in the default installation path). 
I renamed the directory name of the x86 boost libs and set BOOST_ROOT to 
point to the x64 boost libs - not sure which one did the trick but it 
helped.


2) cmake generates project files that link EXEs with a /machine:X86 
flag. If you look in $RDBASE/build/CMakeCache.txt you can find this line 
which must be the one responsible:

 CMAKE_EXE_LINKER_FLAGS:STRING=' /STACK:1000 /machine:X86 '

I could not really figure out where this came from. Perhaps cmake just 
adds it per default when no machine type is explicity specified (?). But 
I had cygwin installed, so I did this (my $RDBASE is C:\RDKit_2012_09_1\):

 cd /cygdrive/C/RDKit_2012_09_1/build/
 find . -name *.vcxproj -type f -exec sed -i 
s/machine:X86/machine:X64/ '{}' \;


3) After the above changes an ALL_BUILD build still gives me 50 
succeeded, 31 failed. The failed ones have the linker error LNK2001: 
unresolved external symbol class boost::system::error_category. I did 
apply the patches Greg sent for the boost::thread_resource_error build 
problem 2012-11-18, but it needs a couple of additional fixes.

In $RDBASE/Code/GraphMol/CMakeLists.txt all the rdkit_test() 
expressions - 8 in total - need to link in ${Boost_SYSTEM_LIBRARY}, so e.g.

 rdkit_test(graphmolTest1 test1.cpp LINK_LIBRARIES GraphMol 
RDGeometryLib RDGeneral)

becomes

 rdkit_test(graphmolTest1 test1.cpp LINK_LIBRARIES 
${Boost_SYSTEM_LIBRARY} GraphMol RDGeometryLib RDGeneral)

And that also needs to be fixed in all the CMakeLists.txt files in the 
following subdirectories of GraphMol/ :

ChemReactions
ChemTransforms
Depictor
DistGeomHelpers
FileParsers
Fingerprints
ForceFieldHelpers
FragCatalog
MolAlign
MolCatalog
MolChemicalFeatures
MolTransforms
ShapeHelpers
SLNParse
SmilesParse
SubGraphs
Substruct


Once those things were fixed I got a 64-bit build of RDKit. And all 44 
tests pass :-).

The build generates tons of warnings, but they are mostly harmless ones 
about conversion from 'size_t' to 'unsigned int', possible loss of data.

Next: Cartridge building. Stay tuned...

Cheers
-- Jan

--
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Yet another build question - boost::thread_resource_error not found when building 2012_09_1.

2012-11-16 Thread Jan Holst Jensen
Hi Riccardo,

On 2012-11-16 13:29, Riccardo Vianello wrote:
 Hi Jan,

 On Fri, Nov 16, 2012 at 11:54 AM, Jan Holst Jensen
 j...@biochemfusion.com wrote:
 [...]
 ../../lib/libSmilesParse.so.1.2012.09.1: undefined reference to
 `boost::thread_resource_error::~thread_resource_error()'
 collect2: ld returned 1 exit status
 make[2]: *** [Code/GraphMol/graphmolIterTest] Error 1
 make[1]: *** [Code/GraphMol/CMakeFiles/graphmolIterTest.dir/all] Error 2
 make: *** [all] Error 2
 I encontered a similar link problem in building rdkit on an oldish
 linux x86_64 machine with boost 1.39 (that, according to the cmake
 build files, is the minimum required version), but I couldn't yet find
 the time to investigate the details. For the time being I worked
 around it with the addition of

 find_package(Boost 1.39.0 COMPONENTS thread REQUIRED)

 in $RDBASE/CMakeLists.txt (after the assignment of
 Boost_ADDITIONAL_VERSIONS and before if(RDK_BUILD_PYTHON_WRAPPERS))
 and with the addition of ${Boost_THREAD_LIBRARY} to the GraphMol's
 LINK_LIBRARIES in $RDBASE/Code/GraphMol/CMakeLists.txt

Excellent! I added such a line to $RDBASE/CMakeLists.txt:

 if(MSVC)
   SET(Boost_ADDITIONAL_VERSIONS 1.481.48.0 1.45 1.45.0 
1.44 1.44.0 1.43 1.43.0 1.42 1.42.0 1.41 1.41.0 1.40 
1.40.0)
 endif(MSVC)

+   find_package(Boost 1.40.0 COMPONENTS thread REQUIRED)

 if(RDK_BUILD_PYTHON_WRAPPERS)
   #---


and added ${Boost_THREAD_LIBRARY} after each occurrence of 
LINK_LIBRARIES in $RDBASE/Code/GraphMol/CMakeLists.txt. I now have a 
build of 2012_09_1 that passes all tests. Thanks :-) !!

Cheers
-- Jan

--
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Yet another build question - boost::thread_resource_error not found when building 2012_09_1.

2012-11-16 Thread Jan Holst Jensen
Quick follow-up:

For the postgres cartridge I also needed to add this line to the Makefile:

 SHLIB_LINK += -pthread
+   SHLIB_LINK += -lboost_thread
 ifndef BOOSTHOME
   BOOSTHOME=/usr/local/include
 endif

Don't know if it is more appropriate to link boost_thread statically for 
the cartridge, but this works OK for me - on that machine at least.

Cheers
-- Jan
__

Hi Riccardo,

On 2012-11-16 13:29, Riccardo Vianello wrote:
 Hi Jan,

 On Fri, Nov 16, 2012 at 11:54 AM, Jan Holst Jensen
 j...@biochemfusion.com wrote:
 [...]
 ../../lib/libSmilesParse.so.1.2012.09.1: undefined reference to
 `boost::thread_resource_error::~thread_resource_error()'
 collect2: ld returned 1 exit status
 make[2]: *** [Code/GraphMol/graphmolIterTest] Error 1
 make[1]: *** [Code/GraphMol/CMakeFiles/graphmolIterTest.dir/all] Error 2
 make: *** [all] Error 2
 I encontered a similar link problem in building rdkit on an oldish
 linux x86_64 machine with boost 1.39 (that, according to the cmake
 build files, is the minimum required version), but I couldn't yet find
 the time to investigate the details. For the time being I worked
 around it with the addition of

 find_package(Boost 1.39.0 COMPONENTS thread REQUIRED)

 in $RDBASE/CMakeLists.txt (after the assignment of
 Boost_ADDITIONAL_VERSIONS and before if(RDK_BUILD_PYTHON_WRAPPERS))
 and with the addition of ${Boost_THREAD_LIBRARY} to the GraphMol's
 LINK_LIBRARIES in $RDBASE/Code/GraphMol/CMakeLists.txt

Excellent! I added such a line to $RDBASE/CMakeLists.txt:

 if(MSVC)
   SET(Boost_ADDITIONAL_VERSIONS 1.481.48.0 1.45 1.45.0 
1.44 1.44.0 1.43 1.43.0 1.42 1.42.0 1.41 1.41.0 1.40 
1.40.0)
 endif(MSVC)

+   find_package(Boost 1.40.0 COMPONENTS thread REQUIRED)

 if(RDK_BUILD_PYTHON_WRAPPERS)
   #---


and added ${Boost_THREAD_LIBRARY} after each occurrence of 
LINK_LIBRARIES in $RDBASE/Code/GraphMol/CMakeLists.txt. I now have a 
build of 2012_09_1 that passes all tests. Thanks :-) !!

Cheers
-- Jan

--
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit pgSQL cartridge on 64-bit Windows, anyone ?

2012-11-13 Thread Jan Holst Jensen
Hi RDKitters,

Before I embark on this journey - has anyone else attempted compiling 
and running the RDKit pgSQL cartridge on 64-bit Windows ? Gotchas, 
success stories, and fiaskos/war stories would be equally appreciated :-).

Cheers
-- Jan Holst Jensen

--
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] [Rdkit-devel] 2012.09 (Q3 2012) RDKit release

2012-10-21 Thread Jan Holst Jensen
On 2012-10-21 21:15, Gianluca Sforna wrote:
 On Sun, Oct 21, 2012 at 5:12 AM, Greg Landrum greg.land...@gmail.com wrote:
 I'm very happy to announce that the next version of the RDKit --
 2012.09 (a.k.a Q3 2012) -- is released.
 Hi Greg,
 sorry, I didn't it before but trying to rebuild the RPMs I'm getting a
 test error:

 76/76 Test #76: pythonTestDirChem ***Failed   29.80 sec

 99% tests passed, 1 tests failed out of 76

 Total Test time (real) = 106.14 sec

 The following tests FAILED:
76 - pythonTestDirChem (Failed)
 Errors while running CTest
 make: *** [test] Error 8

 How do I find which of the tests in there is failing? Some expect a
 failure so I can't readily tell from the log which ones are
 problematic.


Hi Gianluca,

ctest --output-on-failure should give you a lot more information so 
you can pinpoint the cause. At least it has helped me in the past.

Cheers
-- Jan

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] C++ - Strange error message thrown by MolDataStreamToMol.

2012-09-15 Thread Jan Holst Jensen

On 2012-09-15 05:40, Greg Landrum wrote:

Hi Jan,

On Sat, Sep 15, 2012 at 12:47 AM, Jan Holst Jensen
j...@biochemfusion.com wrote:

I have now managed to produce a nice little self-contained DLL that uses
RDKit to do substructure mapping. It receives a query and a target molecule
as two MDL molfile strings. When the inputs are valid it behaves well and
returns the atom mapping to me. Yes :-).

Excellent!


However, when I give it an invalid MDL molfile as input, e.g. a text file
with regular prose, MolDataStreamToMol() throws a highly unexpected error
message.

This piece of my test code:

 ifstream query_file (C:\\SubstructData\\results.txt);

 auto_ptrRWMol query (NULL);
 unsigned int line = 0;

 try {
 query.reset( MolDataStreamToMol(query_file, line, false, false,
false) );
 } catch (std::exception e) {
 cout  STD-ERROR:   e.what()  endl;
 return 1;
 } catch (...) {
 cout  UNK-ERROR: Unknown exception.  endl;
 return 1;
 }

produces this output:

[00:13:24] CTAB version string invalid at line 4
STD-ERROR: Unknown exception

In MolFileParser.cpp line 1881 I can locate the reported error message:

 std::ostringstream errout;
 erroutCTAB version string invalid at line line;
 if(strictParsing){
   if(res) delete res;
   throw FileParseException(errout.str());
 } else {
   BOOST_LOG(rdWarningLog)  errout.str()  std::endl;
 }

Since I run with strictParsing==false the BOOST_LOG(rdWarningLog) call must
be the one that emits the first line of output, and I assume that the boost
warning log does not throw an exception. I haven't been able to figure out
where the exception gets thrown, but that must be later in the code.

If I switch to strict parsing (last parameter in MolDataStreamToMol() set to
true) I don't see the boost log message anymore, only the single line
STD-ERROR: Unknown exception. So strict parsing actually yields less error
message context.

I hope the exception error message can be changed to something more
meaningful ?

I would guess that the exception being thrown is a FileParseException.
These don't do anything to override the what() method, so you get the
unknown exception text. I will fix that.
Here's the code from the python wrapper for constructing a molecule
from a mol block
($RDBASE/Code/GraphMol/Wrap/rdmolfiles.cpp:MolFromMolFile()). It
catches the exceptions that are likely to arise and returns a NULL if
something fails. This might be useful:
 RWMol *newM=0;
 try {
   newM = MolFileToMol(molFilename, sanitize,removeHs,strictParsing);
 } catch (RDKit::BadFileException e) {
   PyErr_SetString(PyExc_IOError,e.message());
   throw python::error_already_set();
 } catch (RDKit::FileParseException e) {
   BOOST_LOG(rdWarningLog)  e.message() std::endl;
 } catch (...) {

 }
 return static_castROMol *(newM);

-greg


Hi Greg,

Ah, so the actual message is in .message() instead of .what() - that 
explains it. I changed my test code to:


try {
query.reset( MolDataStreamToMol(query_file, line, false,
   false, true) );
   *} catch (RDKit::BadFileException e) {**
   **cout  RDK-ERROR:   e.message()  endl;**
   **return 1;**
   **} catch (RDKit::FileParseException e) {**
   **cout  RDK-ERROR:   e.message()  endl;**
   **return 1;*
} catch (std::exception e) {
cout  STD-ERROR:   e.what()  endl;
return 1;
} catch (...) {
cout  UNK-ERROR: Unknown exception.  endl;
return 1;
}


and now I get this output:

   RDK-ERROR: CTAB version string invalid at line 4


Problem solved for now - thanks :-). Having the actual message available 
in .what() as well in future will be nice though.


Cheers
-- Jan
--
How fast is your code?
3 out of 4 devs don\\\'t know how their code performs in production.
Find out how slow your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219672;13503038;z?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] C++ - Strange error message thrown by MolDataStreamToMol.

2012-09-14 Thread Jan Holst Jensen

Hi,

I have now managed to produce a nice little self-contained DLL that uses 
RDKit to do substructure mapping. It receives a query and a target 
molecule as two MDL molfile strings. When the inputs are valid it 
behaves well and returns the atom mapping to me. Yes :-).


However, when I give it an invalid MDL molfile as input, e.g. a text 
file with regular prose, MolDataStreamToMol() throws a highly unexpected 
error message.


This piece of my test code:

ifstream query_file (C:\\SubstructData\\results.txt);

auto_ptrRWMol query (NULL);
unsigned int line = 0;

try {
query.reset( MolDataStreamToMol(query_file, line, false, false, 
false) );

} catch (std::exception e) {
cout  STD-ERROR:   e.what()  endl;
return 1;
} catch (...) {
cout  UNK-ERROR: Unknown exception.  endl;
return 1;
}

produces this output:

   [00:13:24] CTAB version string invalid at line 4
   STD-ERROR: *Unknown exception*

In MolFileParser.cpp line 1881 I can locate the reported error message:

std::ostringstream errout;
errout*CTAB version string invalid at line* line;
if(strictParsing){
  if(res) delete res;
  throw FileParseException(errout.str());
} else {
*  BOOST_LOG(rdWarningLog)  errout.str()  std::endl;*
}

Since I run with strictParsing==false the BOOST_LOG(rdWarningLog) call 
must be the one that emits the first line of output, and I assume that 
the boost warning log does not throw an exception. I haven't been able 
to figure out where the exception gets thrown, but that must be later in 
the code.


If I switch to strict parsing (last parameter in MolDataStreamToMol() 
set to true) I don't see the boost log message anymore, only the single 
line STD-ERROR: Unknown exception. So strict parsing actually yields 
less error message context.


I hope the exception error message can be changed to something more 
meaningful ?


Cheers
-- Jan
--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Windows build - build succeeds, half the tests fail with SEGFAULT.

2012-09-12 Thread Jan Holst Jensen
Hi,

I have attempted a build of RDKit_2012_06_1 on Windows. The build seems 
to go fine, but I get a SEGFAULT in half the tests when I run the test 
suite. Below follows what I have done. Any clues appreciated.


Followed http://code.google.com/p/rdkit/wiki/BuildingOnWindows on a 
Windows 2008 R2 x64 server with VS2010 installed. Note: This will be a 
C++-only build, no Python installed.

Install binary boost 1.51 - for VS2010 only, select all library types, 
install into C:\boost\ (delete the Program Files (x86) path bit).
Install cmake - all defaults chosen (does not add cmake to PATH).
Install Cygwin - add 'flex' and 'bison' to package selection, otherwise 
use defaults.

Create a user env. var; 
PATH=C:\RDKit_2012_06_1\lib;C:\boost\boost_1_51\lib;C:\cygwin\bin

Start cmake GUI.
Point it to C:\RDKit_2012_06_1\ for source code location and 
C:\RDKit_2012_06_1\build\ for output location.
Click Configure
 Select your compiler - Visual Studio 10, Use default native 
compilers.
This will produce a handful of red error entries.
 Uncheck RDK_BUILD_PYTHON_WRAPPERS.
Click Configure again.
 Still red entries, but apparently this is some caching done by cmake ?
Click Configure again.
 We now have all-white entries shown. OK.
Click Generate.
Close cmake GUI.

Start Visual Studio 2010.
Open the RDKit.sln solution in C:\RDKit_2012_06_1\build\.
Use the Configuration Manager to change the build to a Release Win32 
build. It will be set per default to Debug Win32 by cmake.
Build the ALL_BUILD target.
 Takes a while and generates warnings, but no errors. OK.
Build the INSTALL target. OK - no issues.

Test the build:
 Open a command prompt.
 cd to C:\RDKit_2012_06_1\build\ and run ctest
 C:\RDKit_2012_06_1\buildpath
PATH=C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32
\WindowsPowerShell\v1.0\;C:\RDKit_2012_06_1\lib;C:\boost\boost_1_51\lib;C:\cygwi
 n\bin
 C:\RDKit_2012_06_1\buildc:\Program Files (x86)\CMake 
2.8\bin\ctest.exe

*

50% tests passed, 22 tests failed out of 44

Total Test time (real) =  31.89 sec

The following tests FAILED:
   2 - testDataStructs (SEGFAULT)
   4 - testGrid (SEGFAULT)
  13 - graphmolMolOpsTest (SEGFAULT)
  15 - graphmoltestChirality (SEGFAULT)
  16 - graphmoltestPickler (SEGFAULT)
  18 - testDepictor (SEGFAULT)
  21 - fileParsersTest1 (SEGFAULT)
  22 - testMolSupplier (SEGFAULT)
  23 - testMolWriter (SEGFAULT)
  24 - testTplParser (SEGFAULT)
  25 - testMol2ToMol (SEGFAULT)
  27 - testReaction (SEGFAULT)
  28 - testChemTransforms (SEGFAULT)
  31 - testFragCatalog (SEGFAULT)
  32 - testDescriptors (SEGFAULT)
  33 - testFingerprints (SEGFAULT)
  34 - testMolTransforms (SEGFAULT)
  35 - testForceFieldHelpers (SEGFAULT)
  36 - testDistGeomHelpers (SEGFAULT)
  37 - testMolAlign (SEGFAULT)
  38 - testFeatures (SEGFAULT)
  39 - testShapeHelpers (SEGFAULT)
Errors while running CTest
_

Any clues as to a common denominator for these failures ? I did check 
that the boost DLLs I have installed in C:\boost\boost_1_51\lib\ are x86 
(32-bit).

Cheers
-- Jan Holst Jensen

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] C-interface to RDKit libraries on Windows ?

2012-09-11 Thread Jan Holst Jensen
Hi all,

I would like to use RDKit from a Pascal (Delphi) program on Windows and 
for that a DLL with a C-style interface would be easiest to work with. 
At least that is an approach that always works.

I downloaded RDKit_2012_06_1.win32.py27.zip but that only contains 
Python-specific binaries - am I correct here ? And there is no standard 
C-style interface to RDKit functionality ?

The option that seems most likely to succeed is then that I program what 
I need in C++; build that as a DLL, and expose my own API for the subset 
of functionality I want as C-functions. Not a problem at all, but I 
would like to avoid setting up the Windows build process if I can - 
unless a talk or tutorial at the upcoming UGM will show me that it is 
quite easy :-).

Any smarter alternatives to this ?

Cheers
-- Jan Holst Jensen

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] C-interface to RDKit libraries on Windows ?

2012-09-11 Thread Jan Holst Jensen
On 2012-09-11 16:13, Greg Landrum wrote:
 Hi,

 On Tue, Sep 11, 2012 at 2:50 PM, Jan Holst Jensen j...@biochemfusion.com 
 wrote:
 Hi all,

 I would like to use RDKit from a Pascal (Delphi) program on Windows and
 for that a DLL with a C-style interface would be easiest to work with.
 At least that is an approach that always works.

 I downloaded RDKit_2012_06_1.win32.py27.zip but that only contains
 Python-specific binaries - am I correct here ? And there is no standard
 C-style interface to RDKit functionality ?
 Correct: there's no standard C-style interface.
 The windows binaries really just include enough to use the python
 wrappers; they don't include DLLs.

 The option that seems most likely to succeed is then that I program what
 I need in C++; build that as a DLL, and expose my own API for the subset
 of functionality I want as C-functions. Not a problem at all, but I
 would like to avoid setting up the Windows build process if I can -
 unless a talk or tutorial at the upcoming UGM will show me that it is
 quite easy :-).
 There's nothing along those lines planned, but it's still pretty easy. :-)
 This wiki page has instructions for doing a windows build:
 http://code.google.com/p/rdkit/wiki/BuildingOnWindows
 Since you aren't interested in doing a Python build, you can ignore
 everything python-related. When you run the cmake GUI, just uncheck
 the RDK_BUILD_PYTHON_WRAPPERS variable after you run configure for the
 first time, then re-run configure.

 -greg


Hi Greg,

Thanks for the info. I'll give the Windows build a try :-).

Cheers
-- Jan

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] windows binary install

2012-08-09 Thread Jan Holst Jensen

On 2012-08-09 00:13, stanley5101 wrote:

Hi,
I'm trying to install the Windows binary of RDKit but having a 
problem.  After

from rdkit import Chem
I get
Traceback (most recent call last):
  File pyshell#1, line 1, in module
from rdkit import Chem
  File C:\RDKit_2012_03_1\rdkit\Chem\__init__.py, line 18, in module
from rdkit import rdBase
ImportError: DLL load failed: %1 is not a valid Win32 application.
I am using the Enthought Python distribution (EPD 7.0-2 32 bit python 
2.7.1) on Windows 7 64 bit.  I have the following environment variables:

PYTHONPATH %RDBASE%
RDBASE C:/RDKit_2012_03_1
Path ;%RDBASE%/lib; (amongst other things)
I'd be grateful for any suggestions as to what is going wrong and a 
fix.   Is the 64 bit OS an issue?

thanks,
Alan


Hi Alan,

It sounds like you are trying to load a 64-bit RDKit library in a 32-bit 
Python environment. Is that the case ? The RDKit bitness must be the 
same as the Python version you are using, which is 32-bit (EPD 7.0-2 
*32 bit python* 2.7.1).


Kind regards
-- Jan Holst Jensen
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-31 Thread Jan Holst Jensen
On 2012-05-31 07:52, Greg Landrum wrote:
 On Wed, May 30, 2012 at 4:50 PM, Greg Landrumgreg.land...@gmail.com  wrote:
 On Wed, May 30, 2012 at 4:13 PM, Jan Holst Jensenj...@biochemfusion.com  
 wrote:
 My failing Linux Mint is 32-bit like George's 12.04. Don't know if it is
 significant but it could be that the problem only occurs on 32-bit.

 Greg mentioned that he has successfully built and tested on Ubuntu 12.04 -
 was that 64-bit or 32-bit ?
 64bit. I'll try it on a 32bit VM tonight.
 This morning I installed the rdkit on a clean 32bit Ubuntu 12.04 VM.
 While installing the cartridge, I was able to reproduce the problems
 you guys observed. I was also able to find the problem and a
 provisional fix for it. If you edit guc.c and replace the calls to
 DefineCustomRealVariable() with the following:

Hi Greg,

Did the changes to guc.c on my previously-failing Linux Mint VM. Did a 
make clean; make; sudo make install and lo and behold:

postgres=# create extension rdkit;
CREATE EXTENSION
postgres=#

And there was much joy in 32-bit land :-). Many thanks :-).

I think we can then remove the Linux Mint behaves badly with rdkit 
observation from the books.

Cheers
-- Jan

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread Jan Holst Jensen

On 2012-05-30 11:28, George Papadatos wrote:

Hi Jan,

I followed your advice and I added the new repo, however the problem 
still persists:


[...]



Again, _all_ the tests fail as do the create extension attempts.

I even tried explicit postgresql-9.1 and postgresql-9.2 (beta version) 
but with the same sad results.


Do I do something wrong here, like still installing the default 
postgresql packages and not the good ones?


Hi George,

Sorry for leading you on a wild goose chase here, but I wasn't sure 
whether you were using custom postgres packages or the standard Ubuntu 
package. As Adrian Schreyer has pointed out, if you are using the 
Ubuntu-supplied packages then you are already using Martin Pitt's build 
so it actually shouldn't make a difference.



 On 12.04 you do not need to add the PPA, PostgreSQL 9.1 is the
 official package there. In addition, the official packages are also
 provided by Martin Pitt, so the packages in the PPA and in the distro
 are actually the same. It only make sense on 10.04 (and those that do
 not ship with 9.1).


I would go with Adrian's suggestion to check if the PG_VERSION_NUM is 
somehow mis-reported on your system. In fact, I will see if I still have 
the Linux Mint VM (that also has this behavior) somewhere and check it 
on that system too.


Cheers
-- Jan
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread Jan Holst Jensen
On 2012-05-30 13:24, George Papadatos wrote:
 Thanks to both of you.

 I do not know how to check for the PG_VERSION_NUM.

Hi George,

I think the snippets below should do it. It shows what my Linux Mint 
machine thinks. It seems to be reporting the correct version though, 
even though it fails to load the cartridge (I have postgresql 9.1.3 
installed from the standard Ubuntu 11.10 repository).

jhje@linux-dev-x86 ~ $ cat test.c
#include postgres.h

int main() {
   int version_no = PG_VERSION_NUM;
   printf(%d\n, version_no);
   return 0;
}
jhje@linux-dev-x86 ~ $ cat make.sh
PG_INCLUDE_DIR=`pg_config | grep INCLUDEDIR-SERVER = | sed 
s/INCLUDEDIR-SERVER = //`
echo Include dir found via 'pg_config' = $PG_INCLUDE_DIR

gcc test.c -I$PG_INCLUDE_DIR

jhje@linux-dev-x86 ~ $ . make.sh; ./a.out
Include dir found via 'pg_config' = /usr/include/postgresql/9.1/server
90103
jhje@linux-dev-x86 ~ $

Cheers
-- Jan

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread Jan Holst Jensen
On 2012-05-30 13:24, George Papadatos wrote:
 Thanks to both of you.

 I do not know how to check for the PG_VERSION_NUM. I tried to edit to 
 guc.c by removing the conditional check of the PG_VERSION but with the 
 same results:
 Adrian, is this what you meant?

   DefineCustomRealVariable(
rdkit.tanimoto_threshold,
Lower threshold of Tanimoto similarity,
Molecules with similarity lower than 
 threshold are not similar by % operation,
 rdkit_tanimoto_smlar_limit,
0.5,
0.0,
1.0,
PGC_USERSET,
0,
(GucRealCheckHook)TanimotoLimitAssign,
NULL,
NULL
);

   DefineCustomRealVariable(
rdkit.dice_threshold,
Lower threshold of Dice similarity,
Molecules with similarity lower than 
 threshold are not similar by # operation,
 rdkit_dice_smlar_limit,
0.5,
0.0,
1.0,
PGC_USERSET,
0,
(GucRealCheckHook)DiceLimitAssign,
NULL,
NULL
);

 Regards,

 George

Hi George,

I just tried the same on my VM, with no change for the better either. My 
version of guc.c now looks like this:

static void
initRDKitGUC()
{
   if (rdkit_guc_inited)
 return;

   DefineCustomRealVariable(
rdkit.tanimoto_threshold,
Lower threshold of Tanimoto similarity,
Molecules with similarity lower than 
threshold are not similar by % operation,
rdkit_tanimoto_smlar_limit,
0.5,
0.0,
1.0,
PGC_USERSET,
0,
//if PG_VERSION_NUM = 90100
(GucRealCheckHook)TanimotoLimitAssign,
NULL,
//else
//   TanimotoLimitAssign,
//endif
NULL
);

   DefineCustomRealVariable(
rdkit.dice_threshold,
Lower threshold of Dice similarity,
Molecules with similarity lower than 
threshold are not similar by # operation,
rdkit_dice_smlar_limit,
0.5,
0.0,
1.0,
PGC_USERSET,
0,
//if PG_VERSION_NUM = 90100
(GucRealCheckHook)DiceLimitAssign,
NULL,
//else
//   DiceLimitAssign,
//endif
NULL
);

   rdkit_guc_inited = true;
}

Did a cartridge make clean, make, sudo make install, and it still 
fails for me with

postgres=# create extension rdkit;
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
The connection to the server was lost. Attempting reset: Succeeded.
postgres=#

Cheers
-- Jan

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread Jan Holst Jensen
How odd. Adrian, are you using a 32-bit or 64-bit version of Ubuntu 12.04 ?

Cheers
-- Jan

On 2012-05-30 15:26, Adrian Schreyer wrote:
 Yes, I could build and install the cartridge without problems
 (Release_2012.03.1) on 12.04.

 On Wed, May 30, 2012 at 2:23 PM, George Papadatosgpapada...@gmail.com  
 wrote:
 Hi Jan,

 Mine is exactly the same:
 gcc test.c -I/usr/include/postgresql/9.1/server;./a.out
 90103

 So, I am back to square 1!

 I am starting to get a bit desperate here, has anyone ever successfully
 built the cartridge from the trunk on a plain Ubuntu 12.04?

 Many thanks for your help,

 George


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread Jan Holst Jensen
My failing Linux Mint is 32-bit like George's 12.04. Don't know if it is 
significant but it could be that the problem only occurs on 32-bit.

Greg mentioned that he has successfully built and tested on Ubuntu 12.04 
- was that 64-bit or 32-bit ?

Cheers
-- Jan

On 2012-05-30 15:49, Adrian Schreyer wrote:
 64-bit, PostgreSQL packages are from the official archive. I simply do
 'make' followed by 'sudo make install' and then

 create schema rdkit;
 create extension rdkit with schema rdkit;

 and that's it.

 $ make installcheck
 /usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress
 --inputdir=. --psqldir='/usr/lib/postgresql/9.1/bin'
 --dbname=contrib_regression rdkit-91 props btree molgist bfpgist-91
 sfpgist slfpgist fps
 (using postmaster on Unix socket, default port)
 == dropping database contrib_regression ==
 DROP DATABASE
 == creating database contrib_regression ==
 CREATE DATABASE
 ALTER DATABASE
 == running regression test queries==
 test rdkit-91 ... ok
 test props... ok
 test btree... ok
 test molgist  ... ok
 test bfpgist-91   ... ok
 test sfpgist  ... ok
 test slfpgist ... ok
 test fps  ... ok

 =
   All 8 tests passed.
 =

 On Wed, May 30, 2012 at 2:43 PM, Jan Holst Jensenj...@biochemfusion.com  
 wrote:
 How odd. Adrian, are you using a 32-bit or 64-bit version of Ubuntu 12.04 ?

 Cheers
 -- Jan


 On 2012-05-30 15:26, Adrian Schreyer wrote:
 Yes, I could build and install the cartridge without problems
 (Release_2012.03.1) on 12.04.

 On Wed, May 30, 2012 at 2:23 PM, George Papadatosgpapada...@gmail.com
   wrote:
 Hi Jan,

 Mine is exactly the same:
 gcc test.c -I/usr/include/postgresql/9.1/server;./a.out
 90103

 So, I am back to square 1!

 I am starting to get a bit desperate here, has anyone ever successfully
 built the cartridge from the trunk on a plain Ubuntu 12.04?

 Many thanks for your help,

 George



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Molecule with no atoms, so is it valid?

2012-04-30 Thread Jan Holst Jensen
On 2012-04-30 13:41, Paul Emsley wrote:
 On 30/04/12 01:03, Eddie Cao wrote:
 Hi Andrew,

 I also prefer #2. #1 is not quite sensible because many readers like 
 MolFromSmiles will return None on failure and it will be hard to distinguish 
 bad input from an empty one if we choose to do #1. Semantically, in many 
 RDKit use cases, None and Empty Mol are as different as a webpage not found 
 (HTTP 404) and a blank web page.
 FWIW, I agree to #2 too.

 Paul.

My humble opinion would also be that #2 is the best option. In line with 
the analogy given by Eddie, treating an empty molecule as an error is 
equivalent to saying that a blank string is equal to NULL - which would 
be unfortunate, since there is a big semantic difference.

Also, imagine that you delete all atoms and bonds in a molecule in order 
to replace them. Enforcing the semantics of empty molecules are 
invalid should then cause your code to break half-way through the process ?

Cheers
-- Jan Holst Jensen

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Trouble install 2012_03_1's PgSQL cartridge in Postgres 9.1.

2012-04-26 Thread Jan Holst Jensen
On 2012-04-26 05:32, Greg Landrum wrote:
 I continue to be unable to reproduce these problems.

 Running on a 64bit Ubuntu 11.10 VM I installed the following packages
 using apt-get:
sudo apt-get install postgresql
sudo apt-get install postgresql-contrib postgresql-server-dev-all

 I build the cartridge:

 glandrum@ubuntu:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ make

 Install it (notice the sudo):

 glandrum@ubuntu:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ sudo make install

 Create a postgresql account for my user:

 glandrum@ubuntu:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ sudo su postgres
 postgres@ubuntu:/home/glandrum/RDKit_2012_03_1/Code/PgSQL/rdkit$
 createuser glandrum
 Shall the new role be a superuser? (y/n) y

 And then run the tests:

 glandrum@ubuntu:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ make installcheck
 /usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress
 --inputdir=. --psqldir='/usr/lib/postgresql/9.1/bin'
 --dbname=contrib_regression rdkit-91 props btree molgist bfpgist-91
 sfpgist slfpgist fps
 (using postmaster on Unix socket, default port)
 == dropping database contrib_regression ==
 NOTICE:  database contrib_regression does not exist, skipping
 DROP DATABASE
 == creating database contrib_regression ==
 CREATE DATABASE
 ALTER DATABASE
 == running regression test queries==
 test rdkit-91 ... ok
 test props... ok
 test btree... ok
 test molgist  ... ok
 test bfpgist-91   ... ok
 test sfpgist  ... ok
 test slfpgist ... ok
 test fps  ... ok

 =
   All 8 tests passed.
 =

 Can you please check your configuration to make sure that the packages
 are all installed and the appropriate accounts are there?

 -greg


Dear Greg,

On my Ubuntu 10.04 I can build against and install into Postgres 9.0 
without any issues. I did a createuser jhje as postgres and then ran 
the test suite.

jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ make installcheck
/opt/postgres/9.0/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress
 
--inputdir=. --psqldir=/opt/postgres/9.0/bin --dbname=contrib_regression 
rdkit-pre91 props btree molgist bfpgist-pre91 sfpgist slfpgist fps
(using postmaster on Unix socket, port 5432)
== dropping database contrib_regression ==
DROP DATABASE
== creating database contrib_regression ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test rdkit-pre91  ... ok
test props... ok
test btree... ok
test molgist  ... ok
test bfpgist-pre91... ok
test sfpgist  ... ok
test slfpgist ... ok
test fps  ... ok

=
  All 8 tests passed.
=

jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$

After having installed the cartridge as the postgres user with psql  
/opt/postgres/9.0/share/postgresql/contrib/rdkit.sql:

postgres@ubuntu-lynet-test:~$ psql
Password:
psql (9.0.4)
Type help for help.

postgres=# select mol_amw('');
  mol_amw
-
   58.124
(1 row)

postgres=# show rdkit.tanimoto_threshold;
  rdkit.tanimoto_threshold
--
  0.5
(1 row)

postgres=#

But against postgres 9.1 both the test suite and the attempt to do 
create extension rdkit fails with

FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5

On the Linux Mint VM I found that I didn't have the postgresql-contrib 
package installed. I added it but it didn't make a difference.

Both of the VMs I have used are 32-bit and you are using a 64-bit VM. I 
wonder if that is significant. I don't have a 64-bit Linux VM ready for 
test building at the moment but let me see if I can establish one.

Kind regards
-- Jan Holst Jensen

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Trouble install 2012_03_1's PgSQL cartridge in Postgres 9.1.

2012-04-26 Thread Jan Holst Jensen
On 2012-04-26 08:37, Gianluca Sforna wrote:
 On Thu, Apr 26, 2012 at 8:26 AM, Jan Holst Jensenj...@biochemfusion.com  
 wrote:
 jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$

 After having installed the cartridge as the postgres user with psql
 /opt/postgres/9.0/share/postgresql/contrib/rdkit.sql:
 I am not able to dig details now, but IIRC there are two different
 cartridges, one for 8.x and another one for 9.x.
 Just to be sure, I'd double check you are installing the right one in there.



Dear Gianluca,

The cartridge is indeed postgres-version specifc. Which one is built 
depends on your postgres environment when you run make. On one machine 
I have both postgres 9.0 and 9.1 installed and so if I setup the 9.0 
environment and run make, it will build the 9.0 cartridge. And vice versa.

Like this (the third line in each block sources the version-specific 
postgres environment):


jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ export 
RDBASE=/home/jhje/RDKit_2012_03_1
jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ export 
LD_LIBRARY_PATH=/home/jhje/RDKit_2012_03_1/lib/
jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ . 
/opt/postgres/9.1/pg91-openscg.env
jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ make clean
rm -f rdkit.so   librdkit.a
rm -f rdkit_io.o mol_op.o bfp_op.o sfp_op.o rdkit_gist.o low_gist.o 
guc.o cache.o adapter.o
rm -f rdkit--3.1.sql
rm -rf results/ regression.diffs regression.out tmp_check/ log/
jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ make
cp rdkit.sql91.in rdkit--3.1.sql
gcc -I/usr/local/include -I/home/jhje/RDKit_2012_03_1/Code 
-DRDKITVER='004100'  -I. -I. 
-I/opt/postgres/9.1/include/postgresql/server 
-I/opt/postgres/9.1/include/postgresql/internal -D_GNU_SOURCE 
-I/usr/local/include/libxml2  -I/usr/local/openssl/include 
-I/usr/local/ldap-2.4.23/include -I/usr/local/include -fPIC -c -o 
rdkit_io.o rdkit_io.c
...


jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ export 
RDBASE=/home/jhje/RDKit_2012_03_1
jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ export 
LD_LIBRARY_PATH=/home/jhje/RDKit_2012_03_1/lib/
jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ . 
/opt/postgres/9.0/pg90-openscg.env
jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ make clean
rm -f rdkit.so   librdkit.a
rm -f rdkit.sql
rm -f rdkit_io.o mol_op.o bfp_op.o sfp_op.o rdkit_gist.o low_gist.o 
guc.o cache.o adapter.o
rm -rf results tmp_check log
rm -f regression.diffs regression.out regress.out run_check.out
jhje@ubuntu-lynet-test:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ make
sed 's,MODULE_PATHNAME,$libdir/rdkit,g' rdkit.sql.in rdkit.sql
gcc -I/usr/local/include -I/home/jhje/RDKit_2012_03_1/Code 
-DRDKITVER='004100'  -I. -I. 
-I/opt/postgres/9.0/include/postgresql/server 
-I/opt/postgres/9.0/include/postgresql/internal -D_GNU_SOURCE 
-I/usr/local/include/libxml2  -I/usr/local/include -fPIC -c -o 
rdkit_io.o rdkit_io.c
...


And you also have to take care that when you do make install as root 
that the environment is setup to point to your correct postgres version. 
You can verify that is the case by checking the output of make install 
and see what directories the library and SQL scripts are installed into, 
e.g. here I am installing into postgres 9.0:

root@ubuntu-lynet-test:~# cd /home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit/
root@ubuntu-lynet-test:/home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit# . 
/opt/postgres/9.0/pg90-openscg.env
root@ubuntu-lynet-test:/home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit# make 
install
/bin/mkdir -p '/opt/postgres/9.0/lib/postgresql'
/bin/mkdir -p '/opt/postgres/9.0/share/postgresql/contrib'
/bin/sh 
/opt/postgres/9.0/lib/postgresql/pgxs/src/makefiles/../../config/install-sh 
-c -m 755  rdkit.so '/opt/postgres/9.0/lib/postgresql/rdkit.so'
/bin/sh 
/opt/postgres/9.0/lib/postgresql/pgxs/src/makefiles/../../config/install-sh 
-c -m 644 ./uninstall_rdkit.sql '/opt/postgres/9.0/share/postgresql/contrib'
/bin/sh 
/opt/postgres/9.0/lib/postgresql/pgxs/src/makefiles/../../config/install-sh 
-c -m 644 rdkit.sql '/opt/postgres/9.0/share/postgresql/contrib'
root@ubuntu-lynet-test:/home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit#

So yes, it can certainly get mixed up, but if you only have a single 
postgres version installed and you build and test on that same machine 
this should not be an issue.

I haven't seen any differences in behavior between the 8.4 and 9.0 
series. I think the build script only changes when you go to 9.1 (?). I 
have the cartridge running fine in postgres 9.0 for a while now and I 
also recently got it installed on a postgres 8.4.5 in an emulated 
Raspberry Pi (ARM architecture) without any issues.

Kind regards
-- Jan

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. 

[Rdkit-discuss] Trouble installing 2012_03_1's PgSQL cartridge in Postgres 9.1. [Follow-up #2- on 64-bit Ubuntu 10.04.2]

2012-04-26 Thread Jan Holst Jensen
Dear Greg,

I can again reproduce the Postgres 9.1 issue, this time in a brand new 
64-bit VM.

Ubuntu 10.04.2 x86_64 server installed from scratch. 
gnome-desktop-environment installed, VMware tools installed.


jhje@ubuntu-x64-build:~$ cat /etc/issue
Ubuntu 10.04.2 LTS \n \l

jhje@ubuntu-x64-build:~$ uname -a
Linux ubuntu-x64-build 2.6.32-28-server #55-Ubuntu SMP Mon Jan 10 
23:57:16 UTC 2011 x86_64 GNU/Linux
jhje@ubuntu-x64-build:~$


Packages build-essential, bison, flex, boost libraries, numpy, cmake 
installed.

RDKit_2012_03_1 builds (after the needed one-line patch to Wrap.h) and 
all tests pass.

Installed OpenSCG postgresql 9.1.3 package (latest postgres for Ubuntu 
10.04.2 is Pg 8.5, so not interesting).

Set trust authentication for local connections in pg_hba.conf so rdkit 
postgres test suite can run without prompting for password.


postgres@ubuntu-x64-build:~$ createuser jhje
Shall the new role be a superuser? (y/n) y
postgres@ubuntu-x64-build:~$

jhje@ubuntu-x64-build:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ . 
/opt/postgres/9.1/pg91-openscg.env
jhje@ubuntu-x64-build:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ make
... builds with the usual 3-4 warnings ...
jhje@ubuntu-x64-build:~/RDKit_2012_03_1/Code/PgSQL/rdkit$

jhje@ubuntu-x64-build:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ sudo su -
[sudo] password for jhje:
root@ubuntu-x64-build:~# cd /home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit/
root@ubuntu-x64-build:/home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit# . 
/opt/postgres/9.1/pg91-openscg.env
root@ubuntu-x64-build:/home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit# make 
install
/bin/mkdir -p '/opt/postgres/9.1/lib/postgresql'
/bin/mkdir -p '/opt/postgres/9.1/share/postgresql/extension'
/bin/sh 
/opt/postgres/9.1/lib/postgresql/pgxs/src/makefiles/../../config/install-sh 
-c -m 755  rdkit.so '/opt/postgres/9.1/lib/postgresql/rdkit.so'
/bin/sh 
/opt/postgres/9.1/lib/postgresql/pgxs/src/makefiles/../../config/install-sh 
-c -m 644 ./rdkit.control '/opt/postgres/9.1/share/postgresql/extension/'
/bin/sh 
/opt/postgres/9.1/lib/postgresql/pgxs/src/makefiles/../../config/install-sh 
-c -m 644 ./rdkit--3.1.sql  '/opt/postgres/9.1/share/postgresql/extension/'
root@ubuntu-x64-build:/home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit#

jhje@ubuntu-x64-build:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ make installcheck
/opt/postgres/9.1/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress
 
--inputdir=. --psqldir='/opt/postgres/9.1/bin'   
--dbname=contrib_regression rdkit-91 props btree molgist bfpgist-91 
sfpgist slfpgist fps
(using postmaster on Unix socket, port 5432)
== dropping database contrib_regression ==
NOTICE:  database contrib_regression does not exist, skipping
DROP DATABASE
== creating database contrib_regression ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test rdkit-91 ... FAILED (test process exited with exit 
code 2)
test props... FAILED
test btree... FAILED
test molgist  ... FAILED
test bfpgist-91   ... FAILED
test sfpgist  ... FAILED
test slfpgist ... FAILED
test fps  ... FAILED

==
  8 of 8 tests failed.
==

The differences that caused some tests to fail can be viewed in the
file /home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit/regression.diffs.  A 
copy of the test summary that you see
above is saved in the file 
/home/jhje/RDKit_2012_03_1/Code/PgSQL/rdkit/regression.out.

make: *** [installcheck] Error 1
jhje@ubuntu-x64-build:~/RDKit_2012_03_1/Code/PgSQL/rdkit$


jhje@ubuntu-x64-build:~/RDKit_2012_03_1/Code/PgSQL/rdkit$ less 
regression.diffs
[... ...]
--- 4,9 
   --
   SET client_min_messages = warning;
   \set ECHO none
! FATAL:  failed to initialize rdkit.dice_threshold to 0.5
! FATAL:  failed to initialize rdkit.dice_threshold to 0.5
! connection to server was lost

OK, this time around it is the dice_threshold instead of the 
tanimoto_threshold that cannot be initialized, but still...

Hmmm... 'less' is reporting a linking issue, but only when the postgres 
environment has been loaded:

jhje@ubuntu-x64-build:~$ less
Missing filename (less --help for help)
jhje@ubuntu-x64-build:~$ . /opt/postgres/9.1/pg91-openscg.env
jhje@ubuntu-x64-build:~$ less
less: Symbol `ospeed' has different size in shared object, consider 
re-linking
Missing filename (less --help for help)
jhje@ubuntu-x64-build:~$ echo $LD_LIBRARY_PATH
/opt/postgres/9.1/lib:
jhje@ubuntu-x64-build:~$

Could a library conflict be what causes rdkit to abort ?

Kind regards
-- Jan

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile 

Re: [Rdkit-discuss] Trouble install 2012_03_1's PgSQL cartridge in Postgres 9.1.

2012-04-25 Thread Jan Holst Jensen
Dear Greg,

 The same thing happens when I try to enable the 'rdkit' extension directly.

postgres@linux-dev-x86 ~ $ psql
psql (9.1.3)
Type help for help.

postgres=# create extension rdkit;
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
The connection to the server was lost. Attempting reset: Succeeded.
postgres=#

 This is an error message that's been reported before, but we never
 really did find a root cause.
 I will start from scratch on my VM with Ubuntu 11.10 and take careful
 notes as I'm doing the installation, then post those on the wiki. I
 don't have time to do it this morning, but I'll try and get to it
 tonight or tomorrow a.m.

 Best,
 -greg

Thanks.

Meanwhile, I got RDKit_2012_03_1 running fine in Postgres 9.0.4 (the 
OpenSCG version) on an Ubuntu 10.04 build server. I will try to install 
Postgres 9.1 on that server and see if the error can be reproduced with 
that setup.

Kind regards
-- Jan

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] One-line patch for 2012_03_1 when compiling with boost 1.40.0 on Ubuntu 10.04.2.

2012-04-25 Thread Jan Holst Jensen

Dear Greg,

When compiling 2012_03_1 on Ubuntu 10.04.2 with boost 1.40.0 I get the 
following make error:


   [  4%] Building CXX object
   Code/RDBoost/CMakeFiles/RDBoost.dir/Wrap.cpp.o
   In file included from
   /home/jhje/RDKit_2012_03_1/Code/RDBoost/Wrap.cpp:16:
   /home/jhje/RDKit_2012_03_1/Code/RDBoost/Wrap.h: In function
   \u2018std::vectorT, std::allocator_CharT *
   pythonObjectToVect(boost::python::api::object, T)\u2019:
   /home/jhje/RDKit_2012_03_1/Code/RDBoost/Wrap.h:108: error:
   \u2018uint32_t\u2019 is not a member of \u2018boost\u2019
   /home/jhje/RDKit_2012_03_1/Code/RDBoost/Wrap.h:108: error: expected
   \u2018;\u2019 before \u2018v\u2019
   /home/jhje/RDKit_2012_03_1/Code/RDBoost/Wrap.h:109: error:
   \u2018v\u2019 was not declared in this scope
   /home/jhje/RDKit_2012_03_1/Code/RDBoost/Wrap.h:112: error:
   \u2018v\u2019 was not declared in this scope
   make[2]: *** [Code/RDBoost/CMakeFiles/RDBoost.dir/Wrap.cpp.o] Error 1
   make[1]: *** [Code/RDBoost/CMakeFiles/RDBoost.dir/all] Error 2
   make: *** [all] Error 2


I found that this change to line 44 of Code/RDBoost/Wrap.h fixes the issue:

#ifndef RDKIT_WRAP_DECL
#define RDKIT_WRAP_DECL
#endif
#include boost/python/suite/indexing/vector_indexing_suite.hpp
   +   #include boost/cstdint.hpp
#include list_indexing_suite.hpp
#include vector

'make test' subsequently pass OK - all 76 tests. I guess the change 
should be harmless to implement in general ?


Cheers
-- Jan Holst Jensen
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Trouble install 2012_03_1's PgSQL cartridge in Postgres 9.1. [Follow-up - testing on Ubuntu 10.04]

2012-04-25 Thread Jan Holst Jensen
  Meanwhile, I got RDKit_2012_03_1 running fine in Postgres
  9.0.4 (the OpenSCG version) on an Ubuntu 10.04 build server.
  I will try to install Postgres 9.1 on that server and see
  if the error can be reproduced with that setup.

Dear Greg,

I have now installed Postgres 9.1 on the mentioned build server. 
Postgres 9.1 was installed via the DEB package from
http://www.openscg.com/se/postgresql/packages.jsp

I got RDKit_2012_03_1 compiled after implementing the one-line patch I 
sent a little earlier. The Postgres install error seems reproducible as 
I also get it here, which is a configuration quite different from the 
Linux Mint 12 developer VM I originally saw the issue on.

   postgres@ubuntu-lynet-test:~$ cat /etc/issue
   Ubuntu 10.04.2 LTS \n \l

   postgres@ubuntu-lynet-test:~$ uname -a
   Linux ubuntu-lynet-test 2.6.32-28-generic-pae #55-Ubuntu SMP Mon Jan 
10 22:34:08 UTC 2011 i686 GNU/Linux
   postgres@ubuntu-lynet-test:~$ psql
   Password:
   psql.bin (9.1.3)
   Type help for help.

   postgres=# create extension rdkit;
   FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
   FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
   The connection to the server was lost. Attempting reset: Succeeded.
   postgres=#


Cheers
-- Jan

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Trouble install 2012_03_1's PgSQL cartridge in Postgres 9.1.

2012-04-24 Thread Jan Holst Jensen
Hi,

I am having problems install the Postgres cartridge in Postgres 9.1 on 
Linux Mint 12 (which is based on Ubuntu 11.10).

RDKit 2012_03_1 builds without any issues and all tests pass.

The PgSQL cartridge builds fine too, and installs fine as seen below.

   jhje@linux-dev-x86 ~/RDKit_2012_03_1/Code/PgSQL/rdkit $ sudo make install
   /bin/mkdir -p '/usr/lib/postgresql/9.1/lib'
   /bin/mkdir -p '/usr/share/postgresql/9.1/extension'
   /bin/sh 
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../config/install-sh 
-c -m 755  rdkit.so '/usr/lib/postgresql/9.1/lib/rdkit.so'
   /bin/sh 
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../config/install-sh 
-c -m 644 ./rdkit.control '/usr/share/postgresql/9.1/extension/'
   /bin/sh 
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../config/install-sh 
-c -m 644 ./rdkit--3.1.sql  '/usr/share/postgresql/9.1/extension/'
   jhje@linux-dev-x86 ~/RDKit_2012_03_1/Code/PgSQL/rdkit $

However, the make installcheck step fails. In the regression.diffs I see:

   [... tons of lines ...]
   --- 4,9 
 --
 SET client_min_messages = warning;
 \set ECHO none
   ! FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
   ! FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
   ! connection to server was lost

The same thing happens when I try to enable the 'rdkit' extension directly.

   postgres@linux-dev-x86 ~ $ psql
   psql (9.1.3)
   Type help for help.

   postgres=# create extension rdkit;
   FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
   FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
   The connection to the server was lost. Attempting reset: Succeeded.
   postgres=#


/var/log/postgresql/postgres-9.1-main-log doesn't tell me anything extra:

   2012-04-24 18:23:32 CEST LOG:  invalid value for parameter 
rdkit.tanimoto_threshold: 0.5
   2012-04-24 18:23:32 CEST STATEMENT:  create extension rdkit;
   2012-04-24 18:23:32 CEST FATAL:  failed to initialize 
rdkit.tanimoto_threshold to 0.5
   2012-04-24 18:23:32 CEST STATEMENT:  create extension rdkit;

Being in Denmark, I suspected regional settings, but apparently it is 
not the issue (?):

   postgres=# select 7.0/3;
 ?column?
   
2.
   (1 row)

   postgres=# \q

The decimal separator used in Postgres is a period, not a comma as I feared.

Any help appreciated :-).

Cheers
-- Jan Holst Jensen

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss