Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-16 Thread George Papadatos
Same here. I would also add the standardisation work done by Francis Atkinson 
at the EBI as an additional starting point. 

George. 

Sent from my giPhone

> On 16 Jan 2018, at 17:19, JP  wrote:
> 
> Joining the fray, +1 for MolVS
> 
>> On 16 January 2018 at 16:00, Brian Cole  wrote:
>> +1 to the MolVS project as well. 
>> 
>> Perhaps an easy bite-size project is to incorporate the open source mae 
>> parser code into core RDKit: https://github.com/schrodinger/maeparser
>> 
>> 
>>> On Mon, Jan 15, 2018 at 9:08 PM, Francois BERENGER 
>>>  wrote:
>>> On 01/16/2018 05:51 AM, Tim Dudgeon wrote:
>>> > Incorporating and "industrialising" Matt's MolVS tautomer and
>>> > standardizer code?
>>> > http://molvs.readthedocs.io/en/latest/index.html
>>> 
>>> If we can vote, I would vote for this one.
>>> 
>>> > On 15/01/18 07:09, Greg Landrum wrote:
>>> >> Dear all,
>>> >>
>>> >> We've been invited again to participate in the OpenChemistry
>>> >> application for Google Summer of Code.
>>> >>
>>> >> In order to participate we need ideas for projects and mentors to go
>>> >> along with them.
>>> >>
>>> >> The current list of RDKit ideas is being maintained here:
>>> >> http://wiki.openchemistry.org/GSoC_Ideas_2018#RDKit_Project_Ideas
>>> >>
>>> >> (Note: at the point that I'm pressing "send", that's still a copy of
>>> >> last year's project ideas).
>>> >>
>>> >> If you're willing to be a mentor (please ask me about the ~5
>>> >> hours/week required here) or have ideas, please reply to this thread.
>>> >>
>>> >> Best,
>>> >> -greg
>>> >>
>>> >>
>>> >> --
>>> >> Check out the vibrant tech community on one of the world's most
>>> >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> >>
>>> >>
>>> >> ___
>>> >> Rdkit-discuss mailing list
>>> >> Rdkit-discuss@lists.sourceforge.net
>>> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>> >
>>> >
>>> >
>>> > --
>>> > Check out the vibrant tech community on one of the world's most
>>> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> >
>>> >
>>> >
>>> > ___
>>> > Rdkit-discuss mailing list
>>> > Rdkit-discuss@lists.sourceforge.net
>>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>> >
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Default behavior of certain calls

2017-10-12 Thread George Papadatos
Great example of functools.partial. 
For those who like functional programming, it can also be used with map and 
imap when a function needs more than one parameters. 

George. 

Sent from my giPhone

> On 12 Oct 2017, at 19:04, Andy Jennings  wrote:
> 
> Hi Paolo,
> 
> That's outstanding - thanks very much.
> 
> Best,
> Andy
> 
>> On Thu, Oct 12, 2017 at 10:27 AM, Paolo Tosco  wrote:
>> Dear Andy,
>> 
>> you may accomplish that within the scope of a Python script using 
>> functools.partial:
>> 
>> In [1]: from rdkit import Chem
>> 
>> In [2]: import functools
>> 
>> In [3]: # redefine Chem.SDMolSupplier to include a custom default parameter
>> 
>> In [4]: Chem.SDMolSupplier = functools.partial(Chem.SDMolSupplier, removeHs 
>> = False)
>> 
>> In [5]: suppl = Chem.SDMolSupplier('/home/paolo/sdf/bilastine.sdf')
>> 
>> In [6]: # hydrogens have not been stripped
>> 
>> In [7]: suppl[0].GetNumAtoms()
>> Out[7]: 71
>> 
>> In [8]: # If you wish to invoke the original function with the original 
>> default parameter:
>> 
>> In [9]: suppl = Chem.SDMolSupplier.func('/home/paolo/sdf/bilastine.sdf')
>> 
>> In [10]: # hydrogens have been stripped as the original function was invoked
>> 
>> In [11]: suppl[0].GetNumAtoms()
>> Out[11]: 34
>> HTH, cheers
>> p.
>>> On 10/12/17 18:09, Andy Jennings wrote:
>>> Hi,
>>> 
>>> First off: great work on the RDKit - a great resource for those of us that 
>>> like to cook up our own solutions to problems.
>>> 
>>> The default behavior of certain calls (e.g. Chem.SDMolSupplier, 
>>> Chem.MolToSmiles) has default behavior that is the opposite of what I would 
>>> generally want. For instance I might be processing docking files and want 
>>> to keep those pesky hydrogens, or I want to keep the stereochemical 
>>> information when I dump a smiles string.
>>> 
>>> I can understand why the current defaults might have been arrived at so I'm 
>>> not advocating the change in default behavior. Rather, I'm curious if one 
>>> could set the default behavior for an entire script (I write mostly 
>>> python). It maybe/is lazy of me but every so often I get caught out and 
>>> have to backtrack through a workflow.
>>> 
>>> Best,
>>> Andy
>>> 
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> 
>>> 
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit-fingerprints set all bits for complex molecules?

2017-06-01 Thread George Papadatos
Example: https://www.surechembl.org/chemical/SCHEMBL1895

George. 

Sent from my giPhone

> On 1 Jun 2017, at 17:05, Greg Landrum  wrote:
> 
> Hi Nils,
> 
> Can you please send me the SMILES for those structures (or point me to an 
> easy way to lookup a SCHEMBL id)?
> 
> I will take a look at these, but I don't currently have a convenient copy of 
> SCHEMBL.
> 
> -greg
> 
> 
> 
>> On Thu, Jun 1, 2017 at 4:28 PM, Nils Weskamp  wrote:
>> Dear RDKitters,
>> 
>> I just calculated RDKit "Daylight-like" fingerprints for a number of public 
>> compound databases and found quite a number of examples where the resulting 
>> fingerprints have *all* bits set to 1. This happens in both KNIME 3.2.1 
>> (1024/1/7) and also via the command line (2048/1/7/4) for RDKit 2016.03. 
>> 
>> Examples include (from SureChEMBL):
>> 
>> SCHEMBL5141968   
>>
>> SCHEMBL13916889  
>>   
>> SCHEMBL16257315  
>>
>> SCHEMBL16257310  
>>
>> SCHEMBL16257297  
>>
>> SCHEMBL16257215  
>>
>> SCHEMBL16257169  
>>
>> SCHEMBL8232906   
>>   
>> SCHEMBL16257312  
>>
>> SCHEMBL13011081  
>>   
>> SCHEMBL12570100  
>>
>> SCHEMBL14524878  
>>   
>> SCHEMBL6370886   
>>   
>> SCHEMBL15305169  
>>   
>> SCHEMBL16912871  
>>
>> SCHEMBL13290179  
>>
>> 
>> Now, these are obviously some very large and complex molecules, so I would 
>> expect that they contain many features and thus set many bits - but all of 
>> them?
>> 
>> So, in short: Are these compounds so ugly that it is normal for the 
>> fingerprints to have all bits set or are they so ugly that they trigger some 
>> rare bug in RDKit?
>> 
>> Any ideas / suggestions / comments?
>> 
>> Thanks a lot,
>> Nils
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] substructure of a fingerprint position

2017-02-01 Thread George Papadatos
https://iwatobipen.wordpress.com/2017/01/08/get-bit-information-with-rdkit/

George. 

Sent from my giPhone

> On 26 Jan 2017, at 11:02, Gonzalo Colmenarejo  
> wrote:
> 
> Hi,
> 
> is there a way in RDKit to retrieve the substructure(s) corresponding to a 
> (hashed or unhashed) Morgan fingerprint position? 
> 
> Thanks a lot in advance
> 
> Gonzalo
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread George Papadatos
:)

George. 

Sent from my giPhone

> On 2 Dec 2016, at 22:11, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> wrote:
> 
>> On 12/02/2016 03:12 PM, George Papadatos wrote:
>> Here's a pragmatic idea:
> ... would it not be safe to
>> assume that *any *word containing more than 4 'C' or 'c' characters would
>> only be a SMILES string?
> 
> pneumonoultramicroscopicsilicovolcanoconiosis
> 
> 
> -- 
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
> 
> --
> Check out the vibrant tech community on one of the world's most 
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread George Papadatos
Here's a pragmatic idea:

If Alexis wants to search for valid SMILES strings representing
typical *organic
*molecules among text of plain English words, would it not be safe to
assume that *any *word containing more than 4 'C' or 'c' characters would
only be a SMILES string?
This simple filter (word.lower().count('c')>=4) would quickly eliminate all
normal English words, leaving only SMILES to parse. No need for regexes,
unless you really care for ISIS or IOPS molecules. :)

George

On 2 December 2016 at 19:36, Andrew Dalke  wrote:

> On Dec 2, 2016, at 11:11 AM, Greg Landrum wrote:
> > An initial start on some regexps that match SMILES is here:
> https://gist.github.com/lsauer/1312860/264ae813c2bd2c27a769d261c8c6b3
> 8da34e22fb
> >
> > that may also be useful
>
>
> I've put together a more gnarly regular expression to find possible SMILES
> strings. It's configured for at least 4 atom terms, but that's easy to
> change (there's a "{3,}" which can be changed as desired.)
>
> It's follows the SMILES specification a bit more closely, which means
> there should be fewer false positives than the regular expression Greg
> pointed out.
>
> The file which constructs the regular expression, and an example driver,
> is attached. Here's what the output looks like:
>
>
>
>
> % python detect_smiles.py ~/talks/*.txt
> /Users/dalke/talks/ICCS_2014_paper.txt:528:532 'IOPS'
> /Users/dalke/talks/ICCS_2014_paper.txt:30150:30183
> 'CC12CCC3C(CCC4=CC(O)CCC34C)C1CCC2'
> /Users/dalke/talks/ICCS_2014_paper2.txt:3270:3274 'CBCC'
> /Users/dalke/talks/ICCS_2014_paper2.txt:10229:10239 'CC(=O)[O-]'
> /Users/dalke/talks/ICCS_2014_paper2.txt:32766:32770 'ISIS'
> /Users/dalke/talks/Sheffield2013.txt:25002:25013 'C1=CC=CC=C1'
> /Users/dalke/talks/Sheffield2013.txt:25039:25047 'c1c1'
> /Users/dalke/talks/Sheffield_2016.txt:2767:2771 'CBCC'
> /Users/dalke/talks/Sheffield_2016.txt:10295:10301 'O0'
> /Users/dalke/talks/Sheffield_2016_talk.txt:7302:7306 'CBCC'
> /Users/dalke/talks/Sheffield_2016_talk.txt:7564:7568 'CBCC'
> /Users/dalke/talks/Sheffield_2016_talk.txt:7716:7720 'CBCC'
> /Users/dalke/talks/Sheffield_2016_v2.txt:2874:2878 'soon'
> /Users/dalke/talks/Sheffield_2016_v2.txt:7312:7317 'O'
> /Users/dalke/talks/Sheffield_2016_v2.txt:22770:22774 'ICCS'
> /Users/dalke/talks/Sheffield_2016_v3.txt:2982:2986 'soon'
> /Users/dalke/talks/Sheffield_2016_v3.txt:7627:7632 'O'
> /Users/dalke/talks/Sheffield_2016_v3.txt:24546:24550 'ICCS'
> /Users/dalke/talks/tdd_part_2.txt:7547:7551 'scop'
>
> You can also modify the code for line-by-line processing rather than an
> entire block of text like I did.
>
>
> As others have pointed out, this is a well-trodden path. Follow their
> warnings and advice.
>
> Also, I didn't fully test it.
>
>
>
> Andrew
> da...@dalkescientific.com
>
>
> P.S.
>
> Here's the regular expression:
>
> (? term
>
> (
>
> (
> (
>  Cl? | # Cl and Br are part of the organic subset
>  Br? |
>  [NOSPFIbcnosp*] |  # as are these single-letter elements
>
>  # bracket atom
>  \[\d*  # optional atomic mass
>(# valid element names
> C[laroudsemf]? |
> Os?|N[eaibdpos]? |
> S[icernbmg]? |
> P[drmtboau]? |
> H[eofgas]? |
> c|n|o|s|p |
> A[lrsgutcm] |
> B[eraik]? |
> Dy|E[urs] |
> F[erm]? |
> G[aed] |
> I[nr]? |
> Kr? |
> L[iaur] |
> M[gnodt] |
> R[buhenaf] |
> T[icebmalh] |
> U|V|W|Xe |
> Yb?|Z[nr]
>)
>[^]]*   # ignore anything up to the ']'
> \]
> )
># allow 0 or more closures directly after any atom
> (
>   [-=#$/\\]?  # optional bond type
>   (
> [0-9] |# single digit closure
> (%[0-9][0-9])  # two digit closure
>   )
> ) *
> )
>
> (
>
> (
>  (
>   \( [-=#$/\\]?   # a '(', which can have an optional bond (no dot)
>  ) | (
>\)*   # any number of close parens, followed by
>(
>  ( \( [-=#$/\\]? ) |  # an open parens and optional bond (no dot)
>  [.-=#$/\\]?  # or a dot disconnect or bond
>)
>  )
> )
> ?
>
> (
> (
>  Cl? | # Cl and Br are part of the organic subset
>  Br? |
>  [NOSPFIbcnosp*] |  # as are these single-letter elements
>
>  # bracket atom
>  \[\d*  # optional atomic mass
>(# valid element names
> C[laroudsemf]? |
> Os?|N[eaibdpos]? |
> S[icernbmg]? |
> P[drmtboau]? |
> H[eofgas]? |
> c|n|o|s|p |
> A[lrsgutcm] |
> B[eraik]? |
> Dy|E[urs] |
> F[erm]? |
> G[aed] |
> I[nr]? |
> Kr? |
> L[iaur] |
> M[gnodt] |
> R[buhenaf] |
> T[icebmalh] |
> U|V|W|Xe |
> Yb?|Z[nr]
>)
>[^]]*   # ignore anything up to the ']'
> \]
> )
># allow 0 or more closures directly after any atom
> (
>   [-=#$/\\]?  # optional bond type
>   (
> [0-9] |# single digit closure
> (%[0-9][0-9])  # two digit closure
>   )
> ) *
> )
>
> 

Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread George Papadatos
I think Alexis was referring to converting actual SMILES strings found in
random text. Chemical entity recognition and name to structure conversion
is another story altogether and nowadays one can quickly go a long way with
open tools such as OSCAR + OPSIN in KNIME or with something like this:
http://chemdataextractor.org/docs/intro

George

On 2 December 2016 at 17:35, Brian Kelley  wrote:

> This was why they started using the dictionary lookup as I recall :). The
> iupac system they ended up using was Roger's when at OpenEye.
>
> 
> Brian Kelley
>
> On Dec 2, 2016, at 12:33 PM, Igor Filippov 
> wrote:
>
> I could be wrong but I believe IBM system had a preprocessing step which
> removed all known dictionary words - which would get rid of "submarine" etc.
> I also believe this problem has been solved multiple times in the past,
> NextMove software comes to mind, chemical tagger -
> http://chemicaltagger.ch.cam.ac.uk/, etc.
>
> my 2 cents,
> Igor
>
>
>
>
> On Fri, Dec 2, 2016 at 11:46 AM, Brian Kelley 
> wrote:
>
>> I hacked a version of RDKit's smiles parser to compute heavy atom count,
>> perhaps some version of this could be used to check smiles validity without
>> making the actual molecule.
>>
>> From a fun historical perspective:  IBM had an expert system to find
>> IUPAC names in documents.  They ended up finding things like "submarine"
>> which was amusing.  It turned out that just parsing all words with the
>> IUPAC parser was by far the fastest and best solution.  I expect the same
>> will be true for finding smiles.
>>
>> It would be interesting to put the common OCR errors into the parser as
>> well (l's and 1's are hard for instance).
>>
>>
>> On Fri, Dec 2, 2016 at 10:46 AM, Peter Gedeck 
>> wrote:
>>
>>> Hello Alexis,
>>>
>>> Depending on the size of your document, you could consider limit storing
>>> the already tested strings by word length and only memoize shorter words.
>>> SMILES tend to be longer, so everything above a given number of characters
>>> has a higher probability of being a SMILES. Large words probably also
>>> contain a lot of chemical names. They often contain commas (,), so they are
>>> easy to remove quickly.
>>>
>>> Best,
>>>
>>> Peter
>>>
>>>
>>> On Fri, Dec 2, 2016 at 5:43 AM Alexis Parenty <
>>> alexis.parenty.h...@gmail.com> wrote:
>>>
 Dear Pavel And Greg,



 Thanks Greg for the regexps link. I’ll use that too.


 Pavel, I need to track on which document the SMILES are coming from,
 but I will indeed make a set of unique word for each document before
 looping. Thanks!

 Best,

 Alexis

 On 2 December 2016 at 11:21, Pavel  wrote:

 Hi, Alexis,

   if you should not track from which document SMILES come, you may just
 combine all words from all document in a list, take only unique words and
 try to test them. Thus, you should not store and check for valid/non-valid
 strings. That would reduce problem complexity as well.

 Pavel.
 On 12/02/2016 11:11 AM, Greg Landrum wrote:

 An initial start on some regexps that match SMILES is here:
 https://gist.github.com/lsauer/1312860/264ae813c2bd2c2
 7a769d261c8c6b38da34e22fb

 that may also be useful

 On Fri, Dec 2, 2016 at 11:07 AM, Alexis Parenty <
 alexis.parenty.h...@gmail.com> wrote:

 Hi Markus,


 Yes! I might discover novel compounds that way!! Would be interesting
 to see how they look like…


 Good suggestion to also store the words that were correctly identified
 as SMILES. I’ll add that to the script.


 I also like your “distribution of word” idea. I could safely skip any
 words that occur more than 1% of the time and could try to play around with
 the threshold to find an optimum.


 I will try every suggestions and will time it to see what is best. I’ll
 keep everyone in the loop and will share the script and results.


 Thanks,


 Alexis

 On 2 December 2016 at 10:47, Markus Sitzmann  wrote:

 Hi Alexis,

 you may find also so some "novel" compounds by this approach :-).

 Whether your tuple solution improves performance strongly depends on
 the content of your text documents and how often they repeat the same words
 again - but my guess would be it will help. Probably the best way is even
 to look at the distribution of words before you feed them to RDKit. You
 should also "memorize" those ones that successfully generated a structure,
 doesn't make sense to do it again, then.

 Markus

 On Fri, Dec 2, 2016 at 10:21 AM, Maciek Wójcikowski <
 mac...@wojcikowski.pl> wrote:

 Hi Alexis,

 You may want to filter with some regex 

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-12-01 Thread George Papadatos
HI Stephen,

Further to Greg's excellent reply, see this paper on how InChI strings and
keys can be used in practice to map together tautomer (ones covered by
InChI at least), isotope, stereo and parent-salt variants.
http://rd.springer.com/article/10.1186/s13321-014-0043-5

Francis (cc'ed) has a nice notebook somewhere illustrating these nice InChI
splits to find these variants.

For educational purposes, there have been other approaches like the NCI's
identifiers - discussion here:
http://acscinf.org/docs/meetings/237nm/presentations/237nm17.pdf

For pure structure standardization using RDKit see here:
https://github.com/flatkinson/standardiser
and
https://github.com/mcs07/MolVS


Cheers,

George




On 29 November 2016 at 17:02, Greg Landrum  wrote:

> Wow, this is a great question and quite a fun thread.
>
> It's hard to really make much of a contribution here without writing a
> book/review article (something that I'm really not willing to do!), but I
> have a few thoughts. Most of this is repeating/rephrasing things others
> have already said.
>
> I'm going to propose some things as facts. I think that these won't be
> controversial:
> fact 1: if the structures are coming from different sources, they need to
> be standardized/normalized before you compare them. This is true regardless
> of how you want to compare them. The details of the standardization process
> are not incredibly important, but it does need to take care of the things
> you care about when comparing molecules. For example, if you don't care
> about differences between salts, it should strip salts. If you don't care
> about differences between tautomers, it should normalize tautomers.
> fact 2: The InChI algorithm includes a standardization step that
> normalizes some tautomers, but does not remove salts.
> fact 3: The InChI representation contain a number of layers defining the
> structure in increasing detail (this isn't strictly true, because some of
> the choices about how layers are ordered are arbitrary, but it's close).
> fact 4: canonicalization, the way I define it, produces a canonical atom
> numbering for a given structure, but it does *not* standardize
> fact 5: the RDKit has essentially no well-documented standardization code
>
> fact X: we don't have any standard, broadly accepted approach for
> standardization, canonicalization or representation that is fool-proof or
> that works for even all of organic chemistry, never mind organometallics.
> InChI, useful as it is for some things, completely fails to handle things
> like atropisomers (they are working on this kind of thing, but it's not out
> yet).
>
> Given all of this, if I wanted to have flexible duplicate checking *right*
> now, I think I would use the AvalonTools struchk functionality that the
> RDKit provides (the new pure-RDKit version still needs a bit more testing)
> to handle basic standardization and salt stripping and then produce a table
> that includes the InChI in a couple of different forms. I'd want to be able
> to recognize molecules that differ only by stereochemistry, molecules that
> differ only by location of tautomeric Hs, and molecules that differ only by
> the location of isotopic labels. You can do this with various clever splits
> of the InChI (how to do it is left as an exercise for the reader and/or a
> future RDKit blog post).
>
> I think there's something fun to be done here with SMILES variants,
> borrowing heavily from some of the things that Roger has written about:
> https://nextmovesoftware.com/blog/2013/04/25/finding-all-typ
> es-of-every-mer/
> here's a more recent application of that from Noel:
> https://nextmovesoftware.com/blog/2016/06/22/fishing-for-mat
> ched-series-in-a-sea-of-structure-representations/
>
> If I didn't really care about details and just wanted something that I
> could explain easily to others, I'd skip all the complication and just use
> InChIs (or InChI keys) to recognize duplicates. There would be times when
> that would be the wrong answer, but it would be a broadly accepted kind of
> wrong.[1]
>
> Regardless of the approach, I would not, under most any circumstances,
> discard the original input structures that I had. It's really good to be
> able to figure out what the original data looked like later.
>
> -greg
> [1] I'm crying as I write this...
>
>
>
>
> On Mon, Nov 28, 2016 at 5:25 PM, Stephen O'hagan  > wrote:
>
>> Has anyone come up with fool-proof way of matching structurally
>> equivalent molecules?
>>
>>
>>
>> Unique Smiles or InChI String comparisons don’t appear to work presumable
>> because there are different but equivalent structures, e.g. explicit vs
>> non-explicit H’s, Kekule vs Aromatic, isomeric forms vs non-isomeric form,
>> tautomers etc.
>>
>>
>>
>> I also expect that comparing InChI strings might need something more than
>> just a simple string comparison, such as masking off stereo information
>> when you don’t care about stereo 

Re: [Rdkit-discuss] Fingerprints_calculation

2016-10-02 Thread George Papadatos
Hi Sahil,

You'll find the same documentation as a Jupyter Notebook here:
http://nbviewer.jupyter.org/github/chembl/mychembl/blob/master/ipython_notebooks/02_myChEMBL_RDKit_tutorial.ipynb#Morgan-Fingerprints-(Circular-Fingerprints)

Cheers,

George

On 1 October 2016 at 04:17, Greg Landrum  wrote:

> Hi Sahil,
>
> The documentation includes some detail about the calculation of the Morgan
> fingerprints, along with a pointer to the original publication describing
> the method: http://rdkit.org/docs/GettingStartedInPython.html#
> morgan-fingerprints-circular-fingerprints
>
> Does the information there answer your question?
> -greg
>
>
>
> On Fri, Sep 30, 2016 at 6:37 AM, Sahil Kharangarh <
> sahilkharang...@gmail.com> wrote:
>
>>
>> I am facing the problem during the calculation of morgan(circular)
>> fingerprints that on which basis the RDkit calculated the fingerprints when
>> we choose the radius?
>> how to choose the radius in the circular fingerprints?
>> and what is the use of the function  useFeatures=True particularly means?
>>
>>
>>
>>
>> *With Warm Regards,*
>> *SAHIL*
>> *M.S. Research Scholar, *
>> *Department of Pharmacoinformatics,*
>> *National Institute of Pharmaceutical Education and Research (NIPER), *
>> *sector-67, S.A.S Nagar, Mohali,*
>> *Punjab- 160062, INDIA*
>> *contact no: +917508142749 <%2B917508142749>,+919813153122
>> <%2B919813153122>*
>> *email: sahilkharang...@gmail.com *
>>
>>
>> 
>> --
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] OCEAN: Our Target Prediction Paper (including Source Code)

2016-09-27 Thread George Papadatos
Hi guys,

Congrats - great use of ChEMBL and myChEMBL too :)


George



On 27 September 2016 at 05:13, Paul Czodrowski <
paul.czodrow...@merckgroup.com> wrote:

> Dear RDKitters,
>
>
>
> Our target prediction method – fully based on RDKit – has become online:
>
> OCEAN: *O*ptimized *C*ross r*EA*ctivity estimatio*N*
>
> http://pubs.acs.org/doi/abs/10.1021/acs.jcim.6b00067
>
>
>
> The source code can be found here:
>
> https://github.com/rdkit/OCEAN
>
>
>
> We will give a talk as well an hands-on workshop at the upcoming RDKit UGM
> end of October.
>
>
>
> Cheers,
>
> Guido & Paul
>
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
>
>
> Click http://www.merckgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Tools for the IPython Notebook

2015-07-02 Thread George Papadatos
Axel, this is seriously cool!
Many thanks!

George

On 2 July 2015 at 13:31, Axel Pahl axelp...@gmx.de wrote:

  Dear fellow RDKitters,

 the RDKit community is always so helpful that I wanted share back two
 functions that I use in the IPython Notebook from which I thought that they
 could be of use to others, as well.

 - show_table:
 Display a list of molecules in a table with molecule properties as
 columns.
 When an ID property is given, the table becomes interactive and compounds
 can be selected.
 I know that this can be also done with PandasTools but that might be
 overkill in some situations. Also the table from Pandas is not interactive
 to my knowledge.

 - jsme:
 Display Peter Ertl's Javascript Melecule Editor to enter a molecule
 directly in the IPython notebook (how cool is that??)

 If you are interested, please have a look at the GitHub
 https://github.com/apahl/rdkit_ipynb_tools repo and the example
 http://nbviewer.ipython.org/github/apahl/rdkit_ipynb_tools/blob/master/rdkit_ipynb_tools.ipynb
 notebook.

 Kind regards,
 Axel


 --
 Don't Limit Your Business. Reach for the Cloud.
 GigeNET's Cloud Solutions provide you with the tools and support that
 you need to offload your IT needs and focus on growing your business.
 Configured For All Businesses. Start Your Cloud Today.
 https://www.gigenetcloud.com/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Molecular dis / similarity using fingerprints

2015-05-26 Thread George Papadatos
Hi JP,

Aha, so you're looking for a threshold that will exhibit the optimal
balance between the false positives and false negatives in the *biological*
*activity* space. This threshold varies depending on the fingerprint and
the dataset of course.
See here for some generalised insights:

(1) Papadatos, G.; Cooper, A. W. J.; Kadirkamanathan, V.; Macdonald, S. J.
F.; McLay, I. M.; Pickett, S. D.; Pritchard, J. M.; Willett, P.; Gillet, V.
J. Analysis of Neighborhood Behavior in Lead Optimization and Array Design. *J.
Chem. Inf. Model.* *2009*, *49*, 195–208.

especially Figure 17, and

(2) Muchmore, S. W.; Debe, D. A.; Metz, J. T.; Brown, S. P.; Martin, Y. C.;
Hajduk, P. J. Application of Belief Theory to Similarity Data Fusion for
Use in Analog Searching and Lead Hopping. *J. Chem. Inf. Model.* *2008*,
*48*, 941–948.

and also Greg's blog post:

http://rdkit.blogspot.co.uk/2013/10/fingerprint-thresholds.html


The TL/DR version is that for ECFP_4, this threshold should be around
0.45-0.55.
Wrt methodology, are you trying to score/rank the
intra-diversity/heterogeneity for different structure sets?


Cheers,

George



On 26 May 2015 at 11:59, JP jeanpaul.ebe...@inhibox.com wrote:


 On 25 May 2015 at 22:23, Tim Dudgeon tdudgeon...@gmail.com wrote:

 Maybe a clustering approach may work? Something like sphere exclusion
 clustering with counting the number of clusters at 0.9 - 0.8 similarity)?
 With 30K structures it sounds computationally tractable?


 Thanks Tim for this idea.  I hadn't heard of sphere exclusion.  The
 problem is we still need a distance / similarity function (which using ECFP
 with high similarity 0.8-0.9 would result in very few compounds being
 thrown out).  I think the real issue here is selecting a sensible
 similarity threshold which defines my idea of similarity.  But that is a
 tricky number to get right - too high and you remove nothing, too low and
 you start catching different molecules.  I guess the best thing is try a
 few values (0.5, 0.6, 0.7, 0.8, 0.9) and have a visual look at the
 remaining compounds.

 -
 JP


 --
 One dashboard for servers and applications across Physical-Virtual-Cloud
 Widest out-of-the-box monitoring support with 50+ applications
 Performance metrics, stats and reports that give you Actionable Insights
 Deep dive visibility with transaction tracing using APM Insight.
 http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] generating scaffold trees

2015-05-22 Thread George Papadatos
Hi all,

Coincidentally, we had a chat about this with James the other day.
Maybe the good colleagues at the ICR have implemented this already with
RDKit? Nick?

Cheers,

g


On 22 May 2015 at 13:38, Axel Pahl axelp...@gmx.de wrote:

 Dear RDKitters,

 has someone used the RDKit to generate scaffold trees from molecules as
 described in this paper:
 Schuffenhauer, A., Ertl, P., Roggo, S., Wetzel, S., Koch, M. A.,
 Waldmann, H., J. Chem. Inf. Model. 2007, 47, 47-58

 I know that this is possible with ScaffoldHunter and that there is a
 Pipeline Pilot component for it, but being able to do it in RDKit would
 fit especially well in my workflow...

 Kind regards and have a nice weekend,
 Axel



 --
 One dashboard for servers and applications across Physical-Virtual-Cloud
 Widest out-of-the-box monitoring support with 50+ applications
 Performance metrics, stats and reports that give you Actionable Insights
 Deep dive visibility with transaction tracing using APM Insight.
 http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit.version()

2014-12-11 Thread George Papadatos
Hi Soren,

from rdkit import rdBase
print rdBase.rdkitVersion

Cheers,

George


On 11 December 2014 at 18:22, Soren Wacker swac...@ucalgary.ca wrote:

  Hi,

 I would like to find out the currently installed version on my System.
 However, I cannot find a version string in RDKit. Something like
 rdkit.version() would be nice. Is there something like this implemented??

 kind regards
 Soren

  --
 *From:* James Davidson [j.david...@vernalis.com]
 *Sent:* Wednesday, December 10, 2014 10:48 AM
 *To:* greg.land...@gmail.com
 *Cc:* rdkit-discuss@lists.sourceforge.net
 *Subject:* Re: [Rdkit-discuss] Avalon test failing(?)

   Hi Greg,



  The new version of the test code is targeting the 1.2 avalon toolkit

  version.

  Here's the commit that did that.

  https://github.com/rdkit/rdkit/commit/42dab414ee6fbe5489078e5e52046608bbf785cb

 

  As an FYI, to make these tests pass on windows, you need to edit the code

  to fix a bug:

 

  you need to comment out line 1446 of reaccsio.c:

 //MyFree((char *)tempdir);



 Following your advice, I downloaded the 1.2 source from Sourceforge (
 http://sourceforge.net/projects/avalontoolkit/files/AvalonToolkit_1.2/);
 commented-out the line in reaccsio.c; and then reconfigured in cmake and
 rebuilt in VS.  The tests pass now – thanks!



 Kind regards



 James

 __
 PLEASE READ: This email is confidential and may be privileged. It is
 intended for the named addressee(s) only and access to it by anyone else is
 unauthorised. If you are not an addressee, any disclosure or copying of the
 contents of this email or any action taken (or not taken) in reliance on it
 is unauthorised and may be unlawful. If you have received this email in
 error, please notify the sender or postmas...@vernalis.com. Email is not
 a secure method of communication and the Company cannot accept
 responsibility for the accuracy or completeness of this message or any
 attachment(s). Please check this email for virus infection for which the
 Company accepts no responsibility. If verification of this email is sought
 then please request a hard copy. Unless otherwise stated, any views or
 opinions presented are solely those of the author and do not represent
 those of the Company.

 The Vernalis Group of Companies
 100 Berkshire Place
 Wharfedale Road
 Winnersh, Berkshire
 RG41 5RD, England
 Tel: +44 (0)118 938 

 To access trading company registration and address details, please go to
 the Vernalis website at www.vernalis.com and click on the Company
 address and registration details link at the bottom of the page..
 __


 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE

 http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] myChEMBL

2014-06-12 Thread George Papadatos
Hi all,

Further to Michał's announcement earlier, let me use this opportunity to
announce the new release of myChEMBL, as I'm sure it will be relevant to
many of you.

myChEMBL is an open platform which consists of a Linux (Ubuntu) Virtual
Machine featuring a PostgreSQL schema with the latest version of the ChEMBL
database, the latest RDKit toolkit and cartridge, along with several Python
tools and libraries for scientific computing and data mining.

myChEMBL offers several ways to interact with ChEMBL data locally and
provides a free and secure environment for application development,
teaching and learning.

More information here:
http://chembl.blogspot.co.uk/2014/06/mychembl-launchpadlaunched.html


Cheers,

George
--
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing  Easy Data Exploration
http://p.sf.net/sfu/hpccsystems___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] flexmatch in RDKit cartridge?

2014-02-21 Thread George Papadatos
Many thanks Jan, that's very helpful.

Cheers,

George



On 20 February 2014 21:32, Jan Holst Jensen j...@biochemfusion.com wrote:

  Hi George et al,

 flexmatch(... 'all') is the most strict exact match that the
 Symyx/Accelrys cartridge has. You can relax the matching behavior to
 varying degrees by passing it different options, e.g. using 'tau' instead
 of 'all' will make the identity check tautomer-agnostic (to the extent that
 the cartridge will perceive tautomers correctly - an interesting
 discussion topic in itself).

 The various options to flexmatch() are well documented in the Accelrys
 documentation for the cartridge, but I don't know if that is publicly
 available.

 The short answer in my opinion: Yes, @= should be the equivalent of
 flexmatch(m1, m2, 'all'). To emulate flexmatch(..., 'all') with rdkit, I
 find a small gotcha with regards to chiral matching:

 -- Clearly not identical.
 postgres=# select mol('CCC') @= mol('CCF');
  ?column?
 --
  f
 (1 row)

 -- Clearly identical.
 postgres=# select mol('CCC') @= mol('CCC');
  ?column?
 --
  t
 (1 row)

 -- Ala versus dAla - should *not* be identical ?
 postgres=# select mol('C[C@H](N)C(=O)O') @= mol('C[C@@H](N)C(=O)O');
  ?column?
 --
  t
 (1 row)

 To get the expected behavior of @= you need to turn on chiral matching.
 Even though the parameter says that is controls SSS behavior it apparently
 also has an effect on exact matching:

 postgres=# set rdkit.do_chiral_sss=true;
 SET
 -- Ala versus dAla - no longer identical.
 postgres=# select mol('C[C@H](N)C(=O)O') @= mol('C[C@@H](N)C(=O)O');
  ?column?
 --
  f
 (1 row)

 -- Ala versus Ala - phew, identical.
 postgres=# select mol('C[C@H](N)C(=O)O') @= mol('C[C@H](N)C(=O)O');
  ?column?
 --
  t
 (1 row)

 Cheers
 -- Jan



 On 2014-02-20 13:46, George Papadatos wrote:

 Hi there,
 Wouldn't that be (at least partly) possible with an exact structure search?

- @= : returns whether or not two molecules are the same.

 Cheers,
 George


 On 20 February 2014 11:59, Greg Landrum greg.land...@gmail.com wrote:

 Sounds interesting. Can anyone provide a pointer to a doc with more
 specific info about what this actually does?


 On Thursday, February 20, 2014, Michał Nowotka mmm...@gmail.com wrote:

   Hi,

  Symix cartridge defines something called flexmatch - Finds records
 that are an exact match of the 2D or 3D structure that you specify in the
 query.
  Is there anything similar in RDKit cartridge? I looked into
 documentation and couldn't find this feature.

  Regards,
  Michal Nowotka



 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss





--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] flexmatch in RDKit cartridge?

2014-02-20 Thread George Papadatos
Hi there,
Wouldn't that be (at least partly) possible with an exact structure search?

   - @= : returns whether or not two molecules are the same.

Cheers,
George


On 20 February 2014 11:59, Greg Landrum greg.land...@gmail.com wrote:

 Sounds interesting. Can anyone provide a pointer to a doc with more
 specific info about what this actually does?


 On Thursday, February 20, 2014, Michał Nowotka mmm...@gmail.com wrote:

 Hi,

 Symix cartridge defines something called flexmatch - Finds records that
 are an exact match of the 2D or 3D structure that you specify in the query.
 
 Is there anything similar in RDKit cartridge? I looked into documentation
 and couldn't find this feature.

 Regards,
 Michal Nowotka



 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI roundtrip

2014-01-30 Thread George Papadatos
OK just to add some fuel to this fire: A colleague of mine and I looked at
the inchi roundtrip using KNIME 2.9 and the latest versions of indigo and
rdkit nodes. We used ~90,000 inchis from chembl_17, converted them to mols
(sanitise + remove Hs), removed the ones that fail to convert, and then we
converted back to inchis (standard ones, no extra parameters). We assessed
the discrepancies between indigo and rdkit inchis compared to the original
input inchis that are stored in chembl.
Rdkit had 10 times more discrepancies with 200 failures as opposed to 21
from indigo. This rate (~0.2%) was also confirmed using ~1 million inchis.

I had a closer look to a couple of cases here:
http://nbviewer.ipython.org/gist/madgpap/8715974

It seems that there is more that one reason for the failure. I totally
understand Greg's caution about the inchi2mol conversion, but given the
difference between rdkit and indigo, there might room for improvement. Any
insights would be very much appreciated.

Btw, the KNIME workflow and full list of fails are available to you.

Cheers,

George



On 30 January 2014 04:11, Greg Landrum greg.land...@gmail.com wrote:

 Yeah, I have been tempted several times to remove the InChI-RDKit
 functionality entirely



 On Thu, Jan 30, 2014 at 5:05 AM, Igor Filippov 
 igor.v.filip...@gmail.comwrote:

 Thank you, Greg!
 Very nice explanation and I think this issue has confused people before
 me as well. I am going to have to keep reminding myself about it as the
 subject comes up every now and then.

 Igor
 On Jan 29, 2014 10:59 PM, Greg Landrum greg.land...@gmail.com wrote:

 Hi Igor,

 On Wed, Jan 29, 2014 at 2:04 PM, Igor Filippov 
 igor.v.filip...@gmail.com wrote:

 Greg et al,

 Here is a little script that demonstrates a problem with fingerprints
 after the roundtrip through InChI.
 My input mol file is also attached.
 As you can see the similarity between before and after is not 1 in
 45 out of 100 cases.
 In one case it is as low as 0.29. Could someone take a look and tell me
 what I'm doing wrong?


 Ah! Now I see what you're doing and understand the problem.

 It's really important when using InChI to remember that InChI is
 designed to be an identifier, not an interchange format. The InChI
 algorithm modifies the molecule as part of its canonicalization step. This
 modification includes standardizing tautomers.

 Here's an example of the type of substructure modification that happens
 in your molecules:
 input smiles c1c1C(=O)Nc1c1 on begin converted to InChI and back
 yields: OC(=Nc1c1)c1c1

 Basically: If you think you know what your molecules are, you probably
 should be building them from SMILES or CTAB, not InChI.

 Apologies that I didn't think of this before; I was just focusing on the
 stereochemistry.

 -greg




 --
 WatchGuard Dimension instantly turns raw network data into actionable
 security intelligence. It gives you real-time visual feedback on key
 security issues and trends.  Skip the complicated setup - simply import
 a virtual appliance and go from zero to informed in seconds.

 http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI roundtrip

2014-01-30 Thread George Papadatos
Hi Igor,
Thanks for the quick reply.
I just did in my workflow. The number of discrepancies increased from 200
to 950 :(
George


On 30 January 2014 19:19, Igor Filippov igor.v.filip...@gmail.com wrote:

 George,

 Have you added coordinates to the mols converted from InChI?
 It made a huge difference for the examples I've tried.

 Igor


 On Thu, Jan 30, 2014 at 2:07 PM, George Papadatos gpapada...@gmail.comwrote:

 OK just to add some fuel to this fire: A colleague of mine and I looked
 at the inchi roundtrip using KNIME 2.9 and the latest versions of indigo
 and rdkit nodes. We used ~90,000 inchis from chembl_17, converted them to
 mols (sanitise + remove Hs), removed the ones that fail to convert, and
 then we converted back to inchis (standard ones, no extra parameters). We
 assessed the discrepancies between indigo and rdkit inchis compared to the
 original input inchis that are stored in chembl.
 Rdkit had 10 times more discrepancies with 200 failures as opposed to 21
 from indigo. This rate (~0.2%) was also confirmed using ~1 million inchis.

 I had a closer look to a couple of cases here:
 http://nbviewer.ipython.org/gist/madgpap/8715974

 It seems that there is more that one reason for the failure. I totally
 understand Greg's caution about the inchi2mol conversion, but given the
 difference between rdkit and indigo, there might room for improvement. Any
 insights would be very much appreciated.

 Btw, the KNIME workflow and full list of fails are available to you.

 Cheers,

 George



 On 30 January 2014 04:11, Greg Landrum greg.land...@gmail.com wrote:

 Yeah, I have been tempted several times to remove the InChI-RDKit
 functionality entirely



 On Thu, Jan 30, 2014 at 5:05 AM, Igor Filippov 
 igor.v.filip...@gmail.com wrote:

 Thank you, Greg!
 Very nice explanation and I think this issue has confused people before
 me as well. I am going to have to keep reminding myself about it as the
 subject comes up every now and then.

 Igor
 On Jan 29, 2014 10:59 PM, Greg Landrum greg.land...@gmail.com
 wrote:

 Hi Igor,

 On Wed, Jan 29, 2014 at 2:04 PM, Igor Filippov 
 igor.v.filip...@gmail.com wrote:

 Greg et al,

 Here is a little script that demonstrates a problem with fingerprints
 after the roundtrip through InChI.
 My input mol file is also attached.
 As you can see the similarity between before and after is not 1
 in 45 out of 100 cases.
 In one case it is as low as 0.29. Could someone take a look and tell
 me what I'm doing wrong?


 Ah! Now I see what you're doing and understand the problem.

 It's really important when using InChI to remember that InChI is
 designed to be an identifier, not an interchange format. The InChI
 algorithm modifies the molecule as part of its canonicalization step. This
 modification includes standardizing tautomers.

 Here's an example of the type of substructure modification that
 happens in your molecules:
 input smiles c1c1C(=O)Nc1c1 on begin converted to InChI and
 back yields: OC(=Nc1c1)c1c1

 Basically: If you think you know what your molecules are, you probably
 should be building them from SMILES or CTAB, not InChI.

 Apologies that I didn't think of this before; I was just focusing on
 the stereochemistry.

 -greg




 --
 WatchGuard Dimension instantly turns raw network data into actionable
 security intelligence. It gives you real-time visual feedback on key
 security issues and trends.  Skip the complicated setup - simply import
 a virtual appliance and go from zero to informed in seconds.

 http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




 --
 WatchGuard Dimension instantly turns raw network data into actionable
 security intelligence. It gives you real-time visual feedback on key
 security issues and trends.  Skip the complicated setup - simply import
 a virtual appliance and go from zero to informed in seconds.

 http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss

[Rdkit-discuss] MDS using RDKit, SciKit and Pandas

2014-01-21 Thread George Papadatos
Hi RDKitters,

This is not a question, more like an FYI.
Inspired by Noel's related post:
http://baoilleach.blogspot.co.uk/2014/01/convert-distance-matrix-to-2d.html,
I've put together an iPython Notebook example that performs MDS on a bunch
of ChEMBL compounds (i.e. visualises their chemical space in 2D).

http://nbviewer.ipython.org/gist/madgpap/8538507

Enjoy,

George


EMBL-EBI
--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] postgres FPs in python

2013-11-08 Thread George Papadatos
Hi there,
DB-related question again:
When I retrieve fps from a postgres db, they look like this:
\x020c00102204810001040001981408420180400040048088c020800423a192001814002021044200092400040208

Is there are way to convert them to RDKit bitvector fingerprint objects or
at least bitvector strings in python?


Thanks,

George
--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Friday pandas q

2013-10-25 Thread George Papadatos
Question to rdkit pandas users (pandaskitters?):

I managed to have the mol_send(m) object in a pandas frame:
[image: Inline images 1]
if I do this: data['mol'].map(str).map(Chem.Mol)
I get the mol in base64 PNG:

[image: Inline images 2]

How do I display the column as rendered images (and keep them internally as
a Series of rdmols) ?

PandasTools.ChangeMoleculeRendering seems relevant but I can't get it to
display the mols

Cheers,

George
image.pngimage.png--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Friday pandas q

2013-10-25 Thread George Papadatos
It worked! Many thanks!
g


On 25 October 2013 16:18, Greg Landrum greg.land...@gmail.com wrote:

 Hi George,

 Nikolas is really the expert here, but this just worked for me:

 curs.execute('select molregno,mol_send(m) from rdk.mols where m@
 %s',('c12c1nncc2',))

 d = curs.fetchall()

 df2 = pd.DataFrame(d,columns=('molregno','pkl'))

 df2['romol']=df2.apply(lambda x:Chem.Mol(str(x['pkl'])),axis=1)

 PandasTools.RenderImagesInAllDataFrames()
 del df2['pkl']
 df2.head(2)

 -greg



 On Fri, Oct 25, 2013 at 4:43 PM, George Papadatos gpapada...@gmail.comwrote:

 Question to rdkit pandas users (pandaskitters?):

 I managed to have the mol_send(m) object in a pandas frame:
 [image: Inline images 1]
 if I do this: data['mol'].map(str).map(Chem.Mol)
 I get the mol in base64 PNG:

 [image: Inline images 2]

 How do I display the column as rendered images (and keep them internally
 as a Series of rdmols) ?

 PandasTools.ChangeMoleculeRendering seems relevant but I can't get it to
 display the mols

 Cheers,

 George



image.pngimage.png--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Notes from the 2013 UGM

2013-10-23 Thread George Papadatos
Hi all,

I'd also like to thank you all for attending this UGM at the EBI and
contributing to its success.
And, of course, a big thanks to Greg for, well, you know.
:)

See you all next year - or sooner!

George



On 22 October 2013 09:09, Greg Landrum greg.land...@gmail.com wrote:

 Hi,

 Looks like I'm never going to have time to do a really thorough write up
 of the UGM. In the interests of getting something out there, I guess I will
 do something short.

 From my point of view, the UGM was a great success. George did a great job
 of getting everything organized, and everything went very smoothly. We had
 an interesting set of talks, some good questions and discussions during the
 talks, and a couple of very nice social activities at the pub.

 The slides and ipython notebooks for many of the talks are available in
 github:
 https://github.com/rdkit/UGM_2013

 A few things to note from the talks:
 1) The code for PDB handling, MMFF94, and Open3DAlign is now all on the
 trunk. It will be in the upcoming release.
 2) Jameed updated the MMPA code in Contrib; the new version is definitely
 worth checking out, as is Jameed's tutorial on how to use it (part of the
 materials linked to above).
 3) Jameed (and his employer) also contributed an implementation of the
 Fraggle similarity algorithm described in his talk. The command line tools
 are now in Contrib and the main similarity code is in
 $RDBASE/rdkit/Chem/Fraggle. This will be in the upcoming release.

 The roundtable produced a long list of ideas for future features/changes.
 Some of these are already done, the rest will land in github as I manage to
 find time.

 We also had a discussion about the frequency of RDKit releases. It seems
 that the quarterly release cycle creates extra work for the community as
 well as me, so we're going to switch to doing releases every six months. If
 a critical bug is found (and fixed!) I'll do a patch release, but new
 features and improvements will only be released twice a year. Anyone who
 wants to stay on the bleeding edge can, of course, track the version of
 the code in github. That doesn't get checked in without passing tests on at
 least one platform. If this slower release cycle ends up creating problems,
 we can always go back to three or four times a year.

 Many many thanks to everyone for participating; in particular everyone who
 did a presentation or tutorial and George for the organization. I'm already
 looking forward to next year!

 -greg




 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] rdkit mol objects from sql

2013-10-23 Thread George Papadatos
Hi RDKitters,
I must have seen this in an ipython notebook but can't find it right now:
If I have a table of rdkit mols generated by the cartridge, is there a way
to retrieve them using a psycopg2 connection within python - ideally inside
a pandas dataframe?

I've got this snippet:
import pandas as pd
import psycopg2
conn = psycopg2.connect(port=5432 user=chembl dbname=chembl_17)
data = pd.read_sql(sql, conn)

...but I'm missing the step where I retrieve rdkit mol objects somehow
instead of smiles.

Many thanks in advance,
George
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] rdkit mol objects from sql

2013-10-23 Thread George Papadatos
Yes it does; many thanks!
I've just found the notebook I mentioned:
http://nbviewer.ipython.org/4316426/
(Scroll to bottom)
I prefer Greg's first solution though, as it avoids the conversion from smiles 
completely. 

Best, 

George 

Sent from my gPad

 On 23 Oct 2013, at 20:39, JP jeanpaul.ebe...@inhibox.com wrote:
 
 Does the following help you george?
 http://comments.gmane.org/gmane.science.chemistry.rdkit.user/860
 
 
 
 On 23 October 2013 17:11, George Papadatos gpapada...@gmail.com wrote:
 Hi RDKitters,
 I must have seen this in an ipython notebook but can't find it right now:
 If I have a table of rdkit mols generated by the cartridge, is there a way 
 to retrieve them using a psycopg2 connection within python - ideally inside 
 a pandas dataframe?
 
 I've got this snippet:
 import pandas as pd
 import psycopg2
 conn = psycopg2.connect(port=5432 user=chembl dbname=chembl_17)
 data = pd.read_sql(sql, conn)
 
 ...but I'm missing the step where I retrieve rdkit mol objects somehow 
 instead of smiles. 
 
 Many thanks in advance,
 George
 
 
 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit UGM 2013 - a few pictures

2013-10-16 Thread George Papadatos
Lovely pics Paul.
Many thanks,
g


On 14 October 2013 16:56, Paul Emsley pems...@mrc-lmb.cam.ac.uk wrote:


 https://www.dropbox.com/sh/a3s55kmxa37yx7e/vLC5uea1xP

 Paul.




 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem drawing molecules under windows

2013-09-26 Thread George Papadatos
Hi Andrea,
Seems like a font problem to me, which could indicate the lack of
cairo/pango libraries.
George


On 26 September 2013 17:29, Greg Landrum greg.land...@gmail.com wrote:

 Hi Andrea,

 On Thu, Sep 26, 2013 at 10:59 AM, Andrea Volkamer volka...@bio.mx wrote:



 **

 I am relatively new to rdkit, and just started using IPython notebook
 under Windows.

 I installed WinPython-64bit-2.7.5.3 and RDKit_2013_06_1 as well as
 Pillow-2.1.0.win-amd64-py2.7 to do so.

 Anyhow, I have some trouble drawing molecules:

 For some reason, drawing this molecule Chem.MolFromSmiles('C11CC')
 works, adding, e.g.,  a nitrogen (Chem.MolFromSmiles('C11CCN')) doesn’t
 work (rdkit.Chem.rdchem.Mol at 0x7a6e8d0)?

 This happens for many other examples as well. 

 Any suggestions?


 Does it happen exclusively for molecules with heteroatoms?

 -greg



 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] failed tests for github master version on ubuntu

2013-09-25 Thread George Papadatos
Hi there,
I tried to install RDKit on a fresh Ubuntu 13.04 VM today.
I checkout out the source from GitHub master but I got the following errors
after ctest:
65% tests passed, 28 tests failed out of 79

Total Test time (real) =  23.19 sec

The following tests FAILED:
  4 - pyBV (Failed)
  5 - pyDiscreteValueVect (Failed)
  6 - pySparseIntVect (Failed)
  9 - testPyGeometry (Failed)
 12 - pyAlignment (Failed)
 17 - pyDistGeom (Failed)
 27 - pyDepictor (Failed)
 37 - pyChemReactions (Failed)
 42 - pyFragCatalog (Failed)
 44 - pyMolDescriptors (Failed)
 46 - pyPartialCharges (Failed)
 48 - pyMolTransforms (Failed)
 51 - pyForceFieldHelpers (Failed)
 53 - pyDistGeom (Failed)
 55 - pyMolAlign (Failed)
 57 - pyChemicalFeatures (Failed)
 59 - pyShapeHelpers (Failed)
 61 - pyMolCatalog (Failed)
 63 - pySLNParse (Failed)
 64 - pyGraphMolWrap (Failed)
 65 - pyTestConformerWrap (Failed)
 68 - pyMatCalc (Failed)
 69 - pyCMIM (Failed)
 70 - pyRanker (Failed)
 72 - pyFeatures (Failed)
 73 - pythonTestDbCLI (Failed)
 74 - pythonTestDirML (Failed)
 79 - pythonTestDirChem (Failed)
Errors while running CTest

Any ideas why?

Thanks in advance,

George
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] failed tests for github master version on ubuntu

2013-09-25 Thread George Papadatos
Hello again,
Sorry for the false alarm - that was me messing up with the env variables
and not having enough coffee to realise it earlier!
However, there is still one fail:
*99% tests passed, 1 tests failed out of 79*

Total Test time (real) =  88.37 sec

The following tests FAILED:
 79 - pythonTestDirChem (Failed)
Errors while running CTest

Do you have any ideas why that might be? Is it safe to ignore it?

George



On 25 September 2013 10:04, George Papadatos gpapada...@gmail.com wrote:

 Hi there,
 I tried to install RDKit on a fresh Ubuntu 13.04 VM today.
 I checkout out the source from GitHub master but I got the following
 errors after ctest:
 65% tests passed, 28 tests failed out of 79

 Total Test time (real) =  23.19 sec

 The following tests FAILED:
   4 - pyBV (Failed)
   5 - pyDiscreteValueVect (Failed)
   6 - pySparseIntVect (Failed)
   9 - testPyGeometry (Failed)
  12 - pyAlignment (Failed)
  17 - pyDistGeom (Failed)
  27 - pyDepictor (Failed)
  37 - pyChemReactions (Failed)
  42 - pyFragCatalog (Failed)
  44 - pyMolDescriptors (Failed)
  46 - pyPartialCharges (Failed)
  48 - pyMolTransforms (Failed)
  51 - pyForceFieldHelpers (Failed)
  53 - pyDistGeom (Failed)
  55 - pyMolAlign (Failed)
  57 - pyChemicalFeatures (Failed)
  59 - pyShapeHelpers (Failed)
  61 - pyMolCatalog (Failed)
  63 - pySLNParse (Failed)
  64 - pyGraphMolWrap (Failed)
  65 - pyTestConformerWrap (Failed)
  68 - pyMatCalc (Failed)
  69 - pyCMIM (Failed)
  70 - pyRanker (Failed)
  72 - pyFeatures (Failed)
  73 - pythonTestDbCLI (Failed)
  74 - pythonTestDirML (Failed)
  79 - pythonTestDirChem (Failed)
 Errors while running CTest

 Any ideas why?

 Thanks in advance,

 George


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] failed tests for github master version on ubuntu

2013-09-25 Thread George Papadatos
Hi Paolo,
Thanks a lot for the quick response and the tips.
My problem was actually tk and the lack of the $DISPLAY variable.
Now everything has passed.
See you next week!
George



On 25 September 2013 10:36, Paolo Tosco paolo.to...@unito.it wrote:

  Dear George,

 that test depends on the PIL Python module, which was not present in my
 Linux distro (Scientific Linux 6); once I installed it, the test ran fine.

 Regarding Ubuntu 13.04 and PIL, I just googled this thread:
 https://plus.google.com/112555004333838485342/posts/H8iRnbmdv7a

 Maybe this is your case too.

 Anyway, I suggest to try

 $ cd rdkit/Chem
 $ python test_list.py

 and see what is actually failing.

 HTH,
 Paolo


 On 09/25/2013 11:27 AM, George Papadatos wrote:

 Hello again,
 Sorry for the false alarm - that was me messing up with the env variables
 and not having enough coffee to realise it earlier!
 However, there is still one fail:
  *99% tests passed, 1 tests failed out of 79*

  Total Test time (real) =  88.37 sec

  The following tests FAILED:
  79 - pythonTestDirChem (Failed)
 Errors while running CTest

  Do you have any ideas why that might be? Is it safe to ignore it?

  George



 On 25 September 2013 10:04, George Papadatos gpapada...@gmail.com wrote:

 Hi there,
 I tried to install RDKit on a fresh Ubuntu 13.04 VM today.
 I checkout out the source from GitHub master but I got the following
 errors after ctest:
  65% tests passed, 28 tests failed out of 79

  Total Test time (real) =  23.19 sec

  The following tests FAILED:
   4 - pyBV (Failed)
   5 - pyDiscreteValueVect (Failed)
   6 - pySparseIntVect (Failed)
   9 - testPyGeometry (Failed)
  12 - pyAlignment (Failed)
  17 - pyDistGeom (Failed)
  27 - pyDepictor (Failed)
  37 - pyChemReactions (Failed)
  42 - pyFragCatalog (Failed)
  44 - pyMolDescriptors (Failed)
  46 - pyPartialCharges (Failed)
  48 - pyMolTransforms (Failed)
  51 - pyForceFieldHelpers (Failed)
  53 - pyDistGeom (Failed)
  55 - pyMolAlign (Failed)
  57 - pyChemicalFeatures (Failed)
  59 - pyShapeHelpers (Failed)
  61 - pyMolCatalog (Failed)
  63 - pySLNParse (Failed)
  64 - pyGraphMolWrap (Failed)
  65 - pyTestConformerWrap (Failed)
  68 - pyMatCalc (Failed)
  69 - pyCMIM (Failed)
  70 - pyRanker (Failed)
  72 - pyFeatures (Failed)
  73 - pythonTestDbCLI (Failed)
  74 - pythonTestDirML (Failed)
  79 - pythonTestDirChem (Failed)
 Errors while running CTest

  Any ideas why?

  Thanks in advance,

  George




 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk



 ___
 Rdkit-discuss mailing 
 listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss



 --
 ==
 Paolo Tosco, Ph.D.
 Department of Drug Science and Technology
 Via Pietro Giuria, 9 - 10125 Torino (Italy)
 Tel: +39 011 670 7680 | Mob: +39 348 5537206
 Fax: +39 011 670 7687 | E-mail: paolo.tosco@unito.ithttp://open3dqsar.org | 
 http://open3dalign.org
 ==



 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] name generator

2013-08-27 Thread George Papadatos
I think this is not an actual structure to name converter but a look-up service 
based on a a predefined dictionary. 
If this is true, then it won't return anything for any novel/unseen structures. 
Give it a try and let us know. 

George. 

Sent from my giPhone

On 27 Aug 2013, at 18:39, David Hall li...@cowsandmilk.net wrote:

 Not sure what software is behind it, but the NCI's Chemical Identifier 
 Resolver may suit your needs.
 
 For your example, the URL:
 
 http://cactus.nci.nih.gov/chemical/structure/CC(C)O/iupac_name
 
 returns Propan-2-ol
 
 -David
 
 On Aug 27, 2013, at 11:54 AM, Sergio Martinez Cuesta sermar...@gmail.com 
 wrote:
 
 thanks Greg,
 
 indeed, I only found commercial software for it
 
 http://www.chemaxon.com/marvin/help/applications/molconvert.html
 
 cheers
 Sergio
 
 
 On 27 August 2013 16:45, Greg Landrum greg.land...@gmail.com wrote:
 Dear Sergio,
 
 
 On Tue, Aug 27, 2013 at 5:21 PM, Sergio Martinez Cuesta 
 sermar...@gmail.com wrote:
 is there any IUPAC name generator in RDKit?
 
 e.g. for transforming CC(C)O into propan-2-ol ?
 
 There is not. In fact, I'm not aware of any open source structure-name 
 converters.
 
 -greg
 
 --
 Introducing Performance Central, a new site from SourceForge and 
 AppDynamics. Performance Central is your source for news, insights, 
 analysis and resources for efficient Application Performance Management. 
 Visit us today!
 http://pubads.g.doubleclick.net/gampad/clk?id=48897511iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 --
 Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
 Discover the easy way to master current and previous Microsoft technologies
 and advance your career. Get an incredible 1,500+ hours of step-by-step
 tutorial videos with LearnDevNow. Subscribe today and save!
 http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] MMP analysis - active vs. inactive compounds

2013-05-03 Thread George Papadatos
Hi Paul,

I guess you firstly have to generate the list of MMPs as per Jameed's code,
secondly you join your property values for MolID1 and MolID2 and finally
you calculate the property difference/ratio for each MMP.

Best regards,

George


On 3 May 2013 12:10, paul.czodrow...@merckgroup.com wrote:

 Dear RDKitters,

 has anyone applied Jameed's great code to the following scenario:
 - Perform a MMP analysis with respect to a particular property (e.g.
 activity)

 Given the current code, I do not see any chance to consider any property
 besides the compound ID.

 It is also not possible to provide 2 files (one for the active compounds,
 one for the inactive compounds) - or am I wrong?


 Cheers  Thanks,
 Paul


 This message and any attachment are confidential and may be privileged or
 otherwise protected from disclosure. If you are not the intended recipient,
 you must not copy this message or attachment or disclose the contents to
 any other person. If you have received this transmission in error, please
 notify the sender immediately and delete the message and any attachment
 from your system. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not accept liability for any omissions or errors in this
 message which may arise as a result of E-Mail-transmission or for damages
 resulting from any unauthorized changes of the content of this message and
 any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not guarantee that this message is free of viruses and does
 not accept liability for any damages caused by any virus transmitted
 therewith.

 Click http://www.merckgroup.com/disclaimer to access the German, French,
 Spanish and Portuguese versions of this disclaimer.


 --
 Get 100% visibility into Java/.NET code with AppDynamics Lite
 It's a free troubleshooting tool designed for production
 Get down to code-level detail for bottlenecks, with 2% overhead.
 Download for free and get started troubleshooting in minutes.
 http://p.sf.net/sfu/appdyn_d2d_ap2
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with 2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Cartridge problems (again) Ubuntu 64-bit

2013-03-21 Thread George Papadatos
Hi RDKitters,

So I've successfully installed RDKit from the *svn trunk* on a brand new
Ubuntu Server 12.10 64-bit VM.
All 77/77 tests passed. Yay.

When I tried to build the cartridge against psql 9.1.8, 4/8 tests failed:

## Build RDKit Cartridge
cd $RDBASE/Code/PgSQL/rdkit
make
sudo make install
make installcheck

== dropping database contrib_regression ==
NOTICE:  database contrib_regression does not exist, skipping
DROP DATABASE
== creating database contrib_regression ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test rdkit-91 ... FAILED
test props... ok
test btree... FAILED
test molgist  ... ok
test bfpgist-91   ... FAILED
test sfpgist  ... ok
test slfpgist ... ok
test fps  ... FAILED

==
 4 of 8 tests failed.
==

However, the following works:

createdb test
psql test

psql (9.1.8)
Type help for help.

test=# create extension rdkit;
CREATE EXTENSION
test=# show rdkit.tanimoto_threshold;
 rdkit.tanimoto_threshold
--
 0.5
(1 row)

test=# select 'c1c1O'::mol;
mol
---
 Oc1c1
(1 row)

Any ideas?

Many thanks in advance,

George
EMBL-EBI
--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Cartridge problems (again) Ubuntu 64-bit

2013-03-21 Thread George Papadatos
Ah, many thanks for the clarification Greg.
Are these changes related to the problematic phenanthrene substructure
query?
When is the new release scheduled for?

Cheers,

George
EMBL-EBI



On 21 March 2013 17:58, greg landrum greg.land...@gmail.com wrote:

 These are due to some ongoing changes in the rdkit fingerprint. Don't
 worry about them.
 I will fix those tests after the fingerprint changes settle down,
 definitely before the next release.

 -greg

 On Mar 21, 2013, at 1:22 PM, George Papadatos gpapada...@gmail.com
 wrote:

 Hi RDKitters,

 So I've successfully installed RDKit from the *svn trunk* on a brand new
 Ubuntu Server 12.10 64-bit VM.
 All 77/77 tests passed. Yay.

 When I tried to build the cartridge against psql 9.1.8, 4/8 tests failed:

 ## Build RDKit Cartridge
 cd $RDBASE/Code/PgSQL/rdkit
 make
 sudo make install
 make installcheck

 == dropping database contrib_regression ==
 NOTICE:  database contrib_regression does not exist, skipping
 DROP DATABASE
 == creating database contrib_regression ==
 CREATE DATABASE
 ALTER DATABASE
 == running regression test queries==
 test rdkit-91 ... FAILED
 test props... ok
 test btree... FAILED
 test molgist  ... ok
 test bfpgist-91   ... FAILED
 test sfpgist  ... ok
 test slfpgist ... ok
 test fps  ... FAILED

 ==
  4 of 8 tests failed.
 ==

 However, the following works:

 createdb test
 psql test

 psql (9.1.8)
 Type help for help.

 test=# create extension rdkit;
 CREATE EXTENSION
 test=# show rdkit.tanimoto_threshold;
  rdkit.tanimoto_threshold
 --
  0.5
 (1 row)

 test=# select 'c1c1O'::mol;
 mol
 ---
  Oc1c1
 (1 row)

 Any ideas?

 Many thanks in advance,

 George
 EMBL-EBI




 --
 Everyone hates slow websites. So do we.
 Make your web apps faster with AppDynamics
 Download AppDynamics Lite for free today:
 http://p.sf.net/sfu/appdyn_d2d_mar

 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] problems with RDKit and Mountain Lion

2012-10-10 Thread George Papadatos
Hi Greg,

I built boost 1.49 from source and tried again. There is now a similar
error but elsewhere:

 44%] Building CXX object
Code/GraphMol/SmilesParse/CMakeFiles/SmilesParse.dir/SmartsWrite.cpp.o
Linking CXX shared library ../../../lib/libSmilesParse.dylib
ld: warning: path
'//Library/Frameworks/Python.framework/Versions/2.7/Python' following -L
not a directory
Undefined symbols for architecture x86_64:
  yysmarts_parse(char const*, std::vectorRDKit::RWMol*,
std::allocatorRDKit::RWMol* *, void*), referenced from:
  RDKit::(anonymous namespace)::smarts_parse(std::string const,
std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
SmilesParse.cpp.o
  yysmiles_parse(char const*, std::vectorRDKit::RWMol*,
std::allocatorRDKit::RWMol* *, std::listunsigned int,
std::allocatorunsigned int *, void*), referenced from:
  RDKit::(anonymous namespace)::smiles_parse(std::string const,
std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
SmilesParse.cpp.o
  yysmarts_lex_init(void**), referenced from:
  RDKit::(anonymous namespace)::smarts_parse(std::string const,
std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
SmilesParse.cpp.o
  yysmiles_lex_init(void**), referenced from:
  RDKit::(anonymous namespace)::smiles_parse(std::string const,
std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
SmilesParse.cpp.o
  setup_smarts_string(std::string const, void*), referenced from:
  RDKit::(anonymous namespace)::smarts_parse(std::string const,
std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
SmilesParse.cpp.o
  setup_smiles_string(std::string const, void*), referenced from:
  RDKit::(anonymous namespace)::smiles_parse(std::string const,
std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
SmilesParse.cpp.o
  yysmarts_lex_destroy(void*), referenced from:
  RDKit::(anonymous namespace)::smarts_parse(std::string const,
std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
SmilesParse.cpp.o
  yysmiles_lex_destroy(void*), referenced from:
  RDKit::(anonymous namespace)::smiles_parse(std::string const,
std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
SmilesParse.cpp.o
  _yysmarts_debug, referenced from:
  RDKit::SmartsToMol(std::string, int, bool, std::mapstd::string,
std::string, std::lessstd::string, std::allocatorstd::pairstd::string
const, std::string  *) in SmilesParse.cpp.o
  _yysmiles_debug, referenced from:
  RDKit::SmilesToMol(std::string, int, bool, std::mapstd::string,
std::string, std::lessstd::string, std::allocatorstd::pairstd::string
const, std::string  *) in SmilesParse.cpp.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
make[2]: *** [lib/libSmilesParse.2012.09.1beta.dylib] Error 1
make[1]: *** [Code/GraphMol/SmilesParse/CMakeFiles/SmilesParse.dir/all]
Error 2
make: *** [all] Error 2


On 10 October 2012 02:35, Greg Landrum greg.land...@gmail.com wrote:

 Just an FYI, not sure if it's relevant or not: I have not yet done an
 rdkit build with boost 1.51, so I am not sure that the problem isn't there

 -greg



 On Tuesday, October 9, 2012, George Papadatos wrote:

 Hi James,

 You're right. I checked out the true HEAD which is 2234 but it still
 failed!
  This is the make log:

 MS-Verdun:build georgep$ cmake -D PYTHON_LIBRARY=/${PYTHON_ROOT}/Python
 -DPYTHON_INCLUDE_DIR=${PYTHON_ROOT}/Headers .. 21 | tee cmake.log
 -- The C compiler identification is GNU 4.2.1
 -- The CXX compiler identification is Clang 4.1.0
 -- Checking whether C compiler has -isysroot
 -- Checking whether C compiler has -isysroot - yes
 -- Checking whether C compiler supports OSX deployment target flag
 -- Checking whether C compiler supports OSX deployment target flag - yes
 -- Check for working C compiler: /usr/bin/gcc
 -- Check for working C compiler: /usr/bin/gcc -- works
 -- Detecting C compiler ABI info
 -- Detecting C compiler ABI info - done
 -- Check for working CXX compiler: /usr/bin/c++
 -- Check for working CXX compiler: /usr/bin/c++ -- works
 -- Detecting CXX compiler ABI info
 -- Detecting CXX compiler ABI info - done
 -- Check if the system is big endian
 -- Searching 16 bit integer
 -- Looking for sys/types.h
 -- Looking for sys/types.h - found
 -- Looking for stdint.h
 -- Looking for stdint.h - found
 -- Looking for stddef.h
 -- Looking for stddef.h - found
 -- Check size of unsigned short
 -- Check size of unsigned short - done
 -- Using unsigned short
 -- Check if the system is big endian - little endian
 -- Found PythonLibs:
 //Library/Frameworks/Python.framework/Versions/2.7/Python (found version
 2.7.3)
 -- Found PythonInterp:
 /Library/Frameworks/Python.framework/Versions/2.7/bin/python (found version
 2.7.3)
 -- Boost version: 1.51.0
 -- Found the following Boost libraries:
 --   python
 -- Found BISON: /usr/bin/bison
 -- Found FLEX: /usr/bin/flex
 -- Looking for include file pthread.h
 -- Looking for include file pthread.h - found
 -- Looking

Re: [Rdkit-discuss] problems with RDKit and Mountain Lion

2012-10-10 Thread George Papadatos
Hello again,

Success at last! I managed to build rdkit using brew and boost 1.49.
I think the cause of the problem was a strange combination of Mountain
Lion, boost 1.51 and not up-to-date rdkit svn repo in HomeBrew.

So to summarise:

brew update
brew uninstall boost
brew versions boost
cd /usr/local
git checkout e40bc41 /usr/local/Library/Formula/boost.rb #version 1.49.0
cd

brew install boost --build-from-source

brew untap edc/homebrew-rdkit
brew tap edc/homebrew-rdkit
brew uninstall rdkit
brew install --HEAD rdkit

Thanks for all the tips,

George



On 10 October 2012 12:32, George Papadatos gpapada...@gmail.com wrote:

 Hi Greg,

 I built boost 1.49 from source and tried again. There is now a similar
 error but elsewhere:

  44%] Building CXX object
 Code/GraphMol/SmilesParse/CMakeFiles/SmilesParse.dir/SmartsWrite.cpp.o
 Linking CXX shared library ../../../lib/libSmilesParse.dylib
 ld: warning: path
 '//Library/Frameworks/Python.framework/Versions/2.7/Python' following -L
 not a directory
 Undefined symbols for architecture x86_64:
   yysmarts_parse(char const*, std::vectorRDKit::RWMol*,
 std::allocatorRDKit::RWMol* *, void*), referenced from:
   RDKit::(anonymous namespace)::smarts_parse(std::string const,
 std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
 SmilesParse.cpp.o
   yysmiles_parse(char const*, std::vectorRDKit::RWMol*,
 std::allocatorRDKit::RWMol* *, std::listunsigned int,
 std::allocatorunsigned int *, void*), referenced from:
   RDKit::(anonymous namespace)::smiles_parse(std::string const,
 std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
 SmilesParse.cpp.o
   yysmarts_lex_init(void**), referenced from:
   RDKit::(anonymous namespace)::smarts_parse(std::string const,
 std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
 SmilesParse.cpp.o
   yysmiles_lex_init(void**), referenced from:
   RDKit::(anonymous namespace)::smiles_parse(std::string const,
 std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
 SmilesParse.cpp.o
   setup_smarts_string(std::string const, void*), referenced from:
   RDKit::(anonymous namespace)::smarts_parse(std::string const,
 std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
 SmilesParse.cpp.o
   setup_smiles_string(std::string const, void*), referenced from:
   RDKit::(anonymous namespace)::smiles_parse(std::string const,
 std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
 SmilesParse.cpp.o
   yysmarts_lex_destroy(void*), referenced from:
   RDKit::(anonymous namespace)::smarts_parse(std::string const,
 std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
 SmilesParse.cpp.o
   yysmiles_lex_destroy(void*), referenced from:
   RDKit::(anonymous namespace)::smiles_parse(std::string const,
 std::vectorRDKit::RWMol*, std::allocatorRDKit::RWMol* ) in
 SmilesParse.cpp.o
   _yysmarts_debug, referenced from:
   RDKit::SmartsToMol(std::string, int, bool, std::mapstd::string,
 std::string, std::lessstd::string, std::allocatorstd::pairstd::string
 const, std::string  *) in SmilesParse.cpp.o
   _yysmiles_debug, referenced from:
   RDKit::SmilesToMol(std::string, int, bool, std::mapstd::string,
 std::string, std::lessstd::string, std::allocatorstd::pairstd::string
 const, std::string  *) in SmilesParse.cpp.o
 ld: symbol(s) not found for architecture x86_64
 clang: error: linker command failed with exit code 1 (use -v to see
 invocation)
 make[2]: *** [lib/libSmilesParse.2012.09.1beta.dylib] Error 1
 make[1]: *** [Code/GraphMol/SmilesParse/CMakeFiles/SmilesParse.dir/all]
 Error 2
 make: *** [all] Error 2


 On 10 October 2012 02:35, Greg Landrum greg.land...@gmail.com wrote:

 Just an FYI, not sure if it's relevant or not: I have not yet done an
 rdkit build with boost 1.51, so I am not sure that the problem isn't there

 -greg



 On Tuesday, October 9, 2012, George Papadatos wrote:

 Hi James,

 You're right. I checked out the true HEAD which is 2234 but it still
 failed!
  This is the make log:

 MS-Verdun:build georgep$ cmake -D PYTHON_LIBRARY=/${PYTHON_ROOT}/Python
 -DPYTHON_INCLUDE_DIR=${PYTHON_ROOT}/Headers .. 21 | tee cmake.log
 -- The C compiler identification is GNU 4.2.1
 -- The CXX compiler identification is Clang 4.1.0
 -- Checking whether C compiler has -isysroot
 -- Checking whether C compiler has -isysroot - yes
 -- Checking whether C compiler supports OSX deployment target flag
 -- Checking whether C compiler supports OSX deployment target flag - yes
 -- Check for working C compiler: /usr/bin/gcc
 -- Check for working C compiler: /usr/bin/gcc -- works
 -- Detecting C compiler ABI info
 -- Detecting C compiler ABI info - done
 -- Check for working CXX compiler: /usr/bin/c++
 -- Check for working CXX compiler: /usr/bin/c++ -- works
 -- Detecting CXX compiler ABI info
 -- Detecting CXX compiler ABI info - done
 -- Check if the system is big endian
 -- Searching 16 bit integer
 -- Looking for sys/types.h
 -- Looking for sys/types.h - found
 -- Looking for stdint.h

Re: [Rdkit-discuss] problems with RDKit and Mountain Lion

2012-10-09 Thread George Papadatos
Hi James,

You're right. I checked out the true HEAD which is 2234 but it still
failed!
 This is the make log:

MS-Verdun:build georgep$ cmake -D PYTHON_LIBRARY=/${PYTHON_ROOT}/Python
-DPYTHON_INCLUDE_DIR=${PYTHON_ROOT}/Headers .. 21 | tee cmake.log
-- The C compiler identification is GNU 4.2.1
-- The CXX compiler identification is Clang 4.1.0
-- Checking whether C compiler has -isysroot
-- Checking whether C compiler has -isysroot - yes
-- Checking whether C compiler supports OSX deployment target flag
-- Checking whether C compiler supports OSX deployment target flag - yes
-- Check for working C compiler: /usr/bin/gcc
-- Check for working C compiler: /usr/bin/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check if the system is big endian
-- Searching 16 bit integer
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of unsigned short
-- Check size of unsigned short - done
-- Using unsigned short
-- Check if the system is big endian - little endian
-- Found PythonLibs:
//Library/Frameworks/Python.framework/Versions/2.7/Python (found version
2.7.3)
-- Found PythonInterp:
/Library/Frameworks/Python.framework/Versions/2.7/bin/python (found version
2.7.3)
-- Boost version: 1.51.0
-- Found the following Boost libraries:
--   python
-- Found BISON: /usr/bin/bison
-- Found FLEX: /usr/bin/flex
-- Looking for include file pthread.h
-- Looking for include file pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - found
-- Found Threads: TRUE
-- Boost version: 1.51.0
-- Found the following Boost libraries:
--   regex
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/georgep/rdkit/rdkit-code/build

*and the error is:*

Linking CXX shared library ../../lib/libGraphMol.dylib
ld: warning: path
'//Library/Frameworks/Python.framework/Versions/2.7/Python' following -L
not a directory
Undefined symbols for architecture x86_64:
  boost::system::system_category(), referenced from:
  __GLOBAL__I_a in QueryAtom.cpp.o
  __GLOBAL__I_a in QueryBond.cpp.o
  __GLOBAL__I_a in ROMol.cpp.o
  __GLOBAL__I_a in QueryOps.cpp.o
  boost::mutex::mutex() in MolPickler.cpp.o
  __GLOBAL__I_a in MolPickler.cpp.o
  __GLOBAL__I_a in AtomIterators.cpp.o
  ...
  boost::system::generic_category(), referenced from:
  __GLOBAL__I_a in QueryAtom.cpp.o
  __GLOBAL__I_a in QueryBond.cpp.o
  __GLOBAL__I_a in ROMol.cpp.o
  __GLOBAL__I_a in QueryOps.cpp.o
  __GLOBAL__I_a in MolPickler.cpp.o
  __GLOBAL__I_a in AtomIterators.cpp.o
  __GLOBAL__I_a in AddHs.cpp.o
  ...
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
make[2]: *** [lib/libGraphMol.2012.09.1beta.dylib] Error 1
make[1]: *** [Code/GraphMol/CMakeFiles/GraphMol.dir/all] Error 2
make: *** [all] Error 2

Any more ideas?

Regards,

George




On 9 October 2012 22:33, James Swetnam jswet...@gmail.com wrote:

 George-

 My templating fix was submitted as 2155, and HEAD in SVN is at 2234.  I'm
 not terribly familiar with homebrew, or why it thinks 2148 is HEAD

 James


 On Tue, Oct 9, 2012 at 2:27 PM, George Papadatos gpapada...@gmail.comwrote:

 Hi James,

 Many thanks for the quick answer. I'm afraid I'm already using the trunk:
 brew install -v --HEAD rdkit --with-inchi
 (revision 2148)

 Regards,

 George


 On 9 October 2012 21:30, James Swetnam jswet...@gmail.com wrote:

 George-

 I believe you're running into an issue that was raised on the developer
 list.  I submitted a patch for this issue, which has been applied in the
 SVN trunk.  If you install from trunk you should be fine.

 Best
 James

 On Tue, Oct 9, 2012 at 12:52 PM, George Papadatos 
 gpapada...@gmail.comwrote:

 HI RDKitters,

 I get compilation errors when I try to build RDKit on a new Mountain
 Lion Mac OS machine.
 I've tried both Eddie's brew formula and manual installation with
 cmake. I also tried both the beta 2012_09 versions and the 2012_06 one.
 Apart from the system python, I use the python.org version (2.7.3)
 I also used brew to build boost from source. I copied the error I get
 at the bottom of this message.

 Has anyone had a similar problem? Any ideas for troubleshooting?

 Many thanks,

 George


 Linking CXX shared library ../../lib/libGraphMol.dylib
 cd /tmp/rdkit-urlC/Code/GraphMol 
 /usr/local/Cellar/cmake/2.8.9/bin/cmake -E cmake_link_script
 CMakeFiles/GraphMol.dir/link.txt --verbose=1
 /usr/local/Library/ENV/4.3/c++   -shared   -compatibility_version 1.0.0
 -current_version 2012.9.1 -o ../../lib/libGraphMol.2012.09.1pre.dylib
 -install_name /tmp/rdkit-urlC/lib

Re: [Rdkit-discuss] parallel conformation generation

2012-10-05 Thread George Papadatos
Hi Andrew,

Thanks for this. I didn't know about the futures and progressbar modules.

You wrote:
---
*I have to use the zip because map(f, iterable, [chunksize=None]) only
takes a single iterable. This also means I need to change the
generateconformations
function so that it takes a single element as input, which a 2-element
tuple of the molecule and the count.*
---

For such cases, there is a more elegant and pythonic way: functools.partial
http://docs.python.org/library/functools.html#functools.partial
It just freezes some of the arguments of a function, so you can use map
with a single argument.

In your case:
newfunc = partial(generateconformations, size=n)
map(newfunc, mols)


Best regards,

George P.



On 4 October 2012 22:47, Andrew Dalke da...@dalkescientific.com wrote:

 Hi again,

  Greg asked why I used the concurrent.futures module rather than
 the multiprocessing module which is standard with Python 2.6.


 There are a few differences in the API which makes the futures
 module more interesting. First off, here's how you could write
 the same process pool part using the existing multiprocessing module:


 from multiprocessing import Pool
 p = Pool(5)
 for mol, ids in p.map(generateconformations, zip(suppl, [n]*len(suppl))):
for id in ids:
writer.write(mol, confId=id)

 I have to use the zip because map(f, iterable, [chunksize=None]) only
 takes a single iterable. This also means I need to change the
 generateconformations
 function so that it takes a single element as input, which a 2-element
 tuple of the molecule and the count. (That is, change from

 def generateconformations(m, n):
   ...

 to

 def generateconformations((m, n)):
   ...

 ).

 That's a touch uglier, but doable.

 Now, when I posted the code yesterday, I should have posted the simplest
 version of the code, which is:

 with futures.ProcessPoolExecutor(max_workers=max_workers) as executor:
for mol, ids in executor.map(generateconformations, suppl,
 [n]*len(suppl)):
for id in ids:
writer.write(mol, confId=id)


 Then Greg wouldn't have asked me about how complex my code was. ;)


 This is the easiest to understand. You can see that this API supports
 multiple iterators. I used [n]*len(suppl) to make a new list containing
 repeats of the count, so I could have the twin iterators of the molecules
 and the count. This is a bit simpler than the multiprocessing code.

 In addition, the with statement know how to work with an executor. Here
 it means that all submitted jobs must finish before leaving the with block,
 and the process pool will be shut down; even if there's an exception.
 With the multiprocessing module, you need to manage that yourself, or
 trust in the memory manager.


 But I yesterday wrote something more like this:

# Submit a set of asynchronous jobs
jobs = []
for mol in suppl:
if mol:
job = executor.submit(generateconformations, mol, n)
jobs.append(job)

# Process the job results (in submission order) and save the conformers.
for job in jobs:
mol, ids = job.result()
for id in ids:
writer.write(mol, confId=id)


 The submit immediately returns a 'future' object, which is called a
 promise in some other language. You can ask for its .result() to
 get its result. That call will block (up to a timeout) if the result
 isn't there. You can also check to see if there is a result.

 The reason I did this is because I usually 1) show a progress bar
 and 2) have enough memory to store all the results in memory.

 I've enjoyed using the 'progressbar' module, from
  http://pypi.python.org/pypi/progressbar/

 I have code which looks like this:

with futures.ProcessPoolExecutor(max_workers=4) as executor:
for (collection, first_id, last_id) in blocks:
jobs.append(executor.submit(process_block, tmpdir, config,
 collection, first_id, last_id))

widgets = [Fingerprinting , progressbar.Percentage(),  ,
 progressbar.ETA(),  , progressbar.Bar()]
pbar = progressbar.ProgressBar(widgets=widgets, maxval=len(jobs))
for job in pbar(futures.as_completed(jobs)):
job.result()


 This submits all of the fingerprinting jobs to the process pool.
 The futures.as_completed() function takes an iterable of jobs
 and returns each one as they become available, no matter what the
 order is. Then the ProgressBar sees the new item, updates the
 terminal display to show progress information and an ETA, only
 to return the original object itself as an iterator. Finally,
 I call job.result() in the loop, since .result() will forward
 any exceptions if one had happened during the original call.

 Then if I want the results I iterate over them again:

for job in jobs:
 ... do something with job.result() ...



 BTW, you don't need to keep things around in memory. You can also do
 things purely asynchronously, should the output order not memory.
 In that case, the 

Re: [Rdkit-discuss] Reading files (SmilesMolSupplier, SDMolSupplier

2012-09-07 Thread George Papadatos
Hi Fabian,

The first one is easy: the function expects a header in the file by default. 
There is a parameter that toggles this but I don't have access to a computer 
right now. There is an example in the documentation. 

Best regards, 

George 

Sent from my gPad

On 7 Sep 2012, at 13:34, Fabian Dey fabian...@gmail.com wrote:

 Hi 
 
 I found two issues when reading files:
 
 1)  I might be getting something wrong here, but it seems as if 
 SmilesMolSupplier misses the very first Smiles:
 
 input smiles file test.smi:
 C mola
 CC molb
 CCC molc
  mold
 
 
 # python script
 from rdkit import Chem
 suppl = Chem.SmilesMolSupplier('test.smi');
 
 print TEST-1 : %s %s  
 %(Chem.MolToSmiles(suppl[0]),suppl[0].GetProp(_Name))
 print 
 
 for mol in suppl:
 print TEST-2 : %s %s  %(Chem.MolToSmiles(mol),mol.GetProp(_Name))
 
 print 
 
 for i,mol in enumerate(suppl):
 print TEST-3 : %s %s  %(Chem.MolToSmiles(mol),mol.GetProp(_Name))
 
 
 #output 
 TEST-1 : CC molb
 
 TEST-2 : CC molb
 TEST-2 : CCC molc
 TEST-2 :  mold
 
 TEST-3 : CC molb
 TEST-3 : CCC molc
 TEST-3 :  mold
 
 The first molecule mola is not available through the supplier (also happens 
 with other smiles files).
 
 
 2) SDMolSupplier  : I have a script which calculates properties from SDfiles 
 read in through the corresponding supplier
 and RDKIT occassionally reported the following errors:
 
 
 Pre-condition Violation
 Atomic number not found
 Violation occurred on line 56 in file 
 /home/dey/Downloads/RDKit_2012_06_1/Code/GraphMol/PeriodicTable.h
 Failed Expression: atomicNumberbyanum.size()
 
 
 [12:25:23] Unexpected error hit on line 6
 [12:25:23] ERROR: moving to the begining of the next molecule
 ERROR for molecule at position 0
 
 
 It turned out that for the corresponding SD-file the atom elements were 
 written in all captial letters  (e.g. CL) - if these
 were changed to the proper format (Cl) RDKIT passed without throwing an 
 error. Although I can preprocess the SD-files
 with a script, it would be nice if RDKIT could handle these cases internally.
 
 Best
 Fabian
 
 
 
 
 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and 
 threat landscape has changed and how IT managers can respond. Discussions 
 will include endpoint security, mobile security and the latest in malware 
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Faulty valence for nitrogen in aromatic ring

2012-08-16 Thread George Papadatos
Wow, this almost makes me wanting to re-write my thesis in LaTeX. Almost!
:)

George


On 15 August 2012 16:26, Greg Landrum greg.land...@gmail.com wrote:

 On Wed, Aug 15, 2012 at 5:10 PM, Michael Palmer mpal...@uwaterloo.ca
 wrote:
 
  Now that I've at least tried to clear up what is going on, maybe I can
  be more helpful: was there a specific question you were trying to
  answer that led you to your discovery that the RDKit behaves strangely
  in this special case?
 
 
  What I'm trying to do can be inspected here:
 
  http://chimpsky.uwaterloo.ca/mol2chemfig/index
 
  Briefly, I'm building a program for converting molecular structures from
  smiles or molfile format to TeX code, using the syntax defined by the
  chemfig package as the target.

 , coool!

  rdkit does all the heavy lifting. I was using
  the GetImplicitHs method to determine how many hydrogens to attach to
  carbons and heteroatoms and then noticed that the number of hydrogens on
  nitrogen in rings was off.
 
  From your answer, it seems I should be using GetTotalNumHs. However, I
 would
  still like to be able to distinguish between hydrogens that were
 specified
  in a molfile, with coordinates, and those that weren't.

 the answer to this isn't super straightforward, so it probably won't
 come until tomorrow.

 
  Another question I ran into was accessing the coordinates of an atom,
 either
  loaded from molfile or, with smiles, computed with
 AllChem.Compute2DCoords.
  Does the atom object have a method to get at those? Right now, I'm using
  some embarrassing workaround.

 This one I can answer quickly. You need to the molecule's conformer:
 In [7]: AllChem.Compute2DCoords(m)
 Out[7]: 0

 In [8]: conf = m.GetConformer()

 In [9]: for atom in m.GetAtoms():
...: aid = atom.GetIdx()
...: print aid,list(conf.GetAtomPosition(aid))
...:
 0 [0.15858546683951269, -1.1294387542967057, 0.0]
 1 [-1.3046720119188515, -1.4594047386916416, 0.0]
 2 [-2.3220596761687866, -0.35716958200679838, 0.0]
 3 [-1.8761898616603592, 1.07503155907298, 0.0]
 4 [-0.41293238290199596, 1.4049975434679163, 0.0]
 5 [0.60445528134793969, 0.30276238678307321, 0.0]
 6 [2.0677127601063026, 0.63272837117800962, 0.0]
 7 [3.0851004243562379, -0.46950678550683356, 0.0

 -greg


 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] windows binary install

2012-08-11 Thread George Papadatos
Hi Alan,

You're almost there but it seems now that you need to upgrade your numpy from 
1.4 to 1.6. 

Regards,

George P. 

Sent from my gPhone

On 11 Aug 2012, at 21:38, stanley5101 stanley5...@yahoo.co.uk wrote:

 I've gone to this archived message and installed the library mentioned.  It 
 just installed itself rather than asking me where I wanted to put it.  
 However, rdkit seems to be recognising it as I now get a new error (pasted 
 below) .  Do I have to go to an older rdkit which likes numpy version 4? 
  
  
 Python 2.7.1 |EPD 7.0-2 (32-bit)| (r271:86832, Dec  2 2010, 10:35:02) [MSC 
 v.1500 32 bit (Intel)] on win32
 Type copyright, credits or license() for more information.
  import rdkit
  from rdkit import Chem
 RuntimeError: module compiled against API version 6 but this version of numpy 
 is 4
 RuntimeError: module compiled against API version 6 but this version of numpy 
 is 4
 
 From: James Davidson j.david...@vernalis.com
 To: stanley5...@yahoo.co.uk 
 Cc: rdkit-discuss@lists.sourceforge.net 
 Sent: Saturday, 11 August 2012, 7:58
 Subject: Re: [Rdkit-discuss] windows binary install
 
 Hi Alan,
  
 My guess is that your problem is missing DLLs, available in the MS C++ 
 Redistributable package – solution described by George for a very similar 
 problem:  
 http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg02381.html.
  
 I now tend to explicitly just put a copy of these two DLLs into the RDKit lib 
 folder when installing for others, and I can reproduce your error if I remove 
 one of these DLLs from there on my system.
  
 Cheers
  
 James
  
 
 __
 PLEASE READ: This email is confidential and may be privileged. It is intended 
 for the named addressee(s) only and access to it by anyone else is 
 unauthorised. If you are not an addressee, any disclosure or copying of the 
 contents of this email or any action taken (or not taken) in reliance on it 
 is unauthorised and may be unlawful. If you have received this email in 
 error, please notify the sender or postmas...@vernalis.com. Email is not a 
 secure method of communication and the Company cannot accept responsibility 
 for the accuracy or completeness of this message or any attachment(s). Please 
 check this email for virus infection for which the Company accepts no 
 responsibility. If verification of this email is sought then please request a 
 hard copy. Unless otherwise stated, any views or opinions presented are 
 solely those of the author and do not represent those of the Company.
 
 The Vernalis Group of Companies
 100 Berkshire Place
 Wharfedale Road
 Winnersh, Berkshire
 RG41 5RD, England
 Tel: +44 (0)118 938 
 
 To access trading company registration and address details, please go to the 
 Vernalis website at www.vernalis.com and click on the Company address and 
 registration details link at the bottom of the page..
 __
 
 
 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and 
 threat landscape has changed and how IT managers can respond. Discussions 
 will include endpoint security, mobile security and the latest in malware 
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread George Papadatos
 packages...
  en_au
  en_gb
  en_us
  en_za
Setting up postgresql-9.1 (9.1.3-2) ...
Creating new cluster (configuration: /etc/postgresql/9.1/main, data:
/var/lib/postgresql/9.1/main)...
Moving configuration file /var/lib/postgresql/9.1/main/postgresql.conf to
/etc/postgresql/9.1/main...
Moving configuration file /var/lib/postgresql/9.1/main/pg_hba.conf to
/etc/postgresql/9.1/main...
Moving configuration file /var/lib/postgresql/9.1/main/pg_ident.conf to
/etc/postgresql/9.1/main...
Configuring postgresql.conf to use port 5432...
update-alternatives: using
/usr/share/postgresql/9.1/man/man1/postmaster.1.gz to provide
/usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode.
 * Starting PostgreSQL 9.1 database server

   [ OK ]
Setting up postgresql (9.1+130~precise) ...
Setting up postgresql-server-dev-9.1 (9.1.3-2) ...
Setting up postgresql-server-dev-all (130~precise) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place

Again, *all* the tests fail as do the create extension attempts.

I even tried explicit postgresql-9.1 and postgresql-9.2 (beta version) but
with the same sad results.

Do I do something wrong here, like still installing the default postgresql
packages and not the good ones?

Regards,

George

On 29 May 2012 23:27, George Papadatos gpapada...@gmail.com wrote:

 Hi Jan,

 Many thanks for the reply.
 Yes, I used apt-get and the default repositories to install postgresql on
 a  Ubuntu 12.04.
 I'll follow your guidelines and the new repos tomorrow and I'll let you
 know.

 Many thanks again,

 George


 On 29 May 2012 23:15, Jan Holst Jensen j...@biochemfusion.com wrote:

  On 2012-05-29 17:45, George Papadatos wrote:

 Hi RDKitters,

  Today I tried to install the RDKit and cartridge to a brand new Ubuntu
 12.04 32-bit running on a Virtual Box.

  [...]

  Then when I tried to install the extension:
  georgep@george-VB:~$ psql -c 'CREATE EXTENSION rdkit' gpdb
 FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
 FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
 connection to server was lost

  or even:
 georgep@george-VB:~/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit$ psql gpdb
 psql (9.1.3)
 Type help for help.

  gpdb=# create extension rdkit;
 FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
 FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
 The connection to the server was lost. Attempting reset: Succeeded.
 gpdb=# show rdkit.tanimoto_threshold;
 ERROR:  unrecognized configuration parameter rdkit.tanimoto_threshold


  Any ideas would be much appreciated!


 Hi George,

 Sounds exactly like the behavior I described in this thread:


 http://sourceforge.net/mailarchive/forum.php?thread_name=CAD4fdRRHdpqDRCWd5AjEzDWJia5WM6zsq%3DosvVmb%3DYHe%3DpmR7A%40mail.gmail.comforum_name=rdkit-discuss

 My issue seemed to be related to the OpenSCG version of PostgreSQL and
 for my purposes the issue was solved by using Martin Pitt's postgres
 packages instead. However, on one machine, a Linux Mint box, I never got it
 working. Are you using all plain vanilla packages from Ubuntu 12.04 ?

 Cheers
 -- Jan



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread George Papadatos
Hi Jan,

Mine is exactly the same:
gcc test.c -I/usr/include/postgresql/9.1/server;./a.out
90103

So, I am back to square 1!

I am starting to get a bit desperate here, has anyone ever successfully
built the cartridge from the trunk on a plain Ubuntu 12.04?

Many thanks for your help,

George


On 30 May 2012 12:48, Jan Holst Jensen j...@biochemfusion.com wrote:

 On 2012-05-30 13:24, George Papadatos wrote:

 Thanks to both of you.

 I do not know how to check for the PG_VERSION_NUM. I tried to edit to
 guc.c by removing the conditional check of the PG_VERSION but with the same
 results:
 Adrian, is this what you meant?

  DefineCustomRealVariable(
   rdkit.tanimoto_threshold,
   Lower threshold of Tanimoto similarity,
   Molecules with similarity lower than threshold
 are not similar by % operation,
 rdkit_tanimoto_smlar_limit,
   0.5,
   0.0,
   1.0,
   PGC_USERSET,
   0,
   (GucRealCheckHook)**TanimotoLimitAssign,
   NULL,
   NULL
   );

  DefineCustomRealVariable(
   rdkit.dice_threshold,
   Lower threshold of Dice similarity,
   Molecules with similarity lower than threshold
 are not similar by # operation,
 rdkit_dice_smlar_limit,
   0.5,
   0.0,
   1.0,
   PGC_USERSET,
   0,
   (GucRealCheckHook)**DiceLimitAssign,
   NULL,
   NULL
   );

 Regards,

 George


 Hi George,

 I just tried the same on my VM, with no change for the better either. My
 version of guc.c now looks like this:

 static void
 initRDKitGUC()
 {
  if (rdkit_guc_inited)
return;


  DefineCustomRealVariable(
   rdkit.tanimoto_threshold,
   Lower threshold of Tanimoto similarity,
   Molecules with similarity lower than threshold
 are not similar by % operation,
 rdkit_tanimoto_smlar_limit,
   0.5,
   0.0,
   1.0,
   PGC_USERSET,
   0,
 //if PG_VERSION_NUM = 90100
   (GucRealCheckHook)**TanimotoLimitAssign,
   NULL,
 //else
 //   TanimotoLimitAssign,
 //endif

   NULL
   );

  DefineCustomRealVariable(
   rdkit.dice_threshold,
   Lower threshold of Dice similarity,
   Molecules with similarity lower than threshold
 are not similar by # operation,
 rdkit_dice_smlar_limit,
   0.5,
   0.0,
   1.0,
   PGC_USERSET,
   0,
 //if PG_VERSION_NUM = 90100
   (GucRealCheckHook)**DiceLimitAssign,
   NULL,
 //else
 //   DiceLimitAssign,
 //endif
   NULL
   );

  rdkit_guc_inited = true;
 }

 Did a cartridge make clean, make, sudo make install, and it still
 fails for me with

 postgres=# create extension rdkit;

 FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
 FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
 The connection to the server was lost. Attempting reset: Succeeded.
 postgres=#

 Cheers
 -- Jan

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] cartridge problems

2012-05-30 Thread George Papadatos
Compare it with this one:

georgep@george-VB:~/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit$ make
installcheck
/usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress
--inputdir=. --psqldir='/usr/lib/postgresql/9.1/bin'
--dbname=contrib_regression rdkit-91 props btree molgist bfpgist-91 sfpgist
slfpgist fps
(using postmaster on Unix socket, default port)
== dropping database contrib_regression ==
DROP DATABASE
== creating database contrib_regression ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test rdkit-91 ... FAILED
test props... FAILED
test btree... FAILED
test molgist  ... FAILED
test bfpgist-91   ... FAILED
test sfpgist  ... FAILED
test slfpgist ... FAILED
test fps  ... FAILED

==
 8 of 8 tests failed.
==

The differences that caused some tests to fail can be viewed in the
file
/home/georgep/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit/regression.diffs.
 A copy of the test summary that you see
above is saved in the file
/home/georgep/local/rdkit/rdkit_trunk/Code/PgSQL/rdkit/regression.out.

make: *** [installcheck] Error 1

gpdb=# create extension rdkit with schema rdkit;
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
FATAL:  failed to initialize rdkit.tanimoto_threshold to 0.5
The connection to the server was lost. Attempting reset: Succeeded.
gpdb=#


On 30 May 2012 14:49, Adrian Schreyer ams...@cam.ac.uk wrote:

 64-bit, PostgreSQL packages are from the official archive. I simply do
 'make' followed by 'sudo make install' and then

 create schema rdkit;
 create extension rdkit with schema rdkit;

 and that's it.

 $ make installcheck

 /usr/lib/postgresql/9.1/lib/pgxs/src/makefiles/../../src/test/regress/pg_regress
 --inputdir=. --psqldir='/usr/lib/postgresql/9.1/bin'
 --dbname=contrib_regression rdkit-91 props btree molgist bfpgist-91
 sfpgist slfpgist fps
 (using postmaster on Unix socket, default port)
 == dropping database contrib_regression ==
 DROP DATABASE
 == creating database contrib_regression ==
 CREATE DATABASE
 ALTER DATABASE
 == running regression test queries==
 test rdkit-91 ... ok
 test props... ok
 test btree... ok
 test molgist  ... ok
 test bfpgist-91   ... ok
 test sfpgist  ... ok
 test slfpgist ... ok
 test fps  ... ok

 =
  All 8 tests passed.
 =

 On Wed, May 30, 2012 at 2:43 PM, Jan Holst Jensen j...@biochemfusion.com
 wrote:
  How odd. Adrian, are you using a 32-bit or 64-bit version of Ubuntu
 12.04 ?
 
  Cheers
  -- Jan
 
 
  On 2012-05-30 15:26, Adrian Schreyer wrote:
 
  Yes, I could build and install the cartridge without problems
  (Release_2012.03.1) on 12.04.
 
  On Wed, May 30, 2012 at 2:23 PM, George Papadatosgpapada...@gmail.com
   wrote:
 
  Hi Jan,
 
  Mine is exactly the same:
  gcc test.c -I/usr/include/postgresql/9.1/server;./a.out
  90103
 
  So, I am back to square 1!
 
  I am starting to get a bit desperate here, has anyone ever successfully
  built the cartridge from the trunk on a plain Ubuntu 12.04?
 
  Many thanks for your help,
 
  George
 
 

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Python 2.7 binaries for WinXP VirtualBox - FYI

2012-05-10 Thread George Papadatos
Hello RDKiters,

Just a quick thing to say that I had some problems installing the latest
2012_03 binaries on a Virtual WinXP Pro machine.
The error was deceptively familiar:

In [1]: from rdkit.Chem import AllChem as Chem
---
ImportError   Traceback (most recent call last)
C:\Documents and Settings\georgep\ipython-input-1-395511f74b21 in
module()
 1 from rdkit.Chem import AllChem as Chem

C:\RDKit_2012_03_1\rdkit\Chem\__init__.py in module()
 16
 17 
--- 18 from rdkit import rdBase
 19 from rdkit import RDConfig
 20

ImportError: DLL load failed: The specified module could not be found.

...and it is usually attributed to not setting the PATH properly. After
make sure that this was fine, I had to use the Dependency Walker against
the rdBase.pyd, which pointed out that there were a couple of dlls missing
(msvcp100 and msvcr100). Everything was solved after the installation of MS
C++ redist package I found here:
http://www.microsoft.com/en-us/download/details.aspx?id=

I hope this will prevent somebody else from wasting their morning with
troubleshooting!

Regards,

George Papadatos
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Failed Expression: pick = 0

2012-05-02 Thread George Papadatos
Hi Andrew,

This is probably not going to solve the problem at hand but it may be
useful to you or others in the future:
ChEMBLdb maintains a molecular hierarchy table where you can retrieve the
parent (=desalted - using Pipeline Pilot) structures for each molecular
entity.
You may try something like this:

select distinct cs.molregno, cs.molfile, cs.canonical_smiles
from compound_structures cs, molecule_hierarchy mh
where cs.molregno = mh.parent_molregno

This will give you all the *unique* desalted structures in chEMBL.
In case you want to keep also track of the molregnos of the salt forms for
each parent structure, try (mysql-specific):

select cs.molregno, group_concat(mh.molregno), cs.molfile,
cs.canonical_smiles
from compound_structures cs, molecule_hierarchy mh
where cs.molregno = mh.parent_molregno
group by cs.molregno

I hope it hels.

Best regards,

George Papadatos
EMBL-EBI


On 30 April 2012 21:32, Andrew Dalke da...@dalkescientific.com wrote:

 I'm desalting the ChEMBL data set and generating the corresponding
 de-salted SD and SMILES files. I found a problem in the conversion step,
 and found that the problem has nothing to do with the de-salting.

 My code failed with CHEMBL1269997, which is record ~750,200 out of
 1,142,974. (In other words, it took a while to get to this point.) Here's a
 reproducible:

  from rdkit import Chem
  writer = Chem.SDWriter(/dev/stdout)
  for mol in Chem.ForwardSDMolSupplier(CHEMBL1269997.sdf):
 ...   writer.write(mol)
 ...
 [22:11:05]

 
 Invariant Violation

 Violation occurred on line 388 in file
 /tmp/homebrew-rdkit-HEAD-Ebdo/Code/GraphMol/FileParsers/MolFileStereochem.cpp
 Failed Expression: pick = 0
 

 Traceback (most recent call last):
  File stdin, line 2, in module
 RuntimeError: Invariant Violation
  Chem.MolToSmiles(mol)
 'OCC1=CC2OC(CC(C)C)(CC(C)C)C3C456C(OC(C)(C)O5)C1(O)C46C23'
  Chem.MolToSmiles(mol, isomericSmiles=True)
 'OCC1=C[C@@H]2OC(CC(C)C)(CC(C)C)[C@@H]3[C@H]4CCC[C@@]56[C@
 @H](OC(C)(C)O5)[C@]1(O)[C@]46[C@H]23'
 

 You can see that the molecule was read in, is not None, and can be used to
 generate a SMILES.

 The CHEMBL1269997.sdf is attached.

 This error was previously reported in the thread JP started, titled
 Invariant violation..., dated July 6, 2011. Greg replied:

  Wow that is certainly an error I never expected to see. From the code,
  I guess the molecule has a stereocenter that is surrounded by other
  stereocenters and something extremely unfortunate is happening with
  the way decisions are being made about which bonds to wedge. As Eddie
  requested in an earlier message, it would be helpful to have the input
  that produced the error so that it can be added to the test cases (and
  so that I can be sure the problem is fixed once I figure out how to).

 but I see no posting of a failing structure. I hope the attached structure
 helps resolve this problem.



Andrew
da...@dalkescientific.com



 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Strange SMILES behaviour

2012-03-12 Thread George Papadatos
Hello all,

Could anyone please explain this:

In [21]: Chem.CanonSmiles('C1=CC=C2C(=C1)NC=S2')
Out[21]: 'c1[nH]c2c2s1'

In [22]: Chem.MolFromSmiles(Out[21])
[16:47:14] Can't kekulize mol

In other words, how is it possible that a valid RDKit SMILES output fails
to be converted to molecule again?
I'm sure this has to do with aromaticity and kekulization for benzothiazole
but still


Many thanks in advance,
George
--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Strange SMILES behaviour

2012-03-12 Thread George Papadatos
Thanks for the prompt reply Greg; this is what I suspected too!

Regards,

George

On 12 March 2012 17:22, Greg Landrum greg.land...@gmail.com wrote:

 Hi George,

 On Mon, Mar 12, 2012 at 5:58 PM, George Papadatos gpapada...@gmail.com
 wrote:
 
  Could anyone please explain this:
 
  In [21]: Chem.CanonSmiles('C1=CC=C2C(=C1)NC=S2')
  Out[21]: 'c1[nH]c2c2s1'
 
  In [22]: Chem.MolFromSmiles(Out[21])
  [16:47:14] Can't kekulize mol
 
  In other words, how is it possible that a valid RDKit SMILES output
 fails to
  be converted to molecule again?

 I'm sure the general answer isn't a surprise: it's a bug

 It may actually be more than one bug.

 The SMILES 'C1=CC=C2C(=C1)NC=S2' probably shouldn't produce a legal
 molecule. It certainly shouldn't produce one with an aromatic ring.
 That's not really a valid/reasonable resonance structure for
 benzothiazole. This would be ok: S1C=NC2=CC=CC=C12 o

 The output smiles:  'c1[nH]c2c2s1' is also not a reasonable
 molecule, which the RDKit recognizes when it tries to read it back in.

 I'm going to have to think about where the right place to fix this is.

 -greg

--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] 2011.09 (Q3 2011) RDKit release

2011-10-16 Thread George Papadatos
Hi James,

This looks like there are missing dlls from the lib folder. The easiest
solution would to be copy all the files from the lib folder of the previous
(working) RDKit version and paste them in the lib folder of the current one
(without overwriting them).

Regards,

George


On 16 October 2011 15:56, James Davidson j.david...@vernalis.com wrote:

 **
 Hi Greg,

 I probably should have picked this up in the beta (but didn't...)  When I
 try to import AllChem, I see the following:

  from rdkit import Chem
  from rdkit.Chem import AllChem

 Traceback (most recent call last):
   File pyshell#6, line 1, in module
 from rdkit.Chem import AllChem
   File C:\Python27\RDKit_2011_09_1\rdkit\Chem\AllChem.py, line 28, in
 module
 from rdkit.Chem.rdSLNParse import *
 ImportError: DLL load failed: The specified module could not be found.

 Any advice?

 Kind regards

 James

 __
 PLEASE READ: This email is confidential and may be privileged. It is
 intended for the named addressee(s) only and access to it by anyone else is
 unauthorised. If you are not an addressee, any disclosure or copying of the
 contents of this email or any action taken (or not taken) in reliance on it
 is unauthorised and may be unlawful. If you have received this email in
 error, please notify the sender or postmas...@vernalis.com. Email is not a
 secure method of communication and the Company cannot accept responsibility
 for the accuracy or completeness of this message or any attachment(s).
 Please check this email for virus infection for which the Company accepts no
 responsibility. If verification of this email is sought then please request
 a hard copy. Unless otherwise stated, any views or opinions presented are
 solely those of the author and do not represent those of the Company.

 The Vernalis Group of Companies
 Oakdene Court
 613 Reading Road
 Winnersh, Berkshire
 RG41 5UA.
 Tel: +44 118 977 3133

 To access trading company registration and address details, please go to
 the Vernalis website at www.vernalis.com and click on the Company address
 and registration details link at the bottom of the page..
 __


 --
 All the data continuously generated in your IT infrastructure contains a
 definitive record of customers, application performance, security
 threats, fraudulent activity and more. Splunk takes this data and makes
 sense of it. Business sense. IT sense. Common sense.
 http://p.sf.net/sfu/splunk-d2d-oct
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] 2011.09 (Q3 2011) RDKit release

2011-10-16 Thread George Papadatos
Hi Greg,

This should work - this is how I solved a similar problem with the latest
RDKit version for Windows.

Cheers,

George

On 16 October 2011 16:32, Greg Landrum greg.land...@gmail.com wrote:

 I'm traveling and don't have access to the machine where I normally do
 windows builds, but I tried to create an alternate binary using dlls
 from an older RDKit distribution.

 Please give this a try:

 http://code.google.com/p/rdkit/downloads/detail?name=RDKit_2011_09_1.win32.py27.pkg2.zip
 and let me know if it works. If so I will go ahead and replace the
 current binaries with this one.

 Sorry for the hassle and thanks for the help,
 -greg


 --
 All the data continuously generated in your IT infrastructure contains a
 definitive record of customers, application performance, security
 threats, fraudulent activity and more. Splunk takes this data and makes
 sense of it. Business sense. IT sense. Common sense.
 http://p.sf.net/sfu/splunk-d2d-oct
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Partial/Rooted/Anchored Morgan Fingerprint

2011-10-06 Thread George Papadatos
Hi Greg,

That's great , thanks a lot for your help.

Regards,

George

On 30 September 2011 07:01, Greg Landrum greg.land...@gmail.com wrote:

 Hi George,

 On Thu, Sep 29, 2011 at 1:11 PM, George Papadatos gpapada...@gmail.com
 wrote:
  I'd like to calculate the *rooted* Morgan fingerprint for a set of
  molecules. By rooted I mean the subset of the whole-molecule fingerprint
  which contains just the bits which correspond to circular atom layers (up
 to
  N bond lengths) that include a specific atom.
  So let's say that there is a single Uranium atom in each molecule. What I
  want to calculate is the subset of the Morgan fingerprint (let's say with
 a
  radius of 3) which contains the bits set on by layers including my U
 atom.
  This should include not only the bits where U was the root of the layer,
 but
  also the bits where U was in the layer of neighboring atoms, up to 3
 bonds
  away.

 A minor point: I wouldn't call this the rooted fingerprint since it
 includes bits that are set by layers that are not centered at your U
 atom.

  After checking the super-helpful Getting Started with the RDKit in
 Python
  (Q2 2011) tutorial, section 5.4.1, I can see one way of doing this:
  calculating the Morgan fp and then enumerating all the sub-molecules (or
  layers) that set the corresponding bits on and then checking if U is in
 any
  one of these submolecules. If it is then the corresponding bit is part of
  the root Morgan fp.
  Is there any other more efficient way???

 If you only want the bits that are set by a particular atom (i.e.
 those that are centered at that atom), you can use the fromAtoms
 argument:
  from rdkit import Chem
  from rdkit.Chem import rdMolDescriptors
  m1 = Chem.MolFromSmiles('Cc1c1')
  m2 = Chem.MolFromSmiles('Cc1c(C)1')
 
 rdMolDescriptors.GetMorganFingerprint(m1,1,fromAtoms=[0]).GetNonzeroElements()
 {2246728737: 1, 422715066: 1}
 
 rdMolDescriptors.GetMorganFingerprint(m1,2,fromAtoms=[0]).GetNonzeroElements()
 {2246728737: 1, 422715066: 1, 2218109011: 1}
 
 rdMolDescriptors.GetMorganFingerprint(m2,1,fromAtoms=[0]).GetNonzeroElements()
 {2246728737: 1, 422715066: 1}
 
 rdMolDescriptors.GetMorganFingerprint(m2,2,fromAtoms=[0]).GetNonzeroElements()
 {2246728737: 1, 422715066: 1, 2368203427: 1}

 Note that I just fixed a bug that was leading to missing bits in the
 morgan fingerprints generated with a fromAtoms argument.

 If you want all bits that the atom is involved in, I would suggest
 using the fromAtoms argument, but also including all the atoms that
 are within the appropriate radius of your atom. You can find these
 atoms using the molecule's distance matrix:
  m1 = Chem.MolFromSmiles('Cc1c1')
  dm=Chem.GetDistanceMatrix(m1)
  dm
 array([[ 0.,  1.,  2.,  3.,  4.,  3.,  2.],
   [ 1.,  0.,  1.,  2.,  3.,  2.,  1.],
   [ 2.,  1.,  0.,  1.,  2.,  3.,  2.],
   [ 3.,  2.,  1.,  0.,  1.,  2.,  3.],
   [ 4.,  3.,  2.,  1.,  0.,  1.,  2.],
   [ 3.,  2.,  3.,  2.,  1.,  0.,  1.],
   [ 2.,  1.,  2.,  3.,  2.,  1.,  0.]])


 I hope this helps,
 -greg

--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] label properties on 2D depiction

2011-07-12 Thread George Papadatos
Hi all,

On a related topic, is it possible to depict an arbitrary string on a cairo
canvas? I am thinking particularly depicting the name or ID of a molecule
below its structure.

Regards,

George

On 12 July 2011 06:36, Peter Schmidtke pschmid...@ub.edu wrote:

 Thanks Greg,

 I'll give it a try ;)

 ++

 Peter

 On 07/11/2011 06:47 PM, Greg Landrum wrote:
  Hi Peter,
 
  On Mon, Jul 11, 2011 at 1:35 PM, Peter Schmidtke pschmid...@ub.edu
 wrote:
 
  I wondered if it was possible and easy to show some numerical properties
  or strings or whatever for each atom on a 2d representation of a
 molecule.
 
  Is something like that implemented (didn't really see it right now)?
 
  There's nothing like that built in, but pretty much everything you
  need to be able to annotate the drawing yourself after the molecule
  has been drawn is already there.
 
  Take a look at the code in $RDBASE/rdkit/Chem/Draw/__init__.py:MolToImage
  After line 70 executes, you have a canvas (either cairo, aggdraw, or
  sping, depending on which system you have installed) that contains the
  molecule drawing as well as MolDrawing instance named drawer. drawer
  has a data element atomPs that can be used to get the position of
  atoms in canvas coordinates : drawer.atomPs[mol][atomIdx]. The code
  for the individual canvases shows how to do something with these
  coordinates.
 
   -greg
 


 --

 Peter Schmidtke
 PhD Student
 Dept. Physical Chemistry
 Faculty of Pharmacy
 University of Barcelona



 --
 All of the data generated in your IT infrastructure is seriously valuable.
 Why? It contains a definitive record of application performance, security
 threats, fraudulent activity, and more. Splunk takes this data and makes
 sense of it. IT sense. And common sense.
 http://p.sf.net/sfu/splunk-d2d-c2
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] random forest in RDKit

2011-05-02 Thread George Papadatos
Hi guys,

I'd also be interested in some ML examples.

Regards,

George


On 2 May 2011 20:52, Igor Filippov igor.v.filip...@gmail.com wrote:

 Hi Greg,

 Yes, actually for this project I'm interested in Python specifically!
 Time to learn me some new tricks :)
 I was looking through the docs online but I cannot figure it out :(

 Best regards,
 Igor

 On Mon, 2011-05-02 at 21:45 +0200, Greg Landrum wrote:
  Hi Igor,
 
  On Mon, May 2, 2011 at 9:08 PM, Igor Filippov igor.v.filip...@gmail.com
 wrote:
  
   Can anybody point me in the right direction (some simple code snippets
   would be best) how to use machine learning methods in RDkit? I am
   especially interested in RandomForest implementation.
  
 
  The machine learning code is mostly written in Python. I know you're
  primarily a C++ user, are you still interested?
 
  -greg




 --
 WhatsUp Gold - Download Free Network Management Software
 The most intuitive, comprehensive, and cost-effective network
 management toolset available today.  Delivers lowest initial
 acquisition cost and overall TCO of any competing solution.
 http://p.sf.net/sfu/whatsupgold-sd
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Python 2.7 binaries for win32

2011-04-19 Thread George Papadatos
Hello,

FYI, it seems there is an inconsistency in the RDKit binaries for Windows
Python 2.7, as the dependency walker indicated: The rdBase.pyd looks for a
boost_python-vc-mt-1_44.dll in %RDBASE%/lib whereas the actual name of the
dll is boost_python-vc*90*-mt-1_44.dll
This is probably what caused the problem for me.

Removing the '90' from the 2 dlls in lib folder seems to do the trick.

Regards,

George



On 18 April 2011 08:17, George Papadatos gpapada...@gmail.com wrote:

 Hi Uwe,

  Thanks for the reply. Perhaps I did not make it clear but what I meant is
 that I appended %RDBASE%\lib to the PATH variable.

 Regards,

 George

 Sent from my gPhone

 On 18 Apr 2011, at 07:49, Uwe Hoffmann chemis...@uwe-hoffmann.de wrote:

  Hi George,
  Am 17.04.2011 12:03, schrieb George Papadatos:
  So...
  I've copied the binaries folder to C:\RDKit_2011_03_1
  I've added the variables:
  RDBASE = C:\RDKit_2011_03_1
  PYTHONPATH = %RDBASE%
  PATH = %RDBASE%\lib
  This seems to be problematic because you overwrite the whole PATH
  environment variable.
 
  ImportError: DLL load failed: The specified module could not be found.
 
  regards,
 
Uwe
 
 
 
 --
  Benefiting from Server Virtualization: Beyond Initial Workload
  Consolidation -- Increasing the use of server virtualization is a top
  priority.Virtualization can reduce costs, simplify management, and
 improve
  application availability and disaster protection. Learn more about
 boosting
  the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Python 2.7 binaries for win32

2011-04-18 Thread George Papadatos
Hi Uwe,

 Thanks for the reply. Perhaps I did not make it clear but what I meant is that 
I appended %RDBASE%\lib to the PATH variable. 

Regards,

George

Sent from my gPhone

On 18 Apr 2011, at 07:49, Uwe Hoffmann chemis...@uwe-hoffmann.de wrote:

 Hi George,
 Am 17.04.2011 12:03, schrieb George Papadatos:
 So...
 I've copied the binaries folder to C:\RDKit_2011_03_1
 I've added the variables:
 RDBASE = C:\RDKit_2011_03_1
 PYTHONPATH = %RDBASE%
 PATH = %RDBASE%\lib
 This seems to be problematic because you overwrite the whole PATH 
 environment variable.
 
 ImportError: DLL load failed: The specified module could not be found.
 
 regards,
 
   Uwe
 
 
 --
 Benefiting from Server Virtualization: Beyond Initial Workload 
 Consolidation -- Increasing the use of server virtualization is a top
 priority.Virtualization can reduce costs, simplify management, and improve 
 application availability and disaster protection. Learn more about boosting 
 the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Python 2.7 binaries for win32

2011-04-17 Thread George Papadatos
Cheers, Greg.

George

On 17 April 2011 06:14, Greg Landrum greg.land...@gmail.com wrote:

 Dear all,

 After a couple of requests, I just uploaded a win32 build of the
 2011.03 release that supports Python 2.7 to both the google code and
 sourceforge download sites.

 Best Regards,
 -greg


 --
 Benefiting from Server Virtualization: Beyond Initial Workload
 Consolidation -- Increasing the use of server virtualization is a top
 priority.Virtualization can reduce costs, simplify management, and improve
 application availability and disaster protection. Learn more about boosting
 the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Installation driving me mad (RDKit on Centos 5.4 final)

2011-02-23 Thread George Papadatos
Fair enough, I did not know that! However, according to the same
documentation, these packages are highly recommended for NumPy and required
for SciPy:
http://scipy.org/Installing_SciPy/Linux#head-9cf6f4b7fe9ba63fc228203c4f28554a74970847

http://scipy.org/Installing_SciPy/Linux#head-9cf6f4b7fe9ba63fc228203c4f28554a74970847In
any case, here is a repository for CentOS 5/RHEL 5 with the necessary rpms
(for those who can't access yum):
http://download.opensuse.org/repositories/home:/ashigabou/
http://download.opensuse.org/repositories/home:/ashigabou/After that,
Kirk's walk though has been most helpful.

George


On 23 February 2011 11:12, Greg Landrum greg.land...@gmail.com wrote:

 Let me elaborate on that... from the numpy installation page
 (http://docs.scipy.org/doc/numpy/user/install.html:
 NumPy does not require any external linear algebra libraries to be
 installed. However, if these are available, NumPy’s setup script can
 detect them and use them for building. A number of different LAPACK
 library setups can be used, including optimized LAPACK libraries such
 as ATLAS, MKL or the Accelerate/vecLib framework on OS X.

 Best,
 -greg




 On Wed, Feb 23, 2011 at 12:10 PM, Greg Landrum greg.land...@gmail.com
 wrote:
  I'm not convinced of that. I'm pretty sure that I have built numpy on
  redhat and ubuntu systems without ever installing lapack.
 
  -greg
 
 
  On Wed, Feb 23, 2011 at 12:06 PM, George Papadatos gpapada...@gmail.com
 wrote:
  ...yet you need them to build Numpy...
  George
 
  On 23 February 2011 11:03, Greg Landrum greg.land...@gmail.com wrote:
 
  To be very clear: you do not need *any* of these packages to install
 the
  RDKit.
 
  -greg
 
 
  On Wed, Feb 23, 2011 at 10:53 AM, JP jeanpaul.ebe...@inhibox.com
 wrote:
   Great wiki - I wonder how I missed that.
   But the first instruction
   sudo yum install atlas, atlas-devel, blas blas-devel lapack
 lapack-devel
  
   Gives me the following error:
   No package atlas, available.
   No package atlas-devel, available.
   No package blas available.
   No package lapack available.
   Is there a repos I have to add to /etc/yum.repos.d/ ?
  
  
   On 22 February 2011 18:41, Robert DeLisle rkdeli...@gmail.com
 wrote:
  
   What are your environment settings?  You should have at minimum,
 these:
  
   $RDBASE = the directory where you have installed the RDKit code
  
  
   $LD_LIBRARY_PATH = /usr/local/lib:/$RDBASE/lib
  
   $PYTHONPATH = $RDBASE
  
  
   At least this worked for me for a CentOS installation, detailed here
 -
   http://code.google.com/p/rdkit/wiki/BuildingOnCentOS
  
  
  
   Another possibility is your PATH variable.  Make sure that
 /usr/local
   pathnames precede any /usr options.
   This will ensure looking into /usr/local first.
  
   There also may be options for cmake that will force it into the
 correct
   directory.  I've found in the past that even though
  
  
   it says in the initial output that is looking in the correct
 location
   for
   boost and python, it doesn't necessarily follow its
   own advice.
  
   -Kirk
  
  
  
  
   On Tue, Feb 22, 2011 at 9:44 AM, JP jeanpaul.ebe...@inhibox.com
   wrote:
  
   I ended up not using yum to install Numpy - I installed it from
   source,
   which was only slightly painful.
import platform; print platform.python_version()
   # /usr/local/lib/python2.7/platform.pyc matches
   /usr/local/lib/python2.7/platform.py
   import platform # precompiled from
   /usr/local/lib/python2.7/platform.pyc
   2.7.0
import numpy as N
a=N.random.randn(10, 10)
   
   In /usr/lib64/ I can find some libpython2.4.so
 , libpython2.4.so.1.0
   What should I do?
  
   On 22 February 2011 16:23, rkdeli...@gmail.com wrote:
  
   Are you sure that your NumPy installation is going to the correct
   Python
   instance? I see from the logs that you have Python 2.7 installed,
 or
   at
   least that is what cmake is finding at /usr/local/lib. You use yum
 to
   install NumPy, but the standard installation of Python on CentOS
 5.x
   is 2.4
   and it is located in /usr/lib. Which version of Python has NumPy?
  
  
   -Kirk
  
  
  
  
  
  
   On Feb 22, 2011 9:14am, JP jeanpaul.ebe...@inhibox.com wrote:
I've installed Atlas, Numpy, Boostand everything works fine
 until I
try:
   
   
   
cmake .. -DBOOST_ROOT=/share/apps/boost_1_45_0/
sudo make VERBOSE=1
   
   
   
At which point everything fails as follows::
   
   
   
[  3%] Building CXX object
Code/RDBoost/CMakeFiles/RDBoost.dir/Wrap.cpp.o
   
cd /share/apps/RDKit_2010_12_1/build/Code/RDBoost 
 /usr/bin/c++
-DRDBoost_EXPORTS -O3 -DNDEBUG -fPIC
 -I/usr/local/include/python2.7
-I/usr/local/lib/python2.7/site-packages/numpy/core/include
-I/share/apps/boost_1_45_0/include
-I/share/apps/RDKit_2010_12_1/Code
-Wno-deprecated -Wno-unused-function -fno-strict-aliasing -fPIC
 -o
CMakeFiles/RDBoost.dir/Wrap.cpp.o -c
/share/apps/RDKit_2010_12_1/Code/RDBoost/Wrap.cpp
   
Linking CXX

[Rdkit-discuss] KNIME + Java RDKit library

2010-11-19 Thread George Papadatos
Hi guys,

I installed the RDKit nodes for KNIME (by copying the plugins folder
manually, as I too had problems with the 'update from file' feature).
Inspired by the source code that was bundled with the nodes, I tried to use
the RDKit libraries in KNIME/Eclipse in order to develop my own nodes based
on the RDKit toolkit.

For example:

import org.RDKit.RDKFuncs;
import org.RDKit.ROMol;

public class RDKitTest {
 public static void main (String[] args) throws Exception
{
ROMol mol = null;
String smi = c1c1N;
mol = RDKFuncs.MolFromSmiles(smi);
System.out.println(mol.getNumAtoms());
}


However, this script throws the following runtime error:

Exception in thread main java.lang.UnsatisfiedLinkError:
org.RDKit.RDKFuncsJNI.MolFromSmiles(Ljava/lang/String;)J
at org.RDKit.RDKFuncsJNI.MolFromSmiles(Native Method)
at org.RDKit.RDKFuncs.MolFromSmiles(RDKFuncs.java:65)


In the Eclipse lib folder, I included all the .jar files and the
RDKFuncs.dll.

Any ideas???


Regards,

George Papadatos
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Antwort: Installation fails for KNIME nodes

2010-11-19 Thread George Papadatos
Hi Paul,

No worries! :)


Regards,

George
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] KNIME + Java RDKit library problem

2010-11-19 Thread George Papadatos
Hi guys,

I installed the RDKit nodes for KNIME (by copying the plugins folder
manually, as I too had problems with the 'update from file' feature).
Inspired by the source code that was bundled with the nodes, I tried to use
the RDKit libraries in KNIME/Eclipse in order to develop my own nodes based
on the RDKit toolkit.

For example:

import org.RDKit.RDKFuncs;
import org.RDKit.ROMol;

public class RDKitTest {
 public static void main (String[] args) throws Exception
{
ROMol mol = null;
 String smi = c1c1N;
mol = RDKFuncs.MolFromSmiles(smi);
System.out.println(mol.getNumAtoms());
 }


However, this script throws the following runtime error:

Exception in thread main java.lang.UnsatisfiedLinkError:
org.RDKit.RDKFuncsJNI.MolFromSmiles(Ljava/lang/String;)J
 at org.RDKit.RDKFuncsJNI.MolFromSmiles(Native Method)
at org.RDKit.RDKFuncs.MolFromSmiles(RDKFuncs.java:65)


In the Eclipse lib folder, I included all the .jar files and the
RDKFuncs.dll.

Any ideas???


Regards,

George Papadatos
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] KNIME + Java RDKit library problem

2010-11-17 Thread George Papadatos
Hi Thorsten and Greg,

Many thanks for your replies.

On 16 November 2010 20:16, Thorsten Meinl thorsten.me...@uni-konstanz.dewrote:

 Hi George,

  I installed the RDKit nodes for KNIME (by copying the plugins folder
  manually, as I too had problems with the 'update from file' feature).
  Inspired by the source code that was bundled with the nodes,
 Did you have the same problems as Paul, i.e. KNIME complaining about
 some osbi.bundles not being found?


This is the error I get when I use the local update site:
Cannot complete the request.  See the details.
Unsatisfied dependency: [org.rdkit.knime.source.feature.feature.group
0.9.0.0027626] requiredCapability:
org.eclipse.equinox.p2.iu/org.rdkit.knime.bin.macosx.x86_64/[0.9.0.0027589,0.9.0.0027589]
Unsatisfied dependency: [org.rdkit.knime.source.feature.feature.group
0.9.0.0027626] requiredCapability:
org.eclipse.equinox.p2.iu/org.rdkit.knime.bin.linux.x86/[0.9.0.0027561,0.9.0.0027561]
Unsatisfied dependency: [org.rdkit.knime.source.feature.feature.group
0.9.0.0027626] requiredCapability:
org.eclipse.equinox.p2.iu/org.rdkit.knime.bin.linux.x86_64/[1.0.0.0027615,1.0.0.0027615]


  However, this script throws the following runtime error:
  Exception in thread main java.lang.UnsatisfiedLinkError:
  org.RDKit.RDKFuncsJNI.MolFromSmiles(Ljava/lang/String;)J
  at org.RDKit.RDKFuncsJNI.MolFromSmiles(Native Method)
  at org.RDKit.RDKFuncs.MolFromSmiles(RDKFuncs.java:65)
 
  In the Eclipse lib folder, I included all the .jar files and the
 RDKFuncs.dll.
  Any ideas???
 In order to use code from native libaries Java needs to be told where to
 look for them. This is usually done by defining -Djava.library.path
 appropriately. If the application consists of Eclipse plugins (i.e. not
 just a bunch of jars), then there is some magic that loads the native
 libraries w/o needing to specify the explicitly. This is what happens
 wiht the KNIME plugins. So you either need to set the Java property or
 put your code in a plugin, which depends on org.rdkit.knime.types, and
 run an Eclipse application.

 Thanks to your tip and my colleague Nico Fechner, it is working now.
For those with the same problem, you need this line at the beginning of your
code: System.load(Path//to//the//dll//RDKFuncs.dll);
Alternatively, as you suggested, you need to set the VM arguments
appropriately in Eclipse, i.e. -Djava.library.path=Path//to///the/dll// and
then add this line in the code: System.loadLibrary(RDKFuncs);

Thanks again,

George
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss