Re: [Rdkit-discuss] ModuleNotFoundError: No module named 'rdkit'

2021-04-14 Thread Christos Kannas
Hi Andres,

Maybe Spyder runs on the base conda environment.
Do you run Spyder from your activated environment?

Kind regards,

Christos

On Wed, Apr 14, 2021, 17:52 Andrés Sánchez Ruiz <
andressanchezrui...@gmail.com> wrote:

> Dear Greg,
>
> It works! It seems I can call functions of RDKit from this console,
> however, when I start spyder and try to run them I still get the
> error. Could it be something related to the spyder interpreter?
>
> Best regards,
>
> Andrés
>
> El mié, 14 abr 2021 a las 17:38, Greg Landrum
> () escribió:
> >
> > That looks good so far.
> > So what happens in that exact same shell if you then start ipython
> > and do "import rdkit"?
> >
> > -greg
> >
> >
> > On Wed, Apr 14, 2021 at 5:33 PM Andrés Sánchez Ruiz <
> andressanchezrui...@gmail.com> wrote:
> >>
> >> Dear Greg,
> >>
> >> After activating my enviroment (foodpains) I wrote the command "
> >> ipython -c 'import IPython;import
> >> rdkit;print(IPython.__file__,rdkit.__file__)' ". Right after getting
> >> the output I wrote: " where ipython ". This is what I get:
> >>
> >> (foodpains) C:\Users\Andres Sanchez>ipython -c "import IPython;import
> >> rdkit;print(IPython.__file__,rdkit.__file__)"
> >> C:\Anaconda\envs\foodpains\lib\site-packages\IPython\__init__.py
> >> C:\Anaconda\envs\foodpains\lib\site-packages\rdkit\__init__.py
> >>
> >> (foodpains) C:\Users\Andres Sanchez>where ipython
> >> C:\Anaconda\envs\foodpains\Scripts\ipython.exe
> >> C:\Anaconda\Scripts\ipython.exe
> >>
> >> Best regards,
> >>
> >> Andrés
> >>
> >> El mié, 14 abr 2021 a las 17:06, Greg Landrum
> >> () escribió:
> >> >
> >> > That looks good. Please send the output of:
> >> > ipython -c 'import IPython;import
> rdkit;print(IPython.__file__,rdkit.__file__)'
> >> >
> >> > and we also need to figure out exactly which version of ipython you
> are running.
> >> >
> >> > If you are running these commands in the command shell, that's
> >> > where ipython
> >> >
> >> > in powershell:
> >> > gcm ipython
> >> >
> >> > if you're using a bash shell:
> >> > which ipython
> >> >
> >> > Please run the ipython -c and which/where/gcm command directly after
> each other and paste in both the command you executed and its output.
> >> >
> >> > -greg
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Apr 14, 2021 at 4:46 PM Andrés Sánchez Ruiz <
> andressanchezrui...@gmail.com> wrote:
> >> >>
> >> >> Dear Greg,
> >> >>
> >> >> This is what I see after activating my enviroment (foodpains) and
> >> >> introducing your command:
> >> >>
> >> >> C:\Anaconda\envs\foodpains\lib\site-packages\IPython\__init__.py
> >> >> C:\Anaconda\envs\foodpains\lib\site-packages\rdkit\__init__.py
> >> >>
> >> >> Best regards,
> >> >>
> >> >> Andrés
> >> >>
> >> >> El mié, 14 abr 2021 a las 15:42, Greg Landrum
> >> >> () escribió:
> >> >> >
> >> >> > What do you see when you execute this quick test to ensure that
> ipython and the rdkit are both really installed?
> >> >> >
> >> >> > python -c 'import IPython;import
> rdkit;print(IPython.__file__,rdkit.__file__)'
> >> >> >
> >> >> > -greg
> >> >> >
> >> >> > On Wed, Apr 14, 2021 at 2:58 PM Andrés Sánchez Ruiz <
> andressanchezrui...@gmail.com> wrote:
> >> >> >>
> >> >> >> Hello,
> >> >> >>
> >> >> >> I have not been able to solve the issue yet after installing
> ipython
> >> >> >> in the same enviroment in which I have RDKIT.
> >> >> >>
> >> >> >> ipython   7.22.0   py39hd4e2768_0
> >> >> >> ipython_genutils  0.2.0  pyhd3eb1b0_1
> >> >> >> .
> >> >> >> .
> >> >> >> .
> >> >> >> rdkit 2021.03.1py39hfadf033_0
> conda-forge
> >> >> >>
> >> >> >> From this enviroment I can call pandas (for example) but not
> RDKIT.
> >> >> >> What is still not working?
> >> >> >>
> >> >> >> Best regards,
> >> >> >>
> >> >> >> Andrés
> >> >> >>
> >> >> >>
> >> >> >> ___
> >> >> >> Rdkit-discuss mailing list
> >> >> >> Rdkit-discuss@lists.sourceforge.net
> >> >> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] HasSubstructMatch & GetSubstructMatches hang when useChirality is True

2021-03-29 Thread Christos Kannas
Thanks a lot Paolo.

Christos

Christos Kannas

Research Software Engineer (Cheminformatics)

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>


On Fri, 26 Mar 2021 at 16:01, Paolo Tosco 
wrote:

> Hi Christos,
>
> this is a possible workaround that will address your current problem:
>
> https://gist.github.com/ptosco/863cb55ace485c6664c21c244b2ca10a
>
> A better solution would be to implement in the C++ layer a callback or
> timeout similarly to MCS and other similar, potentially time consuming
> operations.
>
> Cheers,
> p.
>
> On Fri, Mar 26, 2021 at 1:15 PM Christos Kannas 
> wrote:
>
>> Hi,
>>
>> Long story short, I'm using rdChiral to extract reaction templates from
>> Reaction SMILES.
>>
>> I've found an issue with substructure matching when using a large
>> molecule, a large pattern, having lots of chiral centers and
>> the HasSubstructMatch & GetSubstructMatches have useChirality set to True,
>> the process hangs.
>>
>> Here is a notebook showing the issue
>> https://nbviewer.jupyter.org/gist/CKannas/d54bb5ab0fa3c964086c75f18250ddac
>>
>> Is there any workaround for this?
>> Looking for a solution to stop the computation in a graceful manner.
>>
>> Thanks,
>>
>> Christos
>>
>> Christos Kannas
>>
>> Research Software Engineer (Cheminformatics)
>>
>> [image: View Christos Kannas's profile on LinkedIn]
>> <http://cy.linkedin.com/in/christoskannas>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] HasSubstructMatch & GetSubstructMatches hang when useChirality is True

2021-03-26 Thread Christos Kannas
Hi,

Long story short, I'm using rdChiral to extract reaction templates from
Reaction SMILES.

I've found an issue with substructure matching when using a large molecule,
a large pattern, having lots of chiral centers and the HasSubstructMatch &
GetSubstructMatches have useChirality set to True, the process hangs.

Here is a notebook showing the issue
https://nbviewer.jupyter.org/gist/CKannas/d54bb5ab0fa3c964086c75f18250ddac

Is there any workaround for this?
Looking for a solution to stop the computation in a graceful manner.

Thanks,

Christos

Christos Kannas

Research Software Engineer (Cheminformatics)

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] conda install rdkit

2020-10-16 Thread Christos Kannas
Hi Ling,

Maybe you should switch to conda-forge channel. Replace "-c rdkit" with "-c
conda-forge".
At least that's what I'm using personally and I have no problems so far.
The latest version of rdkit there is 2020.03.6.

Best,

Christos

Christos Kannas

Scientific Software Developer (Cheminformatics)

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>


<https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
<#m_-2612651416672237168_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Fri, 16 Oct 2020 at 21:00, Ling Chan  wrote:

> Thank you Drew for your suggestion. I tried it, but it did not help.
>
> I also did a "conda clean -a" on top of that. Still, when I do
> conda create -c rdkit -n rdkenv rdkit
> it stubbornly points to
> rdkit  rdkit/linux-64::rdkit-2018.03.2.0-py36h6bb024c_1
>
> and as I wrote before, I don't have a ".condarc" file in my home
> directory. Moreover, there is no "pinned" file found in any of my
> conda-meta directories. And even "conda update rdkit --no-pin" does not
> update my rdkit.
>
> I have no idea where it picks up that 2018 rdkit version and sticks to it.
>
> Ling
>
> Drew Gibson  於 2020年10月12日週一 上午4:35寫道:
>
>> Hi,
>>
>> I had a similar issue in the past.  Try updating conda...
>>
>> conda update conda
>>
>> then try creating your RDKit environment again.
>>
>> Drew
>>
>> On Sat, 10 Oct 2020 at 23:47, Ling Chan  wrote:
>>
>>> Dear colleagues,
>>>
>>> I am trying to install RDKit using conda. According to the manual at
>>> https://www.rdkit.org/docs/Install.html#how-to-install-rdkit-with-conda
>>>
>>> it is very simple. I just need to do
>>>
>>> conda create -c rdkit -n my-rdkit-env rdkit
>>>
>>> It used to work. Somehow when I try this again, things are not working. 
>>> When I investigated, it turns out that somehow the 2018.03.2.0 version of 
>>> rdkit was installed instead of the most current one. It seems to me that I 
>>> have screwed up my conda setup. Just wonder what have I screwed up? How can 
>>> I repair it?
>>>
>>> One hint could be found at the message when I did conda create. The line 
>>> for rdkit looks different from the other lines, as indicated below. 
>>> Unfortunately I still could not figure it out.
>>>
>>> Thank you for your insight.
>>>
>>> Ling
>>>
>>>
>>> =
>>>
>>> > conda create -c rdkit -n tempenv rdkit
>>>
>>> The following NEW packages will be INSTALLED:
>>>
>>>   _libgcc_mutex  pkgs/main/linux-64::_libgcc_mutex-0.1-main
>>>   blas   pkgs/main/linux-64::blas-1.0-mkl
>>>   bzip2  pkgs/main/linux-64::bzip2-1.0.8-h7b6447c_0
>>> 
>>>  python pkgs/main/linux-64::python-3.6.12-hcff3b4d_2
>>>   python-dateutilpkgs/main/noarch::python-dateutil-2.8.1-py_0
>>>   pytz   pkgs/main/noarch::pytz-2020.1-py_0
>>>   rdkit  rdkit/linux-64::rdkit-2018.03.2.0-py36h6bb024c_1
>>>   readline   pkgs/main/linux-64::readline-8.0-h7b6447c_0
>>>   setuptools pkgs/main/linux-64::setuptools-50.3.0-py36hb0f4dca_1
>>>   sixpkgs/main/noarch::six-1.15.0-py_0
>>>  
>>>
>>>
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

<https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Different InChI: RDKit Knime Vs RDKit Python

2019-11-12 Thread Christos Kannas
Dear RDKiters,

I'm having the following problem.
I have a workflow that standardises compounds and as part of the process it
generates standard InChI and InChIkey for the compound. The output is
stored in an SDF.
If I parse the SDF to a dataframe in jupyter notebook, then use the mol
object to generate standard inchi, for a small number of compounds the new
standard InChI is slightly different than the one generated in Knime
environment.

Environments Details:
- RDKit Knime Nodes: 3.8.0v201906261723
- RDKit Python (conda): 2018.09.3, 2019.03.4, 2019.09.1

See image: https://imgur.com/a/EnYoHWG

Best,

Christos

Christos Kannas

Scientific Software Developer (Cheminformatics)
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Adding molecules to pandas dataframe

2019-07-25 Thread Christos Kannas
Hi Gianluca,

Yes you can do that.

You create a list of molecule objects from the mol2 files and then you
assign this list to a new column in your dataframe.
I.e. (Pythopsuedocode...)

mols = list()
for mol2 in mol2_files:
mol = Chem.MolFromMol2(mol2)
mols.append(mol)

df["Molecule"] = mols

Best,

Christos

Christos Kannas

Scientific Software Developer (Cheminformatics)


On Thu, 25 Jul 2019 at 15:45, Gianluca Sforna  wrote:

> Hi all,
> is it possible to manually add molecules to a pandas dataframe? I am
> reading a bunch of mol2 files, adding some properties (including some
> atom highlighting), then I'd like to add the resulting molecule to the
> dataframe in order to show its depiction along with the data.
> However, API docs and examples I found around always assume you have a
> SMILES string to start with.
>
> Any pointers?
>
> --
> Gianluca Sforna
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Using RdKit in Parallel

2019-02-20 Thread Christos Kannas
Hi Stamatia,

Yes, SDMolSupplier is not thread safe.
My guess is due to the nature of SDF file where a molecule record needs
multiple lines and you do not know a-priory the number of lines per
molecule in order to split the file to different threads/processes.

Given that your proposed approach is the preferred one.
Process each SDF file and return matched molecules using a separate process.

I would advice to use concurrent.futures (
https://docs.python.org/3/library/concurrent.futures.html) package instead
of multiprocessing.
As it provides an abstraction layer on top of multiprocessing.
See the example on ProcessPoolExecutor.

One important think to remember when returning the list of matched
molecules make use you preserve the molecule objects (
https://rdkit.org/docs/GettingStartedInPython.html#preserving-molecules) as
transferring data between processes in Python requires that the data to be
picklable.

Best,

Christos

Christos Kannas

Cheminformatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>


On Wed, 20 Feb 2019 at 11:28, Stamatia Zavitsanou <
stamatia.zavitsa...@oriel.ox.ac.uk> wrote:

> Hello everyone,
>
>
> We have been writing a script that searches though a large number of
> molecules within different files for a common substructure. To speed this
> up we have been attempting to run this script in parallel-see scripts
> below. However online the tutorial notes make reference to problems with
> using the SDMolSupplier in parallel, we were wondering what is the issue
> and how we could circumvent them to speed up some of our calculations.
>
>
> Non-parallel
>
>
> from __future__ import print_function
>
> from rdkit import Chem
>
> import os
>
> from progressbar import ProgressBar
>
> pbar=ProgressBar()
>
> matches = []
>
> directory = 'Q:\Data2'
>
> patt = Chem.MolFromSmarts('NC(NNC=O)=O')
>
> for file in pbar(os.listdir(directory)):
>
> filename = os.fsdecode(file)
>
> if filename.endswith(".sdf"):
>
> f = os.path.join(directory,filename)
>
> suppl= Chem.SDMolSupplier(f)
>
> for mol in suppl:
>
> if mol is None: continue
>
> if mol.HasSubstructMatch(patt):
>
> matches.append(mol)
>
> w = Chem.SDWriter(r'C:\Users\tom.watts\Desktop\datasmarts4c.sdf')
>
> for m in matches: w.write(m)
>
> print(filename)
>
>
>
> Parallel
>
>
> pbar=ProgressBar()
>
> matches = []
>
> directory = 'E:\Data'
>
> patt = Chem.MolFromSmarts('NC(NNC=O)=O')
>
> w = Chem.SDWriter(r'C:\Users\tom.watts\Desktop\SearchDataNonly.sdf')
>
> l=[]
>
> for file in pbar(os.listdir(directory)):
>
> filename = os.fsdecode(file)
>
> if filename.endswith(".sdf"):
>
> f = os.path.join(directory,filename)
>
> l.append(f)
>
> num_cores = multiprocessing.cpu_count()
>
> print(num_cores)
>
> lock = multiprocessing.Lock()
>
> def Search(i):
>
> suppl= Chem.SDMolSupplier(i)
>
> for mol in suppl:
>
> if mol is None: continue
>
> if mol.HasSubstructMatch(patt):
>
> matches.append(mol)
>
> return matches
>
> results = Parallel(n_jobs=20)(delayed(Search)(i) for i in l)
>
>
>
> We also wish to use a second script  that opens one SDF file and then
> runs a loop over each molecule in the file. This is currently done
> serially and we were wondering if it could be made parallel.
>
>
>
> suppl = Chem.SDMolSupplier('Red3.sdf')
>
> *for* mol *in* suppl:
>
> patt = Chem.MolFromSmarts('NC(N)=O')
>
> num=mol.GetSubstructMatches(patt)
>
> logger.debug(Chem.MolToSmiles(mol))
>
> h=len(num)
>
> m3=Chem.AddHs(mol)
>
> cids =AllChem.EmbedMultipleConfs(m3, numConfs)
>
>
>
> Any comments can be useful.
>
>
> Thanks a lot,
>
> Stamatia Zavitsanou
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Christos Kannas
I would call it a "feature"...
I guess running conformer optimization (i.e. ETKDG, UFF, MMFF94) after the
embedding would be a good practice...

> I think I do vaguely remember that InChI gives precedence to 3D
coordinates if present over anything else for the determination of
stereochemistry.
I guess that's why there are inconsistencies [sometimes] when the molecule
has been generated from a SMILES instead from a MOL block with 2D or 3D
coordinates...

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>


On Wed, 19 Dec 2018 at 01:12, Markus Sitzmann 
wrote:

> I think I do vaguely remember that InChI gives precedence to 3D
> coordinates if present over anything else for the determination of
> stereochemistry. And I think that is what happens here: the Allchem
> embedding of the molecule adds 3D coordinates which are not present for the
> original  molecule create straight from InChI. Probably the minimization of
> the structure during the embedding is “turning around” the stereochemistry
> (probably you could have a long discussion whether this is a bug or a
> feature),
>
> Markus
>
> -
> |  Markus Sitzmann
> |  markus.sitzm...@gmail.com
>
> On 18. Dec 2018, at 19:43, Jason Biggs  wrote:
>
> see https://github.com/rdkit/rdkit/issues/1852, and
> https://sourceforge.net/p/rdkit/mailman/message/36309813/
>
> You can see it in the smiles if you remove stereo after embedding, then
> re-detect stereo from the conformation.
>
> inchi1 =
> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
> m1 = Chem.MolFromInchi(inchi1)
> m1 = Chem.AddHs(m1)
> m2 = Chem.Mol(m1)
> AllChem.EmbedMolecule(m2)
> m3 = Chem.Mol(m2)
> Chem.rdmolops.RemoveStereochemistry(m3)
> Chem.rdmolops.AssignStereochemistryFrom3D(m3)
> sm1 = Chem.MolToSmiles(m1)
> sm2 = Chem.MolToSmiles(m2)
> sm3 = Chem.MolToSmiles(m3)
> print(sm1 == sm2)  # returns true
> print(sm2 == sm3) # returns false
>
>
> The difference between sm2 and sm3 is just swapping a \ for a /,
> confirming what Christos was able to read from the InChI.
>
> Why does the inchi reflect the 3D bond stereo but the smiles doesn't until
> you remove and re-detect the stereo?  Does the InChI code go to the 3D
> structure when present and ignore stereo information in the mol object?
>
> Jason Biggs
>
>
> On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas 
> wrote:
>
>> Hi Jean-Marc,
>>
>> There difference is due to bond orientation (if my inchi analysis skills
>> are correct).
>> See the bold bond layer below (14-7+ vs 14-7-).
>>
>>
>> m1 -> 
>> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7+*/t17-,19-/m1/s1
>>
>> m2 -> 
>> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7-*/t17-,19-/m1/s1
>>
>>
>> Not sure why it happens, but I've seen it multiple times...
>>
>>
>> Best,
>>
>> Christos
>>
>> Christos Kannas
>>
>> Chem[o]informatics Researcher & Software Developer
>>
>> [image: View Christos Kannas's profile on LinkedIn]
>> <http://cy.linkedin.com/in/christoskannas>
>>
>>
>> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD <
>> jm.nuzill...@univ-reims.fr> wrote:
>>
>>> Thank you for your answer but alatis might not be adapted to my current
>>> problem.
>>>
>>> Attempting to understand what was changed by the embedding step I wrote:
>>>
>>> inchi1 =
>>>
>>> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
>>> m1 = Chem.MolFromInchi(inchi1)
>>> m1 = Chem.AddHs(m1)
>>> m2 = Chem.Mol(m1)
>>> AllChem.EmbedMolecule(m2)
>>> sm1 = Chem.MolToSmiles(m1)
>>> sm2 = Chem.MolToSmiles(m2)
>>> print(sm1)
>>> print(sm2)
>>> print(sm1 == sm2)
>>> inc1 = Chem.MolToInchi(m1)
>>> inc2 = Chem.MolToInchi(m2)
>>> print(inc1)
>>> print(inc2)
>>> print(inc1 == inc2)
>>>
>>> Molecules m1 and m2 have identical SMILES representations
>>> but different InChI representations, which I find odd.
>>>
>>> All the best,
>>>
>>> Jean-Mar

Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Christos Kannas
Hi Jean-Marc,

There difference is due to bond orientation (if my inchi analysis skills
are correct).
See the bold bond layer below (14-7+ vs 14-7-).


m1 -> 
InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7+*/t17-,19-/m1/s1

m2 -> 
InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7-*/t17-,19-/m1/s1


Not sure why it happens, but I've seen it multiple times...


Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>


On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD <
jm.nuzill...@univ-reims.fr> wrote:

> Thank you for your answer but alatis might not be adapted to my current
> problem.
>
> Attempting to understand what was changed by the embedding step I wrote:
>
> inchi1 =
>
> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
> m1 = Chem.MolFromInchi(inchi1)
> m1 = Chem.AddHs(m1)
> m2 = Chem.Mol(m1)
> AllChem.EmbedMolecule(m2)
> sm1 = Chem.MolToSmiles(m1)
> sm2 = Chem.MolToSmiles(m2)
> print(sm1)
> print(sm2)
> print(sm1 == sm2)
> inc1 = Chem.MolToInchi(m1)
> inc2 = Chem.MolToInchi(m2)
> print(inc1)
> print(inc2)
> print(inc1 == inc2)
>
> Molecules m1 and m2 have identical SMILES representations
> but different InChI representations, which I find odd.
>
> All the best,
>
> Jean-Marc
>
>
>
>
> Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :
> > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:
> >> Is there any more deterministic procedure than the one of trying until
> >> success is obtained?
> >>
> >> How do I determine the InChI string of a conformer obtained after
> >> multiple embedding?
> >
> > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/
> >
> > Generally speaking the problem with InChI is that the only *required*
> > layer is the formula. Therefore *an* InChI string cannot be used to
> > differentiate conformers, you need the InChI string with all the
> > relevant layers and all the protons.
> >
> > https://www.nature.com/articles/sdata201773
> >
> > ___
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Matching Generalized Compounds

2018-08-23 Thread Christos Kannas
Hi Kovas,

You have two fuzzy compounds that you try to match them, because our
intuition says that any atom notation [*:1] from m1 should match the
Fluorine [F:11] in m2 and any atom [*:14] in m2 should match Carbon [CH3:4]
in m1.
The issue here is that you create two query compounds from m1 and m2 which
will match their own specific substructures. Query to query matching is not
trivial.

In order to do what you want you need a query compound that combines their
characteristic, which is what Paolo showed.
Paolo with MCS and modifying atom properties created that query compound
'[*:1]-[CH2:2]-[C:3](-[*:4])=[CH2:5]' or
'[*:1]-[CH2X4:2]-[CX3:3](-[*:4])=[CH2X3:5]'
Also bare in mind that Paolo's approach changed the starting compounds, as
now they resemble the generic query compound that combines their fuzzy
atoms.

https://gist.github.com/CKannas/ac1a4791dec909552d7c8899cfaff030

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>


On Thu, 23 Aug 2018 at 12:36, Paolo Tosco 
wrote:

> Dear Kovas,
>
> It looks like GetSubstructMatch() only finds a match if the dummy atom is
> in the query, not if it is in the molecule they you are matching the query
> against.
>
> This notebook present a possible solution off the top of my head:
>
> https://gist.github.com/ptosco/a35ac28a14103b47096f6d6af1aec831
>
> which does not involve changes to the C++ layer, even though it is
> computationally more expensive and will fail with disconnected fragments as
> it uses FindMCS(). There may be better solutions - this is what I came out
> with yesterday night in the little time I had available.
>
> Cheers,
> P.
>
> On 08/22/18 19:34, Kovas Palunas wrote:
>
> Hi All,
>
>
>
> I’m interested in having GetSubstructMatches return non-“null” results in
> the following example.  The results should lead to a match where atom 1
> maps to atom 11, 2 to 12, etc.
>
>
>
> m1 = Chem.MolFromSmiles('[*:1][CH2:2][C:3]([CH3:4])=[CH2:5]')
>
> m2 = Chem.MolFromSmiles('[F:11][CH2:12][C:13]([*:14])=[CH2:15]')
>
>
>
> ### do something here so that the mols will match ###
>
> qp = Chem.AdjustQueryParameters()
>
> qp.makeDummiesQueries = True
>
> m1 = Chem.AdjustQueryProperties(m1, qp)
>
> m2 = Chem.AdjustQueryProperties(m2, qp)
>
>
>
> # I’d like both of the following to return results
>
> m1.GetSubstructMatches(m2)
>
> m2.GetSubstructMatches(m1)
>
>
>
> My understanding of why these mols currently do not match is as follows:
>  because only the dummy atoms are made queries (based on my query parameter
> adjustment), when one mol is matched to another dummy 1 may match to F:11,
> but dummy 14 will then not match to methyl:14.  This is because (as I
> understand), normal atoms can only be matched by queries, and cannot match
> them themselves.
>
>
>
> Potential ideas to make this work as I’d like:
>
>1. Override atom.Match in the python code – not sure that this would
>work since the C++ version of this function is what would be called during
>GetSubstructMatches
>2. Override atom.Match in the C++ code – not quite sure how to do
>this, or what side affects it might have.  Ideally the changes I make would
>only affect this example (and other similar ones)
>3. Make all atoms in both molecules QueryAtoms, but otherwise leave
>them unchanged.  I’m not quite sure how to do this!
>
>
>
> Does anyone have any ideas for what the best approach here would be, or
> knows if there is already built in functionality for something like this?
> I’d prefer to not use SMARTS to construct my molecules if possible, since I
> don’t really think of them as queries, just as other molecules in the
> system that happen to not be fully specified.
>
>
>
> - Kovas
>
>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
>
>
> ___
> Rdkit-discuss mailing 
> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit PostgreSQL

2018-05-03 Thread Christos Kannas
Hi Paolo,

Thanks, I forgot to mention that I got it working on a linux box that I use
as dev machine.

Eventually I plan to deploy it on a Linux server.

When I'll have to build the RDKit cartridge, will I have to just build the
cartridge or I'll have to rebuild RDKit and the cartridge?

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>


On Thu, 3 May 2018 at 09:47, Paolo Tosco <paolo.tosco.m...@gmail.com> wrote:

> Hi Christos,
>
> you may use an existing PostgreSQL installation, but then you will need to
> build the cartridge from source. You may rebuild the conda RDKit cartridge
> against your system PostgreSQL - I have done it on Linux and I think it
> should be doable on macOS too. I'll give it a go tonight and get back to
> you.
>
> Cheers,
> Paolo
>
> On 05/03/18 08:16, Christos Kannas wrote:
>
> Hi Paolo,
>
> Will do, thanks!
>
> I was playing with it yesterday and I managed to have it in a working
> manner.
> I'm going through the ChEMBL example.
>
> Questions:
>
>1. The conda recipe  for rdkit-postgresql95 works only with PostgreSQL
>9.5 and rdkit_2017.09.03, given that those are in the conda environment. If
>I need to connect it to an existing PostgreSQL installation I will have to
>build RDKit and the cartridge from source don't I?
>2. Will the cartridge work in the latest RDKit 2018 version, if I
>build it from the source?
>
>
> Best,
>
> Christos
>
> Christos Kannas
>
> Chem[o]informatics Researcher & Software Developer
>
> Mob (UK): +44 (0) 7447700937
> Mob (Cyprus): +357 99530608
>
> [image: View Christos Kannas's profile on LinkedIn]
> <http://cy.linkedin.com/in/christoskannas>
>
>
> On Thu, 3 May 2018 at 08:05, Paolo Tosco <paolo.tosco.m...@gmail.com>
> wrote:
>
>> Hi Christos,
>>
>> It was definitely possible last time I tried, but it was some time ago.
>> Give it a try and get back to me directly (i.e., off-list) if you have
>> problems.
>>
>> Cheers,
>> p.
>>
>> On 1 May 2018, at 16:31, Christos Kannas <chriskan...@gmail.com> wrote:
>>
>> Hi RDKiters,
>>
>> Is it possible to install RDKit PostgreSQL cartridge via Anaconda on
>> MacOS?
>> Or would be better to try with a Linux VM?
>>
>> Best,
>>
>> Christos
>>
>> Christos Kannas
>>
>> Chem[o]informatics Researcher & Software Developer
>>
>> Mob (UK): +44 (0) 7447700937
>> Mob (Cyprus): +357 99530608
>>
>> [image: View Christos Kannas's profile on LinkedIn]
>> <http://cy.linkedin.com/in/christoskannas>
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit PostgreSQL

2018-05-03 Thread Christos Kannas
Hi Paolo,

Will do, thanks!

I was playing with it yesterday and I managed to have it in a working
manner.
I'm going through the ChEMBL example.

Questions:

   1. The conda recipe  for rdkit-postgresql95 works only with PostgreSQL
   9.5 and rdkit_2017.09.03, given that those are in the conda environment. If
   I need to connect it to an existing PostgreSQL installation I will have to
   build RDKit and the cartridge from source don't I?
   2. Will the cartridge work in the latest RDKit 2018 version, if I build
   it from the source?


Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>


On Thu, 3 May 2018 at 08:05, Paolo Tosco <paolo.tosco.m...@gmail.com> wrote:

> Hi Christos,
>
> It was definitely possible last time I tried, but it was some time ago.
> Give it a try and get back to me directly (i.e., off-list) if you have
> problems.
>
> Cheers,
> p.
>
> On 1 May 2018, at 16:31, Christos Kannas <chriskan...@gmail.com> wrote:
>
> Hi RDKiters,
>
> Is it possible to install RDKit PostgreSQL cartridge via Anaconda on MacOS?
> Or would be better to try with a Linux VM?
>
> Best,
>
> Christos
>
> Christos Kannas
>
> Chem[o]informatics Researcher & Software Developer
>
> Mob (UK): +44 (0) 7447700937
> Mob (Cyprus): +357 99530608
>
> [image: View Christos Kannas's profile on LinkedIn]
> <http://cy.linkedin.com/in/christoskannas>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] RDKit PostgreSQL

2018-05-02 Thread Christos Kannas
Hi RDKiters,

Is it possible to install RDKit PostgreSQL cartridge via Anaconda on MacOS?
Or would be better to try with a Linux VM?

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] seg fault when importing Chem on OS-X 10.12

2018-04-16 Thread Christos Kannas
Hi Patrick,

I had a similar problem with RDKit 2017.09.03 on MacOS, using rdkit channel
in anaconda.
Using the conda-forge channel with python 3.5.5 and ipython 6.2 works fine.

I can post my env tomorrow from work.

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>

On 16 April 2018 at 18:57, Brian Cole <col...@gmail.com> wrote:

> Pat, the beta for 2018.03 seems to work fine:
>
> conda install -c rdkit/label/beta rdkit
>
> My current guess is that there is a boost python dependency problem with
> RDKit 2017 for Python 3. I've twiddled between a few different boost
> versions in the conda environment, but to no success in getting 2017
> working.
>
> -Brian
>
>
> On Mon, Apr 16, 2018 at 1:20 PM, Brian Cole <col...@gmail.com> wrote:
>
>> I can reproduce the problem, and the issue does appear to be different
>> than the previous issue. Reproducible with the following on OSX:
>>
>> $ conda create -c rdkit -n rdkit_2017 rdkit python=3.5
>> $ source activate rdkit_2017
>> $ python -c 'import rdkit.rdBase'
>> Segmentation fault: 11
>>
>> $ lldb -- python -c 'import rdkit.rdBase'
>> (lldb) target create "python"
>> Current executable set to 'python' (x86_64).
>> (lldb) settings set -- target.run-args  "-c" "import rdkit.rdBase"
>> (lldb) run
>> Process 14929 launched: '/Users/coleb/anaconda2/envs/rdkit_2017/bin/python'
>> (x86_64)
>> Process 14929 stopped
>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>> EXC_BAD_ACCESS (code=1, address=0xa9)
>> frame #0: 0x0001001c301d python`visit_decref + 13
>> python`visit_decref:
>> ->  0x1001c301d <+13>: testb  $0x40, 0xa9(%rax)
>> 0x1001c3024 <+20>: jne0x1001c302f   ; <+31>
>> 0x1001c3026 <+22>: xorl   %eax, %eax
>> 0x1001c3028 <+24>: addq   $0x8, %rsp
>> Target 0: (python) stopped.
>> (lldb) bt
>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>> EXC_BAD_ACCESS (code=1, address=0xa9)
>>   * frame #0: 0x0001001c301d python`visit_decref + 13
>> frame #1: 0x0001000a07a7 python`tupletraverse + 55
>> frame #2: 0x0001001c1eb3 python`collect + 291
>> frame #3: 0x000100196972 python`Py_Finalize + 226
>> frame #4: 0x0001001bfbc8 python`Py_Main + 3096
>> frame #5: 0x00011301 python`main + 497
>> frame #6: 0x7fff5fe23015 libdyld.dylib`start + 1
>> frame #7: 0x7fff5fe23015 libdyld.dylib`start + 1
>> (lldb) info threads
>>
>>
>>
>> On Mon, Apr 16, 2018 at 1:11 PM, Brian Cole <col...@gmail.com> wrote:
>>
>>> An issue like this was fixed in the past: https://github.com/rdkit
>>> /rdkit/commit/009dd580527caa662de8bac5ad0c60f1e9bc90cd
>>>
>>> Will see if I can reproduce this.
>>>
>>> -Brian
>>>
>>> On Mon, Apr 16, 2018 at 12:09 PM, Patrick Walters <wpwalt...@gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I installed the latest RDKit using conda
>>>>
>>>> conda create -c rdkit -n rdkit_2017 rdkit
>>>>
>>>> When I import Chem I get a seg fault
>>>>
>>>> ➜  ~ source activate rdkit_2017
>>>> (rdkit_2017) ➜  ~ python
>>>> Python 3.5.5 |Anaconda, Inc.| (default, Mar 12 2018, 16:25:05)
>>>> [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
>>>> Type "help", "copyright", "credits" or "license" for more information.
>>>> >>> from rdkit import Chem
>>>> [1]85097 segmentation fault  python
>>>>
>>>> Has anyone else encountered this?
>>>>
>>>> Thanks,
>>>>
>>>> Pat
>>>>
>>>>
>>>> 
>>>> --
>>>> Check out the vibrant tech community on one of the world's most
>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>> ___
>>>> Rdkit-discuss mailing list
>>>> Rdkit-discuss@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>
>>>>
>>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Exhaustive Library Enumeration

2018-01-17 Thread Christos Kannas
Hi Andy,

A better option is to sanitize the products of a reaction enumeration
before using them as reactants.
Look at this example from RDKit "Getting Started" documentation.

Note that the molecules that are produced by the chemical reaction
processing code are not sanitized, as this artificial reaction demonstrates:

>>> rxn = 
>>> AllChem.ReactionFromSmarts('[C:1]=[C:2][C:3]=[C:4].[C:5]=[C:6]>>[C:1]1=[C:2][C:3]=[C:4][C:5]=[C:6]1')>>>
>>>  ps = rxn.RunReactants((Chem.MolFromSmiles('C=CC=C'), 
>>> Chem.MolFromSmiles('C=C')))>>> Chem.MolToSmiles(ps[0][0])'C1=CC=CC=C1'>>> 
>>> p0 = ps[0][0]>>> 
>>> Chem.SanitizeMol(p0)rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE>>> 
>>> Chem.MolToSmiles(p0)'c1c1'

​PS: ​I forgot that the results of a reaction enumeration were not
sanitised, until I so the error in the command line.

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>

On 18 January 2018 at 00:07, Andy Jennings <andy.j.jenni...@gmail.com>
wrote:

> Hi Christos,
>
> Many thanks for the reply. I hadn't appreciated that the presence of a
> single invalid reagent would bring the entire thing crashing down, rather
> than issuing a warning/error and moving onto other molecules in the set.
> Good to know, and I'll have to be less lazy in my code ;-)
>
> Best,
> Andy
>
> On Wed, Jan 17, 2018 at 1:56 PM, Christos Kannas <chriskan...@gmail.com>
> wrote:
>
>> Hi Andy,
>>
>> The reason that your code breaks is that the second product of the third
>> iteration ( 'NN(Cc1c1)(Cc1c1)Cc1c1') is not a valid
>> molecule.
>> And when calling Chem.MolFromSmiles( 'NN(Cc1c1)(Cc1c1)Cc1
>> c1') it creates a None object.
>> So you have to filter out the molecules that are not valid.
>>
>> See this Jupyter Notebook
>> <https://gist.github.com/CKannas/11bb9bcaa9435dd18a0bb969501219b2> at
>> cell 5 the 1st line in the while loop.
>>
>> Best,
>>
>> Christos
>>
>> Christos Kannas
>>
>> Chem[o]informatics Researcher & Software Developer
>>
>> [image: View Christos Kannas's profile on LinkedIn]
>> <http://cy.linkedin.com/in/christoskannas>
>>
>> On 17 January 2018 at 18:16, Andy Jennings <andy.j.jenni...@gmail.com>
>> wrote:
>>
>>> Hi RDKitters,
>>>
>>> I have a question and an observation on the topic of library enumeration.
>>>
>>> First, the question: is there a call within RDKit to trigger the
>>> exhaustive reaction of reagents? For example, if I have two reagents - a
>>> primary amine and an akyl chloride - can I tell RDKit to enumerate the
>>> reaction as though there were an excess of each reagent? In my case here
>>> the reaction would continue until the alkylation can no longer occur
>>> because there are no more valences available on the amine and I would
>>> either be tri-alkylated for a neutral product or quat-alkylated for a
>>> positively charged product
>>> e.g. CCN + RCl -> CCN(R)(R)R or CC[N+](R)(R)(R)R
>>>
>>> This brings me to my observation. When I try to attempt exactly this by
>>> repeatedly exposing the product to the reagent again I am able to drive it
>>> to exhaustion *in some cases*.
>>>
>>> For example, in the example above where RCl is benzyl chloride and my
>>> smirks is:
>>> [#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]'
>>> I do drive the final product to be exclusively the tri-akylated amine.
>>> Success.
>>>
>>> However, when I attempt the same thing with an amine with more than one
>>> reactive nitrogen (e.g. NN) I don't get a single product with 6
>>> alkylations, I get two unique product each with three alkylations. One
>>> product has two alkylations on the first nitrogen and one on the second,
>>> the other product has three alkylations on the first nitrogen and none on
>>> the second. Attempting to drive the reaction once again leads to a
>>> 'reaction called with None reactants' ValueError. My dreadful code is below
>>> and the output is
>>> Reaction 1: ['NNCc1c1']
>>> Reaction 2: ['NN(Cc1c1)Cc1c1', 'c1ccc(CNNCc2c2)cc1']
>>> Reaction 3: ['c1ccc(CNN(Cc2c2)Cc2c2)cc1',
>>> 'NN(Cc1c1)(Cc1c1)Cc1c1']
>>> Reaction 4: ValueError
>>>
>>> Any pointers would be gre

Re: [Rdkit-discuss] Exhaustive Library Enumeration

2018-01-17 Thread Christos Kannas
Hi Andy,

The reason that your code breaks is that the second product of the third
iteration ( 'NN(Cc1c1)(Cc1c1)Cc1c1') is not a valid
molecule.
And when calling Chem.MolFromSmiles( 'NN(Cc1c1)(Cc1c1)Cc1c1')
it creates a None object.
So you have to filter out the molecules that are not valid.

See this Jupyter Notebook
<https://gist.github.com/CKannas/11bb9bcaa9435dd18a0bb969501219b2> at cell
5 the 1st line in the while loop.

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>

On 17 January 2018 at 18:16, Andy Jennings <andy.j.jenni...@gmail.com>
wrote:

> Hi RDKitters,
>
> I have a question and an observation on the topic of library enumeration.
>
> First, the question: is there a call within RDKit to trigger the
> exhaustive reaction of reagents? For example, if I have two reagents - a
> primary amine and an akyl chloride - can I tell RDKit to enumerate the
> reaction as though there were an excess of each reagent? In my case here
> the reaction would continue until the alkylation can no longer occur
> because there are no more valences available on the amine and I would
> either be tri-alkylated for a neutral product or quat-alkylated for a
> positively charged product
> e.g. CCN + RCl -> CCN(R)(R)R or CC[N+](R)(R)(R)R
>
> This brings me to my observation. When I try to attempt exactly this by
> repeatedly exposing the product to the reagent again I am able to drive it
> to exhaustion *in some cases*.
>
> For example, in the example above where RCl is benzyl chloride and my
> smirks is:
> [#7:1].[#6:2][Cl:3]>>[#6:2][#7:1].[Cl:3]'
> I do drive the final product to be exclusively the tri-akylated amine.
> Success.
>
> However, when I attempt the same thing with an amine with more than one
> reactive nitrogen (e.g. NN) I don't get a single product with 6
> alkylations, I get two unique product each with three alkylations. One
> product has two alkylations on the first nitrogen and one on the second,
> the other product has three alkylations on the first nitrogen and none on
> the second. Attempting to drive the reaction once again leads to a
> 'reaction called with None reactants' ValueError. My dreadful code is below
> and the output is
> Reaction 1: ['NNCc1c1']
> Reaction 2: ['NN(Cc1c1)Cc1c1', 'c1ccc(CNNCc2c2)cc1']
> Reaction 3: ['c1ccc(CNN(Cc2c2)Cc2c2)cc1',
> 'NN(Cc1c1)(Cc1c1)Cc1c1']
> Reaction 4: ValueError
>
> Any pointers would be great, as would any pre-existing library enumeration
> code. The examples I've found shipped with RDKit don't appear to allow me
> to name the products using a combination of the reagent names (useful for
> tracking library content).
>
> Best,
> Andy
>
>  Code snippet 
>
> amine = Chem.MolFromSmiles('NN')
> acyl = Chem.MolFromSmiles('c1c1CCl')
> rxn = AllChem.ReactionFromSmarts('[#7:1].[#6:2][Cl:3]>>[#6:2][#7:
> 1].[Cl:3]')
>
> # First reaction
> reactantListMols = [amine,acyl]
> prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,
> reactantListMols])
> prods = list(prods)
> smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in
> prods]))
> print smis
> # ['NNCc1c1']
>
> # Now repeat until doom
> for i in range(0,10):
> oldproducts = [Chem.MolFromSmiles(x) for x in smis]
> reactantListMols = oldproducts + [acyl]
> prods = AllChem.EnumerateLibraryFromReaction(rxn,[reactantListMols,
> reactantListMols])
> prods = list(prods)
> smis = list(set([Chem.MolToSmiles(x[0],isomericSmiles=True) for x in
> prods]))
> print smis
>
>  End Code 
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] adding fragment to existing molecule

2017-08-07 Thread Christos Kannas
Hi Per,

I can think of 2 approaches to solve this.

The 1st is to have fragments of molecules that have an explicit connection
point, i.e. OH[*] and C[*], and use RDKit's functionality of combining
fragments.
The 2nd is to use define a reaction for this using SMIRKS or Reaction
SMILES, i.e. [OH-].C>>COH, and use RDKit's reaction functionality
to perform the reaction on your molecules.

Hope this was a bit helpful.

Regards,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>

On 7 August 2017 at 18:46, Per Jr. Greisen <pgrei...@gmail.com> wrote:

> Hi Nikolaus and Ling,
>
> Thanks for your help (the atom numbe shouldnt be 43 but it still gives the
> error I will clarify)- yes Nikolaus you are right it is a sanitization
> issue and in this case I am trying to use it as a molecular editor to build
> a model molecule (a transition state model to be exact) - I would normally
> do this calling some other script but it would be very nice to do all of it
> in the framework of RDkit - can this be done? Thanks
>
> On Mon, Aug 7, 2017 at 12:05 PM, Stiefl, Nikolaus <
> nikolaus.sti...@novartis.com> wrote:
>
>> Hi Per
>>
>> Just by looking at your code I would assume you have a sanitization
>> issue. You create your pentane molecule and then add H’s. This will
>> saturate each single carbon. When you then add a bond between the two
>> fragments your atom 3 will have a valence of 5 and this causes issues.
>>
>> Maybe do the fragment combination first and then add the H’s? Or do an
>> explicit handling of the correct carbon you link to upfront.
>>
>> Hope this helps
>>
>> Nik
>>
>>
>>
>>
>>
>> *From: *"Per Jr. Greisen" <pgrei...@gmail.com>
>> *Date: *Sunday 6 August 2017 at 19:55
>> *To: *RDKit <rdkit-discuss@lists.sourceforge.net>
>> *Subject: *[Rdkit-discuss] adding fragment to existing molecule
>>
>>
>>
>> Hi all,
>>
>>
>>
>> I am trying to add a fragment to an existing molecule using RDkit - I
>> start by generating the desired molecules I would like to combine:
>>
>>
>>
>> oh = '[OH-]'
>>
>> ohh = Chem.MolFromSmiles(oh)
>>
>> oh = Chem.AddHs(ohh)
>>
>> oh.SetProp("_Name","OH-")
>>
>> AllChem.EmbedMolecule(oh, AllChem.ETKDG())
>>
>>
>>
>> smiles_ = 'C'
>>
>> m = Chem.MolFromSmiles(smiles_)
>>
>> m_h = Chem.AddHs(vxm)
>>
>> m_h.SetProp("_Name","XP")
>>
>> AllChem.EmbedMolecule(m_h, AllChem.ETKDG())
>>
>>
>>
>> I combine them which works fine:
>>
>>
>>
>> combo = Chem.CombineMols(m_h,oh)
>>
>>
>>
>> and I can add the bond between the desired atoms:
>>
>>
>>
>>
>>
>> edcombo = Chem.EditableMol(combo)
>>
>>
>>
>> edcombo.AddBond(3,1,order=Chem.rdchem.BondType.SINGLE)
>>
>> back = edcombo.GetMol()
>>
>>
>>
>> The problems arises when I want to edit the geometry between the two :
>>
>>
>>
>> from rdkit.Chem import rdMolTransforms as rdmt
>>
>> conf = back.GetConformer(0)
>>
>>
>>
>> rdmt.SetBondLength(conf,3,43,10)
>>
>>
>>
>> writer3 = Chem.SDWriter('out_long.sdf')
>>
>> writer3.write(back,confId=0)
>>
>>
>>
>>
>>
>>
>>
>> RuntimeError  Traceback (most recent call last)
>>
>>  in ()
>>
>> *  2* conf = back.GetConformer(0)
>>
>> *  3*
>>
>> > 4 rdmt.SetBondLength(conf,3,43,10)
>>
>> *  5*
>>
>> *  6* writer3 = Chem.SDWriter('out_long.sdf')
>>
>>
>>
>> RuntimeError: Pre-condition Violation
>>
>> RingInfo not initialized
>>
>> Violation occurred on line 66 in file Code/GraphMol/RingInfo.cpp
>>
>> Failed Expression: df_init
>>
>> RDKIT: 2017.03.3
>>
>> BOOST: 1_56
>>
>>
>>
>> So I am not sure how fix - thanks in advance
>>
>>
>>
>>
>>
>> --
>>
>> With kind regards
>>
>>
>> Per
>>
>
>
>
> --
> With kind regards
>
> Per
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Using RDKit in PyCharm and Anaconda on Windows

2017-05-31 Thread Christos Kannas
Hi Richard,

I use rdkit with Miniconda and PyCharm as you do. Which makes things easier
as far for autocomplete in the IDE.
But I do not use PyCharm's Python console for that same reason, instead I
have a cmd window with my Python environment activated to run tests.

Greg's proposed solution for a script I think is the best approach for
having different Python environments activated along with additional
instances of PyCharm.

Regards,

Christos

Christos Kannas

Researcher
Ph.D Student

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>

On 31 May 2017 at 09:56, Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Richard,
>
> I'm glad you found something that works. I'm not enough of a Windows
> expert to be able to provide an example of how to fix it, but here's what
> you need to do:
>
> When you activate a conda environment a bunch of stuff happens behind the
> scenes, it's not generally sufficient to just call the corresponding
> interpreter directly (as you've seen). So you really need to activate the
> environment before invoking PyCharm. I would guess that the easiest way to
> do this is to create a batch file that first does the environment execution
> and then invokes PyCharm. That way you only have one command that you need
> to execute.
>
> A more "fixing the problem with a big hammer" solution, and probably not
> something that would be recommended, would be to install the rdkit into the
> root conda environment (i.e. do `conda install -c rdkit rdkit` without
> activating an environment first. This should allow things to work directly.
>
> -greg
>
>
>
> On Tue, May 30, 2017 at 10:10 PM, West, Richard <r.w...@northeastern.edu>
> wrote:
>
>> We're having trouble getting RDKit to work in a PyCharm project using an
>> Anaconda interpreter (Python 2.7), on Windows 8.1.
>> Has anyone had success with this and can guide us?
>> The trouble is we get an
>>
>>   ImportError: DLL load failed: The specified module could not be found.
>>
>> when trying to import rdkit (or rdBase).
>>
>> We have tried many variations of the following, but here is a basic
>> recipe of what does/doesn't work:
>> 1. Make a new conda environment (called 'eg1'), install rdkit ('conda
>> install -c rdkit rdkit')
>> 2. From a cmd.exe prompt, use this environment ('activate eg1') load
>> python ('python') and import rdkit ('import rdkit') it works fine.
>> 3. From PyCharm, create a Project Interpreter (pointing to
>> 'C:\Anaconda2\envs\eg1\python.exe'), and use this to run a script or
>> create a new Python Console in which you 'import rdkit', leading to the
>> "DLL load failed" message.
>> 4. We have tried manually adding a bunch of things to the "Interpreter
>> Paths" in PyCharm, but without success (perhaps we just didn't add the
>> right thing).
>>
>>
>> 
>>
>> Update: just before I hit "send" on this request for help, we stumbled
>> across this posting of the same problem, and solution, from Christian
>> Ribeaud:
>> https://intellij-support.jetbrains.com/hc/en-us/community/
>> posts/115000244450-DLL-load-failed
>>
>> It seems that if we open cmd.exe, activate the environment, and then
>> launch PyCharm exe from there, it works.
>> I'm sharing this here because it took us a while to find the other post,
>> but also to ask: is there a "better" way?
>>
>> Cheers,
>> Richard
>>
>>
>> --
>> Richard H. West, Ph.D.
>> Assistant Professor, Department of Chemical Engineering,
>> Northeastern University, 360 Huntington Ave, Boston, MA 02115
>> http://northeastern.edu/comochengPhone: 617-373-5163
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] How to transform SMARTS of aromatic structures so that their aromatic atoms could be any?

2017-05-19 Thread Christos Kannas
Hi Alexis,

In SMARTS you can define an aromateic atom with "a".
So I'm thinking that something like the following, might produce more
correct generalised SMARTS patterns.

https://gist.github.com/CKannas/7a9e2768461260461155257fd30c2152

*Note: Please check if the chemistry is correct.*

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>

On 19 May 2017 at 12:52, Alexis Parenty <alexis.parenty.h...@gmail.com>
wrote:

> Hi everyone,
>
>
> I need a function that could generalize any aromatic rings from a SMARTS:
>
> [image: Inline images 1]
>
>
> I have noticed that it is possible to rearrange most of SMARTS strings
> into a general aromatic SMARTS strings by following those simple rules:
>
> 1 Exchange any lower case of a SMARTS string with
> “:[*]”
>
> 2 Catch the two cycle junctions of the SMARTS:
>
> a.   Where a number(1-9) appears a first time in the string: insert a
> colon after the digit (for example “[*]1” to “[*]1:”
>
> b.  Where the same number appears a second time, move the semi colon
> before the digit (for example “[*]1:” to “[*]:1 the
>
>
> I have written a function (see under) that works fine with any SMART
> containing a single aromatic ring. But it does get buggy when I have a
> SMARTS with more than one aromatic ring:
>
>
>
> [image: Inline images 2]
>
>
>
> def get_aromatic_generalised_smarts(smarts):
>for arom_atom in ("c", "o", "n", "s"):
>   smarts = smarts.replace(arom_atom, "x")
>smarts = smarts.replace("[xH]", "x") # to take care of explicit hydrogen 
> atoms
>
>for char in smarts:
>   if char == 'x':
>  smarts = smarts.replace(char, ":[*]")
>
>for char in smarts:
>   if char.isdigit():
>  if ("[*]"+char) in smarts:
> for cycle_junction in ("[*]1", "[*]2", "[*]3", "[*]4", "[*]5", 
> "[*]6", "[*]7", "[*]8", "[*]9"):
>smarts = smarts.replace(cycle_junction, "[*]:" + 
> cycle_junction[-1])   # that make the second cycle junction OK but introduce 
> an error in the first cycle jonction that is corrected next line
> smarts = smarts.replace(":[*]:"+char, "[*]"+char, 1) # to correct 
> the first cycle junction.
> break
>return smarts
>
>
> print(get_aromatic_generalised_smarts("[*]c1coc(Cl)n1"))
> print(get_aromatic_generalised_smarts("[*]c1coc(Cl)c1"))
>
> print(get_aromatic_generalised_smarts("[*]c1coc(Cl)c1Cc2c2")
>
>
> Am I heading in the right direction? I can't make my heads around SMARTS
> with more than one aromatic rings...
>
> Maybe regular expressions would be more appropriate? Maybe there is an
> RDKit function that does the trick from a mol object?
>
>
> Thanks,
>
>
> Alexis
>
>
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-28 Thread Christos Kannas
Hi Steve,

I think it would be better to use a similarity metric based on fingerprints.

Regards,

Christos

Christos Kannas

Researcher
Ph.D Student

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>

On 28 November 2016 at 18:25, Stephen O'hagan <soha...@manchester.ac.uk>
wrote:

> Has anyone come up with fool-proof way of matching structurally equivalent
> molecules?
>
>
>
> Unique Smiles or InChI String comparisons don’t appear to work presumable
> because there are different but equivalent structures, e.g. explicit vs
> non-explicit H’s, Kekule vs Aromatic, isomeric forms vs non-isomeric form,
> tautomers etc.
>
>
>
> I also expect that comparing InChI strings might need something more than
> just a simple string comparison, such as masking off stereo information
> when you don’t care about stereo isomers.
>
>
>
> I assume there are suitable tools within RDKit that can do this?
>
>
>
> N.B. I need to collate tables from several sources that have a mix of
> smiles / InChI / sdf molecular representations.
>
>
>
> I usually use RDKit via Python and/or Knime.
>
>
>
> Cheers,
>
> Steve.
>
>
>
> 
> --
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Count carbon atoms

2015-10-07 Thread Christos Kannas
Hi Joss,

Yes there is an easier way, by using substructure search, i.e. do a
substructure search for [C] and then get the number of matches.

Hope this example to be readable and answer your question.

In [1]:

from rdkit import rdBase

print rdBase.rdkitVersion

​

from rdkit import Chem

from rdkit.Chem import AllChem

from rdkit.Chem import Draw

from rdkit.Chem.Draw import IPythonConsole

2014.09.2

In [2]:

m = Chem.MolFromSmiles("c1c1")

m

Out[2]:
[image: Inline images 1]
In [3]:

patt= Chem.MolFromSmarts("[C]")

print Chem.MolToSmarts(patt)

C

In [4]:

pm = m.GetSubstructMatches(patt)

print len(pm)

8


Regards,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
<http://cy.linkedin.com/in/christoskannas>

On 7 October 2015 at 10:12, Joos Kiener <joos.kie...@gmail.com> wrote:

> Hi all,
>
> is there an easy way I'm missing to get the number of C-Atoms in a
> molecule?
>
> Currently I iterate all atoms and check if it's symbol is C. Doesn't seem
> very efficient.
>
> Best Regards,
>
> Joos Kiener
>
>
> --
> Full-scale, agent-less Infrastructure Monitoring from a single dashboard
> Integrate with 40+ ManageEngine ITSM Solutions for complete visibility
> Physical-Virtual-Cloud Infrastructure monitoring from one console
> Real user monitoring with APM Insights and performance trend reports
> Learn More
> http://pubads.g.doubleclick.net/gampad/clk?id=247754911=/4140
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Full-scale, agent-less Infrastructure Monitoring from a single dashboard
Integrate with 40+ ManageEngine ITSM Solutions for complete visibility
Physical-Virtual-Cloud Infrastructure monitoring from one console
Real user monitoring with APM Insights and performance trend reports 
Learn More http://pubads.g.doubleclick.net/gampad/clk?id=247754911=/4140___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] install rdkit cartridge

2015-04-24 Thread Christos Kannas
Hi Tim,

I'm not sure about this, but I think that in order to use the cartridge you
have to build RDKit and cartridge from source.

Regards,

Christos Kannas

Sent from my Galaxy Note 4.
On 24 Apr 2015 13:47, Tim Dudgeon tdudgeon...@gmail.com wrote:

 Is it possible when using the packages?
 I'm trying to get a reproducible build process so prefer not to have to
 build from sources.

 Tim

 On 24/04/2015 13:29, Axel Pahl wrote:
  The instructions below appliy to the RDKit installation from source
  and are probably not applicable to the Debian package, sorry for
  generating confusion.
 
  Kind regards,
  Axel
 
  On 04/24/2015 02:24 PM, Axel Pahl wrote:
  Dear Tim,
 
  please take a look at this README in your installation:
  your RDKit folder/Code/PgSQL/rdkit/README
 
  It essentially boils down to this:
  cd $RDBASE/Code/PgSQL/rdkit
  $ make
  $ sudo make install
  $ make installcheck
 
  (only the second step has to be performed as root)
 
  Kind regards,
  Axel
 
  On 04/24/2015 01:33 PM, Tim Dudgeon wrote:
  How to install RDKit cartridge?
  The instructions here show how to use it, but not how to install it.
  http://www.rdkit.org/docs/Cartridge.html
 
  I have RDKit and PostgreSQL installed from the corresponding Debian
  packages, but alas no cartridge :-(
 
  Tim
 
 
 --
 
  One dashboard for servers and applications across
  Physical-Virtual-Cloud
  Widest out-of-the-box monitoring support with 50+ applications
  Performance metrics, stats and reports that give you Actionable
  Insights
  Deep dive visibility with transaction tracing using APM Insight.
  http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
 
 
 --
 
  One dashboard for servers and applications across Physical-Virtual-Cloud
  Widest out-of-the-box monitoring support with 50+ applications
  Performance metrics, stats and reports that give you Actionable Insights
  Deep dive visibility with transaction tracing using APM Insight.
  http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
 



 --
 One dashboard for servers and applications across Physical-Virtual-Cloud
 Widest out-of-the-box monitoring support with 50+ applications
 Performance metrics, stats and reports that give you Actionable Insights
 Deep dive visibility with transaction tracing using APM Insight.
 http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Rdkit-discuss Digest, Vol 88, Issue 13

2015-02-13 Thread Christos Kannas
Hi Samuel,

The problem is that x is a tuple in your case not an rdkit molecule object.
Check your code to see what mols list actually has inside. I'm guessing
that x is a tuple containing a molecule object plus some other info.

Best.

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
http://cy.linkedin.com/in/christoskannas

On 13 February 2015 at 12:14, segie...@sanbi.ac.za wrote:

 Kindly look through this aspect of my script below.

 I keep getting the error:

 AttributeError: 'tuple' object has no attribute 'HasSubstructMatch'


 #!/usr/bin/python
 from rdkit import Chem
 from rdkit.Chem import Draw
 from rdkit.Chem import AllChem

 pains = []# Contains my PAINS query molecules


 #Loop through data with PAINS query

 for p in pains:
match = [x for x in mols if x.HasSubstructMatch(p)]
 a = len(match)
 print a

 Thank you

 Samuel Ayodele Egieyeh
 South African National Bioinformatics Institute
 University of the Western Cape




  Send Rdkit-discuss mailing list submissions to
rdkit-discuss@lists.sourceforge.net
 
  To subscribe or unsubscribe via the World Wide Web, visit
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
  or, via email, send a message with subject or body 'help' to
rdkit-discuss-requ...@lists.sourceforge.net
 
  You can reach the person managing the list at
rdkit-discuss-ow...@lists.sourceforge.net
 
  When replying, please edit your Subject line so it is more specific
  than Re: Contents of Rdkit-discuss digest...
 
 
  Today's Topics:
 
 1. Re: Chem.Draw darker colors (Soren Wacker)
 2. Re: Chem.Draw darker colors (David Hall)
 3. Re: Inchi installation in postgresql database driving me mad
(Jan Holst Jensen)
 4. Re: Docker images (Greg Landrum)
 5. Re: Inchi installation in postgresql database driving me mad (JP)
 
 
  --
 
  Message: 1
  Date: Thu, 12 Feb 2015 17:58:42 +
  From: Soren Wacker swac...@ucalgary.ca
  Subject: Re: [Rdkit-discuss] Chem.Draw darker colors
  To: Greg Landrum greg.land...@gmail.com
  Cc: RDKit Discuss rdkit-discuss@lists.sourceforge.net
  Message-ID:

 cf4e4cc78f22f44bb773c76c066a3dd109255...@itcimexch03.uc.ucalgary.ca
  Content-Type: text/plain; charset=us-ascii
 
  but how?
  Soren
  
  From: Greg Landrum [greg.land...@gmail.com]
  Sent: Wednesday, February 11, 2015 10:27 PM
  To: Soren Wacker
  Cc: RDKit Discuss
  Subject: Re: [Rdkit-discuss] Chem.Draw darker colors
 
  My other answer about using the DrawingOptions object applies here too.
  Instead of setting elemDict to be a defaultDict, you would just change
 the
  colors for S and F to whatever you prefer.
 
  -greg
 
 
  On Thu, Feb 12, 2015 at 12:13 AM, Soren Wacker
  swac...@ucalgary.camailto:swac...@ucalgary.ca wrote:
 
  Hi,
 
  I printed some moecules with the Draw module of rdkit and generated some
  useful figures.
  I noticed that for some elements the contrast to white is very low.
  Therefore, I suggest to change the default colors to the darker versions.
 
  E.g. darkyellow instead of yellow for sulfur
  and darkcyan instead of cyan for fluorine
 
  kind regards
  Soren
 
 
 --
  Dive into the World of Parallel Programming. The Go Parallel Website,
  sponsored by Intel and developed in partnership with Slashdot Media, is
  your
  hub for all things parallel software development, from weekly thought
  leadership blogs to news, videos, case studies, tutorials and more. Take
 a
  look and join the conversation now. http://goparallel.sourceforge.net/
  ___
  Rdkit-discuss mailing list
  Rdkit-discuss@lists.sourceforge.netmailto:
 Rdkit-discuss@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
 
 
 
  --
 
  Message: 2
  Date: Thu, 12 Feb 2015 13:14:35 -0500
  From: David Hall li...@cowsandmilk.net
  Subject: Re: [Rdkit-discuss] Chem.Draw darker colors
  To: Soren Wacker swac...@ucalgary.ca
  Cc: RDKit Discuss rdkit-discuss@lists.sourceforge.net,  Greg
 Landrum
greg.land...@gmail.com
  Message-ID: 1a6e481e-8a31-46cc-b719-6fd1a4e2c...@cowsandmilk.net
  Content-Type: text/plain; charset=us-ascii
 
 
  In [6]: opt.elemDict
  Out[6]:
  {0: (0.5, 0.5, 0.5),
   1: (0.55, 0.55, 0.55),
   7: (0, 0, 1),
   8: (1, 0, 0),
   9: (0.2, 0.8, 0.8),
   15: (1, 0.5, 0),
   16: (0.8, 0.8, 0),
   17: (0, 0.8, 0),
   35: (0.5, 0.3, 0.1)}
 
  presumably, you set fluorine and sulfur by changing the values of 9 and
  16.
 
  -David
 
  On Feb 12, 2015, at 12:58 PM, Soren Wacker swac...@ucalgary.ca wrote:
 
  but how?
  Soren
  
  From: Greg Landrum [greg.land...@gmail.com]
  Sent

Re: [Rdkit-discuss] Modified Mol objects with concurrent.futures

2015-02-02 Thread Christos Kannas
Hi Michael,

The problem occurs because child processes return their results using
pickle, and the ordinary rdkit molecule object when is being pickled it
looses information.
A solution that I use is to convert the molecule objects to PropertyMol
objects, which retain their properties.

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
http://cy.linkedin.com/in/christoskannas

On 2 February 2015 at 09:03, Reutlinger, Michael 
michael.reutlin...@roche.com wrote:

 Hi all,

 I am currently trying to parallelize part of a script using RDKIT and
 concurrent.futures. The function that is executed in parallel returns
 processed molecules as RDKIT Mol objects.

 Without parallelization everything is fine and the Mol objects keep all
 the properties that they had before the processing. When using
 concurrent.futures, the returned molecules lose all properties and seem to
 be created from scratch maybe with unknown side-effects.

 I am wondering if anyone experienced the same issue and knows how to
 circumvent this. I attached a ipython notebook with a small script
 demonstrating the issue.

 Best,
 Michael




 Example Code:

 from concurrent import futures
 from rdkit import Chem
 from rdkit.Chem import AllChem
 from rdkit.Chem.Draw import IPythonConsole

 def process(mol):
 if not Name in mol.GetPropNames():
 print Processing: Name missing
 mol.SetProp(Processed,True)
 return mol

 mol = Chem.MolFromSmiles(N[C@@H](C)C(=O)O)
 mol.SetProp(Name,Alanine)

 with futures.ProcessPoolExecutor(max_workers=1) as pool:
 future = pool.submit(process, mol)
 molOut = future.result()
 if Name not in molOut.GetPropNames():
 print Result: Name missing
 if  Processed not in molOut.GetPropNames():
 print Result: Processed missing




 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is
 your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Can't kelulize

2014-12-09 Thread Christos Kannas
Sergio,

You have to use GetSubstructMatches.
Look at my sample here
http://nbviewer.ipython.org/gist/CKannas/5a762b97c52e389d492e.

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
http://cy.linkedin.com/in/christoskannas

On 9 December 2014 at 18:48, Wong, Sergio E. wong...@llnl.gov wrote:

  Dear Ling;

Thank you for pointing out the issue with the lactam ring.  I manually
 changed the bond types in the mol2 file and now the error is gone.  The
 MolFromMol2File function can sanitize the molecule.  However, I still have
 a problem with the output.  Again, my code is:

mol=Chem.MolFromMol2File(%s.lig.%d.mol2%(name, i), sanitize = True,
 removeHs = False)

aromatic_6=[c,n]1[c,n][c,n][c,n][c,n][c,n]1
aromatic_5=[c,n]1[c,n][c,n][c,n][c,n]1

pattern6=Chem.MolFromSmarts(aromatic_6)
pattern5=Chem.MolFromSmarts(aromatic_5)

print Pattern 6 
lar = mol.GetSubstructMatch(pattern6)
print lar
print Pattern 5 
lar = mol.GetSubstructMatch(pattern5)
print lar

 The output is:

Pattern 6
(0, 1, 3, 5, 7, 9)
Pattern 5
(30, 31, 32, 41, 42)

 So for some reason, the pattern match for an aromatic six-membered ring
 returns the conjugated lactam ring, but fails to recognize the other two
 (all-carbon) aromatic rings in the system.  Interesting, it correctly
 recognizes the five-membered ring system.  Do you have any idea's on how to
 address the issue?  I am attaching the hand-edited mol2 file.

 Thanks!
 -Sergio



 *From:* S.L. Chan [slch...@yahoo.com]
  *Sent:* Monday, December 08, 2014 8:26 PM
 *To:* Wong, Sergio E.; rdkit-discuss@lists.sourceforge.net

 *Subject:* Re: [Rdkit-discuss] Can't kelulize

   Dear Sergio,
  The lactam ring (atoms 1 2 4 6 8 10) is not really aromatic. The bonds
 4-6, 6-8, 8-10 should all be single rather than aromatic in the mol2 file.
 The remaining three bonds in the ring should be double or single rather
 than aromatic.

  Ling

   --
 *From:* Wong, Sergio E. wong...@llnl.gov
 *To:* rdkit-discuss@lists.sourceforge.net 
 rdkit-discuss@lists.sourceforge.net
 *Sent:* Monday, December 8, 2014 3:27 PM
 *Subject:* Re: [Rdkit-discuss] Can't kelulize

   Dear RDKit users:

 I tried reading a mol2 file using the function MolFromMol2 ().  The goal
 of my script is to read the molecule and find 5 or 6 membered aromatic
 rings.  First I got the following error:

 Can't Kekulize mol

 The code I used is as follows:

 mol=Chem.MolFromMol2File(%s.lig.%d.mol2%(name, i), sanitize = True,
 removeHs = False)

 As a work-around I tried removing the sanitize flag and did the following


mol=Chem.MolFromMol2File(%s.lig.%d.mol2%(name, i), sanitize = False,
 removeHs = False)

aromatic_6=[c,n]1[c,n][c,n][c,n][c,n][c,n]1
aromatic_5=[c,n]1[c,n][c,n][c,n][c,n]1

pattern6=Chem.MolFromSmarts(aromatic_6)
pattern5=Chem.MolFromSmarts(aromatic_5)

print Pattern 6 
lar = mol.GetSubstructMatch(pattern6)
print lar
print Pattern 5 
lar = mol.GetSubstructMatch(pattern5)
print lar

 The output should 3 aromatic six-membered rings and 1 aromatic
 five-membered ring.  Instead I get only the first six-membered ring and no
 listing of the five-membered ring.:

Pattern 6
(0, 1, 3, 5, 7, 9)
Pattern 5
()

 So basically, I can not get around the kekulize function.  I looked the
 mol2 file (attached) and it correctly lists the bond types as aromatic for
 all of the rings.  Is there a way to use the bond information from the mol2
 file to assign aromaticity?

 Thanks!!

 -Sergio






 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE

 http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE

 http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] UGM Update

2014-10-24 Thread Christos Kannas
Hi RDKiters,

How is UGM going?
Is there a tweet feed to follow?

Hope you are having a nice and interesting time!

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

[image: View Christos Kannas's profile on LinkedIn]
http://cy.linkedin.com/in/christoskannas
--
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Installation of RDKit 2014 on Centos 5.10 (Final)

2014-07-24 Thread Christos Kannas
Hi Enrico,

The latest version of RDKit does not require flex or bison, thankfully.

In the attached file I list the commands that I used to build CMake
(2.8.12.2), Boost libraries (1.55) and RDKit (2014_03_1) on a CentOS 5.10
VM.
In my case I was using an Anaconda Python environment so I didn't have to
install NumPy, since it is bundled to it.

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
http://cy.linkedin.com/in/christoskannas


On 24 July 2014 10:05, Enrico Perspicace e.perspic...@mx.uni-saarland.de
wrote:


 Dear all,

 I would like to install RDKit 2014 on Centos 5.10 (Final) but I did not
 succeed!

 I follow Instructions for Installation on RDKIT website but I got an error
 when I used cmake command line...

 Indeed, cmake is not able to find boost_python library.

 I installed: Python 2.7, atlas, lapack, blas, fftw3, numpy 1.8 via canopy
 1.4.1 (and is working with python import numpy), boost 1.55, flex 2.5.35
 and bison 3.0.2 before performing the RDKit installation.

 I followed the procedure described here: https://www.mail-archive.com/
 rdkit-discuss@lists.sourceforge.net/msg01376.html

 Please find in attached document related files which describe my problem.

 Thanks a lot for you help.

 Best regards,

 Enrico Perspicace


 --
 Want fast and easy access to all the code in your enterprise? Index and
 search up to 200,000 lines of code with a free copy of Black Duck
 Code Sight - the same software that powers the world's largest code
 search on Ohloh, the Black Duck Open Hub! Try it now.
 http://p.sf.net/sfu/bds
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


RDKit Installation on CentOS 5.10
=
first create some directories...
mkdir devel
cd devel/
mkdir boost
mkdir cmake
mkdir RDKit


Install CMake
-
cd cmake
wget http://www.cmake.org/files/v2.8/cmake-2.8.12.2.tar.gz
tar -xzvf cmake-2.8.12.2.tar.gz 
cd cmake
cd cmake-2.8.12.2
./bootstrap 
sudo make
sudo make install
ctest


Install Boost
-
cd boost
wget 
http://sourceforge.net/projects/boost/files/boost/1.55.0/boost_1_55_0.tar.bz2/download
tar -xjvf boost_1_55_0.tar.bz2 
cd boost_1_55_0
./bootstrap.sh --with-libraries=python,regex

Note: 64-bit OS

./b2 address-model=64 cflags=-fPIC cxxflags=-fPIC


Install RDKit
-
cd RDKit/
wget wget 
http://downloads.sourceforge.net/project/rdkit/rdkit/Q1_2014/RDKit_2014_03_1.tgz
tar -xzvf RDKit_2014_03_1.tgz 
cd RDKit_2014_03_1
mkdir build
cd build/

Note: I was using Anaconda Python environment

export PATH=~/anaconda/bin:$PATH 
export RDBASE=~/devel/RDKit/RDKit_2014_03_1
export 
LD_LIBRARY_PATH=/home/christos/devel/RDKit/RDKit_2014_03_1/lib:~/devel/boost/boost_1_55_0/stage/lib:$LD_LIBRARY_PATH
export PYTHONPATH=$RDBASE:$PYTHONPATH
cmake -D PYTHON_LIBRARY=~/anaconda/lib/python2.7/config/libpython2.7.a -D 
PYTHON_INCLUDE_DIR=~/anaconda/include/python2.7/ -D 
PYTHON_EXECUTABLE=~/anaconda/bin/python -D 
BOOST_ROOT=~/devel/boost/boost_1_55_0 ..
make
make install
ctest

--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Aldehyde functional group does not identify Formaldehyde as an aldehyde

2014-07-16 Thread Christos Kannas
Hi Greg and RDKiters,

I'm doing some functional group (FG) substructure matching in some
molecules and I noticed that the SMARTS pattern used to define the aldehyde
FG will identify all aldehydes larger than Acetalhyde but fails to identify
formaldehyde as part of the family.

Is this something that should happen?

Attached is an ipython notebook showing this behaviour.

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on LinkedIn]
http://cy.linkedin.com/in/christoskannas


Aldehyde FG.ipynb
Description: Binary data
--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] autodock vina pdbqt file to mol2

2014-05-09 Thread Christos Kannas
Hi Jan,

AutoDock has a set of tools (MGLTools) that have tools to convert pdb to
pdbqt and vice-versa.
If I recall it can also convert pdbqt to mol2 also. See this discussion
http://autodock.1369657.n2.nabble.com/ADL-pdbqt-to-mol2-td6755769.html

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas


On 9 May 2014 20:17, Jan Domanski jan...@gmail.com wrote:

 Hi guys,

 I'm really stuck here: I have some output from autodock vina in a rather
 obscure pdbqt format. It's a little bit like pdb but not quite. I'm trying
 to get back a mol2 file.

 The autodock pdbqt file has only the polar hydrogens in it – part of the
 trick is to re-add the hydrogens.

 Example autodock vina output is attached (it's a conformer of the ACE
 native ligand DUDE).

 First of all, I convert that to a PDB file by doing a simple sed,
 sed -e '/ROOT/d' -e '/BRANCH/d'
 Then I reorder the atoms to match those of the original
 crystal_ligand.mol2 (because autodock re-orders the atoms duh).

 Finally, I save a mol2 file out (attached) ordered as the original
 crystal_ligand and with polar hydrogens (for each pose of a conformer).

 Let's go to rdkit and try to add hydrogens:

 mol = Chem.MolFromMol2File(output, removeHs=False)
 mol2 = AllChem.AddHs(mol, addCoords=True)
 print mol.GetNumAtoms(), mol2.GetNumAtoms()
 44 44

 So, only the implicit hydorgens are present. Calling AddHs doesn't raise
 an error and it doesn't really change the number of hydrogens...

 Now this may not be the best way of doing things: what I care for is to
 get a mol2 from autodock vina that I can compare to the original mol2 from
 DUD (same atom order, same number of atoms). Maybe there are other ways to
 achieve this: one idea would be to inject the docked pose coordinates into
 the original mol2 atoms (heavy and polar hydrogens) and somehow adjust
 the non-polar hydrogens.

 Thanks,

 - Jan



 --
 Is your legacy SCM system holding you back? Join Perforce May 7 to find
 out:
 #149; 3 signs your SCM is hindering your productivity
 #149; Requirements for releasing software faster
 #149; Expert tips and advice for migrating your SCM now
 http://p.sf.net/sfu/perforce
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
#149; 3 signs your SCM is hindering your productivity
#149; Requirements for releasing software faster
#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Sanitization Errors

2014-04-24 Thread Christos Kannas
Hi all,

I'm having a dozen of compounds, where some of them have a charged atom
(see the attached SMILES file).

When I parse the file I get sanitization errors on the compounds with the
charged atoms.
But when I view them with MarvinView 6.2.0 all goes fine.

I'm using an RDKit build from github, version 2014.03.1pre.

In order to see what sanitization error occurs in each case I did the
following:

1. To parse all compounds without sanitization

suppl = Chem.SmilesMolSupplier('data/SurfactantTestCompounds.smi',
titleLine=True, sanitize=False)
molsList = [x for x in suppl if x is not None]
print len(molsList)

2. Sanitize the compounds and catch specific errors

for m in molsList:
error = Chem.SanitizeMol(m, catchErrors=True)
if error:
print m.GetProp(_Name), Chem.MolToSmiles(m), error

2.1 the output is as follows

NaLAS ()c1ccc(S(=O)(=O)O[Na+])cc1 SANITIZE_PROPERTIES
NaOLAS ()C1=CC=CC=C1S(=O)(=O)O[Na+] SANITIZE_PROPERTIES
SLES3EO OCCOCCOCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES
SLES2EO OCCOCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES
SLES1EO OCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES
SDS OS(=O)(=O)O[Na+] SANITIZE_PROPERTIES
DTAC [N+](C)(C)(C)[Cl-] SANITIZE_PROPERTIES
Sdoc (=O)O[Na+] SANITIZE_PROPERTIES

3. Visualize compounds

Draw.MolsToGridImage(molsList, molsPerRow=5, legends=[x.GetProp('_Name')
for x in molsList], kekulize=True)

For visualized output check
http://nbviewer.ipython.org/gist/anonymous/11248962/Sanitization_Errors.ipynb

Is this an expected behaviour?
Is there something I can do as a fix?

Regards,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas


SurfactantTestCompounds.smi
Description: application/smil
--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Sanitization Errors

2014-04-24 Thread Christos Kannas
Hi Patrick,

Thanks.

So the correct would be, sodium should not have an explicit bond with the
oxygen.
From O=S(c1ccc(C(CCC))cc1)(O-[Na+])=O I should
have O=S(c1ccc(C(CCC))cc1)([O-])=O.[Na+]

Similar to the rest of my compounds.

And regarding nitrogen it already has 4 bonds with carbons so chloride
should be disconnected.
[N+]([Cl-])(C)(C)C - [N+](C)(C)C.[Cl-]

Regards,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas


On 24 April 2014 11:37, Patrick Walters wpwalt...@gmail.com wrote:

 It looks like the problem here is a covalent bond to the counter ion.

 Pat


 On Thu, Apr 24, 2014 at 6:04 AM, Christos Kannas chriskan...@gmail.comwrote:

 Hi all,

 I'm having a dozen of compounds, where some of them have a charged atom
 (see the attached SMILES file).

 When I parse the file I get sanitization errors on the compounds with the
 charged atoms.
 But when I view them with MarvinView 6.2.0 all goes fine.

 I'm using an RDKit build from github, version 2014.03.1pre.

 In order to see what sanitization error occurs in each case I did the
 following:

 1. To parse all compounds without sanitization

 suppl = Chem.SmilesMolSupplier('data/SurfactantTestCompounds.smi',
 titleLine=True, sanitize=False)
 molsList = [x for x in suppl if x is not None]
 print len(molsList)

 2. Sanitize the compounds and catch specific errors

 for m in molsList:
 error = Chem.SanitizeMol(m, catchErrors=True)
 if error:
 print m.GetProp(_Name), Chem.MolToSmiles(m), error

 2.1 the output is as follows

 NaLAS ()c1ccc(S(=O)(=O)O[Na+])cc1 SANITIZE_PROPERTIES
 NaOLAS ()C1=CC=CC=C1S(=O)(=O)O[Na+] SANITIZE_PROPERTIES
 SLES3EO OCCOCCOCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES
 SLES2EO OCCOCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES
 SLES1EO OCCOS(=O)(=O)O[Na+] SANITIZE_PROPERTIES
 SDS OS(=O)(=O)O[Na+] SANITIZE_PROPERTIES
 DTAC [N+](C)(C)(C)[Cl-] SANITIZE_PROPERTIES
 Sdoc (=O)O[Na+] SANITIZE_PROPERTIES

 3. Visualize compounds

 Draw.MolsToGridImage(molsList, molsPerRow=5, legends=[x.GetProp('_Name')
 for x in molsList], kekulize=True)

 For visualized output check
 http://nbviewer.ipython.org/gist/anonymous/11248962/Sanitization_Errors.ipynb

 Is this an expected behaviour?
 Is there something I can do as a fix?

 Regards,

 Christos

 Christos Kannas

 Researcher
 Ph.D Student

 Mob (UK): +44 (0) 7447700937
 Mob (Cyprus): +357 99530608

 [image: View Christos Kannas's profile on 
 LinkedIn]http://cy.linkedin.com/in/christoskannas


 --
 Start Your Social Network Today - Download eXo Platform
 Build your Enterprise Intranet with eXo Platform Software
 Java Based Open Source Intranet - Social, Extensible, Cloud Ready
 Get Started Now And Turn Your Intranet Into A Collaboration Platform
 http://p.sf.net/sfu/ExoPlatform
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Opposite of GetSubstructureMatches()

2014-04-17 Thread Christos Kannas
Hi JP,

Well I've noticed the same  thing on my tests, the only reason I can think
off is that it either takes the preceding or the ending atom to the tuple
of atom indices.
I hope Greg or someone else can shed some light into it.

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas


On 17 April 2014 09:32, JP jeanpaul.ebe...@inhibox.com wrote:


 On 16 April 2014 19:13, Christos Kannas chriskan...@gmail.com wrote:

 Chem.PathToSubmol(mol, path)


 Hi there Christos,

 Many thanks for your reply (and idea of using nbviewer)

 There is still something strange happening which I cannot figure out - my
 atom index is a tuple with six elements - and in the resulting submol I
 get seven atoms.  Also the ring is opened in a chain (so some of the
 properties are changing).

 A simple example here:
 http://nbviewer.ipython.org/gist/anonymous/10964449

 Any ideas?


 -
 Jean-Paul Ebejer
 Early Stage Researcher

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Opposite of GetSubstructureMatches()

2014-04-17 Thread Christos Kannas
Hi Greg,

Thats why I had that strange bug with my hydrophobic - hydrophilic
fragmentation, I was using PathToSubmol with list of atom indices too.

And the solution I've actually found is, as Greg said, iterate through the
atoms of the molecule and find the bonds that connect my query pattern,
hydrophobic atoms, to atoms that are not part of it, aka are hydrophilic.
Then I used Chem.FragmentOnBonds to break the molecule on that list of
bonds.

Here is the IPython Notebook that shows what I'm doing
http://nbviewer.ipython.org/gist/CKannas/10975497
I do not break rings, if some ring atoms are mapped as hydrophobic, and I
also keep terminal carbons (CH3) connected to the adjacent hydrophobic
group, if any. I hope these assumptions are chemically correct.

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas


On 17 April 2014 10:18, Greg Landrum greg.land...@gmail.com wrote:


 On Thu, Apr 17, 2014 at 10:32 AM, JP jeanpaul.ebe...@inhibox.com wrote:


 On 16 April 2014 19:13, Christos Kannas chriskan...@gmail.com wrote:

 Chem.PathToSubmol(mol, path)


 Hi there Christos,

 Many thanks for your reply (and idea of using nbviewer)

 There is still something strange happening which I cannot figure out - my
 atom index is a tuple with six elements - and in the resulting submol I
 get seven atoms.  Also the ring is opened in a chain (so some of the
 properties are changing).

 A simple example here:
 http://nbviewer.ipython.org/gist/anonymous/10964449

 Any ideas?


 PathToSubmol is underdocumented. It's expecting a list/tuple of bond
 indices; not atom indices.

 What you need to do is loop over the atoms in your match and find all the
 bonds that they are involved in that go to other atoms in the match. Pass
 that tuple/list to PathToSubmol and you should get what you want.

 If you're ok having dummies marking attachment points (which I suspect you
 aren't), you could use Chem.ReplaceSidechains(), but otherwise I don't
 think there's an easier way to do this.

 -greg



 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/NeoTech
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Opposite of GetSubstructureMatches()

2014-04-16 Thread Christos Kannas
Hi JP,

I think smol = Chem.PathToSubmol(mol, path), where path is a list of
indices, will do what you want.

Best,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas


On 16 April 2014 16:44, JP jeanpaul.ebe...@inhibox.com wrote:


 Hi there RDKitters,

 This is probably an easy one, but I cannot find anything in the docs or
 the mailing list.

 I have a tuple of atom Ids (e.g. 21,22,24,26,27) and a mol and I would
 like to extract the substructure (molecule) which matches those indices.
  Note that in my case this will be a connected subgraph of the molecule (no
 fragmentation).

 This is pretty much the opposite of GetSubstruct family of methods which
 give Mol - Indices.  I want Indices - Mol.

 Is there a convenience method to do this?

 -
 Jean-Paul Ebejer
 Early Stage Researcher


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/NeoTech
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Fragmentation Bug?

2014-04-09 Thread Christos Kannas
Hi Greg,

I'm writing a piece of script that identifies the hydrophobic parts of
compounds using the pharmacophore features and then fragment the compounds
around the hydrophobic regions.

While I was doing this I noticed that some compounds were raising
ValueError exceptions during the fragmentation process.

I've created a sample IPython Notebook that shows this.
http://nbviewer.ipython.org/gist/CKannas/10255625/Fragmentation%20Bug%20Maybe.ipynb

Check cell 6, you will notice the ValueError print out.

For example for the first compound :

COc1cc(C=CC(=O)OC2OC[C@@H](O)[C@H](O)[C@H]2O)ccc1O

the hydrophobe region is:

CC=Cc

but when the compound is broken up in sidechains and core(hydrophobe
region) the SMILES are:

SideChains: [1*]:cc(OC)c(O)cc:[4*].[2*]=O.[3*]OC1OC[C@@H](O)[C@H](O)[C@H]1O
Hydrophobe: [1*]c([2*])C=CC(=[3*])[4*]
which I think is has some errors in the attachment points in hydrophobe
region core.

Is this the normal behaviour, or a bug?

Also if I merge the two families of Hydrophobe features then I do not get
the error.

Regards,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas
--
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test  Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SMARTS/SMARTS and SMILES/SMARTS substructure matching

2014-03-05 Thread Christos Kannas
Hi Greg,

Thanks a lot for the explanation.
It makes things clearer now.
Well the reason I'm doing SMARTS-SMARTS match is because I would like to
match functional groups with the reactants in reactions.

Regards,

Christos

Christos Kannas

Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas


On 5 March 2014 04:44, Greg Landrum greg.land...@gmail.com wrote:

 Hi Christos,


 On Tue, Mar 4, 2014 at 3:46 PM, Christos Kannas chriskan...@gmail.comwrote:

 Hi all,

 Why does the following happen?

 In [1]: from rdkit import Chem
 In [2]: from rdkit.Chem import AllChem
 In [3]: from rdkit.Chem import Draw

 In [4]: patt = Chem.MolFromSmarts([CH;D2;!$(C-[!#6;!#1])]=O)

 In [5]: z2 = Chem.MolFromSmarts([*]-C-C([H])(=O), 1)
 In [6]: print Chem.MolToSmiles(z2)
 [*]CC=O
 In [7]: print Chem.MolToSmarts(z2)
 *-C-[C!H0]=O
 In [9]: z2.HasSubstructMatch(patt)
 Out[9]: False

 In [10]: z3 = Chem.MolFromSmiles(Chem.MolToSmiles(z2))
 In [11]: print Chem.MolToSmiles(z3)
 [*]CC=O
 In [12]: print Chem.MolToSmarts(z3)
 [*]-[#6]-[#6]=[#8]
 In [13]: z3.HasSubstructMatch(patt)
 Out[13]: True

 Shouldn't be that z2 and z3 have the same information?


 The way SMARTS/SMARTS matches is handled is different than the way
 SMARTS/SMILES matches works.
 The short answer is that when doing a SMARTS/SMARTS match, the RDKit
 compares the queries to each other; when doing a SMARTS/SMILES match, on
 the other hand, it checks to see if the atoms in the SMILES molecule match
 the queries in the SMARTS molecule.

 A bit longer answer:
 Molecules built using MolFromSmiles contain Atoms, molecules built using
 MolFromSmarts contain QueryAtoms. Both atoms and QueryAtoms have a Match()
 method that takes another Atom or QueryAtom as an argument and returns
 whether or not the two match.
 The substructure matching code makes heavy use of this Match() method.
 QueryAtom.Match(Atom) checks to see if the Atom satisfies the query.
 QueryAtom.Match(QueryAtom) checks to see if the queries on the atoms are
 the same. This uses a crude approach that is easy to fool, but I assume
 that a SMARTS-SMARTS match is not a frequent thing someone wants to do.
 query-query matching is also not a particularly easy problem to solve in a
 general way.

 -greg



--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] sanitization removes Hs - is this expected?

2014-02-24 Thread Christos Kannas
Hi Michal,

It doesn't actually removes them, to be more precise it hides them.
You actually explicit define a hydrogen as ([H]), but if you omit it it
still exists.
You can use Chem.AddHs(...) to add the hydrogens in a molecule and
Chem.RemoveHs(..) to hide them.

Best,
Christos


On 24 February 2014 15:48, Michal Krompiec michal.kromp...@gmail.comwrote:

 Hello, I have just noticed this:
  Chem.MolToSmiles(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H]))
 'c1ccsc1'
 
 Chem.MolToSmiles(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H],sanitize=False))
 '[H]c1sc([H])c([H])c1[H]'
 
 Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles([H]c1c([H])sc([H])c1[H],sanitize=False)))
 'c1ccsc1'
  Chem.MolToSmiles(Chem.MolFromSmiles([H]c1cscc1[H]))
 'c1ccsc1'
  Chem.MolToSmiles(Chem.MolFromSmiles([H]c1cscc1[H],sanitize=False))
 '[H]c1cscc1[H]'

 Is it the expected behaviour? Why does sanitization remove hydrogens?
 Is it controlled by any of the SanitizeFlags?

 Best wishes,
 Michal


 --
 Flow-based real-time traffic analytics software. Cisco certified tool.
 Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
 Customize your own dashboards, set traffic alerts and generate reports.
 Network behavioral analysis  security monitoring. All-in-one tool.

 http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




-- 

Christos Kannas
Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas
--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] SMARTS Substructure matching

2014-02-19 Thread Christos Kannas
Hi all,

At my current project I'm working on reaction based multiobjective de novo
design.
And I have a set of reactions that I have converted into SMIRKS and
reaction SMARTS..

The problem I have is that when I have a reactant pattern in SSMARTS, as
required by SMIRKS, that has explicit mapped Hydrogens that play a role in
reaction, and I request a substructure search matching to a compound that
has the substructure in question it can not find a match. But when I change
the pattern to not have explicit mapped hydrogens the substructure matching
search is successful.

To help you understand I've created this small IPython Notebook
http://nbviewer.ipython.org/gist/CKannas/9089271

Can you give me the reasons why this happens?

Best,
Christos

-- 

Christos Kannas
Researcher
Ph.D Student

Mob (UK): +44 (0) 7447700937
Mob (Cyprus): +357 99530608

[image: View Christos Kannas's profile on
LinkedIn]http://cy.linkedin.com/in/christoskannas
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Best Practice: Git model for code submissions to the RDkit

2013-10-24 Thread Christos Kannas
Hi JP,

Best is to use branches for committing changes.
And once the changes are in a state to share, then you merge the branch
with master and do a pull request.
Also try to use short meaningful names for branches.

Regards,
Christos


On Thu, Oct 24, 2013 at 1:06 PM, JP jeanpaul.ebe...@inhibox.com wrote:

 Hi RDkit-Devs,

 ** Disclaimer: Just used git as svn till now **

 What are the best practices for submitting code changes to the RDKit
 codebase via git?

 Right now I do the following:

 0. Fork the rdkit repository (upstream)
 1. Make my changes on the master
 2. Send a pull request to original RDKit repo

 I have local commits I do not want to send in the pull request (e.g.
 .gitignore file which ignores all build files).  Also I have some
 erroneous commits in my forked repo which I would not like to send over).

 The solution probably lies in using branches - but what is the best
 practice to do this? Should all commits which I want to send be in the
 branch and the commits I want to keep private be on the master (or on
 another branch).  How do you do it?

 Perhaps I am thinking too much in terms of SVN.

 Cheers
 JP

 [small note:  By mistake, I sent this email from another address to the
 mailing list and I got the Waiting for moderator approval message ...
 just pointing this out perhaps there are other messages stuck in that queue]


 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss




-- 

Christos Kannas
Researcher
Ph.D Student

e-Health Laboratory http://www.medinfo.cs.ucy.ac.cy/
kannas.chris...@ucy.ac.cy
kannas.chris...@cs.ucy.ac.cy
chriskan...@gmail.com

Mob: (+357) 99530608
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] [RDKit-Discuss]: Aromatic Heavy Atoms

2013-07-26 Thread Christos Kannas
Dear RDKiters,

I'm creating a descriptor for estimating water solubility (clogSw) base on
the following article of Delaney (doi:10.1021/ci034243x).

J. S. Delaney, “ESOL: Estimating Aqueous Solubility Directly from Molecular
 Structure,” *Journal of Chemical Information and Modeling*, vol. 44, no.
 3, pp. 1000–1005, May 2004.


In this paper he proposes an equation to calculate an estimation of the
water solubility of molecules based on physio-chemical descriptors.

One of the descriptors used is Aromatic Proportion, that is the proportion
of heavy atoms of the molecule that are in aromatic ring.

So in order to find the aromatic heavy atoms I use GetSubstructMatches(...)
with query SMARTS '[a]'. Is that the correct way to find all the aromatic
atoms of a molecule? If not what is the correct SMARTS to use?

@Greg: When I complete this, can we look into adding it as a new
descriptor, clogSw (like clogP), within the RDKit distribution?

Kind Regards,
Christos

-- 

Christos Kannas
Researcher
Ph.D Student

e-Health Laboratory http://www.medinfo.cs.ucy.ac.cy/
kannas.chris...@ucy.ac.cy
kannas.chris...@cs.ucy.ac.cy
chriskan...@gmail.com

Mob: (+357) 99530608
--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Domain of applicability

2013-03-19 Thread Christos Kannas
Hi Paul,

We are using RDKit along with R, instead of scikit-learn,  though the
general idea is the same.

Using a good dataset of compounds with known biological property you can
train a model using an algorithm such as Random Forests.
This model, can later be used on a new dataset to predict the
aforementioned biological property.
What you actually do, is finding correlations between the chemical
properties and a known biological property of compounds.

This can be applied to any problem where there is the need to predict a
biological property.

Hope this helps a bit.

Kind regards,
Christos

--
Christos Kannas
Sent from my Galaxy Note!
On Mar 19, 2013 3:42 PM, paul.czodrow...@merckgroup.com wrote:

 Dear RDKitters,

 anyone worked with RDKit (data processing  descriptor calculation) 
 scikit-learn (train Random Forests) and could share some experiences with
 setting up/defining a domain of applicability?

 Cheers  Thanks so far,
 Paul


 P.S.: Just resent this mail, since the last mail contained typos which
 might make future searches for keywords in the mailing list quite
 challenging... ;)


 This message and any attachment are confidential and may be privileged or
 otherwise protected from disclosure. If you are not the intended recipient,
 you must not copy this message or attachment or disclose the contents to
 any other person. If you have received this transmission in error, please
 notify the sender immediately and delete the message and any attachment
 from your system. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not accept liability for any omissions or errors in this
 message which may arise as a result of E-Mail-transmission or for damages
 resulting from any unauthorized changes of the content of this message and
 any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
 subsidiaries do not guarantee that this message is free of viruses and does
 not accept liability for any damages caused by any virus transmitted
 therewith.

 Click http://www.merckgroup.com/disclaimer to access the German, French,
 Spanish and Portuguese versions of this disclaimer.


 --
 Everyone hates slow websites. So do we.
 Make your web apps faster with AppDynamics
 Download AppDynamics Lite for free today:
 http://p.sf.net/sfu/appdyn_d2d_mar
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] how to install rdkit systemwide on Ubuntu 12.04

2012-11-08 Thread Christos Kannas
Hi Michal,

I don't think it will work, at least with Python since it requires the
RDKit directory in PYTHONPATH and RDKit/lib is required in LD_LIBRARY_PATH
in order that Python  can find the library files created when building
RDKit.

Regards,
Christos Kannas

Sent from my Galaxy Note!
On Nov 8, 2012 7:21 PM, Michał Nowotka mmm...@gmail.com wrote:

 Hello,
 I would like to install rdkit in such a way, I don't have to append
 anything to LD_LIBRARY_PATH.
 If I do:

 export RDBASE=/usr/lib

 and run cmake  make  make install
 would that do the trick?

 Kind regards,
 Michal Nowotka



 --
 Everyone hates slow websites. So do we.
 Make your web apps faster with AppDynamics
 Download AppDynamics Lite for free today:
 http://p.sf.net/sfu/appdyn_d2d_nov
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_nov___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss