Re: [Cdk-user] Writing V3000 format

2017-02-06 Thread John Mayfield
IIRC the output is internally buffered, you're calling flush but it won't
do anything. Try closing the files to force the flush.

try (MDLV3000Writer mdlw = new MDLV3000Writer(System.out)) {
  mdlw.write(mol);
}


You can also manually buffer your output:

StringWriter writer2 = new StringWriter();
BufferedWriter bwriter2 = new BufferedWriter(writer2);
MDLV3000Writer mdl = new MDLV3000Writer(bwriter2);
mdl.write(mol);
bwriter2.flush();
String v3000 = writer2.toString();

Also - SilentChemObjectBuilder! I really want to just remove that
DefaultChemObjectBuilder so much misuse.

John

On 6 February 2017 at 17:43, Tim Dudgeon  wrote:

> I'm finding that the MDLV3000Writer class does not seem to output anything.
> In the example below the V2000 output is fine, but the V3000 output is an
> empty String.
>
> Is this a bug?
>
> Thanks
> Tim
>
>
> SmilesParser smilesParser = new 
> SmilesParser(DefaultChemObjectBuilder.getInstance());
> IAtomContainer mol = smilesParser.parseSmiles("CCO");
>
> StringWriter writer1 = new StringWriter();
> MDLV2000Writer v2000writer = new MDLV2000Writer(writer1);
> v2000writer.write(mol);
> writer1.flush();
> String v2000 = writer1.toString();
>
> StringWriter writer2 = new StringWriter();
> MDLV3000Writer mdl = new MDLV3000Writer(writer2);
> mdl.write(mol);
> writer2.flush();
> String v3000 = writer2.toString();
>
>
>
>
>
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] 'neutralizing' a molecule

2017-02-17 Thread John Mayfield
In general you shouldn't use InChI for storing (i.e. reading structures)
it's unfortunate they made it possible. It is an identifier != exchange
format, see Talk form Steve
, Slide 8.

That being said you can strip the salts and push it back through InChI
(which has a neutralization algorithm - notice the formula is for the
neutral form in your example), cut of the charge layer and then read it
back. A disadvantage is since the layers are dependant you have to also
drop the stereo as well. In this case it's actually okay to splice out the
charge but in general that's not true.

Charged Component:
InChI=1S/C16H25N5O15P2/c17-13-7-14(19-3-18-13)21(4-20-7)15-11(26)9(24)6(33-15)2-32-37(28,29)36-38(30,31)35-16-12(27)10(25)8(23)5(1-22)34-16/h3-6,8-12,15-16,22-27H,1-2H2,(H,28,29)(H,30,31)(H2,17,18,19)/p-2/t5-,6-,8-,9-,10+,11-,12-,15-,16?/m1/s1

Neutral Component:
InChI=1S/C16H25N5O15P2/c17-13-7-14(19-3-18-13)21(4-20-7)15-11(26)9(24)6(33-15)2-32-37(28,29)36-38(30,31)35-16-12(27)10(25)8(23)5(1-22)34-16/h3-6,8-12,15-16,22-27H,1-2H2,(H,28,29)(H,30,31)(H2,17,18,19)/

You'd also be surprised how far a very simple approach gets you:
Neutralize.java


John

On 17 February 2017 at 14:00, Egon Willighagen 
wrote:

>
> John suggest this list of SMARTS recently:
>
> http://www.daylight.com/meetings/emug00/Sayle/pkapredict.html
>
> And Nina mentioned code in AMBIT for SMIRKS to do the job...
>
> Egon
>
>
>
>
> On Fri, Feb 17, 2017 at 2:51 PM, Rajarshi Guha 
> wrote:
>
>> Hi, I have a situtation where I start from an InChI for a molecule in
>> salt form, and after stripping the salt, I would like to obtain the neutral
>> form.
>>
>> An example is starting from
>>
>> InChI=1S/C16H25N5O15P2.2Na/c17-13-7-14(19-3-18-13)21(4-20-7)
>> 15-11(26)9(24)6(33-15)2-32-37(28,29)36-38(30,31)35-16-12(27)
>> 10(25)8(23)5(1-22)34-16;;/h3-6,8-12,15-16,22-27H,1-2H2,(H,
>> 28,29)(H,30,31)(H2,17,18,19);;/q;2*+1/p-2/t5-,6-,8-,9-,10+,
>> 11-,12-,15-,16?;;/m1../s1
>>
>> I can use ConnectivityChecker to get the largest component. But this has
>> a charge of -2, with the phosphate groups missing a proton.
>>
>> Is there a convenience method to neutralize this molecule by adding
>> protons appropriately?
>>
>> --
>> Rajarshi Guha | http://blog.rguha.net
>> NIH Center for Advancing Translational Science
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/u/egonwillighagen
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] 'neutralizing' a molecule

2017-02-18 Thread John Mayfield
It's under LGPL, just a quick thing a wrote during my thesis. Might include
in CDK proper as there's a few more things you can do.

SMIRKS does let you customise these easily but actually they're not too bad
and more efficient to inline in to code.

[Cl-1:1]>>[Cl+0:1]
[NH0+1:1]=[C+0:2][N+0:3]([H])>>[NH0+0:1][C+0:2]=[N+0:3]

The second one handles cases like this:
C[N+](C)=CN

etc..

On 17 February 2017 at 23:10, Rajarshi Guha <rajarshi.g...@gmail.com> wrote:

> Indeed - thanks.
>
> This was actually asked by a user of the rcdk package who is dealing with
> InChI's.
>
> The Java code you linked to is handy - what license is it available under?
> if feasible, I'd like to include it in the rcdk package
>
> On Fri, Feb 17, 2017 at 5:38 PM, John Mayfield <
> john.wilkinson...@gmail.com> wrote:
>
>> In general you shouldn't use InChI for storing (i.e. reading structures)
>> it's unfortunate they made it possible. It is an identifier != exchange
>> format, see Talk form Steve
>> <http://www.hellers.com/steve/pub-talks/toronto-7-14.pdf>, Slide 8.
>>
>> That being said you can strip the salts and push it back through InChI
>> (which has a neutralization algorithm - notice the formula is for the
>> neutral form in your example), cut of the charge layer and then read it
>> back. A disadvantage is since the layers are dependant you have to also
>> drop the stereo as well. In this case it's actually okay to splice out the
>> charge but in general that's not true.
>>
>> Charged Component:
>> InChI=1S/C16H25N5O15P2/c17-13-7-14(19-3-18-13)21(4-20-7)15-1
>> 1(26)9(24)6(33-15)2-32-37(28,29)36-38(30,31)35-16-12(27)10(2
>> 5)8(23)5(1-22)34-16/h3-6,8-12,15-16,22-27H,1-2H2,(H,28,29)(
>> H,30,31)(H2,17,18,19)/p-2/t5-,6-,8-,9-,10+,11-,12-,15-,16?/m1/s1
>>
>> Neutral Component:
>> InChI=1S/C16H25N5O15P2/c17-13-7-14(19-3-18-13)21(4-20-7)15-1
>> 1(26)9(24)6(33-15)2-32-37(28,29)36-38(30,31)35-16-12(27)10(2
>> 5)8(23)5(1-22)34-16/h3-6,8-12,15-16,22-27H,1-2H2,(H,28,29)(
>> H,30,31)(H2,17,18,19)/
>>
>> You'd also be surprised how far a very simple approach gets you:
>> Neutralize.java
>> <https://github.com/johnmay/mdk/blob/develop-1.5/tool/search-tree/src/main/java/org/openscience/cdk/isomorphism/Neutralise.java>
>>
>> John
>>
>> On 17 February 2017 at 14:00, Egon Willighagen <
>> egon.willigha...@gmail.com> wrote:
>>
>>>
>>> John suggest this list of SMARTS recently:
>>>
>>> http://www.daylight.com/meetings/emug00/Sayle/pkapredict.html
>>>
>>> And Nina mentioned code in AMBIT for SMIRKS to do the job...
>>>
>>> Egon
>>>
>>>
>>>
>>>
>>> On Fri, Feb 17, 2017 at 2:51 PM, Rajarshi Guha <rajarshi.g...@gmail.com>
>>> wrote:
>>>
>>>> Hi, I have a situtation where I start from an InChI for a molecule in
>>>> salt form, and after stripping the salt, I would like to obtain the neutral
>>>> form.
>>>>
>>>> An example is starting from
>>>>
>>>> InChI=1S/C16H25N5O15P2.2Na/c17-13-7-14(19-3-18-13)21(4-20-7)
>>>> 15-11(26)9(24)6(33-15)2-32-37(28,29)36-38(30,31)35-16-12(27)
>>>> 10(25)8(23)5(1-22)34-16;;/h3-6,8-12,15-16,22-27H,1-2H2,(H,28
>>>> ,29)(H,30,31)(H2,17,18,19);;/q;2*+1/p-2/t5-,6-,8-,9-,10+,11-
>>>> ,12-,15-,16?;;/m1../s1
>>>>
>>>> I can use ConnectivityChecker to get the largest component. But this
>>>> has a charge of -2, with the phosphate groups missing a proton.
>>>>
>>>> Is there a convenience method to neutralize this molecule by adding
>>>> protons appropriately?
>>>>
>>>> --
>>>> Rajarshi Guha | http://blog.rguha.net
>>>> NIH Center for Advancing Translational Science
>>>>
>>>> 
>>>> --
>>>> Check out the vibrant tech community on one of the world's most
>>>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>>>> ___
>>>> Cdk-user mailing list
>>>> Cdk-user@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>>
>>>
>>>
>>> --
>>> E.L. Willighagen
>>> Department of Bioinformatics - BiGCaT
>>> Maastricht University (http://www.bigcat.unimaas.nl/)
>>> Homepage: http://egonw.github.com/

Re: [Cdk-user] 'neutralizing' a molecule

2017-02-18 Thread John Mayfield
Oh and need to watch out for cases like.

*[N+](=O)[O-]

J

On 18 February 2017 at 09:43, Egon Willighagen <egon.willigha...@gmail.com>
wrote:

>
> I am also interested in implementing this in Bioclipse...
>
> Egon
>
> On Sat, Feb 18, 2017 at 10:12 AM, John Mayfield <
> john.wilkinson...@gmail.com> wrote:
>
>> It's under LGPL, just a quick thing a wrote during my thesis. Might
>> include in CDK proper as there's a few more things you can do.
>>
>> SMIRKS does let you customise these easily but actually they're not too
>> bad and more efficient to inline in to code.
>>
>> [Cl-1:1]>>[Cl+0:1]
>> [NH0+1:1]=[C+0:2][N+0:3]([H])>>[NH0+0:1][C+0:2]=[N+0:3]
>>
>> The second one handles cases like this:
>> C[N+](C)=CN
>>
>> etc..
>>
>> On 17 February 2017 at 23:10, Rajarshi Guha <rajarshi.g...@gmail.com>
>> wrote:
>>
>>> Indeed - thanks.
>>>
>>> This was actually asked by a user of the rcdk package who is dealing
>>> with InChI's.
>>>
>>> The Java code you linked to is handy - what license is it available
>>> under? if feasible, I'd like to include it in the rcdk package
>>>
>>> On Fri, Feb 17, 2017 at 5:38 PM, John Mayfield <
>>> john.wilkinson...@gmail.com> wrote:
>>>
>>>> In general you shouldn't use InChI for storing (i.e. reading
>>>> structures) it's unfortunate they made it possible. It is an identifier !=
>>>> exchange format, see Talk form Steve
>>>> <http://www.hellers.com/steve/pub-talks/toronto-7-14.pdf>, Slide 8.
>>>>
>>>> That being said you can strip the salts and push it back through InChI
>>>> (which has a neutralization algorithm - notice the formula is for the
>>>> neutral form in your example), cut of the charge layer and then read it
>>>> back. A disadvantage is since the layers are dependant you have to also
>>>> drop the stereo as well. In this case it's actually okay to splice out the
>>>> charge but in general that's not true.
>>>>
>>>> Charged Component:
>>>> InChI=1S/C16H25N5O15P2/c17-13-7-14(19-3-18-13)21(4-20-7)15-1
>>>> 1(26)9(24)6(33-15)2-32-37(28,29)36-38(30,31)35-16-12(27)10(2
>>>> 5)8(23)5(1-22)34-16/h3-6,8-12,15-16,22-27H,1-2H2,(H,28,29)(H
>>>> ,30,31)(H2,17,18,19)/p-2/t5-,6-,8-,9-,10+,11-,12-,15-,16?/m1/s1
>>>>
>>>> Neutral Component:
>>>> InChI=1S/C16H25N5O15P2/c17-13-7-14(19-3-18-13)21(4-20-7)15-1
>>>> 1(26)9(24)6(33-15)2-32-37(28,29)36-38(30,31)35-16-12(27)10(2
>>>> 5)8(23)5(1-22)34-16/h3-6,8-12,15-16,22-27H,1-2H2,(H,28,29)(H
>>>> ,30,31)(H2,17,18,19)/
>>>>
>>>> You'd also be surprised how far a very simple approach gets you:
>>>> Neutralize.java
>>>> <https://github.com/johnmay/mdk/blob/develop-1.5/tool/search-tree/src/main/java/org/openscience/cdk/isomorphism/Neutralise.java>
>>>>
>>>> John
>>>>
>>>> On 17 February 2017 at 14:00, Egon Willighagen <
>>>> egon.willigha...@gmail.com> wrote:
>>>>
>>>>>
>>>>> John suggest this list of SMARTS recently:
>>>>>
>>>>> http://www.daylight.com/meetings/emug00/Sayle/pkapredict.html
>>>>>
>>>>> And Nina mentioned code in AMBIT for SMIRKS to do the job...
>>>>>
>>>>> Egon
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 17, 2017 at 2:51 PM, Rajarshi Guha <
>>>>> rajarshi.g...@gmail.com> wrote:
>>>>>
>>>>>> Hi, I have a situtation where I start from an InChI for a molecule in
>>>>>> salt form, and after stripping the salt, I would like to obtain the 
>>>>>> neutral
>>>>>> form.
>>>>>>
>>>>>> An example is starting from
>>>>>>
>>>>>> InChI=1S/C16H25N5O15P2.2Na/c17-13-7-14(19-3-18-13)21(4-20-7)
>>>>>> 15-11(26)9(24)6(33-15)2-32-37(28,29)36-38(30,31)35-16-12(27)
>>>>>> 10(25)8(23)5(1-22)34-16;;/h3-6,8-12,15-16,22-27H,1-2H2,(H,28
>>>>>> ,29)(H,30,31)(H2,17,18,19);;/q;2*+1/p-2/t5-,6-,8-,9-,10+,11-
>>>>>> ,12-,15-,16?;;/m1../s1
>>>>>>
>>>>>> I can use ConnectivityChecker to get the largest component. But this
>>>>

Re: [Cdk-user] MurckoFragmenter infinity loop

2017-01-11 Thread John Mayfield
Well It's not that big a molecule. I can't remember the exact number but we
can write SMILES for proteins of up to around 1000 AAs. With the correct
algorithm It should be sublinear time to get the framework... however the
fragments and uniquifiing of those would take a while. Unfortunately it
looks like even just requesting the framework takes forever.

Rewriting the code with a better algorithm should do it - a first glance
shows it's doing lots it doesn't need. Curiously you don't need ring
perception to get the framework just iteratively delete degree 1 vertices.
I'll try and find time can you add a feature request/issue on github please.

John

On 11 January 2017 at 12:06, Nikolas Glaser  wrote:

> Hello guys,
> one question to the murcko fragmenter.
> The fragmenter worked for 15 hours on the following Chembl Compound and
> did not find a result. Is this a bug or is it not possible to get the
> murcko scaffold out of a compound like this?
> I am just fit in the informatics and have not a lot of knowledge for
> chemical compounds. I try to find out if there is a bug in the
> MurckoFragmenter or if it is necessary to change the fragmenter for my
> needs? Or is it just stupid to try fragmentation for a compound like this
> and I should put a filter in front of the fragmenter?
>
> Copmound: https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL529226
>
> Thanks for reply.
>
> Best regards
> Niko
>
> 
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Count fragment in molecule

2017-01-05 Thread John Mayfield
You can find more documentation on that class here:
http://cdk.github.io/cdk/1.5/docs/api/index.html

On 5 January 2017 at 14:57, Nikolas Glaser <glase...@gmail.com> wrote:

> Ah great, that was what i hoped to find.
> Thanks a lot for the fast answer!
> Niko
>
> John Mayfield <john.wilkinson...@gmail.com> schrieb am Do., 5. Jan. 2017
> um 15:52 Uhr:
>
>> Niko,
>>
>> I do not need a complete code, just some information what i can use for
>> it. I want to try it by myself to learn more about cdk.
>>
>>
>> It's a one liner... so I'm just going to give the answer:
>>
>> int count = Pattern.findSubstructure(frag).matchAll(mol).countUnique();
>>
>>
>> John
>>
>> On 5 January 2017 at 14:42, Nikolas Glaser <glase...@gmail.com> wrote:
>>
>> Hello everyone,
>> is there a possibility to count a fragment (e.g. created witch
>> murcko-fragmenter) in the molecule?
>> E.g. i have a 6-Ring and want to know how often he is present in the
>> fragmented molecule.
>>
>> If there is no tool, maybe someone can give me a hint how to implement it
>> by myself? I am very new to CDK so i do not have a lot of experience.
>> I do not need a complete code, just some information what i can use for
>> it. I want to try it by myself to learn more about cdk.
>>
>> Thanks a lot!
>>
>> Niko
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Count fragment in molecule

2017-01-05 Thread John Mayfield
Niko,

I do not need a complete code, just some information what i can use for it.
> I want to try it by myself to learn more about cdk.


It's a one liner... so I'm just going to give the answer:

int count = Pattern.findSubstructure(frag).matchAll(mol).countUnique();


John

On 5 January 2017 at 14:42, Nikolas Glaser  wrote:

> Hello everyone,
> is there a possibility to count a fragment (e.g. created witch
> murcko-fragmenter) in the molecule?
> E.g. i have a 6-Ring and want to know how often he is present in the
> fragmented molecule.
>
> If there is no tool, maybe someone can give me a hint how to implement it
> by myself? I am very new to CDK so i do not have a lot of experience.
> I do not need a complete code, just some information what i can use for
> it. I want to try it by myself to learn more about cdk.
>
> Thanks a lot!
>
> Niko
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Questions about the CDK 1.5.14 (JDK 1.8.0_121, maven 3.3.9) and the source code of calculateBCUT.java

2017-03-13 Thread John Mayfield
Hi,

Yes your changes look correct. Unfortunately the API changed a lot between
1.4 and 1.5.

The true/false specifies whether to calculate aromaticity:
https://github.com/cdk/cdk/blob/master/descriptor/qsarmolecular/src/main/java/org/openscience/cdk/qsar/descriptors/molecular/BCUTDescriptor.java#L154

It depends on the descriptor but you should either set it to true, or
preferably calculate the aromaticity your self.

Aromaticity arom = new Aromaticity(ElectronDonctation.daylight(),
> Cycles.or(Cycles.all(), Cycles.all(6)));
> while (mdl.hasNext()) {
>   IAtomContainer mol = mdl.next();
>   arom.apply(mol);
> }


There are lots of options for which aromaticity model to use, see:
http://cdk.github.io/cdk/2.0/docs/api/index.html


On 11 March 2017 at 09:51, 丁雷  wrote:

> Dear John
>
> Thanks for your contribution to the CDK project. I am a newcomer to the
> CDK and Java, so I have followed the guideness on the webpage of github to
> build CDK and create the JavaDoc from the source code which was downloaded
> via the webpage of sourceforge. But I failed to build CDK unless I chose to
> skip the tests, and I failed to create the JavaDoc either. So I had to
> directly download the .jar files from sourceforge. *I don't know whether
> the reason lies in the version of CDK and JDK I have used.*
>
> Then I typed the source code of calculateBCUT.java (on the webpage
> http://www.redbrick.dcu.ie/~noel/CDKJython.html) into Netbeans IDE 8.2 to
> test CDK and my Java environment. Unfortunatly, there are some errors in
> the source code which are pointed out by Netbeans, such as not
> finding Descriptor class, IteratingMDLReader class, Molecule class. So I
> modified the source code by reading the JavaDoc to make it run. Following
> is the modified code:
>
> import org.openscience.cdk.*;
> import org.openscience.cdk.qsar.*;
> import org.openscience.cdk.io.iterator.IteratingSDFReader;
> import org.openscience.cdk.exception.CDKException;
> import org.openscience.cdk.qsar.result.*;
> import org.openscience.cdk.qsar.descriptors.molecular.BCUTDescriptor;
> import java.io.*;
>
> /**
>  *
>  * @author NanoFate
>  */
> public class CalculateBCUT {
>
> /**
>  * @param args the command line arguments
>  */
> public static void main(String[] args) throws CDKException {
> FileReader sdfile=null;
> try {
> sdfile=new FileReader(new File(args[0]));
> } catch (FileNotFoundException e) {
> System.err.println("File not found: "+args[0]);
> System.exit(1);
> } catch (ArrayIndexOutOfBoundsException e) {
> System.err.println("You need to give the name of an .sd
> file.");
> System.exit(1);
> }
> // Descriptor bcut=new BCUTDescriptor();
> BCUTDescriptor bcut=new BCUTDescriptor();
> // bcut.setParameters(new Object[] {1,1});
> bcut.setParameters(new Object[] {1, 1, false});
> // IteratingMDLReader myiter=new IteratingMDLReader(sdfile);
> IteratingSDFReader myiter=new IteratingSDFReader(sdfile,
> DefaultChemObjectBuilder.getInstance());
> // Molecule mol=null;
> AtomContainer mol=null;
> while (myiter.hasNext()) {
> // mol=(Molecule)myiter.next();
> mol=(AtomContainer)myiter.next();
> DoubleArrayResult BCUTvalue=(DoubleArrayResult)
> bcut.calculate(mol).getValue();
> System.out.print(BCUTvalue.get(0));
> for (int i=1; i<6; i++) {
> System.out.print("\t"+BCUTvalue.get(i));
> }
> System.out.print("\n");
> }
> }
> }
>
> The text behind the "//" symbol is the original code. *I would like to
> know whether my modification is correct.*
>
> What's more, the result of the original code shows on the webpage is
> "11.996163377263015   15.998263783541644   -0.41661438592771444
> 0.08657868420569534   5.618366876046048   11.845146625965969". The result
> generated from my modified code is "11.996163377277325   15.99826380352772
>   -0.4172826456748383   0.08593815726116673   5.42156116627
> 11.691340132916682". And if I change the statement bcut.setParameters to
> "bcut.setParameters(new Object[] {1, 1, true});", the result changes to
> "11.996163377277325   15.99826380352772   -0.4172826456748383
> 0.08593815726116673   5.03756330128007   11.421637440625194".
>
> *So I would like to know the reason of the deviation of the result and the
> difference between true and false in the statement bcut.setParameters.*
>
> Thank you very much!
>
> Your sincerely
>
> Lei Ding
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] General Question to method getAtomNumber(IAtom atom) of IAtomContainer

2017-04-07 Thread John Mayfield
HI Niko,

If getAtomNumber works like an array indexOf and does reference equality.
I'm a bit confused by your description but I think what your trying to do
is find a hydroxy in CHEMBL501674? In general you need to find a
substructure, https://en.wikipedia.org/wiki/Subgraph_isomorphism_problem.
Easiest way to do this is with a SMARTS pattern match
.

String smi=
"CO[C@@H]1[C@H](O)[C@@H](C)O[C@@H](OC[C@@H]2[C@@H](C)OC(=O)\\C=C\\[C@H](C)[C@H](CC[C@@H](C)C(=O)\\C=C\\[C@H]3O[C@@H]23)O[C@@H]4O[C@H](C)C[C@H](O)[C@H]4O)[C@@H]1OC";
SmilesParser   smipar = new SmilesParser(SilentChemObjectBuilder.getInstance());
IAtomContainer target = smipar.parseSmiles(smi);

Pattern ptrn = SmartsPattern.create("*[OH]");
for (int[] mapping : ptrn.matchAll(target)) {
IAtom a = target.getAtom(mapping[1]); // 0=*, 1=[OH]
System.out.println(target.getAtomNumber(a));
}


4
40
42


There are indexes so add one to get the num in the depiction, 5, 41, 43: CDK
Depict


[image: Inline images 1]


On 7 April 2017 at 17:28, Nikolas Glaser  wrote:

> Hey guys,
> I have a general question to the working of the method getAtomNumber().
> What are the criteria an atom is chosen in the IAtomContainer?
>
> For example I have following test code:
> Created a AtomContainer from following smiles: [Z]O // in this
> case the [Z] is a placeholder for the link position of the fragment in the
> main substance
> So I take the Atom at index position 1 in the Fragment. When I print the
> Symbol it is the O.
>
> Now I want to get the Index in the complete substance and it is sure that
> O is in the substance present. I get the index -1 so I think the method
> cannot find the Atom.
> The complete substance I use for my test is CHEMBL501674.
> If I do getAtom(0) on the substance it returns an O Atom.
>
> So the general Question is, is it impossible to get the Atomposition in
> this way in the original substance with a fragment created from a smiles
> string?
> Or how does the method correctly work, on the documentation I was not able
> to find a detailed information and I had not the time to watch in the code,
> also I thought if it is the wrong way you can tell it me much more faster.
>
> I want to realise the idea of John to get the starting position of the
> fragment in the specific molecule on this way.
>
> Thanks for your help again.
>
> Cheers
> Niko
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


[Cdk-user] NCDK .NET Port of CDK

2017-07-10 Thread John Mayfield
Hi All,

If you're interested in a .NET port of CDK please check out NCDK (
https://github.com/kazuyaujihara/NCDK). It's also now linked from the
homepage. The author has done a great job and found some issues with the
Java which has feed back an improved the Java version too.

Cheers,
John

Related: https://github.com/cdk/cdk/issues/345
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Atom Signatures with stereo-info

2017-07-21 Thread John Mayfield
Yes,

Use the CircularFingerprinter, it encodes stereochemistry, the relevant
method is CircularFingerprinter
.getFP()
which will give you the atoms involved and the hashed value. IIRC the first
atom in the list is the 'root'.

John

On 21 July 2017 at 09:39, Staffan Arvidsson 
wrote:

> Hi all,
>
> I wonder if there is any way of producing atom signatures with
> stereoinformation? Currently we're using
>
> String signature = new AtomSignature(atom, height,
> molecule).toCanonicalString();
>
> to produce the signatures.
>
>
> Best,
> Staffan
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Atom Signatures with stereo-info

2017-07-21 Thread John Mayfield
>
>  Although this produces bit-fingerprints and not any String-representation
> of the signatures if I'm reading this correctly?


Yes but notice it also gives you the atom indexes, this is much more
powerful that just giving the String. We actually have a utility to get the
SMARTS for the atoms. Won't give you stereo but it's pretty easy to make it
do that if you were so inclined, would be easy to output stereo as SMILES
instead of SMARTS:


> SmilesParser   smipar = new
> SmilesParser(SilentChemObjectBuilder.getInstance());
> IAtomContainer mol = smipar.parseSmiles("CC[C@H](C)CO");
> CircularFingerprinter fp = new
> CircularFingerprinter(CircularFingerprinter.CLASS_ECFP6);
> fp.calculate(mol);
>
SmartsFragmentExtractor smafrag = new SmartsFragmentExtractor(mol);

for (int i = 0; i < fp.getFPCount(); i++)
>   System.out.println(smafrag.generate(fp.getFP(i).atoms));


Result:

[CH3v4X4+0]
> [CH2v4X4+0]
> [CH2v4X4+0]
> [CH2v4X4+0]
> [CH2v4X4+0]
> [CH2v4X4+0]
> [CH1v4X4+0]
> [CH3v4X4+0]
> [CH2v4X4+0]
> [OH1v2X2+0]
> [CH3v4X4+0][CH2v4X4+0]
> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0]
> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
> [CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]
> [CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0]
> [CH1v4X4+0][CH3v4X4+0]
> [CH1v4X4+0][CH2v4X4+0][OH1v2X2+0]
> [CH2v4X4+0][OH1v2X2+0]
> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]
> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0]
> [CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0][OH1v2X2+0]
> [CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0][OH1v2X2+0]
> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
>
> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]
>
> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0]
>
> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0][OH1v2X2+0]
>
> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0][OH1v2X2+0]



However, I have done some experiments comparing the circular fingerprints
> of enantiomers and also diastereomers, and they turn out to have 1.0
> tanimoto scores.
> What am I doing wrong?


Unfortunately the way it was written you currently need 2D coordinates.
It's an easy fix if you want to submit the patch, just need to pull the
tetrahedral rubric out of the IStereoElements - note the IStereoElement's
are created automatically on 2D/3D.

SmilesParser  smipar = new
> SmilesParser(SilentChemObjectBuilder.getInstance());
> IAtomContainermol1 = smipar.parseSmiles("CC[C@H](C)CO");
> IAtomContainermol2 = smipar.parseSmiles("CC[C@@H](C)CO");
> CircularFingerprinter fp = new
> CircularFingerprinter(CircularFingerprinter.CLASS_ECFP6);
> System.out.println(Tanimoto.calculate(fp.getFingerprint(mol1),
> fp.getFingerprint(mol2)));
> // 1.0
> StructureDiagramGenerator sdg = new StructureDiagramGenerator();
> sdg.generateCoordinates(mol1);
> sdg.generateCoordinates(mol2);
> System.out.println(Tanimoto.calculate(fp.getFingerprint(mol1),
> fp.getFingerprint(mol2)));
> // 0.77



On 21 July 2017 at 12:25, Christoph Steinbeck <
christoph.steinb...@uni-jena.de> wrote:

> CircularFingerprinter.getBitFingerprint().asBitString().toString();
>
> or
>
> Integer.toString(CircularFingerprinter.getFP())
>
> Did not test this.
>
> Kind regards,
>
> Chris
>
>
> —
> Prof. Dr. Christoph Steinbeck
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> Phone Secretariat: +49-3641-948171
> http://orcid.org/-0001-6966-0814
>
> What is man but that lofty spirit - that sense of enterprise.
> ... Kirk, "I, Mudd," stardate 4513.3..
>
> > On 21 Jul 02017, at 13:09, Staffan Arvidsson <
> staffan.arvids...@gmail.com> wrote:
> >
> > OK thanks! Although this produces bit-fingerprints and not any
> String-representation of the signatures if I'm reading this correctly?
> Currently all our code requires the Signatures to be Strings. Would require
> a large rewrite to get this to work for us. Because the javadoc says that
> method getRawFingerprint is not correct so I should not use it? (Even
> though this would be something more like what we want)
> >
> > Best,
> > Staffan
> >
> > 2017-07-21 11:59 GMT+02:00 John Mayfield <john.wilkinson...@gmail.com>:
> > Yes,
> >
> > Use 

Re: [Cdk-user] Atom Signatures with stereo-info

2017-07-21 Thread John Mayfield
Here's how you can convert the atom indices to a SMILES with stereo,
2.1-SNAPSHOT cleans up the stereo API avoids the cast and actually makes
this a lot easier, done quick and dirty here but you get the idea.

public static String toSmiles(CircularFingerprinter.FP fp,
IAtomContainer mol) throws CDKException
{
  IAtomContainer part = mol.getBuilder().newAtomContainer();
  Set aset = new HashSet<>();
  for (int idx : fp.atoms) {
aset.add(mol.getAtom(idx));
part.addAtom(mol.getAtom(idx));
  }
  for (IBond bond : mol.bonds()) {
if (aset.contains(bond.getBegin()) &&
aset.contains(bond.getEnd()))
  part.addBond(bond);
  }
  for (IStereoElement se : mol.stereoElements()) {
if (se instanceof ITetrahedralChirality) {
  ITetrahedralChirality tc = (ITetrahedralChirality) se;
  if (aset.contains(tc.getChiralAtom()) &&
  aset.contains(tc.getLigands()[0]) &&
  aset.contains(tc.getLigands()[1]) &&
  aset.contains(tc.getLigands()[2]) &&
  aset.contains(tc.getLigands()[3]))
part.addStereoElement(tc);
}
  }
  return SmilesGenerator.isomeric().create(part);
}


On 21 July 2017 at 13:12, John Mayfield <john.wilkinson...@gmail.com> wrote:

>  Although this produces bit-fingerprints and not any String-representation
>> of the signatures if I'm reading this correctly?
>
>
> Yes but notice it also gives you the atom indexes, this is much more
> powerful that just giving the String. We actually have a utility to get the
> SMARTS for the atoms. Won't give you stereo but it's pretty easy to make it
> do that if you were so inclined, would be easy to output stereo as SMILES
> instead of SMARTS:
>
>
>> SmilesParser   smipar = new SmilesParser(SilentChemObjectBuilder.
>> getInstance());
>> IAtomContainer mol = smipar.parseSmiles("CC[C@H](C)CO");
>> CircularFingerprinter fp = new CircularFingerprinter(
>> CircularFingerprinter.CLASS_ECFP6);
>> fp.calculate(mol);
>>
> SmartsFragmentExtractor smafrag = new SmartsFragmentExtractor(mol);
>
> for (int i = 0; i < fp.getFPCount(); i++)
>>   System.out.println(smafrag.generate(fp.getFP(i).atoms));
>
>
> Result:
>
> [CH3v4X4+0]
>> [CH2v4X4+0]
>> [CH2v4X4+0]
>> [CH2v4X4+0]
>> [CH2v4X4+0]
>> [CH2v4X4+0]
>> [CH1v4X4+0]
>> [CH3v4X4+0]
>> [CH2v4X4+0]
>> [OH1v2X2+0]
>> [CH3v4X4+0][CH2v4X4+0]
>> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]
>> [CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0]
>> [CH1v4X4+0][CH3v4X4+0]
>> [CH1v4X4+0][CH2v4X4+0][OH1v2X2+0]
>> [CH2v4X4+0][OH1v2X2+0]
>> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
>> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0][OH1v2X2+0]
>> [CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0][OH1v2X2+0]
>> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0]
>> [CH3v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+
>> 0][CH2v4X4+0][CH1v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+
>> 0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0]
>> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+
>> 0]([CH3v4X4+0])[CH2v4X4+0][OH1v2X2+0]
>> [CH2v4X4+0][CH2v4X4+0][CH2v4X4+0][CH1v4X4+0]([CH3v4X4+0])[CH2v4X4+0][
>> OH1v2X2+0]
>
>
>
> However, I have done some experiments comparing the circular fingerprints
>> of enantiomers and also diastereomers, and they turn out to have 1.0
>> tanimoto scores.
>> What am I doing wrong?
>
>
> Unfortunately the way it was written you currently need 2D coordinates.
> It's an easy fix if you want to submit the patch, just need to pull the
> tetrahedral rubric out of the IStereoElements - note the IStereoElement's
> are created automatically on 2D/3D.
>
> SmilesParser  smipar = new SmilesParser(SilentChemObjectBuilder.
>> getInstance());
>> IAtomContainermol1 = smipar.parseSmiles("CC[C@H](C)CO");
>> IAtomContainermol2 = smipar.parseSmiles("CC[C@@H](C)CO");
>> CircularFingerprinter fp = new CircularFingerprinter(
>> CircularFingerprinter.CLASS_ECFP6);
>> System.out.println(Tanimoto.calculate(fp.getFingerprint(mol1),
>> fp.getFingerprint(mol2)));
>> // 1.0
>> StructureDiagramGenerator sdg = new StructureDiagramG

Re: [Cdk-user] Bug in IteratingSDFReader in 2.0 cdk?

2017-07-24 Thread John Mayfield
Which version did you generate it with? There should be a blank line after
M END?

John

On 24 July 2017 at 11:03, Staffan Arvidsson 
wrote:

> I just bumped the CDK version in our code repo (from 1.5.13) and found
> that some of my tests are now failing. The issue is reading v3000 SDF files
> were properties are not set on the molecules. I attached the file that I've
> used in one of tests. Basic code for showing things are not working:
>
> IteratingSDFReader iter = new IteratingSDFReader(new
> FileInputStream(file), SilentChemObjectBuilder.getInstance());
>
> while(iter.hasNext()){
>
> System.out.println(iter.next().getProperties());
> }
>
> The output is simply an empty map. Or did I miss something here?
>
> Also, I've found that the keys in properties are somehow 'cleaned' by
> switching out space, dots, equal signs to lower case character? Any reason
> behind this?
>
>
> Best,
> Staffan
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Bug in IteratingSDFReader in 2.0 cdk?

2017-07-24 Thread John Mayfield
>
> There should be a blank line after M END?


Just checked and no there should not be a blank line. The current writer
does not generate this, after removing the erroneous blank lines I can
round trip it.

try (SDFWriter sdfw = new SDFWriter(System.out)) {
> sdfw.setAlwaysV3000(true);
> while (sdfr.hasNext()) {
> sdfw.write(sdfr.next());
> }
> }


Result:


>
>   CDK 0724171132
>
>   0  0  0 0  0999 V3000
> M  V30 BEGIN CTAB
> M  V30 COUNTS 10 10 0 0 0
> M  V30 BEGIN ATOM
> M  V30 1 C .0 .0 0 0
> M  V30 2 C .0 .0 0 0
> M  V30 3 C .0 .0 0 0
> M  V30 4 C .0 .0 0 0
> M  V30 5 C .0 .0 0 0
> M  V30 6 C .0 .0 0 0
> M  V30 7 N .0 .0 0 0
> M  V30 8 C .0 .0 0 0
> M  V30 9 C .0 .0 0 0
> M  V30 10 C .0 .0 0 0
> M  V30 END ATOM
> M  V30 BEGIN BOND
> M  V30 1 2 1 2
> M  V30 2 1 2 3
> M  V30 3 1 3 4
> M  V30 4 1 4 5
> M  V30 5 1 5 6
> M  V30 6 1 6 7
> M  V30 7 2 3 8
> M  V30 8 1 8 9
> M  V30 9 2 9 10
> M  V30 10 1 1 10
> M  V30 END BOND
> M  V30 END CTAB
> M  END
> > 
> id1
>
> 
>
>   CDK 0724171132
>
>   0  0  0 0  0999 V3000
> M  V30 BEGIN CTAB
> M  V30 COUNTS 11 11 0 0 0
> M  V30 BEGIN ATOM
> M  V30 1 C .0 .0 0 0
> M  V30 2 C .0 .0 0 0
> M  V30 3 S .0 .0 0 0
> M  V30 4 O .0 .0 0 0
> M  V30 5 O .0 .0 0 0
> M  V30 6 C .0 .0 0 0
> M  V30 7 C .0 .0 0 0
> M  V30 8 C .0 .0 0 0
> M  V30 9 C .0 .0 0 0
> M  V30 10 C .0 .0 0 0
> M  V30 11 C .0 .0 0 0
> M  V30 END ATOM
> M  V30 BEGIN BOND
> M  V30 1 1 1 2
> M  V30 2 1 2 3
> M  V30 3 2 3 4
> M  V30 4 2 3 5
> M  V30 5 1 2 6
> M  V30 6 2 6 7
> M  V30 7 1 7 8
> M  V30 8 2 8 9
> M  V30 9 1 9 10
> M  V30 10 2 10 11
> M  V30 11 1 6 11
> M  V30 END BOND
> M  V30 END CTAB
> M  END
> > 
> id2
>
> 


On 24 July 2017 at 11:15, John Mayfield <john.wilkinson...@gmail.com> wrote:

> Which version did you generate it with? There should be a blank line after
> M END?
>
> John
>
> On 24 July 2017 at 11:03, Staffan Arvidsson <staffan.arvids...@gmail.com>
> wrote:
>
>> I just bumped the CDK version in our code repo (from 1.5.13) and found
>> that some of my tests are now failing. The issue is reading v3000 SDF files
>> were properties are not set on the molecules. I attached the file that I've
>> used in one of tests. Basic code for showing things are not working:
>>
>> IteratingSDFReader iter = new IteratingSDFReader(new
>> FileInputStream(file), SilentChemObjectBuilder.getInstance());
>>
>> while(iter.hasNext()){
>>
>> System.out.println(iter.next().getProperties());
>> }
>>
>> The output is simply an empty map. Or did I miss something here?
>>
>> Also, I've found that the keys in properties are somehow 'cleaned' by
>> switching out space, dots, equal signs to lower case character? Any reason
>> behind this?
>>
>>
>> Best,
>> Staffan
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Strange behaviour of AtomContainer.getConnectedAtomsList()

2017-04-26 Thread John Mayfield
How are you generating the depiction (p.s. you're losing stereochemistry
somewhere)?

Speculating - but if you generate a SMILES the output order may (normally
will) be different from the input. When generating a SMILES you can access
the atom output order but it's easier just to label the atoms (e.g. with
atom maps...)

For storage reasons you should use isomeric non-canonical SmiFlavour... but
ignore that take a look at this example:
SmilesParser smipar = new SmilesParser(SilentChemObjectBuilder.getInstance
());

SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.Canonical |
SmiFlavor.AtomAtomMap);
for (String smi : new String[]{"CCO", "C(C)O", "OCC"}) {
IAtomContainer mol = smipar.parseSmiles(smi);
for (int i = 0; i < mol.getAtomCount(); i++)
mol.getAtom(i).setProperty(CDKConstants.ATOM_ATOM_MAPPING, (i + 1));
System.out.println(smigen.create(mol));
}


[OH:3][CH2:2][CH3:1]
[OH:3][CH2:1][CH3:2]
[OH:1][CH2:2][CH3:3]

All the oxygens are now atom 1... but the map index tells us what the
original storage index was.

John

On 26 April 2017 at 13:49, Nikolas Glaser  wrote:

> Hello everyone,
>
> I have a part in my code where I want to get all connected atoms of one
> given.
> Now the result of AtomNumbers of the connected atoms does not make sense
> in comparison with the atom numbers of my printed substance.
>
> You can see the code, result and substance in the attached files.
>
> Is there a difference between the AtomNumber in the program to the Number
> in the printed image? In the program there exists Atomnumber 0 but not in
> image so do I have to add 1 to every Atomnumber to compare correctly?
>
> If there is the difference the result does not make sense either, cause
> the „connected atoms“ of AtomNumber 39 should be in the image number 40 and
> 37 also if I make -1 to every atom number the only connected atoms should
> be only two not three.
>
>
> Thanks for your reply.
>
> Cheers
> Niko
>
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


[Cdk-user] n way-bonds

2017-04-27 Thread John Mayfield
Hi All,

I was wondering does anyone use n-way bonds in CDK (a bond with more than
two atoms)?

IAtom atom1 = object.getBuilder().newInstance(IAtom.class, "C");
> IAtom atom2 = object.getBuilder().newInstance(IAtom.class, "O");
> IAtom atom3 = object.getBuilder().newInstance(IAtom.class, "C");
> IAtom atom4 = object.getBuilder().newInstance(IAtom.class, "C");
> IAtom atom5 = object.getBuilder().newInstance(IAtom.class, "C");
>
> IBond bond1 = new Bond(new IAtom[]{atom1, atom2, atom3, atom4, atom5});
>
> I don't deny they exists but not sure how useful they are as they don't
fit into traditional valence bond theory. Many useful algorithms break down
(e.g. simple traversals, substructure matching, and canonical labelling).
Rich Apodaca talks about it a little in FlexMol

and
shows it for doing Pi bonding - nice but again not really practical. I seem
to recall Egon saying CML can "store" them but I'm struggling to see if
it's anything more than nicety.

I believe a better way to handle (if you wanted to) would be with a higher
level description as sets/collections of atoms/bonds over a wrong but
acceptable valence bond graph:

Three-center one-electron valence bond graph (via CDK Depict)
[image: Inline images 2]
Using multicentre Sgroups (via CDK Depict)
[image: Inline images 3]
- John
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] n way-bonds

2017-04-27 Thread John Mayfield
>
> It's a chicken and egg problem... people don't use it, and even can't use
> it, because there are no good tools yet; not making good tools also ensures
> no one uses it.


I somewhat agree but MDL/ChemDraw support it and it's still not used.


> So, rather than trunking the CDK, I would suggest, let's make work of
> convincing the cheminformatics community of these advanced features, just
> like others are doing to the PTM-protein sequence mashups... let's be a
> leader, rather than a follower.


At the moment it makes the general case (with 99.99% of uses) much
slower for a case that isn't used (yet). I think it should be possible to
do the exotic but not at the cost of the regular. As a library I would push
for getting the basics of the model right before moving on to extra bits, I
quote Jurassic park in jest :p:

*Your scientists were so preoccupied with whether or not they could, they
> didn’t stop to think if they should*


By argument
John

On 27 April 2017 at 09:36, Egon Willighagen <egon.willigha...@gmail.com>
wrote:

>
>
> On Thu, Apr 27, 2017 at 9:54 AM, John Mayfield <
> john.wilkinson...@gmail.com> wrote:
>
>> I was wondering does anyone use n-way bonds in CDK (a bond with more than
>> two atoms)?
>>
>> IAtom atom1 = object.getBuilder().newInstance(IAtom.class, "C");
>>> IAtom atom2 = object.getBuilder().newInstance(IAtom.class, "O");
>>> IAtom atom3 = object.getBuilder().newInstance(IAtom.class, "C");
>>> IAtom atom4 = object.getBuilder().newInstance(IAtom.class, "C");
>>> IAtom atom5 = object.getBuilder().newInstance(IAtom.class, "C");
>>>
>>> IBond bond1 = new Bond(new IAtom[]{atom1, atom2, atom3, atom4, atom5});
>>>
>>> I don't deny they exists but not sure how useful they are as they don't
>> fit into traditional valence bond theory. Many useful algorithms break down
>> (e.g. simple traversals, substructure matching, and canonical labelling).
>> Rich Apodaca talks about it a little in FlexMol
>> <http://depth-first.com/articles/2006/12/20/a-molecular-language-for-modern-chemistry-getting-started-with-flexmol/>
>>  and
>> shows it for doing Pi bonding - nice but again not really practical. I seem
>> to recall Egon saying CML can "store" them but I'm struggling to see if
>> it's anything more than nicety.
>>
>
> It's a chicken and egg problem... people don't use it, and even can't use
> it, because there are no good tools yet; not making good tools also ensures
> no one uses it.
>
> One of the reasons why we have incorrect data, is because at some point
> someone took a shortcut in some tool, because it covered most of the issues.
>
>
>> I believe a better way to handle (if you wanted to) would be with a
>> higher level description as sets/collections of atoms/bonds over a wrong
>> but acceptable valence bond graph:
>>
>
> We don't do much with transition states, but one of the key advantages of
> the CDK (over RDKit, OpenBabel) is that you actually can... Fairly, I did
> not track with literature citing the CDK actually uses these features...
>
> So, besides the obvious representation (and viz) use cases below (just
> think of what mess some PubChem/ChemSpider entries are), these kind of
> features are essential for reaction mechanisms, so the first place to check
> would be the MaCIE work and the metabolite identification cheminformatics
> work from Chris' lab in the past.
>
> Other use cases, one that I never found time for, is (also obvious from
> the below) cheminformatics for organometallics... there is a good bit of
> literature doing VS of such substances, typically severely limited in the
> cheminformatics they use...
>
>
>> Three-center one-electron valence bond graph (via CDK Depict)
>> [image: Inline images 2]
>> Using multicentre Sgroups (via CDK Depict)
>> [image: Inline images 3]
>>
>
> So, rather than trunking the CDK, I would suggest, let's make work of
> convincing the cheminformatics community of these advanced features, just
> like others are doing to the PTM-protein sequence mashups... let's be a
> leader, rather than a follower.
>
> Egon
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/u/egonwillighagen
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] ungraceful fail of MDE descriptor for long alkyl chains

2017-08-06 Thread John Mayfield
In fact even better please submit a PR since you've identified and fixed
the problem.

On 6 August 2017 at 16:43, John Mayfield <john.wilkinson...@gmail.com>
wrote:

> Please post an issue: github.com/cdk/cdk
>
>
> On 5 August 2017 at 01:50, Andrei Kazakov <and4li...@gmail.com> wrote:
>
>> Hello,
>>
>> The "bugs" list seems to be deserted so I am posting it here. It appears
>> to be an overflow issue in MDEDescriptor.java that produces zero values for
>> long chains (the code is unchanged across 1.4, 1.5, and 2.0 series).
>> Specifically, MDEC-22 becomes zero starting from C23 (tricosane). It is
>> caused by geometric mean evaluation via direct product; replacing product
>> with the sum of logs fixes the issue:
>>
>> private double evalCValue(int[][] distmat, int[][] codemat, int
>> type1, int type2) {
>> /* double lambda = 1; */
>> double lambda = 0;
>> double n = 0;
>>
>> List v1 = new ArrayList();
>> List v2 = new ArrayList();
>> for (int i = 0; i < codemat.length; i++) {
>> if (codemat[i][0] == type1) v1.add(codemat[i][1]);
>> if (codemat[i][0] == type2) v2.add(codemat[i][1]);
>> }
>>
>> for (int i = 0; i < v1.size(); i++) {
>> for (int j = 0; j < v2.size(); j++) {
>> int a = v1.get(i);
>> int b = v2.get(j);
>> if (a == b) continue;
>> double distance = distmat[a][b];
>> /* lambda = lambda * distance; */
>> lambda += Math.log(distance);
>> n++;
>> }
>> }
>>
>> if (type1 == type2) {
>> /* lambda = Math.sqrt(lambda); */
>> lambda /= 2;
>> n = n / 2;
>> }
>> if (n == 0) return 0.0;
>> else
>> /* return n / Math.pow(Math.pow(lambda, 1.0 / (2.0 * n)), 2);
>> */
>> return n / Math.exp(lambda / n);
>> }
>>
>>
>> --Andrei
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


[Cdk-user] CDK 1.5.15

2017-05-02 Thread John Mayfield
Hi All,

I've done a CDK 1.5.15 release ahead of the imminent v2.0 release. Release
notes should follow
later this week.

There was a regression in 1.5.14 when passing options to the InChI
generation this has been resolved along with some decent performance gains
in SMILES/SDF reading. The cdk-group module also has some updates.

This upgrade should be seamless (please let me know ASAP if not as it might
be something we can fix for v2.0).

Get it via Maven (central should sync tonight) or GitHub
. Full release notes
available soon.

John
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Title line for MDL formats

2017-06-26 Thread John Mayfield
Use version 2.0 :-)

On 26 June 2017 at 08:31, Tim Dudgeon <tdudgeon...@gmail.com> wrote:

> Here's what I see:
>
> IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
> SmilesParser parser = new SmilesParser(bldr);
> IAtomContainer mol1 = parser.parseSmiles("CCO");
> IAtomContainer mol2 = parser.parseSmiles("CCO ethanol");
> SDFWriter sdf = new SDFWriter(System.out);
> sdf.write(mol1);
> sdf.write(mol2);
>
> Outputs:
>
>
>   CDK 0626170823
>
>   3 2 0 0 0 0 0 0 0 0999 V2000
>   0. 0. 0. C 0 0 0 0 0 0 0 0 0 0 0 0
>   0. 0. 0. C 0 0 0 0 0 0 0 0 0 0 0 0
>   0. 0. 0. O 0 0 0 0 0 0 0 0 0 0 0 0
>   1 2 1 0 0 0 0
>   2 3 1 0 0 0 0
> M END
> > 
> null
>
> 
> ethanol
>   CDK 0626170823
>
>   3 2 0 0 0 0 0 0 0 0999 V2000
>   0. 0. 0. C 0 0 0 0 0 0 0 0 0 0 0 0
>   0. 0. 0. C 0 0 0 0 0 0 0 0 0 0 0 0
>   0. 0. 0. O 0 0 0 0 0 0 0 0 0 0 0 0
>   1 2 1 0 0 0 0
>   2 3 1 0 0 0 0
> M END
> > 
> ethanol
>
> 
>
> Note that for mol1 no title is set but a property named cdk:Title is
> output with a value of "null".
> For mol2 the title is set and output correctly, but the property cdk:Title
> is also output.
> This is with version 1.5.14.
>
> Tim
>
>
>
> On 25/06/2017 15:40, John Mayfield wrote:
>
> I'm planning on rewriting  the CTab readers to fix issues like the
> cdk:Remark nastyness but the title lines should still be working.
>
> IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
>> SmilesParser smipar = new SmilesParser(bldr);
>> IAtomContainer mol = smipar.parseSmiles("CCO ethanol");
>> new MDLV2000Writer(System.out).write(mol);
>> System.out.println("//");
>> try (SDFWriter sdf = new SDFWriter(System.out)) {
>> sdf.write(mol);
>> }
>
>
> Result:
>
> ethanol
>>
>>   CDK 0625171540
>>
>>
>>>   3  2  0  0  0  0  0  0  0  0999 V2000
>>
>> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>>
>> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>>
>> 0.0.0. O   0  0  0  0  0  0  0  0  0  0  0  0
>>
>>   1  2  1  0  0  0  0
>>
>>   2  3  1  0  0  0  0
>>
>> M  END
>>
>> //
>>
>> ethanol
>>
>>   CDK 0625171540
>>
>>
>>>   3  2  0  0  0  0  0  0  0  0999 V2000
>>
>> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>>
>> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>>
>> 0.0.0. O   0  0  0  0  0  0  0  0  0  0  0  0
>>
>>   1  2  1  0  0  0  0
>>
>>   2  3  1  0  0  0  0
>>
>> M  END
>>
>> 
>>
>>
> How are you using them Tim?
>
> John
>
> On 24 June 2017 at 17:19, Tim Dudgeon <tdudgeon...@gmail.com> wrote:
>
>> One possibility might be to use the value of IChemObject.getID() for the
>> title line.
>> Not sure if that would be a good or a bad idea. But I tried it and it
>> doesn't work.
>>
>> Another thing I noticed is that CDK has a bad habit of adding an empty
>> cdk:Remark property for no particular reason.
>> But if you know about this then you can remove it before exporting so its
>> not a major problem.
>>
>> Tim
>>
>>
>>
>> On 24/06/2017 15:45, Egon Willighagen wrote:
>>
>>> Mmm... I'd consider that a regression, as that was the intended
>>> behavior...
>>>
>>> John, do you agree we should restore that behavior, or do you have a
>>> better solution?
>>>
>>> Egon
>>>
>>>
>>> On Sat, Jun 24, 2017 at 4:23 PM, Tim Dudgeon <tdudgeon...@gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I'm needing to write the title line (the first line in the record) for
>>>> MDL
>>>> formats.
>>>> I'm using SDFWriter.write(mol) to write out the SD file.
>>>> I've found that you can do this be setting the property
>>>> CDKConstants.TITLE
>>>> (which has a value of "cdk:Title") of the IAtomContainer to the value
>>>> you
>>>> want in the title line , but you also get this written as a SD file
>>>> property, which is not what I want.
>>>>
>>>> Is there a way of better controlling this?
>>>>
>>>> Tim
>>>>
>>>

Re: [Cdk-user] Title line for MDL formats

2017-06-26 Thread John Mayfield
Relevant commit:
https://github.com/cdk/cdk/commit/20d57ed9e92e187e3ef3f356d4236534bff0bce2

You can actually control the that set from an option if IIRC so you a work
around in 1.5.14 might be to add those to the ignore. I can remember though
if this might also remove the title line - there were some other commit
around then for example null is now blank.

John

On 26 June 2017 at 09:51, John Mayfield <john.wilkinson...@gmail.com> wrote:

> Use version 2.0 :-)
>
> On 26 June 2017 at 08:31, Tim Dudgeon <tdudgeon...@gmail.com> wrote:
>
>> Here's what I see:
>>
>> IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
>> SmilesParser parser = new SmilesParser(bldr);
>> IAtomContainer mol1 = parser.parseSmiles("CCO");
>> IAtomContainer mol2 = parser.parseSmiles("CCO ethanol");
>> SDFWriter sdf = new SDFWriter(System.out);
>> sdf.write(mol1);
>> sdf.write(mol2);
>>
>> Outputs:
>>
>>
>>   CDK 0626170823
>>
>>   3 2 0 0 0 0 0 0 0 0999 V2000
>>   0. 0. 0. C 0 0 0 0 0 0 0 0 0 0 0 0
>>   0. 0. 0. C 0 0 0 0 0 0 0 0 0 0 0 0
>>   0. 0. 0. O 0 0 0 0 0 0 0 0 0 0 0 0
>>   1 2 1 0 0 0 0
>>   2 3 1 0 0 0 0
>> M END
>> > 
>> null
>>
>> 
>> ethanol
>>   CDK 0626170823
>>
>>   3 2 0 0 0 0 0 0 0 0999 V2000
>>   0. 0. 0. C 0 0 0 0 0 0 0 0 0 0 0 0
>>   0. 0. 0. C 0 0 0 0 0 0 0 0 0 0 0 0
>>   0. 0. 0. O 0 0 0 0 0 0 0 0 0 0 0 0
>>   1 2 1 0 0 0 0
>>   2 3 1 0 0 0 0
>> M END
>> > 
>> ethanol
>>
>> $$$$
>>
>> Note that for mol1 no title is set but a property named cdk:Title is
>> output with a value of "null".
>> For mol2 the title is set and output correctly, but the property
>> cdk:Title is also output.
>> This is with version 1.5.14.
>>
>> Tim
>>
>>
>>
>> On 25/06/2017 15:40, John Mayfield wrote:
>>
>> I'm planning on rewriting  the CTab readers to fix issues like the
>> cdk:Remark nastyness but the title lines should still be working.
>>
>> IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
>>> SmilesParser smipar = new SmilesParser(bldr);
>>> IAtomContainer mol = smipar.parseSmiles("CCO ethanol");
>>> new MDLV2000Writer(System.out).write(mol);
>>> System.out.println("//");
>>> try (SDFWriter sdf = new SDFWriter(System.out)) {
>>> sdf.write(mol);
>>> }
>>
>>
>> Result:
>>
>> ethanol
>>>
>>>   CDK 0625171540
>>>
>>>
>>>>   3  2  0  0  0  0  0  0  0  0999 V2000
>>>
>>> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>>>
>>> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>>>
>>> 0.0.0. O   0  0  0  0  0  0  0  0  0  0  0  0
>>>
>>>   1  2  1  0  0  0  0
>>>
>>>   2  3  1  0  0  0  0
>>>
>>> M  END
>>>
>>> //
>>>
>>> ethanol
>>>
>>>   CDK 0625171540
>>>
>>>
>>>>   3  2  0  0  0  0  0  0  0  0999 V2000
>>>
>>> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>>>
>>> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>>>
>>> 0.0.0. O   0  0  0  0  0  0  0  0  0  0  0  0
>>>
>>>   1  2  1  0  0  0  0
>>>
>>>   2  3  1  0  0  0  0
>>>
>>> M  END
>>>
>>> 
>>>
>>>
>> How are you using them Tim?
>>
>> John
>>
>> On 24 June 2017 at 17:19, Tim Dudgeon <tdudgeon...@gmail.com> wrote:
>>
>>> One possibility might be to use the value of IChemObject.getID() for the
>>> title line.
>>> Not sure if that would be a good or a bad idea. But I tried it and it
>>> doesn't work.
>>>
>>> Another thing I noticed is that CDK has a bad habit of adding an empty
>>> cdk:Remark property for no particular reason.
>>> But if you know about this then you can remove it before exporting so
>>> its not a major problem.
>>>
>>> Tim
>>>
>>>
>>>
>>> On 24/06/2017 15:45, Egon Willighagen wrote:
>>>
>>>> Mmm... I'd consider that a regression, as that was the intended
>>>> behavior...
>>>>
>>>> John, do you agree 

Re: [Cdk-user] Maven Artifact Overview

2017-05-23 Thread John Mayfield
>
> I currently try to get an overview over the existing different maven
> atifacts that the cdk provides. However, I am a bit overwhelmed by the
> sheer amount of different artifacts. Is there a good overview or
> introduction that explains, which features are found in which artifact and
> where to start with?


Unfortunately not, I'll write one at some point but the truth is that it's
a bit of a mess in some places. Because it was originally one single source
tree it's taken a long tedious time to tease functionality apart and
sometimes it just wasn't possible to decouple or put things in the right
place. Typically CDK modules are organised by their dependancies rather
their functionality. For example:

 > If you want all IFingerprinter implementations you must include
cdk-fingerprint, cdk-signature, and cdk-smiles (might be another one I'm
forgetting).
 > If you want to read CML you only need (cdk-io), to write cml you need
(cdk-libiocml).

If you go to the GitHub repo (https://github.com/cdk/cdk/) and press the
"t" key you can type in a filename and it will quickly tell you where it is
located.

I do not like to use the bundle, as it is messing up my logger
> configuration.


Hmm.. that might be via JNI-InChI. Anyways let me know why it's messing up
as we should fix it, I seem to remember I had problems before hand when
using Log4J.

John

On 23 May 2017 at 13:04, Till Schäfer  wrote:

> Hi,
> I currently try to get an overview over the existing different maven
> atifacts that the cdk provides. However, I am a bit overwhelmed by the
> sheer amount of different artifacts. Is there a good overview or
> introduction that explains, which features are found in which artifact and
> where to start with? I do not like to use the bundle, as it is messing up
> my logger configuration.
>
> Regards,
> Till
> --
> Dipl.-Inf. Till Schäfer
> TU Dortmund University
> Chair 11 - Algorithm Engineering
> Otto-Hahn-Str. 14 / Room 237
> 44227 Dortmund, Germany
>
> e-mail: till.schae...@cs.tu-dortmund.de
> phone: +49(231)755-7706
> fax: +49(231)755-7740
> web: http://ls11-www.cs.uni-dortmund.de/staff/schaefer
> pgp: https://keyserver2.pgp.com/vkd/SubmitSearch.event?&;
> SearchCriteria=0xD84DED79
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK - issue with stereochemistry

2017-06-07 Thread John Mayfield
I tried MarvinSketch before I posted that and all the same there too.. let
me know if you still think there is a problem.

On 7 June 2017 at 18:13, Yannick .Djoumbou <y.djoum...@gmail.com> wrote:

> Hi John,
>
> Thank you the fast response. I am using another viewer, which could have
> brought the confusion. This just means the issue in my code might come from
> somewhere else.
>
> Thanks,
> Yannick
>
> On Tue, Jun 6, 2017 at 3:25 AM, John Mayfield <john.wilkinson...@gmail.com
> > wrote:
>
>> Hi Yannick,
>>
>> Please, please, please don't "add hydrogens" like this! The hydrogens are
>> already there and don't need to be added - this is just wasted effort and
>> worst of all the atom typing *might* change your structures valence!! If
>> for some reason you want to make hydrogens explicit just call this method
>> directly:
>>
>> AtomContainerManipulator.convertImplicitToExplicitHydrogens(molecule);
>>
>>
>> Anyways why do you think the stereochemistry is different? All your
>> SMILES are the same and the stereochemistry is preserved correctly - paste
>> them in here to see: https://cdkdepict-openchem.rhcloud.com/depict.html
>>
>>
>> [image: Inline images 1]
>>
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK - issue with stereochemistry

2017-06-06 Thread John Mayfield
Hi Yannick,

Please, please, please don't "add hydrogens" like this! The hydrogens are
already there and don't need to be added - this is just wasted effort and
worst of all the atom typing *might* change your structures valence!! If
for some reason you want to make hydrogens explicit just call this method
directly:

AtomContainerManipulator.convertImplicitToExplicitHydrogens(molecule);


Anyways why do you think the stereochemistry is different? All your SMILES
are the same and the stereochemistry is preserved correctly - paste them in
here to see: https://cdkdepict-openchem.rhcloud.com/depict.html


[image: Inline images 1]
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


[Cdk-user] CDK 2.0 Release

2017-06-06 Thread John Mayfield
Hi All,

Been live for almost two weeks but 2.0 is released. As always, available on
Maven, GitHub, and SF.

Release  (GitHub Bundle
Jar)
Release Notes 


John
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


[Cdk-user] Request for comment: AtomContainer2

2017-09-14 Thread John Mayfield
Hi All,

I've been working on a patch since March that greatly enhances the CDK
structure manipulation both in terms of usability and performance.

It is essentially backwards compatible with the current API (barring some
minor caveats) but to minimise issues I was planning a gradual introduction
where by initially it is turned on and off using an system/environment flag
and removing the flag and current implementation as a 3.0 release.

In summary you can now do this:

IAtomContainer mol = ...;
> for (IAtom atom : mol.atoms()) {
>   int aidx = atom.getIndex(); // O(1) constant time!
>   int deg = atom.getBondCount(); // O(1) constant time!
>   // atoms know about their bonds O(|degree|)
>   for (IBond bond : atom.bonds()) {
> int bidx = bond.getIndex(); // O(1) constant time!
>   }
> }


I believe it should be relatively pain free to migrate however only testing
will tell. This is where we need your help! Please review the patch and
once applied please test you code with the new implementation (see info
below).

Further details/benchmarks: https://github.com/cdk/cdk/wiki/AtomContainer2
Patch: https://github.com/cdk/cdk/pull/368

Many Thanks,
John
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] It is possible to set timeout in search?

2017-10-11 Thread John Mayfield
Hi Charo,

It is possible to avoid this?


InChI or canonical SMILES is the correct way to check identity.

It is possible to set a timeout in the search?


 I've contemplated setting a timeout (or more realistically a state
counter) but I've not encountered any molecules where it's needed. If you
can provide the molecules I can assess the best course of action.

John
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


[Cdk-user] CDK Depict is Moving

2017-09-26 Thread John Mayfield
Hi All,

If you're are using the publicly hosted CDK Depict
http://cdkdepict-openchem.rhcloud.com please be aware that the service will
no longer be available from the 30th Sep 2017. I have set up a new instance
on my own website available at http://www.simolecule.com/cdkdepict -
alternatively if you have an internal service dependance on it the best
option for latency is to host it locally.

The reason for this change is that the application was previously hosted on
the free tier of OpenShift v2 which is being replaced by OpenShift v3. In
OpenShift v3 the application URLs are not the same (e.g.
http://cdkdepict-openchem.1d35.starter-us-east-1.openshiftapps.com) and the
free tier must now must sleep (wherein it is unavailable) for 25% of the
time (or pay 50 USD/month). I'm now hosting elsewhere for a reasonable
price (https://www.linode.com) and have the option to migrate portably if
required.

Apologies for any inconvenience and short notice. My plan was to keep the
existing one up via OpenShift v3 with the resource hibernation and a
message pointing to the better http://www.simolecule.com/cdkdepict. However
since the URLs will be different there's not point doing this.

Thanks,
John
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


[Cdk-user] CDK Release 2.1

2017-12-15 Thread John Mayfield
Hi All,

CDK v2.1 has been released and includes improved. AtomContainer and
Stereochemistry Implementations/APIs:
https://github.com/cdk/cdk/wiki/2.1-Release-Notes

*Important:* the new AtomContainer APIs may have some unexpected behaviour
on existing codebases (see
https://github.com/cdk/cdk/wiki/AtomContainer2#gotchas). To offer a period
of transition the new implementation (*internally named* *AtomContainer2*) is
disabled by default. *Now* is the time to try it out and report any
problems. If no critical issues outside of that covered by the gotchas are
identified it will be the new default implementation, the old will then be
removed in future.

 - v2.1 CdkUseLegacyAtomContainer=true (AtomContainer is default)
 - v2.2 CdkUseLegacyAtomContainer=false (AtomContainer2 is default) -
pushed down if issues identified by community
 - v2.x ...
 - v3.0 (original AtomContainer removed)
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Reg: Editing linear Chain alkane (Heptane)

2017-10-30 Thread John Mayfield
Right the issue here is possibly you need to fix you valences... you have
two 5 valent carbons.

When you delete add bonds you want to adjust the hydrogen counts such that
the valence is valid. Or if you want a quick and dirty fix, try adding this:

AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(mol);

CDKHydrogenAdder.getInstance(mol.getBuilder()).addImplicitHydrogens(mol);

[image: Inline images 1]

On 30 October 2017 at 11:07, Vinothkumar Mohanakrishnan <kmvin...@gmail.com>
wrote:

> Oh yes. It was said in the documentation that 3D Model Builder is having
> issues, but I thought it should work for small molecules?
>
> The SMILES for my assembled molecules (NewHeptane) = *CC[CH3]C[CH3]CC*
>
> Thank you for your interest in this post.
>
> On Mon, Oct 30, 2017 at 11:39 AM, John Mayfield <
> john.wilkinson...@gmail.com> wrote:
>
>> Oh right, yeah sorry I missed that.
>>
>> The 3D coordinate generation in the CDK is really bad, it kind of works
>> but not really. Been meaning up to update, anyways this simple molecule
>> should be okay... what does the SMILES look like for your assembled
>> molecule?
>>
>> John
>>
>> On 30 October 2017 at 09:05, Vinothkumar Mohanakrishnan <
>> kmvin...@gmail.com> wrote:
>>
>>> Dear John
>>>
>>> Thanks for your valuable insight.
>>>
>>> I messed up with the indices in my first post where I numbered atoms
>>> sequentially for clarity purpose of the question, but I should have
>>> mentioned about CDK indices and renumbering in the NewHeptane atom
>>> container, which I am quite aware of, (thanks a lot for your clarity in
>>> explaining it)
>>>
>>> To me, it seems that the real issue is with the model builder. (After
>>> generating SMILES)
>>>
>>> When I tried creating a 3d model of the NewHeptane using model3dbuilder
>>> I get an exception
>>>
>>> *org.openscience.cdk.exception.NoSuchAtomTypeException: Atom is unkown:
>>> Symbol:C does not MATCH AtomType. HoseCode:C-5;CC(C,C/C,/)*
>>>
>>> By the way, I checked with 
>>> AtomContainerManipulator.perceiveatomsandconfigure()
>>> and all atoms in the NewHeptane Iatomcontainer has C.Sp3 hybridized.
>>>
>>>
>>> // Below is the code with CDK atom indices
>>>
>>> NewHeptane.add(Propane)
>>>
>>>
>>> *NewHeptane.addBond(1, // second atom in heptane
>>>   4, // first atom in propane*
>>>
>>> *  IBond.Order.Single);*
>>>
>>>
>>> *NewHeptane.addBond(2, // third atom in heptane (was sixth before we
>>> deleted some)*
>>> * 6, // third atom in propane*
>>> * IBond.Order.Single);*
>>>
>>> //Generate smiles
>>> SmilesGenerator sg = SmilesGenerator.absolute();
>>> String smi = sg.create(NewHeptane);
>>>
>>> //Parse smiles
>>> SmilesParser sp = new SmilesParser(SilentChemObjectB
>>> uilder.getInstance());
>>> IAtomContainer m   = sp.parseSmiles(smi);
>>>
>>> // Build 3d model
>>> ModelBuilder3D builder3d = ModelBuilder3D.getInstance(m.getBuilder());
>>> IAtomContainer newHeptane = builder3d.generate3DCoordinates(m, false);
>>>
>>> Any insights in the right direction are highly appreciated. Thank you.
>>>
>>>
>>>
>>> On Sat, Oct 28, 2017 at 5:38 PM, John Mayfield <
>>> john.wilkinson...@gmail.com> wrote:
>>>
>>>> Can't run the code without more context but I think I know what's
>>>> happening.
>>>>
>>>> Firstly atom indices start at 0, so to remove atom numbers 3, 4, 5 (as
>>>> in picture) you remove 2, 3, 4. Secondly, when you delete atoms the indices
>>>> are renumbered/repacked. So the new indices will always be 0, 1, 2
>>>> (previously index 5 - atom number 6).
>>>>
>>>> [image: Inline images 1]
>>>>
>>>> Now this wouldn't actually cause the error because you're adding
>>>> pentane... but 5 atoms from pentane + the 3 from the fragmented hexane
>>>> means the maximum index as 7 but you've requested atom 10...
>>>>
>>>> NewHeptane.addBond(6,10, IBond.Order.Single)
>>>>
>>>>
>>>> There are a couple of ways to do this for example tagging atoms, map
>>>> indices, or use the atom object to create the bonds (most efficient).
>>>

Re: [Cdk-user] Reg: Editing linear Chain alkane (Heptane)

2017-10-30 Thread John Mayfield
Oh right, yeah sorry I missed that.

The 3D coordinate generation in the CDK is really bad, it kind of works but
not really. Been meaning up to update, anyways this simple molecule should
be okay... what does the SMILES look like for your assembled molecule?

John

On 30 October 2017 at 09:05, Vinothkumar Mohanakrishnan <kmvin...@gmail.com>
wrote:

> Dear John
>
> Thanks for your valuable insight.
>
> I messed up with the indices in my first post where I numbered atoms
> sequentially for clarity purpose of the question, but I should have
> mentioned about CDK indices and renumbering in the NewHeptane atom
> container, which I am quite aware of, (thanks a lot for your clarity in
> explaining it)
>
> To me, it seems that the real issue is with the model builder. (After
> generating SMILES)
>
> When I tried creating a 3d model of the NewHeptane using model3dbuilder I
> get an exception
>
> *org.openscience.cdk.exception.NoSuchAtomTypeException: Atom is unkown:
> Symbol:C does not MATCH AtomType. HoseCode:C-5;CC(C,C/C,/)*
>
> By the way, I checked with 
> AtomContainerManipulator.perceiveatomsandconfigure()
> and all atoms in the NewHeptane Iatomcontainer has C.Sp3 hybridized.
>
>
> // Below is the code with CDK atom indices
>
> NewHeptane.add(Propane)
>
>
> *NewHeptane.addBond(1, // second atom in heptane
> 4, // first atom in propane*
>
> *  IBond.Order.Single);*
>
>
> *NewHeptane.addBond(2, // third atom in heptane (was sixth before we
> deleted some)*
> * 6, // third atom in propane*
> * IBond.Order.Single);*
>
> //Generate smiles
> SmilesGenerator sg = SmilesGenerator.absolute();
> String smi = sg.create(NewHeptane);
>
> //Parse smiles
> SmilesParser sp = new SmilesParser(SilentChemObjectBuilder.getInstance());
> IAtomContainer m   = sp.parseSmiles(smi);
>
> // Build 3d model
> ModelBuilder3D builder3d = ModelBuilder3D.getInstance(m.getBuilder());
> IAtomContainer newHeptane = builder3d.generate3DCoordinates(m, false);
>
> Any insights in the right direction are highly appreciated. Thank you.
>
>
>
> On Sat, Oct 28, 2017 at 5:38 PM, John Mayfield <
> john.wilkinson...@gmail.com> wrote:
>
>> Can't run the code without more context but I think I know what's
>> happening.
>>
>> Firstly atom indices start at 0, so to remove atom numbers 3, 4, 5 (as in
>> picture) you remove 2, 3, 4. Secondly, when you delete atoms the indices
>> are renumbered/repacked. So the new indices will always be 0, 1, 2
>> (previously index 5 - atom number 6).
>>
>> [image: Inline images 1]
>>
>> Now this wouldn't actually cause the error because you're adding
>> pentane... but 5 atoms from pentane + the 3 from the fragmented hexane
>> means the maximum index as 7 but you've requested atom 10...
>>
>> NewHeptane.addBond(6,10, IBond.Order.Single)
>>
>>
>> There are a couple of ways to do this for example tagging atoms, map
>> indices, or use the atom object to create the bonds (most efficient).
>>
>> NewHeptane.add(Propane)
>>> //adding bonds
>>> NewHeptane.addBond(1, // second atom in heptane
>>>  4,  // was second atom in
>>> propane... 3+1
>>>   IBond.Order.Single);
>>> NewHeptane.addBond(2,  // third atom in heptane (was sixth before we
>>> deleted some)
>>>   6, // was forth atom in propane
>>> 3+3
>>
>>   IBond.Order.Single)
>>
>>
>> Using the addBond() API you can get the index in the new molecule as
>> follows... which makes it easier
>>
>> NewHeptane.add(Propane)
>>> //adding bonds
>>> NewHeptane.addBond(1, // second atom in heptane
>>>   
>>> NewHeptane.indexOf(Propane.getAtom(1)),
>>>   // second atom in propane -> find index in the combined mol
>>>   IBond.Order.Single);
>>> NewHeptane.addBond(2, // third atom in heptane (was sixth before we
>>> deleted some)
>>>   
>>> NewHeptane.indexOf(Propane.getAtom(3)),
>>>// forth atom in propane -> find index in the combined mol
>>
>>   IBond.Order.Single)
>>>
>>
>> Hope that helps,
>> John
>>
>> On 25 October 2017 at 12:53, Vinothkumar Mohanakrishnan <
>> kmvin...@gmail.

Re: [Cdk-user] Question about 3D SDF file from SMILES string

2018-05-20 Thread John Mayfield
There is some support in the *builder3d *modules but there are better
options out there. One day I'll find some time to update that code but
depending on how many structure you need to convert the NCI resolver uses
Corina (via CACTVS) IIRC.

https://cactus.nci.nih.gov/chemical/structure/{smiles}/sdf

John

On 20 May 2018 at 06:08, Xuan Cao  wrote:

> Hi everyone!
>
> I am just wondering does CDK have the feature that convert SMILES string
> to 3D SDF file, or convert 2D SDF file to 3D SDF file?
>
> I see that there are some stand-alone programs that do the convert (2D to
> 3D), but I prefer using CDK if it has the feature.
>
> Thank you so much!
>
> Xuan
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Question about 3D SDF file from SMILES string

2018-05-20 Thread John Mayfield
Sorry, intended that you replace "{smiles}" with your structure :-)

On 20 May 2018 at 19:45, Egon Willighagen <egon.willigha...@gmail.com>
wrote:

>
> Maybe a temporary glitch? https://cactus.nci.nih.gov/chemical/structure/
> CCC/sdf works for me right now...
>
> Egon
>
> On Sun, May 20, 2018 at 4:20 PM Xuan Cao <danis.cao.x...@gmail.com> wrote:
>
>> Hi John,
>>
>> Thank you for your reply! I will look into that!
>>
>> BTW, the link is not working (404).
>>
>> Xuan
>>
>> On 20 May 2018 at 02:16, John Mayfield <john.wilkinson...@gmail.com>
>> wrote:
>>
>>> There is some support in the *builder3d *modules but there are better
>>> options out there. One day I'll find some time to update that code but
>>> depending on how many structure you need to convert the NCI resolver uses
>>> Corina (via CACTVS) IIRC.
>>>
>>> https://cactus.nci.nih.gov/chemical/structure/{smiles}/sdf
>>>
>>> John
>>>
>>> On 20 May 2018 at 06:08, Xuan Cao <danis.cao.x...@gmail.com> wrote:
>>>
>>>> Hi everyone!
>>>>
>>>> I am just wondering does CDK have the feature that convert SMILES
>>>> string to 3D SDF file, or convert 2D SDF file to 3D SDF file?
>>>>
>>>> I see that there are some stand-alone programs that do the convert (2D
>>>> to 3D), but I prefer using CDK if it has the feature.
>>>>
>>>> Thank you so much!
>>>>
>>>> Xuan
>>>>
>>>> 
>>>> --
>>>> Check out the vibrant tech community on one of the world's most
>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>> ___
>>>> Cdk-user mailing list
>>>> Cdk-user@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>>
>>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot__
>> _
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: https://www.zotero.org/egonw
> ORCID: -0001-7542-0286 <http://orcid.org/-0001-7542-0286>
> ImpactStory: https://impactstory.org/u/egonwillighagen
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] aromatic bonds depicted as any bonds

2018-01-03 Thread John Mayfield
I'll answer these back to front as the second one is much simpler to answer:

1. Why the inconsistency in how the different parsers/readers behave? Is
> this documented anywhere?
> 2. Is it possible to have the aromatic bonds depicted as proper aromatic
> bonds?


Answer 2: I don't think the dohnuts are useful for plain old structures,
only query structures. The circles also do not scale well to all cases
(porphyrin is a classic). The dashed bond in the depiction is not really
'any' bond as you say but rather "you input was junk/had missing
information" (see the next Answer on why that is). Since as I said for
query structures you need the 'delocalised' bond depiction i've updated the
renderer accordingly. For now I've just done an offset dash but will try
and find time to add in the dohuts: https://github.com/cdk/cdk/pull/403

Answer 1: The short answer is CDK matches behaviour to what Daylight does
for SMILES and what MDL/Symyx/Accelrys/BIOVIA do for molfile. You can
safely use aromatic bond types in SMILES and not in CTfiles.

In CDK aromaticity is a bond property and not a type/order, that is to say
the bond order is independent of the aromatic status of the bond. The
"normal form" of a molecule in the CDK is to have all the hydrogen counts
and bond orders set - if this not so you will get warnings/exceptions all
over the place. A molecule can be in an inconsistent state if an input
format was invalid or you create it that way manually. As I'm sure you
know, bond type = 4 in CTfiles is a query feature, if you use it to
represent a discrete structure there is no way to know what the original
representation was. If I try to read your structure with BIOVIA I get an
error:

ORA-20100: MDL-1919: Molecule failed registration check:
> Error: (root) No query features allowed for registration
> MDL-0633: Unable to convert molfile string to binary molecule ctab
> ORA-06512: at "C$DIRECT2017.MDLAUXOP", line 359
> ORA-06512: at "C$DIRECT2017.MDLAUXOP", line 352
> ORA-06512: at "C$DIRECT2017.MDLAUXOP", line 335


I've written a wiki section to help explain why the problem exists:
https://github.com/cdk/cdk/wiki/CTfile-Reading#aromatic-query-bonds. Rather
then reject molfiles with aromatic bonds outright we leave the molecule in
an inconsistent state as a user knows their data better then us and may be
able to correct it. SMILES will automatically kekulize input because it can
safely do so.

Hope that helps,
John


On 26 December 2017 at 14:46, Tim Dudgeon  wrote:

> I've noticed that if you try to depict a structure in molfile format that
> has bonds in rings defined as aromatic type then they are depicted as any
> bonds (dashed), not aromatic (donuts). For example take this molfile:
>
>
>   Mrv17a0 10061711272D
>
>  14 15  0  0  0  0999 V2000
> 0.54200.23230. C   0  0  0  0  0  0  0  0  0  0  0  0
> 1.2564   -0.18020. C   0  0  0  0  0  0  0  0  0  0  0  0
> 1.2564   -1.00520. N   0  0  0  0  0  0  0  0  0  0  0  0
> 1.9239   -1.49010. C   0  0  0  0  0  0  0  0  0  0  0  0
> 2.7085   -1.23520. S   0  0  0  0  0  0  0  0  0  0  0  0
> 3.3216   -1.78720. C   0  0  0  0  0  0  0  0  0  0  0  0
> 1.6689   -2.27480. N   0  0  0  0  0  0  0  0  0  0  0  0
> 0.8439   -2.27480. N   0  0  0  0  0  0  0  0  0  0  0  0
> 0.5890   -1.49010. C   0  0  0  0  0  0  0  0  0  0  0  0
>-0.1956   -1.23520. C   0  0  0  0  0  0  0  0  0  0  0  0
>-0.8631   -1.72010. C   0  0  0  0  0  0  0  0  0  0  0  0
>-1.5305   -1.23520. C   0  0  0  0  0  0  0  0  0  0  0  0
>-1.2756   -0.45060. C   0  0  0  0  0  0  0  0  0  0  0  0
>-0.4506   -0.45060. S   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0  0  0  0
>   2  3  1  0  0  0  0
>   3  4  4  0  0  0  0
>   4  5  1  0  0  0  0
>   5  6  1  0  0  0  0
>   4  7  4  0  0  0  0
>   7  8  4  0  0  0  0
>   8  9  4  0  0  0  0
>   3  9  4  0  0  0  0
>   9 10  1  0  0  0  0
>  10 11  4  0  0  0  0
>  11 12  4  0  0  0  0
>  12 13  4  0  0  0  0
>  13 14  4  0  0  0  0
>  10 14  4  0  0  0  0
> M  END
>
> Some of the bonds are clearly aromatic (4 in the 3rd column of the bond
> block). But when rendering with code like this you get those bonds depicted
> as dashed bonds:
>
> String mol = ...
> DepictionGenerator dg = new DepictionGenerator()
> .withTerminalCarbons()
> .withSize(500d, 400d)
> .withFillToFit()
>
> MDLV2000Reader v2000Parser = new MDLV2000Reader(new
> ByteArrayInputStream(mol.getBytes()))
> IAtomContainer atomContainer = v2000Parser.read(new AtomContainer())
> Depiction depiction = dg.depict(atomContainer)
> depiction.writeTo("png", "/tmp/mol.png")
>
> This is using either CDK 2.0 or 2.1.
>
> If you try a similar thing with the same molecule in smiles format the
> behaviour is a bit different.
>
> String mol2 = 

Re: [Cdk-user] InChiKey changing Upon simple conversion of implicit hydrogens

2018-02-07 Thread John Mayfield
Which version are you using, works okay for me on 2.2-SNAPSHOT and not much
should have changed their since a few years:

public static void main(String[] args) throws CDKException {
String smi =
"C[C@]12CC[C@H]3[C@@H](CCC4=CC(O)=CC=C34)[C@@H]1CC[C@@]2(O)C#C";
IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
SmilesParser   smipar = new SmilesParser(bldr);
IAtomContainer mol = smipar.parseSmiles(smi);

System.out.println(InChIGeneratorFactory.getInstance().getInChIGenerator(mol).getInchiKey());
// not needed
// AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(mol);
AtomContainerManipulator.convertImplicitToExplicitHydrogens(mol);

System.out.println(InChIGeneratorFactory.getInstance().getInChIGenerator(mol).getInchiKey());
}

You don't need to assign the atom types, but I tried with/without the
answer was the same.

BFPYWIDHMRZLRN-SLHNCBLASA-N
BFPYWIDHMRZLRN-SLHNCBLASA-N

John

On 7 February 2018 at 22:06, Yannick .Djoumbou  wrote:

> Hi all,
>
> I have been under deep debugging more for a few hours, only to find that
> there seem the InChIKey of molecules seem to change when I use the
> AtomContainerManipulator to convert implicit hydrogens into explicit ones.
>
> For instance
>
> I have the following 17-Ethinylestradiol molecule:
>
> C[C@]12CC[C@H]3[C@@H](CCC4=CC(O)=CC=C34)[C@@H]1CC[C@@]2(O)C#C
>
> PubChem CID: 5991
>
>
> If I use
>
>
> InChIGeneratorFactory inchiGenFactory = InChIGeneratorFactory.
> getInstance();
>
> System.out.println("INCHIKEY BEFORE ADDING EXPLICIT HYDROGEN: ") +
> inchiGenFactory.getInChIGenerator(molClone).getInchiKey());
>
>
> AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms("
> 17-Ethinylestradiol");
>
> AtomContainerManipulator.convertImplicitToExplicitHydrogens("
> 17-Ethinylestradiol");
>
>
> System.out.println("INCHIKEY AFTER ADDING EXPLICIT HYDROGEN: ") +
> inchiGenFactory.getInChIGenerator(molClone).getInchiKey());
>
>
> I get the following output:
>
>
> INCHIKEY BEFORE ADDING EXPLICIT HYDROGEN: BFPYWIDHMRZLRN-SLHNCBLASA-N
>
> INCHIKEY AFTER ADDING EXPLICIT HYDROGEN: BFPYWIDHMRZLRN-OQPPHWFISA-N
>
> This causes a problem down the path for me.
>
>
> An interesting point is that I used the structures generated (via isomeric
> SMILES) upon conversion of implicit hydrogen by CDK and visualized them
> using Marvin Sketch (Seee attached file), and the look identical to me. I
> also used the isomeric SMILES strings before and after conversion to search
> for same stereoisitopes in PubChem, and found the exact same structure. So
> I am confused as to why CDK would return different InChIKeys.
>
>
> Is there an explanation for this, or may be some other steps I should take
> to avoid this?
>
>
> Thank you for your consideration.
>
>
> Best,
>
>
> Yannick
>
>
>
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK Release 2.1

2017-12-28 Thread John Mayfield
Doh, That'll teach me to rush before Christmas, Will push out a 2.1.1 patch
release to fix that.

On 26 December 2017 at 12:39, Tim Dudgeon <tdudgeon...@gmail.com> wrote:

> Just for the record, this 2.1 release cannot be used using the Maven
> Central or JCenter repositories as there is a dependency on
> uk.ac.ebi.beam:beam-core:1.1-SNAPSHOT that is not present in those repos.
>
> To run a maven or gradle based build using CDK 2.1 you should also include
> the OSSRH repo: https://oss.sonatype.org/content/repositories/snapshots
> Tim
>
>
> On 15/12/17 21:43, John Mayfield wrote:
>
> Hi All,
>
> CDK v2.1 has been released and includes improved. AtomContainer and
> Stereochemistry Implementations/APIs: https://github.com/cdk/cdk/
> wiki/2.1-Release-Notes
>
> *Important:* the new AtomContainer APIs may have some unexpected
> behaviour on existing codebases (see https://github.com/cdk/cdk/
> wiki/AtomContainer2#gotchas). To offer a period of transition the new
> implementation (*internally named* *AtomContainer2*) is disabled by
> default. *Now* is the time to try it out and report any problems. If no
> critical issues outside of that covered by the gotchas are identified it
> will be the new default implementation, the old will then be removed in
> future.
>
>  - v2.1 CdkUseLegacyAtomContainer=true (AtomContainer is default)
>  - v2.2 CdkUseLegacyAtomContainer=false (AtomContainer2 is default) -
> pushed down if issues identified by community
>  - v2.x ...
>  - v3.0 (original AtomContainer removed)
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
>
>
> ___
> Cdk-user mailing 
> listCdk-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK Release 2.1

2017-12-28 Thread John Mayfield
Okay I've done a 2.1.1 patch release to rectify it. Deployed and should be
synced with central shortly.

On 28 December 2017 at 18:48, John Mayfield <john.wilkinson...@gmail.com>
wrote:

> Doh, That'll teach me to rush before Christmas, Will push out a 2.1.1
> patch release to fix that.
>
> On 26 December 2017 at 12:39, Tim Dudgeon <tdudgeon...@gmail.com> wrote:
>
>> Just for the record, this 2.1 release cannot be used using the Maven
>> Central or JCenter repositories as there is a dependency on
>> uk.ac.ebi.beam:beam-core:1.1-SNAPSHOT that is not present in those repos.
>>
>> To run a maven or gradle based build using CDK 2.1 you should also
>> include the OSSRH repo: https://oss.sonatype.org/conte
>> nt/repositories/snapshots
>> Tim
>>
>>
>> On 15/12/17 21:43, John Mayfield wrote:
>>
>> Hi All,
>>
>> CDK v2.1 has been released and includes improved. AtomContainer and
>> Stereochemistry Implementations/APIs: https://github.com/cdk/cdk/wik
>> i/2.1-Release-Notes
>>
>> *Important:* the new AtomContainer APIs may have some unexpected
>> behaviour on existing codebases (see https://github.com/cdk/cdk/wik
>> i/AtomContainer2#gotchas). To offer a period of transition the new
>> implementation (*internally named* *AtomContainer2*) is disabled by
>> default. *Now* is the time to try it out and report any problems. If no
>> critical issues outside of that covered by the gotchas are identified it
>> will be the new default implementation, the old will then be removed in
>> future.
>>
>>  - v2.1 CdkUseLegacyAtomContainer=true (AtomContainer is default)
>>  - v2.2 CdkUseLegacyAtomContainer=false (AtomContainer2 is default) -
>> pushed down if issues identified by community
>>  - v2.x ...
>>  - v3.0 (original AtomContainer removed)
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>
>>
>>
>> ___
>> Cdk-user mailing 
>> listCdk-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] aromatic bonds depicted as any bonds

2018-01-03 Thread John Mayfield
Incidentally, I believe ChemAxon is the only one producing these molfiles
with aromatic bonds. Certainly CDK/RDKit/OpenBabel/OEChem don't, I think
Indigo used to generate them in older versions.

$ obabel -ismi -:'c1c1' -omol



OpenBabel01031816072D



  6  6  0  0  0  0  0  0  0  0999 V2000
> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
>   1  6  1  0  0  0  0
>   1  2  2  0  0  0  0
>   2  3  1  0  0  0  0
>   3  4  2  0  0  0  0
>   4  5  1  0  0  0  0
>   5  6  2  0  0  0  0
> M  END
> 1 molecule converted


On 3 January 2018 at 15:58, Tim Dudgeon <tdudgeon...@gmail.com> wrote:

> John,
>
> Thanks for the details response. I think it will be useful to be able to
> depict aromatic bonds, and, as you mention, the main proper use for this
> will be for query structures and fragments. However, many structures out
> there in the wild do use aromatic bonds, so I think its useful to have it
> for normal structures too.
>
> Tim
>
> p.s. when I referred to the dotted bond as 'ANY' bond, this is the
> notation that ChemAxon uses to depict this type of query bond. But I guess
> that's not an absolute standard.
>
> On 03/01/18 14:03, John Mayfield wrote:
>
> I'll answer these back to front as the second one is much simpler to
> answer:
>
> 1. Why the inconsistency in how the different parsers/readers behave? Is
>> this documented anywhere?
>> 2. Is it possible to have the aromatic bonds depicted as proper aromatic
>> bonds?
>
>
> Answer 2: I don't think the dohnuts are useful for plain old structures,
> only query structures. The circles also do not scale well to all cases
> (porphyrin is a classic). The dashed bond in the depiction is not really
> 'any' bond as you say but rather "you input was junk/had missing
> information" (see the next Answer on why that is). Since as I said for
> query structures you need the 'delocalised' bond depiction i've updated the
> renderer accordingly. For now I've just done an offset dash but will try
> and find time to add in the dohuts: https://github.com/cdk/cdk/pull/403
>
> Answer 1: The short answer is CDK matches behaviour to what Daylight does
> for SMILES and what MDL/Symyx/Accelrys/BIOVIA do for molfile. You can
> safely use aromatic bond types in SMILES and not in CTfiles.
>
> In CDK aromaticity is a bond property and not a type/order, that is to say
> the bond order is independent of the aromatic status of the bond. The
> "normal form" of a molecule in the CDK is to have all the hydrogen counts
> and bond orders set - if this not so you will get warnings/exceptions all
> over the place. A molecule can be in an inconsistent state if an input
> format was invalid or you create it that way manually. As I'm sure you
> know, bond type = 4 in CTfiles is a query feature, if you use it to
> represent a discrete structure there is no way to know what the original
> representation was. If I try to read your structure with BIOVIA I get an
> error:
>
> ORA-20100: MDL-1919: Molecule failed registration check:
>> Error: (root) No query features allowed for registration
>> MDL-0633: Unable to convert molfile string to binary molecule ctab
>> ORA-06512: at "C$DIRECT2017.MDLAUXOP", line 359
>> ORA-06512: at "C$DIRECT2017.MDLAUXOP", line 352
>> ORA-06512: at "C$DIRECT2017.MDLAUXOP", line 335
>
>
> I've written a wiki section to help explain why the problem exists:
> https://github.com/cdk/cdk/wiki/CTfile-Reading#aromatic-query-bonds.
> Rather then reject molfiles with aromatic bonds outright we leave the
> molecule in an inconsistent state as a user knows their data better then us
> and may be able to correct it. SMILES will automatically kekulize input
> because it can safely do so.
>
> Hope that helps,
> John
>
>
> On 26 December 2017 at 14:46, Tim Dudgeon <tdudgeon...@gmail.com> wrote:
>
>> I've noticed that if you try to depict a structure in molfile format that
>> has bonds in rings defined as aromatic type then they are depicted as any
>> bonds (dashed), not aromatic (donuts). For example take this molfile:
>>
>>
>>   Mrv17a0 10061711272D
>>
>>  14 15  0  0  0  0999 V2000
>> 0.54200.23230. C   0  0  0  0  0  0  0  0  0  0  0  0
>> 1.2564   -0.18020. C   0  0  0  0  0  

Re: [Cdk-user] lots of debug output when using CDK 2.2-SNAPSHOT

2018-08-06 Thread John Mayfield
The output is coming from *DynamicFactory*, are you including log4j on your
classpath (watch out for cdk-bundle), if so make sure it picks up your
config and not one from another project (e.g. jena/jni-inchi). You can
either exclude *log4j* or configure it to suite your needs.

On Sat, 4 Aug 2018 at 05:38, Rajarshi Guha  wrote:

> Hi, I was rebuilding the rcdk version with the latest CDK (2.2-SNAPSHOT).
> On loading the R package, I'm seeing a lot of DEBUG output of the form
>
> 41   [main] DEBUG org.openscience.cdk.DynamicFactory  - registered
> 'IChemModel' with 'ChemModel' implementation
>
> Given that the R code is just making some calls to constructors, I' not
> sure where this output is coming from.
>
> Any suggestions on how to suppress it?
>
>
> --
> Rajarshi Guha | http://blog.rguha.net | @rguha 
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] lots of debug output when using CDK 2.2-SNAPSHOT

2018-08-06 Thread John Mayfield
I've removed *cdk-log4j* from the bundle, see if that has any impact.

https://github.com/cdk/cdk/commit/6af1ad94638dc11874fa85006302b92a786f5d7b

On Mon, 6 Aug 2018 at 08:24, John Mayfield 
wrote:

> The output is coming from *DynamicFactory*, are you including log4j on
> your classpath (watch out for cdk-bundle), if so make sure it picks up your
> config and not one from another project (e.g. jena/jni-inchi). You can
> either exclude *log4j* or configure it to suite your needs.
>
> On Sat, 4 Aug 2018 at 05:38, Rajarshi Guha 
> wrote:
>
>> Hi, I was rebuilding the rcdk version with the latest CDK (2.2-SNAPSHOT).
>> On loading the R package, I'm seeing a lot of DEBUG output of the form
>>
>> 41   [main] DEBUG org.openscience.cdk.DynamicFactory  - registered
>> 'IChemModel' with 'ChemModel' implementation
>>
>> Given that the R code is just making some calls to constructors, I' not
>> sure where this output is coming from.
>>
>> Any suggestions on how to suppress it?
>>
>>
>> --
>> Rajarshi Guha | http://blog.rguha.net | @rguha
>> <https://twitter.com/rguha>
>>
>>
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] SMARTSQueryTool aromaticity

2018-08-27 Thread John Mayfield
This is because you are using the CDK aromaticity model whilst in this case
the Daylight model is the correct one. It's "correct" because Daylight
created SMARTS so anything to do with SMARTS should preferably be using
this model. If all toolkits did this, SMARTS patterns (and SMIRKS
transforms) would be a lot more portable. Daylight says this molecule
*is* aromatic
(see screenshot).

The *SMARTSQueryTool* is unofficially deprecated, the newer class is
SmartsPattern

and
it uses the Daylight model. You may also find the *SmartsFragmentExtractor
*
useful, writing good SMARTS is some what of an art form, this class helps.
If you want to match patterns with a different aromaticity model you need
to use the base *Pattern* class and provide it a query molecule.

John

[image: image.png]

On Fri, 24 Aug 2018 at 06:18, Iris E.  wrote:

>
>
>
> Hey,
>
> I am currentyl working on a project using CDK, where I want to generate
> aromatic fragment SMARTS from an initial molecule.
> In further course of the project I want to relocate these fragments in
> their inital molecules.
>
> For the fragment SMARTS generation I aromatize the inital molecules using
> the Aromaticity class:
> Aromaticity aromaticity = new Aromaticity(ElectronDonation.cdk(),
> Cycles.all())
>
> For the relocation I use the SMARTSQueryTool class:
> SMARTSQueryTool sqt = new SMARTSQueryTool(fragmentSmarts,
> DefaultChemObjectBuilder.getInstance());
> System.out.println(sqt.matches(molecule));
>
> By doing this, I discovered the following problem.
>
> Example:
> * inital molecules SMILES: CCNC1=NC(=O)N=C(NC(C)(C)C)N1
> * fragment SMARTS: C(=NC=O)N, CCNC=N
>
> When I am aromatizing the molecule as described above, I get a
> non-aromatic ring within the molecule, what is reflected by the found
> fragments.
> When I try to find the the fragment CCNC=N in the inital molecule via the
> SMARTSQueryTool, the method returns false for not finding it.
> If I change the the fragment SMARTS to CCNcn, the method returns true.
>
> I reproduced this problem by using this online tool, too:
> http://www.simolecule.com/cdkdepict/depict.html
>
> This leads me to the question, if the
> SMARTSQueryTool internally aromatizes the molecule. The Aromaticity class
> seems to aromatize the molecule correct by flagging the ring as
> non-aromatic, because this molecule is not an aromatic compound. The ring
> features a SP2-
> hybridised double bond charateristic, but the molecule is not planar and
> its electrons are not delocalized. On request, I can provide further
> examples.
>
> Now i am wondering why the
> SMARTSQueryTool is make an own aromaticity on the molecule and under
> which conditions?
>
> I hope you can help me with this issue.
>
>
> Kind regards
> ,
>
>
> Iris
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Aligning Reaction using Fixed Substructure

2018-09-07 Thread John Mayfield
Yes, reactions are aligned left to right automatically when atom-maps are
present.

For example:

[image: image.png]
(CDK Depict

)

[image: image.png]
(CDK Depict

)
// StructureDiagramGenerater sdg;
sdg.setAlignMappedReaction(false);

If you want to *also* align to a substructure you can lay out the product
first, then the whole reaction. The API is deliberately low-level to be
flexible but basically you can generate a layout keeping certain
atoms/bonds in place (atoms are more important, the bond set stops them
being stretched/flipped). With this you can transfer across the coordinates
first then "fix" these in place.

For example, something like this:

IAtomContainer query; // already laid out
> IAtomContainer mol;   // to be laid out
> StructureDiagramGenerator sdg = new StructureDiagramGenerator();
> for (Map map : Pattern.findSubstructure(query)
>.matchAll(mol)
>.toAtomMap()) {
>   for (Map.Entry e = map.entrySet())
>   e.getValue().setPoint2d(new Point2d(e.getKey().getPoint2d()));
>   sdg.setMolecule(mol, false, map.values(), Collections.emptySet());
>   sdg.generateCoordinates();
>   break; // first mapping only
> }


StructureDiagramGenerator

 API

> public void setMolecule(IAtomContainer mol,
> boolean clone,
> Set afix,
> Set bfix)
> Assigns a molecule to be laid out. After, setting the molecule call
> generateCoordinates() to assign 2D coordinates. An optional set of
> atoms/bonds can be parsed in to allow partial layout, these will be 'fixed'
> in place. This only applies to non-cloned molecules, and only atoms with
> coordinates can be fixed.
> Parameters:
> mol - the molecule for which coordinates are to be generated.
> clone - Should the whole process be performed with a cloned copy?
> afix - Atoms that should be fixed in place, coordinates are not changed.
> bfix - Bonds that should be fixed in place, they will not be flipped,
> bent, or streched.



On Fri, 7 Sep 2018 at 10:32, Ross West  wrote:

> Hi,
>
>
>
> I'm wondering if it's possible to align a Reaction using a fixed
> substructure similar to the methods covered here:
> http://ctr.wikia.com/wiki/Align_the_depiction_using_a_fixed_substructure
>
>
>
> Is this possible using CDK?
>
>
>
> Thanks in advance,
>
> Ross
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] InChi and Docker

2018-09-10 Thread John Mayfield
Unlikely to work in alpine, have you tried ubuntu/debian slim?

John

On Mon, 10 Sep 2018 at 18:16, Maria Sorokina 
wrote:

> Hi,
>
> I developed an app using CDK, and in particular it’s InChi Generator. It
> runs perfectly with IntelliJ, and as a jar on MacOS and on CentOS. I wanted
> to make it run within a Docker container (openjdk:8-jdk-alpine with
> additional installation of gcc), the image is created without any problem,
> but when I run the container, I get the following error just once:
>
>
> npls-db-filler_1  | 0[main] INFO
> net.sf.jnati.deploy.artefact.ConfigManager  - Loading global configuration
> npls-db-filler_1  | 4[main] DEBUG
> net.sf.jnati.deploy.artefact.ConfigManager  - Loading defaults:
> jar:file:/app.jar!/BOOT-INF/lib/jnati-deploy-0.3.jar!/META-INF/jnati/jnati.default-properties
> npls-db-filler_1  | 5[main] INFO
> net.sf.jnati.deploy.artefact.ConfigManager  - Loading artefact
> configuration: jniinchi-1.03_1
> npls-db-filler_1  | 6[main] DEBUG
> net.sf.jnati.deploy.artefact.ConfigManager  - Loading instance defaults:
> jar:file:/app.jar!/BOOT-INF/lib/jnati-deploy-0.3.jar!/META-INF/jnati/jnati.instance.default-properties
> npls-db-filler_1  | 9[main] INFO
> net.sf.jnati.deploy.repository.ClasspathRepository  - Searching classpath
> for: jniinchi-1.03_1-LINUX-AMD64
> npls-db-filler_1  | 14   [main] INFO
> net.sf.jnati.deploy.repository.LocalRepository  - Searching local
> repository for: jniinchi-1.03_1-LINUX-AMD64
> npls-db-filler_1  | 16   [main] DEBUG
> net.sf.jnati.deploy.repository.LocalRepository  - Artefact path:
> /root/.jnati/repo/jniinchi/1.03_1/LINUX-AMD64
> npls-db-filler_1  | 16   [main] INFO
> net.sf.jnati.deploy.repository.LocalRepository  - Creating artefact:
> /root/.jnati/repo/jniinchi/1.03_1/LINUX-AMD64
> npls-db-filler_1  | 18   [main] DEBUG
> net.sf.jnati.deploy.source.JarSource  - Opening jar: /app.jar
> npls-db-filler_1  | 19   [main] WARN
> net.sf.jnati.deploy.NativeArtefactLocator  - Error resolving artefact to
> local repository
> npls-db-filler_1  | java.io.FileNotFoundException: File not found:
> MANIFEST.xml
> npls-db-filler_1  | at
> net.sf.jnati.deploy.source.JarSource.openFile(JarSource.java:67)
> npls-db-filler_1  | at
> net.sf.jnati.deploy.source.ArtefactSource.loadManifest(ArtefactSource.java:41)
>
>
> And this  every time the InChiGenerator is called in my code:
>
>
>  Error loading JNI InChI native code.
> npls-db-filler_1  | You may need to compile the native code for your
> platform.
> npls-db-filler_1  | See http://jni-inchi.sourceforge.net for instructions.
> npls-db-filler_1  |
> npls-db-filler_1  | 1033 [main] INFO
> net.sf.jnati.deploy.artefact.ConfigManager  - Loading artefact
> configuration: jniinchi-1.03_1
> npls-db-filler_1  | 1037 [main] DEBUG
> net.sf.jnati.deploy.artefact.ConfigManager  - Loading instance defaults:
> jar:file:/app.jar!/BOOT-INF/lib/jnati-deploy-0.3.jar!/META-INF/jnati/jnati.instance.default-properties
> npls-db-filler_1  | 1039 [main] INFO
> net.sf.jnati.deploy.repository.ClasspathRepository  - Searching classpath
> for: jniinchi-1.03_1-LINUX-AMD64
> npls-db-filler_1  | 1039 [main] INFO
> net.sf.jnati.deploy.repository.LocalRepository  - Searching local
> repository for: jniinchi-1.03_1-LINUX-AMD64
> npls-db-filler_1  | 1039 [main] DEBUG
> net.sf.jnati.deploy.repository.LocalRepository  - Artefact path:
> /root/.jnati/repo/jniinchi/1.03_1/LINUX-AMD64
> npls-db-filler_1  | 1040 [main] INFO
> net.sf.jnati.deploy.repository.LocalRepository  - Creating artefact:
> /root/.jnati/repo/jniinchi/1.03_1/LINUX-AMD64
> npls-db-filler_1  | 1040 [main] DEBUG
> net.sf.jnati.deploy.source.JarSource  - Opening jar: /app.jar
> npls-db-filler_1  | 1041 [main] WARN
> net.sf.jnati.deploy.NativeArtefactLocator  - Error resolving artefact to
> local repository
> npls-db-filler_1  | java.io.FileNotFoundException: File not found:
> MANIFEST.xml
> npls-db-filler_1  | at
> net.sf.jnati.deploy.source.JarSource.openFile(JarSource.java:67)
>
>
>
> Have anybody tried and succeed using InChiGenerator within a Docker
> container? Which package/library/software I should add to either my Docker
> image, either to my Maven repo to make it work within this container?
>
> Thank you for your help!
>
> Kind regards,
>
> Maria Sorokina, PhD
> Steinbeck Research Group
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> http://cheminf.uni-jena.de
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] InChi and Docker

2018-09-11 Thread John Mayfield
Right, alpine was probably a red herring. Looking again I presume you build
app.jar yourself? It looks like you've removed the MANIFEST.xml that is
needed by JNI InChI to locate the native dependency. How are you building
the JAR?

Unfortunately v1.03 of InChI is all we have for JNI ATM. My former
colleague has been looking at JNA InChI version (
https://github.com/dan2097/jna-inchi) but this is very different to the JNI
bindings and not a drop in replacement.

John

On Tue, 11 Sep 2018 at 09:48, Maria Sorokina 
wrote:

> I tried the ubuntu 16.04 based image with Java 8 added, I get exactly the
> same error, only at the InChi generation. I have the feeling that something
> is missing in this minimal linux images for the InChi generator to run
> correctly, but I cannot find what.
>
> I also noticed that for an old version of ini-inchi (1.03, and the latest
> and only available is 1.6):
> npls-db-filler_1  | 2181 [main] INFO
> net.sf.jnati.deploy.repository.RemoteRepository  - Searching remote
> repository for: jniinchi-1.03_1-LINUX-AMD64 (
> http://jnati.sourceforge.net/jnati-repo)
>
> What am I missing in my configuration ? Knowing that everything works fine
> outside of Docker?
>
> The Dockerfile using Ubuntu:
>
> FROM ubuntu:16.04
> LABEL maintainer="maria.ssorok...@gmail.com"
>
>
> RUN apt-get update && \
> apt-get upgrade -y && \
> apt-get install -y  software-properties-common && \
> add-apt-repository ppa:webupd8team/java -y && \
> apt-get update && \
> echo oracle-java7-installer shared/accepted-oracle-license-v1-1 select 
> true | /usr/bin/debconf-set-selections && \
> apt-get install -y oracle-java8-installer && \
> apt-get clean
>
> EXPOSE 8080
> VOLUME /tmp
> ARG JAR_FILE
> COPY ${JAR_FILE} app.jar
> ENTRYPOINT 
> ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/app.jar"]
>
> Thank you for your help!
>
> Kind regards,
>
>
> Maria Sorokina, PhD
> Steinbeck Research Group
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> http://cheminf.uni-jena.de
>
> Le 11 sept. 2018 à 00:13, John Mayfield  a
> écrit :
>
> Unlikely to work in alpine, have you tried ubuntu/debian slim?
>
> John
>
> On Mon, 10 Sep 2018 at 18:16, Maria Sorokina 
> wrote:
>
>> Hi,
>>
>> I developed an app using CDK, and in particular it’s InChi Generator. It
>> runs perfectly with IntelliJ, and as a jar on MacOS and on CentOS. I wanted
>> to make it run within a Docker container (openjdk:8-jdk-alpine with
>> additional installation of gcc), the image is created without any problem,
>> but when I run the container, I get the following error just once:
>>
>>
>> npls-db-filler_1  | 0[main] INFO
>> net.sf.jnati.deploy.artefact.ConfigManager  - Loading global configuration
>> npls-db-filler_1  | 4[main] DEBUG
>> net.sf.jnati.deploy.artefact.ConfigManager  - Loading defaults:
>> jar:file:/app.jar!/BOOT-INF/lib/jnati-deploy-0.3.jar!/META-INF/jnati/jnati.default-properties
>> npls-db-filler_1  | 5[main] INFO
>> net.sf.jnati.deploy.artefact.ConfigManager  - Loading artefact
>> configuration: jniinchi-1.03_1
>> npls-db-filler_1  | 6[main] DEBUG
>> net.sf.jnati.deploy.artefact.ConfigManager  - Loading instance defaults:
>> jar:file:/app.jar!/BOOT-INF/lib/jnati-deploy-0.3.jar!/META-INF/jnati/jnati.instance.default-properties
>> npls-db-filler_1  | 9[main] INFO
>> net.sf.jnati.deploy.repository.ClasspathRepository  - Searching classpath
>> for: jniinchi-1.03_1-LINUX-AMD64
>> npls-db-filler_1  | 14   [main] INFO
>> net.sf.jnati.deploy.repository.LocalRepository  - Searching local
>> repository for: jniinchi-1.03_1-LINUX-AMD64
>> npls-db-filler_1  | 16   [main] DEBUG
>> net.sf.jnati.deploy.repository.LocalRepository  - Artefact path:
>> /root/.jnati/repo/jniinchi/1.03_1/LINUX-AMD64
>> npls-db-filler_1  | 16   [main] INFO
>> net.sf.jnati.deploy.repository.LocalRepository  - Creating artefact:
>> /root/.jnati/repo/jniinchi/1.03_1/LINUX-AMD64
>> npls-db-filler_1  | 18   [main] DEBUG
>> net.sf.jnati.deploy.source.JarSource  - Opening jar: /app.jar
>> npls-db-filler_1  | 19   [main] WARN
>> net.sf.jnati.deploy.NativeArtefactLocator  - Error resolving artefact to
>> local repository
>> npls-db-filler_1  | java.io.FileNotFoundException: File not found:
>> MANIFEST.xml
>> npls-db-filler_1  | at
>> net.sf.jnati.deploy.source.JarSource.openFile(JarSource.java:67)
>> npls-db-filler_1  | at
>> net.sf.jnati.deplo

Re: [Cdk-user] CDK functionality in Docker images (was: InChi and Docker)

2018-09-11 Thread John Mayfield
It doesn't use InChI :-).

On Tue, 11 Sep 2018 at 16:43, Egon Willighagen 
wrote:

> Anyway, I'm quite excited about the idea of a Docker for CDK
> functionality... (well, excited it triggered here not so much I am entirely
> sure this is a fantastic platform, but in our OpenRiskNet we use it for
> workflows, and being able to replace some stuff with CDK functionality in
> that cloud *is* exciting :)
>
> John, how is your CDK Depict docker [0] set up then?
>
> Egon
>
> 0.https://hub.docker.com/r/simolecule/cdkdepict/
>
>
> On Tue, Sep 11, 2018 at 5:03 PM Maria Sorokina 
> wrote:
>
>> I am building my jar with Maven, but I don’t think that it is the
>> problem, as I tried tu run the jar on a Centos7 VM having just Java and it
>> worked without any problem.
>> I also desperately tried the dirty solution of downloading the manifest
>> and the jniinchi-1.03_1-LINUX-AMD64.so files the ini-inchi seems to
>> search for and put them where it seems to search them within the container,
>> but, of course it failed.
>>
>> I’ll try to search for a solution, but for now I’ll just do my thing
>> without containerizing this part of my app. If I find a solution, I’ll post
>> it here. Meanwhile, I’m still very open to any suggestion!
>>
>
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: https://www.zotero.org/egonw
> ORCID: -0001-7542-0286 
> ImpactStory: https://impactstory.org/u/egonwillighagen
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] InChi and Docker

2018-09-11 Thread John Mayfield
I'll have a poke around in an alpine container this evening, does seem odd.

A third option is nested-vm version, we used this in JChemPaint when it was
an Applet - (see inchi-nestedvm in https://github.com/JChemPaint/jchempaint
).

On Tue, 11 Sep 2018 at 16:03, Maria Sorokina 
wrote:

> I am building my jar with Maven, but I don’t think that it is the problem,
> as I tried tu run the jar on a Centos7 VM having just Java and it worked
> without any problem.
> I also desperately tried the dirty solution of downloading the manifest
> and the jniinchi-1.03_1-LINUX-AMD64.so files the ini-inchi seems to
> search for and put them where it seems to search them within the container,
> but, of course it failed.
>
> I’ll try to search for a solution, but for now I’ll just do my thing
> without containerizing this part of my app. If I find a solution, I’ll post
> it here. Meanwhile, I’m still very open to any suggestion!
>
> Kind regards,
>
> Maria Sorokina, PhD
> Steinbeck Research Group
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> http://cheminf.uni-jena.de
>
> Le 11 sept. 2018 à 14:51, John Mayfield  a
> écrit :
>
> Right, alpine was probably a red herring. Looking again I presume you
> build app.jar yourself? It looks like you've removed the MANIFEST.xml that
> is needed by JNI InChI to locate the native dependency. How are you
> building the JAR?
>
> Unfortunately v1.03 of InChI is all we have for JNI ATM. My former
> colleague has been looking at JNA InChI version (
> https://github.com/dan2097/jna-inchi) but this is very different to the
> JNI bindings and not a drop in replacement.
>
> John
>
> On Tue, 11 Sep 2018 at 09:48, Maria Sorokina 
> wrote:
>
>> I tried the ubuntu 16.04 based image with Java 8 added, I get exactly the
>> same error, only at the InChi generation. I have the feeling that something
>> is missing in this minimal linux images for the InChi generator to run
>> correctly, but I cannot find what.
>>
>> I also noticed that for an old version of ini-inchi (1.03, and the latest
>> and only available is 1.6):
>> npls-db-filler_1  | 2181 [main] INFO
>> net.sf.jnati.deploy.repository.RemoteRepository  - Searching remote
>> repository for: jniinchi-1.03_1-LINUX-AMD64 (
>> http://jnati.sourceforge.net/jnati-repo)
>>
>> What am I missing in my configuration ? Knowing that everything works
>> fine outside of Docker?
>>
>> The Dockerfile using Ubuntu:
>>
>> FROM ubuntu:16.04
>> LABEL maintainer="maria.ssorok...@gmail.com"
>>
>>
>> RUN apt-get update && \
>> apt-get upgrade -y && \
>> apt-get install -y  software-properties-common && \
>> add-apt-repository ppa:webupd8team/java -y && \
>> apt-get update && \
>> echo oracle-java7-installer shared/accepted-oracle-license-v1-1 select 
>> true | /usr/bin/debconf-set-selections && \
>> apt-get install -y oracle-java8-installer && \
>> apt-get clean
>>
>> EXPOSE 8080
>> VOLUME /tmp
>> ARG JAR_FILE
>> COPY ${JAR_FILE} app.jar
>> ENTRYPOINT 
>> ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/app.jar"]
>>
>> Thank you for your help!
>>
>> Kind regards,
>>
>>
>> Maria Sorokina, PhD
>> Steinbeck Research Group
>> Analytical Chemistry - Cheminformatics and Chemometrics
>> Friedrich-Schiller-University Jena, Germany
>> http://cheminf.uni-jena.de
>>
>> Le 11 sept. 2018 à 00:13, John Mayfield  a
>> écrit :
>>
>> Unlikely to work in alpine, have you tried ubuntu/debian slim?
>>
>> John
>>
>> On Mon, 10 Sep 2018 at 18:16, Maria Sorokina 
>> wrote:
>>
>>> Hi,
>>>
>>> I developed an app using CDK, and in particular it’s InChi Generator. It
>>> runs perfectly with IntelliJ, and as a jar on MacOS and on CentOS. I wanted
>>> to make it run within a Docker container (openjdk:8-jdk-alpine with
>>> additional installation of gcc), the image is created without any problem,
>>> but when I run the container, I get the following error just once:
>>>
>>>
>>> npls-db-filler_1  | 0[main] INFO
>>> net.sf.jnati.deploy.artefact.ConfigManager  - Loading global configuration
>>> npls-db-filler_1  | 4[main] DEBUG
>>> net.sf.jnati.deploy.artefact.ConfigManager  - Loading defaults:
>>> jar:file:/app.jar!/BOOT-INF/lib/jnati-deploy-0.3.jar!/META-INF/jnati/jnati.default-properties
>>> npls-

Re: [Cdk-user] Aligning Reaction using Fixed Substructure

2018-10-11 Thread John Mayfield
Nice,

For historical reasons, CDK uses unit bond length 1.5 instead of 1.
Something to do with C-C bonds but that really only makes sense for 3D.
Rescale it like this:

GeometryUtil.scaleMolecule(fixedSubstructure,

1.5/GeometryUtil.getBondLengthMedian(fixedSubstructure));

Also I presume you want *findSubstructure* rather than *findIdentical* (exact
match). You should also avoid the *count() > 1* this is very waste-full if
there are a lot of automorphisms in the query. Basically it says find them
all and count them and then re-find the first one for alignment. You can
completely remove that check as show but for the benefit of the mailing
list the correct way to write that if-condition is:


> *if (mappings.atLeast(1)) {}*


Now the tricky part is working out when to/not align atoms in generic
queries, for example: *C~C~O* matches both C=C=O and CCO the first would
should not be bent when laid out. Anyways it's an open problem and for most
queries it will be fine.

Here's the final function, you probably also want the highlighting done at
the same time but have omitted that here:

https://gist.github.com/johnmay/12797a89f4186bc7da881f1f4a706671

public static void alignMoleculeToSubstructure(IAtomContainer mol,
>IAtomContainer sub,
>boolean fixBonds) throws
> CDKException {
>
> *Pattern substructurePattern = Pattern.findSubstructure(sub);*
> Mappings mappings = substructurePattern.matchAll(mol);
> Set fixedAtoms = new HashSet();
> Set fixedBonds = new HashSet();
> for (Map map : mappings.toAtomBondMap()) {
>
> *GeometryUtil.scaleMolecule(sub,
>  1.5/GeometryUtil.getBondLengthMedian(sub));*
> for (IChemObject substructureObject : map.keySet()) {
> IChemObject targetObject = map.get(substructureObject);
> if (targetObject instanceof IAtom) {
> //set the target atom's position to that of the
> substructure atom and add it to the fixed atom list
> IAtom targetAtom = (IAtom) targetObject;
> IAtom substructureAtom = (IAtom) substructureObject;
> targetAtom.setPoint2d(new
> Point2d(substructureAtom.getPoint2d()));
> fixedAtoms.add(targetAtom);
> } else if (fixBonds) {
> //only check bonds if needed
> if (targetObject instanceof IBond) {
> //add the target bond to the fixed bond list
> IBond targetBond = (IBond) substructureObject;
> fixedBonds.add(targetBond);
> }
> }
> }
> //only align to the first matching substructure
> break;
> }
> //generate coordinates for the molecule
> StructureDiagramGenerator sdg = new StructureDiagramGenerator();
> sdg.setMolecule(mol, false, fixedAtoms, fixedBonds);
> sdg.generateCoordinates();
> }


John
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


[Cdk-user] CDK v2.2

2018-10-30 Thread John Mayfield
Dear CDK users,

CDK 2.2 is now released and can be obtained from Maven central or the GitHub
 site. The full release
notes  provide details
on the new features and changes.

As noted in v2.1/v2.1.1 the AtomContainer2
 is now the default so if
you didn't try running v2.1/v2.1.1 with the flag
CdkUseLegacyAtomContainer=false you may some breakages. As highlighted on
the wiki page  this
normally just requires addAtom/Bond statements be reordered and using
``Objects.equals(container1, container2)`` instead of reference comparison
(container1 == container2).

- John
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK v2.2

2018-10-30 Thread John Mayfield
1) You can just include cdk-legacy and use the existing, but the
functionality was just a connivence the same as getMinMax(container) but
without the Java AWT dependency which caused problems for Andriod/SWT. IIRC
this was the only place AWT was use in the core package. If you just want
the width/height use: get2DDimension. Note this was 4+ years ago :-)
https://github.com/cdk/cdk/commit/214785ce18e2d06f1ba7d9fddc82c0ea9753a385#diff-9a119f1ec045c70b21aa694d01bbc773
2) Looks like a bug, but you really really really should not be using
clone.

On Tue, 30 Oct 2018 at 15:51, Syed Asad Rahman  wrote:

> Thanks.
>
>
>
> I have started to play with the new release.
>
> Got few regression which is fine with API changes.
>
>
>
> Any pointer please?
>
>
>
> Q1) What is the equivalent of GeometryTools.getRectangle2D in the new
> GeometryUtil?
>
> import static org.openscience.cdk.geometry.GeometryTools.getRectangle2D;
>
> Q2)
>
> IAtomContainer clone = org.clone();
>
>
>
> Throws java.lang.ClassCastException: java.util.ArrayList cannot be cast to
> org.openscience.cdk.sgroup.SgroupBracket
>
> java.lang.ClassCastException: java.util.ArrayList cannot be cast to
> org.openscience.cdk.sgroup.SgroupBracket
>
> at
> org.openscience.cdk.tools.manipulator.SgroupManipulator.copy(SgroupManipulator.java:108)
>
>     at
> org.openscience.cdk.AtomContainer.clone(AtomContainer.java:1408)
>
>
>
> *From: *John Mayfield 
> *Date: *Tuesday, 30 October 2018 at 15:39
> *To: *Syed Asad Rahman 
> *Cc: *cdkuser 
> *Subject: *Re: [Cdk-user] CDK v2.2
>
>
>
> Yes, but there has been no changes to SilentChemObjectBuilder...?
>
>
>
> - I did try testing RDT with the new AtomContainer APIs but your tests
> took too long to run and I have stuff to do :-).
>
> - Likewise I did the same with Ambit but there are integration tests with
> dependencies on DBs etc so difficult to see if there is actually anything
> wrong. Bigger problem in Ambit is extended type (e.g. SuppleAtomContainer)
> which needs to implement some new methods. Often trivial stuff convenience
> stuff that could be resolved with Java 8 and default method implementations
> on interfaces, but we're currently on Java 7 so not an option.
>
>
>
> John
>
>
>
> On Tue, 30 Oct 2018 at 14:47, Syed Asad Rahman  wrote:
>
> Thank you John and Developers!
>
>
>
> This is a fantastic news!
>
> One quick question before I pull it into SMSD and RDT - Is 
> SilentChemObjectBuilder thread safe?
>
> Best wishes,
>
> -Asad
>
>
>
> *From: *John Mayfield 
> *Date: *Tuesday, 30 October 2018 at 12:01
> *To: *cdkuser 
> *Subject: *[Cdk-user] CDK v2.2
>
>
>
> Dear CDK users,
>
>
>
> CDK 2.2 is now released and can be obtained from Maven central or the
> GitHub <https://github.com/cdk/cdk/releases/tag/cdk-2.2> site. The full 
> release
> notes <https://github.com/cdk/cdk/wiki/2.2-Release-Notes> provide details
> on the new features and changes.
>
>
>
> As noted in v2.1/v2.1.1 the AtomContainer2
> <https://github.com/cdk/cdk/wiki/AtomContainer2> is now the default so if
> you didn't try running v2.1/v2.1.1 with the flag
> CdkUseLegacyAtomContainer=false you may some breakages. As highlighted on
> the wiki page <https://github.com/cdk/cdk/wiki/AtomContainer2> this
> normally just requires addAtom/Bond statements be reordered and using
> ``Objects.equals(container1, container2)`` instead of reference comparison
> (container1 == container2).
>
>
>
> - John
>
>
>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK v2.2

2018-10-30 Thread John Mayfield
On (2) you can also just remove all the Sgroup info, likely you're not even
using it.

mol.setProperty(CDKConstants.CTAB_SGROUPS, null);
>

On Tue, 30 Oct 2018 at 16:15, John Mayfield 
wrote:

> 1) You can just include cdk-legacy and use the existing, but the
> functionality was just a connivence the same as getMinMax(container) but
> without the Java AWT dependency which caused problems for Andriod/SWT. IIRC
> this was the only place AWT was use in the core package. If you just want
> the width/height use: get2DDimension. Note this was 4+ years ago :-)
> https://github.com/cdk/cdk/commit/214785ce18e2d06f1ba7d9fddc82c0ea9753a385#diff-9a119f1ec045c70b21aa694d01bbc773
> 2) Looks like a bug, but you really really really should not be using
> clone.
>
> On Tue, 30 Oct 2018 at 15:51, Syed Asad Rahman  wrote:
>
>> Thanks.
>>
>>
>>
>> I have started to play with the new release.
>>
>> Got few regression which is fine with API changes.
>>
>>
>>
>> Any pointer please?
>>
>>
>>
>> Q1) What is the equivalent of GeometryTools.getRectangle2D in the new
>> GeometryUtil?
>>
>> import static org.openscience.cdk.geometry.GeometryTools.getRectangle2D;
>>
>> Q2)
>>
>> IAtomContainer clone = org.clone();
>>
>>
>>
>> Throws java.lang.ClassCastException: java.util.ArrayList cannot be cast
>> to org.openscience.cdk.sgroup.SgroupBracket
>>
>> java.lang.ClassCastException: java.util.ArrayList cannot be cast to
>> org.openscience.cdk.sgroup.SgroupBracket
>>
>>     at
>> org.openscience.cdk.tools.manipulator.SgroupManipulator.copy(SgroupManipulator.java:108)
>>
>> at
>> org.openscience.cdk.AtomContainer.clone(AtomContainer.java:1408)
>>
>>
>>
>> *From: *John Mayfield 
>> *Date: *Tuesday, 30 October 2018 at 15:39
>> *To: *Syed Asad Rahman 
>> *Cc: *cdkuser 
>> *Subject: *Re: [Cdk-user] CDK v2.2
>>
>>
>>
>> Yes, but there has been no changes to SilentChemObjectBuilder...?
>>
>>
>>
>> - I did try testing RDT with the new AtomContainer APIs but your tests
>> took too long to run and I have stuff to do :-).
>>
>> - Likewise I did the same with Ambit but there are integration tests with
>> dependencies on DBs etc so difficult to see if there is actually anything
>> wrong. Bigger problem in Ambit is extended type (e.g. SuppleAtomContainer)
>> which needs to implement some new methods. Often trivial stuff convenience
>> stuff that could be resolved with Java 8 and default method implementations
>> on interfaces, but we're currently on Java 7 so not an option.
>>
>>
>>
>> John
>>
>>
>>
>> On Tue, 30 Oct 2018 at 14:47, Syed Asad Rahman  wrote:
>>
>> Thank you John and Developers!
>>
>>
>>
>> This is a fantastic news!
>>
>> One quick question before I pull it into SMSD and RDT - Is 
>> SilentChemObjectBuilder thread safe?
>>
>> Best wishes,
>>
>> -Asad
>>
>>
>>
>> *From: *John Mayfield 
>> *Date: *Tuesday, 30 October 2018 at 12:01
>> *To: *cdkuser 
>> *Subject: *[Cdk-user] CDK v2.2
>>
>>
>>
>> Dear CDK users,
>>
>>
>>
>> CDK 2.2 is now released and can be obtained from Maven central or the
>> GitHub <https://github.com/cdk/cdk/releases/tag/cdk-2.2> site. The full 
>> release
>> notes <https://github.com/cdk/cdk/wiki/2.2-Release-Notes> provide
>> details on the new features and changes.
>>
>>
>>
>> As noted in v2.1/v2.1.1 the AtomContainer2
>> <https://github.com/cdk/cdk/wiki/AtomContainer2> is now the default so
>> if you didn't try running v2.1/v2.1.1 with the flag
>> CdkUseLegacyAtomContainer=false you may some breakages. As highlighted on
>> the wiki page <https://github.com/cdk/cdk/wiki/AtomContainer2> this
>> normally just requires addAtom/Bond statements be reordered and using
>> ``Objects.equals(container1, container2)`` instead of reference comparison
>> (container1 == container2).
>>
>>
>>
>> - John
>>
>>
>>
>>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK v2.2

2018-10-30 Thread John Mayfield
Yes, but there has been no changes to SilentChemObjectBuilder...?

- I did try testing RDT with the new AtomContainer APIs but your tests took
too long to run and I have stuff to do :-).
- Likewise I did the same with Ambit but there are integration tests with
dependencies on DBs etc so difficult to see if there is actually anything
wrong. Bigger problem in Ambit is extended type (e.g. SuppleAtomContainer)
which needs to implement some new methods. Often trivial stuff convenience
stuff that could be resolved with Java 8 and default method implementations
on interfaces, but we're currently on Java 7 so not an option.

John

On Tue, 30 Oct 2018 at 14:47, Syed Asad Rahman  wrote:

> Thank you John and Developers!
>
>
>
> This is a fantastic news!
>
> One quick question before I pull it into SMSD and RDT - Is 
> SilentChemObjectBuilder thread safe?
>
> Best wishes,
>
> -Asad
>
>
>
> *From: *John Mayfield 
> *Date: *Tuesday, 30 October 2018 at 12:01
> *To: *cdkuser 
> *Subject: *[Cdk-user] CDK v2.2
>
>
>
> Dear CDK users,
>
>
>
> CDK 2.2 is now released and can be obtained from Maven central or the
> GitHub <https://github.com/cdk/cdk/releases/tag/cdk-2.2> site. The full 
> release
> notes <https://github.com/cdk/cdk/wiki/2.2-Release-Notes> provide details
> on the new features and changes.
>
>
>
> As noted in v2.1/v2.1.1 the AtomContainer2
> <https://github.com/cdk/cdk/wiki/AtomContainer2> is now the default so if
> you didn't try running v2.1/v2.1.1 with the flag
> CdkUseLegacyAtomContainer=false you may some breakages. As highlighted on
> the wiki page <https://github.com/cdk/cdk/wiki/AtomContainer2> this
> normally just requires addAtom/Bond statements be reordered and using
> ``Objects.equals(container1, container2)`` instead of reference comparison
> (container1 == container2).
>
>
>
> - John
>
>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK v2.2

2018-10-30 Thread John Mayfield
It's more that clone() is an indication of bad style. Unfortunately a lot
of the CDK (particularly the QSAR code) is built on the premise this is
easy and cheap. Also all out *clone()* implementations throw
*CloneNotSupported* when they shouldn't this leads to ugly try/catch down
stream.

It is true there isn't a viable alternative for a deep copy and having a
*AtomContainerManipulator.copy()* for example would help, ideally the copy
constructor should have been a deep copy (it's currently a shallow).

John


On Tue, 30 Oct 2018 at 16:50, Christoph Steinbeck <
christoph.steinb...@uni-jena.de> wrote:

>
> > On 30. Oct 2018, at 17:15, John Mayfield 
> wrote:
> >
> > 2) Looks like a bug, but you really really really should not be using
> clone.
>
> You also say that in https://github.com/cdk/cdk/wiki/AtomContainer2,
> which is a great guidance, apart from indicating alternatives cloning for
> the less enlightened :)
>
> The only alternative to cloning is to write code which comes down to a
> custom clone method, or not? i.e. you create a new AtomContainer and copy
> over the objects that you need for your algorithm, by, say, instantiating
> new atoms with identical properties, and adding them to the AC.
>
> Is your argument that clone does a lot more work than one might need in
> one's specific case?
>
> Kind regards,
>
> Chris
>
> —
> Prof. Dr. Christoph Steinbeck
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> Phone Secretariat: +49-3641-948171
> http://cheminf.uni-jena.de
> http://orcid.org/-0001-6966-0814
>
> What is man but that lofty spirit - that sense of enterprise.
> ... Kirk, "I, Mudd," stardate 4513.3..
>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Question on CDK and a small documentation

2018-10-29 Thread John Mayfield
Please use the *cdk-user* mailing (cc'd) for such questions in future.
Other people than me can help, and if someone has the same question they'll
get to see the answer too (it's also archived).

Is there any possibilities to turn off the counter ion depiction in CDK and
> can we use a just lines for all types of bonds, rather than wedged bonds
> and dashed style bonds?


Yes and yes. But I'd like a bit more details, was there something wrong
that needs fixing?

Under Bond Count Descriptor the documentation says that you can use
> parameter “a” for aromatic bond counts, but after CDK 2.0 There is a new
> class called AromaticBondCountdescriptor made available. So now a developer
> cannot use that parameter. Also now a developer can use “q” for quadruple
> bonds, which should be added in the API documentation.


Please feel free to add a patch via GitHub that updates the JavaDoc.

John


On Mon, 29 Oct 2018 at 14:39, Kohulan Rajan 
wrote:

> Dear John,
>
>
>
> Hope you are doing well. I am Kohulan Rajan , currently a Ph.D student
> working Under Prof.C.Steinbeck.
>
>
>
> This is regarding a question regarding CDK and a small documentation in
> the new release.
>
>
>
> Question.
>
> Is there any possibilities to turn off the counter ion
> depiction in CDK, and can we use a just lines for all types of bonds,
> rather than wedged bonds and dashed style bonds?
>
>
>
> Update,
>
>
>
> Under Bond Count Descriptor the documentation says that you can use
> parameter “a” for aromatic bond counts, but after CDK 2.0 There is a new
> class called AromaticBondCountdescriptor made available. So now a developer
> cannot use that parameter.
>
> Also now a developer can use “q” for quadruple bonds, which should be
> added in the API documentation.
>
>
>
>
>
> Awaiting for your reply.
>
>
>
> Kind Regards,
> ~Kohulan.R
>
> ___
>
> Kohulan Rajan
> PhD Student
> Faculty of Chemistry and Geosciences
> Institute of Inorganic and Analytical Chemistry - Cheminformatics and
> Chemometrics
> Friedrich-Schiller-University
> Lessingstraße 8, 07743 Jena , Germany
>
> http://cheminf.uni-jena.de
> Phone : +49 3641 948783
>
> “It is our choices that show what we truly are, far more than our
> abilities.” - Albus Dumbledore
>
>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Question on CDK and a small documentation

2018-10-29 Thread John Mayfield
I am still curious why, this kind of dubious manipulation is how errors
start propagating. Anyways I can only ask twice :-). These are not
depiction options but you can achieve it by modifying the molecule.

1) I think what you're asking is to get the "parent" molecule. You should
define a list of the salts (counter ions) you want to remove and just
remove them from the IAtomContainer. A common hack is to remove everything
but the largest component (OEChem has the amusingly name:
OETheFunctionFormerlyKnownAsStripSalts
).
In CDK you would use the ConnectivityChecker

sort
by size and take the largest one. Note you may end up with a non-neutral
form if care is not taken and if there are two components of the same size
you should decide what to do.

2) Remove the stereochemistry,
*container.setStereoElements(Collections.emptyList())*; prior to generating
coordinates. Or if you already have coordinates, iterate over the bonds and
*setStereo(IBond.Stereo.NONE)*. Depending on what you're asking you may
want to set the bond order to single (*setOrder(IBond.Order.SINGLE)*) too,
note this will then mess up valence and so you'd have radicals, so you'd
have to sort that out etc. Again removing information from the molecule is
not a good.
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] InChi and Docker [solved]

2018-09-14 Thread John Mayfield
Knew it was the JAR packaging, I did try to send the JAR file I has working
in debain-slim but Gmail blocks JAR attachments.

Interestingly I stopped using SpringBoot (for another reason I can't
remember) a few years ago and use the TomCat (exec-war) plugin instead.
Jetty also has a plugin which is another option. I think the best option
here is to use the TomCat DockerHub images and deploy it as a WAR file.

https://github.com/cdk/depict/blob/master/pom.xml#L88-L109


> org.apache.tomcat.maven
> tomcat7-maven-plugin
> 2.1
> 
> 
> tomcat-run
> 
> exec-war-only
> 
> package
> 
> ${project.name
> }-${project.version}.jar
> /
> boot
> 
> jar
> 
> 
> 
> 
> 


On Fri, 14 Sep 2018 at 07:12, Egon Willighagen 
wrote:

>
> Oh, very happy to hear that! Are you going to put it up on DockerHub?
>
> Egon
>
> On Thu, Sep 13, 2018 at 10:42 PM Maria Sorokina 
> wrote:
>
>> I managed to make my app function.
>>
>> The problem comes from the way a Spring Boot application packages the jar
>> - internally, the jar is different from plain old Java, so the deployment
>> of artefacts is also different. The Jnati library that allows the wrapping
>> of the InChi factory is not adapted to that special deployment and is
>> simply lost and doesn’t find the artifacts in correct places *if it never
>> ran on the system before* (case of a brand new minimal image).
>> The trick I found to make a Spring Boot application using InChis run in
>> Docker is not very elegant, but functional. In the container, during the
>> compilation, or at the execution of the container, before running the main
>> Spring Boot jar, run the simples possible mini POJO jar containing the
>> InChiGeneratorFactory, that will create the correct environment.
>> Now everything works!
>>
>> This is my Dockerfile:
>>
>> FROM openjdk:8u171-slim
>> EXPOSE 8080
>> VOLUME /tmp
>> ARG JAR_FILE
>> COPY ${JAR_FILE} app.jar
>> COPY inchiPet.jar /
>>
>> RUN java -jar inchiPet.jar
>>
>> ENTRYPOINT 
>> ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/app.jar"]
>>
>> Where « inchiPet.jar » is the tiny regular jar with only a call to an
>> inchiGeneratorFactory, and « app.jar » is my main app.
>>
>> However, I used a bunch of libraries from CDK and until now, only the
>> InChi generator caused this problem within the Spring Boot - Docker
>> ecosystem.
>>
>> Hope this solution could help anybody encountering this kind of problem
>> in the future!
>>
>> Maria Sorokina, PhD
>> Steinbeck Research Group
>> Analytical Chemistry - Cheminformatics and Chemometrics
>> Friedrich-Schiller-University Jena, Germany
>> http://cheminf.uni-jena.de
>>
>
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: https://www.zotero.org/egonw
> ORCID: -0001-7542-0286 
> ImpactStory: https://impactstory.org/u/egonwillighagen
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] InChi and Docker

2018-09-11 Thread John Mayfield
Do you have a different machine to test on? Could also be a factor, alpine
gave me a segfault (see attached) but slim works OK.

[john@toaster jni-inchi-docker]$ more Dockerfile
> FROM openjdk:8u171-slim
> COPY smi2inchi.jar .


John

On Tue, 11 Sep 2018 at 16:23, John Mayfield 
wrote:

> I'll have a poke around in an alpine container this evening, does seem odd.
>
> A third option is nested-vm version, we used this in JChemPaint when it
> was an Applet - (see inchi-nestedvm in
> https://github.com/JChemPaint/jchempaint).
>
> On Tue, 11 Sep 2018 at 16:03, Maria Sorokina 
> wrote:
>
>> I am building my jar with Maven, but I don’t think that it is the
>> problem, as I tried tu run the jar on a Centos7 VM having just Java and it
>> worked without any problem.
>> I also desperately tried the dirty solution of downloading the manifest
>> and the jniinchi-1.03_1-LINUX-AMD64.so files the ini-inchi seems to
>> search for and put them where it seems to search them within the container,
>> but, of course it failed.
>>
>> I’ll try to search for a solution, but for now I’ll just do my thing
>> without containerizing this part of my app. If I find a solution, I’ll post
>> it here. Meanwhile, I’m still very open to any suggestion!
>>
>> Kind regards,
>>
>> Maria Sorokina, PhD
>> Steinbeck Research Group
>> Analytical Chemistry - Cheminformatics and Chemometrics
>> Friedrich-Schiller-University Jena, Germany
>> http://cheminf.uni-jena.de
>>
>> Le 11 sept. 2018 à 14:51, John Mayfield  a
>> écrit :
>>
>> Right, alpine was probably a red herring. Looking again I presume you
>> build app.jar yourself? It looks like you've removed the MANIFEST.xml that
>> is needed by JNI InChI to locate the native dependency. How are you
>> building the JAR?
>>
>> Unfortunately v1.03 of InChI is all we have for JNI ATM. My former
>> colleague has been looking at JNA InChI version (
>> https://github.com/dan2097/jna-inchi) but this is very different to the
>> JNI bindings and not a drop in replacement.
>>
>> John
>>
>> On Tue, 11 Sep 2018 at 09:48, Maria Sorokina 
>> wrote:
>>
>>> I tried the ubuntu 16.04 based image with Java 8 added, I get exactly
>>> the same error, only at the InChi generation. I have the feeling that
>>> something is missing in this minimal linux images for the InChi generator
>>> to run correctly, but I cannot find what.
>>>
>>> I also noticed that for an old version of ini-inchi (1.03, and the
>>> latest and only available is 1.6):
>>> npls-db-filler_1  | 2181 [main] INFO
>>> net.sf.jnati.deploy.repository.RemoteRepository  - Searching remote
>>> repository for: jniinchi-1.03_1-LINUX-AMD64 (
>>> http://jnati.sourceforge.net/jnati-repo)
>>>
>>> What am I missing in my configuration ? Knowing that everything works
>>> fine outside of Docker?
>>>
>>> The Dockerfile using Ubuntu:
>>>
>>> FROM ubuntu:16.04
>>> LABEL maintainer="maria.ssorok...@gmail.com"
>>>
>>>
>>> RUN apt-get update && \
>>> apt-get upgrade -y && \
>>> apt-get install -y  software-properties-common && \
>>> add-apt-repository ppa:webupd8team/java -y && \
>>> apt-get update && \
>>> echo oracle-java7-installer shared/accepted-oracle-license-v1-1 select 
>>> true | /usr/bin/debconf-set-selections && \
>>> apt-get install -y oracle-java8-installer && \
>>> apt-get clean
>>>
>>> EXPOSE 8080
>>> VOLUME /tmp
>>> ARG JAR_FILE
>>> COPY ${JAR_FILE} app.jar
>>> ENTRYPOINT 
>>> ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/app.jar"]
>>>
>>> Thank you for your help!
>>>
>>> Kind regards,
>>>
>>>
>>> Maria Sorokina, PhD
>>> Steinbeck Research Group
>>> Analytical Chemistry - Cheminformatics and Chemometrics
>>> Friedrich-Schiller-University Jena, Germany
>>> http://cheminf.uni-jena.de
>>>
>>> Le 11 sept. 2018 à 00:13, John Mayfield  a
>>> écrit :
>>>
>>> Unlikely to work in alpine, have you tried ubuntu/debian slim?
>>>
>>> John
>>>
>>> On Mon, 10 Sep 2018 at 18:16, Maria Sorokina 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I developed an app using CDK, and in particular it’s InChi Generator.
>>>> It runs pe

Re: [Cdk-user] can't get total exact mass after atom typing

2018-09-17 Thread John Mayfield
No but the isotopes are assigned by the isotope factory not the atom types.
Always been this way.

Isotopes.getInstance().configure(mol);


However depends on what you want: *AtomContainer.getMolecularWeight(mol);* does
not need atom typing or isotope configuring. It will be a different number
to total exact mass though (see previous discussions).

J

On Sun, 16 Sep 2018 at 21:21, Rajarshi Guha  wrote:

> Has atom typing changed with CDK 2.2?  The following code fails with a NPE
> in getTotalExactMass
>
> String smiles = "c1c1";
> SmilesParser sp = new
> SmilesParser(SilentChemObjectBuilder.getInstance());
> IAtomContainer mol = sp.parseSmiles(smiles);
>
>
> AtomContainerManipulator.percieveAtomTypesAndConfigureUnsetProperties(mol);
>
> System.out.println("AtomContainerManipulator.getTotalExactMass() =
> " +
> AtomContainerManipulator.getTotalExactMass(mol));
>
> --
> Rajarshi Guha | http://blog.rguha.net | @rguha 
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] MCS detection

2018-09-17 Thread John Mayfield
A couple of options,

a) Use the newer standalone version of SMSD, this is why the package is
deprecated. We did try to integrate the newer version but it proved
difficult and there were some test regressions. You can still use the
deprecated one.
b) Edmund Duesbury has some updated MCS algorithms based on CDK for his PhD.

John

On Sun, 16 Sep 2018 at 22:40, Rajarshi Guha  wrote:

> Looking at the 2.0 docs indicates that the SMSD classes for MCS detection
> have been deprecated.
>
> What is the recommended way to identify MCS's in 2.0?
>
> --
> Rajarshi Guha | http://blog.rguha.net | @rguha 
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK2.0 with python

2019-04-08 Thread John Mayfield
I thought there was a Cinfony version suing CDK 2+... but can't find it now
or even the repo as google code no longer exists. CC'ing Noel.

On Mon, 8 Apr 2019 at 07:52, Ganapati Natarajan  wrote:

> Dear all,
>
> I wish to use the CDK 2.0 from python. I noticed on the cinfony website
> that the CDK 1.4 version can be used with cinfony. Please advise on how to
> use CDK2.0 with python.
>
> Thanks in advance,
>
> Ganapati
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK2.0 with python

2019-04-08 Thread John Mayfield
Just asked Noel and the main code base is up to date with CDK 2+ it's just
the doc/release which is out of date. If you pull the *master* branch you
should be able to use CDK 2+ no problem:
https://github.com/cinfony/cinfony

On Mon, 8 Apr 2019 at 09:20, Ganapati Natarajan  wrote:

> Thanks.
>
> Ganapati
>
> On Mon, 8 Apr 2019 at 12:59, John Mayfield 
> wrote:
>
>> I thought there was a Cinfony version suing CDK 2+... but can't find it
>> now or even the repo as google code no longer exists. CC'ing Noel.
>>
>> On Mon, 8 Apr 2019 at 07:52, Ganapati Natarajan 
>> wrote:
>>
>>> Dear all,
>>>
>>> I wish to use the CDK 2.0 from python. I noticed on the cinfony website
>>> that the CDK 1.4 version can be used with cinfony. Please advise on how to
>>> use CDK2.0 with python.
>>>
>>> Thanks in advance,
>>>
>>> Ganapati
>>> ___
>>> Cdk-user mailing list
>>> Cdk-user@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>
>>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] How to add hydrogens with 3D coordinate?

2019-02-23 Thread John Mayfield
Unfortunately there is no easy way to do this ATM other then regeneration
3D coordinates and that support is pretty limited in CDK. Of course the
question is then do you want minimised hydrogens or any old reasonably
valid positions. As a first approximation you can set the hydrogen
coordinates to the same as the atom they are attached.

However I think it's a reasonable thing to do if a 3D (or 2D) molecule
comes in and will add something like OESet3DHydrogenGeom
.
Please can you add a GitHub issue for this.

Thanks,
John

On Sat, 16 Feb 2019 at 12:38, love_software0 via Cdk-user <
cdk-user@lists.sourceforge.net> wrote:

> Dear all,
>
> The Chemistry Development Kit(CDK) is very useful for my current work.
> However, I meet a problem: when I read and add hydrogens on the sybyl mol2
> format 3D molecule by using CDK, it seems the added hydrogens have no
> coordinates. I use the code as below:
>
> "AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(mol);
> CDKHydrogenAdder.getInstance(mol.getBuilder()).addImplicitHydrogens(mol);
> AtomContainerManipulator.convertImplicitToExplicitHydrogens(mol);
> "
> The added hydrogens with coordinates is very important for my current
> project. I had checked the API of CDK, but I can not find the suitable way
> to solve this problem.
>
> So could anyone gives me some codes or suggestions on adding hydrogens
> with 3D coordinates on molecules?
>
> Thanks for your help.
>
> Sincerely,
>
> Qifeng
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Tow problems about the calculation of molecular weight

2019-02-14 Thread John Mayfield
As an aside we are thinking of simplifying this to a single API point
*getMass(mol,
opt) *where the option lets you choose what you want.

The existing API points will still be valid but defer to this method.

John

On Thu, 14 Feb 2019 at 09:59, John Mayfield 
wrote:

> Please note the correct way to get Molecular Weight is:
>
> AtomContainerManipulator.getMolecularWeight(mol);
>
>
> We are aware of the issue with the MolecularWeight descriptor - please see
> the issue tracker.
>
> On Thu, 14 Feb 2019 at 08:44, Stesycki, Manuel <
> stesy...@mpi-muelheim.mpg.de> wrote:
>
>> Dear love_software0,
>>
>> i am running CDK Version 2.2.
>>
>> As a test structure i used Benzene (CAS 71-43-2)
>>
>> 1) I calculate the mass by using:
>> *double mw = AtomContainerManipulator.getMolecularWeight(mol);*
>>
>> 2) To calculate the monoIsotopicMass i use:
>>
>> *IMolecularFormula form =
>> MolecularFormulaManipulator.getMolecularFormula(mol);*
>> *double mw = MolecularFormulaManipulator.getTotalExactMass(form);*
>>
>> The methods from 1) and 2) calculate the following results:
>>
>> 1) 78.11205990368276
>> 2)  78.046950192
>> your code) 78.04695024
>>
>> I attached 2 screen shots. One from SciFinder which states an mw of 78.11.
>> The other on from ChemDraw V18. There the exact Mass is equal to your
>> result and the molecular weight (Mol.Wt.) matches the SciFinder value.
>>
>> Best regards,
>>Manuel Stesycki
>>
>> IT
>>0208 / 306-2146
>>Physikbau, Büro 117
>>stesy...@mpi-muelheim.mpg.de
>>
>> Max-Planck-Institut für Kohlenforschung
>>Kaiser-Wilhelm-Platz 1
>>D-45470 Mülheim an der Ruhr
>>http://www.kofo.mpg.de/de
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Tow problems about the calculation of molecular weight

2019-02-14 Thread John Mayfield
Please note the correct way to get Molecular Weight is:

AtomContainerManipulator.getMolecularWeight(mol);


We are aware of the issue with the MolecularWeight descriptor - please see
the issue tracker.

On Thu, 14 Feb 2019 at 08:44, Stesycki, Manuel 
wrote:

> Dear love_software0,
>
> i am running CDK Version 2.2.
>
> As a test structure i used Benzene (CAS 71-43-2)
>
> 1) I calculate the mass by using:
> *double mw = AtomContainerManipulator.getMolecularWeight(mol);*
>
> 2) To calculate the monoIsotopicMass i use:
>
> *IMolecularFormula form =
> MolecularFormulaManipulator.getMolecularFormula(mol);*
> *double mw = MolecularFormulaManipulator.getTotalExactMass(form);*
>
> The methods from 1) and 2) calculate the following results:
>
> 1) 78.11205990368276
> 2)  78.046950192
> your code) 78.04695024
>
> I attached 2 screen shots. One from SciFinder which states an mw of 78.11.
> The other on from ChemDraw V18. There the exact Mass is equal to your
> result and the molecular weight (Mol.Wt.) matches the SciFinder value.
>
> Best regards,
>Manuel Stesycki
>
> IT
>0208 / 306-2146
>Physikbau, Büro 117
>stesy...@mpi-muelheim.mpg.de
>
> Max-Planck-Institut für Kohlenforschung
>Kaiser-Wilhelm-Platz 1
>D-45470 Mülheim an der Ruhr
>http://www.kofo.mpg.de/de
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Enantiomer generator?

2019-05-08 Thread John Mayfield
No there isn't and I'm struggling to think of a use-case so I'll first ask
what's your actually end goal as there is likely a more efficient approach.
For example testing if two compounds are enantiomers does not require
enumeration.

But if you really want to enumerate - I would just flip them as you
suggested and think if you need to worry about steric-ally hindered cases.
If you care enough you can handle the common one e.g. bicyclo, with a
simple check. IIRC Greg had some routines in RDKit to enumerate to filter
them out by doing some 3D geometry calc, personally I feel this is too
expensive for the large numbers that can be generated but horses for corses.

John

On Wed, 8 May 2019 at 18:44, Daniel Katzel  wrote:

> Hello all
>   Does CDK have a way to generate enantiomers of a given IAtomContainer? I
> guess one could go to each stereo center and flip up and down bonds but I'm
> sure that get hairy very quickly if some are connected to multiple chiral
> atoms.
>
> Thanks
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Stereochemistry is disregarded when creating SMILES from MOL-File

2019-07-16 Thread John Mayfield
Hi Sebastian,

> I am using CDK version 1.4.17

That is a very old version and does not convert stereochemistry correctly.
Essentially there wasn't data structures to represent it so it was store
different for 0D (e.g. SMILES) vs 2D vs 3D. This will all fixed about 7
years ago :-). Latest release is 2.2 BTW -
https://github.com/cdk/cdk/releases

John

On Tue, 16 Jul 2019 at 15:04, Wehner, Sebastian via Cdk-user <
cdk-user@lists.sourceforge.net> wrote:

> Hi,
>
>
>
> I could use your help! I am using CDK version 1.4.17 and want to build a
> SMILES string from a mol-file (SDF from lipid maps). The molecule has two
> isomers, but this information should be included in the mol-file, should it
> not?
>
>
>
> Anyways, I pass the mol-file as ByteArrayInputStream into MDLV2000Reader and
> then create a new AtomContainer from this. Iterating over each atom of
> the AtomContainer, using a CDKAtomTypeMatcher to get the IAtomType of the
> atom which I then use to configure this atom with via AtomTypeManipulator
> .configure().
>
>
>
> Finally adding implicit hydrogens to the AtomContainer, creating a
> SmilesGenerator and returning the smiles of the AtomContainer with
> smilesGenerator.createSMILES(AtomContainer).
>
>
>
> However this produces a SMILES of the molecule which disregards the
> stereochemistry. The documentation states, that stereochemistry is taken
> into account (
> http://cdk.github.io/cdk/1.4/docs/api/org/openscience/cdk/smiles/SmilesGenerator.html).
> Am I missing something?
>
>
>
> Would be great I someone could help! Added the link to lipid maps for the
> example: http://www.lipidmaps.org/data/LMSDRecord.php?LMID=LMGL02010378.
>
>
>
>
>
> Best regards,
>
> Sebastian Wehner
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] controlling hydrogen display with DepictionGenerator

2019-08-02 Thread John Mayfield
>
> 1. explicit hydrogens are also not rendered


Er... I showed the example with the explicit hydrogens being rendered.
Let's clarify, what does "explicit hydrogen" mean to you?


> 2. Carbon symbols are displayed (as if the withCarbonSymbols() method had
> been called).

The "Visibility" option controls this, but the other answer is just don't
set to zero on carbons?

On Fri, 2 Aug 2019 at 12:09, Egon Willighagen 
wrote:

>
>
> On Fri, Aug 2, 2019 at 11:28 AM John Mayfield 
> wrote:
>
>> Other option You can also use the old *BasicAtomGenerator* which never
>> puts hydrogens on anything...
>>
>
> Documentation on the old generator stack can be found in this copy of the
> Groovy CDK book, "Depiction" chapter:
>
>
> https://figshare.com/articles/Edition_1_4_1_0_of_Groovy_Cheminformatics_with_the_Chemistry_Development_Kit/2057790
>
> Egon
>
> --
> Hi, do you like citation networks? Already 51% of all citations are
> available <https://i4oc.org/> available for innovative new uses
> <https://twitter.com/hashtag/acs2ioc>. Join me in asking the American
> Chemical Society to join the Initiative for Open Citations too
> <https://www.change.org/p/asking-the-american-chemical-society-to-join-the-initiative-for-open-citations>.
>  SpringerNature,
> the RSC and many others already did <https://i4oc.org/#publishers>.
>
> -
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: https://www.zotero.org/egonw
> ORCID: -0001-7542-0286 <http://orcid.org/-0001-7542-0286>
> ImpactStory: https://impactstory.org/u/egonwillighagen
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] controlling hydrogen display with DepictionGenerator

2019-08-02 Thread John Mayfield
I think you probably want this:

new DepictionGenerator().withParam(StandardGenerator.Visibility.class,
   new SymbolVisibility() {
   @Override
   public boolean visible(IAtom
atom, List neighbors, RendererModel model) {
   return atom.getAtomicNumber() != 6;
   }
   });


On Fri, 2 Aug 2019 at 16:26, Tim Dudgeon  wrote:

> On 02/08/2019 15:42, John Mayfield wrote:
>
> 1. explicit hydrogens are also not rendered
>
>
> Er... I showed the example with the explicit hydrogens being rendered.
> Let's clarify, what does "explicit hydrogen" mean to you?
>
> Sorry, my mistake. The explicit H is displayed. I was getting mixed up
> with too many examples!
>
>
>
>
>> 2. Carbon symbols are displayed (as if the withCarbonSymbols() method had
>> been called).
>
> The "Visibility" option controls this, but the other answer is just don't
> set to zero on carbons?
>
> Is there an example of how to use this "Visibility" option? Doesn't seem
> to be an option of DepictionGenerator.
>
> Not setting to zero on carbons is not so straight forward as you would
> still want this on terminal carbons.
> I suppose these would be carbons with only one bond to a non-hydrogen
> atom. Does this look right?
>
> for (IAtom atom : mol.atoms()) {
> if (atom.getAtomicNumber() == 6) {
> // count the number of connections that are heavy atomsint 
> numHeavy = 0;
> for (IBond bond : atom.bonds()) {
> IAtom other = bond.getOther(atom);
> if (other.getAtomicNumber() > 1) {
> numHeavy++;
> }
> }
> // if only one then this is a terminal carbon so we need to leave the 
> Hs in placeif (numHeavy < 2) {
> atom.setImplicitHydrogenCount(0);
> }
> } else { // non-carbon atoms    atom.setImplicitHydrogenCount(0);
> }
> }
>
> Tim
>
>
> On Fri, 2 Aug 2019 at 12:09, Egon Willighagen 
> wrote:
>
>>
>>
>> On Fri, Aug 2, 2019 at 11:28 AM John Mayfield <
>> john.wilkinson...@gmail.com> wrote:
>>
>>> Other option You can also use the old *BasicAtomGenerator* which never
>>> puts hydrogens on anything...
>>>
>>
>> Documentation on the old generator stack can be found in this copy of the
>> Groovy CDK book, "Depiction" chapter:
>>
>>
>> https://figshare.com/articles/Edition_1_4_1_0_of_Groovy_Cheminformatics_with_the_Chemistry_Development_Kit/2057790
>>
>> Egon
>>
>> --
>> Hi, do you like citation networks? Already 51% of all citations are
>> available <https://i4oc.org/> available for innovative new uses
>> <https://twitter.com/hashtag/acs2ioc>. Join me in asking the American
>> Chemical Society to join the Initiative for Open Citations too
>> <https://www.change.org/p/asking-the-american-chemical-society-to-join-the-initiative-for-open-citations>.
>>  SpringerNature,
>> the RSC and many others already did <https://i4oc.org/#publishers>.
>>
>> -
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: https://www.zotero.org/egonw
>> ORCID: -0001-7542-0286 <http://orcid.org/-0001-7542-0286>
>> ImpactStory: https://impactstory.org/u/egonwillighagen
>>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] MCS and alignment

2019-08-15 Thread John Mayfield
In that case include the cdk-legacy module and use that version of SMSD.

Here's the GIST I previously wrote to align to a subgraph:

https://gist.github.com/johnmay/12797a89f4186bc7da881f1f4a706671

On Wed, 14 Aug 2019 at 18:21, Tim Dudgeon  wrote:

> Hi John,
>
> Thanks for that info. I did look into SMSD, but found some problems using
> it with the latest CDK [1,2].
>
> Also, the maven version has not been updated since Jun 2016 so I wonder if
> its still active?
>
> Let me know if you want help with the utility function you mention. Happy
> to help, but not sure right now how to approach it.
>
> CDK rendering is so beautiful!
>
> Tim
>
> 1. https://github.com/asad/SMSD/issues/9
> 2. https://github.com/asad/SMSD/issues/10
>
>
> On 14/08/2019 16:05, John Mayfield wrote:
>
> 2. Use SMSD or Edmund Duesbury's MCS code. SMSD is a separate library now
> as we couldn't smoothly integrate the updates and had tests failing.
>
> 3. You can fix atoms in place with the *Set afix* option of the
> layout. So you copy the coords from MCS you got, fix these in place whilst
> you lay out the rest.
>
> One day I will get around to adding a utility function for this but there
> is some example code on the mailing list for No. 3, look for emails from
> someone at Dotmatics albeit with a substructure match.
>
> On Tue, 13 Aug 2019 at 17:01, Tim Dudgeon  wrote:
>
>> I'm wanting to depict molecules that have been aligned to a the MCS of a
>> query molecule, and highlight the MCS.
>> Are there any examples of this? Seems like some of the relevant CDK code
>> is deprecated, but its not clear what should be used.
>>
>> As an example:
>>
>> 1. I have a query molecule and a target molecule and they share
>> significant MCS, often with the query being a complete subgraph of the
>> target.
>>
>> 2. I identify the MCS
>>
>> 3. Using that MCS I layout the target molecule (e.g. generate 2D
>> coordinates) fixing the parts that are in the MCS to the coordinates
>> from the query structure.
>>
>> 4. I then depict that layed out molecule colouring the MCS.
>>
>> I know how to do #4 - its steps #2 and #3 that I'm unsure about.
>>
>>
>>
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] depiction without stereochemistry

2019-08-20 Thread John Mayfield
Are you using the very latest release? I think it's an over site. Try

bond.setDisplay(IBond.Display.Solid);


On Tue, 20 Aug 2019 at 16:40, Tim Dudgeon  wrote:

> Hi, I'm wanting to depict a molecule without sterochemistry, but the
> DepictionGenerator stubbornly seems to add it back.
> What is the way to do this? In the example below a squiggle bond is
> displayed even though all bonds have been set to Stereo.NONE and I have
> already layed out the molecule.
>
> void "no stereo"() {
>
> IAtomContainer mol = ChemUtils.readSmiles("c1ccc(N=C2SCCN2c2c2)cc1")
> StructureDiagramGenerator sdg = new StructureDiagramGenerator();
> sdg.setMolecule(mol);
> sdg.generateCoordinates(new Vector2d(0, 1));
> mol = sdg.getMolecule();
> for (IBond bond : mol.bonds()) {
> bond.setStereo(IBond.Stereo.NONE);
> }
> DepictionGenerator g = new DepictionGenerator()
> Depiction d = g.depict(mol)
>
> when:
> def img = d.toImg()
> ByteArrayOutputStream out = new ByteArrayOutputStream();
> ImageIO.write(img, "png", out);
> out.close();
> byte[] png = out.toByteArray();
> Files.write(java.nio.file.Paths.get("/tmp/myimage5.png"), png)
>
> then:
> png != nullpng.length > 0}
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Bug in MCS determination?

2019-08-28 Thread John Mayfield
Okay code is likely adding atoms/bonds in the wrong order, will fix it.

On Tue, 27 Aug 2019 at 18:18, Tim Dudgeon  wrote:

> Hi John,
>
> Yes, turning off AtomContainer2 avoids the error.
>
>
> On 27/08/2019 16:31, John Mayfield wrote:
>
> Hmm odd, in legacy so expected but tests seem okay. Can you try turning
> off AtomContainer2, https://github.com/cdk/cdk/wiki/AtomContainer2
>
> On Tue, 27 Aug 2019 at 14:19, Tim Dudgeon  wrote:
>
>> Hi folks,
>>
>> I'm getting a NPE from AtomAtomMapping.getCommonFragmentAsSMILES() in
>> certain cases.
>> An example is below - the two structures differ only for a Cl <-> Br
>> change.
>>
>> This is using the org.openscience.smsd.AtomAtomMapping,
>> org.openscience.smsd.Isomorphism and
>> org.openscience.smsd.tools.ExtAtomContainerManipulator classes.
>> Not sure if those guys are active on this list?
>>
>>
>> IAtomContainer query =  smilesParser.parseSmiles('BrC1CCC(Cc2c2)C1')
>> IAtomContainer target = smilesParser.parseSmiles('ClC1CCC(Cc2c2)C1')
>>
>> StructureDiagramGenerator sdg = new StructureDiagramGenerator()
>> sdg.generateCoordinates(query)
>>
>> ExtAtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(query)
>> ExtAtomContainerManipulator.aromatizeMolecule(query)
>>
>> ExtAtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(target)
>> ExtAtomContainerManipulator.aromatizeMolecule(target)
>>
>> Isomorphism comparison = new Isomorphism(query, target, Algorithm.DEFAULT, 
>> true, false, false)
>> AtomAtomMapping mapping = comparison.getFirstAtomMapping()
>> String mcsSmiles = mapping.getCommonFragmentAsSMILES()
>>
>> The error I get is:
>>
>> java.lang.NullPointerException
>> at
>> org.openscience.cdk.silent.AtomContainer2.getAtomRefUnsafe(AtomContainer2.java:172)
>> at
>> org.openscience.cdk.silent.AtomContainer2.getBond(AtomContainer2.java:612)
>> at
>> org.openscience.smsd.AtomAtomMapping.getCommonFragment(AtomAtomMapping.java:332)
>> at
>> org.openscience.smsd.AtomAtomMapping.getCommonFragmentAsSMILES(AtomAtomMapping.java:371)
>> at
>> org.squonk.fragnet.depict.ChemUtilsSpec.alignMolecule2(ChemUtilsSpec.groovy:81)
>>
>>
>>
>>
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Bug in MCS determination?

2019-08-28 Thread John Mayfield
Which SMSD are you using? I don't have control over the downstream one.

On Wed, 28 Aug 2019 at 10:16, Tim Dudgeon  wrote:

> Unfortunately other parts of my code are using new features such as
> IAtomContainer.atoms() so whilst switching to the legacy IAtomContainer
> avoids the alignment problem it looks to be a no go as a solution.
>
> Would switching to the legacy classes in the org.openscience.cdk.smsd
> package be an option or do I just need to wait for the problem to be fixed?
>
>
> On 28/08/2019 08:15, John Mayfield wrote:
>
> Okay code is likely adding atoms/bonds in the wrong order, will fix it.
>
> On Tue, 27 Aug 2019 at 18:18, Tim Dudgeon  wrote:
>
>> Hi John,
>>
>> Yes, turning off AtomContainer2 avoids the error.
>>
>>
>> On 27/08/2019 16:31, John Mayfield wrote:
>>
>> Hmm odd, in legacy so expected but tests seem okay. Can you try turning
>> off AtomContainer2, https://github.com/cdk/cdk/wiki/AtomContainer2
>>
>> On Tue, 27 Aug 2019 at 14:19, Tim Dudgeon  wrote:
>>
>>> Hi folks,
>>>
>>> I'm getting a NPE from AtomAtomMapping.getCommonFragmentAsSMILES() in
>>> certain cases.
>>> An example is below - the two structures differ only for a Cl <-> Br
>>> change.
>>>
>>> This is using the org.openscience.smsd.AtomAtomMapping,
>>> org.openscience.smsd.Isomorphism and
>>> org.openscience.smsd.tools.ExtAtomContainerManipulator classes.
>>> Not sure if those guys are active on this list?
>>>
>>>
>>> IAtomContainer query =  smilesParser.parseSmiles('BrC1CCC(Cc2c2)C1')
>>> IAtomContainer target = smilesParser.parseSmiles('ClC1CCC(Cc2c2)C1')
>>>
>>> StructureDiagramGenerator sdg = new StructureDiagramGenerator()
>>> sdg.generateCoordinates(query)
>>>
>>> ExtAtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(query)
>>> ExtAtomContainerManipulator.aromatizeMolecule(query)
>>>
>>> ExtAtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(target)
>>> ExtAtomContainerManipulator.aromatizeMolecule(target)
>>>
>>> Isomorphism comparison = new Isomorphism(query, target, Algorithm.DEFAULT, 
>>> true, false, false)
>>> AtomAtomMapping mapping = comparison.getFirstAtomMapping()
>>> String mcsSmiles = mapping.getCommonFragmentAsSMILES()
>>>
>>> The error I get is:
>>>
>>> java.lang.NullPointerException
>>> at
>>> org.openscience.cdk.silent.AtomContainer2.getAtomRefUnsafe(AtomContainer2.java:172)
>>> at
>>> org.openscience.cdk.silent.AtomContainer2.getBond(AtomContainer2.java:612)
>>> at
>>> org.openscience.smsd.AtomAtomMapping.getCommonFragment(AtomAtomMapping.java:332)
>>> at
>>> org.openscience.smsd.AtomAtomMapping.getCommonFragmentAsSMILES(AtomAtomMapping.java:371)
>>> at
>>> org.squonk.fragnet.depict.ChemUtilsSpec.alignMolecule2(ChemUtilsSpec.groovy:81)
>>>
>>>
>>>
>>>
>>> ___
>>> Cdk-user mailing list
>>> Cdk-user@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>
>>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] depiction without stereochemistry

2019-08-21 Thread John Mayfield
Technically 2.3 isn't officially release yet as not had time to do the
patch notes ;-), but feel free to use it.

You can just remove the stereo information? The wedge/hatch info is display
only, the actual information is stored as a list of stereo elements. You
can clear this:

> mol.setStereoElements(new ArrayList());


Again if you've already generated a layout when stereo is there you will
need to clear the bond display. This is actually one of the reasons I added
the "bond display" to emphasise it really is only a display option and the
stereo info isn't stored there.

On Wed, 21 Aug 2019 at 14:51, Tim Dudgeon  wrote:

> I'm using the 2.3 release.
>
> Using bond.setDisplay(IBond.Display.Solid) does work, but there's a big
> gotcha. If the molecule still needs to be layed out (e.g. no 2D
> coordinates) then when the DepictionGenerator does the layout the bond's
> display property gets reset to the chiral representation.
>
> The workaround is to make sure that 2D coordinates are present (e.g. using
> StructureDiagramGenerator) and then set the display property.
>
> Tim
>
>
> On 20/08/2019 17:31, John Mayfield wrote:
>
> Are you using the very latest release? I think it's an over site. Try
>
> bond.setDisplay(IBond.Display.Solid);
>
>
> On Tue, 20 Aug 2019 at 16:40, Tim Dudgeon  wrote:
>
>> Hi, I'm wanting to depict a molecule without sterochemistry, but the
>> DepictionGenerator stubbornly seems to add it back.
>> What is the way to do this? In the example below a squiggle bond is
>> displayed even though all bonds have been set to Stereo.NONE and I have
>> already layed out the molecule.
>>
>> void "no stereo"() {
>>
>> IAtomContainer mol = ChemUtils.readSmiles("c1ccc(N=C2SCCN2c2c2)cc1")
>> StructureDiagramGenerator sdg = new StructureDiagramGenerator();
>> sdg.setMolecule(mol);
>> sdg.generateCoordinates(new Vector2d(0, 1));
>> mol = sdg.getMolecule();
>> for (IBond bond : mol.bonds()) {
>> bond.setStereo(IBond.Stereo.NONE);
>> }
>> DepictionGenerator g = new DepictionGenerator()
>> Depiction d = g.depict(mol)
>>
>> when:
>> def img = d.toImg()
>> ByteArrayOutputStream out = new ByteArrayOutputStream();
>> ImageIO.write(img, "png", out);
>> out.close();
>> byte[] png = out.toByteArray();
>> Files.write(java.nio.file.Paths.get("/tmp/myimage5.png"), png)
>>
>> then:
>> png != nullpng.length > 0}
>>
>>
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Stereochemistry resolution

2019-07-30 Thread John Mayfield
No, it wasn't possible. They used different data structures so you could go
to/from SMILES and to/from 2D Mol/CML but not from Mol to Smi or Smi to
Mol. Please don't use 1.1.5 (I presume as there is no 1.15 version) it's
10+ years old.

John

On Tue, 30 Jul 2019 at 08:49, Wehner, Sebastian via Cdk-user <
cdk-user@lists.sourceforge.net> wrote:

> Hello,
>
>
>
> I am trying to produce a smiles string from a molfile, via AtomContainer
> as an intermediate. Is it possible to properly resolve stereochemistry in
> CDK version 1.15?
>
>
>
> In some detail:
>
> I tried some approaches with known stereo-isotopes for which I both had
> the molfile (SDF). The molfile was parsed via MDLV2000Reader to an
> AtomContainer which in turn was passed to the SmilesGenerator for parsing
> to smiles. Sadly both molfiles resulted in the same smiles string.
>
> I read the paper for CDK v2.0 (
> https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0220-4)
> which states that in this version the stereochemistry is standardized. But
> it does not convey whether it wasn’t possible before. So is there a way to
> resolve stereochemistry in v1.15? And if so, can anyone provide some code
> examples?
>
>
>
> Hope someone can shed some light on this,
>
> Best Regards
>
> Sebastian
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Stereochemistry resolution

2019-07-30 Thread John Mayfield
Improvements after I believe it was 1.5.4+

https://github.com/cdk/cdk/wiki/1.5.4-Release-Notes#stereochemistry-

On Tue, 30 Jul 2019 at 14:19, John Mayfield 
wrote:

> Ah okay, in some versions of 1.5 it's supported. Which subversion are you
> using?
>
> On Tue, 30 Jul 2019 at 09:34, Wehner, Sebastian <
> sebastian.weh...@bruker.com> wrote:
>
>> Thanks for the clarification. And sorry, it was a typo I meant version
>> 1.5. I assume your explanation still holds true for that as well?
>>
>>
>>
>> Sebastian
>>
>>
>>
>> *From:* John Mayfield 
>> *Sent:* Tuesday, July 30, 2019 10:27 AM
>> *To:* Wehner, Sebastian 
>> *Cc:* cdk-user@lists.sourceforge.net
>> *Subject:* Re: [Cdk-user] Stereochemistry resolution
>>
>>
>>
>> No, it wasn't possible. They used different data structures so you could
>> go to/from SMILES and to/from 2D Mol/CML but not from Mol to Smi or Smi to
>> Mol. Please don't use 1.1.5 (I presume as there is no 1.15 version) it's
>> 10+ years old.
>>
>>
>>
>> John
>>
>>
>>
>> On Tue, 30 Jul 2019 at 08:49, Wehner, Sebastian via Cdk-user <
>> cdk-user@lists.sourceforge.net> wrote:
>>
>> Hello,
>>
>>
>>
>> I am trying to produce a smiles string from a molfile, via AtomContainer
>> as an intermediate. Is it possible to properly resolve stereochemistry in
>> CDK version 1.15?
>>
>>
>>
>> In some detail:
>>
>> I tried some approaches with known stereo-isotopes for which I both had
>> the molfile (SDF). The molfile was parsed via MDLV2000Reader to an
>> AtomContainer which in turn was passed to the SmilesGenerator for parsing
>> to smiles. Sadly both molfiles resulted in the same smiles string.
>>
>> I read the paper for CDK v2.0 (
>> https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0220-4)
>> which states that in this version the stereochemistry is standardized. But
>> it does not convey whether it wasn’t possible before. So is there a way to
>> resolve stereochemistry in v1.15? And if so, can anyone provide some code
>> examples?
>>
>>
>>
>> Hope someone can shed some light on this,
>>
>> Best Regards
>>
>> Sebastian
>>
>>
>>
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Stereochemistry resolution

2019-07-30 Thread John Mayfield
Ah okay, in some versions of 1.5 it's supported. Which subversion are you
using?

On Tue, 30 Jul 2019 at 09:34, Wehner, Sebastian 
wrote:

> Thanks for the clarification. And sorry, it was a typo I meant version
> 1.5. I assume your explanation still holds true for that as well?
>
>
>
> Sebastian
>
>
>
> *From:* John Mayfield 
> *Sent:* Tuesday, July 30, 2019 10:27 AM
> *To:* Wehner, Sebastian 
> *Cc:* cdk-user@lists.sourceforge.net
> *Subject:* Re: [Cdk-user] Stereochemistry resolution
>
>
>
> No, it wasn't possible. They used different data structures so you could
> go to/from SMILES and to/from 2D Mol/CML but not from Mol to Smi or Smi to
> Mol. Please don't use 1.1.5 (I presume as there is no 1.15 version) it's
> 10+ years old.
>
>
>
> John
>
>
>
> On Tue, 30 Jul 2019 at 08:49, Wehner, Sebastian via Cdk-user <
> cdk-user@lists.sourceforge.net> wrote:
>
> Hello,
>
>
>
> I am trying to produce a smiles string from a molfile, via AtomContainer
> as an intermediate. Is it possible to properly resolve stereochemistry in
> CDK version 1.15?
>
>
>
> In some detail:
>
> I tried some approaches with known stereo-isotopes for which I both had
> the molfile (SDF). The molfile was parsed via MDLV2000Reader to an
> AtomContainer which in turn was passed to the SmilesGenerator for parsing
> to smiles. Sadly both molfiles resulted in the same smiles string.
>
> I read the paper for CDK v2.0 (
> https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0220-4)
> which states that in this version the stereochemistry is standardized. But
> it does not convey whether it wasn’t possible before. So is there a way to
> resolve stereochemistry in v1.15? And if so, can anyone provide some code
> examples?
>
>
>
> Hope someone can shed some light on this,
>
> Best Regards
>
> Sebastian
>
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] controlling hydrogen display with DepictionGenerator

2019-08-02 Thread John Mayfield
It's not an option - and I would hesitant to add it as it's "at best
ambiguous" (quoting Brecher's IUPAC :-)). Symyx/Accelrys/BioVia Draw like
to hide the hydrogens by default which is where I think the acceptability
crept in from. You right that for queries there's a use-case, but the
depiction isn't really set up for rendering queries.

Anyways if you want to do it just add a helper routine to sets the implicit
hydrogen counts to 0, for non aromatics you can put them

C1C[N]CN([H])C1


[image: image.png]
The radicals there are specific to the WebApp so you would just get a plain
N - I may actually make than option as I don't like it (Noel Talked me into
it).

Other option You can also use the old *BasicAtomGenerator* which never puts
hydrogens on anything...

John

On Thu, 1 Aug 2019 at 18:12, Tim Dudgeon  wrote:

> Can someone point to examples of how to control the display of hydrogens
> when depicting using DepictionGenerator?
>
> It looks like implicit hydrogens are displayed on terminal and hetero
> atoms which is not unreasonable, but what if I ONLY want explicit
> hydrogens to be displayed (e.g. when depicting a query structure) or I
> don't want any hydrogens to be displayed?
>
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Stereochemistry is disregarded when creating SMILES from MOL-File

2019-07-17 Thread John Mayfield
Why were you using that version?

On Wed, 17 Jul 2019 at 06:37, Wehner, Sebastian 
wrote:

> Hi John,
>
>
>
> That is what I suspected. I had some hope that there was still a way to
> properly convert stereochemistry in this version…
>
> Anyways, thanks for your quick answer and explanations.
>
>
>
> Sebastian
>
>
>
> *From:* John Mayfield 
> *Sent:* Tuesday, July 16, 2019 6:05 PM
> *To:* Wehner, Sebastian 
> *Cc:* cdk-user@lists.sourceforge.net
> *Subject:* Re: [Cdk-user] Stereochemistry is disregarded when creating
> SMILES from MOL-File
>
>
>
> Hi Sebastian,
>
>
>
> > I am using CDK version 1.4.17
>
>
>
> That is a very old version and does not convert stereochemistry
> correctly.  Essentially there wasn't data structures to represent it so it
> was store different for 0D (e.g. SMILES) vs 2D vs 3D. This will all fixed
> about 7 years ago :-). Latest release is 2.2 BTW -
> https://github.com/cdk/cdk/releases
>
>
>
> John
>
>
>
> On Tue, 16 Jul 2019 at 15:04, Wehner, Sebastian via Cdk-user <
> cdk-user@lists.sourceforge.net> wrote:
>
> Hi,
>
>
>
> I could use your help! I am using CDK version 1.4.17 and want to build a
> SMILES string from a mol-file (SDF from lipid maps). The molecule has two
> isomers, but this information should be included in the mol-file, should it
> not?
>
>
>
> Anyways, I pass the mol-file as ByteArrayInputStream into MDLV2000Reader and
> then create a new AtomContainer from this. Iterating over each atom of
> the AtomContainer, using a CDKAtomTypeMatcher to get the IAtomType of the
> atom which I then use to configure this atom with via AtomTypeManipulator
> .configure().
>
>
>
> Finally adding implicit hydrogens to the AtomContainer, creating a
> SmilesGenerator and returning the smiles of the AtomContainer with
> smilesGenerator.createSMILES(AtomContainer).
>
>
>
> However this produces a SMILES of the molecule which disregards the
> stereochemistry. The documentation states, that stereochemistry is taken
> into account (
> http://cdk.github.io/cdk/1.4/docs/api/org/openscience/cdk/smiles/SmilesGenerator.html).
> Am I missing something?
>
>
>
> Would be great I someone could help! Added the link to lipid maps for the
> example: http://www.lipidmaps.org/data/LMSDRecord.php?LMID=LMGL02010378.
>
>
>
>
>
> Best regards,
>
> Sebastian Wehner
>
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Reg: Reading Jmol generated SDF file using Iterating SDF reader cdk 1.5.8

2019-11-17 Thread John Mayfield
IIRC JMol was generating them incorrectly, Bob (JMol dev) patched it and we
also updated our code to be more tolerant. Please try CDK 2.3 and if there
is still an issue report via GitHub Issues.

On Sun, 17 Nov 2019 at 09:50, Vinothkumar Mohanakrishnan 
wrote:

> Dear Users,
>
> I would like to read a SDF file using IteratingSDFReader. The SDF file is 
> generated by jmol (see below).
>
> U:/research/project_opas/Code/MVC_OPAS/build/check.sdf
> __Jmol-14_11161922023D 1   1.0 0.0 0
> Jmol version 14.9.1  2017-02-18 13:47 EXTRACT: ({0:43})
>  22 22  0  0  0  0  1 V2000
>  -10.26842  21.30587  -2.02430 N   0  0  0  0  0  0
>  -11.14109  20.23272  -2.39821 C   0  0  0  0  0  0
>  -10.84018  18.85777  -1.92734 C   0  0  0  0  0  0
>  -11.60011  17.94142  -2.23013 O   0  0  0  0  0  0
>  -12.38167  20.49262  -3.21797 C   0  0  0  0  0  0
>  -12.66761  21.93940  -3.49472 C   0  0  0  0  0  0
>  -13.73838  22.56509  -3.02287 C   0  0  0  0  0  0
>  -11.88128  22.81522  -4.30956 N   0  0  0  0  0  0
>  -12.65747  24.01784  -4.21934 C   0  0  0  0  0  0
>  -13.70560  23.88527  -3.49217 N   0  0  0  0  0  0
>   -9.64144  18.61510  -1.15438 N   0  0  0  0  0  0
>   -6.47461  15.46647   0.53840 C   0  0  0  0  0  0
>   -5.16207  15.52031  -0.21484 C   0  0  0  0  0  0
>   -4.06313  15.41673   0.72449 N   0  0  0  0  0  0
>   -9.80464  17.44463  -0.29800 C   0  0  0  0  0  0
>   -8.64600  16.48725  -0.41087 C   0  0  0  0  0  0
>   -8.90795  15.33022  -1.01563 C   0  0  0  0  0  0
>   -7.33993  16.68411   0.28862 C   0  0  0  0  0  0
>   -7.04245  17.86578   0.93854 O   0  0  0  0  0  0
>   -6.07225  18.51085   0.19546 C   0  0  0  0  0  0
>   -6.39116  19.21376  -0.74531 O   0  0  0  0  0  0
>   -4.69541  18.41611   0.56612 N   0  0  0  0  0  0
>   2  1  1  0  0  0
>   2  3  1  0  0  0
>   3 11  1  0  0  0
>   4  3  2  0  0  0
>   5  2  1  0  0  0
>   6  5  1  0  0  0
>   6  7  2  0  0  0
>   8  9  1  0  0  0
>   8  6  1  0  0  0
>   9 10  2  0  0  0
>  10  7  1  0  0  0
>  13 12  1  0  0  0
>  13 14  1  0  0  0
>  15 16  1  0  0  0
>  16 17  2  0  0  0
>  16 18  1  0  0  0
>  18 19  1  0  0  0
>  19 20  1  0  0  0
>  20 21  2  0  0  0
>  20 22  1  0  0  0
>  11 15  1  0  0  0
>  12 18  1  0  0  0
> M  END
> 
> U:/research/project_opas/Code/MVC_OPAS/build/check.sdf
> __Jmol-14_11161922023D 1   1.0 0.0 0
> Jmol version 14.9.1  2017-02-18 13:47 EXTRACT: ({0:43})
>  22 22  0  0  0  0  1 V2000
>  -10.34994  21.23775  -2.14993 N   0  0  0  0  0  0
>  -11.34250  20.30777  -2.59501 C   0  0  0  0  0  0
>  -11.13198  18.85018  -2.50356 C   0  0  0  0  0  0
>  -12.03183  18.12532  -2.90109 O   0  0  0  0  0  0
>  -12.62336  20.72709  -3.26233 C   0  0  0  0  0  0
>  -12.81469  22.18660  -3.49445 C   0  0  0  0  0  0
>  -13.85760  22.84610  -3.01911 C   0  0  0  0  0  0
>  -11.99766  23.01098  -4.32672 N   0  0  0  0  0  0
>  -12.72729  24.24473  -4.23861 C   0  0  0  0  0  0
>  -13.77078  24.16132  -3.50354 N   0  0  0  0  0  0
>   -9.86955  18.26965  -2.15583 N   0  0  0  0  0  0
>   -7.30122  16.52601   0.75455 C   0  0  0  0  0  0
>   -5.96259  17.25681   0.80966 C   0  0  0  0  0  0
>   -4.87734  16.30636   0.66950 N   0  0  0  0  0  0
>  -12.83689  15.52634  -0.85527 C   0  0  0  0  0  0
>  -11.62686  16.04439  -1.46833 N   0  0  0  0  0  0
>  -11.00919  14.97258  -2.22952 C   0  0  0  0  0  0
>  -10.70605  16.47306  -0.42756 C   0  0  0  0  0  0
>   -9.83868  17.64207  -0.85262 C   0  0  0  0  0  0
>   -8.72671  18.07421   0.02162 C   0  0  0  0  0  0
>   -8.09045  19.06942  -0.28779 O   0  0  0  0  0  0
>   -8.35385  17.36976   1.15010 O   0  0  0  0  0  0
>   2  1  1  0  0  0
>   2  3  1  0  0  0
>   3 11  1  0  0  0
>   4  3  2  0  0  0
>   5  2  1  0  0  0
>   6  5  1  0  0  0
>   6  7  2  0  0  0
>   8  9  1  0  0  0
>   8  6  1  0  0  0
>   9 10  2  0  0  0
>  10  7  1  0  0  0
>  13 12  1  0  0  0
>  13 14  1  0  0  0
>  15 16  1  0  0  0
>  16 17  1  0  0  0
>  16 18  1  0  0  0
>  18 19  1  0  0  0
>  19 20  1  0  0  0
>  20 21  2  0  0  0
>  20 22  1  0  0  0
>  11 19  1  0  0  0
>  12 22  1  0  0  0
> M  END
> 
>
> I am using CDK 1.5.8 (I have to stick to this version for compatibility 
> issues). I am trying to read the SDF file using the below snippet
>
> public static List readFragments(String fileName) throws 
> IOException, CDKException {
>
> List frags = new ArrayList<>();
>
> File sdfFile = new File(fileName);
>   
>IteratingSDFReader sdfReader = new IteratingSDFReader(new 
> FileInputStream(sdfFile),DefaultChemObjectBuilder.getInstance());
>
> while (sdfReader.hasNext()) {
>
> IAtomContainer molecule = 
> (IAtomContainer)sdfReader.next();
>
> frags.add(molecule);
>
> }
> sdfReader.close();
>
> return frags;
> }
>
> The function works perfectly fine for sdf files genrated by CDK and Openbabel 
> and returns null for Jmol generated 

Re: [Cdk-user] Smarts cast exception

2019-11-08 Thread John Mayfield
No problem,

So essentially there are "molecules" and  "queries". Molecules are things,
queries match things. We can convert a molecule to a query by telling it
what things we want to match.

John

On Fri, 8 Nov 2019 at 09:27, Stesycki, Manuel 
wrote:

> Ok i looked up the Test class and did the following:
>
> public static String createSMARTS(AtomContainer ac) {
>
> String ret;
>
> try {
>
> QueryAtomContainer qac = QueryAtomContainer.create(ac,
> Expr.Type.ALIPHATIC_ELEMENT,
> Expr.Type.AROMATIC_ELEMENT,
> Expr.Type.SINGLE_OR_AROMATIC,
> Expr.Type.ALIPHATIC_ORDER,
> Expr.Type.ISOTOPE,
> Expr.Type.RING_BOND_COUNT
> );
>
> ret = Smarts.generate(qac);
>
> } catch (Exception e) {
> ret = "";
> }
>
> return ret;
> }
>
> This works for me.
>
> Sorry to bother you,
>Manuel Stesycki
>
> IT
>0208 / 306-2146
>Physikbau, Büro 117
>stesy...@mpi-muelheim.mpg.de
>
> Max-Planck-Institut für Kohlenforschung
>Kaiser-Wilhelm-Platz 1
>D-45470 Mülheim an der Ruhr
>http://www.kofo.mpg.de/de
>
> Am 07.11.2019 um 14:34 schrieb Stesycki, Manuel <
> stesy...@mpi-muelheim.mpg.de>:
>
> Dear all,
>
> i am trying to create SMARTS-Pattern for a structure.
> The structure is an AtomContainer and read from an MDL file.
> If i try to call Smarts.generate( AtomContainer )
> i get this error message:
>
> java.lang.ClassCastException: org.openscience.cdk.Bond cannot be cast to
> org.openscience.cdk.isomorphism.matchers.QueryBond
>
> Has anyone a clue, where my mistake is?
>
> Many thanks,
>Manuel Stesycki
>
> IT
>0208 / 306-2146
>Physikbau, Büro 117
>stesy...@mpi-muelheim.mpg.de
>
> Max-Planck-Institut für Kohlenforschung
>Kaiser-Wilhelm-Platz 1
>D-45470 Mülheim an der Ruhr
>http://www.kofo.mpg.de/de
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Questions about the function "Addring"

2019-11-20 Thread John Mayfield
It's calculated differently because without it the rings get laid out on
top of each other. Example case, you can reverse this commit:

https://github.com/cdk/cdk/commit/6533533a95b5e9ca0d55d0d37ab5f048a25e88f7#diff-da65f1759b150e9510a643e017112b3f

And see how it lays out the following.

C1CO[Fe]234(O1)OCCO2.C(CO3)O4

[image: image.png]

The old code would generate this:

[image: image.png]

because the bond vector was pointing towards the centre of the ring.

John

On Wed, 20 Nov 2019 at 16:36, Christoph Steinbeck <
christoph.steinb...@uni-jena.de> wrote:

> This is very old code and it seem that others changed it (to the better
> :)) since I wrote it long time ago.
> If I understand you correctly, there is actually no bug, just an apparent
> inconsistency that you are reporting.
> The simplest thing would be for you to remove the case distinction and see
> what happens.
> Maybe that reveals the reason for the distinction.
>
> I’d love to dig into this but I lack the time for such fun these days
> Very sad. :D
>
> All the best,
>
> Chris
>
> —
> Prof. Dr. Christoph Steinbeck
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> Phone Secretariat: +49-3641-948171
> http://cheminf.uni-jena.de
> http://orcid.org/-0001-6966-0814
>
> What is man but that lofty spirit - that sense of enterprise.
> ... Kirk, "I, Mudd," stardate 4513.3..
>
> > On 18. Nov 2019, at 16:24, 努力努力 <843982...@qq.com> wrote:
> >
> > Thanks for your reply.
> > I want to implement a JChemPaint in C#. In fact, I used a CDK port,
> named NCDK which is C# impementation implementation of the Chemistry
> Development Kit.  https://github.com/kazuyaujihara/NCDK.
> > But I have got some bugs. I want to read the source code of CDK and fix
> it.
> > The bug is like this:
> > When we addRing, we need to calculate the position of the virtual ring.
> From the code , I understand that the position of the new ring need to be
> calculated by some variables, include the new ring center, startAngle,
> addAngle, radius.
> > The code in function "placeSpiroRing" is
> > atomPlacer.populatePolygonCorners(atomsToDraw, ringCenter, startAngle,
> addAngle, radius);
> > The variable ringCenter is dependent on the variable ringCenterVector
> which is a vector pointing the the center of the new ring.
> >
> > For example, when I want to add a triangle to a shared atom, It
> satisfies numplace==2.It seems that the sharedAtom' s position will be
> changed.
> > The code in function "placeSpiroRing" is
> > if (numPlaced == 2) {
> > // nudge the shared atom such that bond lengths will be
> > // equal
> > startAtom.getPoint2d().add(ringCenterVector);
> > sharedAtomsCenter.add(ringCenterVector);
> > }
> > And when degree == 4 and degree != 4, ringCenterVector is differently
> recalculate. Why?
> > The code in function "placeSpiroRing" is
> > if (degree == 4) {
> > ringCenterVector.normalize();
> > ringCenterVector.scale(radius);
> > } else {
> > // spread things out a little for multiple spiro centres
> > ringCenterVector.normalize();
> > ringCenterVector.scale(2*radius);
> > }
> > I'm confused. Or I understand it wrong.
> >
> > Thank you for taking your time to read this letter again.
> >
> > -- 原始邮件 --
> > 发件人: "Christoph Steinbeck";
> > 发送时间: 2019年11月18日(星期一) 晚上6:55
> > 收件人: "努力努力"<843982...@qq.com>;
> > 抄送: "cdk-user";
> > 主题: Re: [Cdk-user] Questions about the function "Addring"
> >
> > Can you comment on what you try to achieve?
> > The method that you are referring to is a quite specialised method for
> structure diagram layout.
> > Are you trying to create 2D drawings of some molecule or fragment, or
> maybe something else?
> >
> > Kind regards,   Chris
> >
> > —
> > Prof. Dr. Christoph Steinbeck
> > Analytical Chemistry - Cheminformatics and Chemometrics
> > Friedrich-Schiller-University Jena, Germany
> > Phone Secretariat: +49-3641-948171
> > http://cheminf.uni-jena.de
> > http://orcid.org/-0001-6966-0814
> >
> > What is man but that lofty spirit - that sense of enterprise.
> > ... Kirk, "I, Mudd," stardate 4513.3..
> >
> > > On 18. Nov 2019, at 11:32, 努力努力 <843982...@qq.com> wrote:
> > >
> > > Dear all,
> > > i want to understand how to add rings in the atom.In the function
> "Addring",I find the code "ringPlacer.PlaceSpiroRing" and then jump to the
> function "placeSpiroRing".And I have some problems about this function.Why
> do we have special treatment when degree==4 and numplace==2? In my
> understanding, "degree" is the number of bonds connected to sharedAtoms,
> and numPlaced is the number of other Atoms ring except sharedAtoms.
> > > Looking forward to your reply. Thank you!
> > >
> > > The source code from CDK is here:
> > > public void placeSpiroRing(IRing ring, IAtomContainer sharedAtoms,
> Point2d 

Re: [Cdk-user] Questions about the function "placeSpiroRing" in RingPlacer.jave

2019-11-30 Thread John Mayfield
I already explained, you can have "spirodegree > 2" (i.e. degree > 4) in
which case it lays things out on top of each other (incorrect). The num
place two "nudge" is to make the bond lengths longer.

Try commenting out the if conditions and use the SMILES I gave last time to
see the effect:

C1CO[Fe]234(O1)OCCO2.C(CO3)O4



On Sat, 30 Nov 2019 at 09:30, 努力努力 <843982...@qq.com> wrote:

> Why do we have special treatment when degree==4 and numplace==2?
> In my understanding, "degree" is the number of bonds connected to
> sharedAtoms, and numPlaced is the number of other Atoms ring except
> sharedAtoms.
> Looking forward to your reply. Thank you!
>
> The source code from CDK is here:
> public void placeSpiroRing(IRing ring, IAtomContainer sharedAtoms, Point2d
> sharedAtomsCenter, Vector2d ringCenterVector, double bondLength) {
>
> IAtom startAtom = sharedAtoms.getAtom(0);
> List mBonds =
> molecule.getConnectedBondsList(sharedAtoms.getAtom(0));
> final int degree = mBonds.size();
> logger.debug("placeSpiroRing: D=", degree);
>
> // recalculate the ringCentreVector
> if (degree != 4) {
>
> int numPlaced = 0;
> for (IBond bond : mBonds) {
> IAtom nbr = bond.getOther(sharedAtoms.getAtom(0));
> if (!nbr.getFlag(CDKConstants.ISPLACED))
> continue;
> numPlaced++;
> }
>
> if (numPlaced == 2) {
> // nudge the shared atom such that bond lengths will be
> // equal
> startAtom.getPoint2d().add(ringCenterVector);
> sharedAtomsCenter.add(ringCenterVector);
> }
>
> double theta = Math.PI-(2 * Math.PI / (degree / 2));
> rotate(ringCenterVector, theta);
> }
>
> double radius = getNativeRingRadius(ring, bondLength);
> Point2d ringCenter = new Point2d(sharedAtomsCenter);
> if (degree == 4) {
> ringCenterVector.normalize();
> ringCenterVector.scale(radius);
> } else {
> // spread things out a little for multiple spiro centres
> ringCenterVector.normalize();
> ringCenterVector.scale(2*radius);
> }
> ringCenter.add(ringCenterVector);
> double addAngle = 2 * Math.PI / ring.getRingSize();
>
> IAtom currentAtom = startAtom;
> double startAngle = GeometryUtil.getAngle(startAtom.getPoint2d().x
> - ringCenter.x,
>   startAtom.getPoint2d().y
> - ringCenter.y);
>
> /*
>  * Get one bond connected to the spiro bridge atom. It doesn't
> matter in
>  * which direction we draw.
>  */
> List rBonds = ring.getConnectedBondsList(startAtom);
>
> IBond currentBond = (IBond) rBonds.get(0);
>
> Vector atomsToDraw = new Vector();
> /*
>  * Store all atoms to draw in consequtive order relative to the
> chosen
>  * bond.
>  */
> for (int i = 0; i < ring.getBondCount(); i++) {
> currentBond = ring.getNextBond(currentBond, currentAtom);
> currentAtom = currentBond.getOther(currentAtom);
> if (!currentAtom.equals(startAtom))
> atomsToDraw.addElement(currentAtom);
> }
> logger.debug("currentAtom  " + currentAtom);
> logger.debug("startAtom  " + startAtom);
>
> atomPlacer.populatePolygonCorners(atomsToDraw, ringCenter,
> startAngle, addAngle, radius);
>
> }
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] How can I use CDK on ia64 server using HP UX OS?

2019-12-02 Thread John Mayfield
Hi,

Providing you don't try and get an InChI from OPSIN/CDK then everything
else will work. Do you really need an InChI?

If you really do need it then I think the best option would be to build the
InChI library/executable yourself and then call out to it via system exec
via a Molfile:

 Runtime.getRuntime().exec("inchi input.mol"); // etc


You could also rebuild JNI-InChI but this is more complicated.

John

On Sun, 1 Dec 2019 at 12:55, 강신원  wrote:

> Hi, all.
>
> I'm making a simple chemical substructure using CDK and OPSIN library.
>
> I recently got to know that my program should run on ia64 server using HP
> UX OS, but jni-inchi in the CDK and OPSIN does not support that platform.
>
> Is there any one who have binary jni-inchi library for that platform or
> know how to get it?
>
> Help, please.
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Raw fingerprints impossible to calculate

2020-02-25 Thread John Mayfield
Okay,

I'm going to presume you want to search the data.. to retrieve similar
compounds or substructures. If not then just store the hexadecimal
fingerprint.

It's not impossible to do searching in MongoDB, see a talk from Matt Swain
<https://matt-swain.com/blog/2014-06-03-chemical-similarity-search-in-mongodb>,
... and my follow ups:
http://efficientbits.blogspot.com/2014/11/memory-mapped-fingerprint-index-part-i.html
,
http://efficientbits.blogspot.com/2014/12/memory-mapped-fingerprint-index-part-ii.html
.

However my view is (as I make clear in those blog posts) MongoDB is the
wrong technology for this, but you could convert your the binary
fingerprint to a vector. In fact to *toString* works well:

System.out.println(new
> Fingerprinter().getBitFingerprint(mol).asBitSet().toString());


{43, 46, 51, 60, 65, 70, 72, 86, 95, 99, 111, 114, 123, 128, 144, 157, 158,
161, 166, 174, 185, 188, 204, 213, 222, 253, 271, 275, 278, 311, 315, 320,
335, 364, 371, 379, 390, 409, 446, 449, 463, 486, 498, 520, 523, 535, 540,
565, 574, 586, 588, 611, 628, 632, 637, 647, 649, 655, 667, 725, 742, 756,
770, 793, 845, 859, 865, 918, 951, 954, 959, 1015}

You could then use and/or queries to find fingerprint subsets or computer
Tanimotos etc.

John

On Mon, 24 Feb 2020 at 13:44, Maria Sorokina 
wrote:

> I see the problem.
>
> Well, originally, I wanted to checkout how the raw fingerprints look like.
> I am storing all the data (and the fingerprints) in MongoDB, and I am still
> not sure if in case I save the BitFingerprints directly in there (with is
> possible when the field has an Object type), if they will be parseable by
> the mongo engine as fingerprints (without retrieving them to be read with
> CDK). So this is why I wanted to check the raw fingerprints, as they should
> be more JSON-friendly format, and mongo engine would be able to read those
> integers and strings for further similarity search.
>
> Kind regards,
> Maria
>
>
> Dr. Maria Sorokina
> Steinbeck Research Group
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> http://cheminf.uni-jena.de
>
> Le 21 févr. 2020 à 19:31, John Mayfield  a
> écrit :
>
> Okay looking at it the Substructure fingerprint would be easy to adapt...
> but it's not hard to just count the substructures. Utility code like that
> is difficult to justify, every line is more to maintain.
>
> The other problem is I don't like the fingerprint APIs so it's a toss-up
> between using effort to implement something I (or hopefully someone else)
> will ultimately rewrite in future. "Deprecated on arrival" I believe Egon
> has said before.
>
> On Fri, 21 Feb 2020 at 18:25, John Mayfield 
> wrote:
>
>> What do you think the "raw" fingerprint is? Why would you expect it for
>> the Substructure one?
>>
>> On Fri, 21 Feb 2020 at 09:47, Maria Sorokina 
>> wrote:
>>
>>> I tried in total 7 fingerprinters (PubChem, Substructure, MACCS,
>>> KlekotaRoth, Circular, ShortestPath and Hybrifization) and none worked. For
>>> some, I’m not surprised, but I was really expecting to have the raw
>>> fingerprints for the Substructure one
>>>
>>>
>>> Dr. Maria Sorokina
>>> Steinbeck Research Group
>>> Analytical Chemistry - Cheminformatics and Chemometrics
>>> Friedrich-Schiller-University Jena, Germany
>>> http://cheminf.uni-jena.de
>>>
>>> Le 21 févr. 2020 à 10:39, John Mayfield  a
>>> écrit :
>>>
>>> ... I do have some patches for an updated fingerprint API stack that
>>> would also add this in to more places. Essentially it was added to the
>>> public API but only implemented in a few places and left as a "ToDo"
>>> elsewhere. Might be something for the hack-a-thon.
>>>
>>> I should PubChem fingerprints are binary in nature though so you would
>>> probably never want the RAW version. *getBitFingerprint()* it
>>> implemented always.
>>>
>>> John
>>>
>>> On Fri, 21 Feb 2020 at 09:34, John Mayfield 
>>> wrote:
>>>
>>>> Hi Maria,
>>>>
>>>> Not all fingerprint support the "RAW" option and Count options.
>>>>
>>>> John
>>>>
>>>> On Fri, 21 Feb 2020 at 09:31, Maria Sorokina 
>>>> wrote:
>>>>
>>>>> Dear community,
>>>>>
>>>>> It is decidedly substructure search and fingerprinting period of the
>>>>> year!
>>>>>
>>>>> I want to create (to store) raw fingerprints of a range of different
>>>>> fingerprint 

Re: [Cdk-user] Substructure search using ShortestPathFingerprinter

2020-02-25 Thread John Mayfield
Yes good idea, I added a comment at the bottom but it does explicitly say
that at the top.

On Tue, 25 Feb 2020 at 08:43, nicepeopleproject 
wrote:

> Thank you!
> The documentation for the ShortestPathFingerprinter class says "Fingerprints
> allow for a fast screening step to exclude candidates for a substructure
> search in a database. They are also a means for determining the similarity
> of chemical structures.". Perhaps it’s worth removing so that there are
> no contradictions.
>
> чт, 20 февр. 2020 г. в 18:28, John Mayfield :
>
>> I've added a warning in the doc, there was already a warning on MACCS 166
>> keys.
>>
>> https://github.com/cdk/cdk/commit/82cb4f8d49283e117696f40d09538c70790a18fd
>>
>> On Thu, 20 Feb 2020 at 15:20, John Mayfield 
>> wrote:
>>
>>> *wrote :-)
>>>
>>> On Thu, 20 Feb 2020 at 15:20, John Mayfield 
>>> wrote:
>>>
>>>> Only *Fingerprinter* or *ExtendedFingerprint* obey this transitivity
>>>> property.
>>>>
>>>> Relevant post I wrong in 2015:
>>>> https://nextmovesoftware.com/blog/2015/02/16/for-every-fingerprint-optimisation-there-is-an-equal-and-opposite-fingerprint-deterioration/
>>>>
>>>> On Thu, 20 Feb 2020 at 10:44, nicepeopleproject <
>>>> nicepeopleproj...@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>> I'm trying to realize substructure search. As I understand, the
>>>>> ShortestPathFingerprinter is suitable for this. I ran into the following
>>>>> problem. I attach two file(in molecules.zip). when using butane.mol as
>>>>> query, should find ciclopentane.mol. When i found BitSet for butane i got:
>>>>> {115, 503, 540, 653, 893}
>>>>> {115, 503, 542, 653, 893} - for ciclopentane.
>>>>> So i cannot find ciclopentane. Is there a way to make it work?
>>>>>
>>>>> --
>>>>> С уважением,
>>>>> Николаев Артём
>>>>> ___
>>>>> Cdk-user mailing list
>>>>> Cdk-user@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>>
>>>>
>
> --
> С уважением,
> Николаев Артём
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Substructure search using ShortestPathFingerprinter

2020-02-20 Thread John Mayfield
I've added a warning in the doc, there was already a warning on MACCS 166
keys.

https://github.com/cdk/cdk/commit/82cb4f8d49283e117696f40d09538c70790a18fd

On Thu, 20 Feb 2020 at 15:20, John Mayfield 
wrote:

> *wrote :-)
>
> On Thu, 20 Feb 2020 at 15:20, John Mayfield 
> wrote:
>
>> Only *Fingerprinter* or *ExtendedFingerprint* obey this transitivity
>> property.
>>
>> Relevant post I wrong in 2015:
>> https://nextmovesoftware.com/blog/2015/02/16/for-every-fingerprint-optimisation-there-is-an-equal-and-opposite-fingerprint-deterioration/
>>
>> On Thu, 20 Feb 2020 at 10:44, nicepeopleproject <
>> nicepeopleproj...@gmail.com> wrote:
>>
>>> Hello!
>>> I'm trying to realize substructure search. As I understand, the
>>> ShortestPathFingerprinter is suitable for this. I ran into the following
>>> problem. I attach two file(in molecules.zip). when using butane.mol as
>>> query, should find ciclopentane.mol. When i found BitSet for butane i got:
>>> {115, 503, 540, 653, 893}
>>> {115, 503, 542, 653, 893} - for ciclopentane.
>>> So i cannot find ciclopentane. Is there a way to make it work?
>>>
>>> --
>>> С уважением,
>>> Николаев Артём
>>> ___
>>> Cdk-user mailing list
>>> Cdk-user@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>
>>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Substructure search using ShortestPathFingerprinter

2020-02-20 Thread John Mayfield
Only *Fingerprinter* or *ExtendedFingerprint* obey this transitivity
property.

Relevant post I wrong in 2015:
https://nextmovesoftware.com/blog/2015/02/16/for-every-fingerprint-optimisation-there-is-an-equal-and-opposite-fingerprint-deterioration/

On Thu, 20 Feb 2020 at 10:44, nicepeopleproject 
wrote:

> Hello!
> I'm trying to realize substructure search. As I understand, the
> ShortestPathFingerprinter is suitable for this. I ran into the following
> problem. I attach two file(in molecules.zip). when using butane.mol as
> query, should find ciclopentane.mol. When i found BitSet for butane i got:
> {115, 503, 540, 653, 893}
> {115, 503, 542, 653, 893} - for ciclopentane.
> So i cannot find ciclopentane. Is there a way to make it work?
>
> --
> С уважением,
> Николаев Артём
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Raw fingerprints impossible to calculate

2020-02-21 Thread John Mayfield
... I do have some patches for an updated fingerprint API stack that would
also add this in to more places. Essentially it was added to the public API
but only implemented in a few places and left as a "ToDo" elsewhere. Might
be something for the hack-a-thon.

I should PubChem fingerprints are binary in nature though so you would
probably never want the RAW version. *getBitFingerprint()* it implemented
always.

John

On Fri, 21 Feb 2020 at 09:34, John Mayfield 
wrote:

> Hi Maria,
>
> Not all fingerprint support the "RAW" option and Count options.
>
> John
>
> On Fri, 21 Feb 2020 at 09:31, Maria Sorokina 
> wrote:
>
>> Dear community,
>>
>> It is decidedly substructure search and fingerprinting period of the year!
>>
>> I want to create (to store) raw fingerprints of a range of different
>> fingerprint types for a big number of complex molecules (natural products).
>>
>> For example this:
>>
>> PubchemFingerprinter pubchemFingerprinter = new PubchemFingerprinter( 
>> SilentChemObjectBuilder.getInstance() );
>>
>> System.out.println(pubchemFingerprinter.getRawFingerprint(myAtomContainer));
>>
>> For all my molecules I am getting an" UnsupportedOperationException",
>> which according to the documentation reflects only the fact that the 
>> fingerprinter
>> cannot produce the raw fingerprint.
>> I am using the latest (2.3) version of the CDK.
>> Can anybody help me with this issue?
>>
>>
>> Kind regards,
>> Maria
>>
>>
>> Dr. Maria Sorokina
>> Steinbeck Research Group
>> Analytical Chemistry - Cheminformatics and Chemometrics
>> Friedrich-Schiller-University Jena, Germany
>> http://cheminf.uni-jena.de
>>
>> ___
>> Cdk-user mailing list
>> Cdk-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Raw fingerprints impossible to calculate

2020-02-21 Thread John Mayfield
Hi Maria,

Not all fingerprint support the "RAW" option and Count options.

John

On Fri, 21 Feb 2020 at 09:31, Maria Sorokina 
wrote:

> Dear community,
>
> It is decidedly substructure search and fingerprinting period of the year!
>
> I want to create (to store) raw fingerprints of a range of different
> fingerprint types for a big number of complex molecules (natural products).
>
> For example this:
>
> PubchemFingerprinter pubchemFingerprinter = new PubchemFingerprinter( 
> SilentChemObjectBuilder.getInstance() );
>
> System.out.println(pubchemFingerprinter.getRawFingerprint(myAtomContainer));
>
> For all my molecules I am getting an" UnsupportedOperationException",
> which according to the documentation reflects only the fact that the 
> fingerprinter
> cannot produce the raw fingerprint.
> I am using the latest (2.3) version of the CDK.
> Can anybody help me with this issue?
>
>
> Kind regards,
> Maria
>
>
> Dr. Maria Sorokina
> Steinbeck Research Group
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> http://cheminf.uni-jena.de
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Raw fingerprints impossible to calculate

2020-02-21 Thread John Mayfield
What do you think the "raw" fingerprint is? Why would you expect it for the
Substructure one?

On Fri, 21 Feb 2020 at 09:47, Maria Sorokina 
wrote:

> I tried in total 7 fingerprinters (PubChem, Substructure, MACCS,
> KlekotaRoth, Circular, ShortestPath and Hybrifization) and none worked. For
> some, I’m not surprised, but I was really expecting to have the raw
> fingerprints for the Substructure one
>
>
> Dr. Maria Sorokina
> Steinbeck Research Group
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> http://cheminf.uni-jena.de
>
> Le 21 févr. 2020 à 10:39, John Mayfield  a
> écrit :
>
> ... I do have some patches for an updated fingerprint API stack that would
> also add this in to more places. Essentially it was added to the public API
> but only implemented in a few places and left as a "ToDo" elsewhere. Might
> be something for the hack-a-thon.
>
> I should PubChem fingerprints are binary in nature though so you would
> probably never want the RAW version. *getBitFingerprint()* it implemented
> always.
>
> John
>
> On Fri, 21 Feb 2020 at 09:34, John Mayfield 
> wrote:
>
>> Hi Maria,
>>
>> Not all fingerprint support the "RAW" option and Count options.
>>
>> John
>>
>> On Fri, 21 Feb 2020 at 09:31, Maria Sorokina 
>> wrote:
>>
>>> Dear community,
>>>
>>> It is decidedly substructure search and fingerprinting period of the
>>> year!
>>>
>>> I want to create (to store) raw fingerprints of a range of different
>>> fingerprint types for a big number of complex molecules (natural products).
>>>
>>> For example this:
>>>
>>> PubchemFingerprinter pubchemFingerprinter = new PubchemFingerprinter( 
>>> SilentChemObjectBuilder.getInstance() );
>>>
>>> System.out.println(pubchemFingerprinter.getRawFingerprint(myAtomContainer));
>>>
>>> For all my molecules I am getting an" UnsupportedOperationException",
>>> which according to the documentation reflects only the fact that the 
>>> fingerprinter
>>> cannot produce the raw fingerprint.
>>> I am using the latest (2.3) version of the CDK.
>>> Can anybody help me with this issue?
>>>
>>>
>>> Kind regards,
>>> Maria
>>>
>>>
>>> Dr. Maria Sorokina
>>> Steinbeck Research Group
>>> Analytical Chemistry - Cheminformatics and Chemometrics
>>> Friedrich-Schiller-University Jena, Germany
>>> http://cheminf.uni-jena.de
>>>
>>> ___
>>> Cdk-user mailing list
>>> Cdk-user@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>
>>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Raw fingerprints impossible to calculate

2020-02-21 Thread John Mayfield
Okay looking at it the Substructure fingerprint would be easy to adapt...
but it's not hard to just count the substructures. Utility code like that
is difficult to justify, every line is more to maintain.

The other problem is I don't like the fingerprint APIs so it's a toss-up
between using effort to implement something I (or hopefully someone else)
will ultimately rewrite in future. "Deprecated on arrival" I believe Egon
has said before.

On Fri, 21 Feb 2020 at 18:25, John Mayfield 
wrote:

> What do you think the "raw" fingerprint is? Why would you expect it for
> the Substructure one?
>
> On Fri, 21 Feb 2020 at 09:47, Maria Sorokina 
> wrote:
>
>> I tried in total 7 fingerprinters (PubChem, Substructure, MACCS,
>> KlekotaRoth, Circular, ShortestPath and Hybrifization) and none worked. For
>> some, I’m not surprised, but I was really expecting to have the raw
>> fingerprints for the Substructure one
>>
>>
>> Dr. Maria Sorokina
>> Steinbeck Research Group
>> Analytical Chemistry - Cheminformatics and Chemometrics
>> Friedrich-Schiller-University Jena, Germany
>> http://cheminf.uni-jena.de
>>
>> Le 21 févr. 2020 à 10:39, John Mayfield  a
>> écrit :
>>
>> ... I do have some patches for an updated fingerprint API stack that
>> would also add this in to more places. Essentially it was added to the
>> public API but only implemented in a few places and left as a "ToDo"
>> elsewhere. Might be something for the hack-a-thon.
>>
>> I should PubChem fingerprints are binary in nature though so you would
>> probably never want the RAW version. *getBitFingerprint()* it
>> implemented always.
>>
>> John
>>
>> On Fri, 21 Feb 2020 at 09:34, John Mayfield 
>> wrote:
>>
>>> Hi Maria,
>>>
>>> Not all fingerprint support the "RAW" option and Count options.
>>>
>>> John
>>>
>>> On Fri, 21 Feb 2020 at 09:31, Maria Sorokina 
>>> wrote:
>>>
>>>> Dear community,
>>>>
>>>> It is decidedly substructure search and fingerprinting period of the
>>>> year!
>>>>
>>>> I want to create (to store) raw fingerprints of a range of different
>>>> fingerprint types for a big number of complex molecules (natural products).
>>>>
>>>> For example this:
>>>>
>>>> PubchemFingerprinter pubchemFingerprinter = new PubchemFingerprinter( 
>>>> SilentChemObjectBuilder.getInstance() );
>>>>
>>>> System.out.println(pubchemFingerprinter.getRawFingerprint(myAtomContainer));
>>>>
>>>> For all my molecules I am getting an" UnsupportedOperationException",
>>>> which according to the documentation reflects only the fact that the 
>>>> fingerprinter
>>>> cannot produce the raw fingerprint.
>>>> I am using the latest (2.3) version of the CDK.
>>>> Can anybody help me with this issue?
>>>>
>>>>
>>>> Kind regards,
>>>> Maria
>>>>
>>>>
>>>> Dr. Maria Sorokina
>>>> Steinbeck Research Group
>>>> Analytical Chemistry - Cheminformatics and Chemometrics
>>>> Friedrich-Schiller-University Jena, Germany
>>>> http://cheminf.uni-jena.de
>>>>
>>>> ___
>>>> Cdk-user mailing list
>>>> Cdk-user@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>
>>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] How can I save a wavy bond in a file?

2020-04-01 Thread John Mayfield
Yes they are round tripped by MDL (BondStereo.UP_OR_DOWN) for sure, however
could be a JChemPaint issue - and that's no longer actively developed.

Corner-case but are you using a JChemPaint release or built one yourself?
If you've mixed in a new version of CDK it may be tripped up with the new
Bond "Display" property.

https://github.com/cdk/cdk/blob/master/base/interfaces/src/main/java/org/openscience/cdk/interfaces/IBond.java#L137

On Wed, 1 Apr 2020 at 04:31, Shao Frankro  wrote:

> Dear all,
> I am reading the source code of JChemPaint and writing a molecule
> editor based on CDK. I found the wavy bonds become solid bonds when I
> save them to a MDL/CML format file and reopen it in JChemPaint.
> So I want to know if there's any file format or any way that can save
> all the informations of CDK's memory model to a file and I cound reload
> it(maybe like serialization and deserialization), If not, what is the
> easiest way to achieve this?
>
> Thanks for your help!
>
>
> PS: I found the CDKSourceCodeWriter may save the informations of CDK's
> memory model, but I don't know how to use these codes and it also
> lose Stereo information.
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] How can I save a wavy bond in a file?

2020-04-02 Thread John Mayfield
In that case it should be stored as the "up_or_down" and looking at NCDK it
looks correct to me:

MDL reader

https://github.com/kazuyaujihara/NCDK/blob/master/NCDK/IO/MDLV2000Reader.cs#L1359

MDL write

https://github.com/kazuyaujihara/NCDK/blob/master/NCDK/IO/MDLV2000Writer.cs#L577

Again this could be a JChemPaint issue using the old "MDLReader/Writer".

On Wed, 1 Apr 2020 at 14:17, Shao Frankro  wrote:

> Actually I am using C# with NCDK <https://github.com/kazuyaujihara/NCDK>,
> But I don't found Bond "Display" property of it, maybe it hasn't been
> updated yet.
> Anyway, must I write my own file format writer? I thought there would be a
> serialization method of CDK/NCDK.   : )
>
> --
>  John Mayfield 
>  2020-4-1 20:00
>  Re: [Cdk-user] How can I save a wavy bond in a file?
>
> Yes they are round tripped by MDL (BondStereo.UP_OR_DOWN) for sure,
> however could be a JChemPaint issue - and that's no longer actively
> developed.
>
> Corner-case but are you using a JChemPaint release or built one yourself?
> If you've mixed in a new version of CDK it may be tripped up with the new
> Bond "Display" property.
>
>
> https://github.com/cdk/cdk/blob/master/base/interfaces/src/main/java/org/openscience/cdk/interfaces/IBond.java#L137
>
> On Wed, 1 Apr 2020 at 04:31, Shao Frankro  wrote:
>
> Dear all,
> I am reading the source code of JChemPaint and writing a molecule
> editor based on CDK. I found the wavy bonds become solid bonds when I
> save them to a MDL/CML format file and reopen it in JChemPaint.
> So I want to know if there's any file format or any way that can save
> all the informations of CDK's memory model to a file and I cound reload
> it(maybe like serialization and deserialization), If not, what is the
> easiest way to achieve this?
>
> Thanks for your help!
>
>
> PS: I found the CDKSourceCodeWriter may save the informations of CDK's
> memory model, but I don't know how to use these codes and it also
> lose Stereo information.
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] atom typing without atom type name

2020-09-21 Thread John Mayfield
The SMILES parser nor other IO (maybe CML) will assign atom types for you -
you need to do this yourself with:

AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(methane);

Atom types are an annotation on top of a molecule. There are different atom
types we could assign - CDK atom types are just on set, ALOGP is a
different set (for example). In pre CDK 1.4 basically everything was built
on top of the view that CDK atom types were present - this is no longer the
case.

On Sun, 20 Sep 2020 at 22:44, Rajarshi Guha  wrote:

> Hi, the following code is failing because the parsed molecule has no atom
> type names. The calculate() method tries to identify atom types from the
> atoms type name, but this seems circular. Unless I assign atom types, where
> does the type name come from?
>
> public class CDKVolumeTest {
> public static void main(String[] args) throws CDKException {
> SmilesParser sp = new 
> SmilesParser(DefaultChemObjectBuilder.getInstance());
> IAtomContainer mol = sp.parseSmiles("CCO");
>
> double vol = VABCVolume.calculate(mol);
> }
> }
>
>
> --
> Rajarshi Guha | http://blog.rguha.net | @rguha 
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] CDK in PyCharm IDE

2020-07-05 Thread John Mayfield
Hi Stuart,

Sorry for the late reply

Finally, while I was hoping it would easy to use CDK right in
> PyCharm/Python I now see that I will have to install it separately and use
> the command line.
> I guess I don’t know, is there command line functionality built into the
> CDK jar file?


You don't need to go via the command line, IDEA
<https://www.jetbrains.com/idea/> is the equivalent of PyCharm, in fact I
think it came first and I believe PyCharm reuses most of the UI :-). We
don't have any command line utilities and what you're trying to do is quite
specific so couldn't imagine having a utility for it if we did provide
those. It's best to think of CDK like NumPy - it's a set of tools for you
build something with. Since you sound like you're more familiar with Python
so why not use RDKit or Open Babel?

So, I have a project where I am converting a lot of data to RDF and as a
> result I am adding compounds to a graph DB with lots of descriptors and
> identifiers.
> Thus, the first thing I want to do with CDK is to get descriptors for
> compounds, preferably using the InChIKey (but other identifiers if needed).


I presume you mean Resource Description Framework (RDF) but note CTfile
Reaction Data File (RDF)
<http://help.accelrysonline.com/ulm/onelab/1.0/content/ulm_pdfs/direct/reference/ctfileformats2016.pdf>
is
common cheminformatics format so can get confusing. Egon knows more
about RDF but I believe Apache JENA <https://jena.apache.org/> will help
here - I think we actually use in the CDK already.

Thus, the first thing I want to do with CDK is to get descriptors for
> compounds, preferably using the InChIKey (but other identifiers if needed).


You can't read InChI-Key but perhaps I misunderstand.

Second, I want to store the atoms and bonds of the molecular graph in the
> data (as RDF).  I am therefore really interested in either:
> - getting access to the InChI canonicalization algorithm (if CDK has a
> version of that outside of the InChI code) OR

- obtaining a .mol file where the connection layer atoms are labelled with
> the atom numbers from the canonicalization routine
> My idea is to see if I can generate the InChI string from the molecular
> graph using semantic inferencing


We call into the standard InChI native code, however we do provide
convenient access to the InChI atoms numbers:
http://cdk.github.io/cdk/latest/docs/api/index.html?org/openscience/cdk/graph/invariant/InChINumbersTools.html.
It's not too hard to pull these out and set them on a MOLfile or SMILES.

John


On Thu, 25 Jun 2020 at 18:08, Chalk, Stuart  wrote:

> John
>
> Thanks for reply.
>
> So, I have a project where I am converting a lot of data to RDF and as a
> result I am adding compounds to a graph DB with lots of descriptors and
> identifiers.
> Thus, the first thing I want to do with CDK is to get descriptors for
> compounds, preferably using the InChIKey (but other identifiers if needed).
>
> Second, I want to store the atoms and bonds of the molecular graph in the
> data (as RDF).  I am therefore really interested in either:
> - getting access to the InChI canonicalization algorithm (if CDK has a
> version of that outside of the InChI code) OR
> - obtaining a .mol file where the connection layer atoms are labelled with
> the atom numbers from the canonicalization routine
> My idea is to see if I can generate the InChI string from the molecular
> graph using semantic inferencing
>
> Finally, while I was hoping it would easy to use CDK right in
> PyCharm/Python I now see that I will have to install it separately and use
> the command line.
> I guess I don’t know, is there command line functionality built into the
> CDK jar file?
>
> Any advice much appreciated…
> Stuart
>
> On Jun 24, 2020, at 7:26 PM, John Mayfield 
> wrote:
>
> Hi Stuart,
>
> We have some small snippets here (
> https://github.com/cdk/cdk/wiki/Toolkit-Rosetta
> <https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcdk%2Fcdk%2Fwiki%2FToolkit-Rosetta=02%7C01%7Cschalk%40unf.edu%7Cbb26e68251c14bbe4c7708d818961a29%7Cdf29b2fa8929482f9dbb60ff4df224c4%7C1%7C0%7C637286380256993241=uQmREXtAYy7PI9JEqDH%2Fest4GyQN5modZ4PsmUx8Roo%3D=0>)
> but most of our doc is geared towards having at least some familiarity with
> writing and using Java libraries. Saying you don't see a plugin in the IDE
> for CDK is like saying you don't see a petrol cap on an electric car.
> Removing the Python vs JAVA issues - CDK is a chemistry toolkit (distinctly
> not an application) - you link in the JAR file to your own code and use
> it's components. IDE plugins help you code, formatting, syntax highlighting
> etc.
>
> Now skipping over a lot of details you can link a Java JAR in different
> ways either manually via the classpath or more commonly via

Re: [Cdk-user] CDK in PyCharm IDE

2020-06-24 Thread John Mayfield
Hi Stuart,

We have some small snippets here (
https://github.com/cdk/cdk/wiki/Toolkit-Rosetta) but most of our doc is
geared towards having at least some familiarity with writing and using Java
libraries. Saying you don't see a plugin in the IDE for CDK is like saying
you don't see a petrol cap on an electric car. Removing the Python vs JAVA
issues - CDK is a chemistry toolkit (distinctly not an application) - you
link in the JAR file to your own code and use it's components. IDE plugins
help you code, formatting, syntax highlighting etc.

Now skipping over a lot of details you can link a Java JAR in different
ways either manually via the classpath or more commonly via a build tool
(e.g. maven/gradle/ant). Before going further I think it would be better to
start from what you hope to do, i.e. why CDK within the PyCharm IDE? Are
you just wanting to have a play around or was there a task you wanted
to accomplish?

John

On Wed, 24 Jun 2020 at 23:26, Chalk, Stuart  wrote:

> Markus
>
> Thanks for that great idea!
>
> Sadly, I don’t find CDK in the Plugin marketplace for PyCharm.
> All the plugins are coding related of course...
>
> Regards,
> Stuart
>
> On Jun 24, 2020, at 5:35 PM, Markus Sitzmann 
> wrote:
>
> Hi Stuart,
>
> Pycharm is a specialized version of the IntelliJ IDE for python, Intelij
> itself (and pycharm) is written in Java, so the solution for CDK should be
> using Intelij. I am not sure if you can get Java extensions for pycharm
> (the professional versions of pycharm and IntelliJ even require separate
> licenses, but there is a community version of both)
>
> Markus
>
> ---
> Markus Sitzmann
>
>
> On 24. Jun 2020, at 23:20, Chalk, Stuart  wrote:
>
>  I am interested in using CDK within the PyCharm IDE.  I see that the
> recommendation on the CDK website is to use Cinfony, however the package on
> PyPi does not work and there does not seem to be a version after 2012.
> If anyone has expertise/advise/suggestions please let me know…
>
> I hope everyone out there in CDK land is doing OK given the current
> situation...
>
> Stuart Chalk, Ph.D.
> Professor of Chemistry
> Department of Chemistry, Building 50, Room 3514,
> University of North Florida
> 1 UNF Drive, Jacksonville, FL 32224 USA
> ORCID: -0002-0703-7776
> P: 904-620-1938
> F: 904-620-3535
> E: sch...@unf.edu
> W: http://www.unf.edu/coas/chemistry/
> 
> faculty/Stuart_Chalk.aspx
> 
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Wrong molecular formula?

2020-12-03 Thread John Mayfield
Hi Manual,

Chris is right, unfortunately the ChemDraw export isn't quite correct. It
is actually possible to represent multi-attach in V3000 but it's not used
here. The more common problem is that there are simply a random bond into
the middle of a ring. I've done a fair bit of work on ChemDraw processing (
https://nextmovesoftware.com/blog/2016/07/28/sketchy-sketches/), the
biggest issue is the ChemDraw chemical formula/abbreviation parsing, for
example K2CO3 has a peroxide, HATU is a "[H]*[3H][U]", etc (I show more
examples in the poster).

NextMove has a commercial tool to generate CXSMILES, for you example note
the *m:* part on the end that captures the positional variation.

[john@harbinger:Praline]% java -jar exec/target/praline.jar convert
> ~/Downloads/structure.cdx --cxsmi
> [Ru]([P](CCC1=CC=CC=C1)(C2C2)C3C3)(Cl)(Cl)*.C1(=CC=C(C=C1)C(C)C)C
> |m:24:25.26.27.28.29.30| structure Molecule/Specific/High/+PVar


CDK can read and handle this, we actually do get the formula wrong still
though (will fix that).

OpenBabel has a FOSS ChemDraw parser, one option could be to modify that
and parse your examples to get the info and then generate the
MOLfile/CXSMILES. The parsing is easy *NodeType="MultipleAttach"
Attachments="{id1} {id2} .."* where the id's are node ids. Unfortunately I
don't think they have the data structures to represent it so it would be a
fair bit of work other than handling these fields.

All the best,
John

On Wed, 2 Dec 2020 at 15:05, Christoph Steinbeck <
christoph.steinb...@uni-jena.de> wrote:

> Dear Manuel,
>
> if you open the mol file in a text editor, there are clearly 31 C atoms in
> the file.
> So the CDK is “right”. I also opened the file in Marvin Sketch and it
> output the analysis below.
>
> ChemDraw uses a fishy trick, as it seems, to create the illusion of a
> multi-center attachment. Clearly, they focus on publication-ready drawing
> of chemical structures and not one creating correct file representations of
> the chemistry. Fact is that the end of the line to the center of the
> benzene ring is a carbon atom and nothing else.
>
> Kind regards,
>
> Chris
>
> —
> Prof. Dr. Christoph Steinbeck
> Analytical Chemistry - Cheminformatics and Chemometrics
> Friedrich-Schiller-University Jena, Germany
> Phone Secretariat: +49-3641-948171
> http://cheminf.uni-jena.de
> http://orcid.org/-0001-6966-0814
>
> What is man but that lofty spirit - that sense of enterprise.
> ... Kirk, "I, Mudd," stardate 4513.3..
>
>
>
>
>
> > On 2. Dec 2020, at 14:38, Stesycki, Manuel 
> wrote:
> >
> > Dear CDK users,
> >
> > we are using CDK version 2.3 in our application.
> > As a user tried to add a structure (see attachment) we found a
> difference in the molecular formula of the structure.
> >
> > The original structure was draw with ChemDraw 18.
> > A multi-center attachment was added to the structure and ChemDraw shows
> this molecular formula: C30H46Cl2PRu
> >
> > Whereas our application takes the mol-version of the cdx-file and
> computes this formula: C31H49Cl2PRu
> > To get the formula we use this piece of code:
> >
> > IMolecularFormula form =
> MolecularFormulaManipulator.getMolecularFormula(mol);
> > sumFormula = MolecularFormulaManipulator.getString(form);
> >
> > Did we missed something by creating the AtomContainer?
> > We create the atomcontainer directly by parsing the mol-file:
> > try (StringReader sr = new StringReader(molFile); MDLV2000Reader mr =
> new MDLV2000Reader(sr, mode)) {
> >
> > AtomContainer mol = new AtomContainer();
> > AtomContainer ac = mr.read(mol);
> > }
> >
> > Maybe someone can give us a hint, what we are doing wrong.
> >
> > Best regards,
> >Manuel Stesycki
> >
> > IT
> >0208 / 306-2146
> >Physikbau, Büro 117
> >stesy...@mpi-muelheim.mpg.de
> >
> > Max-Planck-Institut für Kohlenforschung
> >Kaiser-Wilhelm-Platz 1
> >D-45470 Mülheim an der Ruhr
> >http://www.kofo.mpg.de/de
> >
> > ___
> > Cdk-user mailing list
> > Cdk-user@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] How to install Chemistry development kit after installing apache-maven-3.6.3

2020-12-30 Thread John Mayfield
Sorry I misread your email and that you had the JAR downloaded, you do not
need maven unless your project will use maven to build. You just need to
add the JAR to the classpath.

java -cp cdk-2.0.jar YourClassName
>

or in an IDE (e.g. Eclipse/IntelliJ) you would configure this from a menu
option.

On Wed, 30 Dec 2020 at 09:27, Winod Dhamnekar 
wrote:

> Hello,
>
> John May,
>
> Sir,
>
> What should be the contents of pom.xml file in cdk directory? What
> are the modelversion and snapshot? If you know it, please guide me in this
> regard.
>
> Cdk beginner user,
>
> WMD
>
>
>
> Sent from Mail  for
> Windows 10
>
>
>
> *From: *John May 
> *Sent: *30 December 2020 13:50
> *To: *Winod Dhamnekar 
> *Cc: *cdk-u...@lists.sf.net
> *Subject: *Re: [Cdk-user] How to install Chemistry development kit after
> installing apache-maven-3.6.3
>
>
>
> You need to run mvn install from the CDK directory, the install just
> builds the code and puts the JAR files in the maven repo directory
> (~/.m2/repository on Linux not sure where it is on Windows).
>
>
>
> If you just want to use the CDK you can actually just download the release
> jar from GitHub or let maven download them for you.
>
> Also note that CDK is a programming library and not an application.
>
>
>
> - John
>
>
>
> On 30 Dec 2020, at 06:25, Winod Dhamnekar 
> wrote:
>
> 
>
> Hello,
> I have java , java development kit 32 bit and 64 bit installed on my
> laptop. I have installed apache maven 3.6.3 and its path is C: \Program
> Files\apache-maven-3.6.3. I have downloaded chemistry development kit
> cdk-2.0.jar and it is Program Files directory.
>
> On my laptop, JAVA_HOME environment variable is set to C:\Program
> Files\Java\jdk-15.0.1. But at the time of installation of cdk-2.0.jar by
> giving command mvn install at the command prompt, following screen appears.
>
> <55C2E9251DE648ABBED995B9CC4CFC30.png>
>
>   How to overcome this difficulty?
>
>
>
> Cdk beginner user,
>
>
>
> WMD
>
>
>
> Sent from Mail  for
> Windows 10
>
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


Re: [Cdk-user] Mol2 file to SMILES

2021-03-28 Thread John Mayfield
You should use the *CircularFingerprinter* for similarity.

On Sun, 28 Mar 2021 at 08:39, Sub Jae Shin  wrote:

> To John Mayfield
>
> Hi, I found the drugbank id property from AtomContainer's getproperties
> method, so that I could specify which atom container indicates which drug.
>
> I think my goal to get drug-drug similarity has been achieved in my guess.
>
> package com.company;
> import org.openscience.cdk.ChemFile;
> import org.openscience.cdk.exception.CDKException;
> import org.openscience.cdk.fingerprint.Fingerprinter;
> import org.openscience.cdk.fingerprint.IBitFingerprint;
> import org.openscience.cdk.fingerprint.IFingerprinter;
> import org.openscience.cdk.graph.rebond.Bspt;
> import org.openscience.cdk.interfaces.IAtomContainer;
> import org.openscience.cdk.interfaces.IChemFile;
> import org.openscience.cdk.io.MDLV2000Reader;
> import org.openscience.cdk.similarity.Tanimoto;
> import org.openscience.cdk.tools.manipulator.ChemFileManipulator;
>
> import java.io.*;
> import java.lang.reflect.Array;
> import java.util.ArrayList;
> import java.util.List;
> import java.util.Map;
>
> public class Main {
>
> public static void main(String[] args) {
> try {
>
> InputStream structures = new 
> FileInputStream("../data/drugbank/structures.sdf");
> MDLV2000Reader reader = new MDLV2000Reader(structures);
> IChemFile file = reader.read(new ChemFile());
> //Where can I find drugbank id?
>
> Fingerprinter finger = new Fingerprinter();
> List AtomData = 
> ChemFileManipulator.getAllAtomContainers(file);
> int count = AtomData.size();
> ArrayList df = new ArrayList<>();
>
> for(int i = 0; i < count; ++i) {
> ArrayList list = new ArrayList<>();
> IAtomContainer acReference = AtomData.get(i);
> Map refProperties = acReference.getProperties();
> list.add(refProperties.get("DATABASE_ID"));
> for(int j = 0; j < count; ++j) {
> IAtomContainer acStructure = AtomData.get(j);
> Map structProperties = acStructure.getProperties();
> System.out.println("REF DATABASE_ID : " + 
> refProperties.get("DATABASE_ID") +
> "-" + "COMP DATABASE_ID" + 
> structProperties.get("DATABASE_ID") + " similarity is now calculating");
> double similarity = cdkCalculateTanimotoCoef(finger, 
> acReference, acStructure);
> list.add(similarity);
> }
> df.add(list);
> }
> FileWriter result_csv = new 
> FileWriter("../data/drugbank/drug_drug_sim.csv");
>
> for(ArrayList a : df){
> String row = "";
> for(int i = 0; i < a.size(); ++i) {
> if(i == a.size() - 1) {
> row = row + a.get(i).toString() + "\n";
> }
> else {
> row = row + a.get(i).toString() + ",";
> }
> }
> // System.out.println(row);
> result_csv.write(row);
> }
>
> result_csv.close();
>
> //System.out.println(acReference.toString());
>
>
> } catch (FileNotFoundException | CDKException e) {
> System.out.println(e.getMessage());
> } catch (IOException e) {
> e.printStackTrace();
> }
> }
>
> public static double cdkCalculateTanimotoCoef(IFingerprinter 
> fingerprinter, IAtomContainer acReference, IAtomContainer acStructure ) {
>
> double ret = 0.0;
>
> try {
>
> IBitFingerprint fpReference = 
> fingerprinter.getBitFingerprint(acReference);
>
> //Tanimoto-score
> IBitFingerprint fpStructure = 
> fingerprinter.getBitFingerprint(acStructure);
> ret = Tanimoto.calculate(fpReference, fpStructure);
>
> } catch (Exception ex) {
> //...
> }
>
> return ret;
> }
> }
>
>
> I hope this code result matches with my goal.
>
> I always thank you all, cdk developers.
>
> Sincerely
> Seopjae Shin
>
>
> On Fri, Mar 26, 2021 at 6:36 PM John Mayfield 
> wrote:
>
>> Do you have a mol2 file or a SMILES file? It's not clear. Mol2 support
>> isn't great in the CDK mainly because it's 

Re: [Cdk-user] Mol2 file to SMILES

2021-03-26 Thread John Mayfield
Do you have a mol2 file or a SMILES file? It's not clear. Mol2 support
isn't great in the CDK mainly because it's more a compchem/modelling format
than cheminformations which primarily use SMILES or MOLfile.

Presume you know how to read line by line from a file here is an example
from SMILES:

IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
> // load from SMILES and compute the ECFP (circular) fingerprint
> IFingerprinter fpr = new CircularFingerprinter();
> SmilesParser smipar = new SmilesParser(bldr);
> List smiles = Arrays.asList("Clc1c1",
> "Fc1c1",
> "Ic1c1",
> "Clc1n1");
> List fps = new ArrayList<>();
> for (String smi : smiles) {
> IAtomContainer mol = smipar.parseSmiles(smi);
> fps.add(fpr.getBitFingerprint(mol).asBitSet());
> }
> // print N^2 comparison table
> for (int j = 0; j < fps.size(); j++)
> System.out.print("," + smiles.get(j));
> System.out.print('\n');
> for (int i = 0; i < fps.size(); i++) {
> System.out.print(smiles.get(i));
> for (int j = 0; j < fps.size(); j++) {
> System.out.printf(",%.3f", Tanimoto.calculate(fps.get(i),
> fps.get(j)));
> }
> System.out.print('\n');
> }


,Clc1c1,Fc1c1,Ic1c1,Clc1n1
Clc1c1,1.000,0.368,0.368,0.292
Fc1c1,0.368,1.000,0.368,0.192
Ic1c1,0.368,0.368,1.000,0.192
Clc1n1,0.292,0.192,0.192,1.000

There are a lot more optimal ways of doing it and for a large comparison
table use ChemFP: https://chemfp.com/.

On Wed, 24 Mar 2021 at 06:42, Stesycki, Manuel 
wrote:

> Good morning,
>
> Use this class for Tanimoto calucations:
>  org.openscience.cdk.similarity.Tanimoto (see doc:
> http://cdk.github.io/cdk/latest/docs/api/index.html)
>
> you could do something like this to calculate your tanimoto score:
>
> public static double cdkCalculateTanimotoCoef(IFingerprinter
> fingerprinter, IAtomContainer acReference, IAtomContainer acStructure ) {
>
> double ret = 0.0;
>
> try {
>
> IBitFingerprint fpReference = fingerprinter.getBitFingerprint(
> acReference);
>
> //Tanimoto-score
> IBitFingerprint fpStructure = fingerprinter.getBitFingerprint(
> acStructure);
> ret = Tanimoto.calculate(fpReference, fpStructure);
>
> } catch (Exception ex) {
> //...
> }
>
> return ret;
> }
>
>
>
> Viele Grüße,
>Manuel Stesycki
>
> IT
>0208 / 306-2146
>Physikbau, Büro 117
>stesy...@mpi-muelheim.mpg.de
>
> Max-Planck-Institut für Kohlenforschung
>Kaiser-Wilhelm-Platz 1
>D-45470 Mülheim an der Ruhr
>http://www.kofo.mpg.de/de
>
> Am 24.03.2021 um 04:55 schrieb Sub Jae Shin :
>
> To CDK developers.
>
> Hello, I'm trying to get drug-drug similarity by Tanimoto score.
>
> I'm a beginner of cdk and java, so I'm stuck in the process of changing
> smiles file to Tanimoto score's calculate method's variable.
>
> package com.company;
> import org.openscience.cdk.ChemFile;
> import org.openscience.cdk.exception.CDKException;
> import org.openscience.cdk.interfaces.IChemFile;
> import org.openscience.cdk.io.SMILESReader;
> import java.io.*;
>
> public class Main {
>
> public static void main(String[] args) {
> try {
>
> InputStream mol2DataStream = new 
> FileInputStream("../data/drugbank/structure.smiles");
> SMILESReader reader = new SMILESReader(mol2DataStream);
> IChemFile file = reader.read(new ChemFile());
>
> } catch (FileNotFoundException | CDKException e) {
> System.out.println(e.getMessage());
> }
> }
> }
>
> Sincerely
> Seopjae Shin.
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
> ___
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
___
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user


  1   2   >