Re: [Rdkit-discuss] non-element elements

2021-02-03 Thread Francois Berenger

On 04/02/2021 00:35, Brian Peterson wrote:

Hello RDKit people,

Is it possible to modify the properties of elements in the periodic
table or to create new ones?  Use case: Suppose one had some molecules
defined in terms of functional groups or united atoms or some other
entities that are not pure elemental atoms. Could one map these things
on to unused elements (e.g. my_functional_group --> U) and fix up the
properties of U so that it had the appropriate valence etc. and could
be present both in a molecule and in SMARTS patterns so that one could
do substructure matches within RDKit?


Maybe you can use the isotope number to encode some special meaning
for an atom.

Cf. http://www.rdkit.org/docs/GettingStartedInPython.html

"Other fragmentation approaches"
[...] attachment points are labelled (using isotopes) [...]

Those are preserved in the output SMILES.


Thanks,
Brian
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java

2021-02-03 Thread Tim Dudgeon
Steve,
It happens whether running in multiple threads or a single thread.
Tim

On Wed, Feb 3, 2021 at 2:36 PM Stephen Roughley 
wrote:

> Hi Tim,
>
> You mentioned the calculation is done using Java streams. Have you tried
> calling #sequential() on your stream to force it to run single threaded
> from the Java side?
>
> Steve
>
> On Wed, 3 Feb 2021, 12:58 Greg Landrum,  wrote:
>
>> Given the fun that threading is, this isn't necessarily conclusive, but I
>> just created a small C++ multi-threading test for the morgan fingerprinting
>> code and everything looks fine. That remains true when the code is run
>> under valgrind (which is quite good at picking up the usual types of memory
>> corruption that cause threading issues).
>>
>> -greg
>>
>>
>> On Wed, Feb 3, 2021 at 1:36 PM Greg Landrum 
>> wrote:
>>
>>> Hi Tim,
>>>
>>> I haven't seen this particular problem myself, nor have we gotten any
>>> reports of crashes from the Morgan fingerprinting code.
>>> Comparing the fingerprinting code itself across the 2019.09 and 2020.09
>>> branches I also don't see anything which is likely to cause problems, but
>>> one never knows.
>>>
>>> One thing that might help to know is how you construct the molecule's
>>> you're generating fingerprints for: are these from one of the RDKit file
>>> parsers? Have they been sanitized?
>>>
>>> Another thing you might have already tried, but it's worth checking
>>> anyway: can you force your web app to only run a single thread at a time?
>>> That shouldn't be a problem with the morgan fingerprinting code, but it's
>>> still worth the experiment.
>>>
>>> -greg
>>>
>>>
>>> On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon 
>>> wrote:
>>>
 Wondering if anyone had any thoughts on this core dump from Java.
 What other info would be useful?

 Tim

 On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon 
 wrote:

> I'm struggling to work out a stange core dump I'm getting when
> calculating Morgan fingerprints from Java. This seems to happen with the
> Release_2020_09 releases but not with the Release_2019_09 ones. It does 
> not
> happen when calculating RDKit fingerprints. The exact Java code involved 
> is:
>
> RDKFuncs.MorganFingerprintMol(mol, 2);
>
> More precisely this is happening when running inside a Docker
> container which is running the code as a Tomcat webapp, but a simple test
> of running that same function inside the container directly from Java 
> (e.g.
> not when running in tomcat) works OK and does not core dump.
> Building an otherwise identical container with the Release_2019_09
> code does not core dump from Tomcat.
>
> The core dump looks like this:
>
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111
> #
> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build
> 11.0.9.1+1-post-Debian-1deb10u2)
> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2,
> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
> # Problematic frame:
> # [thread 145 also had an error]
> [thread 149 also had an error]
> [thread 113 also had an error]
> [thread 117 also had an error]
> C  [libGraphMolWrap.so+0xa20518]  void
> RDKit::MorganFingerprints::calcFingerprint int> >(RDKit::ROMol const&, unsigned int, std::vector std::allocator >*, std::vector std::allocator > const*, bool, bool, bool, bool,
> std::map,
> std::allocator > >,
> std::less, std::allocator std::vector,
> std::allocator > > > > >*, bool,
> RDKit::SparseIntVect&)+0x148
>
> It's difficult to know what's wrong, but thought it might be worth
> asking if anything in the Morgan fingerprint code has changed over that
> timeframe?
> It might be related to threading as the fingerprint generation is
> being done inside Java streams.
>
> Tim
>
>
>
> ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

>>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] non-element elements

2021-02-03 Thread Brian Peterson


Hello RDKit people,

Is it possible to modify the properties of elements in the periodic 
table or to create new ones?  Use case: Suppose one had some molecules 
defined in terms of functional groups or united atoms or some other 
entities that are not pure elemental atoms. Could one map these things 
on to unused elements (e.g. my_functional_group --> U) and fix up the 
properties of U so that it had the appropriate valence etc. and could be 
present both in a molecule and in SMARTS patterns so that one could do 
substructure matches within RDKit?


Thanks,
Brian

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java

2021-02-03 Thread Tim Dudgeon
Hi Greg,

The problem seems to happen whether running in a single thread or not.
It does not seem to depend on the molecule. Seems to always happen AFAICT.
The molecules are created from SMILES using RWMol.MolFromSmiles(smiles),
which I believe sanatizes by default.

Tim

On Wed, Feb 3, 2021 at 12:36 PM Greg Landrum  wrote:

> Hi Tim,
>
> I haven't seen this particular problem myself, nor have we gotten any
> reports of crashes from the Morgan fingerprinting code.
> Comparing the fingerprinting code itself across the 2019.09 and 2020.09
> branches I also don't see anything which is likely to cause problems, but
> one never knows.
>
> One thing that might help to know is how you construct the molecule's
> you're generating fingerprints for: are these from one of the RDKit file
> parsers? Have they been sanitized?
>
> Another thing you might have already tried, but it's worth checking
> anyway: can you force your web app to only run a single thread at a time?
> That shouldn't be a problem with the morgan fingerprinting code, but it's
> still worth the experiment.
>
> -greg
>
>
> On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon  wrote:
>
>> Wondering if anyone had any thoughts on this core dump from Java.
>> What other info would be useful?
>>
>> Tim
>>
>> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon 
>> wrote:
>>
>>> I'm struggling to work out a stange core dump I'm getting when
>>> calculating Morgan fingerprints from Java. This seems to happen with the
>>> Release_2020_09 releases but not with the Release_2019_09 ones. It does not
>>> happen when calculating RDKit fingerprints. The exact Java code involved is:
>>>
>>> RDKFuncs.MorganFingerprintMol(mol, 2);
>>>
>>> More precisely this is happening when running inside a Docker container
>>> which is running the code as a Tomcat webapp, but a simple test of running
>>> that same function inside the container directly from Java (e.g. not when
>>> running in tomcat) works OK and does not core dump.
>>> Building an otherwise identical container with the Release_2019_09 code
>>> does not core dump from Tomcat.
>>>
>>> The core dump looks like this:
>>>
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #  SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111
>>> #
>>> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build
>>> 11.0.9.1+1-post-Debian-1deb10u2)
>>> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2,
>>> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>> # Problematic frame:
>>> # [thread 145 also had an error]
>>> [thread 149 also had an error]
>>> [thread 113 also had an error]
>>> [thread 117 also had an error]
>>> C  [libGraphMolWrap.so+0xa20518]  void
>>> RDKit::MorganFingerprints::calcFingerprint>> int> >(RDKit::ROMol const&, unsigned int, std::vector>> std::allocator >*, std::vector>> std::allocator > const*, bool, bool, bool, bool,
>>> std::map,
>>> std::allocator > >,
>>> std::less, std::allocator>> std::vector,
>>> std::allocator > > > > >*, bool,
>>> RDKit::SparseIntVect&)+0x148
>>>
>>> It's difficult to know what's wrong, but thought it might be worth
>>> asking if anything in the Morgan fingerprint code has changed over that
>>> timeframe?
>>> It might be related to threading as the fingerprint generation is being
>>> done inside Java streams.
>>>
>>> Tim
>>>
>>>
>>>
>>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java

2021-02-03 Thread Stephen Roughley via Rdkit-discuss
Hi Tim,

You mentioned the calculation is done using Java streams. Have you tried
calling #sequential() on your stream to force it to run single threaded
from the Java side?

Steve

On Wed, 3 Feb 2021, 12:58 Greg Landrum,  wrote:

> Given the fun that threading is, this isn't necessarily conclusive, but I
> just created a small C++ multi-threading test for the morgan fingerprinting
> code and everything looks fine. That remains true when the code is run
> under valgrind (which is quite good at picking up the usual types of memory
> corruption that cause threading issues).
>
> -greg
>
>
> On Wed, Feb 3, 2021 at 1:36 PM Greg Landrum 
> wrote:
>
>> Hi Tim,
>>
>> I haven't seen this particular problem myself, nor have we gotten any
>> reports of crashes from the Morgan fingerprinting code.
>> Comparing the fingerprinting code itself across the 2019.09 and 2020.09
>> branches I also don't see anything which is likely to cause problems, but
>> one never knows.
>>
>> One thing that might help to know is how you construct the molecule's
>> you're generating fingerprints for: are these from one of the RDKit file
>> parsers? Have they been sanitized?
>>
>> Another thing you might have already tried, but it's worth checking
>> anyway: can you force your web app to only run a single thread at a time?
>> That shouldn't be a problem with the morgan fingerprinting code, but it's
>> still worth the experiment.
>>
>> -greg
>>
>>
>> On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon  wrote:
>>
>>> Wondering if anyone had any thoughts on this core dump from Java.
>>> What other info would be useful?
>>>
>>> Tim
>>>
>>> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon 
>>> wrote:
>>>
 I'm struggling to work out a stange core dump I'm getting when
 calculating Morgan fingerprints from Java. This seems to happen with the
 Release_2020_09 releases but not with the Release_2019_09 ones. It does not
 happen when calculating RDKit fingerprints. The exact Java code involved 
 is:

 RDKFuncs.MorganFingerprintMol(mol, 2);

 More precisely this is happening when running inside a Docker container
 which is running the code as a Tomcat webapp, but a simple test of running
 that same function inside the container directly from Java (e.g. not when
 running in tomcat) works OK and does not core dump.
 Building an otherwise identical container with the Release_2019_09 code
 does not core dump from Tomcat.

 The core dump looks like this:

 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111
 #
 # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build
 11.0.9.1+1-post-Debian-1deb10u2)
 # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2,
 mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
 # Problematic frame:
 # [thread 145 also had an error]
 [thread 149 also had an error]
 [thread 113 also had an error]
 [thread 117 also had an error]
 C  [libGraphMolWrap.so+0xa20518]  void
 RDKit::MorganFingerprints::calcFingerprint>>> int> >(RDKit::ROMol const&, unsigned int, std::vector>>> std::allocator >*, std::vector>>> std::allocator > const*, bool, bool, bool, bool,
 std::map,
 std::allocator > >,
 std::less, std::allocator>>> std::vector,
 std::allocator > > > > >*, bool,
 RDKit::SparseIntVect&)+0x148

 It's difficult to know what's wrong, but thought it might be worth
 asking if anything in the Morgan fingerprint code has changed over that
 timeframe?
 It might be related to threading as the fingerprint generation is being
 done inside Java streams.

 Tim



 ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java

2021-02-03 Thread Greg Landrum
Given the fun that threading is, this isn't necessarily conclusive, but I
just created a small C++ multi-threading test for the morgan fingerprinting
code and everything looks fine. That remains true when the code is run
under valgrind (which is quite good at picking up the usual types of memory
corruption that cause threading issues).

-greg


On Wed, Feb 3, 2021 at 1:36 PM Greg Landrum  wrote:

> Hi Tim,
>
> I haven't seen this particular problem myself, nor have we gotten any
> reports of crashes from the Morgan fingerprinting code.
> Comparing the fingerprinting code itself across the 2019.09 and 2020.09
> branches I also don't see anything which is likely to cause problems, but
> one never knows.
>
> One thing that might help to know is how you construct the molecule's
> you're generating fingerprints for: are these from one of the RDKit file
> parsers? Have they been sanitized?
>
> Another thing you might have already tried, but it's worth checking
> anyway: can you force your web app to only run a single thread at a time?
> That shouldn't be a problem with the morgan fingerprinting code, but it's
> still worth the experiment.
>
> -greg
>
>
> On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon  wrote:
>
>> Wondering if anyone had any thoughts on this core dump from Java.
>> What other info would be useful?
>>
>> Tim
>>
>> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon 
>> wrote:
>>
>>> I'm struggling to work out a stange core dump I'm getting when
>>> calculating Morgan fingerprints from Java. This seems to happen with the
>>> Release_2020_09 releases but not with the Release_2019_09 ones. It does not
>>> happen when calculating RDKit fingerprints. The exact Java code involved is:
>>>
>>> RDKFuncs.MorganFingerprintMol(mol, 2);
>>>
>>> More precisely this is happening when running inside a Docker container
>>> which is running the code as a Tomcat webapp, but a simple test of running
>>> that same function inside the container directly from Java (e.g. not when
>>> running in tomcat) works OK and does not core dump.
>>> Building an otherwise identical container with the Release_2019_09 code
>>> does not core dump from Tomcat.
>>>
>>> The core dump looks like this:
>>>
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #  SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111
>>> #
>>> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build
>>> 11.0.9.1+1-post-Debian-1deb10u2)
>>> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2,
>>> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>> # Problematic frame:
>>> # [thread 145 also had an error]
>>> [thread 149 also had an error]
>>> [thread 113 also had an error]
>>> [thread 117 also had an error]
>>> C  [libGraphMolWrap.so+0xa20518]  void
>>> RDKit::MorganFingerprints::calcFingerprint>> int> >(RDKit::ROMol const&, unsigned int, std::vector>> std::allocator >*, std::vector>> std::allocator > const*, bool, bool, bool, bool,
>>> std::map,
>>> std::allocator > >,
>>> std::less, std::allocator>> std::vector,
>>> std::allocator > > > > >*, bool,
>>> RDKit::SparseIntVect&)+0x148
>>>
>>> It's difficult to know what's wrong, but thought it might be worth
>>> asking if anything in the Morgan fingerprint code has changed over that
>>> timeframe?
>>> It might be related to threading as the fingerprint generation is being
>>> done inside Java streams.
>>>
>>> Tim
>>>
>>>
>>>
>>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java

2021-02-03 Thread Greg Landrum
Hi Tim,

I haven't seen this particular problem myself, nor have we gotten any
reports of crashes from the Morgan fingerprinting code.
Comparing the fingerprinting code itself across the 2019.09 and 2020.09
branches I also don't see anything which is likely to cause problems, but
one never knows.

One thing that might help to know is how you construct the molecule's
you're generating fingerprints for: are these from one of the RDKit file
parsers? Have they been sanitized?

Another thing you might have already tried, but it's worth checking anyway:
can you force your web app to only run a single thread at a time? That
shouldn't be a problem with the morgan fingerprinting code, but it's still
worth the experiment.

-greg


On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon  wrote:

> Wondering if anyone had any thoughts on this core dump from Java.
> What other info would be useful?
>
> Tim
>
> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon 
> wrote:
>
>> I'm struggling to work out a stange core dump I'm getting when
>> calculating Morgan fingerprints from Java. This seems to happen with the
>> Release_2020_09 releases but not with the Release_2019_09 ones. It does not
>> happen when calculating RDKit fingerprints. The exact Java code involved is:
>>
>> RDKFuncs.MorganFingerprintMol(mol, 2);
>>
>> More precisely this is happening when running inside a Docker container
>> which is running the code as a Tomcat webapp, but a simple test of running
>> that same function inside the container directly from Java (e.g. not when
>> running in tomcat) works OK and does not core dump.
>> Building an otherwise identical container with the Release_2019_09 code
>> does not core dump from Tomcat.
>>
>> The core dump looks like this:
>>
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111
>> #
>> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build
>> 11.0.9.1+1-post-Debian-1deb10u2)
>> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2,
>> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>> # Problematic frame:
>> # [thread 145 also had an error]
>> [thread 149 also had an error]
>> [thread 113 also had an error]
>> [thread 117 also had an error]
>> C  [libGraphMolWrap.so+0xa20518]  void
>> RDKit::MorganFingerprints::calcFingerprint> int> >(RDKit::ROMol const&, unsigned int, std::vector> std::allocator >*, std::vector> std::allocator > const*, bool, bool, bool, bool,
>> std::map,
>> std::allocator > >,
>> std::less, std::allocator> std::vector,
>> std::allocator > > > > >*, bool,
>> RDKit::SparseIntVect&)+0x148
>>
>> It's difficult to know what's wrong, but thought it might be worth asking
>> if anything in the Morgan fingerprint code has changed over that timeframe?
>> It might be related to threading as the fingerprint generation is being
>> done inside Java streams.
>>
>> Tim
>>
>>
>>
>> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss