Re: [Rdkit-discuss] non-element elements
On 04/02/2021 00:35, Brian Peterson wrote: Hello RDKit people, Is it possible to modify the properties of elements in the periodic table or to create new ones? Use case: Suppose one had some molecules defined in terms of functional groups or united atoms or some other entities that are not pure elemental atoms. Could one map these things on to unused elements (e.g. my_functional_group --> U) and fix up the properties of U so that it had the appropriate valence etc. and could be present both in a molecule and in SMARTS patterns so that one could do substructure matches within RDKit? Maybe you can use the isotope number to encode some special meaning for an atom. Cf. http://www.rdkit.org/docs/GettingStartedInPython.html "Other fragmentation approaches" [...] attachment points are labelled (using isotopes) [...] Those are preserved in the output SMILES. Thanks, Brian ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java
Steve, It happens whether running in multiple threads or a single thread. Tim On Wed, Feb 3, 2021 at 2:36 PM Stephen Roughley wrote: > Hi Tim, > > You mentioned the calculation is done using Java streams. Have you tried > calling #sequential() on your stream to force it to run single threaded > from the Java side? > > Steve > > On Wed, 3 Feb 2021, 12:58 Greg Landrum, wrote: > >> Given the fun that threading is, this isn't necessarily conclusive, but I >> just created a small C++ multi-threading test for the morgan fingerprinting >> code and everything looks fine. That remains true when the code is run >> under valgrind (which is quite good at picking up the usual types of memory >> corruption that cause threading issues). >> >> -greg >> >> >> On Wed, Feb 3, 2021 at 1:36 PM Greg Landrum >> wrote: >> >>> Hi Tim, >>> >>> I haven't seen this particular problem myself, nor have we gotten any >>> reports of crashes from the Morgan fingerprinting code. >>> Comparing the fingerprinting code itself across the 2019.09 and 2020.09 >>> branches I also don't see anything which is likely to cause problems, but >>> one never knows. >>> >>> One thing that might help to know is how you construct the molecule's >>> you're generating fingerprints for: are these from one of the RDKit file >>> parsers? Have they been sanitized? >>> >>> Another thing you might have already tried, but it's worth checking >>> anyway: can you force your web app to only run a single thread at a time? >>> That shouldn't be a problem with the morgan fingerprinting code, but it's >>> still worth the experiment. >>> >>> -greg >>> >>> >>> On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon >>> wrote: >>> Wondering if anyone had any thoughts on this core dump from Java. What other info would be useful? Tim On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon wrote: > I'm struggling to work out a stange core dump I'm getting when > calculating Morgan fingerprints from Java. This seems to happen with the > Release_2020_09 releases but not with the Release_2019_09 ones. It does > not > happen when calculating RDKit fingerprints. The exact Java code involved > is: > > RDKFuncs.MorganFingerprintMol(mol, 2); > > More precisely this is happening when running inside a Docker > container which is running the code as a Tomcat webapp, but a simple test > of running that same function inside the container directly from Java > (e.g. > not when running in tomcat) works OK and does not core dump. > Building an otherwise identical container with the Release_2019_09 > code does not core dump from Tomcat. > > The core dump looks like this: > > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111 > # > # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build > 11.0.9.1+1-post-Debian-1deb10u2) > # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, > mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # [thread 145 also had an error] > [thread 149 also had an error] > [thread 113 also had an error] > [thread 117 also had an error] > C [libGraphMolWrap.so+0xa20518] void > RDKit::MorganFingerprints::calcFingerprint int> >(RDKit::ROMol const&, unsigned int, std::vector std::allocator >*, std::vector std::allocator > const*, bool, bool, bool, bool, > std::map, > std::allocator > >, > std::less, std::allocator std::vector, > std::allocator > > > > >*, bool, > RDKit::SparseIntVect&)+0x148 > > It's difficult to know what's wrong, but thought it might be worth > asking if anything in the Morgan fingerprint code has changed over that > timeframe? > It might be related to threading as the fingerprint generation is > being done inside Java streams. > > Tim > > > > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] non-element elements
Hello RDKit people, Is it possible to modify the properties of elements in the periodic table or to create new ones? Use case: Suppose one had some molecules defined in terms of functional groups or united atoms or some other entities that are not pure elemental atoms. Could one map these things on to unused elements (e.g. my_functional_group --> U) and fix up the properties of U so that it had the appropriate valence etc. and could be present both in a molecule and in SMARTS patterns so that one could do substructure matches within RDKit? Thanks, Brian ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java
Hi Greg, The problem seems to happen whether running in a single thread or not. It does not seem to depend on the molecule. Seems to always happen AFAICT. The molecules are created from SMILES using RWMol.MolFromSmiles(smiles), which I believe sanatizes by default. Tim On Wed, Feb 3, 2021 at 12:36 PM Greg Landrum wrote: > Hi Tim, > > I haven't seen this particular problem myself, nor have we gotten any > reports of crashes from the Morgan fingerprinting code. > Comparing the fingerprinting code itself across the 2019.09 and 2020.09 > branches I also don't see anything which is likely to cause problems, but > one never knows. > > One thing that might help to know is how you construct the molecule's > you're generating fingerprints for: are these from one of the RDKit file > parsers? Have they been sanitized? > > Another thing you might have already tried, but it's worth checking > anyway: can you force your web app to only run a single thread at a time? > That shouldn't be a problem with the morgan fingerprinting code, but it's > still worth the experiment. > > -greg > > > On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon wrote: > >> Wondering if anyone had any thoughts on this core dump from Java. >> What other info would be useful? >> >> Tim >> >> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon >> wrote: >> >>> I'm struggling to work out a stange core dump I'm getting when >>> calculating Morgan fingerprints from Java. This seems to happen with the >>> Release_2020_09 releases but not with the Release_2019_09 ones. It does not >>> happen when calculating RDKit fingerprints. The exact Java code involved is: >>> >>> RDKFuncs.MorganFingerprintMol(mol, 2); >>> >>> More precisely this is happening when running inside a Docker container >>> which is running the code as a Tomcat webapp, but a simple test of running >>> that same function inside the container directly from Java (e.g. not when >>> running in tomcat) works OK and does not core dump. >>> Building an otherwise identical container with the Release_2019_09 code >>> does not core dump from Tomcat. >>> >>> The core dump looks like this: >>> >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111 >>> # >>> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build >>> 11.0.9.1+1-post-Debian-1deb10u2) >>> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, >>> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>> # Problematic frame: >>> # [thread 145 also had an error] >>> [thread 149 also had an error] >>> [thread 113 also had an error] >>> [thread 117 also had an error] >>> C [libGraphMolWrap.so+0xa20518] void >>> RDKit::MorganFingerprints::calcFingerprint>> int> >(RDKit::ROMol const&, unsigned int, std::vector>> std::allocator >*, std::vector>> std::allocator > const*, bool, bool, bool, bool, >>> std::map, >>> std::allocator > >, >>> std::less, std::allocator>> std::vector, >>> std::allocator > > > > >*, bool, >>> RDKit::SparseIntVect&)+0x148 >>> >>> It's difficult to know what's wrong, but thought it might be worth >>> asking if anything in the Morgan fingerprint code has changed over that >>> timeframe? >>> It might be related to threading as the fingerprint generation is being >>> done inside Java streams. >>> >>> Tim >>> >>> >>> >>> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java
Hi Tim, You mentioned the calculation is done using Java streams. Have you tried calling #sequential() on your stream to force it to run single threaded from the Java side? Steve On Wed, 3 Feb 2021, 12:58 Greg Landrum, wrote: > Given the fun that threading is, this isn't necessarily conclusive, but I > just created a small C++ multi-threading test for the morgan fingerprinting > code and everything looks fine. That remains true when the code is run > under valgrind (which is quite good at picking up the usual types of memory > corruption that cause threading issues). > > -greg > > > On Wed, Feb 3, 2021 at 1:36 PM Greg Landrum > wrote: > >> Hi Tim, >> >> I haven't seen this particular problem myself, nor have we gotten any >> reports of crashes from the Morgan fingerprinting code. >> Comparing the fingerprinting code itself across the 2019.09 and 2020.09 >> branches I also don't see anything which is likely to cause problems, but >> one never knows. >> >> One thing that might help to know is how you construct the molecule's >> you're generating fingerprints for: are these from one of the RDKit file >> parsers? Have they been sanitized? >> >> Another thing you might have already tried, but it's worth checking >> anyway: can you force your web app to only run a single thread at a time? >> That shouldn't be a problem with the morgan fingerprinting code, but it's >> still worth the experiment. >> >> -greg >> >> >> On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon wrote: >> >>> Wondering if anyone had any thoughts on this core dump from Java. >>> What other info would be useful? >>> >>> Tim >>> >>> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon >>> wrote: >>> I'm struggling to work out a stange core dump I'm getting when calculating Morgan fingerprints from Java. This seems to happen with the Release_2020_09 releases but not with the Release_2019_09 ones. It does not happen when calculating RDKit fingerprints. The exact Java code involved is: RDKFuncs.MorganFingerprintMol(mol, 2); More precisely this is happening when running inside a Docker container which is running the code as a Tomcat webapp, but a simple test of running that same function inside the container directly from Java (e.g. not when running in tomcat) works OK and does not core dump. Building an otherwise identical container with the Release_2019_09 code does not core dump from Tomcat. The core dump looks like this: # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111 # # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build 11.0.9.1+1-post-Debian-1deb10u2) # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # [thread 145 also had an error] [thread 149 also had an error] [thread 113 also had an error] [thread 117 also had an error] C [libGraphMolWrap.so+0xa20518] void RDKit::MorganFingerprints::calcFingerprint>>> int> >(RDKit::ROMol const&, unsigned int, std::vector>>> std::allocator >*, std::vector>>> std::allocator > const*, bool, bool, bool, bool, std::map, std::allocator > >, std::less, std::allocator>>> std::vector, std::allocator > > > > >*, bool, RDKit::SparseIntVect&)+0x148 It's difficult to know what's wrong, but thought it might be worth asking if anything in the Morgan fingerprint code has changed over that timeframe? It might be related to threading as the fingerprint generation is being done inside Java streams. Tim ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java
Given the fun that threading is, this isn't necessarily conclusive, but I just created a small C++ multi-threading test for the morgan fingerprinting code and everything looks fine. That remains true when the code is run under valgrind (which is quite good at picking up the usual types of memory corruption that cause threading issues). -greg On Wed, Feb 3, 2021 at 1:36 PM Greg Landrum wrote: > Hi Tim, > > I haven't seen this particular problem myself, nor have we gotten any > reports of crashes from the Morgan fingerprinting code. > Comparing the fingerprinting code itself across the 2019.09 and 2020.09 > branches I also don't see anything which is likely to cause problems, but > one never knows. > > One thing that might help to know is how you construct the molecule's > you're generating fingerprints for: are these from one of the RDKit file > parsers? Have they been sanitized? > > Another thing you might have already tried, but it's worth checking > anyway: can you force your web app to only run a single thread at a time? > That shouldn't be a problem with the morgan fingerprinting code, but it's > still worth the experiment. > > -greg > > > On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon wrote: > >> Wondering if anyone had any thoughts on this core dump from Java. >> What other info would be useful? >> >> Tim >> >> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon >> wrote: >> >>> I'm struggling to work out a stange core dump I'm getting when >>> calculating Morgan fingerprints from Java. This seems to happen with the >>> Release_2020_09 releases but not with the Release_2019_09 ones. It does not >>> happen when calculating RDKit fingerprints. The exact Java code involved is: >>> >>> RDKFuncs.MorganFingerprintMol(mol, 2); >>> >>> More precisely this is happening when running inside a Docker container >>> which is running the code as a Tomcat webapp, but a simple test of running >>> that same function inside the container directly from Java (e.g. not when >>> running in tomcat) works OK and does not core dump. >>> Building an otherwise identical container with the Release_2019_09 code >>> does not core dump from Tomcat. >>> >>> The core dump looks like this: >>> >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111 >>> # >>> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build >>> 11.0.9.1+1-post-Debian-1deb10u2) >>> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, >>> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>> # Problematic frame: >>> # [thread 145 also had an error] >>> [thread 149 also had an error] >>> [thread 113 also had an error] >>> [thread 117 also had an error] >>> C [libGraphMolWrap.so+0xa20518] void >>> RDKit::MorganFingerprints::calcFingerprint>> int> >(RDKit::ROMol const&, unsigned int, std::vector>> std::allocator >*, std::vector>> std::allocator > const*, bool, bool, bool, bool, >>> std::map, >>> std::allocator > >, >>> std::less, std::allocator>> std::vector, >>> std::allocator > > > > >*, bool, >>> RDKit::SparseIntVect&)+0x148 >>> >>> It's difficult to know what's wrong, but thought it might be worth >>> asking if anything in the Morgan fingerprint code has changed over that >>> timeframe? >>> It might be related to threading as the fingerprint generation is being >>> done inside Java streams. >>> >>> Tim >>> >>> >>> >>> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java
Hi Tim, I haven't seen this particular problem myself, nor have we gotten any reports of crashes from the Morgan fingerprinting code. Comparing the fingerprinting code itself across the 2019.09 and 2020.09 branches I also don't see anything which is likely to cause problems, but one never knows. One thing that might help to know is how you construct the molecule's you're generating fingerprints for: are these from one of the RDKit file parsers? Have they been sanitized? Another thing you might have already tried, but it's worth checking anyway: can you force your web app to only run a single thread at a time? That shouldn't be a problem with the morgan fingerprinting code, but it's still worth the experiment. -greg On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon wrote: > Wondering if anyone had any thoughts on this core dump from Java. > What other info would be useful? > > Tim > > On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon > wrote: > >> I'm struggling to work out a stange core dump I'm getting when >> calculating Morgan fingerprints from Java. This seems to happen with the >> Release_2020_09 releases but not with the Release_2019_09 ones. It does not >> happen when calculating RDKit fingerprints. The exact Java code involved is: >> >> RDKFuncs.MorganFingerprintMol(mol, 2); >> >> More precisely this is happening when running inside a Docker container >> which is running the code as a Tomcat webapp, but a simple test of running >> that same function inside the container directly from Java (e.g. not when >> running in tomcat) works OK and does not core dump. >> Building an otherwise identical container with the Release_2019_09 code >> does not core dump from Tomcat. >> >> The core dump looks like this: >> >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # SIGSEGV (0xb) at pc=0x7ff9edc00518, pid=1, tid=111 >> # >> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build >> 11.0.9.1+1-post-Debian-1deb10u2) >> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, >> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >> # Problematic frame: >> # [thread 145 also had an error] >> [thread 149 also had an error] >> [thread 113 also had an error] >> [thread 117 also had an error] >> C [libGraphMolWrap.so+0xa20518] void >> RDKit::MorganFingerprints::calcFingerprint> int> >(RDKit::ROMol const&, unsigned int, std::vector> std::allocator >*, std::vector> std::allocator > const*, bool, bool, bool, bool, >> std::map, >> std::allocator > >, >> std::less, std::allocator> std::vector, >> std::allocator > > > > >*, bool, >> RDKit::SparseIntVect&)+0x148 >> >> It's difficult to know what's wrong, but thought it might be worth asking >> if anything in the Morgan fingerprint code has changed over that timeframe? >> It might be related to threading as the fingerprint generation is being >> done inside Java streams. >> >> Tim >> >> >> >> ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss