Re: [Rdkit-discuss] Question about ECFP fingerprints when using multiprocessing and chiralty

2020-05-19 Thread Greg Landrum
Hi Hao, Good question! I had to do a bit of digging to figure that out Here's what's going on: The Morgan fingerprint code uses CIP codes when you set useChirality=True Atomic CIP codes are stored as an atomic property When you use the multiprocessing module everything ends up being pickled and

[Rdkit-discuss] Comparing sets of comformers

2020-05-19 Thread Othman Al Bahri
Hello, I have two sets of conforms (ca. 2000 each) of a molecule. I’d like to find the relative complement of set A in set B (i.e., unique conformers in set B that are not in set A). I’m thinking of calculating the distance matrix of each conformer, then looping through all conformers to find

[Rdkit-discuss] Question about ECFP fingerprints when using multiprocessing and chiralty

2020-05-19 Thread Hao
Hello, This was a very strange bug that I saw. I was getting inconsistent fingerprints using GetMorganFingerprint with useChirality=True, when I used multiprocessing vs when I ran serially on rdkit 2017.09.1 and 2018.03.2. It seems to have been fixed in the latest version. Woo! I was just

Re: [Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds

2020-05-19 Thread Paolo Tosco
Hi Theo, I don't think the RDKit version should make a difference; did you notice that rdmolops.AdjustQueryProperties() does not modify the molecule in place, but rather returns a modified copy? pattern_generic_bonds = Chem.AdjustQueryProperties(pattern, query_params) That might be the

Re: [Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds

2020-05-19 Thread theozh
Hi Paolo, thank you very much for your detailed answer. I tried to reproduce your last suggestion (but I don't have Jupyter Notebook). However, my bonds are still SINGLE and DOUBLE instead of UNSPECIFIED. Does this maybe depend on the RDKit Version, I have 2019.03... ? Maybe, I should update and

Re: [Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds

2020-05-19 Thread Paolo Tosco
Hi Theo, the lack of match is due to different aromaticity flags on atoms and bonds in the larger molecule. This gist provides some explanation and a possible solution: https://gist.github.com/ptosco/e410e45278b94e8f047ff224193d7788 Cheers, p. On 19/05/2020 14:13, theozh wrote: Dear

Re: [Rdkit-discuss] performance issues with PandasTools LoadSDF

2020-05-19 Thread Mario Lovrić
The original file is around 25MB. I changed the content in the dataframe (from sdf) and wrote: PandasTools.WriteSDF( dataframe, 'path', properties= dataframe .columns, idName='ID') Thanks, Mario On Tue, May 19, 2020 at 9:13 AM Greg Landrum wrote: > Hi

[Rdkit-discuss] Substructure search issue with aliphatic/aromatic bonds

2020-05-19 Thread theozh
Dear RDKit-users, I would like to do a very simple substructure search. The chapter 3.5 "Substructure Searching" in RDKit Documentation (2019.09.1) is pretty short and doesn't point to a solution. So far, I've learned that you can create your search pattern via Chem.MolFromSmiles() or

Re: [Rdkit-discuss] performance issues with PandasTools LoadSDF

2020-05-19 Thread Greg Landrum
Hi Mario, how big is the file? did you *add* properties to it or just modify existing values? -greg On Fri, May 15, 2020 at 11:34 AM Mario Lovrić wrote: > Dear all, > > > I have loaded a SDF file (lets call it file1) with PandasTools, corrected > some properties and wrote it with PandasTools