Re: [Rdkit-discuss] atom mapping in reaction searches
Hi Sebastian, I'm a bit mystified by this and am going to have to dig around a bit to see if I can figure out what's going on. -greg On Wed, May 9, 2018 at 9:59 PM Sebastian Wandernothwrote: > Hey guys, > > any chance to get an answer on my issue? Even if the answer is that this > feature is currently not included in RDKit, it would still be helpful :-) > > Best regards > Sebastian > > *Gesendet:* Dienstag, 24. April 2018 um 09:33 Uhr > *Von:* "Sebastian Wandernoth" > *An:* rdkit-discuss@lists.sourceforge.net > *Betreff:* [Rdkit-discuss] atom mapping in reaction searches > Hey guys, > > I'm still working on my search engine for reactions and I'm a bit puzzled > as to what RDKit does with atom mapping information. > I'm still working with the PostgreSQL cartridge version 0.73.0, which > should correspond to the release 2017.9.3. > > I'm starting off with this example reaction which is fully mapped > ([S:1]1[C:3]([Cl:4])=[N:5][CH:6]=[CH:2]1>>S(=O)(=O)1OCC([C:2]2[S:1][C:3]([Cl:4])=[N:5][CH:6]=2)=N1) > > If I'm using a completely unmapped reaction as query I expect to find this > one. So the following should return TRUE: > > SELECT > reaction_from_smarts('[S:1]1[C:3]([Cl:4])=[N:5][CH:6]=[CH:2]1>>S(=O)(=O)1OCC([C:2]2[S:1][C:3]([Cl:4])=[N:5][CH:6]=2)=N1') > @> reaction_from_smarts('S1C(Cl)=NC=C1>>S(=O)(=O)1OCC(C2SC(Cl)=NC=2)=N1'); > > ... which it does > > > Next step is to map one atom correctly in the query and try again. I still > expect this to return TRUE: > > SELECT > reaction_from_smarts('[S:1]1[C:3]([Cl:4])=[N:5][CH:6]=[CH:2]1>>S(=O)(=O)1OCC([C:2]2[S:1][C:3]([Cl:4])=[N:5][CH:6]=2)=N1') > @> > reaction_from_smarts('[S:1]1C(Cl)=NC=C1>>S(=O)(=O)1OCC(C2[S:1]C(Cl)=NC=2)=N1'); > > ... which it doesn't > > > With two atoms mapped correctly in the query, I wouldn't expect to get > different results from the previous try: > > SELECT > reaction_from_smarts('[S:1]1[C:3]([Cl:4])=[N:5][CH:6]=[CH:2]1>>S(=O)(=O)1OCC([C:2]2[S:1][C:3]([Cl:4])=[N:5][CH:6]=2)=N1') > @> > reaction_from_smarts('[S:1]1C(Cl)=NC=[CH:2]1>>S(=O)(=O)1OCC([C:2]2[S:1]C(Cl)=NC=2)=N1'); > > ... this one however returns TRUE again > > > Final try I did was to include a wrong mapping in the query. I definitely > would expect to get back FALSE here (I'm mapping one sulfur atom to a > carbon atom and a nitrogen to an oxygen): > > SELECT > reaction_from_smarts('[S:1]1[C:3]([Cl:4])=[N:5][CH:6]=[CH:2]1>>S(=O)(=O)1OCC([C:2]2[S:1][C:3]([Cl:4])=[N:5][CH:6]=2)=N1') > @> > reaction_from_smarts('[S:1]1C(Cl)=[N:2]C=C1>>S(=[O:2])(=O)1O[CH2:1]C(C2SC(Cl)=NC=2)=N1'); > > ... however this returns TRUE yet again. > > Playing around with it a bit more I found that whatever single atom I map > in the query, I always get back FALSE and if I map more than one atom, I > always get back TRUE... > Does this have something to do with the parameter 'rdkit. > threshold_unmapped_reactant_atoms'? My suspicion is that RDKit only > counts how many atoms are mapped and not compare them to the correct > mapping. Can you confirm this? > Is there any way at all to include atom mapping in the query to filter the > reactions the way I want to? > > > I hope you guys can help me here. Sorry for the lengthy question, but I > wanted to include as much information as possible for you to pinpoint the > issue. > > Best regards > Sebastian > -- > Check out the vibrant tech community on one of the world's most engaging > tech sites, Slashdot.org! > http://sdm.link/slashdot___ > Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] question on rdRGroupDecomposition
Hi All, I'm hoping someone can help me with rdRGroupDecomposition. I'd like to be able to specify specific R-group locations AND match cases where R=H. The example below illustrates what I'm talking about. When RGroupDecompositionParameters.onlyMatchAtRGroups = True, cases where R == H are skipped. I tried putting an explicit hydrogen on the core to block a position, but it appears that the explicit hydrogen is ignored. from rdkit import Chem from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition, RGroupDecompositionParameters # run an RGroupDecomposition on a set of molecules def process_r_groups(core_mol,rg_params,mols): rg = RGroupDecomposition(core_mol,rg_params) for mol in mol_list: rg.Add(mol) rg.Process() return [x for x in rg.GetRGroupsAsRows(asSmiles=True)] buff = """CCc1ccnc(C)n1 Cc1ncccn1 Cc1cnc(C)nc1""" mol_list = [Chem.MolFromSmiles(x) for x in buff.split("\n")] core = Chem.MolFromSmiles("[H]c1cc([2*])nc([1*])n1") # default parameters, note that 3 R-groups are returned, the # explicit hydrogen is ignored params_1 = RGroupDecompositionParameters() for row in process_r_groups(core,params_1,mol_list): print(row) print() params_2 = RGroupDecompositionParameters() params_2.onlyMatchAtRGroups = True # run with the onlyMatchAtRGroups parameter # now only one row is returned for row in process_r_groups(core,params_2,mol_list): print(row) The output from the script above is {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]', 'R2': '[H][*:2]', 'R3': '[H]C([H])([H])C([H])([H])[*:3]'} {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]', 'R2': '[H][*:2]', 'R3': '[H][*:3]'} {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]', 'R2': '[H]C([H])([H])[*:2]', 'R3': '[H][*:3]'} {'Core': 'c1cc([*:2])nc([*:1])n1', 'R1': '[H]C([H])([H])[*:1]', 'R2': '[H]C([H])([H])C([H])([H])[*:2]'} I'd like to figure out how I can only get the substituents at the labeled positions, but have it match where R1 == H or R2 == H. Thanks in advance, Pat -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Chembience
Hello, I have released Chembience 0.2.0: it includes an update to RDKit 2018.03 and also provides Jupyter as new base App container type. https://github.com/chembience/chembience (so, assuming you have Docker and docker-compose installed on your computer, you are a few, easy commands away from your personal Jupyter notebook server with all RDKit 2018.03 goodness readily available). Best, Markus On Tue, Apr 24, 2018 at 10:44 AM Markus Sitzmannwrote: > Hello, > > since it includes RDKit as one of its major components I am happy to > announce the first release of my new open-source project Chembience: > > A Docker-based, cloudable platform for the development of > chemoinformatics-centric web applications and microservices. > > https://github.com/chembience/chembience > > (unfortunately it is still on RDKit 2017.09_3, I failed releasing it > before 2018.03 :-) ). > > Best, > Markus > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss