I'm trying to use the Java wrapper to do a basic molecule
fragmentation, greedily matching against anything that fits the SMARTS
"[*;R]-;!@[*]" (effectively any regular bond that isn't part of a ring
system)

As a test case, I'm using a structure with a few obvious bonds that
can be broken, namely (in SMILES)
OC1CCC2CCC(OCCCCC3CCC(C4CCCC(Cl)C4Cl)CC3)CC2C1 .
As the Java Wrapper uses some slightly unusual functions for the
regular runtimes, I have included a toy example:

                Set<String> output = new HashSet<String>();

                RWMol mol2 = RWMol.MolFromSmiles(input, 0, true);
                ROMol patt = RWMol.MolFromSmarts("[*;R]-;!@[*]");

                Match_Vect_Vect matches = mol2.getSubstructMatches(patt);

                for (int i = 0; i < matches.size(); i++)
                {
                        Match_Vect match = matches.get(i);
                        for (int j = 0; j < match.size(); j++)
                        {
                                //lazy cloning
                                RWMol mol_to_manipulate =
RWMol.MolFromSmiles(input, 0, true);
                                Int_Pair pair = match.get(j);

                                // Find bond and remove

mol_to_manipulate.removeBond(pair.getFirst(), pair.getSecond());

System.out.println(mol_to_manipulate.MolToSmiles());
                                mol_to_manipulate.canonicalizeMol();
                                String[] temp =
mol_to_manipulate.MolToSmiles().split("\\.");

                                for (String tmp : temp)
                                {
                                        if (tmp.length() > 1)
                                        {
                                                if (!tmp.equals(input))
                                                {
                                                        output.add(tmp);
                                                }
                                        }
                                }
                        }
                }

                return output;
        }

When I run the selection on this molecule, I see the bond between atom
0 and 1 (the terminal OH) from both directions (0,1 and 1,0). However,
as the iteration continues, the matches seem to no longer relate to
meaningful bonds (0,7 , 1,8 , 0, 13 , 1,12 , 0,16 ,1,17 etc), and an
attempt to remove them doesn't work as there is no bond to link them.
Am I making some mistake in my handling of the query, or is there
something going on with the matches (I'm making sure to load in a new
molecule each time for the bond removal, but I'm struggling to see how
the mapping follows).


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to