I'm trying to use the Java wrapper to do a basic molecule fragmentation, greedily matching against anything that fits the SMARTS "[*;R]-;!@[*]" (effectively any regular bond that isn't part of a ring system)
As a test case, I'm using a structure with a few obvious bonds that can be broken, namely (in SMILES) OC1CCC2CCC(OCCCCC3CCC(C4CCCC(Cl)C4Cl)CC3)CC2C1 . As the Java Wrapper uses some slightly unusual functions for the regular runtimes, I have included a toy example: Set<String> output = new HashSet<String>(); RWMol mol2 = RWMol.MolFromSmiles(input, 0, true); ROMol patt = RWMol.MolFromSmarts("[*;R]-;!@[*]"); Match_Vect_Vect matches = mol2.getSubstructMatches(patt); for (int i = 0; i < matches.size(); i++) { Match_Vect match = matches.get(i); for (int j = 0; j < match.size(); j++) { //lazy cloning RWMol mol_to_manipulate = RWMol.MolFromSmiles(input, 0, true); Int_Pair pair = match.get(j); // Find bond and remove mol_to_manipulate.removeBond(pair.getFirst(), pair.getSecond()); System.out.println(mol_to_manipulate.MolToSmiles()); mol_to_manipulate.canonicalizeMol(); String[] temp = mol_to_manipulate.MolToSmiles().split("\\."); for (String tmp : temp) { if (tmp.length() > 1) { if (!tmp.equals(input)) { output.add(tmp); } } } } } return output; } When I run the selection on this molecule, I see the bond between atom 0 and 1 (the terminal OH) from both directions (0,1 and 1,0). However, as the iteration continues, the matches seem to no longer relate to meaningful bonds (0,7 , 1,8 , 0, 13 , 1,12 , 0,16 ,1,17 etc), and an attempt to remove them doesn't work as there is no bond to link them. Am I making some mistake in my handling of the query, or is there something going on with the matches (I'm making sure to load in a new molecule each time for the bond removal, but I'm struggling to see how the mapping follows). _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss