Dear All,

Please could you help with the following problem (I could not find answers in 
discussion list) ?

pattern='C~C~C(~C)~C'

smiles='O[C@H]1C[C@H]2C([C@@]1(C)CC2)(C)C'


pat = Chem.MolFromSmiles(pattern)
mol = Chem.MolFromSmiles(smiles)
res = mol.GetSubstructMatches(pat, uniquify=True)


The results are:

((1, 2, 3, 4, 8), (1, 5, 4, 3, 9), (1, 5, 4, 3, 10), (1, 5, 4, 9, 10), (2, 1, 
5, 4, 6), (2, 1, 5, 4, 7), (2, 1, 5, 6, 7), (2, 3, 4, 5, 9), (2, 3, 4, 5, 10), 
(2, 3, 4, 9, 10), (3, 4, 5, 1, 6), (3, 4, 5, 1, 7), (3, 4, 5, 6, 7), (5, 4, 3, 
2, 8), (6, 5, 4, 3, 9), (6, 5, 4, 3, 10), (6, 5, 4, 9, 10), (7, 5, 4, 3, 9), 
(7, 5, 4, 3, 10), (7, 5, 4, 9, 10), (7, 8, 3, 2, 4), (8, 3, 4, 5, 9), (8, 3, 4, 
5, 10), (8, 3, 4, 9, 10), (8, 7, 5, 1, 4), (8, 7, 5, 1, 6), (8, 7, 5, 4, 6), 
(9, 4, 3, 2, 8), (9, 4, 5, 1, 6), (9, 4, 5, 1, 7), (9, 4, 5, 6, 7), (10, 4, 3, 
2, 8), (10, 4, 5, 1, 6), (10, 4, 5, 1, 7), (10, 4, 5, 6, 7))


I expect to have only 2 matches with uniquify=True as I only have 2 units of 
the pattern. Furthermore, with or without uniquify, I have the same answers. I 
also expected that there should be 2 "independent" lists but here, there is 
always at least one common atom between each list.

Is there something misunderstood or misused?

Thanks in advance for your help and explanations.

Best regards,

Quoc-Tuan
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to