On Nov 8, 2017, at 21:00, Chenyang Shi <cs3...@columbia.edu> wrote:
> =C= : [CH0;A;X2;!R](=[$(*)])=[$(*)] 

The recursive SMARTS notation, which is the term inside of the [$(...)], finds 
a match for the entire pattern and returns the first atom in that pattern.

> For example, if I search "C=C=O" using "[CH0;A;X2;!R](=[$(*)])=[$(*)]", 
> >>> from rdkit import Chem
> >>> m = Chem.MolFromSmiles('C=C=O')
> >>> m.GetSubstructMatches(Chem.MolFromSmarts("[CH0;A;X2;!R](=[$(*)])=[$(*)]"))
> ((1, 0, 2),)
> it prints out atomic positions 1, 0, 2--three positions. But I would expect 
> only one position for the Carbon in the middle.

The $(*) finds the pattern, which is a "*" and in this case the terminal 
carbons, and returns it. The substructure search returns 3 positions because 
the first is [CH0;A;X2;!R], the second is the first atom of "*", and the third 
is the first atom of the other "*".

If you only want the first atom the entire pattern, then put the entire pattern 
in a recursive SMARTS, as in:


>>> pat = Chem.MolFromSmarts("[$([CH0;A;X2;!R](=*)=*)]")
>>> mol = Chem.MolFromSmiles('C=C=O')
>>> mol.GetSubstructMatches(pat)

> Similarly, if I search "C#C" using "[CH1;A;X2;!R]#[$(*)]", 
> >>> from rdkit import Chem
> >>> m = Chem.MolFromSmiles('C#C')
> >>> m.GetSubstructMatches(Chem.MolFromSmarts("[CH1;A;X2;!R]#[$(*)]"))
> ((0, 1),)
> I would expect two separate positions such as (0,), (1,), indicating there 
> are two carbon triple bonds (with an hydrogen).

Since you are only looking for a single atom, try putting the entire pattern in 
a recursive SMARTS, as in


>>> mol = Chem.MolFromSmiles("C#C")
>>> pat = Chem.MolFromSmarts("[$([CH1;A;X2;!R]#*)]")
>>> mol.GetSubstructMatches(pat)
((0,), (1,))

> Then if  if I search "CC#CC" using " [CH0;A;X2;!R]#[$(*)]", 

I believe you want "[$([CH0;A;X2;!R]#*)]"

Thank you for your clear description of what you expected.



