Hi,
I think if you simply need H and the H count appended it is by far the
easiest by just appending it to the symbol string. See the codeblock below:

def get_symbol_with_Hs(a):
    symbol=a.GetSymbol()
    charge=a.GetFormalCharge()
    hcount=a.GetTotalNumHs()
    if hcount > 0:
        symbol+="H"
        if hcount > 1:
            symbol+=str(hcount)
    if charge==1:
        symbol+="+"
    if charge==-1:
        symbol+="-"
    if charge > 1:
        symbol+=f"(+{charge})"
    if charge < -1:
        symbol+=f"(-{charge})"
    return symbol

mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
for i, atom in enumerate(mol.GetAtoms()):
    atom.SetProp('molAtomMapNumber',str(i))
    print(i,get_symbol_with_Hs(atom))

-----
another way I would recommend is using smiles and explicit hydrogens (i.e.
bracketed) instead. For your use case I would imagine this as follows:
from rdkit import Chem
mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
mol=Chem.AddHs(mol)
rwmol=Chem.RWMol(mol)
for b in list(rwmol.GetBonds()):
    ba=b.GetBeginAtom()
    ea=b.GetEndAtom()
    if ba.GetAtomicNum()!=1 and ea.GetAtomicNum()!=1:
        rwmol.RemoveBond(ba.GetIdx(),ea.GetIdx())
frags=Chem.GetMolFrags(rwmol, asMols=True,sanitizeFrags=False)
for i,f in enumerate(frags):
    print(i,Chem.MolToSmiles(f))

this would output

0 [H]c
1 [H]c
2 [H]c
3 [H]c
4 [H]c
5 c
6 C
7 [H]N[H]
8 O


i hope that helps.

best wishes
wim

On Tue, May 9, 2023 at 7:58 AM Haijun Feng <haijun20230...@gmail.com> wrote:

>
> <https://stackoverflow.com/posts/76197437/timeline>
>
> Hi All,
>
> I am trying to add atom numbers in smiles as belows,
>
>     from rdkit import Chem
>     mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
>     for i, atom in enumerate(mol.GetAtoms()):
>       atom.SetProp('molAtomMapNumber',str(i))
>     smi=Chem.MolToSmiles(mol)
>     print(smi)
>
> the output is: [cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1C:6=[O:8]
>
> then I want to split the smiles into atoms, I did it like this:
>
>     from rdkit import Chem
>     mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
>     for i, atom in enumerate(mol.GetAtoms()):
>       atom.SetProp('molAtomMapNumber',str(i))
>       print(i,atom.GetSymbol())
>
> the output is:
>
> 0 C
> 1 C
> 2 C
> 3 C
> 4 C
> 5 C
> 6 C
> 7 N
> 8 O
>
> *But what I do want is something like this (with fragments instead of
> atoms): *
>
>
>
>
>
>
> *0 cH1 CH...7 NH28 O  *
>
> Can anyone help me figure out how to get each atom with H from the smiles
> as above. Thanks so much!
>
>
> best,
>
>
> Hal
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to