[Rdkit-discuss] Keeping only parts of a molecule (a set of atom ids)

2012-03-22 Thread JP
Hi there at RDKit,

I have a set of atom indices from a molecule I want to keep, and any atom
which is not in this list I want to discard.

I thought of implementing this as follows:

#!/usr/bin/env python

from rdkit import Chem

mol = Chem.MolFromSmiles(CCC1CNCC1CC)

keep_atoms = [2,3,4] # assume these exist in the above as an example, you
can print the atom ids to check

edit_mol = Chem.EditableMol(mol)
for atom in mol.GetAtoms():
if atom.GetIdx() not in keep_atoms:
edit_mol.RemoveAtom(atom.GetIdx())

I am not sure this is the best implementation (also because it does not
work), but it's a try.

The end result should be an sdf file with only atoms 2,3,4 from the
original molecule.

When I run the above I get:

[15:45:50]


Range Error
idx
Violation occurred on line 143 in file
/opt/RDKit_2011_12_1/Code/GraphMol/ROMol.cpp
Failed Expression: 0 = 6 = 5


Traceback (most recent call last):
  File ./test.py, line 12, in module
edit_mol.RemoveAtom(atom.GetIdx())
RuntimeError: Range Error

I cannot quite understand this error.  Can anyone shed some light?
I mean this is related to me deleting 3 atoms from the molecule, so it
somehow expects the range to be from 0 = x = 5  instead of 0 = x = 8...
but why is there this check in place?

Many Thanks

-
Jean-Paul Ebejer
Early Stage Researcher
--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Keeping only parts of a molecule (a set of atom ids)

2012-03-22 Thread JP
And as a follow up - running this:

#!/usr/bin/env python

from rdkit import Chem

mol = Chem.MolFromSmiles(CCC1CNCC1CC)
edit_mol = Chem.EditableMol(mol)
edit_mol.RemoveAtom(0)

for atom in edit_mol.GetMol().GetAtoms():
print atom.GetIdx()

gives seg fault...

jp@xxx:~/tmp$ test.py
Segmentation fault


-
Jean-Paul Ebejer
Early Stage Researcher


On 22 March 2012 15:52, JP jeanpaul.ebe...@inhibox.com wrote:


 Hi there at RDKit,

 I have a set of atom indices from a molecule I want to keep, and any atom
 which is not in this list I want to discard.

 I thought of implementing this as follows:

 #!/usr/bin/env python

 from rdkit import Chem

 mol = Chem.MolFromSmiles(CCC1CNCC1CC)

 keep_atoms = [2,3,4] # assume these exist in the above as an example, you
 can print the atom ids to check

 edit_mol = Chem.EditableMol(mol)
 for atom in mol.GetAtoms():
 if atom.GetIdx() not in keep_atoms:
 edit_mol.RemoveAtom(atom.GetIdx())

  I am not sure this is the best implementation (also because it does not
 work), but it's a try.

 The end result should be an sdf file with only atoms 2,3,4 from the
 original molecule.

 When I run the above I get:

 [15:45:50]

 
 Range Error
 idx
 Violation occurred on line 143 in file
 /opt/RDKit_2011_12_1/Code/GraphMol/ROMol.cpp
 Failed Expression: 0 = 6 = 5
 

 Traceback (most recent call last):
   File ./test.py, line 12, in module
 edit_mol.RemoveAtom(atom.GetIdx())
 RuntimeError: Range Error

 I cannot quite understand this error.  Can anyone shed some light?
 I mean this is related to me deleting 3 atoms from the molecule, so it
 somehow expects the range to be from 0 = x = 5  instead of 0 = x = 8...
 but why is there this check in place?

 Many Thanks

 -
 Jean-Paul Ebejer
 Early Stage Researcher

--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Keeping only parts of a molecule (a set of atom ids)

2012-03-22 Thread Sarah Langdon
I had a similar problem to this when removing atoms from a molecule. When you 
remove an atom, the atoms IDs change, therefore resulting in a seg fault, or 
your atoms not being in the correct range.

The way I got around this was to sort the IDs of the atoms I want to remove 
from highest to lowest, so the atoms with higher IDs are removed first, this 
will not affect the IDs of the atoms with lower IDs.

Here is the relevant discussion 
http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01937.html
Thanks,

Sarah


On 22 Mar 2012, at 15:56, JP wrote:

 And as a follow up - running this:
 
 #!/usr/bin/env python
 
 from rdkit import Chem
 
 mol = Chem.MolFromSmiles(CCC1CNCC1CC)
 edit_mol = Chem.EditableMol(mol)
 edit_mol.RemoveAtom(0)
 
 for atom in edit_mol.GetMol().GetAtoms():
 print atom.GetIdx()
 
 gives seg fault...
 
 jp@xxx:~/tmp$ test.py
 Segmentation fault
 
 
 -
 Jean-Paul Ebejer
 Early Stage Researcher
 
 
 On 22 March 2012 15:52, JP jeanpaul.ebe...@inhibox.com wrote:
 
 Hi there at RDKit,
 
 I have a set of atom indices from a molecule I want to keep, and any atom 
 which is not in this list I want to discard.
 
 I thought of implementing this as follows:
 
 #!/usr/bin/env python
 
 from rdkit import Chem
 
 mol = Chem.MolFromSmiles(CCC1CNCC1CC)
 
 keep_atoms = [2,3,4] # assume these exist in the above as an example, you can 
 print the atom ids to check
 
 edit_mol = Chem.EditableMol(mol)
 for atom in mol.GetAtoms():
 if atom.GetIdx() not in keep_atoms:
 edit_mol.RemoveAtom(atom.GetIdx())
 
 I am not sure this is the best implementation (also because it does not 
 work), but it's a try.
 
 The end result should be an sdf file with only atoms 2,3,4 from the original 
 molecule.
 
 When I run the above I get:
 
 [15:45:50] 
 
 
 Range Error
 idx
 Violation occurred on line 143 in file 
 /opt/RDKit_2011_12_1/Code/GraphMol/ROMol.cpp
 Failed Expression: 0 = 6 = 5
 
 
 Traceback (most recent call last):
   File ./test.py, line 12, in module
 edit_mol.RemoveAtom(atom.GetIdx())
 RuntimeError: Range Error
 
 I cannot quite understand this error.  Can anyone shed some light?
 I mean this is related to me deleting 3 atoms from the molecule, so it 
 somehow expects the range to be from 0 = x = 5  instead of 0 = x = 8... 
 but why is there this check in place?  
 
 Many Thanks
 
 -
 Jean-Paul Ebejer
 Early Stage Researcher
 
 --
 This SF email is sponsosred by:
 Try Windows Azure free for 90 days Click Here 
 http://p.sf.net/sfu/sfd2d-msazure___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addressee only.  If the 
message is received by anyone other than the addressee, please return the 
message to the sender by replying to it and then delete the message from your 
computer and network.--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Keeping only parts of a molecule (a set of atom ids)

2012-03-22 Thread Eddie Cao
Hi JP,

Sarah was right on the trick of deleting atoms in the descending order of atom 
index. Regarding the segment fault, this is very likely to be a memory 
management issue. Make sure you assign the result of  the GetMol() call to a 
temporary variable to avoid the underlying object being released prematurely.

 from rdkit import Chem
 
 mol = Chem.MolFromSmiles(CCC1CNCC1CC)
 edit_mol = Chem.EditableMol(mol)
 edit_mol.RemoveAtom(0)
 m = edit_mol.GetMol()
 for atom in m.GetAtoms():
...  print atom.GetIdx()
... 
0
1
2
3
4
5
6
7

Regards,
Eddie

On Mar 22, 2012, at 9:07 AM, Sarah Langdon wrote:

 I had a similar problem to this when removing atoms from a molecule. When you 
 remove an atom, the atoms IDs change, therefore resulting in a seg fault, or 
 your atoms not being in the correct range.
 
 The way I got around this was to sort the IDs of the atoms I want to remove 
 from highest to lowest, so the atoms with higher IDs are removed first, this 
 will not affect the IDs of the atoms with lower IDs.
 
 Here is the relevant discussion 
 http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01937.html
 Thanks,
 
 Sarah
 
 
 On 22 Mar 2012, at 15:56, JP wrote:
 
 And as a follow up - running this:
 
 #!/usr/bin/env python
 
 from rdkit import Chem
 
 mol = Chem.MolFromSmiles(CCC1CNCC1CC)
 edit_mol = Chem.EditableMol(mol)
 edit_mol.RemoveAtom(0)
 
 for atom in edit_mol.GetMol().GetAtoms():
 print atom.GetIdx()
 
 gives seg fault...
 
 jp@xxx:~/tmp$ test.py
 Segmentation fault
 
 
 -
 Jean-Paul Ebejer
 Early Stage Researcher
 
 
 On 22 March 2012 15:52, JP jeanpaul.ebe...@inhibox.com wrote:
 
 Hi there at RDKit,
 
 I have a set of atom indices from a molecule I want to keep, and any atom 
 which is not in this list I want to discard.
 
 I thought of implementing this as follows:
 
 #!/usr/bin/env python
 
 from rdkit import Chem
 
 mol = Chem.MolFromSmiles(CCC1CNCC1CC)
 
 keep_atoms = [2,3,4] # assume these exist in the above as an example, you 
 can print the atom ids to check
 
 edit_mol = Chem.EditableMol(mol)
 for atom in mol.GetAtoms():
 if atom.GetIdx() not in keep_atoms:
 edit_mol.RemoveAtom(atom.GetIdx())
 
 I am not sure this is the best implementation (also because it does not 
 work), but it's a try.
 
 The end result should be an sdf file with only atoms 2,3,4 from the original 
 molecule.
 
 When I run the above I get:
 
 [15:45:50] 
 
 
 Range Error
 idx
 Violation occurred on line 143 in file 
 /opt/RDKit_2011_12_1/Code/GraphMol/ROMol.cpp
 Failed Expression: 0 = 6 = 5
 
 
 Traceback (most recent call last):
   File ./test.py, line 12, in module
 edit_mol.RemoveAtom(atom.GetIdx())
 RuntimeError: Range Error
 
 I cannot quite understand this error.  Can anyone shed some light?
 I mean this is related to me deleting 3 atoms from the molecule, so it 
 somehow expects the range to be from 0 = x = 5  instead of 0 = x = 8... 
 but why is there this check in place?  
 
 Many Thanks
 
 -
 Jean-Paul Ebejer
 Early Stage Researcher
 
 --
 This SF email is sponsosred by:
 Try Windows Azure free for 90 days Click Here 
 http://p.sf.net/sfu/sfd2d-msazure___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
 
 
 The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
 Limited by Guarantee, Registered in England under Company No. 534147 with its 
 Registered Office at 123 Old Brompton Road, London SW7 3RP.
 
 This e-mail message is confidential and for use by the addressee only. If the 
 message is received by anyone other than the addressee, please return the 
 message to the sender by replying to it and then delete the message from your 
 computer and network.
 --
 This SF email is sponsosred by:
 Try Windows Azure free for 90 days Click Here 
 http://p.sf.net/sfu/sfd2d-msazure___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Keeping only parts of a molecule (a set of atom ids)

2012-03-22 Thread JP
Thanks to both of you, nice trick...
-
Jean-Paul Ebejer
Early Stage Researcher


On 22 March 2012 16:34, Eddie Cao cao.yi...@gmail.com wrote:

 Hi JP,

 Sarah was right on the trick of deleting atoms in the descending order of
 atom index. Regarding the segment fault, this is very likely to be a memory
 management issue. Make sure you assign the result of  the GetMol() call to
 a temporary variable to avoid the underlying object being released
 prematurely.

  from rdkit import Chem
 
  mol = Chem.MolFromSmiles(CCC1CNCC1CC)
  edit_mol = Chem.EditableMol(mol)
  edit_mol.RemoveAtom(0)
  m = edit_mol.GetMol()
  for atom in m.GetAtoms():
 ...  print atom.GetIdx()
 ...
 0
 1
 2
 3
 4
 5
 6
 7

 Regards,
 Eddie

 On Mar 22, 2012, at 9:07 AM, Sarah Langdon wrote:

 I had a similar problem to this when removing atoms from a molecule. When
 you remove an atom, the atoms IDs change, therefore resulting in a seg
 fault, or your atoms not being in the correct range.

 The way I got around this was to sort the IDs of the atoms I want to
 remove from highest to lowest, so the atoms with higher IDs are removed
 first, this will not affect the IDs of the atoms with lower IDs.

 Here is the relevant discussion
 http://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg01937.html
 Thanks,

 Sarah


 On 22 Mar 2012, at 15:56, JP wrote:

 And as a follow up - running this:

 #!/usr/bin/env python

 from rdkit import Chem

 mol = Chem.MolFromSmiles(CCC1CNCC1CC)
 edit_mol = Chem.EditableMol(mol)
 edit_mol.RemoveAtom(0)

 for atom in edit_mol.GetMol().GetAtoms():
 print atom.GetIdx()

 gives seg fault...

 jp@xxx:~/tmp$ test.py
 Segmentation fault


 -
 Jean-Paul Ebejer
 Early Stage Researcher


 On 22 March 2012 15:52, JP jeanpaul.ebe...@inhibox.com wrote:


 Hi there at RDKit,

 I have a set of atom indices from a molecule I want to keep, and any atom
 which is not in this list I want to discard.

 I thought of implementing this as follows:

 #!/usr/bin/env python

 from rdkit import Chem

 mol = Chem.MolFromSmiles(CCC1CNCC1CC)

 keep_atoms = [2,3,4] # assume these exist in the above as an example, you
 can print the atom ids to check

 edit_mol = Chem.EditableMol(mol)
 for atom in mol.GetAtoms():
 if atom.GetIdx() not in keep_atoms:
 edit_mol.RemoveAtom(atom.GetIdx())

  I am not sure this is the best implementation (also because it does not
 work), but it's a try.

 The end result should be an sdf file with only atoms 2,3,4 from the
 original molecule.

 When I run the above I get:

 [15:45:50]

 
 Range Error
 idx
 Violation occurred on line 143 in file
 /opt/RDKit_2011_12_1/Code/GraphMol/ROMol.cpp
 Failed Expression: 0 = 6 = 5
 

 Traceback (most recent call last):
   File ./test.py, line 12, in module
 edit_mol.RemoveAtom(atom.GetIdx())
 RuntimeError: Range Error

 I cannot quite understand this error.  Can anyone shed some light?
 I mean this is related to me deleting 3 atoms from the molecule, so it
 somehow expects the range to be from 0 = x = 5  instead of 0 = x = 8...
 but why is there this check in place?

 Many Thanks

 -
 Jean-Paul Ebejer
 Early Stage Researcher



 --
 This SF email is sponsosred by:
 Try Windows Azure free for 90 days Click Here

 http://p.sf.net/sfu/sfd2d-msazure___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



 The Institute of Cancer Research: Royal Cancer Hospital, a charitable
 Company Limited by Guarantee, Registered in England under Company No.
 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.

 This e-mail message is confidential and for use by the addressee only. If
 the message is received by anyone other than the addressee, please return
 the message to the sender by replying to it and then delete the message
 from your computer and network.

 --
 This SF email is sponsosred by:
 Try Windows Azure free for 90 days Click Here

 http://p.sf.net/sfu/sfd2d-msazure___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss