On Dec 18, 2016, at 6:32 PM, Brian Kelley wrote: > >>> m.GetProp("_smilesAtomOutputOrder") > '[3,2,1,0,]' > > Note that this returns the list as a string which is sub-optimal. > GetPropsAsDict will convert these to proper python objects, however, this is > considered a private member so you need to return these as well: > > >>> list(m.GetPropsAsDict(True,True)["_smilesAtomOutputOrder"]) > [3, 2, 1, 0]
For fun, here are a few timing numbers: # Common setup from rdkit import Chem mol = Chem.MolFromSmiles("c1ccccc1Oc1ccccc1") Chem.MolToSmiles(mol)' import json import ujson # third-party JSON decoder import re integer_pat = re.compile("[0-9]+") # Get the string (give a lower bound) mol.GetProp("_smilesAtomOutputOrder")' 10000 loops, best of 3: 31.3 usec per loop Here are variations for how to get that information as a list of integers: # Using Python's "eval()" to decode the list (this is generally UNSAFE!) eval(mol.GetProp("_smilesAtomOutputOrder"))' 10000 loops, best of 3: 157 usec per loop # Use the built-in json module (need to remove the terminal ",") json.loads(mol.GetProp("_smilesAtomOutputOrder")[:-2]+"]")' 10000 loops, best of 3: 66.5 usec per loop # Use the third-party "ujson" package, which is faster than json. ujson.loads(mol.GetProp("_smilesAtomOutputOrder")[:-2]+"]") 10000 loops, best of 3: 41.2 usec per loop ("cjson" takes 49.7 usec per loop) # Use the properties dictionary mol.GetPropsAsDict(True,True)["_smilesAtomOutputOrder"] 1000 loops, best of 3: 462 usec per loop # Parse it more directly map(int, integer_pat.findall(mol.GetProp("_smilesAtomOutputOrder"))) 10000 loops, best of 3: 89 usec per loop Andrew da...@dalkescientific.com ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss