Hello all, While reading the source code for ASKCOS ( https://github.com/connorcoley/ASKCOS/blob/master/makeit/utilities/io/draw.py) I noticed this code snippet (line 216 on the GitHub):
reactants, agents, products = [mols_from_smiles_list(x) for x in [mols.split('.') for mols in rxn_string.split('>')]] When the above code is applied on a SMILES reaction string, the result unpacks the reactants, agents, and products mol objects into the respected variables, with pretty good accuracy. The function 'mols_from_smiles' essentially just applies Chem.MolFromSmiles over a list of smiles. I think this code snippet is really cool but I cannot find any documentation on how this is working. Searching this mailing list I came across the thread (https://sourceforge.net/p/rdkit/mailman/message/36316849/) where this operation of labeling reactants, agents, and products seems to be determined by the threshold_unmapped_reactant_atoms explained in the quoted text from the message (linked above) Here's what's going on: By default the cartridge code does an extra step > after reading a reaction from SMILES/SMARTS: it looks at all the reactants > and moves any that don't have a sufficient fraction of mapped atoms to the > agents. We do this by default because the reactions that we found "in the > wild" often have agents, solvents, etc. mixed in with the reactants. The > key parameter used there is threshold_unmapped_reactant_atoms, which > defaults to 0.2. The only further reading I can find is from Greg's paper ( https://pubs.acs.org/doi/10.1021/ci5006614). I have two main questions: 1. Where in the code is this atom mapping being applied? I cannot tell when this method is being applied or where the meta data is being saved. Applying the code snippet above to a SMILES reaction string results in a list of rdkit.Chem.rdchem.Mol objects. I cannot seem to find any static method or attributes specifying if it's a reactant, agent, or product when inspecting a mol object using help in a python terminal. 2. How can I change the value of the variables threshold_unmapped_reactant_atoms and move_unmmapped_reactants_to_agents? I am using rdkit version 2019.03.4 in an Anaconda environment. I want to experiment changing the mapping threshold. Very Respectfully, Benjamin
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss