Hi Axel,

What is calculated in the function GetAtomicWeightsForModel() is the difference 
between the probability value of the complete molecule (“base probability”) and 
the probability value when the bits of a certain atom are deleted. 

In the cookbook (and based on a quick glance also in your code), the 
probability of the active class is used as the measure for the similarity maps 
(that’s defined in the getProba() helper function). This means that any atom 
whose missing bits lead to an increase in the probability to be active is 
colored green. If it leads to a decrease, it gets colored pink. 

Now if you have an inactive molecule then your base probability for the active 
class is close to zero. In your cases it looks like nearly all of the atoms in 
the molecule are necessary to make these molecules be considered inactive. In 
other words, deleting any of green colored atoms results in a higher 
probability to be active – although it might still be below 50% (note that the 
color range is not standardized globally but based on the largest difference 
observed in the molecule).

I hope this helps.

Best,
Sereina 


> On 22 Aug 2019, at 11:38, Axel Pahl <axelp...@gmx.de> wrote:
> 
> Dear fellow RDKitters,
> 
> I am experimenting with the classification example from the Cookbook [1] 
> using a RandomForestClassifier and Similarity Maps for visualization.
> I need, however, some help with the interpretation of the coloring in the 
> similarity map.
> In the attached example, the compounds were correctly predicted ("AC_Pred") 
> as being inactive ("0") with a high probability.
> But the corresponding similarity maps show mainly green areas, indicating (in 
> my understanding) a positive contribution to the activity class, which should 
> have lead to a different prediction.
> 
> What would be the correct interpretation of the coloring?
> Many thanks in advance for any help.
> 
> Kind regards,
> Axel
> 
> P.S.: The code is available in a repo [2], an example notebook can be found 
> in the tutorials folder.
> 
> [1] http://www.rdkit.org/docs/Cookbook.html#using-scikit-learn-with-rdkit 
> <http://www.rdkit.org/docs/Cookbook.html#using-scikit-learn-with-rdkit>
> [2] https://github.com/apahl/mol_frame <https://github.com/apahl/mol_frame>
> 
> <similarity_map.png>_______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to