Hi all,

I recently discovered the “GenerateMoleculeHashString” in function in
rdMolHash. It has several features which seem attractive including being
faster than InchiKey calculation and accepting wildcard atoms. It seems
like a better option than my current approach of hashing the canonical
SMILES but I couldn’t find anything about it in the mailing list archives
and I’d like to understand it better before I incorporate it in my work.


I’ve determined that a hash (i.e
100-10-9-koXJdQ-VrNVKw-Srh2xg-7kztBA-2qU33A-Vr7YHA) has the following
structure: <Hash Version>-<# of Atoms>-<# of bonds>-<CRC32 of Molecular
Formula>-<NonChiralAtomsHash>-<NonChiralBondsHash>-<ChiralAtomsHash>-<ChiralBondsHash>-<ChiralityHash>.


However, I don’t quite understand how the “computeMorganCodeHash” function
is used to calculate each of the blocks. Is there a reference describing
this method?


Thanks,

James


------------------------------------------------------------------

James G Jeffryes

Doctoral Candidate

Tyo lab, Chemical & Biological Engineering

Northwestern University

Mathematics and Computer Science Division

Argonne National Laboratory
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to