On 07/12/2018 19:22, Noel O'Boyle wrote:
An ECFP4 implementation could use a single bit or a million bits. The
actual information that is being encoded is an element of a set of
size of more than billions (I forget the details). So it's hashed to
something manageable. The shorter the length, the more bit collisions
(everything will collide with a single bit, for example). Open Babel
uses 4096. I would regard this as the minimum.

Just FYI, in rdkit, 2048 bits is the default length.

When converting from hex, you could concatenate the binaries. Or you
could use pybel which doesn't the conversion for you:
pybel.readstring("smi", "c1ccccc1C(=O)Cl").calcfp("ecfp4").bits
[556, 1348, 1509, 1547, 1993, 2078, 2089, 2378, 2487, 2531, 2700,
3017, 3023, 3117, 3324, 3395, 3599, 4036]

These are the bits that are set. If you use "len", you can get the
number of them.

Regards,

- Noel

On Fri, 7 Dec 2018 at 09:49, I. Camps <ica...@gmail.com> wrote:

@Geoff

I use Python.

I already made an script to convert hex to binary, but as I wrote
previously, the fingerprint (fp) from OpenBabel is in the form of a
set of hex numbers. I converted each one to binary and then
concatenate all the binaries. Is it that okay?

If it is okay, the second problem is that the fp is much longer
(6040) than the RDKit (1024). I really do not understand the
"folded" issue because any read about ECFP4 talk about a 1024 bit
string and not higher.

@Francois

I certainly will take a look!

thank you both.

Camps

On Fri, Dec 7, 2018 at 1:59 AM Geoffrey Hutchison
<geoff.hutchi...@gmail.com> wrote:

Using OpenBabel, I got a file with the information that the
fingerprint is a 6040 bits set and got hexadecimal numbers.
Using PyBioMed, which is based in RDKIT, I got a binary string of
1024 bits, very different from that obtained with OpenBabel.

The RDKit binary string will be "folded" down to 1024 bits, so of
course they will be very different bit strings.

2-) How can I convert the ECFP4 obtained from OpenBabel in
hexadecimal form to a bit string with only ones and zeros?

What programming language are you using? For example in Python, a
quick search on StackExchange:
https://stackoverflow.com/questions/1425493/convert-hex-to-binary
[1]

Hope that helps,
-Geoff
 _______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss [2]


Links:
------
[1] https://stackoverflow.com/questions/1425493/convert-hex-to-binary
[2] https://lists.sourceforge.net/lists/listinfo/openbabel-discuss



_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to