An ECFP4 implementation could use a single bit or a million bits. The
actual information that is being encoded is an element of a set of size of
more than billions (I forget the details). So it's hashed to something
manageable. The shorter the length, the more bit collisions (everything
will collide with a single bit, for example). Open Babel uses 4096. I would
regard this as the minimum.

When converting from hex, you could concatenate the binaries. Or you could
use pybel which doesn't the conversion for you:
>>> pybel.readstring("smi", "c1ccccc1C(=O)Cl").calcfp("ecfp4").bits
[556, 1348, 1509, 1547, 1993, 2078, 2089, 2378, 2487, 2531, 2700, 3017,
3023, 3117, 3324, 3395, 3599, 4036]

These are the bits that are set. If you use "len", you can get the number
of them.

Regards,
- Noel


On Fri, 7 Dec 2018 at 09:49, I. Camps <ica...@gmail.com> wrote:

> @Geoff
> I use Python.
> I already made an script to convert hex to binary, but as I wrote
> previously, the fingerprint (fp) from OpenBabel is in the form of a set of
> hex numbers. I converted each one to binary and then concatenate all the
> binaries. Is it that okay?
> If it is okay, the second problem is that the fp is much longer (6040)
> than the RDKit (1024). I really do not understand the "folded" issue
> because any read about ECFP4 talk about a 1024 bit string and not higher.
>
> @Francois
> I certainly will take a look!
>
> thank you both.
>
> Camps
>
>
> On Fri, Dec 7, 2018 at 1:59 AM Geoffrey Hutchison <
> geoff.hutchi...@gmail.com> wrote:
>
>> Using OpenBabel, I got a file with the information that the fingerprint
>> is a 6040 bits set and got hexadecimal numbers.
>> Using PyBioMed, which is based in RDKIT, I got a binary string of 1024
>> bits, very different from that obtained with OpenBabel.
>>
>>
>> The RDKit binary string will be "folded" down to 1024 bits, so of course
>> they will be very different bit strings.
>>
>> 2-) How can I convert the ECFP4 obtained from OpenBabel in hexadecimal
>> form to a bit string with only ones and zeros?
>>
>>
>> What programming language are you using? For example in Python, a quick
>> search on StackExchange:
>>  https://stackoverflow.com/questions/1425493/convert-hex-to-binary
>>
>> Hope that helps,
>> -Geoff
>>
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to