An ECFP4 implementation could use a single bit or a million bits. The actual information that is being encoded is an element of a set of size of more than billions (I forget the details). So it's hashed to something manageable. The shorter the length, the more bit collisions (everything will collide with a single bit, for example). Open Babel uses 4096. I would regard this as the minimum.
When converting from hex, you could concatenate the binaries. Or you could use pybel which doesn't the conversion for you: >>> pybel.readstring("smi", "c1ccccc1C(=O)Cl").calcfp("ecfp4").bits [556, 1348, 1509, 1547, 1993, 2078, 2089, 2378, 2487, 2531, 2700, 3017, 3023, 3117, 3324, 3395, 3599, 4036] These are the bits that are set. If you use "len", you can get the number of them. Regards, - Noel On Fri, 7 Dec 2018 at 09:49, I. Camps <ica...@gmail.com> wrote: > @Geoff > I use Python. > I already made an script to convert hex to binary, but as I wrote > previously, the fingerprint (fp) from OpenBabel is in the form of a set of > hex numbers. I converted each one to binary and then concatenate all the > binaries. Is it that okay? > If it is okay, the second problem is that the fp is much longer (6040) > than the RDKit (1024). I really do not understand the "folded" issue > because any read about ECFP4 talk about a 1024 bit string and not higher. > > @Francois > I certainly will take a look! > > thank you both. > > Camps > > > On Fri, Dec 7, 2018 at 1:59 AM Geoffrey Hutchison < > geoff.hutchi...@gmail.com> wrote: > >> Using OpenBabel, I got a file with the information that the fingerprint >> is a 6040 bits set and got hexadecimal numbers. >> Using PyBioMed, which is based in RDKIT, I got a binary string of 1024 >> bits, very different from that obtained with OpenBabel. >> >> >> The RDKit binary string will be "folded" down to 1024 bits, so of course >> they will be very different bit strings. >> >> 2-) How can I convert the ECFP4 obtained from OpenBabel in hexadecimal >> form to a bit string with only ones and zeros? >> >> >> What programming language are you using? For example in Python, a quick >> search on StackExchange: >> https://stackoverflow.com/questions/1425493/convert-hex-to-binary >> >> Hope that helps, >> -Geoff >> > _______________________________________________ > OpenBabel-discuss mailing list > OpenBabel-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss >
_______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss