Dear Helpdesk,
I was using CDK (version 2.7) to generate FCFP4 and 6 for the compound
butyramide (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL1231396.sdf)
and ethanol (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL545.sdf) from
their MolFiles which I got from CHEMBL. I was using the following commands in
CDK:
---------------------------------------------------------------------------------------------------------------
package ecfp;
import java.io.*;
import com.opencsv.CSVReader;
import com.opencsv.CSVReaderBuilder;
import com.opencsv.CSVWriter;
import com.opencsv.exceptions.CsvException;
import java.util.Arrays;
import java.util.List;
import java.io.FileInputStream;
import java.io.IOException;
import org.openscience.cdk.exception.CDKException;
import org.openscience.cdk.fingerprint.CircularFingerprinter;
import org.openscience.cdk.fingerprint.ExtendedFingerprinter;
import org.openscience.cdk.fingerprint.ICountFingerprint;
import org.openscience.cdk.interfaces.IAtomContainer;
import org.openscience.cdk.interfaces.IChemObjectBuilder;
import org.openscience.cdk.io.MDLV2000Reader;
import org.openscience.cdk.silent.SilentChemObjectBuilder;
public class main{
public static void main(String[] args) throws CDKException,
IOException {
String filename =
"C:\\Users\\NGWO0001\\Downloads\\CHEMBL545.sdf.txt";
FileInputStream in = new FileInputStream(filename);
MDLV2000Reader reader = new MDLV2000Reader(in);
IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
IAtomContainer mol = reader.read(bldr.newAtomContainer());
CircularFingerprinter fingerprinter0 = new CircularFingerprinter(
CircularFingerprinter.CLASS_FCFP4
);
System.out.println("FCFP4 Ethanol:");
ICountFingerprint result0 =
fingerprinter0.getCountFingerprint(mol);
for (int k=0, n = result0.numOfPopulatedbins(); k < n; ++k) {
String ans4 = "";
ans4 += result0.getHash(k);
ans4 += " " + result0.getCount(k);
System.out.printf("%s\n",ans4);
}
reader.close();
}
}
---------------------------------------------------------------------------------------------------------------
The results I got were:
FCFP4 Butyramide:
-1393198889 1
-1212393386 1
-1131767167 2
0 4
2 1
3 1
425233353 1
785469695 1
824716024 1
994111779 1
1429107614 1
FCFP6 Butyramide:
-1393198889 1
-1212393386 1
-1131767167 2
0 4
2 1
3 1
425233353 1
785469695 1
824716024 1
994111779 1
1429107614 1
FCFP4 Ethanol:
-1212393386 1
0 2
3 1
629394235 1
824716024 1
FCFP6 Ethanol:
-1212393386 1
0 2
3 1
629394235 1
824716024 1
I think these results may not be right since I thought that fingerprints are
supposed to be a series of hash and so they ought to be a series of
fixed-length integers. However, as you see in the results I got, for example,
for the FCFP6 for ethanol, one is 10-digits long while others are single digits
and 9-digits long.
Can you please tell me what I am doing wrong?
Thanking you in advance for your assistance and time.
Best regards,
Woon Yee
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user