Thank you Sir for your reply. RDkit version I am using is 2020.03.4.
I have included each SDF section with associated errors I am receiving. *ERROR: Problems encountered parsing Mol data, M END missing around line 16739 * > <DSSTox_Compound_id> DTXCID701169 > <DSSTox_Substance_id> DTXSID6021169 > <CASRN> 61477-94-9 > <QC_Level> DSSTox_High > <Preferred_name> Pirmenol hydrochloride > <Mol_Weight> 374.9500000000 > <Mol_Formula> C22H31ClN2O > <Monoisotopic_Mass> 374.2124913000 > <Dashboard_URL> https://comptox.epa.gov/dashboard/DTXSID6021169 $$$$ DTXCID601285170 Mrv1805 05101813452D 0 0 0 0 0 999 V3000 M V30 BEGIN CTAB M V30 COUNTS 22 23 0 0 0 M V30 BEGIN ATOM M V30 1 C 3.5184 1.3335 0 0 M V30 2 C 5.0584 1.3335 0 0 M V30 3 C 5.8282 0 0 0 M V30 4 C 5.0584 -1.3335 0 0 M V30 5 C 3.5184 -1.3335 0 0 M V30 6 C 2.7484 0 0 0 M V30 7 C 1.2084 0 0 0 M V30 8 C 0.4386 -1.3335 0 0 M V30 9 C -1.1014 -1.3335 0 0 M V30 10 C -1.8714 0 0 0 M V30 11 C -1.1014 1.3335 0 0 M V30 12 C 0.4386 1.3335 0 0 M V30 13 R# -1.8714 2.6671 0 0 RGROUPS=(1 1) M V30 14 R# -3.4114 0 0 0 RGROUPS=(1 1) M V30 15 R# -1.8712 -2.6671 0 0 RGROUPS=(1 1) M V30 16 R# 1.2084 -2.6671 0 0 RGROUPS=(1 1) M V30 17 R# 2.7486 -2.6671 0 0 RGROUPS=(1 1) M V30 18 R# 1.2086 2.6671 0 0 RGROUPS=(1 1) M V30 19 R# 2.7484 2.6671 0 0 RGROUPS=(1 1) M V30 20 R# 5.8284 2.6671 0 0 RGROUPS=(1 1) M V30 21 R# 7.3682 0 0 0 RGROUPS=(1 1) M V30 22 R# 5.8282 -2.6671 0 0 RGROUPS=(1 1) M V30 END ATOM M V30 BEGIN BOND M V30 1 2 1 2 M V30 2 1 2 3 M V30 3 2 3 4 M V30 4 1 4 5 M V30 5 2 5 6 M V30 6 1 6 1 M V30 7 1 6 7 M V30 8 1 8 9 M V30 9 2 9 10 M V30 10 1 10 11 M V30 11 2 11 12 M V30 12 2 7 8 M V30 13 1 12 7 M V30 14 1 9 15 M V30 15 1 8 16 M V30 16 1 5 17 M V30 17 1 4 22 M V30 18 1 3 21 M V30 19 1 2 20 M V30 20 1 1 19 M V30 21 1 12 18 M V30 22 1 11 13 M V30 23 1 10 14 M V30 END BOND M V30 END CTAB M V30 BEGIN RGROUP 1 M V30 RLOGIC 0 1 >0 M V30 BEGIN CTAB M V30 COUNTS 1 0 0 0 0 M V30 BEGIN ATOM M V30 1 Br -7.3682 -1.2499 0 0 ATTCHPT=1 M V30 END ATOM M V30 END CTAB M V30 END RGROUP M END *ERROR: Could not sanitize molecule ending on line 78558 * > <DSSTox_Compound_id> DTXCID501446 > <DSSTox_Substance_id> DTXSID6026298 > <CASRN> 108-38-3 > <QC_Level> DSSTox_High > <Preferred_name> m-Xylene > <Mol_Weight> 106.1680000000 > <Mol_Formula> C8H10 > <Monoisotopic_Mass> 106.0782503220 > <Dashboard_URL> https://comptox.epa.gov/dashboard/DTXSID6026298 $$$$ DTXCID90820451 Mrv1611104121614362D 0 0 0 0 0 999 V3000 M V30 BEGIN CTAB M V30 COUNTS 17 20 0 0 0 M V30 BEGIN ATOM M V30 1 O -0.7801 -1.2459 0 0 CHG=-1 M V30 2 C -2.2448 0.77 0 0 M V30 3 N -2.2448 -0.77 0 0 M V30 4 C -3.5784 1.54 0 0 M V30 5 C -3.5784 -1.54 0 0 M V30 6 C -4.9121 0.77 0 0 M V30 7 C -4.9121 -0.77 0 0 M V30 8 S -0.7801 1.2459 0 0 M V30 9 Zn 0.1251 0 0 0 CHG=2 M V30 10 O 0.7801 1.2459 0 0 CHG=-1 M V30 11 C 2.2448 -0.77 0 0 M V30 12 N 2.2448 0.77 0 0 M V30 13 C 3.5784 -1.54 0 0 M V30 14 C 3.5784 1.54 0 0 M V30 15 C 4.9121 -0.77 0 0 M V30 16 C 4.9121 0.77 0 0 M V30 17 S 0.7801 -1.2459 0 0 M V30 END ATOM M V30 BEGIN BOND M V30 1 1 3 1 M V30 2 1 3 2 M V30 3 1 4 2 M V30 4 2 8 2 M V30 5 1 5 3 M V30 6 2 6 4 M V30 7 2 7 5 M V30 8 1 7 6 M V30 9 1 9 8 M V30 10 1 17 9 M V30 11 1 12 10 M V30 12 1 12 11 M V30 13 1 13 11 M V30 14 2 17 11 M V30 15 1 14 12 M V30 16 2 15 13 M V30 17 2 16 14 M V30 18 1 16 15 M V30 19 1 9 1 M V30 20 1 9 10 M V30 END BOND M V30 END CTAB M END On Thu, Aug 6, 2020 at 3:51 AM Greg Landrum <greg.land...@gmail.com> wrote: > Hi, > > Without seeing the SDF itself it's hard to be specific, but here's what > the error messages are telling you, in general: > > the first one normally indicates a badly formed record in the SDF. If you > look at around that line in the file you will, hopefully, see a misformed > record. > The next one, "Explicit valence" indicates that the molecule has an atom > (in this case an "O") that has the equivalent of three bonds to it. That's > not chemically reasonable, so the software complains > The error about "Alkyl" is self explanatory: there's a molecule in the SDF > which has an atom with symbol "Alkyl". > The rest are warnings. > > In order to provide more specific help, we'll need to see the SDF you're > using (or at least the SDF for the molecules that are failing) as well as > information about which version of the RDKit you're using. > > -greg > > > > On Wed, Aug 5, 2020 at 11:43 PM Pitanti Chalowa <ch1...@gmail.com> wrote: > >> Respected Altruistic Researcher, >> While converting one sdf file to fingerprint, I am facing several errors. >> >> My code >> >> suppl = Chem.SDMolSupplier('1.sdf')for mol in suppl: >> if mol is None: continue >> # print(mol.GetNumAtoms()) >> >> fps = [Chem.RDKFingerprint(x) for x in supply] >> >> I am facing many errors >> >> ERROR: Problems encountered parsing Mol data, M END missing around line >> 16739... >> ERROR: Explicit valence for atom # 0 O, 3, is greater than permitted... >> ERROR: Could not sanitize molecule ending on line 78558... >> ERROR: Post-condition ViolationRDKit ERROR: Element 'Alkyl' not foundRDKit >> ERROR: Violation occurred on line 91 in file >> /home/conda/feedstock_root/build_artifacts/rdkit_1593788763912/work/Code/GraphMol/PeriodicTable.hRDKit >> ERROR: Failed Expression: anum > -1 >> ... >> WARNING: not removing hydrogen atom without neighbors >> >> RDKit WARNING: atom 0 has specified valence (4) smaller than the drawn >> valence 6. >> >> Please direct me to the references. How can I correct them? >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss