Thank you Sir for your reply.

RDkit version I am using is 2020.03.4.

I have included each SDF section with associated errors I am receiving.


*ERROR: Problems encountered parsing Mol data, M END missing around line
16739  *

>  <DSSTox_Compound_id>
DTXCID701169

>  <DSSTox_Substance_id>
DTXSID6021169

>  <CASRN>
61477-94-9

>  <QC_Level>
DSSTox_High

>  <Preferred_name>
Pirmenol hydrochloride

>  <Mol_Weight>
374.9500000000

>  <Mol_Formula>
C22H31ClN2O

>  <Monoisotopic_Mass>
374.2124913000

>  <Dashboard_URL>
https://comptox.epa.gov/dashboard/DTXSID6021169

$$$$
DTXCID601285170
  Mrv1805 05101813452D

  0  0  0     0  0            999 V3000
M  V30 BEGIN CTAB
M  V30 COUNTS 22 23 0 0 0
M  V30 BEGIN ATOM
M  V30 1 C 3.5184 1.3335 0 0
M  V30 2 C 5.0584 1.3335 0 0
M  V30 3 C 5.8282 0 0 0
M  V30 4 C 5.0584 -1.3335 0 0
M  V30 5 C 3.5184 -1.3335 0 0
M  V30 6 C 2.7484 0 0 0
M  V30 7 C 1.2084 0 0 0
M  V30 8 C 0.4386 -1.3335 0 0
M  V30 9 C -1.1014 -1.3335 0 0
M  V30 10 C -1.8714 0 0 0
M  V30 11 C -1.1014 1.3335 0 0
M  V30 12 C 0.4386 1.3335 0 0
M  V30 13 R# -1.8714 2.6671 0 0 RGROUPS=(1 1)
M  V30 14 R# -3.4114 0 0 0 RGROUPS=(1 1)
M  V30 15 R# -1.8712 -2.6671 0 0 RGROUPS=(1 1)
M  V30 16 R# 1.2084 -2.6671 0 0 RGROUPS=(1 1)
M  V30 17 R# 2.7486 -2.6671 0 0 RGROUPS=(1 1)
M  V30 18 R# 1.2086 2.6671 0 0 RGROUPS=(1 1)
M  V30 19 R# 2.7484 2.6671 0 0 RGROUPS=(1 1)
M  V30 20 R# 5.8284 2.6671 0 0 RGROUPS=(1 1)
M  V30 21 R# 7.3682 0 0 0 RGROUPS=(1 1)
M  V30 22 R# 5.8282 -2.6671 0 0 RGROUPS=(1 1)
M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 1 2 1 2
M  V30 2 1 2 3
M  V30 3 2 3 4
M  V30 4 1 4 5
M  V30 5 2 5 6
M  V30 6 1 6 1
M  V30 7 1 6 7
M  V30 8 1 8 9
M  V30 9 2 9 10
M  V30 10 1 10 11
M  V30 11 2 11 12
M  V30 12 2 7 8
M  V30 13 1 12 7
M  V30 14 1 9 15
M  V30 15 1 8 16
M  V30 16 1 5 17
M  V30 17 1 4 22
M  V30 18 1 3 21
M  V30 19 1 2 20
M  V30 20 1 1 19
M  V30 21 1 12 18
M  V30 22 1 11 13
M  V30 23 1 10 14
M  V30 END BOND
M  V30 END CTAB
M  V30 BEGIN RGROUP 1
M  V30 RLOGIC 0 1 >0
M  V30 BEGIN CTAB
M  V30 COUNTS 1 0 0 0 0
M  V30 BEGIN ATOM
M  V30 1 Br -7.3682 -1.2499 0 0 ATTCHPT=1
M  V30 END ATOM
M  V30 END CTAB
M  V30 END RGROUP
M  END

*ERROR: Could not sanitize molecule ending on line 78558  *

>  <DSSTox_Compound_id>
DTXCID501446

>  <DSSTox_Substance_id>
DTXSID6026298

>  <CASRN>
108-38-3

>  <QC_Level>
DSSTox_High

>  <Preferred_name>
m-Xylene

>  <Mol_Weight>
106.1680000000

>  <Mol_Formula>
C8H10

>  <Monoisotopic_Mass>
106.0782503220

>  <Dashboard_URL>
https://comptox.epa.gov/dashboard/DTXSID6026298

$$$$
DTXCID90820451
  Mrv1611104121614362D

  0  0  0     0  0            999 V3000
M  V30 BEGIN CTAB
M  V30 COUNTS 17 20 0 0 0
M  V30 BEGIN ATOM
M  V30 1 O -0.7801 -1.2459 0 0 CHG=-1
M  V30 2 C -2.2448 0.77 0 0
M  V30 3 N -2.2448 -0.77 0 0
M  V30 4 C -3.5784 1.54 0 0
M  V30 5 C -3.5784 -1.54 0 0
M  V30 6 C -4.9121 0.77 0 0
M  V30 7 C -4.9121 -0.77 0 0
M  V30 8 S -0.7801 1.2459 0 0
M  V30 9 Zn 0.1251 0 0 0 CHG=2
M  V30 10 O 0.7801 1.2459 0 0 CHG=-1
M  V30 11 C 2.2448 -0.77 0 0
M  V30 12 N 2.2448 0.77 0 0
M  V30 13 C 3.5784 -1.54 0 0
M  V30 14 C 3.5784 1.54 0 0
M  V30 15 C 4.9121 -0.77 0 0
M  V30 16 C 4.9121 0.77 0 0
M  V30 17 S 0.7801 -1.2459 0 0
M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 1 1 3 1
M  V30 2 1 3 2
M  V30 3 1 4 2
M  V30 4 2 8 2
M  V30 5 1 5 3
M  V30 6 2 6 4
M  V30 7 2 7 5
M  V30 8 1 7 6
M  V30 9 1 9 8
M  V30 10 1 17 9
M  V30 11 1 12 10
M  V30 12 1 12 11
M  V30 13 1 13 11
M  V30 14 2 17 11
M  V30 15 1 14 12
M  V30 16 2 15 13
M  V30 17 2 16 14
M  V30 18 1 16 15
M  V30 19 1 9 1
M  V30 20 1 9 10
M  V30 END BOND
M  V30 END CTAB
M  END









On Thu, Aug 6, 2020 at 3:51 AM Greg Landrum <greg.land...@gmail.com> wrote:

> Hi,
>
> Without seeing the SDF itself it's hard to be specific, but here's what
> the error messages are telling you, in general:
>
> the first one normally indicates a badly formed record in the SDF. If you
> look at around that line in the file you will, hopefully, see a misformed
> record.
> The next one, "Explicit valence" indicates that the molecule has an atom
> (in this case an "O") that has the equivalent of three bonds to it. That's
> not chemically reasonable, so the software complains
> The error about "Alkyl" is self explanatory: there's a molecule in the SDF
> which has an atom with symbol "Alkyl".
> The rest are warnings.
>
> In order to provide more specific help, we'll need to see the SDF you're
> using (or at least the SDF for the molecules that are failing) as well as
> information about which version of the RDKit you're using.
>
> -greg
>
>
>
> On Wed, Aug 5, 2020 at 11:43 PM Pitanti Chalowa <ch1...@gmail.com> wrote:
>
>> Respected Altruistic Researcher,
>> While converting one sdf file to fingerprint, I am facing several errors.
>>
>> My code
>>
>> suppl = Chem.SDMolSupplier('1.sdf')for mol in suppl:
>>   if mol is None: continue
>>   # print(mol.GetNumAtoms())
>>
>> fps = [Chem.RDKFingerprint(x) for x in supply]
>>
>> I am facing many errors
>>
>> ERROR: Problems encountered parsing Mol data, M  END missing around line 
>> 16739...
>> ERROR: Explicit valence for atom # 0 O, 3, is greater than permitted...
>> ERROR: Could not sanitize molecule ending on line 78558...
>> ERROR: Post-condition ViolationRDKit ERROR: Element 'Alkyl' not foundRDKit 
>> ERROR: Violation occurred on line 91 in file 
>> /home/conda/feedstock_root/build_artifacts/rdkit_1593788763912/work/Code/GraphMol/PeriodicTable.hRDKit
>>  ERROR: Failed Expression: anum > -1
>> ...
>> WARNING: not removing hydrogen atom without neighbors
>>
>> RDKit WARNING: atom 0 has specified valence (4) smaller than the drawn 
>> valence 6.
>>
>> Please direct me to the references. How can I correct them?
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to