Re: [Rdkit-discuss] Request for Assistance: Understanding InChI to Mol Conversion Issue in RDKit

2023-12-12 Thread Jan Holst Jensen
You can also cross-check with standard InChI to see if this is an RDKit 
issue or a more general InChI issue. To convert InChI strings (and 
optionally AuxInfo) to SDF format with the standard inchi-1 executable, 
put the InChI string and AuxInfo into a text file and convert it like this.


P:\Projects\RInChI\INCHI-1-BIN__V1.06\windows\32bit>*type test.txt*
InChI=1/Ca.2H
AuxInfo=1/0/N:1;2;3/rA:3Ca0H0H0/rB:;;/rC:;;;

P:\Projects\RInChI\INCHI-1-BIN__V1.06\windows\32bit>*inchi-1.exe 
/InChI2Struct /OutputSDF test.txt*

InChI version 1, Software v. 1.06 (inchi-1 executable)
Windows 32-bit Build (MS VS 2015) of Dec 18 2020 20:45:14

Opened log file 'test.txt.log'
Opened input file 'test.txt'
Opened output file 'test.txt.txt'
Opened problem file 'test.txt.prb'
The command line used:
"inchi-1.exe /InChI2Struct /OutputSDF test.txt"
Converting InChI(s) to structure(s) in MOL format
Output SDfile only without stereochemical information and atom coordinates
Input format: InChI (plain identifier)
Output format: SDfile only (without stereochemical info and atom 
coordinates)

Timeout per structure: 6 msec
Up to 1024 atoms per structure


Finished processing 1 structure: 0 errors, processing time 0:00:00.00

Elapsed walltime: 15 msec.

P:\Projects\RInChI\INCHI-1-BIN__V1.06\windows\32bit>type test.txt.txt
Structure #1.
  InChIV10

  3  0  0  0  0  0  0  0  0  0  1 V2000
    0.    0.    0. Ca  0  0  0 0 15  0  0  0  0
    0.    0.    0. H   0  0  0 0 15  0  0  0  0
    0.    0.    0. H   0  0  0 0 15  0  0  0  0
M  END


P:\Projects\RInChI\INCHI-1-BIN__V1.06\windows\32bit>

Cheers
-- Jan

On 2023-12-12 07:59, S Joshua Swamidass wrote:

Perhaps provide some examples were this failure happens.

Sent from Gmail Mobile


On Tue, Nov 28, 2023 at 7:35 PM 李大舟  wrote:

Dear RDKit Developers and Maintainers,

I hope this email finds you well. My name is Dr. Dazhou Li, and I
am a researcher working on the development of a tool for
extracting chemical compound structures recognized by OCR (Optical
Character Recognition) technology. I have been using the RDKit
library for a crucial step in this process, specifically the
rdkit.Chem.inchi.MolFromInchi() function, to convert InChI-format
strings into Mol format representations.

Firstly, I would like to express my gratitude for the excellent
work you have done in developing and maintaining the RDKit
library, which has been an invaluable resource in my research. The
library has consistently delivered high-quality results in various
aspects of chemical informatics, and I appreciate your dedication
to its development.

However, I have encountered a specific issue with the
rdkit.Chem.inchi.MolFromInchi() function that I hope you can help
me understand and resolve. When attempting to convert InChI-format
strings generated by my tool, some of them fail with an error
message reporting "NaN." Since the rdkit.Chem.inchi.MolFromInchi()
function calls C++ code, I am unable to directly inspect its
execution or source code to diagnose the issue.

My primary request is for assistance in understanding the internal
workings of the rdkit.Chem.inchi.MolFromInchi() function,
specifically the checking process or generation step that leads to
the "NaN" error when certain InChI-format strings are processed.
It is crucial for my research to determine at which point in the
execution of this function my generated InChI-formatted strings
are considered unreasonable, as this information will help me
refine my tool's output to be compatible with RDKit.

I understand that the RDKit library is a complex and comprehensive
toolkit, and I appreciate the complexity involved in diagnosing
such issues. However, any insights or guidance you can provide
regarding the problematic cases and the internal processes of the
rdkit.Chem.inchi.MolFromInchi() function would be immensely
valuable to me and would help me ensure the compatibility of my
tool with RDKit.

If possible, I would be grateful for access to relevant
documentation or insights into the specific error conditions that
may lead to the "NaN" result. Additionally, any suggestions or
best practices for generating InChI-format strings that are more
likely to be successfully processed by RDKit would be greatly
appreciated.

Thank you for your time and consideration. I look forward to your
response and hope that we can collaborate to resolve this issue
and enhance the compatibility of my tool with the RDKit library.

Please feel free to reach out to me if you require any additional
information or if there are specific details about my tool or the
InChI-format strings that would aid in diagnosing the issue.

Best regards,

Dr. Dazhou Li
Shenyang University of Chemical Technology

Re: [Rdkit-discuss] Request for Assistance: Understanding InChI to Mol Conversion Issue in RDKit

2023-12-11 Thread S Joshua Swamidass
Perhaps provide some examples were this failure happens.

Sent from Gmail Mobile


On Tue, Nov 28, 2023 at 7:35 PM 李大舟  wrote:

> Dear RDKit Developers and Maintainers,
>
> I hope this email finds you well. My name is Dr. Dazhou Li, and I am a
> researcher working on the development of a tool for extracting chemical
> compound structures recognized by OCR (Optical Character Recognition)
> technology. I have been using the RDKit library for a crucial step in this
> process, specifically the rdkit.Chem.inchi.MolFromInchi() function, to
> convert InChI-format strings into Mol format representations.
>
> Firstly, I would like to express my gratitude for the excellent work you
> have done in developing and maintaining the RDKit library, which has been
> an invaluable resource in my research. The library has consistently
> delivered high-quality results in various aspects of chemical informatics,
> and I appreciate your dedication to its development.
>
> However, I have encountered a specific issue with the
> rdkit.Chem.inchi.MolFromInchi() function that I hope you can help me
> understand and resolve. When attempting to convert InChI-format strings
> generated by my tool, some of them fail with an error message reporting
> "NaN." Since the rdkit.Chem.inchi.MolFromInchi() function calls C++ code, I
> am unable to directly inspect its execution or source code to diagnose the
> issue.
>
> My primary request is for assistance in understanding the internal
> workings of the rdkit.Chem.inchi.MolFromInchi() function, specifically the
> checking process or generation step that leads to the "NaN" error when
> certain InChI-format strings are processed. It is crucial for my research
> to determine at which point in the execution of this function my generated
> InChI-formatted strings are considered unreasonable, as this information
> will help me refine my tool's output to be compatible with RDKit.
>
> I understand that the RDKit library is a complex and comprehensive
> toolkit, and I appreciate the complexity involved in diagnosing such
> issues. However, any insights or guidance you can provide regarding the
> problematic cases and the internal processes of the
> rdkit.Chem.inchi.MolFromInchi() function would be immensely valuable to me
> and would help me ensure the compatibility of my tool with RDKit.
>
> If possible, I would be grateful for access to relevant documentation or
> insights into the specific error conditions that may lead to the "NaN"
> result. Additionally, any suggestions or best practices for generating
> InChI-format strings that are more likely to be successfully processed by
> RDKit would be greatly appreciated.
>
> Thank you for your time and consideration. I look forward to your response
> and hope that we can collaborate to resolve this issue and enhance the
> compatibility of my tool with the RDKit library.
>
> Please feel free to reach out to me if you require any additional
> information or if there are specific details about my tool or the
> InChI-format strings that would aid in diagnosing the issue.
>
> Best regards,
>
> Dr. Dazhou Li
> Shenyang University of Chemical Technology
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Request for Assistance: Understanding InChI to Mol Conversion Issue in RDKit

2023-11-28 Thread 李大舟
Dear RDKit Developers and Maintainers,


I hope this email finds you well. My name is Dr. Dazhou Li, and I am a 
researcher working on the development of a tool for extracting chemical 
compound structures recognized by OCR (Optical Character Recognition) 
technology. I have been using the RDKit library for a crucial step in this 
process, specifically the rdkit.Chem.inchi.MolFromInchi() function, to convert 
InChI-format strings into Mol format representations.


Firstly, I would like to express my gratitude for the excellent work you have 
done in developing and maintaining the RDKit library, which has been an 
invaluable resource in my research. The library has consistently delivered 
high-quality results in various aspects of chemical informatics, and I 
appreciate your dedication to its development.


However, I have encountered a specific issue with the 
rdkit.Chem.inchi.MolFromInchi() function that I hope you can help me understand 
and resolve. When attempting to convert InChI-format strings generated by my 
tool, some of them fail with an error message reporting "NaN." Since the 
rdkit.Chem.inchi.MolFromInchi() function calls C++ code, I am unable to 
directly inspect its execution or source code to diagnose the issue.


My primary request is for assistance in understanding the internal workings of 
the rdkit.Chem.inchi.MolFromInchi() function, specifically the checking process 
or generation step that leads to the "NaN" error when certain InChI-format 
strings are processed. It is crucial for my research to determine at which 
point in the execution of this function my generated InChI-formatted strings 
are considered unreasonable, as this information will help me refine my tool's 
output to be compatible with RDKit.


I understand that the RDKit library is a complex and comprehensive toolkit, and 
I appreciate the complexity involved in diagnosing such issues. However, any 
insights or guidance you can provide regarding the problematic cases and the 
internal processes of the rdkit.Chem.inchi.MolFromInchi() function would be 
immensely valuable to me and would help me ensure the compatibility of my tool 
with RDKit.


If possible, I would be grateful for access to relevant documentation or 
insights into the specific error conditions that may lead to the "NaN" result. 
Additionally, any suggestions or best practices for generating InChI-format 
strings that are more likely to be successfully processed by RDKit would be 
greatly appreciated.


Thank you for your time and consideration. I look forward to your response and 
hope that we can collaborate to resolve this issue and enhance the 
compatibility of my tool with the RDKit library.


Please feel free to reach out to me if you require any additional information 
or if there are specific details about my tool or the InChI-format strings that 
would aid in diagnosing the issue.


Best regards,


Dr. Dazhou Li
Shenyang University of Chemical Technology___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss