Re: [Rdkit-discuss] SDF tags and -

2015-04-30 Thread Dimitri Maziuk
On 2015-04-29 23:08, Greg Landrum wrote: Here are my thoughts on this: The RDKit is usually strict while parsing molecules from SDF, SMILES, or other formats. My point was that given ''' my_property2 1234 my_property3 ''' a lexer shouldn't have a problem recognizing the 2 tags. A

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Nicholas Firth
Ahh ok… Interesting way to format a file! Got to love ChemAxon... Best, Nick Nicholas C. Firth | PhD Student | Cancer Therapeutics The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | Surrey | SM2 5NG T 020 8722 4033 | E

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Paolo Tosco
From: riccardo.viane...@gmail.com Date: Wed, 29 Apr 2015 12:08:48 +0200 Subject: Re: [Rdkit-discuss] SDF tags and - To: tkall...@live.com Hi Tuomo, yes, I agree the behavior seems a bit inconsistent. I suppose that if the correctness of the parser

[Rdkit-discuss] SDF tags and -

2015-04-29 Thread Tuomo Kalliokoski
Hello all, I have got a bunch of SDF-files with molecules and some long descriptions in SDF-tags on them that include stuff like - inside. These files have been produced by ChemAxon's software and are handled fine by their software. Such files can be written out also from RDKit 2014_09_02, but

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Andrew Dalke
Riccardo Vianello: I suppose that if the correctness of the parser is confirmed, then a change could be suggested for the writer, consisting in raising an error if blank lines are present inside the data item. Yes, the SD tag data is not a general purpose data field. It's not possible,

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Tuomo Kalliokoski
Hello Riccardo, That sounds very reasonable solution to the issue. [I replied to rdkit-discuss to bring this thread on the list back again] Best regards, Tuomo From: riccardo.viane...@gmail.com Date: Wed, 29 Apr 2015 12:08:48 +0200 Subject: Re: [Rdkit-discuss] SDF tags and - To: tkall

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Dimitri Maziuk
On 04/29/2015 01:47 PM, Andrew Dalke wrote: Postel's Robustness principle is a mistake. See RFC 3117 for elaboration, ... Or from http://cacm.acm.org/magazines/2011/8/114933-the-robustness-principle-reconsidered/fulltext : There is a difference between ACM members writing network

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Dimitri Maziuk
On 04/29/2015 07:54 AM, Andrew Dalke wrote: I don't have a good solution. Were it me, I would have the writer fail should any unsupported value be present in the output, including those which are allowed by the SD specification but will cause problems in practice, like embedded \0 and

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Dimitri Maziuk
On 04/29/2015 05:32 PM, Andrew Dalke wrote: On Apr 29, 2015, at 9:19 PM, Dimitri Maziuk wrote: There is a difference between ACM members writing network protocols and domain people writing junk. I think that you are saying that the MDL connection table file formats are junk. I do not

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Jan Holst Jensen
Actually, you want to send your loving thoughts to MDL (now: Biovia). They defined the SDF format :-). Cheers -- Jan On 2015-04-29 13:26, Nicholas Firth wrote: Ahh ok… Interesting way to format a file! Got to love ChemAxon... Best, Nick *Nicholas C. Firth*| PhD Student | Cancer Therapeutics

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Greg Landrum
Here are my thoughts on this: The RDKit is usually strict while parsing molecules from SDF, SMILES, or other formats. This is done for one simple reason: it tends to be difficult/impossible to recover from syntax errors in input in a way that doesn't result in a significant chance of producing a