Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Nicholas Firth
Ahh ok… Interesting way to format a file! Got to love ChemAxon... Best, Nick Nicholas C. Firth | PhD Student | Cancer Therapeutics The Institute of Cancer Research | 15 Cotswold Road | Belmont | Sutton | Surrey | SM2 5NG T 020 8722 4033 | E

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Paolo Tosco
Dear all, Indeed, as Riccardo mentions, according to the specifications in CTfile.pdf a property should be truncated after the first blank line. This is also what other SDF parsers I have tried actually do. What I noticed is that other SDF parsers are tolerant of spurious lines not starting

[Rdkit-discuss] SDF tags and -

2015-04-29 Thread Tuomo Kalliokoski
Hello all, I have got a bunch of SDF-files with molecules and some long descriptions in SDF-tags on them that include stuff like - inside. These files have been produced by ChemAxon's software and are handled fine by their software. Such files can be written out also from RDKit 2014_09_02, but

Re: [Rdkit-discuss] Java wrappers

2015-04-29 Thread Gianluca Sforna
On Tue, Apr 28, 2015 at 5:59 AM, Greg Landrum greg.land...@gmail.com wrote: There's certainly no reason why the java wrappers shouldn't work if they are dynamically linked; I just tried it on my ubuntu box and they work fine. I don't normally do it because it would make distributing the

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Andrew Dalke
Riccardo Vianello: I suppose that if the correctness of the parser is confirmed, then a change could be suggested for the writer, consisting in raising an error if blank lines are present inside the data item. Yes, the SD tag data is not a general purpose data field. It's not possible,

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Tuomo Kalliokoski
Hello Riccardo, That sounds very reasonable solution to the issue. [I replied to rdkit-discuss to bring this thread on the list back again] Best regards, Tuomo From: riccardo.viane...@gmail.com Date: Wed, 29 Apr 2015 12:08:48 +0200 Subject: Re: [Rdkit-discuss] SDF tags and - To:

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Dimitri Maziuk
On 04/29/2015 01:47 PM, Andrew Dalke wrote: Postel's Robustness principle is a mistake. See RFC 3117 for elaboration, ... Or from http://cacm.acm.org/magazines/2011/8/114933-the-robustness-principle-reconsidered/fulltext : There is a difference between ACM members writing network

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Dimitri Maziuk
On 04/29/2015 07:54 AM, Andrew Dalke wrote: I don't have a good solution. Were it me, I would have the writer fail should any unsupported value be present in the output, including those which are allowed by the SD specification but will cause problems in practice, like embedded \0 and

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Dimitri Maziuk
On 04/29/2015 05:32 PM, Andrew Dalke wrote: On Apr 29, 2015, at 9:19 PM, Dimitri Maziuk wrote: There is a difference between ACM members writing network protocols and domain people writing junk. I think that you are saying that the MDL connection table file formats are junk. I do not

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Jan Holst Jensen
Actually, you want to send your loving thoughts to MDL (now: Biovia). They defined the SDF format :-). Cheers -- Jan On 2015-04-29 13:26, Nicholas Firth wrote: Ahh ok… Interesting way to format a file! Got to love ChemAxon... Best, Nick *Nicholas C. Firth*| PhD Student | Cancer Therapeutics

Re: [Rdkit-discuss] Java wrappers

2015-04-29 Thread Greg Landrum
On Wed, Apr 29, 2015 at 9:48 AM, Gianluca Sforna gia...@gmail.com wrote: On Tue, Apr 28, 2015 at 5:59 AM, Greg Landrum greg.land...@gmail.com wrote: There's certainly no reason why the java wrappers shouldn't work if they are dynamically linked; I just tried it on my ubuntu box and they

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Greg Landrum
Here are my thoughts on this: The RDKit is usually strict while parsing molecules from SDF, SMILES, or other formats. This is done for one simple reason: it tends to be difficult/impossible to recover from syntax errors in input in a way that doesn't result in a significant chance of producing a