Re: [Rdkit-discuss] SVG BUG (Re: Fwd: 2D drawing with atoms labeled by index)

2016-10-26 Thread Greg Landrum
Since I've been busy at the UGM I haven't had the time to really pay
attention to this thread (or the longer one that spawned it). Apologies for
that.

First: I agree that the current situation is "a bit" ugly.

There is certainly a reason the RDKit generates SVG in the form that it
does, but I (not that surprisingly) don't remember what that reason was. At
the time I did this I was testing the output SVG using several different
display systems (multiple browsers, the ipython notebook, the preview tool
in ubuntu, the java-based renderer in KNIME, likely others) and what you
currently see seemed, at the time, to be the best compromise. Revisiting
this is no problem, but I'm going to balk if it breaks rendering with any
of the endpoints I care about.


On Wed, Oct 26, 2016 at 3:07 AM, Peter S. Shenkin  wrote:

> Indeed, when the file under discussion most recently named "svg2.html" is
> modified so that "xmlns:svg=" is replaced with "xmlns=", and the file is
> renamed "svg2.svg", double-clicking it opens it and correctly correctly
> displays the image in the browser.
>
> But trying this in the Jupyter notebook fails. the original code had the
> lines:
>
> svg = drawer.GetDrawingText().replace('svg:','')
> display(SVG(svg))
>
> This succeeded. If i add Dimitri's latest sugesstion:
>
> svg = drawer.GetDrawingText().replace('svg:','').replace('
> xmlns:svg=','xmlns=')
> display(SVG(svg))
>
> this also succeeds. If I only carry out the second replacement, this fails
> with an error several levels down.
>
> So apparently, SVG() can create an svg object out of the contents of a
> correctly formed svg file, but is insensitive to some constructs that make
> the such a file invalid for direct use in a browser.
>
> I'm still not sure why GetDrawingText() doesn't return a properly
> formatted svg string. Is there some use its output can be put to without
> these .replacements?
>
> -P
>
> On Tue, Oct 25, 2016 at 1:35 PM, Dimitri Maziuk 
> wrote:
>
>> On 10/25/2016 11:21 AM, Peter S. Shenkin wrote:
>> > Hi, Hongbin,
>> >
>> > Thanks. Indeed. svg2.svg, when renamed to svg2.html, shows the correct
>> > image in Chrome. svg.html shows garbage.
>> >
>> > Still, it would be good to be able to create a real .svg file from
>> RDKit.
>>
>> OK, you made me look and I learned something today.
>>
>> Mozilla claims valid SVG must include the namespace declarations
>> (https://developer.mozilla.org/en-US/docs/Web/SVG/FAQ) citing this
>> document: https://jwatt.org/svg/authoring/#namespace-binding
>>
>> There it states
>> """
>> http://www.w3.org/2000/svg";
>> ...
>> Be careful not to type xmlns:svg instead of just xmlns when you bind the
>> SVG namespace. This is an easy mistake to make, but one that can break
>> everything. Instead of making SVG the default namespace, it binds it to
>> the namespace prefix 'svg', and this is almost certainly not what you
>> want to do in an SVG file. A standards compliant browser will then fail
>> to recognise any tags and attributes that don't have an explicit
>> namespace prefix (probably most if not all of them) and fail to render
>> your document as SVG.
>> """
>>
>> Sure enough, rdkit's files start with
>> """
>> >   xmlns:svg='http://www.w3.org/2000/svg'
>> ...
>> """
>>
>> With that declaration any standards-compliant viewer should only
>> recognize tags with "svg:" prefix, and removing svg:'s results in a
>> technically invalid file. Anything that displays it as an image is what
>> we "it professionals" call b0rk3d.
>>
>> According to this, what RDKit writes out is wrong: you actually *want
>> to* remove :svg from the root tag's "xmlns" attribute, then you *may*
>> remove the svg: prefixes from all tags (including the root one).
>>
>> Of course, that was last edited in 2007, maybe something changed in the
>> 10 years since.
>>
>> HTH,
>> --
>> Dimitri Maziuk
>> Programmer/sysadmin
>> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>>
>>
>> 
>> --
>> The Command Line: Reinvented for Modern Developers
>> Did the resurgence of CLI tooling catch you by surprise?
>> Reconnect with the command line and become more productive.
>> Learn the new .NET and ASP.NET CLI. Get your free copy!
>> http://sdm.link/telerik
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> The Command Line: Reinvented for Modern Developers
> Did the resurgence of CLI tooling catch you by surprise?
> Reconnect with the command line and become more productive.
> Learn the new .NET and ASP.NET CLI. Get your free copy!
> http://sdm.link/telerik
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discus

Re: [Rdkit-discuss] reading multiple conformers from file

2016-10-26 Thread Greg Landrum
Hi Thomas,

You're right, reading multiple conformations out of an SDF does seem like
one of those common operations. Unfortunately the RDKit does not currently
support it in an easy way.

A python implementation of this would be a good topic for Friday's UGM
hackathon, we can see if anyone finds it interesting enough to work on.

-greg


On Tue, Oct 25, 2016 at 2:16 AM, Thomas Evangelidis 
wrote:

> Hello everyone,
>
> I am a new user of RDkit and I was looking in the documentation for an
> easy way to load multiple conformers from a structure file like .sdf. The
> code must 1) distinguish between different protonation states of the same
> molecule,  2) create a new Mol() object for each protonation state and load
> into it the respective conformers.
>
> Apparently I can work out a solution for 1) using mol.GetProp('_Name'),
> mol.GetNumAtoms, mol.GetNumBonds and other properties, but I was
> wondering if there is any more straight forward way to do it.
> For 2) I guess I must iterate over all molecules in the input file, create
> new Mol() objects (one for each protonation state of each ligand) and add
> conformers to these new Mol() objects. Again this sounds easily
> programmable, but sounds like a very common operation, thus I was wondering
> if it has been implemented in a function.
>
> thanks in advance
> Thomas
>
>
> --
>
> ==
>
> Thomas Evangelidis
>
> Research Specialist
> CEITEC - Central European Institute of Technology
> Masaryk University
> Kamenice 5/A35/1S081,
> 62500 Brno, Czech Republic
>
> email: tev...@pharm.uoa.gr
>
>   teva...@gmail.com
>
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>
> 
> --
> The Command Line: Reinvented for Modern Developers
> Did the resurgence of CLI tooling catch you by surprise?
> Reconnect with the command line and become more productive.
> Learn the new .NET and ASP.NET CLI. Get your free copy!
> http://sdm.link/telerik
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] SVG BUG (Re: Fwd: 2D drawing with atoms labeled by index)

2016-10-26 Thread Peter S. Shenkin
Hey, by the way, my agenda is trying to understand all this. I'm ignorant
about the general area and have learned something. But don't worry -- not
enough to be dangerous. :-) If something comes out of the discussion that's
generally useful, great!

By the way, when you post your UGM Jupyter notebook on github, could you
post the URL to the list? As I mentioned at the Cambridge UGM, that talk
was the best introduction to RDKit that I've seen, and I think many will
find it useful.

-P.
--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] reading multiple conformers from file

2016-10-26 Thread David Cosgrove
I've been wondering if, now that you can get decent conformations from
RDKit, it would be worth devising a multi-conformation file format to make
reading multi-conf molecules faster for vs purposes. In my experience,
pulling all the conformers out of an ascii file such as an sdf can become
the RDS for pharmacophore searchimg. Something to think about at the
hackathon maybe and certainly something that deserves a new email thread.

Dave


On Thursday, 27 October 2016, Greg Landrum  wrote:

> Hi Thomas,
>
> You're right, reading multiple conformations out of an SDF does seem like
> one of those common operations. Unfortunately the RDKit does not currently
> support it in an easy way.
>
> A python implementation of this would be a good topic for Friday's UGM
> hackathon, we can see if anyone finds it interesting enough to work on.
>
> -greg
>
>
> On Tue, Oct 25, 2016 at 2:16 AM, Thomas Evangelidis  > wrote:
>
>> Hello everyone,
>>
>> I am a new user of RDkit and I was looking in the documentation for an
>> easy way to load multiple conformers from a structure file like .sdf. The
>> code must 1) distinguish between different protonation states of the same
>> molecule,  2) create a new Mol() object for each protonation state and load
>> into it the respective conformers.
>>
>> Apparently I can work out a solution for 1) using mol.GetProp('_Name'), 
>> mol.GetNumAtoms, mol.GetNumBonds
>> and other properties, but I was wondering if there is any more straight
>> forward way to do it.
>> For 2) I guess I must iterate over all molecules in the input file,
>> create new Mol() objects (one for each protonation state of each ligand)
>> and add conformers to these new Mol() objects. Again this sounds easily
>> programmable, but sounds like a very common operation, thus I was wondering
>> if it has been implemented in a function.
>>
>> thanks in advance
>> Thomas
>>
>>
>> --
>>
>> ==
>>
>> Thomas Evangelidis
>>
>> Research Specialist
>> CEITEC - Central European Institute of Technology
>> Masaryk University
>> Kamenice 5/A35/1S081,
>> 62500 Brno, Czech Republic
>>
>> email: tev...@pharm.uoa.gr
>> 
>>
>>   teva...@gmail.com
>> 
>>
>>
>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>
>>
>> 
>> --
>> The Command Line: Reinvented for Modern Developers
>> Did the resurgence of CLI tooling catch you by surprise?
>> Reconnect with the command line and become more productive.
>> Learn the new .NET and ASP.NET CLI. Get your free copy!
>> http://sdm.link/telerik
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> 
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] reading multiple conformers from file

2016-10-26 Thread Greg Landrum
The RDKit has support for the TPL format, an old BioCad/MSI/Accelrys format.
It's easy to imagine something better, but this is at least already there
and there could be other software that speaks it:
https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/FileParsers/test_data/cmpd2.tpl

I'd still like to do a decent JSON format and adding multi-confs to that
would be logical

On Thu, Oct 27, 2016 at 6:58 AM, David Cosgrove 
wrote:

> I've been wondering if, now that you can get decent conformations from
> RDKit, it would be worth devising a multi-conformation file format to make
> reading multi-conf molecules faster for vs purposes. In my experience,
> pulling all the conformers out of an ascii file such as an sdf can become
> the RDS for pharmacophore searchimg. Something to think about at the
> hackathon maybe and certainly something that deserves a new email thread.
>
> Dave
>
>
> On Thursday, 27 October 2016, Greg Landrum  wrote:
>
>> Hi Thomas,
>>
>> You're right, reading multiple conformations out of an SDF does seem like
>> one of those common operations. Unfortunately the RDKit does not currently
>> support it in an easy way.
>>
>> A python implementation of this would be a good topic for Friday's UGM
>> hackathon, we can see if anyone finds it interesting enough to work on.
>>
>> -greg
>>
>>
>> On Tue, Oct 25, 2016 at 2:16 AM, Thomas Evangelidis 
>> wrote:
>>
>>> Hello everyone,
>>>
>>> I am a new user of RDkit and I was looking in the documentation for an
>>> easy way to load multiple conformers from a structure file like .sdf. The
>>> code must 1) distinguish between different protonation states of the same
>>> molecule,  2) create a new Mol() object for each protonation state and load
>>> into it the respective conformers.
>>>
>>> Apparently I can work out a solution for 1)
>>> using mol.GetProp('_Name'), mol.GetNumAtoms, mol.GetNumBonds and other
>>> properties, but I was wondering if there is any more straight forward way
>>> to do it.
>>> For 2) I guess I must iterate over all molecules in the input file,
>>> create new Mol() objects (one for each protonation state of each ligand)
>>> and add conformers to these new Mol() objects. Again this sounds easily
>>> programmable, but sounds like a very common operation, thus I was wondering
>>> if it has been implemented in a function.
>>>
>>> thanks in advance
>>> Thomas
>>>
>>>
>>> --
>>>
>>> ==
>>>
>>> Thomas Evangelidis
>>>
>>> Research Specialist
>>> CEITEC - Central European Institute of Technology
>>> Masaryk University
>>> Kamenice 5/A35/1S081,
>>> 62500 Brno, Czech Republic
>>>
>>> email: tev...@pharm.uoa.gr
>>>
>>>   teva...@gmail.com
>>>
>>>
>>> website: https://sites.google.com/site/thomasevangelidishomepage/
>>>
>>>
>>> 
>>> --
>>> The Command Line: Reinvented for Modern Developers
>>> Did the resurgence of CLI tooling catch you by surprise?
>>> Reconnect with the command line and become more productive.
>>> Learn the new .NET and ASP.NET CLI. Get your free copy!
>>> http://sdm.link/telerik
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
--
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss