[Rdkit-discuss] mmpdb 3.0b1

2022-02-09 Thread Andrew Dalke
Hi all,

  The combination of crowd-funding and contract work for me, and methods + 
software development by Mahendra Awale, has resulted in a new version of mmpdb.

More specifically, version 3.0 beta 1 is available on GitHub at:

  https://github.com/adalke/mmpdb/tree/v3-dev

The CHANGELOG summary is at the bottom of the email. For many people the 
biggest improvement is probably the support for large data sets, and the switch 
to a more human-understandable SMARTS/"pseudo-SMILES" for the environment 
fingerprints.

The documentation is available through the program, starting with 'mmpdb 
--help'.

Try it out, kick the tires, and let me know what fell off!

Cheers,

Andrew
da...@dalkescientific.com


A large number of changes to merge three different development tracks
and add new features.

The "fragments" file format has been replaced with a SQLite-based
"fragdb" file format. This makes it much easier to develop tools to
work on fragment data sets instead of processing a JSON-Lines file.

New functionality to create an MMP data set in a distributed compute
environment. Some of the features are:

- split a SMILES file into a set of smaller SMILES files
- the default "fragment" file output is now based on the input name
- fragment files can be re-partitioned by constant fragments:
- the "fragdb_constants" file generates fragment information
- the "fragdb_partition" create re-partitioned fragdb files
- the default "index" file output is now based on the input name
- there are tools to merge fragdb and mmpdb files into one

As a result, mmpdb can now handle significantly larger data sets.

Added support for Postgres for direct index database creation. (The
new distributed compute tools require SQLite.)

Added a new "generate" command to apply 1-cut transforms to a
structure, using MMP rules as a playbook.

Replaced the SHA256-based Morgan fingerprint signature with a
canonical SMARTS representing the Morgan fingerprint environment. This
is difficult to understand or depict, so also include a "pseudo"
SMILES that can be parsed by RDKit (if sanitize is disabled) and
drawn. The new environment fingerprint also include the SMARTS of its
parent, that is, the SMARTS with a smaller radius.

Switched to 'click' for command-line parsing, removed the vendered
version of the peewee ORM, and switched to a modern "pyproject.toml"
project configuration with a setup.cfg which declares its dependencies.

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Font size when drawing molecules

2022-02-09 Thread David Cosgrove
To help further, I'm just implementing an option
drawOptions().fixedFontSize to allow you to insist on a font size, in
pixels.  I will remember to expose it to Python!
Layout is a different department, I'm afraid I can't help there.  It would
probably be better to start a new thread with that, so as to catch the
attention of the right people.


On Wed, Feb 9, 2022 at 3:58 PM Tim Dudgeon  wrote:

> Thanks Dave. Understood.
> A related question - is it possible to make the layout aware of the amount
> of space that is available? I'm stuck with a very wide and short aspect
> ratio and it would be helpful if the layout engine could optimise the
> layout to fit in that unconventional space.
> Tim
>
> On Wed, Feb 9, 2022 at 11:40 AM David Cosgrove 
> wrote:
>
>> Hi Tim,
>> Sorry, the font size setting both within the code and in the public
>> interface has been a fraught matter since Freetype was introduced for the
>> font drawing and it isn't currently as controllable as one might wish.  The
>> font is chosen based on baseFontSize and the drawing scale.  The size of
>> the font relative to the bond lengths is therefore fixed, unless it hits
>> the minFontSize or maxFontSize.  So for a large molecule in a small canvas,
>> it is likely that the font size will be larger relative to the bonds as
>> minFontSize has an effect, and vice versa with a small molecule in a large
>> canvas.  To achieve what you want, you need to increase bastFontSize,
>> which, as Paolo mentioned, isn't currently exposed to Python.  Apologies
>> for that, which was an oversight.  It does work with the current release,
>> though, so if you don't mind rebuilding RDKit you can use it now.
>> Add
>> ```
>>
>>   .def_readwrite(
>>
>>   "baseFontSize", ::MolDrawOptions::baseFontSize,
>>
>>   "relative size of font.  Defaults to 0.6.  -1 means use
>> default.")
>> ```
>> to $RDBASE/Code/GraphMol/MolDraw2D/Wrap/rdMolDraw2D.cpp immediately after
>> the analogous minFontSize entry
>> HTH,
>> Dave
>>
>>
>>
>> On Wed, Feb 9, 2022 at 10:31 AM Tim Dudgeon 
>> wrote:
>>
>>> OK, thanks. That's great to hear.
>>> In the meantime could someone explain how the font is currently chosen?
>>> e.g. if I specify 10 as min and 14 as max what is actually used?
>>> Tim
>>>
>>> On Wed, Feb 9, 2022 at 10:11 AM Paolo Tosco 
>>> wrote:
>>>
 Hi Tim,

 Dave Cosgrove is currently working at a PR which, among other things,
 addresses exactly the need that you describe through the baseFontSize
 parameter, which is currently not exposed to Python. The PR is almost ready
 for merging and it should become part of the March release.

 Cheers,
 p.

 On Wed, Feb 9, 2022 at 10:57 AM Tim Dudgeon 
 wrote:

> I'm confused over how the font is chosen when drawing molecules.
> There are MolDrawOptions.minFontSize and MolDrawOptions.maxFontSize
> properties, and if I set them to the same value then that sized font is
> used. But if I set max to a larger size than min then it's not clear what
> font size will be used.
> I'm wanting the font size to adapt to the amount the molecule is
> scaled to fit the space (larger molecules needing a smaller font) but I
> want the font size that is used to be a bit bigger than the default
> that would be used if I don't set anything.
> How do I go about this?
> Thanks
> Tim
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
 ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
>>
>> --
>> David Cosgrove
>> Freelance computational chemistry and chemoinformatics developer
>> http://cozchemix.co.uk
>>
>>

-- 
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Font size when drawing molecules

2022-02-09 Thread Tim Dudgeon
Thanks Dave. Understood.
A related question - is it possible to make the layout aware of the amount
of space that is available? I'm stuck with a very wide and short aspect
ratio and it would be helpful if the layout engine could optimise the
layout to fit in that unconventional space.
Tim

On Wed, Feb 9, 2022 at 11:40 AM David Cosgrove 
wrote:

> Hi Tim,
> Sorry, the font size setting both within the code and in the public
> interface has been a fraught matter since Freetype was introduced for the
> font drawing and it isn't currently as controllable as one might wish.  The
> font is chosen based on baseFontSize and the drawing scale.  The size of
> the font relative to the bond lengths is therefore fixed, unless it hits
> the minFontSize or maxFontSize.  So for a large molecule in a small canvas,
> it is likely that the font size will be larger relative to the bonds as
> minFontSize has an effect, and vice versa with a small molecule in a large
> canvas.  To achieve what you want, you need to increase bastFontSize,
> which, as Paolo mentioned, isn't currently exposed to Python.  Apologies
> for that, which was an oversight.  It does work with the current release,
> though, so if you don't mind rebuilding RDKit you can use it now.
> Add
> ```
>
>   .def_readwrite(
>
>   "baseFontSize", ::MolDrawOptions::baseFontSize,
>
>   "relative size of font.  Defaults to 0.6.  -1 means use
> default.")
> ```
> to $RDBASE/Code/GraphMol/MolDraw2D/Wrap/rdMolDraw2D.cpp immediately after
> the analogous minFontSize entry
> HTH,
> Dave
>
>
>
> On Wed, Feb 9, 2022 at 10:31 AM Tim Dudgeon  wrote:
>
>> OK, thanks. That's great to hear.
>> In the meantime could someone explain how the font is currently chosen?
>> e.g. if I specify 10 as min and 14 as max what is actually used?
>> Tim
>>
>> On Wed, Feb 9, 2022 at 10:11 AM Paolo Tosco 
>> wrote:
>>
>>> Hi Tim,
>>>
>>> Dave Cosgrove is currently working at a PR which, among other things,
>>> addresses exactly the need that you describe through the baseFontSize
>>> parameter, which is currently not exposed to Python. The PR is almost ready
>>> for merging and it should become part of the March release.
>>>
>>> Cheers,
>>> p.
>>>
>>> On Wed, Feb 9, 2022 at 10:57 AM Tim Dudgeon 
>>> wrote:
>>>
 I'm confused over how the font is chosen when drawing molecules.
 There are MolDrawOptions.minFontSize and MolDrawOptions.maxFontSize
 properties, and if I set them to the same value then that sized font is
 used. But if I set max to a larger size than min then it's not clear what
 font size will be used.
 I'm wanting the font size to adapt to the amount the molecule is scaled
 to fit the space (larger molecules needing a smaller font) but I want the
 font size that is used to be a bit bigger than the default that would be
 used if I don't set anything.
 How do I go about this?
 Thanks
 Tim
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

>>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
> --
> David Cosgrove
> Freelance computational chemistry and chemoinformatics developer
> http://cozchemix.co.uk
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem with depicting reaction SMARTS

2022-02-09 Thread Mark Mackey via Rdkit-discuss
Thanks Paolo – we’ll give that a try.

Regards,
Mark

Dr Mark Mackey
Chief Scientific Officer
Cresset
New Cambridge House, Bassingbourn Road, Litlington, Cambridgeshire, SG8 0SS, UK
Tel: +44 (0)1223 858890   Mobile: +44 (0)7595099165
Email: m...@cresset-group.com Web: 
www.cresset-group.com



From: Paolo Tosco 
Sent: 08 February 2022 21:48
To: Mark Mackey 
Cc: rdkit-discuss@lists.sourceforge.net
Subject: Re: [Rdkit-discuss] Problem with depicting reaction SMARTS

Hi Mark,

I believe the bug is caused by the fact that isAtomListQuery() returns true for 
a query that is actually a complex query, and that subsequently 
getAtomListQueryVals() (called by getAtomListText()) fails to parse.
The following patch seems to solve the problem:

$ git diff
diff --git a/Code/GraphMol/QueryOps.cpp b/Code/GraphMol/QueryOps.cpp
index a80d8f5..ea28ad3 100644
--- a/Code/GraphMol/QueryOps.cpp
+++ b/Code/GraphMol/QueryOps.cpp
@@ -812,8 +812,9 @@ bool _atomListQueryHelper(const T query) {
 return false;
   }
 }
+return true;
   }
-  return true;
+  return false;
 }
 }  // namespace
 bool isAtomListQuery(const Atom *a) {

Cheers,
p.

On Tue, Feb 8, 2022 at 8:29 PM Mark Mackey via Rdkit-discuss 
mailto:rdkit-discuss@lists.sourceforge.net>>
 wrote:
Hi all,

I’m trying to generate an SVG from reaction SMARTS. The code looks like this:

QByteArray ChemicalReactionsCalculation::toSvg(const QString , int 
width, int height)
{
try
{
std::string text = reaction.toStdString();

std::unique_ptr 
rxn(RDKit::RxnSmartsToChemicalReaction(text));
if (rxn)
{
RDKit::MolDraw2DSVG drawer(width, height);
drawer.drawReaction(*rxn);
drawer.finishDrawing();

auto svg = drawer.getDrawingText();
return QByteArray::fromStdString(svg);
}
}
catch (const std::exception )
{
qWarning() << "Exception in ChemicalReactionsCalculation::toSvg on" << 
reaction << ":" << e.what();
}
return QByteArray();
}

This works fine for most reaction SMARTS strings, but throws an exception at 
“drawer.drawReaction” when the SMARTS string has a combination of Boolean 
operators in it. For example, “[n:1]>>[o:1]” works fine, as does 
“[c,n:1]>>[O:1]”, but doing “[c,n:1]>>[o:1]” fails (as does the equivalent 
“[c,nH1:1]>>[o:1]”):

Exception in ChemicalReactionsCalculation::toSvg on "[c,n:1]>>[o:1]" : bad 
query type1

This happens for pretty much any combination of “,” and “&” operators inside 
the square brackets. Any ideas?

Regards,
Mark

Dr Mark Mackey
Chief Scientific Officer
Cresset
New Cambridge House, Bassingbourn Road, Litlington, Cambridgeshire, SG8 0SS, UK
Tel: +44 (0)1223 858890   Mobile: +44 (0)7595099165
Email: m...@cresset-group.com Web: 
www.cresset-group.com




This email has been sent from Cresset BioMolecular Discovery Limited, 
registered in England and Wales, Company Number: 04151475. The information in 
this email and any attachments are confidential and may be privileged. It is 
intended solely for the addressee and access to this email by anyone else is 
unauthorised. If an addressing or transmission error has misdirected this 
email, please notify the author by replying to this email. If you are not the 
intended recipient you must not use, disclose, distribute, store or copy the 
information in any medium. Although this e-mail and any attachments are 
believed to be free from any virus or other defect which might affect any 
system into which they are opened or received, it is the responsibility of the 
recipient to check that they are virus-free and that they will in no way affect 
systems and data. No responsibility is accepted by Cresset BioMolecular 
Discovery Limited for any loss or damage arising in any way from their receipt, 
opening or use. Privacy notice
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Font size when drawing molecules

2022-02-09 Thread David Cosgrove
Hi Tim,
Sorry, the font size setting both within the code and in the public
interface has been a fraught matter since Freetype was introduced for the
font drawing and it isn't currently as controllable as one might wish.  The
font is chosen based on baseFontSize and the drawing scale.  The size of
the font relative to the bond lengths is therefore fixed, unless it hits
the minFontSize or maxFontSize.  So for a large molecule in a small canvas,
it is likely that the font size will be larger relative to the bonds as
minFontSize has an effect, and vice versa with a small molecule in a large
canvas.  To achieve what you want, you need to increase bastFontSize,
which, as Paolo mentioned, isn't currently exposed to Python.  Apologies
for that, which was an oversight.  It does work with the current release,
though, so if you don't mind rebuilding RDKit you can use it now.
Add
```

  .def_readwrite(

  "baseFontSize", ::MolDrawOptions::baseFontSize,

  "relative size of font.  Defaults to 0.6.  -1 means use default.")
```
to $RDBASE/Code/GraphMol/MolDraw2D/Wrap/rdMolDraw2D.cpp immediately after
the analogous minFontSize entry
HTH,
Dave



On Wed, Feb 9, 2022 at 10:31 AM Tim Dudgeon  wrote:

> OK, thanks. That's great to hear.
> In the meantime could someone explain how the font is currently chosen?
> e.g. if I specify 10 as min and 14 as max what is actually used?
> Tim
>
> On Wed, Feb 9, 2022 at 10:11 AM Paolo Tosco 
> wrote:
>
>> Hi Tim,
>>
>> Dave Cosgrove is currently working at a PR which, among other things,
>> addresses exactly the need that you describe through the baseFontSize
>> parameter, which is currently not exposed to Python. The PR is almost ready
>> for merging and it should become part of the March release.
>>
>> Cheers,
>> p.
>>
>> On Wed, Feb 9, 2022 at 10:57 AM Tim Dudgeon 
>> wrote:
>>
>>> I'm confused over how the font is chosen when drawing molecules.
>>> There are MolDrawOptions.minFontSize and MolDrawOptions.maxFontSize
>>> properties, and if I set them to the same value then that sized font is
>>> used. But if I set max to a larger size than min then it's not clear what
>>> font size will be used.
>>> I'm wanting the font size to adapt to the amount the molecule is scaled
>>> to fit the space (larger molecules needing a smaller font) but I want the
>>> font size that is used to be a bit bigger than the default that would be
>>> used if I don't set anything.
>>> How do I go about this?
>>> Thanks
>>> Tim
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>


-- 
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Font size when drawing molecules

2022-02-09 Thread Tim Dudgeon
OK, thanks. That's great to hear.
In the meantime could someone explain how the font is currently chosen?
e.g. if I specify 10 as min and 14 as max what is actually used?
Tim

On Wed, Feb 9, 2022 at 10:11 AM Paolo Tosco 
wrote:

> Hi Tim,
>
> Dave Cosgrove is currently working at a PR which, among other things,
> addresses exactly the need that you describe through the baseFontSize
> parameter, which is currently not exposed to Python. The PR is almost ready
> for merging and it should become part of the March release.
>
> Cheers,
> p.
>
> On Wed, Feb 9, 2022 at 10:57 AM Tim Dudgeon  wrote:
>
>> I'm confused over how the font is chosen when drawing molecules.
>> There are MolDrawOptions.minFontSize and MolDrawOptions.maxFontSize
>> properties, and if I set them to the same value then that sized font is
>> used. But if I set max to a larger size than min then it's not clear what
>> font size will be used.
>> I'm wanting the font size to adapt to the amount the molecule is scaled
>> to fit the space (larger molecules needing a smaller font) but I want the
>> font size that is used to be a bit bigger than the default that would be
>> used if I don't set anything.
>> How do I go about this?
>> Thanks
>> Tim
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Font size when drawing molecules

2022-02-09 Thread Paolo Tosco
Hi Tim,

Dave Cosgrove is currently working at a PR which, among other things,
addresses exactly the need that you describe through the baseFontSize
parameter, which is currently not exposed to Python. The PR is almost ready
for merging and it should become part of the March release.

Cheers,
p.

On Wed, Feb 9, 2022 at 10:57 AM Tim Dudgeon  wrote:

> I'm confused over how the font is chosen when drawing molecules.
> There are MolDrawOptions.minFontSize and MolDrawOptions.maxFontSize
> properties, and if I set them to the same value then that sized font is
> used. But if I set max to a larger size than min then it's not clear what
> font size will be used.
> I'm wanting the font size to adapt to the amount the molecule is scaled to
> fit the space (larger molecules needing a smaller font) but I want the font
> size that is used to be a bit bigger than the default that would be used if
> I don't set anything.
> How do I go about this?
> Thanks
> Tim
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Font size when drawing molecules

2022-02-09 Thread Tim Dudgeon
I'm confused over how the font is chosen when drawing molecules.
There are MolDrawOptions.minFontSize and MolDrawOptions.maxFontSize
properties, and if I set them to the same value then that sized font is
used. But if I set max to a larger size than min then it's not clear what
font size will be used.
I'm wanting the font size to adapt to the amount the molecule is scaled to
fit the space (larger molecules needing a smaller font) but I want the font
size that is used to be a bit bigger than the default that would be used if
I don't set anything.
How do I go about this?
Thanks
Tim
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss