[Rdkit-discuss] mol_from_ctab doesn't preserve coordinates

2020-05-06 Thread Sharang Phatak
Hi,

I am following the documentation for postgres / rdkit. I have a table with
valid molfiles as confirmed from is_valid_ctab(). I am then trying to
insert into a table 'mols' using mol_from_ctab(molfile::cstring,true).

However, the coordinates are not preserved. Is there something I am missing?

Thank you,
Sharang
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Usage of CombineFeatMaps

2020-05-06 Thread Greg Landrum
On Wed, May 6, 2020 at 6:15 PM Tim Dudgeon  wrote:

> What's not clear is the usage of mergeMethod as this does not seem to be
> specifiable in CombineFeatMaps(), but you say that that method uses it when
> calling MergeFeatPoints(). Why isn't it in the signature of the former?
>
> Because CombineFeatMaps() is just a convenience function and we only ever
ended up using the WeightedAverages in the end

-greg




> Tim
> On 06/05/2020 16:55, Greg Landrum wrote:
>
> We're deep in the underdocumented space now. Sorry about that.
>
> Here's an attempt to provide some at least sense of what these things
> mean. This is going from (old) memory and a quick skim of the code...
> hopefully I don't make any egregious errors.
>
> mergeMetric is the metric used to determine whether or not two feature
> points will be combined into a single point. These use the the mergeTol
> - NoMerge means don't merge points at all
> - Distance means combine points that are less than mergeTol apart
> - Overlap means combine points that have an overlap more than
> mergeTol (the feature map itself computes this with GetFeatFeatScore())
>
> dirMergeMode is not used (thankfully)
>
> mergeMethod is a parameter to MergeFeatPoints, which CombineFeatMaps
> calls.
> The default value is to use the weighted average of the features to
> determine the location and weight of the replacement feature.
> The other two possibilities are Average (does a non-weighted average) and
> UseLarger which just keeps the location and weight of the feature point
> with the larger weight.
>
> does that help?
>
> -greg
>
>
>
>
> On Wed, May 6, 2020 at 5:27 PM Tim Dudgeon  wrote:
>
>> I'm trying to use the FeatureMaps functionality in RDKit (described here
>> http://rdkit.blogspot.com/2017/11/using-feature-maps.html) and have a
>> question on the parameters for CombineFeatMaps.
>>
>> See here:
>>
>> http://rdkit.org/docs/source/rdkit.Chem.FeatMaps.FeatMapUtils.html?highlight=combinefeatmaps#rdkit.Chem.FeatMaps.FeatMapUtils.CombineFeatMaps
>>
>> In particular what is the meaning of the 'mergeMetric' param (Distance
>> vs. Overlap) and the 'dirMergeMode' param (NoMerge vs. Sum). And there
>> is a class for MergeMethod that seems relevant but that is not part of
>> the parameters for CombineFeatMaps.
>>
>> Thanks
>> Tim
>>
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Usage of CombineFeatMaps

2020-05-06 Thread Tim Dudgeon

Hi Greg, yes that mostly makes sense and is a great help.

What's not clear is the usage of mergeMethod as this does not seem to be 
specifiable in CombineFeatMaps(), but you say that that method uses it 
when calling MergeFeatPoints(). Why isn't it in the signature of the former?


Tim

On 06/05/2020 16:55, Greg Landrum wrote:

We're deep in the underdocumented space now. Sorry about that.

Here's an attempt to provide some at least sense of what these things 
mean. This is going from (old) memory and a quick skim of the code... 
hopefully I don't make any egregious errors.


mergeMetric is the metric used to determine whether or not two feature 
points will be combined into a single point. These use the the mergeTol

- NoMerge means don't merge points at all
- Distance means combine points that are less than mergeTol apart
- Overlap means combine points that have an overlap more than 
mergeTol (the feature map itself computes this with GetFeatFeatScore())


dirMergeMode is not used (thankfully)

mergeMethod is a parameter to MergeFeatPoints, which CombineFeatMaps 
calls.
The default value is to use the weighted average of the features to 
determine the location and weight of the replacement feature.
The other two possibilities are Average (does a non-weighted average) 
and UseLarger which just keeps the location and weight of the feature 
point with the larger weight.


does that help?

-greg




On Wed, May 6, 2020 at 5:27 PM Tim Dudgeon > wrote:


I'm trying to use the FeatureMaps functionality in RDKit
(described here
http://rdkit.blogspot.com/2017/11/using-feature-maps.html) and have a
question on the parameters for CombineFeatMaps.

See here:

http://rdkit.org/docs/source/rdkit.Chem.FeatMaps.FeatMapUtils.html?highlight=combinefeatmaps#rdkit.Chem.FeatMaps.FeatMapUtils.CombineFeatMaps

In particular what is the meaning of the 'mergeMetric' param
(Distance
vs. Overlap) and the 'dirMergeMode' param (NoMerge vs. Sum). And
there
is a class for MergeMethod that seems relevant but that is not
part of
the parameters for CombineFeatMaps.

Thanks
Tim



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Usage of CombineFeatMaps

2020-05-06 Thread Greg Landrum
We're deep in the underdocumented space now. Sorry about that.

Here's an attempt to provide some at least sense of what these things mean.
This is going from (old) memory and a quick skim of the code... hopefully I
don't make any egregious errors.

mergeMetric is the metric used to determine whether or not two feature
points will be combined into a single point. These use the the mergeTol
- NoMerge means don't merge points at all
- Distance means combine points that are less than mergeTol apart
- Overlap means combine points that have an overlap more than mergeTol (the
feature map itself computes this with GetFeatFeatScore())

dirMergeMode is not used (thankfully)

mergeMethod is a parameter to MergeFeatPoints, which CombineFeatMaps calls.
The default value is to use the weighted average of the features to
determine the location and weight of the replacement feature.
The other two possibilities are Average (does a non-weighted average) and
UseLarger which just keeps the location and weight of the feature point
with the larger weight.

does that help?

-greg




On Wed, May 6, 2020 at 5:27 PM Tim Dudgeon  wrote:

> I'm trying to use the FeatureMaps functionality in RDKit (described here
> http://rdkit.blogspot.com/2017/11/using-feature-maps.html) and have a
> question on the parameters for CombineFeatMaps.
>
> See here:
>
> http://rdkit.org/docs/source/rdkit.Chem.FeatMaps.FeatMapUtils.html?highlight=combinefeatmaps#rdkit.Chem.FeatMaps.FeatMapUtils.CombineFeatMaps
>
> In particular what is the meaning of the 'mergeMetric' param (Distance
> vs. Overlap) and the 'dirMergeMode' param (NoMerge vs. Sum). And there
> is a class for MergeMethod that seems relevant but that is not part of
> the parameters for CombineFeatMaps.
>
> Thanks
> Tim
>
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Usage of CombineFeatMaps

2020-05-06 Thread Tim Dudgeon
I'm trying to use the FeatureMaps functionality in RDKit (described here 
http://rdkit.blogspot.com/2017/11/using-feature-maps.html) and have a 
question on the parameters for CombineFeatMaps.


See here: 
http://rdkit.org/docs/source/rdkit.Chem.FeatMaps.FeatMapUtils.html?highlight=combinefeatmaps#rdkit.Chem.FeatMaps.FeatMapUtils.CombineFeatMaps


In particular what is the meaning of the 'mergeMetric' param (Distance 
vs. Overlap) and the 'dirMergeMode' param (NoMerge vs. Sum). And there 
is a class for MergeMethod that seems relevant but that is not part of 
the parameters for CombineFeatMaps.


Thanks
Tim



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Unexpected hybridization state of oxygens

2020-05-06 Thread Jean-Marc Nuzillard

Dear all,

from:

m = Chem.MolFromSmiles("C(=O)OC")
[print(repr(x.GetHybridization())) for x in m.GetAtoms() if 
x.GetSymbol() == 'O']


I obtained:

rdkit.Chem.rdchem.HybridizationType.SP2
rdkit.Chem.rdchem.HybridizationType.SP2

which is confusing because in methyl formiate, one of the two oxygens is sp2
and the other one is sp3.
The carbons are OK.

Is there a better way than repr(x.GetHybridization()) to access the value
of the hybridization state?

Best,

Jean-Marc

--
Jean-Marc Nuzillard
Directeur de Recherches au CNRS

Institut de Chimie Moléculaire de Reims
CNRS UMR 7312
Moulin de la Housse
CPCBAI, Bâtiment 18
BP 1039
51687 REIMS Cedex 2
France

Tel : 03 26 91 82 10
Fax : 03 26 91 31 66
http://www.univ-reims.fr/icmr
http://eos.univ-reims.fr/LSD/CSNteam.html

http://www.univ-reims.fr/LSD/
http://www.univ-reims.fr/LSD/JmnSoft/



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] write sdf properties without position number

2020-05-06 Thread Nicolas Bosc
It’s for an old internal tool that we are updating. It has apparently some 
strict parsing rules at the moment.

> On 6 May 2020, at 10:37, Greg Landrum  wrote:
> 
> Out of curiousity, which software are you using that can't read those index 
> values?
> 
> On Wed, May 6, 2020 at 11:25 AM Nicolas Bosc  > wrote:
> Hi Paolo,
> 
> Thank you. You spoonfed me!
> 
> Nicolas
> 
>> On 5 May 2020, at 17:25, Paolo Tosco > > wrote:
>> 
>> Hi Nicolas,
>> 
>> quick and dirty solution: strip it with a regex, e.g.
>> 
>> sed 's|^\(>  <.*>\) *([0-9]*)|\1|'
>> 
>> HTH,
>> p.
>> 
>> On 05/05/2020 16:35, Nicolas Bosc wrote:
>>> Hi RDKit users,
>>> 
>>> Writing molecules in a sdf with properties automatically add a number after 
>>> the property name which is the position of the associated molecule in the 
>>> file:
>>> >(1) 
>>> CHEMBL123
>>> 
>>> How can I change this so there is no number? The program that I am using to 
>>> read the sdf file fails because of this...
>>> 
>>> Thanks,
>>> 
>>> Nicolas
>>> 
>>> 
>>> 
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net 
>>> 
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 
>>> 
> 
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net 
> 
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 
> 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] write sdf properties without position number

2020-05-06 Thread Greg Landrum
Out of curiousity, which software are you using that can't read those index
values?

On Wed, May 6, 2020 at 11:25 AM Nicolas Bosc  wrote:

> Hi Paolo,
>
> Thank you. You spoonfed me!
>
> Nicolas
>
> On 5 May 2020, at 17:25, Paolo Tosco  wrote:
>
> Hi Nicolas,
>
> quick and dirty solution: strip it with a regex, e.g.
>
> sed 's|^\(>  <.*>\) *([0-9]*)|\1|'
>
> HTH,
> p.
> On 05/05/2020 16:35, Nicolas Bosc wrote:
>
> Hi RDKit users,
>
> Writing molecules in a sdf with properties automatically add a number
> after the property name which is the position of the associated molecule in
> the file:
> >   * (1) *
> CHEMBL123
>
> How can I change this so there is no number? The program that I am using
> to read the sdf file fails because of this...
>
> Thanks,
>
> Nicolas
>
>
> ___
> Rdkit-discuss mailing 
> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] write sdf properties without position number

2020-05-06 Thread Nicolas Bosc
Hi Paolo,

Thank you. You spoonfed me!

Nicolas

> On 5 May 2020, at 17:25, Paolo Tosco  wrote:
> 
> Hi Nicolas,
> 
> quick and dirty solution: strip it with a regex, e.g.
> 
> sed 's|^\(>  <.*>\) *([0-9]*)|\1|'
> 
> HTH,
> p.
> 
> On 05/05/2020 16:35, Nicolas Bosc wrote:
>> Hi RDKit users,
>> 
>> Writing molecules in a sdf with properties automatically add a number after 
>> the property name which is the position of the associated molecule in the 
>> file:
>> >(1) 
>> CHEMBL123
>> 
>> How can I change this so there is no number? The program that I am using to 
>> read the sdf file fails because of this...
>> 
>> Thanks,
>> 
>> Nicolas
>> 
>> 
>> 
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net 
>> 
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss 
>> 

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] GetSubstructMatches and unique match

2020-05-06 Thread Quoc-Tuan DO

Dear Paolo,

Thank you again for your code. Sorry for bothering you again. It works 
all fine for monoterpenes but not for diterpenes, sesquiterpenes nor 
triterpenes.


pattern: C~C~C(~C)~C

mol1: CC(=O)O[C@H]1CC[C@]2([C@H](C1(C)C)CC=C([C@@H]2CC/C(=C/C(=O)O)/C)C)C

=> ((17, 18, 19, 20, 23), (16, 24, 13, 14, 15), (8, 9, 4, 12, 7))

It should find 4 distinct units.

mol2: OCC12CCC(C2C2C(CC1)(C)C1(C)CCC3C(C1CC2)(C)CCC(C3(C)C)O)C(=C)C

=> ((16, 25, 27, 17, 15), (18, 19, 12, 13, 14), (1, 2, 5, 6, 7))

It should find 6 distinct units.

I tried with a smarts version of the pattern [#6]~[#6]~[#6](~[#6])~[#6], 
but got the same results as with smiles.


What do you think? Is there something missing in the query?

Thanks for your time,

Best regards,

QT



Le 05/05/2020 à 14:52, Paolo Tosco a écrit :


Dear Quoc-Tuan,

this should do what you need:

https://gist.github.com/ptosco/dc4d27153e6e8e45aed654761e4d7409

Cheers,
p.





___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss