Re: [Rdkit-discuss] Saving mol file

2018-09-26 Thread GALLY Jose Manuel
Dear Colin,
this is a specific problem I stumbled upon some time ago.[1]

I also mentioned it to the rDock mailing list.[2]

Maybe there is a better work-around, but in the meantime I wrote the
attached function.

It takes as input the Mol Block, which in my case are in a dataframe.

Hope that helps!

Cheers,
Jose Manuel

Refs:
[1] https://sourceforge.net/p/rdkit/mailman/message/34740124/
[2] https://sourceforge.net/p/rdock/mailman/message/34741112/

2018-09-25 17:27 GMT+02:00 Colin Bournez :

> Well yes I have this line indeed, I did not put the whole file for clarity
> purpose. The thing is tools as MOE, Pymol read it without problem but RDock
> for example can't read it properly and returns a neutral N which is not the
> case. And if I open it with pymol and save it back in mol format, the 3
> appears on the N line and Rdock has no trouble anymore...
> I was just wondering if there was a trick in RDKit to also save it this
> way.
>
>
> On 25/09/18 17:18, Greg Landrum wrote:
>
> Hi Colin,
> The RDkit outputs charge information to mol blocks using the CHG line:
>
> In [3]: m = Chem.MolFromSmiles('C[NH3+]')
>
> In [4]: print(Chem.MolToMolBlock(m))
>
>  RDKit  2D
>
>   2  1  0  0  0  0  0  0  0  0999 V2000
> 0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
> 1.29900.75000. N   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
> M  CHG  1   2   1
> M  END
>
>
> I expect that you will find one of those in your mol file and that it
> should be properly read in by other tools.
> Is this not the case for you?
>
> Best,
> -greg
>
>
>
> On Tue, Sep 25, 2018 at 4:39 PM Colin Bournez <
> colin.bour...@univ-orleans.fr> wrote:
>
>> Hey everyone,
>>
>> I have a question concerning the Chem.MolToMolFile() function.
>> When I open this file containing a N+ (here is the line corresponding in
>> the mol file) :
>>
>>11.37003.4360  -11.8300 N   0  3  0  0  0  0  0  0  0  0  0  0
>>
>> And I just save it back withotu any modification, the line is then :
>>
>>  11.37003.4360  -11.8300 N   0  0  0  0  0  0  0  0  0  0  0  0
>>
>> The problem is that for some software this mol file causes trouble and
>> the N+ is transformed to N with 4 bonds.
>> I tried several tricks but I was not able to save it as the original
>> line, does anyone has suggestion?
>>
>> Thanks,
>>
>> --
>> *Colin Bournez*
>> PhD Student, Structural Bioinformatics & Chemoinformatics
>> Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université
>> d'Orléans 7311
>> Rue de Chartres, 45067 Orléans, France
>> T. +33 238 494 577
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
> --
> *Colin Bournez*
> PhD Student, Structural Bioinformatics & Chemoinformatics
> Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université
> d'Orléans 7311
> Rue de Chartres, 45067 Orléans, France
> T. +33 238 494 577
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>


-- 
José-Manuel Gally
PhD Student
Structural Bioinformatics & Chemoinformatics
Institut de Chimie Organique et Analytique (ICOA)
UMR CNRS-Université d'Orléans 7311
Université d'Orléans
Rue de Chartres
F-45067 Orléans
phone: +33 238 494 577

def UpdateChargeFlagInAtomBlock(mb):
"""
This function opens twice a file.
During the first time it reads it in order to extract all Mol Blocks
and update them with expected charge flags in atomblocks in memory.
During second time it rewrites it using the updated Mol Blocks in memory.
"""
f="{:>10s}"*3+"{:>2}{:>4s}"+"{:>3s}"*11
chgs = []# list of charges
lines = mb.split("\n")
if mb[0] == '' or mb[0] == "\n":
del lines[0]
CTAB = lines[2]
atomCount = int(CTAB.split()[0])
# parse mb line per line
for l in lines:
# look for M CHG property
if l[0:6] == "M  CHG":
records = l.split()[3:]# M  CHG X is not needed for parsing, the info we want comes afterwards
# record each charge into a list
for i in range(0,len(records),2):
idx = records[i]
chg = records[i+1]
chgs.append((int(idx), int(chg)))# sort tuples by first element?
break# stop iterating

# sort by idx in order to parse the molblock only once more
chgs = sorted(chgs, key=lambda x: x[0])

# that we have a list for the current molblock, attribute each charges
for chg in chgs:
i=3
while i < 3+atomCount:# do not read from beginning each time, rather continue parsing mb!
# when finding the idx of the atom we want to update, extract all fields and rewrite whole sequence
if i-2 == chg[0]:# -4 to take into account the CTAB headers, +1 

Re: [Rdkit-discuss] Saving mol file

2018-09-25 Thread Colin Bournez
Well yes I have this line indeed, I did not put the whole file for 
clarity purpose. The thing is tools as MOE, Pymol read it without 
problem but RDock for example can't read it properly and returns a 
neutral N which is not the case. And if I open it with pymol and save it 
back in mol format, the 3 appears on the N line and Rdock has no trouble 
anymore...
I was just wondering if there was a trick in RDKit to also save it this 
way.



On 25/09/18 17:18, Greg Landrum wrote:

Hi Colin,
The RDkit outputs charge information to mol blocks using the CHG line:

In [3]: m = Chem.MolFromSmiles('C[NH3+]')

In [4]: print(Chem.MolToMolBlock(m))

 RDKit  2D

  2  1  0  0  0  0  0  0  0  0999 V2000
0.0.0. C   0  0  0  0  0  0 0  0  0  0  0  0
1.29900.75000. N   0  0  0  0  0  0 0  0  0  0  0  0
  1  2  1  0
M  CHG  1   2   1
M  END


I expect that you will find one of those in your mol file and that it 
should be properly read in by other tools.

Is this not the case for you?

Best,
-greg



On Tue, Sep 25, 2018 at 4:39 PM Colin Bournez 
mailto:colin.bour...@univ-orleans.fr>> 
wrote:


Hey everyone,

I have a question concerning the Chem.MolToMolFile() function.
When I open this file containing a N+ (here is the line
corresponding in the mol file) :

   11.37003.4360  -11.8300 N   0  3  0  0  0  0  0  0 0  0  0  0

And I just save it back withotu any modification, the line is then :

 11.37003.4360  -11.8300 N   0  0  0  0  0  0  0  0 0  0  0  0

The problem is that for some software this mol file causes trouble
and the N+ is transformed to N with 4 bonds.
I tried several tricks but I was not able to save it as the
original line, does anyone has suggestion?

Thanks,

-- 
*Colin Bournez*

PhD Student, Structural Bioinformatics & Chemoinformatics
Institut de Chimie Organique et Analytique (ICOA), UMR
CNRS-Université d'Orléans 7311
Rue de Chartres, 45067 Orléans, France
T. +33 238 494 577
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



--
*Colin Bournez*
PhD Student, Structural Bioinformatics & Chemoinformatics
Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université 
d'Orléans 7311

Rue de Chartres, 45067 Orléans, France
T. +33 238 494 577
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Saving mol file

2018-09-25 Thread Greg Landrum
Hi Colin,
The RDkit outputs charge information to mol blocks using the CHG line:

In [3]: m = Chem.MolFromSmiles('C[NH3+]')

In [4]: print(Chem.MolToMolBlock(m))

 RDKit  2D

  2  1  0  0  0  0  0  0  0  0999 V2000
0.0.0. C   0  0  0  0  0  0  0  0  0  0  0  0
1.29900.75000. N   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0
M  CHG  1   2   1
M  END


I expect that you will find one of those in your mol file and that it
should be properly read in by other tools.
Is this not the case for you?

Best,
-greg



On Tue, Sep 25, 2018 at 4:39 PM Colin Bournez 
wrote:

> Hey everyone,
>
> I have a question concerning the Chem.MolToMolFile() function.
> When I open this file containing a N+ (here is the line corresponding in
> the mol file) :
>
>11.37003.4360  -11.8300 N   0  3  0  0  0  0  0  0  0  0  0  0
>
> And I just save it back withotu any modification, the line is then :
>
>  11.37003.4360  -11.8300 N   0  0  0  0  0  0  0  0  0  0  0  0
>
> The problem is that for some software this mol file causes trouble and the
> N+ is transformed to N with 4 bonds.
> I tried several tricks but I was not able to save it as the original line,
> does anyone has suggestion?
>
> Thanks,
>
> --
> *Colin Bournez*
> PhD Student, Structural Bioinformatics & Chemoinformatics
> Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université
> d'Orléans 7311
> Rue de Chartres, 45067 Orléans, France
> T. +33 238 494 577
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Saving mol file

2018-09-25 Thread Colin Bournez

Hey everyone,

I have a question concerning the Chem.MolToMolFile() function.
When I open this file containing a N+ (here is the line corresponding in 
the mol file) :


   11.37003.4360  -11.8300 N   0  3  0  0  0  0  0  0  0  0  0 0

And I just save it back withotu any modification, the line is then :

 11.37003.4360  -11.8300 N   0  0  0  0  0  0  0  0  0  0 0  0

The problem is that for some software this mol file causes trouble and 
the N+ is transformed to N with 4 bonds.
I tried several tricks but I was not able to save it as the original 
line, does anyone has suggestion?


Thanks,

--
*Colin Bournez*
PhD Student, Structural Bioinformatics & Chemoinformatics
Institut de Chimie Organique et Analytique (ICOA), UMR CNRS-Université 
d'Orléans 7311

Rue de Chartres, 45067 Orléans, France
T. +33 238 494 577
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss