[Rdkit-discuss] Senior Software Engineer in Cheminformatics job opening at Zymergen

2020-07-13 Thread Luke Zulauf via Rdkit-discuss
Hello all,

We are seeking an experienced software engineer with a degree in Chemistry
to join our Cheminformatics team. Your expertise will guide decisions on
appropriate representations of chemicals and their properties, selection of
existing tools to incorporate, and evaluation of public data sets. You will
work side-by-side with computational chemists, data scientists and other
scientists to build the foundation on which key scientific and business
decisions are made. This role provides an opportunity to make a significant
impact on Zymergen in a high-priority project.

If you have experience with Rdkit or other cheminformatics toolkits and
familiarity with various molecule representation formats, chemical
fingerprinting, and scalable cloud infrastructure, we'd love to talk to you!

More information can be found here:
https://www.zymergen.com/careers/?gh_jid=2238408#senior-software-engineer-cheminformatics
.

Luke Zulauf
Staff Software Engineer

*ZYMERGEN | WE MAKE TOMORROW*Zymergen.com  |
Twitter  | LinkedIn
 | Facebook

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] changes in chirality in rdkit?

2020-07-13 Thread Bennion, Brian via Rdkit-discuss
hello,
I am "translating" smiles strings output in a csv file  from another program 
into RDKit canonical strings with this code.
If there is something that I am doing incorrectly I would appreciate the input.
thanks
brian bennion


The original  smiles string

"OC[C@@H]1O[C@H](Cn2c(N33)nnc2[C@@H]2OCCC2)CC1"


after conversion with rdkit

"OC[C@H]1CC[C@@H](Cn2c([C@H]3CCCO3)nnc2N22)O1"

my code is below.

   protn_pat = re.compile(r'\[([IBnN])\+(@*)(H[1234]*)*\]')

   line = inFile.readline()
   while len(line) != 0:
fields = line.replace('","',' ').split()
mol_name = fields[2]
molMOE = fields[3].replace('"','')
mol1check = protn_pat.search(molMOE)
if mol1check is not None:
   print("Found crazy MOE string",mol1check,molMOE)
   mol1 = protn_pat.sub(r'[\1\3\2+]',molMOE)
else:
   mol1 = molMOE
try:
mol = Chem.MolFromSmiles(mol1)
except:
mol = None
if mol is None:
print('mol failed:'+molMOE+' '+mol1+' '+str(count)+'\n')

else:
rdkitsmichiout.write('\"'+Chem.MolToSmiles(mol, 
isomericSmiles=True)+'\",')

rdkitsmichiout.write('\"'+Chem.inchi.MolToInchi(mol,options='/FixedH')+'\",')

rdkitsmichiout.write('\"'+(Chem.inchi.InchiToInchiKey(Chem.inchi.MolToInchi(mol,options='/FixedH')))+'\"\n')

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Problem with AllChem.EmbedMolecule and/or MMFFOptimizeMolecule

2020-07-13 Thread Wojtek Plonka
Dear Jan,

I was using 2020.03.3
I can't get to install the version 2020.03.4 which you are using.
I work with command line Anaconda on Open SUSE 15.1 Linux and installing
using:
conda install -q -y -c conda-forge rdkit
gives me version 2018.09

while installing with:
conda install -c rdkit rdkit
gets me to version 2020.03.3

I will try reinstalling Anaconda from scratch, I think and see what happens
then
The structure in your notebook looks good to me

Thank you so much!

Wojtek Plonka
+48885756652
wojtekplonka.com 
fb.com/wojtek.plonka



On Sun, Jul 12, 2020 at 3:40 PM Jan Halborg Jensen 
wrote:

> The 3D structure of the first molecule looks fine to me:
>
> https://colab.research.google.com/drive/1V-KkS4tMfbD5UNIs5tyQewZnYtq7yQ7i?usp=sharing
>
> What version of RDKit are you using?
>
> On 12 Jul 2020, at 07.00, Wojtek Plonka  wrote:
>
> Dear Greg, All,
>
> (I tried sending the message some time ago, but I think it did not go
> through)
>
> I'm trying to convert some molecules which I have as SMILES strings only
> to 3D.
> I use a methodology similar to the below script, except this example saves
> to SDF at different stages of conversion for test purposes.
>
> What happens is that I get very bad 3D structures, CH3 groups with insane
> geometry, crazy bond lengths between heavy atoms for some molecules, even
> as the EmbedMolecule and MMFFOptimizeMolecule report success. The problems
> seem to be gone when I remove the chirality data from SMILES (as far as
> little I understand and like SMILES at all:) ) The script has 3 molecules
> to process, I'd greatly appreciate it if any of you could take a look at
> the SDF with any 3D molecule viewer file it produces. The first and second
> one are processed, the third fails, but this is OK. The problem is the
> geometry I get for the first molecule.
>
> Any suggestions what I might be doing wrong?
>
> I tried playing with parameters of EmbedMolecule and MMFFOptimizeMolecule,
> also using UFF optimization, no success. I can fix my molecules by running
> MM3 calculations in external software, but I'd love to avoid that.
>
> Here is the code:
>
> from rdkit import Chem
> from rdkit.Chem import AllChem
>
>
> myuglymols = [
> 'C[C@@]12OC(=O)[C@]3(O)CC[C@H]4[C@@H](C[C@@H](O)[C@
> @]5(O)CC=CC(=O)[C@]45C)[C@@]45O[C@@]13[C@@H](C4=O)[C@]1(C)C[C@H]2OC(=O)[C@
> @H]1CO5',
>
> 'C[C]12OC(=O)[C]3(O)CC[CH]4[CH](C[CH](O)[C]5(O)CC=CC(=O)[C]45C)[C]45O[C]13[CH](C4=O)[C]1(C)C[CH]2OC(=O)[CH]1CO5'
> ,
> 'CC1=CC(=O)[C@@H](O)[C@]2(C)[C@H]3[C@]4(O)OC[C@@]33[C@@H](C[C@
> @H]12)OC(=O)C[C@H]3C(=C)[C@H]4O'
> ]
>
> w = Chem.SDWriter('uglymols.sdf')
>
> for smiles in myuglymols:
> m = Chem.MolFromSmiles(smiles)
> if (m):
> mold = m
> m.SetProp('State','MolFromSmiles')
> w.write(m)
> Chem.SanitizeMol(m)
> m.SetProp('State','SanitizeMol')
> w.write(m)
> try:
> print (smiles)
> m= Chem.AddHs(m)
>
> # 
> print(AllChem.EmbedMolecule(m,randomSeed=42,useRandomCoords=True,useSmallRingTorsions=True,
>  useMacrocycleTorsions=True))
> print(AllChem.EmbedMolecule(m))
> m.SetProp('State','SanitizeMol')
> w.write(m)
> opt = AllChem.MMFFOptimizeMolecule(m,maxIters=1,
> ignoreInterfragInteractions=False,mmffVariant='MMFF94')
> m.SetProp('State','MMFFOptimizeMolecule')
> m.SetProp('Optimized',str(1-opt))
> w.write(m)
> print(opt)
> except Exception as e:
> print ('Failed')
> m = mold
> m.SetProp('Optimized','Error')
> w.write(m)
> w.flush()
> w.close()
>
> Thank you very much!
>
> Wojtek Plonka
> +48885756652
> wojtekplonka.com
> 
> fb.com/wojtek.plonka
> 
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
>
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Frdkit-discussdata=02%7C01%7Cjhjensen%40chem.ku.dk%7Cc9707731a2bb4cbe0cb908d826288afa%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637301302871342085sdata=blIHAuotiLkpcDk8h6kZGCA%2B78ffGzPOfobpZ%2BQXtPQ%3Dreserved=0
>
>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Chiral flag when writing molfile

2020-07-13 Thread Tim Dudgeon
I've noticed that when writing a 3D molfile that has been generated by
RDKit (in my case using AllChem.EmbedMolecule) and has chirality present
that the chiral flag is not set. At least it is not always set, I can't be
exactly sure. My exact scenario is to read a chiral SMILES, convert to 3D
using EmbedMolecule and then write to SDF.

I also note that there is a magical '_MolFileChiralFlag' property that it
seems if set to 1 results in the chiral flag being set.

So the question here is is it the responsibility of the developer to set
this '_MolFileChiralFlag' property when the molecule is known to be chiral
(e.g. using the Chem.FindMolChiralCenters() method) or am I missing
something here?

Thanks
Tim
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss