[Rdkit-discuss] svg with all atoms and their numbers

2013-11-20 Thread Dimitri Maziuk
, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Shape the Mobile Experience: Free Subscription Software experts

Re: [Rdkit-discuss] rdkit version from python

2013-12-07 Thread Dimitri Maziuk
On 12/06/2013 07:55 PM, Greg Landrum wrote: Hi Dimitri, The correct invocation is: from rdkit import rdBase print rdBase.rdkitVersion D'oh. Should've thought of that. It was Friday, that's my excuse. It works, thanks -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http

Re: [Rdkit-discuss] rdkit.Chem.rdMolDescriptors._CalcMolWt()

2013-12-16 Thread Dimitri Maziuk
On 12/12/2013 9:50 PM, Greg Landrum wrote: If the numbers are being used for the interpretation of mass spec results, you almost certainly should be using exact masses, not the average molecular weight: In [9]: Descriptors.ExactMolWt(m) Out[9]: 46.041864812 The ExactMolWt calculation uses

Re: [Rdkit-discuss] rdkit.Chem.rdMolDescriptors._CalcMolWt()

2013-12-16 Thread Dimitri Maziuk
suppose I should've looked there as well... -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Rapidly troubleshoot

Re: [Rdkit-discuss] InChI roundtrip

2014-01-30 Thread Dimitri Maziuk
is the only layer that's required. Everything else is optional. Let's say you read in InChI=1/C2H6O and print out InChI=1/C2H6O/c1-2-3. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] cite?

2014-04-16 Thread Dimitri Maziuk
On 4/16/2014 1:55 AM, Greg Landrum wrote: On Wed, Apr 16, 2014 at 8:16 AM, paul.czodrow...@merckgroup.com mailto:paul.czodrow...@merckgroup.com wrote: I have used this citation: RDKit, Open-Source Cheminformatics. http://www.rdkit.org. There has not (yet) been an RDKit paper

[Rdkit-discuss] SVG with atom labels

2014-06-03 Thread Dimitri Maziuk
current rpms)? And the follow-up question: can I renumber them from 1? -- I've no problem with 0-based indexing but my users for some reason do. TIA -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] Leaky Memory?

2014-06-10 Thread Dimitri Maziuk
will be heterogeneous). I am wondering if anyone knows where my leak might be. Nitpick: it's only a leak if the process never terminates, otherwise it's deferred cleanup. ;) My first question would be are the other processes actually doing anything? -- Dimitri Maziuk Programmer/sysadmin

Re: [Rdkit-discuss] Chem.AddHs() doesn't care about compound layout

2014-08-20 Thread Dimitri Maziuk
like what is supposed to happen. I think that it should just add coords for the Hs that it adds. Unless your backbone is laid out so there's room around it for all those protons, expect the result to not look good. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http

[Rdkit-discuss] SDF and stereo config

2014-09-29 Thread Dimitri Maziuk
: N O2 : N N3 : N C4 : S C5 : N C6 : N -- with or without calling rdkit.Chem.FindMolChiralCenters( mol ) Is this what's supposed to happen? TIA -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu 71080.sdf Description: application/vnd.kinar signature.asc

Re: [Rdkit-discuss] SDF and stereo config

2014-09-30 Thread Dimitri Maziuk
and I can do a longer (but still brief) description. No, what I need is to read the fine manual more carefully. Specifically, the part where Chem.FindMolChiralCenters(m) returns a list instead of modifying its argument in place. Thanks and apologies for the noise -- Dimitri Maziuk Programmer

Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Dimitri Maziuk
, then you can have a different definition of unique as an English word. In this version of English, go buy a dictionary. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Dimitri Maziuk
On 2015-02-19 05:58, Greg Landrum wrote: On Thu, Feb 19, 2015 at 10:11 AM, Markus Sitzmann markus.sitzm...@gmail.com mailto:markus.sitzm...@gmail.com wrote: Well, at least you said something important: conversion of InChI to molecules is something that's not in general guaranteed to

Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Dimitri Maziuk
' faces and other disciplines have nothing to do with it. (Note, however, the face recognition people actually get simplistic integer numbers so their unique keys tend to be based on well-defined metrics and factor in isometries and other fun math stuff. Unlike IUPAC and InChI.) -- Dimitri Maziuk

Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Dimitri Maziuk
On 2015-02-19 07:27, Markus Sitzmann wrote: No, a chemical structure must calculate a unique InChI, but a InChI might cover more then one chemical structure Heh. I could swear last time I read the description it specifically mentioned databases. In the database context 'unique' has a specific

Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-19 Thread Dimitri Maziuk
of applications. It's your database. What you can't do is redefine unique to mean two things at once: it's not your discrete math. Sorry. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

[Rdkit-discuss] Bad conformer ID?

2015-02-09 Thread Dimitri Maziuk
rdkit.Chem.AllChem.UFFOptimizeMolecule( m ) ValueError: Bad Conformer Id on the attached molecule. I wonder if it's triggering a bug in RDKit somewhere. (I did try RemoveHs() instead of add, it made no difference.) It's not a nice molecule... -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison

Re: [Rdkit-discuss] FYI: google code shutting down

2015-03-13 Thread Dimitri Maziuk
On 2015-03-13 01:25, Greg Landrum wrote: Github does offer the option of setting up a wiki for a project, I haven't done this for the RDKit since it doesn't seem that necessary (and it seems that the information in wikis has a tendency to rot) but if anyone has strong opinion otherwise, we

Re: [Rdkit-discuss] RDKit, Inchi, Stereochemistry !

2015-02-23 Thread Dimitri Maziuk
of the same software... -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Download BIRT iHub F-Type - The Free Enterprise

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Dimitri Maziuk
lexer can't decide what's what. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- One dashboard for servers and applications

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Dimitri Maziuk
- if a record contains forbidden values, stop writing to the file, with an error. With the reader it looks like you can't help it if someone makes a value like 55 or . With that caveat, you should be able to find tags and read everything in between as a value. -- Dimitri Maziuk Programmer

Re: [Rdkit-discuss] SDF tags and -

2015-04-30 Thread Dimitri Maziuk
On 2015-04-29 23:08, Greg Landrum wrote: Here are my thoughts on this: The RDKit is usually strict while parsing molecules from SDF, SMILES, or other formats. My point was that given ''' my_property2 1234 my_property3 ''' a lexer shouldn't have a problem recognizing the 2 tags. A

Re: [Rdkit-discuss] SDF properties in case of error

2015-05-01 Thread Dimitri Maziuk
. Without a valid mol block you don't have a molecule and you shouldn't be making one up. As in conservative in what you produce. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] SDF properties in case of error

2015-05-04 Thread Dimitri Maziuk
On 2015-05-03 15:06, Michael Reutlinger wrote: Well... I think my proposal should enable us to put more strict, robust QC in place, but I guess you are missing this point. My definition of strict and robust is if the input is bad, what comes out does is an out of band error signal. Such that

Re: [Rdkit-discuss] SDF tags and -

2015-04-29 Thread Dimitri Maziuk
On 04/29/2015 05:32 PM, Andrew Dalke wrote: On Apr 29, 2015, at 9:19 PM, Dimitri Maziuk wrote: There is a difference between ACM members writing network protocols and domain people writing junk. I think that you are saying that the MDL connection table file formats are junk. I do

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Dimitri Maziuk
in a binary computer, you'll have to have 2 distinctly different canonical benzenes. That's just how a binary computer works. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] SMILES: Why are rings consisting of wildcards assumed to be aromatic?

2015-06-15 Thread Dimitri Maziuk
RDKit does it because Greg said so. HTH -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] Clustering 1M molecules

2015-08-23 Thread Dimitri Maziuk
and you only intend to run it a handful of times, N^2 is not worth worrying about. Otherwise try to split into smaller batches that you can run in parallel on a cluster of computers. FWIW -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc

Re: [Rdkit-discuss] The Chlorine molfile question

2016-01-21 Thread Dimitri Maziuk
e left trying to guess whether a given "CA" stands for C-alpha or calcium. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -

Re: [Rdkit-discuss] The Chlorine molfile question

2016-01-20 Thread Dimitri Maziuk
d busted PDB format gone, they are not offering a usable alternative that I know of. That's exactly what we've been doing at BMRB, too, and then complaining about low rate of adoption of NMR-STAR by the NMR community. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.b

Re: [Rdkit-discuss] The Chlorine molfile question

2016-01-20 Thread Dimitri Maziuk
On 01/20/2016 04:57 PM, Peter S. Shenkin wrote: > On Wed, Jan 20, 2016 at 5:33 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> > wrote: >> JSON encodes a single string. That is a problem for sending larger files >> over the net, say, an NMR structure of a larger molecule with 1

[Rdkit-discuss] rdMolDraw2D drawing code

2016-09-02 Thread Dimitri Maziuk
? TIA -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- ___ Rdkit-discuss mailing

Re: [Rdkit-discuss] rdMolDraw2D drawing code

2016-09-06 Thread Dimitri Maziuk
).GetSymbol() + str ((i+1)) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] rdMolDraw2D drawing code

2016-09-06 Thread Dimitri Maziuk
hat that layout might look better with all the Hs and numbers added, than the one I get (the other 3 pictures). -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.

Re: [Rdkit-discuss] drawing code take 2

2016-09-08 Thread Dimitri Maziuk
ike someone will sit in front of Marvin and play with options until they get a perfect picture for their paper. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenP

Re: [Rdkit-discuss] AddHs()

2016-09-09 Thread Dimitri Maziuk
hat's what I expected and "removeHs = False" works, thanks. And I was kidding about histidine of course. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: Ope

Re: [Rdkit-discuss] AddHs()

2016-09-13 Thread Dimitri Maziuk
ull > and I'm sure he would appreciate the help. GitHub does have a wiki. One has to become a "collaborator" to get edit permissions, AFAIK it doesn't do fine-grained, but what it does should be good enough. The wiki is a git repo itself so it could be pulled and integrated into relea

Re: [Rdkit-discuss] FindChiralCenters in MOL/SDF files howto

2016-09-14 Thread Dimitri Maziuk
On 09/14/2016 02:23 PM, Dimitri Maziuk wrote: > lbl=mol.GetAtomWithIdx(s[0]).GetSymbol() + str(s[0]+1) > print label, ":", s[1] ^^^ Should be lbl -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Desc

[Rdkit-discuss] FindChiralCenters in MOL/SDF files howto

2016-09-14 Thread Dimitri Maziuk
;DL-Alanine" describing *either* D- or L-Alanine. In this case "unspecified" is the correct value for chirality tag. (And in the case of "2D" SDF it will be; unfortunately PubChem software will generate a "3D" SDF for CID 602 and it will have a single conformer: L-Alanine.

Re: [Rdkit-discuss] AddHs()

2016-09-08 Thread Dimitri Maziuk
s in the source file. Which might matter in the case of e.g. stereospecifically assigned methylene protons. (Or so they tell me ;) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.as

Re: [Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-10 Thread Dimitri Maziuk
Oops. AssignAtomChiralTagsFromStructure() does indeed work. >> If your file has 3D coordinates, AssignAtomChrialTagsFromStructure Good to see I'm not the only one with lysdexic fnigers. Apologies for the noise: ;) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -

Re: [Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-10 Thread Dimitri Maziuk
do have atom parity set even though there's no 3D coordinates. So I'll probably go with your solution instead of TagsFromStructure b/c it'll work for both 2D and 3D MOL files. (elif p == 3 -> rdkit.Chem.rdchem.ChiralType.CHI_UNSPECIFIED, of course) -- Dimitri Maziuk Programmer/sysadmin BioMagR

Re: [Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-10 Thread Dimitri Maziuk
ore, if the MOL file lacks the bond stereo information chirality >> won't be set. GetProp( "molParity" ) does work, thank you, but as I understand it's based on atom ordering in the CTAB and not on CIP rules. So it's just as good as OB's stereo "feature" for my purposes: e

Re: [Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-10 Thread Dimitri Maziuk
On 09/10/2016 04:34 PM, David Cosgrove wrote: ... > Also, the atoms in a molecule should have the property _CIPRank set, you > might be able to do something with that. Possibly, but since the non-typo'ed function seems to do the trick, that's good enough for me. Thanks -- Dimitri

Re: [Rdkit-discuss] Has3D?

2016-09-13 Thread Dimitri Maziuk
a single conformer with ID 0, with the coordinates from the file. `Conformer`s have a `Is3D` method, which *should* do what you want. It does. "There's conformer[0]" is the bit I was missing. It seems to be there for 2D MOLs as well with Is3D() -> False. Thank you. -- Dimitri Mazi

[Rdkit-discuss] Has3D?

2016-09-13 Thread Dimitri Maziuk
t's one or the other? TIA, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- _

[Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-09 Thread Dimitri Maziuk
s( 'C[C@@H](C(=O)O)N' ) instead, the output is C2 : S (this is L-ALA from the same PubChem record as the SDF). So it looks like MOL reader ignores chirality, is that the case? Thanks, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc D

Re: [Rdkit-discuss] The RDKit and modern C++

2016-09-24 Thread Dimitri Maziuk
On 2016-09-24 01:25, Greg Landrum wrote: > https://medium.com/@greg.landrum_t5/the-rdkit-and-modern-c-48206b966218?source=linkShare-d698b3fa9f7-1474698147 > > This is a big and important change and I'd love to hear whatever > feedback members of the community may have. Please comment either on

[Rdkit-discuss] drawing code take 3

2016-09-26 Thread Dimitri Maziuk
On the plus side, when drawing PubChem CID 5057 from a 3D SDF before and after our canonicalization, RDKit draws a mirror image, but otherwise the same 2D structure. OB's "after" version is attached: enjoy the 7-bond carbon in the ring. ;) -- Dimitri Maziuk Programmer/sysadmin BioMagR

Re: [Rdkit-discuss] The RDKit and modern C++

2016-09-29 Thread Dimitri Maziuk
On 2016-09-29 00:57, Markus Sitzmann wrote: > I get the feeling, RH/Centos 6 becomes the next XP kind of story - to > many legacies that make the update impossible or very hard. Also docker, > a great technology that could mitigate this problem, is very painful > under RH/Centos 6. systemd,

Re: [Rdkit-discuss] The RDKit and modern C++

2016-09-29 Thread Dimitri Maziuk
c++-14 code with c++-03 compilers. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital sig

[Rdkit-discuss] drawing code take 2

2016-09-07 Thread Dimitri Maziuk
should have. For reference, CID260719.ob.svg is the other toolkit's rendering of the same file with (atom indexes changed to green from OB's default red). -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu alatis_output_Structure3D_CID_260719.sdf Description

Re: [Rdkit-discuss] drawing code take 3

2016-09-27 Thread Dimitri Maziuk
On 2016-09-26 18:19, Peter S. Shenkin wrote: > 2D drawing code is tough. The 90/10 rule applies: the last 10% of > I think for the present purposes what we need is something correct, > robust and legible, and of course the example shown does not exhibit > that. (But I don't know what the starting

Re: [Rdkit-discuss] drawing code take 3

2016-09-26 Thread Dimitri Maziuk
one(s) with least clashes/overlaps. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital sig

Re: [Rdkit-discuss] 2D drawing with atoms labeled by index

2016-10-24 Thread Dimitri Maziuk
it either way, unfortunately my target viewer is firefox (it's a web application and the user's default browser is firefox) and firefox isn't one of them. Without svg:'s it'll show the file as xml text instead of the image. HTH -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madi

Re: [Rdkit-discuss] Fwd: 2D drawing with atoms labeled by index

2016-10-24 Thread Dimitri Maziuk
On 2016-10-24 19:04, Peter S. Shenkin wrote: > My second conclusion (based on the .svg-file experiments) is that it's > not an iPython problem and, since you see the same thing on Firefox, > it's unlikely to be a Chrome problem. Well, what I got it from (Greg's I think) tutorial that if you

[Rdkit-discuss] SVG BUG (Re: Fwd: 2D drawing with atoms labeled by index)

2016-10-25 Thread Dimitri Maziuk
On 10/25/2016 11:21 AM, Peter S. Shenkin wrote: > Hi, Hongbin, > > Thanks. Indeed. svg2.svg, when renamed to svg2.html, shows the correct > image in Chrome. svg.html shows garbage. > > Still, it would be good to be able to create a real .svg file from RDKit. OK, you made me look and I learned

Re: [Rdkit-discuss] 2D drawing with atoms labeled by index

2016-10-24 Thread Dimitri Maziuk
=rdkit.Chem.Draw.rdMolDraw2D.MolDraw2DSVG(800,800) dr.SetFontSize(0.3) op = dr.drawOptions() for i in range(mol.GetNumAtoms()) : op.atomLabels[i]=mol.GetAtomWithIdx(i).GetSymbol() + str((i+1)) rdkit.Chem.AllChem.Compute2DCoords(mol) dr.DrawMolecule(mol) dr.FinishDrawing() svg=dr.GetDrawingText() -- Dimitri Maziuk

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-28 Thread Dimitri Maziuk
We have no proof that it's 100% correct, but all duplicates it found in the PDB ligand expo at the time were genuine. Enjoy, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madis

Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic)

2016-11-17 Thread Dimitri Maziuk
537 comes close. Marvin doesn't do much better on this one even if you don't turn on all the labels. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital

Re: [Rdkit-discuss] SVG BUG (Re: Fwd: 2D drawing with atoms labeled by index)

2016-10-27 Thread Dimitri Maziuk
On 2016-10-26 23:39, Peter S. Shenkin wrote: > Hey, by the way, my agenda is trying to understand all this. (Using python syntax instead of ML) Recommended by TFM: from "http://www.w3.org/2000/svg; import * All svg names should work with or without package qualifier: point(), line(), etc., as

Re: [Rdkit-discuss] How to find the idx of hydrogens bonded to a specific atom

2016-10-13 Thread Dimitri Maziuk
ere all hydrogens are explicitly present and indexed. I wonder if they stay that way throughout the steps leading to (and past) the smarts match. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signat

Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Dimitri Maziuk
y, waiting 5 sec for a page refresh wouldn't be great. Maybe not, but depending how the browser lays out the grid, it may take 5 seconds anyway. My recommendation for that use case would be to pre-generate the images and store the URLs in that database. Which is what we do here. ;) -- Di

Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Dimitri Maziuk
On 12/15/2016 02:53 PM, Peter S. Shenkin wrote: > Looks good, but maybe too slow for production use... (?) I wonder what kind of production use would require sub-second wall clock time for this. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-29 Thread Dimitri Maziuk
rdized representation of all the properties you consider relevant and produce a unique hash of that. Doesn't matter if it's a SHA-1 string or some graph-based magic or a matrix voodoo. (String comparison is of course easier.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -

Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread Dimitri Maziuk
On 12/02/2016 03:12 PM, George Papadatos wrote: > Here's a pragmatic idea: ... would it not be safe to > assume that *any *word containing more than 4 'C' or 'c' characters would > only be a SMILES string? pneumonoultramicroscopicsilicovolcanoconiosis -- Dimitri Maziuk Programmer

Re: [Rdkit-discuss] drawing code take 3

2016-12-29 Thread Dimitri Maziuk
can guarantee you that a) it's much more than $20, and b) hiring a competent programmer will cost you more than buying a "better computer" and is not guaranteed to result in any appreciable speed-up. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.

Re: [Rdkit-discuss] drawing code take 3

2016-12-29 Thread Dimitri Maziuk
we need a billion depictions all at once" implies that you have a billion users looking at them all at once. If you don't, then rapid response is a very interesting academic exercise but its practical usefulness might be somewhat questionable. -- Dimitri Maziuk Programmer/sysadmin BioMagResBa

Re: [Rdkit-discuss] drawing code take 3

2016-12-29 Thread Dimitri Maziuk
On 2016-12-29 07:19, John M wrote: > For why you need sub-second depiction consider these times for 92877507 > structures (current size PubChem Compound): > > 1s per structure = 1074 days (~3 years) > 100 ms per structure = 107 days > 1ms per structure = 25 hours The Dilbert answer is buy a

Re: [Rdkit-discuss] drawing code take 3

2016-12-29 Thread Dimitri Maziuk
n a regular basis is a realistic use case, either. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Check out the vibrant te

Re: [Rdkit-discuss] Molecule representation

2017-03-08 Thread Dimitri Maziuk
;, "on" ) cmd.png( outfile, width = 800, dpi = 300, ray = 1 ) while threading.active_count() > 2 : time.sleep( 2 ) cmd.quit() HTH, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital s

Re: [Rdkit-discuss] PBF precision is to high to determine good planarity

2017-03-02 Thread Dimitri Maziuk
On 2017-03-02 04:37, Guillaume GODIN wrote: > Based on the precision of the coordinates (in rdkit sdf files it's 4 > digits) can we infer the precision on the PBF value based on that ? Only if you *know* the values are actually accurate to 4 digits and not e.g. were printed as "%.4f" just

Re: [Rdkit-discuss] Is there a Ubuntu ppa or some repository with the latest rdkit release as .deb ?

2017-06-22 Thread Dimitri Maziuk
On 2017-06-22 01:36, Francois BERENGER wrote: make deb # in rdkit source tree Some people might ask for a make rpm target also. You'd have to track any changes that redhat, canonical, suse, and whoever else's out there might make to e.g. filesystem layout, linked libraries, python and so

Re: [Rdkit-discuss] Memory issue when storing more than 300K mol in a list

2017-06-09 Thread Dimitri Maziuk
On 2017-06-09 08:12, Alexis Parenty wrote: Dear Greg and Brian, Many thanks for your response. I was also thinking of your streaming approach! I think the RAM of most machine would deal with lists of 100K mol so we could put the threshold higher than 1000. Actually, I was thinking to monitor

Re: [Rdkit-discuss] atom indexes and order of atoms in the input file

2017-06-15 Thread Dimitri Maziuk
t back up today, will output a MOL file with atoms ordered as per the article. The downside is it only works on 3D MOLs. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP di

Re: [Rdkit-discuss] atom indexes and order of atoms in the input file

2017-06-15 Thread Dimitri Maziuk
vers doing the crunching are currently down. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Check out the vibra

Re: [Rdkit-discuss] Memory issue when storing more than 300K mol in a list

2017-06-10 Thread Dimitri Maziuk
On 2017-06-10 07:42, Chris Swain wrote: This sounds like the situation where a database might be a better option, tuned to store fingerprints in RAM? The issue is how much programming time it will take, how much that time is worth, and how many times the solution will be reused. A clever

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Dimitri Maziuk
On 2017-09-13 09:56, TJ O'Donnell wrote: Let the database do the work for you.  Create a canonical SMILES column and/or InChI column and declare them to be unique.  As you insert new rows, postgres will let  you know if there is already a row with the same SMILES or InChI. Here's some help on

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Dimitri Maziuk
eas you want it represented as one. Last I looked PDB Ligand Expo had two different benzenes. Their software doesn't (didn't?) do the circle version so they don't have the third one. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Descript

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Dimitri Maziuk
On 2017-09-13 10:17, Markus Sitzmann wrote: Canonical SMILES are only a very rough approximation for "unique molecule" as they usually don't work well for tautomeric forms of compound. InChI or Standard InChI is much better although also not perfect. ALATIS I linked to above does impose a

Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-14 Thread Dimitri Maziuk
On 09/14/2017 03:04 PM, Markus Sitzmann wrote: > Not on Centos 6 - Docker requires Centos 7 for the host system. You can't win... :( -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signat

Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-14 Thread Dimitri Maziuk
idea to update. Just FYI: python 2.6 is the system python on (at least) RHEL-6 family of linux distros that will be officially with us until June 30, 2024. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: Ope

Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-14 Thread Dimitri Maziuk
centos-sclo-rh > python27-python-pip.noarch 8.1.2-1.el6 > centos-sclo-rh ... Any guesses as to how many things will break in my infrastructure manglement setup (saltstack) if I enable Software Collections and some

Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-14 Thread Dimitri Maziuk
On 09/14/2017 02:58 PM, Andrew Dalke wrote: > If only Greg got as much money for long term RDKit support as Red Hat > gets for long term RHEL support. :) Yep. But an rdkit docker container might be feasible. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Dimitri Maziuk
On 2017-09-13 05:13, Wandré wrote: Compare if the SMILES as already inserted is easy (text compare), but, compare fingerprint of molecule... Here's one option: http://alatis.nmrfam.wisc.edu/ -- you can use string comparison on the resulting inchi string. Dima

Re: [Rdkit-discuss] RDkit and Pubchem

2017-12-01 Thread Dimitri Maziuk
es, except where it doesn't. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on on

Re: [Rdkit-discuss] Transparent background for 2D molecule images

2017-11-20 Thread Dimitri Maziuk
On 11/20/2017 04:45 PM, Markus Metz wrote: > opts.clearBackground=False > > or > > opts.setBackgroundColour((1,1,0)) > > are not working for me. What's your output format? -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc

Re: [Rdkit-discuss] Atom mapping

2018-05-09 Thread Dimitri Maziuk
the exact same ligand*. (It's the *substructure* bit that I'm not entirely sure about.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ---

Re: [Rdkit-discuss] Issue with the latest RDKit DB build

2017-12-29 Thread Dimitri Maziuk
PS the real question is why you're trying to run psql built with a newer toolset when there's 2 perfectly good ones available: one from the distro vendor and one from postgres repos. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc

Re: [Rdkit-discuss] Issue with the latest RDKit DB build

2017-12-29 Thread Dimitri Maziuk
5 /usr/lib64/libpanel.so.5.9 /usr/lib64/libpanelw.so.5 /usr/lib64/libpanelw.so.5.9 /usr/lib64/libtic.so.5 /usr/lib64/libtic.so.5.9 /usr/lib64/libtinfo.so.5 /usr/lib64/libtinfo.so.5.9 -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Desc

Re: [Rdkit-discuss] mol file parsing, 3D or 2D

2018-01-17 Thread Dimitri Maziuk
On 2018-01-16 22:46, Greg Landrum wrote: It might be worth thinking about adding an option to the aromaticity perception code to maintain the original bond types and just set the "isAromatic" flag on the bonds. This is how it's modeled in mmCIF chem. comp. It may or may not come from

Re: [Rdkit-discuss] mol file parsing, 3D or 2D

2018-01-17 Thread Dimitri Maziuk
On 2018-01-17 10:25, Jason Biggs wrote: For the case in question, I find that if I read in a mol file containing 2D coordinates, and I skip the sanitization step altogether, then the 3D embedding algorithms fail. Well, yes, as I mentioned in the other thread: the only way you can get it to

Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-15 Thread Dimitri Maziuk
n that I don't know if there's anything that works on not proteins, or can predict disordered regions well etc. If anyone's counting votes, pretty 2D depictions get mine. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPG

Re: [Rdkit-discuss] convert a smiles file to a xyz file

2018-05-23 Thread Dimitri Maziuk via Rdkit-discuss
On 5/23/2018 10:23 AM, Chenyang Shi wrote: A separate question is that is the converted molecular structure from SMILES the same as that taken from a crystal structure? Provided there's no undefined/different stereochemistry on SMILES side, no quirks with added protons, and so on and so

Re: [Rdkit-discuss] Can't import Chem from rdkit in Anaconda Python 3.6.5

2018-06-13 Thread Dimitri Maziuk via Rdkit-discuss
On 6/13/2018 10:06 AM, Greg Landrum wrote: Note that my answer assumes that there is a reason that you don't have X11 installed on your linux box. If that's not the case, you should be able to fix things "more easily" by installing X Quite frankly, this is rapidly becoming unusable as a

Re: [Rdkit-discuss] Can't import Chem from rdkit in Anaconda Python 3.6.5

2018-06-13 Thread Dimitri Maziuk via Rdkit-discuss
er container, but compiling it on my desktop is simply not worth my time. (And our compute nodes are the same or older as my desktop, so if it doesn't work on my box, we can't deploy it anywhere.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc

Re: [Rdkit-discuss] Can't import Chem from rdkit in Anaconda Python 3.6.5

2018-06-13 Thread Dimitri Maziuk via Rdkit-discuss
nows they're doing and watches the whole ting very carefully. No, it's not your problem, you're doing the best you can, and thank you for that. But the end result is that ready-made builds are getting increasingly too bloated to be of use, and custom builds are too "non-trivial" to attemp

Re: [Rdkit-discuss] Creating Mol Object From SD File

2018-08-29 Thread Dimitri Maziuk via Rdkit-discuss
Also, it seems that SDMolSupplier.next() does not work anymore? if sys.version_info[0] == 2 : next() elif sys.version_info[0] == 3 : __next()__ else : raise Exception( "Go! is looking better every day" ) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.

Re: [Rdkit-discuss] Compilation Errors on RHEL7

2018-10-24 Thread Dimitri Maziuk via Rdkit-discuss
rdkit container instead. I strongly recommend doing that, or finding a singularity version of the same. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature __

Re: [Rdkit-discuss] Compilation Errors on RHEL7

2018-10-24 Thread Dimitri Maziuk via Rdkit-discuss
On 10/24/2018 12:10 PM, Dimitri Maziuk via Rdkit-discuss wrote: > Yes. I once spent a couple of hours trying and ended up installing docer docker -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signat

  1   2   >