Re: [Rdkit-discuss] Code efficiency improvement

2019-12-20 Thread Dimitri Maziuk via Rdkit-discuss
luster and/or are willing to pay for spinning up a bunch of VMs on amazon etc. Otherwise the best you can hope for is to run maybe two per CPU core. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Descri

Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand

2019-12-16 Thread Dimitri Maziuk via Rdkit-discuss
iles haven't been processed by them, all bets are off. And so on -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Rdkit-discuss mailing list R

Re: [Rdkit-discuss] Constructing a mol object from a PDB ligand

2019-12-16 Thread Dimitri Maziuk via Rdkit-discuss
On 12/16/2019 10:07 AM, Illimar Hugo Rekand wrote: Would it be viable to create a function where you could create a mol object from specific lines within a pdb-file? PDB file is simple text. There's any number of utilities to extract the lines you want, incl. a plain text editor, why spend

Re: [Rdkit-discuss] Saving chains from PDB file

2019-10-05 Thread Dimitri Maziuk via Rdkit-discuss
On 10/5/2019 10:34 AM, Maciek Wójcikowski wrote: Paolo and Chris, There actually is Rdkit function to do this very task: SplitMolByPDBChainId Why, though? -- It's a punch-card format with chain id in specific column, you just read the lines and sort them into buckets on line[X]. Unless you

Re: [Rdkit-discuss] drawing code

2019-08-14 Thread Dimitri Maziuk via Rdkit-discuss
s with labels, I am painfully aware of how crowded the image becomes. Perhaps some day I'll manage to persuade my spectroscopist to use a 3D image instead -- her argument is that having to move the mouse to the other screen to rotate that thing all the time is too distracting.) -- Dimitri Maziuk

Re: [Rdkit-discuss] drawing code

2019-08-13 Thread Dimitri Maziuk via Rdkit-discuss
PS I played with it a bit: the least ugly version is if you MMFF94-optimize it after rdkit.Chem.rdCoordGen.AddCoords() It's still far from perfect. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

[Rdkit-discuss] drawing code

2019-08-13 Thread Dimitri Maziuk via Rdkit-discuss
algorithms. Here's out latest one for example. The thing about this one is, the molecule itself is not that bad, it not clear why the picture isn't any better. Enjoy. (Try it in OB if you think RDKit's pix is bad. ;) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http

Re: [Rdkit-discuss] High-quality matplotlib drawing?

2019-08-09 Thread Dimitri Maziuk via Rdkit-discuss
ut you're right: getting that to work in an interactive display in a notebook would be a hassle. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Rdki

Re: [Rdkit-discuss] High-quality matplotlib drawing?

2019-08-08 Thread Dimitri Maziuk via Rdkit-discuss
On 8/7/2019 7:20 PM, Wout Bittremieux wrote: ... Unfortunately the quality of the molecule drawing is rather poor (see attachment; nonsensical spectrum and molecule). This seems to be true for non-SVG drawing in general, and unfortunately it's not really possible to combine SVG output with

Re: [Rdkit-discuss] Read only first model of a pdb-file

2019-05-29 Thread Dimitri Maziuk via Rdkit-discuss
IA : REMARK 210 REMARK 210 REMARK 210 BEST REPRESENTATIVE CONFORMER IN THIS ENSEMBLE : """ (the last one is the one I mentioned earlier) You would typically have "lowest energy" as selection criteria and "best reperesentative" is the minimized average of those submit

Re: [Rdkit-discuss] Read only first model of a pdb-file

2019-05-29 Thread Dimitri Maziuk via Rdkit-discuss
multiple conformers in one file, there is at least one in PDB. (It's been a while since I did this so I don't remember the remark number, nor the multi-conormer entry id off the top of my head.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.a

Re: [Rdkit-discuss] RDKit Release 2018.09.2 available

2019-02-22 Thread Dimitri Maziuk via Rdkit-discuss
ific version of another package that depends on... turtles all the way down. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Rdkit-discuss mailing list R

Re: [Rdkit-discuss] Warning as error

2019-01-21 Thread Dimitri Maziuk via Rdkit-discuss
ide of the try/catch block and ignore ones not followed by the warning message. (And run with #!/usr/bin/python -u and/or flush sys.stdout/stderr on every iteration for good measure.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description:

Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Dimitri Maziuk via Rdkit-discuss
ame 3D structure with the same atom labels on round-trip, *as long as you don't removeH/addH/recalculate conformers etc.* (At least on all molecules they tried and I think that includes the entire PubChem.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wi

Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Dimitri Maziuk via Rdkit-discuss
gh any cheminformatics program, there's 50% chance you'll get a different molecule out. That's how chemistry works when it meets compsci. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital sig

Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-17 Thread Dimitri Maziuk via Rdkit-discuss
les/sdata201773 -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net

Re: [Rdkit-discuss] 回复: 回复: Help: How to set timeout for the function namedRunReactants

2018-11-15 Thread Dimitri Maziuk via Rdkit-discuss
- boost - swig - python is not something I'd want to ever become familiar with... -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Rdkit-discuss ma

Re: [Rdkit-discuss] svg: next question

2018-11-02 Thread Dimitri Maziuk via Rdkit-discuss
On 11/02/2018 12:19 AM, Greg Landrum wrote: > On Fri, Nov 2, 2018 at 12:32 AM Dimitri Maziuk via Rdkit-discuss < > rdkit-discuss@lists.sourceforge.net> wrote: > >> Does anyone know where TH does >> >> >> >> come from? -- > > > assuming y

Re: [Rdkit-discuss] Plotting values next to atoms

2018-11-02 Thread Dimitri Maziuk via Rdkit-discuss
s draws atom labels: op = dr.drawOptions() for i in range( self._mol.GetNumAtoms() ) : op.atomLabels[i] = self._mol.GetAtomWithIdx( i ).GetSymbol() + str( (i + 1) ) HTH, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc

[Rdkit-discuss] svg: next question

2018-11-01 Thread Dimitri Maziuk via Rdkit-discuss
not cp1252. Any ideas? -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https

[Rdkit-discuss] svg transparent background

2018-11-01 Thread Dimitri Maziuk via Rdkit-discuss
, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo

Re: [Rdkit-discuss] Compilation Errors on RHEL7

2018-10-24 Thread Dimitri Maziuk via Rdkit-discuss
On 10/24/2018 12:10 PM, Dimitri Maziuk via Rdkit-discuss wrote: > Yes. I once spent a couple of hours trying and ended up installing docer docker -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signat

Re: [Rdkit-discuss] Compilation Errors on RHEL7

2018-10-24 Thread Dimitri Maziuk via Rdkit-discuss
rdkit container instead. I strongly recommend doing that, or finding a singularity version of the same. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature __

Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-03 Thread Dimitri Maziuk via Rdkit-discuss
"same molecule" for your purposes, then it doesn't matter. If it does mater, then alatis (the link I sent earlier) is the best option that I know of. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP d

Re: [Rdkit-discuss] Are atom and bond indexes deterministic?

2018-10-02 Thread Dimitri Maziuk via Rdkit-discuss
tion? https://www.nature.com/articles/sdata201773 -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourcefor

Re: [Rdkit-discuss] Creating Mol Object From SD File

2018-08-29 Thread Dimitri Maziuk via Rdkit-discuss
Also, it seems that SDMolSupplier.next() does not work anymore? if sys.version_info[0] == 2 : next() elif sys.version_info[0] == 3 : __next()__ else : raise Exception( "Go! is looking better every day" ) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.

Re: [Rdkit-discuss] Can't import Chem from rdkit in Anaconda Python 3.6.5

2018-06-13 Thread Dimitri Maziuk via Rdkit-discuss
nows they're doing and watches the whole ting very carefully. No, it's not your problem, you're doing the best you can, and thank you for that. But the end result is that ready-made builds are getting increasingly too bloated to be of use, and custom builds are too "non-trivial" to attemp

Re: [Rdkit-discuss] Can't import Chem from rdkit in Anaconda Python 3.6.5

2018-06-13 Thread Dimitri Maziuk via Rdkit-discuss
er container, but compiling it on my desktop is simply not worth my time. (And our compute nodes are the same or older as my desktop, so if it doesn't work on my box, we can't deploy it anywhere.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc

Re: [Rdkit-discuss] Can't import Chem from rdkit in Anaconda Python 3.6.5

2018-06-13 Thread Dimitri Maziuk via Rdkit-discuss
On 6/13/2018 10:06 AM, Greg Landrum wrote: Note that my answer assumes that there is a reason that you don't have X11 installed on your linux box. If that's not the case, you should be able to fix things "more easily" by installing X Quite frankly, this is rapidly becoming unusable as a

Re: [Rdkit-discuss] convert a smiles file to a xyz file

2018-05-23 Thread Dimitri Maziuk via Rdkit-discuss
On 5/23/2018 10:23 AM, Chenyang Shi wrote: A separate question is that is the converted molecular structure from SMILES the same as that taken from a crystal structure? Provided there's no undefined/different stereochemistry on SMILES side, no quirks with added protons, and so on and so

Re: [Rdkit-discuss] Atom mapping

2018-05-09 Thread Dimitri Maziuk
the exact same ligand*. (It's the *substructure* bit that I'm not entirely sure about.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ---

Re: [Rdkit-discuss] mol file parsing, 3D or 2D

2018-01-17 Thread Dimitri Maziuk
On 2018-01-17 10:25, Jason Biggs wrote: For the case in question, I find that if I read in a mol file containing 2D coordinates, and I skip the sanitization step altogether, then the 3D embedding algorithms fail. Well, yes, as I mentioned in the other thread: the only way you can get it to

Re: [Rdkit-discuss] mol file parsing, 3D or 2D

2018-01-17 Thread Dimitri Maziuk
On 2018-01-16 22:46, Greg Landrum wrote: It might be worth thinking about adding an option to the aromaticity perception code to maintain the original bond types and just set the "isAromatic" flag on the bonds. This is how it's modeled in mmCIF chem. comp. It may or may not come from

Re: [Rdkit-discuss] RDKit and Google Summer of Code 2018

2018-01-15 Thread Dimitri Maziuk
n that I don't know if there's anything that works on not proteins, or can predict disordered regions well etc. If anyone's counting votes, pretty 2D depictions get mine. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPG

Re: [Rdkit-discuss] Issue with the latest RDKit DB build

2017-12-29 Thread Dimitri Maziuk
PS the real question is why you're trying to run psql built with a newer toolset when there's 2 perfectly good ones available: one from the distro vendor and one from postgres repos. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc

Re: [Rdkit-discuss] Issue with the latest RDKit DB build

2017-12-29 Thread Dimitri Maziuk
5 /usr/lib64/libpanel.so.5.9 /usr/lib64/libpanelw.so.5 /usr/lib64/libpanelw.so.5.9 /usr/lib64/libtic.so.5 /usr/lib64/libtic.so.5.9 /usr/lib64/libtinfo.so.5 /usr/lib64/libtinfo.so.5.9 -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Desc

Re: [Rdkit-discuss] RDkit and Pubchem

2017-12-01 Thread Dimitri Maziuk
es, except where it doesn't. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on on

Re: [Rdkit-discuss] Transparent background for 2D molecule images

2017-11-20 Thread Dimitri Maziuk
On 11/20/2017 04:45 PM, Markus Metz wrote: > opts.clearBackground=False > > or > > opts.setBackgroundColour((1,1,0)) > > are not working for me. What's your output format? -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc

Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-14 Thread Dimitri Maziuk
On 09/14/2017 03:04 PM, Markus Sitzmann wrote: > Not on Centos 6 - Docker requires Centos 7 for the host system. You can't win... :( -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signat

Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-14 Thread Dimitri Maziuk
On 09/14/2017 02:58 PM, Andrew Dalke wrote: > If only Greg got as much money for long term RDKit support as Red Hat > gets for long term RHEL support. :) Yep. But an rdkit docker container might be feasible. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -

Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-14 Thread Dimitri Maziuk
centos-sclo-rh > python27-python-pip.noarch 8.1.2-1.el6 > centos-sclo-rh ... Any guesses as to how many things will break in my infrastructure manglement setup (saltstack) if I enable Software Collections and some

Re: [Rdkit-discuss] ImportError: No module named rdkit

2017-09-14 Thread Dimitri Maziuk
idea to update. Just FYI: python 2.6 is the system python on (at least) RHEL-6 family of linux distros that will be officially with us until June 30, 2024. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: Ope

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Dimitri Maziuk
eas you want it represented as one. Last I looked PDB Ligand Expo had two different benzenes. Their software doesn't (didn't?) do the circle version so they don't have the third one. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Descript

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Dimitri Maziuk
On 2017-09-13 10:17, Markus Sitzmann wrote: Canonical SMILES are only a very rough approximation for "unique molecule" as they usually don't work well for tautomeric forms of compound. InChI or Standard InChI is much better although also not perfect. ALATIS I linked to above does impose a

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Dimitri Maziuk
On 2017-09-13 09:56, TJ O'Donnell wrote: Let the database do the work for you.  Create a canonical SMILES column and/or InChI column and declare them to be unique.  As you insert new rows, postgres will let  you know if there is already a row with the same SMILES or InChI. Here's some help on

Re: [Rdkit-discuss] Non-redundant database of molecules

2017-09-13 Thread Dimitri Maziuk
On 2017-09-13 05:13, Wandré wrote: Compare if the SMILES as already inserted is easy (text compare), but, compare fingerprint of molecule... Here's one option: http://alatis.nmrfam.wisc.edu/ -- you can use string comparison on the resulting inchi string. Dima

Re: [Rdkit-discuss] Is there a Ubuntu ppa or some repository with the latest rdkit release as .deb ?

2017-06-22 Thread Dimitri Maziuk
On 2017-06-22 01:36, Francois BERENGER wrote: make deb # in rdkit source tree Some people might ask for a make rpm target also. You'd have to track any changes that redhat, canonical, suse, and whoever else's out there might make to e.g. filesystem layout, linked libraries, python and so

Re: [Rdkit-discuss] atom indexes and order of atoms in the input file

2017-06-15 Thread Dimitri Maziuk
t back up today, will output a MOL file with atoms ordered as per the article. The downside is it only works on 3D MOLs. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP di

Re: [Rdkit-discuss] atom indexes and order of atoms in the input file

2017-06-15 Thread Dimitri Maziuk
vers doing the crunching are currently down. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Check out the vibra

Re: [Rdkit-discuss] Memory issue when storing more than 300K mol in a list

2017-06-10 Thread Dimitri Maziuk
On 2017-06-10 07:42, Chris Swain wrote: This sounds like the situation where a database might be a better option, tuned to store fingerprints in RAM? The issue is how much programming time it will take, how much that time is worth, and how many times the solution will be reused. A clever

Re: [Rdkit-discuss] Memory issue when storing more than 300K mol in a list

2017-06-09 Thread Dimitri Maziuk
On 2017-06-09 08:12, Alexis Parenty wrote: Dear Greg and Brian, Many thanks for your response. I was also thinking of your streaming approach! I think the RAM of most machine would deal with lists of 100K mol so we could put the threshold higher than 1000. Actually, I was thinking to monitor

Re: [Rdkit-discuss] Molecule representation

2017-03-08 Thread Dimitri Maziuk
;, "on" ) cmd.png( outfile, width = 800, dpi = 300, ray = 1 ) while threading.active_count() > 2 : time.sleep( 2 ) cmd.quit() HTH, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital s

Re: [Rdkit-discuss] PBF precision is to high to determine good planarity

2017-03-02 Thread Dimitri Maziuk
On 2017-03-02 04:37, Guillaume GODIN wrote: > Based on the precision of the coordinates (in rdkit sdf files it's 4 > digits) can we infer the precision on the PBF value based on that ? Only if you *know* the values are actually accurate to 4 digits and not e.g. were printed as "%.4f" just

Re: [Rdkit-discuss] drawing code take 3

2016-12-29 Thread Dimitri Maziuk
n a regular basis is a realistic use case, either. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- Check out the vibrant te

Re: [Rdkit-discuss] drawing code take 3

2016-12-29 Thread Dimitri Maziuk
we need a billion depictions all at once" implies that you have a billion users looking at them all at once. If you don't, then rapid response is a very interesting academic exercise but its practical usefulness might be somewhat questionable. -- Dimitri Maziuk Programmer/sysadmin BioMagResBa

Re: [Rdkit-discuss] drawing code take 3

2016-12-29 Thread Dimitri Maziuk
can guarantee you that a) it's much more than $20, and b) hiring a competent programmer will cost you more than buying a "better computer" and is not guaranteed to result in any appreciable speed-up. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.

Re: [Rdkit-discuss] drawing code take 3

2016-12-29 Thread Dimitri Maziuk
On 2016-12-29 07:19, John M wrote: > For why you need sub-second depiction consider these times for 92877507 > structures (current size PubChem Compound): > > 1s per structure = 1074 days (~3 years) > 100 ms per structure = 107 days > 1ms per structure = 25 hours The Dilbert answer is buy a

Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Dimitri Maziuk
y, waiting 5 sec for a page refresh wouldn't be great. Maybe not, but depending how the browser lays out the grid, it may take 5 seconds anyway. My recommendation for that use case would be to pre-generate the images and store the URLs in that database. Which is what we do here. ;) -- Di

Re: [Rdkit-discuss] drawing code take 3

2016-12-15 Thread Dimitri Maziuk
On 12/15/2016 02:53 PM, Peter S. Shenkin wrote: > Looks good, but maybe too slow for production use... (?) I wonder what kind of production use would require sub-second wall clock time for this. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.

Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-02 Thread Dimitri Maziuk
On 12/02/2016 03:12 PM, George Papadatos wrote: > Here's a pragmatic idea: ... would it not be safe to > assume that *any *word containing more than 4 'C' or 'c' characters would > only be a SMILES string? pneumonoultramicroscopicsilicovolcanoconiosis -- Dimitri Maziuk Programmer

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-29 Thread Dimitri Maziuk
rdized representation of all the properties you consider relevant and produce a unique hash of that. Doesn't matter if it's a SHA-1 string or some graph-based magic or a matrix voodoo. (String comparison is of course easier.) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -

Re: [Rdkit-discuss] comparing two or more tables of molecules

2016-11-28 Thread Dimitri Maziuk
We have no proof that it's 100% correct, but all duplicates it found in the PDB ligand expo at the time were genuine. Enjoy, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madis

Re: [Rdkit-discuss] GenerateDepictionMatching[23]DStructure (a bit off-topic)

2016-11-17 Thread Dimitri Maziuk
537 comes close. Marvin doesn't do much better on this one even if you don't turn on all the labels. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital

Re: [Rdkit-discuss] SVG BUG (Re: Fwd: 2D drawing with atoms labeled by index)

2016-10-27 Thread Dimitri Maziuk
On 2016-10-26 23:39, Peter S. Shenkin wrote: > Hey, by the way, my agenda is trying to understand all this. (Using python syntax instead of ML) Recommended by TFM: from "http://www.w3.org/2000/svg; import * All svg names should work with or without package qualifier: point(), line(), etc., as

[Rdkit-discuss] SVG BUG (Re: Fwd: 2D drawing with atoms labeled by index)

2016-10-25 Thread Dimitri Maziuk
On 10/25/2016 11:21 AM, Peter S. Shenkin wrote: > Hi, Hongbin, > > Thanks. Indeed. svg2.svg, when renamed to svg2.html, shows the correct > image in Chrome. svg.html shows garbage. > > Still, it would be good to be able to create a real .svg file from RDKit. OK, you made me look and I learned

Re: [Rdkit-discuss] Fwd: 2D drawing with atoms labeled by index

2016-10-24 Thread Dimitri Maziuk
On 2016-10-24 19:04, Peter S. Shenkin wrote: > My second conclusion (based on the .svg-file experiments) is that it's > not an iPython problem and, since you see the same thing on Firefox, > it's unlikely to be a Chrome problem. Well, what I got it from (Greg's I think) tutorial that if you

Re: [Rdkit-discuss] 2D drawing with atoms labeled by index

2016-10-24 Thread Dimitri Maziuk
it either way, unfortunately my target viewer is firefox (it's a web application and the user's default browser is firefox) and firefox isn't one of them. Without svg:'s it'll show the file as xml text instead of the image. HTH -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madi

Re: [Rdkit-discuss] 2D drawing with atoms labeled by index

2016-10-24 Thread Dimitri Maziuk
=rdkit.Chem.Draw.rdMolDraw2D.MolDraw2DSVG(800,800) dr.SetFontSize(0.3) op = dr.drawOptions() for i in range(mol.GetNumAtoms()) : op.atomLabels[i]=mol.GetAtomWithIdx(i).GetSymbol() + str((i+1)) rdkit.Chem.AllChem.Compute2DCoords(mol) dr.DrawMolecule(mol) dr.FinishDrawing() svg=dr.GetDrawingText() -- Dimitri Maziuk

Re: [Rdkit-discuss] How to find the idx of hydrogens bonded to a specific atom

2016-10-13 Thread Dimitri Maziuk
ere all hydrogens are explicitly present and indexed. I wonder if they stay that way throughout the steps leading to (and past) the smarts match. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signat

Re: [Rdkit-discuss] The RDKit and modern C++

2016-09-29 Thread Dimitri Maziuk
c++-14 code with c++-03 compilers. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital sig

Re: [Rdkit-discuss] The RDKit and modern C++

2016-09-29 Thread Dimitri Maziuk
On 2016-09-29 00:57, Markus Sitzmann wrote: > I get the feeling, RH/Centos 6 becomes the next XP kind of story - to > many legacies that make the update impossible or very hard. Also docker, > a great technology that could mitigate this problem, is very painful > under RH/Centos 6. systemd,

Re: [Rdkit-discuss] drawing code take 3

2016-09-27 Thread Dimitri Maziuk
On 2016-09-26 18:19, Peter S. Shenkin wrote: > 2D drawing code is tough. The 90/10 rule applies: the last 10% of > I think for the present purposes what we need is something correct, > robust and legible, and of course the example shown does not exhibit > that. (But I don't know what the starting

Re: [Rdkit-discuss] drawing code take 3

2016-09-26 Thread Dimitri Maziuk
one(s) with least clashes/overlaps. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital sig

[Rdkit-discuss] drawing code take 3

2016-09-26 Thread Dimitri Maziuk
On the plus side, when drawing PubChem CID 5057 from a 3D SDF before and after our canonicalization, RDKit draws a mirror image, but otherwise the same 2D structure. OB's "after" version is attached: enjoy the 7-bond carbon in the ring. ;) -- Dimitri Maziuk Programmer/sysadmin BioMagR

Re: [Rdkit-discuss] The RDKit and modern C++

2016-09-24 Thread Dimitri Maziuk
On 2016-09-24 01:25, Greg Landrum wrote: > https://medium.com/@greg.landrum_t5/the-rdkit-and-modern-c-48206b966218?source=linkShare-d698b3fa9f7-1474698147 > > This is a big and important change and I'd love to hear whatever > feedback members of the community may have. Please comment either on

Re: [Rdkit-discuss] FindChiralCenters in MOL/SDF files howto

2016-09-14 Thread Dimitri Maziuk
On 09/14/2016 02:23 PM, Dimitri Maziuk wrote: > lbl=mol.GetAtomWithIdx(s[0]).GetSymbol() + str(s[0]+1) > print label, ":", s[1] ^^^ Should be lbl -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Desc

[Rdkit-discuss] FindChiralCenters in MOL/SDF files howto

2016-09-14 Thread Dimitri Maziuk
;DL-Alanine" describing *either* D- or L-Alanine. In this case "unspecified" is the correct value for chirality tag. (And in the case of "2D" SDF it will be; unfortunately PubChem software will generate a "3D" SDF for CID 602 and it will have a single conformer: L-Alanine.

Re: [Rdkit-discuss] Has3D?

2016-09-13 Thread Dimitri Maziuk
a single conformer with ID 0, with the coordinates from the file. `Conformer`s have a `Is3D` method, which *should* do what you want. It does. "There's conformer[0]" is the bit I was missing. It seems to be there for 2D MOLs as well with Is3D() -> False. Thank you. -- Dimitri Mazi

[Rdkit-discuss] Has3D?

2016-09-13 Thread Dimitri Maziuk
t's one or the other? TIA, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- _

Re: [Rdkit-discuss] AddHs()

2016-09-13 Thread Dimitri Maziuk
ull > and I'm sure he would appreciate the help. GitHub does have a wiki. One has to become a "collaborator" to get edit permissions, AFAIK it doesn't do fine-grained, but what it does should be good enough. The wiki is a git repo itself so it could be pulled and integrated into relea

Re: [Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-10 Thread Dimitri Maziuk
do have atom parity set even though there's no 3D coordinates. So I'll probably go with your solution instead of TagsFromStructure b/c it'll work for both 2D and 3D MOL files. (elif p == 3 -> rdkit.Chem.rdchem.ChiralType.CHI_UNSPECIFIED, of course) -- Dimitri Maziuk Programmer/sysadmin BioMagR

Re: [Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-10 Thread Dimitri Maziuk
On 09/10/2016 04:34 PM, David Cosgrove wrote: ... > Also, the atoms in a molecule should have the property _CIPRank set, you > might be able to do something with that. Possibly, but since the non-typo'ed function seems to do the trick, that's good enough for me. Thanks -- Dimitri

Re: [Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-10 Thread Dimitri Maziuk
Oops. AssignAtomChiralTagsFromStructure() does indeed work. >> If your file has 3D coordinates, AssignAtomChrialTagsFromStructure Good to see I'm not the only one with lysdexic fnigers. Apologies for the noise: ;) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -

Re: [Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-10 Thread Dimitri Maziuk
ore, if the MOL file lacks the bond stereo information chirality >> won't be set. GetProp( "molParity" ) does work, thank you, but as I understand it's based on atom ordering in the CTAB and not on CIP rules. So it's just as good as OB's stereo "feature" for my purposes: e

[Rdkit-discuss] SDF and FindMolChiralCenters()

2016-09-09 Thread Dimitri Maziuk
s( 'C[C@@H](C(=O)O)N' ) instead, the output is C2 : S (this is L-ALA from the same PubChem record as the SDF). So it looks like MOL reader ignores chirality, is that the case? Thanks, -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc D

Re: [Rdkit-discuss] AddHs()

2016-09-09 Thread Dimitri Maziuk
hat's what I expected and "removeHs = False" works, thanks. And I was kidding about histidine of course. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: Ope

Re: [Rdkit-discuss] AddHs()

2016-09-08 Thread Dimitri Maziuk
s in the source file. Which might matter in the case of e.g. stereospecifically assigned methylene protons. (Or so they tell me ;) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.as

Re: [Rdkit-discuss] drawing code take 2

2016-09-08 Thread Dimitri Maziuk
ike someone will sit in front of Marvin and play with options until they get a perfect picture for their paper. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenP

[Rdkit-discuss] drawing code take 2

2016-09-07 Thread Dimitri Maziuk
should have. For reference, CID260719.ob.svg is the other toolkit's rendering of the same file with (atom indexes changed to green from OB's default red). -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu alatis_output_Structure3D_CID_260719.sdf Description

Re: [Rdkit-discuss] rdMolDraw2D drawing code

2016-09-06 Thread Dimitri Maziuk
).GetSymbol() + str ((i+1)) -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] rdMolDraw2D drawing code

2016-09-06 Thread Dimitri Maziuk
hat that layout might look better with all the Hs and numbers added, than the one I get (the other 3 pictures). -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.

[Rdkit-discuss] rdMolDraw2D drawing code

2016-09-02 Thread Dimitri Maziuk
? TIA -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -- ___ Rdkit-discuss mailing

Re: [Rdkit-discuss] The Chlorine molfile question

2016-01-21 Thread Dimitri Maziuk
e left trying to guess whether a given "CA" stands for C-alpha or calcium. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature -

Re: [Rdkit-discuss] The Chlorine molfile question

2016-01-20 Thread Dimitri Maziuk
d busted PDB format gone, they are not offering a usable alternative that I know of. That's exactly what we've been doing at BMRB, too, and then complaining about low rate of adoption of NMR-STAR by the NMR community. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.b

Re: [Rdkit-discuss] The Chlorine molfile question

2016-01-20 Thread Dimitri Maziuk
On 01/20/2016 04:57 PM, Peter S. Shenkin wrote: > On Wed, Jan 20, 2016 at 5:33 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> > wrote: >> JSON encodes a single string. That is a problem for sending larger files >> over the net, say, an NMR structure of a larger molecule with 1

Re: [Rdkit-discuss] Clustering 1M molecules

2015-08-23 Thread Dimitri Maziuk
and you only intend to run it a handful of times, N^2 is not worth worrying about. Otherwise try to split into smaller batches that you can run in parallel on a cluster of computers. FWIW -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Dimitri Maziuk
in a binary computer, you'll have to have 2 distinctly different canonical benzenes. That's just how a binary computer works. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] SMILES: Why are rings consisting of wildcards assumed to be aromatic?

2015-06-15 Thread Dimitri Maziuk
RDKit does it because Greg said so. HTH -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [Rdkit-discuss] SDF properties in case of error

2015-05-04 Thread Dimitri Maziuk
On 2015-05-03 15:06, Michael Reutlinger wrote: Well... I think my proposal should enable us to put more strict, robust QC in place, but I guess you are missing this point. My definition of strict and robust is if the input is bad, what comes out does is an out of band error signal. Such that

Re: [Rdkit-discuss] SDF properties in case of error

2015-05-01 Thread Dimitri Maziuk
. Without a valid mol block you don't have a molecule and you shouldn't be making one up. As in conservative in what you produce. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

  1   2   >