Hi everyone,
We have released mmpdb 3.1, which you can get from
https://github.com/rdkit/mmpdb .
mmpdb 3.0, released May 2023, merged three development tracks:
- create and query 1-cut med chem transformations as described in Awale et al.,
The Playbooks of Medicinal Chemistry Design Moves,
Hi Mandar,
> On Dec 13, 2023, at 03:39, Mandar Kulkarni
> wrote:
> I could not figure out how Rdkit is guessing it as 2D structure, as there is
> no such information in SDF.
Line 2 of the SDF record looks something like:
RDKit 2D
This line has the format (quoting from the
On Sep 26, 2023, at 01:17, Ling Chan wrote:
> >(1)
> 4.099
..
> Just wonder what was the rationale behind this extra "(1)" on the property
> field lines (pKa and logP in the above example)?
>
> And is there a way to get rid of these? I am not sure if this extra "(1)" is
> part of
On Jun 16, 2023, at 03:15, S Joshua Swamidass wrote:
> In graph theory, a planar graph is a graph that can be embedded in the plane,
> i.e., it can be drawn on the plane in such a way that its edges intersect
> only at their endpoints. In other words, it can be drawn in such a way that
> no
On Jun 15, 2023, at 20:49, S Joshua Swamidass wrote:
>
> And what (generally speaking) is the algorithm used by rdkit? Do we know it's
> complexity?
https://pubs.acs.org/doi/abs/10.1021/acs.jcim.5b00543
"Get Your Atoms in Order—An Open-Source Implementation of a Novel and Robust
Molecular
On Jun 15, 2023, at 18:20, S Joshua Swamidass wrote:
> It's well known that the graph-isomorphism problem is NP
While P is contained in NP, I don't think that's the NP you mean.
I suspect you may be thinking of subgraph isomorphism, which is NP-hard. Graph
isomorphism may be quasi-polynomial
Hi everyone,
I've just released chemfp 4.1. To install the pre-compiled package for
Linux-based OSes do:
python -m pip install chemfp -i https://chemp.com/packages/
For a detailed description of what's new, see:
https://chemfp.readthedocs.io/en/latest/whats_new_in_41.html
As a summary,
On May 17, 2023, at 02:31, Vincent Scalfani wrote:
> I thought that this might also be the case for bond indices, but that does
> not appear to be correct (see example below). Is it possible to get a bond
> index in the order of the SMILES?
This may help you understand why that's a difficult
On May 9, 2023, at 07:55, Haijun Feng wrote:
> Can anyone help me figure out how to get each atom with H from the smiles as
> above. Thanks so much!
Try using Chem.MolFragmentToSmiles to get the SMILES for each atom, with all
hydrogens explicit, then strip off the leading and trailing []s.
ildcards, like:
[*:1]c1ncc([*:2])cn1
where [*:1] is the attachment point for R1, [*:2] is the attachment
point for R3, and [*:3] is the attachment point for R3.
The R-group SMILES must have a single unlabled "*" wildcard, like:
CO*
-or-
C(*)CO
The program is used like this:
python enu
Hi all,
I've recently released chemfp 4.0, with support for several diversity
selection algorithms, and an improved API for interactive use in a notebook
environment.
Chemfp is an analytics package for cheminformatics fingerprints. It contains
command-line tools and an extensive Python
On Apr 14, 2022, at 12:57, Ivan Tubert-Brohman
wrote:
> How about splitting the file on lines consisting of "", and then parsing
> each record? If the parsing fails, you can write out the bad record for
> future inspection. (This addresses the basic use case, but not the "even
> better"
On Apr 14, 2022, at 09:16, Gyro Funch wrote:
> I don't know the sdf format well, so please excuse my ignorance, but instead
> of a custom parser, would it be possible to write a preprocessor to eliminate
> the offending information? Perhaps something using regular expressions in
> python,
Hi all,
The combination of crowd-funding and contract work for me, and methods +
software development by Mahendra Awale, has resulted in a new version of mmpdb.
More specifically, version 3.0 beta 1 is available on GitHub at:
https://github.com/adalke/mmpdb/tree/v3-dev
The CHANGELOG
Hi Gyro,
> On Dec 8, 2021, at 11:02, Gyro Funch wrote:
>
> My work is in the area of toxicology and I am interested in generating SMILES
> for molecules referred to as 'short chain chlorinated paraffins' (SCCP).
>
> A general definition that is sometimes used is that an SCCP is given by the
Hi Tim,
You might also consider using chemfp, which has this sort of functionality
available through its toolkit wrapper API:
from chemfp import rdkit_toolkit as T
import itertools
with T.read_ids_and_molecules("chembl_28.sdf.gz") as reader:
loc = reader.location
for id, mol in
Hi Ling,
If there are symmetries then a substructure search like will only give you
one mapping, and that might not be the canonical mapping.
What you're looking for is the special property _smilesAtomOutputOrder
>>> from rdkit import Chem
>>> mol =
> On Oct 21, 2021, at 04:50, Ling Chan wrote:
>
> I got the attached sdf. When I did a MolToSmiles, it gives me the following.
>
> >>> for m in Chem.SDMolSupplier("pdb_structures/1q6k_ligand.sdf"):
> ... print (Chem.MolToSmiles(m))
> ...
>
On Jul 23, 2021, at 06:42, Andrew Dalke wrote:
>
> No, there's no way to do that.
>
> The best I can suggest is to go back to the original Python implementation
> and change the code leading up to
Alternatively, since your template is small, you can brute-force enumerat
On Jul 23, 2021, at 01:01, Gustavo Seabra wrote:
> I actually want the sulfone to be found, if it is there. My problem is that I
> also want flexibility to change the ring atoms and still find the ring as a
> match, while considering a match on the sulfone only if it really is there.
> (e.g.,
Hi Gustavo,
> template =
> Chem.MolFromSmarts('[a]1(-[S](-*)(=[O])=[O]):[a]:[a]:[a]:[a]:[a]:1')
Unless things have changed since I last looked at the algorithm, you can't
meaningfully pass a SMARTS-based query molecule into the MCS program, outside
of a few simple cases.
It generates a
> On Jun 30, 2021, at 04:20, Francois Berenger wrote:
>
> On 29/06/2021 12:26, Greg Landrum wrote:
>> Hi Leon,
>> You can convert the tanimoto distance to similarity, but the formula
>> is:
>> Similarity = 1 - Distance
>
> In other words:
>
> Tanimoto_distance = 1.0 - Tanimoto_score
As a
Hi all,
I'm excited about a tool I developed for the Open Force Field Initiative and
thought to share a bit about it here.
It's called "off_coverage", currently in a pull-request at
https://github.com/openforcefield/cheminformatics-toolkit-equivalence and also
available from my Sourcehut
On May 20, 2021, at 03:17, Francois Berenger wrote:
> Weren't the path-based FPs formally described somewhere?
What does "formally" mean?
Daylight was rarely participated in the academic literature tradition.
They instead preferred to publish their information directly, as Pat mentions:
On
On May 12, 2021, at 05:08, Francois Berenger wrote:
> Or, more generally, flag a given atom in a molecule
> and ask rdkit to not start the corresponding SMILES with
> this atom, any unflagged atom being fine.
Perhaps do the opposite and use rootedAtAtom to have RDKit start with a
specific atom
Hi Ling,
> On Apr 2, 2021, at 16:23, Ling Chan wrote:
>
> Thank you Francois, I took a look at your code and borrowed parts of it to
> rejoin two molecules. It seems like my problem is solved. I eventually
> arrived at something like example 4 in
>
On Mar 31, 2021, at 21:55, Ling Chan wrote:
> I am trying to do something that I think is quite simple, but I have not
> figured out a simple way. Don't know if I am missing something. I am sure
> that ultimately I can figure it out, but I wonder if there is a good way.
If you can work in
> On Mar 13, 2021, at 20:29, Marawan Hussien via Rdkit-discuss
> wrote:
> my question is if this is the valid approach of comparison, particularly if
> the class sizes vary widely and the average similarity will be inevitably
> affected by the size of each item in each pair. As a check, it
Hi all,
I've just released chemfp 3.5.1 with support for "licensed
FPB files". These are fingerprint datasets which can be used
under the terms of chemfp's base license agreement even without
a chemfp license key or source code distribution.
As the first (and so far only) data set, I've
625 0.640 0.500
CHEBI:1895 9 9 9 8 0.409 0.333 0.409 0.320
...
Finally, nearly all of the MCS parameters can be configured on the command-line.
This program was written by Andrew Dalke .
"""
import sys
import argparse
from rdkit import Chem
from rdkit.Chem import rdF
On Oct 26, 2020, at 17:41, Cyrus Maher wrote:
> I’m wondering if there is an easy way to retrieve the atom numbers that the
> morgan fingerprints algorithm assigns as its first step.
Many of the fingerprint function support an optional "bitInfo" parameter. If
it's a dictionary then the keys
S strings
before and after the conversion.
Best regards,
Andrew
da...@dalkescientific.com
# Copy charges from the "M CHG" data lines to the atom block
# Written by Andrew Dalke, 2 October 2020
import argparse
import sys
import gzip
# This requires
On Sep 9, 2020, at 04:00, Lewis Martin wrote:
> I'd like to keep it FOSS since its for academic publication and hopefully to
> be re-used. Chemfp is amazing but brute-forcing 100million by 100million
> would surely still take a long time compared with an approximate nearest
> neighbor
On Sep 8, 2020, at 14:30, Mike Mazanetz wrote:
> Does anyone know whether it’s possible to obtain not just a fingerprint keys
> for MACCS (binary values) but the number of occurrences of the keys,
> particularly these details:
The SMARTS patterns for most of the MACCS keys is available by:
On Jun 25, 2020, at 16:27, Andrew Dalke wrote:
>
> See https://chemfp.com/license/ for details, or to get started:
>
> python -m pip install chemfp -i https://chemp.com/packages/ --upgrade
That should be
python -m pip install chemfp -i https://chemfp.com/packages/ --u
Hi RDKit'ers,
I've just released new versions of chemfp. Version 1.6 is the no-cost/open
source version, and 3.4 is the commercial version.
The goal of chemfp 1.6 is to provide a good performance baseline for evaluating
new Tanimoto search programs. This release is about 10-20% faster than
On May 31, 2020, at 15:23, Chris Swain via Rdkit-discuss
wrote:
> I’d like to include the number of sp3 atoms, is there an easy way to do this?
I don't easily see a function for that. There's
rdMolDescriptors.CalcFractionCSP3() which "returns the fraction of C atoms that
are SP3 hybridized".
On Feb 8, 2020, at 17:55, Janusz Petkowski wrote:
>
> If not how can I match cases where in a given position there can be C or H
> with rdkit?
I believe you should use #1 instead of H.
>>> from rdkit import Chem
>>> mols = [Chem.MolFromSmiles(s) for s in ["C(=O)OC", "C(=O)OCC", "C(=O)OCCC"]]
Hi all,
This is the last email I'll send asking for people and organizations to join
the current mmpdb crowdsourcing effort.
I've discussed it several times before here. In summary, I'm looking for
crowdfunding for the matched molecular pair program 'mmpdb'. This is part of a
test to find
On Jan 22, 2020, at 14:12, Greg Landrum wrote:
> As an aside: it's not particularly relevant to this discussion, but I don't
> understand why the wikipedia page says that the compound is anti-aromatic. I
> think the standard definition of anti-aromaticity (agrees with the one linked
> to from
Hi all,
Could someone explain the following, which uses the SMILES from
https://en.wikipedia.org/wiki/Acepentalene :
>>> from rdkit import Chem
>>> Chem.CanonSmiles("C1=CC2=CC=C3C2=C1C=C3")
'c1cc2ccc3ccc1-c=3-2'
>>> import rdkit
>>> rdkit.__version__
'2019.09.1'
I don't understand the aromatic
On Dec 12, 2019, at 17:39, Rafal Roszak wrote:
> I also had situation when I need to generate smiles with either
> isotopes or stereochemistry but not both. Maybe it is worth to add two
> options to ChemMolToSmiles function:
>
> dontIncludeStereochemistry=True/False
>
Hi all,
Is there any way to assign all bond directions (E/Z stereochemistry) to the
output SMILES string?
For example, here's a structure:
>>> mol = Chem.MolFromSmiles(r"F/C(Cl)=C(O)/N")
>>> Chem.MolToSmiles(mol)
'N/C(O)=C(/F)Cl'
It's a minimal definition, in that I could have specified the
On Nov 18, 2019, at 17:40, David Cosgrove wrote:
>
> Point taken. I don’t think you’d be able to get RDKit to spit such SMILES
> strings out unless you tortured it pretty hard, however.
Did someone mention one of my favorite things to do? :) See:
Hi all,
The end of the year is coming up. Perhaps there's extra money in your budget
which can go to support open source development in cheminformatics?
As many of you know, I started a crowdfunding effort to fund improvements to
the matched molecular pair program "mmpdb". I want to see if
Dear Stéphane,
> On Oct 16, 2019, at 19:39, Téletchéa Stéphane
> wrote:
> Did you 'by chance' transmit your presentation in PDF?
Yes, I exported my Keynote.app presentation to PDF.
However, I also sent the specific commands in email as plain text, as part of
the process of trying to
Hi all,
I wasn't able to give my RDKit training session at the last UGM, so I passed
out the presentation materials to the students who signed up. One of them wrote
to me asking why the following didn't display an error message in the notebook.
from rdkit import Chem
from rdkit.Chem.Draw
On Oct 3, 2019, at 20:34, Ondrej Gutten via Rdkit-discuss
wrote:
> # MCS is a benzene
> my_mcs = Chem.MolFromSmiles(res.smarts)
The res.smarts (or res.smartsString if you use the rdFMCS module) returns a
SMARTS string, not a SMILES string. You should be using Chem.MolFromSmarts() in
the
Hi all,
In August I sent a pre-announcement email about my mmpdb crowdfunding project.
The project is now live, at http://mmpdb.dalkescientific.com/ .
The basic idea is that I can commit to developing a few features for mmpdb.
• Postgres support, as an alternative to the existing SQLite
Hi Jan,
The GetMorganFingerprint() returns count fingerprints, and the Tanimoto
calculation does the full Jaccard similarity, including the counts.
The GetMorganFingerprintAsBitVect() version only uses the keys (that is, it
treats all non-zero values as being 1) when computing the Tanimoto.
Hi Jameed,
I don't think your approach will work, which means I likely didn't explain
myself well enough.
Let's say I start with:
Cc1cc2c2c(=O)o1 -
https://cactus.nci.nih.gov/chemical/structure/Cc1cc2c2c(=O)o1/image
I want to break the aromatic bond between the aromatic 'c'
On Aug 21, 2019, at 03:42, Francois Berenger wrote:
> Unless rdkit has something, I think graph edit distance is the kind
> of things for which you have to rely on a good graph library.
Do you know of any (non-chemical) graph library which can handle edits
involving the breaking of aromatic
Hi all,
Someone asked me recently about finding the graph edit distance of two small
(<= 14 atom) fragments.
I figured this was something that could be brute forced. Following SmallWorld's
example at https://cisrg.shef.ac.uk/shef2016/talks/oral13.pdf , given a
fragment, incrementally delete
On Aug 7, 2019, at 13:08, Paolo Tosco wrote:
> You can use
>
> Chem.MolFragmentToSmiles(mol, match)
>
> where match is a tuple of atom indices returned by GetSubstructMatch().
Note however that if only the atom indices are given then
Chem.MolFragmentToSmiles() will include all bonds which
or consulting work.
3) Where should I send questions and suggestions?
Right now, private email to me is the best. I'll set up a mailing list and
project web page if I get preliminary feedback that it's worth my time to go
further with this trial.
Thanks for reading to the end!
On Mar 27, 2019, at 13:26, Chris Swain via Rdkit-discuss
wrote:
> This is an interesting discussion and suspect this does not only apply to
> open-source software developers, there are similar challenges for small
> independent software companies.
My points were focused on the disadvantages
On Mar 27, 2019, at 16:44, Bennion, Brian via Rdkit-discuss
wrote:
> One of the goals of ATOM is to fund work that will be open sourced. I think
> any of the partners can choose to hire consultants for the work.
>
> https://atomscience.org/
> Atom
> atomscience.org
I think there are only
On Mar 27, 2019, at 08:24, Francois Berenger wrote:
> As an open-source project, I feel rdkit is quite successful.
> So, the user community is not so small.
> Some people who cannot contribute time could contribute money to the project
> (especially if it is tax-deductible, I guess).
I think the
On Mar 25, 2019, at 04:05, Francois Berenger wrote:
> Sometimes, I wish there was a rdkit consortium/NPO (so that donations are tax
> deductible), so that rdkit could be massively funded by all its commercial
> users, and even accepting individual donations.
Setting up such an organization is
Hi RDKit users,
This week I submitted a paper about chemfp for publication. I also submitted
a preprint on ChemRxiv, which was just accepted.
For those interested, it's at
https://chemrxiv.org/articles/The_Chemfp_Project/7877846 .
It's a rather long paper as it covers many aspects about the
On Nov 19, 2018, at 04:17, Rajarshi Guha wrote:
> Hi, I check out the latest RDKit sources from master and I'm trying to
> compile the PBF. However, the compilation fails reporting that
> RDGeneral/export.h is missing:
While this doesn't answer the question, it seems to be coupled to
On Sep 26, 2018, at 20:26, Peter S. Shenkin wrote:
> Ah, David, but how do you define a "real" singleton?
There can be many different definitions of what a '"real" singleton' might be,
but we are specifically talking about Butina clustering.
The Butina paper defines the term "false singleton",
On Sep 25, 2018, at 17:13, Peter S. Shenkin wrote:
> FWIW, in work on conformational clustering, I used the “most representative”
> molecule; that is, the real molecule closest to the mathematical centroid.
> This would probably be the best way of displaying a single molecule that
> typifies
On Sep 21, 2018, at 14:53, Philipp Thiel
wrote:
> you probably read about the Tanimoto being a proper metric in case of having
> binary data
> in Leach and Gillet 'Introduction to Chemoinformatics' chapter 5.3.1 in the
> revised edition.
What we call Tanimoto is more broadly known as the
On Sep 7, 2018, at 22:22, Alexey Orlov wrote:
> I'm trying to calculate the number of equivalent/nonequivalent neighbor
> heteroatoms for each atom i of molecule m.
>
> For examples, the third carbon atom of molecule CC(OH)CC has two
> nonequivalent neighbors: one carbon atom connected to OH
Hi all,
As you may know, I will be offering a free Python/RDKit training session
before the UGM in a couple of weeks. This is a beginner level course for people
with some programming experience. It will cover the basics of Python, RDKit,
JupyterLab, Pandas, Scikit-Learn and more. The goal is
On Aug 31, 2018, at 15:27, Axel Pahl wrote:
> on Linux, using Anaconda, RDKit and Python 3.6, I always need to additionally
> install cairocffi via pip:
Thanks Axel.
When I tried it out, I figured out the more likely problem - I was using
jupyter from a non-conda virtualenv.
My problem
On Aug 31, 2018, at 11:58, Andrew Dalke wrote:
> I am unable to see an inline structure depiction in the Jupyter notebook, nor
> in the JupyterLab notebook, tested with both the Python 2 and Python 3
> kernels, and rdkit.__version__ '2018.03.1'.
I've narrowed it down to the Cairo code
Hi all,
I am unable to see an inline structure depiction in the Jupyter notebook, nor
in the JupyterLab notebook, tested with both the Python 2 and Python 3 kernels,
and rdkit.__version__ '2018.03.1'.
I installed miniconda and RDKit on my Mac using:
curl -O
On Aug 31, 2018, at 07:41, Paolo Tosco wrote:
> this gist should do what you need:
Unless I misinterpreted what Jim is looking for, I don't think that returns the
contiguous rotatable bonds in a small molecule.
In the following there are only two rotatable bonds:
>>> mol =
Thanks for the responses. I'll merge them into one reply:
On Aug 29, 2018, at 16:56, Eloy Félix wrote:
> If you want to build model I guess that what you want is to get experimental
> logp values.
>
> This should give you something to start with:
>
> select ACTIVITY_ID, MOLREGNO,
Hi all,
I am starting to put together materials for the Python/RDKit training course
I'm giving just before the RDKit UGM next month.
I would like to structure part of it around the SQLite release of the ChEMBL
data set. More specifically, I plan to include examples of machine learning
with
On Aug 23, 2018, at 07:18, Roman Bolzern wrote:
> Dear RDKittens,
I would prefer to not be called a 'kitten'.
> https://www.rdkit.org/docs/Cartridge.html#license, and at the bottom it says
> “This work is licensed under the Creative Commons Attribution-ShareAlike 4.0
> License”,
...
> Is
On Jun 29, 2018, at 02:43, 藤秀義 wrote:
> Although not strictly based on the number of atoms, but on the length of
> SMILES string, the simplest way is using Python built-in functions as follows:
>
> smiles = 'CCC.CC'
> fragment = max(smiles.split('.'), key=len)
> print (fragment)
The mmpdb
On Jun 28, 2018, at 22:08, Paolo Tosco wrote:
> if you wish to keep only the largest disconnected fragment you may try the
> following:
>
> mols = list(rdmolops.GetMolFrags(mol, asMols = True))
> if (mols):
> mols.sort(reverse = True, key = lambda m: m.GetNumAtoms())
> mol = mols[0]
A
Hi Андрей,
The GetMorganFingerprint function takes additional parameters. From
http://rdkit.org/Python_Docs/rdkit.Chem.rdMolDescriptors-module.html#GetMorganFingerprint
GetMorganFingerprint( (Mol)mol, (int)radius
[, (AtomPairsParameters)invariants=[]
[, (AtomPairsParameters)fromAtoms=[]
> On Jun 17, 2018, at 21:04, Raghuram Srinivas
> wrote:
> Is there a way to convert a bit string of 2048 bits back to the SMILES /
> BitVector representation of the molecule? Any help /pointers in this
> direction will be much appreciated .
That topic came up on this list in April of this
On Jun 12, 2018, at 18:00, Bennion, Brian via Rdkit-discuss
wrote:
> Does RDkit support boron in SMILES strings? We have a number of compounds
> for which rdkit parsing is not successful. The commonality is that there is
> a B or b listed in the string.
RDKit supports boron, including
On May 18, 2018, at 17:48, Jennifer Hemmerich
wrote:
> I really liked the idea and I implemented it as follows:
> df = pd.DataFrame(columns=counts.keys())
> for i,fp in enumerate(allfps):
> logger.debug('appending %s', str(i))
>
And I have uploaded a source tar.gz and a binary wheel to PyPI.
That means you can do "pip install mmpdb" to install this most recent version.
Andrew
da...@dalkescientific.com
> On May 9, 2018, at 18:04, Kramer, Christian
Dear Marco,
> On May 7, 2018, at 23:59, Marco Stenta wrote:
> I had some time to set an environment for it and test it: it works fine, as
> far as my tests go. I will switch to this version and to the latest RDKIT now.
Thanks for the feedback. Someone else sent me a
On Apr 27, 2018, at 00:20, Andrew Dalke <da...@dalkescientific.com> wrote:
> Please try out:
> http://dalkescientific.com/mmpdb-2.1b1.tar.gz
>
> or my fork at:
> https://github.com/adalke/mmpdb
>
> and let me know of any problems.
Has anyone downloaded and tes
On Apr 27, 2018, at 00:20, Andrew Dalke <da...@dalkescientific.com> wrote:
> It does not appear that the .fragment files also need to be redone, so
> rebuilding the .mmpdb file is mostly a matter of re-running the index step.
I no longer think that is correct. While indexi
On Apr 26, 2018, at 12:38, Andrew Dalke <da...@dalkescientific.com> wrote:
> The automated mmpdb test suite isn't that good, so I still need to do some
> manual testing. I won't be able to get to this until (hopefully) this evening.
I did that, and tracked down one more bug.
Pl
On Apr 26, 2018, at 10:09, Marco Stenta wrote:
>
> Dear Colleagues,
> I just installed on conda env the new rdkit version
> and wanted to try mmpdb but upon testing I got the error below
> reverting back to rdkit=2017.09.3.0 it works fine (I still get some errors
> but
On Apr 26, 2018, at 10:09, Marco Stenta wrote:
>
> Dear Colleagues,
> I just installed on conda env the new rdkit version
> and wanted to try mmpdb but upon testing I got the error below
> reverting back to rdkit=2017.09.3.0 it works fine (I still get some errors
> but
On Apr 25, 2018, at 01:31, David Hall wrote:
> You need to turn off RDK_INSTALL_INTREE
Thanks! I've put that my build notes for the next time I compile RDKit.
BTW, a quick benchmark of the new release shows that it's almost 15% faster at
parsing SMILES strings than
> On Apr 23, 2018, at 10:43, Greg Landrum wrote:
>
> I'm pleased to announce that the next version of the RDKit - 2018.03 - is
> released. The release notes are below.
...
> Please let me know if you find any problems with the release or have
> suggestions for the
On Apr 23, 2018, at 14:54, Brian Cole wrote:
> Unfortunately it doesn't work on circular/ECFP-like fingerprints.
To be fair, you didn't mention that was a requirement. ;)
> It has the requirement that the fingerprint be a substructure fingerprint as
> you described.
Could
On Apr 22, 2018, at 20:22, Nils Weskamp wrote:
> Actually, I *was* also thinking about your use cases 2 and 3 since you
> also need some form of hash function to map substructures to bit
> numbers. This is normally a rather simple function / pseudo random
> generator,
On Apr 22, 2018, at 08:42, Nils Weskamp wrote:
> Nice work. If brute-force approaches like this (or methods based on
> genetic algorithms etc.) are the only way to reverse a fingerprint, one
> could probably come up with a fingerprint that allows for pretty secure
>
On Apr 21, 2018, at 01:55, Andrew Dalke <da...@dalkescientific.com> wrote:
> Hand-waving sketch: start with a carbon. Generate fingerprint. It should pass
> the screening test. If not, the structure contains no carbons, so repeat with
> other elements until you find an at
On Apr 20, 2018, at 19:03, jeff godden wrote:
>
> Long ago molecular fingerprints were referred to in the literature as
> molecular hash functions. (y'know, those crazy mathematical algorithms which
> permitted rapid lookup of some string in a lookup table)
Do you have a
On Apr 16, 2018, at 16:29, Guillaume GODIN
wrote:
> And for this one C[C@@]12CC[C@@](C)(CC1)O2O any idea
>
> Cause your tool failed too.
It's true that smiview failed, in the sense that it shouldn't have tried to do
further analysis with a molecule that RDKit
If you try this out with my smiview package, available from
https://bitbucket.org/dalke/smiview/downloads/ , it reports:
% smiview 'C\(C(C)C)=N/O'
Cannot parse --smiles: Unexpected term
C\(C(C)C)=N/O
^ Tokenizing stopped here
A bond must be followed by an atom, closure.
That is, the bond
On Apr 16, 2018, at 05:37, Patrick Walters wrote:
>
> Thanks Andrew, the SMILES approach seemed to have quite a few edge cases so I
> wrote something to work directly on a molecule.
That's the approach I started with, until I figured out that it doesn't
preserve
Hi Pat,
I wrote something like this for mmpdb, which is the MMPA code I helped
develop, at https://github.com/rdkit/mmpdb .
It has one restriction, which I'll get to in a moment.
The general idea is to convert the attachment points to closures, join them
with a ".", and canonicalize:
>>>
On Apr 7, 2018, at 07:13, Greg Landrum <greg.land...@gmail.com> wrote:
> Andrew Dalke (Dalke Scientific) will offer a course on Python and the RDKit
I need to finalize what I'm going to cover. I've been going between two
approaches.
1) Python programming for cheminformatics
This
About 10 days ago I posted a prototype program called 'smiview', which displays
information about the structure of a SMILES string.
Thanks to feedback from a couple of users, and a deep urge to explore the idea,
I've just released smiview 1.2, available from
Over the last few days I've developed a command-line tool that I call "smiview".
It's a SMILES viewer. It isn't a depiction tool where the input is in SMILES
but rather a tool to highlight different aspects of the SMILES string.
I'll put some examples at the end. If you want to try it out you
1 - 100 of 245 matches
Mail list logo