Your understanding is correct, though this is not widely appreciated. There are many more papers that discuss improving the fingerprint filtering, but whether they filter out 30% or 20% is kind of irrelevant, given the orders of magnitude differences in speed for algorithms for subgraph matching.
Most cheminformatics tools represent molecules as graphs internally. It should not be difficult to add a new output format to Open Babel for example, or write a Python script to do this, that generates whatever graph format you require. You just need a "for" loop over the bonds. On Thu, 3 Dec 2020 at 11:09, Steve Vestal <steve.ves...@adventiumlabs.com> wrote: > Thanks, this was an interesting paper. I am in fact curious about the > substructure search problem. > > I would appreciate a sanity-check on my understanding of this paper. My > impression was that partially ordered fingerprints are used in an initial > relational database comparison query to obtain a modestly sized set of > candidate structures, after which a subgraph matching algorithm (e.g., a > VF2 variant) is applied sequentially to each element of that set to get an > exact answer. Is that the general approach? I got the vague impression > the sequential subgraph matching, not the fingerprint comparison query, is > the performance bottleneck -- is that generally true in this approach? > > To answer the earlier question, I am interested in seeing if graph > database and description logic technologies can be applied to structure > queries. To play around with that, I would want a true graph database > representation of structure. I looked at ChEBI, like PubChem also > available in RDF format, and like PubChemRDF also encodes structure using > SMILES strings rather than RDF graphs. > > Does anyone know of any structure database that uses an attributed graph > rather than string representation? > > Does anyone know of an open source software package that can convert > SMILES strings into RDF (brass ring) or any sort of attributed graph data > structure? What about open source tools to generate graphical > visualizations from SMILES strings? I assume those would have this > capability buried inside them. The CDK page cites a few export formats, > SMILES, SDF, InChI, Mol2, CML, *and others*. Are any of the formats > attributed graph data structures? > > > On 12/2/2020 12:09 PM, Egon Willighagen wrote: > > > Please have a look at: > https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0282-y > > On Tue, Dec 1, 2020 at 4:04 PM Steve Vestal < > steve.ves...@adventiumlabs.com> wrote: > >> Does anyone know of a structure database that can be queried using an >> RDF query language like SPARQL? PubChemRDF can be accessed in RDF >> format, but it encodes structures as SMILES strings, which cannot be >> queried in this way. >> >> If not, can anyone suggest open source software that might be used to >> construct a modest RDF dataset from an existing structure database for >> the purpose of experimenting? For example, software that can translate >> SMILES strings into an annotated graph data structure of some sort? >> >> Thanks in advance for any suggestions. >> >> >> >> >> _______________________________________________ >> Blueobelisk-discuss mailing list >> Blueobelisk-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss >> > > > -- > Have you heard about Wikidata already? "Use Scholia and Wikidata to find > scientific literature" is a new tutorial from my colleague Lauren Dupuis. > https://laurendupuis.github.io/Scholia_tutorial/ > > ----- > E.L. Willighagen > Department of Bioinformatics - BiGCaT > Maastricht University (http://www.bigcat.unimaas.nl/) > Homepage: http://egonw.github.com/ > Blog: http://chem-bla-ics.blogspot.com/ > PubList: https://www.zotero.org/egonw > ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286> > ImpactStory: https://impactstory.org/u/egonwillighagen > > _______________________________________________ > Blueobelisk-discuss mailing list > Blueobelisk-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss >
_______________________________________________ Blueobelisk-discuss mailing list Blueobelisk-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss