I looked around and experimented a bit, and it seems pretty straight-forward to use Apache Jena and Open Babel to insert an RDF graph representation of structure into chebi.owl that is constructed from the existing SMILES strings.

Can someone suggest a widely-used existing vocabulary for concepts (rdf:types, owl:classes) and associated properties that are reasonably synonymous with primary things obtained from an OBMol object such as OBBond, OBAtom, etc.?  I would prefer an OWL-DL over an OWL-FULL vocabulary, and OWL-FULL over an RDFS vocabulary, but being widely-used is a more important criteria.

On 12/4/2020 9:45 AM, christian.pil...@web.de wrote:
Dear all,
currently, our "work horse" is a database of 50k chemical reactions extracted from US patents. Chemical structures are assigned by our neo4j/RDKit-plugin as node attributes and can be queried by (sub-)structure queries through the neo4j query language CYPHER. Please find attached a short PDF outlining the opportunities of the plugin. Let me know if you find it useful. We recently updated to the latest releases of RDkit (2020.09.01) and neo4j (4.2.1) - testing is intended to be fiinalized over the upcoming weekend. I'm happy to provide afore mentioned sample database and - if needed - detailed instructions how to upload it into a neo4j instance.
Viele Grüße,
Christian
*Gesendet:* Donnerstag, 03. Dezember 2020 um 18:15 Uhr
*Von:* "Greg Landrum" <greg.land...@gmail.com>
*An:* "Steve Vestal" <steve.ves...@adventiumlabs.com>, christian.pil...@web.de
*Cc:* "BlueObelisk-Discuss" <blueobelisk-discuss@lists.sourceforge.net>
*Betreff:* Re: [BlueObelisk-discuss] Structure database that can be queried by SPARQL? I'm going to bring in Christian to answer the questions about the RDKit/Neo4J integration. Christian defined the use cases we implementing, as done all the testing, and coordinated the development work.
-greg
On Thu, Dec 3, 2020 at 3:54 PM Steve Vestal <steve.ves...@adventiumlabs.com <mailto:steve.ves...@adventiumlabs.com>> wrote:

    Is there a sample database to look at?  Neo4j can export in RDF
    format.  Do you have a paper or tech report?

    On 12/3/2020 8:35 AM, Greg Landrum wrote:

        Not quite what you're asking for, but: if you're willing to
        use neo4j to store the graph (which rules out SPARQL I guess)
        you can use the RDKit neo4j plugin:
        https://github.com/rdkit/neo4j-rdkit
        <https://github.com/rdkit/neo4j-rdkit>
        That gets you efficient substructure search and similarity search.
        -greg
        On Tue, Dec 1, 2020 at 4:04 PM Steve Vestal
        <steve.ves...@adventiumlabs.com
        <mailto:steve.ves...@adventiumlabs.com>> wrote:

            Does anyone know of a structure database that can be
            queried using an
            RDF query language like SPARQL?  PubChemRDF can be
            accessed in RDF
            format, but it encodes structures as SMILES strings, which
            cannot be
            queried in this way.

            If not, can anyone suggest open source software that might
            be used to
            construct a modest RDF dataset from an existing structure
            database for
            the purpose of experimenting?  For example, software that
            can translate
            SMILES strings into an annotated graph data structure of
            some sort?

            Thanks in advance for any suggestions.




            _______________________________________________
            Blueobelisk-discuss mailing list
            Blueobelisk-discuss@lists.sourceforge.net
            <mailto:Blueobelisk-discuss@lists.sourceforge.net>
            https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss
            <https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss>

_______________________________________________
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Reply via email to