Hello Andy,
Thank you so much for your response. I would be very interested in
making a new implementation of DatasetGraph, although I would have to
learn about the issues involved in SPARQL query optimization, as I have
not studied those issues. I would also have to learn more about
parallel programming. Well, maybe it is late to be applying for GSoC.
I would just like to get involved in an open source Semantic Web
project, in any case.
Thank you also for the references to related work. I shall have to look
at them in more detail. I also need to explore the connection with JSON-LD.
I have had a really large vision about how we can enhance all our
object-oriented technologies with the Semantic Web technologies. Running
SPARQL on object-oriented data is part of it, and I have thought ARQ
would be best for that purpose. Another part is that we can enhance the
object-oriented data model with many elements of the OWL data model,
including much of the reasoning, anywhere the object-oriented data model
is used. The intention is to let people use the OWL data model in all
the object-oriented programs they write, instead of just the
object-oriented data model as it is.
What I would really like, though, would be if we could get more data on
the Semantic Web and make it larger, with all the object-oriented data
in the world that people are willing to post. I have thought what we
really want to do is set up SPARQL endpoints on object databases.
Another source of object-oriented data in the world is object-relational
mapping. I'm not entirely sure, but I have thought it might also be
possible to set up SPARQL endpoints on data sources of object-relational
mapping, by treating the data as object-oriented data.
I do have in mind how to implement at least large parts of the
functionality of the Graph, Node, and Triple interfaces backed by
object-oriented data, and I have a lot of code working toward that
purpose in Java. My understanding of ARQ is that it would be sufficient
in order to run SPARQL SELECT, CONSTRUCT, and Update queries on
object-oriented data if we would just implement the Model interface, or
related interfaces, backed by object-oriented data. Is that correct?
As I have in mind, there would be one implementation of Model for each
piece of object database software, or maybe for each piece of
object-relational mapping software, although the implementations could
have a lot of code in common. There would also be a Model for an
arbitrary Java Collection of Java objects that the user would supply.
Additionally, there would be a Model to use in Java programs that would
consist of all the Java objects in memory that have not been garbage
collected, which we could use to run SPARQL on all the objects in
memory. (I have means of accessing main memory in Java with AspectJ.)
Well, as I say, maybe it is late to be applying for GSoC. I have just
been hoping that I can make a contribution to the Semantic Web with
these ideas. I need to find a conference to which to submit my
article. Thank you very much again for your response.
Tim Armstrong
On 03/18/2014 09:49 AM, Andy Seaborne wrote:
Hi Tim,
The idea of this project wasn't to implement the Model interface, it
was to implement the storage level DatasetGraph interface. Jena has an
implement for Model in memory (actually - for Graph : Model is a
presentation of Graph and Graph (and Node and Triple) are the key
abstractions.
Aside from GSoC:
Your ideas for relating RDF access to object-oriented sounds
interesting - do you have a particular source of object-oriented data
in mind?
I don't know of any closely related work which isn't to say there
isn't any. Does the work on CumulusRDF, which stores RDF molecules
(if I rmember correctly) have any relevance? Or Haystacks (MIT) which
used adjacency lists on nodes to store RDF which is a different style
to the "traditional" triple storage style.
I suspect the W3C "CSV on the Web" Working Group might be connected -
there, data is assumed into be in regular table structures which can
be viewed as a low level object oriented data format.
Andy
On 18/03/14 01:27, Timothy Armstrong wrote:
Hello,
I'm interested in contributing to Jena in Google Summer of Code 2014.
I'm a computer science Ph.D. student at North Carolina State
University. I have studied the Semantic Web very passionately, as I
feel it is a wonderful vision. I have taken a course in it, worked as a
research assistant on the Protein Ontology project (
http://pir.georgetown.edu/pro/pro.shtml ), and developed some open
source software for it. I have used Jena a lot.
I have some ideas for JENA-624 (
https://issues.apache.org/jira/browse/JENA-624 ), although I am very
interested in directions you see for it, and I would be glad to work on
other issues. There are a lot of ideas I have had for my Semantic Web
software that are related to Jena. I would be very glad to contribute
to the Jena project in GSoC, but I would also be glad to contribute
anything in my existing software that would be useful to Jena. Well, I
realize that I am a bit late posting here for GSoC, and I am hurrying to
get my software's web site and article in a presentable form.
I came up with a very simple interpretation of object-oriented
programming, similar to connections other people have made, that treats
all object-oriented data as triples in RDF. It means in part that we
can run SPARQL queries on any object-oriented data. I have thought it
would be very good if we could use ARQ to run SPARQL on main memory in
object-oriented programs and on object databases. I found that we can
post object-oriented data directly on the Semantic Web without having to
write any sort of mapping like D2RQ: either by translating
object-oriented data into an existing Semantic Web format, or by setting
up SPARQL endpoints on object databases. Well, I am very interested if
you are aware if any of this has been done before.
Regarding JENA-624, I have in mind how to create implementations of the
Jena Model interface (com.hp.hpl.jena.rdf.model.Model) backed by Java
data. I have been thinking that it might help to run SPARQL on Java
data with ARQ if we could implement Model backed by Java data. I am
wondering if you think it would be applicable to JENA-624, or to any
other issues, if we could create implementations of Model in this
manner. There could be both in-memory models with Java data, and disk
models with object databases.
So, I would be very glad to contribute.
Thanks,
Tim Armstrong