[jira] [Commented] (JENA-624) Develop a new in-memory RDF Dataset implementation

ASF GitHub Bot (JIRA) Wed, 14 Oct 2015 11:56:29 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14957519#comment-14957519
 ]


ASF GitHub Bot commented on JENA-624:
-------------------------------------

Github user ajs6f commented on the pull request:

    https://github.com/apache/jena/pull/94#issuecomment-148155899
  
    Okay, @afs, I've squashed everything into two commits. They are basically 
orthogonal, although there is a tiny leak from the first into the second (in 
`DatasetGraphWithRecord`), because I found a clearer way of typing mutations 
later in the summer. The first is the dataset-with-operation-record that I did 
earlier, that provides a real abort function to any wrapped MRSW dataset. The 
second is the persistent-data-structure-based MR+SW dataset impl. Let me know 
if you would prefer these to be broken out into two separate PRs-- that's no 
problem, I just didn't want to fire multiple PRs at once without clearing it. I 
can easily remedy the small leak and build two PRs.
    
    You are right in that other than that, there aren't really natural 
boundaries inside each commit. That's a little unfortunate, because I know it 
will make review more of a pain in the neck, but at least I can promise that of 
the extant code into which this is being flung, nothing is touched outside of 
adding dependencies to POM files. This is all new stuff and shouldn't bother 
anything else.
    
    Now back to ISWC!


> Develop a new in-memory RDF Dataset implementation
> --------------------------------------------------
>
>                 Key: JENA-624
>                 URL: https://issues.apache.org/jira/browse/JENA-624
>             Project: Apache Jena
>          Issue Type: Improvement
>            Reporter: Andy Seaborne
>              Labels: gsoc, gsoc2015, java, linked_data, rdf
>
> The current (Jan 2014) Jena in-memory dataset uses a general purpose 
> container that works for any storage technology for graphs together with 
> in-memory graphs.  
> This project would develop a new implementation design specifically for RDF 
> datasets (triples and quads) and efficient SPARQL execution, for example, 
> using multi-core parallel operations and/or multi-version concurrent 
> datastructures to maximise true parallel operation.
> This is a system project suitable for someone interested in datatbase 
> implementation, datastructure design and implementation, operating systems or 
> distributed systems.
> Note that TDB can operate in-memory using a simulated disk with 
> copy-in/copy-out semantics for disk-level operations.  It is for faithful 
> testing TDB infrastructure and is not designed performance, general in-memory 
> use or use at scale.  While lesson may be learnt from that system, TDB 
> in-memory is not the answer here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-624) Develop a new in-memory RDF Dataset implementation

Reply via email to