[jira] [Commented] (JENA-624) Develop a new in-memory RDF Dataset implementation

ASF GitHub Bot (JIRA) Wed, 21 Oct 2015 03:04:58 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14966576#comment-14966576
 ]


ASF GitHub Bot commented on JENA-624:
-------------------------------------

Github user afs commented on the pull request:

    https://github.com/apache/jena/pull/94#issuecomment-149844143
  
    This comment is mostly reviewing the Memory Dataset design, not the 
Journalling part.
    
    #### Performance Status
    
    @ajs6f, you [mentioned that using Java8 
streams](http://mail-archives.apache.org/mod_mbox/jena-dev/201510.mbox/%3CBC838704-0486-4094-90C6-D8C8DA044351%40email.virginia.edu%3E)
 was leading to overhead costs.  Is it the streams or the fact that the 
datastructure has to be traversed and it's less efficient than the array 
bunches in the memory graph?
    
    #### DatasetFactory
    
    I don't see factory operations and, what will be important, javadoc.  
What's the status here?
    
    See discussion on dev@ [(part of this 
email)](http://mail-archives.apache.org/mod_mbox/jena-dev/201510.mbox/%3C5610F36D.8090602%40apache.org%3E).
    
    #### Assembler
    Ditto.
    
    #### Documentation
    
    A couple of paragraphs
    * what is it
    * how to use it 
    * Choice between general and  txnmem datasets
    
    Maybe best to integrate with other documentation - pull out "transactions" 
from TDB and modify it.
    
    #### Persistent datastructures
    
    These could go in their own package.
    
    Some basic tests needed.
    
    #### Dependency management
    
    The jena-arq POM changes should have version management carried up in 
jena-parent.
    
    mockito already has some version in jena-parent and is used in jena-core.
    
    Add mockito and awaitility to jena-arq/DEPENDENCIES for accounting.
    
    #### Mocking in tests
    
    (minor) Any reason to not just do the action and test for an effect?
    
    #### Warnings
    
    Remove warning or suppress if necessary.
    (unnecessary semicolons, a couple of casts, one javadoc error)
    
    ### Journaling
    
    Even though not used, I'm minded to include it in the codebase with the 
understanding it is "subject to change". I can imagine it could help jena-text 
for example.
    
    LockMRPlusSW : tests?



> Develop a new in-memory RDF Dataset implementation
> --------------------------------------------------
>
>                 Key: JENA-624
>                 URL: https://issues.apache.org/jira/browse/JENA-624
>             Project: Apache Jena
>          Issue Type: Improvement
>            Reporter: Andy Seaborne
>            Assignee: A. Soroka
>              Labels: gsoc, gsoc2015, java, linked_data, rdf
>
> The current (Jan 2014) Jena in-memory dataset uses a general purpose 
> container that works for any storage technology for graphs together with 
> in-memory graphs.  
> This project would develop a new implementation design specifically for RDF 
> datasets (triples and quads) and efficient SPARQL execution, for example, 
> using multi-core parallel operations and/or multi-version concurrent 
> datastructures to maximise true parallel operation.
> This is a system project suitable for someone interested in datatbase 
> implementation, datastructure design and implementation, operating systems or 
> distributed systems.
> Note that TDB can operate in-memory using a simulated disk with 
> copy-in/copy-out semantics for disk-level operations.  It is for faithful 
> testing TDB infrastructure and is not designed performance, general in-memory 
> use or use at scale.  While lesson may be learnt from that system, TDB 
> in-memory is not the answer here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-624) Develop a new in-memory RDF Dataset implementation

Reply via email to