[ 
https://issues.apache.org/jira/browse/RYA-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16096897#comment-16096897
 ] 

ASF GitHub Bot commented on RYA-307:
------------------------------------

GitHub user ejwhite922 opened a pull request:

    https://github.com/apache/incubator-rya/pull/181

    Rya-307 MongoDB Rya DAO Batch Writer

    ## Description
    Improved Rya MongoDB ingest of statements through the Sail Layer and Rya 
DAO by queueing up multiple inserts at a time so can be written as a single 
batch. If no statements in the batch have been written after a set time limit 
then they are flushed out into the datastore. The size of the batch and the 
time limit are configurable.
    
    ### Tests
    Unit Tests/Integration Tests
    
    ### Links
    [Jira](https://issues.apache.org/jira/browse/RYA-307)
    
    ### Checklist
    - [ ] Code Review
    - [ ] Squash Commits
    
    #### People To Review
    @amihalik 
    @meiercaleb 
    @DLotts
    @jessehatfield 
    @isper3at 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ejwhite922/incubator-rya RYA-307_MongoIngest

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-rya/pull/181.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #181
    
----
commit 08596dd004c5f35867c7a6780a800d2387e05676
Author: eric.white <[email protected]>
Date:   2017-07-19T13:08:32Z

    RYA-307 Improved Rya MongoDB ingest of statements through the Sail Layer 
and Rya DAO by queueing up multiple inserts at a time so can be written as a 
single batch.  If no statements in the batch have been written after a set time 
limit then they are flushed out into the datastore.  The size of the batch and 
the time limit are configurable.

commit ec0ccc4ce1cc329edd02cfc5d09d543447ca59af
Author: eric.white <[email protected]>
Date:   2017-07-20T20:22:47Z

    Rya-307 Commit #2. Added config options for flushing MongoDB batch writer.

commit 154e589a082b91c4556079be4fabd55d37480360
Author: eric.white <[email protected]>
Date:   2017-07-21T21:15:19Z

    RYA_307 Commit #3. Fixed integration tests. Made BatchWriter compatible 
with MongoCollection.

----


> MongoDB Bulk Load methods should use Secondary Indexer Bulk Loading
> -------------------------------------------------------------------
>
>                 Key: RYA-307
>                 URL: https://issues.apache.org/jira/browse/RYA-307
>             Project: Rya
>          Issue Type: Improvement
>          Components: dao
>            Reporter: Aaron Mihalik
>            Assignee: Eric White
>
> The MongoDB secondary indexers *really* slow down inserts via the bulk load 
> methods for the DAO.  The DAO should use the bulk load methods on the 
> secondary indexers.  Here is where the call is made in the DAO [1]
> [1] 
> https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBRyaDAO.java#L158
> Some version of a BatchWriter should be created for Mongo that is used in the 
> DAO and any secondary indexers. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to