[jira] [Commented] (RYA-313) Rya Mongo Blows up on Large result sets
[ https://issues.apache.org/jira/browse/RYA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116021#comment-16116021 ] ASF GitHub Bot commented on RYA-313: Github user asfgit closed the pull request at: https://github.com/apache/incubator-rya/pull/196 > Rya Mongo Blows up on Large result sets > --- > > Key: RYA-313 > URL: https://issues.apache.org/jira/browse/RYA-313 > Project: Rya > Issue Type: Bug > Components: dao >Affects Versions: 3.2.10 > Environment: Mongo DB with Rya 3.2.11-SNAPSHOT with a lot of data in > Rya >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > Simple queries that return a lot of results fail because mongo is trying to > send all of the results back at once. For instance, if I have a lot of data > and run something like: > {noformat} > SELECT * WHERE > { > ?s a ?t. > } > {noformat} > I will get this exception. > {noformat} > Caused by: com.mongodb.MongoCommandException: Command failed with error > 16389: 'aggregation result exceeds maximum document size (16MB)' on server > localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "aggregation > result exceeds maximum document size (16MB)", "code" : 16389 } > {noformat} > I think we need to toss in a "AggregationOptions with Batch = 1000", but I > couldn't get that to work immediately. Somebody with more mongo experience > needs to look at this. > [Here is the line of > code|https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java#L114] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-313) Rya Mongo Blows up on Large result sets
[ https://issues.apache.org/jira/browse/RYA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114998#comment-16114998 ] ASF GitHub Bot commented on RYA-313: Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/196#discussion_r131490821 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java --- @@ -118,7 +124,8 @@ public MongoDBRdfConfiguration getConf() { } // TODO not sure what to do about regex ranges? -final RyaStatementBindingSetCursorIterator iterator = new RyaStatementBindingSetCursorIterator(coll, rangeMap, strategy, conf.getAuthorizations()); +final RyaStatementBindingSetCursorIterator iterator = new RyaStatementBindingSetCursorIterator( +getCollection(conf), rangeMap, strategy, conf.getAuthorizations()); --- End diff -- i think other parts of this class use the legacy DBCollection coll still > Rya Mongo Blows up on Large result sets > --- > > Key: RYA-313 > URL: https://issues.apache.org/jira/browse/RYA-313 > Project: Rya > Issue Type: Bug > Components: dao >Affects Versions: 3.2.10 > Environment: Mongo DB with Rya 3.2.11-SNAPSHOT with a lot of data in > Rya >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > Simple queries that return a lot of results fail because mongo is trying to > send all of the results back at once. For instance, if I have a lot of data > and run something like: > {noformat} > SELECT * WHERE > { > ?s a ?t. > } > {noformat} > I will get this exception. > {noformat} > Caused by: com.mongodb.MongoCommandException: Command failed with error > 16389: 'aggregation result exceeds maximum document size (16MB)' on server > localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "aggregation > result exceeds maximum document size (16MB)", "code" : 16389 } > {noformat} > I think we need to toss in a "AggregationOptions with Batch = 1000", but I > couldn't get that to work immediately. Somebody with more mongo experience > needs to look at this. > [Here is the line of > code|https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java#L114] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-313) Rya Mongo Blows up on Large result sets
[ https://issues.apache.org/jira/browse/RYA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114879#comment-16114879 ] ASF GitHub Bot commented on RYA-313: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/196 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/358/ > Rya Mongo Blows up on Large result sets > --- > > Key: RYA-313 > URL: https://issues.apache.org/jira/browse/RYA-313 > Project: Rya > Issue Type: Bug > Components: dao >Affects Versions: 3.2.10 > Environment: Mongo DB with Rya 3.2.11-SNAPSHOT with a lot of data in > Rya >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > Simple queries that return a lot of results fail because mongo is trying to > send all of the results back at once. For instance, if I have a lot of data > and run something like: > {noformat} > SELECT * WHERE > { > ?s a ?t. > } > {noformat} > I will get this exception. > {noformat} > Caused by: com.mongodb.MongoCommandException: Command failed with error > 16389: 'aggregation result exceeds maximum document size (16MB)' on server > localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "aggregation > result exceeds maximum document size (16MB)", "code" : 16389 } > {noformat} > I think we need to toss in a "AggregationOptions with Batch = 1000", but I > couldn't get that to work immediately. Somebody with more mongo experience > needs to look at this. > [Here is the line of > code|https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java#L114] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-313) Rya Mongo Blows up on Large result sets
[ https://issues.apache.org/jira/browse/RYA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114846#comment-16114846 ] ASF GitHub Bot commented on RYA-313: Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/196#discussion_r131464691 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/MongoDBQueryEngine.java --- @@ -118,7 +124,8 @@ public MongoDBRdfConfiguration getConf() { } // TODO not sure what to do about regex ranges? -final RyaStatementBindingSetCursorIterator iterator = new RyaStatementBindingSetCursorIterator(coll, rangeMap, strategy, conf.getAuthorizations()); +final RyaStatementBindingSetCursorIterator iterator = new RyaStatementBindingSetCursorIterator( +getCollection(conf), rangeMap, strategy, conf.getAuthorizations()); --- End diff -- Since you call getCollection(conf), I believe that coll is now an unused field. Can you remove it? > Rya Mongo Blows up on Large result sets > --- > > Key: RYA-313 > URL: https://issues.apache.org/jira/browse/RYA-313 > Project: Rya > Issue Type: Bug > Components: dao >Affects Versions: 3.2.10 > Environment: Mongo DB with Rya 3.2.11-SNAPSHOT with a lot of data in > Rya >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > Simple queries that return a lot of results fail because mongo is trying to > send all of the results back at once. For instance, if I have a lot of data > and run something like: > {noformat} > SELECT * WHERE > { > ?s a ?t. > } > {noformat} > I will get this exception. > {noformat} > Caused by: com.mongodb.MongoCommandException: Command failed with error > 16389: 'aggregation result exceeds maximum document size (16MB)' on server > localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "aggregation > result exceeds maximum document size (16MB)", "code" : 16389 } > {noformat} > I think we need to toss in a "AggregationOptions with Batch = 1000", but I > couldn't get that to work immediately. Somebody with more mongo experience > needs to look at this. > [Here is the line of > code|https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java#L114] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-313) Rya Mongo Blows up on Large result sets
[ https://issues.apache.org/jira/browse/RYA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114820#comment-16114820 ] ASF GitHub Bot commented on RYA-313: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/196#discussion_r131461300 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java --- @@ -106,12 +110,16 @@ private void findNextValidResultCursor() { currentBindingSetCollection = rangeMap.get(currentQuery); // Executing redact aggregation to only return documents the user // has access to. -final List pipeline = new ArrayList<>(); -pipeline.add(new BasicDBObject("$match", currentQuery)); +final List pipeline = new ArrayList<>(); +pipeline.add(new Document("$match", currentQuery)); pipeline.addAll(AggregationUtil.createRedactPipeline(auths)); log.debug(pipeline); -final AggregationOutput output = coll.aggregate(pipeline); -resultsIterator = output.results().iterator(); + +final AggregationOptions opts = AggregationOptions.builder() --- End diff -- Where's this being used? Do you need to apply this to your iterator? > Rya Mongo Blows up on Large result sets > --- > > Key: RYA-313 > URL: https://issues.apache.org/jira/browse/RYA-313 > Project: Rya > Issue Type: Bug > Components: dao >Affects Versions: 3.2.10 > Environment: Mongo DB with Rya 3.2.11-SNAPSHOT with a lot of data in > Rya >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > Simple queries that return a lot of results fail because mongo is trying to > send all of the results back at once. For instance, if I have a lot of data > and run something like: > {noformat} > SELECT * WHERE > { > ?s a ?t. > } > {noformat} > I will get this exception. > {noformat} > Caused by: com.mongodb.MongoCommandException: Command failed with error > 16389: 'aggregation result exceeds maximum document size (16MB)' on server > localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "aggregation > result exceeds maximum document size (16MB)", "code" : 16389 } > {noformat} > I think we need to toss in a "AggregationOptions with Batch = 1000", but I > couldn't get that to work immediately. Somebody with more mongo experience > needs to look at this. > [Here is the line of > code|https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java#L114] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-313) Rya Mongo Blows up on Large result sets
[ https://issues.apache.org/jira/browse/RYA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114658#comment-16114658 ] ASF GitHub Bot commented on RYA-313: Github user amihalik commented on the issue: https://github.com/apache/incubator-rya/pull/196 Looks good @isper3at. I'm going to test this out and wait for one more before I merge. > Rya Mongo Blows up on Large result sets > --- > > Key: RYA-313 > URL: https://issues.apache.org/jira/browse/RYA-313 > Project: Rya > Issue Type: Bug > Components: dao >Affects Versions: 3.2.10 > Environment: Mongo DB with Rya 3.2.11-SNAPSHOT with a lot of data in > Rya >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > Simple queries that return a lot of results fail because mongo is trying to > send all of the results back at once. For instance, if I have a lot of data > and run something like: > {noformat} > SELECT * WHERE > { > ?s a ?t. > } > {noformat} > I will get this exception. > {noformat} > Caused by: com.mongodb.MongoCommandException: Command failed with error > 16389: 'aggregation result exceeds maximum document size (16MB)' on server > localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "aggregation > result exceeds maximum document size (16MB)", "code" : 16389 } > {noformat} > I think we need to toss in a "AggregationOptions with Batch = 1000", but I > couldn't get that to work immediately. Somebody with more mongo experience > needs to look at this. > [Here is the line of > code|https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java#L114] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-313) Rya Mongo Blows up on Large result sets
[ https://issues.apache.org/jira/browse/RYA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114656#comment-16114656 ] ASF GitHub Bot commented on RYA-313: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/196 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/354/ > Rya Mongo Blows up on Large result sets > --- > > Key: RYA-313 > URL: https://issues.apache.org/jira/browse/RYA-313 > Project: Rya > Issue Type: Bug > Components: dao >Affects Versions: 3.2.10 > Environment: Mongo DB with Rya 3.2.11-SNAPSHOT with a lot of data in > Rya >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > Simple queries that return a lot of results fail because mongo is trying to > send all of the results back at once. For instance, if I have a lot of data > and run something like: > {noformat} > SELECT * WHERE > { > ?s a ?t. > } > {noformat} > I will get this exception. > {noformat} > Caused by: com.mongodb.MongoCommandException: Command failed with error > 16389: 'aggregation result exceeds maximum document size (16MB)' on server > localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "aggregation > result exceeds maximum document size (16MB)", "code" : 16389 } > {noformat} > I think we need to toss in a "AggregationOptions with Batch = 1000", but I > couldn't get that to work immediately. Somebody with more mongo experience > needs to look at this. > [Here is the line of > code|https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java#L114] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-313) Rya Mongo Blows up on Large result sets
[ https://issues.apache.org/jira/browse/RYA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113707#comment-16113707 ] ASF GitHub Bot commented on RYA-313: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/196 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/353/Failed Tests: 2incubator-rya-master-with-optionals-pull-requests/org.apache.rya:mongodb.rya: 1org.apache.rya.mongodb.MongoDBQueryEngineTest.statementQueryincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.indexing.example: 1ExamplesTest.MongoRyaDirectExampleTest > Rya Mongo Blows up on Large result sets > --- > > Key: RYA-313 > URL: https://issues.apache.org/jira/browse/RYA-313 > Project: Rya > Issue Type: Bug > Components: dao >Affects Versions: 3.2.10 > Environment: Mongo DB with Rya 3.2.11-SNAPSHOT with a lot of data in > Rya >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > Simple queries that return a lot of results fail because mongo is trying to > send all of the results back at once. For instance, if I have a lot of data > and run something like: > {noformat} > SELECT * WHERE > { > ?s a ?t. > } > {noformat} > I will get this exception. > {noformat} > Caused by: com.mongodb.MongoCommandException: Command failed with error > 16389: 'aggregation result exceeds maximum document size (16MB)' on server > localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "aggregation > result exceeds maximum document size (16MB)", "code" : 16389 } > {noformat} > I think we need to toss in a "AggregationOptions with Batch = 1000", but I > couldn't get that to work immediately. Somebody with more mongo experience > needs to look at this. > [Here is the line of > code|https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java#L114] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-313) Rya Mongo Blows up on Large result sets
[ https://issues.apache.org/jira/browse/RYA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113665#comment-16113665 ] ASF GitHub Bot commented on RYA-313: GitHub user isper3at opened a pull request: https://github.com/apache/incubator-rya/pull/196 RYA-313 Aggregation now is performed over batches of 1000. ## Description >What Changed? Changed the iterator to use the non-deprecated Mongo api, enabled the aggregation framework to work properly. This could be a very good reason to address: [Jira mongo api ticket](https://issues.apache.org/jira/browse/RYA-302) ### Tests >Coverage? No tests, just ran locally and assured that no exception for document size was thrown. ### Links [Jira](https://issues.apache.org/jira/browse/RYA-313) ### Checklist - [ ] Code Review - [ ] Squash Commits People To Reivew @amihalik You can merge this pull request into a Git repository by running: $ git pull https://github.com/isper3at/incubator-rya RYA-313 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-rya/pull/196.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #196 commit 3b279972198d34a2bcda775bc6ec4f9a87ade8ba Author: isper3at Date: 2017-08-03T23:03:11Z RYA-313 Aggregation now is performed over batches of 1000. > Rya Mongo Blows up on Large result sets > --- > > Key: RYA-313 > URL: https://issues.apache.org/jira/browse/RYA-313 > Project: Rya > Issue Type: Bug > Components: dao >Affects Versions: 3.2.10 > Environment: Mongo DB with Rya 3.2.11-SNAPSHOT with a lot of data in > Rya >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > Simple queries that return a lot of results fail because mongo is trying to > send all of the results back at once. For instance, if I have a lot of data > and run something like: > {noformat} > SELECT * WHERE > { > ?s a ?t. > } > {noformat} > I will get this exception. > {noformat} > Caused by: com.mongodb.MongoCommandException: Command failed with error > 16389: 'aggregation result exceeds maximum document size (16MB)' on server > localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "aggregation > result exceeds maximum document size (16MB)", "code" : 16389 } > {noformat} > I think we need to toss in a "AggregationOptions with Batch = 1000", but I > couldn't get that to work immediately. Somebody with more mongo experience > needs to look at this. > [Here is the line of > code|https://github.com/apache/incubator-rya/blob/master/dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/iter/RyaStatementBindingSetCursorIterator.java#L114] -- This message was sent by Atlassian JIRA (v6.4.14#64029)