[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308635#comment-16308635 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on the issue: https://github.com/apache/incubator-rya/pull/254 Just a heads up, this is going to conflict with RYA-414 because of changes to MongoDBRdfConfiguration.java. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #258: [WIP] RYA-104 Mongo Rya Shell Integration
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/258 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/585/ ---
[jira] [Commented] (RYA-104) Mongo Support in Rya Administrative Shell
[ https://issues.apache.org/jira/browse/RYA-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308598#comment-16308598 ] ASF GitHub Bot commented on RYA-104: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/258 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/585/ > Mongo Support in Rya Administrative Shell > - > > Key: RYA-104 > URL: https://issues.apache.org/jira/browse/RYA-104 > Project: Rya > Issue Type: New Feature >Affects Versions: 3.2.9 >Reporter: Andrew Smith >Assignee: David W. Lotts > Fix For: 3.2.10 > > > Implement mongo support for the admin shell. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #254: RYA-416 Optionally invoke aggregation pipeline to ...
Github user kchilton2 commented on the issue: https://github.com/apache/incubator-rya/pull/254 Just a heads up, this is going to conflict with RYA-414 because of changes to MongoDBRdfConfiguration.java. ---
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159306588 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryOptimizer.java --- @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import org.apache.hadoop.conf.Configurable; +import org.apache.hadoop.conf.Configuration; +import org.apache.log4j.Logger; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.openrdf.query.BindingSet; +import org.openrdf.query.Dataset; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.algebra.evaluation.QueryOptimizer; + +public class AggregationPipelineQueryOptimizer implements QueryOptimizer, Configurable { --- End diff -- Docs. ---
[GitHub] incubator-rya issue #172: RYA-303 Mongo PCJ Support
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/172 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/588/ ---
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308917#comment-16308917 ] ASF GitHub Bot commented on RYA-303: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/172 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/588/ > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-104) Mongo Support in Rya Administrative Shell
[ https://issues.apache.org/jira/browse/RYA-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308508#comment-16308508 ] ASF GitHub Bot commented on RYA-104: GitHub user kchilton2 opened a pull request: https://github.com/apache/incubator-rya/pull/258 [WIP] RYA-104 Mongo Rya Shell Integration ## Description Added Mongo support to the Rya Shell. ### Tests Added tests. ### Links https://issues.apache.org/jira/browse/RYA-104 You can merge this pull request into a Git repository by running: $ git pull https://github.com/kchilton2/incubator-rya RYA-104 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-rya/pull/258.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #258 commit 767349dac9822cd13e92f9b117d1b5d2dad13e3d Author: kchilton2Date: 2017-12-23T01:52:27Z RYA-414 Introduced the stateful mongo configuratino object so that it is the arbitor of MongoDB state within a Sail object. commit 17cebae3328916bd80fbe5447da5ccb660539556 Author: Andrew Smith Date: 2017-12-26T19:30:32Z RYA-414 Removed mongo connection factory addressed the indexers that used the factory addressed the geo project commit 157c06491cd814a1d6e445ebfe77fc63226e5739 Author: kchilton2 Date: 2017-12-26T23:42:33Z RYA-414 Fixing broken tests, cleaning up documentate, cleaning up whitespace. commit a0d3b02e9883d300be9c60bb87b6a4e4e4a09bff Author: David W. Lotts Date: 2017-12-13T22:34:55Z RYA-104 most interactors complete. commit ac8fc3d040a68d39e2911d9fc2c96341eeb15193 Author: kchilton2 Date: 2017-12-13T23:14:40Z RYA-104 Updated the RyaClient to optionally include features that are required for the Accumulo Client, but not the Mongo Client. commit 7a09ab338d402204eb3ebef73d5e193f54a22751 Author: David W. Lotts Date: 2017-12-15T22:19:45Z RYA-104 LoadStatementFile is not reading for some reason, todo: MongoExecSparql. commit 49aadaf2970c39522985c74c61d12fa38a0fb27d Author: kchilton2 Date: 2017-12-17T19:56:03Z Making the build pass. commit 1cea9b416e5e7168e5696dd5b9efb84b4da05058 Author: kchilton2 Date: 2017-12-17T20:23:41Z RYA-104 Implemented the Rya Shell integration with the Mongo DB interactors. commit 777c1d394858d95cd10c6ba965d53475067852cf Author: David W. Lotts Date: 2017-12-19T20:39:31Z RYA-104 commit 3774f4b28b168454b9c6146b00ba7b0ddff9b89d Author: isper3at Date: 2017-12-27T18:35:36Z Post-mongo change rebase commit 2a68e37c119b1d0f48423ff21c5e204a4b66fbb7 Author: kchilton2 Date: 2017-12-28T00:50:57Z RYA-104 Removed the MongoCommand class. commit 1944ea9260bf25035ad95d2e67cabcb150875ce8 Author: kchilton2 Date: 2017-12-28T23:21:44Z RYA-104 Closing resources and fixing tests for some of the Mongo interactors. commit 23977111ac7469072bcd49e2395a8717ffa73778 Author: kchilton2 Date: 2017-12-29T18:27:27Z RYA-104 Updated the Rya Shell to prompt for Mongo Specific install configurations. commit e8e43a11a81088b2902d3d952f40286d2bf5e6b9 Author: kchilton2 Date: 2017-12-29T22:56:48Z RYA-104 Starting integration tests with Mongo for the shell. > Mongo Support in Rya Administrative Shell > - > > Key: RYA-104 > URL: https://issues.apache.org/jira/browse/RYA-104 > Project: Rya > Issue Type: New Feature >Affects Versions: 3.2.9 >Reporter: Andrew Smith >Assignee: David W. Lotts > Fix For: 3.2.10 > > > Implement mongo support for the admin shell. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308650#comment-16308650 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159306746 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryOptimizer.java --- @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import org.apache.hadoop.conf.Configurable; +import org.apache.hadoop.conf.Configuration; +import org.apache.log4j.Logger; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.openrdf.query.BindingSet; +import org.openrdf.query.Dataset; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.algebra.evaluation.QueryOptimizer; + +public class AggregationPipelineQueryOptimizer implements QueryOptimizer, Configurable { +private Configuration conf; +private Logger logger = Logger.getLogger(getClass()); + +@Override +public void optimize(TupleExpr tupleExpr, Dataset dataset, BindingSet bindings) { +if (conf instanceof MongoDBRdfConfiguration) { --- End diff -- This will likely change into a StatefulMongoDBRdfConfiguration in 414 (which is a MongoDBRdfConfiguration). Is anything about your optimizer stateful? If so, you don't want to make more than one instance of it, so it should be shared within that config. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308649#comment-16308649 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159306588 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryOptimizer.java --- @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import org.apache.hadoop.conf.Configurable; +import org.apache.hadoop.conf.Configuration; +import org.apache.log4j.Logger; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.openrdf.query.BindingSet; +import org.openrdf.query.Dataset; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.algebra.evaluation.QueryOptimizer; + +public class AggregationPipelineQueryOptimizer implements QueryOptimizer, Configurable { --- End diff -- Docs. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159307808 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -63,6 +63,10 @@ public static final String STATEMENT_METADATA = "statementMetadata"; public static final String DOCUMENT_VISIBILITY = "documentVisibility"; +public static String hash(String value) { --- End diff -- Docs. ---
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308658#comment-16308658 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159307808 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -63,6 +63,10 @@ public static final String STATEMENT_METADATA = "statementMetadata"; public static final String DOCUMENT_VISIBILITY = "documentVisibility"; +public static String hash(String value) { --- End diff -- Docs. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159307607 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/document/util/DocumentVisibilityUtil.java --- @@ -328,4 +328,22 @@ public static boolean doesUserHaveDocumentAccess(final Authorizations authorizat } return list.toArray(new Object[0]); } + +/** + * Converts a {@link List} into an array of {@link Object}s. --- End diff -- It might be worth mentioning that it flattens lists that are found within inputList into the returned array. What does this do to other data structures though? You'll leave a Set or Map as such in the array? Are other collections just allowed to be in the returned array? ---
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308688#comment-16308688 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159311494 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryNode.java --- @@ -0,0 +1,882 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.CONTEXT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.DOCUMENT_VISIBILITY; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_TYPE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.STATEMENT_METADATA; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.TIMESTAMP; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.NavigableSet; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentSkipListSet; +import java.util.function.Function; + +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.api.domain.StatementMetadata; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.mongodb.MongoDbRdfConstants; +import org.apache.rya.mongodb.dao.MongoDBStorageStrategy; +import org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy; +import org.apache.rya.mongodb.document.operators.query.ConditionalOperators; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.Literal; +import org.openrdf.model.Resource; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.ExtensionElem; +import org.openrdf.query.algebra.ProjectionElem; +import org.openrdf.query.algebra.ProjectionElemList; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.ValueConstant; +import org.openrdf.query.algebra.ValueExpr; +import org.openrdf.query.algebra.Var; +import org.openrdf.query.algebra.evaluation.impl.ExternalSet; + +import com.google.common.base.Preconditions; +import com.google.common.collect.BiMap; +import com.google.common.collect.HashBiMap; +import com.mongodb.BasicDBObject; +import com.mongodb.DBObject; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.model.Aggregates; +import com.mongodb.client.model.BsonField; +import com.mongodb.client.model.Filters; +import com.mongodb.client.model.Projections; + +import info.aduna.iteration.CloseableIteration; + +/** + * Represents a portion of a query tree as MongoDB aggregation pipeline. Should + * be built bottom-up: start with a statement pattern implemented as a
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308687#comment-16308687 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159311056 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryNode.java --- @@ -0,0 +1,882 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.CONTEXT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.DOCUMENT_VISIBILITY; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_TYPE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.STATEMENT_METADATA; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.TIMESTAMP; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.NavigableSet; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentSkipListSet; +import java.util.function.Function; + +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.api.domain.StatementMetadata; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.mongodb.MongoDbRdfConstants; +import org.apache.rya.mongodb.dao.MongoDBStorageStrategy; +import org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy; +import org.apache.rya.mongodb.document.operators.query.ConditionalOperators; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.Literal; +import org.openrdf.model.Resource; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.ExtensionElem; +import org.openrdf.query.algebra.ProjectionElem; +import org.openrdf.query.algebra.ProjectionElemList; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.ValueConstant; +import org.openrdf.query.algebra.ValueExpr; +import org.openrdf.query.algebra.Var; +import org.openrdf.query.algebra.evaluation.impl.ExternalSet; + +import com.google.common.base.Preconditions; +import com.google.common.collect.BiMap; +import com.google.common.collect.HashBiMap; +import com.mongodb.BasicDBObject; +import com.mongodb.DBObject; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.model.Aggregates; +import com.mongodb.client.model.BsonField; +import com.mongodb.client.model.Filters; +import com.mongodb.client.model.Projections; + +import info.aduna.iteration.CloseableIteration; + +/** + * Represents a portion of a query tree as MongoDB aggregation pipeline. Should + * be built bottom-up: start with a statement pattern implemented as a
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308689#comment-16308689 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159310844 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryNode.java --- @@ -0,0 +1,882 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.CONTEXT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.DOCUMENT_VISIBILITY; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_TYPE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.STATEMENT_METADATA; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.TIMESTAMP; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.NavigableSet; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentSkipListSet; +import java.util.function.Function; + +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.api.domain.StatementMetadata; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.mongodb.MongoDbRdfConstants; +import org.apache.rya.mongodb.dao.MongoDBStorageStrategy; +import org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy; +import org.apache.rya.mongodb.document.operators.query.ConditionalOperators; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.Literal; +import org.openrdf.model.Resource; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.ExtensionElem; +import org.openrdf.query.algebra.ProjectionElem; +import org.openrdf.query.algebra.ProjectionElemList; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.ValueConstant; +import org.openrdf.query.algebra.ValueExpr; +import org.openrdf.query.algebra.Var; +import org.openrdf.query.algebra.evaluation.impl.ExternalSet; + +import com.google.common.base.Preconditions; +import com.google.common.collect.BiMap; +import com.google.common.collect.HashBiMap; +import com.mongodb.BasicDBObject; +import com.mongodb.DBObject; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.model.Aggregates; +import com.mongodb.client.model.BsonField; +import com.mongodb.client.model.Filters; +import com.mongodb.client.model.Projections; + +import info.aduna.iteration.CloseableIteration; + +/** + * Represents a portion of a query tree as MongoDB aggregation pipeline. Should + * be built bottom-up: start with a statement pattern implemented as a
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308686#comment-16308686 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159307900 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/SparqlToPipelineTransformVisitor.java --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Arrays; + +import org.apache.rya.mongodb.MongoConnectorFactory; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.bson.Document; +import org.openrdf.query.algebra.Distinct; +import org.openrdf.query.algebra.Extension; +import org.openrdf.query.algebra.Filter; +import org.openrdf.query.algebra.Join; +import org.openrdf.query.algebra.MultiProjection; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.Reduced; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.helpers.QueryModelVisitorBase; + +import com.mongodb.MongoClient; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.MongoDatabase; + +public class SparqlToPipelineTransformVisitor extends QueryModelVisitorBase { --- End diff -- Docs. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308684#comment-16308684 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159308088 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/SparqlToPipelineTransformVisitor.java --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Arrays; + +import org.apache.rya.mongodb.MongoConnectorFactory; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.bson.Document; +import org.openrdf.query.algebra.Distinct; +import org.openrdf.query.algebra.Extension; +import org.openrdf.query.algebra.Filter; +import org.openrdf.query.algebra.Join; +import org.openrdf.query.algebra.MultiProjection; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.Reduced; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.helpers.QueryModelVisitorBase; + +import com.mongodb.MongoClient; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.MongoDatabase; + +public class SparqlToPipelineTransformVisitor extends QueryModelVisitorBase { +private MongoCollection inputCollection; + +public SparqlToPipelineTransformVisitor(MongoCollection inputCollection) { +this.inputCollection = inputCollection; +} + +public SparqlToPipelineTransformVisitor(MongoDBRdfConfiguration conf) { +MongoClient mongo = MongoConnectorFactory.getMongoClient(conf); --- End diff -- MongoConnectorFactory no longer exists after RYA-414. You will receive a stateful configuration object that contains the MongoClient. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159311494 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryNode.java --- @@ -0,0 +1,882 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.CONTEXT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.DOCUMENT_VISIBILITY; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_TYPE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.STATEMENT_METADATA; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.TIMESTAMP; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.NavigableSet; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentSkipListSet; +import java.util.function.Function; + +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.api.domain.StatementMetadata; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.mongodb.MongoDbRdfConstants; +import org.apache.rya.mongodb.dao.MongoDBStorageStrategy; +import org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy; +import org.apache.rya.mongodb.document.operators.query.ConditionalOperators; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.Literal; +import org.openrdf.model.Resource; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.ExtensionElem; +import org.openrdf.query.algebra.ProjectionElem; +import org.openrdf.query.algebra.ProjectionElemList; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.ValueConstant; +import org.openrdf.query.algebra.ValueExpr; +import org.openrdf.query.algebra.Var; +import org.openrdf.query.algebra.evaluation.impl.ExternalSet; + +import com.google.common.base.Preconditions; +import com.google.common.collect.BiMap; +import com.google.common.collect.HashBiMap; +import com.mongodb.BasicDBObject; +import com.mongodb.DBObject; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.model.Aggregates; +import com.mongodb.client.model.BsonField; +import com.mongodb.client.model.Filters; +import com.mongodb.client.model.Projections; + +import info.aduna.iteration.CloseableIteration; + +/** + * Represents a portion of a query tree as MongoDB aggregation pipeline. Should + * be built bottom-up: start with a statement pattern implemented as a $match + * step, then add steps to the pipeline to handle higher levels of the query + * tree. Methods are provided to add certain supported query operations to the + * end of the internal pipeline. In some cases,
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308685#comment-16308685 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159307946 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/SparqlToPipelineTransformVisitor.java --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Arrays; + +import org.apache.rya.mongodb.MongoConnectorFactory; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.bson.Document; +import org.openrdf.query.algebra.Distinct; +import org.openrdf.query.algebra.Extension; +import org.openrdf.query.algebra.Filter; +import org.openrdf.query.algebra.Join; +import org.openrdf.query.algebra.MultiProjection; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.Reduced; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.helpers.QueryModelVisitorBase; + +import com.mongodb.MongoClient; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.MongoDatabase; + +public class SparqlToPipelineTransformVisitor extends QueryModelVisitorBase { +private MongoCollection inputCollection; + +public SparqlToPipelineTransformVisitor(MongoCollection inputCollection) { +this.inputCollection = inputCollection; --- End diff -- Is this allowed to be null? > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-104) Mongo Support in Rya Administrative Shell
[ https://issues.apache.org/jira/browse/RYA-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308692#comment-16308692 ] ASF GitHub Bot commented on RYA-104: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/258 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/587/Failed Tests: 3incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.prospector: 3org.apache.rya.prospector.mr.ProspectorTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testNoAuthsCount > Mongo Support in Rya Administrative Shell > - > > Key: RYA-104 > URL: https://issues.apache.org/jira/browse/RYA-104 > Project: Rya > Issue Type: New Feature >Affects Versions: 3.2.9 >Reporter: Andrew Smith >Assignee: David W. Lotts > Fix For: 3.2.10 > > > Implement mongo support for the admin shell. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-414) Fix inconsistent method of using MongoClient within a Sail object
[ https://issues.apache.org/jira/browse/RYA-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308745#comment-16308745 ] ASF GitHub Bot commented on RYA-414: Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159309851 --- Diff: dao/mongodb.rya/src/test/java/org/apache/rya/mongodb/EmbeddedMongoSingleton.java --- @@ -19,20 +19,43 @@ package org.apache.rya.mongodb; import java.io.IOException; +import java.net.UnknownHostException; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import com.mongodb.MongoClient; +import com.mongodb.MongoException; + +import de.flapdoodle.embed.mongo.config.IMongodConfig; /** * To be used for tests. Creates a singleton {@link MongoClient} to be used * throughout all of the MongoDB related tests. Without the singleton, the * embedded mongo factory ends up orphaning processes, consuming resources. */ public class EmbeddedMongoSingleton { -public static MongoClient getInstance() { -return InstanceHolder.SINGLETON.instance; + +public static MongoClient getNewMongoClient() throws UnknownHostException, MongoException { + final MongoClient client = InstanceHolder.SINGLETON.factory.newMongoClient(); + +Runtime.getRuntime().addShutdownHook(new Thread() { +@Override +public void run() { +try { +client.close(); +} catch (final Throwable t) { +// logging frameworks will likely be shut down +t.printStackTrace(System.err); +} +} +}); + +return client; +} + +public static IMongodConfig getMongodConfig() { --- End diff -- javadocs > Fix inconsistent method of using MongoClient within a Sail object > - > > Key: RYA-414 > URL: https://issues.apache.org/jira/browse/RYA-414 > Project: Rya > Issue Type: Bug >Reporter: Andrew Smith >Assignee: Kevin Chilton >Priority: Blocker > > We've decided to introduce a stateful configuration object that is used to > hold onto objects that needs to be shared across components within a Sail > object in favor of having a static holder. The static holder limited our > ability to use more than a single Sail object within a single JVM because all > of those Sail objects would all have to access the same Mongo DB repository. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-414) Fix inconsistent method of using MongoClient within a Sail object
[ https://issues.apache.org/jira/browse/RYA-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308747#comment-16308747 ] ASF GitHub Bot commented on RYA-414: Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159313589 --- Diff: extras/rya.geoindexing/geo.common/src/main/java/org/apache/rya/indexing/geotemporal/GeoTemporalIndexer.java --- @@ -33,15 +32,15 @@ /** * initialize after setting configuration. */ -public void init(); +@Override + public void init(); --- End diff -- spacing got thrown off > Fix inconsistent method of using MongoClient within a Sail object > - > > Key: RYA-414 > URL: https://issues.apache.org/jira/browse/RYA-414 > Project: Rya > Issue Type: Bug >Reporter: Andrew Smith >Assignee: Kevin Chilton >Priority: Blocker > > We've decided to introduce a stateful configuration object that is used to > hold onto objects that needs to be shared across components within a Sail > object in favor of having a static holder. The static holder limited our > ability to use more than a single Sail object within a single JVM because all > of those Sail objects would all have to access the same Mongo DB repository. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-414) Fix inconsistent method of using MongoClient within a Sail object
[ https://issues.apache.org/jira/browse/RYA-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308748#comment-16308748 ] ASF GitHub Bot commented on RYA-414: Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159308100 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/EmbeddedMongoFactory.java --- @@ -79,19 +79,27 @@ private int findRandomOpenPortOnAllLocalInterfaces() throws IOException { /** * Creates a new Mongo connection. - * + * * @throws MongoException * @throws UnknownHostException */ public MongoClient newMongoClient() throws UnknownHostException, MongoException { return new MongoClient(new ServerAddress(mongodProcess.getConfig().net().getServerAddress(), mongodProcess.getConfig().net().getPort())); } +/** + * Gives access to the process configuration. --- End diff -- add @return > Fix inconsistent method of using MongoClient within a Sail object > - > > Key: RYA-414 > URL: https://issues.apache.org/jira/browse/RYA-414 > Project: Rya > Issue Type: Bug >Reporter: Andrew Smith >Assignee: Kevin Chilton >Priority: Blocker > > We've decided to introduce a stateful configuration object that is used to > hold onto objects that needs to be shared across components within a Sail > object in favor of having a static holder. The static holder limited our > ability to use more than a single Sail object within a single JVM because all > of those Sail objects would all have to access the same Mongo DB repository. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-414) Fix inconsistent method of using MongoClient within a Sail object
[ https://issues.apache.org/jira/browse/RYA-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308749#comment-16308749 ] ASF GitHub Bot commented on RYA-414: Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159311393 --- Diff: dao/mongodb.rya/src/test/java/org/apache/rya/mongodb/MongoTestBase.java --- @@ -33,27 +34,43 @@ */ public class MongoTestBase { -private static MongoClient mongoClient = null; -protected static MongoDBRdfConfiguration conf; +private MongoClient mongoClient = null; +protected StatefulMongoDBRdfConfiguration conf; @Before public void setupTest() throws Exception { -conf = new MongoDBRdfConfiguration( new Configuration() ); +// Setup the configuration that will be used within the test. +final MongoDBRdfConfiguration conf = new MongoDBRdfConfiguration( new Configuration() ); conf.setBoolean("sc.useMongo", true); conf.setTablePrefix("test_"); conf.setMongoDBName("testDB"); -mongoClient = EmbeddedMongoSingleton.getInstance(); -conf.setMongoClient(mongoClient); -} + conf.setMongoHostname(EmbeddedMongoSingleton.getMongodConfig().net().getServerAddress().getHostAddress()); + conf.setMongoPort(Integer.toString(EmbeddedMongoSingleton.getMongodConfig().net().getPort())); + +// Let tests update the configuration. +updateConfiguration(conf); + +// Create the stateful configuration object. +mongoClient = EmbeddedMongoSingleton.getNewMongoClient(); +final List indexers = conf.getInstances("ac.additional.indexers", MongoSecondaryIndex.class); +this.conf = new StatefulMongoDBRdfConfiguration(conf, mongoClient, indexers); -@After -public void cleanupTest() { -// Remove any DBs that were created by the test. +// Remove any DBs that were created by previous tests. for(final String dbName : mongoClient.listDatabaseNames()) { mongoClient.dropDatabase(dbName); } } +/** + * Override this method if you would like to augment the configuration object that + * will be used to initialize indexers and create the mongo client prior to running a test. + * + * @param conf - The configuration object that may be updated. (not null) + */ +protected void updateConfiguration(final MongoDBRdfConfiguration conf) { --- End diff -- Do any classes not override this? (I haven't looked to verify) If most or all classes end up overriding this then this should be changed to an abstract method in an abstract class. > Fix inconsistent method of using MongoClient within a Sail object > - > > Key: RYA-414 > URL: https://issues.apache.org/jira/browse/RYA-414 > Project: Rya > Issue Type: Bug >Reporter: Andrew Smith >Assignee: Kevin Chilton >Priority: Blocker > > We've decided to introduce a stateful configuration object that is used to > hold onto objects that needs to be shared across components within a Sail > object in favor of having a static holder. The static holder limited our > ability to use more than a single Sail object within a single JVM because all > of those Sail objects would all have to access the same Mongo DB repository. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159306746 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryOptimizer.java --- @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import org.apache.hadoop.conf.Configurable; +import org.apache.hadoop.conf.Configuration; +import org.apache.log4j.Logger; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.openrdf.query.BindingSet; +import org.openrdf.query.Dataset; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.algebra.evaluation.QueryOptimizer; + +public class AggregationPipelineQueryOptimizer implements QueryOptimizer, Configurable { +private Configuration conf; +private Logger logger = Logger.getLogger(getClass()); + +@Override +public void optimize(TupleExpr tupleExpr, Dataset dataset, BindingSet bindings) { +if (conf instanceof MongoDBRdfConfiguration) { --- End diff -- This will likely change into a StatefulMongoDBRdfConfiguration in 414 (which is a MongoDBRdfConfiguration). Is anything about your optimizer stateful? If so, you don't want to make more than one instance of it, so it should be shared within that config. ---
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308657#comment-16308657 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159307607 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/document/util/DocumentVisibilityUtil.java --- @@ -328,4 +328,22 @@ public static boolean doesUserHaveDocumentAccess(final Authorizations authorizat } return list.toArray(new Object[0]); } + +/** + * Converts a {@link List} into an array of {@link Object}s. --- End diff -- It might be worth mentioning that it flattens lists that are found within inputList into the returned array. What does this do to other data structures though? You'll leave a Set or Map as such in the array? Are other collections just allowed to be in the returned array? > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159306671 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryOptimizer.java --- @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import org.apache.hadoop.conf.Configurable; +import org.apache.hadoop.conf.Configuration; +import org.apache.log4j.Logger; --- End diff -- We're using slf4j instead of log4j. ---
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159310844 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryNode.java --- @@ -0,0 +1,882 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.CONTEXT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.DOCUMENT_VISIBILITY; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_TYPE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.STATEMENT_METADATA; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.TIMESTAMP; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.NavigableSet; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentSkipListSet; +import java.util.function.Function; + +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.api.domain.StatementMetadata; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.mongodb.MongoDbRdfConstants; +import org.apache.rya.mongodb.dao.MongoDBStorageStrategy; +import org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy; +import org.apache.rya.mongodb.document.operators.query.ConditionalOperators; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.Literal; +import org.openrdf.model.Resource; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.ExtensionElem; +import org.openrdf.query.algebra.ProjectionElem; +import org.openrdf.query.algebra.ProjectionElemList; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.ValueConstant; +import org.openrdf.query.algebra.ValueExpr; +import org.openrdf.query.algebra.Var; +import org.openrdf.query.algebra.evaluation.impl.ExternalSet; + +import com.google.common.base.Preconditions; +import com.google.common.collect.BiMap; +import com.google.common.collect.HashBiMap; +import com.mongodb.BasicDBObject; +import com.mongodb.DBObject; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.model.Aggregates; +import com.mongodb.client.model.BsonField; +import com.mongodb.client.model.Filters; +import com.mongodb.client.model.Projections; + +import info.aduna.iteration.CloseableIteration; + +/** + * Represents a portion of a query tree as MongoDB aggregation pipeline. Should + * be built bottom-up: start with a statement pattern implemented as a $match + * step, then add steps to the pipeline to handle higher levels of the query + * tree. Methods are provided to add certain supported query operations to the + * end of the internal pipeline. In some cases,
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159308088 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/SparqlToPipelineTransformVisitor.java --- @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Arrays; + +import org.apache.rya.mongodb.MongoConnectorFactory; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.bson.Document; +import org.openrdf.query.algebra.Distinct; +import org.openrdf.query.algebra.Extension; +import org.openrdf.query.algebra.Filter; +import org.openrdf.query.algebra.Join; +import org.openrdf.query.algebra.MultiProjection; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.Reduced; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.helpers.QueryModelVisitorBase; + +import com.mongodb.MongoClient; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.MongoDatabase; + +public class SparqlToPipelineTransformVisitor extends QueryModelVisitorBase { +private MongoCollection inputCollection; + +public SparqlToPipelineTransformVisitor(MongoCollection inputCollection) { +this.inputCollection = inputCollection; +} + +public SparqlToPipelineTransformVisitor(MongoDBRdfConfiguration conf) { +MongoClient mongo = MongoConnectorFactory.getMongoClient(conf); --- End diff -- MongoConnectorFactory no longer exists after RYA-414. You will receive a stateful configuration object that contains the MongoClient. ---
[jira] [Commented] (RYA-104) Mongo Support in Rya Administrative Shell
[ https://issues.apache.org/jira/browse/RYA-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308682#comment-16308682 ] ASF GitHub Bot commented on RYA-104: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/258 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/586/ > Mongo Support in Rya Administrative Shell > - > > Key: RYA-104 > URL: https://issues.apache.org/jira/browse/RYA-104 > Project: Rya > Issue Type: New Feature >Affects Versions: 3.2.9 >Reporter: Andrew Smith >Assignee: David W. Lotts > Fix For: 3.2.10 > > > Implement mongo support for the admin shell. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #258: RYA-104 Mongo Rya Shell Integration
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/258 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/586/ ---
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308648#comment-16308648 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159306671 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryOptimizer.java --- @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import org.apache.hadoop.conf.Configurable; +import org.apache.hadoop.conf.Configuration; +import org.apache.log4j.Logger; --- End diff -- We're using slf4j instead of log4j. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #258: RYA-104 Mongo Rya Shell Integration
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/258 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/587/Failed Tests: 3incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.prospector: 3org.apache.rya.prospector.mr.ProspectorTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testNoAuthsCount ---
[GitHub] incubator-rya pull request #256: RYA-414
Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159313589 --- Diff: extras/rya.geoindexing/geo.common/src/main/java/org/apache/rya/indexing/geotemporal/GeoTemporalIndexer.java --- @@ -33,15 +32,15 @@ /** * initialize after setting configuration. */ -public void init(); +@Override + public void init(); --- End diff -- spacing got thrown off ---
[GitHub] incubator-rya pull request #256: RYA-414
Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159311393 --- Diff: dao/mongodb.rya/src/test/java/org/apache/rya/mongodb/MongoTestBase.java --- @@ -33,27 +34,43 @@ */ public class MongoTestBase { -private static MongoClient mongoClient = null; -protected static MongoDBRdfConfiguration conf; +private MongoClient mongoClient = null; +protected StatefulMongoDBRdfConfiguration conf; @Before public void setupTest() throws Exception { -conf = new MongoDBRdfConfiguration( new Configuration() ); +// Setup the configuration that will be used within the test. +final MongoDBRdfConfiguration conf = new MongoDBRdfConfiguration( new Configuration() ); conf.setBoolean("sc.useMongo", true); conf.setTablePrefix("test_"); conf.setMongoDBName("testDB"); -mongoClient = EmbeddedMongoSingleton.getInstance(); -conf.setMongoClient(mongoClient); -} + conf.setMongoHostname(EmbeddedMongoSingleton.getMongodConfig().net().getServerAddress().getHostAddress()); + conf.setMongoPort(Integer.toString(EmbeddedMongoSingleton.getMongodConfig().net().getPort())); + +// Let tests update the configuration. +updateConfiguration(conf); + +// Create the stateful configuration object. +mongoClient = EmbeddedMongoSingleton.getNewMongoClient(); +final List indexers = conf.getInstances("ac.additional.indexers", MongoSecondaryIndex.class); +this.conf = new StatefulMongoDBRdfConfiguration(conf, mongoClient, indexers); -@After -public void cleanupTest() { -// Remove any DBs that were created by the test. +// Remove any DBs that were created by previous tests. for(final String dbName : mongoClient.listDatabaseNames()) { mongoClient.dropDatabase(dbName); } } +/** + * Override this method if you would like to augment the configuration object that + * will be used to initialize indexers and create the mongo client prior to running a test. + * + * @param conf - The configuration object that may be updated. (not null) + */ +protected void updateConfiguration(final MongoDBRdfConfiguration conf) { --- End diff -- Do any classes not override this? (I haven't looked to verify) If most or all classes end up overriding this then this should be changed to an abstract method in an abstract class. ---
[GitHub] incubator-rya pull request #256: RYA-414
Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159308100 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/EmbeddedMongoFactory.java --- @@ -79,19 +79,27 @@ private int findRandomOpenPortOnAllLocalInterfaces() throws IOException { /** * Creates a new Mongo connection. - * + * * @throws MongoException * @throws UnknownHostException */ public MongoClient newMongoClient() throws UnknownHostException, MongoException { return new MongoClient(new ServerAddress(mongodProcess.getConfig().net().getServerAddress(), mongodProcess.getConfig().net().getPort())); } +/** + * Gives access to the process configuration. --- End diff -- add @return ---
[jira] [Commented] (RYA-414) Fix inconsistent method of using MongoClient within a Sail object
[ https://issues.apache.org/jira/browse/RYA-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308744#comment-16308744 ] ASF GitHub Bot commented on RYA-414: Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159309681 --- Diff: dao/mongodb.rya/src/test/java/org/apache/rya/mongodb/MongoDBRyaDAOIT.java --- @@ -503,8 +526,11 @@ public void testVisibility() throws RyaDAOException, MongoException, IOException * in the collection. {@code false} otherwise. * @throws RyaDAOException */ -private boolean testVisibilityStatement(final String documentVisibility, final Authorizations userAuthorizations) throws RyaDAOException { -final MongoDatabase db = client.getDatabase(conf.get(MongoDBRdfConfiguration.MONGO_DB_NAME)); +private boolean testVisibilityStatement( --- End diff -- Update javadocs with new @param > Fix inconsistent method of using MongoClient within a Sail object > - > > Key: RYA-414 > URL: https://issues.apache.org/jira/browse/RYA-414 > Project: Rya > Issue Type: Bug >Reporter: Andrew Smith >Assignee: Kevin Chilton >Priority: Blocker > > We've decided to introduce a stateful configuration object that is used to > hold onto objects that needs to be shared across components within a Sail > object in favor of having a static holder. The static holder limited our > ability to use more than a single Sail object within a single JVM because all > of those Sail objects would all have to access the same Mongo DB repository. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-414) Fix inconsistent method of using MongoClient within a Sail object
[ https://issues.apache.org/jira/browse/RYA-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308750#comment-16308750 ] ASF GitHub Bot commented on RYA-414: Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159308896 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBNamespaceManager.java --- @@ -37,147 +37,145 @@ public class SimpleMongoDBNamespaceManager implements MongoDBNamespaceManager { - public class NamespaceImplementation implements Namespace { +public class NamespaceImplementation implements Namespace { - private final String namespace; - private final String prefix; +private final String namespace; +private final String prefix; - public NamespaceImplementation(final String namespace, final String prefix) { - this.namespace = namespace; - this.prefix = prefix; - } +public NamespaceImplementation(final String namespace, final String prefix) { +this.namespace = namespace; +this.prefix = prefix; +} - @Override - public int compareTo(final Namespace o) { - if (!namespace.equalsIgnoreCase(o.getName())) { +@Override +public int compareTo(final Namespace o) { +if (!namespace.equalsIgnoreCase(o.getName())) { return namespace.compareTo(o.getName()); } - if (!prefix.equalsIgnoreCase(o.getPrefix())) { +if (!prefix.equalsIgnoreCase(o.getPrefix())) { return prefix.compareTo(o.getPrefix()); } - return 0; - } - - @Override - public String getName() { - return namespace; - } - - @Override - public String getPrefix() { - return prefix; - } - - } - - public class MongoCursorIteration implements - CloseableIteration{ - private final DBCursor cursor; - - public MongoCursorIteration(final DBCursor cursor2) { - this.cursor = cursor2; - } - - @Override - public boolean hasNext() throws RyaDAOException { - return cursor.hasNext(); - } - - @Override - public Namespace next() throws RyaDAOException { - final DBObject ns = cursor.next(); - final Map values = ns.toMap(); - final String namespace = (String) values.get(NAMESPACE); - final String prefix = (String) values.get(PREFIX); - - final Namespace temp = new NamespaceImplementation(namespace, prefix); - return temp; - } - - @Override - public void remove() throws RyaDAOException { - next(); - } - - @Override - public void close() throws RyaDAOException { - cursor.close(); - } - - } - - private static final String ID = "_id"; - private static final String PREFIX = "prefix"; - private static final String NAMESPACE = "namespace"; - private MongoDBRdfConfiguration conf; - private final DBCollection nsColl; - - - public SimpleMongoDBNamespaceManager(final DBCollection nameSpaceCollection) { - nsColl = nameSpaceCollection; - } - - @Override - public void createIndices(final DBCollection coll){ - coll.createIndex(PREFIX); - coll.createIndex(NAMESPACE); - } - - - @Override - public void setConf(final MongoDBRdfConfiguration paramC) { - this.conf = paramC; - } - - @Override - public MongoDBRdfConfiguration getConf() { - // TODO Auto-generated method stub - return conf; - } - - @Override - public void addNamespace(final String prefix, final String namespace) - throws RyaDAOException { - final String id = prefix; - byte[] bytes = id.getBytes(StandardCharsets.UTF_8); - try { - final MessageDigest digest = MessageDigest.getInstance("SHA-1"); - bytes = digest.digest(bytes); - } catch (final NoSuchAlgorithmException e) { -
[jira] [Commented] (RYA-414) Fix inconsistent method of using MongoClient within a Sail object
[ https://issues.apache.org/jira/browse/RYA-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308746#comment-16308746 ] ASF GitHub Bot commented on RYA-414: Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159312925 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/statement/metadata/matching/RyaQueryEngineFactory.java --- @@ -64,12 +60,13 @@ throw new RuntimeException(e); } return (RyaQueryEngine) new AccumuloRyaQueryEngine(conn, aConf); -} else if(conf instanceof MongoDBRdfConfiguration && conf.getBoolean("sc.useMongo", false)) { -MongoClient client = MongoConnectorFactory.getMongoClient(conf); -return (RyaQueryEngine) new MongoDBQueryEngine((MongoDBRdfConfiguration) conf, client); +} else if(conf instanceof StatefulMongoDBRdfConfiguration && conf.getBoolean("sc.useMongo", false)) { --- End diff -- There's a constant for "sc.useMongo" inside ConfigUtils. Maybe use that here or move that constant to MongoDBRdfConfiguration and use it. > Fix inconsistent method of using MongoClient within a Sail object > - > > Key: RYA-414 > URL: https://issues.apache.org/jira/browse/RYA-414 > Project: Rya > Issue Type: Bug >Reporter: Andrew Smith >Assignee: Kevin Chilton >Priority: Blocker > > We've decided to introduce a stateful configuration object that is used to > hold onto objects that needs to be shared across components within a Sail > object in favor of having a static holder. The static holder limited our > ability to use more than a single Sail object within a single JVM because all > of those Sail objects would all have to access the same Mongo DB repository. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #256: RYA-414
Github user ejwhite922 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/256#discussion_r159309851 --- Diff: dao/mongodb.rya/src/test/java/org/apache/rya/mongodb/EmbeddedMongoSingleton.java --- @@ -19,20 +19,43 @@ package org.apache.rya.mongodb; import java.io.IOException; +import java.net.UnknownHostException; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import com.mongodb.MongoClient; +import com.mongodb.MongoException; + +import de.flapdoodle.embed.mongo.config.IMongodConfig; /** * To be used for tests. Creates a singleton {@link MongoClient} to be used * throughout all of the MongoDB related tests. Without the singleton, the * embedded mongo factory ends up orphaning processes, consuming resources. */ public class EmbeddedMongoSingleton { -public static MongoClient getInstance() { -return InstanceHolder.SINGLETON.instance; + +public static MongoClient getNewMongoClient() throws UnknownHostException, MongoException { + final MongoClient client = InstanceHolder.SINGLETON.factory.newMongoClient(); + +Runtime.getRuntime().addShutdownHook(new Thread() { +@Override +public void run() { +try { +client.close(); +} catch (final Throwable t) { +// logging frameworks will likely be shut down +t.printStackTrace(System.err); +} +} +}); + +return client; +} + +public static IMongodConfig getMongodConfig() { --- End diff -- javadocs ---
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159332716 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); +this.varToOriginalName = varToOriginalName; +this.bindings = bindings; +lookahead(); +} + +private void lookahead() { +while (nextSolution == null && cursor.hasNext()) { +nextSolution = docToBindingSet(cursor.next()); +} +} + +@Override +public boolean hasNext() throws QueryEvaluationException { +lookahead(); +return nextSolution != null; +} + +@Override +public BindingSet next() throws QueryEvaluationException { +lookahead(); +BindingSet solution = nextSolution; +nextSolution = null; +return solution; +} + +@Override +public void remove() throws QueryEvaluationException { --- End diff -- This should throw an unsupported exception. Remove means you're removing the previously returned object from the underlying collection. This doesn't do that. ---
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159333249 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); +this.varToOriginalName = varToOriginalName; +this.bindings = bindings; +lookahead(); +} + +private void lookahead() { +while (nextSolution == null && cursor.hasNext()) { +nextSolution = docToBindingSet(cursor.next()); +} +} + +@Override +public boolean hasNext() throws QueryEvaluationException { +lookahead(); +return nextSolution != null; +} + +@Override +public BindingSet next() throws QueryEvaluationException { +lookahead(); +BindingSet solution = nextSolution; +nextSolution = null; +return solution; +} + +@Override +public void remove() throws QueryEvaluationException { +lookahead(); +nextSolution = null; +} + +@Override +public void close() throws QueryEvaluationException { +cursor.close(); +} + +private QueryBindingSet docToBindingSet(Document result) { +QueryBindingSet bindingSet = new QueryBindingSet(bindings); +Document valueSet = result.get(AggregationPipelineQueryNode.VALUES, Document.class); +Document typeSet = result.get(AggregationPipelineQueryNode.TYPES, Document.class); +if (valueSet != null) { +for (Map.Entry entry : valueSet.entrySet()) { +String fieldName = entry.getKey(); +String valueString = entry.getValue().toString(); +String typeString = typeSet == null ? null : typeSet.getString(fieldName); +String
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159331857 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); --- End diff -- All of these need to be null checked. ---
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159332382 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); +this.varToOriginalName = varToOriginalName; +this.bindings = bindings; +lookahead(); --- End diff -- This should be removed. All of your methods perform lookahead(). In general, a constructor shouldn't be invoking methods on something that communicates on a network since it may take a long time to complete when things aren't working properly. ---
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159333105 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); +this.varToOriginalName = varToOriginalName; +this.bindings = bindings; +lookahead(); +} + +private void lookahead() { +while (nextSolution == null && cursor.hasNext()) { +nextSolution = docToBindingSet(cursor.next()); +} +} + +@Override +public boolean hasNext() throws QueryEvaluationException { +lookahead(); +return nextSolution != null; +} + +@Override +public BindingSet next() throws QueryEvaluationException { +lookahead(); +BindingSet solution = nextSolution; +nextSolution = null; +return solution; +} + +@Override +public void remove() throws QueryEvaluationException { +lookahead(); +nextSolution = null; +} + +@Override +public void close() throws QueryEvaluationException { +cursor.close(); +} + +private QueryBindingSet docToBindingSet(Document result) { +QueryBindingSet bindingSet = new QueryBindingSet(bindings); +Document valueSet = result.get(AggregationPipelineQueryNode.VALUES, Document.class); +Document typeSet = result.get(AggregationPipelineQueryNode.TYPES, Document.class); +if (valueSet != null) { +for (Map.Entry entry : valueSet.entrySet()) { +String fieldName = entry.getKey(); +String valueString = entry.getValue().toString(); +String typeString = typeSet == null ? null : typeSet.getString(fieldName); +String
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308840#comment-16308840 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159333249 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); +this.varToOriginalName = varToOriginalName; +this.bindings = bindings; +lookahead(); +} + +private void lookahead() { +while (nextSolution == null && cursor.hasNext()) { +nextSolution = docToBindingSet(cursor.next()); +} +} + +@Override +public boolean hasNext() throws QueryEvaluationException { +lookahead(); +return nextSolution != null; +} + +@Override +public BindingSet next() throws QueryEvaluationException { +lookahead(); +BindingSet solution = nextSolution; +nextSolution = null; +return solution; +} + +@Override +public void remove() throws QueryEvaluationException { +lookahead(); +nextSolution = null; +} + +@Override +public void close() throws QueryEvaluationException { +cursor.close(); +} + +private QueryBindingSet docToBindingSet(Document result) { +QueryBindingSet bindingSet = new QueryBindingSet(bindings); +Document valueSet = result.get(AggregationPipelineQueryNode.VALUES, Document.class); +Document typeSet = result.get(AggregationPipelineQueryNode.TYPES, Document.class); +if (valueSet != null) { +for (Map.Entry entry : valueSet.entrySet()) { +
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308837#comment-16308837 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159332716 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); +this.varToOriginalName = varToOriginalName; +this.bindings = bindings; +lookahead(); +} + +private void lookahead() { +while (nextSolution == null && cursor.hasNext()) { +nextSolution = docToBindingSet(cursor.next()); +} +} + +@Override +public boolean hasNext() throws QueryEvaluationException { +lookahead(); +return nextSolution != null; +} + +@Override +public BindingSet next() throws QueryEvaluationException { +lookahead(); +BindingSet solution = nextSolution; +nextSolution = null; +return solution; +} + +@Override +public void remove() throws QueryEvaluationException { --- End diff -- This should throw an unsupported exception. Remove means you're removing the previously returned object from the underlying collection. This doesn't do that. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline >
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308838#comment-16308838 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159333105 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); +this.varToOriginalName = varToOriginalName; +this.bindings = bindings; +lookahead(); +} + +private void lookahead() { +while (nextSolution == null && cursor.hasNext()) { +nextSolution = docToBindingSet(cursor.next()); +} +} + +@Override +public boolean hasNext() throws QueryEvaluationException { +lookahead(); +return nextSolution != null; +} + +@Override +public BindingSet next() throws QueryEvaluationException { +lookahead(); +BindingSet solution = nextSolution; +nextSolution = null; +return solution; +} + +@Override +public void remove() throws QueryEvaluationException { +lookahead(); +nextSolution = null; +} + +@Override +public void close() throws QueryEvaluationException { +cursor.close(); +} + +private QueryBindingSet docToBindingSet(Document result) { +QueryBindingSet bindingSet = new QueryBindingSet(bindings); +Document valueSet = result.get(AggregationPipelineQueryNode.VALUES, Document.class); +Document typeSet = result.get(AggregationPipelineQueryNode.TYPES, Document.class); +if (valueSet != null) { +for (Map.Entry entry : valueSet.entrySet()) { +
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308836#comment-16308836 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159331857 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); --- End diff -- All of these need to be null checked. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side optimization. This could be used as a general > query optimization, but would additionally be useful for any tool that only > needed to write query results back to the server: adding a write step to the > end of the resulting pipeline could obviate the need to communicate > individual results to the client at all. -- This
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308839#comment-16308839 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159332382 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/PipelineResultIteration.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import java.util.Map; + +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.ValueFactory; +import org.openrdf.model.impl.ValueFactoryImpl; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.Binding; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +import com.mongodb.client.AggregateIterable; +import com.mongodb.client.MongoCursor; + +import info.aduna.iteration.CloseableIteration; + +/** + * An iterator that converts the documents resulting from an + * {@link AggregationPipelineQueryNode} into {@link BindingSet}s. + */ +public class PipelineResultIteration implements CloseableIteration{ +private static final int BATCH_SIZE = 1000; +private static final ValueFactory VF = ValueFactoryImpl.getInstance(); + +private final MongoCursor cursor; +private final Map varToOriginalName; +private final BindingSet bindings; +private BindingSet nextSolution = null; + +/** + * Constructor. + * @param aggIter Iterator of documents in AggregationPipelineQueryNode's + * intermediate solution representation. + * @param varToOriginalName A mapping from field names in the pipeline + * result documents to equivalent variable names in the original query. + * Where an entry does not exist for a field, the field name and variable + * name are assumed to be the same. + * @param bindings A partial solution. May be empty. + */ +public PipelineResultIteration(AggregateIterable aggIter, +Map varToOriginalName, +BindingSet bindings) { +aggIter.batchSize(BATCH_SIZE); +this.cursor = aggIter.iterator(); +this.varToOriginalName = varToOriginalName; +this.bindings = bindings; +lookahead(); --- End diff -- This should be removed. All of your methods perform lookahead(). In general, a constructor shouldn't be invoking methods on something that communicates on a network since it may take a long time to complete when things aren't working properly. > Add the ability use the MongoDB aggregation pipeline to evaluate simple > SPARQL expressions > -- > > Key: RYA-416 > URL: https://issues.apache.org/jira/browse/RYA-416 > Project: Rya > Issue Type: New Feature >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > MongoDB provides the [aggregation pipeline > framework|https://docs.mongodb.com/manual/core/aggregation-pipeline/] for > multi-stage data processing. Currently, the query engine invokes this > framework to apply individual statement patterns (using a "$match" expression > for each and iterating through the results), then applies higher-level query > operations (join, filter, select, project, etc) client-side. > In principle, those high-level query operations could be rewritten as > aggregation pipeline stages as well ($group, $match, $project, etc). This > would allow more query evaluation logic to be executed by the MongoDB server > itself, enabling server-side
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159335982 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryNode.java --- @@ -0,0 +1,882 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.CONTEXT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.DOCUMENT_VISIBILITY; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_TYPE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.STATEMENT_METADATA; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.TIMESTAMP; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.NavigableSet; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentSkipListSet; +import java.util.function.Function; + +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.api.domain.StatementMetadata; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.mongodb.MongoDbRdfConstants; +import org.apache.rya.mongodb.dao.MongoDBStorageStrategy; +import org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy; +import org.apache.rya.mongodb.document.operators.query.ConditionalOperators; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.Literal; +import org.openrdf.model.Resource; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.ExtensionElem; +import org.openrdf.query.algebra.ProjectionElem; +import org.openrdf.query.algebra.ProjectionElemList; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.ValueConstant; +import org.openrdf.query.algebra.ValueExpr; +import org.openrdf.query.algebra.Var; +import org.openrdf.query.algebra.evaluation.impl.ExternalSet; + +import com.google.common.base.Preconditions; +import com.google.common.collect.BiMap; +import com.google.common.collect.HashBiMap; +import com.mongodb.BasicDBObject; +import com.mongodb.DBObject; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.model.Aggregates; +import com.mongodb.client.model.BsonField; +import com.mongodb.client.model.Filters; +import com.mongodb.client.model.Projections; + +import info.aduna.iteration.CloseableIteration; + +/** + * Represents a portion of a query tree as MongoDB aggregation pipeline. Should + * be built bottom-up: start with a statement pattern implemented as a $match + * step, then add steps to the pipeline to handle higher levels of the query + * tree. Methods are provided to add certain supported query operations to the + * end of the internal pipeline. In some cases,
[GitHub] incubator-rya pull request #254: RYA-416 Optionally invoke aggregation pipel...
Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159336327 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryNode.java --- @@ -0,0 +1,882 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.CONTEXT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.DOCUMENT_VISIBILITY; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_TYPE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.STATEMENT_METADATA; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.TIMESTAMP; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.NavigableSet; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentSkipListSet; +import java.util.function.Function; + +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.api.domain.StatementMetadata; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.mongodb.MongoDbRdfConstants; +import org.apache.rya.mongodb.dao.MongoDBStorageStrategy; +import org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy; +import org.apache.rya.mongodb.document.operators.query.ConditionalOperators; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.Literal; +import org.openrdf.model.Resource; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.ExtensionElem; +import org.openrdf.query.algebra.ProjectionElem; +import org.openrdf.query.algebra.ProjectionElemList; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.ValueConstant; +import org.openrdf.query.algebra.ValueExpr; +import org.openrdf.query.algebra.Var; +import org.openrdf.query.algebra.evaluation.impl.ExternalSet; + +import com.google.common.base.Preconditions; +import com.google.common.collect.BiMap; +import com.google.common.collect.HashBiMap; +import com.mongodb.BasicDBObject; +import com.mongodb.DBObject; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.model.Aggregates; +import com.mongodb.client.model.BsonField; +import com.mongodb.client.model.Filters; +import com.mongodb.client.model.Projections; + +import info.aduna.iteration.CloseableIteration; + +/** + * Represents a portion of a query tree as MongoDB aggregation pipeline. Should + * be built bottom-up: start with a statement pattern implemented as a $match + * step, then add steps to the pipeline to handle higher levels of the query + * tree. Methods are provided to add certain supported query operations to the + * end of the internal pipeline. In some cases,
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308874#comment-16308874 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159336327 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryNode.java --- @@ -0,0 +1,882 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.CONTEXT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.DOCUMENT_VISIBILITY; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_TYPE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.STATEMENT_METADATA; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.TIMESTAMP; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.NavigableSet; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentSkipListSet; +import java.util.function.Function; + +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.api.domain.StatementMetadata; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.mongodb.MongoDbRdfConstants; +import org.apache.rya.mongodb.dao.MongoDBStorageStrategy; +import org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy; +import org.apache.rya.mongodb.document.operators.query.ConditionalOperators; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.Literal; +import org.openrdf.model.Resource; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.ExtensionElem; +import org.openrdf.query.algebra.ProjectionElem; +import org.openrdf.query.algebra.ProjectionElemList; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.ValueConstant; +import org.openrdf.query.algebra.ValueExpr; +import org.openrdf.query.algebra.Var; +import org.openrdf.query.algebra.evaluation.impl.ExternalSet; + +import com.google.common.base.Preconditions; +import com.google.common.collect.BiMap; +import com.google.common.collect.HashBiMap; +import com.mongodb.BasicDBObject; +import com.mongodb.DBObject; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.model.Aggregates; +import com.mongodb.client.model.BsonField; +import com.mongodb.client.model.Filters; +import com.mongodb.client.model.Projections; + +import info.aduna.iteration.CloseableIteration; + +/** + * Represents a portion of a query tree as MongoDB aggregation pipeline. Should + * be built bottom-up: start with a statement pattern implemented as a
[jira] [Commented] (RYA-416) Add the ability use the MongoDB aggregation pipeline to evaluate simple SPARQL expressions
[ https://issues.apache.org/jira/browse/RYA-416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308873#comment-16308873 ] ASF GitHub Bot commented on RYA-416: Github user kchilton2 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/254#discussion_r159335982 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/aggregation/AggregationPipelineQueryNode.java --- @@ -0,0 +1,882 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.mongodb.aggregation; + +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.CONTEXT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.DOCUMENT_VISIBILITY; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.OBJECT_TYPE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.PREDICATE_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.STATEMENT_METADATA; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.SUBJECT_HASH; +import static org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy.TIMESTAMP; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.NavigableSet; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentSkipListSet; +import java.util.function.Function; + +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.api.domain.StatementMetadata; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.mongodb.MongoDbRdfConstants; +import org.apache.rya.mongodb.dao.MongoDBStorageStrategy; +import org.apache.rya.mongodb.dao.SimpleMongoDBStorageStrategy; +import org.apache.rya.mongodb.document.operators.query.ConditionalOperators; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.Literal; +import org.openrdf.model.Resource; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.vocabulary.XMLSchema; +import org.openrdf.query.BindingSet; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Compare; +import org.openrdf.query.algebra.ExtensionElem; +import org.openrdf.query.algebra.ProjectionElem; +import org.openrdf.query.algebra.ProjectionElemList; +import org.openrdf.query.algebra.StatementPattern; +import org.openrdf.query.algebra.ValueConstant; +import org.openrdf.query.algebra.ValueExpr; +import org.openrdf.query.algebra.Var; +import org.openrdf.query.algebra.evaluation.impl.ExternalSet; + +import com.google.common.base.Preconditions; +import com.google.common.collect.BiMap; +import com.google.common.collect.HashBiMap; +import com.mongodb.BasicDBObject; +import com.mongodb.DBObject; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.model.Aggregates; +import com.mongodb.client.model.BsonField; +import com.mongodb.client.model.Filters; +import com.mongodb.client.model.Projections; + +import info.aduna.iteration.CloseableIteration; + +/** + * Represents a portion of a query tree as MongoDB aggregation pipeline. Should + * be built bottom-up: start with a statement pattern implemented as a