[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/202 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/381/Build result: ABORTED[...truncated 2.26 MB...][INFO] [INFO] --- maven-resources-plugin:2.7:resources (default-resources) @ rya.pcj.fluo.demo ---[INFO] Using 'UTF-8' encoding to copy filtered resources.[INFO] skip non existing resourceDirectory /home/jenkins/jenkins-slave/workspace/incubator-rya-master-with-optionals-pull-requests/extras/rya.pcj.fluo/pcj.fluo.demo/src/main/resources[INFO] Copying 3 resourcesBuild was aborted[WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy: java.util.concurrent.ExecutionException: Invalid object ID 14 iota=59[INFO] [INFO] --- maven-compiler-plugin:3.2:compile (default-compile) @ rya.pcj.fluo.demo ---[INFO] Changes detected - recompiling the module![INFO] Compiling 3 source files to /home/jenkins/jenkins-slave/workspace/incubator-rya-master-with-optionals-pull-requests/extras/rya.pcj.fluo/pcj.fluo.demo/target/classeschannel st opped[WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy: hudson.remoting.Channel$OrderlyShutdown[INFO] [INFO] --- maven-resources-plugin:2.7:testResources (default-testResources) @ rya.pcj.fluo.demo ---[INFO] Using 'UTF-8' encoding to copy filtered resources.[INFO] skip non existing resourceDirectory /home/jenkins/jenkins-slave/workspace/incubator-rya-master-with-optionals-pull-requests/extras/rya.pcj.fluo/pcj.fluo.demo/src/test/resources[INFO] Copying 3 resources[WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy: java.io.IOException: Backing channel 'channel' is disconnected.[INFO] [INFO] --- maven-compiler-plugin:3.2:testCompile (default-testCompile) @ rya.pcj.fluo.demo ---Setting status of 256e5f53434801c7b6325c80020ac8080b6636e0 to FAILURE with url https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/381/ and message: 'FAILURE 'Using context: Jenkins: clean package -Pgeoindexing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/202 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/380/Build result: FAILURE[...truncated 4.03 MB...][INFO] Apache Rya Spark Support ... SKIPPED[INFO] Apache Rya Web Projects SKIPPED[INFO] Apache Rya Web Implementation .. SKIPPED[INFO] [INFO] BUILD FAILURE[INFO] [INFO] Total time: 01:04 h[INFO] Finished at: 2017-08-10T01:35:39+00:00[INFO] Final Memory: 494M/2861M[INFO] [ERROR] Failed to execute goal org.codehaus.mojo:jaxb2-maven-plugin:2.3.1:xjc (xjc) on project rya.benchmark: MojoExecutionException: NoSchemasException -> [Help 1][ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.[ERROR] Re-run Maven using the -X switch to enable full debug logging.[ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles:[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException[ERROR] [ERROR] After correcting the problems, you can resume the build with the command[ERROR] mvn -rf :rya.benchmarkchannel stoppedSetting status of 1fc939493ccac56a1f2ea8d8f7ac3fe42aa190d2 to FAILURE with url https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/380/ and message: 'FAILURE 'Using context: Jenkins: clean package -Pgeoindexing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-250) Smart URI avoid data duplication
[ https://issues.apache.org/jira/browse/RYA-250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120786#comment-16120786 ] ASF GitHub Bot commented on RYA-250: Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/153 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/379/ > Smart URI avoid data duplication > > > Key: RYA-250 > URL: https://issues.apache.org/jira/browse/RYA-250 > Project: Rya > Issue Type: Task > Components: dao >Affects Versions: 3.2.10 >Reporter: Eric White >Assignee: Eric White > Fix For: 3.2.10 > > > Implement Smart URI methods for avoiding data duplication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #153: RYA-250 Smart URI avoiding data duplication
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/153 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/379/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/202 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/378/Failed Tests: 5incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.benchmark: 1org.apache.rya.benchmark.query.QueriesBenchmarkConfReaderIT.loadincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.pcj.fluo.integration: 1org.apache.rya.indexing.pcj.fluo.integration.KafkaExportIT.newResultsExportedTestincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.prospector: 3org.apache.rya.prospector.mr.ProspectorTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testNoAuthsCount --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/202 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/377/Failed Tests: 3incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.benchmark: 1org.apache.rya.benchmark.query.QueriesBenchmarkConfReaderIT.loadincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.pcj.fluo.integration: 1org.apache.rya.indexing.pcj.fluo.integration.KafkaExportIT.newResultsExportedTestincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.pcj.fluo.test.base: 1org.apache.rya.pcj.fluo.test.base.KafkaExportITBaseIT.embeddedKafkaTest --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120519#comment-16120519 ] ASF GitHub Bot commented on RYA-303: Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132279789 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; +final SPARQLParser sp = new SPARQLParser(); +final ParsedTupleQuery pq = (ParsedTupleQuery) sp.parseQuery(sparql, null); +final TupleExpr te = pq.getTupleExpr(); +Preconditions.checkArgu
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120520#comment-16120520 ] ASF GitHub Bot commented on RYA-303: Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132280543 --- Diff: extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjDocuments.java --- @@ -0,0 +1,418 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.storage.mongo; + +import static com.google.common.base.Preconditions.checkNotNull; +import static java.util.Objects.requireNonNull; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Set; + +import org.apache.accumulo.core.security.Authorizations; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.api.resolver.RyaToRdfConversions; +import org.apache.rya.indexing.pcj.storage.PcjMetadata; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet; +import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.impl.URIImpl; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.QueryLanguage; +import org.openrdf.query.TupleQuery; +import org.openrdf.query.TupleQueryResult; +import org.openrdf.query.impl.MapBindingSet; +import org.openrdf.repository.RepositoryConnection; +import org.openrdf.repository.RepositoryException; + +import com.mongodb.MongoClient; +import com.mongodb.client.FindIterable; +import com.mongodb.client.MongoCollection; +import com.mongodb.util.JSON; + +/** + * Creates and modifies PCJs in MongoDB. PCJ's are stored as follows: + * + * + * + * - PCJ Metadata Doc - + * { + * _id: [table_name]_METADATA, + * sparql: [sparql query to match results], + * cardinality: [number of results] + * } + * + * - PCJ Results Doc - + * { + * pcjName: [table_name], + * auths: [auths] + * [binding_var1]: { + * uri: [type_uri], + * value: value + * } + * . + * . + * . + * [binding_varn]: { --- End diff -- n is just showing that its the nth var. the dots are showing any number between > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132279789 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; +final SPARQLParser sp = new SPARQLParser(); +final ParsedTupleQuery pq = (ParsedTupleQuery) sp.parseQuery(sparql, null); +final TupleExpr te = pq.getTupleExpr(); +Preconditions.checkArgument(PCJOptimizerUtilities.isPCJValid(te), "TupleExpr is an invalid PCJ."); + +final Optional projection = new ParsedQueryUtil().findProjection(pq); +if (!projection.isPresent()) { +throw new Malformed
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132280543 --- Diff: extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjDocuments.java --- @@ -0,0 +1,418 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.storage.mongo; + +import static com.google.common.base.Preconditions.checkNotNull; +import static java.util.Objects.requireNonNull; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Set; + +import org.apache.accumulo.core.security.Authorizations; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.api.resolver.RyaToRdfConversions; +import org.apache.rya.indexing.pcj.storage.PcjMetadata; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet; +import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.impl.URIImpl; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.QueryLanguage; +import org.openrdf.query.TupleQuery; +import org.openrdf.query.TupleQueryResult; +import org.openrdf.query.impl.MapBindingSet; +import org.openrdf.repository.RepositoryConnection; +import org.openrdf.repository.RepositoryException; + +import com.mongodb.MongoClient; +import com.mongodb.client.FindIterable; +import com.mongodb.client.MongoCollection; +import com.mongodb.util.JSON; + +/** + * Creates and modifies PCJs in MongoDB. PCJ's are stored as follows: + * + * + * + * - PCJ Metadata Doc - + * { + * _id: [table_name]_METADATA, + * sparql: [sparql query to match results], + * cardinality: [number of results] + * } + * + * - PCJ Results Doc - + * { + * pcjName: [table_name], + * auths: [auths] + * [binding_var1]: { + * uri: [type_uri], + * value: value + * } + * . + * . + * . + * [binding_varn]: { --- End diff -- n is just showing that its the nth var. the dots are showing any number between --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user jdasch commented on the issue: https://github.com/apache/incubator-rya/pull/202 asfbot build --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/202 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/376/Failed Tests: 2incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.benchmark: 1org.apache.rya.benchmark.query.QueriesBenchmarkConfReaderIT.loadincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.pcj.fluo.integration: 1org.apache.rya.indexing.pcj.fluo.integration.KafkaExportIT.newResultsExportedTest --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #198: Rya 283
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/198#discussion_r132251517 --- Diff: extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/query/QueryBuilderVisitorBase.java --- @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.fluo.app.query; + +import org.apache.rya.indexing.pcj.fluo.app.NodeType; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +/** + * Base visitor class for navigating a {@link FluoQuery.Builder}. + * The visit methods in this class provide the basic functionality + * for navigating between the Builders that make u the FluoQuery.Builder. + * + */ +public abstract class QueryBuilderVisitorBase { --- End diff -- you can ignore this, I'm talking to you in person for clarification --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #198: Rya 283
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/198#discussion_r132250708 --- Diff: extras/rya.pcj.fluo/pcj.fluo.integration/src/test/java/org/apache/rya/indexing/pcj/fluo/integration/KafkaExportIT.java --- @@ -425,6 +432,160 @@ public void groupByManyBindings_avaerages() throws Exception { assertEquals(expectedResults, results); } + +@Test +public void nestedGroupByManyBindings_averages() throws Exception { +// A query that groups what is aggregated by two of the keys. +final String sparql = +"SELECT ?type ?location ?averagePrice {" + +"FILTER(?averagePrice > 4) " + +"{SELECT ?type ?location (avg(?price) as ?averagePrice) {" + +"?id ?type . " + +"?id ?location ." + +"?id ?price ." + +"} " + +"GROUP BY ?type ?location }}"; + +// Create the Statements that will be loaded into Rya. +final ValueFactory vf = new ValueFactoryImpl(); +final Collection statements = Sets.newHashSet( +// American items that will be averaged. +vf.createStatement(vf.createURI("urn:1"), vf.createURI("urn:type"), vf.createLiteral("apple")), +vf.createStatement(vf.createURI("urn:1"), vf.createURI("urn:location"), vf.createLiteral("USA")), +vf.createStatement(vf.createURI("urn:1"), vf.createURI("urn:price"), vf.createLiteral(2.50)), + +vf.createStatement(vf.createURI("urn:2"), vf.createURI("urn:type"), vf.createLiteral("cheese")), +vf.createStatement(vf.createURI("urn:2"), vf.createURI("urn:location"), vf.createLiteral("USA")), +vf.createStatement(vf.createURI("urn:2"), vf.createURI("urn:price"), vf.createLiteral(4.25)), + +vf.createStatement(vf.createURI("urn:3"), vf.createURI("urn:type"), vf.createLiteral("cheese")), +vf.createStatement(vf.createURI("urn:3"), vf.createURI("urn:location"), vf.createLiteral("USA")), +vf.createStatement(vf.createURI("urn:3"), vf.createURI("urn:price"), vf.createLiteral(5.25)), + +// French items that will be averaged. +vf.createStatement(vf.createURI("urn:4"), vf.createURI("urn:type"), vf.createLiteral("cheese")), +vf.createStatement(vf.createURI("urn:4"), vf.createURI("urn:location"), vf.createLiteral("France")), +vf.createStatement(vf.createURI("urn:4"), vf.createURI("urn:price"), vf.createLiteral(8.5)), + +vf.createStatement(vf.createURI("urn:5"), vf.createURI("urn:type"), vf.createLiteral("cigarettes")), +vf.createStatement(vf.createURI("urn:5"), vf.createURI("urn:location"), vf.createLiteral("France")), +vf.createStatement(vf.createURI("urn:5"), vf.createURI("urn:price"), vf.createLiteral(3.99)), + +vf.createStatement(vf.createURI("urn:6"), vf.createURI("urn:type"), vf.createLiteral("cigarettes")), +vf.createStatement(vf.createURI("urn:6"), vf.createURI("urn:location"), vf.createLiteral("France")), +vf.createStatement(vf.createURI("urn:6"), vf.createURI("urn:price"), vf.createLiteral(4.99))); + +// Create the PCJ in Fluo and load the statements into Rya. +final String pcjId = loadData(sparql, statements); + +// Create the expected results of the SPARQL query once the PCJ has been computed. +final Set expectedResults = new HashSet<>(); + +MapBindingSet bs = new MapBindingSet(); +bs.addBinding("type", vf.createLiteral("cheese", XMLSchema.STRING)); +bs.addBinding("location", vf.createLiteral("France", XMLSchema.STRING)); +bs.addBinding("averagePrice", vf.createLiteral("8.5", XMLSchema.DECIMAL)); +expectedResults.add( new VisibilityBindingSet(bs)); + +bs = new MapBindingSet(); +bs.addBinding("type", vf.createLiteral("cigarettes", XMLSchema.STRING)); +bs.addBinding("location", vf.createLiteral("France", XMLSchema.STRING)); +bs.addBinding("averagePrice", vf.createLiteral("4.49", XMLSchema.DECIMAL)); +expectedResults.add( new VisibilityBindingSet(bs) ); + +bs = new MapBindingSet(); +bs.addBinding("type", vf.createLiteral("cheese", XMLSchema.STRING)); +bs.addBinding("location", vf.createLiteral("USA", XMLSchema.STRING)); +bs.addBinding("averagePrice", vf.createLiteral("4.75", XMLSchema.DECIMAL)); +expectedResults.add( new VisibilityBindingSet(bs) ); +
[GitHub] incubator-rya pull request #198: Rya 283
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/198#discussion_r132244653 --- Diff: extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/query/QueryBuilderVisitorBase.java --- @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.fluo.app.query; + +import org.apache.rya.indexing.pcj.fluo.app.NodeType; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +/** + * Base visitor class for navigating a {@link FluoQuery.Builder}. + * The visit methods in this class provide the basic functionality + * for navigating between the Builders that make u the FluoQuery.Builder. + * + */ +public abstract class QueryBuilderVisitorBase { --- End diff -- should this extend some openRDF visitor? or are you just mimicking the visitor pattern? Why are you visiting on the builders rather than what they build? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #198: Rya 283
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/198#discussion_r132234446 --- Diff: extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/query/CommonNodeMetadata.java --- @@ -99,4 +99,18 @@ public String toString() { .append("}") .toString(); } + +/** + * Base interface for all metadata Builders. Using this type def + * allows for the implementation of a Builder visitor for navigating + * the Builder tree. + * + */ +public static interface Builder { --- End diff -- there's no build function? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #198: Rya 283
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/198#discussion_r132231558 --- Diff: extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/JoinResultUpdater.java --- @@ -160,8 +183,55 @@ public void updateJoinResults( public static enum Side { LEFT, RIGHT; } + + +/** + * Fetches batch to be processed by scanning over the Span specified by the + * {@link JoinBatchInformation}. The number of results is less than or equal + * to the batch size specified by the JoinBatchInformation. + * + * @param tx - Fluo transaction in which batch operation is performed + * @param siblingSpan - span of sibling to retrieve elements to join with + * @param bsSet- set that batch results are added to + * @return Set - containing results of sibling scan. + * @throws Exception + */ +private Optional fillSiblingBatch(TransactionBase tx, Span siblingSpan, Column siblingColumn, Set bsSet, int batchSize) throws Exception { + +RowScanner rs = tx.scanner().over(siblingSpan).fetch(siblingColumn).byRow().build(); +Iterator colScannerIter = rs.iterator(); + +boolean batchLimitMet = false; +Bytes row = siblingSpan.getStart().getRow(); +while (colScannerIter.hasNext() && !batchLimitMet) { +ColumnScanner colScanner = colScannerIter.next(); +row = colScanner.getRow(); +Iterator iter = colScanner.iterator(); +while (iter.hasNext()) { --- End diff -- should this also check batchLimitMet? the flag can't be set to true on the first pass, so you can just do the size check after adding the first bindingSet, then you don't need a break. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #198: Rya 283
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/198#discussion_r132250247 --- Diff: extras/rya.pcj.fluo/pcj.fluo.app/src/test/java/org/apache/rya/indexing/pcj/fluo/app/query/QueryBuilderVisitorTest.java --- @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.fluo.app.query; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.apache.rya.indexing.pcj.fluo.app.NodeType; +import org.junit.Assert; +import org.junit.Test; + +public class QueryBuilderVisitorTest { + +@Test +public void builderTest() { + +FluoQuery.Builder fluoBuilder = FluoQuery.builder(); + +String queryId = NodeType.generateNewFluoIdForType(NodeType.QUERY); +String projectionId = NodeType.generateNewFluoIdForType(NodeType.PROJECTION); +String joinId = NodeType.generateNewFluoIdForType(NodeType.JOIN); +String leftSp = NodeType.generateNewFluoIdForType(NodeType.STATEMENT_PATTERN); +String rightSp = NodeType.generateNewFluoIdForType(NodeType.STATEMENT_PATTERN); + +List expected = Arrays.asList(queryId, projectionId, joinId, leftSp, rightSp); + +QueryMetadata.Builder queryBuilder = QueryMetadata.builder(queryId); +queryBuilder.setChildNodeId(projectionId); + +ProjectionMetadata.Builder projectionBuilder = ProjectionMetadata.builder(projectionId); +projectionBuilder.setChildNodeId(joinId); + +JoinMetadata.Builder joinBuilder = JoinMetadata.builder(joinId); +joinBuilder.setLeftChildNodeId(leftSp); +joinBuilder.setRightChildNodeId(rightSp); + +StatementPatternMetadata.Builder left = StatementPatternMetadata.builder(leftSp); +StatementPatternMetadata.Builder right = StatementPatternMetadata.builder(rightSp); + +fluoBuilder.setQueryMetadata(queryBuilder); +fluoBuilder.addProjectionBuilder(projectionBuilder); +fluoBuilder.addJoinMetadata(joinBuilder); +fluoBuilder.addStatementPatternBuilder(left); +fluoBuilder.addStatementPatternBuilder(right); + +QueryBuilderPrinter printer = new QueryBuilderPrinter(fluoBuilder); +printer.visit(); +Assert.assertEquals(expected, printer.getIds()); +} + + +public static class QueryBuilderPrinter extends QueryBuilderVisitorBase { + +private List ids = new ArrayList<>(); + +public List getIds() { +return ids; +} + +public QueryBuilderPrinter(FluoQuery.Builder builder) { +super(builder); +} + +public void visit(QueryMetadata.Builder queryBuilder) { +System.out.println(queryBuilder.getNodeId()); --- End diff -- why do you need to print during a test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #198: Rya 283
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/198#discussion_r132245408 --- Diff: extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/query/QueryMetadataVisitorBase.java --- @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.fluo.app.query; + +import org.apache.rya.indexing.pcj.fluo.app.NodeType; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +public abstract class QueryMetadataVisitorBase { --- End diff -- confusedwhy are you visiting on the builders as well as the metadata built? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #198: Rya 283
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/198#discussion_r132234031 --- Diff: extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/observers/QueryResultObserver.java --- @@ -107,7 +107,7 @@ public void process(final TransactionBase tx, final Bytes brow, final Column col // Read the Child Binding Set that will be exported. final Bytes valueBytes = tx.get(brow, col); final VisibilityBindingSet result = BS_SERDE.deserialize(valueBytes); - + --- End diff -- this class only has this whitespace change, can it be removed from the commits? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #198: Rya 283
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/198#discussion_r132233795 --- Diff: extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/batch/JoinBatchInformation.java --- @@ -149,12 +137,12 @@ public boolean equals(Object other) { JoinBatchInformation batch = (JoinBatchInformation) other; return super.equals(other) && Objects.equals(this.bs, batch.bs) && Objects.equals(this.join, batch.join) --- End diff -- you should be able to just Objects.equals().equals().equals(); --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user jdasch commented on the issue: https://github.com/apache/incubator-rya/pull/202 asfbot build --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120301#comment-16120301 ] ASF GitHub Bot commented on RYA-316: Github user pujav65 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132249498 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- I'm pretty sure Mongo further condenses the data, so I'm not sure hashing is necessary in order for it to store in memory. You're adding a lot of overhead to query. I'm ok with adding it now if you think it's necessary. > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user pujav65 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132249498 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- I'm pretty sure Mongo further condenses the data, so I'm not sure hashing is necessary in order for it to store in memory. You're adding a lot of overhead to query. I'm ok with adding it now if you think it's necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120257#comment-16120257 ] ASF GitHub Bot commented on RYA-316: Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132245115 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -64,14 +68,14 @@ @Override public void createIndices(final DBCollection coll){ BasicDBObject doc = new BasicDBObject(); -doc.put(SUBJECT, 1); -doc.put(PREDICATE, 1); +doc.put(SUBJECT_HASH, 1); +doc.put(PREDICATE_HASH, 1); coll.createIndex(doc); --- End diff -- @pujav65 thanks. @isper3at clearly a bug. please add OBJECT_HASH, OBJECT_TYPE_HASH to the first index. > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132245115 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -64,14 +68,14 @@ @Override public void createIndices(final DBCollection coll){ BasicDBObject doc = new BasicDBObject(); -doc.put(SUBJECT, 1); -doc.put(PREDICATE, 1); +doc.put(SUBJECT_HASH, 1); +doc.put(PREDICATE_HASH, 1); coll.createIndex(doc); --- End diff -- @pujav65 thanks. @isper3at clearly a bug. please add OBJECT_HASH, OBJECT_TYPE_HASH to the first index. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120245#comment-16120245 ] ASF GitHub Bot commented on RYA-316: Github user pujav65 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132243060 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -64,14 +68,14 @@ @Override public void createIndices(final DBCollection coll){ BasicDBObject doc = new BasicDBObject(); -doc.put(SUBJECT, 1); -doc.put(PREDICATE, 1); +doc.put(SUBJECT_HASH, 1); +doc.put(PREDICATE_HASH, 1); coll.createIndex(doc); --- End diff -- When the Mongo db backend was first implemented, you could only do indices over two fields-- the first is the primary index, the second the secondary index. That may have changed since. The indices we originally had were subject, predicate, object, and then subject/predicate, predicate/object, and object/subject. The not including object type might be a bug, but I had thought that was addressed at some point. Also one could argue that the single field indices were redundant-- I had wanted to test to see but never got around to it. If you can now index over more than two fields, then we might want to revisit this. > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user pujav65 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132243060 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -64,14 +68,14 @@ @Override public void createIndices(final DBCollection coll){ BasicDBObject doc = new BasicDBObject(); -doc.put(SUBJECT, 1); -doc.put(PREDICATE, 1); +doc.put(SUBJECT_HASH, 1); +doc.put(PREDICATE_HASH, 1); coll.createIndex(doc); --- End diff -- When the Mongo db backend was first implemented, you could only do indices over two fields-- the first is the primary index, the second the secondary index. That may have changed since. The indices we originally had were subject, predicate, object, and then subject/predicate, predicate/object, and object/subject. The not including object type might be a bug, but I had thought that was addressed at some point. Also one could argue that the single field indices were redundant-- I had wanted to test to see but never got around to it. If you can now index over more than two fields, then we might want to revisit this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120235#comment-16120235 ] ASF GitHub Bot commented on RYA-316: Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132242206 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- @pujav65 I'm concerned about index size. please hash everything. If you want another ticket for "please hash everything" I'm fine with that, but let's knock that out while @isper3at is cleaning this stuff up. Key thing with mongo is to get the index to fit in memory, so lets do that. > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132242206 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- @pujav65 I'm concerned about index size. please hash everything. If you want another ticket for "please hash everything" I'm fine with that, but let's knock that out while @isper3at is cleaning this stuff up. Key thing with mongo is to get the index to fit in memory, so lets do that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120195#comment-16120195 ] ASF GitHub Bot commented on RYA-316: Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132235261 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -64,14 +68,14 @@ @Override public void createIndices(final DBCollection coll){ BasicDBObject doc = new BasicDBObject(); -doc.put(SUBJECT, 1); -doc.put(PREDICATE, 1); +doc.put(SUBJECT_HASH, 1); +doc.put(PREDICATE_HASH, 1); coll.createIndex(doc); -doc = new BasicDBObject(PREDICATE, 1); -doc.put(OBJECT, 1); +doc = new BasicDBObject(PREDICATE_HASH, 1); +doc.put(OBJECT_HASH, 1); doc.put(OBJECT_TYPE, 1); coll.createIndex(doc); -doc = new BasicDBObject(OBJECT, 1); +doc = new BasicDBObject(OBJECT_HASH, 1); doc.put(OBJECT_TYPE, 1); doc.put(SUBJECT, 1); --- End diff -- SUBJECT_HASH > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120194#comment-16120194 ] ASF GitHub Bot commented on RYA-316: Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132235567 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -64,14 +68,14 @@ @Override public void createIndices(final DBCollection coll){ BasicDBObject doc = new BasicDBObject(); -doc.put(SUBJECT, 1); -doc.put(PREDICATE, 1); +doc.put(SUBJECT_HASH, 1); +doc.put(PREDICATE_HASH, 1); coll.createIndex(doc); --- End diff -- @pujav65 Looking over this index creation code... this seems like a bug... where's the SPO index? I think this first index should be SUBJECT_HASH, PREDICATE_HASH, OBJECT_HASH, OBJECT_TYPE_HASH > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132235261 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -64,14 +68,14 @@ @Override public void createIndices(final DBCollection coll){ BasicDBObject doc = new BasicDBObject(); -doc.put(SUBJECT, 1); -doc.put(PREDICATE, 1); +doc.put(SUBJECT_HASH, 1); +doc.put(PREDICATE_HASH, 1); coll.createIndex(doc); -doc = new BasicDBObject(PREDICATE, 1); -doc.put(OBJECT, 1); +doc = new BasicDBObject(PREDICATE_HASH, 1); +doc.put(OBJECT_HASH, 1); doc.put(OBJECT_TYPE, 1); coll.createIndex(doc); -doc = new BasicDBObject(OBJECT, 1); +doc = new BasicDBObject(OBJECT_HASH, 1); doc.put(OBJECT_TYPE, 1); doc.put(SUBJECT, 1); --- End diff -- SUBJECT_HASH --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132235567 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -64,14 +68,14 @@ @Override public void createIndices(final DBCollection coll){ BasicDBObject doc = new BasicDBObject(); -doc.put(SUBJECT, 1); -doc.put(PREDICATE, 1); +doc.put(SUBJECT_HASH, 1); +doc.put(PREDICATE_HASH, 1); coll.createIndex(doc); --- End diff -- @pujav65 Looking over this index creation code... this seems like a bug... where's the SPO index? I think this first index should be SUBJECT_HASH, PREDICATE_HASH, OBJECT_HASH, OBJECT_TYPE_HASH --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120185#comment-16120185 ] ASF GitHub Bot commented on RYA-316: Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132235041 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- yep, might as well hash context and object type as well. > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132235041 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- yep, might as well hash context and object type as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120183#comment-16120183 ] ASF GitHub Bot commented on RYA-316: Github user pujav65 commented on the issue: https://github.com/apache/incubator-rya/pull/199 looks good to me. aaron's different hashing suggestion can be done later -- add it to jira to track if you don't want to do it now. > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #199: RYA-316 Long OBJ string
Github user pujav65 commented on the issue: https://github.com/apache/incubator-rya/pull/199 looks good to me. aaron's different hashing suggestion can be done later -- add it to jira to track if you don't want to do it now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120178#comment-16120178 ] ASF GitHub Bot commented on RYA-316: Github user pujav65 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132234315 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- hey i don't think we need to hash predicates and subjects - just objects. objects are possibly literals which means they can have unspecified length (and in practice are likely to be very long -- sometimes people literally put books into comments which are object values). theoretically predicates and subjects are URIs which means that to be valid they are limited in length. no harm in doing it, it just adds a layer of indirection at query time. > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120179#comment-16120179 ] ASF GitHub Bot commented on RYA-316: Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132234427 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) { final RyaURI context = stmt.getContext(); final BasicDBObject query = new BasicDBObject(); if (subject != null){ -query.append(SUBJECT, subject.getData()); +query.append(SUBJECT_HASH, DigestUtils.sha256Hex(subject.getData())); --- End diff -- I'll do some testing on this, but I'm guessing PRO: (1) smaller index size and (2) smaller messages over the wire. CON: Need to take care when println'ing the query. > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132234427 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) { final RyaURI context = stmt.getContext(); final BasicDBObject query = new BasicDBObject(); if (subject != null){ -query.append(SUBJECT, subject.getData()); +query.append(SUBJECT_HASH, DigestUtils.sha256Hex(subject.getData())); --- End diff -- I'll do some testing on this, but I'm guessing PRO: (1) smaller index size and (2) smaller messages over the wire. CON: Need to take care when println'ing the query. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user pujav65 commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132234315 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- hey i don't think we need to hash predicates and subjects - just objects. objects are possibly literals which means they can have unspecified length (and in practice are likely to be very long -- sometimes people literally put books into comments which are object values). theoretically predicates and subjects are URIs which means that to be valid they are limited in length. no harm in doing it, it just adds a layer of indirection at query time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-293) Implement owl:unionOf inference
[ https://issues.apache.org/jira/browse/RYA-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120158#comment-16120158 ] ASF GitHub Bot commented on RYA-293: Github user jessehatfield commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/180#discussion_r132230928 --- Diff: sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java --- @@ -142,6 +143,53 @@ public void refreshGraph() throws InferenceEngineException { } } +// Add unions to the subclass graph: if c owl:unionOf LIST(c1, c2, ... cn), then any +// instances of c1, c2, ... or cn are also instances of c, meaning c is a superclass +// of all the rest. +// (In principle, an instance of c is likewise implied to be at least one of the other +// types, but this fact is ignored for now to avoid nondeterministic reasoning.) +iter = RyaDAOHelper.query(ryaDAO, null, OWL.UNIONOF, null, conf); +try { +while (iter.hasNext()) { +Statement st = iter.next(); +Value unionType = st.getSubject(); +// Traverse the list of types constituting the union +Value current = st.getObject(); +while (current instanceof Resource && !RDF.NIL.equals(current)) { +Resource listNode = (Resource) current; +CloseableIteration listIter = RyaDAOHelper.query(ryaDAO, +listNode, RDF.FIRST, null, conf); +try { +if (listIter.hasNext()) { +Statement firstStatement = listIter.next(); +if (firstStatement.getObject() instanceof Resource) { +Resource subclass = (Resource) firstStatement.getObject(); +Statement subclassStatement = vf.createStatement(subclass, RDFS.SUBCLASSOF, unionType); +addStatementEdge(graph, RDFS.SUBCLASSOF.stringValue(), subclassStatement); +} +} +} finally { +listIter.close(); +} +listIter = RyaDAOHelper.query(ryaDAO, listNode, RDF.REST, null, conf); +try { +if (listIter.hasNext()) { +current = listIter.next().getObject(); --- End diff -- Yep, a union is given as a linked list so we just walk down adding subclass statements until we get to a node with no rdf:rest or with rdf:rest equal to rdf:nil. If the list is poorly-formed or someone tries to express the union using the wrong collection type ([describes unionOf expression](https://www.w3.org/TR/owl2-rdf-based-semantics/#Semantic_Conditions_for_Boolean_Connectives) | [gives interpretation of sequence](https://www.w3.org/TR/owl2-rdf-based-semantics/#Semantic_Conditions)) then it won't work. In most cases that just means the intended union isn't fully represented, but I suppose if it were somehow a cyclical list, we'd end up in an infinite loop. > Implement owl:unionOf inference > --- > > Key: RYA-293 > URL: https://issues.apache.org/jira/browse/RYA-293 > Project: Rya > Issue Type: Sub-task > Components: sail >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > An *{{owl:unionOf}}* expression defines one type to be equivalent to the > union of another set of types. If the ontology states that {{:Parent}} is the > union of {{:Mother}} and {{:Father}}, then the inference engine should > rewrite statement patterns of the form {{?x rdf:type :Parent}} to check for > resources that are stated to be any of the types {{:Mother}}, {{:Father}}, or > {{:Parent}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #180: RYA-293 Added owl:unionOf inference
Github user jessehatfield commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/180#discussion_r132230928 --- Diff: sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java --- @@ -142,6 +143,53 @@ public void refreshGraph() throws InferenceEngineException { } } +// Add unions to the subclass graph: if c owl:unionOf LIST(c1, c2, ... cn), then any +// instances of c1, c2, ... or cn are also instances of c, meaning c is a superclass +// of all the rest. +// (In principle, an instance of c is likewise implied to be at least one of the other +// types, but this fact is ignored for now to avoid nondeterministic reasoning.) +iter = RyaDAOHelper.query(ryaDAO, null, OWL.UNIONOF, null, conf); +try { +while (iter.hasNext()) { +Statement st = iter.next(); +Value unionType = st.getSubject(); +// Traverse the list of types constituting the union +Value current = st.getObject(); +while (current instanceof Resource && !RDF.NIL.equals(current)) { +Resource listNode = (Resource) current; +CloseableIteration listIter = RyaDAOHelper.query(ryaDAO, +listNode, RDF.FIRST, null, conf); +try { +if (listIter.hasNext()) { +Statement firstStatement = listIter.next(); +if (firstStatement.getObject() instanceof Resource) { +Resource subclass = (Resource) firstStatement.getObject(); +Statement subclassStatement = vf.createStatement(subclass, RDFS.SUBCLASSOF, unionType); +addStatementEdge(graph, RDFS.SUBCLASSOF.stringValue(), subclassStatement); +} +} +} finally { +listIter.close(); +} +listIter = RyaDAOHelper.query(ryaDAO, listNode, RDF.REST, null, conf); +try { +if (listIter.hasNext()) { +current = listIter.next().getObject(); --- End diff -- Yep, a union is given as a linked list so we just walk down adding subclass statements until we get to a node with no rdf:rest or with rdf:rest equal to rdf:nil. If the list is poorly-formed or someone tries to express the union using the wrong collection type ([describes unionOf expression](https://www.w3.org/TR/owl2-rdf-based-semantics/#Semantic_Conditions_for_Boolean_Connectives) | [gives interpretation of sequence](https://www.w3.org/TR/owl2-rdf-based-semantics/#Semantic_Conditions)) then it won't work. In most cases that just means the intended union isn't fully represented, but I suppose if it were somehow a cyclical list, we'd end up in an infinite loop. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120155#comment-16120155 ] ASF GitHub Bot commented on RYA-316: Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132230502 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- woops. I'll make it just object. did you want a hash for context as well? > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120156#comment-16120156 ] ASF GitHub Bot commented on RYA-316: Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132230655 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) { final RyaURI context = stmt.getContext(); final BasicDBObject query = new BasicDBObject(); if (subject != null){ -query.append(SUBJECT, subject.getData()); +query.append(SUBJECT_HASH, DigestUtils.sha256Hex(subject.getData())); --- End diff -- I can store as either. Not really sure if there are any pros-cons between the two > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132230655 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) { final RyaURI context = stmt.getContext(); final BasicDBObject query = new BasicDBObject(); if (subject != null){ -query.append(SUBJECT, subject.getData()); +query.append(SUBJECT_HASH, DigestUtils.sha256Hex(subject.getData())); --- End diff -- I can store as either. Not really sure if there are any pros-cons between the two --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user isper3at commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132230502 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- woops. I'll make it just object. did you want a hash for context as well? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user jdasch commented on the issue: https://github.com/apache/incubator-rya/pull/202 asfbot builld --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-293) Implement owl:unionOf inference
[ https://issues.apache.org/jira/browse/RYA-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120123#comment-16120123 ] ASF GitHub Bot commented on RYA-293: Github user jessehatfield commented on the issue: https://github.com/apache/incubator-rya/pull/180 So this is a case where a few different schema terms (rdfs:subClassOf, owl:unionOf, and eventually owl:equivalentClass) end up being represented by just one term internally (subclass/superclass relationships). We could plausibly organize by either; my intuition is to use the internal representation, since the internal graph being complete and accurate can matter to the other rules. That would mean pulling the logic introduced here, plus the logic introduced in [https://github.com/apache/incubator-rya/pull/184](PR 184) , plus the logic for rdfs:subClassOf (which 184 incidentally simplifies to a method call anyway) into some "refreshSubClassGraph" method. Thoughts on that approach? Would probably do the same with subPropertyOfGraph for consistency (though it's simpler because there's no equivalent to union). > Implement owl:unionOf inference > --- > > Key: RYA-293 > URL: https://issues.apache.org/jira/browse/RYA-293 > Project: Rya > Issue Type: Sub-task > Components: sail >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > An *{{owl:unionOf}}* expression defines one type to be equivalent to the > union of another set of types. If the ontology states that {{:Parent}} is the > union of {{:Mother}} and {{:Father}}, then the inference engine should > rewrite statement patterns of the form {{?x rdf:type :Parent}} to check for > resources that are stated to be any of the types {{:Mother}}, {{:Father}}, or > {{:Parent}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya issue #180: RYA-293 Added owl:unionOf inference
Github user jessehatfield commented on the issue: https://github.com/apache/incubator-rya/pull/180 So this is a case where a few different schema terms (rdfs:subClassOf, owl:unionOf, and eventually owl:equivalentClass) end up being represented by just one term internally (subclass/superclass relationships). We could plausibly organize by either; my intuition is to use the internal representation, since the internal graph being complete and accurate can matter to the other rules. That would mean pulling the logic introduced here, plus the logic introduced in [https://github.com/apache/incubator-rya/pull/184](PR 184) , plus the logic for rdfs:subClassOf (which 184 incidentally simplifies to a method call anyway) into some "refreshSubClassGraph" method. Thoughts on that approach? Would probably do the same with subPropertyOfGraph for consistency (though it's simpler because there's no equivalent to union). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/202 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/375/Failed Tests: 1incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.benchmark: 1org.apache.rya.benchmark.query.QueriesBenchmarkConfReaderIT.load --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-293) Implement owl:unionOf inference
[ https://issues.apache.org/jira/browse/RYA-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120060#comment-16120060 ] ASF GitHub Bot commented on RYA-293: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/180#discussion_r132213500 --- Diff: sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java --- @@ -142,6 +143,53 @@ public void refreshGraph() throws InferenceEngineException { } } +// Add unions to the subclass graph: if c owl:unionOf LIST(c1, c2, ... cn), then any +// instances of c1, c2, ... or cn are also instances of c, meaning c is a superclass +// of all the rest. +// (In principle, an instance of c is likewise implied to be at least one of the other +// types, but this fact is ignored for now to avoid nondeterministic reasoning.) +iter = RyaDAOHelper.query(ryaDAO, null, OWL.UNIONOF, null, conf); +try { +while (iter.hasNext()) { +Statement st = iter.next(); +Value unionType = st.getSubject(); +// Traverse the list of types constituting the union +Value current = st.getObject(); +while (current instanceof Resource && !RDF.NIL.equals(current)) { +Resource listNode = (Resource) current; +CloseableIteration listIter = RyaDAOHelper.query(ryaDAO, +listNode, RDF.FIRST, null, conf); +try { +if (listIter.hasNext()) { +Statement firstStatement = listIter.next(); +if (firstStatement.getObject() instanceof Resource) { +Resource subclass = (Resource) firstStatement.getObject(); +Statement subclassStatement = vf.createStatement(subclass, RDFS.SUBCLASSOF, unionType); +addStatementEdge(graph, RDFS.SUBCLASSOF.stringValue(), subclassStatement); +} +} +} finally { +listIter.close(); +} +listIter = RyaDAOHelper.query(ryaDAO, listNode, RDF.REST, null, conf); +try { +if (listIter.hasNext()) { +current = listIter.next().getObject(); --- End diff -- Trying to follow the general logic here: Each list has an RDF.FIRST and RDF.RESET property, where FIRST is a resource and REST is of type list. So if the list has more than one element, current is set to the list obtained by the RDF.REST query and we go through the loop again. Is this how all of the SUBCLASSOF statements are created for the union? > Implement owl:unionOf inference > --- > > Key: RYA-293 > URL: https://issues.apache.org/jira/browse/RYA-293 > Project: Rya > Issue Type: Sub-task > Components: sail >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > An *{{owl:unionOf}}* expression defines one type to be equivalent to the > union of another set of types. If the ontology states that {{:Parent}} is the > union of {{:Mother}} and {{:Father}}, then the inference engine should > rewrite statement patterns of the form {{?x rdf:type :Parent}} to check for > resources that are stated to be any of the types {{:Mother}}, {{:Father}}, or > {{:Parent}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-293) Implement owl:unionOf inference
[ https://issues.apache.org/jira/browse/RYA-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120059#comment-16120059 ] ASF GitHub Bot commented on RYA-293: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/180#discussion_r132207923 --- Diff: sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java --- @@ -142,6 +143,53 @@ public void refreshGraph() throws InferenceEngineException { } } +// Add unions to the subclass graph: if c owl:unionOf LIST(c1, c2, ... cn), then any --- End diff -- I thought that you were going to start breaking out any new logic that you added to the refreshGraph() method into methods that were specific to the given rule. > Implement owl:unionOf inference > --- > > Key: RYA-293 > URL: https://issues.apache.org/jira/browse/RYA-293 > Project: Rya > Issue Type: Sub-task > Components: sail >Reporter: Jesse Hatfield >Assignee: Jesse Hatfield > > An *{{owl:unionOf}}* expression defines one type to be equivalent to the > union of another set of types. If the ontology states that {{:Parent}} is the > union of {{:Mother}} and {{:Father}}, then the inference engine should > rewrite statement patterns of the form {{?x rdf:type :Parent}} to check for > resources that are stated to be any of the types {{:Mother}}, {{:Father}}, or > {{:Parent}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #180: RYA-293 Added owl:unionOf inference
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/180#discussion_r132207923 --- Diff: sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java --- @@ -142,6 +143,53 @@ public void refreshGraph() throws InferenceEngineException { } } +// Add unions to the subclass graph: if c owl:unionOf LIST(c1, c2, ... cn), then any --- End diff -- I thought that you were going to start breaking out any new logic that you added to the refreshGraph() method into methods that were specific to the given rule. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #180: RYA-293 Added owl:unionOf inference
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/180#discussion_r132213500 --- Diff: sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java --- @@ -142,6 +143,53 @@ public void refreshGraph() throws InferenceEngineException { } } +// Add unions to the subclass graph: if c owl:unionOf LIST(c1, c2, ... cn), then any +// instances of c1, c2, ... or cn are also instances of c, meaning c is a superclass +// of all the rest. +// (In principle, an instance of c is likewise implied to be at least one of the other +// types, but this fact is ignored for now to avoid nondeterministic reasoning.) +iter = RyaDAOHelper.query(ryaDAO, null, OWL.UNIONOF, null, conf); +try { +while (iter.hasNext()) { +Statement st = iter.next(); +Value unionType = st.getSubject(); +// Traverse the list of types constituting the union +Value current = st.getObject(); +while (current instanceof Resource && !RDF.NIL.equals(current)) { +Resource listNode = (Resource) current; +CloseableIteration listIter = RyaDAOHelper.query(ryaDAO, +listNode, RDF.FIRST, null, conf); +try { +if (listIter.hasNext()) { +Statement firstStatement = listIter.next(); +if (firstStatement.getObject() instanceof Resource) { +Resource subclass = (Resource) firstStatement.getObject(); +Statement subclassStatement = vf.createStatement(subclass, RDFS.SUBCLASSOF, unionType); +addStatementEdge(graph, RDFS.SUBCLASSOF.stringValue(), subclassStatement); +} +} +} finally { +listIter.close(); +} +listIter = RyaDAOHelper.query(ryaDAO, listNode, RDF.REST, null, conf); +try { +if (listIter.hasNext()) { +current = listIter.next().getObject(); --- End diff -- Trying to follow the general logic here: Each list has an RDF.FIRST and RDF.RESET property, where FIRST is a resource and REST is of type list. So if the list has more than one element, current is set to the list obtained by the RDF.REST query and we go through the loop again. Is this how all of the SUBCLASSOF statements are created for the union? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120003#comment-16120003 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132192625 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; +final SPARQLParser sp = new SPARQLParser(); +final ParsedTupleQuery pq = (ParsedTupleQuery) sp.parseQuery(sparql, null); +final TupleExpr te = pq.getTupleExpr(); +Preconditions.checkAr
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120009#comment-16120009 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132195020 --- Diff: extras/rya.benchmark/src/main/gen/org/apache/rya/benchmark/query/Rya.java --- @@ -20,7 +20,7 @@ // This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.11 // See http://java.sun.com/xml/jaxb";>http://java.sun.com/xml/jaxb // Any modifications to this file will be lost upon recompilation of the source schema. -// Generated on: 2016.12.16 at 01:22:14 PM PST +// Generated on: 2017.07.06 at 03:13:11 PM EDT --- End diff -- This and the above source gen files should be fixed by Eric's latest PR. > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120005#comment-16120005 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132182885 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/BasePcjIndexer.java --- @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkState; +import static java.util.Collections.singleton; +import static java.util.Objects.requireNonNull; +import static java.util.stream.Collectors.groupingBy; + +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Map; +import java.util.Map.Entry; +import java.util.Set; +import java.util.concurrent.atomic.AtomicReference; + +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.indexing.entity.model.Entity; +import org.apache.rya.indexing.entity.storage.EntityStorage; +import org.apache.rya.indexing.entity.storage.EntityStorage.EntityStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.apache.rya.mongodb.MongoSecondaryIndex; +import org.openrdf.model.URI; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; + +/** + * A base class that may be used to update an {@link EntityStorage} as new + * {@link RyaStatement}s are added to/removed from the Rya instance. + */ +@DefaultAnnotation(NonNull.class) +public abstract class BasePcjIndexer implements PcjIndexer, MongoSecondaryIndex { --- End diff -- I'm not sure what this class is interacting with. The basic components of our PCJ framework are the matcher framework for query optimization, the storage layer, and the indexer layer. It seems like this is related to the indexer layer. But the indexer layer is meant to interact with the updater (whatever observer framework we use to maintain the PCJs). Given that there is currently no updater in place, what is the purpose of BasePcjIndexer, PcjIndexer, and MongoPrecomputedJoinIndexer? I can understand including abstract classes and interfaces just to have them in place when an updater is incorporated, but some of these our concrete implementations. So what are they interacting with? > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120011#comment-16120011 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132198496 --- Diff: extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjDocuments.java --- @@ -0,0 +1,418 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.storage.mongo; + +import static com.google.common.base.Preconditions.checkNotNull; +import static java.util.Objects.requireNonNull; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Set; + +import org.apache.accumulo.core.security.Authorizations; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.api.resolver.RyaToRdfConversions; +import org.apache.rya.indexing.pcj.storage.PcjMetadata; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet; +import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.impl.URIImpl; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.QueryLanguage; +import org.openrdf.query.TupleQuery; +import org.openrdf.query.TupleQueryResult; +import org.openrdf.query.impl.MapBindingSet; +import org.openrdf.repository.RepositoryConnection; +import org.openrdf.repository.RepositoryException; + +import com.mongodb.MongoClient; +import com.mongodb.client.FindIterable; +import com.mongodb.client.MongoCollection; +import com.mongodb.util.JSON; + +/** + * Creates and modifies PCJs in MongoDB. PCJ's are stored as follows: + * + * + * + * - PCJ Metadata Doc - + * { + * _id: [table_name]_METADATA, + * sparql: [sparql query to match results], + * cardinality: [number of results] + * } + * + * - PCJ Results Doc - + * { + * pcjName: [table_name], + * auths: [auths] + * [binding_var1]: { + * uri: [type_uri], + * value: value + * } + * . + * . + * . + * [binding_varn]: { --- End diff -- binding_var2 > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1611#comment-1611 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132187057 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { --- End diff -- I think that you should make this name more Mongo centric. PcjQueryNode sounds very general purpose and makes it seem like this class is DB agnostic. We should have similar naming conventions for the two PCJ nodes. Currently the other node is AccumuloIndexSet. If you don't like MongoIndexSet, we can rename that to AccumuloPcjNode and rename this class to MongoPcjNode. > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120008#comment-16120008 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132193846 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/pcj/matching/PCJOptimizer.java --- @@ -90,9 +97,19 @@ public final void setConf(final Configuration conf) { if (!init) { try { this.conf = conf; -this.useOptimal = ConfigUtils.getUseOptimalPCJ(conf); -provider = new AccumuloIndexSetProvider(conf); -} catch (Exception e) { +useOptimal = ConfigUtils.getUseOptimalPCJ(conf); --- End diff -- Ugh, I hate that we have to do this and we can't use an Interface. Stupid setConf() init. > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119997#comment-16119997 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132190366 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; +final SPARQLParser sp = new SPARQLParser(); +final ParsedTupleQuery pq = (ParsedTupleQuery) sp.parseQuery(sparql, null); +final TupleExpr te = pq.getTupleExpr(); +Preconditions.checkAr
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119996#comment-16119996 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132175735 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static java.util.Objects.requireNonNull; + +import java.util.List; +import java.util.Map; + +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.RdfCloudTripleStoreConfiguration; +import org.apache.rya.api.instance.RyaDetailsRepository; +import org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage; +import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository; + +import com.google.common.collect.Lists; +import com.google.common.collect.Maps; +import com.mongodb.MongoClient; + +/** + * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB. + */ +public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider { +private final MongoClient client; +private final MongoDBRdfConfiguration mongoConf; + +public MongoPcjIndexSetProvider(final Configuration conf, final MongoClient client) { +super(conf); +this.client = client; +mongoConf = new MongoDBRdfConfiguration(conf); +} + +public MongoPcjIndexSetProvider(final Configuration conf, final List indices, final MongoClient client) { +super(conf, indices); +this.client = client; --- End diff -- Preconditions > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120001#comment-16120001 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132175692 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static java.util.Objects.requireNonNull; + +import java.util.List; +import java.util.Map; + +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.RdfCloudTripleStoreConfiguration; +import org.apache.rya.api.instance.RyaDetailsRepository; +import org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage; +import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository; + +import com.google.common.collect.Lists; +import com.google.common.collect.Maps; +import com.mongodb.MongoClient; + +/** + * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB. + */ +public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider { +private final MongoClient client; +private final MongoDBRdfConfiguration mongoConf; + +public MongoPcjIndexSetProvider(final Configuration conf, final MongoClient client) { +super(conf); +this.client = client; --- End diff -- Preconditions > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120007#comment-16120007 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132201588 --- Diff: extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoVisibilityBindingSetBsonConverter.java --- @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.storage.mongo; + +import static org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter.DOCUMENT_VISIBILITY_KEY; + +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.api.resolver.RyaToRdfConversions; +import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet; +import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder; +import org.apache.rya.mongodb.document.visibility.DocumentVisibility; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter.MalformedDocumentVisibilityException; +import org.bson.BsonArray; +import org.bson.BsonDocument; +import org.bson.BsonString; +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.impl.URIImpl; +import org.openrdf.query.BindingSet; +import org.openrdf.query.impl.MapBindingSet; + +import com.mongodb.DBObject; +import com.mongodb.MongoClient; +import com.mongodb.util.JSON; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; + +/** + * Converts {@link BindingSet}s to Strings and back again. The Strings do not + * include the binding names and are ordered with a {@link VariableOrder}. + */ +@DefaultAnnotation(NonNull.class) +public class MongoVisibilityBindingSetBsonConverter {/* implements MongoBindingSetConverter { --- End diff -- What's going on here? Why is this commented out? > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120004#comment-16120004 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132188383 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); --- End diff -- So, the MongoPrecomputedJoinIndexer and the PcjQueryNode do not need to talk to each other. Again, the PrecomputedJoinIndexer is for ingesting data into the Updater, while the PcjQueryNode is a placeholder for the sub query that the PCJ match
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120006#comment-16120006 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132202883 --- Diff: extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjAdapter.java --- @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.storage.mongo; + +import static com.google.common.base.Preconditions.checkNotNull; + +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.api.resolver.RyaToRdfConversions; +import org.apache.rya.indexing.pcj.storage.accumulo.BindingSetConverter; +import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder; +import org.bson.BsonArray; +import org.bson.BsonDocument; +import org.bson.BsonString; +import org.bson.BsonValue; +import org.bson.Document; +import org.bson.codecs.DocumentCodec; +import org.bson.codecs.configuration.CodecRegistries; +import org.bson.codecs.configuration.CodecRegistry; +import org.bson.conversions.Bson; +import org.openrdf.model.impl.URIImpl; +import org.openrdf.query.BindingSet; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +/** + * Converts a Pcj for storage in mongoDB or retrieval from mongoDB. + */ +public class MongoPcjAdapter implements BindingSetConverter { --- End diff -- Where is this class used? It doesn't appear that this class or the converter above are used anywhere. > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1612#comment-1612 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132183937 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPrecomputedJoinIndexer.java --- @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.concurrent.atomic.AtomicReference; + +import org.apache.hadoop.conf.Configuration; +import org.apache.log4j.Logger; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; + +import com.mongodb.MongoClient; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; + +/** + * Updates the state of the Precomputed Join indices that are used by Rya. + */ +@DefaultAnnotation(NonNull.class) +public class MongoPrecomputedJoinIndexer extends BasePcjIndexer { --- End diff -- It appears this class does nothing. Is this just a stub class for when an observer framework is in place? > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya > Issue Type: Improvement >Reporter: Andrew Smith >Assignee: Andrew Smith > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120002#comment-16120002 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132189129 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; +final SPARQLParser sp = new SPARQLParser(); +final ParsedTupleQuery pq = (ParsedTupleQuery) sp.parseQuery(sparql, null); +final TupleExpr te = pq.getTupleExpr(); +Preconditions.checkAr
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119998#comment-16119998 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132189025 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; --- End diff -- Preconditions > Mongo PCJ indexer support > - > > Key: RYA-303 > URL: https://issues.apache.org/jira/browse/RYA-303 > Project: Rya >
[jira] [Commented] (RYA-303) Mongo PCJ indexer support
[ https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119995#comment-16119995 ] ASF GitHub Bot commented on RYA-303: Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132173821 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static java.util.Objects.requireNonNull; + +import java.util.List; +import java.util.Map; + +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.RdfCloudTripleStoreConfiguration; +import org.apache.rya.api.instance.RyaDetailsRepository; +import org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage; +import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository; + +import com.google.common.collect.Lists; +import com.google.common.collect.Maps; +import com.mongodb.MongoClient; + +/** + * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB. + */ +public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider { +private final MongoClient client; +private final MongoDBRdfConfiguration mongoConf; + +public MongoPcjIndexSetProvider(final Configuration conf, final MongoClient client) { +super(conf); +this.client = client; +mongoConf = new MongoDBRdfConfiguration(conf); +} + +public MongoPcjIndexSetProvider(final Configuration conf, final List indices, final MongoClient client) { +super(conf, indices); +this.client = client; +mongoConf = new MongoDBRdfConfiguration(conf); +} + +@Override +protected List getIndices() throws Exception { +requireNonNull(conf); +final MongoPcjDocuments pcjTables = new MongoPcjDocuments(client, mongoConf.getMongoDBName()); +final String pcjPrefix = requireNonNull(conf.get(RdfCloudTripleStoreConfiguration.CONF_TBL_PREFIX)); +List tables = null; + +tables = mongoConf.getPcjTables(); +// this maps associates pcj table name with pcj sparql query +final Map indexTables = Maps.newLinkedHashMap(); + +try(final PrecomputedJoinStorage storage = new MongoPcjStorage(client, mongoConf.getMongoInstance(), null)) { +final PcjTableNameFactory pcjFactory = new PcjTableNameFactory(); + +final boolean tablesProvided = tables != null && !tables.isEmpty(); + +if (tablesProvided) { +// if tables provided, associate table name with sparql +for (final String table : tables) { +indexTables.put(table, storage.getPcjMetadata(pcjFactory.getPcjId(table)).getSparql()); +} +} else if (hasRyaDetails(mongoConf.getMongoDBName())) { +// If this is a newer install of Rya, and it has PCJ Details, +// then +// use those. +final List ids = storage.listPcjs(); +for (final String id : ids) { +indexTables.put(pcjFactory.makeTableName(pcjPrefix, id), storage.getPcjMetadata(id)
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132201588 --- Diff: extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoVisibilityBindingSetBsonConverter.java --- @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.storage.mongo; + +import static org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter.DOCUMENT_VISIBILITY_KEY; + +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.api.resolver.RyaToRdfConversions; +import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet; +import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder; +import org.apache.rya.mongodb.document.visibility.DocumentVisibility; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter; +import org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter.MalformedDocumentVisibilityException; +import org.bson.BsonArray; +import org.bson.BsonDocument; +import org.bson.BsonString; +import org.bson.Document; +import org.openrdf.model.Value; +import org.openrdf.model.impl.URIImpl; +import org.openrdf.query.BindingSet; +import org.openrdf.query.impl.MapBindingSet; + +import com.mongodb.DBObject; +import com.mongodb.MongoClient; +import com.mongodb.util.JSON; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; + +/** + * Converts {@link BindingSet}s to Strings and back again. The Strings do not + * include the binding names and are ordered with a {@link VariableOrder}. + */ +@DefaultAnnotation(NonNull.class) +public class MongoVisibilityBindingSetBsonConverter {/* implements MongoBindingSetConverter { --- End diff -- What's going on here? Why is this commented out? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132183937 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPrecomputedJoinIndexer.java --- @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.concurrent.atomic.AtomicReference; + +import org.apache.hadoop.conf.Configuration; +import org.apache.log4j.Logger; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; + +import com.mongodb.MongoClient; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; + +/** + * Updates the state of the Precomputed Join indices that are used by Rya. + */ +@DefaultAnnotation(NonNull.class) +public class MongoPrecomputedJoinIndexer extends BasePcjIndexer { --- End diff -- It appears this class does nothing. Is this just a stub class for when an observer framework is in place? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132193846 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/pcj/matching/PCJOptimizer.java --- @@ -90,9 +97,19 @@ public final void setConf(final Configuration conf) { if (!init) { try { this.conf = conf; -this.useOptimal = ConfigUtils.getUseOptimalPCJ(conf); -provider = new AccumuloIndexSetProvider(conf); -} catch (Exception e) { +useOptimal = ConfigUtils.getUseOptimalPCJ(conf); --- End diff -- Ugh, I hate that we have to do this and we can't use an Interface. Stupid setConf() init. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132182885 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/BasePcjIndexer.java --- @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkState; +import static java.util.Collections.singleton; +import static java.util.Objects.requireNonNull; +import static java.util.stream.Collectors.groupingBy; + +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Map; +import java.util.Map.Entry; +import java.util.Set; +import java.util.concurrent.atomic.AtomicReference; + +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.domain.RyaStatement; +import org.apache.rya.api.domain.RyaURI; +import org.apache.rya.indexing.entity.model.Entity; +import org.apache.rya.indexing.entity.storage.EntityStorage; +import org.apache.rya.indexing.entity.storage.EntityStorage.EntityStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.apache.rya.mongodb.MongoSecondaryIndex; +import org.openrdf.model.URI; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; + +/** + * A base class that may be used to update an {@link EntityStorage} as new + * {@link RyaStatement}s are added to/removed from the Rya instance. + */ +@DefaultAnnotation(NonNull.class) +public abstract class BasePcjIndexer implements PcjIndexer, MongoSecondaryIndex { --- End diff -- I'm not sure what this class is interacting with. The basic components of our PCJ framework are the matcher framework for query optimization, the storage layer, and the indexer layer. It seems like this is related to the indexer layer. But the indexer layer is meant to interact with the updater (whatever observer framework we use to maintain the PCJs). Given that there is currently no updater in place, what is the purpose of BasePcjIndexer, PcjIndexer, and MongoPrecomputedJoinIndexer? I can understand including abstract classes and interfaces just to have them in place when an updater is incorporated, but some of these our concrete implementations. So what are they interacting with? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132195020 --- Diff: extras/rya.benchmark/src/main/gen/org/apache/rya/benchmark/query/Rya.java --- @@ -20,7 +20,7 @@ // This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.11 // See http://java.sun.com/xml/jaxb";>http://java.sun.com/xml/jaxb // Any modifications to this file will be lost upon recompilation of the source schema. -// Generated on: 2016.12.16 at 01:22:14 PM PST +// Generated on: 2017.07.06 at 03:13:11 PM EDT --- End diff -- This and the above source gen files should be fixed by Eric's latest PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132187057 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { --- End diff -- I think that you should make this name more Mongo centric. PcjQueryNode sounds very general purpose and makes it seem like this class is DB agnostic. We should have similar naming conventions for the two PCJ nodes. Currently the other node is AccumuloIndexSet. If you don't like MongoIndexSet, we can rename that to AccumuloPcjNode and rename this class to MongoPcjNode. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132175692 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static java.util.Objects.requireNonNull; + +import java.util.List; +import java.util.Map; + +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.RdfCloudTripleStoreConfiguration; +import org.apache.rya.api.instance.RyaDetailsRepository; +import org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage; +import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository; + +import com.google.common.collect.Lists; +import com.google.common.collect.Maps; +import com.mongodb.MongoClient; + +/** + * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB. + */ +public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider { +private final MongoClient client; +private final MongoDBRdfConfiguration mongoConf; + +public MongoPcjIndexSetProvider(final Configuration conf, final MongoClient client) { +super(conf); +this.client = client; --- End diff -- Preconditions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132198496 --- Diff: extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjDocuments.java --- @@ -0,0 +1,418 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.storage.mongo; + +import static com.google.common.base.Preconditions.checkNotNull; +import static java.util.Objects.requireNonNull; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashSet; +import java.util.Iterator; +import java.util.List; +import java.util.Set; + +import org.apache.accumulo.core.security.Authorizations; +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.api.resolver.RyaToRdfConversions; +import org.apache.rya.indexing.pcj.storage.PcjMetadata; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet; +import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder; +import org.bson.Document; +import org.bson.conversions.Bson; +import org.openrdf.model.URI; +import org.openrdf.model.Value; +import org.openrdf.model.impl.URIImpl; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.QueryLanguage; +import org.openrdf.query.TupleQuery; +import org.openrdf.query.TupleQueryResult; +import org.openrdf.query.impl.MapBindingSet; +import org.openrdf.repository.RepositoryConnection; +import org.openrdf.repository.RepositoryException; + +import com.mongodb.MongoClient; +import com.mongodb.client.FindIterable; +import com.mongodb.client.MongoCollection; +import com.mongodb.util.JSON; + +/** + * Creates and modifies PCJs in MongoDB. PCJ's are stored as follows: + * + * + * + * - PCJ Metadata Doc - + * { + * _id: [table_name]_METADATA, + * sparql: [sparql query to match results], + * cardinality: [number of results] + * } + * + * - PCJ Results Doc - + * { + * pcjName: [table_name], + * auths: [auths] + * [binding_var1]: { + * uri: [type_uri], + * value: value + * } + * . + * . + * . + * [binding_varn]: { --- End diff -- binding_var2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132189129 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; +final SPARQLParser sp = new SPARQLParser(); +final ParsedTupleQuery pq = (ParsedTupleQuery) sp.parseQuery(sparql, null); +final TupleExpr te = pq.getTupleExpr(); +Preconditions.checkArgument(PCJOptimizerUtilities.isPCJValid(te), "TupleExpr is an invalid PCJ."); + +final Optional projection = new ParsedQueryUtil().findProjection(pq); +if (!projection.isPresent()) { +throw new Malform
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132202883 --- Diff: extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjAdapter.java --- @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.pcj.storage.mongo; + +import static com.google.common.base.Preconditions.checkNotNull; + +import org.apache.rya.api.domain.RyaType; +import org.apache.rya.api.resolver.RdfToRyaConversions; +import org.apache.rya.api.resolver.RyaToRdfConversions; +import org.apache.rya.indexing.pcj.storage.accumulo.BindingSetConverter; +import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder; +import org.bson.BsonArray; +import org.bson.BsonDocument; +import org.bson.BsonString; +import org.bson.BsonValue; +import org.bson.Document; +import org.bson.codecs.DocumentCodec; +import org.bson.codecs.configuration.CodecRegistries; +import org.bson.codecs.configuration.CodecRegistry; +import org.bson.conversions.Bson; +import org.openrdf.model.impl.URIImpl; +import org.openrdf.query.BindingSet; +import org.openrdf.query.algebra.evaluation.QueryBindingSet; + +/** + * Converts a Pcj for storage in mongoDB or retrieval from mongoDB. + */ +public class MongoPcjAdapter implements BindingSetConverter { --- End diff -- Where is this class used? It doesn't appear that this class or the converter above are used anywhere. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132188383 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); --- End diff -- So, the MongoPrecomputedJoinIndexer and the PcjQueryNode do not need to talk to each other. Again, the PrecomputedJoinIndexer is for ingesting data into the Updater, while the PcjQueryNode is a placeholder for the sub query that the PCJ matches. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact in
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132192625 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; +final SPARQLParser sp = new SPARQLParser(); +final ParsedTupleQuery pq = (ParsedTupleQuery) sp.parseQuery(sparql, null); +final TupleExpr te = pq.getTupleExpr(); +Preconditions.checkArgument(PCJOptimizerUtilities.isPCJValid(te), "TupleExpr is an invalid PCJ."); + +final Optional projection = new ParsedQueryUtil().findProjection(pq); +if (!projection.isPresent()) { +throw new Malform
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132190366 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; +final SPARQLParser sp = new SPARQLParser(); +final ParsedTupleQuery pq = (ParsedTupleQuery) sp.parseQuery(sparql, null); +final TupleExpr te = pq.getTupleExpr(); +Preconditions.checkArgument(PCJOptimizerUtilities.isPCJValid(te), "TupleExpr is an invalid PCJ."); + +final Optional projection = new ParsedQueryUtil().findProjection(pq); +if (!projection.isPresent()) { +throw new Malform
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132189025 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static com.google.common.base.Preconditions.checkNotNull; + +import java.util.Collection; +import java.util.Collections; +import java.util.HashSet; + +import org.apache.accumulo.core.client.AccumuloException; +import org.apache.accumulo.core.client.AccumuloSecurityException; +import org.apache.accumulo.core.client.TableNotFoundException; +import org.apache.accumulo.core.security.Authorizations; +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.utils.IteratorWrapper; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil; +import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities; +import org.apache.rya.indexing.pcj.storage.PcjException; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator; +import org.openrdf.query.BindingSet; +import org.openrdf.query.MalformedQueryException; +import org.openrdf.query.QueryEvaluationException; +import org.openrdf.query.algebra.Projection; +import org.openrdf.query.algebra.TupleExpr; +import org.openrdf.query.parser.ParsedTupleQuery; +import org.openrdf.query.parser.sparql.SPARQLParser; +import org.openrdf.sail.SailException; + +import com.google.common.base.Optional; +import com.google.common.base.Preconditions; + +import edu.umd.cs.findbugs.annotations.DefaultAnnotation; +import edu.umd.cs.findbugs.annotations.NonNull; +import info.aduna.iteration.CloseableIteration; + +/** + * Indexing Node for PCJs expressions to be inserted into execution plan to + * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}. + */ +@DefaultAnnotation(NonNull.class) +public class PcjQueryNode extends ExternalTupleSet implements ExternalBatchingIterator { +private final String tablename; +private final PcjIndexer indexer; +private final MongoPcjDocuments pcjDocs; + +/** + * + * @param sparql - name of sparql query whose results will be stored in PCJ table + * @param conf - Rya Configuration + * @param tablename - name of an existing PCJ table + * @throws MalformedQueryException + * @throws SailException + * @throws QueryEvaluationException + * @throws TableNotFoundException + * @throws AccumuloSecurityException + * @throws AccumuloException + * @throws PCJStorageException + */ +public PcjQueryNode(final String sparql, final String tablename, final MongoPcjDocuments pcjDocs) +throws MalformedQueryException, SailException, QueryEvaluationException, TableNotFoundException, +AccumuloException, AccumuloSecurityException, PCJStorageException { +this.pcjDocs = checkNotNull(pcjDocs); +indexer = new MongoPrecomputedJoinIndexer(); +this.tablename = tablename; --- End diff -- Preconditions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132173821 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static java.util.Objects.requireNonNull; + +import java.util.List; +import java.util.Map; + +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.RdfCloudTripleStoreConfiguration; +import org.apache.rya.api.instance.RyaDetailsRepository; +import org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage; +import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository; + +import com.google.common.collect.Lists; +import com.google.common.collect.Maps; +import com.mongodb.MongoClient; + +/** + * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB. + */ +public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider { +private final MongoClient client; +private final MongoDBRdfConfiguration mongoConf; + +public MongoPcjIndexSetProvider(final Configuration conf, final MongoClient client) { +super(conf); +this.client = client; +mongoConf = new MongoDBRdfConfiguration(conf); +} + +public MongoPcjIndexSetProvider(final Configuration conf, final List indices, final MongoClient client) { +super(conf, indices); +this.client = client; +mongoConf = new MongoDBRdfConfiguration(conf); +} + +@Override +protected List getIndices() throws Exception { +requireNonNull(conf); +final MongoPcjDocuments pcjTables = new MongoPcjDocuments(client, mongoConf.getMongoDBName()); +final String pcjPrefix = requireNonNull(conf.get(RdfCloudTripleStoreConfiguration.CONF_TBL_PREFIX)); +List tables = null; + +tables = mongoConf.getPcjTables(); +// this maps associates pcj table name with pcj sparql query +final Map indexTables = Maps.newLinkedHashMap(); + +try(final PrecomputedJoinStorage storage = new MongoPcjStorage(client, mongoConf.getMongoInstance(), null)) { +final PcjTableNameFactory pcjFactory = new PcjTableNameFactory(); + +final boolean tablesProvided = tables != null && !tables.isEmpty(); + +if (tablesProvided) { +// if tables provided, associate table name with sparql +for (final String table : tables) { +indexTables.put(table, storage.getPcjMetadata(pcjFactory.getPcjId(table)).getSparql()); +} +} else if (hasRyaDetails(mongoConf.getMongoDBName())) { +// If this is a newer install of Rya, and it has PCJ Details, +// then +// use those. +final List ids = storage.listPcjs(); +for (final String id : ids) { +indexTables.put(pcjFactory.makeTableName(pcjPrefix, id), storage.getPcjMetadata(id).getSparql()); +} +} else { +// Otherwise figure it out by getting document IDs. +tables = pcjTables.listPcjDocuments(); +for (final String table : tab
[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support
Github user meiercaleb commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/172#discussion_r132175735 --- Diff: extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.rya.indexing.mongodb.pcj; + +import static java.util.Objects.requireNonNull; + +import java.util.List; +import java.util.Map; + +import org.apache.hadoop.conf.Configuration; +import org.apache.rya.api.RdfCloudTripleStoreConfiguration; +import org.apache.rya.api.instance.RyaDetailsRepository; +import org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException; +import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet; +import org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider; +import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage; +import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments; +import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage; +import org.apache.rya.mongodb.MongoDBRdfConfiguration; +import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository; + +import com.google.common.collect.Lists; +import com.google.common.collect.Maps; +import com.mongodb.MongoClient; + +/** + * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB. + */ +public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider { +private final MongoClient client; +private final MongoDBRdfConfiguration mongoConf; + +public MongoPcjIndexSetProvider(final Configuration conf, final MongoClient client) { +super(conf); +this.client = client; +mongoConf = new MongoDBRdfConfiguration(conf); +} + +public MongoPcjIndexSetProvider(final Configuration conf, final List indices, final MongoClient client) { +super(conf, indices); +this.client = client; --- End diff -- Preconditions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132191684 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- Can you change this to just "object" or change "context" "predicate" "subject" to "xxx_original" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119949#comment-16119949 ] ASF GitHub Bot commented on RYA-316: Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132191684 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -53,8 +54,11 @@ public static final String OBJECT_TYPE_VALUE = XMLSchema.ANYURI.stringValue(); public static final String CONTEXT = "context"; public static final String PREDICATE = "predicate"; -public static final String OBJECT = "object"; +public static final String PREDICATE_HASH = "predicate_hash"; +public static final String OBJECT = "object_original"; --- End diff -- Can you change this to just "object" or change "context" "predicate" "subject" to "xxx_original" > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest
[ https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119948#comment-16119948 ] ASF GitHub Bot commented on RYA-316: Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132192953 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) { final RyaURI context = stmt.getContext(); final BasicDBObject query = new BasicDBObject(); if (subject != null){ -query.append(SUBJECT, subject.getData()); +query.append(SUBJECT_HASH, DigestUtils.sha256Hex(subject.getData())); --- End diff -- Can we store/query in binary (32 bytes) vs hex string (64 bytes)? > Long LineStrings break MongoDB ingest > - > > Key: RYA-316 > URL: https://issues.apache.org/jira/browse/RYA-316 > Project: Rya > Issue Type: Bug > Components: dao >Reporter: Aaron Mihalik >Assignee: Andrew Smith > > MongoDB will reject statements they contain very long linestrings. > Basically, the mongodb index key is limited to 1024 chars, so the insert will > fail if the literal is longer. > [Here is some example > code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java]. > I think the inserts will work if you use 10 points, but fail if you use > linestrings with 100 points. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string
Github user amihalik commented on a diff in the pull request: https://github.com/apache/incubator-rya/pull/199#discussion_r132192953 --- Diff: dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java --- @@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) { final RyaURI context = stmt.getContext(); final BasicDBObject query = new BasicDBObject(); if (subject != null){ -query.append(SUBJECT, subject.getData()); +query.append(SUBJECT_HASH, DigestUtils.sha256Hex(subject.getData())); --- End diff -- Can we store/query in binary (32 bytes) vs hex string (64 bytes)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya issue #202: [WIP] Rya 331
Github user asfgit commented on the issue: https://github.com/apache/incubator-rya/pull/202 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/374/Build result: FAILURE[...truncated 1.21 MB...][INFO] Apache Rya Spark Support ... SKIPPED[INFO] Apache Rya Web Projects SKIPPED[INFO] Apache Rya Web Implementation .. SKIPPED[INFO] [INFO] BUILD FAILURE[INFO] [INFO] Total time: 18:57 min[INFO] Finished at: 2017-08-09T13:20:24+00:00[INFO] Final Memory: 226M/2679M[INFO] [ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.11:check (check-licenses) on project rya.pcj.fluo.test.base: Too many files with unapproved license: 4 See RAT report in: /home/jenkins/jenkins-slave/workspace/incubator-rya-master-with-optionals-pull-requests/extras/rya .pcj.fluo/pcj.fluo.test.base/target/rat.txt -> [Help 1][ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.[ERROR] Re-run Maven using the -X switch to enable full debug logging.[ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles:[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException[ERROR] [ERROR] After correcting the problems, you can resume the build with the command[ERROR] mvn -rf :rya.pcj.fluo.test.basechannel stoppedSetting status of 56c2eec09ebbed482d4dee192dc9e150bca7186d to FAILURE with url https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/374/ and message: 'FAILURE 'Using context: Jenkins: clean package -Pgeoindexing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-rya pull request #202: [WIP] Rya 331
GitHub user jdasch opened a pull request: https://github.com/apache/incubator-rya/pull/202 [WIP] Rya 331 ## Description >What Changed? Don't bother reviewing this right now. This PR is just a mechanism for exercising the build servers. ### Tests >Coverage? [Description of what tests were written] ### Links [Jira](https://issues.apache.org/jira/browse/RYA-NUMBER) ### Checklist - [ ] Code Review - [ ] Squash Commits People To Reivew [Add those who should review this] You can merge this pull request into a Git repository by running: $ git pull https://github.com/jdasch/incubator-rya RYA-331 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-rya/pull/202.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #202 commit 5ca17c7e8a832b040a33ee2ce8aa0aabaf5d083e Author: jdasch Date: 2017-08-03T14:34:52Z Improved IntegrationTest stability. commit f4f4f02db73362bc9eb1997b9a391f5c0c1c5854 Author: jdasch Date: 2017-08-04T13:44:13Z stash commit ddeea3c56d6329a2215c4cc25fba58ddeb7ce4a0 Author: jdasch Date: 2017-08-04T19:04:49Z stash commit ef647ebf56c60d942bf30c3f9eeab43124f47dbb Author: jdasch Date: 2017-08-07T12:50:17Z stash commit 3072ce496184a6c6b8c141f24b85e39379ef4973 Author: jdasch Date: 2017-08-07T12:51:12Z stash commit 1ac702dd3e5c01b766718d16c2efd726f43e4a14 Author: jdasch Date: 2017-08-07T15:55:00Z stash commit 7d828bd253fe7bf912cfde3b9fadaf74ce920794 Author: jdasch Date: 2017-08-07T17:34:28Z added close commit 10fce8cec5b3a2379b49bf93a2520f335705c183 Author: jdasch Date: 2017-08-07T17:47:01Z stash commit 24eb7cc17dfdec825fa732a4f993da3a0e561c74 Author: jdasch Date: 2017-08-07T20:28:02Z stash commit 26a12828d18d21a682ef0f121a3fb6a6420e6b7c Author: jdasch Date: 2017-08-08T15:18:55Z stash commit 2b6f9412b3d72a22fee296293011992bbc8200a5 Author: jdasch Date: 2017-08-08T15:38:16Z prevented blocking commit fefc906c1dbfdd90969add65fc5c31b266cd4b9e Author: jdasch Date: 2017-08-08T19:41:30Z stash commit d69b7db18ed95e7f57a5a182cfad915cb2641cda Author: jdasch Date: 2017-08-08T21:00:17Z stash - needs to be cleaned up. commit d556f060e9e68f072a5feff394a797ca8e419c54 Author: jdasch Date: 2017-08-09T03:30:15Z stash commit 56c2eec09ebbed482d4dee192dc9e150bca7186d Author: jdasch Date: 2017-08-09T12:20:26Z ignored failing tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (RYA-337) Batch Queries to MongoDB
Aaron Mihalik created RYA-337: - Summary: Batch Queries to MongoDB Key: RYA-337 URL: https://issues.apache.org/jira/browse/RYA-337 Project: Rya Issue Type: Improvement Components: dao Reporter: Aaron Mihalik Assignee: Aaron Mihalik Currently the MongoDB DAO sends one query at a time to Mongo. Instead, the DAO should send a batch of queries and perform a client side hash join (like the Accumulo DAO) -- This message was sent by Atlassian JIRA (v6.4.14#64029)