date:20170809

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/381/Build
 result: ABORTED[...truncated 2.26 MB...][INFO] [INFO] --- 
maven-resources-plugin:2.7:resources (default-resources) @ rya.pcj.fluo.demo 
---[INFO] Using 'UTF-8' encoding to copy filtered resources.[INFO] skip non 
existing resourceDirectory 
/home/jenkins/jenkins-slave/workspace/incubator-rya-master-with-optionals-pull-requests/extras/rya.pcj.fluo/pcj.fluo.demo/src/main/resources[INFO]
 Copying 3 resourcesBuild was aborted[WARNING] Failed to notify spy 
hudson.maven.Maven3Builder$JenkinsEventSpy: 
java.util.concurrent.ExecutionException: Invalid object ID 14 iota=59[INFO] 
[INFO] --- maven-compiler-plugin:3.2:compile (default-compile) @ 
rya.pcj.fluo.demo ---[INFO] Changes detected - recompiling the module![INFO] 
Compiling 3 source files to 
/home/jenkins/jenkins-slave/workspace/incubator-rya-master-with-optionals-pull-requests/extras/rya.pcj.fluo/pcj.fluo.demo/target/classeschannel
 st
 opped[WARNING] Failed to notify spy 
hudson.maven.Maven3Builder$JenkinsEventSpy: 
hudson.remoting.Channel$OrderlyShutdown[INFO] [INFO] --- 
maven-resources-plugin:2.7:testResources (default-testResources) @ 
rya.pcj.fluo.demo ---[INFO] Using 'UTF-8' encoding to copy filtered 
resources.[INFO] skip non existing resourceDirectory 
/home/jenkins/jenkins-slave/workspace/incubator-rya-master-with-optionals-pull-requests/extras/rya.pcj.fluo/pcj.fluo.demo/src/test/resources[INFO]
 Copying 3 resources[WARNING] Failed to notify spy 
hudson.maven.Maven3Builder$JenkinsEventSpy: java.io.IOException: Backing 
channel 'channel' is disconnected.[INFO] [INFO] --- 
maven-compiler-plugin:3.2:testCompile (default-testCompile) @ rya.pcj.fluo.demo 
---Setting status of 256e5f53434801c7b6325c80020ac8080b6636e0 to FAILURE with 
url 
https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/381/
 and message: 'FAILURE 'Using context: Jenkins: clean package -Pgeoindexing



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/380/Build
 result: FAILURE[...truncated 4.03 MB...][INFO] Apache Rya Spark 
Support ... SKIPPED[INFO] Apache Rya Web Projects 
 SKIPPED[INFO] Apache Rya Web Implementation 
.. SKIPPED[INFO] 
[INFO] 
BUILD FAILURE[INFO] 
[INFO] 
Total time: 01:04 h[INFO] Finished at: 2017-08-10T01:35:39+00:00[INFO] Final 
Memory: 494M/2861M[INFO] 
[ERROR] 
Failed to execute goal org.codehaus.mojo:jaxb2-maven-plugin:2.3.1:xjc (xjc) on 
project rya.benchmark: MojoExecutionException: NoSchemasException -> [Help 
1][ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with 
the -e switch.[ERROR] Re-run Maven using the -X
  switch to enable full debug logging.[ERROR] [ERROR] For more information 
about the errors and possible solutions, please read the following 
articles:[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the 
command[ERROR]   mvn  -rf :rya.benchmarkchannel stoppedSetting status of 
1fc939493ccac56a1f2ea8d8f7ac3fe42aa190d2 to FAILURE with url 
https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/380/
 and message: 'FAILURE 'Using context: Jenkins: clean package -Pgeoindexing



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-250) Smart URI avoid data duplication

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120786#comment-16120786
 ] 

ASF GitHub Bot commented on RYA-250:


Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/153
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/379/



> Smart URI avoid data duplication
> 
>
> Key: RYA-250
> URL: https://issues.apache.org/jira/browse/RYA-250
> Project: Rya
>  Issue Type: Task
>  Components: dao
>Affects Versions: 3.2.10
>Reporter: Eric White
>Assignee: Eric White
> Fix For: 3.2.10
>
>
> Implement Smart URI methods for avoiding data duplication.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya issue #153: RYA-250 Smart URI avoiding data duplication

2017-08-09 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/153
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/379/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/378/Failed
 Tests: 5incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.benchmark:
 1org.apache.rya.benchmark.query.QueriesBenchmarkConfReaderIT.loadincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.pcj.fluo.integration:
 1org.apache.rya.indexing.pcj.fluo.integration.KafkaExportIT.newResultsExportedTestincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.prospector:
 3org.apache.rya.prospector.mr.ProspectorTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testCountorg.apache.rya.prospector.service.ProspectorServiceEvalStatsDAOTest.testNoAuthsCount



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/377/Failed
 Tests: 3incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.benchmark:
 1org.apache.rya.benchmark.query.QueriesBenchmarkConfReaderIT.loadincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.pcj.fluo.integration:
 1org.apache.rya.indexing.pcj.fluo.integration.KafkaExportIT.newResultsExportedTestincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.pcj.fluo.test.base:
 1org.apache.rya.pcj.fluo.test.base.KafkaExportITBaseIT.embeddedKafkaTest



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120519#comment-16120519
 ] 

ASF GitHub Bot commented on RYA-303:


Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132279789
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
+final SPARQLParser sp = new SPARQLParser();
+final ParsedTupleQuery pq = (ParsedTupleQuery) 
sp.parseQuery(sparql, null);
+final TupleExpr te = pq.getTupleExpr();
+Preconditions.checkArgu

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120520#comment-16120520
 ] 

ASF GitHub Bot commented on RYA-303:


Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132280543
  
--- Diff: 
extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjDocuments.java
 ---
@@ -0,0 +1,418 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.storage.mongo;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+import static java.util.Objects.requireNonNull;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.rya.api.domain.RyaType;
+import org.apache.rya.api.resolver.RdfToRyaConversions;
+import org.apache.rya.api.resolver.RyaToRdfConversions;
+import org.apache.rya.indexing.pcj.storage.PcjMetadata;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet;
+import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder;
+import org.bson.Document;
+import org.bson.conversions.Bson;
+import org.openrdf.model.URI;
+import org.openrdf.model.Value;
+import org.openrdf.model.impl.URIImpl;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.QueryLanguage;
+import org.openrdf.query.TupleQuery;
+import org.openrdf.query.TupleQueryResult;
+import org.openrdf.query.impl.MapBindingSet;
+import org.openrdf.repository.RepositoryConnection;
+import org.openrdf.repository.RepositoryException;
+
+import com.mongodb.MongoClient;
+import com.mongodb.client.FindIterable;
+import com.mongodb.client.MongoCollection;
+import com.mongodb.util.JSON;
+
+/**
+ * Creates and modifies PCJs in MongoDB. PCJ's are stored as follows:
+ *
+ * 
+ * 
+ * - PCJ Metadata Doc -
+ * {
+ *   _id: [table_name]_METADATA,
+ *   sparql: [sparql query to match results],
+ *   cardinality: [number of results]
+ * }
+ *
+ * - PCJ Results Doc -
+ * {
+ *   pcjName: [table_name],
+ *   auths: [auths]
+ *   [binding_var1]: {
+ * uri: [type_uri],
+ * value: value
+ *   }
+ *   .
+ *   .
+ *   .
+ *   [binding_varn]: {
--- End diff --

n is just showing that its the nth var.  the dots are showing any number 
between


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132279789
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
+final SPARQLParser sp = new SPARQLParser();
+final ParsedTupleQuery pq = (ParsedTupleQuery) 
sp.parseQuery(sparql, null);
+final TupleExpr te = pq.getTupleExpr();
+Preconditions.checkArgument(PCJOptimizerUtilities.isPCJValid(te), 
"TupleExpr is an invalid PCJ.");
+
+final Optional projection = new 
ParsedQueryUtil().findProjection(pq);
+if (!projection.isPresent()) {
+throw new Malformed

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132280543
  
--- Diff: 
extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjDocuments.java
 ---
@@ -0,0 +1,418 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.storage.mongo;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+import static java.util.Objects.requireNonNull;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.rya.api.domain.RyaType;
+import org.apache.rya.api.resolver.RdfToRyaConversions;
+import org.apache.rya.api.resolver.RyaToRdfConversions;
+import org.apache.rya.indexing.pcj.storage.PcjMetadata;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet;
+import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder;
+import org.bson.Document;
+import org.bson.conversions.Bson;
+import org.openrdf.model.URI;
+import org.openrdf.model.Value;
+import org.openrdf.model.impl.URIImpl;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.QueryLanguage;
+import org.openrdf.query.TupleQuery;
+import org.openrdf.query.TupleQueryResult;
+import org.openrdf.query.impl.MapBindingSet;
+import org.openrdf.repository.RepositoryConnection;
+import org.openrdf.repository.RepositoryException;
+
+import com.mongodb.MongoClient;
+import com.mongodb.client.FindIterable;
+import com.mongodb.client.MongoCollection;
+import com.mongodb.util.JSON;
+
+/**
+ * Creates and modifies PCJs in MongoDB. PCJ's are stored as follows:
+ *
+ * 
+ * 
+ * - PCJ Metadata Doc -
+ * {
+ *   _id: [table_name]_METADATA,
+ *   sparql: [sparql query to match results],
+ *   cardinality: [number of results]
+ * }
+ *
+ * - PCJ Results Doc -
+ * {
+ *   pcjName: [table_name],
+ *   auths: [auths]
+ *   [binding_var1]: {
+ * uri: [type_uri],
+ * value: value
+ *   }
+ *   .
+ *   .
+ *   .
+ *   [binding_varn]: {
--- End diff --

n is just showing that its the nth var.  the dots are showing any number 
between


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread jdasch

Github user jdasch commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  
asfbot build


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/376/Failed
 Tests: 2incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.benchmark:
 1org.apache.rya.benchmark.query.QueriesBenchmarkConfReaderIT.loadincubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.pcj.fluo.integration:
 1org.apache.rya.indexing.pcj.fluo.integration.KafkaExportIT.newResultsExportedTest



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #198: Rya 283

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/198#discussion_r132251517
  
--- Diff: 
extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/query/QueryBuilderVisitorBase.java
 ---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.fluo.app.query;
+
+import org.apache.rya.indexing.pcj.fluo.app.NodeType;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+/**
+ * Base visitor class for navigating a {@link FluoQuery.Builder}.
+ * The visit methods in this class provide the basic functionality
+ * for navigating between the Builders that make u the FluoQuery.Builder.
+ *
+ */
+public abstract class QueryBuilderVisitorBase {
--- End diff --

you can ignore this, I'm talking to you in person for clarification


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #198: Rya 283

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/198#discussion_r132250708
  
--- Diff: 
extras/rya.pcj.fluo/pcj.fluo.integration/src/test/java/org/apache/rya/indexing/pcj/fluo/integration/KafkaExportIT.java
 ---
@@ -425,6 +432,160 @@ public void groupByManyBindings_avaerages() throws 
Exception {
 assertEquals(expectedResults, results);
 }
 
+
+@Test
+public void nestedGroupByManyBindings_averages() throws Exception {
+// A query that groups what is aggregated by two of the keys.
+final String sparql =
+"SELECT ?type ?location ?averagePrice {" +
+"FILTER(?averagePrice > 4) " +
+"{SELECT ?type ?location (avg(?price) as ?averagePrice) {" 
+
+"?id  ?type . " +
+"?id  ?location ." +
+"?id  ?price ." +
+"} " +
+"GROUP BY ?type ?location }}";
+
+// Create the Statements that will be loaded into Rya.
+final ValueFactory vf = new ValueFactoryImpl();
+final Collection statements = Sets.newHashSet(
+// American items that will be averaged.
+vf.createStatement(vf.createURI("urn:1"), 
vf.createURI("urn:type"), vf.createLiteral("apple")),
+vf.createStatement(vf.createURI("urn:1"), 
vf.createURI("urn:location"), vf.createLiteral("USA")),
+vf.createStatement(vf.createURI("urn:1"), 
vf.createURI("urn:price"), vf.createLiteral(2.50)),
+
+vf.createStatement(vf.createURI("urn:2"), 
vf.createURI("urn:type"), vf.createLiteral("cheese")),
+vf.createStatement(vf.createURI("urn:2"), 
vf.createURI("urn:location"), vf.createLiteral("USA")),
+vf.createStatement(vf.createURI("urn:2"), 
vf.createURI("urn:price"), vf.createLiteral(4.25)),
+
+vf.createStatement(vf.createURI("urn:3"), 
vf.createURI("urn:type"), vf.createLiteral("cheese")),
+vf.createStatement(vf.createURI("urn:3"), 
vf.createURI("urn:location"), vf.createLiteral("USA")),
+vf.createStatement(vf.createURI("urn:3"), 
vf.createURI("urn:price"), vf.createLiteral(5.25)),
+
+// French items that will be averaged.
+vf.createStatement(vf.createURI("urn:4"), 
vf.createURI("urn:type"), vf.createLiteral("cheese")),
+vf.createStatement(vf.createURI("urn:4"), 
vf.createURI("urn:location"), vf.createLiteral("France")),
+vf.createStatement(vf.createURI("urn:4"), 
vf.createURI("urn:price"), vf.createLiteral(8.5)),
+
+vf.createStatement(vf.createURI("urn:5"), 
vf.createURI("urn:type"), vf.createLiteral("cigarettes")),
+vf.createStatement(vf.createURI("urn:5"), 
vf.createURI("urn:location"), vf.createLiteral("France")),
+vf.createStatement(vf.createURI("urn:5"), 
vf.createURI("urn:price"), vf.createLiteral(3.99)),
+
+vf.createStatement(vf.createURI("urn:6"), 
vf.createURI("urn:type"), vf.createLiteral("cigarettes")),
+vf.createStatement(vf.createURI("urn:6"), 
vf.createURI("urn:location"), vf.createLiteral("France")),
+vf.createStatement(vf.createURI("urn:6"), 
vf.createURI("urn:price"), vf.createLiteral(4.99)));
+
+// Create the PCJ in Fluo and load the statements into Rya.
+final String pcjId = loadData(sparql, statements);
+
+// Create the expected results of the SPARQL query once the PCJ 
has been computed.
+final Set expectedResults = new HashSet<>();
+
+MapBindingSet bs = new MapBindingSet();
+bs.addBinding("type", vf.createLiteral("cheese", 
XMLSchema.STRING));
+bs.addBinding("location", vf.createLiteral("France", 
XMLSchema.STRING));
+bs.addBinding("averagePrice", vf.createLiteral("8.5", 
XMLSchema.DECIMAL));
+expectedResults.add( new VisibilityBindingSet(bs));
+
+bs = new MapBindingSet();
+bs.addBinding("type", vf.createLiteral("cigarettes", 
XMLSchema.STRING));
+bs.addBinding("location", vf.createLiteral("France", 
XMLSchema.STRING));
+bs.addBinding("averagePrice", vf.createLiteral("4.49", 
XMLSchema.DECIMAL));
+expectedResults.add( new VisibilityBindingSet(bs) );
+
+bs = new MapBindingSet();
+bs.addBinding("type", vf.createLiteral("cheese", 
XMLSchema.STRING));
+bs.addBinding("location", vf.createLiteral("USA", 
XMLSchema.STRING));
+bs.addBinding("averagePrice", vf.createLiteral("4.75", 
XMLSchema.DECIMAL));
+expectedResults.add( new VisibilityBindingSet(bs) );
+

[GitHub] incubator-rya pull request #198: Rya 283

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/198#discussion_r132244653
  
--- Diff: 
extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/query/QueryBuilderVisitorBase.java
 ---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.fluo.app.query;
+
+import org.apache.rya.indexing.pcj.fluo.app.NodeType;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+/**
+ * Base visitor class for navigating a {@link FluoQuery.Builder}.
+ * The visit methods in this class provide the basic functionality
+ * for navigating between the Builders that make u the FluoQuery.Builder.
+ *
+ */
+public abstract class QueryBuilderVisitorBase {
--- End diff --

should this extend some openRDF visitor? or are you just mimicking the 
visitor pattern?  Why are you visiting on the builders rather than what they 
build?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #198: Rya 283

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/198#discussion_r132234446
  
--- Diff: 
extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/query/CommonNodeMetadata.java
 ---
@@ -99,4 +99,18 @@ public String toString() {
 .append("}")
 .toString();
 }
+
+/**
+ * Base interface for all metadata Builders.  Using this type def
+ * allows for the implementation of a Builder visitor for navigating
+ * the Builder tree.
+ *
+ */
+public static interface Builder {
--- End diff --

there's no build function?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #198: Rya 283

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/198#discussion_r132231558
  
--- Diff: 
extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/JoinResultUpdater.java
 ---
@@ -160,8 +183,55 @@ public void updateJoinResults(
 public static enum Side {
 LEFT, RIGHT;
 }
+
+
+/**
+ * Fetches batch to be processed by scanning over the Span specified 
by the
+ * {@link JoinBatchInformation}. The number of results is less than or 
equal
+ * to the batch size specified by the JoinBatchInformation.
+ * 
+ * @param tx - Fluo transaction in which batch operation is performed
+ * @param siblingSpan - span of sibling to retrieve elements to join 
with
+ * @param bsSet- set that batch results are added to
+ * @return Set - containing results of sibling scan.
+ * @throws Exception 
+ */
+private Optional fillSiblingBatch(TransactionBase tx, Span 
siblingSpan, Column siblingColumn, Set bsSet, int 
batchSize) throws Exception {
+
+RowScanner rs = 
tx.scanner().over(siblingSpan).fetch(siblingColumn).byRow().build();
+Iterator colScannerIter = rs.iterator();
+
+boolean batchLimitMet = false;
+Bytes row = siblingSpan.getStart().getRow();
+while (colScannerIter.hasNext() && !batchLimitMet) {
+ColumnScanner colScanner = colScannerIter.next();
+row = colScanner.getRow();
+Iterator iter = colScanner.iterator();
+while (iter.hasNext()) {
--- End diff --

should this also check batchLimitMet?
the flag can't be set to true on the first pass, so you can just do the 
size check after adding the first bindingSet, then you don't need a break.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #198: Rya 283

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/198#discussion_r132250247
  
--- Diff: 
extras/rya.pcj.fluo/pcj.fluo.app/src/test/java/org/apache/rya/indexing/pcj/fluo/app/query/QueryBuilderVisitorTest.java
 ---
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.fluo.app.query;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.apache.rya.indexing.pcj.fluo.app.NodeType;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class QueryBuilderVisitorTest {
+
+@Test
+public void builderTest() {
+
+FluoQuery.Builder fluoBuilder = FluoQuery.builder();
+
+String queryId = NodeType.generateNewFluoIdForType(NodeType.QUERY);
+String projectionId = 
NodeType.generateNewFluoIdForType(NodeType.PROJECTION);
+String joinId = NodeType.generateNewFluoIdForType(NodeType.JOIN);
+String leftSp = 
NodeType.generateNewFluoIdForType(NodeType.STATEMENT_PATTERN);
+String rightSp = 
NodeType.generateNewFluoIdForType(NodeType.STATEMENT_PATTERN);
+
+List expected = Arrays.asList(queryId, projectionId, 
joinId, leftSp, rightSp);
+
+QueryMetadata.Builder queryBuilder = 
QueryMetadata.builder(queryId);
+queryBuilder.setChildNodeId(projectionId);
+
+ProjectionMetadata.Builder projectionBuilder = 
ProjectionMetadata.builder(projectionId);
+projectionBuilder.setChildNodeId(joinId);
+
+JoinMetadata.Builder joinBuilder = JoinMetadata.builder(joinId);
+joinBuilder.setLeftChildNodeId(leftSp);
+joinBuilder.setRightChildNodeId(rightSp);
+
+StatementPatternMetadata.Builder left = 
StatementPatternMetadata.builder(leftSp);
+StatementPatternMetadata.Builder right = 
StatementPatternMetadata.builder(rightSp);
+
+fluoBuilder.setQueryMetadata(queryBuilder);
+fluoBuilder.addProjectionBuilder(projectionBuilder);
+fluoBuilder.addJoinMetadata(joinBuilder);
+fluoBuilder.addStatementPatternBuilder(left);
+fluoBuilder.addStatementPatternBuilder(right);
+
+QueryBuilderPrinter printer = new QueryBuilderPrinter(fluoBuilder);
+printer.visit();
+Assert.assertEquals(expected, printer.getIds());
+}
+
+
+public static class QueryBuilderPrinter extends 
QueryBuilderVisitorBase {
+
+private List ids = new ArrayList<>();
+
+public List getIds() {
+return ids;
+}
+
+public QueryBuilderPrinter(FluoQuery.Builder builder) {
+super(builder);
+}
+
+public void visit(QueryMetadata.Builder queryBuilder) {
+System.out.println(queryBuilder.getNodeId());
--- End diff --

why do you need to print during a test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #198: Rya 283

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/198#discussion_r132245408
  
--- Diff: 
extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/query/QueryMetadataVisitorBase.java
 ---
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.fluo.app.query;
+
+import org.apache.rya.indexing.pcj.fluo.app.NodeType;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+public abstract class QueryMetadataVisitorBase {
--- End diff --

confusedwhy are you visiting on the builders as well as the metadata 
built?  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #198: Rya 283

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/198#discussion_r132234031
  
--- Diff: 
extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/observers/QueryResultObserver.java
 ---
@@ -107,7 +107,7 @@ public void process(final TransactionBase tx, final 
Bytes brow, final Column col
 // Read the Child Binding Set that will be exported.
 final Bytes valueBytes = tx.get(brow, col);
 final VisibilityBindingSet result = 
BS_SERDE.deserialize(valueBytes);
-
+
--- End diff --

this class only has this whitespace change, can it be removed from the 
commits?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #198: Rya 283

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/198#discussion_r132233795
  
--- Diff: 
extras/rya.pcj.fluo/pcj.fluo.app/src/main/java/org/apache/rya/indexing/pcj/fluo/app/batch/JoinBatchInformation.java
 ---
@@ -149,12 +137,12 @@ public boolean equals(Object other) {
 
 JoinBatchInformation batch = (JoinBatchInformation) other;
 return super.equals(other) &&  Objects.equals(this.bs, batch.bs) 
&& Objects.equals(this.join, batch.join)
--- End diff --

you should be able to just Objects.equals().equals().equals();


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread jdasch

Github user jdasch commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  
asfbot build


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120301#comment-16120301
 ] 

ASF GitHub Bot commented on RYA-316:


Github user pujav65 commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132249498
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

I'm pretty sure Mongo further condenses the data, so I'm not sure hashing 
is necessary in order for it to store in memory.  You're adding a lot of 
overhead to query.  I'm ok with adding it now if you think it's necessary.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread pujav65

Github user pujav65 commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132249498
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

I'm pretty sure Mongo further condenses the data, so I'm not sure hashing 
is necessary in order for it to store in memory.  You're adding a lot of 
overhead to query.  I'm ok with adding it now if you think it's necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120257#comment-16120257
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132245115
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
--- End diff --

@pujav65 thanks.  

@isper3at clearly a bug.  please add OBJECT_HASH, OBJECT_TYPE_HASH to the 
first index. 


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread amihalik

Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132245115
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
--- End diff --

@pujav65 thanks.  

@isper3at clearly a bug.  please add OBJECT_HASH, OBJECT_TYPE_HASH to the 
first index. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120245#comment-16120245
 ] 

ASF GitHub Bot commented on RYA-316:


Github user pujav65 commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132243060
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
--- End diff --

When the Mongo db backend was first implemented, you could only do indices 
over two fields-- the first is the primary index, the second the secondary 
index.  That may have changed since.  The indices we originally had were 
subject, predicate, object, and then subject/predicate, predicate/object, and 
object/subject.  The not including object type might be a bug, but I had 
thought that was addressed at some point.  Also one could argue that the single 
field indices were redundant-- I had wanted to test to see but never got around 
to it.
If you can now index over more than two fields, then we might want to 
revisit this.  


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread pujav65

Github user pujav65 commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132243060
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
--- End diff --

When the Mongo db backend was first implemented, you could only do indices 
over two fields-- the first is the primary index, the second the secondary 
index.  That may have changed since.  The indices we originally had were 
subject, predicate, object, and then subject/predicate, predicate/object, and 
object/subject.  The not including object type might be a bug, but I had 
thought that was addressed at some point.  Also one could argue that the single 
field indices were redundant-- I had wanted to test to see but never got around 
to it.
If you can now index over more than two fields, then we might want to 
revisit this.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120235#comment-16120235
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132242206
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

@pujav65 I'm concerned about index size.  please hash everything.  If you 
want another ticket for "please hash everything" I'm fine with that, but let's 
knock that out while @isper3at is cleaning this stuff up.  Key thing with mongo 
is to get the index to fit in memory, so lets do that.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread amihalik

Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132242206
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

@pujav65 I'm concerned about index size.  please hash everything.  If you 
want another ticket for "please hash everything" I'm fine with that, but let's 
knock that out while @isper3at is cleaning this stuff up.  Key thing with mongo 
is to get the index to fit in memory, so lets do that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120195#comment-16120195
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132235261
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
-doc = new BasicDBObject(PREDICATE, 1);
-doc.put(OBJECT, 1);
+doc = new BasicDBObject(PREDICATE_HASH, 1);
+doc.put(OBJECT_HASH, 1);
 doc.put(OBJECT_TYPE, 1);
 coll.createIndex(doc);
-doc = new BasicDBObject(OBJECT, 1);
+doc = new BasicDBObject(OBJECT_HASH, 1);
 doc.put(OBJECT_TYPE, 1);
 doc.put(SUBJECT, 1);
--- End diff --

SUBJECT_HASH


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120194#comment-16120194
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132235567
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
--- End diff --

@pujav65  Looking over this index creation code... this seems like a bug... 
where's the SPO index?  I think this first index should be SUBJECT_HASH, 
PREDICATE_HASH, OBJECT_HASH, OBJECT_TYPE_HASH


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread amihalik

Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132235261
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
-doc = new BasicDBObject(PREDICATE, 1);
-doc.put(OBJECT, 1);
+doc = new BasicDBObject(PREDICATE_HASH, 1);
+doc.put(OBJECT_HASH, 1);
 doc.put(OBJECT_TYPE, 1);
 coll.createIndex(doc);
-doc = new BasicDBObject(OBJECT, 1);
+doc = new BasicDBObject(OBJECT_HASH, 1);
 doc.put(OBJECT_TYPE, 1);
 doc.put(SUBJECT, 1);
--- End diff --

SUBJECT_HASH


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread amihalik

Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132235567
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -64,14 +68,14 @@
 @Override
 public void createIndices(final DBCollection coll){
 BasicDBObject doc = new BasicDBObject();
-doc.put(SUBJECT, 1);
-doc.put(PREDICATE, 1);
+doc.put(SUBJECT_HASH, 1);
+doc.put(PREDICATE_HASH, 1);
 coll.createIndex(doc);
--- End diff --

@pujav65  Looking over this index creation code... this seems like a bug... 
where's the SPO index?  I think this first index should be SUBJECT_HASH, 
PREDICATE_HASH, OBJECT_HASH, OBJECT_TYPE_HASH


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120185#comment-16120185
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132235041
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

yep, might as well hash context and object type as well.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread amihalik

Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132235041
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

yep, might as well hash context and object type as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120183#comment-16120183
 ] 

ASF GitHub Bot commented on RYA-316:


Github user pujav65 commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
looks good to me.  aaron's different hashing suggestion can be done later 
-- add it to jira to track if you don't want to do it now.  


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya issue #199: RYA-316 Long OBJ string

2017-08-09 Thread pujav65

Github user pujav65 commented on the issue:

https://github.com/apache/incubator-rya/pull/199
  
looks good to me.  aaron's different hashing suggestion can be done later 
-- add it to jira to track if you don't want to do it now.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120178#comment-16120178
 ] 

ASF GitHub Bot commented on RYA-316:


Github user pujav65 commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132234315
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

hey i don't think we need to hash predicates and subjects - just objects.  
objects are possibly literals which means they can have unspecified length (and 
in practice are likely to be very long -- sometimes people literally put books 
into comments which are object values).  theoretically predicates and subjects 
are URIs which means that to be valid they are limited in length.  no harm in 
doing it, it just adds a layer of indirection at query time.  


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120179#comment-16120179
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132234427
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) {
 final RyaURI context = stmt.getContext();
 final BasicDBObject query = new BasicDBObject();
 if (subject != null){
-query.append(SUBJECT, subject.getData());
+query.append(SUBJECT_HASH, 
DigestUtils.sha256Hex(subject.getData()));
--- End diff --

I'll do some testing on this, but I'm guessing PRO: (1) smaller index size 
and (2) smaller messages over the wire.  CON: Need to take care when 
println'ing the query.


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread amihalik

Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132234427
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) {
 final RyaURI context = stmt.getContext();
 final BasicDBObject query = new BasicDBObject();
 if (subject != null){
-query.append(SUBJECT, subject.getData());
+query.append(SUBJECT_HASH, 
DigestUtils.sha256Hex(subject.getData()));
--- End diff --

I'll do some testing on this, but I'm guessing PRO: (1) smaller index size 
and (2) smaller messages over the wire.  CON: Need to take care when 
println'ing the query.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread pujav65

Github user pujav65 commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132234315
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

hey i don't think we need to hash predicates and subjects - just objects.  
objects are possibly literals which means they can have unspecified length (and 
in practice are likely to be very long -- sometimes people literally put books 
into comments which are object values).  theoretically predicates and subjects 
are URIs which means that to be valid they are limited in length.  no harm in 
doing it, it just adds a layer of indirection at query time.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-293) Implement owl:unionOf inference

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120158#comment-16120158
 ] 

ASF GitHub Bot commented on RYA-293:


Github user jessehatfield commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/180#discussion_r132230928
  
--- Diff: 
sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java 
---
@@ -142,6 +143,53 @@ public void refreshGraph() throws 
InferenceEngineException {
 }
 }
 
+// Add unions to the subclass graph: if c owl:unionOf LIST(c1, 
c2, ... cn), then any
+// instances of c1, c2, ... or cn are also instances of c, 
meaning c is a superclass
+// of all the rest.
+// (In principle, an instance of c is likewise implied to be 
at least one of the other
+// types, but this fact is ignored for now to avoid 
nondeterministic reasoning.)
+iter = RyaDAOHelper.query(ryaDAO, null, OWL.UNIONOF, null, 
conf);
+try {
+while (iter.hasNext()) {
+Statement st = iter.next();
+Value unionType = st.getSubject();
+// Traverse the list of types constituting the union
+Value current = st.getObject();
+while (current instanceof Resource && 
!RDF.NIL.equals(current)) {
+Resource listNode = (Resource) current;
+CloseableIteration listIter = RyaDAOHelper.query(ryaDAO,
+listNode, RDF.FIRST, null, conf);
+try {
+if (listIter.hasNext()) {
+Statement firstStatement = listIter.next();
+if (firstStatement.getObject() instanceof 
Resource) {
+Resource subclass = (Resource) 
firstStatement.getObject();
+Statement subclassStatement = 
vf.createStatement(subclass, RDFS.SUBCLASSOF, unionType);
+addStatementEdge(graph, 
RDFS.SUBCLASSOF.stringValue(), subclassStatement);
+}
+}
+} finally {
+listIter.close();
+}
+listIter = RyaDAOHelper.query(ryaDAO, listNode, 
RDF.REST, null, conf);
+try {
+if (listIter.hasNext()) {
+current = listIter.next().getObject();
--- End diff --

Yep, a union is given as a linked list so we just walk down adding subclass 
statements until we get to a node with no rdf:rest or with rdf:rest equal to 
rdf:nil. If the list is poorly-formed or someone tries to express the union 
using the wrong collection type ([describes unionOf 
expression](https://www.w3.org/TR/owl2-rdf-based-semantics/#Semantic_Conditions_for_Boolean_Connectives)
 | [gives interpretation of 
sequence](https://www.w3.org/TR/owl2-rdf-based-semantics/#Semantic_Conditions)) 
then it won't work. In most cases that just means the intended union isn't 
fully represented, but I suppose if it were somehow a cyclical list, we'd end 
up in an infinite loop.


> Implement owl:unionOf inference
> ---
>
> Key: RYA-293
> URL: https://issues.apache.org/jira/browse/RYA-293
> Project: Rya
>  Issue Type: Sub-task
>  Components: sail
>Reporter: Jesse Hatfield
>Assignee: Jesse Hatfield
>
> An *{{owl:unionOf}}* expression defines one type to be equivalent to the 
> union of another set of types. If the ontology states that {{:Parent}} is the 
> union of {{:Mother}} and {{:Father}}, then the inference engine should 
> rewrite statement patterns of the form {{?x rdf:type :Parent}} to check for 
> resources that are stated to be any of the types {{:Mother}}, {{:Father}}, or 
> {{:Parent}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #180: RYA-293 Added owl:unionOf inference

2017-08-09 Thread jessehatfield

Github user jessehatfield commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/180#discussion_r132230928
  
--- Diff: 
sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java 
---
@@ -142,6 +143,53 @@ public void refreshGraph() throws 
InferenceEngineException {
 }
 }
 
+// Add unions to the subclass graph: if c owl:unionOf LIST(c1, 
c2, ... cn), then any
+// instances of c1, c2, ... or cn are also instances of c, 
meaning c is a superclass
+// of all the rest.
+// (In principle, an instance of c is likewise implied to be 
at least one of the other
+// types, but this fact is ignored for now to avoid 
nondeterministic reasoning.)
+iter = RyaDAOHelper.query(ryaDAO, null, OWL.UNIONOF, null, 
conf);
+try {
+while (iter.hasNext()) {
+Statement st = iter.next();
+Value unionType = st.getSubject();
+// Traverse the list of types constituting the union
+Value current = st.getObject();
+while (current instanceof Resource && 
!RDF.NIL.equals(current)) {
+Resource listNode = (Resource) current;
+CloseableIteration listIter = RyaDAOHelper.query(ryaDAO,
+listNode, RDF.FIRST, null, conf);
+try {
+if (listIter.hasNext()) {
+Statement firstStatement = listIter.next();
+if (firstStatement.getObject() instanceof 
Resource) {
+Resource subclass = (Resource) 
firstStatement.getObject();
+Statement subclassStatement = 
vf.createStatement(subclass, RDFS.SUBCLASSOF, unionType);
+addStatementEdge(graph, 
RDFS.SUBCLASSOF.stringValue(), subclassStatement);
+}
+}
+} finally {
+listIter.close();
+}
+listIter = RyaDAOHelper.query(ryaDAO, listNode, 
RDF.REST, null, conf);
+try {
+if (listIter.hasNext()) {
+current = listIter.next().getObject();
--- End diff --

Yep, a union is given as a linked list so we just walk down adding subclass 
statements until we get to a node with no rdf:rest or with rdf:rest equal to 
rdf:nil. If the list is poorly-formed or someone tries to express the union 
using the wrong collection type ([describes unionOf 
expression](https://www.w3.org/TR/owl2-rdf-based-semantics/#Semantic_Conditions_for_Boolean_Connectives)
 | [gives interpretation of 
sequence](https://www.w3.org/TR/owl2-rdf-based-semantics/#Semantic_Conditions)) 
then it won't work. In most cases that just means the intended union isn't 
fully represented, but I suppose if it were somehow a cyclical list, we'd end 
up in an infinite loop.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120155#comment-16120155
 ] 

ASF GitHub Bot commented on RYA-316:


Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132230502
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

woops.  I'll make it just object.  did you want a hash for context as well?


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120156#comment-16120156
 ] 

ASF GitHub Bot commented on RYA-316:


Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132230655
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) {
 final RyaURI context = stmt.getContext();
 final BasicDBObject query = new BasicDBObject();
 if (subject != null){
-query.append(SUBJECT, subject.getData());
+query.append(SUBJECT_HASH, 
DigestUtils.sha256Hex(subject.getData()));
--- End diff --

I can store as either.  Not really sure if there are any pros-cons between 
the two


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132230655
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) {
 final RyaURI context = stmt.getContext();
 final BasicDBObject query = new BasicDBObject();
 if (subject != null){
-query.append(SUBJECT, subject.getData());
+query.append(SUBJECT_HASH, 
DigestUtils.sha256Hex(subject.getData()));
--- End diff --

I can store as either.  Not really sure if there are any pros-cons between 
the two


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread isper3at

Github user isper3at commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132230502
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

woops.  I'll make it just object.  did you want a hash for context as well?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread jdasch

Github user jdasch commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  
asfbot builld


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-293) Implement owl:unionOf inference

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120123#comment-16120123
 ] 

ASF GitHub Bot commented on RYA-293:


Github user jessehatfield commented on the issue:

https://github.com/apache/incubator-rya/pull/180
  
So this is a case where a few different schema terms (rdfs:subClassOf, 
owl:unionOf, and eventually owl:equivalentClass) end up being represented by 
just one term internally (subclass/superclass relationships). We could 
plausibly organize by either; my intuition is to use the internal 
representation, since the internal graph being complete and accurate can matter 
to the other rules. That would mean pulling the logic introduced here, plus the 
logic introduced in [https://github.com/apache/incubator-rya/pull/184](PR 184) 
, plus the logic for rdfs:subClassOf (which 184 incidentally simplifies to a 
method call anyway) into some "refreshSubClassGraph" method. Thoughts on that 
approach? Would probably do the same with subPropertyOfGraph for consistency 
(though it's simpler because there's no equivalent to union).


> Implement owl:unionOf inference
> ---
>
> Key: RYA-293
> URL: https://issues.apache.org/jira/browse/RYA-293
> Project: Rya
>  Issue Type: Sub-task
>  Components: sail
>Reporter: Jesse Hatfield
>Assignee: Jesse Hatfield
>
> An *{{owl:unionOf}}* expression defines one type to be equivalent to the 
> union of another set of types. If the ontology states that {{:Parent}} is the 
> union of {{:Mother}} and {{:Father}}, then the inference engine should 
> rewrite statement patterns of the form {{?x rdf:type :Parent}} to check for 
> resources that are stated to be any of the types {{:Mother}}, {{:Father}}, or 
> {{:Parent}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya issue #180: RYA-293 Added owl:unionOf inference

2017-08-09 Thread jessehatfield

Github user jessehatfield commented on the issue:

https://github.com/apache/incubator-rya/pull/180
  
So this is a case where a few different schema terms (rdfs:subClassOf, 
owl:unionOf, and eventually owl:equivalentClass) end up being represented by 
just one term internally (subclass/superclass relationships). We could 
plausibly organize by either; my intuition is to use the internal 
representation, since the internal graph being complete and accurate can matter 
to the other rules. That would mean pulling the logic introduced here, plus the 
logic introduced in [https://github.com/apache/incubator-rya/pull/184](PR 184) 
, plus the logic for rdfs:subClassOf (which 184 incidentally simplifies to a 
method call anyway) into some "refreshSubClassGraph" method. Thoughts on that 
approach? Would probably do the same with subPropertyOfGraph for consistency 
(though it's simpler because there's no equivalent to union).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/375/Failed
 Tests: 1incubator-rya-master-with-optionals-pull-requests/org.apache.rya:rya.benchmark:
 1org.apache.rya.benchmark.query.QueriesBenchmarkConfReaderIT.load



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-293) Implement owl:unionOf inference

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120060#comment-16120060
 ] 

ASF GitHub Bot commented on RYA-293:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/180#discussion_r132213500
  
--- Diff: 
sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java 
---
@@ -142,6 +143,53 @@ public void refreshGraph() throws 
InferenceEngineException {
 }
 }
 
+// Add unions to the subclass graph: if c owl:unionOf LIST(c1, 
c2, ... cn), then any
+// instances of c1, c2, ... or cn are also instances of c, 
meaning c is a superclass
+// of all the rest.
+// (In principle, an instance of c is likewise implied to be 
at least one of the other
+// types, but this fact is ignored for now to avoid 
nondeterministic reasoning.)
+iter = RyaDAOHelper.query(ryaDAO, null, OWL.UNIONOF, null, 
conf);
+try {
+while (iter.hasNext()) {
+Statement st = iter.next();
+Value unionType = st.getSubject();
+// Traverse the list of types constituting the union
+Value current = st.getObject();
+while (current instanceof Resource && 
!RDF.NIL.equals(current)) {
+Resource listNode = (Resource) current;
+CloseableIteration listIter = RyaDAOHelper.query(ryaDAO,
+listNode, RDF.FIRST, null, conf);
+try {
+if (listIter.hasNext()) {
+Statement firstStatement = listIter.next();
+if (firstStatement.getObject() instanceof 
Resource) {
+Resource subclass = (Resource) 
firstStatement.getObject();
+Statement subclassStatement = 
vf.createStatement(subclass, RDFS.SUBCLASSOF, unionType);
+addStatementEdge(graph, 
RDFS.SUBCLASSOF.stringValue(), subclassStatement);
+}
+}
+} finally {
+listIter.close();
+}
+listIter = RyaDAOHelper.query(ryaDAO, listNode, 
RDF.REST, null, conf);
+try {
+if (listIter.hasNext()) {
+current = listIter.next().getObject();
--- End diff --

Trying to follow the general logic here:  Each list has an RDF.FIRST and 
RDF.RESET property, where FIRST is a resource and REST is of type list.  So if 
the list has more than one element, current is set to the list obtained by the 
RDF.REST query and we go through the loop again.  Is this how all of the 
SUBCLASSOF statements are created for the union?


> Implement owl:unionOf inference
> ---
>
> Key: RYA-293
> URL: https://issues.apache.org/jira/browse/RYA-293
> Project: Rya
>  Issue Type: Sub-task
>  Components: sail
>Reporter: Jesse Hatfield
>Assignee: Jesse Hatfield
>
> An *{{owl:unionOf}}* expression defines one type to be equivalent to the 
> union of another set of types. If the ontology states that {{:Parent}} is the 
> union of {{:Mother}} and {{:Father}}, then the inference engine should 
> rewrite statement patterns of the form {{?x rdf:type :Parent}} to check for 
> resources that are stated to be any of the types {{:Mother}}, {{:Father}}, or 
> {{:Parent}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-293) Implement owl:unionOf inference

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120059#comment-16120059
 ] 

ASF GitHub Bot commented on RYA-293:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/180#discussion_r132207923
  
--- Diff: 
sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java 
---
@@ -142,6 +143,53 @@ public void refreshGraph() throws 
InferenceEngineException {
 }
 }
 
+// Add unions to the subclass graph: if c owl:unionOf LIST(c1, 
c2, ... cn), then any
--- End diff --

I thought that you were going to start breaking out any new logic that you 
added to the refreshGraph() method into methods that were specific to the given 
rule.


> Implement owl:unionOf inference
> ---
>
> Key: RYA-293
> URL: https://issues.apache.org/jira/browse/RYA-293
> Project: Rya
>  Issue Type: Sub-task
>  Components: sail
>Reporter: Jesse Hatfield
>Assignee: Jesse Hatfield
>
> An *{{owl:unionOf}}* expression defines one type to be equivalent to the 
> union of another set of types. If the ontology states that {{:Parent}} is the 
> union of {{:Mother}} and {{:Father}}, then the inference engine should 
> rewrite statement patterns of the form {{?x rdf:type :Parent}} to check for 
> resources that are stated to be any of the types {{:Mother}}, {{:Father}}, or 
> {{:Parent}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #180: RYA-293 Added owl:unionOf inference

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/180#discussion_r132207923
  
--- Diff: 
sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java 
---
@@ -142,6 +143,53 @@ public void refreshGraph() throws 
InferenceEngineException {
 }
 }
 
+// Add unions to the subclass graph: if c owl:unionOf LIST(c1, 
c2, ... cn), then any
--- End diff --

I thought that you were going to start breaking out any new logic that you 
added to the refreshGraph() method into methods that were specific to the given 
rule.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #180: RYA-293 Added owl:unionOf inference

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/180#discussion_r132213500
  
--- Diff: 
sail/src/main/java/org/apache/rya/rdftriplestore/inference/InferenceEngine.java 
---
@@ -142,6 +143,53 @@ public void refreshGraph() throws 
InferenceEngineException {
 }
 }
 
+// Add unions to the subclass graph: if c owl:unionOf LIST(c1, 
c2, ... cn), then any
+// instances of c1, c2, ... or cn are also instances of c, 
meaning c is a superclass
+// of all the rest.
+// (In principle, an instance of c is likewise implied to be 
at least one of the other
+// types, but this fact is ignored for now to avoid 
nondeterministic reasoning.)
+iter = RyaDAOHelper.query(ryaDAO, null, OWL.UNIONOF, null, 
conf);
+try {
+while (iter.hasNext()) {
+Statement st = iter.next();
+Value unionType = st.getSubject();
+// Traverse the list of types constituting the union
+Value current = st.getObject();
+while (current instanceof Resource && 
!RDF.NIL.equals(current)) {
+Resource listNode = (Resource) current;
+CloseableIteration listIter = RyaDAOHelper.query(ryaDAO,
+listNode, RDF.FIRST, null, conf);
+try {
+if (listIter.hasNext()) {
+Statement firstStatement = listIter.next();
+if (firstStatement.getObject() instanceof 
Resource) {
+Resource subclass = (Resource) 
firstStatement.getObject();
+Statement subclassStatement = 
vf.createStatement(subclass, RDFS.SUBCLASSOF, unionType);
+addStatementEdge(graph, 
RDFS.SUBCLASSOF.stringValue(), subclassStatement);
+}
+}
+} finally {
+listIter.close();
+}
+listIter = RyaDAOHelper.query(ryaDAO, listNode, 
RDF.REST, null, conf);
+try {
+if (listIter.hasNext()) {
+current = listIter.next().getObject();
--- End diff --

Trying to follow the general logic here:  Each list has an RDF.FIRST and 
RDF.RESET property, where FIRST is a resource and REST is of type list.  So if 
the list has more than one element, current is set to the list obtained by the 
RDF.REST query and we go through the loop again.  Is this how all of the 
SUBCLASSOF statements are created for the union?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120003#comment-16120003
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132192625
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
+final SPARQLParser sp = new SPARQLParser();
+final ParsedTupleQuery pq = (ParsedTupleQuery) 
sp.parseQuery(sparql, null);
+final TupleExpr te = pq.getTupleExpr();
+Preconditions.checkAr

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120009#comment-16120009
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132195020
  
--- Diff: 
extras/rya.benchmark/src/main/gen/org/apache/rya/benchmark/query/Rya.java ---
@@ -20,7 +20,7 @@
 // This file was generated by the JavaTM Architecture for XML 
Binding(JAXB) Reference Implementation, v2.2.11 
 // See http://java.sun.com/xml/jaxb";>http://java.sun.com/xml/jaxb 
 // Any modifications to this file will be lost upon recompilation of the 
source schema. 
-// Generated on: 2016.12.16 at 01:22:14 PM PST 
+// Generated on: 2017.07.06 at 03:13:11 PM EDT 
--- End diff --

This and the above source gen files should be fixed by Eric's latest PR.


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120005#comment-16120005
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132182885
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/BasePcjIndexer.java
 ---
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkState;
+import static java.util.Collections.singleton;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.groupingBy;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+import java.util.Map.Entry;
+import java.util.Set;
+import java.util.concurrent.atomic.AtomicReference;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.domain.RyaStatement;
+import org.apache.rya.api.domain.RyaURI;
+import org.apache.rya.indexing.entity.model.Entity;
+import org.apache.rya.indexing.entity.storage.EntityStorage;
+import 
org.apache.rya.indexing.entity.storage.EntityStorage.EntityStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+import org.apache.rya.mongodb.MongoSecondaryIndex;
+import org.openrdf.model.URI;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+
+/**
+ * A base class that may be used to update an {@link EntityStorage} as new
+ * {@link RyaStatement}s are added to/removed from the Rya instance.
+ */
+@DefaultAnnotation(NonNull.class)
+public abstract class BasePcjIndexer implements PcjIndexer, 
MongoSecondaryIndex {
--- End diff --

I'm not sure what this class is interacting with.  The basic components of 
our PCJ framework are the matcher framework for query optimization, the storage 
layer, and the indexer layer.  It seems like this is related to the indexer 
layer.  But the indexer layer is meant to interact with the updater (whatever 
observer framework we use to maintain the PCJs).  Given that there is currently 
no updater in place, what is the purpose of BasePcjIndexer, PcjIndexer, and 
MongoPrecomputedJoinIndexer?  I can understand including abstract classes and 
interfaces just to have them in place when an updater is incorporated, but some 
of these our concrete implementations.  So what are they interacting with?


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120011#comment-16120011
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132198496
  
--- Diff: 
extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjDocuments.java
 ---
@@ -0,0 +1,418 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.storage.mongo;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+import static java.util.Objects.requireNonNull;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.rya.api.domain.RyaType;
+import org.apache.rya.api.resolver.RdfToRyaConversions;
+import org.apache.rya.api.resolver.RyaToRdfConversions;
+import org.apache.rya.indexing.pcj.storage.PcjMetadata;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet;
+import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder;
+import org.bson.Document;
+import org.bson.conversions.Bson;
+import org.openrdf.model.URI;
+import org.openrdf.model.Value;
+import org.openrdf.model.impl.URIImpl;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.QueryLanguage;
+import org.openrdf.query.TupleQuery;
+import org.openrdf.query.TupleQueryResult;
+import org.openrdf.query.impl.MapBindingSet;
+import org.openrdf.repository.RepositoryConnection;
+import org.openrdf.repository.RepositoryException;
+
+import com.mongodb.MongoClient;
+import com.mongodb.client.FindIterable;
+import com.mongodb.client.MongoCollection;
+import com.mongodb.util.JSON;
+
+/**
+ * Creates and modifies PCJs in MongoDB. PCJ's are stored as follows:
+ *
+ * 
+ * 
+ * - PCJ Metadata Doc -
+ * {
+ *   _id: [table_name]_METADATA,
+ *   sparql: [sparql query to match results],
+ *   cardinality: [number of results]
+ * }
+ *
+ * - PCJ Results Doc -
+ * {
+ *   pcjName: [table_name],
+ *   auths: [auths]
+ *   [binding_var1]: {
+ * uri: [type_uri],
+ * value: value
+ *   }
+ *   .
+ *   .
+ *   .
+ *   [binding_varn]: {
--- End diff --

binding_var2


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1611#comment-1611
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132187057
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
--- End diff --

I think that you should make this name more Mongo centric.  PcjQueryNode 
sounds very general purpose and makes it seem like this class is DB agnostic.  
We should have similar naming conventions for the two PCJ nodes.  Currently the 
other node is AccumuloIndexSet.  If you don't like MongoIndexSet, we can rename 
that to AccumuloPcjNode and rename this class to MongoPcjNode.


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120008#comment-16120008
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132193846
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/pcj/matching/PCJOptimizer.java
 ---
@@ -90,9 +97,19 @@ public final void setConf(final Configuration conf) {
 if (!init) {
 try {
 this.conf = conf;
-this.useOptimal = ConfigUtils.getUseOptimalPCJ(conf);
-provider = new AccumuloIndexSetProvider(conf);
-} catch (Exception e) {
+useOptimal = ConfigUtils.getUseOptimalPCJ(conf);
--- End diff --

Ugh, I hate that we have to do this and we can't use an Interface.  Stupid 
setConf() init.


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119997#comment-16119997
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132190366
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
+final SPARQLParser sp = new SPARQLParser();
+final ParsedTupleQuery pq = (ParsedTupleQuery) 
sp.parseQuery(sparql, null);
+final TupleExpr te = pq.getTupleExpr();
+Preconditions.checkAr

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119996#comment-16119996
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132175735
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static java.util.Objects.requireNonNull;
+
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.RdfCloudTripleStoreConfiguration;
+import org.apache.rya.api.instance.RyaDetailsRepository;
+import 
org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import 
org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider;
+import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage;
+import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.mongodb.MongoClient;
+
+/**
+ * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB.
+ */
+public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider {
+private final MongoClient client;
+private final MongoDBRdfConfiguration mongoConf;
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
MongoClient client) {
+super(conf);
+this.client = client;
+mongoConf = new MongoDBRdfConfiguration(conf);
+}
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
List indices, final MongoClient client) {
+super(conf, indices);
+this.client = client;
--- End diff --

Preconditions


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120001#comment-16120001
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132175692
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static java.util.Objects.requireNonNull;
+
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.RdfCloudTripleStoreConfiguration;
+import org.apache.rya.api.instance.RyaDetailsRepository;
+import 
org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import 
org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider;
+import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage;
+import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.mongodb.MongoClient;
+
+/**
+ * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB.
+ */
+public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider {
+private final MongoClient client;
+private final MongoDBRdfConfiguration mongoConf;
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
MongoClient client) {
+super(conf);
+this.client = client;
--- End diff --

Preconditions


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120007#comment-16120007
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132201588
  
--- Diff: 
extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoVisibilityBindingSetBsonConverter.java
 ---
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.storage.mongo;
+
+import static 
org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter.DOCUMENT_VISIBILITY_KEY;
+
+import org.apache.rya.api.domain.RyaType;
+import org.apache.rya.api.resolver.RdfToRyaConversions;
+import org.apache.rya.api.resolver.RyaToRdfConversions;
+import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet;
+import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder;
+import org.apache.rya.mongodb.document.visibility.DocumentVisibility;
+import 
org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter;
+import 
org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter.MalformedDocumentVisibilityException;
+import org.bson.BsonArray;
+import org.bson.BsonDocument;
+import org.bson.BsonString;
+import org.bson.Document;
+import org.openrdf.model.Value;
+import org.openrdf.model.impl.URIImpl;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.impl.MapBindingSet;
+
+import com.mongodb.DBObject;
+import com.mongodb.MongoClient;
+import com.mongodb.util.JSON;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+
+/**
+ * Converts {@link BindingSet}s to Strings and back again. The Strings do 
not
+ * include the binding names and are ordered with a {@link VariableOrder}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class MongoVisibilityBindingSetBsonConverter {/* implements 
MongoBindingSetConverter {
--- End diff --

What's going on here?  Why is this commented out?


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120004#comment-16120004
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132188383
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
--- End diff --

So, the MongoPrecomputedJoinIndexer and the PcjQueryNode do not need to 
talk to each other.  Again, the PrecomputedJoinIndexer is for ingesting data 
into the Updater, while the PcjQueryNode is a placeholder for the sub query 
that the PCJ match

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120006#comment-16120006
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132202883
  
--- Diff: 
extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjAdapter.java
 ---
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.storage.mongo;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import org.apache.rya.api.domain.RyaType;
+import org.apache.rya.api.resolver.RdfToRyaConversions;
+import org.apache.rya.api.resolver.RyaToRdfConversions;
+import org.apache.rya.indexing.pcj.storage.accumulo.BindingSetConverter;
+import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder;
+import org.bson.BsonArray;
+import org.bson.BsonDocument;
+import org.bson.BsonString;
+import org.bson.BsonValue;
+import org.bson.Document;
+import org.bson.codecs.DocumentCodec;
+import org.bson.codecs.configuration.CodecRegistries;
+import org.bson.codecs.configuration.CodecRegistry;
+import org.bson.conversions.Bson;
+import org.openrdf.model.impl.URIImpl;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.algebra.evaluation.QueryBindingSet;
+
+/**
+ * Converts a Pcj for storage in mongoDB or retrieval from mongoDB.
+ */
+public class MongoPcjAdapter implements BindingSetConverter {
--- End diff --

Where is this class used?  It doesn't appear that this class or the 
converter above are used anywhere.


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1612#comment-1612
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132183937
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPrecomputedJoinIndexer.java
 ---
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkArgument;
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.concurrent.atomic.AtomicReference;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.log4j.Logger;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+
+import com.mongodb.MongoClient;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+
+/**
+ * Updates the state of the Precomputed Join indices that are used by Rya.
+ */
+@DefaultAnnotation(NonNull.class)
+public class MongoPrecomputedJoinIndexer extends BasePcjIndexer {
--- End diff --

It appears this class does nothing.  Is this just a stub class for when an 
observer framework is in place?


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>  Issue Type: Improvement
>Reporter: Andrew Smith
>Assignee: Andrew Smith
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120002#comment-16120002
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132189129
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
+final SPARQLParser sp = new SPARQLParser();
+final ParsedTupleQuery pq = (ParsedTupleQuery) 
sp.parseQuery(sparql, null);
+final TupleExpr te = pq.getTupleExpr();
+Preconditions.checkAr

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119998#comment-16119998
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132189025
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
--- End diff --

Preconditions


> Mongo PCJ indexer support
> -
>
> Key: RYA-303
> URL: https://issues.apache.org/jira/browse/RYA-303
> Project: Rya
>

[jira] [Commented] (RYA-303) Mongo PCJ indexer support

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119995#comment-16119995
 ] 

ASF GitHub Bot commented on RYA-303:


Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132173821
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static java.util.Objects.requireNonNull;
+
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.RdfCloudTripleStoreConfiguration;
+import org.apache.rya.api.instance.RyaDetailsRepository;
+import 
org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import 
org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider;
+import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage;
+import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.mongodb.MongoClient;
+
+/**
+ * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB.
+ */
+public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider {
+private final MongoClient client;
+private final MongoDBRdfConfiguration mongoConf;
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
MongoClient client) {
+super(conf);
+this.client = client;
+mongoConf = new MongoDBRdfConfiguration(conf);
+}
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
List indices, final MongoClient client) {
+super(conf, indices);
+this.client = client;
+mongoConf = new MongoDBRdfConfiguration(conf);
+}
+
+@Override
+protected List getIndices() throws Exception {
+requireNonNull(conf);
+final MongoPcjDocuments pcjTables = new MongoPcjDocuments(client, 
mongoConf.getMongoDBName());
+final String pcjPrefix = 
requireNonNull(conf.get(RdfCloudTripleStoreConfiguration.CONF_TBL_PREFIX));
+List tables = null;
+
+tables = mongoConf.getPcjTables();
+// this maps associates pcj table name with pcj sparql query
+final Map indexTables = Maps.newLinkedHashMap();
+
+try(final PrecomputedJoinStorage storage = new 
MongoPcjStorage(client, mongoConf.getMongoInstance(), null)) {
+final PcjTableNameFactory pcjFactory = new 
PcjTableNameFactory();
+
+final boolean tablesProvided = tables != null && 
!tables.isEmpty();
+
+if (tablesProvided) {
+// if tables provided, associate table name with sparql
+for (final String table : tables) {
+indexTables.put(table, 
storage.getPcjMetadata(pcjFactory.getPcjId(table)).getSparql());
+}
+} else if (hasRyaDetails(mongoConf.getMongoDBName())) {
+// If this is a newer install of Rya, and it has PCJ 
Details,
+// then
+// use those.
+final List ids = storage.listPcjs();
+for (final String id : ids) {
+indexTables.put(pcjFactory.makeTableName(pcjPrefix, 
id), storage.getPcjMetadata(id)

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132201588
  
--- Diff: 
extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoVisibilityBindingSetBsonConverter.java
 ---
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.storage.mongo;
+
+import static 
org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter.DOCUMENT_VISIBILITY_KEY;
+
+import org.apache.rya.api.domain.RyaType;
+import org.apache.rya.api.resolver.RdfToRyaConversions;
+import org.apache.rya.api.resolver.RyaToRdfConversions;
+import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet;
+import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder;
+import org.apache.rya.mongodb.document.visibility.DocumentVisibility;
+import 
org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter;
+import 
org.apache.rya.mongodb.document.visibility.DocumentVisibilityAdapter.MalformedDocumentVisibilityException;
+import org.bson.BsonArray;
+import org.bson.BsonDocument;
+import org.bson.BsonString;
+import org.bson.Document;
+import org.openrdf.model.Value;
+import org.openrdf.model.impl.URIImpl;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.impl.MapBindingSet;
+
+import com.mongodb.DBObject;
+import com.mongodb.MongoClient;
+import com.mongodb.util.JSON;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+
+/**
+ * Converts {@link BindingSet}s to Strings and back again. The Strings do 
not
+ * include the binding names and are ordered with a {@link VariableOrder}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class MongoVisibilityBindingSetBsonConverter {/* implements 
MongoBindingSetConverter {
--- End diff --

What's going on here?  Why is this commented out?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132183937
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPrecomputedJoinIndexer.java
 ---
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkArgument;
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.concurrent.atomic.AtomicReference;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.log4j.Logger;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+
+import com.mongodb.MongoClient;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+
+/**
+ * Updates the state of the Precomputed Join indices that are used by Rya.
+ */
+@DefaultAnnotation(NonNull.class)
+public class MongoPrecomputedJoinIndexer extends BasePcjIndexer {
--- End diff --

It appears this class does nothing.  Is this just a stub class for when an 
observer framework is in place?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132193846
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/pcj/matching/PCJOptimizer.java
 ---
@@ -90,9 +97,19 @@ public final void setConf(final Configuration conf) {
 if (!init) {
 try {
 this.conf = conf;
-this.useOptimal = ConfigUtils.getUseOptimalPCJ(conf);
-provider = new AccumuloIndexSetProvider(conf);
-} catch (Exception e) {
+useOptimal = ConfigUtils.getUseOptimalPCJ(conf);
--- End diff --

Ugh, I hate that we have to do this and we can't use an Interface.  Stupid 
setConf() init.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132182885
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/BasePcjIndexer.java
 ---
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkState;
+import static java.util.Collections.singleton;
+import static java.util.Objects.requireNonNull;
+import static java.util.stream.Collectors.groupingBy;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+import java.util.Map.Entry;
+import java.util.Set;
+import java.util.concurrent.atomic.AtomicReference;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.domain.RyaStatement;
+import org.apache.rya.api.domain.RyaURI;
+import org.apache.rya.indexing.entity.model.Entity;
+import org.apache.rya.indexing.entity.storage.EntityStorage;
+import 
org.apache.rya.indexing.entity.storage.EntityStorage.EntityStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+import org.apache.rya.mongodb.MongoSecondaryIndex;
+import org.openrdf.model.URI;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+
+/**
+ * A base class that may be used to update an {@link EntityStorage} as new
+ * {@link RyaStatement}s are added to/removed from the Rya instance.
+ */
+@DefaultAnnotation(NonNull.class)
+public abstract class BasePcjIndexer implements PcjIndexer, 
MongoSecondaryIndex {
--- End diff --

I'm not sure what this class is interacting with.  The basic components of 
our PCJ framework are the matcher framework for query optimization, the storage 
layer, and the indexer layer.  It seems like this is related to the indexer 
layer.  But the indexer layer is meant to interact with the updater (whatever 
observer framework we use to maintain the PCJs).  Given that there is currently 
no updater in place, what is the purpose of BasePcjIndexer, PcjIndexer, and 
MongoPrecomputedJoinIndexer?  I can understand including abstract classes and 
interfaces just to have them in place when an updater is incorporated, but some 
of these our concrete implementations.  So what are they interacting with?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132195020
  
--- Diff: 
extras/rya.benchmark/src/main/gen/org/apache/rya/benchmark/query/Rya.java ---
@@ -20,7 +20,7 @@
 // This file was generated by the JavaTM Architecture for XML 
Binding(JAXB) Reference Implementation, v2.2.11 
 // See http://java.sun.com/xml/jaxb";>http://java.sun.com/xml/jaxb 
 // Any modifications to this file will be lost upon recompilation of the 
source schema. 
-// Generated on: 2016.12.16 at 01:22:14 PM PST 
+// Generated on: 2017.07.06 at 03:13:11 PM EDT 
--- End diff --

This and the above source gen files should be fixed by Eric's latest PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132187057
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
--- End diff --

I think that you should make this name more Mongo centric.  PcjQueryNode 
sounds very general purpose and makes it seem like this class is DB agnostic.  
We should have similar naming conventions for the two PCJ nodes.  Currently the 
other node is AccumuloIndexSet.  If you don't like MongoIndexSet, we can rename 
that to AccumuloPcjNode and rename this class to MongoPcjNode.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132175692
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static java.util.Objects.requireNonNull;
+
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.RdfCloudTripleStoreConfiguration;
+import org.apache.rya.api.instance.RyaDetailsRepository;
+import 
org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import 
org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider;
+import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage;
+import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.mongodb.MongoClient;
+
+/**
+ * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB.
+ */
+public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider {
+private final MongoClient client;
+private final MongoDBRdfConfiguration mongoConf;
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
MongoClient client) {
+super(conf);
+this.client = client;
--- End diff --

Preconditions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132198496
  
--- Diff: 
extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjDocuments.java
 ---
@@ -0,0 +1,418 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.storage.mongo;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+import static java.util.Objects.requireNonNull;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.rya.api.domain.RyaType;
+import org.apache.rya.api.resolver.RdfToRyaConversions;
+import org.apache.rya.api.resolver.RyaToRdfConversions;
+import org.apache.rya.indexing.pcj.storage.PcjMetadata;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.VisibilityBindingSet;
+import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder;
+import org.bson.Document;
+import org.bson.conversions.Bson;
+import org.openrdf.model.URI;
+import org.openrdf.model.Value;
+import org.openrdf.model.impl.URIImpl;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.QueryLanguage;
+import org.openrdf.query.TupleQuery;
+import org.openrdf.query.TupleQueryResult;
+import org.openrdf.query.impl.MapBindingSet;
+import org.openrdf.repository.RepositoryConnection;
+import org.openrdf.repository.RepositoryException;
+
+import com.mongodb.MongoClient;
+import com.mongodb.client.FindIterable;
+import com.mongodb.client.MongoCollection;
+import com.mongodb.util.JSON;
+
+/**
+ * Creates and modifies PCJs in MongoDB. PCJ's are stored as follows:
+ *
+ * 
+ * 
+ * - PCJ Metadata Doc -
+ * {
+ *   _id: [table_name]_METADATA,
+ *   sparql: [sparql query to match results],
+ *   cardinality: [number of results]
+ * }
+ *
+ * - PCJ Results Doc -
+ * {
+ *   pcjName: [table_name],
+ *   auths: [auths]
+ *   [binding_var1]: {
+ * uri: [type_uri],
+ * value: value
+ *   }
+ *   .
+ *   .
+ *   .
+ *   [binding_varn]: {
--- End diff --

binding_var2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132189129
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
+final SPARQLParser sp = new SPARQLParser();
+final ParsedTupleQuery pq = (ParsedTupleQuery) 
sp.parseQuery(sparql, null);
+final TupleExpr te = pq.getTupleExpr();
+Preconditions.checkArgument(PCJOptimizerUtilities.isPCJValid(te), 
"TupleExpr is an invalid PCJ.");
+
+final Optional projection = new 
ParsedQueryUtil().findProjection(pq);
+if (!projection.isPresent()) {
+throw new Malform

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132202883
  
--- Diff: 
extras/rya.indexing.pcj/src/main/java/org/apache/rya/indexing/pcj/storage/mongo/MongoPcjAdapter.java
 ---
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.pcj.storage.mongo;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import org.apache.rya.api.domain.RyaType;
+import org.apache.rya.api.resolver.RdfToRyaConversions;
+import org.apache.rya.api.resolver.RyaToRdfConversions;
+import org.apache.rya.indexing.pcj.storage.accumulo.BindingSetConverter;
+import org.apache.rya.indexing.pcj.storage.accumulo.VariableOrder;
+import org.bson.BsonArray;
+import org.bson.BsonDocument;
+import org.bson.BsonString;
+import org.bson.BsonValue;
+import org.bson.Document;
+import org.bson.codecs.DocumentCodec;
+import org.bson.codecs.configuration.CodecRegistries;
+import org.bson.codecs.configuration.CodecRegistry;
+import org.bson.conversions.Bson;
+import org.openrdf.model.impl.URIImpl;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.algebra.evaluation.QueryBindingSet;
+
+/**
+ * Converts a Pcj for storage in mongoDB or retrieval from mongoDB.
+ */
+public class MongoPcjAdapter implements BindingSetConverter {
--- End diff --

Where is this class used?  It doesn't appear that this class or the 
converter above are used anywhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132188383
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
--- End diff --

So, the MongoPrecomputedJoinIndexer and the PcjQueryNode do not need to 
talk to each other.  Again, the PrecomputedJoinIndexer is for ingesting data 
into the Updater, while the PcjQueryNode is a placeholder for the sub query 
that the PCJ matches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact in

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132192625
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
+final SPARQLParser sp = new SPARQLParser();
+final ParsedTupleQuery pq = (ParsedTupleQuery) 
sp.parseQuery(sparql, null);
+final TupleExpr te = pq.getTupleExpr();
+Preconditions.checkArgument(PCJOptimizerUtilities.isPCJValid(te), 
"TupleExpr is an invalid PCJ.");
+
+final Optional projection = new 
ParsedQueryUtil().findProjection(pq);
+if (!projection.isPresent()) {
+throw new Malform

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132190366
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
+final SPARQLParser sp = new SPARQLParser();
+final ParsedTupleQuery pq = (ParsedTupleQuery) 
sp.parseQuery(sparql, null);
+final TupleExpr te = pq.getTupleExpr();
+Preconditions.checkArgument(PCJOptimizerUtilities.isPCJValid(te), 
"TupleExpr is an invalid PCJ.");
+
+final Optional projection = new 
ParsedQueryUtil().findProjection(pq);
+if (!projection.isPresent()) {
+throw new Malform

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132189025
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/PcjQueryNode.java
 ---
@@ -0,0 +1,184 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashSet;
+
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.utils.IteratorWrapper;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import org.apache.rya.indexing.external.tupleSet.ParsedQueryUtil;
+import org.apache.rya.indexing.pcj.matching.PCJOptimizerUtilities;
+import org.apache.rya.indexing.pcj.storage.PcjException;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.CloseableIterator;
+import 
org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage.PCJStorageException;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.rdftriplestore.evaluation.ExternalBatchingIterator;
+import org.openrdf.query.BindingSet;
+import org.openrdf.query.MalformedQueryException;
+import org.openrdf.query.QueryEvaluationException;
+import org.openrdf.query.algebra.Projection;
+import org.openrdf.query.algebra.TupleExpr;
+import org.openrdf.query.parser.ParsedTupleQuery;
+import org.openrdf.query.parser.sparql.SPARQLParser;
+import org.openrdf.sail.SailException;
+
+import com.google.common.base.Optional;
+import com.google.common.base.Preconditions;
+
+import edu.umd.cs.findbugs.annotations.DefaultAnnotation;
+import edu.umd.cs.findbugs.annotations.NonNull;
+import info.aduna.iteration.CloseableIteration;
+
+/**
+ * Indexing Node for PCJs expressions to be inserted into execution plan to
+ * delegate entity portion of query to {@link MongoPrecomputedJoinIndexer}.
+ */
+@DefaultAnnotation(NonNull.class)
+public class PcjQueryNode extends ExternalTupleSet implements 
ExternalBatchingIterator {
+private final String tablename;
+private final PcjIndexer indexer;
+private final MongoPcjDocuments pcjDocs;
+
+/**
+ *
+ * @param sparql - name of sparql query whose results will be stored 
in PCJ table
+ * @param conf - Rya Configuration
+ * @param tablename - name of an existing PCJ table
+ * @throws MalformedQueryException
+ * @throws SailException
+ * @throws QueryEvaluationException
+ * @throws TableNotFoundException
+ * @throws AccumuloSecurityException
+ * @throws AccumuloException
+ * @throws PCJStorageException
+ */
+public PcjQueryNode(final String sparql, final String tablename, final 
MongoPcjDocuments pcjDocs)
+throws MalformedQueryException, SailException, 
QueryEvaluationException, TableNotFoundException,
+AccumuloException, AccumuloSecurityException, 
PCJStorageException {
+this.pcjDocs = checkNotNull(pcjDocs);
+indexer = new MongoPrecomputedJoinIndexer();
+this.tablename = tablename;
--- End diff --

Preconditions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132173821
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static java.util.Objects.requireNonNull;
+
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.RdfCloudTripleStoreConfiguration;
+import org.apache.rya.api.instance.RyaDetailsRepository;
+import 
org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import 
org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider;
+import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage;
+import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.mongodb.MongoClient;
+
+/**
+ * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB.
+ */
+public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider {
+private final MongoClient client;
+private final MongoDBRdfConfiguration mongoConf;
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
MongoClient client) {
+super(conf);
+this.client = client;
+mongoConf = new MongoDBRdfConfiguration(conf);
+}
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
List indices, final MongoClient client) {
+super(conf, indices);
+this.client = client;
+mongoConf = new MongoDBRdfConfiguration(conf);
+}
+
+@Override
+protected List getIndices() throws Exception {
+requireNonNull(conf);
+final MongoPcjDocuments pcjTables = new MongoPcjDocuments(client, 
mongoConf.getMongoDBName());
+final String pcjPrefix = 
requireNonNull(conf.get(RdfCloudTripleStoreConfiguration.CONF_TBL_PREFIX));
+List tables = null;
+
+tables = mongoConf.getPcjTables();
+// this maps associates pcj table name with pcj sparql query
+final Map indexTables = Maps.newLinkedHashMap();
+
+try(final PrecomputedJoinStorage storage = new 
MongoPcjStorage(client, mongoConf.getMongoInstance(), null)) {
+final PcjTableNameFactory pcjFactory = new 
PcjTableNameFactory();
+
+final boolean tablesProvided = tables != null && 
!tables.isEmpty();
+
+if (tablesProvided) {
+// if tables provided, associate table name with sparql
+for (final String table : tables) {
+indexTables.put(table, 
storage.getPcjMetadata(pcjFactory.getPcjId(table)).getSparql());
+}
+} else if (hasRyaDetails(mongoConf.getMongoDBName())) {
+// If this is a newer install of Rya, and it has PCJ 
Details,
+// then
+// use those.
+final List ids = storage.listPcjs();
+for (final String id : ids) {
+indexTables.put(pcjFactory.makeTableName(pcjPrefix, 
id), storage.getPcjMetadata(id).getSparql());
+}
+} else {
+// Otherwise figure it out by getting document IDs.
+tables = pcjTables.listPcjDocuments();
+for (final String table : tab

[GitHub] incubator-rya pull request #172: RYA-303 Mongo PCJ Support

2017-08-09 Thread meiercaleb

Github user meiercaleb commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/172#discussion_r132175735
  
--- Diff: 
extras/indexing/src/main/java/org/apache/rya/indexing/mongodb/pcj/MongoPcjIndexSetProvider.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.rya.indexing.mongodb.pcj;
+
+import static java.util.Objects.requireNonNull;
+
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.rya.api.RdfCloudTripleStoreConfiguration;
+import org.apache.rya.api.instance.RyaDetailsRepository;
+import 
org.apache.rya.api.instance.RyaDetailsRepository.RyaDetailsRepositoryException;
+import org.apache.rya.indexing.external.tupleSet.ExternalTupleSet;
+import 
org.apache.rya.indexing.pcj.matching.provider.AbstractPcjIndexSetProvider;
+import org.apache.rya.indexing.pcj.storage.PrecomputedJoinStorage;
+import org.apache.rya.indexing.pcj.storage.accumulo.PcjTableNameFactory;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjDocuments;
+import org.apache.rya.indexing.pcj.storage.mongo.MongoPcjStorage;
+import org.apache.rya.mongodb.MongoDBRdfConfiguration;
+import org.apache.rya.mongodb.instance.MongoRyaInstanceDetailsRepository;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import com.mongodb.MongoClient;
+
+/**
+ * Implementation of {@link AbstractPcjIndexSetProvider} for MongoDB.
+ */
+public class MongoPcjIndexSetProvider extends AbstractPcjIndexSetProvider {
+private final MongoClient client;
+private final MongoDBRdfConfiguration mongoConf;
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
MongoClient client) {
+super(conf);
+this.client = client;
+mongoConf = new MongoDBRdfConfiguration(conf);
+}
+
+public MongoPcjIndexSetProvider(final Configuration conf, final 
List indices, final MongoClient client) {
+super(conf, indices);
+this.client = client;
--- End diff --

Preconditions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread amihalik

Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132191684
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

Can you change this to just "object" or change "context" "predicate" 
"subject" to "xxx_original"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119949#comment-16119949
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132191684
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -53,8 +54,11 @@
 public static final String OBJECT_TYPE_VALUE = 
XMLSchema.ANYURI.stringValue();
 public static final String CONTEXT = "context";
 public static final String PREDICATE = "predicate";
-public static final String OBJECT = "object";
+public static final String PREDICATE_HASH = "predicate_hash";
+public static final String OBJECT = "object_original";
--- End diff --

Can you change this to just "object" or change "context" "predicate" 
"subject" to "xxx_original"


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RYA-316) Long LineStrings break MongoDB ingest

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/RYA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119948#comment-16119948
 ] 

ASF GitHub Bot commented on RYA-316:


Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132192953
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) {
 final RyaURI context = stmt.getContext();
 final BasicDBObject query = new BasicDBObject();
 if (subject != null){
-query.append(SUBJECT, subject.getData());
+query.append(SUBJECT_HASH, 
DigestUtils.sha256Hex(subject.getData()));
--- End diff --

Can we store/query in binary (32 bytes) vs hex string (64 bytes)?


> Long LineStrings break MongoDB ingest
> -
>
> Key: RYA-316
> URL: https://issues.apache.org/jira/browse/RYA-316
> Project: Rya
>  Issue Type: Bug
>  Components: dao
>Reporter: Aaron Mihalik
>Assignee: Andrew Smith
>
> MongoDB will reject statements they contain very long linestrings.  
> Basically, the mongodb index key is limited to 1024 chars, so the insert will 
> fail if the literal is longer.
> [Here is some example 
> code|https://github.com/amihalik/rya-mongo-debugging/blob/master/src/main/java/com/github/amihalik/rya/mongo/debugging/linestring/LoadLineString.java].
>   I think the inserts will work if you use 10 points, but fail if you use 
> linestrings with 100 points.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] incubator-rya pull request #199: RYA-316 Long OBJ string

2017-08-09 Thread amihalik

Github user amihalik commented on a diff in the pull request:

https://github.com/apache/incubator-rya/pull/199#discussion_r132192953
  
--- Diff: 
dao/mongodb.rya/src/main/java/org/apache/rya/mongodb/dao/SimpleMongoDBStorageStrategy.java
 ---
@@ -85,14 +89,14 @@ public DBObject getQuery(final RyaStatement stmt) {
 final RyaURI context = stmt.getContext();
 final BasicDBObject query = new BasicDBObject();
 if (subject != null){
-query.append(SUBJECT, subject.getData());
+query.append(SUBJECT_HASH, 
DigestUtils.sha256Hex(subject.getData()));
--- End diff --

Can we store/query in binary (32 bytes) vs hex string (64 bytes)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #202: [WIP] Rya 331

2017-08-09 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/incubator-rya/pull/202
  

Refer to this link for build results (access rights to CI server needed): 

https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/374/Build
 result: FAILURE[...truncated 1.21 MB...][INFO] Apache Rya Spark 
Support ... SKIPPED[INFO] Apache Rya Web Projects 
 SKIPPED[INFO] Apache Rya Web Implementation 
.. SKIPPED[INFO] 
[INFO] 
BUILD FAILURE[INFO] 
[INFO] 
Total time: 18:57 min[INFO] Finished at: 2017-08-09T13:20:24+00:00[INFO] Final 
Memory: 226M/2679M[INFO] 
[ERROR] 
Failed to execute goal org.apache.rat:apache-rat-plugin:0.11:check 
(check-licenses) on project rya.pcj.fluo.test.base: Too many files with 
unapproved license: 4 See RAT report in: 
/home/jenkins/jenkins-slave/workspace/incubator-rya-master-with-optionals-pull-requests/extras/rya
 .pcj.fluo/pcj.fluo.test.base/target/rat.txt -> [Help 1][ERROR] [ERROR] To see 
the full stack trace of the errors, re-run Maven with the -e switch.[ERROR] 
Re-run Maven using the -X switch to enable full debug logging.[ERROR] [ERROR] 
For more information about the errors and possible solutions, please read the 
following articles:[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the 
command[ERROR]   mvn  -rf :rya.pcj.fluo.test.basechannel stoppedSetting 
status of 56c2eec09ebbed482d4dee192dc9e150bca7186d to FAILURE with url 
https://builds.apache.org/job/incubator-rya-master-with-optionals-pull-requests/374/
 and message: 'FAILURE 'Using context: Jenkins: clean package -Pgeoindexing



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #202: [WIP] Rya 331

2017-08-09 Thread jdasch

GitHub user jdasch opened a pull request:

https://github.com/apache/incubator-rya/pull/202

[WIP] Rya 331


## Description
>What Changed?
Don't bother reviewing this right now.  This PR is just a mechanism for 
exercising the build servers.

### Tests
>Coverage?

[Description of what tests were written]

### Links
[Jira](https://issues.apache.org/jira/browse/RYA-NUMBER)

### Checklist
- [ ] Code Review
- [ ] Squash Commits

 People To Reivew
[Add those who should review this]


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jdasch/incubator-rya RYA-331

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-rya/pull/202.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #202


commit 5ca17c7e8a832b040a33ee2ce8aa0aabaf5d083e
Author: jdasch 
Date:   2017-08-03T14:34:52Z

Improved IntegrationTest stability.

commit f4f4f02db73362bc9eb1997b9a391f5c0c1c5854
Author: jdasch 
Date:   2017-08-04T13:44:13Z

stash

commit ddeea3c56d6329a2215c4cc25fba58ddeb7ce4a0
Author: jdasch 
Date:   2017-08-04T19:04:49Z

stash

commit ef647ebf56c60d942bf30c3f9eeab43124f47dbb
Author: jdasch 
Date:   2017-08-07T12:50:17Z

stash

commit 3072ce496184a6c6b8c141f24b85e39379ef4973
Author: jdasch 
Date:   2017-08-07T12:51:12Z

stash

commit 1ac702dd3e5c01b766718d16c2efd726f43e4a14
Author: jdasch 
Date:   2017-08-07T15:55:00Z

stash

commit 7d828bd253fe7bf912cfde3b9fadaf74ce920794
Author: jdasch 
Date:   2017-08-07T17:34:28Z

added close

commit 10fce8cec5b3a2379b49bf93a2520f335705c183
Author: jdasch 
Date:   2017-08-07T17:47:01Z

stash

commit 24eb7cc17dfdec825fa732a4f993da3a0e561c74
Author: jdasch 
Date:   2017-08-07T20:28:02Z

stash

commit 26a12828d18d21a682ef0f121a3fb6a6420e6b7c
Author: jdasch 
Date:   2017-08-08T15:18:55Z

stash

commit 2b6f9412b3d72a22fee296293011992bbc8200a5
Author: jdasch 
Date:   2017-08-08T15:38:16Z

prevented blocking

commit fefc906c1dbfdd90969add65fc5c31b266cd4b9e
Author: jdasch 
Date:   2017-08-08T19:41:30Z

stash

commit d69b7db18ed95e7f57a5a182cfad915cb2641cda
Author: jdasch 
Date:   2017-08-08T21:00:17Z

stash - needs to be cleaned up.

commit d556f060e9e68f072a5feff394a797ca8e419c54
Author: jdasch 
Date:   2017-08-09T03:30:15Z

stash

commit 56c2eec09ebbed482d4dee192dc9e150bca7186d
Author: jdasch 
Date:   2017-08-09T12:20:26Z

ignored failing tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (RYA-337) Batch Queries to MongoDB

2017-08-09 Thread Aaron Mihalik (JIRA)

Aaron Mihalik created RYA-337:
-

 Summary: Batch Queries to MongoDB
 Key: RYA-337
 URL: https://issues.apache.org/jira/browse/RYA-337
 Project: Rya
  Issue Type: Improvement
  Components: dao
Reporter: Aaron Mihalik
Assignee: Aaron Mihalik


Currently the MongoDB DAO sends one query at a time to Mongo.  Instead, the DAO 
should send a batch of queries and perform a client side hash join (like the 
Accumulo DAO)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

95 matches

Mail list logo