brushworth opened a new pull request #317: URL: https://github.com/apache/rya/pull/317
…consistently to a well-defined standard. This commit begins to tidy the Accumulo config (MongoDB to come) but more work is required. <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> ## Description >What Changed? This pull request is a DRAFT seeking feedback from the community. The current structure and documentation concerning the environment.properties and spring xml configuration seems to be lacking. Trying to get advanced features working in Tomcat, for example the AccumuloSelectivityEvalDAO or the various indexing strategies, is a very hard slog, particularly for new comers to the project. I've started to tidy up the accumulo and extension spring xml files. I've tested the accumulo and extensions files on a test cluster. I'm not 100% sure the AccumuloSelectivityEvalDAO is working fully, but it seems to be running. I've got a different branch of Rya that contains a bunch of debug logging that I will try tomorrow. I propose we add default properties files to the project that work out of the box with Fluo Muchos (or similar) to allow for easy spin up of a development cluster to give Rya a whirl, for example on AWS or Azure. People can then easily edit them to their environment, rather than having to reverse engineer what parameters are available and what default values are in use. I'm happy to do MongoDB configuration too but I don't have a test cluster running at present. I'm after feedback from more experienced developers of Rya about whether these changes are heading in the correct direction. For example, I've replaced some of the configuration xml calling setter methods like this: ``` <bean id="conf" class="org.apache.rya.accumulo.AccumuloRdfConfiguration"> <!-- Calls setter method name --> <property name="tablePrefix" value="${rya.tableprefix}"/> <property name="displayQueryPlan" value="${rya.displayqueryplan}"/> <property name="useStats" value="false"/> <property name="useStats" value="${rya.usestats}"/> <property name="useSelectivity" value="${rya.useselectivity}"/> <property name="useStatementMetadata" value="${rya.usestatementmetadata}"/> <property name="numThreads" value="${rya.querythreads}"/> <property name="batchSize" value="${rya.batchsize}"/> <property name="dataWaveEdge" value="${rya.datawaveedge}"/> <property name="dataType" value="org.eclipse.rdf4j.model.Statement"/> <!-- <property name="useEntity" value="${sc.use_entity}"/> <property name="useGeo" value="${sc.use_geo}"/> <property name="useFreeText" value="${sc.use_freetext}"/> <property name="useTemporal" value="${sc.use_temporal}"/> --> </bean> ``` with ``` <bean id="conf" class="org.apache.rya.accumulo.AccumuloRdfConfiguration" factory-method="fromProperties"> <constructor-arg ref="properties"/> </bean> ``` Additionally, there is a lot of duplication in the properties space. I'm trying understand why. I'm also trying to understand the differences. For example, there are environment properties `accumulo.instance`, `instance.name` and `sc.cloudbase.instancename` in different places, all obviously referring to the same type of thing? Is this redundancy deliberate, or can I start consolidating it down? Can we get down to a single Rya configuration properties file for an entire Rya installation (e.g. ingest jobs, Tomcat, Fluo, etc)? I'm still a little lost in the details here, and any background or advice would be much appreciated. ### Tests >Coverage? N/A ### Links [Jira RYA-70](https://issues.apache.org/jira/browse/RYA-70) ### Checklist - [ ] Code Review - [ ] Squash Commits #### People To Reivew [Add those who should review this] ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org