[GitHub] jena issue #406: JENA-1532 | Added support for escaping special characters i...
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/406 I agree with you @osma. I have made the changes to use QueryParserBase instead of custom logic. Let me know if it is OK. ---
[GitHub] jena pull request #406: JENA-1532 | Added support for escaping special chara...
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/406#discussion_r183642709 --- Diff: jena-text-es/src/main/java/org/apache/jena/query/text/es/TextIndexES.java --- @@ -422,6 +422,27 @@ public EntityDefinition getDocDef() { } private String parse(String fieldName, String qs, String lang) { +//Escape special characters if any in the query string +qs = qs.replaceAll("\\:", ":") +.replaceAll("\\+", "+") +.replaceAll("\\-", "-") +.replaceAll("\\=", "=") +.replaceAll("\\&", "&") +.replaceAll("\\|", "|") +.replaceAll("\\>", ">") +.replaceAll("\\<", "<") +.replaceAll("\\!", "!") +.replaceAll("\\(", "(") +.replaceAll("\\)", ")") +.replaceAll("\\{", "{") +.replaceAll("\\}", "}") +.replaceAll("\\]", "]") +.replaceAll("\\[", "[") +.replaceAll("\\^", "^") +.replaceAll("\\~", "~") +.replaceAll("\\?", "?"); + --- End diff -- So, what i understand is that backslashes need to be escaped even before they reach the Jena ES query method, just like double quotes. Therefore i haven't added the check for it. I can add a unit test depicting the escaping of backslash in the query string itself. Let me know if that would help. ---
[GitHub] jena pull request #406: JENA-1532 | Added support for escaping special chara...
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/406#discussion_r183512768 --- Diff: jena-text-es/src/main/java/org/apache/jena/query/text/es/TextIndexES.java --- @@ -422,6 +422,27 @@ public EntityDefinition getDocDef() { } private String parse(String fieldName, String qs, String lang) { +//Escape special characters if any in the query string +qs = qs.replaceAll("\\:", ":") --- End diff -- Awesome. I will wait it to be merged in master. ---
[GitHub] jena pull request #406: JENA-1532 | Added support for escaping special chara...
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/406#discussion_r183507712 --- Diff: jena-text-es/src/main/java/org/apache/jena/query/text/es/TextIndexES.java --- @@ -422,6 +422,27 @@ public EntityDefinition getDocDef() { } private String parse(String fieldName, String qs, String lang) { +//Escape special characters if any in the query string +qs = qs.replaceAll("\\:", ":") --- End diff -- So this is indeed a bit of a style question. But keeping that aside, under the hood, ```str.replaceAll()``` is equivalent to ```Pattern.compile(regex).matcher(str).replaceAll(repl)``` I can rewrite the ```str.replaceAll``` like above, but I genuinely do not see any major gains. You get to make the final call. :) Changing it is quite trivial from my side. Its more the logistics that I would like to avoid, if possible. ---
[GitHub] jena pull request #406: Jena Text ES | Added support for escaping special ch...
GitHub user anujgandharv opened a pull request: https://github.com/apache/jena/pull/406 Jena Text ES | Added support for escaping special characters in search strings Added support for escaping special characters in search strings You can merge this pull request into a Git repository by running: $ git pull https://github.com/EaseTech/jena master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/406.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #406 commit c2a3bac147a786210b1ba4125a4be9edc66719c4 Author: Anuj Kumar <akumar1@...> Date: 2018-04-23T08:42:13Z JENA-1532 | fix for date based text query search commit 7f4915ee6c94b1044155110fe4612147a9e8de91 Author: Anuj Kumar <akumar1@...> Date: 2018-04-23T18:23:38Z JENA-1532 | fix for date based text query search ---
[GitHub] jena pull request #405: JENA-1532 | fix for date based text query search
Github user anujgandharv closed the pull request at: https://github.com/apache/jena/pull/405 ---
[GitHub] jena pull request #405: JENA-1532 | fix for date based text query search
GitHub user anujgandharv opened a pull request: https://github.com/apache/jena/pull/405 JENA-1532 | fix for date based text query search Added fix for Date based searching in Jena Elasticsearch text Index. You can merge this pull request into a Git repository by running: $ git pull https://github.com/EaseTech/jena master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/405.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #405 commit c2a3bac147a786210b1ba4125a4be9edc66719c4 Author: Anuj Kumar <akumar1@...> Date: 2018-04-23T08:42:13Z JENA-1532 | fix for date based text query search ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 Thanks Osma for incorporating the changes into master. ð --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 Thanks @osma . Can you point me to the SNAPSHOT repo please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 @osma I have merged the changes from Master into my branch. I am fine with merging the code on Monday/Tuesday. Can you also let me know when will 3.3.0 be released? Currently, to not stop us from using the ES functionality, I am maintaining a local branch of Jena where I have merged the changes from this branch. Obviously, I want to get rid of it ASAP and for that I need 3.3.0 from Apache Jena's maven repo. Is there a planned release coming up anytime soon? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 Cool. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 @osma @ajs6f I have made the necessary changes to the ES TextIndex based on changes in #226 I want to bring one thing to notice: If the query string is: `?s text:query ('word' 'lang:en' )`, then the query method receives the following attributes: `*null*, "word", null, "en" `, and NOT `RDFS.label.asNode(), "word", null, "en"` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #226: Refactor TextIndex and TextQueryPF in jena-text (preparatio...
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/226 Thanks @osma . I learned from you that: _It's always good to have Unit Tests for the changes to make sure we are not missing any corner cases :)_ So just to clarify, if the Sparql query string is: ``` SELECT ?s { ?s text:query (rdfs:label 'word' 10) ; rdfs:label ?label } ``` then the query method will receive the following attributes: `RDFS.label.asNode(), "word", null, null, 10` And if the query string is: `?s text:query (rdfs:label 'word' 'lang:en' )`, then the query method will receive the following attributes: `RDFS.label.asNode(), "word", null, "en"` And finally if the query string is: `?s text:query ('word' 'lang:en' )`, then the query method will receive the following attributes: `RDFS.label.asNode(), "word", null, "en"` , where the Node property is the default Node defined at the configuration time. Can you confirm all the scenarios above please. I will modify my test cases and logic accordingly. NOTE: I say NULL for graph attribute, mainly because I don't care about it in the ES implementation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107701838 --- Diff: jena-text/pom.xml --- @@ -112,11 +141,73 @@ org.apache.maven.plugins maven-surefire-plugin - -**/TS_*.java - + +true + + + + unit-tests + test + + test + + + false + + **/TS_*.java + + + **/*IT.java + + + + + integration-tests + integration-test + + test + + + false + + **/*IT.java + + + + + +com.github.alexcojocaru +elasticsearch-maven-plugin + +5.2 + +elasticsearch +9300 --- End diff -- Just found a bug in the Maven ES Plugin. The TCP port is ALWAYS defaulted to 9300 no matter whether you specify it as config or not. Thus reverting back the TCP port to 9300 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #226: Refactor TextIndex and TextQueryPF in jena-text (preparatio...
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/226 @osma I do not see any Unit/Integration tests updated for Lucene for the changes you have done in this PR. Do they come in a separate PR? In the mean time, can you provide me some examples as to how the information will flow into the query endpoint. Will the lang be like: `en` or `lang:en`. I assume it would be `en. Is that the correct assumption. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 @osma Let me try to merge your changes in #226 to my code and see if I can turn it around today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107681374 --- Diff: jena-text/src/main/java/examples/JenaESTextExample.java --- @@ -0,0 +1,94 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package examples; + +import org.apache.jena.atlas.lib.StrUtils; +import org.apache.jena.query.*; +import org.apache.jena.sparql.util.QueryExecUtils; + +/** + * Simple example class to test the {@link org.apache.jena.query.text.assembler.TextIndexESAssembler} + * For this class to work properly, an elasticsearch node should be up and running, otherwise it will fail. + * You can find the details of downloading and running an ElasticSearch version here: https://www.elastic.co/downloads/past-releases/elasticsearch-5-2-1 + * Unzip the file in your favourite directory and then execute the appropriate file under the bin directory. + * It will take less than a minute. + * In order to visualize what is written in ElasticSearch, you need to download and run Kibana: https://www.elastic.co/downloads/kibana + * To run kibana, just go to the bin directory and execute the appropriate file. + * We need to resort to this mechanism as ElasticSearch has stopped supporting embedded ElasticSearch. + * + * In addition we cant have it in the test package because ElasticSearch + * detects the thread origin and stops us from instantiating a client. + */ +public class JenaESTextExample { + +public static void main(String[] args) { + +queryData(loadData(createAssembler())); +} + + +private static Dataset createAssembler() { +String assemblerFile = "text-config-es.ttl"; +Dataset ds = DatasetFactory.assemble(assemblerFile, +"http://localhost/jena_example/#text_dataset;) ; +return ds; +} + +private static Dataset loadData(Dataset ds) { +JenaTextExample1.loadData(ds, "data-es.ttl"); +return ds; +} + +/** + * Query Data + * @param ds + */ +private static void queryData(Dataset ds) { +//JenaTextExample1.queryData(ds); --- End diff -- Its actually something I comment and uncomment for testing different Sparql queries. So I would prefer to keep it, if that ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107681126 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/assembler/TextIndexESAssembler.java --- @@ -0,0 +1,129 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text.assembler; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.text.*; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; +import org.apache.jena.rdf.model.Statement; +import org.apache.jena.sparql.util.graph.GraphUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.Map; + +import static org.apache.jena.query.text.assembler.TextVocab.*; + +public class TextIndexESAssembler extends AssemblerBase { + +private static Logger LOGGER = LoggerFactory.getLogger(TextIndexESAssembler.class) ; + +protected static final String COMMA = ","; +protected static final String COLON = ":"; +/* +<#index> a :TextIndexES ; +text:serverList "127.0.0.1:9300,127.0.0.2:9400,127.0.0.3:9500" ; #Comma separated list of hosts:ports +text:clusterName "elasticsearch" +text:shards "1" +text:replicas "1" +text:entityMap <#endMap> ; +. +*/ + +@SuppressWarnings("resource") +@Override +public TextIndex open(Assembler a, Resource root, Mode mode) { +try { +String listOfHostsAndPorts = GraphUtils.getAsStringValue(root, pServerList) ; +if(listOfHostsAndPorts == null || listOfHostsAndPorts.isEmpty()) { +throw new TextIndexException("Mandatory property text:serverList (containing the comma-separated list of host:port) property is not specified. " + +"An example value for the property: 127.0.0.1:9300"); +} +String[] hosts = listOfHostsAndPorts.split(COMMA); +Map<String,Integer> hostAndPortMapping = new HashMap<>(); +for(String host : hosts) { +String[] hostAndPort = host.split(COLON); +if(hostAndPort.length < 2) { +LOGGER.error("Either the host or the port value is missing.Please specify the property in host:port format. " + +"Both parts are mandatory. Ignoring this value. Moving to the next one."); +continue; +} +hostAndPortMapping.put(hostAndPort[0], Integer.valueOf(hostAndPort[1])); +} + +String clusterName = GraphUtils.getAsStringValue(root, pClusterName); +if(clusterName == null || clusterName.isEmpty()) { +LOGGER.warn("ClusterName property is not specified. Defaulting to 'elasticsearch'"); +clusterName = "elasticsearch"; +} + +String numberOfShards = GraphUtils.getAsStringValue(root, pShards); +if(numberOfShards == null || numberOfShards.isEmpty()) { +LOGGER.warn("shards property is not specified. Defaulting to '1'"); +numberOfShards = "1"; +} + +String replicationFactor = GraphUtils.getAsStringValue(root, pReplicas); +if(replicationFactor == null || replicationFactor.isEmpty()) { +LOGGER.warn("replicas property is not specified. Defaulting to '1'"); +replicationFactor = "1"; +} +
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107681054 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,435 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.engine.DocumentMissingException; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'jena-text' + */ +private final String indexName; + +/** + * The parameter representing the cluster name key + */ +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +/** + * The parameter representing the number of shards key + */ +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +/** + * The parameter representing the number of replicas key + */ +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +private static final String DASH = "-"; + +private static final String UNDERSCORE = "_"; + +private static final String COLON = ":"; + +private static final String ASTREIX = "*"; --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107681154 --- Diff: jena-text/src/main/resources/text-config-es.ttl --- @@ -0,0 +1,65 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + ## Example of a TDB dataset and text index for ElasticSearch + +@prefix :<http://localhost/jena_example/#> . +@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . +@prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#> . +@prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> . +@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . +@prefix text:<http://jena.apache.org/text#> . + +# TDB +[] ja:loadClass "org.apache.jena.tdb.TDB" . +tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset . +tdb:GraphTDBrdfs:subClassOf ja:Model . + +# Text +[] ja:loadClass "org.apache.jena.query.text.TextQuery" . +text:TextDataset rdfs:subClassOf ja:RDFDataset . +text:TextIndexES rdfs:subClassOf text:TextIndex . + +## --- +## This URI must be fixed - it's used to assemble the text dataset. + +:text_dataset rdf:type text:TextDataset ; +text:dataset <#dataset> ; +text:index <#indexES> ; +. + +<#dataset> rdf:type tdb:DatasetTDB ; +tdb:location "--mem--" ; +. + +<#indexES> a text:TextIndexES ; +text:serverList "127.0.0.1:9300" ; # A comma-separated list of Host:Port values of the ElasticSearch Cluster nodes. +text:clusterName "elasticsearch" ; # Name of the ElasticSearch Cluster. If not specified defaults to 'elasticsearch' +text:shards "1" ; # The number of shards for the index. Defaults to 1 +text:replicas "1" ;# The number of replicas for the index. Defaults to 1 +text:indexName "jena-text" ; # Name of the Index. defaults to jena-text +text:multilingualSupport true ; --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107681019 --- Diff: jena-text/pom.xml --- @@ -112,11 +141,73 @@ org.apache.maven.plugins maven-surefire-plugin - -**/TS_*.java - + +true + + + + unit-tests + test + + test + + + false + + **/TS_*.java + + + **/*IT.java + + + + + integration-tests + integration-test + + test + + + false + + **/*IT.java + + + + + +com.github.alexcojocaru +elasticsearch-maven-plugin + +5.2 + +elasticsearch +9300 --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107681104 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/assembler/TextIndexESAssembler.java --- @@ -0,0 +1,129 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text.assembler; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.text.*; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; +import org.apache.jena.rdf.model.Statement; +import org.apache.jena.sparql.util.graph.GraphUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.Map; + +import static org.apache.jena.query.text.assembler.TextVocab.*; + +public class TextIndexESAssembler extends AssemblerBase { + +private static Logger LOGGER = LoggerFactory.getLogger(TextIndexESAssembler.class) ; + +protected static final String COMMA = ","; +protected static final String COLON = ":"; +/* +<#index> a :TextIndexES ; +text:serverList "127.0.0.1:9300,127.0.0.2:9400,127.0.0.3:9500" ; #Comma separated list of hosts:ports +text:clusterName "elasticsearch" +text:shards "1" +text:replicas "1" +text:entityMap <#endMap> ; +. +*/ + +@SuppressWarnings("resource") +@Override +public TextIndex open(Assembler a, Resource root, Mode mode) { +try { +String listOfHostsAndPorts = GraphUtils.getAsStringValue(root, pServerList) ; +if(listOfHostsAndPorts == null || listOfHostsAndPorts.isEmpty()) { +throw new TextIndexException("Mandatory property text:serverList (containing the comma-separated list of host:port) property is not specified. " + +"An example value for the property: 127.0.0.1:9300"); +} +String[] hosts = listOfHostsAndPorts.split(COMMA); +Map<String,Integer> hostAndPortMapping = new HashMap<>(); +for(String host : hosts) { +String[] hostAndPort = host.split(COLON); +if(hostAndPort.length < 2) { +LOGGER.error("Either the host or the port value is missing.Please specify the property in host:port format. " + +"Both parts are mandatory. Ignoring this value. Moving to the next one."); +continue; +} +hostAndPortMapping.put(hostAndPort[0], Integer.valueOf(hostAndPort[1])); +} + +String clusterName = GraphUtils.getAsStringValue(root, pClusterName); +if(clusterName == null || clusterName.isEmpty()) { +LOGGER.warn("ClusterName property is not specified. Defaulting to 'elasticsearch'"); +clusterName = "elasticsearch"; +} + +String numberOfShards = GraphUtils.getAsStringValue(root, pShards); +if(numberOfShards == null || numberOfShards.isEmpty()) { +LOGGER.warn("shards property is not specified. Defaulting to '1'"); +numberOfShards = "1"; +} + +String replicationFactor = GraphUtils.getAsStringValue(root, pReplicas); +if(replicationFactor == null || replicationFactor.isEmpty()) { +LOGGER.warn("replicas property is not specified. Defaulting to '1'"); +replicationFactor = "1"; +} +
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107680990 --- Diff: jena-text/pom.xml --- @@ -81,6 +81,35 @@ lucene-queryparser + + org.elasticsearch + elasticsearch + + + + org.elasticsearch.client + transport + + + --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107680962 --- Diff: jena-parent/pom.xml --- @@ -275,6 +276,27 @@ ${ver.spatial4j} + + +org.elasticsearch +elasticsearch +${ver.elasticsearch} + + + +org.elasticsearch.client +transport +${ver.elasticsearch} + + + + --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r107437003 --- Diff: jena-text/pom.xml --- @@ -112,11 +141,77 @@ org.apache.maven.plugins maven-surefire-plugin - -**/TS_*.java - + +true + + +-Dtests.security.manager=false --- End diff -- Good catch. I will remove it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 @osma @ajs6f I have added integration tests for ES based Indexing Strategy. Could you guys please review and let me know if they are fine and if I missed anything. I do not have any more pending tasks for ES based Indexing, unless I missed a review comment. Let me know what you guys think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 Great. Thanks @Osma. I misunderstood your previous comment. I will implement the integration tests for the above scenarios. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 Thanks @osma and @ajs6f for your inputs. Can I then suggest that instead of moving TestTextIndexES to integration tests module, lets get rid of it completely and instead have the same tests as well as more complex tests built with Maven ES Plugin. Also, can you guys provide some test scenarios that I can work on. I will make sure to include the `Berlin` removal example. Any others? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 This is the error I am getting ``` Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 11.253 sec <<< FAILURE! - in org.apache.jena.query.text.it.TextIndexESIT org.apache.jena.query.text.it.TextIndexESIT Time elapsed: 11.253 sec <<< ERROR! java.lang.IllegalStateException: running tests but failed to invoke RandomizedContext#getRandom Caused by: java.lang.reflect.InvocationTargetException Caused by: java.lang.IllegalStateException: No context information for thread: Thread[id=1, name=main, state=RUNNABLE, group=main]. Is this thread running under a class com.carrotsearch.randomizedtesting.RandomizedRunner runner context? Add @RunWith(class com.carrotsearch.randomizedtesting.RandomizedRunner.class) to your test class. Make sure your code accesses random contexts within @BeforeClass and @AfterClass boundary (for example, static test class initializers are not permitted to access random contexts). ``` Just to rule out any local interference from my side, I have checked in a Simple Test based on the ES Maven Plugin. Can you try that out on your side @osma and see if you get the same error as I am? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 @osma spring-elasticsearch IT is throwing exactly the same error that I am getting on my setup. What they have done is they are ignoring all the errors and assuming there are no errors, thus the tests in [BaseTest.java](https://github.com/dadoonet/spring-elasticsearch/blob/master/src/test/java/fr/pilato/spring/elasticsearch/it/BaseTest.java#L52) classes are skipped. They are using Spring specific [ESBeanFactory](https://github.com/dadoonet/spring-elasticsearch/blob/master/src/test/java/fr/pilato/spring/elasticsearch/it/annotation/AppConfig.java#L32) to instantiate an elastic-search client. Personally I do not want to introduce Spring dependency un-necessarily in Apache Jena because Jena is not based on Spring Framework. What are your thoughts? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 @osma @ajs6f I have tried the ES Maven plugin but it is throwing me lots of errors. I suspect that it is just a wrapper around the embedded elasticsearch. I asked the creator to provide a more comprehensive example of using the plugin. I am still waiting for his answer. In the mean time, in the interest of time and getting this functionality, I would "please" suggest that we keep the Tests that we have currently which serve for basic testing of the functionality and utilize JenaESTextExample.java class implements a more comprehensive testing. In any case, we require a running instance of ElasticSearch. Till the time we do not have a mechanism to start and stop an ElasticSearch instance automatically, we can start and stop it manually. I have documented how to do that in the JenaESTextExample.java. Rest of the review comments have been incorporated (unless I missed something again). Do you guys agree? and can we merge it in? Thank, Anuj Kumar --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106870357 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106869941 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106869330 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; --- End diff -- Removed Multilingual checks from the latest commit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106641289 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106636373 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106624906 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 Thanks @ajs6f for the Maven ElasticSearch Plugin link. Looks like this would enable us to spin up a fully functional Single Node ES for our integration tests. Can you share some more light as to how I can reuse my test as Integration test in Jena. Is it something specific or I use the standard Maven way of executing Integration tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106431068 --- Diff: jena-text/text-config.ttl --- @@ -50,6 +50,7 @@ text:TextIndexLucene rdfs:subClassOf text:TextIndex . <#indexLucene> a text:TextIndexLucene ; #text:directory ; text:directory "mem" ; +text:multilingualSupport true ; --- End diff -- Same as above --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106431036 --- Diff: jena-text/testing/TextQuery/text-config.ttl --- @@ -31,6 +31,7 @@ text:TextIndexLucene rdfs:subClassOf text:TextIndex . <#indexLucene> a text:TextIndexLucene ; text:directory "mem" ; +text:multilingualSupport true ; --- End diff -- Well, I was testing how multilingual works and since making this change did not break anything, I left it there. I can revert it back. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106430285 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106429946 --- Diff: jena-text/pom.xml --- @@ -81,6 +81,51 @@ lucene-queryparser + + org.elasticsearch + elasticsearch + + + + org.elasticsearch.client + transport + + + --- End diff -- Will test and remove if unnecessary --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106429891 --- Diff: jena-parent/pom.xml --- @@ -275,6 +276,75 @@ ${ver.spatial4j} + + +org.elasticsearch +elasticsearch +${ver.elasticsearch} + + +commons-logging +commons-logging + + +org.hamcrest +hamcrest-core + + + + + + +org.elasticsearch.client +transport +${ver.elasticsearch} + + +commons-logging +commons-logging + + +org.hamcrest +hamcrest-core + + + + + + --- End diff -- Will test it at my end and if not required, will remove them --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106429678 --- Diff: jena-text/pom.xml --- @@ -115,6 +160,7 @@ **/TS_*.java +-Dtests.security.manager=false --- End diff -- Will do --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106429631 --- Diff: jena-text/src/main/java/examples/JenaESTextExample.java --- @@ -0,0 +1,64 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package examples; + +import org.apache.jena.query.Dataset; +import org.apache.jena.query.DatasetFactory; + +/** + * Simple example class to test the {@link org.apache.jena.query.text.assembler.TextIndexESAssembler} + * For this class to work properly, an elasticsearch node should be up and running, otherwise it will fail. + * You can find the details of downloading and running an ElasticSearch version here: https://www.elastic.co/downloads/past-releases/elasticsearch-5-2-1 + * Unzip the file in your favourite directory and then execute the appropriate file under the bin directory. + * It will take less than a minute. + * In order to visualize what is written in ElasticSearch, you need to download and run Kibana: https://www.elastic.co/downloads/kibana + * To run kibana, just go to the bin directory and execute the appropriate file. + * We need to resort to this mechanism as ElasticSearch has stopped supporting embedded ElasticSearch. + * + * In addition we cant have it in the test package because ElasticSearch + * detects the thread origin and stops us from instantiating a client. + */ +public class JenaESTextExample { --- End diff -- Oh. I didnt realise that. I havent renamed it. I think it got deleted by mistake. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106429441 --- Diff: jena-text/src/main/java/examples/JenaESTextExample.java --- @@ -0,0 +1,64 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package examples; + +import org.apache.jena.query.Dataset; +import org.apache.jena.query.DatasetFactory; + +/** + * Simple example class to test the {@link org.apache.jena.query.text.assembler.TextIndexESAssembler} + * For this class to work properly, an elasticsearch node should be up and running, otherwise it will fail. + * You can find the details of downloading and running an ElasticSearch version here: https://www.elastic.co/downloads/past-releases/elasticsearch-5-2-1 + * Unzip the file in your favourite directory and then execute the appropriate file under the bin directory. + * It will take less than a minute. + * In order to visualize what is written in ElasticSearch, you need to download and run Kibana: https://www.elastic.co/downloads/kibana + * To run kibana, just go to the bin directory and execute the appropriate file. + * We need to resort to this mechanism as ElasticSearch has stopped supporting embedded ElasticSearch. + * + * In addition we cant have it in the test package because ElasticSearch + * detects the thread origin and stops us from instantiating a client. + */ +public class JenaESTextExample { + +public static void main(String[] args) { + +queryData(loadData(createAssembler())); +} + + +private static Dataset createAssembler() { +String assemblerFile = "text-config-es.ttl"; +Dataset ds = DatasetFactory.assemble(assemblerFile, +"http://localhost/jena_example/#text_dataset;) ; +return ds; +} + +private static Dataset loadData(Dataset ds) { +JenaTextExample1.loadData(ds, "data-es.ttl"); +return ds; +} + +/** + * Query Data + * @param ds + */ +private static void queryData(Dataset ds) { +JenaTextExample1.queryData(ds); --- End diff -- Actually since I am loading ES specific assembler and loading it into data set, it is fine actually. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106428753 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106428563 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106428124 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106428015 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106427299 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,394 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String indexName; + +static final String CLUSTER_NAME_PARAM = "cluster.name"; + +static final String NUM_OF_SHARDS_PARAM = "number_of_shards"; + +static final String NUM_OF_REPLICAS_PARAM = "number_of_replicas"; + +/** + * Number of maximum results to return in case no limit is specified on the search operation + */ +static final Integer MAX_RESULTS = 1; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) { + +this.indexName = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +try { +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME_PARAM, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocke
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106420753 --- Diff: jena-text/src/test/java/org/apache/jena/query/text/TestTextIndexES.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.jena.query.text; + + + +import org.apache.jena.graph.Node; +import org.apache.jena.vocabulary.RDFS; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.test.ESIntegTestCase; +import org.junit.Assert; +import org.junit.Ignore; +import org.junit.Test; + +import java.util.List; +import java.util.Map; +import java.util.concurrent.ExecutionException; + +/** + * + * Integration test for {@link TextIndexES} class + * ES Integration test depends on security policies that may sometime not be loaded properly. + * If you find any issues regarding security set the following VM argument to resolve the issue: + * -Dtests.security.manager=false + * + */ +@ESIntegTestCase.ClusterScope() +public class TestTextIndexES extends ESIntegTestCase { --- End diff -- The main issue is that embedded ElasticSearch does not come with the "pianless" plugin and ElasticSearch has stopped releasing painless plugin as maven artifacts. Therefore I can not have tests that rely on the script portion to be executed. In order to do extensive testing, I need this plugin. That is also the reason why the delete tests are ignored currently. Let me see if I can add some more unit tests, although IMO they would still be simple ones. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 Thanks @osma. I think the Index will becoe much simpler if we remove the non-used methods --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 Hi @osma I need one more favour from you. I need to understand the scenario when the 'get' method of TextIndex gets called. Can you provide me an example Sparql query which I can run from my JenaESTextExample.java class that would result in calling the 'get' method? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena issue #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on the issue: https://github.com/apache/jena/pull/227 @osma Can you review now. I have made the changes to the Add and Delete API so that they are executed as a single REST call. I think we already have consensus on the Multilingual aspect. If not, please let me know and we can have a discussion around it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106198915 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ + +this.INDEX_NAME = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocketTransportAddress socketAddresses[] = new InetSocketTransportAddress[addresses.size()]; +client = new PreBuiltTransportClient(settings).addTransportAddresses(addresses.toArray(s
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106199137 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ + +this.INDEX_NAME = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocketTransportAddress socketAddresses[] = new InetSocketTransportAddress[addresses.size()]; +client = new PreBuiltTransportClient(settings).addTransportAddresses(addresses.toArray(s
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106199017 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ + +this.INDEX_NAME = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocketTransportAddress socketAddresses[] = new InetSocketTransportAddress[addresses.size()]; +client = new PreBuiltTransportClient(settings).addTransportAddresses(addresses.toArray(s
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106158114 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ + +this.INDEX_NAME = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocketTransportAddress socketAddresses[] = new InetSocketTransportAddress[addresses.size()]; +client = new PreBuiltTransportClient(settings).addTransportAddresses(addresses.toArray(s
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106156794 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ + +this.INDEX_NAME = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocketTransportAddress socketAddresses[] = new InetSocketTransportAddress[addresses.size()]; +client = new PreBuiltTransportClient(settings).addTransportAddresses(addresses.toArray(s
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106156573 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ + +this.INDEX_NAME = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocketTransportAddress socketAddresses[] = new InetSocketTransportAddress[addresses.size()]; +client = new PreBuiltTransportClient(settings).addTransportAddresses(addresses.toArray(s
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106155260 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ + +this.INDEX_NAME = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocketTransportAddress socketAddresses[] = new InetSocketTransportAddress[addresses.size()]; +client = new PreBuiltTransportClient(settings).addTransportAddresses(addresses.toArray(s
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106153769 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106154731 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ + +this.INDEX_NAME = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocketTransportAddress socketAddresses[] = new InetSocketTransportAddress[addresses.size()]; +client = new PreBuiltTransportClient(settings).addTransportAddresses(addresses.toArray(s
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106154642 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ + +this.INDEX_NAME = esSettings.getIndexName(); +this.docDef = config.getEntDef(); + + +this.isMultilingual = config.isMultilingualSupport(); +if (this.isMultilingual && config.getEntDef().getLangField() == null) { +//multilingual index cannot work without lang field +docDef.setLangField("lang"); +} +if(client == null) { + +LOGGER.debug("Initializing the Elastic Search Java Client with settings: " + esSettings); +Settings settings = Settings.builder() +.put(CLUSTER_NAME, esSettings.getClusterName()).build(); +List addresses = new ArrayList<>(); +for(String host: esSettings.getHostToPortMapping().keySet()) { +InetSocketTransportAddress addr = new InetSocketTransportAddress(InetAddress.getByName(host), esSettings.getHostToPortMapping().get(host)); +addresses.add(addr); +} + +InetSocketTransportAddress socketAddresses[] = new InetSocketTransportAddress[addresses.size()]; +client = new PreBuiltTransportClient(settings).addTransportAddresses(addresses.toArray(s
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106154483 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; --- End diff -- Renamed so each of them end with _PARAM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106154421 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,427 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106152321 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/assembler/TextAssembler.java --- @@ -29,14 +29,15 @@ public static void init() AssemblerUtils.registerDataset(TextVocab.textDataset, new TextDatasetAssembler()) ; Assembler.general.implementWith(TextVocab.entityMap,new EntityDefinitionAssembler()) ; -Assembler.general.implementWith(TextVocab.textIndexSolr,new TextIndexSolrAssembler()) ; Assembler.general.implementWith(TextVocab.textIndexLucene, new TextIndexLuceneAssembler()) ; Assembler.general.implementWith(TextVocab.standardAnalyzer, new StandardAnalyzerAssembler()) ; Assembler.general.implementWith(TextVocab.simpleAnalyzer, new SimpleAnalyzerAssembler()) ; Assembler.general.implementWith(TextVocab.keywordAnalyzer, new KeywordAnalyzerAssembler()) ; Assembler.general.implementWith(TextVocab.lowerCaseKeywordAnalyzer, new LowerCaseKeywordAnalyzerAssembler()) ; Assembler.general.implementWith(TextVocab.localizedAnalyzer, new LocalizedAnalyzerAssembler()) ; Assembler.general.implementWith(TextVocab.configurableAnalyzer, new ConfigurableAnalyzerAssembler()) ; +Assembler.general.implementWith(TextVocab.textIndexES, new TextIndexESAssembler()) ; --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106151647 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/ESSettings.java --- @@ -0,0 +1,177 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.jena.query.text; + +import java.util.HashMap; +import java.util.Map; + +/** + * Settings for ElasticSearch based indexing + */ +public class ESSettings { + +/** + * Map of hosts and ports. The host could also be an IP Address + */ +private Map<String,Integer> hostToPortMapping; + +/** + * Name of the Cluster. Defaults to 'elasticsearch' + */ +private String clusterName; + +/** + * Number of shards. Defaults to '1' + */ +private Integer shards; + +/** + * Number of replicas. Defaults to '1' + */ +private Integer replicas; + +/** + * Name of the index. Defaults to 'test' + */ +private String indexName; + + +public Map<String, Integer> getHostToPortMapping() { +return hostToPortMapping; +} + +public void setHostToPortMapping(Map<String, Integer> hostToPortMapping) { +this.hostToPortMapping = hostToPortMapping; +} + +public ESSettings.Builder builder() { +return new ESSettings.Builder(); +} + +/** + * Convenient builder class for building ESSettings + */ +public static class Builder { + +ESSettings settings; + +public Builder() { +this.settings = new ESSettings(); +this.settings.setClusterName("elasticsearch"); +this.settings.setShards(1); +this.settings.setReplicas(1); +this.settings.setHostToPortMapping(new HashMap<>()); +this.settings.setIndexName("test"); --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106151289 --- Diff: jena-text/src/main/java/examples/JenaESTextExample.java --- @@ -0,0 +1,65 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package examples; + +import org.apache.jena.query.Dataset; +import org.apache.jena.query.DatasetFactory; + +/** + * Simple example class to test the {@link org.apache.jena.query.text.assembler.TextIndexESAssembler} + * For this class to work properly, an elasticsearch node should be up and running, otherwise it will fail. + * You can find the details of downloading and running an ElasticSearch version here: https://www.elastic.co/downloads/past-releases/elasticsearch-5-2-1 + * Unzip the file in your favourite directory and then execute the appropriate file under the bin directory. + * It will take less than a minute. + * In order to visualize what is written in ElasticSearch, you need to download and run Kibana: https://www.elastic.co/downloads/kibana + * To run kibana, just go to the bin directory and execute the appropriate file. + * We need to resort to this mechanism as ElasticSearch has stopped supporting embedded ElasticSearch. + * + * In addition we cant have it in the test package because ElasticSearch + * detects the thread origin and stops us from instantiating a client. + */ +public class JenaESTextExample { + +public static void main(String[] args) { + +queryData(loadData(createAssembler())); +} + + +private static Dataset createAssembler() { +String assemblerFile = "text-config-es.ttl"; +Dataset ds = DatasetFactory.assemble(assemblerFile, +"http://localhost/jena_example/#text_dataset;) ; +return ds; +} + +private static Dataset loadData(Dataset ds) { +JenaTextExample1.loadData(ds, "data-es.ttl"); +return ds; +} + +/** + * The data being queried from ElasticSearch is proper but what is getting printed is wrong. --- End diff -- Yes, fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106153318 --- Diff: jena-text/src/test/java/org/apache/jena/query/text/TestTextIndexES.java --- @@ -0,0 +1,184 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.jena.query.text; + + + +import org.apache.jena.graph.Node; +import org.apache.jena.vocabulary.RDFS; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.test.ESIntegTestCase; +import org.junit.Assert; +import org.junit.Ignore; +import org.junit.Test; + +import java.util.List; +import java.util.Map; +import java.util.concurrent.ExecutionException; + +/** + * + * Integration test for {@link TextIndexES} class + * ES Integration test depends on security policies that may sometime not be loaded properly. + * If you find any issues regarding security set the following VM argument to resolve the issue: + * -Dtests.security.manager=false + * + */ +@ESIntegTestCase.ClusterScope() +public class TestTextIndexES extends ESIntegTestCase { --- End diff -- Unfortunately, currently I do not have/know a mechanism to suppress the logs. But I will dig a bit deeper. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106151436 --- Diff: jena-text/src/main/java/examples/JenaTextExample1.java --- @@ -41,9 +41,9 @@ public static void main(String ... argv) { -Dataset ds = createCode() ; -//Dataset ds = createAssembler() ; -loadData(ds , "data.ttl") ; +//Dataset ds = createCode() ; +Dataset ds = createAssembler() ; --- End diff -- Reverted changes to Lucene Example. It was an accidental checkin. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106151951 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextDatasetFactory.java --- @@ -27,7 +27,7 @@ import org.apache.jena.system.JenaSystem ; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.store.Directory ; -import org.apache.solr.client.solrj.SolrServer ; +import org.elasticsearch.indices.IndexCreationException; --- End diff -- Agree. Made the changes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106152482 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/assembler/TextIndexESAssembler.java --- @@ -0,0 +1,129 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text.assembler; + +import org.apache.jena.assembler.Assembler; +import org.apache.jena.assembler.Mode; +import org.apache.jena.assembler.assemblers.AssemblerBase; +import org.apache.jena.query.text.*; +import org.apache.jena.rdf.model.RDFNode; +import org.apache.jena.rdf.model.Resource; +import org.apache.jena.rdf.model.Statement; +import org.apache.jena.sparql.util.graph.GraphUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.Map; + +import static org.apache.jena.query.text.assembler.TextVocab.*; + +public class TextIndexESAssembler extends AssemblerBase { + +private static Logger LOGGER = LoggerFactory.getLogger(TextIndexESAssembler.class) ; + +protected static final String COMMA = ","; +protected static final String COLON = ":"; +/* +<#index> a :TextIndexES ; +text:serverList "127.0.0.1:9300,127.0.0.2:9400,127.0.0.3:9500" ; #Comma separated list of hosts:ports +text:clusterName "elasticsearch" +text:shards "1" +text:replicas "1" +text:entityMap <#endMap> ; +. +*/ + +@SuppressWarnings("resource") +@Override +public TextIndex open(Assembler a, Resource root, Mode mode) { +try { +String listOfHostsAndPorts = GraphUtils.getAsStringValue(root, pServerList) ; +if(listOfHostsAndPorts == null || listOfHostsAndPorts.isEmpty()) { +throw new TextIndexException("Mandatory property text:serverList (containing the comma-separated list of host:port) property is not specified. " + +"An example value for the property: 127.0.0.1:9300"); +} +String[] hosts = listOfHostsAndPorts.split(COMMA); +Map<String,Integer> hostAndPortMapping = new HashMap<>(); +for(String host : hosts) { +String[] hostAndPort = host.split(COLON); +if(hostAndPort.length < 2) { +LOGGER.error("Either the host or the port value is missing.Please specify the property in host:port format. " + +"Both parts are mandatory. Ignoring this value. Moving to the next one."); +continue; +} +hostAndPortMapping.put(hostAndPort[0], Integer.valueOf(hostAndPort[1])); +} + +String clusterName = GraphUtils.getAsStringValue(root, pClusterName); +if(clusterName == null || clusterName.isEmpty()) { +LOGGER.warn("ClusterName property is not specified. Defaulting to 'elasticsearch'"); +clusterName = "elasticsearch"; +} + +String numberOfShards = GraphUtils.getAsStringValue(root, pShards); +if(numberOfShards == null || numberOfShards.isEmpty()) { +LOGGER.warn("shards property is not specified. Defaulting to '1'"); +numberOfShards = "1"; +} + +String replicationFactor = GraphUtils.getAsStringValue(root, pReplicas); +if(replicationFactor == null || replicationFactor.isEmpty()) { +LOGGER.warn("replicas property is not specified. Defaulting to '1'"); +replicationFactor = "1"; +} +
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106153611 --- Diff: jena-text/pom.xml --- @@ -81,39 +81,50 @@ lucene-queryparser - - - solr-solrj - org.apache.solr - - - + + org.elasticsearch + elasticsearch + + + + org.elasticsearch.client + transport + + + + org.apache.lucene + lucene-test-framework + + + + org.elasticsearch.test + framework + + + + + junit + junit + + + org.hamcrest + hamcrest-core + + + + + --- End diff -- One is core and one is api dependency --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106151618 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/ESSettings.java --- @@ -0,0 +1,177 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.jena.query.text; + +import java.util.HashMap; +import java.util.Map; + +/** + * Settings for ElasticSearch based indexing + */ +public class ESSettings { + +/** + * Map of hosts and ports. The host could also be an IP Address + */ +private Map<String,Integer> hostToPortMapping; + +/** + * Name of the Cluster. Defaults to 'elasticsearch' + */ +private String clusterName; + +/** + * Number of shards. Defaults to '1' + */ +private Integer shards; + +/** + * Number of replicas. Defaults to '1' + */ +private Integer replicas; + +/** + * Name of the index. Defaults to 'test' --- End diff -- Done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
Github user anujgandharv commented on a diff in the pull request: https://github.com/apache/jena/pull/227#discussion_r106145497 --- Diff: jena-text/src/main/java/org/apache/jena/query/text/TextIndexES.java --- @@ -0,0 +1,425 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.jena.query.text; + +import org.apache.jena.graph.Node; +import org.apache.jena.graph.NodeFactory; +import org.apache.jena.sparql.util.NodeFactoryExtra; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsRequest; +import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; +import org.elasticsearch.action.get.GetResponse; +import org.elasticsearch.action.index.IndexRequest; +import org.elasticsearch.action.search.SearchResponse; +import org.elasticsearch.action.update.UpdateRequest; +import org.elasticsearch.action.update.UpdateResponse; +import org.elasticsearch.client.Client; +import org.elasticsearch.client.transport.TransportClient; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.common.transport.InetSocketTransportAddress; +import org.elasticsearch.common.xcontent.XContentBuilder; +import org.elasticsearch.index.get.GetField; +import org.elasticsearch.index.query.QueryBuilders; +import org.elasticsearch.script.Script; +import org.elasticsearch.search.SearchHit; +import org.elasticsearch.transport.client.PreBuiltTransportClient; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.net.InetAddress; +import java.util.*; + +import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder; + +/** + * Elastic Search Implementation of {@link TextIndex} + * + */ +public class TextIndexES implements TextIndex { + +/** + * The definition of the Entity we are trying to Index + */ +private final EntityDefinition docDef ; + +/** + * Thread safe ElasticSearch Java Client to perform Index operations + */ +private static Client client; + +/** + * The name of the index. Defaults to 'test' + */ +private final String INDEX_NAME; + +static final String CLUSTER_NAME = "cluster.name"; + +static final String NUM_OF_SHARDS = "number_of_shards"; + +static final String NUM_OF_REPLICAS = "number_of_replicas"; + +private boolean isMultilingual ; + +private static final Logger LOGGER = LoggerFactory.getLogger(TextIndexES.class) ; + +public TextIndexES(TextIndexConfig config, ESSettings esSettings) throws Exception{ --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] jena pull request #227: JENA-1305 | Elastic search support for Jena Text
GitHub user anujgandharv opened a pull request: https://github.com/apache/jena/pull/227 JENA-1305 | Elastic search support for Jena Text Implemented ES support for Jena Text Indexing capability You can merge this pull request into a Git repository by running: $ git pull https://github.com/EaseTech/jena jena-1301-es-support Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/227.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #227 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---