BTW, I have one more question: How do I add more than one field to be indexed in my Index? Basically, if I want to index rdfs:label , rdfs:comment in the same index document, how do I do it?
I tried : EntityDefinition entDef = new EntityDefinition(DOC_TYPE, FIELD_TO_SEARCH); entDef.setPrimaryPredicate(RDFS.label); entDef.setGraphField(GRAPH_FIELD_NAME); entDef.set("comment", RDFS.comment.asNode()); But it doesnt work. Can you please point me on a way to do it please. This is an important piece of functionality I need. Thanks, Anuj Kumar On Wed, Mar 1, 2017 at 3:59 PM, anuj kumar <anuj.gandh...@gmail.com> wrote: > I personally have no preference as to how the code in Jena should be > structured, as long as I am able to use it :). > I have personal preference of doing it in a specific way because IMO, it > is modular which makes it much easier to maintain in the long run. But > again it may not be the quickest one. > > I already have been given a deadline, by the company to have ES extension > implemented in the next 15 days :). What this means is that I will be > maintaining the ES code extension to Jena Text at-least locally for a > coming period of time. I would be more than happy to contribute to Jena > community whatever is required to have a proper ElasticSearch > Implementation in place, whether within jena-text module or as a separate > module. Till the time Lucene and Solr is not upgraded to the latest > version, I will have to maintain a separate module for jena-text-es. > > Cheers! > Anuj Kumar > > > On Wed, Mar 1, 2017 at 3:36 PM, A. Soroka <aj...@virginia.edu> wrote: > >> Osma-- >> >> The short answer is that yes, given the right tools you _can_ have >> different versions of code accessible in different ways. The longer answer >> is that it's probably not a viable alternative for Jena for this problem, >> at least not without a lot of other change. >> >> You are right to point to the classloader mechanism as being at the heart >> of this question, but I must alter your remark just slightly. From "the >> Java classloader only sees a single, flat package/class namespace and a set >> of compiled classes" to "ANY GIVEN Java classloader only sees a single, >> flat package/class namespace and a set of compiled classes". >> >> This is the fact that OSGi uses to make it possible to maintain strict >> module boundaries (and even dynamic module relationships at run-time). Each >> OSGi bundle sees its own classloader, and the framework is responsible for >> connecting bundles up to ensure that every bundle has what it needs in the >> way of types to function, based on metadata that the bundles provide to the >> framework. It's an incredibly powerful system (I use it every day and enjoy >> it enormously) but it's also very "heavy" and requires a good deal of >> investment to use. In particular, it's probably too large to put _inside_ >> Jena. (I frequently put Jena inside an OSGi instance, on the other hand.) >> >> Java 9 Jigsaw [1] offers some possibility for strong modularization of >> this kind, but it's really meant for the JDK itself, not application >> libraries. In theory, we could "roll our own" classloader management for >> this problem. That sounds like more than a bit of a rabbit hole to me. >> There might be another, more lightweight, toolkit out there to this >> purpose, but I'm not aware of any myself. >> >> Otherwise, yes, you get into shading and the like. We have to do that for >> Guava for now because of HADOOP-10101 (grumble grumble) but it's hardly a >> thing we want to do any more of than needed, I don't think. >> >> --- >> A. Soroka >> The University of Virginia Library >> >> [1] http://openjdk.java.net/projects/jigsaw/ >> >> > On Mar 1, 2017, at 9:03 AM, Osma Suominen <osma.suomi...@helsinki.fi> >> wrote: >> > >> > Hi Anuj! >> > >> > Thanks for the clarification. >> > >> > However, I'm still not sure I understand the situation completely. I >> know Maven can perform a lot of tricks, but Maven modules are just >> convenient ways to structure a Java project. Maven cannot change the fact >> that at runtime, module divisions don't really matter (except that they >> usually correspond to package sub-namespaces) and the Java classloader only >> sees a single, flat package/class namespace and a set of compiled classes >> (usually within JARs) in the classpath that it needs to check to find the >> right classes, and if there are two versions of the same library (eg >> Lucene) with overlapping class names, that's going to cause trouble. The >> only way around that is to shade some of the libraries, i.e. rename them so >> that they end up in another, non-conflicting namespace. Apparently >> Elasticsearch also did some of that in the past [1] but nowadays tries to >> avoid it. >> > >> > Does your assumption 1 ("At a given point in time, only a single >> Indexing Technology is used") imply that in the assembler configuration, >> you cannot have ja:loadClass declarations for both Lucene and ES backends? >> Or how do you run something like Fuseki that contains (in a single big JAR) >> both the jena-text and jena-text-es modules with all their dependencies, >> one of which requires the Lucene 4.x classes and the other one the Lucene >> 6.4.1 classes? How do you ensure that only one of them is used at a time, >> and that the Java classloader, even though it has access to both versions >> of Lucene, only loads classes from the single, correct one and not the >> other? Or do you need to have separate "Fuseki-Lucene" and "Fuseki-ES" >> packages, so that you don't end up with two Lucene versions within the same >> Fuseki JAR? >> > >> > -Osma >> > >> > [1] https://www.elastic.co/blog/to-shade-or-not-to-shade >> > >> > 01.03.2017, 11:03, anuj kumar kirjoitti: >> >> Hi Osma, >> >> >> >> I understand what you are saying. There are ways to mitigate risks and >> >> balance the refactoring without affecting the existing modules. But I >> will >> >> not delve into those now. I am not an expert in Jena to convincingly >> say >> >> that it is possible, without any hiccups. But I can take a guess and >> say >> >> that it is indeed possible :) >> >> >> >> For the question: "is it even possible to mix modules that depend on >> >> different versions of the Lucene libraries within the same project?" >> >> >> >> I actually do not understand what you mean by mixing modules. I assume >> you >> >> mean having jena-text and jena-text-es as dependencies in a build >> without >> >> causing the build to conflict. If that is what you mean than the >> answer is >> >> yes it is possible and quite simple as well. Let me explain how it is >> >> possible. But before that some assumption which I want to call out >> >> explicitly. >> >> >> >> *Assumption:* >> >> 1. At a given point in time, only a single Indexing Technology is used >> for >> >> text based indexing and searching via Jean. What this means is that we >> will >> >> either use Lucene Implementation OR Solr Implementation OR ES >> >> Implementation at any given point in time. >> >> 2. Fuseki build does not depend on any Lucene 4.9.1 specific classes >> but >> >> only on jena-text classes, if at all. >> >> >> >> Based on these assumptions it is possible to create a build that >> contains >> >> jena-text based common classes + ES specific classes without any >> >> compatibility issues. And it is infact quite simple. I did it in the >> >> current jena-text-es module and ran the entire build which succeeded. >> >> The key is to include the latest Lucene dependencies at the very >> beginning >> >> in the pom and then include jena-text dependency. Maven will then >> >> automatically resolve the dependency issues by including the Lucene >> >> librarires that we included in our es specific pom. Have a look the >> pom of >> >> jena-text-es module here to see how it can be done : >> >> https://github.com/EaseTech/jena/blob/master/jena-text-es/pom.xml >> >> >> >> >> >> Thanks, >> >> Anuj Kumar >> >> >> >> >> >> On Wed, Mar 1, 2017 at 7:27 AM, Osma Suominen < >> osma.suomi...@helsinki.fi> >> >> wrote: >> >> >> >>> Hi Anuj, >> >>> >> >>> I understand your concerns. However, we also need to balance between >> the >> >>> needs of individual modules/features and the whole codebase. I'm >> willing to >> >>> put in the effort to keep the other modules up to date with newer >> Lucene >> >>> versions. Lucene upgrade requirements are well documented, the only >> hitches >> >>> seen in JENA-1250 were related to how jena-text (ab)used some Lucene >> >>> features that were dropped from newer versions. >> >>> >> >>> A perhaps stupid question to more experienced Java developers: is it >> even >> >>> possible to mix modules that depend on different versions of the >> Lucene >> >>> libraries within the same project? In my (quite limited) >> understanding of >> >>> Java projects and libraries, this requires special arrangements (e.g. >> >>> shading) as the Java package/class namespace is shared by all the code >> >>> running within the same JVM. >> >>> >> >>> So can you create, say, a Fuseki build that contains the current >> jena-text >> >>> module (depending on Lucene 4.x) and the new jena-text-es module >> (depending >> >>> on Lucene 6.4.1) without any compatibility issues? >> >>> >> >>> -Osma >> >>> >> >>> >> >>> >> >>> >> >>> 01.03.2017, 00:47, anuj kumar kirjoitti: >> >>> >> >>>> Hi, >> >>>> >> >>>> My 2 Cents : >> >>>> >> >>>> The reason I proposed to have separate modules for Lucene, Solr and >> ES is >> >>>> exactly for avoiding the "All or Nothing" approach we need to take >> if we >> >>>> club them all together. If they stay together and if in the near >> future I >> >>>> want to upgrade ES to another version, I also need to again upgrade >> Lucene >> >>>> and Solr and possibly another implementation that may have been added >> >>>> during the time. As we all know, this means weeks of work if not >> months to >> >>>> get the changes released. This will personally de-motivate me to do >> >>>> anything and I will probably start maintaining my version of >> Jena-Text as >> >>>> that would be much simpler to do than to upgrade and test and in the >> >>>> process own(read fix bugs) the upgrade for each and every technology. >> >>>> >> >>>> If they are developed as separate modules, they can evolve >> independently >> >>>> of >> >>>> each other and we can avoid situations where we cant upgrade to >> latest >> >>>> version of Lucene because we do not know what effect it will have on >> Solr >> >>>> Implementation. >> >>>> >> >>>> We can start with having a separate Module for Jena Text ES and see >> how >> >>>> things go. If they go well, we could extract out Solr and Lucene out >> of >> >>>> Jena Text. >> >>>> >> >>>> Again this is just a suggestion based on my limited industry >> experience. >> >>>> >> >>>> Thanks, >> >>>> Anuj Kumar >> >>>> >> >>>> >> >>>> >> >>>> On Tue, Feb 28, 2017 at 5:23 PM, Osma Suominen < >> osma.suomi...@helsinki.fi >> >>>>> >> >>>> wrote: >> >>>> >> >>>> 28.02.2017, 17:12, A. Soroka kirjoitti: >> >>>>> >> >>>>> https://lists.apache.org/thread.html/dce0d502b11891c28e57bbc >> >>>>>> bb0cdef27d8374d58d9634076b8ef4cd7@1431107516@%3Cdev.jena.apa >> che.org%3E >> >>>>>> ? In other words, might it be better to factor out between -text >> and >> >>>>>> -spatial and _then_ try to upgrade the Lucene version? >> >>>>>> >> >>>>>> >> >>>>> I certainly wouldn't object to that, but somebody has to volunteer >> to do >> >>>>> the actual work! >> >>>>> >> >>>>> I don't use the Solr component now, but I could easily see so >> doing... >> >>>>> >> >>>>>> that's pretty vague, I know, and I'm not in a position to do any >> work to >> >>>>>> maintain it, so consider that just a very small and blurry data >> point. >> >>>>>> :) >> >>>>>> >> >>>>>> >> >>>>> Last time I tried it (it was a while ago) I couldn't figure out how >> to >> >>>>> get >> >>>>> it running... If you could just try that with some toy data, then >> your >> >>>>> data >> >>>>> point would be a lot less blurry :) I haven't used Solr for >> anything, so >> >>>>> I'm not very familiar with how to set it up, and the jena-text >> >>>>> instructions >> >>>>> are pretty vague unfortunately. >> >>>>> >> >>>>> >> >>>>> -Osma >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Osma Suominen >> >>>>> D.Sc. (Tech), Information Systems Specialist >> >>>>> National Library of Finland >> >>>>> P.O. Box 26 (Kaikukatu 4) >> >>>>> 00014 HELSINGIN YLIOPISTO >> >>>>> Tel. +358 50 3199529 >> >>>>> osma.suomi...@helsinki.fi >> >>>>> http://www.nationallibrary.fi >> >>>>> >> >>>>> >> >>>> >> >>>> >> >>>> >> >>> >> >>> -- >> >>> Osma Suominen >> >>> D.Sc. (Tech), Information Systems Specialist >> >>> National Library of Finland >> >>> P.O. Box 26 (Kaikukatu 4) >> >>> 00014 HELSINGIN YLIOPISTO >> >>> Tel. +358 50 3199529 >> >>> osma.suomi...@helsinki.fi >> >>> http://www.nationallibrary.fi >> >>> >> >> >> >> >> >> >> > >> > >> > -- >> > Osma Suominen >> > D.Sc. (Tech), Information Systems Specialist >> > National Library of Finland >> > P.O. Box 26 (Kaikukatu 4) >> > 00014 HELSINGIN YLIOPISTO >> > Tel. +358 50 3199529 >> > osma.suomi...@helsinki.fi >> > http://www.nationallibrary.fi >> >> > > > -- > *Anuj Kumar* > -- *Anuj Kumar*