I agree Osma. If Lucent is upgraded to 6.4.1 it would be much easier for me
to integrate the Elastic Search implementation.

But I am still waiting for someone to provide me a hint as to how I can
index multiple predicate values. This is the most pressing issue for me
currently.

Thanks,
Anuj Kumar

On 1 Mar 2017 19:27, "Osma Suominen" <[email protected]> wrote:

> Hi Anuj!
>
> I have nothing against modularity in general. However, I cannot see how
> your proposal could work in practice for the Fuseki build, due to the
> reasons I mentioned in my previous message (and Adam seemed to concur).
>
> In any case, I'll see what I can do to get the Lucene upgrade moving
> again. If all current Jena modules (ie jena-text and jena-spatial) were
> upgraded to Lucene 6.4.1, then you could just add your ES classes to
> jena-text, right? I think that would be better for everyone than having to
> maintain your own separate module.
>
> -Osma
>
> 01.03.2017, 16:59, anuj kumar kirjoitti:
>
>> I personally have no preference as to how the code in Jena should be
>> structured, as long as I am able to use it :).
>> I have personal preference of doing it in a specific way because IMO, it
>> is
>> modular which makes it much easier to maintain in the long run. But again
>> it may not be the quickest one.
>>
>> I already have been given a deadline, by the company to have ES extension
>> implemented in the next 15 days :). What this means is that I will be
>> maintaining the ES code extension to Jena Text at-least locally for a
>> coming period of time. I would be more than happy to contribute to Jena
>> community whatever is required to have a proper ElasticSearch
>> Implementation in place, whether within jena-text module or as a separate
>> module. Till the time Lucene and Solr is not upgraded to the latest
>> version, I will have to maintain a separate module for jena-text-es.
>>
>> Cheers!
>> Anuj Kumar
>>
>>
>> On Wed, Mar 1, 2017 at 3:36 PM, A. Soroka <[email protected]> wrote:
>>
>> Osma--
>>>
>>> The short answer is that yes, given the right tools you _can_ have
>>> different versions of code accessible in different ways. The longer
>>> answer
>>> is that it's probably not a viable alternative for Jena for this problem,
>>> at least not without a lot of other change.
>>>
>>> You are right to point to the classloader mechanism as being at the heart
>>> of this question, but I must alter your remark just slightly. From "the
>>> Java classloader only sees a single, flat package/class namespace and a
>>> set
>>> of compiled classes" to "ANY GIVEN Java classloader only sees a single,
>>> flat package/class namespace and a set of compiled classes".
>>>
>>> This is the fact that OSGi uses to make it possible to maintain strict
>>> module boundaries (and even dynamic module relationships at run-time).
>>> Each
>>> OSGi bundle sees its own classloader, and the framework is responsible
>>> for
>>> connecting bundles up to ensure that every bundle has what it needs in
>>> the
>>> way of types to function, based on metadata that the bundles provide to
>>> the
>>> framework. It's an incredibly powerful system (I use it every day and
>>> enjoy
>>> it enormously) but it's also very "heavy" and requires a good deal of
>>> investment to use. In particular, it's probably too large to put _inside_
>>> Jena. (I frequently put Jena inside an OSGi instance, on the other hand.)
>>>
>>> Java 9 Jigsaw [1] offers some possibility for strong modularization of
>>> this kind, but it's really meant for the JDK itself, not application
>>> libraries. In theory, we could "roll our own" classloader management for
>>> this problem. That sounds like more than a bit of a rabbit hole to me.
>>> There might be another, more lightweight, toolkit out there to this
>>> purpose, but I'm not aware of any myself.
>>>
>>> Otherwise, yes, you get into shading and the like. We have to do that for
>>> Guava for now because of HADOOP-10101 (grumble grumble) but it's hardly a
>>> thing we want to do any more of than needed, I don't think.
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>> [1] http://openjdk.java.net/projects/jigsaw/
>>>
>>> On Mar 1, 2017, at 9:03 AM, Osma Suominen <[email protected]>
>>>>
>>> wrote:
>>>
>>>>
>>>> Hi Anuj!
>>>>
>>>> Thanks for the clarification.
>>>>
>>>> However, I'm still not sure I understand the situation completely. I
>>>>
>>> know Maven can perform a lot of tricks, but Maven modules are just
>>> convenient ways to structure a Java project. Maven cannot change the fact
>>> that at runtime, module divisions don't really matter (except that they
>>> usually correspond to package sub-namespaces) and the Java classloader
>>> only
>>> sees a single, flat package/class namespace and a set of compiled classes
>>> (usually within JARs) in the classpath that it needs to check to find the
>>> right classes, and if there are two versions of the same library (eg
>>> Lucene) with overlapping class names, that's going to cause trouble. The
>>> only way around that is to shade some of the libraries, i.e. rename them
>>> so
>>> that they end up in another, non-conflicting namespace. Apparently
>>> Elasticsearch also did some of that in the past [1] but nowadays tries to
>>> avoid it.
>>>
>>>>
>>>> Does your assumption 1 ("At a given point in time, only a single
>>>>
>>> Indexing Technology is used") imply that in the assembler configuration,
>>> you cannot have ja:loadClass declarations for both Lucene and ES
>>> backends?
>>> Or how do you run something like Fuseki that contains (in a single big
>>> JAR)
>>> both the jena-text and jena-text-es modules with all their dependencies,
>>> one of which requires the Lucene 4.x classes and the other one the Lucene
>>> 6.4.1 classes? How do you ensure that only one of them is used at a time,
>>> and that the Java classloader, even though it has access to both versions
>>> of Lucene, only loads classes from the single, correct one and not the
>>> other? Or do you need to have separate "Fuseki-Lucene" and "Fuseki-ES"
>>> packages, so that you don't end up with two Lucene versions within the
>>> same
>>> Fuseki JAR?
>>>
>>>>
>>>> -Osma
>>>>
>>>> [1] https://www.elastic.co/blog/to-shade-or-not-to-shade
>>>>
>>>> 01.03.2017, 11:03, anuj kumar kirjoitti:
>>>>
>>>>> Hi Osma,
>>>>>
>>>>> I understand what you are saying. There are ways to mitigate risks and
>>>>> balance the refactoring without affecting the existing modules. But I
>>>>>
>>>> will
>>>
>>>> not delve into those now. I am not an expert in Jena to convincingly say
>>>>> that it is possible, without any hiccups. But I can take a guess and
>>>>> say
>>>>> that it is indeed possible :)
>>>>>
>>>>> For the question: "is it even possible to mix modules that depend on
>>>>> different versions of the Lucene libraries within the same project?"
>>>>>
>>>>> I actually do not understand what you mean by mixing modules. I assume
>>>>>
>>>> you
>>>
>>>> mean having jena-text and jena-text-es as dependencies in a build
>>>>>
>>>> without
>>>
>>>> causing the build to conflict. If that is what you mean than the answer
>>>>>
>>>> is
>>>
>>>> yes it is possible and quite simple as well. Let me explain how it is
>>>>> possible. But before that some assumption which I want to call out
>>>>> explicitly.
>>>>>
>>>>> *Assumption:*
>>>>> 1. At a given point in time, only a single Indexing Technology is used
>>>>>
>>>> for
>>>
>>>> text based indexing and searching via Jean. What this means is that we
>>>>>
>>>> will
>>>
>>>> either use Lucene Implementation OR Solr Implementation OR ES
>>>>> Implementation at any given point in time.
>>>>> 2. Fuseki build does not depend on any Lucene 4.9.1 specific classes
>>>>> but
>>>>> only on jena-text classes, if at all.
>>>>>
>>>>> Based on these assumptions it is possible to create a build that
>>>>>
>>>> contains
>>>
>>>> jena-text based common classes + ES specific classes without any
>>>>> compatibility issues. And it is infact quite simple. I did it in the
>>>>> current jena-text-es module and ran the entire build which succeeded.
>>>>> The key is to include the latest Lucene dependencies at the very
>>>>>
>>>> beginning
>>>
>>>> in the pom and then include jena-text dependency. Maven will then
>>>>> automatically resolve the dependency issues by including the Lucene
>>>>> librarires that we included in our es specific pom. Have a look the pom
>>>>>
>>>> of
>>>
>>>> jena-text-es module here to see how it can be done :
>>>>> https://github.com/EaseTech/jena/blob/master/jena-text-es/pom.xml
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Anuj Kumar
>>>>>
>>>>>
>>>>> On Wed, Mar 1, 2017 at 7:27 AM, Osma Suominen <
>>>>>
>>>> [email protected]>
>>>
>>>> wrote:
>>>>>
>>>>> Hi Anuj,
>>>>>>
>>>>>> I understand your concerns. However, we also need to balance between
>>>>>>
>>>>> the
>>>
>>>> needs of individual modules/features and the whole codebase. I'm
>>>>>>
>>>>> willing to
>>>
>>>> put in the effort to keep the other modules up to date with newer
>>>>>>
>>>>> Lucene
>>>
>>>> versions. Lucene upgrade requirements are well documented, the only
>>>>>>
>>>>> hitches
>>>
>>>> seen in JENA-1250 were related to how jena-text (ab)used some Lucene
>>>>>> features that were dropped from newer versions.
>>>>>>
>>>>>> A perhaps stupid question to more experienced Java developers: is it
>>>>>>
>>>>> even
>>>
>>>> possible to mix modules that depend on different versions of the Lucene
>>>>>> libraries within the same project? In my (quite limited) understanding
>>>>>>
>>>>> of
>>>
>>>> Java projects and libraries, this requires special arrangements (e.g.
>>>>>> shading) as the Java package/class namespace is shared by all the code
>>>>>> running within the same JVM.
>>>>>>
>>>>>> So can you create, say, a Fuseki build that contains the current
>>>>>>
>>>>> jena-text
>>>
>>>> module (depending on Lucene 4.x) and the new jena-text-es module
>>>>>>
>>>>> (depending
>>>
>>>> on Lucene 6.4.1) without any compatibility issues?
>>>>>>
>>>>>> -Osma
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 01.03.2017, 00:47, anuj kumar kirjoitti:
>>>>>>
>>>>>> Hi,
>>>>>>>
>>>>>>> My 2 Cents :
>>>>>>>
>>>>>>> The reason I proposed to have separate modules for Lucene, Solr and
>>>>>>>
>>>>>> ES is
>>>
>>>> exactly for avoiding the "All or Nothing" approach we need to take if
>>>>>>>
>>>>>> we
>>>
>>>> club them all together. If they stay together and if in the near
>>>>>>>
>>>>>> future I
>>>
>>>> want to upgrade ES to another version, I also need to again upgrade
>>>>>>>
>>>>>> Lucene
>>>
>>>> and Solr and possibly another implementation that may have been added
>>>>>>> during the time. As we all know, this means weeks of work if not
>>>>>>>
>>>>>> months to
>>>
>>>> get the changes released. This will personally de-motivate me to do
>>>>>>> anything and I will probably start maintaining my version of
>>>>>>>
>>>>>> Jena-Text as
>>>
>>>> that would be much simpler to do than to upgrade and test and in the
>>>>>>> process own(read fix bugs) the upgrade for each and every technology.
>>>>>>>
>>>>>>> If they are developed as separate modules, they can evolve
>>>>>>>
>>>>>> independently
>>>
>>>> of
>>>>>>> each other and we can avoid situations where we cant upgrade to
>>>>>>> latest
>>>>>>> version of Lucene because we do not know what effect it will have on
>>>>>>>
>>>>>> Solr
>>>
>>>> Implementation.
>>>>>>>
>>>>>>> We can start with having a separate Module for Jena Text ES and see
>>>>>>>
>>>>>> how
>>>
>>>> things go. If they go well, we could extract out Solr and Lucene out
>>>>>>>
>>>>>> of
>>>
>>>> Jena Text.
>>>>>>>
>>>>>>> Again this is just a suggestion based on my limited industry
>>>>>>>
>>>>>> experience.
>>>
>>>>
>>>>>>> Thanks,
>>>>>>> Anuj Kumar
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 28, 2017 at 5:23 PM, Osma Suominen <
>>>>>>>
>>>>>> [email protected]
>>>
>>>>
>>>>>>>> wrote:
>>>>>>>
>>>>>>> 28.02.2017, 17:12, A. Soroka kirjoitti:
>>>>>>>
>>>>>>>>
>>>>>>>> https://lists.apache.org/thread.html/dce0d502b11891c28e57bbc
>>>>>>>>
>>>>>>>>> bb0cdef27d8374d58d9634076b8ef4cd7@1431107516@%3Cdev.jena.apa
>>>>>>>>> che.org
>>>>>>>>>
>>>>>>>> %3E
>>>
>>>> ? In other words, might it be better to factor out between -text and
>>>>>>>>> -spatial and _then_ try to upgrade the Lucene version?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I certainly wouldn't object to that, but somebody has to volunteer
>>>>>>>>
>>>>>>> to do
>>>
>>>> the actual work!
>>>>>>>>
>>>>>>>> I don't use the Solr component now, but I could easily see so
>>>>>>>>
>>>>>>> doing...
>>>
>>>>
>>>>>>>> that's pretty vague, I know, and I'm not in a position to do any
>>>>>>>>>
>>>>>>>> work to
>>>
>>>> maintain it, so consider that just a very small and blurry data
>>>>>>>>>
>>>>>>>> point.
>>>
>>>> :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Last time I tried it (it was a while ago) I couldn't figure out how
>>>>>>>>
>>>>>>> to
>>>
>>>> get
>>>>>>>> it running... If you could just try that with some toy data, then
>>>>>>>>
>>>>>>> your
>>>
>>>> data
>>>>>>>> point would be a lot less blurry :) I haven't used Solr for
>>>>>>>>
>>>>>>> anything, so
>>>
>>>> I'm not very familiar with how to set it up, and the jena-text
>>>>>>>> instructions
>>>>>>>> are pretty vague unfortunately.
>>>>>>>>
>>>>>>>>
>>>>>>>> -Osma
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Osma Suominen
>>>>>>>> D.Sc. (Tech), Information Systems Specialist
>>>>>>>> National Library of Finland
>>>>>>>> P.O. Box 26 (Kaikukatu 4)
>>>>>>>> 00014 HELSINGIN YLIOPISTO
>>>>>>>> Tel. +358 50 3199529
>>>>>>>> [email protected]
>>>>>>>> http://www.nationallibrary.fi
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> Osma Suominen
>>>>>> D.Sc. (Tech), Information Systems Specialist
>>>>>> National Library of Finland
>>>>>> P.O. Box 26 (Kaikukatu 4)
>>>>>> 00014 HELSINGIN YLIOPISTO
>>>>>> Tel. +358 50 3199529
>>>>>> [email protected]
>>>>>> http://www.nationallibrary.fi
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Osma Suominen
>>>> D.Sc. (Tech), Information Systems Specialist
>>>> National Library of Finland
>>>> P.O. Box 26 (Kaikukatu 4)
>>>> 00014 HELSINGIN YLIOPISTO
>>>> Tel. +358 50 3199529
>>>> [email protected]
>>>> http://www.nationallibrary.fi
>>>>
>>>
>>>
>>>
>>
>>
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 26 (Kaikukatu 4)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> [email protected]
> http://www.nationallibrary.fi
>

Reply via email to