I Osma,
 I briefly looked at the pull request. I beieve we need to upgrade Lucene
and Solr in one go, isnt it. The reason being Solr 4.9.1 depends on Lucene
4.9.1

Also how do i log into  issues.apache.org and where to file this bug?

Thanks,
Anuj Kumar

On Fri, Mar 3, 2017 at 11:22 AM, Osma Suominen <osma.suomi...@helsinki.fi>
wrote:

> Hi Anuj,
>
> It's great that we found agreement over this!
>
> I've restarted the Lucene upgrade effort (JENA-1250) that had stalled and
> made a PR [1] that implements the upgrade up to version 6.4.1 (with 5.5.4
> as an intermediate step). I'll wait for comments on the PR and if people
> think it's OK I will merge it soon to Jena master. Meanwhile, you can
> already base your ES implementation on that branch [2] if you like.
>
> Could you please open a JIRA issue on issues.apache.org explaining the
> Elasticsearch support feature, so that we have a place for tracking this
> work, request comments etc.
>
> Also I suggest we move the discussion around this to the developers' list (
> d...@jena.apache.org) where it's more appropriate.
>
> -Osma
>
> [1] https://github.com/apache/jena/pull/219
>
> [2] https://github.com/osma/jena/tree/jena-1250-lucene6
>
>
> 03.03.2017, 02:45, anuj kumar kirjoitti:
>
>> I second that. I am now finalising the integration of ES and should have a
>> good production quality implementation ready in a week's time.  At that
>> time I would want you guys to have a look at the implementation and
>> provide
>> feedback. Once you guys have upgraded Lucene to 6.4.1 , I can merge the
>> code in jena-text module and do a round of testing.
>>
>> Thanks,
>> Anuj Kumar
>>
>> On 2 Mar 2017 22:28, "A. Soroka" <aj...@virginia.edu> wrote:
>>
>> I do agree that trying to juggle different versions of Lucene libraries is
>>> probably not a realistic option right now. Luckily (if I understand the
>>> conversation thus far correctly) we have a solid alternative; getting our
>>> current Lucene dependency upgraded should allow us to (eventually) merge
>>> Anuj's work into the mainstream of development. Someone please tell me
>>> if I
>>> have that wrong! :grin:
>>>
>>> Let me reiterate that this seems like very good work and speaking for
>>> myself, I certainly want to get it included into Jena. It's just a
>>> question
>>> of fitting it in correctly, which might take a bit of time.
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>> On Mar 1, 2017, at 1:27 PM, Osma Suominen <osma.suomi...@helsinki.fi>
>>>>
>>> wrote:
>>>
>>>>
>>>> Hi Anuj!
>>>>
>>>> I have nothing against modularity in general. However, I cannot see how
>>>>
>>> your proposal could work in practice for the Fuseki build, due to the
>>> reasons I mentioned in my previous message (and Adam seemed to concur).
>>>
>>>>
>>>> In any case, I'll see what I can do to get the Lucene upgrade moving
>>>>
>>> again. If all current Jena modules (ie jena-text and jena-spatial) were
>>> upgraded to Lucene 6.4.1, then you could just add your ES classes to
>>> jena-text, right? I think that would be better for everyone than having
>>> to
>>> maintain your own separate module.
>>>
>>>>
>>>> -Osma
>>>>
>>>> 01.03.2017, 16:59, anuj kumar kirjoitti:
>>>>
>>>>> I personally have no preference as to how the code in Jena should be
>>>>> structured, as long as I am able to use it :).
>>>>> I have personal preference of doing it in a specific way because IMO,
>>>>>
>>>> it is
>>>
>>>> modular which makes it much easier to maintain in the long run. But
>>>>>
>>>> again
>>>
>>>> it may not be the quickest one.
>>>>>
>>>>> I already have been given a deadline, by the company to have ES
>>>>>
>>>> extension
>>>
>>>> implemented in the next 15 days :). What this means is that I will be
>>>>> maintaining the ES code extension to Jena Text at-least locally for a
>>>>> coming period of time. I would be more than happy to contribute to Jena
>>>>> community whatever is required to have a proper ElasticSearch
>>>>> Implementation in place, whether within jena-text module or as a
>>>>>
>>>> separate
>>>
>>>> module. Till the time Lucene and Solr is not upgraded to the latest
>>>>> version, I will have to maintain a separate module for jena-text-es.
>>>>>
>>>>> Cheers!
>>>>> Anuj Kumar
>>>>>
>>>>>
>>>>> On Wed, Mar 1, 2017 at 3:36 PM, A. Soroka <aj...@virginia.edu> wrote:
>>>>>
>>>>> Osma--
>>>>>>
>>>>>> The short answer is that yes, given the right tools you _can_ have
>>>>>> different versions of code accessible in different ways. The longer
>>>>>>
>>>>> answer
>>>
>>>> is that it's probably not a viable alternative for Jena for this
>>>>>>
>>>>> problem,
>>>
>>>> at least not without a lot of other change.
>>>>>>
>>>>>> You are right to point to the classloader mechanism as being at the
>>>>>>
>>>>> heart
>>>
>>>> of this question, but I must alter your remark just slightly. From "the
>>>>>> Java classloader only sees a single, flat package/class namespace and
>>>>>>
>>>>> a set
>>>
>>>> of compiled classes" to "ANY GIVEN Java classloader only sees a single,
>>>>>> flat package/class namespace and a set of compiled classes".
>>>>>>
>>>>>> This is the fact that OSGi uses to make it possible to maintain strict
>>>>>> module boundaries (and even dynamic module relationships at run-time).
>>>>>>
>>>>> Each
>>>
>>>> OSGi bundle sees its own classloader, and the framework is responsible
>>>>>>
>>>>> for
>>>
>>>> connecting bundles up to ensure that every bundle has what it needs in
>>>>>>
>>>>> the
>>>
>>>> way of types to function, based on metadata that the bundles provide
>>>>>>
>>>>> to the
>>>
>>>> framework. It's an incredibly powerful system (I use it every day and
>>>>>>
>>>>> enjoy
>>>
>>>> it enormously) but it's also very "heavy" and requires a good deal of
>>>>>> investment to use. In particular, it's probably too large to put
>>>>>>
>>>>> _inside_
>>>
>>>> Jena. (I frequently put Jena inside an OSGi instance, on the other
>>>>>>
>>>>> hand.)
>>>
>>>>
>>>>>> Java 9 Jigsaw [1] offers some possibility for strong modularization of
>>>>>> this kind, but it's really meant for the JDK itself, not application
>>>>>> libraries. In theory, we could "roll our own" classloader management
>>>>>>
>>>>> for
>>>
>>>> this problem. That sounds like more than a bit of a rabbit hole to me.
>>>>>> There might be another, more lightweight, toolkit out there to this
>>>>>> purpose, but I'm not aware of any myself.
>>>>>>
>>>>>> Otherwise, yes, you get into shading and the like. We have to do that
>>>>>>
>>>>> for
>>>
>>>> Guava for now because of HADOOP-10101 (grumble grumble) but it's
>>>>>>
>>>>> hardly a
>>>
>>>> thing we want to do any more of than needed, I don't think.
>>>>>>
>>>>>> ---
>>>>>> A. Soroka
>>>>>> The University of Virginia Library
>>>>>>
>>>>>> [1] http://openjdk.java.net/projects/jigsaw/
>>>>>>
>>>>>> On Mar 1, 2017, at 9:03 AM, Osma Suominen <osma.suomi...@helsinki.fi>
>>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> Hi Anuj!
>>>>>>>
>>>>>>> Thanks for the clarification.
>>>>>>>
>>>>>>> However, I'm still not sure I understand the situation completely. I
>>>>>>>
>>>>>> know Maven can perform a lot of tricks, but Maven modules are just
>>>>>> convenient ways to structure a Java project. Maven cannot change the
>>>>>>
>>>>> fact
>>>
>>>> that at runtime, module divisions don't really matter (except that they
>>>>>> usually correspond to package sub-namespaces) and the Java classloader
>>>>>>
>>>>> only
>>>
>>>> sees a single, flat package/class namespace and a set of compiled
>>>>>>
>>>>> classes
>>>
>>>> (usually within JARs) in the classpath that it needs to check to find
>>>>>>
>>>>> the
>>>
>>>> right classes, and if there are two versions of the same library (eg
>>>>>> Lucene) with overlapping class names, that's going to cause trouble.
>>>>>>
>>>>> The
>>>
>>>> only way around that is to shade some of the libraries, i.e. rename
>>>>>>
>>>>> them so
>>>
>>>> that they end up in another, non-conflicting namespace. Apparently
>>>>>> Elasticsearch also did some of that in the past [1] but nowadays tries
>>>>>>
>>>>> to
>>>
>>>> avoid it.
>>>>>>
>>>>>>>
>>>>>>> Does your assumption 1 ("At a given point in time, only a single
>>>>>>>
>>>>>> Indexing Technology is used") imply that in the assembler
>>>>>>
>>>>> configuration,
>>>
>>>> you cannot have ja:loadClass declarations for both Lucene and ES
>>>>>>
>>>>> backends?
>>>
>>>> Or how do you run something like Fuseki that contains (in a single big
>>>>>>
>>>>> JAR)
>>>
>>>> both the jena-text and jena-text-es modules with all their
>>>>>>
>>>>> dependencies,
>>>
>>>> one of which requires the Lucene 4.x classes and the other one the
>>>>>>
>>>>> Lucene
>>>
>>>> 6.4.1 classes? How do you ensure that only one of them is used at a
>>>>>>
>>>>> time,
>>>
>>>> and that the Java classloader, even though it has access to both
>>>>>>
>>>>> versions
>>>
>>>> of Lucene, only loads classes from the single, correct one and not the
>>>>>> other? Or do you need to have separate "Fuseki-Lucene" and "Fuseki-ES"
>>>>>> packages, so that you don't end up with two Lucene versions within the
>>>>>>
>>>>> same
>>>
>>>> Fuseki JAR?
>>>>>>
>>>>>>>
>>>>>>> -Osma
>>>>>>>
>>>>>>> [1] https://www.elastic.co/blog/to-shade-or-not-to-shade
>>>>>>>
>>>>>>> 01.03.2017, 11:03, anuj kumar kirjoitti:
>>>>>>>
>>>>>>>> Hi Osma,
>>>>>>>>
>>>>>>>> I understand what you are saying. There are ways to mitigate risks
>>>>>>>>
>>>>>>> and
>>>
>>>> balance the refactoring without affecting the existing modules. But I
>>>>>>>>
>>>>>>> will
>>>>>>
>>>>>>> not delve into those now. I am not an expert in Jena to convincingly
>>>>>>>>
>>>>>>> say
>>>
>>>> that it is possible, without any hiccups. But I can take a guess and
>>>>>>>>
>>>>>>> say
>>>
>>>> that it is indeed possible :)
>>>>>>>>
>>>>>>>> For the question: "is it even possible to mix modules that depend on
>>>>>>>> different versions of the Lucene libraries within the same project?"
>>>>>>>>
>>>>>>>> I actually do not understand what you mean by mixing modules. I
>>>>>>>>
>>>>>>> assume
>>>
>>>> you
>>>>>>
>>>>>>> mean having jena-text and jena-text-es as dependencies in a build
>>>>>>>>
>>>>>>> without
>>>>>>
>>>>>>> causing the build to conflict. If that is what you mean than the
>>>>>>>>
>>>>>>> answer
>>>
>>>> is
>>>>>>
>>>>>>> yes it is possible and quite simple as well. Let me explain how it is
>>>>>>>> possible. But before that some assumption which I want to call out
>>>>>>>> explicitly.
>>>>>>>>
>>>>>>>> *Assumption:*
>>>>>>>> 1. At a given point in time, only a single Indexing Technology is
>>>>>>>>
>>>>>>> used
>>>
>>>> for
>>>>>>
>>>>>>> text based indexing and searching via Jean. What this means is that
>>>>>>>>
>>>>>>> we
>>>
>>>> will
>>>>>>
>>>>>>> either use Lucene Implementation OR Solr Implementation OR ES
>>>>>>>> Implementation at any given point in time.
>>>>>>>> 2. Fuseki build does not depend on any Lucene 4.9.1 specific classes
>>>>>>>>
>>>>>>> but
>>>
>>>> only on jena-text classes, if at all.
>>>>>>>>
>>>>>>>> Based on these assumptions it is possible to create a build that
>>>>>>>>
>>>>>>> contains
>>>>>>
>>>>>>> jena-text based common classes + ES specific classes without any
>>>>>>>> compatibility issues. And it is infact quite simple. I did it in the
>>>>>>>> current jena-text-es module and ran the entire build which
>>>>>>>> succeeded.
>>>>>>>> The key is to include the latest Lucene dependencies at the very
>>>>>>>>
>>>>>>> beginning
>>>>>>
>>>>>>> in the pom and then include jena-text dependency. Maven will then
>>>>>>>> automatically resolve the dependency issues by including the Lucene
>>>>>>>> librarires that we included in our es specific pom. Have a look the
>>>>>>>>
>>>>>>> pom
>>>
>>>> of
>>>>>>
>>>>>>> jena-text-es module here to see how it can be done :
>>>>>>>> https://github.com/EaseTech/jena/blob/master/jena-text-es/pom.xml
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Anuj Kumar
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 1, 2017 at 7:27 AM, Osma Suominen <
>>>>>>>>
>>>>>>> osma.suomi...@helsinki.fi>
>>>>>>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Anuj,
>>>>>>>>>
>>>>>>>>> I understand your concerns. However, we also need to balance
>>>>>>>>> between
>>>>>>>>>
>>>>>>>> the
>>>>>>
>>>>>>> needs of individual modules/features and the whole codebase. I'm
>>>>>>>>>
>>>>>>>> willing to
>>>>>>
>>>>>>> put in the effort to keep the other modules up to date with newer
>>>>>>>>>
>>>>>>>> Lucene
>>>>>>
>>>>>>> versions. Lucene upgrade requirements are well documented, the only
>>>>>>>>>
>>>>>>>> hitches
>>>>>>
>>>>>>> seen in JENA-1250 were related to how jena-text (ab)used some Lucene
>>>>>>>>> features that were dropped from newer versions.
>>>>>>>>>
>>>>>>>>> A perhaps stupid question to more experienced Java developers: is
>>>>>>>>> it
>>>>>>>>>
>>>>>>>> even
>>>>>>
>>>>>>> possible to mix modules that depend on different versions of the
>>>>>>>>>
>>>>>>>> Lucene
>>>
>>>> libraries within the same project? In my (quite limited)
>>>>>>>>>
>>>>>>>> understanding
>>>
>>>> of
>>>>>>
>>>>>>> Java projects and libraries, this requires special arrangements
>>>>>>>>>
>>>>>>>> (e.g.
>>>
>>>> shading) as the Java package/class namespace is shared by all the
>>>>>>>>>
>>>>>>>> code
>>>
>>>> running within the same JVM.
>>>>>>>>>
>>>>>>>>> So can you create, say, a Fuseki build that contains the current
>>>>>>>>>
>>>>>>>> jena-text
>>>>>>
>>>>>>> module (depending on Lucene 4.x) and the new jena-text-es module
>>>>>>>>>
>>>>>>>> (depending
>>>>>>
>>>>>>> on Lucene 6.4.1) without any compatibility issues?
>>>>>>>>>
>>>>>>>>> -Osma
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 01.03.2017, 00:47, anuj kumar kirjoitti:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> My 2 Cents :
>>>>>>>>>>
>>>>>>>>>> The reason I proposed to have separate modules for Lucene, Solr
>>>>>>>>>> and
>>>>>>>>>>
>>>>>>>>> ES is
>>>>>>
>>>>>>> exactly for avoiding the "All or Nothing" approach we need to take
>>>>>>>>>>
>>>>>>>>> if
>>>
>>>> we
>>>>>>
>>>>>>> club them all together. If they stay together and if in the near
>>>>>>>>>>
>>>>>>>>> future I
>>>>>>
>>>>>>> want to upgrade ES to another version, I also need to again upgrade
>>>>>>>>>>
>>>>>>>>> Lucene
>>>>>>
>>>>>>> and Solr and possibly another implementation that may have been
>>>>>>>>>>
>>>>>>>>> added
>>>
>>>> during the time. As we all know, this means weeks of work if not
>>>>>>>>>>
>>>>>>>>> months to
>>>>>>
>>>>>>> get the changes released. This will personally de-motivate me to do
>>>>>>>>>> anything and I will probably start maintaining my version of
>>>>>>>>>>
>>>>>>>>> Jena-Text as
>>>>>>
>>>>>>> that would be much simpler to do than to upgrade and test and in
>>>>>>>>>>
>>>>>>>>> the
>>>
>>>> process own(read fix bugs) the upgrade for each and every
>>>>>>>>>>
>>>>>>>>> technology.
>>>
>>>>
>>>>>>>>>> If they are developed as separate modules, they can evolve
>>>>>>>>>>
>>>>>>>>> independently
>>>>>>
>>>>>>> of
>>>>>>>>>> each other and we can avoid situations where we cant upgrade to
>>>>>>>>>>
>>>>>>>>> latest
>>>
>>>> version of Lucene because we do not know what effect it will have
>>>>>>>>>>
>>>>>>>>> on
>>>
>>>> Solr
>>>>>>
>>>>>>> Implementation.
>>>>>>>>>>
>>>>>>>>>> We can start with having a separate Module for Jena Text ES and
>>>>>>>>>> see
>>>>>>>>>>
>>>>>>>>> how
>>>>>>
>>>>>>> things go. If they go well, we could extract out Solr and Lucene
>>>>>>>>>>
>>>>>>>>> out
>>>
>>>> of
>>>>>>
>>>>>>> Jena Text.
>>>>>>>>>>
>>>>>>>>>> Again this is just a suggestion based on my limited industry
>>>>>>>>>>
>>>>>>>>> experience.
>>>>>>
>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Anuj Kumar
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 28, 2017 at 5:23 PM, Osma Suominen <
>>>>>>>>>>
>>>>>>>>> osma.suomi...@helsinki.fi
>>>>>>
>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> 28.02.2017, 17:12, A. Soroka kirjoitti:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://lists.apache.org/thread.html/dce0d502b11891c28e57bbc
>>>>>>>>>>>
>>>>>>>>>>>> bb0cdef27d8374d58d9634076b8ef4cd7@1431107516@%3Cdev.jena.
>>>>>>>>>>>>
>>>>>>>>>>> apache.org
>>>
>>>> %3E
>>>>>>
>>>>>>> ? In other words, might it be better to factor out between -text
>>>>>>>>>>>>
>>>>>>>>>>> and
>>>
>>>> -spatial and _then_ try to upgrade the Lucene version?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I certainly wouldn't object to that, but somebody has to
>>>>>>>>>>> volunteer
>>>>>>>>>>>
>>>>>>>>>> to do
>>>>>>
>>>>>>> the actual work!
>>>>>>>>>>>
>>>>>>>>>>> I don't use the Solr component now, but I could easily see so
>>>>>>>>>>>
>>>>>>>>>> doing...
>>>>>>
>>>>>>>
>>>>>>>>>>> that's pretty vague, I know, and I'm not in a position to do any
>>>>>>>>>>>>
>>>>>>>>>>> work to
>>>>>>
>>>>>>> maintain it, so consider that just a very small and blurry data
>>>>>>>>>>>>
>>>>>>>>>>> point.
>>>>>>
>>>>>>> :)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Last time I tried it (it was a while ago) I couldn't figure out
>>>>>>>>>>>
>>>>>>>>>> how
>>>
>>>> to
>>>>>>
>>>>>>> get
>>>>>>>>>>> it running... If you could just try that with some toy data, then
>>>>>>>>>>>
>>>>>>>>>> your
>>>>>>
>>>>>>> data
>>>>>>>>>>> point would be a lot less blurry :) I haven't used Solr for
>>>>>>>>>>>
>>>>>>>>>> anything, so
>>>>>>
>>>>>>> I'm not very familiar with how to set it up, and the jena-text
>>>>>>>>>>> instructions
>>>>>>>>>>> are pretty vague unfortunately.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -Osma
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Osma Suominen
>>>>>>>>>>> D.Sc. (Tech), Information Systems Specialist
>>>>>>>>>>> National Library of Finland
>>>>>>>>>>> P.O. Box 26 (Kaikukatu 4)
>>>>>>>>>>> 00014 HELSINGIN YLIOPISTO
>>>>>>>>>>> Tel. +358 50 3199529
>>>>>>>>>>> osma.suomi...@helsinki.fi
>>>>>>>>>>> http://www.nationallibrary.fi
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Osma Suominen
>>>>>>>>> D.Sc. (Tech), Information Systems Specialist
>>>>>>>>> National Library of Finland
>>>>>>>>> P.O. Box 26 (Kaikukatu 4)
>>>>>>>>> 00014 HELSINGIN YLIOPISTO
>>>>>>>>> Tel. +358 50 3199529
>>>>>>>>> osma.suomi...@helsinki.fi
>>>>>>>>> http://www.nationallibrary.fi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Osma Suominen
>>>>>>> D.Sc. (Tech), Information Systems Specialist
>>>>>>> National Library of Finland
>>>>>>> P.O. Box 26 (Kaikukatu 4)
>>>>>>> 00014 HELSINGIN YLIOPISTO
>>>>>>> Tel. +358 50 3199529
>>>>>>> osma.suomi...@helsinki.fi
>>>>>>> http://www.nationallibrary.fi
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Osma Suominen
>>>> D.Sc. (Tech), Information Systems Specialist
>>>> National Library of Finland
>>>> P.O. Box 26 (Kaikukatu 4)
>>>> 00014 HELSINGIN YLIOPISTO
>>>> Tel. +358 50 3199529
>>>> osma.suomi...@helsinki.fi
>>>> http://www.nationallibrary.fi
>>>>
>>>
>>>
>>>
>>
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 26 (Kaikukatu 4)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> osma.suomi...@helsinki.fi
> http://www.nationallibrary.fi
>



-- 
*Anuj Kumar*

Reply via email to