On Mar 6, 2009, at 4:58 PM, Marvin Humphrey wrote:

Grant,

I am currently employed by Eventful, Inc, in San Diego, CA. They are paying
me to work full-time on KinoSearch and Lucy.

I went out of my way when we negotiated the terms of my employment to ensure that there was no way my contract could hamper or compromise progress towards Lucy. The actual document is confidential of course, but I feel comfortable saying that first, our lawyers hammered out the legal nuts and bolts to my satisfaction, and second, Eventful is fully on board with regards to Lucy. By way of illustration, my boss regularly hassles me about publishing a Lucy C API, even though since Eventful uses the Perl bindings the benefits would be
indirect.


This just further underscores my point. Lucy cannot be just about you (and your employer) contributing code that you develop in-house at Eventful. A project must be able to survive any single committer leaving the project and simply put, Lucy does not meet that criteria. In the early stages, yes, often one committer gets things going, but Lucy's been around for a fairly long time on life support and you only seem to pop up on the list when nudged by the PMC.


In my opinion, it is not in the best interests of the Apache Lucene project to
make it more difficult for my employer and myself to contribute.

I agree, but unfortunately, it is Lucy that has languished for a good long time.



It is fairly apparent to me that the Lucy project is not making any
progress community-wise or code-wise.  Neither Marvin, Dave or Doug
are active at all on it, and that accounts for all three committers.
There has been very little mailing list traffic,

You may have noticed that up until about three weeks ago (when I dove back into the code cave), I was quite active on java- [email protected] and in
the Lucene JIRA forums.  Significant design innovations were realized,
particularly in the area of real-time search.

In the past, many designs have been hashed out cooperatively on the KinoSearch and Lucy mailing lists: the Schema class, revisions to QueryParser and the boolean Query hierarchy, the implementation of human-readable index metadata, C configuration probing, the OO model, index designs which exploit memory
mapping, and so on.

In this particular case, however, I was assigned the task of solving real-time search, for which the Lucy and KinoSearch forums were not ideal. There is a
very limited number of people who have both the familiarity with the
Lucene/Lucy segment-based inverted index model and the interest to discuss real-time search at the level I desired, where concepts like "segment-centric search" could be bandied about. Basically, I needed Mike McCandless -- so I
went to where he could be found.

The conversations that we had in JIRA and on java-dev were beneficial to both Lucene and Lucy; should I have posted to the Lucy dev list instead simply to demonstrate activity, which would have been less useful to Mike, to me, to Lucy, and to Lucene? To my mind, the Lucene community is also part of the Lucy community. Mike's insights were welcome and useful, and it didn't seem important to me which specific mailing list they wound up on -- they're all under the domain lucene.apache.org, after all. Weren't we all moving forward
together, and wouldn't that be apparent to members of the PMC such as
yourself?

Or is this a zero-sum game where design innovations which help Lucy don't
count as "progress" if they also help Lucene?

That's all fine, but none of it adds up to people looking at Lucy and saying "Gee, I want to contribute to Lucy"




Furthermore, I have my doubts about the development process being employed, which seems to be the notion that KinoSearch is going to be donated by Marvin at some point in the future [1], which would only work if it were to go through the Software Grant or Incubation process (which I would be happy to support.), or at least that is how I understand the process to be when
code is developed outside of the ASF.

I understand why you might have thought that, but that's not how things will
play out, and it's a misreading of the post that you cite.
(<http://www.lucidimagination.com/search/document/152a1a9d00b7d08a/is_there_anybody_here >)

As you note, simply importing KinoSearch wholesale into the Lucy repository with cosmetic changes would violate the terms of the project. But even if that were possible, it would represent a *horrendous missed opportunity*.

A KinoSearch 1.0 release, with permanent API and file format backwards
compatibility guarantees -- i.e. "there will never be a KinoSearch 2.0" -- will be very beneficial for Lucy's development. Imposing such discipline allows library users to proceed with maximum confidence. For instance, it allows Peter Karman, who has long planned to build a KS backend for Swish, to move forward without having to worry about the upstream library pulling the
rug out from underneath his users.

Going that route will maximize our ability to learn the limitations and weaknesses of the design. Using the knowledge we gain, we can then forge ahead as we have in the past: chunk by chuck, class by class. And even though I am very pleased with how pluggable index components, C API user interface improvements, "OS-as-JVM" file format changes, and so on are coming along, I anticipate lots of healthy debate and major discrepancies between what ends up
in KS 1.0 and what ends up in Lucy.

Even if KS were the plan, in looking at KS, it seems there is not much
community activity there, either.

This is largely due to the fact that it has been a long time since I released any significant public updates. I choose to release significant updates infrequently because breaking backwards compatibility has severe consequences for CPAN modules: as soon as the install completes, live apps start crashing.

Since there is no sane deprecation mechanism for dynamically loaded Perl modules, minimizing backwards compatibility problems is a responsibility I
take seriously.

On the flip side, one might ask what's the harm in letting it stand as
is?  Admittedly, not much, other than I think it confuses people b/c
they think there is a C port of Lucene and then they go and find it is
dead.

Indeed. It's not like Lucy in its present form causes harm to the bottom line
of Lucid Imagination, Inc. ;)

What's that got to do with anything? Give me a break. I'm not attacking you. I'm just stating that Lucy has not had any code or any community built for over three years.



Therefore, it is with some hesitation that I suggest we mothball
Lucy.  Mostly, I hesitate, because I hate to see any project be
archived on the hope that someone will come in and pick it up.

However, I just don't see that happening.  If Marvin wishes to
resurrect it, he can donate KS (or whatever core part of it is Lucy)
and go through incubation and prove there is a community and then we
can turn it back on.

Please give me two to three months to make the next dev release of KinoSearch. FWIW, if I can't get a release out within that time frame, I'm going to have
to answer to Eventful. :)

This release will introduce real-time search, improved subclassing support, an mmap-friendly index file format, and pluggable indexing components. I suspect aspects of it may be of interest to the Java Lucene dev community -- but if
that's the case, I won't hold it against you. ;)


Again, this is all great, but it just further demonstrates that you are doing this on your own and not as a part of the Lucy community (or really, even the Lucene community). It's not a judgment of you or of KS. I really like what you are doing. It's merely a statement that this is not how Apache works. There are plenty of other places to host code that do not have these requirements.


Reply via email to