For better or worse, Apache demands / expects an active community in
its projects and sub-projects. The health of the community is far
more important than individuals on the project, or the code itself, or
net innovation/progress to the world, for Apache. It's not clear this
is the "right" way, but it is the Apache way, for better or worse.
Fundamentally, the Apache way is simply not congruent with the "lone
innovator" way. I know Lucy/KS are making awesome innovotions, and
that's thanks to your relentless passion & drive. Lucy/KS innovates
at a much faster pace than Lucene and I for one am very much looking
forward to its eventual realization (plus I want to see how well mmap
works ;).
There's nothing "wrong" with either approach. They are simply
different. It's just like some investors prefer seed or early stage
investments, but others invest much later when the business is more
clearly "established".
I completely agree that Lucy/KS and Lucene have all moved forward
through very healthy cross-fertilization, carried out on java-dev.
That has clearly resulted in awesome innovations, for both Lucene and
Lucy/KS. Lucene finally in 3.0 will finish the transition to wired
autoCommit=false for IndexWriter, something KS has had from the start,
for example (there are many more examples).
I also feel achieving a strong C core, friendly to dynamic languages,
is very important for Lucene's future. Some day more people may use
it than Lucene (religion: I think dynamic languages, not Java, are the
eventual future).
Lucy/KS living elsewhere should not change that cross-fertilization,
and Lucy/KS will clearly live on even if in the short term it's not
under the Apache Lucene umbrella.
Mike
Marvin Humphrey wrote:
On Sat, Mar 07, 2009 at 03:02:22AM +0100, Jukka Zitting wrote:
Please give me two to three months to make the next dev release of
KinoSearch.
What will happen then?
The next dev release of KS will present real world implementations
of many
designs that have been discussed in Lucene and Lucy forums over the
last year.
Some might see that as "progress". ;)
When and how is Lucy development going to start?
It *is* actively progressing. It's just that neither you nor Grant
are
willing to acknowledge that any of the design work I just did (in
happy
collaboration with Java Lucene devs) applies to Lucy.
Please go read <https://issues.apache.org/jira/browse/LUCENE-1458>
and see if
you can still assert after you read it that no work is being done on
Lucy. I
warn you, it is a long thread. :)
You mention that in many cases other forums have been better for
discussing related design issues. What's the benefit of keeping the
Lucy project alive if there's next to no code or even discussion
there?
The proposal remains sound, and there is a deep hunger out there for
a solid C
IR library similar to Lucene. The KS-then-Lucy progression is the
fastest and
best way to get there.
Things would have gone more smoothly and quickly if Dave Balmain had
been able
to contribute more, but even with that setback, we will still reach
the
finish.
I'm sure that everyone here would love to see Lucy become more
active.
How could we help make that happen?
Help Mike McCandless and Jason Rutherglen finish up their work on
the designs
we've all been discussing. This is a multi-way collaboration, and
Lucy
benefits when I'm able to study alternatate implementations, just as
Java
Lucene benefits from being able to see what other projects have done.
Cross-pollination has worked very well in the past. The indexing
speedups a
while back started with McCandless riffing on the KinoSearch merge
model. (He
followed that up with plenty of interesting innovating on his own.)
As a wild idea: would there be interest in bringing the KinoSearch
codebase over to Apache through incubation?
My main reservation is that I really want to see KS and Lucy play out
sequentially, because I want Lucy to benefit from having seen how
the features
now in KS work in the real world. There's no sane versioning under
Perl/CPAN.
You can't move from Lucy version 1 to Lucy version 2 without
screwing over
your users, and therefore I don't want to merge the two projects
into one
namespace. If we did that, the unified project has to stay as an
"alpha" for
that much longer, and it never really gets the benefit of seeing how a
real-world release goes.
If, then, we're proceeding sequentially as I recommend, I don't see
how
putting KS through incubation does anything but slow us down. All
we're doing
is adding extra hoops to jump through. It might be politically
expedient, but
the engineer in me rebels at the waste, as does the loyal employee.
From my perspective, what we have is an optics problem. I'm working
full
time, and I've been plenty active in the Lucene forums, but you and
Grant only
see a big fat zero. :(
Marvin Humphrey