I didn't know the bulk API was so important.  Which bulk API (eg the
postings one or the terms dict)?

On Mon, Aug 15, 2011 at 11:17 PM, Robert Muir <[email protected]> wrote:
> On Mon, Aug 15, 2011 at 10:49 PM, Mark Miller <[email protected]> wrote:
>> Just throwing this out there, but:
>>
>> I think it would be really cool if we could get 4.0 out by the end of the 
>> year.
>>
>> With such a large release, I think it would also make a lot of sense if we 
>> tried a more formal beta release, just to increase the amount of usage 
>> before we officially sign off on a final 4.0.
>>
>
> I agree with the beta idea, I think its really necessary actually: we
> are just being honest at that point that its a real point-zero
> release.
>
> on the other hand, besides the GSOC stuff, I think we should
> accomplish a few things first to ensure we can actually make the 4.x
> release useful and issue minor releases off of it:
> * fix the bulk API: otherwise we only have "flexible indexing, as long
> as you don't mind flexible == slower". This is really important, I
> dont think we have to implement a bunch of new compression algorithms
> but the whole postings APIs are suboptimal, and biased towards
> lucene's current format: the bulk APIs arent low level enough to give
> good performance, the payloads APIs assume you can ask for a payload
> at any time (they assume basically that you are going to 'steal bits'
> from the positions like we do today), etc etc.
> * round out docvalues, especially merging with different docvalues
> types and things like that. arguably these are nocommits... I think
> you will get an exception during merge? I also think its bad we still
> don't use docvalues for norms nor the faceting module, fixing these
> kinds of real world uses is probably a great way to round this out.
> * figure out the packaging system for modules such that things like
> clover/hudson/javadocs etc all work across them (not quite today). We
> also need to look at all the minor things like CHANGES.txt and such...
> there are too many of these. Furthermore at least I wanted the
> analyzers modularization to move forward to a point where we can
> remove the Version crap and you just use the old jar file, I don't
> feel like we are even close to that.
> * fix codec naming: i think its silly to name a codec "Standard" and
> use the codec header for backwards compatibility, easier to name the
> codec "Standard40" and just package this codec in the next release for
> backwards compatibility, e.g. if we want to introduce a new index
> format we make it "Standard42". This is just my opinion though, its
> not the only way to solve the index backwards compat here but I think
> its easiest.
>
> I have a ton more pet peeves, but I think these are the biggest. It
> probably sounds like a lot but I think its totally stupid to release
> 4.0 if we cannot "grow" the 4.x branch with 4.1, 4.2, etc while we
> work on 5.x. Otherwise we are just jumping from 4.0 to 5.0 and thats a
> sign we just shouldnt have released at all.
>
> --
> lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to