Re: [GSoC]About some general information

Michael McCandless Wed, 21 Mar 2012 10:19:39 -0700

Hello!  Answers below...:

On Wed, Mar 21, 2012 at 11:03 AM, Han Jiang <[email protected]> wrote:
> Hi All,
>
> I'm Billy, a senior undergraduate student in Peking University. I'm working
> in the area of Information Retrieval and Web Mining. When going through the
> idea list, I felt quite interested in the LUCENE-3892 and LUCENE-3069. I am
> very proficient on java, and have been using lucene for about one year. I am
> looking forward to make a contribution to this project.


Awesome.

> Here, I have a few questions about lucene:
>
> First of all,  which version of lucene shall we use as a start point? The
> trunk or 3.5?

Both of these issues will be trunk only I think: they both are far
easier to do with the Codec API in 4.0.

> Is there any demo codes to show the idea of Codecs?

Maybe the simplest demo would be to look at the SimpleText codec?  It
roughly "tries" to have simple source code as well as a simple (text
only, human readable) on-disk format.

> How many posting formats are supposed to be implemented, for project
> LUCENE-3892 ?

This can be worked out when scoping the project... but I think getting
one postings format working well would be awesome :)  If somehow
that's too easy, then add more!

> Is there any further documentation for LUCENE-3069 ?

Not that I know of... but I suspect the approach can be very similar
to the MemoryPostingsFormat we already have, just that it'd only be
the terms data stored in the FST, while the postings
(docs/freqs/positions/offsets) are written to a file.

Ideally, it would just act like a different terms dictionary
implementation, ie so that we can then plug in any PostingsBaseFormat
(even the one from LUCENE-3892!).

> Thank you!

You're welcome, and welcome to Lucene/Solr!

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [GSoC]About some general information

Reply via email to