Re: [lucy-dev] Getting Lucy fit for summer

Marvin Humphrey Thu, 20 Jun 2013 14:39:10 -0700

On Thu, Jun 20, 2013 at 1:08 AM, Nathan Kurz <[email protected]> wrote:
> I've been working on some things that I'm hoping to get into Lucy
> sometime this year.   Daniel Lemire published a paper last year about
> fast integer compression/decompression:
>
> http://lemire.me/blog/archives/2012/09/12/fast-integer-compression-decoding-billions-of-integers-per-second/
>
> I've been working with him on doing it even faster.  I think we're on
> target for something twice as fast as the last paper.    We're also
> working on fast intersections of posting lists, and should have good
> numbers there too.   I'd like to use Lucy as the test case for the
> code to show some real world application in the paper.


Sounds exciting!  Here's one possible way that you could do this:

1.  Build arbitrary data structures using a custom DataWriter/DataReader pair.
2.  Create a Query/Compiler/Matcher subclass trio which, instead of going
    through PostingsReader, accesses your custom index files.

Rather than explain everything at once, I'll start off with a sample script
which generates minimal segment files and illustrates the use of
Lucy::Plan::Architecture to control index components.  Try it like so:

    perl custom_arch.pl INDEX_LOCATION
    hexdump -C test_index/seg_1/cf.dat

I'll attempt to attach the script to this email, but I don't recall whether our
dev list strips attachments.  If it doesn't survive, I'll follow up with
another post containing the sample code inlined in the message body.

Marvin Humphrey

Re: [lucy-dev] Getting Lucy fit for summer

Reply via email to