Re: tabled test corpus?

Colin McCabe Fri, 05 Mar 2010 13:15:57 -0800

Random thoughts:

Maybe something like a freely available dictionary would work, with
the key as the word, and the value as the definition.


You could grab git commits from the Linux kernel and make the key the
SHA, and the value the patch.

There's a lot of text in Project Gutenberg. I guess you'd have to
decide what you want your average key / value lengths to be-- I think
most books there are longer than 16K. Maybe you could make the key
(book, page_number).

Colin

P.S. I've been meaning to set up a bigger tabled installation myself,
as soon as I get some time.


On Fri, Mar 5, 2010 at 10:33 AM, Jeff Garzik <[email protected]> wrote:
> On 03/05/2010 10:31 AM, Jeff Garzik wrote:
>>
>> Can anybody suggest a good test dataset for tabled?
>>
>> Hopefully something with a million or more keys, where the values are
>> large.
>>
>> I can certainly generate something like that artificially, but a
>> real-world dataset would be nice.
>
> Still looking for a good, real-world data set.
>
> A synthetic store+retrieve test of 1m keys @ 16K values worked without a
> hitch.  I documented this on
> http://hail.wiki.kernel.org/index.php/Extended_status
>
>        Jeff
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe hail-devel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: tabled test corpus?

Reply via email to