Re: Using Hadoop for Record storage

2007-04-12 Thread Otis Gospodnetic
I'm curious what others will say about Hadoop. I'll just recomment BDB, as I have good experience combining Lucene indices where only the id field is stored, and BDBs are used to store and retrieve data for a set of ids for a given search result. Otis . . . . . . . . . . . . . . . . . . . .

Re: Using Hadoop for Record storage

2007-04-12 Thread Doug Cutting
Andy Liu wrote: I'm exploring the possibility of using the Hadoop records framework to store these document records on disk. Here are my questions: 1. Is this a good application of the Hadoop records framework, keeping in mind that my goals are speed and scalability? I'm assuming the answer

methods and data into SequenceFile

2007-04-12 Thread Peter W.
Hi, This is my first post to the Hadoop list and have not yet written a program using the framework. I'm querying several large Lucene indexes, and generating about 30 text files 1-3MB each. These files contain metadata about the indexed documents and a corresponding MD5 key. This unique key