hi,juff, thanks for your comments. I did read this book early, I use MapFile to store my web pages for random access. First I think the SquenceFile conversion as a solution, howerve, the problem is that I need append the new pages to the MapFile by minute or second, so I didn't think SquenceFile conversion can manage this. Would you give me some suggestion? Think your very much!
Best wishes. On 10/28/09, Jeff Zhang <[email protected]> wrote: > I do not know why you need use MapFile, could you use SequenceFile instead ? > > The MapFile's advantage is its read performance, because it build index on > its keys. So its keys must be in order. > > If you really want to use MapFile, you can first write your data to > SequenceFile and then covert it to MapFile. > > About how to convert SequenceFile to MapFile: > 1. Sort the SequenceFile using sort in examples of hadoop > 2. create index for the output of the above step. then you get both of the > data file and index file > > > You an refer Tom Whilte's book "Hadoop definitive guide" for details about > how to convert SequenceFile into MapFile > > Jeff Zhang > > > > On Wed, Oct 28, 2009 at 4:47 PM, lei wang <[email protected]> wrote: > >> but now, "url" is not in order, must the key be intwritable ? should it >> be >> comparable ? >> How to make sure them in order?sort it first? >> I just want to insert the pages for random acess by "url ". >> >> On Wed, Oct 28, 2009 at 4:26 PM, Jeff Zhang <[email protected]> wrote: >> >> > Hi Wang, >> > >> > The keys of MapFile should be in order, so when you add records into >> > MapFile, you should make sure you insert them in order >> > >> > Best Regards, >> > >> > Jeff Zhang >> > >> > >> > On Wed, Oct 28, 2009 at 4:14 PM, lei wang <[email protected]> >> > wrote: >> > >> > > Hi, friends >> > > I need store the web pages(a huge one) in the MapFile of the hadoop, >> > > So >> i >> > > did use the url as the key, and its type is "text", When writring the >> > > records into the mapfile, it give an error as "out of order", which >> type >> > > should I choose to represent the key "url", can anyone give me some >> > detail >> > > answer, thanks for you help. >> > > >> > >> >
