Keyword schema

2011-02-02 Thread Peter Haidinyak
Hi all, I was just tasked to take the keywords used for a search and put them in HBase so we can slice and dice them. They are interested in standard stuff like highest frequency word, word pairs, etc. I know I'm not the first to do this so does anyone have a recommendation on

Re: Keyword schema

2011-02-02 Thread Jean-Daniel Cryans
I don't think HBase is really needed here, unless you somehow need random read/write to those search queries. J-D On Wed, Feb 2, 2011 at 1:27 PM, Peter Haidinyak phaidin...@local.com wrote: Hi all,        I was just tasked to take the keywords used for a search and put them in HBase so we

Re: Keyword schema

2011-02-02 Thread Pete Haidinyak
I will be updating the keywords and their frequency every X minutes so I don't believe M/R would work well but I could be wrong. I've been doing this approach with other data and have been receiving sub 1 second on my queries with two overtaxed servers. I figured this problem has been solved

Re: Keyword schema

2011-02-02 Thread Ted Dunning
A small map-reduce program could do updates to Hbase or if your incremental data is relatively small, you can do the update one by one. This can work fine, but it doesn't really solve the top-100 term problem. For that, it may be nice to have an occasional MR program that over-produces a list of