Lars George wrote:
Hi,
I have two questions for the mapred BuildTableIndex classes folks.
First, if I have 40 servers with about 32 regions per server, what
would I set the mapper and reducers to?
Coarsely, make as many maps as you have total regions (Assuming
TableInputFormat is in the mix; it splits on table regions) and make the
number of reducers equal to the amount of index shards you want out the
other end. For example, you could have just one reducer produce one
index for all table content if table is small, etc.
And secondly, is it allowed to add new column values during the
process? For example, if I read all rows and the column "contents:A"
(for example row123.contents:A), analyze the data and then write out
the result in "row123.contents:B", is that OK to do?
You mean add new content while indexing? Yes. If you don't mind some
of the added content ending up in the index...
St.Ack
Thanks,
Lars
---
Lars George, CTO
WorldLingo