Yes , I am fully agree with that , but As I understand there is a limitation : Bulk load works only with one column family. It is not my case. Can you advice me workaround how to use bulk load with multi column families?
Thanks Oleg. > Are these insertions the output of the MR jobs? > > If so, I would strongly recommend the bulk load functionality. It is > somewhere between 10x and 100x more efficient than direct API usage. > > > > 3) Run 10-20 scans per day (scanning about 20 regions in a table). > > All this should run in parallel. > > Our current configuration can't cope with this load and we are having > many > > stability issues. > > > > This is what we have in mind : > > 1. Master machine - 32 GB, 4 TB, Two quad core CPUs. > > 2. Name node - 16 GB, 2TB, Two quad core CPUs. > > we plan to have up to 20 name servers (starting with 5). > > > > We already read > > > > > http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/ > > . > > > > We would appreciate your feedback on our proposed configuration. > > > > > > Regards Oleg & Lior > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
