Hi, The schema design is important. There is this entry to look at at least: http://hbase.apache.org/book.html#rowkey.design For the config, could you pastebin the hdfs & hbase config files you used?
N. On Tue, Oct 23, 2012 at 5:48 PM, Nick maillard < [email protected]> wrote: > Hi everyone > > I'm starting with hbase and testing for our needs. I have set up a hadoop > cluster of Three machines and A Hbase cluster atop on the same three > machines, > one master two slaves. > > I am testing the Import of a 5GB csv file with the importTsv tool. I > import the > file in the HDFS and use the importTsv tool to import in Hbase. > > Right now it takes a little over an hour to complete. It creates around 2 > million entries in one table with a single family. > If I use bulk uploading it goes down to 20 minutes. > > My hadoop has 21 map tasks but they all seem to be taking a very long time > to > finish many tasks end up in time out. > > I am wondering what I have missed in my configuration. I have followed the > different prerequisites in the documentations but I am really unsure as to > what > is causing this slow down. If I were to apply the wordcount example to the > same > file it takes only minutes to complete so I am guessing the issue lies in > my > Hbase configuration. > > Any help or pointers would by appreciated > >
