Hi everyone I'm starting with hbase and testing for our needs. I have set up a hadoop cluster of Three machines and A Hbase cluster atop on the same three machines, one master two slaves.
I am testing the Import of a 5GB csv file with the importTsv tool. I import the file in the HDFS and use the importTsv tool to import in Hbase. Right now it takes a little over an hour to complete. It creates around 2 million entries in one table with a single family. If I use bulk uploading it goes down to 20 minutes. My hadoop has 21 map tasks but they all seem to be taking a very long time to finish many tasks end up in time out. I am wondering what I have missed in my configuration. I have followed the different prerequisites in the documentations but I am really unsure as to what is causing this slow down. If I were to apply the wordcount example to the same file it takes only minutes to complete so I am guessing the issue lies in my Hbase configuration. Any help or pointers would by appreciated
