Hi lars Thanks as well for your help. My machines all have one disk.
I am trying to get a feel on both elements reads and writes. Hence my tests. If i used buffered import the times goes down to 20 minutes which is acceptable compared to one hour.I am wondering if this is in the realms of what should be expected in a correctly functionnal 3 machines cluster. I can stress test I'm just not sure how to ananlyse results. In terms of reads through Hive I am not using the rowkey so I expect it to take more time, I am wondering If 4 minutes for a 1400000 entrie table is a coherent time. If I program a multithreaded enviroment for a file of the same entries I get better performance, however it would not scale As well as Hadoop or Hbase. So my dataset might not be enough for a relevant test. If you have time and need more info I fhave opended the cluster: http://91.121.69.14:50030/jobtracker.jsp You can look at the retired task select... to see How it went exactly. or for the Habse cluster: http://91.121.69.14:60030/rs-status Thanks a lot to you and kevin for your time and any advice or just process time tables I could relate to check my cluster implementation and try and better it.
