Hi, I'm using the HBase Bulk Loader<http://archive.cloudera.com/cdh/3/hbase/bulk-loads.html>with 0.89. Very easy to use. I have a few of questions:
1) It seems importtsv will only accept one family at a time. It shows some sort of security access error if I give it a column list with columns from different families. Is this a limitation of the bulk loader, or is this a consequence of some security configuration somewhere? 2) Does the bulk load process respect the hbase family's compression setting? If not, is there a way to trigger the compression after the fact (major compaction, for example)? 3) Am I correct in thinking that the importtsv step can run on a separate cluster from the hbase cluster (assuming you have an hbase client config and libraries)? And if so, for the completebulkload step, will I need to manually copy the output of importtsv to the hbase cluster's HDFS? Or can I provide a remote hdfs path, or even an S3 path for the completebulkload program? Thanks for providing this tool. Marc
