Gabriel,Thanks for the tip, I will retry with the SALT_BUCKETS option. Regards, - kiru From: Gabriel Reid <gabriel.r...@gmail.com> To: user@phoenix.apache.org; Kiru Pakkirisamy <kirupakkiris...@yahoo.com> Sent: Thursday, April 23, 2015 11:57 PM Subject: Re: CsvBulkLoadTool question Hi Kiru, The CSV bulk loader won't automatically make multiple regions for you, it simply loads data into the existing regions of the table. In your case, it means that all data has been loaded into a single region (as you're seeing), which means that any kind of operations that scan over a large number of rows (such as a "select count") will be very slow. I would recommend pre-splitting your table before running the bulk load tool. If you're creating the table directly in Phoenix, you can supply the SALT_BUCKETS table option [1] when creating the table. - Gabriel 1. http://phoenix.apache.org/language/index.html#options
On Fri, Apr 24, 2015 at 2:15 AM Kiru Pakkirisamy <kirupakkiris...@yahoo.com> wrote: Hi,We are trying to load large number of rows (100/200M) into a table and benchmark it against Hive.We pretty much used the CsvBulkLoadTool as documented. But now after completion, Hbase is still in 'minor compaction' for quite a number of hours.(Also, we see only one region in the table.)A select count on this table does not seem to complete. Any ideas on how to proceed ? Regards, - kiru