Hi all,

I had various issues with big tables while experimenting the couple last weeks.

The thing that goes to my mind is that hbase (+phoenix) works only when there is a fairly powerful cluster and say 1/2 the data can fit into the combined servers memory and disks are fast (SSD?) as well. It doesn't seem to be able to work when tables are 2x as large as the memory allocated to region servers (frankly I think it is less)

Things that constantly fail:

- non-trivial queries on large tables (with group by, counts, joins) with region server out of memory errors or crashes without any reason for Xmx of 4G or 8G - index creation on the same big tables. Those always fail I think around the point when hbase has to flush it's memory regions to the disk and couldn't find a solution - spark jobs fail unless they are throttled to feed hbase with the data it can take . No backpressure?

There were no replies to my emails regarding the issues, which makes me think there aren't solutions (or solutions are pretty hard to find and not many ppl know them).

So after 21 tweaks to the default config, I am still not able to operate it as a normal database.

Should I start believing my config is all wrong or that hbase+phoenix is only working if there is a sufficiently powerful cluster to handle the data?

I believe it is a great project and the functionality is really useful. What's lacking is 3 sample configs for 3 different strength clusters.

Thanks

Reply via email to