We are in the midst of working on a complete overhaul from mysql to hbase. >From what I can read, it really does not matter if you use string increments versus a binary row id. I have been reading a lot about hotspots in the cluster, but I was hoping someone could shed some light on the do's and don't of this decision. Hive .9x plays nice with binary, put prior to that it expected strings.
What are the long term advantages using binary compared to a string row id? It sorts according not matter the type in HBase, but what about co-processor aggregations? Are they only expecting long datatypes? Forgive me if this topic has already been hashed out. thanks in advance. /tom
