Running the DataNode inside of an HBase process seems like this could be a good option to enable?
Specifically because it would reduce the number of processes on an HBase instance. Eg, I think one of the barriers to adoption for HBase in general is the multiple processes management part. Are there any known issues with doing this? In addition to the DataNode, one could auto-specify which servers should be running Zookeeper and start ZK inside of the HBase process(es).
