On Sat, Apr 20, 2013 at 2:39 PM, David Alves <[email protected]> wrote:
> Hi > > I'm porting the region level HBase SE to the new SE iface and I > have a couple of questions. > 1- about the method: public ListMultimap<ReadEntry, > DrillbitEndpoint> getReadLocations(Collection<ReadEntry> entries) > > when does it happen that a read entry gets assigned more that one > drillbits? > in terms of hbase I can see the case where multiple read entries > get assigned to the same drillbit (co-located regions) but I can't envision > a case where the same read entry (usually corresponding to a shard or > partition) gets assigned to multiple drillbits. when can that happen? > Best example is probably block replica locations in HDFS have multiple possible endpoints. > > 2- with regard to off-heap storage and underlying SE co-location > > this is not really a doubt, just checking that my reasoning is > correct before. > > for co-located underlying SE and Drillbit's we should use > off-heap, shared memory for IPC when possible, correct? > Specifically I'm investigating the possibility of having HBase > store region scan data directly off heap and making the results from hbase > contain a set references to aligned shared memory locations. > I'm not sure I'll be implementing this immediately but I'd like to > design accounting for it if that is the idea. > Also this means that SE's must work in two modes: co-located with > shared memory and remote with sockets. We'd then have the > Jacques: I'm sure you've put some thought to the underlying > mechanics on how to accomplish this, could you share some quick > ideas/references? > The challenge is separate JVMs don't have a nice way to share memory. The simplest way is probably using MMAP'd tmpfs. We'd have to evaluate the performance impact of this complexity. I think the Java Chronicle, HugeCollections or VanillaJava stuff by Peter Lawrey has played with this. There isn't a lot of work in the space. Other interesting info: http://javaforu.blogspot.com/2011/09/offloading-data-from-jvm-heap-little.html. Yes, this does mean that an SE may need to use two different mechanisms to interact: one local and one remote/fallback. J
