Couldn't we do this in the 1.6 line as an optimization when we meet the constraints on scanners?
That would let us avoid exposing TabletLocator and get something out sooner. -- Sean On Feb 16, 2015 2:48 PM, "Josh Elser" <[email protected]> wrote: > Eugene, > > First off, thanks so much for writing this up. This is definitely a "hot > topic" that comes up for users and appears to have a lot of relevance to > people right now. > > I think the first thing that needs to happen is that we "lift" > TabletLocator (or some class which serves the purpose that TabletLocator > currently fulfills) into the public API. TabletLocator is currently treated > as "internal implementation" meaning that you > don't have any guarantees on its use. > > I think step 1 would be to add a TabletLocator class into the public API > (and hide the implementation in a TabletLocatorImpl). We could only do this > for 1.7.0 given our adoption of semver. You are more than welcome to look > at this and try to work on a PR. > > Feel free to open an issue on JIRA as well (I can make sure it gets > assigned to you after you do), and we can work with you to get a good > design in place. > > - JOsh > > Eugene Cheipesh wrote: > >> Hello, >> >> This is more of a use-case report and a request for comment. >> >> I am using Accumulo as a source for Spark RDDs through >> AccumuloInputFormat. My index is based on a z-order space filing curve. >> When I decompose a bounding box into index ranges I can end up with a >> large number of Ranges, 3k+ is not too unusual. Getting a fast response >> from Accumulo is not at all an issue. It would be possible to generate >> approximate ranges and use a Filter to refine them on server side but >> that only delays the problem. >> >> The ideal scenario is for Spark executors to be co-located with Accumulo >> tservers and number of splits per server to be roughly equal to the >> number of cores on the machine. >> >> However, AccumuloInputFormat maps each range to a Split and Spark maps >> every split to a Task. It is nature of z-order curve that some of these >> ranges contain only a few tiles while others contain a pretty big chunk. >> Having significantly more splits than cores prevents good concurrency on >> fetches. This is a problem that BatchScanner is designed to fix but it’s >> not used in AccumuloInputFormat. >> >> I noticed that TabletLocator which is used by AccumuloInputFormat >> returns a structure that looks like it breaks down ranges by host and >> then by tablet. I hacked together an InputFormat that generates a split >> per tablet and a Reader that uses a BatchScanner. The performance for >> spark use-case was orders of magnitude better. I end up with about 50 >> splits for the same dataset. I can’t give exact numbers because I gave >> up timing the original source. This seems is a pretty good compromise >> since the number of splits can be dynamically controlled to tune the >> distribution and granularity of calculation batches. >> >> A drawback is that most modes can not support this operation directly: >> client side, offline, and isolated scans require a single range >> iterator. So some additional code would be required for juggling them. >> >> What are your thoughts on this use case and its requirements? Is this a >> legitimate use of TabletLocator? >> >> It would be nice if AccumuloInputFormat was able to use BatchScanner, >> perhaps as an option. Accumulo is designed to crunch through large >> number of ranges so I would guess this to be a common issue. I’d be >> willing to take a stab at a PR if there is agreement on that. >> >> Thanks, >> -- >> Eugene Cheipesh >> >
