Jim Klucar created ACCUMULO-763:
-----------------------------------
Summary: Manage table sharding/partitioning within Accumulo
Key: ACCUMULO-763
URL: https://issues.apache.org/jira/browse/ACCUMULO-763
Project: Accumulo
Issue Type: Wish
Affects Versions: 1.5.0
Reporter: Jim Klucar
Priority: Minor
When ingesting a lot of data into a single table, it is common to include a
shard id in the row id to distribute the rows among the tservers. This is so
prevalent, I suggest that Accumulo handles table sharding internally.
I'm not sure how this would be implemented exactly, but I'd like to start a
discussion about the pros and cons of doing this. A lot of users have created
private libraries to handle ingesting into a sharded table and querying a
sharded table. It could be nice to have one supported robust solution for this
that developers didn't have to worry about. Perhaps it is an option when you
create the table that it is a sharded table, splits are automatically created,
and the tablets are automatically distributed among the tservers. Accumulo
could also implement a nice consistent hashing technique that would allow more
shards to be added with a minimum amount of work.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira