https://github.com/medined/D4M_Schema describes one way to handle the secondary indexes and provides some prototype java code to experiment with. There are other projects you can find. For example, https://github.com/joshelser/cosmos.
On Wed, May 6, 2015 at 5:08 PM, Christopher <[email protected]> wrote: > Since Accumulo is essentially a big sorted map, it is most efficient > searching by the row. When you search by other fields, you are > searching the entire data set, and filtering. That is usually not very > efficient. The API provides a way to do this relatively easily by > specifying family or family:qualifier, but it does not (as you've > observed) make it easy to do this by Value. > > There are a few options: > > 1. You can configure the RegExFilter as a scan-time iterator. (This is > going to be terribly inefficient.) > 2. You can adopt adopt a secondary indexing strategy. > > I would do option #2. As you've described, your data is indexed by ID. > If you need an index on whatever you're storing in the Value, you > should make a new table (or new family/locality group) which stores > your data sorted by that instead of ID. You can either just store the > ID in this secondary index, and do two lookups (the secondary index to > find the ID, then the main data once you have the ID), or you can > store all the data a second time, ordered by the contents of your > Value (this trade space for performance). > > There are more complex strategies, but these are the basics. > > -- > Christopher L Tubbs II > http://gravatar.com/ctubbsii > > > On Wed, May 6, 2015 at 10:10 AM, Revan1988 <[email protected]> wrote: >> Hi, >> I've got an other question about using Accumulo. >> >> My table is something like that: >> >> ID1 info:name JhonSmith >> ID1 info:birth 1988-06-26 >> ID1 study:university ComputerEngineering >> ID1 study:graduated Yes >> >> ID2 info:name GeorgeDuff >> ID2 info:birth 1984-01-29 >> ID2 study:university Math >> ID2 study:graduated Yes >> >> ... >> >> >> I want all info about JhonSmith but with Java API I've found only method to >> search by row, family or family:qualifier ... >> >> I need to search by Value and after to use its row (IDx) to search all other >> entries that has the same row (IDx). >> >> for example i need all info about JhonSmith (birth, university, graduated >> ...). >> >> I hope I explain my problem. >> Sorry again for my bad english. >> >> ...and once again: >> Thank you!!! >> >> >> >> ----- >> Andrea Leoni >> Italy >> Computer Engineer >> -- >> View this message in context: >> http://apache-accumulo.1065345.n5.nabble.com/Search-function-tp14030.html >> Sent from the Developers mailing list archive at Nabble.com.
