Corey,

Sure, your proposed solution should work very well. After finding a document, if you can construct a Range that encompasses many documents, it would be trivial to create some code to aggregate many documents instead of just one.

Have you taken a look at the wikisearch example? It has the ability to specify arbitrary boolean expressions, wrapping multiple Intersecting and Or iterators. The wikisearch code is now stored in contrib, a directory above trunk in subversion. A write-up Eric Newton composed - http://accumulo.apache.org/example/wikisearch.html

- Josh

On 8/29/12 10:51 PM, Corey Nolet wrote:
I've been using the intersecting iterator to give me server side AND intersections with Accumulo 1.4.0 and I'm currently in the process of upgrading to Accumulo 1.4.1. I see the IntersectingIterator has been deprecated and the IndexedDocIterator has taken it's place. If I'm reading through the examples correctly- I see that the IndexedDocIterator is forcing a schema that assumes your doc contents can all be mashed together into one data structure in the value of the index row (in my case, I've got a bunch of key/value pairs as the contents). What if I need this contents to be separated so I can apply cell level visibility to the query? Does it make sense to put a UUID to another index as the "contents" and then perform another lookup once after retrieving the intersection result? I've been looking all over the place for good examples of this schema so I admit that I could be missing some key things in my understanding while reading through the source code.

Also, AND queries are nice to do on server side, but I need the ability to perform AND and OR queries in concert with one another. For example, let's say I want to find everyone who's name is Paul or who's name is Gary or who's name is Lee who has Brown hair? That would mean I need to look up everything where (name=Paul | name=Gary | name=Lee) & hairColor=Brown. Do I need to extend the IntersectingIterator or the IndexedDocIterator and making my own that will allow full query criteria as input?



--
Corey Nolet
Senior Software Engineer
TexelTek, inc.
[Office] 301.880.7123
[Cell] 410-903-2110


Reply via email to