Corey,
Sure, your proposed solution should work very well. After finding a
document, if you can construct a Range that encompasses many documents,
it would be trivial to create some code to aggregate many documents
instead of just one.
Have you taken a look at the wikisearch example? It has the ability to
specify arbitrary boolean expressions, wrapping multiple Intersecting
and Or iterators. The wikisearch code is now stored in contrib, a
directory above trunk in subversion. A write-up Eric Newton composed -
http://accumulo.apache.org/example/wikisearch.html
- Josh
On 8/29/12 10:51 PM, Corey Nolet wrote:
I've been using the intersecting iterator to give me server side AND
intersections with Accumulo 1.4.0 and I'm currently in the process of
upgrading to Accumulo 1.4.1. I see the IntersectingIterator has been
deprecated and the IndexedDocIterator has taken it's place. If I'm
reading through the examples correctly- I see that the
IndexedDocIterator is forcing a schema that assumes your doc contents
can all be mashed together into one data structure in the value of the
index row (in my case, I've got a bunch of key/value pairs as the
contents). What if I need this contents to be separated so I can apply
cell level visibility to the query? Does it make sense to put a UUID
to another index as the "contents" and then perform another lookup
once after retrieving the intersection result? I've been looking all
over the place for good examples of this schema so I admit that I
could be missing some key things in my understanding while reading
through the source code.
Also, AND queries are nice to do on server side, but I need the
ability to perform AND and OR queries in concert with one another. For
example, let's say I want to find everyone who's name is Paul or who's
name is Gary or who's name is Lee who has Brown hair? That would mean
I need to look up everything where (name=Paul | name=Gary | name=Lee)
& hairColor=Brown. Do I need to extend the IntersectingIterator or the
IndexedDocIterator and making my own that will allow full query
criteria as input?
--
Corey Nolet
Senior Software Engineer
TexelTek, inc.
[Office] 301.880.7123
[Cell] 410-903-2110