Inlined for clarity
On 4/26/14, 11:05 PM, BlackJack76 wrote:
Thanks again Josh.
The way I have been approaching it is to create/use/close the BatchWriter
inside of the seek method when I need it. Do you see any issues with this
approach?
It's not terrible, but you will be incurring some extra overhead in this
approach. The batchwriter is most efficient when you can keep a single
instance open and just throw many mutations at it. Just make sure to
close the batchwriter in a finally block, and you shouldn't have any
problems.
Call me naive but why don't you know when Accumulo is going to tear down
your iterator and stop using it? When I attach an iterator to a scanner,
isn't it only destroyed after I complete my scan?
You don't know because the SKVI API currently doesn't have any means to
tell you. Yes, the tabletserver knows when it's about to, but you don't
have means to be told this. This gets trickier with some of the work
that Accumulo is doing under the hoods that I hinted at previously.
Accumulo maintains a buffer between your (Batch)Scanner and the
tserver(s) it communicates to. For a number of reasons, when that buffer
fills up, Accumulo notes the last Key that scan returned, tears down
your session, and (assuming the client is still there requesting more
data), will then re-queue your scan to fetch more data starting back at
where you left off.
For example, if you have a table where each row is a letter in the
alphabet, and you want to scan over all rows, you would just pass some
range like (-inf, +inf). Suppose that after you return the letter 'f',
that buffer fills up, and your scan gets torn down.
Accumulo will restart your scan again with a different range than what
you previously passed in: (f, +inf). This is an important note if you
start doing "advanced topics" inside iterators that manipulate the Keys
being returned, however it is relatively easy to work with.
What I have observed is something similar to the following....
init is called on creation
seek is called where you need to have the first K,V pair at the end of seek
hasTop, getTopKey, and getTopValue are called
next is called as long as hasTop is true
Once hasTop is false, the scan concludes
--
View this message in context:
http://apache-accumulo.1065345.n5.nabble.com/Write-to-table-from-Accumulo-iterator-tp9412p9433.html
Sent from the Users mailing list archive at Nabble.com.