I'm a bit confused as to what you mean "if an iterator goes down mid-processing." If it goes down at all, then whatever scope it's running in- minor compaction, major compaction and scan- will most likely go down as well (unless your iterator eats an exception and ignores errors). A WALog shouldn't be deleted if whatever you were trying to do failed.
On Sat, Sep 15, 2012 at 1:44 AM, Sukant Hajra <[email protected]>wrote: > Hi guys, > > We've been slowing inching towards using iterators more effectively. The > typical use case of indexed docs fit one of our needs and we wrote a > prototype > for it. > > We've recently realized that iterators are not just read-only, and that we > can > get more data-local functionality by taking advantage of their ability to > mutate data as well. We've only begun to think more of how this may > assist us. > A /lot/ of our critical data-accesses are slightly complex, but local to > one > row. We have billions of entities in our system, so a simple bijection of > entities to rows works our really well for us with respect to iterators. > > Up to this point, we've had an planned architecture that uses Kestrel for > WALog > and a messaging system like Akka pipelining work. Akka would help us > manage > flowing work from the user to the log and from the log to orchestrations of > Accumulo intra-row reads and writes. The log just helps us get some faster > response time without sacrificing too much reliability. > > Recently someone asked why use our own WALog when Accumulo has one > natively in > HDFS. My response has been that Accumulo's WALog is at a lower level of > granularity of mutations. We want reliable orchestrations of mutations. > Our > orchestrations are idempotent, but we want something long the lines of > at-least-once delivery for the entire orchestration. If an iterator goes > down > mid-processing, I fear Accumulo's native WALog is insufficient to claim we > have > a reliable enough system. > > I could definitely go through source code to validate this opinion, but I > thought I'd bounce this reasoning off the list first. > > Also, I'm sure we're not the only people using Accumulo in this way. > Please > feel to advise us if anyone's got other ideas for an architecture or feels > we're thinking about the problem backwards. > > Thanks for your input, > Sukant >
