That is my issue, you sort of fire and forget about the updates. Even flushing the writes will not help as far as I see it. If you have a server fail in the process of persisting its memstored data the error is not sent back to the caller. Only a deep log file analysis may reveal the issue, but even telling what is missing will be difficult as all you see is an IOE?
On Tue, Apr 6, 2010 at 6:36 PM, Todd Lipcon <t...@cloudera.com> wrote: > On Tue, Apr 6, 2010 at 9:31 AM, Jean-Daniel Cryans <jdcry...@apache.org>wrote: > >> The issue isn't with the write buffer here, it's the WAL. Your edits >> are in the MemStore so as far as your clients can tell, the data is >> all persisted. In this case you would need to know when all the >> memstores that contain your data are flushed... Best practice when >> turning off WAL is force flushing the tables after the job is done, >> else you can't guarantee durability for the last edits. >> >> > You still can't guarantee durability for any of the edits, since a failure > in the middle of your job is undetectable :) > > -Todd > > >> J-D >> >> On Tue, Apr 6, 2010 at 4:02 AM, Lars George <lars.geo...@gmail.com> wrote: >> > Hi, >> > >> > I have an issue where I do bulk import and since WAL is off and a >> > default write buffer used (TableOutputFormat) I am running into >> > situations where the MR job completes successfully but not all data is >> > actually restored. The issue seems to be a failure on the RS side as >> > it cannot flush the write buffers because the MR overloads the cluster >> > (usually the .META: hosting RS is the breaking point) or causes the >> > underlying DFS to go slow and that repercussions all the way up to the >> > RS's. >> > >> > My question is, would it make sense as with any other asynchronous IO >> > to return a Future from the put() that will help checking the status >> > of the actual server side async flush operation? Or am I misguided >> > here? Please advise. >> > >> > Lars >> > >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera >