Hi J-D, Thanks for your comments and clarification. I guess GC does blow (especially when writing things like databases and filesystems).
Right now I will dive into GC tuning once more and probably lower the number of reducers on the insert jobs. Thanks, Friso On 16 dec 2010, at 19:00, Jean-Daniel Cryans wrote: >> I am not surprised by the fact that there is a performance hit, I just >> expected it to be less. I figured it to be somewhere between 2 and 3 times >> slower, not 5 times. So with my question I was basically looking for some >> measure of what to expect based on someone else's experience. Apart from >> that, I hoped it would just take longer and not die. > > GC does blow :) > >> I will re-check. I only grepped for long pauses. I guess a series of short >> collections could also get in the way of application code. Perhaps I need to >> tweak GC params some more. Is highly increased GC activity a logical >> consequence of using WAL? Does it create a lot of short lived objects while >> pushing things to WAL? > > It's not something I wanted to explore in my first email, but the > answer is: yes, it probably helps generating GC, but it also generates > more IO traffic. > > - Writing to the network requires serializing the objects, so that's > an extra copy of the data > - Each request takes a bit longer, so it's payload will stick longer in memory > - Logs have an upper bound, in order to keep log replay time under > control, but it has the adverse effect of forcing flushes. HBASE-3242 > is about helping that specific case. > >> Nope. This happens when all the RS stay up and running. It looks like a >> hang. It does not happen very often. After the reducers are killed the >> subsequent attempt always succeeds, so it just increases the running time of >> the job by ten minutes, which is OK for me for now. > > I think it's worth looking into it. Start by jstacking those processes > to see where it hangs and be sure to enable DEBUG for HBase. > >> Do I need to consider this massive? We do this import every 8 hours and have >> been doing so for months without trouble (without WAL), while servicing >> reads. By nature of the stuff we store, we get it in batches. The reading >> side of things is low volume (small number of users). > > I don't know anything about your write workload, so it's hard to tell > if it's appropriate, but it's generally a better solution to prolonged > imports. > >> One other option would be to detect RS failures and just re-submit the job >> when that happens during the insert job. But this wouldn't scale (with the 8 >> RS we have, I guess we might get away with it). > > Or lower the number of clients. > >> >> Are you referring to this: >> http://hbase.apache.org/docs/r0.89.20100621/bulk-loads.html ? I need to do >> read-modify-write, so I am not sure if this would work for me. > > Yes that page, I don't know either if it'll work for you :) > > J-D
