hi, sorry i dont. i think the current transactional/indexed person is working on bringing it up to 0.89, perhaps they would enjoy your help in testing or porting the code?
I'll poke a few people into replying. -ryan On Mon, Sep 20, 2010 at 5:19 PM, George P. Stathis <[email protected]> wrote: > On Mon, Sep 20, 2010 at 4:55 PM, Ryan Rawson <[email protected]> wrote: > >> When you say replication what exactly do you mean? In normal HDFS, as >> you write the data is sent to 3 nodes yes, but with the flaw I >> outlined, it doesnt matter because the datanodes and namenode will >> pretend a data block just didnt exist if it wasnt closed properly. >> > > That's the part I was not understanding. I do now. Thanks. > > >> >> So even with the most careful white glove handling of hbase, you will >> eventually have a crash and you will lose data w/o 0.89/CDH3 et. al. >> You can circumvent this by storing the data elsewhere and spooling >> into hbase, or perhaps just not minding if you lose data (yes those >> applications exist). >> >> Looking at those JIRAs in question, the first is already on trunk >> which is 0.89. The second isn't alas. At this point the >> transactional hbase just isnt being actively maintained by any >> committer and we are reliant on kind people's contributions. So I >> can't promise when it will hit 0.89/0.90. >> > > Are you aware of any indexing alternatives in 0.89? > > >> >> -ryan >> >> >> On Mon, Sep 20, 2010 at 1:21 PM, George P. Stathis <[email protected]> >> wrote: >> > Thanks for the response Ryan. I have no doubt that 0.89 can be used in >> > production and that it has strong support. I just wanted to avoid moving >> to >> > it now because we have limited resources and it would put a dent in our >> > roadmap if we were to fast track the migration now. Specifically, we are >> > using HBASE-2438 and HBASE-2426 to support pagination across indexes. So >> we >> > either have to migrate those to 0.89 or somehow go stock and be able to >> > support pagination across region servers. >> > >> > Of course, if the choice is between migrating or losing more data, data >> > safety comes first. But if we can buy two or three more months of time >> and >> > avoid region server crashes (like you did for a year), maybe we can go >> that >> > route for now. What do we need to do achieve that? >> > >> > -GS >> > >> > PS: Out of curiosity, I understand the WAL log append issue for a single >> > regionserver when it comes to losing the data on a single node. But if >> that >> > data is also being replicated on another region server, why wouldn't it >> be >> > available there? Or is the WAL log shared across multiple region servers >> > (maybe that's what I'm missing)? >> > >> > >> > On Mon, Sep 20, 2010 at 3:52 PM, Ryan Rawson <[email protected]> wrote: >> > >> >> Hey, >> >> >> >> The problem is that the stock 0.20 hadoop wont let you read from a >> >> non-closed file. It will report that length as 0. So if a >> >> regionserver crashes, that last WAL log that is still open becomes 0 >> >> length and the data within in unreadable. That specifically is the >> >> problem of data loss. You could always make it so your regionservers >> >> rarely crash - this is possible btw and I did it for over a year. >> >> >> >> But you will want to run CDH3 or the append-branch releases to get the >> >> series of patches that fix this hole. It also happens that only 0.89 >> >> runs on it. I would like to avoid the hadoop "everyone uses 0.20 >> >> forever" problem and talk about what we could do to help you get on >> >> 0.89. Over here at SU we've made a commitment to the future of 0.89 >> >> and are running it in production. Let us know what else you'd need. >> >> >> >> -ryan >> >> >> >> On Mon, Sep 20, 2010 at 12:39 PM, George P. Stathis >> >> <[email protected]> wrote: >> >> > Thanks Todd. We are not quite ready to move to 0.89 yet. We have made >> >> custom >> >> > modifications to the transactional contrib sources which are now taken >> >> out >> >> > of 0.89. We are planning on moving to 0.90 when it comes out and at >> that >> >> > point, either migrate our customizations, or move back to the >> out-of-the >> >> box >> >> > features (which will require a re-write of our code). >> >> > >> >> > We are well aware of the CDH distros but at the time we started with >> >> hbase, >> >> > there was none that included HBase. I think CDH3 the first one to >> include >> >> > HBase, correct? And is 0.89 the only one supported? >> >> > >> >> > Moreover, are we saying that there is no way to prevent stock hbase >> >> 0.20.6 >> >> > and hadoop 0.20.2 from losing data when a single node goes down? It >> does >> >> not >> >> > matter if the data is replicated, it will still get lost? >> >> > >> >> > -GS >> >> > >> >> > On Sun, Sep 19, 2010 at 5:58 PM, Todd Lipcon <[email protected]> >> wrote: >> >> > >> >> >> Hi George, >> >> >> >> >> >> The data loss problems you mentioned below are known issues when >> running >> >> on >> >> >> stock Apache 0.20.x hadoop. >> >> >> >> >> >> You should consider upgrading to CDH3b2, which includes a number of >> HDFS >> >> >> patches that allow HBase to durably store data. You'll also have to >> >> upgrade >> >> >> to HBase 0.89 - we ship a version as part of CDH that will work well. >> >> >> >> >> >> Thanks >> >> >> -Todd >> >> >> >> >> >> On Sun, Sep 19, 2010 at 6:57 AM, George P. Stathis < >> >> [email protected] >> >> >> >wrote: >> >> >> >> >> >> > Hi folks. I'd like to run the following data loss scenario by you >> to >> >> see >> >> >> if >> >> >> > we are doing something obviously wrong with our setup here. >> >> >> > >> >> >> > Setup: >> >> >> > >> >> >> > - Hadoop 0.20.1 >> >> >> > - HBase 0.20.3 >> >> >> > - 1 Master Node running Nameserver, SecondaryNameserver, >> JobTracker, >> >> >> > HMaster and 1 Zookeeper (no zookeeper quorum right now) >> >> >> > - 4 child nodes running a Datanode, TaskTracker and RegionServer >> >> each >> >> >> > - dfs.replication is set to 2 >> >> >> > - Host: Amazon EC2 >> >> >> > >> >> >> > Up until yesterday, we were frequently experiencing >> >> >> > HBASE-2077<https://issues.apache.org/jira/browse/HBASE-2077>, >> >> >> > which kept bringing our RegionServers down. What we realized though >> is >> >> >> that >> >> >> > we were losing data (a few hours worth) with just one out of four >> >> >> > regionservers going down. This is problematic since we are supposed >> to >> >> >> > replicate at x2 out of 4 nodes, so at least one other node should >> be >> >> able >> >> >> > to >> >> >> > theoretically serve the data that the downed regionserver can't. >> >> >> > >> >> >> > Questions: >> >> >> > >> >> >> > - When a regionserver goes down unexpectedly, the only data that >> >> >> > theoretically gets lost was whatever didn't make it to the WAL, >> >> right? >> >> >> Or >> >> >> > wrong? E.g. >> >> >> > >> >> >> > >> >> >> >> >> >> http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html >> >> >> > - We ran a hadoop fsck on our cluster and verified the >> replication >> >> >> factor >> >> >> > as well as that the were no under replicated blocks. So why was >> our >> >> >> data >> >> >> > not >> >> >> > available from another node? >> >> >> > - If the log gets rolled every 60 minutes by default (we haven't >> >> >> touched >> >> >> > the defaults), how can we lose data from up to 24 hours ago? >> >> >> > - When the downed regionserver comes back up, shouldn't that data >> be >> >> >> > available again? Ours wasn't. >> >> >> > - In such scenarios, is there a recommended approach for >> restoring >> >> the >> >> >> > regionserver that goes down? We just brought them back up by >> logging >> >> on >> >> >> > the >> >> >> > node itself an manually restarting them first. Now we have >> automated >> >> >> > crons >> >> >> > that listen for their ports and restart them if they go down >> within >> >> two >> >> >> > minutes. >> >> >> > - Are there way to recover such lost data? >> >> >> > - Are versions 0.89 / 0.90 addressing any of these issues? >> >> >> > - Curiosity question: when a regionserver goes down, does the >> master >> >> >> try >> >> >> > to replicate that node's data on another node to satisfy the >> >> >> > dfs.replication >> >> >> > ratio? >> >> >> > >> >> >> > For now, we have upgraded our HBase to 0.20.6, which is supposed to >> >> >> contain >> >> >> > the HBASE-2077 <https://issues.apache.org/jira/browse/HBASE-2077> >> fix >> >> >> (but >> >> >> > no one has verified yet). Lars' blog also suggests that Hadoop >> 0.21.0 >> >> is >> >> >> > the >> >> >> > way to go to avoid the file append issues but it's not production >> >> ready >> >> >> > yet. Should we stick to 0.20.1? Upgrade to 0.20.2? >> >> >> > >> >> >> > Any tips here are definitely appreciated. I'll be happy to provide >> >> more >> >> >> > information as well. >> >> >> > >> >> >> > -GS >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Todd Lipcon >> >> >> Software Engineer, Cloudera >> >> >> >> >> > >> >> >> > >> >
