Krishmin, Thank you for the response. Its always great to hear from someone who has tried out the steps (even if you had a different issue). Like you said I am not really sure what caused the crash in our evn in the first place but having a plan is always good...
Thanks again all, Aji On Wed, Mar 27, 2013 at 5:00 PM, Krishmin Rai <[email protected]> wrote: > Hi Aji, > I wrote the original question linked below (about re-initing Accumulo over > an existing installation). For what it's worth, I believe that my > ZooKeeper data loss was related to the linux+java leap second > bug<https://access.redhat.com/knowledge/articles/15145> -- not > likely to be affecting you now (I did not go back and attempt to re-create > the issue, so it's also possible there were other compounding issues). We > have not encountered any ZK data-loss problems since. > > At the time, I did some basic experiments to understand the process > better, and successfully followed (essentially) the steps Eric has > described. The only real difficulty I had was identifying which directories > corresponded to which tables; I ended up iterating over individual RFiles > and manually identifying tables based on expected data. This was a somewhat > painful process, but at least made me confident that it would be possible > in production. > > It's also important to note that, at least according to my understanding, > this procedure still potentially loses data: mutations written after the > last minor compaction will only have reached the write-ahead-logs and will > not be available in the raw RFiles you're importing from. > > -Krishmin > > On Mar 27, 2013, at 4:45 PM, Aji Janis wrote: > > Eric, Really appreciate you jotting this down. Too late to try it out this > time but will give this a try (if, hopefully not) there is a next time to > be had. > > Thanks again. > > > > On Wed, Mar 27, 2013 at 4:19 PM, Eric Newton <[email protected]>wrote: > >> I should write this up in the user manual. It's not that hard, but it's >> really not the first thing you want to tackle while learning how to use >> accumulo. I just opened >> ACCUMULO-1217<https://issues.apache.org/jira/browse/ACCUMULO-1217> to >> do that. >> >> I wrote this from memory: expect errors. Needless to say, you would only >> want to do this when you are more comfortable with hadoop, zookeeper and >> accumulo. >> >> First, get zookeeper up and running, even if you have delete all its >> data. >> >> Next, attempt to determine the mapping of table names to tableIds. You >> can do this in the shell when your accumulo instance is healthy. If it >> isn't healthy, you will have to guess based on the data in the files in >> HDFS. >> >> So, for example, the table "trace" is probably table id "1". You can >> find the files for trace in /accumulo/tables/1. >> >> Don't worry if you get the names wrong. You can always rename the tables >> later. >> >> Move the old files for accumulo out of the way and re-initialize: >> >> $ hadoop fs -mv /accumulo /accumulo-old >> $ ./bin/accumulo init >> $ ./bin/start-all.sh >> >> Recreate your tables: >> >> $ ./bin/accumulo shell -u root -p mysecret >> shell > createtable table1 >> >> Learn the new table id mapping: >> shell > tables -l >> !METADATA => !0 >> trace => 1 >> table1 => 2 >> ... >> >> Bulk import all your data back into the new table ids: >> Assuming you have determined that "table1" used to be table id "a" and is >> now "2", >> you do something like this: >> >> $ hadoop fs -mkdir /tmp/failed >> $ ./bin/accumulo shell -u root -p mysecret >> shell > table table1 >> shell table1 > importdirectory /accumulo-old/tables/a/default_tablet >> /tmp/failed true >> >> There are lots of directories under every table id directory. You will >> need to import each of them. I suggest creating a script and passing it to >> the shell on the command line. >> >> I know of instances in which trillions of entries were recovered and >> available in a matter of hours. >> >> -Eric >> >> >> >> On Wed, Mar 27, 2013 at 3:39 PM, Aji Janis <[email protected]> wrote: >> >>> when you say " you can move the files aside in HDFS" .. which files are >>> you referring to? I have never set up zookeeper myself so I am not aware of >>> all the changes needed. >>> >>> >>> >>> On Wed, Mar 27, 2013 at 3:33 PM, Eric Newton <[email protected]>wrote: >>> >>>> If you lose zookeeper, you can move the files aside in HDFS, recreate >>>> your instance in zookeeper and bulk import all of the old files. It's not >>>> perfect: you lose table configurations, split points and user permissions, >>>> but you do preserve most of the data. >>>> >>>> You can back up each of these bits of information periodically if you >>>> like. Outside of the files in HDFS, the configuration information is >>>> pretty small. >>>> >>>> -Eric >>>> >>>> >>>> >>>> On Wed, Mar 27, 2013 at 3:18 PM, Aji Janis <[email protected]> wrote: >>>> >>>>> Eric and Josh thanks for all your feedback. We ended up *loosing all >>>>> our accumulo data* because I had to reformat hadoop. Here is in a >>>>> nutshell what I did: >>>>> >>>>> >>>>> 1. Stop accumulo >>>>> 2. Stop hadoop >>>>> 3. On hadoop master and all datanodes, from dfs.data.dir >>>>> (hdfs-site.xml) remove everything under the data folder >>>>> 4. On hadoop master, from dfs.name.dir (hdfs-site.xml) remove >>>>> everything under the name folder >>>>> 5. As hadoop user, execute.../hadoop/bin/hadoop namenode -format >>>>> 6. As hadoop user, execute.../hadoop/bin/start-all.sh ==> should >>>>> populate data/ and name/ dirs that was erased in steps 3, 4. >>>>> 7. Initialized Accumulo - as accumulo user, >>>>> ../accumulo/bin/accumulo init (I created a new instance) >>>>> 8. Start accumulo >>>>> >>>>> I was wondering if anyone had suggestions or thoughts on how I could >>>>> have solved the original issue of accumulo waiting initialization without >>>>> loosing my accumulo data? Is it possible to do so? >>>>> >>>> >>>> >>> >> > >
