AND...slave was promoted to a master that database did not come up right and I cannot connect to it.
As I prefer not to lose all data again I will currently try to leave that at this situation (while the new slave starts and might not be able to complete initialization due to this situation) and wait a while if someone can respond. Don't really know where to take it from here: kill mysql on the new master, just restart it regularly (might fail), reboot, shutdown? On Jan 26, 9:48 am, afishler <[email protected]> wrote: > After losing the database on Saturday completely I started at new. > > Currently I am facing almost EXACT the same issue as > herehttp://groups.google.com/group/scalr-discuss/browse_thread/thread/b1c... > > only that my db totally hangs. > > To me it seems almost clear that there is an issue of the data bundle > with database activity. > > After restarting I left the db idle or mostly doing read actions. When > I regained db population the problem surfaced again. > > I say again that this is a real issue. As far as I am concerned I > cannot rely on this solution. > > When is the new release coming out? > > Arie. > > On Jan 24, 6:48 pm, Arie Fishler <[email protected]> wrote: > > > also...I just noticed, and this is the worst case now...the new master > > extracted a snapshot successfully. All db files seems to be in place....db > > even shows me the list of tables and still any select action on the tables > > returns with an error that THE TABLE DOES NOT EXIST. > > > I dont really know what is this status of the database and whether it is > > totally lost at this point. slave behaves exactly the same. > > > On Sat, Jan 24, 2009 at 6:29 PM, Arie Fishler <[email protected]> wrote: > > > not exactly. the slave was 100% stuck in "not being the master". the > > > second > > > slave that started did not succeed in starting (several started and > > > failed). > > > I tried to shutdown mysql on the slave before executing the script -> > > > failed > > > tried executing the script -> failed (it also tried to stop mysql) > > > only then rebooted. > > > mysql did not start saying it executed slave2master on a non slave. > > > Probably the reboot did initiate the process to become a master but the > > > fact > > > I ran it manually confused it. > > > I was left with no master up....then terminated all instances and sclar > > > automatically started instances right. > > > Naturally there was no db during this period. > > > > I will keep following. > > > > On Sat, Jan 24, 2009 at 6:14 PM, Nickolas Toursky <[email protected]>wrote: > > > >> Slave should promote itself is a master automatically. If it has > > >> failed, the only way to do it - to execute /usr/local/aws/bin/mysql- > > >> slave2master.sh script in the shell. > > >> As I can see, you have rebooted the master instance before it was > > >> initialized - this has broken this instance and did not allow slave to > > >> initialize properly. > > > >> On Jan 24, 10:19 am, afishler <[email protected]> wrote: > > >> > Hi, > > > >> > Currently after losing the master again, the slave is not promoted to > > >> > be a master. How do I do it manually? > > > >> > New instance that is starting does not seem to survive and keeps > > >> > starting over. > > > >> > This is real frustrating. I really don't think this is an issue of > > >> > occasional disk issues on AWS. Master dying happens around twice a > > >> > day....too frequent to be considered a "normal" problem > > > >> > Here is the log of the new slave failing to start > > >> > 24-01-2009 01:46:56 ERROR i-f9800290/instance-up.sh > > >> /usr/local/ > > >> > aws/bin/mysql-init.sh failed. Exiting. > > >> > 24-01-2009 01:46:55 INFO i-f9800290/mysql-init.sh > > >> Traceback (most > > >> > recent call last): > > >> > File "/usr/bin/s3cmd", line 415, in > > >> > error("S3 error: " + str(e)) > > >> > File "/usr/lib/python2.5/site-packages/S3/S3.py", line 41, in __str__ > > >> > retval += (": %s" % self.info["Code"]) > > >> > KeyError: 'Code'. Retrying. > > >> > 24-01-2009 01:46:55 ERROR i-f9800290/mysql-init.sh Could > > >> not fetch > > >> > MySQL data snapshot using index s3://farm-1173-918348349691/farm-mysql/ > > >> > mysql-snapshot.tar. > > >> > 24-01-2009 01:46:55 ERROR i-f9800290/mysql-init.sh Failed > > >> to fetch > > >> > 's3://farm-1173-918348349691/farm-mysql/mysql-snapshot.tar' to '/mnt/ > > >> > mysql-misc/tmp.TySmbs2490/mysql-snapshot.tar' for 4 tries. > > >> > 24-01-2009 01:46:24 INFO i-f9800290/mysql-init.sh > > >> Traceback (most > > >> > recent call last): > > >> > File "/usr/bin/s3cmd", line 415, in > > >> > error("S3 error: " + str(e)) > > >> > File "/usr/lib/python2.5/site-packages/S3/S3.py", line 41, in __str__ > > >> > retval += (": %s" % self.info["Code"]) > > >> > KeyError: 'Code'. Retrying. > > >> > 24-01-2009 01:46:04 INFO i-f9800290/mysql-init.sh > > >> Traceback (most > > >> > recent call last): > > >> > File "/usr/bin/s3cmd", line 415, in > > >> > error("S3 error: " + str(e)) > > >> > File "/usr/lib/python2.5/site-packages/S3/S3.py", line 41, in __str__ > > >> > retval += (": %s" % self.info["Code"]) > > >> > KeyError: 'Code'. Retrying. > > >> > 24-01-2009 01:45:53 INFO i-f9800290/mysql-init.sh > > >> Traceback (most > > >> > recent call last): > > >> > File "/usr/bin/s3cmd", line 415, in > > >> > error("S3 error: " + str(e)) > > >> > File "/usr/lib/python2.5/site-packages/S3/S3.py", line 41, in __str__ > > >> > retval += (": %s" % self.info["Code"]) > > >> > KeyError: 'Code'. Retrying. > > >> > 24-01-2009 01:45:53 INFO i-f9800290/mysql-init.sh Trying > > >> to fetch > > >> > previous MySQL snapshot from s3://farm-1173-918348349691/farm-mysql/ > > >> > mysql-snapshot.tar (10527569920 bytes). --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "scalr-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/scalr-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
