I am still stuck with this cluster not starting again, I know it is all local and such, therefore not really representative, but this ought to work, no? See this log I get at startup:
2011-05-16 11:00:36,834 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker 10.0.0.64,60020,1305536432387 starting 2011-05-16 11:00:36,838 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Allocating LruBlockCache with maximum size 197.5m 2011-05-16 11:00:36,850 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: successfully transitioned task /hbase/splitlog/RESCAN0000234067 to final state done 2011-05-16 11:00:36,852 DEBUG org.apache.hadoop.hbase.regionserver.SplitLogWorker: tasks arrived or departed 2011-05-16 11:00:36,854 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: worker 10.0.0.64,60020,1305536432387 acquired task /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:00:36,871 DEBUG org.apache.hadoop.hbase.monitoring.MonitoredTask: setDescritption: Splitting log file hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389into a temporary staging area. 2011-05-16 11:00:36,874 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting hlog: hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389, length=16173236224 2011-05-16 11:00:36,874 DEBUG org.apache.hadoop.hbase.monitoring.MonitoredTask: setStatus: Opening log file 2011-05-16 11:00:36,875 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering file hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 2011-05-16 11:00:37,415 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 1 replicas but expecting 3 replicas. Requesting close of hlog. 2011-05-16 11:00:37,876 INFO org.apache.hadoop.hbase.util.FSUtils: Finished lease recover attempt for hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 2011-05-16 11:00:38,073 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: This region's directory doesn't exist: hdfs://localhost:8020/hbase/usertable/30c4d0a47703214845d0676d0c7b36f0. It is very likely that it was already split so it's safe to discard those edits. 2011-05-16 11:00:38,074 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: processed 0 edits across 0 regions threw away edits for 1 regions log file = hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 is corrupted = false 2011-05-16 11:00:38,074 DEBUG org.apache.hadoop.hbase.monitoring.MonitoredTask: setStatus: processed 0 edits across 0 regions threw away edits for 1 regions log file = hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 is corrupted = false 2011-05-16 11:00:38,074 DEBUG org.apache.hadoop.hbase.monitoring.MonitoredTask: markComplete: processed 0 edits across 0 regions threw away edits for 1 regions log file = hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 is corrupted = false 2011-05-16 11:00:38,074 INFO org.apache.hadoop.hbase.regionserver.SplitLogWorker: worker 10.0.0.64,60020,1305536432387 done with task /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 in 1217ms 2011-05-16 11:00:38,825 INFO org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: Moving 10.0.0.64,60020,1305535848569's hlogs to my queue ==> /var/lib/hbase/logs/hbase-larsgeorge-5-master-de1-app-mbp-2.log <== 2011-05-16 11:00:41,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:42,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:43,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:44,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:45,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:46,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:47,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:48,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:49,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:50,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:51,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:52,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:53,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:54,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:55,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:56,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:57,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:58,691 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:00:59,692 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:01:00,692 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:01:01,692 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:01:02,692 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:01:03,692 INFO org.apache.hadoop.hbase.master.SplitLogManager: resubmitting task /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:03,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2011-05-16 11:01:03,693 INFO org.apache.hadoop.hbase.master.SplitLogManager: resubmitted 1 out of 1 tasks 2011-05-16 11:01:03,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 ver = 28 2011-05-16 11:01:03,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/RESCAN0000234069 ver = 0 2011-05-16 11:01:04,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:04,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:05,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:05,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:06,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:06,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:07,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:07,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:08,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:08,693 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:09,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:09,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:10,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:10,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:11,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:11,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:12,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:12,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:13,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:13,694 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:14,695 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:14,695 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:15,695 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:15,695 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:16,695 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:16,695 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:17,695 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:17,695 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:18,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:18,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:19,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:19,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:20,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:20,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:21,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:21,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:22,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:22,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:23,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:23,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:24,697 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:24,697 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:25,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:25,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:26,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:26,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:27,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:27,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:28,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:28,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:29,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 2011-05-16 11:01:29,697 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 2011-05-16 11:01:30,696 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 I hacked the code to have the SplitLogManager delete all orphaned RESCAN znodes, as I ended up having hundreds of them, and there seems to be no way to "delete *" them, right? Is there a trick to be able to delete a non-empty node in zkCli? Anyhow, the split is supposedly done, or the task at least reports as complete, then the replication ReplicationSourceManager kicks in, and then the task gets relisted over and over again. Just after a few minutes you see this in ZK's /hbase/splitlogs: [RESCAN0000234200, RESCAN0000234209, RESCAN0000234207, RESCAN0000234208, RESCAN0000234205, RESCAN0000234206, RESCAN0000234203, RESCAN0000234204, RESCAN0000234201, RESCAN0000234202, RESCAN0000234237, RESCAN0000234236, RESCAN0000234235, RESCAN0000234234, RESCAN0000234239, RESCAN0000234238, RESCAN0000234232, RESCAN0000234233, RESCAN0000234230, RESCAN0000234231, RESCAN0000234219, RESCAN0000234218, RESCAN0000234217, RESCAN0000234216, RESCAN0000234215, RESCAN0000234214, RESCAN0000234213, RESCAN0000234212, RESCAN0000234210, RESCAN0000234211, RESCAN0000234228, RESCAN0000234227, RESCAN0000234229, RESCAN0000234224, RESCAN0000234223, RESCAN0000234226, RESCAN0000234225, RESCAN0000234220, RESCAN0000234221, RESCAN0000234222, RESCAN0000234100, RESCAN0000234101, RESCAN0000234107, RESCAN0000234106, RESCAN0000234109, RESCAN0000234108, RESCAN0000234103, RESCAN0000234102, RESCAN0000234105, RESCAN0000234104, RESCAN0000234111, RESCAN0000234112, RESCAN0000234110, RESCAN0000234116, RESCAN0000234115, RESCAN0000234114, RESCAN0000234113, RESCAN0000234119, RESCAN0000234118, RESCAN0000234117, RESCAN0000234120, RESCAN0000234121, RESCAN0000234122, RESCAN0000234123, RESCAN0000234125, RESCAN0000234124, RESCAN0000234127, RESCAN0000234126, RESCAN0000234129, RESCAN0000234128, RESCAN0000234134, RESCAN0000234133, RESCAN0000234132, RESCAN0000234131, RESCAN0000234130, RESCAN0000234139, RESCAN0000234137, RESCAN0000234138, RESCAN0000234135, RESCAN0000234136, RESCAN0000234143, RESCAN0000234142, RESCAN0000234145, RESCAN0000234144, RESCAN0000234141, RESCAN0000234140, RESCAN0000234146, RESCAN0000234147, RESCAN0000234148, RESCAN0000234149, RESCAN0000234152, RESCAN0000234151, RESCAN0000234150, RESCAN0000234156, RESCAN0000234155, RESCAN0000234154, RESCAN0000234153, RESCAN0000234159, RESCAN0000234157, RESCAN0000234158, RESCAN0000234161, RESCAN0000234160, RESCAN0000234163, RESCAN0000234162, RESCAN0000234165, RESCAN0000234164, RESCAN0000234167, RESCAN0000234166, RESCAN0000234168, RESCAN0000234169, RESCAN0000234179, RESCAN0000234175, RESCAN0000234176, RESCAN0000234177, RESCAN0000234178, RESCAN0000234171, RESCAN0000234172, RESCAN0000234173, RESCAN0000234174, RESCAN0000234170, RESCAN0000234188, RESCAN0000234189, RESCAN0000234186, RESCAN0000234187, RESCAN0000234184, RESCAN0000234185, RESCAN0000234182, RESCAN0000234183, RESCAN0000234180, RESCAN0000234181, RESCAN0000234193, RESCAN0000234194, RESCAN0000234195, RESCAN0000234196, RESCAN0000234197, RESCAN0000234198, RESCAN0000234199, RESCAN0000234190, RESCAN0000234191, RESCAN0000234192, RESCAN0000234070, RESCAN0000234071, RESCAN0000234072, RESCAN0000234073, RESCAN0000234074, RESCAN0000234075, RESCAN0000234076, RESCAN0000234077, RESCAN0000234078, RESCAN0000234079, RESCAN0000234081, RESCAN0000234082, RESCAN0000234080, RESCAN0000234085, RESCAN0000234086, RESCAN0000234083, RESCAN0000234084, RESCAN0000234089, RESCAN0000234087, RESCAN0000234088, RESCAN0000234069, RESCAN0000234099, RESCAN0000234098, RESCAN0000234095, RESCAN0000234094, RESCAN0000234097, RESCAN0000234096, RESCAN0000234091, RESCAN0000234090, RESCAN0000234093, RESCAN0000234092, hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389] After that all is stuck. Ideas? On Mon, May 16, 2011 at 7:03 AM, Lars George <[email protected]> wrote: > Hi, > > I am on trunk and testing in pseudo distributed setup. I loaded the > machine with YCSB and got it to break at a few million inserts during > the load phase with the GC taking too long and the compaction queue > going through the roof subsequently. Since then I cannot recover the > local "cluster". It is stuck printing this: > > ... > 2011-05-16 06:59:05,389 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired > /hbase/splitlog/RESCAN0000148501 ver = 0 > 2011-05-16 06:59:06,388 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 06:59:06,389 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: resubmitting > unassigned task(s) after timeout > 2011-05-16 06:59:06,390 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired > /hbase/splitlog/RESCAN0000148502 ver = 0 > 2011-05-16 06:59:07,388 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 06:59:07,388 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: resubmitting > unassigned task(s) after timeout > 2011-05-16 06:59:07,389 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired > /hbase/splitlog/RESCAN0000148503 ver = 0 > 2011-05-16 06:59:08,388 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 06:59:08,388 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: resubmitting > unassigned task(s) after timeout > 2011-05-16 06:59:08,389 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired > /hbase/splitlog/RESCAN0000148504 ver = 0 > 2011-05-16 06:59:09,388 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 06:59:09,389 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: resubmitting > unassigned task(s) after timeout > 2011-05-16 06:59:09,390 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired > /hbase/splitlog/RESCAN0000148505 ver = 0 > ... > > This keeps on going up and up. What is the right way to recover from > this? Delete something from ZK? Delete something from HDFS? What shell > commands would help? > > Thanks, > Lars >
