On Mon, May 16, 2011 at 2:07 AM, Lars George <[email protected]> wrote: > I am still stuck with this cluster not starting again, I know it is > all local and such, therefore not really representative, but this > ought to work, no? See this log I get at startup: >
Do you have replication on? Is this TRUNK of 0.90 branch? If TRUNK then we are doing distributed splitting? Sounds like bug in here Lars, especially if it makes for this much confusion. St.Ack > 2011-05-16 11:00:36,834 INFO > org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker > 10.0.0.64,60020,1305536432387 starting > 2011-05-16 11:00:36,838 INFO > org.apache.hadoop.hbase.regionserver.StoreFile: Allocating > LruBlockCache with maximum size 197.5m > 2011-05-16 11:00:36,850 INFO > org.apache.hadoop.hbase.regionserver.SplitLogWorker: successfully > transitioned task /hbase/splitlog/RESCAN0000234067 to final state done > 2011-05-16 11:00:36,852 DEBUG > org.apache.hadoop.hbase.regionserver.SplitLogWorker: tasks arrived or > departed > 2011-05-16 11:00:36,854 INFO > org.apache.hadoop.hbase.regionserver.SplitLogWorker: worker > 10.0.0.64,60020,1305536432387 acquired task > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:00:36,871 DEBUG > org.apache.hadoop.hbase.monitoring.MonitoredTask: setDescritption: > Splitting log file > hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389into > a temporary staging area. > 2011-05-16 11:00:36,874 INFO > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting hlog: > hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389, > length=16173236224 > 2011-05-16 11:00:36,874 DEBUG > org.apache.hadoop.hbase.monitoring.MonitoredTask: setStatus: Opening > log file > 2011-05-16 11:00:36,875 INFO org.apache.hadoop.hbase.util.FSUtils: > Recovering file > hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 > 2011-05-16 11:00:37,415 WARN > org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error > detected. Found 1 replicas but expecting 3 replicas. Â Requesting close > of hlog. > 2011-05-16 11:00:37,876 INFO org.apache.hadoop.hbase.util.FSUtils: > Finished lease recover attempt for > hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 > 2011-05-16 11:00:38,073 INFO > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: This region's > directory doesn't exist: > hdfs://localhost:8020/hbase/usertable/30c4d0a47703214845d0676d0c7b36f0. > It is very likely that it was already split so it's safe to discard > those edits. > 2011-05-16 11:00:38,074 INFO > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: processed 0 > edits across 0 regions threw away edits for 1 regions log file = > hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 > is corrupted = false > 2011-05-16 11:00:38,074 DEBUG > org.apache.hadoop.hbase.monitoring.MonitoredTask: setStatus: processed > 0 edits across 0 regions threw away edits for 1 regions log file = > hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 > is corrupted = false > 2011-05-16 11:00:38,074 DEBUG > org.apache.hadoop.hbase.monitoring.MonitoredTask: markComplete: > processed 0 edits across 0 regions threw away edits for 1 regions log > file = > hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389 > is corrupted = false > 2011-05-16 11:00:38,074 INFO > org.apache.hadoop.hbase.regionserver.SplitLogWorker: worker > 10.0.0.64,60020,1305536432387 done with task > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > in 1217ms > 2011-05-16 11:00:38,825 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Moving 10.0.0.64,60020,1305535848569's hlogs to my queue > > ==> /var/lib/hbase/logs/hbase-larsgeorge-5-master-de1-app-mbp-2.log <== > 2011-05-16 11:00:41,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:42,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:43,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:44,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:45,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:46,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:47,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:48,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:49,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:50,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:51,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:52,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:53,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:54,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:55,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:56,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:57,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:58,691 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:00:59,692 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:01:00,692 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:01:01,692 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:01:02,692 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:01:03,692 INFO > org.apache.hadoop.hbase.master.SplitLogManager: resubmitting task > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:03,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 0 > 2011-05-16 11:01:03,693 INFO > org.apache.hadoop.hbase.master.SplitLogManager: resubmitted 1 out of 1 > tasks > 2011-05-16 11:01:03,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > ver = 28 > 2011-05-16 11:01:03,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired > /hbase/splitlog/RESCAN0000234069 ver = 0 > 2011-05-16 11:01:04,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:04,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:05,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:05,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:06,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:06,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:07,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:07,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:08,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:08,693 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:09,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:09,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:10,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:10,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:11,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:11,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:12,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:12,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:13,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:13,694 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:14,695 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:14,695 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:15,695 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:15,695 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:16,695 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:16,695 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:17,695 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:17,695 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:18,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:18,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:19,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:19,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:20,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:20,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:21,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:21,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:22,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:22,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:23,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:23,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:24,697 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:24,697 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:25,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:25,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:26,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:26,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:27,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:27,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:28,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:28,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:29,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > 2011-05-16 11:01:29,697 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 > unassigned = 1 > 2011-05-16 11:01:30,696 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task > path -> > /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389 > > I hacked the code to have the SplitLogManager delete all orphaned > RESCAN znodes, as I ended up having hundreds of them, and there seems > to be no way to "delete *" them, right? Is there a trick to be able to > delete a non-empty node in zkCli? > > Anyhow, the split is supposedly done, or the task at least reports as > complete, then the replication ReplicationSourceManager kicks in, and > then the task gets relisted over and over again. Just after a few > minutes you see this in ZK's /hbase/splitlogs: > > [RESCAN0000234200, RESCAN0000234209, RESCAN0000234207, > RESCAN0000234208, RESCAN0000234205, RESCAN0000234206, > RESCAN0000234203, RESCAN0000234204, RESCAN0000234201, > RESCAN0000234202, RESCAN0000234237, RESCAN0000234236, > RESCAN0000234235, RESCAN0000234234, RESCAN0000234239, > RESCAN0000234238, RESCAN0000234232, RESCAN0000234233, > RESCAN0000234230, RESCAN0000234231, RESCAN0000234219, > RESCAN0000234218, RESCAN0000234217, RESCAN0000234216, > RESCAN0000234215, RESCAN0000234214, RESCAN0000234213, > RESCAN0000234212, RESCAN0000234210, RESCAN0000234211, > RESCAN0000234228, RESCAN0000234227, RESCAN0000234229, > RESCAN0000234224, RESCAN0000234223, RESCAN0000234226, > RESCAN0000234225, RESCAN0000234220, RESCAN0000234221, > RESCAN0000234222, RESCAN0000234100, RESCAN0000234101, > RESCAN0000234107, RESCAN0000234106, RESCAN0000234109, > RESCAN0000234108, RESCAN0000234103, RESCAN0000234102, > RESCAN0000234105, RESCAN0000234104, RESCAN0000234111, > RESCAN0000234112, RESCAN0000234110, RESCAN0000234116, > RESCAN0000234115, RESCAN0000234114, RESCAN0000234113, > RESCAN0000234119, RESCAN0000234118, RESCAN0000234117, > RESCAN0000234120, RESCAN0000234121, RESCAN0000234122, > RESCAN0000234123, RESCAN0000234125, RESCAN0000234124, > RESCAN0000234127, RESCAN0000234126, RESCAN0000234129, > RESCAN0000234128, RESCAN0000234134, RESCAN0000234133, > RESCAN0000234132, RESCAN0000234131, RESCAN0000234130, > RESCAN0000234139, RESCAN0000234137, RESCAN0000234138, > RESCAN0000234135, RESCAN0000234136, RESCAN0000234143, > RESCAN0000234142, RESCAN0000234145, RESCAN0000234144, > RESCAN0000234141, RESCAN0000234140, RESCAN0000234146, > RESCAN0000234147, RESCAN0000234148, RESCAN0000234149, > RESCAN0000234152, RESCAN0000234151, RESCAN0000234150, > RESCAN0000234156, RESCAN0000234155, RESCAN0000234154, > RESCAN0000234153, RESCAN0000234159, RESCAN0000234157, > RESCAN0000234158, RESCAN0000234161, RESCAN0000234160, > RESCAN0000234163, RESCAN0000234162, RESCAN0000234165, > RESCAN0000234164, RESCAN0000234167, RESCAN0000234166, > RESCAN0000234168, RESCAN0000234169, RESCAN0000234179, > RESCAN0000234175, RESCAN0000234176, RESCAN0000234177, > RESCAN0000234178, RESCAN0000234171, RESCAN0000234172, > RESCAN0000234173, RESCAN0000234174, RESCAN0000234170, > RESCAN0000234188, RESCAN0000234189, RESCAN0000234186, > RESCAN0000234187, RESCAN0000234184, RESCAN0000234185, > RESCAN0000234182, RESCAN0000234183, RESCAN0000234180, > RESCAN0000234181, RESCAN0000234193, RESCAN0000234194, > RESCAN0000234195, RESCAN0000234196, RESCAN0000234197, > RESCAN0000234198, RESCAN0000234199, RESCAN0000234190, > RESCAN0000234191, RESCAN0000234192, RESCAN0000234070, > RESCAN0000234071, RESCAN0000234072, RESCAN0000234073, > RESCAN0000234074, RESCAN0000234075, RESCAN0000234076, > RESCAN0000234077, RESCAN0000234078, RESCAN0000234079, > RESCAN0000234081, RESCAN0000234082, RESCAN0000234080, > RESCAN0000234085, RESCAN0000234086, RESCAN0000234083, > RESCAN0000234084, RESCAN0000234089, RESCAN0000234087, > RESCAN0000234088, RESCAN0000234069, RESCAN0000234099, > RESCAN0000234098, RESCAN0000234095, RESCAN0000234094, > RESCAN0000234097, RESCAN0000234096, RESCAN0000234091, > RESCAN0000234090, RESCAN0000234093, RESCAN0000234092, > hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389] > > After that all is stuck. Ideas? > > On Mon, May 16, 2011 at 7:03 AM, Lars George <[email protected]> wrote: >> Hi, >> >> I am on trunk and testing in pseudo distributed setup. I loaded the >> machine with YCSB and got it to break at a few million inserts during >> the load phase with the GC taking too long and the compaction queue >> going through the roof subsequently. Since then I cannot recover the >> local "cluster". It is stuck printing this: >> >> ... >> 2011-05-16 06:59:05,389 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired >> /hbase/splitlog/RESCAN0000148501 ver = 0 >> 2011-05-16 06:59:06,388 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 >> unassigned = 1 >> 2011-05-16 06:59:06,389 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: resubmitting >> unassigned task(s) after timeout >> 2011-05-16 06:59:06,390 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired >> /hbase/splitlog/RESCAN0000148502 ver = 0 >> 2011-05-16 06:59:07,388 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 >> unassigned = 1 >> 2011-05-16 06:59:07,388 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: resubmitting >> unassigned task(s) after timeout >> 2011-05-16 06:59:07,389 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired >> /hbase/splitlog/RESCAN0000148503 ver = 0 >> 2011-05-16 06:59:08,388 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 >> unassigned = 1 >> 2011-05-16 06:59:08,388 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: resubmitting >> unassigned task(s) after timeout >> 2011-05-16 06:59:08,389 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired >> /hbase/splitlog/RESCAN0000148504 ver = 0 >> 2011-05-16 06:59:09,388 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 >> unassigned = 1 >> 2011-05-16 06:59:09,389 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: resubmitting >> unassigned task(s) after timeout >> 2011-05-16 06:59:09,390 DEBUG >> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired >> /hbase/splitlog/RESCAN0000148505 ver = 0 >> ... >> >> This keeps on going up and up. What is the right way to recover from >> this? Delete something from ZK? Delete something from HDFS? What shell >> commands would help? >> >> Thanks, >> Lars >> >
