Well, I caught the same error again after terminating my machine with a hard stop - which isn't a normal way to do things but I fat-finger saved an AMI image of it thinking I could boot up just fine afterward.
The only workaround I could do to resolve it was to blow away the HDFS /accumulo directory and re-init my accumulo instance again --- which is fine for playing around, but I'm wondering what exactly is going on? I don't want that to happen if I went to production and had real data. Thoughts on how to debug? On Tue, Jan 6, 2015 at 10:40 AM, Keith Turner <[email protected]> wrote: > > > On Mon, Jan 5, 2015 at 6:50 PM, Mike Atlas <[email protected]> wrote: > >> Hello, >> >> I'm running Accumulo 1.5.2, trying to test out the GeoMesa >> <http://www.geomesa.org/2014/05/28/geomesa-quickstart/> family of >> spatio-temporal iterators using their quickstart demonstration tool. I >> think I'm not making progress due to my Accumulo setup, though, so can >> someone validate that all looks good from here? >> >> start-all.sh output: >> >> hduser@accumulo:~$ $ACCUMULO_HOME/bin/start-all.sh >> Starting monitor on localhost >> Starting tablet servers .... done >> Starting tablet server on localhost >> 2015-01-05 21:37:18,523 [server.Accumulo] INFO : Attempting to talk to >> zookeeper >> 2015-01-05 21:37:18,772 [server.Accumulo] INFO : Zookeeper connected and >> initialized, attemping to talk to HDFS >> 2015-01-05 21:37:19,028 [server.Accumulo] INFO : Connected to HDFS >> Starting master on localhost >> Starting garbage collector on localhost >> Starting tracer on localhost >> >> hduser@accumulo:~$ >> >> >> I do believe my HDFS is set up correctly: >> >> hduser@accumulo:/home/ubuntu/geomesa-quickstart$ hadoop fs -ls /accumulo >> Found 5 items >> drwxrwxrwx - hduser supergroup 0 2014-12-10 01:04 >> /accumulo/instance_id >> drwxrwxrwx - hduser supergroup 0 2015-01-05 21:22 >> /accumulo/recovery >> drwxrwxrwx - hduser supergroup 0 2015-01-05 20:14 /accumulo/tables >> drwxrwxrwx - hduser supergroup 0 2014-12-10 01:04 >> /accumulo/version >> drwxrwxrwx - hduser supergroup 0 2014-12-10 01:05 /accumulo/wal >> >> >> However, when I check the Accumulo monitor logs, I see these errors >> post-startup: >> >> java.io.IOException: Mkdirs failed to create directory >> /accumulo/recovery/15664488-bd10-4d8d-9584-f88d8595a07c/part-r-00000 >> java.io.IOException: Mkdirs failed to create directory >> /accumulo/recovery/15664488-bd10-4d8d-9584-f88d8595a07c/part-r-00000 >> at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:264) >> at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:103) >> at >> org.apache.accumulo.server.tabletserver.log.LogSorter$LogProcessor.writeBuffer(LogSorter.java:196) >> at >> org.apache.accumulo.server.tabletserver.log.LogSorter$LogProcessor.sort(LogSorter.java:166) >> at >> org.apache.accumulo.server.tabletserver.log.LogSorter$LogProcessor.process(LogSorter.java:89) >> at >> org.apache.accumulo.server.zookeeper.DistributedWorkQueue$1.run(DistributedWorkQueue.java:101) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at >> org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47) >> at >> org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) >> at java.lang.Thread.run(Thread.java:745) >> >> >> I don't really understand - I started accumulo as the hduser, which is >> the same user that has access to the HDFS directory /accumulo/recovery, >> and it looks like the directory was created actually, except for the last >> directory (part-r-0000): >> >> hduser@accumulo:~$ hadoop fs -ls /accumulo0/recovery/ >> Found 1 items >> drwxr-xr-x - hduser supergroup 0 2015-01-05 22:11 >> /accumulo/recovery/87fb7aac-0274-4aea-8014-9d53dbbdfbbc >> >> >> I'm not out of physical disk space: >> >> hduser@accumulo:~$ df -h >> Filesystem Size Used Avail Use% Mounted on >> /dev/xvda1 1008G 8.5G 959G 1% / >> >> >> What could be going on here? Any ideas on something simple I could have >> missed? >> > > One possibility is that tserver where the exception occurred had bad or > missing config for hdfs. In this case the hadoop code may try to create > /accumulo/recovery/.../part-r-00000 in local fs, which would fail. > > >> >> Thanks, >> Mike >> > >
