It depends on how much data you're writing. I can't answer that for ya.
Generally for hadoop, you want to avoid that 80-90% utilization (HDFS
will limit you to 90 or 95% capacity usage by default, IIRC).
If you're running things like MapReduce, you'll need more headroom to
account for temporary output, jars being copied, etc. Accumulo has some
lag in free'ing disk space (e.g. during compaction, you'll have double
space usage for the files you're re-writing), as does HDFS in actually
deleting the blocks for files that were deleted.
Jayesh Patel wrote:
So what would you consider a safe minimum amount of disk space in this case?
Thank you,
Jayesh
-----Original Message-----
From: Josh Elser [mailto:[email protected]]
Sent: Thursday, June 02, 2016 1:08 AM
To: [email protected]
Subject: Re: walog consumes all the disk space on power failure
Oh. Why do you only have 16GB of space...
You might be able to tweak some of the configuration properties so that
Accumulo is more aggressive in removing files, but I think you'd just kick
the can down the road for another ~30minutes.
Jayesh Patel wrote:
All 3 nodes have 16GB disk space which was 98% consumed when we looked
at them after few hours after the power failed and was restored.
Normally it's only 33% or about 5GB.
Once it got into this state Zookeeper couldn't even start because it
couldn't create some logfiles that it needs to create. So the disk
space usage was real, not sure if you meant that or not. Ended up
wiping away hdfs data folder and reformatting it to reclaim the space.
Definitely didn't see complaints about writing to WALs. Only
exception is the following that showed up because namenode wasn't in
the right state due to constrained resources:
2016-05-23 07:06:17,599 [recovery.HadoopLogCloser] WARN : Error
recovering lease on hdfs://instance-accumul
o:8020/accumulo/wal/instance-accumulo-3+9997/530f663b-2d6b-42a5-92d6-e
8fbb9b
55c2e
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.na
menode
.SafeModeException): Cannot rec
over the lease of
/accumulo/wal/instance-accumulo-3+9997/530f663b-2d6b-42a5-92d6-e8fbb9b55c2e.
Name node is
in safe mode.
Resources are low on NN. Please add or free up more resources then
turn off safe mode manually. NOTE: If y ou turn off safe mode before
adding resources, the NN will immediately return to safe mode. Use
"hdfs dfsad min -safemode leave" to turn safe mode off.
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeM
ode(FS
Namesystem.java:1327
)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNam
esyste
m.java:2828)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(
NameNo
deRpcServer.java:667
)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTran
slator
PB.recoverLease(Clie
ntNamenodeProtocolServerSideTranslatorPB.java:663)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Cli
entNam
enodeProtocol$2.call
BlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call
(Proto
bufRpcEngine.java:61
6)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
ion.ja
va:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1476)
at org.apache.hadoop.ipc.Client.call(Client.java:1407)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngi
ne.jav
a:229)
at com.sun.proxy.$Proxy15.recoverLease(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.r
ecover
Lease(ClientNamenode
-----Original Message-----
From: Josh Elser [mailto:[email protected]]
Sent: Tuesday, May 31, 2016 6:54 PM
To: [email protected]
Subject: Re: walog consumes all the disk space on power failure
Hi Jayesh,
Can you quantify some rough size numbers for us? Are you seeing
exceptions in the Accumulo tserver/master logs?
One thought is that when Accumulo creates new WAL files, it sets the
blocksize to be 1G (as a trick to force HDFS into making some
"non-standard"
guarantees for us). As a result, it will appear that there are a
number of very large WAL files (but they're essentially empty).
If your instance is in some situation where Accumulo is repeatedly
failing to write to a WAL, it might think the WAL is bad, abandon it,
and try to create a new one. If this is happening each time, I could
see it explain the situation you described. However, you should see
the TabletServers complaining loudly that they cannot write to the WALs.
Jayesh Patel wrote:
We have a 3 node Accumulo 1.7 cluster running as VMWare VMs with
minute amount of data compared to Accumulo standards.
We have run into a situation multiple times now where all the nodes
have a power failure and when they are trying to recover from it
simultaneously, walog grows exponentially and fills up all the
available disk space. We have confirmed that the walog folder under
/accumulo in hdfs is consuming 99% of the disk space.
We have tried freeing enough space to be able to run Accumulo
processes in the hopes of it burning through walog without success.
Walog just grew to take up the freed space.
Given that we need to better manage the power situation, we're trying
to understand what could be causing this and if there's anything we
can do to avoid this situation.
We have some heartbeat data being written to a table at a very small
constant rate which is not sufficient to cause a such large
write-ahead log even if HDFS was pulled from under Accumulo's feet,
so to speak during the power failure in case you're wondering.
Thank you,
Jayesh