ctubbsii commented on code in PR #411:
URL: https://github.com/apache/accumulo-website/pull/411#discussion_r1433295123
##########
_docs-2/troubleshooting/advanced.md:
##########
@@ -249,21 +249,31 @@ metadata table!), the following process can be followed
to create a valid, empty
WAL file. Run the following commands as the Accumulo unix user (to ensure that
the proper file permissions in HDFS)
- $ echo -n -e '--- Log File Header (v2) ---\x00\x00\x00\x00' > empty.wal
-
-The above creates a file with the text "--- Log File Header (v2) ---" and then
-four bytes. You should verify the contents of the file with a hexdump tool.
-
-Then, place this empty WAL in HDFS and then replace the corrupt WAL file in
HDFS
-with the empty WAL.
-
- $ hdfs dfs -moveFromLocal empty.wal /user/accumulo/empty.wal
- $ hdfs dfs -mv /user/accumulo/empty.wal
/accumulo/wal/tserver-4.example.com+10011/26abec5b-63e7-40dd-9fa1-b8ad2436606e
-
-After the corrupt WAL file has been replaced, the system should automatically
recover.
-It may be necessary to restart the Accumulo Manager process as an exponential
-backup policy is used which could lead to a long wait before Accumulo will
-try to re-load the WAL file.
+```sh
+UUID=$(uuidgen); echo -n -e '--- Log File Header (v4)
---U+1F47B$'"$UUID"'\x00\x00\x00\x00' >"$UUID".wal
Review Comment:
So, I was tracking down where the `U+1F47B` comes from. I think the original
intent is to have the actual ghost emoji character stored there, but instead,
it's a human-readable representation of the unicode code point for the ghost
emoji that's stored there. It doesn't really matter what it's value is, though.
It's purpose is to be a fixed value that represents the decryption parameters
for a "not encrypted" file.
Specifically, this value is an indicator that it's "not encrypted" using our
built-in non-encrypting CryptoService that we use in our built-in reference
implementations for a CryptoServiceFactory. A user could provide their own
factory that supports a "not encrypted" option using a different token to
identify files it supports. But, this string is to denote ours.
I checked the serialization of our WALs in the code, and it looks like this
is wrong as written. There should be an integer that is written before the
"U+1F47B" part. After that, I'm not sure what the WAL is supposed to contain,
because I only got so far tracing the code to see how it writes the WAL header.
In Java, this is what I saw as a header, when tracing the steps to write out
the size of the crypto params, then writing out the string for the "no crypto"
params:
```java
byte[] header = "--- Log File Header (v4)
---\000\000\000\007U+1F47B".getBytes(UTF_8)
// { 45, 45, 45, 32, 76, 111, 103, 32, 70, 105, 108, 101, 32, 72, 101, 97,
100, 101, 114, 32, 40, 118, 52, 41, 32, 45, 45, 45, 0, 0, 0, 7, 85, 43, 49, 70,
52, 55, 66 }
```
That's equivalent in bash to:
```sh
echo -n -e '--- Log File Header (v4) ---\x00\x00\x00\x07U+1F47B' >empty.wal
```
I don't know where the UUID comes in, or that dollar sign in the above
example, or the zeroes from the v2 format. There may be more needed after this
header to result in a valid WAL file... I'm not sure, but I do think what's
here now in this PR is wrong.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]