ctubbsii commented on code in PR #411:
URL: https://github.com/apache/accumulo-website/pull/411#discussion_r1433295123


##########
_docs-2/troubleshooting/advanced.md:
##########
@@ -249,21 +249,31 @@ metadata table!), the following process can be followed 
to create a valid, empty
 WAL file. Run the following commands as the Accumulo unix user (to ensure that
 the proper file permissions in HDFS)
 
-    $ echo -n -e '--- Log File Header (v2) ---\x00\x00\x00\x00' > empty.wal
-
-The above creates a file with the text "--- Log File Header (v2) ---" and then
-four bytes. You should verify the contents of the file with a hexdump tool.
-
-Then, place this empty WAL in HDFS and then replace the corrupt WAL file in 
HDFS
-with the empty WAL.
-
-    $ hdfs dfs -moveFromLocal empty.wal /user/accumulo/empty.wal
-    $ hdfs dfs -mv /user/accumulo/empty.wal 
/accumulo/wal/tserver-4.example.com+10011/26abec5b-63e7-40dd-9fa1-b8ad2436606e
-
-After the corrupt WAL file has been replaced, the system should automatically 
recover.
-It may be necessary to restart the Accumulo Manager process as an exponential
-backup policy is used which could lead to a long wait before Accumulo will
-try to re-load the WAL file.
+```sh
+UUID=$(uuidgen); echo -n -e '--- Log File Header (v4) 
---U+1F47B$'"$UUID"'\x00\x00\x00\x00' >"$UUID".wal

Review Comment:
   So, I was tracking down where the `U+1F47B` comes from. I think the original 
intent is to have the actual ghost emoji character stored there, but instead, 
it's a human-readable representation of the unicode code point for the ghost 
emoji that's stored there. It doesn't really matter what it's value is, though. 
It's purpose is to be a fixed value that represents the decryption parameters 
for a "not encrypted" file.
   
   Specifically, this value is an indicator that it's "not encrypted" using our 
built-in non-encrypting CryptoService that we use in our built-in reference 
implementations for a CryptoServiceFactory. A user could provide their own 
factory that supports a "not encrypted" option using a different token to 
identify files it supports. But, this string is to denote ours.
   
   I checked the serialization of our WALs in the code, and it looks like this 
is wrong as written. There should be an integer that is written before the 
"U+1F47B" part. After that, I'm not sure what the WAL is supposed to contain, 
because I only got so far tracing the code to see how it writes the WAL header.
   
   In Java, this is what I saw as a header, when tracing the steps to write out 
the size of the crypto params, then writing out the string for the "no crypto" 
params:
   
   ```java
   byte[] header = "--- Log File Header (v4) 
---\000\000\000\007U+1F47B".getBytes(UTF_8)
   // { 45, 45, 45, 32, 76, 111, 103, 32, 70, 105, 108, 101, 32, 72, 101, 97, 
100, 101, 114, 32, 40, 118, 52, 41, 32, 45, 45, 45, 0, 0, 0, 7, 85, 43, 49, 70, 
52, 55, 66 }
   ```
   
   That's equivalent in bash to:
   
   ```sh
   echo -n -e '--- Log File Header (v4) ---\x00\x00\x00\x07U+1F47B' >empty.wal
   ```
   
   I don't know where the UUID comes in, or that dollar sign in the above 
example, or the zeroes from the v2 format. There may be more needed after this 
header to result in a valid WAL file... I'm not sure, but I do think what's 
here now in this PR is wrong.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to