cdbartholomew commented on issue #5392: Error open RocksDB database when 'Set 
up a standalone Pulsar in Docker'
URL: https://github.com/apache/pulsar/issues/5392#issuecomment-543444568
 
 
   This is a Docker configuration issue. The command in the documentation:
   
   ```
   $ docker run -it \
     -p 6650:6650 \
     -p 8080:8080 \
     -v "$PWD/data:/pulsar/data".ToLower() \
     apachepulsar/pulsar:2.4.1 \
     bin/pulsar standalone
   ```
   
   Is faulty. Here's why.
   
   When the pulsar Docker image is built, it defines two volumes:
   
   ```
   VOLUME  ["/pulsar/conf", "/pulsar/data"]
   ```
   This means at run time the Docker image expects an externally mounted volume 
for those the two paths. When you do "docker run" without specifying any 
storage it will automatically create anonymous Docker volumes. It will look 
something like this:
   
   ```
           "Mounts": [
               {
                   "Type": "volume",
                   "Name": 
"c59b3a5ad98eafc8efcbe6de1f63b1a31693564996a8dcc6de60f02867015a51",
                   "Source": 
"/var/lib/docker/volumes/c59b3a5ad98eafc8efcbe6de1f63b1a31693564996a8dcc6de60f02867015a51/_data",
                   "Destination": "/pulsar/conf",
                   "Driver": "local",
                   "Mode": "",
                   "RW": true,
                   "Propagation": ""
               },
               {
                   "Type": "volume",
                   "Name": 
"b7600b9314de9bf8ed573ef697fa012b9d2574074fbea45c2831b3e0a854da30",
                   "Source": 
"/var/lib/docker/volumes/b7600b9314de9bf8ed573ef697fa012b9d2574074fbea45c2831b3e0a854da30/_data",
                   "Destination": "/pulsar/data",
                   "Driver": "local",
                   "Mode": "",
                   "RW": true,
                   "Propagation": ""
               }
           ],
   ```
   
   The problem is the -v option in the command. This tries to create a bind 
mount at the same path in the container as one of the pre-specified volume 
mount points (/pulsar/data). This creates two mounts on the same path. Docker 
doesn't barf on this (for some reason), but it obviously makes the file system 
behave strangely, causing RocksDB to barf.
   
   PR #3918 mentions adding the ToLower() function to -v option to "fix" the 
issue. This doesn't fix the issue at all. It avoids the error because it ends 
mangling the -v option so that it mounts a volume at /pulsar/data.ToLower(). 
This doesn't collide with the pre-defined path so RocksDB works, but it doesn't 
work at all as intended, since Pulsar is configured to expect its data to be in 
the /pulsar/data directory so /pulsar/data.ToLower() is never used.
   
   The fix for this is simple, instead of messing around with bind mounts, give 
it a volume mount like it expects. And since we presumably want the data to 
persist between "docker run" commands, we just have to give the volume a name.
   
   ```
   docker run -it \
   -p 6650:6650 -p 8080:8080  \
   --mount source=pulsardata,target=/pulsar/data \
   apachepulsar/pulsar:2.4.1 \
   bin/pulsar standalone
   ```
   
   Here is what docker inspect gives: 
   
   ```
           "Mounts": [
               {
                   "Type": "volume",
                   "Name": 
"6a41856b98b36f81d1bf1d1d196a59a026c82dd12aa50cee5bdf82295e7f670c",
                   "Source": 
"/var/lib/docker/volumes/6a41856b98b36f81d1bf1d1d196a59a026c82dd12aa50cee5bdf82295e7f670c/_data",
                   "Destination": "/pulsar/conf",
                   "Driver": "local",
                   "Mode": "",
                   "RW": true,
                   "Propagation": ""
               },
               {
                   "Type": "volume",
                   "Name": "pulsardata",
                   "Source": "/var/lib/docker/volumes/pulsardata/_data",
                   "Destination": "/pulsar/data",
                   "Driver": "local",
                   "Mode": "z",
                   "RW": true,
                   "Propagation": ""
               }
           ],
   ```
   
   
   The other advantage of using a named Docker volume is that we don't have to 
mess around with path specifications, so no need for the $PWD variable, which 
is defined in PowerShell but not CommandPrompt (CMD). I am able to run the 
above command using either PowerShell or CMD and it works reliably for me on 
Docker Desktop for Windows.
   
   Since the image expects the config data to be persisted, we should probably 
specify a name for that volume too, like this:
   
   ```
   docker run -it \
   -p 6650:6650 -p 8080:8080  \
   --mount source=pulsardata,target=/pulsar/data \
   --mount source=pulsarconf,target=/pulsar/conf \
   apachepulsar/pulsar:2.4.1 \
   bin/pulsar standalone
   ```
   
   I think just a documentation change is needed to resolve this issue. @junlia 
can you confirm whether my revised command above works for you?
   
   I am happy to put in a PR with the documentation change once this is 
confirmed to work outside my environment.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to