RaulGracia opened a new issue #3150:
URL: https://github.com/apache/bookkeeper/issues/3150


   **QUESTION**
   We have done recent tests to observe the behavior of Bookies when exhausting 
its storage space. In one test, we got the 3 Bookies (10GB journal, 10GB 
ledger) in read-only mode (expected). But after some time (perhaps after a GC 
cycle, which may require extra storage), we observed that the 3 Bookies were in 
`CrashLoopBackoff` (Kubernetes deployment):
   ```
   # k get po
   NAME                                                READY   STATUS           
  RESTARTS   AGE
   bookkeeper-bookie-0                                 0/1     Running          
  247        20h
   bookkeeper-bookie-1                                 0/1     CrashLoopBackOff 
  253        20h
   bookkeeper-bookie-2                                 0/1     CrashLoopBackOff 
  239        20h
   ```
   
   The reason for Bookies not being able to start is that there is no space 
left on device to start some of the internal Bookie processes:
   ```
   2022-03-21 17:51:11,309 - ERROR - [BookieJournal-3181:Journal@1146] - I/O 
exception in Journal thread!
   2022-03-21 17:51:11,309 - ERROR - [BookieJournal-3181:Journal@1146] - I/O 
exception in Journal thread!java.io.IOException: No space left on device 
   at java.base/java.io.UnixFileSystem.createFileExclusively(Native Method) at 
java.base/java.io.File.createNewFile(File.java:1035) 
   at 
org.apache.bookkeeper.bookie.JournalChannel.<init>(JournalChannel.java:159) 
   at 
org.apache.bookkeeper.bookie.JournalChannel.<init>(JournalChannel.java:117) 
   at org.apache.bookkeeper.bookie.Journal.run(Journal.java:963)
   2022-03-21 17:51:11,310 - INFO  - [BookieJournal-3181:Journal@1158] - 
Journal exited loop!
   ```
   
   _The question is: is there a suggested recovery procedure if we find Bookies 
in this situation?_ 
   One constraint to any potential solution to this problem: we need the data 
of the impacted Bookies to be available, as the system that uses Bookkeeper 
requires accessing it at least once. 
   
   We have considered resizing the Bookie volumes. My understanding is that, if 
we achieve this, that would solve the problem and Bookies would be able to 
boot. In case of a volume not being resizable, we have also considered adding 
new journal/ledger directories to the Bookies that are backed up on new 
volumes, but I don't know if this would work (we may need to play around with 
the Cookie and metadata of the Bookies).
   
   Having a procedure to deal with this problem (if it does not exist) in the 
documentation would be great as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to