wohali commented on issue #1097: CouchDB 2.1.1: Compaction daemon - unable to 
calculate free space for `/run/user/0`: `enoent`
URL: https://github.com/apache/couchdb/issues/1097#issuecomment-371250770
 
 
   It'd be much better if we could simply check the free space for the volume 
on which the shard resides, rather than scanning all volumes for free space 
first. I did a bunch of research when it became clear to me this isn't an easy 
problem to solve.
   
   Reminder: [we've been here 
before](https://github.com/apache/couchdb/issues/732), which is when we 
[introduced this warning](https://github.com/apache/couchdb/pull/803).
   
   First stop: python, since I feel they have excellent cross-platform support 
for this kind of thing. A simplified [code excerpt of finding the mount point 
for a given file's 
path](https://stackoverflow.com/questions/4453602/how-to-find-the-mountpoint-a-file-resides-on):
   
   ```
   def find_mount_point(path):
       path = os.path.abspath(path) # or os.path.realpath()
       while not os.path.ismount(path):
           path = os.path.dirname(path)
       return path
   ```
   
   Erlang has `filename:absname/1` and `filename:dirname/1`, but it doesn't 
have an `os.path.ismount` equivalent, so I dug into how Python implements this. 
Of course, there are different implementations for 
[POSIX](https://github.com/python/cpython/blob/3460198f6ba40a839f105c381f07179aba1e8c61/Lib/posixpath.py#L187-L220),
 
[Mac](https://github.com/python/cpython/blob/3460198f6ba40a839f105c381f07179aba1e8c61/Lib/macpath.py#L118-L122)
 and 
[Windows](https://github.com/python/cpython/blob/3460198f6ba40a839f105c381f07179aba1e8c61/Lib/ntpath.py#L246-L275),
 with the Windows implementation requiring [a call out to a Win32 
function](https://github.com/python/cpython/blob/6921e73e33edc3c61bc2d78ed558eaa22a89a564/Modules/posixmodule.c#L3830-L3877)
 because Windows treats local drives, UNC paths, network-mounted drives and 
junction points (symlinks) all differently.
   
   We also need to be mindful of the difference between `abspath` and 
`realpath`, i.e. when a relative path symlink is a parent directory; ascending 
up the tree can lead to misleading results if a true absolute path isn't 
determined in the process. `filename;absname/1` doesn't attempt to deal with 
this problem. Amusingly, [mochiweb has solved this particular 
problem](https://github.com/mochi/mochiweb/blob/b7f3693a9008de6d31a67174f7184fe24093a1b4/src/mochiweb_util.erl#L72).
   
   For comparison, I checked JS/node/npm, and found 
[diskusage](https://www.npmjs.com/package/diskusage) which appears to be the 
most popular solution to this problem. It, too, calls out to a Win32 API 
endpoint on Windows and relies on `statvfs` for POSIX.  (It's not clear that 
this module's Win32 implementation actually handles all the different 
possibilities that the Python implementation does, incidentally.)
   
   **Option 1**: implement all of the above. The Windows NIF is ugly, but there 
are other reasons we might want to go down the path of a Windows-only util NIF 
anyway. I can help here if this is desired.
   
   Next stop, rabbitmq, as a fairly mature Erlang server-thing that definitely 
does disk space checking. They use a much lower tech alternative, namely, 
[shelling out and running cli commands to check free space for a given 
path](https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_disk_monitor.erl#L211-L237).
 Unfortunately, this is problematic, too: there have been 
[a](https://stackoverflow.com/questions/17148510/rabbitmq-strange-disk-free#17944684)
 [few](https://groups.google.com/forum/#!topic/rabbitmq-users/LLnyb_MXa6Q) bugs 
and 
[multiple](https://github.com/rabbitmq/rabbitmq-server/commit/43fe62ef15465ad01c73dd7f591c7659b5d2d307)
 
[fixes](https://github.com/rabbitmq/rabbitmq-server/commit/4eaa46ef0ea1e1777c86a5ddc93e0cf447a3f448)
 
[required](https://github.com/rabbitmq/rabbitmq-server/commit/186c32700b52de68cb3ba71e445844d236519603).
 A comment is made in the that StackOverflow link that this parsing is fragile, 
and could easily be broken by locale-specific changes, too; `df -kP` guarantees 
POSIX-compatible output, but there's no guarantee we'll get any sort of 
consistent report out of Windows without switching to [a PowerShell 
implementation](https://stackoverflow.com/questions/12159341/how-to-get-disk-capacity-and-free-space-of-remote-computer#12159479)
 or the aforementioned Win32 API call.
   
   **Option 2**: Use the rabbitmq approach and shell out to the system rather 
than parsing paths and calling `disksup:get_disk_data/0` once we find a mount 
point we can use to match against Erlang's gathered stats. This feels easier 
(no NIF!), but the devil's in the details, and technically speaking, rabbitmq's 
implementation is 
[MPL](https://github.com/rabbitmq/rabbitmq-server/blob/master/LICENSE), which 
is a ["Category B" weak-copyleft 
license](https://www.apache.org/legal/resolved.html#category-b). We could 
probably write to the rabbitmq people and get them to waive the MPL on this 
particular part of code, or we could white-box our own implementation.
   
   **Option 3**: Blacklist a bunch of paths that we know are bad: `/proc`, 
`/run`, `/dev`. I hesitate to say all `tmpfs` volume types on Linux, because 
running Couch out of a `tmpfs` volume as an all-in-memory CouchDB (prior to PSE 
landing) isn't unreasonable. That said, if the `*stat*` calls fail on a `tmpfs` 
volume, we still need to recover gracefully. @janl and I don't like this 
because it's yet another weird thing to maintain with magic values.
   
   **Option 4**: Do the least possible. Stay with the current approach and just 
don't crash out when we can't stat a particular volume. If we can't actually 
get the free space for the guessed mount point for the current volume, punt per 
#803 and try our best anyway, free space or no. We'll probably end up with more 
bugs like this in the future if we do this, but it'll let us close out this bug 
quickly ;)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to