One of the more subtle additions in popt-1.16 is POPT_ARG_BITSET This a Bloom filter data type that parses option arguments as a CSV and adds to the "bit set".
What is difficult to describe is why/how Bloom filters "work" (wikipedia does an excellent job) and particularly how to describe Why would I _EVER_ need Bloom filters doing CLI option parsing? So here's a perfectly sane usage case that I'll be implementing in RPM shortly. One of the mysteries in RPM is %config handling. In short, a file path marked with %config is handled with a state machine that tries to avoid accidentally discarding local customizations during "upgrade". However, what is often needed is a means (through POPT options) to disable some functionality, or also to negate the behavior (like "grep -v") Since there are like 300K possible file paths on existing systems, this is a rather complex task if arrays/lists are used to attach a per-file disabler using POPT. One can certainly fiddle up the application code to process POPT options in any number of ways. Or (as I'm gonna do) one can let POPT_ARG_BITSET do most of the heavy lifting for you handling a large estimated universe (in this case potentially 300K file paths) with almost no additional application code needed. This POPT helper int poptBitsChk(poptBits bits, const char * s); will return TRUE/FALSE depending on whether a string was mentioned (or not). There's already a a test case (popt/tdict.c) based on /usr/share/dict/words in popt code. Fiddle around with tdict.c if you wish to attempt using POPT_ARG_BITSET in some application. Wayne: RSYNC+POPT likely has zillions of usage cases for POPT_ARG_BITSET with large unknown(a priori) populations of file paths (and enablers/disablers attached to a specific path). Note that the "default" expected population size in POPT is way too small for 300K+ file paths (from popt/popt.h): #define _POPT_BITS_N 1024U /* estimated population */ #define _POPT_BITS_M ((3U * _POPT_BITS_N) / 2U) #define _POPT_BITS_K 16U /* no. of linear hash combinations */ but can be overridden by assigning values to /*...@unchecked@*/ extern unsigned int _poptBitsN; /*...@unchecked@*/ extern unsigned int _poptBitsM; /*...@unchecked@*/ extern unsigned int _poptBitsK; There's also the attached bit of deep dark chocolate from RPM code (where the POPT_ARG_BITSET Bloom filter implementation in POPT was swiped from) if interested in an "optimal" set of parameters with a "false positive" failure rate less than specified for a given estimated population size. See the appended routine. (aside) I left rpmbfParams() (the mnemonic is RPMBF == RPM Bloom Filter, duh) routine out of POPT largely because of the math complexity and no clearly established usage case for POPT_ARG_BITSET so far. I also hadn't implemented rpmbfParams() yet at the time I did the port to POPT ;-). I almost certainly will include some variant of the attached routine to estimate "optimal" bitset parameters in POPT 2.0 before I'm done. hth 73 de Jeff =============================================================== void rpmbfParams(size_t n, double e, size_t * mp, size_t * kp) { static size_t _nmin = 10; static size_t _n = 10000; static size_t _nmax = 1 * 1000 * 1000 * 1000; static double _emin = 1.0e-10; static double _e = 1.0e-4; static double _emax = 1.0e-2; size_t m; size_t k; /* XXX sanity checks on (n,e) args. */ if (!(n >= _nmin && _n <= _nmax)) n = _n; if (!(e >= _emin && _e <= _emax)) e = _e; m = (size_t)((n * log(e)) / (log(1.0 / pow(2.0, log(2.0)))) + 0.5); k = (size_t) ((m * log(2.0)) / n); if (mp) *mp = m; if (kp) *kp = k; } ______________________________________________________________________ POPT Library http://rpm5.org Developer Communication List popt-devel@rpm5.org