[Citadel Development] DKIM-signed mail is coming to Citadel
So this is happening ... [ https://code.citadel.org/?p=ig-dkim.git ] I found a nice tidy little DKIM-signing library, just a few hundred lines of code, one source module, abandoned for about twelve years after the project that built it ceased to exist. Big thanks to them for leaving this nice bit of code behind. I'm in the process of updating it to C99 syntax. After that I have to change out the deprecated OpenSSL calls. But even in its current state it emits nice clean DKIM signatures and is easy to call. And a big huge EAT SHIT AND DIE to the people at Google who are rejecting our mail again, making this necessary.
[Citadel Development] Re: Do not convert compared times to UTC.
I'm really pleased with how libical does the heavy lifting for me here. It has improved greatly since the old days when we were its caretakers. Comparing two timestamps that are in different time zones actually works. I don't know how it works, but it works. Just to make sure it wasn't just using "the local time zone" when not told to use UTC, I created an event that included a completely fictional time zone, then set the host's local time zone to a different one, and the compare function still worked. Then I found a function to tell me what the time zone ID of a time stamp is; lo and behold, it showed the name of my fictional time zone! Somehow it's getting back to that data, or it's making another copy of it and embedding it in the timestamp, or something. Either way it removes a LOT of work from what I have to do, and it saves me from having to go with the less-than-pretty method of converting everything to UTC at the last moment.
[Citadel Development] Re: Release version 997 generated by do-release.sh
Well that's a fine howdy-do, I tagged the release without also mentioning why I pushed it at this time. This version has the config option to advertise STARTTLS on the public SMTP server.
[Citadel Development] (null)
All righty then. Pulling in the RSS feed from cgit, since OinkLab is no longer in use. There will be some dupes. Here it comes.
[Citadel Development] (null)
For those of you who are pulling updates from the Citadel git repository, please take note: the address of the repository has changed (again). It is now: https://code.citadel.org/git/citadel.git There is also a GitWeb instance at that address, for the benefit of anyone who wants to do some casual browsing of the repository. Share and Enjoy. Commits will be handled by SSH, just like they were before our brief experiment with GitLab. To be fair, I still like GitLab, but it's much too heavy a toolset to be used by a site that just has one project on it and no CI/CD.
[Citadel Development] Re: START_CHAT_MODE is renamed to SEND_THEN_RECV
Me too, which is why it took me so long to get out of that headspace. The truth is, however, Citadel is extremely popular with the operators of small autonomous sites, not so much the "Exchange killer" I once thought it could become. That crap all moved to the cloud anyway. Even now as I am laying out the plans for the "home data center" I thought I would never need again ... I'm thinking multiple nodes and Kubernetes and PVE and/or Ceph and all sorts of other complex things when I could probably do better just running everything on one decent sized node.
[Citadel Development] Re: START_CHAT_MODE is renamed to SEND_THEN_RECV
>That is a hard one. But keeping them separate might help with >scalability? That was true for WebCit-classic which contains a server-side user interface. And until a few years ago they could run on separate machines, so you could have multiple WebCit instances running with a load balancer. Today, the hardware is so powerful that this need no longer exists. We added a requirement for Citadel Server and WebCit to run on the same host because it makes TLS easy to deploy. Both programs can point to the same key and certificate files instead of having to get the components to trust each other over the network and then send everything back and forth. This is appreciated by innumerable site operators who "just want it to work" -- sites which actually exist in large numbers. Moving the web API would increase performance and possibly decrease complexity. However, it would be a major refactoring of the server code, which would be another year-long side quest. That's a bad thing when WebCit-NG is already years behind schedule. I've just begun work on the calendar. This is arguably the most logic-intensive part of the system and I want to keep it as DRY as possible. This is involving a lot of contemplation at a time when I really want to just get down to coding.
[Citadel Development] Re: START_CHAT_MODE is renamed to SEND_THEN_RECV
I've been thinking lately that it might begin to make sense to put the web server (really, the web API) directly into Citadel Server. Since it's now a web API and no longer a user interface, it could be an improvement. *shrug*
[Citadel Development] Re: view_mail.js: more improvements to mail forwarding
I'm starting to be quite happy with this. WebCit-NG is starting to take the shape of a very competent webmail client. Very clean, too, since it doesn't need to push a bunch of other garbage in your face the way commercial offerings do: "Here's a feature we really want you to use and we won't let you hide it!" Select and multiselect are perfect. Actually they have been since spring, before I went on the "make the database not suck" side quest. Drag to folders is great. Reply and quote are working beautifully. Forwarding now looks good, but it still needs to load the attachments from the message being forwarded. And of course the forums section was finished many months ago. I wish I had a way to get this work out to the user community, but it isn't viable until it's complete, and there are large pieces of functionality (calendar, contacts, blog, wiki, chat, settings) to finish. Perhaps after I get my new servers and migrate Uncensored to a container I can add one of those corny "preview our new user experience" buttons. That will be easier when it's running on Kubernetes because I can define a multi-container pod, with one container being the standard Citadel image and the other being a WebCit-NG container, both sharing the same /usr/local/citadel volume so WebCit-NG can grab the SSL certificate and access the server socket. That's why it would have to be a multi-container pod and not two different pods: for the socket to work, it has to be on the same host. Hmmm ... there might also be a pod affinity feature in Kubernetes. That would work even better since I could upgrade and restart the WebCit-NG pod without restarting Citadel Server. Ok that's all for tonight. Thanks for reading (or not reading) my train of thought post.
[Citadel Development] (null)
Whenever I have some time to kill (boring meeting, waiting for something to finish, etc) sometimes I will start up a test environment in the clown. I have a subscription to a training site that allows free time-limited sandboxes in the three big hyperscale clowns. And I'll run a Citadel Server and max out the load tester on it. Today's test ran for four hours. I had it "kill -11" the server process every few minutes on a randomly timed basis. The load tester ran 240 threads and was configured to automatically restart whenever it lost the server. By the end of the sandbox's lifetime, it had completed close to a million transactions, with a few hundred "induced" server crashes and NO unintentional server crashes ... and not one tiny bit of database corruption. None. Nada. Nunca. Jamais. I know I've said this a number of times before, but I am just so damn satisfied with how stable the database layer is in server >= 993 after last summer's effort to make it more robust. I'd be interested in hearing any feedback y'all might have on this. My test systems have been on AMD64 with both Linux and FreeBSD, and on ARM with Linux. Production is 32-bit x86 with Linux, and has also been completely stable. Does anyone have experience to report, either confirming or refuting that we made an improvement?
[Citadel Development] Re: serv_extensions.c: style cleanup from 30,000 feet
>git push from an airplane? showoff Yup. :) Unfortunately I was en route to World Of Crap (AWS re:Invent) while on the plane. As a t-mobile customer I get a lot of free wifi benefits. I think they provide the ground links or something.
[Citadel Development] (null)
Ok, so it has to stay, then. Does sendmail not support LMTP?
[Citadel Development] (null)
*sigh* Another day, another person who won't listen. Quick poll for fellow Citerati out there. Does anyone think we still need the `citmail` utility? As a refresher, this program is an MDA, analagous to `procmail` or `maildrop`, that files incoming mail into Citadel mailboxes after it has been received by some third-party MTA such as Postfix or Sendmail. However, Citadel has for a long time offered an LMTP socket, and I believe all modern MTAs can use LMTP to drop local mail to local users. So what I'm looking for is someone to say "You're correct about that, IG; spawning a process to invoke the MDA is so 20 years ago and nobody does it that way anymore" so that I can discontinue `citmail` entirely. (The number of sites using Citadel Server with a third party MTA is vanishingly small to begin with -- but not zero.)
[Citadel Development] Re: Write server PID to citadel.lock in addition to locking it.
So here's what is on my mind regarding this. Suggestions are welcome. `/var/run` is a good place to put the PID file, which as previously mentioned is also the lock file. Mounting the same database from two different instances of `citserver` would be A Bad Thing. (Even though I'm still freaking amazed at how stable the database has become once we switched to the mmap-backed environment ... I still can't make it go corrupt no matter how hard I try.) On a regular machine, or in a jail, `/var/run` is in the right place, and does the right thing. In a container ... `/var/run` will not be in the same persistent volume as the database. That means someone could accidentally start up two containers that have different `/var/run` but the same `/usr/local/citadel/data` with catastrophic results. I'm tossing around a couple of ideas: * Writing the pid file to both $ctdldir/citadel.lock *and* /var/run/citserver.pid * Making /var/run/citserver.pid a symbolic link to $ctdldir/citadel.lock I really don't want to make the containerized version harder to use.
[Citadel Development] Re: Write server PID to citadel.lock in addition to locking it.
Whoa, what font is that ?!! I am aware that `/var/run` is the conventional location, and may yet move it there. Doing so would, however, prevent multiple instances of Citadel Server from running on the same host. There were definitely people doing that in the past, but I wonder if nowadays that is still a thing, with everyone now having virtual machines and jails and containers and all sorts of other ways to isolate multiple workloads on the same machine.
[Citadel Development] (null)
Quick update regarding GitLab and git itself: I've removed SSH access to the repository. All access should be through HTTPS. This is much more secure. And yes, you can store your creds. The recommended procedure if you plan to commit is to grab yourself a Personal Access Token using your profile, and then use that as your password. You can edit .git/config and set up your creds like this: https://username:to...@code.citadel.org/etc/etc/etc This gives you the ability to commit without typing in your password every time, but without having to store your cleartext password on your dev machine. And of course you can revoke the token at any time. Share And Enjoy
[Citadel Development] (null)
Uncensored has been updated from git master as of right now (Saturday morning). This includes a fix for a security vulnerability that some weirdo expected me to be able to read in Japanese. :(
[Citadel Development] (null)
Here's what worked for me: Going through my history... here's what I've got. (This is with `bash`, alter accordingly for your favorite shell.) pkg install gmake gcc autoconf automake binutils gsed libtool intltool openssl curl expat libical libiconv icu bash git curl https://easyinstall.citadel.org/db-18.1.40.tar.gz | tar xvzf - cd ./db-18.1.40/build_unix ../dist/configure --prefix=/usr/local --with-cryptography=no --disable-hash --disable-heap --disable-queue --disable-replication --disable-statistics --with-uniquename=_ctdl --enable-static --disable-shared --disable-compat185 --disable-cxx --disable-debug --disable-dump185 --disable-java --disable-tcl --disable-test --without-rpm make make install_lib make install_include make install_utilities (you've got Berkeley DB now) git clone https://code.citadel.org/citadel/citadel.git cd citadel/libcitadel export CFLAGS="-g -I/usr/local/include" export LDFLAGS="-L/usr/local/lib" ./bootstrap ./configure gmake gmake install (you've got libcitadel now) cd ../citadel ./configure --ctdldir=/usr/local/citadel # ignore the error about crypt.h, that was a test gmake gmake install # it seems to install even though it says "inappropriate file type or format" (you've got Citadel Server now) cd ../textclient ./configure gmake gmake install (you've got the text mode client now) cd ../webcit ./bootstrap ./configure --prefix=/usr/local/webcit gmake gmake install (you've got webcit now)
[Citadel Development] Re: Removed background and restart from citserver.
Geez. I can't believe how much more stable Citadel Server is with this new improved backend. it was worth spending most of the summer working on this. Kitty -- I hope you're having good results on FreeBSD -- please let me know if it works for you. I've got a test running right now with the load tester running in an auto-restarting loop in one window, and in the other window Citadel Server is running under timeout(1) so it automatically dies from signal 9 after 20-30 seconds. It's been running for close to an hour now, and no matter how catastrophically I make the server die, the database recovers cleanly on the next startup every time. My faith in BDB's recoverability has been restored. I'm just wondering whether we've been using it wrong for the last 20+ years (doubtful, considering LS's substantial talent at this sort of thing) or if some change came along later and silently made it wrong. Anyway, the purpose of this exercise was originally to prepare for moving away from BDB, and we are in a position to add new backends now ... but then I unexpectedly made the BDB backend more or less bulletproof during a late-night coding and testing binge. And so, here is the NEW plan. 1. I am not going to begin working on the LMDB backend at this time. I want to spend my time on WebCit-NG. 2. If anyone wants to work on a backend of any type (LMDB, SQL, whatever) I'll be happy to provide whatever support you need, including modifications to the interface if needed. 3. Naturally, if any stability/durability issues are reported, fixing them will take first priority. 4. I'll do my best to keep everything buildable on FreeBSD. It'll still require gcc and gmake for the time being, but as of right now it does work. I'm trying to line up new hardware to run my stuff on. Hopefully soon. That'll be when I finally move Uncensored to a 64-bit system. I'll proably run the container version.
[Citadel Development] Re: serv_expire: abort() if any malloc() calls fail.
The above commit has to do with the experience a few of us had with rooms suddenly containing all new messages. There was no database corruption. THE DREADED AUTO-PURGER did this, and it appears to have believed that it was supposed to do that. When we purge `visit` records, we first build lists of valid users and rooms, and we delete any `visit` record that doesn't point to a valid user *and* a valid room. I'm still not sure, but I think we built incomplete user and room lists in memory because of memory allocation. So now if we fail to allocate we crash the server completely. It's bad but losing data is worse. On another note, I think this happened on Uncensored because of the Brown Paper Bug that I fixed in the previous release.
[Citadel Development] Re: berkeley_db: cdb_next_item() use DB_REALLOC, not DB_MALLOC
Phooey. This was a bit of a brown-paper-bug. I have to do a full release of the software to correct it. Dumping and loading a test system was fine, but when I tried to dump a clone of Uncensored it showed the memory leak.
[Citadel Development] Re: IMAP Flags branch?
No pressure at all! Just looking out to see if you are ok and still with us. But after a break you'll notice that there is likely some tweaking to do. Specifically, if we are to have multiple versions of the visit record (and I do think that's a good way to go) we'll need to tweak `ctdldump` to detect the different record formats and set things up accordingly. Naturally, we can have it always dump and load in the new format to avoid having to put dependencies across the whole toolchain. I believe the backend interface is now in its final form. I realize I said that before, but now it's actually working exactly the way we need it so, so there won't be any more changes unless some new need comes up. I also realize that I said the purpose of the rework was to get us ready to move away from Berkeley DB as quickly as possible, but we ended up making the Berkeley DB implementation so indestructible that this is now a lower priority.
[Citadel Development] Re: Removed background and restart from citserver.
>The rc system fully detaches it from the entire call stack, in the >end it just looks like init ran the daemon. The rc system is really >just a series of shell scripts, nothing too fancy. That's why PID >files are so important, the rc system uses those to figure out which >PID to send signals to. Cool. Give the new version a try and let me know how it works for you. We don't have the ability to auto-background anymore, but we can still write a PID file. SIGTERM and SIGINT will make the server shut itself down cleanly, and it does so much faster now. SIGKILL and SIGQUIT will make the server exit abruptly, and as previously noted, it doesn't seem to corrupt the db anymore when you do this.
[Citadel Development] Re: Removed background and restart from citserver.
All righty then. The current code in git master is running on FreeBSD. Here's the deal: 1. You need ldap client library (`pkg install openldap26-client`) even if you aren't using LDAP 2. `gmake` and `gcc` are required. It won't build with whatever FreeBSD is using natively. 3. The configure script now detects whether it needs `-lresolv` and/or `-lintl` automatically 4. We now omit the `chkpw` and `chkpwd` programs on systems where they won't work. I'd actually like to remove system auth completely but I don't know if anyone is using it. On my FreeBSD system I was able to get the server running and I ran the new `loadtest` utility with a few dozen threads beating the crap out of it. I abruptly killed the server with SIGKILL and SIGQUIT numerous times, during periods of high write activity, and even under the worst conditions it did not corrupt the database. Looking really good here.
[Citadel Development] Re: Removed background and restart from citserver.
UUU--- Drive 0 on my desktop is fux0red. This was my main workstation and also where I had my FreeBSD vm installed. Time to rebuild.
[Citadel Development] Re: Removed background and restart from citserver.
nd just like that, my main machine at home bit the dust. *grumble*
[Citadel Development] Re: Removed background and restart from citserver.
>So FreeBSD's rc system sends a SIGTERM when you do "doas service >citserver stop". This should run the same code that cleanly shuts >down citserver when you do a ".ATN y" from the text client. All the Perfect. That's exactly the desired behavior, and it will be "even more that way" now. I have to go examine other BSD rc scripts, but I'm guessing they simply do "&" to put a process in the background, and then they keep track of the pid so they can SIGTERM it later. SIGINT and SIGTERM are handled by running the very same shutdown code that the server's "DOWN" command runs (which is what the text client's "<.A>dmin erminate-server ow" command executes). Unmount the databases cleanly, then exit(0) to GTFO. What's new, however, is that the database "environment" is now on mmap+disk instead of just in-memory. I don't know how we went for so many years without knowing that we had that wrong. This accomplishes two things: 1. The server no longer goes SIGSEGV when we try to close the environment 2. The database never, ever shows up as corrupted when we restart it In other words, the damn journaling system works correctly now. The funny thing is, it used to work fine even in its old form, so I don't know what's changed in the last few versions of Berkeley DB. And here we are, with a BDB implementation that's now rock solid, as a result of a two month long development effort intended to get us *off* BDB. But with that done, I'm gonna focus on FreeBSD for a while.
[Citadel Development] Re: Removed background and restart from citserver.
>citserver should exit cleanly when it receives a SIGINT, SIGQUIT, or >SIGKILL. > >This may be why using FreeBSD's rc system caused badness on my >system. And it so happens that FreeBSD is the reason I pulled that section of code out. This is an opportunity to remove all of the "clever" stuff and really focus on portable, standard code. I do feel that there's no longer any reason why a server program should have to do the work to put itself into the background. The operating system's supervisor program should be doing that. I'm sure I'll be learning more about how FreeBSD's rc system works. Seems like it's held pretty closely to Unix System III. That's fine, because I've always hated the System V init scripts. I did release what we've got so far as Citadel 991, with some of the FreeBSD changes in place, but I'm going to keep poking at it until it builds cleanly without needing help. Also I have an ACloudGuru account paid for by my employer and that lets me stand up cloud sandboxes for testing things. I'm going to start using that for an occasional build test, I think. The only problem is that the sandboxes automatically self-destruct after four hours, so it's not useful for much more than testing. Right now I've got the latest Easy Install build running on a Scamazon instance, with a 30-thread load test pounding away at it. I figure it's better to abuse their SSDs than mine.
[Citadel Development] Re: Master: citserver coredump on Shutdown
>When I build from master now, when I shutdown citserver, it core dumps right >after closing the 0d database. Since I had an issue where webcit was core >dumping and was not re-creatable by anyone else, I am wondering if this is >just me. Remember that discussion, from early July? That's part of what started this all ... and made me basically spend the whole summer overhauling the database layer instead of working on WebCit-NG. Well guess what ... it's fixed. I'm sure it had something to do with the "dbenv" and whether it was disk backed or memory backed or both, or something like that. Now that we're using what I believe is mmap'ed disk to back the environment (that's the new __db.00x files you'll see in the data directory) it no longer segfaults when closing the environment. And did I mention that since this change, it hasn't corrupted the test database even once, no matter how badly I abuse it?
[Citadel Development] Re: I'm in ur VM, running FreeBSD
Yet another status update! With some sleep under my belt and a thankfully slow day at work I've been able to poke around a little more. Uncensored crashed a couple of times last night but I think it was because of a null pointer bug I somehow introduced. Turning off the indexer module seems to have stopped the deadlock bug. I'm testing again on one of the development systems (franklin) and it seems that I don't even have to run TDAP in the background. If I run loadtest with about ten threads, it reliably deadlocks after a few thousand operations; with the indexer disabled it doesn't. I left it running last night and it got through millions of operations before I stopped it this morning. All righty then. I'll be looking into the indexer to see why it's misbehaving. By the way, all those server crashes on Uncensored last night (from the bug I created and then fixed) and the database started up cleanly every time. The good news is (knock on wood, or OSB, or whatever's on hand) I think that issue is fixed now.
[Citadel Development] Re: I'm in ur VM, running FreeBSD
Yeesh. It's nearly 2:00am and I still haven't figured out the deadlock problem. But I can make it appear pretty reliably by doing these things together: 1. loadtest, with 3 threads 2. while true ; do ./sendcommand tdap ; sleep 1 ; done I also have a room on the test system that pulls in an auto-generated RSS feed. A few thousand transactions in, something hangs. It appears to be threads fighting for the S_ROOMS mutex, but upon further investigation it actually looks like one thread is hanging while it's holding the S_ROOMS mutex. gdb is suggesting that it's stuck somewhere inside of Berkeley DB while the library is trying to acquire a mutex of its own. I have a watchdog script running on Uncensored that kills the server process whenever it hangs. I hate this. But at least it hasn't wrecked the database. As a last ditch effort before I go and get some sleep, I have disabled the full text indexer. My test has, so far, gone on for far longer than it was previously running. I've installed this code on Uncensored and hopefully the server will make it through the night. Please know that I am working as hard as I can to stabilize this thing, and it matters a LOT to me that we get it to where it needs to be. I want to get off Berkeley DB as much as y'all do, and ASAP, but we have to leave it in a working condition before we move on to something else. If anyone wants to HELP, I'd appreciate it; let me know and I'll provide some guidance on setting up your test environment.
[Citadel Development] Re: I'm in ur VM, running FreeBSD
Yecch. The new code has a deadlock problem. The DB is rock solid though. On it.
[Citadel Development] Re: I'm in ur VM, running FreeBSD
Oh, and in case I didn't make it abundantly clear: the remaining backend decoupling work is complete, and work on more modern backends can begin soon. LMDB? MariaDB? SeattleCloud Hosted DB? You name it, we'll be able to do it.
[Citadel Development] Re: I'm in ur VM, running FreeBSD
Ok, I've got my FreeBSD system installed and have begun making changes to fix any problems that came up. I am also now running Uncensored on 'git master' (commit 6960933711b8353ad5cddcc17994db670911a1dc) as of right now. When it comes to dealing with database corruption, in the words of famous serial rapist Bill Clinton: "I FEEL YOUR PAIN!" My database got scrambled today, and it isn't the first time. So I restored my most recent backup, ran database_cleanup.sh for hopefully the last time (to start with a fresh btree instead of one with potential hidden badness) and upgraded Uncensored to the latest unreleased code. I usually run Uncensored on the release version, but this is a serious jam and it needs serious testing. What better test than running the flagship site on it? If all goes well, we will do a new release in the next couple of days with much more robust database code and better FreeBSD compatibility.
[Citadel Development] Re: I'm in ur VM, running FreeBSD
Ok, I've got my FreeBSD VM set up and will give the Citadel build a shot today. \
[Citadel Development] Re: PLEASE TEST THIS IF YOU CAN!
I'm having *really* good results with this latest commit. I haven't gotten it to corrupt the db no matter how badly I abuse it. This is the one. The only issue so far is that under multiple very heavy load tests, I have gotten it to deadlock. But I don't know whether that's a new problem or just something that's always been there and I didn't know because I wasn't testing it this way before. LadySerenaKitty and others who are having db problems: once this version is installed, the remediation procedure will be for you to install it, ctdldump your database to a flat file, ctdlload it back to a clean db, and then you should be good. And yes, once this is done, we will be 100% API ready to begin alternate back ends.
[Citadel Development] PLEASE TEST THIS IF YOU CAN!
All right everyone, this is the one to test. I think I've fixed all of the conditions that would make it corrupt the database. PLEASE test this version if you have the ability to. (The version currently in git master, not the release version on the web site.) The funny thing is, in the process of reworking the code base so that we can move away from Berkeley DB, I ended up making Berkeley DB more stable. I think the way we were using it may have been "wrong" for a LONG time. We'll still develop a more modern backend, but hopefully it won't be quite as urgent anymore. I *hate* hearing from sites that had to restore backups or run the cleanup job. Again, if you're running the development tree, please give it a good workout. I also included a new utility called "loadtest" that does nothing except post and delete messages in five random rooms, so that will help you out. I haven't gotten it to corrupt my database at all under the new code, even if I hard kill the server while running multiple instances of the load test.
[Citadel Development] Re: master_cleanup() is now the global shutdown/exit function
Geez. The more I look into the Berkeley DB back end the more fucked up I find it is. Disregard the previous comment about the locking subsystem. I changed dbenv->open() flags again, this time DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL | DB_INIT_TXN | DB_RECOVER | DB_THREAD ^^ The documentation says that DB_RECOVER means "run recovery before opening" but for some reason this combination of flags makes it create __db.00X files in the data directory, which I think are environment files. I'm not certain about that. But it seems to be more stable now. I've got a test system (it's on 'franklin' -- thanks Nurb432) running that I can reliably get to corrupt itself by interrupting it during a The Dreaded Auto Purger run. With any luck we'll be able to make this backend more simple and way more reliable. Just in time to deprecate it. :)
[Citadel Development] Re: master_cleanup() is now the global shutdown/exit function
Some interesting developments here. I disabled Berkeley DB's locking subsystem, because Citadel handles locking and concurrency on its own. This seems to have fixed the segfault when it closes the database environment. I also have not managed to corrupt the database no matter how badly I abuse it, even by forkbombing a test client that writes to the db, and killing the server process in the middle of that. I'd appreciate it if some of y'all could help me test that. Hit it hard. Abuse it. Preferably with lots of concurrent sessions all performing write operations. Let me know how it goes.
[Citadel Development] Re: cdb_next_item() now returns both key and value
Clarification of the above commit -- because it's important. When traversing an entire table (as opposed to just fetching one row for which you already know the key) the backend API now returns both a key and a value. This is done by returning a "struct cdbkeyval" which simply contains a pair of "struct cdbdata", one for the key and one for the value. Yes, my discovery that you've been able to return a struct by value from a function in C for the last couple of decades has already reflected itself in Citadel. :) All of this is based on a *new* rule in the backend interface: when you fetch an item from the database, either with cdb_fetch() or cdb_next_item(), the caller does NOT become the owner of the memory it points to. This is how Berkeley DB operates by default (and I changed it back to this mode over the past week) and it's the *only* way LMDB can operate. Why all the fuss? Well, let me tell you; thanks for asking! ctdldump and ctdlload currently call Berkeley DB directly. This is non-awesome. It means that when we add more backends we would have to write new versions of ctdldump and ctdlload for each one. Not gonna do it. ctdldump and ctdlload will be modified to use the same backend interface that citserver uses. Then we'll be able to dump and load in any format on any machine. But ctdldump needs to know the keys, not just the values. And so, cdb_next_item() now returns both.
[Citadel Development] IMAP Flags branch?
HarlowSolutions: I noticed that the IMAP Flags branch is no longer in the repo. Are you refactoring it? Do you need anything from me to move forward? I've been heads down on this stupid database thing lately.
[Citadel Development] (null)
Putting HTML directly into the editor does work if you push the "HTML" button :) (...and someday soon, when this whole war on berkeley db thing is finished, we'll get back to webcit-ng...)
[Citadel Development] (null)
UFarx: the formatter messed up that patch, I think. just remove the br tags I ended up having to apply your changes manually. Thank you for the patch, but in the future please either post it as an attachment (good) or upload it to our gitlab as a merge request (better). And thank you for the clarification about CPPFLAGS ... somehow after all these decades I never realized that CPP stood for "C PreProcessor" and not "C Plus Plus".
[Citadel Development] (null)
UFarx: the formatter messed up that patch, I think. Can you put it in as a merge request? I'll pm you a login.
[Citadel Development] (null)
>if that's welcome, i'm happy to beef things up a little without >sacrificing any compatibility. I'd be pleased to see that. Maybe just do one component first so we can see what it will look like. We try to make the build work cleanly on FreeBSD and Linux. There is no interest in testing on obsolete variants such as Solaris, AIX, HPUX, etc. In other news, y'all may have noticed that we're scraping the Citadel commit log into this room now.
[Citadel Development] (null)
All right, that worked much better. The database backend interface has been reverted to what it was before. The backend for Berkeley DB still needs thread-specific data, but now it maintains that data locally to the module instead of tainting the server with its own needs. Sensibly, when something in the backend needs to store or retrieve something specific to the thread (a cursor, a lock, or a transaction) it calls a function that says "please give me a pointer to the thread-specific data region for this thread; if none exists, create one". I'm guessing that when we built the previous arrangement 25 years ago we were expecting we'd need to store more than just database stuff in that region, but it turned out not to be the case. Also we were foolishly envisioning a Windows port at the time so who knows what was going through our heads.
[Citadel Development] (null)
> Therefore I am changing the API so that cdb_rewind() returns a cursor, >and cdb_next_item() must be passed that cursor with every call. As Well that was a disaster, and with more people involved now I ought to test this stuff before announcing an API change. I think I'm going to try simply moving the thread-specific stuff into berkeley_db.c and letting it handle that stuff internally instead of trying to do surgery on the module and change its inner workings.
[Citadel Development] Re: Build System
>as for creating a patch, i would but if you wanna switch to cmake >anyway it might not be necessary anymore. a word of caution though >regarding cmake: CMake is interesting and I'm happy to learn about it, but we've got too much going on right now to think about switching to a new build system.
[Citadel Development] (null)
>I already did the account creation thingy. Ok, I approved the account. I guess I need to set up more alerts on that sort of thing.
[Citadel Development] (null)
Absolutely! Just go to https://code.citadel.org and create an account. You will need to use your @uncensored.citadel.org address, at least initially. I'll approve it in short order.
[Citadel Development] (null)
>If it were up to me, the entire build system would be using CMake. That does look nice and clean. My objective wasn't to stop using Make, but to stop using the GNU Autotools stuff. Those tools are built to handle all sorts of obscure edge cases that don't exist anymore. There just isn't that much variation in our target platforms anymore, so a few simple tests are all we need. As long as it builds on BSD and Linux, I'm fine with it; we no longer have to check for oddball systems with oddball libraries because no one is running Citadel on those systems anymore (and we don't have enough developer time to support them anyway). I probably reinvented a wheel somewhere ... but it was a very basic, very simple wheel. All of the usual criticisms of Richard Marx Stallman are included herein by reference.
[Citadel Development] (null)
More notes on the database abstraction layer: I did point out that the API isn't totally frozen :) For database cursors, we were keeping cursor variables around per-thread-per-table, which requires putting Berkeley DB specific stuff in server/threads.h , and that's got to go. Therefore I am changing the API so that cdb_rewind() returns a cursor, and cdb_next_item() must be passed that cursor with every call. As always, the caller is expected to keep reading until the end, at which point the cursor is de-allocated by the backend. The data type of the cursor is `void *` and the backend is expected to cast it to its own native type.
[Citadel Development] Re: Build System
UFarx: your suggestions for the build system are most welcome. I don't suppose we could ask you to submit a patch? We have a GitLab instance at [ https://code.citadel.org ] which accepts merge requests to the repository. (You'll need to sign up for it using uf...@uncensored.citadel.org as your email address, since it only accepts citadel.org signups to prevent noise from the outside.) I believe we do use CFLAGS and LDFLAGS in at least some of the builds. We're in the process of moving away from the GNU Autotools, as you might have noticed. Citadel Server, the text client, and WebCit-NG are using our own build system, while WebCit Classic and libcitadel are still on the GNU tools. CPPFLAGS is not used anywhere because there is no C++ anywhere in the system.
[Citadel Development] (null)
Work on the abstraction layer for the storage backend continues. There is a new directory ./server/backends/ that is similar in design to ./server/modules/ for the rest of the server modules. I chose to open a different hierarchy because, as previously mentioned, the storage backend gets initialized long before the add-on modules get initialized. There is a file ./server/database.c which handles the selection and activation of one of the available backends. Naturally, it is simply hardcoded to activate the Berkeley DB backend right now, because that's the only one we have. If you want to write a backend, here's what you have to do: 1. Create a directory ./server/backends/foo 2. Populate it with ./server/backends/foo/foo.c and ./server/backends/foo/foo.h 3. Modify ./server/database.c to include your header file, and to call your initialization function instead of the BDB one. 4. Implement the interface described in ./server/database.h, using ./server/backends/berkeley_db/berkeley_db.c as a reference. Since this is a brand new effort, I cannot guarantee that the interface won't change a bit as we refine the abstractions. But it won't change much; the basic functions are pretty much in the form they'll continue to be. My next task is to update `ctdldump` and `ctdlload` to use this interface instead of calling BDB directly.
[Citadel Development] (null)
All righty, folks ... anyone looking to build a database backend of their own, please check out the latest batch of changes in the git repository. It basically works like this: * database.c has been renamed to database_bdb.c * database.h now basically declares an API for storage backends to use * All functions beginning with cdb_ were renamed to bdb_ * The cdb_ functions are now just pointers * During initialization, the cdb_ functions are assigned to their corresponding bdb_ functions I've tested it a bit and everything seems to be working just as it was before. I did the whole thing with function pointer assignment because I didn't want to write a giant batch of wrapper functions to marshal calls to the correct backend and then require one more step up the stack for every call. Since only one backend will ever be used at a time, we load the pointers at initialization time and then everything works exactly as it did before. Here's where I expect to go with this next: * A new server/database.c file will come into existence, and it will handle the binding of the selected backend * Backends will move to a separate part of the directory tree, probably server/modules. They will need to have a different initialization convention because the data store is initialized long before server extension modules are initialized. * Backends that are incompatible with a particular host (for example, LMDB on 32-bit) will simply be skipped in the build. * We'll have to add another function to scan the data directory and detect what backend was used to write any existing database that might already be there. If an existing database is present, we will automatically load the correct backend. * Where no existing database is present, we will choose one in order of preference. A command line switch will of course be available to override the server's decision. Does that sound acceptable to you all? This should make it possible to bring experimental backends into the tree without foisting them upon the user community.
[Citadel Development] Re: y creashRe: Database Recovery Tools
Ok, here's what I'm starting to do. You can totally follow the commit logs to stay up to date. database.c has been renamed to database_bdb.c for the obvious reason. (We've been here before, but it was about 25 years ago when we switched from GDBM to Berkeley DB.) The database interface was built to be swapped out. It's going to be even more modular soon. If you look in database.h you're going to see a set of functions that is basically the "driver API" for database backends. My work over the next couple of days will be to completely decouple Berkeley DB from the main server code and turn it into a server module that registers as a database backend. At that point we will be able to build in any database driver we want -- just like the OpenLDAP folks did. I'll probably build the LMDB driver, and configure it to only build on 64-bit systems. With the modular interface, you guys can build SQLite or MariaDB or whatever tickles your fancy.
[Citadel Development] Re: y creashRe: Database Recovery Tools
The only thing that's "all the rage with the kids" is *calling* it "NoSQL". Berkeley DB has been a simple key/value data store for 30 years. And until recently it's worked fine for us.
[Citadel Development] Re: y creashRe: Database Recovery Tools
To be honest I'm leaning towards LMDB, and for the people running on Raspberry Pi we will have to say "look, you're just going to have to install the 64-bit OS before you upgrade" Read through this presentation on LMDB. It's practically designed for our use case: [ http://schd.ws/hosted_files/buildstuff14/96/20141120-BuildStuff-Lightning.pdf ]
[Citadel Development] Re: y creashRe: Database Recovery Tools
The options currently under consideration are SQLite, MariaDB, and LMDB. SQLite: well known, well respected, embeddable. To do it right, however, would require adding a schema layer. The idea of just throwing away the SQL semantics and storing all records as blobs kind of bothers me from a design point of view. MariaDB: requires a separate database server process, plus the schema issue. To be honest I'm not a big fan of this idea. LMDB: also very well regarded, and considered by most in the open source community to be "the" replacement for Berkeley DB. It uses memory mapped file semantics as its storage layer, so it's fast and stable. The downside is that because of memory mapped files, the maximum database size is 4 GB on a 32-bit system. This is not enough for Citadel, so we would have to end support for 32-bit hardware. On x86 this is no problem, but on ARM it could be an issue. More on this soon. We have to do something.
[Citadel Development] Re: y creashRe: Database Recovery Tools
Ok, I think I found it. [ https://docs.oracle.com/cd/E17276_01/html/installation/build_unix_small.html ] Easy Install now uses the "--enable-smallbuild" flag, which disables verify. I am going to change it to --with-cryptography=no --disable-hash --disable-heap --disable-queue \ --disable-replication --disable-statistics \ That'll give us back the "verify" code and more verbose error messages too.
[Citadel Development] Re: y creashRe: Database Recovery Tools
I changed your project role from Developer to Maintainer, so hopefully you can do merges now. I'm also going to try to get GitLab to send more "news" to this room.
[Citadel Development] Re: Database Recovery Tools
Ok, here is the DB installation: $MAKE $MAKEOPTS || die $MAKE install_lib || die $MAKE install_include || die $MAKE install_utilities || die So they *should* be there. Are your tools not on the system, or did it throw an error message when you tried to use them?
[Citadel Development] Re: Database Recovery Tools
Did your db get corrupted because you were playing around with a development system, or did it happen on its own on a production system? I've just about had it with Berkeley DB, which is why I've spent a large number of hours on ctdldump/ctdlload. This is our escape hatch once we have an alternative. But I also had a corrupted db on a development system that I (perhaps deliberately) crashed out a few too many times. I was able to use ctdldump/ctdlload to recover it instead of using db_dump/db_load and it actually worked pretty well. I'm wondering whether database_cleanup.sh should switch over. Nevertheless, I will check to see if the utilities are installed. I really thought they were.
[Citadel Development] Re: Merge Request Approval: IMAP memory issues with use of ConstStr
Crap. Ok, I'll do the merge, but I really need to figure out what the heck is happening.
[Citadel Development] Re: Merge Request Approval: IMAP memory issues with use of ConstStr
Sorry about that, it just got away from me. I approved it. Looks good.
[Citadel Development] Re: Merge Request Approval: IMAP memory issues with use of ConstStr
I agree. The developer who introduced StrBuf and ConstBuf was helpful in giving us an elastic string class, but I wish he hadn't gone back through old code and converted stuff without a really deep inspection of how it was used.
[Citadel Development] Re: Merge Request Approval: IMAP memory issues with use of ConstStr
>Not sure why the parameter was empty, but a lot of code accessed ConstStr >pointers without checking for zero length and pointer was null and could >coredump, so went through and added checks. *sigh* The IMAP server was written completely with straight C pointers and string buffers and it was extensively tested as it was built. Unfortunately, the developer who added StrBuf and ConstStr to libcitadel was a little too eager and converted a lot of code all throughout the system, and introduced subtle bugs everywhere.
[Citadel Development] Re: Approval Request: Memory_Leak_SmtpClient
>I am changing to updating the records on they fly rather than upgrading, so >the version number is really not that important. New records use a version >signature and don't compare to the citserver version so no conflict. Sounds good. We are officially changed over from ctdlmigrate to ctdldump/ctdlload now, so if your tree has anything in the modules/migrate/ directory you can safely remove that now. We'll have to figure out how to deal with ctdldump/ctdlload with the multiple record version types, of course, but that should be straightforward.
[Citadel Development] (null)
That is more than sufficient, thanks!
[Citadel Development] (null)
I just pushed an update out that's going to do something ... that will either win big or lose big. It's possible that the occasional database corruption has something to do with aborting so quickly when an occasional segfault happens (for whatever reason) and it just stops without closing the databases. So for the time being I have it set to capture signal 11 and go straight to closing the databases, like we do when we capture some other signals. Let's see if makes things better. For the long term, splitting the database accessor into a separate process might make sense.
[Citadel Development] (null)
I would certainly welcome having a chance to test the build on that target and see how it goes.
[Citadel Development] (null)
Do you have a live system you can use as a test of source data? If so, the most excellent test would be (after I get ctdldump and ctdlload into the build) to dump your system, load the dump on another system, and go through it carefully to see if the source system was reproduced with 100% fidelity.
[Citadel Development] Re: Approval Request: Memory_Leak_SmtpClient
Oh, and I pushed the release button a few times, so when the visit merge is done we'll have to adjust the version numbers being applied.
[Citadel Development] Re: Approval Request: Memory_Leak_SmtpClient
> If not then I'll merge and then troubleshoot. I think when we tell it >not to squash the commits, it brings in your entire commit history >followed by a merge commit. Ok, I went ahead and tried it. It did exactly that. The commits that make up the merge are written to the log and credited to the person who actually wrote and submitted them. Then a commit appears at the end showing who performed the merge. I'm ok with that. The trick is to make sure the "squash commits" option is NOT checked, because that hides the details.
[Citadel Development] (null)
F*g hell. I had to restore from a backup this afternoon because my database bit the dust. Looks like it's time to accelerate the replacement of Berkeley DB. ctdldump/ctdlload is functionally complete, and we can start testing it. I'm also thinking that maybe we want to split the server core and the database handler into two separate processes, so if something happens to citserver it doesn't whack the database along with it. This is something we'd get naturally with an external SQL server, I know. *sigh*
[Citadel Development] Re: Approval Request: Memory_Leak_SmtpClient
I had the same problem, and then I realized I was not logged in. Any chance that's happening to you? If not then I'll merge and then troubleshoot. I think when we tell it not to squash the commits, it brings in your entire commit history followed by a merge commit.
[Citadel Development] Re: Approval Request: Memory_Leak_SmtpClient
Geez. Nice find. Approved, of course.
[Citadel Development] Re: Master: citserver coredump on Shutdown
All good, take as much time as you need on the IMAP flags stuff. We'll make sure we do it right. As you've undoubtedly noticed if you follow the commit logs, over the last couple of weeks I've written "ctdldump" and "ctdlload" utilities that can be used to dump a Citadel database into a flat file and then load it back on another system. This can be the first step towards a transition period between databases, of course, since we can build compatible dump/load utilities for each database that share a common export format. To make things easier: the export format is not intended to be backwards compatible. As with its predecessor, we expect the dumping and loading systems to be running the same version of Citadel Server. Uncensored is still running on 32-bit i386. One of the first things I want to do is use these utilities to finally get my own system onto 64-bit. I tried it a few months ago with ctdlmigrate and it did *not* go well. Aside from taking many hours to run, the resulting system was unstable and had data missing. I guess all that conversion back and forth to XML on running servers just wasn't a great idea, which is why ctdlmigrate is being discontinued. Sqlite is pretty cool, and I'd be interested in hearing your schema idea. I am somewhat turned off by the idea of having a schema layer and not really using it. I wonder if it's possible to access Sqlite's B-Tree back end directly?
[Citadel Development] Re: Master: citserver coredump on Shutdown
To get things started, I have begun work on `ctdldump` and `ctdlload` frameworks. I've also improved the Makefile for Citadel Server so that it doesn't require compiling the entire server every time even one file is touched. I got lazy when I removed the GNU Autotools and just had it compile *.c in one shot, and I'm finally coming back to that now. It needs to be tested on FreeBSD to make sure I didn't do anything Linux-specific in there.
[Citadel Development] Re: Master: citserver coredump on Shutdown
Potentially, yes. I'm thinking more in terms of how to refactor the code. The database access stuff is all in `database.c` because there *was* a somewhat pluggable architecture in the past -- from flat files to gdbm and later to Berkeley DB. Since we've been using Berkeley DB exclusively for 20+ years, some tight couplings may have snuck in, so we'll have to identify them and properly abstract them. The pluggable database layer was, at the time, intended to be used to switch between blob stores, not SQL databases. I'm not against going SQL but it would greatly impact how we build it -- and that's going to depend on who is doing the work. The way I see it, a Citadel Server binary would be linked against one (and only one) database back end. We've played around with linkable modules in the past, and it ended up being a support nightmare. Maybe that's different now because we only really have to support Linux and FreeBSD, and both of those are more stable targets. So if you have support for multiple database types, each one would have a `citserver` binary, a `dump` binary, and a `load` binary. Some of these would be unnecessary if some of the back ends are SQL based and the dump format is a standard SQL dump. I see two different development paths. One is a march towards LMDB and one is a march towards SQL: (A) Blob stores only: 1. Make sure the `database.c` interface is 100% decoupled from the rest of the system. 2. Refactor `ctdl3264` into a dump-and-load program. 3. Develop drivers for LMDB, Cassandra, MongoDB, or whatever. (B) Blob *and* SQL stores: 1. The new abstraction becomes the interfaces such as Ctdl[Get|Put]User, Ctdl[Get|Put]Room, ForEachUser, etc. Refactor the code so these interfaces find their way into the pluggable side of things. 2. Add new pluggable stores for SQLite, MariaDB, etc. 3. Refactor `ctdl3264` into a program that can dump the blob stores into a standard SQL dump. (Dump and load on the SQL based back ends would use their native utilities.) I'm interested in hearing any and all good ideas! But no one gets a vote unless they commit to helping :)
[Citadel Development] Re: Master: citserver coredump on Shutdown
Grrr. Unrecoverable error on my database this morning. I think maybe it is time to put Berkeley DB out to pasture after all. Let's think about what we want the next to look like.
[Citadel Development] Re: Master: citserver coredump on Shutdown
>Pretty consistent for me. I put a check in the Berkley code to work around, >but not sure of the root cause either. Something must be getting double freed, but I can't figure out what.
[Citadel Development] Re: Merge Request: Support_IMAP_Flagged_Deleted_Draft_Recent_flags
>I just realized that a Seen message set of *:2099348686 means all messages >have been seen. Were all your messages unseen before the upgrade? I do A seen message set of *:2099348686 means all messages have been seen up to and including 2099348686. We are using what RFC 3501 (IMAP) refers to as a sequence set. If you dig through the RFC's you can read all of the BNF notation, after which you will give up and throw things at your screen. But it's pretty simple to read (maybe not to manipulate, but definitely to read) once you understand it. A sequence set is basically zero or more atoms separated by commas. Each atom can be one of five things: m (a message number) - describes exactly one message. For example: 1,2,3,10,11,12 That means exactly those six messages, and no others, existing or future. m:n (a range) - describes all messages from m to n inclusive, including any new messages that may appear inside that range in the future. For example: 1,2,3,10-12 The above example means messages 1, 2, 3, and everything from 10 to 12 inclusive. If message 11 exists it is included. If message 11 does not exist but appears later on (for example, someone moved it in to the mailbox/room from elsewhere) it is included. *:n - a range from zero to n, inclusive. You'll see this a lot in Citadel; it means "all messages up to and including n" m:* - a range from m to infinity. You'll almost never see this because it excludes any new messages that are added in the future, which isn't exactly desirable. * - all messages, any number, current or future. Again, you'll almost never see this because it excludes any new messages that are added in the future. So if someone picks through a mailbox selecting and deselecting messages, you might see something like: *:124,160,199,200-207,315 Clear as mud, right? ;) To answer your question: when I tested it I had unread messages in quite a lot of rooms. After the conversion they were all marked as read even though I had not read them.
[Citadel Development] Re: Merge Request: Support_IMAP_Flagged_Deleted_Draft_Recent_flags
>I was trying to figure out how to reject the request myself. Does not look Do you have the button "Close merge request" on the bottom of the screen?
[Citadel Development] Re: Master: citserver coredump on Shutdown
>It looks like the Berkley code is passing a NULL to a memory free routine >that does not check for NULL. Anyone else having a problem? If not, I >will look into how I build the server. It happens to me too, some of the time, and I have been busting my butt trying to figure it out.
[Citadel Development] Re: Merge Request: Support_IMAP_Flagged_Deleted_Draft_Recent_flags
Ok then, I think the proper workflow is to cancel the merge request and start a new one? I could reject it but that seems rude :) It isn't just "a seen message set" -- it was ALL of them, across all rooms, at least for me. But if you want an example, my current seen-set for this room (Citadel Development) is: *:2099348686 Are we processing the asterisk? We're using IMAP-compatible syntax, so that one means "every message up to and including 2099348686"
[Citadel Development] Re: Long term plans for phase-out of Berkeley DB?
You mean like this? ;) [ https://code.citadel.org/citadel/citadel-docker ] I'd like to get things trending towards more deployment of the Docker image (or technically an OCI compatible container, since you don't have to run it under Docker, blah blah blah) The current distribution is built as what I've recently learned is called an "omnibus" package. That is to say, it holds all components together in a single container, and is packaged that way for ease of deployment rather than for optimal componentization. You're also correct in pointing out that I would need `supervisord`, but the idea of bundling an entire Python runtime just to run supervisord rubbed me the wrong way so I wrote `ctdlvisor`, a 285-line C program that does roughly the same thing. So yes, it does make sense to try to move everyone towards the Docker version. I don't know what percentage of sites are running it right now. Based on the number of downloads reported by DockerHub it's pretty significant but I think Easy Install is still more popular. That may simply be due to the fact that Easy Install is the first option listed on the download page, and people are attracted to the word "Easy" in the title. They may be afraid of Docker, even though the description of that download says "It's our easiest install ever!"
[Citadel Development] Re: Merge Request: Support_IMAP_Flagged_Deleted_Draft_Recent_flags
Ok, I tried out the proposed merge. Here are the results. 1. I made a copy of the Uncensored database and ran it on the main line code to verify that the copy is functioning identically to the production database. 2. I switched over to the merged branch and built everything from scratch, and ran it on the same database. 3. The table conversion job took approximately ten minutes to complete. This might cause some server operators to believe the server has stalled and they might take action to interrupt the process. We can talk about options for that later -- perhaps we can wait until an individual record is accessed before converting it, or at the very least we can put progress messages into the log. 4. I logged in to the converted system using my normal account. There were zero (0) messages in all rooms. Prior to the upgrade, I had about 20 rooms with various numbers of new messages in them. 5. I went to a room that is gated by access level and performed a "Who knows room" operation. The results were correct. 6. I verified the following hidden rooms: Room "the test room" with password "thepasswd" Room "Grrr" with no password Room "Citadel Development" with no password In all case, I performed a "Who knows room" operation, and the results appear correct. So it looks like the visit record mapping is correct, the access control bits are perfect, the table conversion completes (albeit slowly), and there were no server crashes or loss of records. The biggest problem is that the conversion of "messages seen" data appears to zero the field out and marks all messages in all rooms as "seen". I did try creating a new message in a room, and then kipped out of the room instead of otoing, leaving that message unmarked. When I returned to the room, it correctly showed as "1 new message". So the ongoing operation appears correct, it's just the conversion that has an issue. I love how you cleaned up the API. Bravo on that.
[Citadel Development] Re: Long term plans for phase-out of Berkeley DB?
>If you want to stay with index/data for the database, I think I might go back >and see if I can finish up a database.c replacement with MySql. What I Hey, I'm totally cool with moving to a SQL database, if someone else is doing to be doing the bulk of the work. Seriously. I am just one person and I need to focus on getting our web interface looking like it wasn't written 15 years ago. If you're serious about doing it, we can talk about the parameters and requirements. Obviously the primary requirement would be for it to be ultra-stable, preferably able to take abuse better than Berkeley DB can. But another requirement would be for it to remain ridiculously easy to install. That seems to imply that Sqlite might be a better option than MySQL/MariaDB because we do *not* want to require people to install a working database server before Citadel can be installed. If we're going to go SQL rather than NoSQL, it would make sense to take advantage of the schema layer. For example, functions like CtdlGetUser() and CtdlPutUser() would end up doing the work of mapping table columns to/from elements in the server's internal data structures. There's a project called DBmail [ https://dbmail.org/ ] that is basically an IMAP server that uses a SQL database as its store (but has no other features; no user interface, no groupware/collaboration stuff, requires an external MTA, etc) but it's been maintained for some time so it's definitely possible to use a SQL database for that purpose. So if you're serious about it and want to commit the time to it, and we can keep it *easy* for users to install (and to migrate to it) -- we can definitely explore that.
[Citadel Development] Re: Long term plans for phase-out of Berkeley DB?
>Just to chime in, the database.c code is very clean. The only problem I >have had with Berkeley is that if the code or system crashes, most of the >time, the database is corrupted and recover does not help sometimes and I >have to revert to a backup. The code has become a lot more stable so not as >big an issue anymore for me. That's why some time ago we made a decision to get rid of all cleanup code except that which has to do with closing the database. Nothing else matters. There was one developer who was with us for a few years who loved to use Valgrind for leak detection, and he did a great job at that, but the cost of it was a very complex server shutdown. In practice, sometimes it never reached the point where the databases got closed, either because the server itself was crashing, or something else went wrong. Switching to something else will be a big job, which is why we haven't done it yet. It isn't so much the writing of the new back end that is the problem, but the conversion of all the existing installations out there. There are quite a lot of them. We've done it before. We went from flat files to GDBM and then eventually to Berkeley DB. At the time we built external tools to convert the database offline. We might consider doing that again. My current thinking is that maybe we abandon serv_migrate.c and then change ctdl3264.c into an offline export/import utility with an architecture-independent and database-independent export format. But for now, I plan to refocus on WebCit-NG, which is now several years behind schedule because of all the side quests.
[Citadel Development] Re: Merge Request: Support_IMAP_Flagged_Deleted_Draft_Recent_flags
I'm away on vacation right now and casually browsing the changes. I'll give it a good thorough testing when I'm back at my desk after the weekend, but so far I'm quite impressed. It looks like the changes really follow not just the coding style, but the *design* style of the existing Citadel system. Some things -- like CtdlUserGoto() -- look even cleaner. Unless something comes up that needs an emergency fix, we will stay at version 980 until this merge request has been accepted and tested. When I am back at my desk after the weekend I will try it out with a copy of the production database from Uncensored and see how it goes. It's basically the acid test because after 35 years of nonstop operation there's a little of everything in there. No one else is actively working on the server core right now so there shouldn't be a lot of merge conflicts, if any. As I sit here on the balcony overlooking the ocean and sipping a tasty beverage, I am poking around on WebCit-NG, so I have a lot of commits this week but nothing on the server core. But for now I am signing off because it's time to go out and do something fun. Thanks for your help!
[Citadel Development] Re: Merge Request: Webcit_Coredump_StrBufQuotedPrintableEncode
>Not sure if regression, but if you try to sent a message with a blank body, >webcit coredumps. Just checked for empty message text and skipped trying to >output. I wasn't able to reproduce the problem but the patch looks like it will add some safety so I pulled it in anyway. In the future I think I'm going to just approve merge requests and then you can do the actual merge so that the commits appear in your name instead of mine.
[Citadel Development] Re: Long term plans for phase-out of Berkeley DB?
Right, and we do table-level locking in the application before we even attempt to touch the database, so that has worked pretty well. I did end up using an alternate --prefix just to keep the build a little more solid, but that seems to just change the symbol names internally without actually building a library that has a different name. It's a little useful, but not super-useful. I eventually decided to just make it build a static library, since the only binary that uses it is citserver (plus the utilities, but disk is cheap now and the Gotard community seems to have decided that static binaries are all the rage this year). Debian really screwed us by deciding not to update BDB but also not to remove it, as it's integrated into some basic system stuff that can't easily be removed. Both the container distribution and the Easy Install distribution of Citadel have been updated to use BDB 18.1.40, which has been the latest version for some time now.
[Citadel Development] Re: Long term plans for phase-out of Berkeley DB?
Heh. Most of the database layer is still pretty much the same code you wrote 22 years ago. You did a great job on that. I've thought about switching to a SQL model. It would confer a lot of benefits. It would also open up a huge effort in getting all of the code converted over to actually make use of that kind of flexibility. Among other projects that use a "nosql" store, LMDB seems to be the favorite. But the 4 GB limit on 32-bit systems is a show stopper for now. On the x86 side of things, everyone (except me) has long since moved to AMD64, but Citadel is really popular on Raspberry Pi, which is still 32-bit for most users even though the hardware can do 64. If I want to procrastinate (which might be a good move for now) it might make sense to simply hope that we can keep Berkeley DB viable until most of the ARM world has migrated to 64-bit, and then switch to LMDB and make 64-bit a requirement for Citadel installations. Funny you mention using an alternate `--prefix` ... I just started playing around with that last night and will probably move forward with that as part of the mainline installations.
[Citadel Development] Re: Merge Request: Webcit_Coredump_StrBufQuotedPrintableEncode
I see the merge request and will check it out. Thanks!
[Citadel Development] Long term plans for phase-out of Berkeley DB?
A bit of thinking out loud with regard to long term plans for Citadel Server. Berkeley DB seems to have gained a bad reputation in the open source community. Much of it began in 2013 when Oracle switched it from the Sleepycat license to the AGPL. Debian discontinued updating it completely, which of course makes Citadel's life difficult because we have to bring in our own version of it, and that can create conflicts with the ancient version that's already on the system, which can't be removed because there are still things dependent on it, etc. etc. Many open source projects have switched to LMDB. LMDB has received much praise, but I don't know how it would handle the write-intensive load of a Citadel Server. Another potential show stopper is that the maximum size of an LMDB database is 4 GB when running on a 32-bit system, and that's way too low. We do have a lot of people running Citadel on 32-bit ARM systems. Wikipedia refers to Berkeley DB as an "unmaintained" piece of software, but then again, Wikipedia is edited by evil propagandists, so I don't know if that's actually true. There hasn't been a new release in three years, but that may simply be due to the fact that there really isn't anything left to add to it. Oracle doesn't list it as discontinued and they still sell new commercial licenses. I don't have time for another side quest right now. After I migrate my own system to 64-bit I have to get back to working on WebCit-NG. But it's something that we have to put on the long-term roadmap somewhere.
[Citadel Development] Re: New Developer Questions
Oh! Ok, that's why I didn't see a merge request. Cool, that's exactly how it should be. :) In other news, I finally figured out why the server kept crashing (on this site). Someone put a match string in their inbox rules that doesn't compile to a regular expression. regcomp() fails, and then when we do the match, the library throws a segfault. I modified the code to only do a match if the regex compiles successfully, and we've been running clean ever since. Unfortunately the server crashed in such a way that my user record got corrupted! That's why that "undocumented recovery mode" got added to the tree. :)
[Citadel Development] Re: New Developer Questions
I also see that the `Support_IMAP_Flagged_Deleted_Draft_Recent_flags` branch is still there. I thought it was going to get deleted in the merge. Mind if I delete it? I also have to make sure I'm not "squashing" multiple commits so they all get their own commit log messages. Thanks for bearing with me on this. The code is what's important, but if we make it easier to work on it, more people will want to.
[Citadel Development] Re: New Developer Questions
I'm assuming you're familiar with the industry-wide standard for how to write a commit message (summary line of 50 characters or less, then if you need to type more, skip a line and then make the longer description lines of 72 characters or less). If you're doing that in the commit message, then when you create a merge request you can just copy the same data over (or it might actually do that automatically, I don't remember). That way you won't have to needlessly duplicate work. The whole GitLab thing is new for us, so as we get comfortable in it I'll probably have certain information get posted to this room automatically. I saw your last merge request because you told it to specifically go to me, but maybe we can have all such things get posted here automatically. Like I said, we'll figure out what's comfortable. As for issues and enhancements, we have a "Citadel Issue Tracker" wiki room that I totally forgot to add you to until now. From a quick scan of it, I think some of the open issues have been fixed and I need to close them. I'm not sure whether using the GitLab issue tracker would be better or not. In the distant past we had a Bugzilla and it got to be a huge mess with people logging poorly researched bugs and feature enhancements. Now shut up and code ;)
[Citadel Development] Re: Daily commit digest for Citadel (CDB_VISIT)
>From my experience using other gitlab installations, yes you have to create a branch and then commit that branch, and then you can create a merge request. When the merge is approved (and I haven't set any approval rules so we might have to experiment there) there is a "delete source branch" option. So that leaves you with two options: 1. You can create a feature branch every time you submit a merge request, and when the merge is performed you select the "delete source branch" option and then pull a new copy of `master` when you're ready to go work on the next thing. (This is what I do when I work on our devops team's gitlab server.) 2. You can create a personal branch that you work in long term. You'll need to rebase every time `master` is updated otherwise you'll get error messages about it being ahead of your branch by x number of commits. When you want to do a merge request, you rebase your branch so that everything *else* is up to date, then your merge request will only contain your actual patch. At some point I will make `master` a "protected branch" so that a casual participant can't accidentally commit to `master` when they meant to commit to another branch. And then there will be a `release` branch that is rebased from `master` when we are ready to do a release, and it will automatically kick off a CI pipeline that updates Easy Install and builds the Docker images. All other things being equal ... I'd rather be working on Citadel than on a build system. But this should save time in the long run.