Just had a quick thought that might make the conversion to library much easier.
If you have a relatively small API interface, each of the API functions could do a setjmp https://en.wikipedia.org/wiki/Setjmp.h and then the fatal error routines could longjmp back. This would give you API safety, at very limited code intervention. And if a flag got set by the API functions, then the fatal routines could check, so that fossil the program would need no changes as only the API functions would change the default fail-hard behaviour. But perhaps the API would be too big to make this a win. Just a thought. ../Dave On 17 June 2018 at 07:50, Stephan Beal <sgb...@googlemail.com> wrote: > On Sat, Jun 16, 2018 at 11:44 PM Sam Putman <atmanis...@gmail.com> wrote: > >> I'll be reading through the codebase and documentation, some initial >> thoughts: >> > > No pressure, but: i would _love_ to see someone pick up the torch and run > with it. > > A bit of background: in Sept. 2011 i had the great pleasure of meeting > Richard in Munich (at which point i'd been active on the mailing list since > early 2008). He asked me what Fossil needed, to which i immediately > responded "a library". We quickly came to the conclusion that the effort > would be "herculean" (i believe was his (apt) description of it (or maybe > that adjective got applied on the mailing list later on)), so i responded > with my second choice: a JSON interface. (HTTP/JSON interfaces are, in > essence, shared libraries with call-time linking. Many of Fossil's features > simply aren't realistic for a JSON interface, but most are.) Richard > promptly agreed, and i spent the next few months building the JSON API > (using a then-recent JSON wiki project of mine as the "structural basis"). > > Anyway... > > >> >> >>> Several aspects of fossil make it very tedious (but not difficult, per >>> se) to port to a library: >>> >>> 1) it uses a great deal of global state. That's simple enough to factor >>> into a Context object, but... >>> >>> >> An incremental refactoring of this into something more modular would >> be a boon to maintenance and testing. Seems like a sleeping dog we can >> let lie for now. >> > > That's actually the easy part. The real effort comes in with error > checking and handling, especially in cases where an error may (in a > library) propagated from 3+ levels deep. The app will just exit() at that > point, so no thought has gone (nor needed to go) into error handling or > propagation (because there is no propagation - all errors are "immediate"). > Not only does the propagation at the error-triggering point need to be > decided upon, but how it will propagate arbitrarily far up the call stack. > In the end, libfossil went with a hybrid approach of returning non-0 (from > a well-defined enum of result codes) and the context object holds an Error > object which may (or may not, depending on context) hold more details about > the error. > > > >> 2) it relies on a fail-fast-and-fail-loud allocator. Any allocation error >>> will immediately (intentionally) crash the app. While that saves literally >>> half (sometimes more) of code/error checking any place where memory is >>> allocated (that's a lot of places), that pattern is unusable for libraries. >>> Granted, allocation errors are rare, but every single C call which >>> allocates has to check for failure or risk Undefined Behaviour. To simplify >>> the vast majority of the implementation, Fossil does this checking in a >>> single place and abort()s the app if an allocation fails. >>> >>> >> Ok, this doesn't sound /ideal/ granted, but maybe not so bad either. >> > > Because allocations fail so rarely (at least ostensibly), it's "not that > big of a deal", but the library-level implementation code "needs" (in my > somewhat-purist point of view) to check for allocation errors nonetheless. > App-level code is free to use a fail-fast allocator, and libfossil's > app-level code did, in fact, use one because it speeds up writing the > app-level code so much. Fossil does _lots_ of allocation, and does, in > fact, sometimes run out of memory. i've never seen it happen on my > machines, but i've seen several reports from users who try to store > multi-GB files in fossil and then wonder why it fails on their Raspberry > Pi. Fossil needs scads of memory. Certain parts of that "could" > hypothetically be optimized to only alloc what they need (e.g. the diff > generator could arguably stream its output), but (1) that would greatly > complicate those parts and (2) very possibly wouldn't result in a leaner > app. e.g. constructing version X of a file from its parent version and the > diff of the versions requires allocating memory for X, X's parent, and the > diff (it knows all of those sizes in advance). In the average case that's > just a bit over 2X memory for each such operation, and fossil regularly has > to perform such an operation during many different types of activities. > > > >> I would likely prefer as much allocation as possible during load. An >> allocation error during this stage >> is a show-stopper. >> > > Because fossil can be used in several discrete ways (e.g. within a > checkout, with (only) a repo, and with neither checkout nor repo (for a > limited subset of operations)), it's impossible to supply a single init > operation. An app needs to tell fossil to init into some specific mode of > operation, and the API "should" allow the user to toss that away and > re-init with a different mode (but that's kind of a free feature when you > create the API as library-centric). > > >> 3) Fossil effectively uses exit() to handle just about any type of >>> non-allocation error. i.e. there's little library-friendly error handling >>> in fossil. >>> >>> >> I guess this bullet depends on how much error handling is possible at >> those points, and how badly >> failures would bork the global state. >> > > In fossil proper there are very few (if any - i can't recall any at the > moment) places where it's considered feasible to even attempt recovery. It > either succeeds completely or fails right in the middle of what it was > doing (which might leave stale files laying around, but it won't break the > DBs, thanks to transactions (fossil adds pseudo-recursive transactions to > sqlite, btw, which greatly simplifies certain types of db operations)). > > >> If the answer is "none" and "not a bit" then turning some of these >> exit()s into a library error would be plenty. >> > > That requires, though (as touched on above), rewriting all of the > interfaces to allow such a propagation. That's a significant part of a > library port. > > 4) Last but not least: Fossil implements a great many intricate algorithms >>> which, if not ported 100% perfectly, could lead to all sorts of Grief, some >>> of it difficult to track down. Such ports typically require 2x as much >>> code, sometimes more, because of the addition of error checking and >>> handling (as opposed to using abort() and exit()). >>> >> The networking-related functionality is the part I personally don't need; >> we're using the luv bindings >> to libuv and I'm quite happy with that. >> > > Networking was slated for last - the underlying streaming interfaces were > in place (off of which networking resp. remote communication could be > added), but there were, at the time, no concrete plans to implement those > particular features. > > The way I explained my desire in that initial email is "everything you >> can't remove without breaking >> fossil". From what I gather there are some tasks which rely on the admin >> interface, and those >> SQL queries might need to end up in some kind of controller module to >> make a durable API. >> >> This also means you might be closer to done than you think! >> > > i got it pretty far along, but the SHA-related changes in 2017(?) made > libfossil immediately incompatible with newer repos, which means that > getting it back up and running would take some effort (for which i have no > estimate, and can't get one without spending more time in the code than is > remotely good for my hands). > > >> >> I concur with Warren that the effort of a libfossil is best justified if >> it becomes the core of fossil proper. >> > > Absolutely 100%, but it's essentially impossible to back-port it into > fossil proper without some massive upheaval. Since fossil lies at the heart > of the sqlite project, there's not (in my somewhat conservatively cautious > view) much room for such severe upheaval. A 3rd-party implementation is > interesting in and of itself, but it would also potentially be a point of > contention, as you say... > > Keeping a libfossil in sync with an upstream fossil poses risks in both >> directions. There are merges from >> fossil core, which is an arbitrary amount of ongoing work. There's also >> the real possibility that libfossil would >> start innovating in ways that would cause compatibility drift. >> > > There were _never_ any plans to innovate libfossil in terms of the SCM > features. The only "incompatible" thing libfossil ever did was allowed a > repo to be completely empty (no initial checkin). Fossil "seeds" new repos > with an empty checkin because that means it never has to deal with a > non-positive artifact ID, but that's not strictly a requirement of the > model, just an implementation convenience. (IIRC, Jan found and fixed all > such assertions in fossil at the time, but more may have snuck back in > since then.) > > "Feature drift" was a genuine concern which (at the time) i hand-waved > away with 2 justifications: 1) i am/was active in Fossil, so my visibility > into fossil was high, i.e. it was unlikely that i'd forget to port some > important fix/feature. 2) it's actually extremely rare that the core > algorithms get (or needed to be) touched - they've been in place for many > years and are low-maintenance. > > Thought point (2) still stands, obviously point (1) was overly-optimistic > - i would have bet more on being hit by a bus than having both of my elbow > nerves go on strike for so long. > > Tasks like isolating those core intricate algorithms into well-documented >> modules, where >> errors and edge cases are handled where they occur, this can really pay >> off. >> > > That's all in libfossil. > > >> Merging and patch theory are >> areas where real conceptual leaps are still happening. >> > > libfossil has all of fossil's diff algorithms but i don't think i ever > ported the full merge support (it can apply deltas but i don't recall > porting the type of merging decisions which are made during, e.g., a > checkout). Speaking of merging: that's often an interactive process, and > interactivity is difficult to define in a UI-ignorant library. > > >> The one area of fossil I've done enough reading into to feel comfortable >> in my understanding is the >> file format itself. There's an edge to the documentation and I'm kinda >> peering over that edge slightly. >> > > The "artifact format" documentation is really Fossil's heart. All of the > other parts are implementation details for supporting that. Nonetheless, > any port will certainly want to take advantage of as many of those details > as possible (much of fossil's "heavy lifting" is done with sqlite, and > reimplementing many of those pieces without sqlite would be a massive > undertaking). > > i would be thrilled to see someone implement a library for fossil, but >>> anyone doing so needs to understand, in advance, that it's a large >>> undertaking. >>> >>> >> I'm happy to sign contribution agreements and otherwise smooth the way to >> collaborating on this. >> > > None are needed if you just want to access to libfossil (initially they > were, but that requirement was later dropped). If you'll send me your > preferred user name off-list i'll get it set up. > > Thanks again, Stephan. I'll be looking into those links, please don't >> feel as though a back-and-forth on each >> email is necessary, whatever is comfortable for you. >> > > My hands have "good days" and "bad days", but today's relatively good. In > any case, every now and then i have to sit down and type for a while just > to see if my hands can take it. > > -- > ----- stephan beal > http://wanderinghorse.net/home/stephan/ > "Freedom is sloppy. But since tyranny's the only guaranteed byproduct of > those who insist on a perfect world, freedom will have to do." -- Bigby Wolf > > _______________________________________________ > fossil-users mailing list > fossil-users@lists.fossil-scm.org > http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users > >
_______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users