On 25 June 2010 19:36, Michal Suchanek <hramr...@centrum.cz> wrote: > On 25 June 2010 20:18, Owen Shepherd <owen.sheph...@e43.eu> wrote: >> One of the reasons that I'm a fan of SCSU is that, with even a >> relatively simple encoder, it produces output which is comparable in >> efficiency to that of most legacy encodings. > > SCSU is a horrendous encoding because it uses shifts. When the shift > is lost the text has completely different meaning. In UTF-8 if you > remove part of the text only that part is affected (if you cut > mid-character you create a bad character at worst but it can be > clearly detected).
And how often do you lose a couple of bytes in the middle of a file? More precisely, how often do you lose them and not have a checksum fail (or some other error) notifying you of this? It's a particularly egregious complaint in the context of Fossil - where all records are hashed anyway! Additionally, if the same kind of error were to occur to the SQLite file that the repository is contained within, it would probably be trashed irretrievably. Years of experience with binary and other modal file formats (XML and HTML to name two very common) show that this is a complete non-issue. SCSU is of course a poor choice for an in-memory format (Use UTF-16) or interacting with the console (For backwards compatibility you're probably going to have to use UTF-8). But for a storage format, particularly one embedded within a database? It's pretty much perfect. _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users