Package: ckermit Version: 416~beta12-1 Severity: grave Tags: patch upstream Justification: causes non-serious data loss
ckermit has a system that is designed to automatically convert files when transferring between different platforms. Unfortunately, this system is enabled by default and it most often does the wrong thing in a modern context. There are two settings that impact this: SET FILE TYPE BINARY SET TRANSFER MODE MANUAL Fortunately, the file type is already binary. However, according to the help text: When TRANSFER MODE is AUTOMATIC (as it is by default), various automatic methods (depending on the platform) are used to determine whether a file is transferred in text or binary mode; these methods (which might include content scan (see SET FILE SCAN below), filename pattern matching (SET FILE PATTERNS), client/server "kindred-spirit" recognition, or source file record format) supersede the FILE TYPE setting but can, themselves, be superseded by including a /BINARY or /TEXT switch in the SEND, GET, or RECEIVE command. When TRANSFER MODE is MANUAL, the automatic methods are skipped for sending files; the FILE TYPE setting is used instead, which can be superseded on a per-command basis with a /TEXT or /BINARY switch. Unfortunately, transfer mode is indeed automatic by default. When sending between Unix platforms, the Kermits recognize that they are similar and always use binary mode (this uses a "sysid" system). However, when sending between Unix and other platforms (Windows, etc.), the Kermits recognize they are different and the automatic methods are used. The default file type is binary, but the automatic methods can override it. The enclosed patch will set the transfer mode to manual by default. There is a small chance this may change behavior for certain legacy scripts; they can always add "set transfer mode automatic" to the start. This automatic behavior has not been static throughout the life of ckermit, however, and this would not be the first time changes have been made. Prior to this change, "show file" shows, among other things: Transfer mode: automatic File patterns: automatic (SHOW PATTERNS for list) Default file type: binary With this change: Transfer mode: manual File patterns: automatic (but disabled by TRANSFER-MODE MANUAL) File type: binary The modern assumption is a byte-accurate transfer of files. We have had a proliferation of file types, extensions, and complicating circumstances since the earlier days of Kermit. By changing this default, we disable the heuristic for attempting to guess the type of files, and convert the existing binary default into a binary setting. This can always be changed by the user, but the idea is to not violate the principle of least surprise. If the user asks to transfer a file, we assume the user wants an exact transfer of the file unless stated otherwise. Additionally, some platforms (eg, HP48 calculators) have wildly different behavior depending on whether a text or binary transfer is requested. By defaulting to manual mode, the user is in charge and further surprises that may be caused by "set file type" being ignored can be avoided. It should be noted that the retiring head of the ckermit project, Frank da Cruz, was reported to have said: "The default transfer mode is "auto", where it uses an time-tested algorithm to look at the contents of each file to see if it contains only text characters, in which case the xfer mode is text, otherwise it's binary. Text is sent as ascii lines terminated by CRLF, and all other Kermits understand this and convert to their own text format, e.g. Unix (CR instead of CRLF), IBM OS/MVT (which also converts from ASCII to EBCDIC (did I forget how to spell that?)); this scheme has worked smoothly for decades and changing it would be nasty surprise for most users, would break countless scripts and init files, etc. It's especially important to keep this behavior when sending a group of files where some are binary and others are text. Kermit automatically switches mode for each file. (I'm sure you know all this, so I'm just being careful.)" via https://github.com/KermitProject/ckermit/pull/15#issuecomment-3560129029 While I appreciate Frank's perspective, I believe we are in a different era now than when ckermit was initially written. Performing those changes, silently, by default, is interpreted as corruption by modern systems even though, in a different era, it was helpful. Cryptographic signatures will fail to verify, hashes will change, etc. Yes, even if it's a .txt or .c file or whatever, changing the file size and bytes that make it up will be interpreted as corruption by lots of systems. A property where files that are transferred by sftp, zip, tar, 7z, etc. have one content, and those sent by kermit have another, is also strongly undesirable for systems that do deduplication (backup systems and filesystems like btrfs and zfs) and those that do content-based hardlinking. My point here is that even if it contains only text characters, changing the content of the file is a big deal on modern systems and with modern security systems. I would absolutely think this change is something to be highlighted in the release notes. Fortunately it is very easy to workaround with a single line consisting of SET TRANSFER MODE AUTO in .kermrc or other scripts. The fact that even I did not realize that SET FILE TYPE BINARY was insufficient to prevent corruption even after years of using kermit is telling. Additionally, it looks like behavior around this has changed in the past. https://www.kermitproject.org/faq-c-bix.html says "Both Kermit and ftp transfer files in text mode by default." That's not the case in the current Kermit 10 betas; I don't know when it changed. That page also says "You can tell Kermit to skip all conversions and transfer the file literally, as-is, with the command SET FILE TYPE BINARY", which is not correct because the trransfer mode can override that by default. This has clearly been a source of confusion for many people for many years. I confirmed just now, that using the latest Kermit 95 to the latest ckermit on Linux, sending a text file with a .txt extension -- even with the file type set to binary -- produces output that has fewer bytes on Linux than it did on Windows. I myself have appreciated the text transfer mode and of course this doesn't take it away. My HP-48GX calculator did something wildly different when in text vs. binary mode (basically, sending the source code of a program vs. bytecode of it) and of course sometimes the automatic CRLF conversion is helpful. It shouldn't go away! Just that the defaults should be suitable for a modern environment, recognizing that the passage of time has changed the default-conversion approach from helpful to corruption. I also observed that with the default settings, a file saved in UTF-16 on Windows had its BOM stripped and was converted to an 8-bit encoding (unclear of it was UTF-8 or latin-1 or what, since it didn't use any non-ASCII characters) on Linux. When transferred back to Windows, it was not converted back to UTF-16. The round-trip, therefore, was also lossy. SET FILE TRANSFER MODE MANUAL prevents this issue as well. Let me back up a moment to the big picture. To assume auto transfer mode enabled by default is a good idea, we have to believe all of these propositions: 1. The primary purpose of sending files to a different platform is to interact with them on that platform 2. The identity of the sending and receiving platforms is sufficient to determine the proper line-ending encodings 3. Interaction with text files from a different platform requires assistance in transforming them 4. The file transfer tool is the appropriate place for this, and default-on does not violate the principle of least surprise 5. The benefit of these outweighs the harm None of these are clearly true, and most of them are clearly false. THE PRIMARY PURPOSE OF SENDING FILES In many cases, the primary purpose of sending text files between platforms is not to interact with them on the target. For instance, if you are running Windows and have mounted a network share, that network share might be hosted on Windows or Linux (samba), and the files may never ever be accessed except on Windows clients. Likewise if ckermit is used to send text files to a file server, or a backup server, or whatever, it is quite possible -- and, these days, probably even likely -- that the intent is not to interact with them on that system, but simply to store them on that system. For instance, I transfer files to a hosting directories for web, Gopher, and Gemini on a server. I never look at that on that server itself; they are only viewed by clients (which may be of many platforms) Most transfer and synchronization software tries to be platform-independent and preserve your content regardless of the client and server platforms. DETERMINING LINE-ENDING AND CHARACTER SET ENCODINGS ckermit currently appears to assess what line endings to use based on what platforms are at the sending and receiving end. This is incorrect, because any given system may contain files that originated on any other given system, and moreover may contain files intended for any other given system, in any combination. My file server contains text files originating on DOS, modern Windows, MacOS, and *nix systems. It would be an impossibility for ckermit to correctly determine what encoding to use for a file placed in a given directory. It may be able to guess the encoding of a file by examining it, but even that gets dicey with character sets. INTERACTION WITH TEXT FILES REQUIRES ASSISTANCE Every modern editor for text files, from vim on up, supports autodetection and preservation of line endings. Granted, cat on Unix may do the wrong thing with a file from Windows or Mac, but these issues are both routine and easily solved these days. THE APPROPRIATE PLACE FOR THIS, AND THE PRINCIPLE OF LEAST SURPRISE I submit that the transfer tool is the wrong place for this (by default). Nobody expects that these days. Here is a partial list of software commonly used to exchange files that does not perform this translation: rsync x/y/zmodem UUCP NNCP scp sftp SMB/samba/CIFS NFS Dropbox Syncthing Nextcloud Google Drive Amazon S3 tar dar xz rdiff unzip (has optional text file conversion, off by default) curl, wget, other http clients (curl supports ASCII mode for FTP but it defaults off) gzip, bzip2, zstd, etc. In fact, the only other one I can think of that does this by default in some circumstances is some very old FTP clients. Modern ones like lftp and ncftp use binary mode by default. So this /definitely/ violates the principle of least surprise. Tools like dos2unix are widely understood to perform the conversion for those cases where it's needed, but it rarely is anymore. THE BENEFIT VS. THE HARM The harm is significant. It will cause severe breakage to: content-addressible (hashing) systems, such as git-annex Cryptographic signature verification tools Tools that create or apply binary deltas (rdiff, xdelta, dar) Some version control systems rsync (will cause it to perceive a file to be different and generally require a full retransmit since the typical line size is less than the typical rsync block size) syncthing many other syncing tools may perceive a conflict There is utility to this conversion, especially for very old systems or ones that don't have what amounts to the typical notion of a file or one of the big three line-ending approaches (Microsoft, Mac, and Unix). But I don't see much utility, and a significant amount of harm, to it being enabled by default. As someone that's had to get data off an AS/400, which neither uses ASCII nor has a standard notion of a file, I get it. But I'd rather request that mode explicitly than have an algorithm decide it for me -- especially when that algorithm may change between releases. As for an extension map, I think that is a losing battle. If that is truly the desire, libmagic is probably a better bet (it's been around for a long time but may not be on all platforms) This is not binding on Debian, but as a side note, the long-time upstream Kermit maintainer is retiring from the project and there may be a need for me to step up and maintain kermit in a more direct way. One former Kermit programmer, from some decades back, has registered a ckermit project on Github but has been hostile to most suggestions, including this one. His repos were dormant for years before I started asking about the future of Kermit after Frank's retirement, and have gone dormant again after I decided his bikeshedding was not worth the time. However, for completeness, I note that I initially reported this issue at https://github.com/KermitProject/ckermit/pull/15 and you can find Jeffrey's comments there as well. As ckermit maintainer in Debian, my conscience is troubled with the idea of allowing software to continue to exist with known data corruption issues in the default configuration. Therefore I intend to patch this in Debian, especially since any compatibility issues are likely minimal and easily addressed. -- System Information: Debian Release: 13.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 6.12.57+deb13-amd64 (SMP w/16 CPU threads; PREEMPT) Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages ckermit depends on: ii debconf [debconf-2.0] 1.5.91 ii libc6 2.41-12 ii libncurses6 6.5+20250216-2 ii libpam0g 1.7.0-5 ii libssl3t64 3.5.4-1~deb13u1 ii libtinfo6 6.5+20250216-2 Versions of packages ckermit recommends: ii openssh-client [ssh-client] 1:10.0p1-7 Versions of packages ckermit suggests: pn openbsd-inetd | inet-superserver <none> -- debconf information excluded
commit 3b4af6b057ca1f1f17e5b262cf76b7f9b1394779 Author: John Goerzen <[email protected]> Date: Tue Nov 18 01:36:49 2025 +0000 Disable auto transfer mode Prior to this change, "show file" shows, among other things: Transfer mode: automatic File patterns: automatic (SHOW PATTERNS for list) Default file type: binary With this change: Transfer mode: manual File patterns: automatic (but disabled by TRANSFER-MODE MANUAL) File type: binary The modern assumption is a byte-accurate transfer of files. We have had a proliferation of file types, extensions, and complicating circumstances since the earlier days of Kermit. By changing this default, we disable the heuristic for attempting to guess the type of files, and convert the existing binary default into a binary setting. This can always be changed by the user, but the idea is to not violate the principle of least surprise. If the user asks to transfer a file, we assume the user wants an exact transfer of the file unless stated otherwise. Additionally, some platforms (eg, HP48 calculators) have wildly different behavior depending on whether a text or binary transfer is requested. By defaulting to manual mode, the user is in charge and further surprises that may be caused by "set file type" being ignored can be avoided. diff --git a/ckcftp.c b/ckcftp.c index 7780cdd..f4f813f 100644 --- a/ckcftp.c +++ b/ckcftp.c @@ -951,7 +951,7 @@ int ftp_log = 1; /* FTP Auto-login */ int sav_log = -1; int ftp_action = 0; /* FTP action from command line */ int ftp_dates = 1; /* Set file dates from server */ -int ftp_xfermode = XMODE_A; /* FTP-specific transfer mode */ +int ftp_xfermode = XMODE_M; /* FTP-specific transfer mode */ char ftp_reply_str[FTP_BUFSIZ] = ""; /* Last line of previous reply */ char ftp_srvtyp[SRVNAMLEN] = { NUL, NUL }; /* Server's system type */ diff --git a/ckcmai.c b/ckcmai.c index a84b9cb..8c08087 100644 --- a/ckcmai.c +++ b/ckcmai.c @@ -1408,7 +1408,7 @@ int deblog = 0, /* Debug log is open */ cursor_save = -1, /* Cursor state */ #endif /* OS2 */ - xfermode = XMODE_A, /* Transfer mode, manual or auto */ + xfermode = XMODE_M, /* Transfer mode, manual or auto */ xfiletype = -1, /* Transfer only text (or binary) */ recursive = 0, /* Recursive directory traversal */ nolinks = 2, /* Don't follow symbolic links */

