--- Begin Message ---
Package: ckermit
Version: 416~beta12-1
Severity: grave
Tags: patch upstream
Justification: causes non-serious data loss
ckermit has a system that is designed to automatically convert files when
transferring between different platforms.
Unfortunately, this system is enabled by default and it most often does the
wrong thing in a modern context.
There are two settings that impact this:
SET FILE TYPE BINARY
SET TRANSFER MODE MANUAL
Fortunately, the file type is already binary. However, according to the help
text:
When TRANSFER MODE is AUTOMATIC (as it is by default), various automatic
methods (depending on the platform) are used to determine whether a file
is transferred in text or binary mode; these methods (which might include
content scan (see SET FILE SCAN below), filename pattern matching (SET FILE
PATTERNS), client/server "kindred-spirit" recognition, or source file
record format) supersede the FILE TYPE setting but can, themselves, be
superseded by including a /BINARY or /TEXT switch in the SEND, GET, or
RECEIVE command.
When TRANSFER MODE is MANUAL, the automatic methods are skipped for sending
files; the FILE TYPE setting is used instead, which can be superseded on
a per-command basis with a /TEXT or /BINARY switch.
Unfortunately, transfer mode is indeed automatic by default.
When sending between Unix platforms, the Kermits recognize that they are similar
and always use binary mode (this uses a "sysid" system).
However, when sending between Unix and other platforms (Windows, etc.), the
Kermits recognize they are different and the automatic methods are used. The
default file type is binary, but the automatic methods can override it.
The enclosed patch will set the transfer mode to manual by default. There is a
small chance this may change behavior for certain legacy scripts; they can
always add "set transfer mode automatic" to the start. This automatic behavior
has not been static throughout the life of ckermit, however, and this would not
be the first time changes have been made.
Prior to this change, "show file" shows, among other things:
Transfer mode: automatic
File patterns: automatic (SHOW PATTERNS for list)
Default file type: binary
With this change:
Transfer mode: manual
File patterns: automatic (but disabled by TRANSFER-MODE MANUAL)
File type: binary
The modern assumption is a byte-accurate transfer of files. We have had a
proliferation of file types, extensions, and complicating circumstances since
the earlier days of Kermit.
By changing this default, we disable the heuristic for attempting to guess the
type of files, and convert the existing binary default into a binary setting.
This can always be changed by the user, but the idea is to not violate the
principle of least surprise. If the user asks to transfer a file, we assume the
user wants an exact transfer of the file unless stated otherwise.
Additionally, some platforms (eg, HP48 calculators) have wildly different
behavior depending on whether a text or binary transfer is requested. By
defaulting to manual mode, the user is in charge and further surprises that may
be caused by "set file type" being ignored can be avoided.
It should be noted that the retiring head of the ckermit project, Frank da Cruz,
was reported to have said: "The default transfer mode is "auto", where it uses
an time-tested algorithm to look at the contents of each file to see if it
contains only text characters, in which case the xfer mode is text, otherwise
it's binary. Text is sent as ascii lines terminated by CRLF, and all other
Kermits understand this and convert to their own text format, e.g. Unix (CR
instead of CRLF), IBM OS/MVT (which also converts from ASCII to EBCDIC (did I
forget how to spell that?)); this scheme has worked smoothly for decades and
changing it would be nasty surprise for most users, would break countless
scripts and init files, etc.
It's especially important to keep this behavior when sending a group
of files where some are binary and others are text. Kermit automatically
switches mode for each file. (I'm sure you know all this, so I'm just
being careful.)" via
https://github.com/KermitProject/ckermit/pull/15#issuecomment-3560129029
While I appreciate Frank's perspective, I believe we are in a different era now
than when ckermit was initially written. Performing those changes, silently, by
default, is interpreted as corruption by modern systems even though, in a
different era, it was helpful. Cryptographic signatures will fail to verify,
hashes will change, etc.
Yes, even if it's a .txt or .c file or whatever, changing the file size and
bytes that make it up will be interpreted as corruption by lots of systems. A
property where files that are transferred by sftp, zip, tar, 7z, etc. have one
content, and those sent by kermit have another, is also strongly undesirable for
systems that do deduplication (backup systems and filesystems like btrfs and
zfs) and those that do content-based hardlinking.
My point here is that even if it contains only text characters, changing the
content of the file is a big deal on modern systems and with modern security
systems.
I would absolutely think this change is something to be highlighted in the
release notes. Fortunately it is very easy to workaround with a single line
consisting of SET TRANSFER MODE AUTO in .kermrc or other scripts.
The fact that even I did not realize that SET FILE TYPE BINARY was insufficient
to prevent corruption even after years of using kermit is telling.
Additionally, it looks like behavior around this has changed in the past.
https://www.kermitproject.org/faq-c-bix.html says "Both Kermit and ftp transfer
files in text mode by default." That's not the case in the current Kermit 10
betas; I don't know when it changed.
That page also says "You can tell Kermit to skip all conversions and transfer
the file literally, as-is, with the command SET FILE TYPE BINARY", which is not
correct because the trransfer mode can override that by default.
This has clearly been a source of confusion for many people for many years.
I confirmed just now, that using the latest Kermit 95 to the latest ckermit on
Linux, sending a text file with a .txt extension -- even with the file type set
to binary -- produces output that has fewer bytes on Linux than it did on
Windows.
I myself have appreciated the text transfer mode and of course this doesn't take
it away. My HP-48GX calculator did something wildly different when in text vs.
binary mode (basically, sending the source code of a program vs. bytecode of it)
and of course sometimes the automatic CRLF conversion is helpful. It shouldn't
go away! Just that the defaults should be suitable for a modern environment,
recognizing that the passage of time has changed the default-conversion approach
from helpful to corruption.
I also observed that with the default settings, a file saved in UTF-16 on
Windows had its BOM stripped and was converted to an 8-bit encoding (unclear of
it was UTF-8 or latin-1 or what, since it didn't use any non-ASCII characters)
on Linux.
When transferred back to Windows, it was not converted back to UTF-16. The
round-trip, therefore, was also lossy.
SET FILE TRANSFER MODE MANUAL prevents this issue as well.
Let me back up a moment to the big picture.
To assume auto transfer mode enabled by default is a good idea, we have to
believe all of these propositions:
1. The primary purpose of sending files to a different platform is to interact
with them on that platform
2. The identity of the sending and receiving platforms is sufficient to
determine the proper line-ending encodings
3. Interaction with text files from a different platform requires assistance in
transforming them
4. The file transfer tool is the appropriate place for this, and default-on
does not violate the principle of least surprise
5. The benefit of these outweighs the harm
None of these are clearly true, and most of them are clearly false.
THE PRIMARY PURPOSE OF SENDING FILES
In many cases, the primary purpose of sending text files between platforms is
not to interact with them on the target. For instance, if you are running
Windows and have mounted a network share, that network share might be hosted on
Windows or Linux (samba), and the files may never ever be accessed except on
Windows clients.
Likewise if ckermit is used to send text files to a file server, or a backup
server, or whatever, it is quite possible -- and, these days, probably even
likely -- that the intent is not to interact with them on that system, but
simply to store them on that system. For instance, I transfer files to a hosting
directories for web, Gopher, and Gemini on a server. I never look at that on
that server itself; they are only viewed by clients (which may be of many
platforms)
Most transfer and synchronization software tries to be platform-independent and
preserve your content regardless of the client and server platforms.
DETERMINING LINE-ENDING AND CHARACTER SET ENCODINGS
ckermit currently appears to assess what line endings to use based on what
platforms are at the sending and receiving end. This is incorrect, because any
given system may contain files that originated on any other given system, and
moreover may contain files intended for any other given system, in any
combination.
My file server contains text files originating on DOS, modern Windows, MacOS,
and *nix systems. It would be an impossibility for ckermit to correctly
determine what encoding to use for a file placed in a given directory. It may be
able to guess the encoding of a file by examining it, but even that gets dicey
with character sets.
INTERACTION WITH TEXT FILES REQUIRES ASSISTANCE
Every modern editor for text files, from vim on up, supports autodetection and
preservation of line endings. Granted, cat on Unix may do the wrong thing with a
file from Windows or Mac, but these issues are both routine and easily solved
these days.
THE APPROPRIATE PLACE FOR THIS, AND THE PRINCIPLE OF LEAST SURPRISE
I submit that the transfer tool is the wrong place for this (by default). Nobody
expects that these days. Here is a partial list of software commonly used to
exchange files that does not perform this translation:
rsync
x/y/zmodem
UUCP
NNCP
scp
sftp
SMB/samba/CIFS
NFS
Dropbox
Syncthing
Nextcloud
Google Drive
Amazon S3
tar
dar
xz
rdiff
unzip (has optional text file conversion, off by default)
curl, wget, other http clients (curl supports ASCII mode for FTP but it
defaults off)
gzip, bzip2, zstd, etc.
In fact, the only other one I can think of that does this by default in some
circumstances is some very old FTP clients. Modern ones like lftp and ncftp use
binary mode by default.
So this /definitely/ violates the principle of least surprise. Tools like
dos2unix are widely understood to perform the conversion for those cases where
it's needed, but it rarely is anymore.
THE BENEFIT VS. THE HARM
The harm is significant.
It will cause severe breakage to:
content-addressible (hashing) systems, such as git-annex
Cryptographic signature verification tools
Tools that create or apply binary deltas (rdiff, xdelta, dar)
Some version control systems
rsync (will cause it to perceive a file to be different and generally
require a full retransmit since the typical line size is less than the
typical rsync block size)
syncthing
many other syncing tools may perceive a conflict
There is utility to this conversion, especially for very old systems or ones
that don't have what amounts to the typical notion of a file or one of the big
three line-ending approaches (Microsoft, Mac, and Unix). But I don't see much
utility, and a significant amount of harm, to it being enabled by default. As
someone that's had to get data off an AS/400, which neither uses ASCII nor has a
standard notion of a file, I get it. But I'd rather request that mode explicitly
than have an algorithm decide it for me -- especially when that algorithm may
change between releases.
As for an extension map, I think that is a losing battle. If that is truly the
desire, libmagic is probably a better bet (it's been around for a long time but
may not be on all platforms)
This is not binding on Debian, but as a side note, the long-time upstream Kermit
maintainer is retiring from the project and there may be a need for me to step
up and maintain kermit in a more direct way. One former Kermit programmer, from
some decades back, has registered a ckermit project on Github but has been
hostile to most suggestions, including this one. His repos were dormant for
years before I started asking about the future of Kermit after Frank's
retirement, and have gone dormant again after I decided his bikeshedding was not
worth the time. However, for completeness, I note that I initially reported
this issue at https://github.com/KermitProject/ckermit/pull/15 and you can find
Jeffrey's comments there as well.
As ckermit maintainer in Debian, my conscience is troubled with the idea of
allowing software to continue to exist with known data corruption issues in the
default configuration. Therefore I intend to patch this in Debian, especially
since any compatibility issues are likely minimal and easily addressed.
-- System Information:
Debian Release: 13.2
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 6.12.57+deb13-amd64 (SMP w/16 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE,
TAINT_UNSIGNED_MODULE
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
Versions of packages ckermit depends on:
ii debconf [debconf-2.0] 1.5.91
ii libc6 2.41-12
ii libncurses6 6.5+20250216-2
ii libpam0g 1.7.0-5
ii libssl3t64 3.5.4-1~deb13u1
ii libtinfo6 6.5+20250216-2
Versions of packages ckermit recommends:
ii openssh-client [ssh-client] 1:10.0p1-7
Versions of packages ckermit suggests:
pn openbsd-inetd | inet-superserver <none>
-- debconf information excluded
commit 3b4af6b057ca1f1f17e5b262cf76b7f9b1394779
Author: John Goerzen <[email protected]>
Date: Tue Nov 18 01:36:49 2025 +0000
Disable auto transfer mode
Prior to this change, "show file" shows, among other things:
Transfer mode: automatic
File patterns: automatic (SHOW PATTERNS for list)
Default file type: binary
With this change:
Transfer mode: manual
File patterns: automatic (but disabled by TRANSFER-MODE MANUAL)
File type: binary
The modern assumption is a byte-accurate transfer of files. We have had
a proliferation of file types, extensions, and complicating
circumstances since the earlier days of Kermit.
By changing this default, we disable the heuristic for attempting to
guess the type of files, and convert the existing binary default into a
binary setting.
This can always be changed by the user, but the idea is to not violate
the principle of least surprise. If the user asks to transfer a file,
we assume the user wants an exact transfer of the file unless stated
otherwise.
Additionally, some platforms (eg, HP48 calculators) have wildly
different behavior depending on whether a text or binary transfer is
requested. By defaulting to manual mode, the user is in charge and
further surprises that may be caused by "set file type" being ignored
can be avoided.
diff --git a/ckcftp.c b/ckcftp.c
index 7780cdd..f4f813f 100644
--- a/ckcftp.c
+++ b/ckcftp.c
@@ -951,7 +951,7 @@ int ftp_log = 1; /* FTP Auto-login */
int sav_log = -1;
int ftp_action = 0; /* FTP action from command line */
int ftp_dates = 1; /* Set file dates from server */
-int ftp_xfermode = XMODE_A; /* FTP-specific transfer mode */
+int ftp_xfermode = XMODE_M; /* FTP-specific transfer mode */
char ftp_reply_str[FTP_BUFSIZ] = ""; /* Last line of previous reply */
char ftp_srvtyp[SRVNAMLEN] = { NUL, NUL }; /* Server's system type */
diff --git a/ckcmai.c b/ckcmai.c
index a84b9cb..8c08087 100644
--- a/ckcmai.c
+++ b/ckcmai.c
@@ -1408,7 +1408,7 @@ int deblog = 0, /* Debug log is
open */
cursor_save = -1, /* Cursor state */
#endif /* OS2 */
- xfermode = XMODE_A, /* Transfer mode, manual or auto */
+ xfermode = XMODE_M, /* Transfer mode, manual or auto */
xfiletype = -1, /* Transfer only text (or binary) */
recursive = 0, /* Recursive directory traversal */
nolinks = 2, /* Don't follow symbolic links */
--- End Message ---