Sorry it's been so long since I've written; the Air Force moved me in the past month.
I'm still interested in adding support for numerical suffixes to split.
Several people have commented that the change from alphabetic suffixes to numerical ones can be accomplished via a shell script. That's true, but my question to you, is: Why? Why should the user have to remember a long line of arcane shell commands instead of just adding a single command line flag that already exists in a sister program (csplit)?
The numerical suffix is *the* standard for forensic images (think dd files, not graphics). Originally developed for a forensic imaging program called Safeback in late 80's, every forensic examination program now uses numerical suffixes to identity parts of a single image. Such programs include EnCase (http://encase.com/), iLook (http://www.ilook-forensics.org/), and Autopsy (http://www.sleuthkit.org/).
Your concern for preserving the split standard is admirable, but I don't think that this patch will break that standard. That is, files created with numerical suffixes can be read in the same manner as files created with alphabetic suffixes. Here's an example with a small text file called "bar." I split the 'bar' file using regular split and then my modified version. Both can then be 'cat'ed back together to be the same file.
$ ls -l bar
-rw-r--r-- 1 jessek users 5120 Aug 7 15:34 bar
$ split -b 500 bar normal
$ /home/jessek/coreutils-5.0/src/split -n -b 500 bar digits
$ ls normal* digits*
digits01 digits04 digits07 digits10 normalab normalae normalah normalak
digits02 digits05 digits08 digits11 normalac normalaf normalai
digits03 digits06 digits09 normalaa normalad normalag normalaj
$ cat normal* > all-normal
$ cat digits* > all-digits
$ diff all-normal all-digits
$
Thus, any set of files created with numerical suffixes is still compatible with any set of files created with alphabetic suffixes. I'm open to argument on this point, of course. :)
Somebody asked if csplit would work for our purposes. Unfortunately we need a program that splits files based on their size. csplit looks only at a file's content.
Below, as requested, is the formatted patch for split to allow numerical suffixes. (I had some CVS issues, but think I got it right.)
2003-08-08 Jesse Kornblum <[EMAIL PROTECTED]>
Add support for numerical suffixes in split
* src/split.c - Add support for -n
Index: patch.c =================================================================== RCS file: /cvsroot/coreutils/coreutils/patch,v retrieving revision x.x diff -p -u -rx.x patch.c --- split-orig.c Thu Aug 7 15:00:25 2003 +++ split.c Thu Aug 7 15:11:11 2003 @@ -63,6 +63,9 @@ static size_t suffix_length = DEFAULT_SU /* Name of input file. May be "-". */ static char *infile;
+/* If non-zero, use numeric suffixes instead of characters */ +static int suffix_type; + /* Descriptor on which input file is open. */ static int input_desc;
@@ -78,6 +81,7 @@ static struct option const longopts[] =
{"bytes", required_argument, NULL, 'b'},
{"lines", required_argument, NULL, 'l'},
{"line-bytes", required_argument, NULL, 'C'},
+ {"numbers", no_argument, &suffix_type, 'n'},
{"suffix-length", required_argument, NULL, 'a'},
{"verbose", no_argument, &verbose, 0},
{GETOPT_HELP_OPTION_DECL},
@@ -109,6 +113,7 @@ Mandatory arguments to long options are
-a, --suffix-length=N use suffixes of length N (default %d)\n\
-b, --bytes=SIZE put SIZE bytes per output file\n\
-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file\n\
+ -n, --numbers use digits for suffixes instead of numbers\n\
-l, --lines=NUMBER put NUMBER lines per output file\n\
"), DEFAULT_SUFFIX_LENGTH);
fputs (_("\
@@ -143,7 +148,12 @@ next_file_name (void)
outfile = xmalloc (outfile_length + 1);
outfile_mid = outfile + outbase_length;
memcpy (outfile, outbase, outbase_length);
- memset (outfile_mid, 'a', suffix_length);
+ if (!suffix_type)
+ memset (outfile_mid, 'a', suffix_length);
+ else {
+ memset (outfile_mid, '0', suffix_length);
+ outfile_mid[suffix_length - 1] = '1';
+ }
outfile[outfile_length] = 0;
#if ! _POSIX_NO_TRUNC && HAVE_PATHCONF && defined _PC_NAME_MAX
@@ -165,10 +175,17 @@ next_file_name (void)
/* Increment the suffix in place, if possible. */
char *p;
- for (p = outfile_mid + suffix_length; outfile_mid < p; *--p = 'a')
- if (p[-1]++ != 'z')
- return;
- error (EXIT_FAILURE, 0, _("Output file suffixes exhausted"));
+ if (!suffix_type) {
+ for (p = outfile_mid + suffix_length; outfile_mid < p; *--p = 'a')
+ if (p[-1]++ != 'z')
+ return;
+ error (EXIT_FAILURE, 0, _("Output file suffixes exhausted"));
+ } else {
+ for (p = outfile_mid + suffix_length; outfile_mid < p; *--p = '0')
+ if (p[-1]++ != '9')
+ return;
+ error (EXIT_FAILURE, 0, _("Output file suffixes exhausted"));
+ }
}
}@@ -376,7 +393,7 @@ main (int argc, char **argv)
int this_optind = optind ? optind : 1;
long int tmp_long;- c = getopt_long (argc, argv, "0123456789C:a:b:l:", longopts, NULL);
+ c = getopt_long (argc, argv, "0123456789nC:a:b:l:", longopts, NULL);
if (c == -1)
break;@@ -385,6 +402,10 @@ main (int argc, char **argv)
case 0:
break;+ case 'n':
+ suffix_type = 1;
+ break;
+
case 'a':
{
unsigned long tmp;-- Jesse Kornblum, Capt, USAF United States Naval Academy Chauvenet Room 329 572 Holloway Rd. Stop 9F Annapolis, MD 21402-5002 Comm 410-293-6821 DSN 281-6821 Fax 410-293-2686 Fax DSN 281-2686 e-mail: [EMAIL PROTECTED] http://www.cs.usna.edu/~kornblum/
_______________________________________________ Bug-coreutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-coreutils
