Hi all,
In order to make our FilePermission checks work properly it is
necessary to canonicalize filenames. Classpath's canonicalizer
does not handle symbolic links at all; GCJ does, but in an all-
or-nothing way and using a function that returns different results
on different systems.
I've written a POSIX path canonicalizer in C that mimics (and in
at least one case improves upon) the behaviour of a proprietary
JVM, but I don't know enough about how Classpath builds C stuff
to be able to see how to build it. It needs be invoked from (or
just replace) gnu.java.io.PlatformHelper.toCanonicalForm().
GCJ has separate implementations for POSIX and Windows, and I'd
like to do the same for Classpath. This code is about as critical
as it gets from a security standpoint, so it's vital it's easy to
understand. It's complicated enough already without cluttering it
with stuff to deal with different separators and drive letters.
For non-POSIX systems we could either fall back to the current
implementation or take GCJ's.
All that is a long-winded way of saying "how do I build this, what
file should I put it in, and how do I make it so that stuff is built
differently on POSIX and non-POSIX?"
Cheers,
Gary
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#define MAXPATHLEN 256 /* XXX Get this from somewhere */
char *
getCanonicalPath(const char *path)
{
char src[MAXPATHLEN], dst[MAXPATHLEN];
int srci, dsti, dsti_save;
int len, tmpi;
int fschecks = 1;
struct stat sb;
/* XXX Presumably the argument to this function will be a Java
String, so this bit will be replaced by some call to extract
its UTF-8 representation. */
if (len >= MAXPATHLEN)
return NULL; /* XXX throw IOException */
strcpy(src, path);
/* It is the caller's responsibility to ensure the path is absolute. */
len = strlen(path);
if (len == 0 || path[0] != '/')
return NULL; /* XXX throw RuntimeException */
dst[0] = '/';
dst[1] = '\0';
dsti = 1;
srci = 1;
while (src[srci] != '\0')
{
/* Skip slashes. */
while (src[srci] == '/')
srci++;
tmpi = srci;
/* Find next slash. */
while (src[srci] != '/' && src[srci] != '\0')
srci++;
if (srci == tmpi)
/* We hit the end. */
break;
len = srci - tmpi;
/* Handle "." and "..". */
if (len == 1 && src[tmpi] == '.')
continue;
if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
{
if (dsti == 1)
/* Unlike other JVMs we do not rewind past the root
directory. I can't see any legitimate reason why you
would want this, yet chopping off pieces of path seems
like a sure-fire way to introduce vulnerabilities. */
return NULL; /* XXX throw IOException */
while (dsti > 1 && dst[dsti - 1] != '/')
dsti--;
if (dsti != 1)
dsti--;
/* Reenable filesystem checking if disabled: we might have
reversed over whatever caused the problem before. At
least one proprietary JVM has inconsistencies because it
does not do this. */
fschecks = 1;
continue;
}
/* Handle real path components. */
if (dsti + len + 1 >= MAXPATHLEN)
return NULL; /* XXX throw IOException */
dsti_save = dsti;
if (dsti > 1)
dst[dsti++] = '/';
strncpy(&dst[dsti], &src[tmpi], len);
dsti += len;
if (fschecks == 0)
continue;
dst[dsti] = '\0';
if (lstat(dst, &sb) == 0)
{
if (S_ISLNK(sb.st_mode))
{
char tmp[MAXPATHLEN];
tmpi = readlink(dst, tmp, MAXPATHLEN);
if (tmpi < 1 || tmpi == MAXPATHLEN)
return NULL; /* XXX throw IOException */
/* Prepend the link's path to src. */
if (tmpi + strlen(&src[srci]) >= MAXPATHLEN)
return NULL; /* XXX throw IOException */
while (src[srci] != '\0')
tmp[tmpi++] = src[srci++];
tmp[tmpi] = '\0';
strcpy(src, tmp);
srci = 0;
/* Either replace or append dst depending on whether the
link is relative or absolute. */
dsti = tmp[0] == '/' ? 1 : dsti_save;
}
}
else
{
/* Something doesn't exist, or we don't have permission to
read it, or a previous path component is a directory, or
a symlink is looped. Whatever, we can't check the
filesystem any more. */
fschecks = 0;
}
}
dst[dsti] = '\0';
/* XXX Presumably this bit will be replaced by something call to
convert the array of UTF-8 bytes into a Java String. */
return strdup(dst);
}
int
main(int argc, char *argv[])
{
int i;
for (i = 1; i < argc; i++)
printf("%s -> %s\n", argv[i], getCanonicalPath(argv[i]));
}