Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-08-04 Thread David Roundy
On Sat, Jul 29, 2006 at 05:35:30PM -0700, Andrew Pimlott wrote:
 On Sat, Jul 29, 2006 at 02:59:06AM +0200, Udo Stenzel wrote:
  Andrew Pimlott wrote:
   Second, foo is just as good a directory
   as foo/ to the system
  
  ...unless you have both (think Reiser4) or you want to create the file
  (I think, but I'm not sure).  However, what's the point in being
  ambiguous when we can be explicit?  Sometimes there is a difference,
  libraries and tools shouldn't gloss over that without consideration.
 
 As I said, it's one of those line-drawing exercises.  But your points
 are well taken, and maybe the trailing delimiter should be part of the
 model.  (My criterion has been whether any filesystem operations require
 the trailing delimiter.  It sounds like with reiser4fs they might.)

Actually, I just read in LWN that that part of reiser4 has been dropped.
On the other hand, it was only dropped after considerable debate, and
people using an older version of reiser4 still have the strange
file-as-directory semantics.
-- 
David Roundy
http://www.darcs.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-29 Thread Udo Stenzel
Andrew Pimlott wrote:
 On Thu, Jul 27, 2006 at 09:59:37PM +0200, Udo Stenzel wrote:
  In fact, that's consistent with the current documentation, because
  
  * getFileName foo == foo
  * getFileName foo/ == 
 
 I have to disagree with that.

No, you don't.  That's the current behaviour of Neil Mitchell's
System.FilePath 0.9 according to the haddockumentation.  There isn't
much point in disagreeing about observable facts, is there?

 First of all,  is not a filename;

Most certainly it isn't.  Which is all the more reason not to like the
current design.  An empty filename just isn't the same as no filename.

 if you mean that foo/ has no filename, it makes much more sense to use
 something like a Maybe type.

It does very much.  In fact, I don't deem getFileName to be an essential
function when a simple pattern match would do the same thing.  foo/
really doesn't have a file name, as it very explicitly names a
directory.

 Second, foo is just as good a directory
 as foo/ to the system

...unless you have both (think Reiser4) or you want to create the file
(I think, but I'm not sure).  However, what's the point in being
ambiguous when we can be explicit?  Sometimes there is a difference,
libraries and tools shouldn't gloss over that without consideration.


 But if you wish to make the distinction,
 at least provide an operation that lets me force a path to be treated
 file-wise or directory-wise.

WTF?!  A path names either a directory or a file.  We might have some
operations that accept file names instead of path names.  What's there
to be treated?  Being explicit about the distinction makes any ambiguity
go away.

 Filesystems are ugly. :-)

So are microprocessors.  We can still have a nice programming language,
and we can also have a nice filesystem language.

 And it is about the slash: foo can be a directory.

No, it still isn't.  We can distinguish between Directory (but not
file, fifo, character or block special) and anything (if in doubt, not
directory), which is an essential semantic distinction and not just the
accidental presence of a slash (or backslash or colon or whatever
$EXOTIC_OS uses).

Also, parsing paths _once_ and printing them _once_ but doing everything
else by operating on their logical structure makes specifying any
intermediate operation a lot easier, if nothing else.  If this thread
shows anything, then it is that specifying path operations is harder
than expected.


Udo.
-- 
Structure is _nothing_ if it is all you got. Skeletons _spook_ people if
they try to walk around on their own. I really wonder why XML does not.
-- Erik Naggum


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-28 Thread Udo Stenzel
Andrew Pimlott wrote:
 On Wed, Jul 26, 2006 at 05:06:41PM -0400, David Roundy wrote:
  This doesn't apply uniformly to all programs--except that we can say
  that any path with a trailing '/' is intended to be a directory, and
  if it's not, then that's an error.
 
 I thought some more about this, and I think the right way to handle this
 is on parsing and printing.

Amen.

 After all, the trailing slash has no real
 meaning for any intermediate processing you might do.

Here I beg to differ.  I'd expect:

* setFileName foo bar == bar
* setFileName foo/ bar == foo/bar

In fact, that's consistent with the current documentation, because

* getFileName foo == foo
* getFileName foo/ == 

No matter whether I'm correct, whether my expectation is natural or
practical and whether others agree, the bahaviour has to be clearly
specified and the final slash certainly isn't unimportant.

 readPath :: String - (Path, Bool {- trailing delimiter -})
 showPath :: Path - String
 showPathTrailingSlash :: Path - String
 
 This is far simpler than trying to figure out what the slash means for
 every path operation.

It's also far uglier...  besides, it isn't about the slash, it is about
the difference between file and directory.


Udo.
-- 
If you cannot in the long run tell everyone what you have been doing,
your doing was worthless.
-- Erwin Schrödinger


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-28 Thread Duncan Coutts
On Thu, 2006-07-27 at 11:07 -0700, Andrew Pimlott wrote:
 On Wed, Jul 26, 2006 at 04:02:31PM -0700, Andrew Pimlott wrote:
  I admit I don't know enough to say how the lpt1 issue should be
  handled.  Is there any Win32 call I can make that will help me avoid
  accidentally opening these magic files?  Say, if I call open with
  O_CREAT | O_EXCL?  Unfortunately, I can find very little information on
  how one should handle this issue.
 
 Thanks to a suggestion from Bulat to use c_open, I was able to test
 O_WRONLY | O_CREAT | O_EXCL on Windows.  In fact, Windows does allow
 files like nul to be opened (as many times as you like) with these
 flags, which I find dismaying.  So I still don't know the proper way to
 handle them.

You can open the file and test the file type with GetFileType. If it's
type FILE_TYPE_CHAR then it's probably not what you wanted.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/getfiletype.asp

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-28 Thread Andrew Pimlott
On Thu, Jul 27, 2006 at 09:59:37PM +0200, Udo Stenzel wrote:
 Andrew Pimlott wrote:
  After all, the trailing slash has no real
  meaning for any intermediate processing you might do.
 
 Here I beg to differ.  I'd expect:
 
 * setFileName foo bar == bar
 * setFileName foo/ bar == foo/bar
 
 In fact, that's consistent with the current documentation, because
 
 * getFileName foo == foo
 * getFileName foo/ == 

I have to disagree with that.  First of all,  is not a filename; if
you mean that foo/ has no filename, it makes much more sense to use
something like a Maybe type.  Second, foo is just as good a directory
as foo/ to the system, and they both denote the same filesystem object
(the object with the name foo in the current directory), so it doesn't
make sense to me for path operations to distinguish them.  Maybe the
second point is philosophical.  But if you wish to make the distinction,
at least provide an operation that lets me force a path to be treated
file-wise or directory-wise.

  readPath :: String - (Path, Bool {- trailing delimiter -})
  showPath :: Path - String
  showPathTrailingSlash :: Path - String
  
  This is far simpler than trying to figure out what the slash means for
  every path operation.
 
 It's also far uglier...  besides, it isn't about the slash, it is about
 the difference between file and directory.

Filesystems are ugly. :-)  And it is about the slash: foo can be a
directory.

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-27 Thread Andrew Pimlott
On Wed, Jul 26, 2006 at 04:02:31PM -0700, Andrew Pimlott wrote:
 I admit I don't know enough to say how the lpt1 issue should be
 handled.  Is there any Win32 call I can make that will help me avoid
 accidentally opening these magic files?  Say, if I call open with
 O_CREAT | O_EXCL?  Unfortunately, I can find very little information on
 how one should handle this issue.

Thanks to a suggestion from Bulat to use c_open, I was able to test
O_WRONLY | O_CREAT | O_EXCL on Windows.  In fact, Windows does allow
files like nul to be opened (as many times as you like) with these
flags, which I find dismaying.  So I still don't know the proper way to
handle them.

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-27 Thread Andrew Pimlott
On Wed, Jul 26, 2006 at 05:06:41PM -0400, David Roundy wrote:
 cp(1), for example, treats paths with trailing separators differently
 from paths without.
 
 This doesn't apply uniformly to all programs--except that we can say
 that any path with a trailing '/' is intended to be a directory, and
 if it's not, then that's an error.  But the trouble is that if you
 silently drop the '/', then the only way for me to implement a correct
 cp(1) in Haskell is to not use your proposed interface for pathname
 handling, which drops this information.

I thought some more about this, and I think the right way to handle this
is on parsing and printing.  After all, the trailing slash has no real
meaning for any intermediate processing you might do.  So if the type
used by my path operations is Path, I might have something like

readPath :: String - (Path, Bool {- trailing delimiter -})
showPath :: Path - String
showPathTrailingSlash :: Path - String

This is far simpler than trying to figure out what the slash means for
every path operation.

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-27 Thread Neil Mitchell

Hi


hahaha!  I admit I don't know enough to say how the lpt1 issue should be
handled.  Is there any Win32 call I can make that will help me avoid
accidentally opening these magic files?

No, because its entirely possible to open these magic files, you'll
just find that accidentally your output has appeared at your
printer, rather than on disk.


BTW, it appears that wget itself does
not handle it. :-)

I know, but my hope is that HsWget will :)


BTW, I guess wget should truncate the path at some number of
characters

Fortunately if we have FilePath == String, take n can be used, or more
likely joinDirectories . take n . splitDirectories


 Windows doesn't use UTF-16, NTFS does.

I was under the impression that NT's Unicode support was conceived when
it meant UCS-2.  So it uses UCS-2 and not UTF-16, which would mean that
you could in principle encounter lone surrogate characters or something
equally nonsensical.

Yep, true, it uses UCS-2.


Windows has two sets of file system related functions, one for legacy
8-bit character sets, one for Unicode.  What happens if I call the
Unicode API on a FAT system that doesn't support it?  Does it do a
half-assed version of the locale specific encoding that we deem
impossible and wrong here?


Of course :) And if you use the ANSI API's on a NTFS system you'll
also get some dodgy encoding.


Ah, never mind, I get the strong feeling I really don't want to know all
this.  When even Windows 98 has been end-of-lifed we should rely on the
Unicode API, if anything.

Windows ME has not been end-of-lifed, and still has native 8-bit.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-27 Thread Duncan Coutts
On Thu, 2006-07-27 at 11:07 -0700, Andrew Pimlott wrote:
 On Wed, Jul 26, 2006 at 04:02:31PM -0700, Andrew Pimlott wrote:
  I admit I don't know enough to say how the lpt1 issue should be
  handled.  Is there any Win32 call I can make that will help me avoid
  accidentally opening these magic files?  Say, if I call open with
  O_CREAT | O_EXCL?  Unfortunately, I can find very little information on
  how one should handle this issue.
 
 Thanks to a suggestion from Bulat to use c_open, I was able to test
 O_WRONLY | O_CREAT | O_EXCL on Windows.  In fact, Windows does allow
 files like nul to be opened (as many times as you like) with these
 flags, which I find dismaying.  So I still don't know the proper way to
 handle them.

Interestingly even Windows explorer doesn't handle these odd files
consistently.

Renaming a file to com1 is ignored with no error, though renaming to
com1.txt gives an error about such a file already existing.

Also, it seems that com1.txt.txt is not allowed either. I thought that
the extension of com1.txt.txt was txt but it seems that it is
txt.txt and so the base name is com1 and thus not allowed.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: ANN: System.FilePath 0.9

2006-07-26 Thread Krasimir Angelov

On 7/26/06, Neil Mitchell [EMAIL PROTECTED] wrote:

The main purpose of canoncialPath is to fix the case on Windows, so
c:\my documents\file.doc becomes C:\My Documents\file.doc if that
is the case correct version of the file. I think this function will
not actually change the file with relation to the underying file
system, so should be race free. (I will document more to make the
operation clearer)


Hi Neil,

   It seems like your canoncialPath function is already in the base
package. Look at System.Directory.canonicalizePath. I have added it
when I was working on the FilePath module for Cabal.
  The FilePath abstraction was discussed a number of times and it
seems that people prefer an ADT representation instead of plain
String. I tend to agree. Maybe such ADT based library can be
integrated with some new IO library like the Streams library.

Cheers,
 Krasimir
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread David Roundy
On Wed, Jul 26, 2006 at 03:36:13AM +0100, Neil Mitchell wrote:

  pathSeparator :: Char
  The character that seperates directories.
 So what do I do with this?  If I need it, it seems like the module has
 failed.
 Hopefully no one will ever use it. Its part of the low level functions
 that the FilePath module builds on. However, pragmatically, someone
 somewhere will have a use for it, and the second they do they'll just
 write '/', and at that point we've lost.

I'd just point out that I'm not aware of an operating system that GHC
runs on that doesn't accept '/' as a path separator.  It may be that
you could fine an OS where you could compile with jhc or run with hugs
that doesn't use '/' (e.g. MacOS 9), but support for MacOS 9 at this
stage I wouldn't consider a high priority.  Since noone ought to need
the path separator, and since they can currently assume '/' without
loss of portability, it seems like adding in an extra function to
protect us from the introduction of an operating system some time in
the future that doesn't allow '/' as a path separator is a bit
much.

Of course, I may be wrong.  Does windows disallow mixing of '/' and
'\\' as path separators? In darcs we always just use '/' as all the
path separators, and it works fine...
-- 
David Roundy
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Neil Mitchell

I'd just point out that I'm not aware of an operating system that GHC
runs on that doesn't accept '/' as a path separator.  It may be that
you could fine an OS where you could compile with jhc or run with hugs
that doesn't use '/' (e.g. MacOS 9), but support for MacOS 9 at this
stage I wouldn't consider a high priority.  Since noone ought to need
the path separator, and since they can currently assume '/' without
loss of portability, it seems like adding in an extra function to
protect us from the introduction of an operating system some time in
the future that doesn't allow '/' as a path separator is a bit
much.

Of course, I may be wrong.  Does windows disallow mixing of '/' and
'\\' as path separators?


That's fine, at the file system level. However some programs that are
layered on top of that, for example the copy command in the shell,
will bork on /. Also on Windows fundamentally \ is the separator, and
/ is a second class separator. When showing paths to the user, it
should always be \ because thats the one thats right (TM) for the
platform.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Udo Stenzel
Andrew Pimlott wrote:
  The drive functions stand on their own as a chunk, and are possibly
  not well suited to a Posix system, but are critical for a Windows
  system.
 
 Why are they critical for portable code?  I am fine with
 Windows-specific functions, but I think it's a mistake to bundle them
 [with] portable functions.

I couldn't agree more.  In fact, why can't we pretend the world is sane
at least within Haskell and just put away those drive letters?


 My criticism is that your properties are all specified in terms of
 string manipulation.

Exactly.  I believe, a FilePath should be an algebraic datatype.
Most operations on that don't have to be specified, because they are
simple and have an obvious effect.  Add a system specific parser and a
system specific renderer, maybe also define a canonical format, and the
headaches stop.  What's wrong with this?

data FilePath = Absolute RelFilePath | Relative RelFilePath
data RelFilePath = ThisDirectory 
 | File String
 | ParentOf RelFilePath 
 | String :|: RelFilePath

parseSystemPath :: String - Maybe FilePath
renderSystemPath :: FilePath - String

We can even clearly distiguish between the name of a directory in its
parent and the directory itself.  On Windows, the root directory just
contains the drive letters and is read-only,
drive-absolute-but-directory-relative paths are simply ignored (they are
a dumb idea anyway).  Seperator characters are never exposed, all we
need now is a mapping from Unicode to whatever the system wants.  



  pathSeparator :: Char
  The character that seperates directories.
 
 So what do I do with this?  If I need it, it seems like the module has
 failed.

Indeed.

 
  splitFileName bob == (, bob)
 
  is not a directory.

Some problems just vanish:

parseSystemPath bob == Just (Relative (File bob))
splitFileName (Relative (File bob)) = (Relative ThisDirectory, File bob)

 
  Windows: splitFileName c: == (c:,)
 
 c: is arguably not a directory.

parseSystemPath c: == Nothing
parseSystemPath c:\ == Absolute (C: :|: ThisDirectory)


 (Consider that dir c: lists the current directory on c:, not c:\)

I'd rather ignore that altogether.  Multiple roots with associated
current directories are just a needless headache.  Even a current
directory is somewhat ill-fitted for a functional language like
Haskell.


  getFileName test/ == 
 
  is not a filename.

getFileName (Relative (test :|: ThisDirectory))
== error pattern match failure

 
 Also, it looks from this that you treat paths differently depending on
 whether they end in a separator.  Yet this makes no difference to the
 system.  That seems wrong to me.

Not to the system, but some programs like to make a difference.  If you
give rsync a path that doesn't end in a slash, it will take that to mean
the directory.  With a slash, it means the contents of the directory.
The difference is an additional path component that ends up on the
target file system or doesn't.

 
  getDirectory :: FilePath - FilePath
  Get the directory name, move up one level. 
 
 What does this mean, in the presence of dots and symlinks?

You're right, this has to be ill-defined.  Instead it should be

moveUp :: FilePath - IO FilePath

which would end up in the parent of the linked-to directory after
following a symlink.  Cutting of a component is done by simple pattern
matching, no special functions needed.


Sorry for the rant, but this is Haskell, not Perl.  We have true data
types, not just strings...


Udo.
-- 
A politician is someone who calls a spade a portable, hand-operated
digging implement.
-- author unknown


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Duncan Coutts
On Wed, 2006-07-26 at 15:29 +0200, Udo Stenzel wrote:

  My criticism is that your properties are all specified in terms of
  string manipulation.
 
 Exactly.  I believe, a FilePath should be an algebraic datatype.
 Most operations on that don't have to be specified, because they are
 simple and have an obvious effect.  Add a system specific parser and a
 system specific renderer, maybe also define a canonical format, and the
 headaches stop.  What's wrong with this?

We've had this discussion before. The main problem is that all the
current IO functions (readFile, etc) use the FilePath type, which is
just a String. So a new path ADT is fine if at the same time we provide
a new IO library. That of course is an ongoing discussion in itself.

So until we have the opportunity to change the FilePath type there does
seem to be value in providing a library that takes some of the
complexity and portability nightmares out of using the existing FilePath
type.

Currently, real programs are doing even less principled hacking with
strings. So an easy to use library that we can use now will be a great
improvement even if it's not perfect.

 data FilePath = Absolute RelFilePath | Relative RelFilePath
 data RelFilePath = ThisDirectory 
  | File String
  | ParentOf RelFilePath 
  | String :|: RelFilePath
 
 parseSystemPath :: String - Maybe FilePath
 renderSystemPath :: FilePath - String
 
 We can even clearly distiguish between the name of a directory in its
 parent and the directory itself.  On Windows, the root directory just
 contains the drive letters and is read-only,
 drive-absolute-but-directory-relative paths are simply ignored (they are
 a dumb idea anyway).  Seperator characters are never exposed, all we
 need now is a mapping from Unicode to whatever the system wants.  

That's another portability headache - file name string encodings.
Windows and OSX use encodings of Unicode. Unix uses strings of bytes.
They are not fully inter-convertible. On Unix the traditional technique
is to keep a system file name in the original encoding and convert to
Unicode to display to the user, but the Unicode version is never
converted back to a system file name because it doesn't necessarily
convert back to the same sequence of bytes.

My point is it's not quite as simple as just making an ADT.

  (Consider that dir c: lists the current directory on c:, not c:\)
 
 I'd rather ignore that altogether.  Multiple roots with associated
 current directories are just a needless headache.  Even a current
 directory is somewhat ill-fitted for a functional language like
 Haskell.

Much of the time it can be ignored. Sometimes programs have to deal with
silly issues like this just because that is what the OS does and so you
might get such a corner case as input and be expected to deal with it.
(Though I admit this is a particularly obscure case.)

So in my humble opinion the current discussion on the issues of
semantics, names, IO or pure etc is worthwhile.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: ANN: System.FilePath 0.9

2006-07-26 Thread Robert Dockins


On Jul 26, 2006, at 1:47 PM, Neil Mitchell wrote:


Hi,


Perhaps instead:

   directoryOf :: FilePath - String
   filenameOf  :: FilePath - String
   extensionOf :: FilePath - String
   basenaneOf  :: FilePath - String

   replaceFilename  = joinFilePath . directoryOf
   replaceDirectory = flip joinFilePath . filenameOf


Trying to design a consistent naming system, it helps if we all agree
on what the various parts of a filepath are called, this is my draft
of that:

http://www-users.cs.york.ac.uk/~ndm/temp/filepath.png

With a better name for basename, if anyone can think of one.


stem, perhaps?  You could also, maybe, distinguish the short  
stem (everything before the extensions) from the long stem  
everything before the extension.




Once we have that, how about

takeElement :: FilePath - String
dropElement :: FilePath - String
replaceElement :: FilePath - String - FilePath
addElement :: FilePath - String - FilePath
splitElement :: FilePath - (String, String)
joinElement :: String - String - FilePath

With the restriction that not all of these are provided. Some don't
make sense (splitBaseName, dropBaseName), some are implemented via
combine (addFileName, joinFileName), some are redundant (addExtensions
== addExtension)

I'm also debating whether split/join should be exported, since they
are less likely to be used and can easily be written as a take/drop
pair. And of course, a bigger interface is harder to understand.

Opinions on this? It's easier to tweak a specification than the  
actual code :)


Thanks

Neil



Rob Dockins

Speak softly and drive a Sherman tank.
Laugh hard; it's a long way to the bank.
  -- TMBG



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Udo Stenzel
Duncan Coutts wrote:
 On Wed, 2006-07-26 at 15:29 +0200, Udo Stenzel wrote:
 
  Exactly.  I believe, a FilePath should be an algebraic datatype.
 
 We've had this discussion before. The main problem is that all the
 current IO functions (readFile, etc) use the FilePath type, which is
 just a String.

So what's better?

- use an ADT (correct and portable by construction), convert to String
  when calling the IO library

- fumble with Strings, use an unholy mix of specialized and general
  functions, trip over a corner case


 So a new path ADT is fine if at the same time we provide
 a new IO library.

We should just wrap the old API, filePathToString any parameters and
liftIO the function while we're at it.


 That's another portability headache - file name string encodings.
 Windows and OSX use encodings of Unicode. Unix uses strings of bytes.

Indeed.  There are two ways out:

- declare that Unix uses Unicode too, take the appropriate conversion
  from the locale

- parameterize the FilePath ADT on the character type, you get (FilePath
  Word16) on Windows (which uses UCS-2, not UCS-4 and not UTF-16) and
  (FilePath Word8) on Unix; provide conversions from/to (FilePath
  String).

I tend towards the second option.  It at least doesn't make anything
worse than it already is.  It's also irrelevant, since pretending the
issue doesn't exist works equally well with an ADT.

 My point is it's not quite as simple as just making an ADT.

Mine is that it is :)  Moreover, a path already has internal structure.
Those string manipulating functions either reconstruct the structure,
then operate on that, then encode it back into a string or implement an
approximation to that.  The latter leads to surprises and making the
former explicit can never hurt.  Heck, NO library fumbles with strings,
neither parsers nor pretty printers nor Network... why should a FilePath
be different?


Udo.


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Andrew Pimlott
On Wed, Jul 26, 2006 at 03:36:13AM +0100, Neil Mitchell wrote:
  Its a rats nest to do it properly, but some very basic idea of does
  this path have things which there is no way could possibly be in a
  file - for example c:\|file is a useful thing to have.
 
 This seems to encourage the classic mistake of checking not known bad
 rather than known good.  known bad is rarely useful in my
 experience.  What use case do you have in mind?
 
 wget on windows saves web pages such as
 http://www.google.com/index.html?q=haskell; to the file
 index.html?q=haskell. This just doesn't work, and is the main reason
 I added this in. I don't think it will be a commonly used operation.

Ok, this is a good use case.  What should wget do if isValid fails?
Certainly not abort the download.  So isValid alone is no help.  Well,
you have makeValid as well, but this is even more of a rats' nest.
There are a zillion different ways you might want this function to work,
depending on your purposes.  Should makeValid be system-dependent?
Should it be reversible?  How hard should it try to preserve the name
verbatim?  Should it prettify legal but unprintable characters?  The
answers are application-dependent.  Also, wget has to worry about not
just whether the filename is valid, but whether it currently exists, and
in that case modify it.  So attempting to provide a generic makeValid is
quixotic and will only lead to misuse.

 My criticism is that your properties are all specified in terms of
 string manipulation.  The whole point of paths is that they are
 interpreted by the system, so if you neglect to say what your operations
 mean to the system, what have you specified?
 True, but at the same time specifying what something means with
 respect to a filesystem is very hard :) If you had any insight how
 this could be done I'd be interested.

The first step is to think carefully about what operations to provide,
and be conservative.  I think the operations I included in my library
all have pretty clear meanings, though I don't claim to have nailed them
down all the way.  Criticism welcome.

http://haskell.org/pipermail/libraries/2006-February/004890.html

 Hopefully no one will ever use it. Its part of the low level functions
 that the FilePath module builds on. However, pragmatically, someone
 somewhere will have a use for it, and the second they do they'll just
 write '/', and at that point we've lost.

Yes, on one hand you want to be pragmatic.  But IMO this way of
thinking--expose the guts just in case--is the path to madness.  Not to
mention, it clutters the API and makes it less clear how the module is
supposed to be used.  Maybe the guts could go into a separate module?

  splitFileName :: FilePath - (String, String)
  Split a filename into directory and file.
 Which directory and which file?
 Ok, thats probably the wrong description. Splits off the last filename
 would be a better description, leaving the rest.

Ok, but now what is the rest good for?  And what is the last
filename in cases like / or ...  The conclusion I come to is that
this operation is unsound to begin with, and should not be part of the
API in any form.

 Also, it looks from this that you treat paths differently depending on
 whether they end in a separator.  Yet this makes no difference to the
 system.  That seems wrong to me.
 That was something I thought over quite a while. If the user enters
 directory/ then they do not mean the file called directory, they
 mean the directory called directory. And in Windows certainly you
 can't open a file called file/

Ok, fair, but dir and dir/ are treated identically if dir is a
directory, so it is still confusing for your library to distinguish
them.  Maybe the user needs to indicate whether a path represents a file
or directory?  These matters confuse your specification.  I made the
simplifying approximation that foo and foo/ should considered
equivalent.  This may not turn out to be the right decision, but at
least it helped me keep the semantics clear.

  getDirectory :: FilePath - FilePath
  Get the directory name, move up one level.
 What does this mean, in the presence of dots and symlinks?
 It gets a parent directory, there may be one, but the one returned
 will be a parent.

Is /a a parent of /a/..?  That seems dubious.

  equalFilePath :: FilePath - FilePath - Bool
  Equality of two FilePaths. If you call fullPath first this has a much
  better chance of working. Note that this doesn't follow symlinks or
  DOSNAM~1s.
 As you acknowledge, it's a crap-shoot.  So what's the point?
 Its a case of reality, at the moment people use == to test if two file
 paths are equal, at least this is a better test.

Why is it better?

 I think of that as a separate module, because extensions have no meaning
 to the system and can be done with portable, functional code, as far as
 I understand.
 Not really, what about getExtension file.ext\lump - the answer is 
 on windows and .ext\lump on Posix.

You would only call the 

Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Andrew Pimlott
On Wed, Jul 26, 2006 at 03:29:01PM +0200, Udo Stenzel wrote:
 Andrew Pimlott wrote:
  Also, it looks from this that you treat paths differently depending on
  whether they end in a separator.  Yet this makes no difference to the
  system.  That seems wrong to me.
 
 Not to the system, but some programs like to make a difference.

How does it make a difference?  Do you have an answer that applies
uniformly to all programs?  If not, aren't we just walking down a blind
alley?  I've heard that Emacs treats double-separators specially.  Do we
account for that too?

Maybe the trailing slash is important enough to take into account.  But
it complicates things, and the problem is hard enough with out it.  So I
say, leave it out.

(In my design, with a different type for different systems, it would be
possible to create types for rsync paths, or Emacs paths, etc.  That
might be a better approach to the problem.)

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Andrew Pimlott
On Wed, Jul 26, 2006 at 04:34:50PM +0100, Duncan Coutts wrote:
 On Wed, 2006-07-26 at 15:29 +0200, Udo Stenzel wrote:
  Exactly.  I believe, a FilePath should be an algebraic datatype.
  Most operations on that don't have to be specified, because they are
  simple and have an obvious effect.  Add a system specific parser and a
  system specific renderer, maybe also define a canonical format, and the
  headaches stop.  What's wrong with this?
 
 We've had this discussion before. The main problem is that all the
 current IO functions (readFile, etc) use the FilePath type, which is
 just a String.

Geesh, just provide a few wrappers.  If we make this a show-stopper,
we'll never get there.

 That's another portability headache - file name string encodings.
 Windows and OSX use encodings of Unicode. Unix uses strings of bytes.
 They are not fully inter-convertible. On Unix the traditional technique
 is to keep a system file name in the original encoding and convert to
 Unicode to display to the user, but the Unicode version is never
 converted back to a system file name because it doesn't necessarily
 convert back to the same sequence of bytes.

The only solution for this, IMO, is to provide different types for
different systems.  Hence my typeclass approach.  (Which I'm not saying
is good enough to cover for all these differences yet.)

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Neil Mitchell

Hi


So what's better?

- use an ADT (correct and portable by construction), convert to String
  when calling the IO library

- fumble with Strings, use an unholy mix of specialized and general
  functions, trip over a corner case


Or provide an ADT, demand people marshal to and from this ADT and not
just cheat and use the string directly? Unfortunately people are lazy,
I am one of them...



We should just wrap the old API, filePathToString any parameters and
liftIO the function while we're at it.


How about

class FilePathLike a where
   getRealFilePath :: a - String

Then convert readFile etc. to take a FilePathLike, rather than a filepath?

I'd be happy with that, and then you can write an ADT and pin down all
the exact details, and the end user can then pick whatever they want
to use.


- declare that Unix uses Unicode too, take the appropriate conversion
  from the locale

Unfortunately this is wrong, and will give the wrong answers.


- parameterize the FilePath ADT on the character type, you get (FilePath
  Word16) on Windows (which uses UCS-2, not UCS-4 and not UTF-16) and
  (FilePath Word8) on Unix; provide conversions from/to (FilePath
  String).

Windows doesn't use UTF-16, NTFS does. FAT doesn't. And what about the
Samba drive I have mounted under Windows?


Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Neil Mitchell

Hi


Ok, this is a good use case.  What should wget do if isValid fails?

isValid (makeValid x) == True

makeValid is system dependant, and unspecified in its behaviour,
although obviously some kind of closeness to the original would be
ideal. So what if isValid fails and we don't have this?


  As you acknowledge, it's a crap-shoot.  So what's the point?
 Its a case of reality, at the moment people use == to test if two file
 paths are equal, at least this is a better test.
Why is it better?


Because its right more often, so pragmatically, is better


 I think of that as a separate module, because extensions have no meaning
 to the system and can be done with portable, functional code, as far as
  I understand.
 Not really, what about getExtension file.ext\lump - the answer is 
 on windows and .ext\lump on Posix.
You would only call the extension functions on a segment name.


So this system independant extension module is dependant on a platform
specific FilePath module? Or do you demand people make two function
calls to get the extension? I think having extensions in this module
is the pragmatic and useful thing to do.


 Not to the system, but some programs like to make a difference.

How does it make a difference?  Do you have an answer that applies
uniformly to all programs?  If not, aren't we just walking down a blind
alley?  I've heard that Emacs treats double-separators specially.  Do we
account for that too?


Haskell makes the difference, runInteractiveCommand vs runInteractiveProcess


Maybe the trailing slash is important enough to take into account.  But
it complicates things, and the problem is hard enough with out it.  So I
say, leave it out.


Originally I left it out, writing quick check properties persudaded me
to put it back in, because it seems to make things more regular. But
I'm not massively tied to this, and I'm slowly thinking I might be
wrong, although not convinced either way yet.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Duncan Coutts
On Wed, 2006-07-26 at 19:41 +0200, Udo Stenzel wrote:
 Duncan Coutts wrote:
  On Wed, 2006-07-26 at 15:29 +0200, Udo Stenzel wrote:
  
   Exactly.  I believe, a FilePath should be an algebraic datatype.
  
  We've had this discussion before. The main problem is that all the
  current IO functions (readFile, etc) use the FilePath type, which is
  just a String.
 
 So what's better?
 
 - use an ADT (correct and portable by construction), convert to String
   when calling the IO library
 
 - fumble with Strings, use an unholy mix of specialized and general
   functions, trip over a corner case

In practise in the short term, the choice is between each application
fumbling with strings in different incorrect ways or a library that
fumbles with strings in a rather more considered and portable way.

  So a new path ADT is fine if at the same time we provide
  a new IO library.
 
 We should just wrap the old API, filePathToString any parameters and
 liftIO the function while we're at it.

Try proposing something concrete and see if you can get it generally
accepted. Perhaps you can get it accepted for the next major release of
various Haskell implementations or for Haskell-prime.

  That's another portability headache - file name string encodings.
  Windows and OSX use encodings of Unicode. Unix uses strings of bytes.
 
 Indeed.  There are two ways out:
 
 - declare that Unix uses Unicode too, take the appropriate conversion
   from the locale

Sadly this does not work. For one thing you don't know that the locale
you're using now was the locale of the program that wrote the file. This
happens on multi-user systems where different users use different
languages.

Then there is the fact that converting from Unicode back to the file
name is not guaranteed to give the same sequence of bytes.

For example, see the section File Name Encodings in the glib api:
http://developer.gnome.org/doc/API/2.0/glib/glib-Character-Set-Conversion.html

 - parameterize the FilePath ADT on the character type, you get (FilePath
   Word16) on Windows (which uses UCS-2, not UCS-4 and not UTF-16) and
   (FilePath Word8) on Unix; provide conversions from/to (FilePath
   String).

 I tend towards the second option.  It at least doesn't make anything
 worse than it already is.  It's also irrelevant, since pretending the
 issue doesn't exist works equally well with an ADT.

Yeah, keeping it in the native format and doing no change of encoding is
almost certainly the way to go. It doesn't address the issue of
converting file names to/from displayable strings, but perhaps that's
reasonable.

  My point is it's not quite as simple as just making an ADT.
 
 Mine is that it is :)  Moreover, a path already has internal structure.
 Those string manipulating functions either reconstruct the structure,
 then operate on that, then encode it back into a string or implement an
 approximation to that.  The latter leads to surprises and making the
 former explicit can never hurt.  Heck, NO library fumbles with strings,
 neither parsers nor pretty printers nor Network... why should a FilePath
 be different?

For compatibility with the Haskell98 IO library. There's also the issue
here that adding in lots of conversions ADT - String means that people
will not bother to use it and will continue to do things like:
readFile (path ++ / ++ file)

If anyone can actually design and implement an ADT that addresses most
of these problems and can get it to work nicely with whatever is the
popular IO system of the time then that'd be great. I think you'll find
that it's not quite as simple as it looks. There was a discussion on a
path ADT on the libraries list a while ago that's probably worth
reading. I don't think it reached consensus.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Duncan Coutts
On Wed, 2006-07-26 at 11:32 -0700, Andrew Pimlott wrote:
 On Wed, Jul 26, 2006 at 03:36:13AM +0100, Neil Mitchell wrote:
   Its a rats nest to do it properly, but some very basic idea of does
   this path have things which there is no way could possibly be in a
   file - for example c:\|file is a useful thing to have.
  
  This seems to encourage the classic mistake of checking not known bad
  rather than known good.  known bad is rarely useful in my
  experience.  What use case do you have in mind?
  
  wget on windows saves web pages such as
  http://www.google.com/index.html?q=haskell; to the file
  index.html?q=haskell. This just doesn't work, and is the main reason
  I added this in. I don't think it will be a commonly used operation.
 
 Ok, this is a good use case.  What should wget do if isValid fails?
 Certainly not abort the download.  So isValid alone is no help.  Well,
 you have makeValid as well, but this is even more of a rats' nest.
 There are a zillion different ways you might want this function to work,
 depending on your purposes.  Should makeValid be system-dependent?
 Should it be reversible?  How hard should it try to preserve the name
 verbatim?  Should it prettify legal but unprintable characters?  The
 answers are application-dependent.  Also, wget has to worry about not
 just whether the filename is valid, but whether it currently exists, and
 in that case modify it.  So attempting to provide a generic makeValid is
 quixotic and will only lead to misuse.

Perhaps we should be more specific and make it talk about illegal file
name characters if that is indeed the use case. Perhaps we should
provide a system-dependent list of characters that are not allowed in
file names. For example, on windows that would include '?'.

Then an application can decide for itself what to do about that
depending on the context. It might be able to tell the user to pick a
different name, or in the wget case replace it with a different
character or remove it or something.

So maybe we should keep isValid but specify exactly what it checks. Then
if it fails it's up to the application to decide how to fix it, possibly
making use of the list of illegal characters.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Andrew Pimlott
On Wed, Jul 26, 2006 at 08:19:47PM +0100, Neil Mitchell wrote:
 Ok, this is a good use case.  What should wget do if isValid fails?
 
 makeValid is system dependant, and unspecified in its behaviour,
 although obviously some kind of closeness to the original would be
 ideal. So what if isValid fails and we don't have this?

Sorry, I meant to say what I think wget should do.  IMO, it should have
a conservative set of allowed characters, encode the filename into that
set using an escaping mechanism it specifies, attempt to open the file
O_EXCL, modify the name if it fails.  The allowed characters set could
perhaps come from the filepath module, though I suspect this is
overkill.  Simpler just to hard-code the set so that the name mangling
is platform-independent and can be fully documented.

   As you acknowledge, it's a crap-shoot.  So what's the point?
  Its a case of reality, at the moment people use == to test if two file
  paths are equal, at least this is a better test.
 Why is it better?
 
 Because its right more often, so pragmatically, is better

To me, that answer is unsatisfactory.

  I think of that as a separate module, because extensions have no meaning
  to the system and can be done with portable, functional code, as far as
   I understand.
  Not really, what about getExtension file.ext\lump - the answer is 
  on windows and .ext\lump on Posix.
 You would only call the extension functions on a segment name.
 
 So this system independant extension module is dependant on a platform
 specific FilePath module? Or do you demand people make two function
 calls to get the extension?

One or the other, it seems a minor detail to me.  There's nothing wrong
with having the extension module use the filepath module.

  Not to the system, but some programs like to make a difference.
 
 How does it make a difference?  Do you have an answer that applies
 uniformly to all programs?  If not, aren't we just walking down a blind
 alley?  I've heard that Emacs treats double-separators specially.  Do we
 account for that too?
 
 Haskell makes the difference, runInteractiveCommand vs runInteractiveProcess

I'm not following.

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread David Roundy
On Wed, Jul 26, 2006 at 11:39:40AM -0700, Andrew Pimlott wrote:
 On Wed, Jul 26, 2006 at 03:29:01PM +0200, Udo Stenzel wrote:
  Andrew Pimlott wrote:
   Also, it looks from this that you treat paths differently depending on
   whether they end in a separator.  Yet this makes no difference to the
   system.  That seems wrong to me.
  
  Not to the system, but some programs like to make a difference.

 How does it make a difference?  Do you have an answer that applies
 uniformly to all programs?  If not, aren't we just walking down a blind
 alley?  I've heard that Emacs treats double-separators specially.  Do we
 account for that too?

cp(1), for example, treats paths with trailing separators differently
from paths without.

rm -rf foo bar
echo test  foo
echo othertest  bar
cp foo bar/
cp foo bar

It's part of the user interface, that allows the user to specify that
he or she intends to use a path to describe a directory.

This doesn't apply uniformly to all programs--except that we can say
that any path with a trailing '/' is intended to be a directory, and
if it's not, then that's an error.  But the trouble is that if you
silently drop the '/', then the only way for me to implement a correct
cp(1) in Haskell is to not use your proposed interface for pathname
handling, which drops this information.

I'd also point out that rieser4, for instance, treats paths with a
trailing slash differently even for files.  True, it's probably not a
good idea, but if we're talking about a portable library we might want
it to work even on systems running an interesting filesystem like
rieser4.
-- 
David Roundy
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Neil Mitchell

Hi


re: isValid

Perhaps we should be more specific and make it talk about illegal file
name characters if that is indeed the use case. Perhaps we should
provide a system-dependent list of characters that are not allowed in
file names. For example, on windows that would include '?'.

Then an application can decide for itself what to do about that
depending on the context. It might be able to tell the user to pick a
different name, or in the wget case replace it with a different
character or remove it or something.


Unfortunately thats too much work for the user of the API. Since it
tends to work on Posix, people probably won't go to the hassle of
fixing it up. If they have a simple fix then there is at least a
chance that they'll accept a patch that fixes the behaviour on
Windows. The only options that I can see are either you want it fixed,
or you want to get the user to fix it manually - both are catered for.

And on Windows its more complex, LPT1.txt is also an invalid file, but
LPT1.txt.txt isn't. Trying to express the weirdness of Windows is
probably beyond the chances of an API :)


So maybe we should keep isValid but specify exactly what it checks.

I'm happy to specify things in more detail, at the moment its pretty
much a no-op on Posix, but if any Posix user suggests thats wrong I'll
happily fix it up.


Sorry, I meant to say what I think wget should do.  IMO, it should have
a conservative set of allowed characters, encode the filename into that


Not enough, because of the LPT1 issue - unless you add L as a
disallowed letter :)


 Haskell makes the difference, runInteractiveCommand vs runInteractiveProcess
I'm not following.


Having some considerations towards a real path, one that can be used
on the command is reasonable, I think, because Haskell has functions
within it that distinguish between firing something at the underlying
filepath vs at the console. I don't however think its worth having a
special type for working with emacs, unless you have
System.FilePath.Emacs, given that Emacs is almost an operating system
:)


But the trouble is that if you
silently drop the '/', then the only way for me to implement a correct
cp(1) in Haskell is to not use your proposed interface for pathname
handling, which drops this information.

Ok, now I remember the reasons I kept the trailing slash, I'll leave
it in. Esp the risers4 issue.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Udo Stenzel
Andrew Pimlott wrote:
 Maybe the trailing slash is important enough to take into account.

No, not the trailing slash.  The difference between a directory and its
contents is important enough.  This is ususally encoded using a trailing
slash, but I'd rather not worry about that detail in a program.

What does Emacs do with double separators?  I'm at a loss thinking of
anything they could denote, but it could be useful.


Udo.
-- 
Guy Steele leads a small team of researchers in Burlington,
Massachusetts, who are taking on an _enormous_challenge_ -- create a
programming language better than Java.
-- Sun.Com


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Andrew Pimlott
On Wed, Jul 26, 2006 at 10:32:02PM +0100, Neil Mitchell wrote:
 Sorry, I meant to say what I think wget should do.  IMO, it should have
 a conservative set of allowed characters, encode the filename into that
 
 Not enough, because of the LPT1 issue - unless you add L as a
 disallowed letter :)

hahaha!  I admit I don't know enough to say how the lpt1 issue should be
handled.  Is there any Win32 call I can make that will help me avoid
accidentally opening these magic files?  Say, if I call open with
O_CREAT | O_EXCL?  Unfortunately, I can find very little information on
how one should handle this issue.  BTW, it appears that wget itself does
not handle it. :-)

Incidentally, there seems to be another problem:  The System.IO API
provides no way to create a file, failing if it already exists (ie,
O_CREAT | O_EXCL).  This is exactly what wget needs.

BTW, I guess wget should truncate the path at some number of
characters

 Having some considerations towards a real path, one that can be used
 on the command is reasonable

That's a great goal, it's just that we have to draw the boundary
somewhere.  At some point, you have be explicit that this is a path for
rsync or Emacs or whatever.

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Doug Quale
Udo Stenzel [EMAIL PROTECTED] writes:

 No, not the trailing slash.  The difference between a directory and its
 contents is important enough.  This is ususally encoded using a trailing
 slash, but I'd rather not worry about that detail in a program.
 
 What does Emacs do with double separators?  I'm at a loss thinking of
 anything they could denote, but it could be useful.

Double separators invoke ange-ftp mode from find-file.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Jeremy Shaw
At Wed, 26 Jul 2006 23:07:36 +0200,
Udo Stenzel wrote:

 What does Emacs do with double separators?  I'm at a loss thinking of
 anything they could denote, but it could be useful.

You mean like,

/path/to/somewhere//with/double/seperator

If so, it treats it as if you had typed in:

/with/double/seperator

That can be useful if you do C-x C-f and you wish to ignore the
default path it brings up. (Of course, it is only one character more
to hit M-DEL).

j.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-26 Thread Chris Kuklewicz
First, http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html may be useful 
to compare to.  I learned C:\ is an absolute and C: is a relative path.


What are the use cases? I can see four different types of file paths that one 
will want to manipulate.  The following are all pure functions:


(Main) Local system compatibility: I must understand all the local files
  1 create roots of the local filesystem(s) (e.g. C:\)
  2 remove the last file or path element (easy except for symlinks).
  3 create a new file or directory name at the end of a path and know that this 
is valid for the local filesystem (i.e. has no invalid or unencodable characters).

  4 Make a way to get the local list of invalid characters
  5 Provide an isValid test (and maybe a rootValid test)
  6 Provide an  invalidToSuggestedValid function that applies some policy for 
ensuring new paths can be coerced into a valid form. (Aside from LPT1 insanity?)

The URL solution of %FF from rfc1630 could be used.
  7 parsing the names of paths and files with respect to the local system, so 
you can handle a directory listing. (Unicode problems go here)


(Secondary) Maximal cross platform compatibility: I want mine to work everywhere
  Provide 2-6 but for a conservative union of all bad characters.  Handling the 
roots would be trickier.


(Tertiary) Specific platform compatibility: I want to work with platform Foo
  Provide 2-6 but for the platform the user specifies.  This may or may not be 
the current platform.


(Special) Handle conversion to and from file:// URI's

The (Secondary) could be accomplished as a special case of (Tertiary) by 
specifying the platform Most for instance.


The (Main) could be accomplished as a special case of (Tertiary) by specifying 
the platform Local.


None of the above depend on IO.  None of the above really care about String vs 
ADT.  The only one that truly and deeply cares about character set encoding is 
#7 on the local system.  Mainly, the above just provides sets of invalid characters.


A makeCanonical pure function could remove . and .. in the syntactic way. 
 But I can't see what else it could do without IO.


Any IO based function can only be part of the (Main) Local system compatibility 
domain of operations.  And the guarantees are weak due to race conditions.


E.g. the makeCanonical_IO is a fancier operation that removes . and .. 
based on symlinks and upper/local case matching based on what is in the filesystem.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-25 Thread Andrew Pimlott
[Sorry for the late reply.]

On Wed, Jul 19, 2006 at 03:16:48AM +0100, Neil Mitchell wrote:
  I want to make sure a filename is valid. For example, prn and con
 This is another rat's nest, so I suggest that it be dealt with
 separately from the basic filepath module.  The notion of valid is
 squishy:  It depends entirely on what you intend to do with the path.
 
 Its a rats nest to do it properly, but some very basic idea of does
 this path have things which there is no way could possibly be in a
 file - for example c:\|file is a useful thing to have.

This seems to encourage the classic mistake of checking not known bad
rather than known good.  known bad is rarely useful in my
experience.  What use case do you have in mind?

  In this library proposal, there are a bunch of xxxDrive functions
  that many Unix-oriented programmers are sure to ignore because they are
  no-ops on Unixy systems. Even on Windows, they are not very useful:
 
 I strongly agree about this.  The temptation in path modules seems to be
 to throw in everything you can think of (without specifying any of it
 precisely), just in case someone finds it useful.
 
 The drive functions stand on their own as a chunk, and are possibly
 not well suited to a Posix system, but are critical for a Windows
 system.

Why are they critical for portable code?  I am fine with
Windows-specific functions, but I think it's a mistake to bundle them
portable functions.  (In my design, I have separate types for Windows
and Unix paths, and imagine full support for Windows-specific
operations, but only on the Windows type.)

 I have tried to specify the functions precisely, and I use this
 specification as a test suite. Currently there are 114 properties in
 this test suite, all can be seen on the haddock documentation. If you
 consider any function to be ambiguously specified, please say which
 one and I'll add extra tests until it gives you no suprises at all.

My criticism is that your properties are all specified in terms of
string manipulation.  The whole point of paths is that they are
interpreted by the system, so if you neglect to say what your operations
mean to the system, what have you specified?

Here are some specific cases I take issue with.  (Quotes are from your
generated docs.)  Sorry if I seem to be piling it on, but I think these
matters are important for a good path library.

 pathSeparator :: Char
 The character that seperates directories.

So what do I do with this?  If I need it, it seems like the module has
failed.

 splitFileName :: FilePath - (String, String)
 Split a filename into directory and file. 

Which directory and which file?

 splitFileName bob == (, bob)

 is not a directory.

 Windows: splitFileName c: == (c:,)

c: is arguably not a directory.  (Consider that dir c: lists the
current directory on c:, not c:\)

 getFileName test/ == 

 is not a filename.

Also, it looks from this that you treat paths differently depending on
whether they end in a separator.  Yet this makes no difference to the
system.  That seems wrong to me.

 setFileName :: FilePath - String - FilePath
 Set the filename. 

This is vague to me.  Eg, what does it do with /, which has no
filename?

 getDirectory :: FilePath - FilePath
 Get the directory name, move up one level. 

What does this mean, in the presence of dots and symlinks?

 normalise :: FilePath - FilePath
 Normalise a file

As Simon asked, when is this safe to use?

 equalFilePath :: FilePath - FilePath - Bool
 Equality of two FilePaths. If you call fullPath first this has a much
 better chance of working. Note that this doesn't follow symlinks or
 DOSNAM~1s.

As you acknowledge, it's a crap-shoot.  So what's the point?

 isValid :: FilePath - Bool
 Is a FilePath valid, i.e. could you create a file like it? 

There are a whole host of reasons you might not be able to create a
file.  Which ones does this address?

 I tried to export a minimal set of operations that seem to me sufficient
 for everything not very platform-specific (though I am interested in
 counterexamples):
 
 Anything to do with file extensions? Its also important (I feel) for
 people to have easy access to common operations, but I guess that is a
 design decision.

I think of that as a separate module, because extensions have no meaning
to the system and can be done with portable, functional code, as far as
I understand.

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-25 Thread Neil Mitchell

Hi


 Its a rats nest to do it properly, but some very basic idea of does
 this path have things which there is no way could possibly be in a
 file - for example c:\|file is a useful thing to have.

This seems to encourage the classic mistake of checking not known bad
rather than known good.  known bad is rarely useful in my
experience.  What use case do you have in mind?


wget on windows saves web pages such as
http://www.google.com/index.html?q=haskell; to the file
index.html?q=haskell. This just doesn't work, and is the main reason
I added this in. I don't think it will be a commonly used operation.



 The drive functions stand on their own as a chunk, and are possibly
Why are they critical for portable code?  I am fine with
Windows-specific functions, but I think it's a mistake to bundle them
portable functions.

I agree, and have now removed them.



My criticism is that your properties are all specified in terms of
string manipulation.  The whole point of paths is that they are
interpreted by the system, so if you neglect to say what your operations
mean to the system, what have you specified?

True, but at the same time specifying what something means with
respect to a filesystem is very hard :) If you had any insight how
this could be done I'd be interested.


 pathSeparator :: Char
 The character that seperates directories.
So what do I do with this?  If I need it, it seems like the module has
failed.

Hopefully no one will ever use it. Its part of the low level functions
that the FilePath module builds on. However, pragmatically, someone
somewhere will have a use for it, and the second they do they'll just
write '/', and at that point we've lost.



 splitFileName :: FilePath - (String, String)
 Split a filename into directory and file.
Which directory and which file?

Ok, thats probably the wrong description. Splits off the last filename
would be a better description, leaving the rest.


 splitFileName bob == (, bob)
 is not a directory

No, its the rest in this context.


 Windows: splitFileName c: == (c:,)
c: is arguably not a directory.  (Consider that dir c: lists the
current directory on c:, not c:\)

Its a bit weird on Windows, but certainly c: isn't a FileName, so
thats the reason for this decision.


 getFileName test/ == 
 is not a filename.

But test/ is certainly not a file.


Also, it looks from this that you treat paths differently depending on
whether they end in a separator.  Yet this makes no difference to the
system.  That seems wrong to me.

That was something I thought over quite a while. If the user enters
directory/ then they do not mean the file called directory, they
mean the directory called directory. And in Windows certainly you
can't open a file called file/


 setFileName :: FilePath - String - FilePath
 Set the filename.
This is vague to me.  Eg, what does it do with /, which has no
filename?

/ as the second element? I guess its calling it out of spec if you
use anything but a valid filename as the second argument, and the
behaviour is undefined. If you do need to do something like that, then
combine is the function.


 getDirectory :: FilePath - FilePath
 Get the directory name, move up one level.
What does this mean, in the presence of dots and symlinks?

It gets a parent directory, there may be one, but the one returned
will be a parent.


 normalise :: FilePath - FilePath
 Normalise a file
As Simon asked, when is this safe to use?

Let me think, and then work on it so the answer is always.


 equalFilePath :: FilePath - FilePath - Bool
 Equality of two FilePaths. If you call fullPath first this has a much
 better chance of working. Note that this doesn't follow symlinks or
 DOSNAM~1s.
As you acknowledge, it's a crap-shoot.  So what's the point?

Its a case of reality, at the moment people use == to test if two file
paths are equal, at least this is a better test.



 isValid :: FilePath - Bool
 Is a FilePath valid, i.e. could you create a file like it?
There are a whole host of reasons you might not be able to create a
file.  Which ones does this address?

I have added documentation which hopefully shows exactly what it tries
to address.



I think of that as a separate module, because extensions have no meaning
to the system and can be done with portable, functional code, as far as
I understand.

Not really, what about getExtension file.ext\lump - the answer is 
on windows and .ext\lump on Posix. This library isn't just a
portability layer (although it does encompass that), its mainly meant
to make the things people do with filepaths easier, and by seducing
them with ease of use, subtly tack in cross platform portability.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: ANN: System.FilePath 0.9

2006-07-22 Thread Brian Hulley

Neil Mitchell wrote:

And if someone wants to define a new and better FilePath type, I
would prefer something more abstract, such as a list of Path
components, with functions to serialize it as a String and to parse
it from a String.


A list of path components is just not enough, I'm afraid. What about
extensions? What about drives? If you want an abstract type it will
probably need to be entirely abstract, rather than with some exposed
structure.


Why not just delete Unix and Windows from the equation altogether, and 
define a simple Haskell file system with something like:


newtype Path a = Path [a]
newtype Filename a = Filename a
data Origin a -- some abstract type
  deriving Eq -- this would be nice if it is possible to implement

data IString a = FileSpecifier a = FileSpecifier !(Origin a) !(Path a) 
!(Filename a)


instance IString ByteString.Char8 ...
instance IString String ...

Origins could be created by a factory appropriate to the underlying 
operating system (they would represent drives or volumes or mount points) - 
in any case a drive can't be mentioned in a program or the program wouldn't 
be portable!


Athough even with a nice rational reconstruction the monstrously unfortunate 
fact remains that Windows is case insensitive (how impossibly moronic!!!) 
and Unix isn't so it is not possible to write code that will work the same 
for both OS's if one is required to use filenames that will look the same in 
other OS apps (ie the trick of encoding the complete Unicode char set in 
terms of legal filename chars is probably not acceptable).


Anyway this is probably straying too far from what you are trying to do at 
the moment.


Regards, Brian.

--
Logic empowers us and Love gives us purpose.
Yet still phantoms restless for eras long past,
congealed in the present in unthought forms,
strive mightily unseen to destroy us.

http://www.metamilk.com 


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: ANN: System.FilePath 0.9

2006-07-21 Thread David Menendez
Neil Mitchell writes:

   We should avoid referring to $PATH as the path, since we
  already have FilePath.
 Agreed, but I couldn't come up with a better name, if anyone has any
 suggestions.

searchPath?
-- 
David Menendez [EMAIL PROTECTED] http://www.eyrie.org/~zednenem/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-20 Thread Neil Mitchell

Hi


 I want to make sure a filename is valid. For example, prn and con
This is another rat's nest, so I suggest that it be dealt with
separately from the basic filepath module.  The notion of valid is
squishy:  It depends entirely on what you intend to do with the path.


Its a rats nest to do it properly, but some very basic idea of does
this path have things which there is no way could possibly be in a
file - for example c:\|file is a useful thing to have. By making it
pure, there is no risk of the result being different. I see the
isValid guarantee more as a False means it definately isn't valid,
rather than the other way round.



 In this library proposal, there are a bunch of xxxDrive functions
 that many Unix-oriented programmers are sure to ignore because they are
 no-ops on Unixy systems. Even on Windows, they are not very useful:

I strongly agree about this.  The temptation in path modules seems to be
to throw in everything you can think of (without specifying any of it
precisely), just in case someone finds it useful.


The drive functions stand on their own as a chunk, and are possibly
not well suited to a Posix system, but are critical for a Windows
system. Ignoring these, which would you consider worthy of removal?
Some are strictly redundant, but quite useful - for example
isAbsolute/isRelative which are the negation of each other.

I have tried to specify the functions precisely, and I use this
specification as a test suite. Currently there are 114 properties in
this test suite, all can be seen on the haddock documentation. If you
consider any function to be ambiguously specified, please say which
one and I'll add extra tests until it gives you no suprises at all.
QuickCheck rules :)


I tried to export a minimal set of operations that seem to me sufficient
for everything not very platform-specific (though I am interested in
counterexamples):


Anything to do with file extensions? Its also important (I feel) for
people to have easy access to common operations, but I guess that is a
design decision.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-20 Thread Neil Mitchell

Hi,


 In this library proposal, there are a bunch of xxxDrive functions .. 
[remove them]
I strongly agree about this.


I have decided you are right, on Windows getDrive x can be written simply as:

getDrive x | isRelative x = 
  | otherwise = head (getDirectories x)

And given that people probably shouldn't be playing with drives
anyway, if they do want to, they can do a bit more work. All the drive
related functions and therefore removed from the interface.

I have also added a canonicalPath function, support for spotting
file\con as invalid and fixing it, support for \\?\ paths (if you
don't know what they are, don't look it up, they are quite painful!)
and a few very obscure corner cases which broke some of the
properties.

Anyone have another other thoughts or comments?

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: ANN: System.FilePath 0.9

2006-07-18 Thread Piotr Kalinowski

On 18/07/06, Stephane Bortzmeyer [EMAIL PROTECTED] wrote:

For instance, many lazy (not in the Haskell meaning) programmers
believe that the path is safe if it does not include .. but it is
false (hint: ../foo/bar is a legal path on Unix).


I believe this does not cause trouble. If it is a shell expression, it
will go one level up. However, when treated as a filesystem path alone
it will stay beneath. After all, the filesystem does not interpret
quotation marks.

Regards,
Piotr Kalinowski
--
Intelligence is like a river: the deeper it is, the less noise it makes
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: ANN: System.FilePath 0.9

2006-07-18 Thread Chris Kuklewicz

Stephane Bortzmeyer wrote:

On Mon, Jul 17, 2006 at 03:07:51AM +0100,
 Neil Mitchell [EMAIL PROTECTED] wrote 
 a message of 64 lines which said:



How about adding something like restrictFilePaths :: FilePath - IO
() which will restrict the area that can be played with to that
beneath the given FilePath?


If someone does so, be aware that it is *not* trivial to write it
securely.

For instance, many lazy (not in the Haskell meaning) programmers
believe that the path is safe if it does not include .. but it is
false (hint: ../foo/bar is a legal path on Unix).


That is a legal path if your Haskell program invokes (perhaps indirectly) a Unix 
shell.  But if you can inject strings into a shell invocation then it is 
obviously impossible to do anything about limiting it to be weaker than the IO 
monad.


--
Chris
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-18 Thread Andrew Pimlott
On Sun, Jul 16, 2006 at 08:43:31PM -0500, Brian Smith wrote:
 I kind of expect that a Haskell library for file paths will use the
 type system to ensure some useful properties about the paths.

It's a nice idea, but I claim that it's a rat's nest.  Path semantics,
when you look hard at them, are too vague, confusing, and subtle to
encode usefully in types.  And I think there's a better way to do what
you're asking for.

 For example, when writing security-conscious software I often want to be
 able to distinguish between absolute, ascending (paths with leading
 ../ and similar items), and decending paths (paths that contain no
 ../).

My suggestion is to specify your own syntax and semantics for the input
to your software, which I assume is coming in over the network or some
other trust boundary.  By resisting the temptation to piggy-back on
native paths, you control what your paths mean, instead of leaving
it to the system.  Further, for many applications, users don't really
care that their paths map to filesystem paths.  If you keep them
separate, you can even change your storage from the filesystem to
something else.

In your case, either define your path syntax not to allow .., or
define your own simple normalization rules, and apply them before you
try to combine a user-supplied path with a native system path.  Eg, the
user gives ../a/../b/c/d/.., and you either reject it or turn it into
/b/c, and then append it to your root, eg /root/a/b.  Of course, you
might make further restrictions in your paths, like only allowing
letters, etc.

 I want to make sure a filename is valid. For example, prn and con
 are not valid path elements for normal files on Windows, certain
 characters are not allowed in filenames (depending on platform), some
 platforms may require paths to be escaped in different ways. I see
 there is a isValid function and even a (magical) makeValid
 function, but they do not report what was wrong with the filename in
 the first place. Furthermore, it isn't clear from the documentation how
 these functions determine whether a filename is valid.

This is another rat's nest, so I suggest that it be dealt with
separately from the basic filepath module.  The notion of valid is
squishy:  It depends entirely on what you intend to do with the path.
There are many cases to consider: on Linux, which characters are allowed
depends on the filesystem type, and special files may appear anywhere
and have any name--the only way to test for them is by doing IO.  Oh,
and who knows if the situation might change between when you call
isValid and when you actually perform the operation?

 IMO, safety is the most important issue regarding file paths and it is
 not addressed in this library as far as I can see. Writing code to
 handle these issues is tedious, error-prone, and boring to write
 despite being critical. It isn't the kind of code that you want to just
 download off of some guy's webpage. Basically, it is exactly the type
 of thing that belongs in a standard library.

My approach is not to take a filepath and say, is it safe? (which
can't be meaningfully answered in general anyway), but to construct
paths in a careful manner that is safe for your application.

 In this library proposal, there are a bunch of xxxDrive functions
 that many Unix-oriented programmers are sure to ignore because they are
 no-ops on Unixy systems. Even on Windows, they are not very useful:

I strongly agree about this.  The temptation in path modules seems to be
to throw in everything you can think of (without specifying any of it
precisely), just in case someone finds it useful.  I posted a more
minimalist module a while back:

http://haskell.org/pipermail/libraries/2006-February/004890.html

I tried to export a minimal set of operations that seem to me sufficient
for everything not very platform-specific (though I am interested in
counterexamples):

currentPath :: p
prefixes :: p - [(p, ChildName)]
addChild :: Monad m = p - ChildName - m p
append :: Monad m = p - p - m p
getChildren :: p - IO [p]
canonicalize :: p - IO p

See the referenced message for explanation.

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-17 Thread David Roundy
On Mon, Jul 17, 2006 at 03:07:51AM +0100, Neil Mitchell wrote:
 Hi Brian,
 
 I kind of expect that a Haskell library for file paths will use the
  type system to ensure some useful properties about the paths.

 I am specificially concentrating on type FilePath = String, because
 that is how it is defined by Haskell 98. And consequently that's how
 it works with readFile/writeFile/appendFile etc.
 
 Perhaps a far better solution to this would not be to hack these kind
 of guarantees in at the filepath level, but have a restricted IO monad
 that only lets you perform operations inside certain directories, or
 only lets you read/write files. I know that both House and Halfs use
 these techniques. Without too much effort Yhc (for example) could be
 modified to perform restricted IO operations (only on certain
 directories etc).

 You seem to want to distinguish between relative, relative down only
 and absolute paths. By putting this in the filepath, and having
 different types for each, you pretty much guarantee that all standard
 functions will operate on all 3 types of path, so you don't gain any
 security that way, since mistakes will still slip through. How about
 adding something like restrictFilePaths :: FilePath - IO () which
 will restrict the area that can be played with to that beneath the
 given FilePath?

Darcs also does something similar (typeclasses for control of IO
actions), and this is certainly the way to go.  However, I also agree
that type distinctions between paths would be nice.  My preference has
long been that the FilePath should be a class rather than a type.
Then one could have single IO functions that accept restricted and
unrestricted file paths, and other ones that accept only restricted
file paths, so you could get compile-time checking that your safe
chroot monad won't die at runtime.
-- 
David Roundy
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-17 Thread Brian Smith
Hi Neil,On 7/17/06, Neil Mitchell [EMAIL PROTECTED] wrote:
Hi Brian,You sent this email just to me, and not to the list. If you indendedto send to the list then feel free to forward my bits on to the list. I know that FilePath is defined by Haskell '98 as a String and so it cannot
 be changed. So, perhaps a new type or class should be created for this library (hereafter GoodPath, although I am not suggesting that is the best name).The problem is people will have to marshal their data into this
GoodPath, and marshal it out again. When people can shortcut thatmarshalling, as the current readFile/writeFile definitions ensure theycan, they will. At that point you loose all safety because people will
abuse it.I disagree. It would be trivial to create a new module that exported new definitions of file IO actions that operated on GoodPath instead of FilePath, transparently delegating to the original readFile/writeFile/etc. until they could be removed in the future. This would also support the SuperFilePath idea you mentioned. 
Another thing I thought of would be a canonicalPath IO action (canonicalPath :: FilePath - IO FilePath) that returns a FilePath that implements case-preserving-case-insensitive matching. For example, if there is a file named Hello 
There.txt in C:\, then(canonicalPath c:\hello there.txt ) would give C:\Hello There.txt).I think that the xxxDrive functions should only be exported from System.FilePath.Windows and no 
System.FilePath since it is unclear as to how they should be used effectively by cross-platform software.- Brian
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RE: ANN: System.FilePath 0.9

2006-07-17 Thread Neil Mitchell

Hi


I disagree. It would be trivial to create a new module that exported new
definitions of file IO actions that operated on GoodPath instead of
FilePath, transparently delegating to the original readFile/writeFile/etc.
until they could be removed in the future. This would also support the
SuperFilePath idea you mentioned.

Yes it would, but because readFile etc. are in the prelude its not
easy to not have them included. If someone was to write a
System.SuperFilePath module and an IO.SuperFilePath module that would
be great! I have considered it myself, but unfortunately don't have
enough time, at the moment.

The advantage of moving to FilePath now is that its entirely
non-breaking for anything, and once we have SuperFilePath, it makes it
easier to migrate because (hopefully!) there will be less functions
proding directly at FilePath's as strings.


Another thing I thought of would be a canonicalPath IO action
(canonicalPath :: FilePath - IO FilePath) that returns a FilePath that
implements case-preserving-case-insensitive matching. For
example, if there is a file named Hello There.txt in C:\, then
(canonicalPath c:\hello there.txt ) would give C:\Hello There.txt).

Yes, thats a really good idea - and in fact when I wrote a FilePath
module for Visual Basic (a long long time ago), I had such a function
in it. I will make sure I add that tomorrow.


I think that the xxxDrive functions should only be exported from
System.FilePath.Windows and no System.FilePath since it is unclear as to how
they should be used effectively by cross-platform software.

I would say they shouldn't be used at all, but it is true that
Posix.setDrive c: is a bit poorly defined. I will think this idea
over, maybe the drive functions shouldn't be exported under either the
general one or under the Posix, but it breaks a nice symetry that the
library has...

I have added a wiki page discussing System.FilePath,
http://haskell.org/haskellwiki/FilePath, which is more a personal todo
list, but if people want to summarise/propose things then feel free :)

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe