Re: Filename literals

2009-08-19 Thread David Green

On 2009-Aug-18, at 7:20 am, Timothy S. Nelson wrote:

On Tue, 18 Aug 2009, David Green wrote:
Some ways in which different paths can be considered equivalent:   
Spelling: ... Simplification: ... Resolution: ... Content-wise: ...
	Ok, my next commit will have canonpath (stolen directly from p5's  
File::Spec documentation), which will do No physical check on the  
filesystem, but a logical cleanup of a path, and realpath (idea  
taken from p5's Cwd documentation), which will resolve symlinks,  
etc, and provide an absolute path.  Oh, and resolvepath, which  
does both.  I'm not quite sure I followed all your discussion above  
-- have I left something out?


I think there's a difference between canonical as in a webpage with  
link rel=canonical, and cleanup as in Windows turning PROGRA~1  
into Program Files.  There could also be other types of  
normalisation depending on the FS, but we probably shouldn't concern  
ourselves with them, other than having some way to get to such native  
calls.


	Anyway, my assumption is that there should be a number of  
comparison options.  Since we do Str, we should get string  
comparison for free.  But I'm expecting other options at other  
levels, but have no idea how or what at this point.


As Leon Timmermans keeps reminding us, that really should be delegated  
to the OS/FS.  I think $file1 =:= $file2 should ask the OS whether it  
thinks those are the same item or not (it can check paths, it can  
check inodes, whatever is its official way to compare file-thingies).   
Similarly, $file1.name === $file2.name should ask the OS whether it  
thinks those names mean the same thing.  And if you want to compare  
the canonical paths or anything else, just say $file1.name.canonical  
=== $file2.name.canonical, or use 'eq', or whatever you want to do,  
just do it explicitly.


	According to my last commit, p{} will return a Path object that  
just stores the path, but has methods attached for accessing all the  
metadata.  But it doesn't do file opening or things like that  
(unless you use the :T and :B thingies, which read the first block  
and try to guess whether it's text or binary -- these are in Perl 5  
too).


There are two things going on here: the user-friendly syntax for  
casual use, which we basically agree should be something short and  
pithy, although we have but begun to shed this bike, I'm sure.


$file = io /foo/bar;
$file = p{/foo/bar};
$file = Q:p/foo/bar/;
$file = File(/foo/bar);

However we end up spelling it, we want that to give us unified access  
to the separate inside parts:


IO::Data# contents of file
IO::Handle  # filehandle for using manually
IO::Metadata
IO::Path

I'm not sure why Path isn't actually just part of IO::Metadata...  
maybe it's just handy to have it out on its own because pathnames are  
so prominent.  In any case, $file.size would just be shorthand for  
something like $file.io.metadata{size}.  The :T and :B tests probably  
ought to be part of IO::Data, since they require opening the file to  
look at it; I'd rather put them there (vs. ::Metadata, which is all  
outside info) since plain ol' $file abstracts over that detail  
anyway.  You can say $file.r, $file.x, $file.T, $file.B, and not care  
where those test live under the hood.


We might actually want to distinguish IO::Metadata::Stat from  
IO::Metadata::Xattr or something... but that's probably too FS- 
specific.  I don't think I mind much whether it's IO::Path or  
IO::Metadata::Path, or whether they both as exist as synonyms


	I think we want many of the same things, I'm just expressing them  
slightly differently.  Let's keep working on this, and hopefully we  
end up with something great.


Yes.  A great mess!  Er, wait, no

And there's no perfect solution, but it would be useful for Perl to  
stick as closely as the FS/OS's idea of types as it can.  Sometimes  
that would mean looking up an extension; it might mean using (or  
emulating) file magic; it might mean querying the FS for a MIME- 
type or a UTI.  After all, the filename extension may not actually  
match the correct type of the file.


	My suggestion would be that it's an interesting idea, but should  
maybe be left to a module, since it's not a small problem.  Of  
course, I'm happy to be overruled by a higher power :).  I'd like  
the feature, I'm just unsure it deserved core status.


Well, it's all modules anyway... certainly we'll have to rely on  
IO::Filesystem::XXX, but I do think this is another area to defer to  
the OS's own type-determining functions rather than try to do it all  
internally.  What we should have, though, is a standard way to  
represent the types in Perl so that users know how to deal with them.   
I think roles are the obvious choice: if the OS tells you that a file  
is HTML, then $file would do IO::Datatype::HTML, which means in turn  
it would also do IO::Datatype::Plaintext, and so on.


Of 

Re: Filename literals

2009-08-19 Thread Mark J. Reed
I don't think $file1.name == $file2.name should talk to the FS,
because I think File#name t+r whatever)  should return a plain Str.
Having magical FilePathName objects is handy, but sometimes you want
to get the filename as a dumb string to do stringish things without
having to worry about the fact that the string started life as the
name of a file somewhere.   I could convert it explicitly, but it's
not obvious that I need to;  'name' sounds like something that should
return Str.

On 8/19/09, David Green david.gr...@telus.net wrote:
 On 2009-Aug-18, at 7:20 am, Timothy S. Nelson wrote:
 On Tue, 18 Aug 2009, David Green wrote:
 Some ways in which different paths can be considered equivalent:
 Spelling: ... Simplification: ... Resolution: ... Content-wise: ...
  Ok, my next commit will have canonpath (stolen directly from p5's
 File::Spec documentation), which will do No physical check on the
 filesystem, but a logical cleanup of a path, and realpath (idea
 taken from p5's Cwd documentation), which will resolve symlinks,
 etc, and provide an absolute path.  Oh, and resolvepath, which
 does both.  I'm not quite sure I followed all your discussion above
 -- have I left something out?

 I think there's a difference between canonical as in a webpage with
 link rel=canonical, and cleanup as in Windows turning PROGRA~1
 into Program Files.  There could also be other types of
 normalisation depending on the FS, but we probably shouldn't concern
 ourselves with them, other than having some way to get to such native
 calls.

  Anyway, my assumption is that there should be a number of
 comparison options.  Since we do Str, we should get string
 comparison for free.  But I'm expecting other options at other
 levels, but have no idea how or what at this point.

 As Leon Timmermans keeps reminding us, that really should be delegated
 to the OS/FS.  I think $file1 =:= $file2 should ask the OS whether it
 thinks those are the same item or not (it can check paths, it can
 check inodes, whatever is its official way to compare file-thingies).
 Similarly, $file1.name === $file2.name should ask the OS whether it
 thinks those names mean the same thing.  And if you want to compare
 the canonical paths or anything else, just say $file1.name.canonical
 === $file2.name.canonical, or use 'eq', or whatever you want to do,
 just do it explicitly.

  According to my last commit, p{} will return a Path object that
 just stores the path, but has methods attached for accessing all the
 metadata.  But it doesn't do file opening or things like that
 (unless you use the :T and :B thingies, which read the first block
 and try to guess whether it's text or binary -- these are in Perl 5
 too).

 There are two things going on here: the user-friendly syntax for
 casual use, which we basically agree should be something short and
 pithy, although we have but begun to shed this bike, I'm sure.

  $file = io /foo/bar;
  $file = p{/foo/bar};
  $file = Q:p/foo/bar/;
  $file = File(/foo/bar);

 However we end up spelling it, we want that to give us unified access
 to the separate inside parts:

  IO::Data# contents of file
  IO::Handle  # filehandle for using manually
  IO::Metadata
  IO::Path

 I'm not sure why Path isn't actually just part of IO::Metadata...
 maybe it's just handy to have it out on its own because pathnames are
 so prominent.  In any case, $file.size would just be shorthand for
 something like $file.io.metadata{size}.  The :T and :B tests probably
 ought to be part of IO::Data, since they require opening the file to
 look at it; I'd rather put them there (vs. ::Metadata, which is all
 outside info) since plain ol' $file abstracts over that detail
 anyway.  You can say $file.r, $file.x, $file.T, $file.B, and not care
 where those test live under the hood.

 We might actually want to distinguish IO::Metadata::Stat from
 IO::Metadata::Xattr or something... but that's probably too FS-
 specific.  I don't think I mind much whether it's IO::Path or
 IO::Metadata::Path, or whether they both as exist as synonyms

  I think we want many of the same things, I'm just expressing them
 slightly differently.  Let's keep working on this, and hopefully we
 end up with something great.

 Yes.  A great mess!  Er, wait, no

 And there's no perfect solution, but it would be useful for Perl to
 stick as closely as the FS/OS's idea of types as it can.  Sometimes
 that would mean looking up an extension; it might mean using (or
 emulating) file magic; it might mean querying the FS for a MIME-
 type or a UTI.  After all, the filename extension may not actually
 match the correct type of the file.

  My suggestion would be that it's an interesting idea, but should
 maybe be left to a module, since it's not a small problem.  Of
 course, I'm happy to be overruled by a higher power :).  I'd like
 the feature, I'm just unsure it deserved core status.

 Well, it's all modules 

Re: Filename literals

2009-08-19 Thread Timothy S. Nelson

On Wed, 19 Aug 2009, Mark J. Reed wrote:


I don't think $file1.name == $file2.name should talk to the FS,
because I think File#name t+r whatever)  should return a plain Str.
Having magical FilePathName objects is handy, but sometimes you want
to get the filename as a dumb string to do stringish things without
having to worry about the fact that the string started life as the
name of a file somewhere.   I could convert it explicitly, but it's
not obvious that I need to;  'name' sounds like something that should
return Str.


	$file1.name == $file2.name is kinda strange because it does a numeric 
comparison between the filenames (see S03).  Methinks you want
$file1 eq $file2 (both of which are assumed to be of type Path) which does a 
string comparison between them without consulting the filesystem.


	Having said that, you've made me realise that $file1 == $file2 might 
be the perfect operator for comparing inodes, since inodes are numbers.


:)

-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+++ PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-



Re: Filename literals

2009-08-19 Thread Timothy S. Nelson
	I should've mentioned, though, we're currently using the smartmatch 
operator for this, so I'm thinking maybe I'll just stick with that.


:)


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+++ PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-



Re: Filename literals

2009-08-18 Thread David Green

On 2009-Aug-17, at 8:36 am, Jon Lang wrote:

Timothy S. Nelson wrote:
   Well, my main thought in this context is that the stuff that  
can be

done to the inside of a file can also be done to other streams -- TCP
sockets for example (I know, there are differences, but the two are  
a lot
the same), whereas metadata makes less sense in the context of TCP  
sockets;


But any IO object might have metadata; some different from the  
metadata you traditionally get with files, and some the same, e.g.  
$io.size, $io.times{modified}, $io.charset, $io.type.



if (path{/path/to/file}.e) {
   @lines = slurp(path{/path/to/file});
}
   (I'm using one of David's suggested syntaxes above, but I'm  
not

closely attached to it).


I suggested variations along the line of: io /path/to/file.  It  
amounts to much the same thing, but it's important conceptually to  
distinguish a pathname from the thing it names.  (A path doesn't have  
a modification date, a file does.)  Also, special quoting/escaping  
could apply to other things, not limited to filenames.  That said, I  
don't think it's unreasonable to want to combine both operations for  
brevity, but the io-constructor should have built-in path parsing, not  
the other way around.


   I guess what I'm saying here is that I think we can do the  
things
without people having to worry about the objects being separate  
unless they
care.  So, separate objects, but hide it as much as possible.  Is  
that

something you're fine with?


Yes -- to me that means some class/role that wraps up all the pieces  
together, but all the separate components are still there underneath.   
But I'm not too bothered about how it's implemented as long as it's  
transparent for casual use.


my $file = io p[/some/file];
my $contents = $file.data;
my $mod-date = $file.times{modified};
my $size = $file.size;


Pathnames still are strings, so that's fine.  In fact, there are  
different
   As for pathnames being strings, you may be right FSVO  
string.  But
I'd say that, while they may be strings, they're not Str, but they  
do Str


Agreed, pathnames are almost strings, but worth distinguishing  
conceptually.  There should be a URL type that does Str.


Actually, there are other differences, like case-insensitivity and  
illegal chars.  Unfortunately, those depend on the given filesystem.   
As long as you're dealing with one FS at a time, that's OK; it  
probably means we have IO::Name::ext3, IO::Name::NTFS, IO::Name::HFS,  
etc.  But what happens when you cross FS-barriers?  Does a case- 
sensitive name match a case-insensitive one?  Is filename-equality not  
commutative or not transitive?  If you're looking for a filename foo  
on Mac/Win, then a file actually called FOO matches; but on Unix it  
wouldn't.


(Actually, Macs can do both IO::Name::HFS::case-insensitive and  
IO::Name::HFS::case-sensitive.  Eek.)


I'd like Perl 6's treatment of filenames to be smart enough that  
smart-matching any of these pairs of alternative spellings would  
result in a successful match.  So while I'll agree that filenames  
are string-like, I really don't want them to _be_ strings.


Well, the *files* are the same, but the pathnames are different.  I'm  
not sure whether some differences in spelling should be ignored by  
default or not.  There are actually several different kinds; S32 has a  
method realpath, but I think canonical is a better name, because  
aliases can be just as real as the canonical path, e.g. a web page  
with multiple addresses.  Or hard links rather than soft links --  
though in that case, there is no one canonical path.  It may not  
even be possible to easily tell if there is one or not.


Some ways in which different paths can be considered equivalent:
Spelling: C:\PROGRA~1, case-insensitivity
Simplification: foo/../bar/ to bar/
Resolution: of symlinks/shortcuts
Content-wise: hard links/multiple addresses

Depending on the circumstances, you might want any of those to count  
as the same file; or none of them.  We'll need methods for each sort  
of transformation, $path.canonical, $path.normalize, $path.simplify,  
etc.  Two high-level IO objects are the same, regardless of path, if  
$file2 =:= $file2 (which might compare inodes, etc.).  There should be  
a way to set what level of sameness applies in a given lexical scope;  
perhaps the first two listed above are a reasonable default to start  
with.


There's something that slightly jars me here... I don't like the  
quotation returning an IO object.
But doesn't normal quoting return a Str object?  And regex quoting  
return an object (Regex?  Match?  Something, anyway).


Certainly, but a regex doesn't produce a Signature object, say.  I  
don't object to objects, just to creating objects, then doing  
something with them, then returning another kind of object, and  
calling that parsing.  If we're parsing the characters, we should  
end up with an IO::Name.  If 

Re: Filename literals

2009-08-18 Thread Leon Timmermans
Reading this discussion, I'm getting the feeling that filename
literals are increasingly getting magical, something that I don't
think is a good development. The only sane way to deal with filenames
is treating them as opaque binary strings, making any more assumptions
is bound to get you into trouble. I don't want to deal with Windows'
strange restrictions on characters when I'm working on Linux. I don't
want to deal with any other platform's particularities either.
Portability should be positive, not negative IMNSHO.

As for comparing paths: reimplementing logic that belongs to the
filesystem sounds like really Bad Idea™ to me. Two paths can't be
reliably compared without choosing to make some explicit assumptions,
and I don't think Perl should make such choices for the programmer.

Leon Timmermans


Re: Filename literals

2009-08-18 Thread Timothy S. Nelson

On Tue, 18 Aug 2009, David Green wrote:


On 2009-Aug-17, at 8:36 am, Jon Lang wrote:

Timothy S. Nelson wrote:

  Well, my main thought in this context is that the stuff that can be
done to the inside of a file can also be done to other streams -- TCP
sockets for example (I know, there are differences, but the two are a lot
the same), whereas metadata makes less sense in the context of TCP 
sockets;


But any IO object might have metadata; some different from the metadata you 
traditionally get with files, and some the same, e.g. $io.size, 
$io.times{modified}, $io.charset, $io.type.


Ok, now you're giving me ideas :).

[snipped a bit and moved it further down the e-mail]

  I guess what I'm saying here is that I think we can do the things
without people having to worry about the objects being separate unless 
they

care.  So, separate objects, but hide it as much as possible.  Is that
something you're fine with?


Yes -- to me that means some class/role that wraps up all the pieces 
together, but all the separate components are still there underneath.  But 
I'm not too bothered about how it's implemented as long as it's transparent 
for casual use.


  my $file = io p[/some/file];
  my $contents = $file.data;
  my $mod-date = $file.times{modified};
  my $size = $file.size;


That sounds like the kind of thing I'm heading for.

Pathnames still are strings, so that's fine.  In fact, there are 
different

  As for pathnames being strings, you may be right FSVO string.  But
I'd say that, while they may be strings, they're not Str, but they do Str


Agreed, pathnames are almost strings, but worth distinguishing 
conceptually.  There should be a URL type that does Str.


Actually, there are other differences, like case-insensitivity and illegal 
chars.  Unfortunately, those depend on the given filesystem.  As long as 
you're dealing with one FS at a time, that's OK; it probably means we have 
IO::Name::ext3, IO::Name::NTFS, IO::Name::HFS, etc.  But what happens when 
you cross FS-barriers?  Does a case-sensitive name match a case-insensitive 
one?  Is filename-equality not commutative or not transitive?  If you're 
looking for a filename foo on Mac/Win, then a file actually called FOO 
matches; but on Unix it wouldn't.


(Actually, Macs can do both IO::Name::HFS::case-insensitive and 
IO::Name::HFS::case-sensitive.  Eek.)


I think it should depend on the set of constraints involved.

I'd like Perl 6's treatment of filenames to be smart enough that 
smart-matching any of these pairs of alternative spellings would result 
in a successful match.  So while I'll agree that filenames are string-like, 
I really don't want them to _be_ strings.


Well, the *files* are the same, but the pathnames are different.  I'm not 
sure whether some differences in spelling should be ignored by default or 
not.  There are actually several different kinds; S32 has a method 
realpath, but I think canonical is a better name, because aliases can be 
just as real as the canonical path, e.g. a web page with multiple 
addresses.  Or hard links rather than soft links -- though in that case, 
there is no one canonical path.  It may not even be possible to easily tell 
if there is one or not.


Some ways in which different paths can be considered equivalent:
  Spelling: C:\PROGRA~1, case-insensitivity
  Simplification: foo/../bar/ to bar/
  Resolution: of symlinks/shortcuts
  Content-wise: hard links/multiple addresses

Depending on the circumstances, you might want any of those to count as the 
same file; or none of them.  We'll need methods for each sort of 
transformation, $path.canonical, $path.normalize, $path.simplify, etc.  Two 
high-level IO objects are the same, regardless of path, if $file2 =:= 
$file2 (which might compare inodes, etc.).  There should be a way to set what 
level of sameness applies in a given lexical scope; perhaps the first two 
listed above are a reasonable default to start with.


	Ok, my next commit will have canonpath (stolen directly from p5's 
File::Spec documentation), which will do No physical check on the filesystem, 
but a logical cleanup of a path, and realpath (idea taken from p5's Cwd 
documentation), which will resolve symlinks, etc, and provide an absolute 
path.  Oh, and resolvepath, which does both.  I'm not quite sure I followed 
all your discussion above -- have I left something out?


	Anyway, my assumption is that there should be a number of comparison 
options.  Since we do Str, we should get string comparison for free.  But I'm 
expecting other options at other levels, but have no idea how or what at this 
point.


There's something that slightly jars me here... I don't like the 
quotation returning an IO object.
But doesn't normal quoting return a Str object?  And regex quoting return 
an object (Regex?  Match?  Something, anyway).


Certainly, but a regex doesn't produce a Signature object, say.  I don't 
object to objects, just to creating objects, 

Re: Filename literals

2009-08-18 Thread Carl Mäsak
Leon ():
 Reading this discussion, I'm getting the feeling that filename
 literals are increasingly getting magical, something that I don't
 think is a good development. The only sane way to deal with filenames
 is treating them as opaque binary strings, making any more assumptions
 is bound to get you into trouble. I don't want to deal with Windows'
 strange restrictions on characters when I'm working on Linux. I don't
 want to deal with any other platform's particularities either.
 Portability should be positive, not negative IMNSHO.

 As for comparing paths: reimplementing logic that belongs to the
 filesystem sounds like really Bad Idea™ to me. Two paths can't be
 reliably compared without choosing to make some explicit assumptions,
 and I don't think Perl should make such choices for the programmer.

Very nicely put. We can't predict the future, but in creating
something that'll at least persist through the next decade, let's not
do elaborate things with lots of moving parts.

Let's make a solid ground to stand on; something so stable that it
works uphill and underwater. People with expertise and tuits will
write the facilitating modules.

PerlJam To quote Kernighan and Pike:  Simplicity. Clarity. Generality.
moritz_ I agree.
Matt-W magic can always be added with module goodness

// Carl


Re: Filename literals

2009-08-18 Thread Jan Ingvoldstad
On Tue, Aug 18, 2009 at 3:20 PM, Carl Mäsak cma...@gmail.com wrote:


  Let's make a solid ground to stand on; something so stable that it
 works uphill and underwater. People with expertise and tuits will
 write the facilitating modules.

 PerlJam To quote Kernighan and Pike:  Simplicity. Clarity. Generality.
 moritz_ I agree.
 Matt-W magic can always be added with module goodness


I agree with this principle.

The discussion has been (and probably still will be) fruitful anyway, if
only in illuminating the challenges with multi-platform and multi-filesystem
support, some of the things we need to consider for that and how.
-- 
Jan


Re: Filename literals

2009-08-18 Thread Daniel Carrera

+1

Carl Mäsak wrote:

Very nicely put. We can't predict the future, but in creating
something that'll at least persist through the next decade, let's not
do elaborate things with lots of moving parts.

Let's make a solid ground to stand on; something so stable that it
works uphill and underwater. People with expertise and tuits will
write the facilitating modules.

PerlJam To quote Kernighan and Pike:  Simplicity. Clarity. Generality.
moritz_ I agree.
Matt-W magic can always be added with module goodness





Re: Filename literals

2009-08-18 Thread Timothy S. Nelson

On Tue, 18 Aug 2009, Leon Timmermans wrote:


Reading this discussion, I'm getting the feeling that filename
literals are increasingly getting magical, something that I don't
think is a good development. The only sane way to deal with filenames
is treating them as opaque binary strings, making any more assumptions
is bound to get you into trouble. I don't want to deal with Windows'
strange restrictions on characters when I'm working on Linux. I don't
want to deal with any other platform's particularities either.
Portability should be positive, not negative IMNSHO.


	Sounds to me like you need p:bin{/path/to/file} -- that does what you 
want it to.  I'll make it more obvious in the S16 documentation.



As for comparing paths: reimplementing logic that belongs to the
filesystem sounds like really Bad Idea? to me. Two paths can't be
reliably compared without choosing to make some explicit assumptions,
and I don't think Perl should make such choices for the programmer.


	That's why I want multiple comparison options, so that people have to 
explicitly choose what they want.  How to do this, though, I'm unsure.


:)


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+++ PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-



Re: Filename literals

2009-08-18 Thread Troels Liebe Bentsen
On Tue, Aug 18, 2009 at 15:20, Carl Mäsakcma...@gmail.com wrote:
 Leon ():
 Reading this discussion, I'm getting the feeling that filename
 literals are increasingly getting magical, something that I don't
 think is a good development. The only sane way to deal with filenames
 is treating them as opaque binary strings, making any more assumptions
 is bound to get you into trouble. I don't want to deal with Windows'
 strange restrictions on characters when I'm working on Linux. I don't
 want to deal with any other platform's particularities either.
 Portability should be positive, not negative IMNSHO.

The whole reason filenames/paths is a mess to code if because they are treated
as binary strings in most cases. This is also why we have modules like
File::Spec and bunch more on CPAN all trying to do the same thing. And today if
I want to code something that works on all platforms I have to use that
instead. How can this be positive?

For me a Path literal is a way to get rid of all this bandage so we don't have
to bother with the strange restrictions later when we get a bug report from a
CPAN user. And there is nothing magical about it, no more so than if I ask for
the length of UTF8 string I expect get back the number of characters not the
number of bytes.

A path is a well defined size on all platforms and should be treated as such.
The main problems is that POSIX really never did cover this part too well. But
today we have Unicode and UTF8 and as such this is the de facto default on most
modern unix'es as most libraries and tools will write filenames in this format
if so defined in the locale.

Just writing binary data to a filename is bound to get you into trouble and you
will quickly find that many of the common C libraries will fail if locale and
filename does not match.

So even on Linux/Unix a path really not just any number of bytes with / as
delimiter. It depends on the locale and the encoding set for the file system
and not caring about that will get you into trouble.

But than again you always have the option of using p:unix{}, it's also a clear
way to signal you really don't care about portability and that this will only
work on Unix. Or you could even use Q{} as this pretty much will allow you to
anything.


 As for comparing paths: reimplementing logic that belongs to the
 filesystem sounds like really Bad Idea™ to me. Two paths can't be
 reliably compared without choosing to make some explicit assumptions,
 and I don't think Perl should make such choices for the programmer.

Getting any kind of path's from user input will require you to reimplement that
logic if you care about validate data before throwing it at the file system.

If you buy that paths are well defined types, then comparing paths should not
require making any assumptions. We can compare Unicode string without making
assumptions.


 Very nicely put. We can't predict the future, but in creating
 something that'll at least persist through the next decade, let's not
 do elaborate things with lots of moving parts.

 Let's make a solid ground to stand on; something so stable that it
 works uphill and underwater. People with expertise and tuits will
 write the facilitating modules.

 PerlJam To quote Kernighan and Pike:  Simplicity. Clarity. Generality.
 moritz_ I agree.
 Matt-W magic can always be added with module goodness


I completely agree we can't predict the future but we do have to make some sane
choices about how the default should work, who knows if UTF8 will still be hot
new thing in 10 years, but that's still the default assumption for much of Perl
6 if nothing else is known about the input we get.

And I totally agree path literals should not be magically, they should be well
defined and you should not suffer when using them because platform X or Y has
strange restrictions. But when finding the sane default we have to make
restrictions and POSIX's path is binary data, simply is to lax.

My idea about using the lowest common denominator for modern Unix and windows
was that we could get as much of Unicode in path names as possible without
breaking on modern platforms and as a way to get Simplicity, Clarity and
Generality into paths.

Because this will never be simple, clear or general:

  File::Spec-catfile(qw(.. ext Sys Syslog macros.all));

or any of the other example that we can find:

http://www.google.com/codesearch?hl=enstart=10sa=Nq=FIle::Spec-%3Ecatfile

Regards Troels


Re: Filename literals

2009-08-18 Thread Dave Whipp

Leon Timmermans wrote:

Reading this discussion, I'm getting the feeling that filename
literals are increasingly getting magical, something that I don't
think is a good development. [...]. I don't want to deal with Windows'
strange restrictions on characters when I'm working on Linux. I don't
want to deal with any other platform's particularities either.


I'd like to agree, and also suggest that the use-case for filename 
literals probably favors the native approach.


Most applications should not hard-code filename constants: they should 
use config files, ask users, or programatically construct them from 
other information. OTOH, one-liners and throw-away scripts will 
frequently hard code these things. So the filename literal syntax should 
optimize for this usage: a 90% solution that makes easy things trivial 
(but requires hard things to use an IO CTOR) seems to me to be what I'd 
want for a one-liner.


Re: Filename literals

2009-08-17 Thread Jon Lang
Timothy S. Nelson wrote:
 David Green wrote:
 Jon Lang wrote:
 If so, could you give some examples of how such a distinction could be
 beneficial, or of how the lack of such a distinction is problematic?

        Well, my main thought in this context is that the stuff that can be
 done to the inside of a file can also be done to other streams -- TCP
 sockets for example (I know, there are differences, but the two are a lot
 the same), whereas metadata makes less sense in the context of TCP sockets;
 I guess this was one of the thoughts that led me to want separate things
 here.

Ah.  I can see that.

 Well, I definitely think there needs to be a class that combines the
 inside and the outside, or the data and the metadata.  Certainly the
 separate parts will exist separately for purposes of implementation, but
 there needs to be a user-friendlier view wrapped around that.  Or maybe
 there are (sort of) three levels, low, medium, and high; that is, the basic
 implementation level (=P6 direct access to OS- and FS- system calls); the
 combined level, where an IO or File object encompasses IO::FSnode and
 IO::FSdata, etc.; and a gloss-over-the-details level with lots of sugar on
 top (at the expense of losing control over some details).

        Hmm.  With the quoting idea, I don't see the need for a both type
 of object.  I mean, I'd see the code happening something like this:

 if (path{/path/to/file}.e) {
       �...@lines = slurp(path{/path/to/file});
 }

        Or...

 if (path{/path/to/file}.e) {
        $handle = open(path{/path/to/file});
 }



        (I'm using one of David's suggested syntaxes above, but I'm not
 closely attached to it).

For the record, the above syntax was my suggestion.

        I guess what I'm saying here is that I think we can do the things
 without people having to worry about the objects being separate unless they
 care.  So, separate objects, but hide it as much as possible.  Is that
 something you're fine with?

It looks good to me.

 In fact, having q, Q, or qq involved at all strikes me as wrong,
 since those three are specifically for generating strings.

 Pathnames still are strings, so that's fine.  In fact, there are different

        Hmm.  I'm not so sure; maybe I'm just being picky, but I want to
 clarify things in case it's important (in other words, I'm thinking out loud
 here to see if it helps).

        First, Q and friends don't generate strings, they generate
 string-like objects, which could be Str, or Match, or whatever.  Think of
 quoting constructs as a way of temporarily switching to a different
 sublanguage (cf. regex), and you'll have the idea that I have in mind.

        As for pathnames being strings, you may be right FSVO string.  But
 I'd say that, while they may be strings, they're not Str, but they do Str,
 as in

 role    IO::FSNode does Str {...}

        (FSNode may not be the right name here, but is used for illustrative
 purposes).

I'd go one step further.  Consider the Windows path 'C:\Program
Files\'.  Is the string what's really important, or is it the
directory to which the string refers?  I ask because, for legacy
reasons, the following points to the same directory: 'C:\PROGRA~1\'.
Then there's the matter of absolute and relative paths: if the current
working directory is 'C:\Program Files\', then the path 'thisfile'
actually refers to 'C:\Program Files\thisfile'.  And because of parent
directory and self-reference links, things like '/bin/../etc/.' is
just an overcomplicated way of pointing to '/etc'.  I'd like Perl 6's
treatment of filenames to be smart enough that smart-matching any of
these pairs of alternative spellings would result in a successful
match.  So while I'll agree that filenames are string-like, I really
don't want them to _be_ strings.

 things going on here; one is to have a way of conveniently quoting strings
 that contain a lot of backslashes.  Just as Perl lets you pick different
 quotation marks, to make it easier to quote strings that have a lot of  or
 ' characters, so it should have a way to make it easy to quote strings with
 a lot of backslashes.  (The most obvious example being Windows paths; but
 there are other possibilities, such as needing to eval some code that
 already has a lot of backslashes in it.)

 Now, you can already turn backwhacking on or off via Q's :backslash
 adverb; Q:qq includes :b (and Q:q recognises a few limited escape sequences
 like \\). So you could say Q[C:\some\path], and you could add scalar
 interpolation to say Q:s[C:\some\path\$filename].  But there's no way to
 have all of: literal backslashes + interpolation + escaped sigils.

 Perhaps instead of a simple :b toggle, we could have an :escapeStr
 adverb that defaults to :escape\? Then you could have
 Q:scalar:escape(^)[C:\path\with\literal^$\$filename].

        Maybe a global variable?  It's an interesting idea, and I'll see how
 others feel :).

I'm leery of global variables, per se; but I _do_ like the idea of

Re: Filename literals

2009-08-17 Thread Troels Liebe Bentsen
Hey,

Just joined the list, and I too have been thinking about a good path literal
for Perl 6. Nice to see so many other people are thinking the same :).

Not knowing where to start in this long thread, I will instead try to show how
I would like a path literal to work. For me a path literal is a way to make the
code pretty and clean. And for multi platform coding this is mostly where it
gets hard to do. So I think a path literal should make it possible to use both
a native style and a more modern portable one, without having to give up using
spaces like in Path::Spec from Perl 5 or have to do verbose object creation.

First I think extending Q with a Q:path{} and making the alias Q:p{} and p{}
would be the most consistent with the current string literal API. Also it
should be possible to sub type the literals to further limit format and
content. This should be done so we can get compile time error when path's are
know to be incorrect or that we throw an exception or return a undef with an
error type(or whatever Larry called it) when we interpolate and return
something that is known to be incorrect.

The default p{} should only allow / as separator and should not allow
characters that won't work on modern Windows and Unix like \ / ? % * : |   ,
etc. The reason for this is that portable Path's should be the default and if
you really need platform specific behavior it should be shown in the code.

my Path $path = p{../ext/dictonary.txt};

or

my Path $path = p{c:/ext/dictonary.txt};

We should allow windows style paths so converting and maintaining code on this
platform is not a pain.

my Path $path = p:win{C:\Program Files\MS Access\file.file};

For Unix specific behavior we should have a p:unix{} literal, here the only
limit are what is defined by locale. So we won't be able to write full Unicode
if locale is set to Latin1. Writing filenames to the filesystem that other
programs won't be able to read should be hard.

my Path $path = p:unix{/usr/src/bla/myfile?:%.file};

And for people where this is a problem p:bin{} can be used as no checking is
done here.

my $path = p:bin{/usr/src/bla/??/adasd/myfile};

Old style Mac paths could also be supported where the : is used as separator.

my Path $path = p:mac{usr:src:bla};

Or old dos paths where 8 char limits and all the old dos stuff apply.

my Path $path = p:dos{c:\windows\test.fil};

Urls could also be support with:

my Path $path = p:url{file:///home/test.file}

** Path Object like File::Spec, etc. just nicer **

All the different variants for p{} return a Path object that offers much of
what is found in File::Spec, Cwd and Path::Class in Perl 5 today in a more
Perl 6 way.

my Path $real_path = $path.realpath; # Like Cwd's realpath

my Path $volume = $path.volume; # Returns the volume part if relevant
my Path $dir = $path.dir; # Returns the directory part
my Path $file = $path.file; # Returns the file part

$path.shift(); # Get rid of last part of path
$path.pop(); # Get rid of first part or path

my @paths = $path.dirs; # Returns the directory parts of the path

etc.

** Comparing Paths should do the right thing **

As we have the option of specifying what type a Path object is, this should
also count when comparing the them. So fx. p:win{} are case insensitive.

my $file = p:win{c:\My File.txt};

my $path = p:win{C:\Program Files\..};

if($path.is_in($file)) { # Check if the path is contained in another path
  say $file is in $path\n; # C:\My File.txt is C:
}

if(p{../test} ~~ p{../dir/../test}) {
  say Comparing two Path works as it should;
}

Also Path handles Unicode normalization so this won't be a problem:

http://lists.zerezo.com/git/msg643117.html

Meaning that both MA WITH UMLAUTrchen and MaUMLAUT MODIFIERrchen are
the same path, but without normalizing the path behind the users back.

** Utility functions **

Path in itself knows nothing about the filesystem and files but might have a
peek in $*CWD to do some path logic. Except for that a number of File related
functions might be available to make it easy to open and slurp a file a Path
points to.

my File $file = p{/etc/passwd}.open;
if($file.type ~~ 'text/plain') {
  say looks like a password file;
}

my @passwd = p{/etc/passwd}.lines;


if(p{/etc/passwd}.exists) {
  say passwd file exists;
}

This is my thought so far, hope it helps the discussion.

Regards Troels


Re: Filename literals

2009-08-17 Thread Jon Lang
Troels Liebe Bentsen wrote:
 Hey,

 Just joined the list, and I too have been thinking about a good path literal
 for Perl 6. Nice to see so many other people are thinking the same :).

Welcome to the list!

 Not knowing where to start in this long thread, I will instead try to show how
 I would like a path literal to work.

A well-considered proposal, and one with which I mostly agree.  Some thoughts:

 The default p{} should only allow / as separator and should not allow
 characters that won't work on modern Windows and Unix like \ / ? % * : |   
 ,
 etc. The reason for this is that portable Path's should be the default and if
 you really need platform specific behavior it should be shown in the code.

I note that you explicitly included * and ? in the list of forbidden
characters; I take it, then, that you're not in favor of Path as a
glob-based pattern-matching utility?  E.g.:

my Path $path;
...
unless $path ~~ pastro* { say the file doesn't begin with 'astro'. }

Admittedly, this particular example _could_ be accomplished through
the use of a regex; but there _are_ cases where the use of wildcard
characters would be easier than the series of equivalent tests that
Perl would otherwise have to perform in order to achieve the same
result.  Hmm... maybe we need something analogous to q vs. qq; that
is:

pastro* #`{ syntax error: '*' is not a valid filename character. }
ppastro* #`{ returns an object that is used for Path
pattern-matching; perhaps Pathglob or somesuch? }

 We should allow windows style paths so converting and maintaining code on this
 platform is not a pain.
:
 For Unix specific behavior we should have a p:unix{} literal, here the only
 limit are what is defined by locale.
:
 And for people where this is a problem p:bin{} can be used as no checking is
 done here.
:
 Old style Mac paths could also be supported where the : is used as separator.
:
 Or old dos paths where 8 char limits and all the old dos stuff apply.

Hear, hear.  Note that these are all mutually exclusive, which
suggests that the proper format ought to be something like:

 my Path $path = p:formatwin{C:\Program Files}

However, I have no problem with the idea that :win is short for
:formatwin; the feature here is brevity.

 Urls could also be support with:

 my Path $path = p:url{file:///home/test.file}

I would be very careful here, in that I wouldn't want to open the can
of worms inherent in non-file protocols (e.g., ftp, http, gopher,
mail), or even in file protocols with hosts other than localhost.

 ** Path Object like File::Spec, etc. just nicer **
:
 ** Comparing Paths should do the right thing **

Agreed on all counts.

 ** Utility functions **

 Path in itself knows nothing about the filesystem and files but might have a
 peek in $*CWD to do some path logic. Except for that a number of File related
 functions might be available to make it easy to open and slurp a file a Path
 points to.

 my File $file = p{/etc/passwd}.open;
 if($file.type ~~ 'text/plain') {
  say looks like a password file;
 }

 my @passwd = p{/etc/passwd}.lines;


 if(p{/etc/passwd}.exists) {
  say passwd file exists;
 }

As soon as you allow methods such as .exists, it undermines your claim
that Path knows nothing about the filesystem or files.  IMHO, you
should still include such methods.

-- 
Jonathan Dataweaver Lang


Re: Filename literals

2009-08-17 Thread Timothy S. Nelson

On Mon, 17 Aug 2009, Jon Lang wrote:


Well, I definitely think there needs to be a class that combines the
inside and the outside, or the data and the metadata.  Certainly the
separate parts will exist separately for purposes of implementation, but
there needs to be a user-friendlier view wrapped around that.  Or maybe
there are (sort of) three levels, low, medium, and high; that is, the basic
implementation level (=P6 direct access to OS- and FS- system calls); the
combined level, where an IO or File object encompasses IO::FSnode and
IO::FSdata, etc.; and a gloss-over-the-details level with lots of sugar on
top (at the expense of losing control over some details).


       Hmm.  With the quoting idea, I don't see the need for a both type
of object.  I mean, I'd see the code happening something like this:

if (path{/path/to/file}.e) {
      �...@lines = slurp(path{/path/to/file});
}

       Or...

if (path{/path/to/file}.e) {
       $handle = open(path{/path/to/file});
}



       (I'm using one of David's suggested syntaxes above, but I'm not
closely attached to it).


For the record, the above syntax was my suggestion.


Ok, as long as I don't have to take the blame :).

	Seriously, I was confused by trying to reply to two e-mails at once. 
Sorry.



In fact, having q, Q, or qq involved at all strikes me as wrong,
since those three are specifically for generating strings.


Pathnames still are strings, so that's fine.  In fact, there are different


       Hmm.  I'm not so sure; maybe I'm just being picky, but I want to
clarify things in case it's important (in other words, I'm thinking out loud
here to see if it helps).

       First, Q and friends don't generate strings, they generate
string-like objects, which could be Str, or Match, or whatever.  Think of
quoting constructs as a way of temporarily switching to a different
sublanguage (cf. regex), and you'll have the idea that I have in mind.

       As for pathnames being strings, you may be right FSVO string.  But
I'd say that, while they may be strings, they're not Str, but they do Str,
as in

role    IO::FSNode does Str {...}

       (FSNode may not be the right name here, but is used for illustrative
purposes).


I'd go one step further.  Consider the Windows path 'C:\Program
Files\'.  Is the string what's really important, or is it the
directory to which the string refers?  I ask because, for legacy
reasons, the following points to the same directory: 'C:\PROGRA~1\'.
Then there's the matter of absolute and relative paths: if the current
working directory is 'C:\Program Files\', then the path 'thisfile'
actually refers to 'C:\Program Files\thisfile'.  And because of parent
directory and self-reference links, things like '/bin/../etc/.' is
just an overcomplicated way of pointing to '/etc'.  I'd like Perl 6's
treatment of filenames to be smart enough that smart-matching any of
these pairs of alternative spellings would result in a successful
match.  So while I'll agree that filenames are string-like, I really
don't want them to _be_ strings.


	Good ideas.  But I still want it to have the same interface, so I can 
concatenate them easily in error messages :).



things going on here; one is to have a way of conveniently quoting strings
that contain a lot of backslashes.  Just as Perl lets you pick different
quotation marks, to make it easier to quote strings that have a lot of  or
' characters, so it should have a way to make it easy to quote strings with
a lot of backslashes.  (The most obvious example being Windows paths; but
there are other possibilities, such as needing to eval some code that
already has a lot of backslashes in it.)

Now, you can already turn backwhacking on or off via Q's :backslash
adverb; Q:qq includes :b (and Q:q recognises a few limited escape sequences
like \\). So you could say Q[C:\some\path], and you could add scalar
interpolation to say Q:s[C:\some\path\$filename].  But there's no way to
have all of: literal backslashes + interpolation + escaped sigils.

Perhaps instead of a simple :b toggle, we could have an :escapeStr
adverb that defaults to :escape\? Then you could have
Q:scalar:escape(^)[C:\path\with\literal^$\$filename].


       Maybe a global variable?  It's an interesting idea, and I'll see how
others feel :).


I'm leery of global variables, per se; but I _do_ like the idea of
lexically-scoped options that let you customize the filename syntax.
Changing the default delimiter would be the most common example of
this.


	Yeah, global variable is probably a bad idea.  But it *feels* like it 
should be some kind of global or semi-global setting :).  By semi-global, I 
mean something that you can override in your local scope, and have it revert, 
much as with the $*IN, etc, filehandles.



Now, isn't Q:path[/some/file] just creating an IO object?  Unlike /foo/,
where foo just IS the pattern, /some/file is *not* an IO object, it's
just a filename.  So if the special path-quoting returned an 

Re: Filename literals

2009-08-16 Thread Timothy S. Nelson

On Sun, 16 Aug 2009, David Green wrote:


On 2009-Aug-15, at 9:22 am, Jon Lang wrote:

IOW, your outside the file stuff is whatever can be done without
having to open the file, and your inside the file is whatever only
makes sense once the file has been opened.  Correct?


Pretty much, yes.

If so, could you give some examples of how such a distinction could be 
beneficial, or of how the lack of such a distinction is problematic?


	Well, my main thought in this context is that the stuff that can be 
done to the inside of a file can also be done to other streams -- TCP sockets 
for example (I know, there are differences, but the two are a lot the same), 
whereas metadata makes less sense in the context of TCP sockets; I guess this 
was one of the thoughts that led me to want separate things here.


Well, I definitely think there needs to be a class that combines the inside 
and the outside, or the data and the metadata.  Certainly the separate parts 
will exist separately for purposes of implementation, but there needs to be a 
user-friendlier view wrapped around that.  Or maybe there are (sort of) three 
levels, low, medium, and high; that is, the basic implementation level (=P6 
direct access to OS- and FS- system calls); the combined level, where an IO 
or File object encompasses IO::FSnode and IO::FSdata, etc.; and a 
gloss-over-the-details level with lots of sugar on top (at the expense of 
losing control over some details).


	Hmm.  With the quoting idea, I don't see the need for a both type of 
object.  I mean, I'd see the code happening something like this:


if (path{/path/to/file}.e) {
@lines = slurp(path{/path/to/file});
}

Or...

if (path{/path/to/file}.e) {
$handle = open(path{/path/to/file});
}



	(I'm using one of David's suggested syntaxes above, but I'm not 
closely attached to it).


	I guess what I'm saying here is that I think we can do the things 
without people having to worry about the objects being separate unless they 
care.  So, separate objects, but hide it as much as possible.  Is that 
something you're fine with?



In fact, having q, Q, or qq involved at all strikes me as wrong,
since those three are specifically for generating strings.


Pathnames still are strings, so that's fine.  In fact, there are different


	Hmm.  I'm not so sure; maybe I'm just being picky, but I want to 
clarify things in case it's important (in other words, I'm thinking out loud 
here to see if it helps).


	First, Q and friends don't generate strings, they generate string-like 
objects, which could be Str, or Match, or whatever.  Think of quoting 
constructs as a way of temporarily switching to a different sublanguage (cf. 
regex), and you'll have the idea that I have in mind.


	As for pathnames being strings, you may be right FSVO string.  But I'd 
say that, while they may be strings, they're not Str, but they do Str, as in


roleIO::FSNode does Str {...}

	(FSNode may not be the right name here, but is used for illustrative 
purposes).


things going on here; one is to have a way of conveniently quoting strings 
that contain a lot of backslashes.  Just as Perl lets you pick different 
quotation marks, to make it easier to quote strings that have a lot of  or ' 
characters, so it should have a way to make it easy to quote strings with a 
lot of backslashes.  (The most obvious example being Windows paths; but there 
are other possibilities, such as needing to eval some code that already has a 
lot of backslashes in it.)


Now, you can already turn backwhacking on or off via Q's :backslash adverb; 
Q:qq includes :b (and Q:q recognises a few limited escape sequences like \\). 
So you could say Q[C:\some\path], and you could add scalar interpolation to 
say Q:s[C:\some\path\$filename].  But there's no way to have all of: literal 
backslashes + interpolation + escaped sigils.


Perhaps instead of a simple :b toggle, we could have an :escapeStr adverb 
that defaults to :escape\? Then you could have 
Q:scalar:escape(^)[C:\path\with\literal^$\$filename].


	Maybe a global variable?  It's an interesting idea, and I'll see how 
others feel :).



The ultimate in path literals would be to establish a similar
default delimiter.  [...]
 `path`.size # how big is the file?  Returns number.


There's something that slightly jars me here... I don't like the quotation 
returning an IO object.  (I like the conciseness, but there's something a bit 
off conceptually.)


	Hmm.  But doesn't normal quoting return a Str object?  And regex 
quoting return an object (Regex?  Match?  Something, anyway).


Now, isn't Q:path[/some/file] just creating an IO object?  Unlike /foo/, 
where foo just IS the pattern, /some/file is *not* an IO object, it's 
just a filename.  So if the special path-quoting returned an IO::File::Name 
object, I would be perfectly happy.  But you can't have $filename.size -- a 
fileNAME doesn't have a size, the file itself does.  To get from the filename 
to the 

Re: Filename literals

2009-08-15 Thread Timothy S. Nelson

On Fri, 14 Aug 2009, Darren Duncan wrote:


Richard Hainsworth wrote:
Would it be possible to remove the special purpose of \ from strings within 
IO constructs?


This would mean '\' could be used in naming paths as an alternative to '/', 
thus allowing windows and unix strings to be equivalent, eg.
IO(:path{$root-path}/data/new) would be equivalent to 
IO(:path{$root-path}\data\new)


The usefulness would be most evident for sub-directories as windows and 
unix have different ways of describing root, viz. 'C:\' versus '/'


I see problems with this considering that \ is quite universally recognized 
in Perl (and many other languages) as meaning an escape character, and that 
moreover you generally need to be able to escape characters in any context 
building a string.


	Considering, though, that we're talking about a magic perl quoting 
syntax, we could offer people the option of the following two:


q:io{C:\Windows} # Does what you want
q:io:qq:{C:\\Windows} # Does the same thing

Wouldn't that cover the bases pretty well?

:)


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+++ PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-



Re: Filename literals

2009-08-15 Thread Timothy S. Nelson

On Sat, 15 Aug 2009, Timothy S. Nelson wrote:

	Considering, though, that we're talking about a magic perl quoting 
syntax, we could offer people the option of the following two:


q:io{C:\Windows} # Does what you want
q:io:qq:{C:\\Windows} # Does the same thing

Wouldn't that cover the bases pretty well?


My bad -- try these:

$file = foo

Q:io{C:\Windows\$file} # Results in C:\Windows\$file
q:io{C:\\Windows\\$file} # Results in the same thing
qq:io{C:\\Windows\\$file} # Results in C:\Windows\foo

HTH,


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+++ PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-



Re: Filename literals

2009-08-15 Thread Austin Hastings

This whole thread seems oriented around two points:

1. Strings should not carry the burden of umpty-ump filesystem checking 
methods.


2. It should be possible to specify a filesystem entity using something 
nearly indistinguishable from standard string syntax.


I agree with the first, but the relentless pursuit of the second seems 
to have gone beyond the point of useful speculation.


What's wrong with

   File('C:\Windows')

or Path() or Dir() or SpecialDevice()?

Not to get all Cozens-y or anything, but chasing after ways to jam some 
cute string-like overloading into the syntax so that we can pull out the 
other overloading (which at least had the virtue of simplicity) seems 
pointless.


The File::* functionality is probably going to be one of the very early 
p6 modules, and it is probably going to be in core. If that's true, why 
not allocate some really short names, ideally with 0 colons in them, and 
use them to spell out what's being done?


Neither q:io:qq:{.} nor qq:io{.} really stand out at excellent ways to 
say this is a path, or directory, or file, or whatever. If it's 
plug-in-able, I'd take qq:file{.} or qq:dir{.} or qq:path{.}, but I'd 
rather see C  File q{.} .



=Austin



Timothy S. Nelson wrote:

On Sat, 15 Aug 2009, Timothy S. Nelson wrote:

Considering, though, that we're talking about a magic perl 
quoting syntax, we could offer people the option of the following two:


q:io{C:\Windows} # Does what you want
q:io:qq:{C:\\Windows} # Does the same thing

Wouldn't that cover the bases pretty well?


My bad -- try these:

$file = foo

Q:io{C:\Windows\$file} # Results in C:\Windows\$file
q:io{C:\\Windows\\$file} # Results in the same thing
qq:io{C:\\Windows\\$file} # Results in C:\Windows\foo

HTH,


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- PE(+) Y+++ 
PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-






Re: Filename literals

2009-08-15 Thread Timothy S. Nelson

On Sat, 15 Aug 2009, Austin Hastings wrote:


This whole thread seems oriented around two points:

1. Strings should not carry the burden of umpty-ump filesystem checking 
methods.


2. It should be possible to specify a filesystem entity using something 
nearly indistinguishable from standard string syntax.


I agree with the first, but the relentless pursuit of the second seems to 
have gone beyond the point of useful speculation.


What's wrong with

  File('C:\Windows')

or Path() or Dir() or SpecialDevice()?

Not to get all Cozens-y or anything, but chasing after ways to jam some cute 
string-like overloading into the syntax so that we can pull out the other 
overloading (which at least had the virtue of simplicity) seems pointless.


The File::* functionality is probably going to be one of the very early p6 
modules, and it is probably going to be in core. If that's true, why not 
allocate some really short names, ideally with 0 colons in them, and use them 
to spell out what's being done?


	S32/IO already specifies all these things as living in the IO 
namespace.  That could be changed, of course.



Neither q:io:qq:{.} nor qq:io{.} really stand out at excellent ways to say


	q:io{.} would be the normal case, unless you want variable 
interpolation or the like.  And it would be possible to come up with shorter 
versions (someone suggested qf).


this is a path, or directory, or file, or whatever. If it's plug-in-able, 
I'd take qq:file{.} or qq:dir{.} or qq:path{.}, but I'd rather see C  File 
q{.} .


	I'm not particularly attached to :io if we can think of something 
better.  These things often have a short name and a long name.  I'm against 
file because the IO::File object models what is inside the file (ie. 
open/read/write/close/etc), whereas the 
IO::FSNode/IO::FileNode/IO::DirectoryNode/IO::LinkNode objects model stuff on 
the outside of the file.  It's things of this second type that I'm 
recommending that we return here.  We could change the names of the objects of 
course, but I'm keen on keeping the class that does stuff to the inside of 
the file separate from the class that does stuff to the outside of the file. 
Path might be a good alternative in my mind.


	Anyway, back to the :io name.  An alternative might be to have the 
short name be :p and the long name be :path.  That would mean that we could 
do:


q:p{.}

	That's a fair bit shorter than Path(q{.}).  Hmm.  Let's compare some 
code samples:


if (q:p'/path/to/file' ~~ :r) {
say Readable\n;
}
if (Path('/path/to/file') ~~ :r) {
say Readable\n;
}


$fobj = new IO::File(FSNode = q:p'/path/to/file');
$fobj = new IO::File(FSNode = Path('/path/to/file'));

	I used single quotes for the Path() things because I think that's what 
people would probably do.


Now, say we want to use backslashes.

if (Q :p {C:\Windows\file} ~~ :r) {
say Readable\n;
}
if (Path(Q {C:\Windows\file}) ~~ :r) {
say Readable\n;
}

	Ok, so they're comparable.  I've used curlies here just because I 
thought it was a good idea :).


Anyway, we have possibilities.  Further thoughts anyone?

:)


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+++ PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-



Re: Filename literals

2009-08-15 Thread Jon Lang
On Sat, Aug 15, 2009 at 7:17 AM, Timothy S. Nelsonwayl...@wayland.id.au wrote:
 On Sat, 15 Aug 2009, Austin Hastings wrote:

 This whole thread seems oriented around two points:

 1. Strings should not carry the burden of umpty-ump filesystem checking
 methods.

 2. It should be possible to specify a filesystem entity using something
 nearly indistinguishable from standard string syntax.

 I agree with the first, but the relentless pursuit of the second seems to
 have gone beyond the point of useful speculation.

 What's wrong with

  File('C:\Windows')

 or Path() or Dir() or SpecialDevice()?

 Not to get all Cozens-y or anything, but chasing after ways to jam some
 cute string-like overloading into the syntax so that we can pull out the
 other overloading (which at least had the virtue of simplicity) seems
 pointless.

 The File::* functionality is probably going to be one of the very early p6
 modules, and it is probably going to be in core. If that's true, why not
 allocate some really short names, ideally with 0 colons in them, and use
 them to spell out what's being done?

        S32/IO already specifies all these things as living in the IO
 namespace.  That could be changed, of course.

 Neither q:io:qq:{.} nor qq:io{.} really stand out at excellent ways to say

        q:io{.} would be the normal case, unless you want variable
 interpolation or the like.  And it would be possible to come up with shorter
 versions (someone suggested qf).

 this is a path, or directory, or file, or whatever. If it's
 plug-in-able, I'd take qq:file{.} or qq:dir{.} or qq:path{.}, but I'd rather
 see C  File q{.} .

        I'm not particularly attached to :io if we can think of something
 better.  These things often have a short name and a long name.  I'm against
 file because the IO::File object models what is inside the file (ie.
 open/read/write/close/etc), whereas the
 IO::FSNode/IO::FileNode/IO::DirectoryNode/IO::LinkNode objects model stuff
 on the outside of the file.  It's things of this second type that I'm
 recommending that we return here.  We could change the names of the objects
 of course, but I'm keen on keeping the class that does stuff to the inside
 of the file separate from the class that does stuff to the outside of the
 file. Path might be a good alternative in my mind.

IOW, your outside the file stuff is whatever can be done without
having to open the file, and your inside the file is whatever only
makes sense once the file has been opened.  Correct?  If so, could you
give some examples of how such a distinction could be beneficial, or
of how the lack of such a distinction is problematic?

        Anyway, back to the :io name.  An alternative might be to have the
 short name be :p and the long name be :path.  That would mean that we could
 do:

 q:p{.}

Isn't there something in the spec that indicates that qq is merely
shorthand for q:qq?  That is, it's possible to bundle a bunch of quote
adverbs together under a special quote name.  If so, you might say
that q:path and q:p are longhand for path:

path{.} # same as q:p{.}

And yes, 'path' is longer that 'q:p' - but only by one character; and
it's considerably more legible.  As well, this is more in keeping with
what's really going on here: path{.} would be no more a string than
m{.} or rx{.} are.  In fact, having q, Q, or qq involved at all
strikes me as wrong, since those three are specifically for generating
strings.

Also note the following:

   string # same as qq[string]
   'string' # same as q[string]
   /pattern/ # same as m[pattern]?

The ultimate in path literals would be to establish a similar
default delimiter.  For example, what if the backtick were pressed
into service for this purpose?  (No, I'm not actually suggesting this;
at the very least, there would be p5 false-compatibility issues
involved.  This is strictly illustrative.)

   `path` # same as path[path]
   `path`.e # does that filename exist?  Returns boolean.
   `path`.size # how big is the file?  Returns number.
   `path`.open # Returns new file handle.

        That's a fair bit shorter than Path(q{.}).  Hmm.  Let's compare some
 code samples:

 if (q:p'/path/to/file' ~~ :r) {
        say Readable\n;
 }
 if (Path('/path/to/file') ~~ :r) {
        say Readable\n;
 }

if (path'/path/to/file'.r) {
   say Readable;
}
if (`/path/to/file`.r) {
   say Readable;
}

 $fobj = new IO::File(FSNode = q:p'/path/to/file');
 $fobj = new IO::File(FSNode = Path('/path/to/file'));

$fobj = path[/path/to/file].open;
$fobj = `/path/to/file`.open;

        I used single quotes for the Path() things because I think that's
 what people would probably do.

        Now, say we want to use backslashes.

 if (Q :p {C:\Windows\file} ~~ :r) {
        say Readable\n;
 }
 if (Path(Q {C:\Windows\file}) ~~ :r) {
        say Readable\n;
 }

if (path:win[C:\Windows\file].r) {
   say Readable;
}

        Anyway, we have possibilities.  Further thoughts anyone?

As illustrated above, I think 

Re: Filename literals

2009-08-14 Thread Timothy S. Nelson

More ideas:

On Thu, 13 Aug 2009, Hinrik Örn Sigurðsson wrote:


   # bin/perl on Unix
   my $rel = qf/usr bin perl/;

   # /usr/bin/perl
   my $abs = qf[/usr bin perl];


...and on Windows, would the above result in C:\/usr\bin\perl ? :)

# The following both result in the same object (kinda):
# /usr/bin/perl on Unix, C:\usr\bin\perl on Windows
my $abs = qf:unix[/usr/bin/perl];
my $abs = qf:win[C:\usr\bin\perl];

Just thinking out loud.

:)


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+++ PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-


Re: Filename literals

2009-08-14 Thread Richard Hainsworth

I like this way.

Would it be possible to remove the special purpose of \ from strings 
within IO constructs?


This would mean '\' could be used in naming paths as an alternative to 
'/', thus allowing windows and unix strings to be equivalent, eg.
IO(:path{$root-path}/data/new) would be equivalent to 
IO(:path{$root-path}\data\new)


The usefulness would be most evident for sub-directories as windows and 
unix have different ways of describing root, viz. 'C:\' versus '/'



David Green wrote:
We should start thinking about the fundamental objects for doing IO as 
IO-objects.  They *have* names, but they aren't names, or strings, or 
even filehandles (although they might *have* filehandles encapsulated 
inside to do the actual work).  A filename is merely a way to get at 
the actual object, just as the string 2009/1/1 can be used to get a 
Date object.  A string, or a handle, or an inode, or some unique 
filesystem spec number, or anything else you can get your hands on 
should be fed to a constructor:
Of course, this being P6, we can have some kind of io macro that 
parses the single item after it:


my $file1 = io file://some/dir/some%20file; # 
the quick way
my $file2 = IO.new(:protocolfile :urifoo/bar/a file.html);  # 
the verbose way





Re: Filename literals

2009-08-14 Thread Timothy S. Nelson

On Thu, 13 Aug 2009, Hinrik Örn Sigurðsson wrote:


Imagine two roles, Filename and Dirname (or Path::File / Path::Dir). I


...or imagine just one, called IO::FSNode.

http://perlcabal.org/syn/S32/IO.html#IO::FSNode

Btw, kudos for the special quoting idea -- I love it :).

	And in response to David Green and his comment about working with file 
data vs. metadata, as a systems  programmer, I've written a fair number of 
programs that have worried a fair bit about the metadata in the filesystem; 
sometimes you want to read data, and sometimes metadata.  That's why the Draft 
IO spec specifies two separate objects.


HTH,


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+++ PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-


Re: Filename literals

2009-08-14 Thread Timothy S. Nelson

On Fri, 14 Aug 2009, Timothy S. Nelson wrote:


On Thu, 13 Aug 2009, Hinrik Örn Sigurðsson wrote:


Imagine two roles, Filename and Dirname (or Path::File / Path::Dir). I


...or imagine just one, called IO::FSNode.


Sorry, I was stupiding again.  I'll ask you to imagine 4:

IO::FSNode
|
+-IO::FileNode
|
+-IO::DirectoryNode
|
+-IO::LinkNode

Role composition tree depicted above.


-
| Name: Tim Nelson | Because the Creator is,|
| E-mail: wayl...@wayland.id.au| I am   |
-

BEGIN GEEK CODE BLOCK
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- 
PE(+) Y+++ PGP-+++ R(+) !tv b++ DI D G+ e++ h! y-

-END GEEK CODE BLOCK-


Re: Filename literals

2009-08-14 Thread David Green

On 2009-Aug-14, at 5:36 am, Richard Hainsworth wrote:
Would it be possible to remove the special purpose of \ from strings  
within IO constructs?


It's P6, anything's possible!  I probably wouldn't change [what look  
like] ordinary quoted strings, but maybe something with a qf//-type  
construct (or would that be qf\\ ?!).  My idea of using a macro was to  
grab almost anything that wasn't whitespace, and it doesn't have to be  
parsed like a normal string, so \ could be interpreted as a dir  
separator.


Of course, / has become so standard, that it even works on Windows  
(kind of); on the other hand, being able to use either (or both) would  
be convenient for a lot of people.



On 2009-Aug-14, at 7:18 am, Leon Timmermans wrote:

I don't think that's a good idea. In general, parsing an URI isn't
that easy, in particular determining the end is undefined AFAIK. In
your example the semicolon should probably be considered part of the
URI, even though that's obviously not what you intended.


Well, we can encode a URI any way we like -- I was thinking of  
anything up to the next whitespace or semicolon, maybe allowing for  
balanced brackets; and internal semicolons, etc. being %-encoded.  I  
guess the argument would be that using an encoding that looks almost- 
but-not-quite like other popular ways of representing URIs could be  
confusing, and people would be tempted to paste in addresses from  
their browser without re-encoding them the P6 way.


Maybe it's more practical to permit only URIs with little to no  
punctuation to be unquoted, and quote anything else?  Not that quoting  
is such a great hardship anyway



On 2009-Aug-14, at 7:41 am, Timothy S. Nelson wrote:
	And in response to David Green and his comment about working with  
file data vs. metadata, as a systems  programmer, I've written a  
fair number of programs that have worried a fair bit about the  
metadata in the filesystem; sometimes you want to read data, and  
sometimes metadata.


Of course; and when I referred to low-level and high-level, there  
isn't really a distinct dividing line between the two.  Is getting (or  
setting) the modification date on a file low-level because it's  
metadata, or high-level because it's a simple, ordinary task?  I  
don't particularly care about the classification; I just wanted to  
make the point that P6 should make it possible to gloss over anything  
that's over-glossable.



-David



Re: Filename literals

2009-08-14 Thread Darren Duncan

Richard Hainsworth wrote:
Would it be possible to remove the special purpose of \ from strings 
within IO constructs?


This would mean '\' could be used in naming paths as an alternative to 
'/', thus allowing windows and unix strings to be equivalent, eg.
IO(:path{$root-path}/data/new) would be equivalent to 
IO(:path{$root-path}\data\new)


The usefulness would be most evident for sub-directories as windows and 
unix have different ways of describing root, viz. 'C:\' versus '/'


I see problems with this considering that \ is quite universally recognized in 
Perl (and many other languages) as meaning an escape character, and that 
moreover you generally need to be able to escape characters in any context 
building a string.


Considering that, AFAIK, practically any modern file system, including those 
used by Windows like NTFS, are Unicode savvy and can have any character in a 
file name, if \ is used literally to denote itself, then what is a simple clean 
way to denote other characters that would otherwise be denoted with an escape 
sequence?


I think it would be best, as well as preserving the principle of least surprise, 
if all of the same escaping syntaxes work universally across 
character-string-like contexts, which means that a literal \ means escaping.


The best compromise that I see is that Windows filenames can be spelled out as 
Windows people are used to, except that / is used instead of \, so for example a 
Windows path begins with 'C:/' for example.


Or even if the '/' paradigm for root is used in Windows, which may actually be 
best, the drive letter or drive name still needs to be in the path somewhere so 
that multiple drives can be distinguished, for example, 'C:\' becomes '/C/'.


Under Mac OS X, all drives, root or otherwise, are accessible under 
'/Volumes/drive-name/...', and Unix in general lets you mount drives anywhere. 
 I imagine Windows supports more ways of denoting drives than the drive letter, 
but either way I don't see a problem here.


-- Darren Duncan



Re: Filename literals

2009-08-14 Thread Mark J. Reed
On Fri, Aug 14, 2009 at 3:35 PM, Darren Duncandar...@darrenduncan.net wrote:

 Under Mac OS X, all drives, root or otherwise, are accessible under
 '/Volumes/drive-name/...', and Unix in general lets you mount drives
 anywhere.  I imagine Windows supports more ways of denoting drives than the
 drive letter.

Nope.  Have to use the drive letter.  But / is understood as a synonym
for \ by the Windows API.


-- 
Mark J. Reed markjr...@gmail.com


Re: Filename literals

2009-08-14 Thread Brandon S. Allbery KF8NH

On Aug 14, 2009, at 16:17 , Mark J. Reed wrote:
On Fri, Aug 14, 2009 at 3:35 PM, Darren  
Duncandar...@darrenduncan.net wrote:

Under Mac OS X, all drives, root or otherwise, are accessible under
'/Volumes/drive-name/...', and Unix in general lets you mount  
drives
anywhere.  I imagine Windows supports more ways of denoting drives  
than the

drive letter.


Nope.  Have to use the drive letter.  But / is understood as a synonym
for \ by the Windows API.



UNC drive specs should work as well, i.e.. \\MYHOST\C\... (or swap / for
\).

--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allb...@kf8nh.com
system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu
electrical and computer engineering, carnegie mellon universityKF8NH




PGP.sig
Description: This is a digitally signed message part


Re: Filename literals

2009-08-14 Thread Leon Timmermans
On Fri, Aug 14, 2009 at 7:41 PM, David Greendavid.gr...@telus.net wrote:
 Well, we can encode a URI any way we like -- I was thinking of anything up
 to the next whitespace or semicolon, and internal semicolons, etc. being 
 %-encoded.

Semicolons are reserved characters in URIs: inappropriately percentage
encoding semicolons would be in direct violation of rfc3986. Using
them as delimiter would break many perfectly valid URIs. That's
absolutely a no-go IMNSHO.

Breaking up at whitespace should work, see appendix C of RFC 3986 for
recommendations on that.

 Maybe it's more practical to permit only URIs with little to no punctuation
 to be unquoted, and quote anything else?  Not that quoting is such a great
 hardship anyway

Maybe, but if I can't use it half of the time it may as well be
omitted. Quoting should be relatively easy, because URIs have a wide
range of characters that can't be in them anyway.

Leon


Re: Filename literals

2009-08-14 Thread Jan Ingvoldstad
I'll just butt in here and say that while the URI format is nice for
alternate schemes, it is not nice for accessing files.

The general case in most programming languages is to assume that a
non-URI file name is local, specifying
file://wherever/whatever/filename is unnecessary additional syntax.

Also, perhaps only URLs should be permitted; they do after all specify
a location.

I'm unsure whether this should be part of a central specification to
Perl 6 or part of a module.


I think I like Hinrik's original proposal.


Oh, and regarding file names in Windows, this document should be a
pretty definitive guide:

http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx
-- 
Jan


Re: Filename literals

2009-08-13 Thread Darren Duncan

Hinrik Örn Sigurðsson wrote:

I was wondering if there had been any discussion about how to type
file and directory names in Perl 6. I've read a couple of posts about
file test operators, where some have suggested making filenames
special, either as a subtype of Str or something else entirely. That
way Str wouldn't have all these file test methods, which is good
because not all strings are valid filenames.

snip

Considering that in the general case a file name can be any string at all, if it 
is going to have its own type at all, it should be disjoint from Str in the same 
manner that, say, Instant and Duration are disjoint from Num/Rat.  When I say 
disjoint, I mean conceptually that FileName say has an attribute of type Str 
rather than being defined as a subtype of Str. -- Darren Duncan