Hi Brad, Tim --

You note that the proposal is argument-rich, but actually if you look at 
the options available for the totality of the software you're proposing 
to replace (basically find(1) or the -R option to ls(1), plus the file 
globbing and variable expansion of the shell) it's a little surprising 
how few options you've got here!

First, instead of yielding just filenames, I would have this yield 
(path, kind) tuples, where 'kind' can be any of the various kinds of 
entries that can occur in a POSIX filesystem: regular files, symbolic 
links, directories, named sockets, block special files, devices, etc.

You ask: Is there a better name for this iterator than glob?
There really ought to be, because 'glob' has a specific meaning: to 
match wildcarded filenames the way the shell does. What you're proposing 
here is a lot more like a modern find(1) than the old glob(1) command. 
It's a file hierarchy traverser that can do globbing as it goes. How 
about 'fileTraverse'?

You ask: Should we add a depth argument to limit the depth of the recursion?
Replace the boolean 'recursive' with an integer 'depth', and make its 
default value max(int). It's about as easy to use, but more general.

You ask: Should we support an argument to control dropping the final 
slash when yielding directory names?
Don't yield a string with a trailing slash in that case. The 'kind' 
component of the yielded tuple can tell users whether this is a dir or 
something else.

You ask: Should we add a symlinks argument to say whether or not we 
should follow symbolic links?
Could be dangerous, as you note right after that. You probably should be 
able to follow symbolic links, though, so indeed you do need this.  Note 
that if this followLinks argument is true, the yielded 'kind' for a 
symbolic link should refer to the pointee of the link.

Given the above, you probably also need a crossFilesystemBoundaries 
argument, to tell whether symbolic links that point outside the current 
filesystem should be followed. This corresponds to (but is the opposite 
of) the -xdev option to find(1). The default should be false.

You ask: For recursive mode, should we support an argument indicating a 
subdirectory name to avoid recursively descending into?
No, because that's not general enough to be useful.  Either we should 
not provide this capability at all, or we should provide it in the form 
of a list (array, whatever) of glob patterns, each indicating whether 
the match should be against the whole current path or just the trailing 
(possibly multi-component) part of it, and each giving a pathspec which 
should be pruned.  That is, if we provide this, we should provide it in 
a fully general form.

You ask: Is it reasonable for the parallel version of the iterator to 
not support sorting?
Yes, I think so. That said, I'd like to point out that a parallel 
version of this is going to be a bit difficult to write, because the 
underlying syscall support isn't there.  The only thing UNIX/POSIX 
provides for traversing filesystems is opendir(3)/readdir(3), and 
readdir() is inherently serial. A parallel version would probably have 
to do something like: handle each directory serially, and go parallel on 
the recursions.

You ask: Should we support a multi-locale version of the parallel version?
Giving the limitations of the underlying syscall support (see previous), 
I'm not sure this makes sense. It can't help the case of millions of 
files in a single directory, though it could help the case of lots and 
lots of subdirectories in a directory. (Though maybe this could be a 
place where available capabilities can influence how people choose to 
store data that occupies lots and lots of files? If so, then a 
multi-locale version might make more sense.)

You ask: What should the iterator do by default in the event of 
special files like block special files, character special files, 
named pipes, or sockets?
I covered this above, with the recommendation of a 'kind' component in 
the yielded tuple.

greg


On 8/4/2014 5:41 PM, Brad Chamberlain wrote:
>
> Hi Chapel Users (and Developers) --
>
> Over the past month or so, Tim Zakian and I have been working on
> designing and prototyping a Chapel iterator for the standard library
> that generates file and directory names from a given start directory,
> similar to ls [-R], glob, wordexp, listdir, find, and the like.
>
> For those who are interested in weighing in on such a capability, we
> invite you to review the attached proposal and give us your feedback on
> what's being proposed and/or any of the open issues listed in the
> proposal.  We're very interested in your feedback and input.
>
> Our hope is to get this into the upcoming 1.10 release which gives us
> just over a month to settle on a design and beef up the prototypes to
> match it.
>
> Thanks very much,
> -Brad
>
>
> ------------------------------------------------------------------------------
> Infragistics Professional
> Build stunning WinForms apps today!
> Reboot your WinForms applications with our WinForms controls.
> Build a bridge from your legacy apps to the future.
> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
>
>
>
> _______________________________________________
> Chapel-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/chapel-users
>

------------------------------------------------------------------------------
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users

Reply via email to