Re: [Toybox] Add remaining pwd options

2013-01-13 Thread Felix Janda
Sorry for this repeated hair-splitting.

On 01/12/13 at 11:33pm, Rob Landley wrote:
 On 01/10/2013 02:25:13 PM, Felix Janda wrote:
  On 01/02/13 at 12:41am, Rob Landley wrote:
   What I did was disable #3 in the case where cwd doesn't exist. So  
  the
   new rule #3 is:
  
   3) If cwd exists and $PWD doesn't point to it, fall back to -P.
  
  Thanks for the clarification.
  
  Your version of 3) depends on whether pwd is builtin or not. Do you  
  mean
  something like If getcwd() fails ...?
 
 cwd is what getcwd() returns. $PWD is an environment variable.

I wanted to differentiate between the current working directory and its
name. Doesn't an unliked current working still exist for the processes
its the cwd of? (I was wrong about that 3) depends on whether pwd in
builtin or not since child processes inherit cwd.)

BTW, in the case that one has deleted and recreated one's current
working
directory one could also use cd . to get to the new directory.
  
   Good to know. (This means the shell is special casing . as well as
   ... I need to read the susv4 shell stuff thoroughly, it's been
   years...)
  
  The susv4 page special cases . and .. a bit, but it seems to me  
  only
  in the $CDPATH handling. Ah, I see that you don't care about $CDPATH  
  from
  the about page.
 
 $CDPATH and $PWD are separate.

I just read

http://landley.net/toybox/about.html:
 And some things (like $CDPATH support in cd) await a good explanation
 of why to bother with them.

and interpreted it as a reluctance to implement $CDPATH support.

 
  Then I think one can leave out step 5 on susv4's page on
  cd, and cd . is no more special than cd dir; it does a chdir to  
  $PWD/.
  or $PWD/dir respectively and then updates $PWD to its canonical  
  form. (and
  modifies $OLDPWD also if necessary)
 
 Um, steps 4 and 8 are the ones that say cd . and .. are special?

Step 4 means that $CDPATH shouldn't be taken into account when you do
something like cd ./dir or cd ../dir.

In Step 8 the usual formal processes of simplifying a path (by removing .
dot components and so on) described.

Of course here . and .. are treated specially, but this treatment
affects only $PWD, since chdir(/some/dir/.) should do the same as
chdir(/some/dir).

Step 9 looks like fun...

  Another interesting situation is if your current directory /dir has  
  been
  moved to /olddir and say /dir has been recreated. Then cd .  
  will move
  you to new directory whereas cd $(pwd -P) will preserve your cwd  
  and fix up
  $PWD. (at least for a shell behaving posixly correct)
 
 Preserving the cwd is what I wanted to do, yes.

  Imagine the same situation but with /dir not being recreated after  
  being
  moved. Then cd . should fail according to susv4 since $PWD/. =  
  /dir/.,
  which does not exist. Would you like to have cd . behave the same as
  cd $(pwd) in this case? Bash does this if not in POSIX mode.  
  Busybox ash
  doesn't do this and for some reason even cd $(pwd) fails.
 
 I want the great mass of existing shell scripts to work, which means  
 reproducing historical behavior. Posix is (mostly) a reasonable  
 consensus documentation of historical behavior.

Ok

Felix
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Add remaining pwd options

2013-01-13 Thread Rob Landley

On 01/13/2013 04:34:56 AM, Felix Janda wrote:

Sorry for this repeated hair-splitting.


Eh, it happens. :)

I'm constantly mucking about in areas I'm brand new to (or haven't got  
the background for, or last messed with so long ago I've forgotten  
important things, or while massively distracted and sleep deprived...),  
so I'm wrong a lot. I just try to fix it when I notice.



On 01/12/13 at 11:33pm, Rob Landley wrote:
 On 01/10/2013 02:25:13 PM, Felix Janda wrote:
  On 01/02/13 at 12:41am, Rob Landley wrote:
   What I did was disable #3 in the case where cwd doesn't exist.  
So

  the
   new rule #3 is:
  
   3) If cwd exists and $PWD doesn't point to it, fall back to -P.
 
  Thanks for the clarification.
 
  Your version of 3) depends on whether pwd is builtin or not. Do  
you

  mean
  something like If getcwd() fails ...?

 cwd is what getcwd() returns. $PWD is an environment variable.

I wanted to differentiate between the current working directory and  
its

name.


The kernel has two magic symlinks as part of each process's state:

1) . which is set by chdir() and returned by getcwd().

2) / which is set by chroot() and not really returned by anything  
because it's what other paths are explained relative _to_.


Inside the kernel, each points to a dentry, which is pinned by the  
reference so you don't have to worry about it going away. (It can be  
invalidated by deleting the dentry's attached inode, but I believe it's  
still around as a zombie until the reference count drops to zero. And  
there's a horrible magic syscall called switch_root that iterates  
through every process in the system and redirects the . and / links  
of every process from one of these to another, but that does horrible  
latency spike locking.)


Each dentry has a .. link, which is not a process attribute, but a  
dentry attribute. (Or is it inode? The fact dentries aren't really  
independent of an underlying inode is half the reason you can't  
hardlink directories. Anyway, .. is implemented by following the  
dentry parent pointer, with the exception that / pointing to the  
current dentry is treated the same as the dentry parent pointer being  
NULL. Yes, this means that if you go:


  mkdir(sub);
  chroot(sub);
  chdir(../../../../../../../../..);
  chroot(.);

You can escape a chroot. Moving the / symlink _under_ . means the  
.. test won't hit it, you see. There's no = test here, just ==.


Anyway, given a dentry the kernel can traverse up to the root (either  
equal to / or where the dentry parent pointer is NULL) to work out  
the absolute path to this dentry, and since each dentry only has one  
parent pointer there's only _one_ absolute path to a given dentry.


Does that help?


Doesn't an unliked current working still exist for the processes
its the cwd of?


You have a pointer to a zombie dentry, the parent pointer of which is  
NULL. It's been unlinked from the tree but won't be garbage collected  
until the reference count falls to zero. I'd guess the corresponding  
inode has been freed and thus the inode pointer is also NULL (thus  
freeing up actual disk space, unlike a filehandle to an open _file_),  
but I'd have to go look at the kernel source to know for sure.



(I was wrong about that 3) depends on whether pwd in
builtin or not since child processes inherit cwd.)


Child processes inherit environment variables too, but a child process  
can't change the parent's attributes. (Ok, it could ptrace it but  
that's HORRIBLE and we're not doing that. Sorry, reflexive action  
anytime anyone, including me, says you can't do X. There's usually a  
bad way to do it. I have a black belt in bad ways to do things, and a  
lot of experience in cleaning them up to look presentable. I do the  
don't ask questions, post errors thing to _myself_ all the time.)


BTW, in the case that one has deleted and recreated one's  
current

working
directory one could also use cd . to get to the new  
directory.

  
   Good to know. (This means the shell is special casing . as  
well as

   ... I need to read the susv4 shell stuff thoroughly, it's been
   years...)
 
  The susv4 page special cases . and .. a bit, but it seems to  
me

  only
  in the $CDPATH handling. Ah, I see that you don't care about  
$CDPATH

  from
  the about page.

 $CDPATH and $PWD are separate.

I just read

http://landley.net/toybox/about.html:


I read http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cd.html

(About the above URL: don't ask me why www.opengroup.org redirects to  
pubs.opengroup.org but just opengroup.org says the service is  
discontinued. The safe thing to do is probably just  
http://pubs.opengroup.org/onlinepubs/9699919799/download/ and use a  
file:// url on the local disk. That's what I do most of the time, and  
then have to dig up a public URL when I want to point somebody else at  
a page...)


 And some things (like $CDPATH support in cd) await a good  
explanation

 of why to bother with 

Re: [Toybox] Add remaining pwd options

2012-12-31 Thread Felix Janda
On 12/30/12 at 05:47pm, Rob Landley wrote:
 On 12/30/2012 05:16:41 AM, Felix Janda wrote:
  On 12/30/12 at 04:43am, Rob Landley wrote:
  POSIX contains many surprises. In the section on environment  
  variables it
  says that $PWD should be set if pwd -P was specified. What happens  
  if an
  error happens seems unspecified.

Sorry, this is wrong. It has been changed between SUSV4 and SUSV3. Now pwd
must not change $PWD. (It would be really nice to have SUSV4 man pages...)

 Translation: pwd must be a shell builtin running within the shell's  
 process ID, and cannot sanely be implemented any other way.

 It would be nice if they would just _identify_ these. I did a pass to  
 find them (in the roadmap), but missed this.

I agree that it's sensible to have it as a builtin. I'm still not sure
whether an external implementation can't be sane, though.

Let's go back to the situation of a directory /dir deleted in a subshell.
What is then the path name of the current working directory of the shell?
(I'd say it's undefined.) Both getcwd() and stat(/dir) fail in this
situation for both the shell and external commands. Does the builtin pwd
have any advantage over the external pwd in making sure that $PWD is sane?

   Sigh. And the whole PWD defaults to -P unless POSIXLY_CORRECT  
  thing
   above: while I'm sure that code is in there, it's not actually what
   it's doing here. Because GNU code is INSANE, and someone somewhere
   thought this tangle of corner cases might help somehow.
  
   Right, in the case of a deleted directory $PWD is all we've got, so
   have -L (which is the default) print it but first validate it's an
   absolute path with no .. in it. Only validate that current directory
   and path directory point to the same place if there IS a current
   directory. If that's not what they want, -P exists.
  
  In the corner case shouldn't pwd (-L and -P) just give an error  
  message?
  ($PWD does not contain an absolute pathname of the current working
  directory.)
 
 If something deletes the directory you're working in, cd .. should  
 work if the directory above you exists. That can't happen if $PWD isn't  
 there.

What exactly is the relation of this to the pwd command? cd .. should
call chdir() with $PWD/.. after canonicalization. On contrast to pwd, cd
_has_ to be builtin since a chdir() in a child process won't affect the
parent shell.

 Also, when a directory gets deleted and recreated I do cd $(pwd) all  
 the time. It's useful to still have pwd if some other process takes out  
 the directory you're in.

Ok, I see that this is handy. Alternative one could use cd $PWD.
I find that this application really contradicts POSIX since here . and
$PWD are completely different directories.


Your fun corner case is still strange. From playing a bit around bash seems
to keep the PWD in addition to the environment variable somewhere internally
(pwd still works after unsetting $PWD.) On the other hand pwd -P seems to
reset this internal state for some reason. Maybe it's a bug. dash also seems
to keep some internal state, but pwd still works after pwd -P has failed.

Felix
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Add remaining pwd options

2012-12-30 Thread Rob Landley

On 12/29/2012 07:38:24 AM, Felix Janda wrote:

POSIX says that pwd should behave the same as pwd -L. The current
pwd -P should behave the same way as the previous version of pwd.  
It

just returns the getcwd() output. pwd -L does just check whether the
environment variable PWD is also a valid current working directory and
uses that instead of the output of getcwd() if that's the case.


Here's a fun corner case:

  $ cd
  $ mkdir fruit
  $ cd fruit
  $ (cd ..  rmdir fruit)
  $ ls -l
  total 0
  $ pwd
  /home/landley/fruit
  $ pwd -L
  /home/landley/fruit
  $ pwd -P
  pwd: error retrieving current directory: getcwd: cannot access parent
  directories: No such file or directory
  $ pwd -L
  pwd: error retrieving current directory: getcwd: cannot access parent
  directories: No such file or directory
  $ pwd
  pwd: error retrieving current directory: getcwd: cannot access parent
  directories: No such file or directory

The amount of magic inherent in that behavior is kind of mind-boggling.  
If you can't getcwd() then it's happy printing $PWD, until you call pwd  
-P and that somehow invalidates $PWD? (Which means pwd is totally a  
shell builtin because a child process can't persistently set an  
environment variable in the parent process).


Sigh. And the whole PWD defaults to -P unless POSIXLY_CORRECT thing  
above: while I'm sure that code is in there, it's not actually what  
it's doing here. Because GNU code is INSANE, and someone somewhere  
thought this tangle of corner cases might help somehow.


Right, in the case of a deleted directory $PWD is all we've got, so  
have -L (which is the default) print it but first validate it's an  
absolute path with no .. in it. Only validate that current directory  
and path directory point to the same place if there IS a current  
directory. If that's not what they want, -P exists.


Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Add remaining pwd options

2012-12-30 Thread Rob Landley

On 12/30/2012 04:47:13 AM, Felix Janda wrote:
Thanks for the various clarifications and making pwd -L check for  
dot

and dot-dot as described in the standard.

Looking at the POSIX man page toysh should set $PWD at some point,  
too.

Right now we have


toysh is hugely incomplete and I just got it to segfault by playing  
with 'cd'.


After I deal with mount/umount/losetup I'm going to try to do a cleanup  
pass on it and actually start on environment variable support.


Alas, toysh was never nearly as finished as people seem to think it  
is...


Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Add remaining pwd options

2012-12-29 Thread Rob Landley

On 12/28/2012 03:24:17 PM, Felix Janda wrote:

Hi,

the first patch adds the -L and -P options to pwd as specified by  
POSIX.
The test script again uses stat. This time in order to get inode  
numbers

of directories.


For future reference adding the test in the same commit as the changes  
being tested is probably ok.


I've applied this patch, but am going to have to take a closer look at  
it in the morning. (You added a -L option which... is a NOP? Huh, what  
posix specifies here is kind of insane, there's no way to get the raw  
getcwd() output. The -L stuff is all about $PWD, and if that doesn't  
have a valid value it falls back to -P which does a realpath() on the  
data to strip symlinks...? I need to read this when I'm more awake,  
this standard is written for a system that stores state different than  
linux. The current working directory is a process attribute used  
directly by the vfs, it's not an environment variable...)


I think the fix is to have -L _not_ be the default, and to have pwd  
return the raw getcwd() output when neither -L nor -P is specified...  
but that's a technical violation of posix...


Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Add remaining pwd options

2012-12-29 Thread Rob Landley

On 12/29/2012 07:38:24 AM, Felix Janda wrote:

On 12/29/12 at 03:53am, Rob Landley wrote:
 On 12/28/2012 03:24:17 PM, Felix Janda wrote:
  Hi,
 
  the first patch adds the -L and -P options to pwd as specified by
  POSIX.
  The test script again uses stat. This time in order to get inode
  numbers
  of directories.

 For future reference adding the test in the same commit as the  
changes

 being tested is probably ok.

Ok.

 I've applied this patch, but am going to have to take a closer look  
at
 it in the morning. (You added a -L option which... is a NOP? Huh,  
what
 posix specifies here is kind of insane, there's no way to get the  
raw

 getcwd() output. The -L stuff is all about $PWD, and if that doesn't
 have a valid value it falls back to -P which does a realpath() on  
the

 data to strip symlinks...? I need to read this when I'm more awake,
 this standard is written for a system that stores state different  
than

 linux. The current working directory is a process attribute used
 directly by the vfs, it's not an environment variable...)

 I think the fix is to have -L _not_ be the default, and to have pwd
 return the raw getcwd() output when neither -L nor -P is  
specified...

 but that's a technical violation of posix...

POSIX says that pwd should behave the same as pwd -L.


Posix seems to believe that the PWD environment variable is where the  
current directory is stored, which is not how Linux works. On linux,  
getcwd() returns one of two process-specific vfs attributes (chdir()  
sets . and chroot() sets /, and neither of those is an environment  
variable). If you export PWD=/blah that's not the same as calling  
chdir.



The current
pwd -P should behave the same way as the previous version of pwd.  
It

just returns the getcwd() output.


Which is always an abspath. (I checked.)

I think what happens when you cd through a symlink is that the shell  
saves the path you descended into in $PWD, and then if you cd .. it  
chops off the last path component instead of actually dereferencing ..  
(which would wind up somewhere other than the directory you came from).


So pwd -L is showing you the shell's view of things (using the $PWD  
environment variable), and pwd -P is showing you the realpath(). And  
what this basically means is pwd is more or less a shell builtin, the  
standard just isn't EXPLAINING it clearly.



pwd -L does just check whether the
environment variable PWD is also a valid current working directory and
uses that instead of the output of getcwd() if that's the case.


Posix goes on at some length about no . or .. in it. I added logic  
to do this, but haven't checked it in yet.



So according to POSIX we have:

$ cd /tmp
$ ln -s . a
$ cd a
$ export PWD=/tmp/a
$ pwd
/tmp/a
$ pwd -P
/tmp

Actually at least bash seems to update PWD automatically so that the  
export

statement is unnecessary.


Indeed, bash updates PWD. (Unless you assign it to something else or  
unset it.)



It's maybe interesting to see what coreutils is doing. A fragment:


I never look at gnu source if I can avoid it. I sometimes run that  
stuff under strace, but mostly I just read the docs and work out tests.



  /* POSIX requires a default of -L, but most scripts expect -P.  */
  bool logical = (getenv (POSIXLY_CORRECT) != NULL);


The rule is Anything gnu does is a bad idea, and there are about as  
many exceptions to that as any other rule.


I think I understand _why_ -L is doing that, and the sanity checks are  
so if somebody tries to futz around with pwd to point somewhere else  
(or in a way the shell wouldn't have set it), we discard it and give  
the abspath instead for security-ish reasons. But the user friendly  
path may have $HOME be a symlink with the abspath on /mnt/vol2 or  
something, and we want to default to giving the PWD the user actually  
remembers.


The implementation of pwd -L could also use realpath instead of  
stat.


Stat's easier.

Taking a further look at POSIX I think that the option string should  
be

0LP[-LP] instead of 0LP[!LP].


I already made that change locally. :)


Felix

 Rob



Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] Add remaining pwd options

2012-12-28 Thread Felix Janda
Hi,

the first patch adds the -L and -P options to pwd as specified by POSIX.
The test script again uses stat. This time in order to get inode numbers
of directories.

Felix
# HG changeset patch
# User Felix Janda felix.ja...@posteo.de
# Date 1356627399 -3600
# Node ID 592dab5e536c053ac8b8696f368045f76c8a30b9
# Parent  017b8fd3c9ac5a86dd849831622c4878fddebe5d
Add options -L and -P to pwd.

diff -r 017b8fd3c9ac -r 592dab5e536c toys/posix/pwd.c
--- a/toys/posix/pwd.c	Wed Dec 26 19:39:51 2012 -0600
+++ b/toys/posix/pwd.c	Thu Dec 27 17:56:39 2012 +0100
@@ -3,26 +3,34 @@
  * Copyright 2006 Rob Landley r...@landley.net
  *
  * See http://opengroup.org/onlinepubs/9699919799/utilities/echo.html
- *
- * TODO: add -L -P
 
-USE_PWD(NEWTOY(pwd, NULL, TOYFLAG_BIN))
+USE_PWD(NEWTOY(pwd, 0LP[!LP], TOYFLAG_BIN))
 
 config PWD
   bool pwd
   default y
   help
-usage: pwd
+usage: pwd [-L|-P]
 
 The print working directory command prints the current directory.
+
+-P  Avoid all symlinks
+-L  Use the value of the environment variable PWD if valid
+
+The option -L is implied by default.
 */
 
+#define FOR_pwd
 #include toys.h
 
 void pwd_main(void)
 {
-  char *pwd = xgetcwd();
+  char *pwd = xgetcwd(), *env_pwd;
+  struct stat st[2];
 
-  xprintf(%s\n, pwd);
+  if (!(toys.optflags  FLAG_P)  (env_pwd = getenv(PWD)) 
+!stat(pwd, st[0])  !stat(env_pwd, st[1]) 
+(st[0].st_ino == st[1].st_ino)) xprintf(%s\n, env_pwd);
+  else xprintf(%s\n, pwd);
   if (CFG_TOYBOX_FREE) free(pwd);
 }
# HG changeset patch
# User Felix Janda felix.ja...@posteo.de
# Date 1356729021 -3600
# Node ID f5b0f21ef92f73e13c3415d8449be86d9c531186
# Parent  dbf0480c88f4895724d719738c7d75ffc9f6c957
Add some tests for pwd.

diff --git a/scripts/test/pwd.test b/scripts/test/pwd.test
new file mode 100755
--- /dev/null
+++ b/scripts/test/pwd.test
@@ -0,0 +1,26 @@
+#!/bin/bash
+
+[ -f testing.sh ]  . testing.sh
+
+#testing name command result infile stdin
+
+#TODO: Find better tests
+
+testing pwd [ $(stat -c %i $(pwd)) = $(stat -c %i .) ]  echo yes \
+	yes\n  
+testing pwd -P [ $(stat -c %i $(pwd -P)) = $(stat -c %i .) ]  echo yes \
+	yes\n  
+
+
+ln -s . sym
+cd sym
+testing pwd [ $(stat -c %i $(pwd)) = $(stat -c %i $PWD) ]  echo yes \
+	yes\n  
+testing pwd -P [ $(stat -c %i $(pwd -P)) = $(stat -c %i $PWD) ] || echo yes \
+	yes\n  
+cd ..
+rm sym
+
+export PWD=walrus
+testing pwd (bad PWD) [ $(pwd) = $(cd . ; pwd) ]  echo yes \
+	yes\n  
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net