Re: [PATCH v3 00/26] inotify support
On Mon, Feb 10, 2014 at 3:19 AM, Torsten Bögershausen wrote: > > On 2014-02-08 09.53, Duy Nguyen wrote: file-watcher.c | 32 1 file changed, 32 insertions(+) >>> >>> I feel a little bit unsure about the 700. >>> Most often Git does not care about permissions, >>> and relies on umask being set appropriatly. >>> (Please correct me if I'm wrong) >> >>Git does care. See credential-cache--daemon.c. In fact this function >>is a copy of check_socket_directory() from that file. >> > I was probably a little bit unclear. > Of course credentials should be protected well and stored with 700. > The rest of the repo could be more loose by using adjust_shared_perm(). > Because the whole repo can be shared (or not) and data is visible > to the group or everyone. > (this is a minor issue) So how about a check whenever a worktree is connected to the daemon, if that worktree has stricter permission, e.g. 0700 vs 0770 of the daemon socket directory, then the daemon refuses the worktree (maybe with a warning)? -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/26] inotify support
On Mon, Feb 10, 2014 at 11:55 PM, Torsten Bögershausen wrote: > On 2014-02-10 11.37, Duy Nguyen wrote: >>> >>> Could we use relative path names internally, relative to $GIT_DIR ? >> >> No because this is when the client tell the server about $GIT_DIR. I >> guess we can use realpath(1) here. > Good. > > I realized that the watcher can watch several repos at the same time. > > However, we could allow relative path names, which will be relative to > $SOCKET_DIR, > and loosen the demand for an absolut path name a little bit. > And $SOCKET_DIR can be the same as $GIT_DIR, when we are watching only one > repo. It does not help much anyway because file-watcher-lib.c sends get_git_work_tree(), which is absolute/normalized path, to file-watcher. There's no sources of sending $GIT_DIR relative to $SOCKET_DIR (and I don't think we want to make get_git_work_tree() relative before sending, it's more work on both sides we no benefits, except for tracing). -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/26] inotify support
On 2014-02-10 11.37, Duy Nguyen wrote: >> >> Could we use relative path names internally, relative to $GIT_DIR ? > > No because this is when the client tell the server about $GIT_DIR. I > guess we can use realpath(1) here. Good. I realized that the watcher can watch several repos at the same time. However, we could allow relative path names, which will be relative to $SOCKET_DIR, and loosen the demand for an absolut path name a little bit. And $SOCKET_DIR can be the same as $GIT_DIR, when we are watching only one repo. > If you want I can update test-file-watcher to accept "send<" and > "recv>" instead of "<" and ">", respectively. The only command with > the same name for response and request is "hello". I can make it > "hello" and "helloack" (or "bonjour" as response?). helloack looks good. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/26] inotify support
On Mon, Feb 10, 2014 at 3:19 AM, Torsten Bögershausen wrote: > Please see filewatcher.c: > + if (daemon) { > + int err; > + strbuf_addf(&sb, "%s/log", socket_path); > + err = open(sb.buf, O_CREAT | O_TRUNC | O_WRONLY, 0600); > + adjust_shared_perm(sb.buf); > (And now we talk about the logfile: > "In daemon mode, stdout and stderr are saved in $WATCHER/log." > It could be nice to make this feature configrable, > either XXX/log or /dev/null. > On the long run we may eat to much disc space on a machine. > The other thing is that we may want to seperate stdout > from stderr, but even this is a low prio comment. I probably should follow git-daemon and put these to syslog. > > There is a small issue when I tested on a machine, > where the "data directory" called "daten" is softlinked to another disk: > daten -> /disk3/home2/tb/daten > > and the "projects" directory is softlinked to "daten/projects" > projects -> daten/projects/ > > t7514 fails like this: > --- expect 2014-02-08 14:37:07.0 + > +++ actual 2014-02-08 14:37:07.0 + > @@ -1,6 +1,6 @@ > packet: git> hello > packet: git< hello > -packet: git> index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 > /home/tb/projects/git/tb/t/trash directory.t7514-file-watcher-lib > +packet: git> index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 > /disk3/home2/tb/daten/projects/git/tb/t/trash directory.t7514-file-watcher-lib > > Could we use relative path names internally, relative to $GIT_DIR ? No because this is when the client tell the server about $GIT_DIR. I guess we can use realpath(1) here. > --- > Another thing: > Compiling under Mingw gives this: > LINK git-credential-store.exe > libgit.a(file-watcher-lib.o): In function `connect_watcher': > c:\Dokumente und Einstellungen\tb\projects\git\tb/file-watcher-lib.c:21: > undefined reference to `unix_stream_connect' > collect2: ld returned 1 exit status > make: *** [git-credential-store.exe] Error 1 > > We may need a compiler option like HAS_UNIX_SOCKETS or so. I'll make unix-socket.o build unconditionally and return error at runtime. > -- > +++ b/file-watcher.c > > +#define INOTIFY_MASKS (IN_DELETE_SELF | IN_MOVE_SELF | \ > + IN_CREATE | IN_ATTRIB | IN_DELETE | IN_MODIFY | \ > + IN_MOVED_FROM | IN_MOVED_TO) > This feels confusing: > a) we have inotify_masks with lower case below. > b) how about INOTIFY_NEEDED_BITS ? > --- OK > I'm OK with having the protocol having specified in the > test cases. > One thing that I have on the wish list is to make the > commands/responses more unique, to be able to run grep > on the code base. > > One idea could be to use a prefix > "fwr" for "file watcher request" or > "fwr" for "file watcher response". > This does not work, hihi, so > > "fwq" for "file watcher reQuest" and > "fwe" for "file watcher rEsponse". > Or > "ffw" as "from file watcher" and > "tfw" as "to file watcher" for the people who have problems > with left and right, < and > could work. If you want I can update test-file-watcher to accept "send<" and "recv>" instead of "<" and ">", respectively. The only command with the same name for response and request is "hello". I can make it "hello" and "helloack" (or "bonjour" as response?). -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/26] inotify support
On 2014-02-08 09.53, Duy Nguyen wrote: > Thanks for the comments. I can see I now have some work to do in the > coming weeks :) > >>> file-watcher.c | 32 >>> 1 file changed, 32 insertions(+) >> >> I feel a little bit unsure about the 700. >> Most often Git does not care about permissions, >> and relies on umask being set appropriatly. >> (Please correct me if I'm wrong) > >Git does care. See credential-cache--daemon.c. In fact this function >is a copy of check_socket_directory() from that file. > I was probably a little bit unclear. Of course credentials should be protected well and stored with 700. The rest of the repo could be more loose by using adjust_shared_perm(). Because the whole repo can be shared (or not) and data is visible to the group or everyone. (this is a minor issue) Please see filewatcher.c: + if (daemon) { + int err; + strbuf_addf(&sb, "%s/log", socket_path); + err = open(sb.buf, O_CREAT | O_TRUNC | O_WRONLY, 0600); + adjust_shared_perm(sb.buf); (And now we talk about the logfile: "In daemon mode, stdout and stderr are saved in $WATCHER/log." It could be nice to make this feature configrable, either XXX/log or /dev/null. On the long run we may eat to much disc space on a machine. The other thing is that we may want to seperate stdout from stderr, but even this is a low prio comment. There is a small issue when I tested on a machine, where the "data directory" called "daten" is softlinked to another disk: daten -> /disk3/home2/tb/daten and the "projects" directory is softlinked to "daten/projects" projects -> daten/projects/ t7514 fails like this: --- expect 2014-02-08 14:37:07.0 + +++ actual 2014-02-08 14:37:07.0 + @@ -1,6 +1,6 @@ packet: git> hello packet: git< hello -packet: git> index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 /home/tb/projects/git/tb/t/trash directory.t7514-file-watcher-lib +packet: git> index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 /disk3/home2/tb/daten/projects/git/tb/t/trash directory.t7514-file-watcher-lib Could we use relative path names internally, relative to $GIT_DIR ? --- Another thing: Compiling under Mingw gives this: LINK git-credential-store.exe libgit.a(file-watcher-lib.o): In function `connect_watcher': c:\Dokumente und Einstellungen\tb\projects\git\tb/file-watcher-lib.c:21: undefined reference to `unix_stream_connect' collect2: ld returned 1 exit status make: *** [git-credential-store.exe] Error 1 We may need a compiler option like HAS_UNIX_SOCKETS or so. -- +++ b/file-watcher.c +#define INOTIFY_MASKS (IN_DELETE_SELF | IN_MOVE_SELF | \ + IN_CREATE | IN_ATTRIB | IN_DELETE | IN_MODIFY | \ + IN_MOVED_FROM | IN_MOVED_TO) This feels confusing: a) we have inotify_masks with lower case below. b) how about INOTIFY_NEEDED_BITS ? --- I'm OK with having the protocol having specified in the test cases. One thing that I have on the wish list is to make the commands/responses more unique, to be able to run grep on the code base. One idea could be to use a prefix "fwr" for "file watcher request" or "fwr" for "file watcher response". This does not work, hihi, so "fwq" for "file watcher reQuest" and "fwe" for "file watcher rEsponse". Or "ffw" as "from file watcher" and "tfw" as "to file watcher" for the people who have problems with left and right, < and > could work. This is all for today. I will have a look at different error scenarios, what happens when the watcher crashes and needs to be restarted, or when Git itself dies with a segfault and doesn't tell the watcher. The easiest way to simulate this would be in terms of test cases. So I will try to write some /Torsten -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/26] inotify support
Thanks for the comments. I can see I now have some work to do in the coming weeks :) On Sat, Feb 8, 2014 at 3:04 PM, Torsten Bögershausen wrote: > I would appreciate if we could have an outline of the protocol > as a seperate "document" somewhere, to be able to have a look at the protocol > first, before looking into the code. My fear is document becomes outdated because people don't always remember to update doc when they change code. Which is why I embed the protocol as comments in handle_command() function. If the final version [1] is still not easy to read, I'll split the protocol out into a separate document. [1] https://github.com/pclouds/git/blob/file-watcher/file-watcher.c#L672 > (Am I using wireshark too much to dream about a dissector ?) Try GIT_TRACE_PACKET=2 > 1) > write_in_full_timeout() > packet_read_line_timeout() > At other places we handle EINTR after calling poll(). > Looking at the code, it could be easier to introduce > a new function xpoll() in wrapper.c, and use that instead > of poll(). Yeah there are already 4 poll call sites before file-watcher jumps in. Will do. > 2) > Similar for the usage of accept(). > I like the idea of xread() xwrite() and all the x functions > so it coud make sense to establish xpoll() and xaccept() > before inotify suppport. OK > 3) >> -int unix_stream_listen(const char *path) >> +int unix_stream_listen(const char *path, int replace) >> { >> int fd, saved_errno; >> struct sockaddr_un sa; >> @@ -103,7 +103,8 @@ int unix_stream_listen(const char *path) >> return -1; >> fd = unix_stream_socket(); >> >> - unlink(path); >> + if (replace) >> + unlink(path); > > Minor remark: > As we do not do the replace here: > s/replace/un_link/ may be ? Heh, I thought of using the name "unlink" but it's taken so I chose "replace" and did not think of underscore.Will do. > 4) >> +++ b/file-watcher.c > {} >> +static const char *const file_watcher_usage[] = { >> + N_("git file-watcher [options] "), >> + NULL >> +}; > Do we already have options here? > I can think about having > -d daemoniye > -u uses Unix doain socket > (And later -t to use a TCP/IP socket, when the repo > is on a mounted NFS (or SAMBA) drive, and the daemon is on a > different machine. > I don't say this patch should include this logic in first round, > But I can see a gain for this kind of setup) Later on we have two, --detach (i.e. daemonize, I reuse the same name from git-daemon) and --check-support. Transport settings (like unix vs tcp/ip...) should be config options though. You don't want to specify it here, then again when you run "git status". Actually I should add --default, that will retrieve from config key filewatcher.path, so the user does not have to specify it.. > 5) >> +++ b/file-watcher.c > [] >> +static int shutdown_connection(int id) >> +{ >> + struct connection *conn = conns[id]; >> + conns[id] = NULL; >> + pfd[id].fd = -1; >> + if (!conn) >> + return 0; >> + close(conn->sock); >> + free(conn); >> + return 0; >> +} > The function is called shutdown_connection(), but it does a close() > Could it be named close_connection() ? Yes, there was a close_connection() which did something similar, and then it was killed off. Will rename. > 6) >> +++ b/file-watcher.c > [] > Do we have a sequence chart about the command flow between the watcher > daemon and the client ? I suggest you have a look at the file-watcher test. For example, from [2] we have this sequence hello inconsistent # Do not call inotify_add_watch() test mode on # First batch should be all ok <" the opposite direction. But you can always obtain a real flow with GIT_TRACE_PACKET=2 (path lists not available though, so really just a flow). [2] https://github.com/pclouds/git/blob/file-watcher/t/t7513-file-watcher.sh#L273 > in 03/26: >>This version is also gentler than its friend packet_read_line() > gentler, what does this mean? No dying for whatever error. packet_read_line is designed for git protocol. It's the main job so if there's an error, dying is the right thing to do. file-watcher on the other hand is a side job and should not stop whatever command from doing. Will update commit message. >>because it's designed for side channel I/O that should not abort the >>program even if the channel is broken. > I'm not so familar with side-channel I/O. How does it fit in here? To sum up, whatever error in communication with file-watcher should not stop you from doing whatever you're doing. file-watcher is contacted whenever $GIT_DIR/index is read, so "whatever you're doing" is basically all git commands that involve worktree or index. > Does this make sense: > In opposite to packet_read_line() which can call die() > to abort the program, read_in_full_timeout() will keep the program running. > (or something like this) Exactly! >>diff --git a/cache.h b/cache.h >>index 718e32b..939db46 100644 >>--
Re: [PATCH v3 00/26] inotify support
On 03.02.14 05:28, Nguyễn Thái Ngọc Duy wrote: I managed to review the code 0..12/26, so some parts are missing. The list below became longer than what I intended, my comments may be hard to read, and there is a mixture of minor and major remarks. I would appreciate if we could have an outline of the protocol as a seperate "document" somewhere, to be able to have a look at the protocol first, before looking into the code. (Am I using wireshark too much to dream about a dissector ?) All in all I like the concept, thanks for the work. 1) write_in_full_timeout() packet_read_line_timeout() At other places we handle EINTR after calling poll(). Looking at the code, it could be easier to introduce a new function xpoll() in wrapper.c, and use that instead of poll(). 2) Similar for the usage of accept(). I like the idea of xread() xwrite() and all the x functions so it coud make sense to establish xpoll() and xaccept() before inotify suppport. 3) > -int unix_stream_listen(const char *path) > +int unix_stream_listen(const char *path, int replace) > { > int fd, saved_errno; > struct sockaddr_un sa; > @@ -103,7 +103,8 @@ int unix_stream_listen(const char *path) > return -1; > fd = unix_stream_socket(); > > - unlink(path); > + if (replace) > + unlink(path); Minor remark: As we do not do the replace here: s/replace/un_link/ may be ? 4) > +++ b/file-watcher.c {} > +static const char *const file_watcher_usage[] = { > + N_("git file-watcher [options] "), > + NULL > +}; Do we already have options here? I can think about having -d daemoniye -u uses Unix doain socket (And later -t to use a TCP/IP socket, when the repo is on a mounted NFS (or SAMBA) drive, and the daemon is on a different machine. I don't say this patch should include this logic in first round, But I can see a gain for this kind of setup) 5) > +++ b/file-watcher.c [] > +static int shutdown_connection(int id) > +{ > + struct connection *conn = conns[id]; > + conns[id] = NULL; > + pfd[id].fd = -1; > + if (!conn) > + return 0; > + close(conn->sock); > + free(conn); > + return 0; > +} The function is called shutdown_connection(), but it does a close() Could it be named close_connection() ? 6) > +++ b/file-watcher.c [] Do we have a sequence chart about the command flow between the watcher daemon and the client ? 7) in 03/26: >This version is also gentler than its friend packet_read_line() gentler, what does this mean? >because it's designed for side channel I/O that should not abort the >program even if the channel is broken. I'm not so familar with side-channel I/O. How does it fit in here? Does this make sense: In opposite to packet_read_line() which can call die() to abort the program, read_in_full_timeout() will keep the program running. (or something like this) > >Signed-off-by: Nguyễn Thái Ngọc Duy >--- > cache.h| 1 + > pkt-line.c | 35 +++ > pkt-line.h | 1 + > wrapper.c | 21 + > 4 files changed, 58 insertions(+) > >diff --git a/cache.h b/cache.h >index 718e32b..939db46 100644 >--- a/cache.h >+++ b/cache.h >@@ -1230,6 +1230,7 @@ extern int write_or_whine_pipe(int fd, const void *buf, >size_t count, const char > extern void fsync_or_die(int fd, const char *); > > extern ssize_t read_in_full(int fd, void *buf, size_t count); >+extern ssize_t read_in_full_timeout(int fd, void *buf, size_t count, int >timeout); > extern ssize_t write_in_full(int fd, const void *buf, size_t count); > extern ssize_t write_in_full_timeout(int fd, const void *buf, size_t count, > int timeout); > static inline ssize_t write_str_in_full(int fd, const char *str) >diff --git a/pkt-line.c b/pkt-line.c >index cf681e9..5a07e97 100644 >--- a/pkt-line.c >+++ b/pkt-line.c >@@ -229,3 +229,38 @@ char *packet_read_line_buf(char **src, size_t *src_len, >int *dst_len) > { > return packet_read_line_generic(-1, src, src_len, dst_len); > } >+ In what is the timeout measured ? seconds, milli years? As we use poll() I think it is milli seconds. (I like the idea of naming timeout timeout_ms) [] >+ len -= 4; >+ if (len >= buf_len) { >+ error("protocol error: bad line length %d", len); >+ return NULL; >+ } >+ if ((ret = read_in_full_timeout(fd, buf, len, timeout)) < 0) >+ return NULL; Do we want a packet_trace here? When a timeout occurs, do we want to close the connection, marking it as dead? Or need to look at errno? >+ buf[len] = '\0'; >+ if (len_p) >+ *len_p = len; >+ packet_trace(buf, len, 0); >+ return buf; >+} >diff --git a/pkt-line.h b/pkt-line.h >index 4b93a0c..d47dca5 100644 >--- a/pkt-line.h >+++ b/pkt-line.h >@@ -69,6 +69,7 @@ int packet_read(int fd, char **src_buffer, size_t *src_len, >char > * packet is written to it. > */ > char *packet_read_line(int fd, int *siz