Re: [PATCH v3 00/26] inotify support

2014-02-17 Thread Duy Nguyen
On Mon, Feb 10, 2014 at 3:19 AM, Torsten Bögershausen  wrote:
>
> On 2014-02-08 09.53, Duy Nguyen wrote:
 file-watcher.c | 32 
 1 file changed, 32 insertions(+)
>>>
>>> I feel a little bit unsure about the 700.
>>> Most often Git does not care about permissions,
>>> and relies on umask being set appropriatly.
>>> (Please correct me if I'm wrong)
>>
>>Git does care. See credential-cache--daemon.c. In fact this function
>>is a copy of check_socket_directory() from that file.
>>
> I was probably a little bit unclear.
> Of course credentials should be protected well and stored with 700.
> The rest of the repo could be more loose by using adjust_shared_perm().
> Because the whole repo can be shared (or not) and data is visible
> to the group or everyone.
> (this is a minor issue)

So how about a check whenever a worktree is connected to the daemon,
if that worktree has stricter permission, e.g. 0700 vs 0770 of the
daemon socket directory, then the daemon refuses the worktree (maybe
with a warning)?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/26] inotify support

2014-02-10 Thread Duy Nguyen
On Mon, Feb 10, 2014 at 11:55 PM, Torsten Bögershausen  wrote:
> On 2014-02-10 11.37, Duy Nguyen wrote:
>>>
>>> Could we use relative path names internally, relative to $GIT_DIR ?
>>
>> No because this is when the client tell the server about $GIT_DIR. I
>> guess we can use realpath(1) here.
> Good.
>
> I realized that the watcher can watch several repos at the same time.
>
> However, we could allow relative path names, which will be relative to 
> $SOCKET_DIR,
> and loosen the demand for an absolut path name a little bit.
> And $SOCKET_DIR can be the same as $GIT_DIR, when we are watching only one 
> repo.

It does not help much anyway because file-watcher-lib.c sends
get_git_work_tree(), which is absolute/normalized path, to
file-watcher. There's no sources of sending $GIT_DIR relative to
$SOCKET_DIR (and I don't think we want to make get_git_work_tree()
relative before sending, it's more work on both sides we no benefits,
except for tracing).
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/26] inotify support

2014-02-10 Thread Torsten Bögershausen
On 2014-02-10 11.37, Duy Nguyen wrote:
>>
>> Could we use relative path names internally, relative to $GIT_DIR ?
> 
> No because this is when the client tell the server about $GIT_DIR. I
> guess we can use realpath(1) here.
Good.

I realized that the watcher can watch several repos at the same time.

However, we could allow relative path names, which will be relative to 
$SOCKET_DIR,
and loosen the demand for an absolut path name a little bit.
And $SOCKET_DIR can be the same as $GIT_DIR, when we are watching only one repo.
> If you want I can update test-file-watcher to accept "send<" and
> "recv>" instead of "<" and ">", respectively. The only command with
> the same name for response and request is "hello". I can make it
> "hello" and "helloack" (or "bonjour" as response?).

helloack looks good. 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/26] inotify support

2014-02-10 Thread Duy Nguyen
On Mon, Feb 10, 2014 at 3:19 AM, Torsten Bögershausen  wrote:
> Please see filewatcher.c:
> +   if (daemon) {
> +   int err;
> +   strbuf_addf(&sb, "%s/log", socket_path);
> +   err = open(sb.buf, O_CREAT | O_TRUNC | O_WRONLY, 0600);
> +   adjust_shared_perm(sb.buf);
> (And now we talk about the logfile:
> "In daemon mode, stdout and stderr are saved in $WATCHER/log."
> It could be nice to make this feature configrable,
> either XXX/log or /dev/null.
> On the long run we may eat to much disc space on a machine.
> The other thing is that we may want to seperate stdout
> from stderr, but even this is a low prio comment.

I probably should follow git-daemon and put these to syslog.

> 
> There is a small issue when I tested on a machine,
> where the "data directory" called "daten" is softlinked to another disk:
> daten -> /disk3/home2/tb/daten
>
> and the "projects" directory is softlinked to "daten/projects"
> projects -> daten/projects/
>
> t7514 fails like this:
> --- expect  2014-02-08 14:37:07.0 +
> +++ actual  2014-02-08 14:37:07.0 +
> @@ -1,6 +1,6 @@
>  packet:  git> hello
>  packet:  git< hello
> -packet:  git> index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 
> /home/tb/projects/git/tb/t/trash directory.t7514-file-watcher-lib
> +packet:  git> index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 
> /disk3/home2/tb/daten/projects/git/tb/t/trash directory.t7514-file-watcher-lib
>
> Could we use relative path names internally, relative to $GIT_DIR ?

No because this is when the client tell the server about $GIT_DIR. I
guess we can use realpath(1) here.

> ---
> Another thing:
> Compiling under Mingw gives this:
> LINK git-credential-store.exe
> libgit.a(file-watcher-lib.o): In function `connect_watcher':
> c:\Dokumente und Einstellungen\tb\projects\git\tb/file-watcher-lib.c:21: 
> undefined reference to `unix_stream_connect'
> collect2: ld returned 1 exit status
> make: *** [git-credential-store.exe] Error 1
>
> We may need a compiler option like HAS_UNIX_SOCKETS or so.

I'll make unix-socket.o build unconditionally and return error at runtime.

> --
> +++ b/file-watcher.c
>
> +#define INOTIFY_MASKS (IN_DELETE_SELF | IN_MOVE_SELF | \
> +  IN_CREATE | IN_ATTRIB | IN_DELETE | IN_MODIFY |  \
> +  IN_MOVED_FROM | IN_MOVED_TO)
> This feels confusing:
> a) we have inotify_masks with lower case below.
> b) how about INOTIFY_NEEDED_BITS ?
> ---

OK

> I'm OK with having the protocol having specified in the
> test cases.
> One thing that I have on the wish list is to make the
> commands/responses more unique, to be able to run grep
> on the code base.
>
> One idea could be to use a prefix
> "fwr" for "file watcher request" or
> "fwr" for "file watcher response".
> This does not work, hihi, so
>
> "fwq" for "file watcher reQuest" and
> "fwe" for "file watcher rEsponse".
> Or
> "ffw" as "from file watcher" and
> "tfw" as "to file watcher" for the people who have problems
> with left and right, < and > could work.

If you want I can update test-file-watcher to accept "send<" and
"recv>" instead of "<" and ">", respectively. The only command with
the same name for response and request is "hello". I can make it
"hello" and "helloack" (or "bonjour" as response?).
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/26] inotify support

2014-02-09 Thread Torsten Bögershausen

On 2014-02-08 09.53, Duy Nguyen wrote:
> Thanks for the comments. I can see I now have some work to do in the
> coming weeks :)
> 



>>> file-watcher.c | 32 
>>> 1 file changed, 32 insertions(+)
>>
>> I feel a little bit unsure about the 700.
>> Most often Git does not care about permissions,
>> and relies on umask being set appropriatly.
>> (Please correct me if I'm wrong)
>
>Git does care. See credential-cache--daemon.c. In fact this function
>is a copy of check_socket_directory() from that file.
>
I was probably a little bit unclear.
Of course credentials should be protected well and stored with 700.
The rest of the repo could be more loose by using adjust_shared_perm().
Because the whole repo can be shared (or not) and data is visible
to the group or everyone.
(this is a minor issue)

Please see filewatcher.c:
+   if (daemon) {
+   int err;
+   strbuf_addf(&sb, "%s/log", socket_path);
+   err = open(sb.buf, O_CREAT | O_TRUNC | O_WRONLY, 0600);
+   adjust_shared_perm(sb.buf);
(And now we talk about the logfile:
"In daemon mode, stdout and stderr are saved in $WATCHER/log."
It could be nice to make this feature configrable,
either XXX/log or /dev/null.
On the long run we may eat to much disc space on a machine.
The other thing is that we may want to seperate stdout
from stderr, but even this is a low prio comment.



There is a small issue when I tested on a machine,
where the "data directory" called "daten" is softlinked to another disk:
daten -> /disk3/home2/tb/daten

and the "projects" directory is softlinked to "daten/projects"
projects -> daten/projects/

t7514 fails like this:
--- expect  2014-02-08 14:37:07.0 +
+++ actual  2014-02-08 14:37:07.0 +
@@ -1,6 +1,6 @@
 packet:  git> hello
 packet:  git< hello
-packet:  git> index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 
/home/tb/projects/git/tb/t/trash directory.t7514-file-watcher-lib
+packet:  git> index 6cb9741eee29ca02c5b79e9c0bc647bcf47ce948 
/disk3/home2/tb/daten/projects/git/tb/t/trash directory.t7514-file-watcher-lib

Could we use relative path names internally, relative to $GIT_DIR ?


---
Another thing:
Compiling under Mingw gives this:
LINK git-credential-store.exe
libgit.a(file-watcher-lib.o): In function `connect_watcher':
c:\Dokumente und Einstellungen\tb\projects\git\tb/file-watcher-lib.c:21: 
undefined reference to `unix_stream_connect'
collect2: ld returned 1 exit status
make: *** [git-credential-store.exe] Error 1

We may need a compiler option like HAS_UNIX_SOCKETS or so.

--
+++ b/file-watcher.c

+#define INOTIFY_MASKS (IN_DELETE_SELF | IN_MOVE_SELF | \
+  IN_CREATE | IN_ATTRIB | IN_DELETE | IN_MODIFY |  \
+  IN_MOVED_FROM | IN_MOVED_TO)
This feels confusing:
a) we have inotify_masks with lower case below.
b) how about INOTIFY_NEEDED_BITS ?
---




I'm OK with having the protocol having specified in the
test cases.
One thing that I have on the wish list is to make the
commands/responses more unique, to be able to run grep
on the code base.

One idea could be to use a prefix
"fwr" for "file watcher request" or
"fwr" for "file watcher response".
This does not work, hihi, so

"fwq" for "file watcher reQuest" and
"fwe" for "file watcher rEsponse".
Or 
"ffw" as "from file watcher" and
"tfw" as "to file watcher" for the people who have problems
with left and right, < and > could work.

This is all for today.
I will have a look at different error scenarios, what happens
when the watcher crashes and needs to be restarted,
or when Git itself dies with a segfault and doesn't tell the
watcher.

The easiest way to simulate this would be in terms of test cases.
So I will try to write some
/Torsten
 



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/26] inotify support

2014-02-08 Thread Duy Nguyen
Thanks for the comments. I can see I now have some work to do in the
coming weeks :)

On Sat, Feb 8, 2014 at 3:04 PM, Torsten Bögershausen  wrote:
> I would appreciate if we could have an outline of the protocol
> as a seperate "document" somewhere, to be able to have a look at the protocol
> first, before looking into the code.

My fear is document becomes outdated because people don't always
remember to update doc when they change code. Which is why I embed the
protocol as comments in handle_command() function. If the final
version [1] is still not easy to read, I'll split the protocol out
into a separate document.

[1] https://github.com/pclouds/git/blob/file-watcher/file-watcher.c#L672

> (Am I using wireshark too much to dream about a dissector ?)

Try GIT_TRACE_PACKET=2

> 1)
>   write_in_full_timeout()
>   packet_read_line_timeout()
>   At other places we handle EINTR after calling poll().
>   Looking at the code, it could be easier to introduce
>   a new function xpoll() in wrapper.c, and use that instead
>   of poll().

Yeah there are already 4 poll call sites before file-watcher jumps in. Will do.

> 2)
>   Similar for the usage of accept().
>   I like the idea of xread() xwrite() and all the x functions
>   so it coud make sense to establish xpoll() and xaccept()
>   before inotify suppport.

OK

> 3)
>> -int unix_stream_listen(const char *path)
>> +int unix_stream_listen(const char *path, int replace)
>>  {
>>   int fd, saved_errno;
>>   struct sockaddr_un sa;
>> @@ -103,7 +103,8 @@ int unix_stream_listen(const char *path)
>>   return -1;
>>   fd = unix_stream_socket();
>>
>> - unlink(path);
>> + if (replace)
>> + unlink(path);
>
> Minor remark:
> As we do not do the replace here:
> s/replace/un_link/ may be ?

Heh, I thought of using the name "unlink" but it's taken so I chose
"replace" and did not think of underscore.Will do.

> 4)
>> +++ b/file-watcher.c
> {}
>> +static const char *const file_watcher_usage[] = {
>> + N_("git file-watcher [options] "),
>> + NULL
>> +};
> Do we already have options here?
> I can think about having
> -d daemoniye
> -u uses Unix doain socket
> (And later -t to use a TCP/IP socket, when the repo
>  is on a mounted NFS (or SAMBA) drive, and  the daemon is on a
>  different machine.
>  I don't say this patch should include this logic in first round,
>  But I can see a gain for this kind of setup)

Later on we have two, --detach (i.e. daemonize, I reuse the same name
from git-daemon) and --check-support. Transport settings (like unix vs
tcp/ip...) should be config options though. You don't want to specify
it here, then again when you run "git status". Actually I should add
--default, that will retrieve  from config key
filewatcher.path, so the user does not have to specify it..

> 5)
>> +++ b/file-watcher.c
> []
>> +static int shutdown_connection(int id)
>> +{
>> + struct connection *conn = conns[id];
>> + conns[id] = NULL;
>> + pfd[id].fd = -1;
>> + if (!conn)
>> + return 0;
>> + close(conn->sock);
>> + free(conn);
>> + return 0;
>> +}
> The function is called shutdown_connection(), but it does a close()
> Could it be named close_connection() ?

Yes, there was a close_connection() which did something similar, and
then it was killed off. Will rename.

> 6)
>> +++ b/file-watcher.c
> []
> Do we have a sequence chart about the command flow between the watcher
> daemon and the client ?

I suggest you have a look at the file-watcher test. For example, from
[2] we have this sequence

hello
inconsistent
# Do not call inotify_add_watch()
test mode on
# First batch should be all ok
<" the opposite
direction. But you can always obtain a real flow with
GIT_TRACE_PACKET=2 (path lists not available though, so really just a
flow).

[2] 
https://github.com/pclouds/git/blob/file-watcher/t/t7513-file-watcher.sh#L273

> in 03/26:
>>This version is also gentler than its friend packet_read_line()
> gentler, what does this mean?

No dying for whatever error. packet_read_line is designed for git
protocol. It's the main job so if there's an error, dying is the right
thing to do. file-watcher on the other hand is a side job and should
not stop whatever command from doing. Will update commit message.

>>because it's designed for side channel I/O that should not abort the
>>program even if the channel is broken.
> I'm not so familar with side-channel I/O. How does it fit in here?

To sum up, whatever error in communication with file-watcher should
not stop you from doing whatever you're doing. file-watcher is
contacted whenever $GIT_DIR/index is read, so "whatever you're doing"
is basically all git commands that involve worktree or index.

> Does this make sense:
> In opposite to packet_read_line() which can call die()
> to abort the program, read_in_full_timeout() will keep the program running.
> (or something like this)

Exactly!

>>diff --git a/cache.h b/cache.h
>>index 718e32b..939db46 100644
>>--

Re: [PATCH v3 00/26] inotify support

2014-02-08 Thread Torsten Bögershausen
On 03.02.14 05:28, Nguyễn Thái Ngọc Duy wrote:

I managed to review the code 0..12/26, so some parts are missing.
The list below became longer than what I intended,
my comments may be hard to read,
and there is a mixture of minor and major remarks.

I would appreciate if we could have an outline of the protocol
as a seperate "document" somewhere, to be able to have a look at the protocol
first, before looking into the code.

(Am I using wireshark too much to dream about a dissector ?)

All in all I like the concept, thanks for the work.


1)
  write_in_full_timeout()
  packet_read_line_timeout()
  At other places we handle EINTR after calling poll().
  Looking at the code, it could be easier to introduce
  a new function xpoll() in wrapper.c, and use that instead
  of poll().

2)
  Similar for the usage of accept().
  I like the idea of xread() xwrite() and all the x functions
  so it coud make sense to establish xpoll() and xaccept()
  before inotify suppport.


3)
> -int unix_stream_listen(const char *path)
> +int unix_stream_listen(const char *path, int replace)
>  {
>   int fd, saved_errno;
>   struct sockaddr_un sa;
> @@ -103,7 +103,8 @@ int unix_stream_listen(const char *path)
>   return -1;
>   fd = unix_stream_socket();
>  
> - unlink(path);
> + if (replace)
> + unlink(path);

Minor remark:
As we do not do the replace here:
s/replace/un_link/ may be ?


4)
> +++ b/file-watcher.c
{}
> +static const char *const file_watcher_usage[] = {
> + N_("git file-watcher [options] "),
> + NULL
> +};
Do we already have options here?
I can think about having 
-d daemoniye
-u uses Unix doain socket
(And later -t to use a TCP/IP socket, when the repo
 is on a mounted NFS (or SAMBA) drive, and  the daemon is on a 
 different machine.
 I don't say this patch should include this logic in first round,
 But I can see a gain for this kind of setup)


5)
> +++ b/file-watcher.c
[]
> +static int shutdown_connection(int id)
> +{
> + struct connection *conn = conns[id];
> + conns[id] = NULL;
> + pfd[id].fd = -1;
> + if (!conn)
> + return 0;
> + close(conn->sock);
> + free(conn);
> + return 0;
> +}
The function is called shutdown_connection(), but it does a close()
Could it be named close_connection() ?

6) 
> +++ b/file-watcher.c
[]
Do we have a sequence chart about the command flow between the watcher
daemon and the client ?


7)

in 03/26:
>This version is also gentler than its friend packet_read_line()
gentler, what does this mean?

>because it's designed for side channel I/O that should not abort the
>program even if the channel is broken.
I'm not so familar with side-channel I/O. How does it fit in here?

Does this make sense:
In opposite to packet_read_line() which can call die()
to abort the program, read_in_full_timeout() will keep the program running.
(or something like this)

>
>Signed-off-by: Nguyễn Thái Ngọc Duy 
>---
> cache.h|  1 +
> pkt-line.c | 35 +++
> pkt-line.h |  1 +
> wrapper.c  | 21 +
> 4 files changed, 58 insertions(+)
>
>diff --git a/cache.h b/cache.h
>index 718e32b..939db46 100644
>--- a/cache.h
>+++ b/cache.h
>@@ -1230,6 +1230,7 @@ extern int write_or_whine_pipe(int fd, const void *buf, 
>size_t count, const char
> extern void fsync_or_die(int fd, const char *);
> 
> extern ssize_t read_in_full(int fd, void *buf, size_t count);
>+extern ssize_t read_in_full_timeout(int fd, void *buf, size_t count, int 
>timeout);
> extern ssize_t write_in_full(int fd, const void *buf, size_t count);
> extern ssize_t write_in_full_timeout(int fd, const void *buf, size_t count, 
> int timeout);
> static inline ssize_t write_str_in_full(int fd, const char *str)
>diff --git a/pkt-line.c b/pkt-line.c
>index cf681e9..5a07e97 100644
>--- a/pkt-line.c
>+++ b/pkt-line.c
>@@ -229,3 +229,38 @@ char *packet_read_line_buf(char **src, size_t *src_len, 
>int *dst_len)
> {
>   return packet_read_line_generic(-1, src, src_len, dst_len);
> }
>+

In what is the timeout measured ?
seconds, milli years?
As we use poll() I think it is milli seconds.
(I like the idea of naming timeout timeout_ms)

[]
>+  len -= 4;
>+  if (len >= buf_len) {
>+  error("protocol error: bad line length %d", len);
>+  return NULL;
>+  }
>+  if ((ret = read_in_full_timeout(fd, buf, len, timeout)) < 0)
>+  return NULL;
Do we want a packet_trace here?

When a timeout occurs, do we want to close the connection,
marking it as dead?
Or need to look at errno?


>+  buf[len] = '\0';
>+  if (len_p)
>+  *len_p = len;
>+  packet_trace(buf, len, 0);
>+  return buf;
>+}

>diff --git a/pkt-line.h b/pkt-line.h
>index 4b93a0c..d47dca5 100644
>--- a/pkt-line.h
>+++ b/pkt-line.h
>@@ -69,6 +69,7 @@ int packet_read(int fd, char **src_buffer, size_t *src_len, 
>char
>  * packet is written to it.
>  */
> char *packet_read_line(int fd, int *siz