Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Florian Achleitner
Hi,

back to the pipe-topic.

On Wednesday 01 August 2012 12:42:48 Jonathan Nieder wrote:
 Hi again,
 
 Florian Achleitner wrote:
  When the first line arrives at the remote-helper, it starts importing one
  line at a time, leaving the remaining lines in the pipe.
  For importing it requires the data from fast-import, which would be mixed
  with import lines or queued at the end of them.
 
 Oh, good catch.
 
 The way it's supposed to work is that in a bidi-import, the remote
 helper reads in the entire list of refs to be imported and only once
 the newline indicating that that list is over arrives starts writing
 its fast-import stream.  We could make this more obvious by not
 spawning fast-import until immediately before writing that newline.
 
 This needs to be clearly documented in the git-remote-helpers(1) page
 if the bidi-import command is introduced.
 
 If a remote helper writes commands for fast-import before that newline
 comes, that is a bug in the remote helper, plain and simple.  It might
 be fun to diagnose this problem:

This would require all existing remote helpers that use 'import' to be ported 
to the new concept, right? Probably there is no other..

 
   static void pipe_drained_or_die(int fd, const char *msg)
   {
   char buf[1];
   int flags = fcntl(fd, F_GETFL);
   if (flags  0)
   die_errno(cannot get pipe flags);
   if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
   die_errno(cannot set up non-blocking pipe read);
   if (read(fd, buf, 1)  0)
   die(%s, msg);
   if (fcntl(fd, F_SETFL, flags))
   die_errno(cannot restore pipe flags);
   }
   ...
 
   for (i = 0; i  nr_heads; i++) {
   write import %s\n, to_fetch[i]-name;
   }
 
   if (getenv(GIT_REMOTE_HELPERS_SLOW_SANITY_CHECK))
   sleep(1);
 
   pipe_drained_or_die(unexpected output from remote helper before
 fast-import launch);
 
   if (get_importer(transport, fastimport))
   die(couldn't run fast-import);
   write_constant(data-helper-in, \n);

I still don't believe that sharing the input pipe of the remote helper is 
worth the hazzle.
It still requires an additional pipe to be setup, the one from fast-import to 
the remote-helper, sharing one FD at the remote helper.
It still requires more than just stdin, stdout, stderr.

I would suggest to use a fifo. It can be openend independently, after forking 
and on windows they have named pipes with similar semantics, so I think this 
could be easily ported. 
I would suggest the following changes:
- add a capability to the remote helper 'bidi-import', or 'bidi-pipe'. This 
signals that the remote helper requires data from fast-import.

- add a command 'bidi-import', or 'bidi-pipe' that is tells the remote helper 
which filename the fifo is at, so that it can open it and read it when it 
handles 'import' commands.

- transport-helper.c creates the fifo on demand, i.e. on seeing the capability, 
in the gitdir or in /tmp.

- fast-import gets the name of the fifo as a command-line argument. The 
alternative would be to add a command, but that's not allowed, because it 
changes the stream semantics.
Another alternative would be to use the existing --cat-pipe-fd argument. But 
that requires to open the fifo before execing fast-import and makes us 
dependent on the posix model of forking and inheriting file descriptors, while 
opening a fifo in fast-import would not.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Jonathan Nieder
Hi again,

Florian Achleitner wrote:

 back to the pipe-topic.

Ok, thanks.

[...]
 The way it's supposed to work is that in a bidi-import, the remote
 helper reads in the entire list of refs to be imported and only once
 the newline indicating that that list is over arrives starts writing
 its fast-import stream.
[...]
 This would require all existing remote helpers that use 'import' to be ported 
 to the new concept, right? Probably there is no other..

You mean all existing remote helpers that use 'bidi-import', right?
There are none.

[...]
 I still don't believe that sharing the input pipe of the remote helper is 
 worth the hazzle.
 It still requires an additional pipe to be setup, the one from fast-import to 
 the remote-helper, sharing one FD at the remote helper.

If I understand correctly, you misunderstood how sharing the input
pipe works.  Have you tried it?

It does not involve setting up an additional pipe.  Standard input for
the remote helper is already a pipe.  That pipe is what allows
transport-helper.c to communicate with the remote helper.  Letting
fast-import share that pipe involves passing that file descriptor to
git fast-import.  No additional pipe() calls.

Do you mean that it would be too much work to implement?  This
explanation just doesn't make sense to me, given that the version
using pipe() *already* *exists* and is *tested*.

I get the feeling I am missing something very basic.  I would welcome
input from others that shows what I am missing.

[...]
 Another alternative would be to use the existing --cat-pipe-fd argument. But 
 that requires to open the fifo before execing fast-import and makes us 
 dependent on the posix model of forking and inheriting file descriptors, 
 while 
 opening a fifo in fast-import would not.

I'm getting kind of frustrated with this conversation going nowhere.  Here
is a compromise to explain why.  Suppose:

- fast-import learns a --cat-blob-file parameter, which tells it to open a
  file to write responses to cat-blob and ls commands to instead of
  using an inherited file descriptor

- transport-helper.c sets up a named pipe and passes it using --cat-blob-file.

- transport-helper.c reads from that named pipe and copies everything it sees
  to the remote helper's standard input, until fast-import exits.

This would:

 - allow us to continue to use a very simple protocol for communicating
   with the remote helper, where commands and fast-import backflow both
   come on standard input

 - work fine on both Windows and Unix

Meanwhile it would:

 - be 100% functionally equivalent to the solution where fast-import
   writes directly to the remote helper's standard input.  Two programs
   can have the same pipe open for writing at the same time for a few
   seconds and that is *perfectly fine*.  On Unix and on Windows.

   On Windows the only complication with the pipe()-based  is that we haven't
   wired up the low-level logic to pass file descriptors other than
   stdin, stdout, stderr to child processes; and if I have understood
   earlier messages correctly, the operating system *does* have a
   concept of that and this is just a todo item in msys
   implementation.

 - be more complicated than the code that already exists for this
   stuff.

So while I presented this as a compromise, I don't see the point.

Is your goal portability, a dislike of the interface, some
implementation detail I have missed, or something else?  Could you
explain the problem as concisely but clearly as possible (perhaps
using an example) so that others like Sverre, Peff, or David can help
think through it and to explain it in a way that dim people like me
understand what's going on?

Puzzled,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Jonathan Nieder
Hi again,

Florian Achleitner wrote:

 Another alternative would be to use the existing --cat-pipe-fd argument. But 
 that requires to open the fifo before execing fast-import and makes us 
 dependent on the posix model of forking and inheriting file descriptors

Let me elaborate on this, which I think is the real point (though I
could easily be wrong).

Probably instead of just sending feedback I should have been sending
suggested patches to compare.  You do not work for me and your time is
not my time, and I would not want to have the responsibility of
dictating how your code works, anyway.  Generally speaking, as long as
code is useful and not hurting maintainability, the best thing I can
do is to make it easy to incorporate.  That has side benefits, too,
like giving an example of what it is like to have your code
incorporated and creating momentum.  Small mistakes can be fixed
later.

Unfortunately here we are talking about two interfaces that need to
be stable.  Mistakes stay --- once there is a UI wart in interfaces
people are using, we are stuck continuing to support it.

To make life even more complicated, there are two interfaces involved
here:

 - the remote-helper protocol, which I think it is very important
   to keep sane and stable.  Despite the remote-helper protocol
   existing for a long time, it hasn't seen a lot of adoption
   outside git yet, and complicating the protocol will not help
   that.

   I am worried about what it would mean to add an additional
   stream on top of the standard input and output used in the current
   remote-helper protocol.  Receiving responses to fast-import
   commands on stdin seems very simple.  Maybe I am wrong?

 - the fast-import interface, which is also important but I'm not
   as worried about.  Adding a new command-line parameter means
   importers that call fast-import can unnecessarily depend on
   newer versions of git, so it's still something to be avoided.

Given a particular interface, there may be multiple choices of how to
implement it.  For example, on Unix the remote-helper protocol could
be fulfilled by letting fast-import inherit the file descriptor
helper-in, while on Windows it could for example be fulfilled by
using DuplicateHandle() to transmit the pipe handle before or after
fast-import has already started running.

The implementation strategy can easily be changed later.  What is
important is that the interface be pleasant and *possible* to
implement sanely.

Hoping that clarifies a little,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Florian Achleitner
On Sunday 12 August 2012 09:12:58 Jonathan Nieder wrote:
 Hi again,
 
 Florian Achleitner wrote:
  back to the pipe-topic.
 
 Ok, thanks.
 
 [...]
 
  The way it's supposed to work is that in a bidi-import, the remote
  helper reads in the entire list of refs to be imported and only once
  the newline indicating that that list is over arrives starts writing
  its fast-import stream.
 
 [...]
 
  This would require all existing remote helpers that use 'import' to be
  ported to the new concept, right? Probably there is no other..
 
 You mean all existing remote helpers that use 'bidi-import', right?
 There are none.

Ok, it would not affect the existing import command.
 
 [...]
 
  I still don't believe that sharing the input pipe of the remote helper is
  worth the hazzle.
  It still requires an additional pipe to be setup, the one from fast-import
  to the remote-helper, sharing one FD at the remote helper.
 
 If I understand correctly, you misunderstood how sharing the input
 pipe works.  Have you tried it?

Yes wrote a test program, sharing works, that's not the problem.

 
 It does not involve setting up an additional pipe.  Standard input for
 the remote helper is already a pipe.  That pipe is what allows
 transport-helper.c to communicate with the remote helper.  Letting
 fast-import share that pipe involves passing that file descriptor to
 git fast-import.  No additional pipe() calls.
 
 Do you mean that it would be too much work to implement?  This
 explanation just doesn't make sense to me, given that the version
 using pipe() *already* *exists* and is *tested*.

Yes, that was the first version I wrote, and remote-svn-alpha uses.

 
 I get the feeling I am missing something very basic.  I would welcome
 input from others that shows what I am missing.
 

This is how I see it, probably it's all wrong:
I thought the main problem is, that we don't want processes to have *more than 
three pipes attached*, i.e. stdout, stdin, stderr, because existing APIs don't 
allow it.
When we share stdin of the remote helper, we achieve this goal for this one 
process, but fast-import still has an additional pipe:
stdout  -- shell;
stderr -- shell; 
stdin -- remote-helper; 
additional_pipe -- remote-helper.

That's what I wanted to say: We still have more than three pipes on fast-
import.
And we need to transfer that fourth file descriptor by inheritance and it's 
number as a command line argument. 
So if we make the remote-helper have only three pipes by double-using stdin, 
but fast-import still has four pipes, what problem does it solve?

Using fifos would remove the requirement to inherit more than three pipes. 
That's my point.

[..]
 
 Meanwhile it would:
 
  - be 100% functionally equivalent to the solution where fast-import
writes directly to the remote helper's standard input.  Two programs
can have the same pipe open for writing at the same time for a few
seconds and that is *perfectly fine*.  On Unix and on Windows.
 
On Windows the only complication with the pipe()-based  is that we
 haven't wired up the low-level logic to pass file descriptors other than
 stdin, stdout, stderr to child processes; and if I have understood earlier
 messages correctly, the operating system *does* have a
concept of that and this is just a todo item in msys
implementation.

I digged into MSDN and it seems it's not a problem at all on the windows api 
layer. Pipe handles can be inherited. [1]
If the low-level logic once supports passing more than 3 fds, it will work on 
fast-import as well as remote-helper.

 
  - be more complicated than the code that already exists for this
stuff.
 
 So while I presented this as a compromise, I don't see the point.
 
 Is your goal portability, a dislike of the interface, some
 implementation detail I have missed, or something else?  Could you
 explain the problem as concisely but clearly as possible (perhaps
 using an example) so that others like Sverre, Peff, or David can help
 think through it and to explain it in a way that dim people like me
 understand what's going on?

It all started as portability-only discussion. On Linux, my first version would 
have worked. It created an additional pipe before forking using pipe(). Runs 
great, it did it like remote-svn-alpha.sh.

I wouldn't have started to produce something else or start a discussion on my 
own. But I was told, it's not good because of portability. This is the root of 
this endless story. (you already know the thread, I think). Since weeks nobody 
of them is interested in that except you and me.
 
So if we accept having more than three pipes on a process, we have no more 
problem.
We can dig out that first version, as well as write the one proposed by you.
While your version saves some trouble by not requiring an additional pipe() 
call and not requiring the prexec_cb, but adding a little complexity with the 
re-using of stdin.

Currently I have the implemented the original pipe version, the original 

Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Jonathan Nieder
Florian Achleitner wrote:

 This is how I see it, probably it's all wrong:
 I thought the main problem is, that we don't want processes to have *more than
 three pipes attached*, i.e. stdout, stdin, stderr, because existing APIs don't
 allow it.

Oh, that makes sense.  Thanks for explaining, and sorry to have been so
dense.

At the Windows API level, Set/GetStdHandle() is only advertised to
handle stdin, stdout, and stderr, so on Windows there would indeed
need to be some magic to communicate the inherited HANDLE value to
fast-import.

But I am confident in the Windows porters, and if a fast-import
interface change ends up being needed, I think we can deal with it
when the moment comes and it wouldn't necessitate changing the remote
helper interface.

You also mentioned before that passing fast-import responses to the
remote helper stdin means that the remote helper has to slurp up the
whole list of refs to import before starting to talk to fast-import.
That was good practice already anyway, but it is a real pitfall and a
real downside to the single-input approach.  Thanks for bringing it
up.

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-01 Thread Florian Achleitner
On Tuesday 31 July 2012 15:43:57 Jonathan Nieder wrote:
 Florian Achleitner wrote:
  I haven't tried that yet, nor do I remember anything where I've already
  seen two processes writing to the same pipe.
 
 It's a perfectly normal and well supported thing to do.

I played around with a little testprogram. It generally works.
I'm still not convinced that this doesn't cause more problems than it can 
solve.
The standard defines that write calls to pipe fds are atomic, i.e. data is not 
interleaved with data from other processes, if the data is less than PIPE_BUF 
[1].
We would need some kind of locking/synchronization to make it work for sure, 
while I believe it will work most of the time.

Currently it runs  like this:
transport-helper.c writes one or more 'import ref' lines, we don't know in 
advance how many and how long they are. Then it waits for fast-import to 
finish.

When the first line arrives at the remote-helper, it starts importing one line 
at a time, leaving the remaining lines in the pipe.
For importing it requires the data from fast-import, which would be mixed with 
import lines or queued at the end of them.

[1] 
http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-01 Thread Jonathan Nieder
Hi again,

Florian Achleitner wrote:

 When the first line arrives at the remote-helper, it starts importing one 
 line 
 at a time, leaving the remaining lines in the pipe.
 For importing it requires the data from fast-import, which would be mixed 
 with 
 import lines or queued at the end of them.

Oh, good catch.

The way it's supposed to work is that in a bidi-import, the remote
helper reads in the entire list of refs to be imported and only once
the newline indicating that that list is over arrives starts writing
its fast-import stream.  We could make this more obvious by not
spawning fast-import until immediately before writing that newline.

This needs to be clearly documented in the git-remote-helpers(1) page
if the bidi-import command is introduced.

If a remote helper writes commands for fast-import before that newline
comes, that is a bug in the remote helper, plain and simple.  It might
be fun to diagnose this problem:

static void pipe_drained_or_die(int fd, const char *msg)
{
char buf[1];
int flags = fcntl(fd, F_GETFL);
if (flags  0)
die_errno(cannot get pipe flags);
if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
die_errno(cannot set up non-blocking pipe read);
if (read(fd, buf, 1)  0)
die(%s, msg);
if (fcntl(fd, F_SETFL, flags))
die_errno(cannot restore pipe flags);
}
...

for (i = 0; i  nr_heads; i++) {
write import %s\n, to_fetch[i]-name;
}

if (getenv(GIT_REMOTE_HELPERS_SLOW_SANITY_CHECK))
sleep(1);

pipe_drained_or_die(unexpected output from remote helper before 
fast-import launch);

if (get_importer(transport, fastimport))
die(couldn't run fast-import);
write_constant(data-helper-in, \n);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-31 Thread Florian Achleitner
On Monday 30 July 2012 11:55:02 Jonathan Nieder wrote:
 Florian Achleitner wrote:
  Hm .. that would mean, that both fast-import and git (transport-helper)
  would write to the remote-helper's stdin, right?
 
 Yes, first git writes the list of refs to import, and then fast-import
 writes feedback during the import.  Is that a problem?

I haven't tried that yet, nor do I remember anything where I've already seen 
two processes writing to the same pipe.
At least it sounds cumbersome to me. Processes' lifetimes overlap, so buffering 
and flushing could mix data.
We have to use it for both purposes interchangably  because there can be more 
than one import command to the remote-helper, of course.

Will try that in test-program..
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-31 Thread Jonathan Nieder
Florian Achleitner wrote:

 I haven't tried that yet, nor do I remember anything where I've already seen
 two processes writing to the same pipe.

It's a perfectly normal and well supported thing to do.

[...]
 Will try that in test-program..

Thanks.

Good luck,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Florian Achleitner
On Saturday 28 July 2012 02:00:31 Jonathan Nieder wrote:
 Thanks for explaining.  Now we've discussed a few different approproaches,
 none of which is perfect.
 
 a. use --cat-blob-fd, no FIFO
 
Doing this unconditionally would break platforms that don't support
--cat-blob-fd=(descriptor 2), like Windows, so we'd have to:
 
* Make it conditional --- only do it (1) we are not on Windows and
  (2) the remote helper requests backflow by advertising the
  import-bidi capability.
 
* Let the remote helper know what's going on by using
  import-bidi instead of import in the command stream to
  initiate the import.

Generally I like your prefered solution.
I think there's one problem:
The pipe needs to be created before the fork, so that the fd can be inherited. 
There is no way of creating it if the remote-helper advertises a capability, 
because it is already forked then. This would work with fifos, though.

We could:
- add a capability: bidi-import. 
- make transport-helper create a fifo if the helper advertises it.
- add a command for remote-helpers, like 'bidi-import pipename' that makes 
the remote helper open the fifo at pipename and use it.
- fast-import is forked after the helper, so we do already know if there will 
be a back-pipe. If yes, open it in transport-helper and pass the fd as command 
line argument cat-blob-fd. 

-- fast-import wouldn't need to be changed, but we'd use a fifo, and we get 
rid of the env-vars.
(I guess it could work on windows too).

What do you think?

 
 b. use envvars to pass around FIFO path
 
This complicates the fast-import interface and makes debugging hard.
It would be nice to avoid this if we can, but in case we can't, it's
nice to have the option available.
 
 c. transport-helper.c uses FIFO behind the scenes.
 
Like (a), except it would require a fast-import tweak (boo) and
would work on Windows (yea)
 
 d. use --cat-blob-fd with FIFO
 
Early scripted remote-svn prototypes did this to fulfill fetch
requests.
 
It has no advantage over use --cat-blob-fd, no FIFO except being
easier to implement as a shell script.  I'm listing this just for
comparison; since (a) looks better in every way, I don't see any
reason to pursue this one.
 
 Since avoiding deadlocks with bidirectional communication is always a
 little subtle, it would be nice for this to be implemented once in
 transport-helper.c rather than each remote helper author having to
 reimplement it again.  As a result, my knee-jerk ranking is a  c 
 b  d.
 
 Sane?
 Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Florian Achleitner
On Thursday 26 July 2012 10:29:51 Junio C Hamano wrote:
 Of course, if the dispatch loop has to be rewritten so that a
 central dispatcher decides what to call, individual input handlers
 do not need to say NOT_HANDLED nor TERMINATE, as the central
 dispatcher should keep track of the overall state of the system, and
 the usual 0 on success, negative on error may be sufficient.
 
 One thing I wondered was how an input capability (or list)
 should be handled after import was issued (hence batch_active
 becomes true).  The dispatcher loop in the patch based on
 NOT_HANDLED convention will happily call cmd_capabilities(), which
 does not have any notion of the batch_active state (because it is a
 function scope static inside cmd_import()), and will say Ah, that
 is mine, and let me do my thing.  If we want to diagnose such an
 input stream as an error, the dispatch loop needs to become aware of
 the overall state of the system _anyway_, so that may be an argument
 against the NOT_HANDLED based dispatch system the patch series uses.

That's a good point. The current implementation allows other commands to 
appear during import batches. This shouldn't be possible according to the 
protocol, I think. But it doesn't do harm. Solving it will require a global 
state and go towards a global displatcher.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Jonathan Nieder
Florian Achleitner wrote:
 On Saturday 28 July 2012 02:00:31 Jonathan Nieder wrote:

 a. use --cat-blob-fd, no FIFO
[...]
* Make it conditional --- only do it (1) we are not on Windows and
  (2) the remote helper requests backflow by advertising the
  import-bidi capability.

* Let the remote helper know what's going on by using
  import-bidi instead of import in the command stream to
  initiate the import.

 Generally I like your prefered solution.
 I think there's one problem:
 The pipe needs to be created before the fork, so that the fd can be 
 inherited. 

The relevant pipe already exists at that point: the remote helper's
stdin.

In other words, it could work like this (just like the existing demo
code, except adding a conditional based on the capabilities
response):

0. transport-helper.c invokes the remote helper.  This requires
   a pipe used to send commands to the remote helper
   (helper-in) and a pipe used to receive responses from the
   remote helper (helper-out)

1. transport-helper.c sends the capabilities command to decide
   what to do.  The remote helper replies that it would like
   some feedback from fast-import.

2. transport-helper.c forks and execs git fast-import with input
   redirected from helper-out and the cat-blob fd redirected
   to helper-in

3. transport-helper.c tells the remote helper to start the
   import

4. wait for fast-import to exit
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Florian Achleitner
On Monday 30 July 2012 03:29:52 Jonathan Nieder wrote:
  Generally I like your prefered solution.
  I think there's one problem:
  The pipe needs to be created before the fork, so that the fd can be
  inherited. 
 The relevant pipe already exists at that point: the remote helper's
 stdin.
 
 In other words, it could work like this (just like the existing demo
 code, except adding a conditional based on the capabilities
 response):
 
 0. transport-helper.c invokes the remote helper.  This requires
a pipe used to send commands to the remote helper
(helper-in) and a pipe used to receive responses from the
remote helper (helper-out)
 
 1. transport-helper.c sends the capabilities command to decide
what to do.  The remote helper replies that it would like
some feedback from fast-import.
 
 2. transport-helper.c forks and execs git fast-import with input
redirected from helper-out and the cat-blob fd redirected
to helper-in

fast-import writes to the helpers stdin..

 3. transport-helper.c tells the remote helper to start the
import

transport-helper writes commands to the helper's stdin.

 
 4. wait for fast-import to exit

Hm .. that would mean, that both fast-import and git (transport-helper) would 
write to the remote-helper's stdin, right?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Jonathan Nieder
Florian Achleitner wrote:

 Hm .. that would mean, that both fast-import and git (transport-helper) would 
 write to the remote-helper's stdin, right?

Yes, first git writes the list of refs to import, and then fast-import
writes feedback during the import.  Is that a problem?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-28 Thread Jonathan Nieder
Hi,

Some quick details.

Florian Achleitner wrote:

Anyways, specifying cat-blob-fd is not 
 allowed via the 'option' command (see Documentation and 85c62395).
 It wouldn't make too much sense, because the file descriptor must be set up 
 by 
 the parent.

 But for the fifo, it would, probably.

More precisely, requiring that the cat-blob fd go on the command line
instead of in the stream matches a model where fast-import's three
standard streams are abstract:

 - its input, using the INPUT FORMAT described in git-fast-import(1)
 - its standard output, which echoes progress commands
 - its backflow stream, where responses to cat-blob and ls commands go

The stream format is not necessarily pinned to a Unix model where
input and output are based on filesystems and file descriptors.  We
can imagine a fast-import backend that runs on a remote server and the
transport used for these streams is sockets; for fast-import frontends
to be usable in such a context, the streams they produce must not rely
on particular fd numbers, nor on pathnames (except for mark state
saved to relative paths with --relative-marks) representing anything
in particular.

This goes just as much for a fifo set up on the filesystem where the
fast-import backend runs as for an inherited file descriptor.  In the
current model, such backend-specific details of setup go on the
command line.

   The backward channel is only used by 
 the 
 commands 'cat-blob' and 'ls' of fast-import. If a remote helper wants to use 
 them, it would could make fast-import open the pipe by sending an 'option' 
 command with the fifo filename, otherwise it defaults to stdout (like now) 
 and 
 is rather useless.

I'm getting confused by terminology again.  Let's see if I have the cast
of characters straight:

 - the fast-import backend (e.g., git fast-import or hg-fastimport)
 - the fast-import frontend (e.g., git fast-export or svn-fe)
 - git's generic foreign VCS support plumbing, also known as
   transport-helper.c
 - the remote helper (e.g., git remote-svn or git remote-testgit)

Why would the fast-import backend ever need to open a pipe?  If I want
it to write backflow to a fifo, I can use

mkfifo my-favorite-fifo
git fast-import --cat-blob-fd=3 3my-favorite-fifo

If I want it to write backflow to a pipe, I can use (using ksh syntax)

cat |
git fast-import --cat-blob-fd=3 3p

 This would take the fifo setup out of transport-helper. The remote-helper 
 would 
 have to create it, if it needs it.

We can imagine transport-helper.c learning the name of a fifo set up by
the remote helper by sending it the capabilities command:

git capabilities
helper option
helper import
helper cat-blob-file my-favorite-fifo
helper refspec refs/heads/*:refs/helper/remotename/*
helper

transport-helper.c could then use that information to invoke
fast-import appropriately:

git fast-import --cat-blob-fd=3 3my-favorite-fifo

But this seems like pushing complication onto the remote helper; since
there is expected to be one remote helper per foreign VCS,
implementing the appropriate logic correctly once and for all in
transport-helper.c for all interested remote helpers to take advantage
of seems to me like a better policy.

 Apropos stdout. That leads to another idea. You already suggested that it 
 would be easiest to only use FDs 0..2. Currently stdout and stderr of fast-
 import go to the shell. We could connect stdout to the remote-helper and 
 don't 
 need the additional channel at all.

The complication that makes this strategy not so easy is progress
commands in the fast-import input stream.  (Incidentally, it would be
nice to teach transport-helper.c to display specially formatted
progress commands using a progress bar some day.)

Hoping that clarifies,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-28 Thread Jonathan Nieder
Florian Achleitner wrote:

 So I should kick printd out?

I think so, yes.

git log -SGIT_TRANSPORT_HELPER_DEBUG transport-helper.c tells me
that that option was added to make the transport-helper machinery make
noise to make it obvious at what stage a remote helper has deadlocked.

GIT_TRANSPORT_HELPER_DEBUG already takes care of that, so there would
not be need for an imitation of that in remote-svn, unless I am
missing something (and even if I am missing something, it seems
complicated enough to be worth moving to another patch where it can be
explained more easily).

[...]
 +
 + printf(import\n);
 + printf(\n);
 + fflush(stdout);
 + return SUCCESS;
 +}
[...]
Maybe the purpose of the flush would
 be more obvious if it were moved to the caller.

 Acutally this goes to the git parent process (not fast-import), waiting for a 
 reply to the command. I think I have to call flush on this side of the pipe. 
 Can you flush it from the reader? This wouldn't have the desired effect, it 
 drops buffered data.

*slaps head*  This is the capabilities command, and it needs to
flush because the reader needs to know what commands it's allowed to
use next before it starts using them.  My brain turned off and I
thought you were emitting an import command rather than advertising
that you support it for some reason.

And 'printf(\n)' was a separate printf because that way, patches
like

printf(import\n);
+   printf(bidi-import\n);
printf(\n);
fflush(stdout);

become simpler.

I'm tempted to suggest a structure like

const char * const capabilities[] = {import};
int i;

for (i = 0; i  ARRAY_SIZE(capabilities); i++)
puts(capabilities[i]);
puts();   /* blank line */

fflush(stdout);

but your original code was fine, too.

[...]
 + /* opening a fifo for usually reading blocks until a writer has opened
 it too. +  * Therefore, we open with RDWR.
 +  */
 + report_fd = open(back_pipe_env, O_RDWR);
 + if(report_fd  0) {
 + die(Unable to open fast-import back-pipe! %s, 
 strerror(errno));
[...]
 I believe it can be solved using RDONLY and WRONLY too. Probably we solve it 
 by not using the fifo at all.
 Currently the blocking comes from the fact, that fast-import doesn't parse 
 it's command line at startup. It rather reads an input line first and decides 
 whether to parse the argv after reading the first input line or at the end of 
 the input. (don't know why)
 remote-svn opens the pipe before sending the first command to fast-import and 
 blocks on the open, while fast-import waits for input -- deadlock.

Thanks for explaining.  Now we've discussed a few different approproaches,
none of which is perfect.

a. use --cat-blob-fd, no FIFO

   Doing this unconditionally would break platforms that don't support
   --cat-blob-fd=(descriptor 2), like Windows, so we'd have to:

   * Make it conditional --- only do it (1) we are not on Windows and
 (2) the remote helper requests backflow by advertising the
 import-bidi capability.

   * Let the remote helper know what's going on by using
 import-bidi instead of import in the command stream to
 initiate the import.

b. use envvars to pass around FIFO path

   This complicates the fast-import interface and makes debugging hard.
   It would be nice to avoid this if we can, but in case we can't, it's
   nice to have the option available.

c. transport-helper.c uses FIFO behind the scenes.

   Like (a), except it would require a fast-import tweak (boo) and
   would work on Windows (yea)

d. use --cat-blob-fd with FIFO

   Early scripted remote-svn prototypes did this to fulfill fetch
   requests.

   It has no advantage over use --cat-blob-fd, no FIFO except being
   easier to implement as a shell script.  I'm listing this just for
   comparison; since (a) looks better in every way, I don't see any
   reason to pursue this one.

Since avoiding deadlocks with bidirectional communication is always a
little subtle, it would be nice for this to be implemented once in
transport-helper.c rather than each remote helper author having to
reimplement it again.  As a result, my knee-jerk ranking is a  c 
b  d.

Sane?
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-27 Thread Florian Achleitner
On Thursday 26 July 2012 09:54:26 Jonathan Nieder wrote:
 
 Since the svn remote helper relies on this, it seems worth working on,
 yeah.  As for how to spend your time (and whether to beg someone else
 to work on it instead :)): I'm not sure what's on your plate or where
 you are with respect to the original plan for the summer at the
 moment, so it would be hard for me to give useful advice about how to
 balance things.

Btw, the pipe version did already exist before I started, it was added with 
the cat-blob command and already used by Dmitry's remote-svn-alpha.
I didn't search for design discussions in the past ..

 
 What did you think of the suggestion of adding a new bidi-import
 capability and command to the remote helper protocol?  I think this
 would be clean and avoid causing a regression on Windows, but it's
 easily possible I am missing something fundamental.

I don't have much overview over this topic besides the part I'm working on, 
like other users of fast-import. 
The bidi-import capability/command would have the advantage, that we don't 
have to bother with the pipe/fifo at all, if the remote-helper doesn't use it.

When I implemented the two variants I had the idea to pass it to the 'option' 
command, that fast-import already has. Anyways, specifying cat-blob-fd is not 
allowed via the 'option' command (see Documentation and 85c62395).
It wouldn't make too much sense, because the file descriptor must be set up by 
the parent.

But for the fifo, it would, probably. The backward channel is only used by the 
commands 'cat-blob' and 'ls' of fast-import. If a remote helper wants to use 
them, it would could make fast-import open the pipe by sending an 'option' 
command with the fifo filename, otherwise it defaults to stdout (like now) and 
is rather useless.
This would take the fifo setup out of transport-helper. The remote-helper would 
have to create it, if it needs it.

Apropos stdout. That leads to another idea. You already suggested that it 
would be easiest to only use FDs 0..2. Currently stdout and stderr of fast-
import go to the shell. We could connect stdout to the remote-helper and don't 
need the additional channel at all.
(Probably there's a good reason why they haven't done that ..)
Maybe this requires many changes to fast-import and breaks existing frontends.

--
Florian
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
Hi!

Most of this review went into the new version.. 
For the remaining points, some comments follow.

On Monday 02 July 2012 06:07:41 Jonathan Nieder wrote:
 Hi,
 
 Florian Achleitner wrote:

 
  --- /dev/null
  +++ b/contrib/svn-fe/remote-svn.c
  @@ -0,0 +1,207 @@
  +
  +#include stdlib.h
  +#include string.h
  +#include stdio.h
 
 git-compat-util.h (or some header that includes it) must be the first
 header included so the appropriate feature test macros can be defined.
 See Documentation/CodingGuidelines for more on that.

check.

 
  +#include cache.h
  +#include remote.h
  +#include strbuf.h
  +#include url.h
  +#include exec_cmd.h
  +#include run-command.h
  +#include svndump.h
  +
  +static int debug = 0;
 
 Small nit: please drop the redundant = 0 here.  Or:

check.

  +
  +static inline void printd(const char* fmt, ...)
  +{
  +   if(debug) {
  +   va_list vargs;
  +   va_start(vargs, fmt);
  +   fprintf(stderr, rhsvn debug: );
  +   vfprintf(stderr, fmt, vargs);
  +   fprintf(stderr, \n);
  +   va_end(vargs);
  +   }
  +}
 
 Why not use trace_printf and avoid the complication?

Hm.. I tried. It wasn't exactly what I wanted. When I use trace_printf, it's 
activated together with all other traces. I can use trace_vprintf and specify 
a key, but I would always have to print the header rhsvn debug:  and the key 
by hand. So I could replace vfprintf in this function by trace_vprintf to do 
that. But then there's not much simplification. (?)


  +
  +enum cmd_result cmd_capabilities(struct strbuf* line);
  +enum cmd_result cmd_import(struct strbuf* line);
  +enum cmd_result cmd_list(struct strbuf* line);
 
 What's a cmd_result?  '*' sticks to variable name.
 
  +
  +enum cmd_result { SUCCESS, NOT_HANDLED, ERROR };
 
 Oh, that's what a cmd_result is. :)  Why not define the type before
 using it to avoid keeping the reader in suspense?
 
 What does each result represent?  If this is a convention like
 
  1: handled
  0: not handled
  -1: error, callee takes care of printing the error message
 
 then please document it in a comment near the caller so the reader can
 understand what is happening without too much confusion.  Given such a
 comment, does the enum add clarity?

Hm.. the enum now has SUCCESS, NOT_HANDLED, TERMINATE.
It gives the numbers a name, thats it.

 
  +typedef enum cmd_result (*command)(struct strbuf*);
 
 When I first read this, I wonder what is being commanded.  Are these
 commands passed on the remote helper's standard input, commands passed
 on its output, or commands run at some point in the process?  What is
 the effect and return value of associated function?  Does the function
 always return some success/failure value, or does it sometimes exit?
 
 Maybe a more specific type name would be clearer?

I renamed it to input_command_handler. Unfortunately the remote-helper spec 
calls what is sent to the helper a 'command'.

 
 [...]
 
  +
  +const command command_list[] = {
  +   cmd_capabilities, cmd_import, cmd_list, NULL
  +};
 
 First association is to functions like cmd_fetch() which implement git
 subcommands.  So I thought these were going to implement subcommands
 like git remote-svn capabilities, git remote-svn import and would
 use the same cmd_foo(argc, argv, prefix) calling convention that git
 subcommands do.  Maybe a different naming convention could avoid
 confusion.

Ok.. same as above, they are kind of commands. Of course I can change the 
names. For me it's not too confusing, because I don't know the git subcommands 
convention very well. You can choose a name.

 
 [...]
 
  +enum cmd_result cmd_capabilities(struct strbuf* line)
  +{
  +   if(strcmp(line-buf, capabilities))
  +   return NOT_HANDLED;
 
 Style: missing SP after keyword.
 
  +
  +   printf(import\n);
  +   printf(\n);
  +   fflush(stdout);
  +   return SUCCESS;
  +}
 
 Why the multiple printf?  Is the flush needed?

Excess printf gone.
Flush is needed. Otherwise it doesn't flush and the other end waits forever.
Don't know exactly why. Some pipe-buffer ..

  +
  +   /* opening a fifo for usually reading blocks until a writer has opened
  it too. +* Therefore, we open with RDWR.
  +*/
  +   report_fd = open(back_pipe_env, O_RDWR);
  +   if(report_fd  0) {
  +   die(Unable to open fast-import back-pipe! %s, 
  strerror(errno));
  +   }
 
 Is this necessary?  Why shouldn't we fork the writer first and wait
 for it here?

Yes, necessary. Blocking on this open call prevents fast-import as well as the 
remote helper from reading and writing on their normal command streams.
This leads to deadlocks.

E.g. If there's have nothing to import, the helper sends only 'done' to fast-
import and quits. That might happen before fast-import opened this pipe.
Then it waits forever because the reader has already closed it.


  +
  +   code = start_command(svndump_proc);
  +   if(code)
  +   die(Unable to start %s, code %d, 

Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Jonathan Nieder
Florian Achleitner wrote:

 Most of this review went into the new version.. 
 For the remaining points, some comments follow.

Thanks for this.

 On Monday 02 July 2012 06:07:41 Jonathan Nieder wrote:

[...]
 +
 +static inline void printd(const char* fmt, ...)
[...]
 Why not use trace_printf and avoid the complication?

 Hm.. I tried. It wasn't exactly what I wanted. When I use trace_printf, it's 
 activated together with all other traces. I can use trace_vprintf and specify 
 a key, but I would always have to print the header rhsvn debug:  and the 
 key 
 by hand. So I could replace vfprintf in this function by trace_vprintf to do 
 that. But then there's not much simplification. (?)

Hmm.  There's no trace_printf_with_key() but that's presumably because
no one has needed it.  If it existed, you could use

#define printd(msg) trace_printf_with_key(GIT_TRACE_REMOTE_SVN, %s, 
msg)

But now that I check, I don't see how the current printd() calls would
be useful to other people.  Why announce these moments and not others?
They're just temporary debugging cruft, right?

For that, plain trace_printf() works great.

[...]
 +
 +enum cmd_result { SUCCESS, NOT_HANDLED, ERROR };
[...]
 Hm.. the enum now has SUCCESS, NOT_HANDLED, TERMINATE.

Much nicer.

I think this tristate return value could be avoided entirely because...
[continued at (*) below]

[...]
 +
 +   printf(import\n);
 +   printf(\n);
 +   fflush(stdout);
 +   return SUCCESS;
 +}

 Why the multiple printf?  Is the flush needed?

 Excess printf gone.
 Flush is needed. Otherwise it doesn't flush and the other end waits forever.

Ah, fast-import is ready, remote helper is ready, no one initiates
pumping of data between them.  Maybe the purpose of the flush would
be more obvious if it were moved to the caller.

[...]
 +   /* opening a fifo for usually reading blocks until a writer has opened
 it too. +* Therefore, we open with RDWR.
 +*/
 +   report_fd = open(back_pipe_env, O_RDWR);
 +   if(report_fd  0) {
 +   die(Unable to open fast-import back-pipe! %s, 
 strerror(errno));
 +   }

 Is this necessary?  Why shouldn't we fork the writer first and wait
 for it here?

 Yes, necessary.

Oh, dear.  I hope not.  E.g., Cygwin doesn't support opening fifos
RDWR (out of scope for the gsoc project, but still).

[...]
 E.g. If there's have nothing to import, the helper sends only 'done' to fast-
 import and quits.

Won't the writer open the pipe and wait for us to open our end before
doing that?

[...]
 +
 +   code = start_command(svndump_proc);
 +   if(code)
 +   die(Unable to start %s, code %d, svndump_proc.argv[0], code);

 start_command() is supposed to have printed a message already when it
 fails, unless errno == ENOENT and silent_exec_failure was set.

 Yes, but it doesn't die, right?

You can exit without writing a message with exit(), e.g. like so:

if (code)
exit(code);

or like so:

if (code)
exit(128);

[...]
 +
 +   close(svndump_proc.out);

 Important?  Wouldn't finish_command do this?

 As far as I understood it, it doesn't close extra created pipes. Probably I 
 just didn't find it in the code ..

So this is to work around a bug in the run-command interface?

[...]
 +   close(report_fd);

 What is the purpose of this step?

 Close the back-report pipe end of the remote-helper.

That's just repeating the question. :)  Perhaps it's supposed to
trigger some action on the other end of the pipe?  It would just be
useful to add a comment documenting why one shouldn't remove this
close() call, or else will probably end up removing it and needlessly
suffering.

[...]
 +
 +   code = finish_command(svndump_proc);
 +   if(code)
 +   warning(Something went wrong with termination of %s, code %d,
 svndump_proc.argv[0], code);
 finish_command() is supposed to print a message when it fails.

 I changed the message text. It should tell us if svnrdump exited with non-
 zero.

I'd suggest looking at other finish_command() callers for examples.

[...]
 +enum cmd_result do_command(struct strbuf* line)
 +{
 +   const command* p = command_list;
 +   enum cmd_result ret;
 +   printd(command line '%s', line-buf);
 +   while(*p) {
 +   ret = (*p)(line);
 +   if(ret != NOT_HANDLED)
 +   return ret;
 +   p++;
 +   }

 If possible, matching commands by name (like git.c does) would make
 the behavior easier to predict.

 There is some usecase for this. The intention was, that command handlers 
 should be able to process more than one 'name'. E.g. an import batch is 
 terminated by a newline. This newline is handled by the import handler if it 
 is a batch. (This feature wasn't implemented in the version reviewed here.)

 So I decided to let the handler functions tell if they handle this line.

[continued from (*) above]
... it isn't needed at the moment.

See http://c2.com/xp/YouArentGonnaNeedIt.html

[...]
 +   free((void*)url);
 +   

Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Steven Michalske

On Jul 2, 2012, at 4:07 AM, Jonathan Nieder jrnie...@gmail.com wrote:

 [...]
 diff: Use fifo instead of pipe: Retrieve the name of the pipe from env and 
 open it
 for svndump.
 
 I'd prefer to avoid this if possible, since it means having to decide
 where the pipe goes on the filesystem.  Can you summarize the
 discussion in the commit message so future readers understand why
 we're doing it?

Crazy thought here but would a socket not be a bad choice here?

Imagine being able to ssh tunnel into the SVN server and run the helper with 
filesystem access to the SVN repo.

Akin to the pushy project use case.
http://packages.python.org/pushy/

SSH into the machine, copy the required components to the machine, and use the 
RPC.
Nothing needed but SSH and python.  In this case SSH, SVN, and the helper would 
be needed.

This also would work just fine with the local host too.

Steve

Note: Resent, Sorry it was signed, and rejected before.--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Jonathan Nieder
(cc-ing Ram since he's also knowledgeable about remote-helper protocol)
Florian Achleitner wrote:
 On Thursday 26 July 2012 06:40:39 Jonathan Nieder wrote:

 Though I still
 think the way forward is to keep using plain pipes internally for now
 and to make the bidirectional communication optional, since it
 wouldn't close any doors to whatever is most convenient on each
 platform.  Hopefully I'll hear more from Florian about this in time.

 Would you like to see a new pipe patch?

Since the svn remote helper relies on this, it seems worth working on,
yeah.  As for how to spend your time (and whether to beg someone else
to work on it instead :)): I'm not sure what's on your plate or where
you are with respect to the original plan for the summer at the
moment, so it would be hard for me to give useful advice about how to
balance things.

What did you think of the suggestion of adding a new bidi-import
capability and command to the remote helper protocol?  I think this
would be clean and avoid causing a regression on Windows, but it's
easily possible I am missing something fundamental.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
On Thursday 26 July 2012 04:08:42 Jonathan Nieder wrote:
 Florian Achleitner wrote:

  On Monday 02 July 2012 06:07:41 Jonathan Nieder wrote:
 [...]
 
  +
  +static inline void printd(const char* fmt, ...)
 
 [...]
 
  Why not use trace_printf and avoid the complication?
  
  Hm.. I tried. It wasn't exactly what I wanted. When I use trace_printf,
  it's activated together with all other traces. I can use trace_vprintf
  and specify a key, but I would always have to print the header rhsvn
  debug:  and the key by hand. So I could replace vfprintf in this
  function by trace_vprintf to do that. But then there's not much
  simplification. (?)
 
 Hmm.  There's no trace_printf_with_key() but that's presumably because
 no one has needed it.  If it existed, you could use
 
   #define printd(msg) trace_printf_with_key(GIT_TRACE_REMOTE_SVN, %s,
 msg)
 
 But now that I check, I don't see how the current printd() calls would
 be useful to other people.  Why announce these moments and not others?
 They're just temporary debugging cruft, right?
 
 For that, plain trace_printf() works great.

Yes, it's for debugging only, I could just delete it all. It's inspired by 
transport-helper.c. The env var GIT_TRANSPORT_HELPER_DEBUG enables it. While 
transport-helper has a lot of if (debug) fprintf(..), I encapsulated it in 
printd.
So I should kick printd out?

  +
  + printf(import\n);
  + printf(\n);
  + fflush(stdout);
  + return SUCCESS;
  +}
  
  Why the multiple printf?  Is the flush needed?
  
  Excess printf gone.
  Flush is needed. Otherwise it doesn't flush and the other end waits
  forever.
 Ah, fast-import is ready, remote helper is ready, no one initiates
 pumping of data between them.  Maybe the purpose of the flush would
 be more obvious if it were moved to the caller.

Acutally this goes to the git parent process (not fast-import), waiting for a 
reply to the command. I think I have to call flush on this side of the pipe. 
Can you flush it from the reader? This wouldn't have the desired effect, it 
drops buffered data.

 [...]
 
  + /* opening a fifo for usually reading blocks until a writer has opened
  it too. +  * Therefore, we open with RDWR.
  +  */
  + report_fd = open(back_pipe_env, O_RDWR);
  + if(report_fd  0) {
  + die(Unable to open fast-import back-pipe! %s, 
  strerror(errno));
  + }
  
  Is this necessary?  Why shouldn't we fork the writer first and wait
  for it here?
  
  Yes, necessary.
 
 Oh, dear.  I hope not.  E.g., Cygwin doesn't support opening fifos
 RDWR (out of scope for the gsoc project, but still).

I believe it can be solved using RDONLY and WRONLY too. Probably we solve it 
by not using the fifo at all.
Currently the blocking comes from the fact, that fast-import doesn't parse 
it's command line at startup. It rather reads an input line first and decides 
whether to parse the argv after reading the first input line or at the end of 
the input. (don't know why)
remote-svn opens the pipe before sending the first command to fast-import and 
blocks on the open, while fast-import waits for input -- deadlock.
with remote-svn: RDWR, fast-import: WRONLY, this works.

Other scenario: Nothing to import, remote-svn only sends 'done' and closes the 
pipe again. After fast-import reads the first line it parses it's command line 
and tries to open the fifo which is already closed on the other side -- 
blocks.
This is solved by using RDWR on both sides.

If we change the points where the pipes are openend and closed, this could be 
circumvented.

 
 [...]
 
  E.g. If there's have nothing to import, the helper sends only 'done' to
  fast- import and quits.
 
 Won't the writer open the pipe and wait for us to open our end before
 doing that?
 
 [...]
 
  +
  + code = start_command(svndump_proc);
  + if(code)
  + die(Unable to start %s, code %d, svndump_proc.argv[0], code);
  
  start_command() is supposed to have printed a message already when it
  fails, unless errno == ENOENT and silent_exec_failure was set.
  
  Yes, but it doesn't die, right?
 
 You can exit without writing a message with exit(), e.g. like so:
 
   if (code)
   exit(code);
 
 or like so:
 
   if (code)
   exit(128);

ok, why not..

 
 [...]
 
  +
  + close(svndump_proc.out);
  
  Important?  Wouldn't finish_command do this?
  
  As far as I understood it, it doesn't close extra created pipes. Probably
  I
  just didn't find it in the code ..
 
 So this is to work around a bug in the run-command interface?

Good question. 

 
 [...]
 
  + close(report_fd);
  
  What is the purpose of this step?
  
  Close the back-report pipe end of the remote-helper.
 
 That's just repeating the question. :)  Perhaps it's supposed to
 trigger some action on the other end of the pipe?  It would just be
 useful to add a comment documenting why one shouldn't remove this
 close() call, or else will probably end up removing it and needlessly
 suffering.

It's just closing files I've openend 

Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Junio C Hamano
Jonathan Nieder jrnie...@gmail.com writes:

 [...]
 +
 +enum cmd_result { SUCCESS, NOT_HANDLED, ERROR };
 [...]
 Hm.. the enum now has SUCCESS, NOT_HANDLED, TERMINATE.

 Much nicer.

 I think this tristate return value could be avoided entirely because...
 ... it isn't needed at the moment.

I am not sure what you mean by that.

The command dispatcher loop in [Patch v2 1/16] seems to call every
possible input handler with the first line of the input and expect
them to answer This is not for me, so NOT_HANDLED is needed.

An alternative dispatcher could be written in such a way that the
dispatcher inspects the first line and decide what to call, and in
such a scheme, you do not need NOT_HANDLED. My intuition tells me
that such an arrangement is in general a better organization.

Looking at what cmd_import() does, however, I think the approach the
patch takes might make sense for this application.  Unlike other
handlers like capabilities that do not want to handle anything
other than capabilities, it wants to handle two:

 - import that starts an import batch;
 -  (an empty line), but only when an import batch is in effect.

A centralized dispatcher that does not use NOT_HANDLED could be
written for such an input stream, but then the state information
(i.e. are we in an import batch?) needs to be global, which may or
may not be desirable (I haven't thought things through on this).

In any case, if you are going to use dispatching based on
NOT_HANDLED, the result may have to be (at least) quadri-state.  In
addition to I am done successfully, please go back and dispatch
another command (SUCCESS), This is not for me (NOT_HANDLED), and
I am done successfully, and there is no need to dispatch and
process another command further (TERMINATE), you may want to be
able to say This was for me, but I found an error (ERROR).

Of course, if the dispatch loop has to be rewritten so that a
central dispatcher decides what to call, individual input handlers
do not need to say NOT_HANDLED nor TERMINATE, as the central
dispatcher should keep track of the overall state of the system, and
the usual 0 on success, negative on error may be sufficient.

One thing I wondered was how an input capability (or list)
should be handled after import was issued (hence batch_active
becomes true).  The dispatcher loop in the patch based on
NOT_HANDLED convention will happily call cmd_capabilities(), which
does not have any notion of the batch_active state (because it is a
function scope static inside cmd_import()), and will say Ah, that
is mine, and let me do my thing.  If we want to diagnose such an
input stream as an error, the dispatch loop needs to become aware of
the overall state of the system _anyway_, so that may be an argument
against the NOT_HANDLED based dispatch system the patch series uses.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html