Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Jonathan Nieder
Florian Achleitner wrote:

> This is how I see it, probably it's all wrong:
> I thought the main problem is, that we don't want processes to have *more than
> three pipes attached*, i.e. stdout, stdin, stderr, because existing APIs don't
> allow it.

Oh, that makes sense.  Thanks for explaining, and sorry to have been so
dense.

At the Windows API level, Set/GetStdHandle() is only advertised to
handle stdin, stdout, and stderr, so on Windows there would indeed
need to be some magic to communicate the inherited HANDLE value to
fast-import.

But I am confident in the Windows porters, and if a fast-import
interface change ends up being needed, I think we can deal with it
when the moment comes and it wouldn't necessitate changing the remote
helper interface.

You also mentioned before that passing fast-import responses to the
remote helper stdin means that the remote helper has to slurp up the
whole list of refs to import before starting to talk to fast-import.
That was good practice already anyway, but it is a real pitfall and a
real downside to the single-input approach.  Thanks for bringing it
up.

Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Florian Achleitner
On Sunday 12 August 2012 09:12:58 Jonathan Nieder wrote:
> Hi again,
> 
> Florian Achleitner wrote:
> > back to the pipe-topic.
> 
> Ok, thanks.
> 
> [...]
> 
> >> The way it's supposed to work is that in a bidi-import, the remote
> >> helper reads in the entire list of refs to be imported and only once
> >> the newline indicating that that list is over arrives starts writing
> >> its fast-import stream.
> 
> [...]
> 
> > This would require all existing remote helpers that use 'import' to be
> > ported to the new concept, right? Probably there is no other..
> 
> You mean all existing remote helpers that use 'bidi-import', right?
> There are none.

Ok, it would not affect the existing import command.
> 
> [...]
> 
> > I still don't believe that sharing the input pipe of the remote helper is
> > worth the hazzle.
> > It still requires an additional pipe to be setup, the one from fast-import
> > to the remote-helper, sharing one FD at the remote helper.
> 
> If I understand correctly, you misunderstood how sharing the input
> pipe works.  Have you tried it?

Yes wrote a test program, sharing works, that's not the problem.

> 
> It does not involve setting up an additional pipe.  Standard input for
> the remote helper is already a pipe.  That pipe is what allows
> transport-helper.c to communicate with the remote helper.  Letting
> fast-import share that pipe involves passing that file descriptor to
> git fast-import.  No additional pipe() calls.
> 
> Do you mean that it would be too much work to implement?  This
> explanation just doesn't make sense to me, given that the version
> using pipe() *already* *exists* and is *tested*.

Yes, that was the first version I wrote, and remote-svn-alpha uses.

> 
> I get the feeling I am missing something very basic.  I would welcome
> input from others that shows what I am missing.
> 

This is how I see it, probably it's all wrong:
I thought the main problem is, that we don't want processes to have *more than 
three pipes attached*, i.e. stdout, stdin, stderr, because existing APIs don't 
allow it.
When we share stdin of the remote helper, we achieve this goal for this one 
process, but fast-import still has an additional pipe:
stdout  --> shell;
stderr --> shell; 
stdin <-- remote-helper; 
additional_pipe --> remote-helper.

That's what I wanted to say: We still have more than three pipes on fast-
import.
And we need to transfer that fourth file descriptor by inheritance and it's 
number as a command line argument. 
So if we make the remote-helper have only three pipes by double-using stdin, 
but fast-import still has four pipes, what problem does it solve?

Using fifos would remove the requirement to inherit more than three pipes. 
That's my point.

[..]
> 
> Meanwhile it would:
> 
>  - be 100% functionally equivalent to the solution where fast-import
>writes directly to the remote helper's standard input.  Two programs
>can have the same pipe open for writing at the same time for a few
>seconds and that is *perfectly fine*.  On Unix and on Windows.
> 
>On Windows the only complication with the pipe()-based  is that we
> haven't wired up the low-level logic to pass file descriptors other than
> stdin, stdout, stderr to child processes; and if I have understood earlier
> messages correctly, the operating system *does* have a
>concept of that and this is just a todo item in msys
>implementation.

I digged into MSDN and it seems it's not a problem at all on the windows api 
layer. Pipe handles can be inherited. [1]
If the low-level logic once supports passing more than 3 fds, it will work on 
fast-import as well as remote-helper.

> 
>  - be more complicated than the code that already exists for this
>stuff.
> 
> So while I presented this as a compromise, I don't see the point.
> 
> Is your goal portability, a dislike of the interface, some
> implementation detail I have missed, or something else?  Could you
> explain the problem as concisely but clearly as possible (perhaps
> using an example) so that others like Sverre, Peff, or David can help
> think through it and to explain it in a way that dim people like me
> understand what's going on?

It all started as portability-only discussion. On Linux, my first version would 
have worked. It created an additional pipe before forking using pipe(). Runs 
great, it did it like remote-svn-alpha.sh.

I wouldn't have started to produce something else or start a discussion on my 
own. But I was told, it's not good because of portability. This is the root of 
this endless story. (you already know the thread, I think). Since weeks nobody 
of them is interested in that except you and me.
 
So if we accept having more than three pipes on a process, we have no more 
problem.
We can dig out that first version, as well as write the one proposed by you.
While your version saves some trouble by not requiring an additional pipe() 
call and not requiring the prexec_cb, but adding a little complexity with the 
re-u

Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Jonathan Nieder
Hi again,

Florian Achleitner wrote:

> Another alternative would be to use the existing --cat-pipe-fd argument. But 
> that requires to open the fifo before execing fast-import and makes us 
> dependent on the posix model of forking and inheriting file descriptors

Let me elaborate on this, which I think is the real point (though I
could easily be wrong).

Probably instead of just sending feedback I should have been sending
suggested patches to compare.  You do not work for me and your time is
not my time, and I would not want to have the responsibility of
dictating how your code works, anyway.  Generally speaking, as long as
code is useful and not hurting maintainability, the best thing I can
do is to make it easy to incorporate.  That has side benefits, too,
like giving an example of what it is like to have your code
incorporated and creating momentum.  Small mistakes can be fixed
later.

Unfortunately here we are talking about two interfaces that need to
be stable.  Mistakes stay --- once there is a UI wart in interfaces
people are using, we are stuck continuing to support it.

To make life even more complicated, there are two interfaces involved
here:

 - the remote-helper protocol, which I think it is very important
   to keep sane and stable.  Despite the remote-helper protocol
   existing for a long time, it hasn't seen a lot of adoption
   outside git yet, and complicating the protocol will not help
   that.

   I am worried about what it would mean to add an additional
   stream on top of the standard input and output used in the current
   remote-helper protocol.  Receiving responses to fast-import
   commands on stdin seems very simple.  Maybe I am wrong?

 - the fast-import interface, which is also important but I'm not
   as worried about.  Adding a new command-line parameter means
   importers that call fast-import can unnecessarily depend on
   newer versions of git, so it's still something to be avoided.

Given a particular interface, there may be multiple choices of how to
implement it.  For example, on Unix the remote-helper protocol could
be fulfilled by letting fast-import inherit the file descriptor
helper->in, while on Windows it could for example be fulfilled by
using DuplicateHandle() to transmit the pipe handle before or after
fast-import has already started running.

The implementation strategy can easily be changed later.  What is
important is that the interface be pleasant and *possible* to
implement sanely.

Hoping that clarifies a little,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Jonathan Nieder
Hi again,

Florian Achleitner wrote:

> back to the pipe-topic.

Ok, thanks.

[...]
>> The way it's supposed to work is that in a bidi-import, the remote
>> helper reads in the entire list of refs to be imported and only once
>> the newline indicating that that list is over arrives starts writing
>> its fast-import stream.
[...]
> This would require all existing remote helpers that use 'import' to be ported 
> to the new concept, right? Probably there is no other..

You mean all existing remote helpers that use 'bidi-import', right?
There are none.

[...]
> I still don't believe that sharing the input pipe of the remote helper is 
> worth the hazzle.
> It still requires an additional pipe to be setup, the one from fast-import to 
> the remote-helper, sharing one FD at the remote helper.

If I understand correctly, you misunderstood how sharing the input
pipe works.  Have you tried it?

It does not involve setting up an additional pipe.  Standard input for
the remote helper is already a pipe.  That pipe is what allows
transport-helper.c to communicate with the remote helper.  Letting
fast-import share that pipe involves passing that file descriptor to
git fast-import.  No additional pipe() calls.

Do you mean that it would be too much work to implement?  This
explanation just doesn't make sense to me, given that the version
using pipe() *already* *exists* and is *tested*.

I get the feeling I am missing something very basic.  I would welcome
input from others that shows what I am missing.

[...]
> Another alternative would be to use the existing --cat-pipe-fd argument. But 
> that requires to open the fifo before execing fast-import and makes us 
> dependent on the posix model of forking and inheriting file descriptors, 
> while 
> opening a fifo in fast-import would not.

I'm getting kind of frustrated with this conversation going nowhere.  Here
is a compromise to explain why.  Suppose:

- fast-import learns a --cat-blob-file parameter, which tells it to open a
  file to write responses to "cat-blob" and "ls" commands to instead of
  using an inherited file descriptor

- transport-helper.c sets up a named pipe and passes it using --cat-blob-file.

- transport-helper.c reads from that named pipe and copies everything it sees
  to the remote helper's standard input, until fast-import exits.

This would:

 - allow us to continue to use a very simple protocol for communicating
   with the remote helper, where commands and fast-import backflow both
   come on standard input

 - work fine on both Windows and Unix

Meanwhile it would:

 - be 100% functionally equivalent to the solution where fast-import
   writes directly to the remote helper's standard input.  Two programs
   can have the same pipe open for writing at the same time for a few
   seconds and that is *perfectly fine*.  On Unix and on Windows.

   On Windows the only complication with the pipe()-based  is that we haven't
   wired up the low-level logic to pass file descriptors other than
   stdin, stdout, stderr to child processes; and if I have understood
   earlier messages correctly, the operating system *does* have a
   concept of that and this is just a todo item in msys
   implementation.

 - be more complicated than the code that already exists for this
   stuff.

So while I presented this as a compromise, I don't see the point.

Is your goal portability, a dislike of the interface, some
implementation detail I have missed, or something else?  Could you
explain the problem as concisely but clearly as possible (perhaps
using an example) so that others like Sverre, Peff, or David can help
think through it and to explain it in a way that dim people like me
understand what's going on?

Puzzled,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Florian Achleitner
Hi,

back to the pipe-topic.

On Wednesday 01 August 2012 12:42:48 Jonathan Nieder wrote:
> Hi again,
> 
> Florian Achleitner wrote:
> > When the first line arrives at the remote-helper, it starts importing one
> > line at a time, leaving the remaining lines in the pipe.
> > For importing it requires the data from fast-import, which would be mixed
> > with import lines or queued at the end of them.
> 
> Oh, good catch.
> 
> The way it's supposed to work is that in a bidi-import, the remote
> helper reads in the entire list of refs to be imported and only once
> the newline indicating that that list is over arrives starts writing
> its fast-import stream.  We could make this more obvious by not
> spawning fast-import until immediately before writing that newline.
> 
> This needs to be clearly documented in the git-remote-helpers(1) page
> if the bidi-import command is introduced.
> 
> If a remote helper writes commands for fast-import before that newline
> comes, that is a bug in the remote helper, plain and simple.  It might
> be fun to diagnose this problem:

This would require all existing remote helpers that use 'import' to be ported 
to the new concept, right? Probably there is no other..

> 
>   static void pipe_drained_or_die(int fd, const char *msg)
>   {
>   char buf[1];
>   int flags = fcntl(fd, F_GETFL);
>   if (flags < 0)
>   die_errno("cannot get pipe flags");
>   if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
>   die_errno("cannot set up non-blocking pipe read");
>   if (read(fd, buf, 1) > 0)
>   die("%s", msg);
>   if (fcntl(fd, F_SETFL, flags))
>   die_errno("cannot restore pipe flags");
>   }
>   ...
> 
>   for (i = 0; i < nr_heads; i++) {
>   write "import %s\n", to_fetch[i]->name;
>   }
> 
>   if (getenv("GIT_REMOTE_HELPERS_SLOW_SANITY_CHECK"))
>   sleep(1);
> 
>   pipe_drained_or_die("unexpected output from remote helper before
> fast-import launch");
> 
>   if (get_importer(transport, &fastimport))
>   die("couldn't run fast-import");
>   write_constant(data->helper->in, "\n");

I still don't believe that sharing the input pipe of the remote helper is 
worth the hazzle.
It still requires an additional pipe to be setup, the one from fast-import to 
the remote-helper, sharing one FD at the remote helper.
It still requires more than just stdin, stdout, stderr.

I would suggest to use a fifo. It can be openend independently, after forking 
and on windows they have named pipes with similar semantics, so I think this 
could be easily ported. 
I would suggest the following changes:
- add a capability to the remote helper 'bidi-import', or 'bidi-pipe'. This 
signals that the remote helper requires data from fast-import.

- add a command 'bidi-import', or 'bidi-pipe' that is tells the remote helper 
which filename the fifo is at, so that it can open it and read it when it 
handles 'import' commands.

- transport-helper.c creates the fifo on demand, i.e. on seeing the capability, 
in the gitdir or in /tmp.

- fast-import gets the name of the fifo as a command-line argument. The 
alternative would be to add a command, but that's not allowed, because it 
changes the stream semantics.
Another alternative would be to use the existing --cat-pipe-fd argument. But 
that requires to open the fifo before execing fast-import and makes us 
dependent on the posix model of forking and inheriting file descriptors, while 
opening a fifo in fast-import would not.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-01 Thread Jonathan Nieder
Hi again,

Florian Achleitner wrote:

> When the first line arrives at the remote-helper, it starts importing one 
> line 
> at a time, leaving the remaining lines in the pipe.
> For importing it requires the data from fast-import, which would be mixed 
> with 
> import lines or queued at the end of them.

Oh, good catch.

The way it's supposed to work is that in a bidi-import, the remote
helper reads in the entire list of refs to be imported and only once
the newline indicating that that list is over arrives starts writing
its fast-import stream.  We could make this more obvious by not
spawning fast-import until immediately before writing that newline.

This needs to be clearly documented in the git-remote-helpers(1) page
if the bidi-import command is introduced.

If a remote helper writes commands for fast-import before that newline
comes, that is a bug in the remote helper, plain and simple.  It might
be fun to diagnose this problem:

static void pipe_drained_or_die(int fd, const char *msg)
{
char buf[1];
int flags = fcntl(fd, F_GETFL);
if (flags < 0)
die_errno("cannot get pipe flags");
if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
die_errno("cannot set up non-blocking pipe read");
if (read(fd, buf, 1) > 0)
die("%s", msg);
if (fcntl(fd, F_SETFL, flags))
die_errno("cannot restore pipe flags");
}
...

for (i = 0; i < nr_heads; i++) {
write "import %s\n", to_fetch[i]->name;
}

if (getenv("GIT_REMOTE_HELPERS_SLOW_SANITY_CHECK"))
sleep(1);

pipe_drained_or_die("unexpected output from remote helper before 
fast-import launch");

if (get_importer(transport, &fastimport))
die("couldn't run fast-import");
write_constant(data->helper->in, "\n");
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-01 Thread Florian Achleitner
On Tuesday 31 July 2012 15:43:57 Jonathan Nieder wrote:
> Florian Achleitner wrote:
> > I haven't tried that yet, nor do I remember anything where I've already
> > seen two processes writing to the same pipe.
> 
> It's a perfectly normal and well supported thing to do.

I played around with a little testprogram. It generally works.
I'm still not convinced that this doesn't cause more problems than it can 
solve.
The standard defines that write calls to pipe fds are atomic, i.e. data is not 
interleaved with data from other processes, if the data is less than PIPE_BUF 
[1].
We would need some kind of locking/synchronization to make it work for sure, 
while I believe it will work most of the time.

Currently it runs  like this:
transport-helper.c writes one or more 'import ' lines, we don't know in 
advance how many and how long they are. Then it waits for fast-import to 
finish.

When the first line arrives at the remote-helper, it starts importing one line 
at a time, leaving the remaining lines in the pipe.
For importing it requires the data from fast-import, which would be mixed with 
import lines or queued at the end of them.

[1] 
http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-31 Thread Jonathan Nieder
Florian Achleitner wrote:

> I haven't tried that yet, nor do I remember anything where I've already seen
> two processes writing to the same pipe.

It's a perfectly normal and well supported thing to do.

[...]
> Will try that in test-program..

Thanks.

Good luck,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-31 Thread Florian Achleitner
On Monday 30 July 2012 11:55:02 Jonathan Nieder wrote:
> Florian Achleitner wrote:
> > Hm .. that would mean, that both fast-import and git (transport-helper)
> > would write to the remote-helper's stdin, right?
> 
> Yes, first git writes the list of refs to import, and then fast-import
> writes feedback during the import.  Is that a problem?

I haven't tried that yet, nor do I remember anything where I've already seen 
two processes writing to the same pipe.
At least it sounds cumbersome to me. Processes' lifetimes overlap, so buffering 
and flushing could mix data.
We have to use it for both purposes interchangably  because there can be more 
than one import command to the remote-helper, of course.

Will try that in test-program..
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Jonathan Nieder
Florian Achleitner wrote:

> Hm .. that would mean, that both fast-import and git (transport-helper) would 
> write to the remote-helper's stdin, right?

Yes, first git writes the list of refs to import, and then fast-import
writes feedback during the import.  Is that a problem?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Florian Achleitner
On Monday 30 July 2012 03:29:52 Jonathan Nieder wrote:
> > Generally I like your prefered solution.
> > I think there's one problem:
> > The pipe needs to be created before the fork, so that the fd can be
> > inherited. 
> The relevant pipe already exists at that point: the remote helper's
> stdin.
> 
> In other words, it could work like this (just like the existing demo
> code, except adding a conditional based on the "capabilities"
> response):
> 
> 0. transport-helper.c invokes the remote helper.  This requires
>a pipe used to send commands to the remote helper
>(helper->in) and a pipe used to receive responses from the
>remote helper (helper->out)
> 
> 1. transport-helper.c sends the "capabilities" command to decide
>what to do.  The remote helper replies that it would like
>some feedback from fast-import.
> 
> 2. transport-helper.c forks and execs git fast-import with input
>redirected from helper->out and the cat-blob fd redirected
>to helper->in

fast-import writes to the helpers stdin..

> 3. transport-helper.c tells the remote helper to start the
>import

transport-helper writes commands to the helper's stdin.

> 
> 4. wait for fast-import to exit

Hm .. that would mean, that both fast-import and git (transport-helper) would 
write to the remote-helper's stdin, right?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Jonathan Nieder
Florian Achleitner wrote:
> On Saturday 28 July 2012 02:00:31 Jonathan Nieder wrote:

>> a. use --cat-blob-fd, no FIFO
[...]
>>* Make it conditional --- only do it (1) we are not on Windows and
>>  (2) the remote helper requests backflow by advertising the
>>  import-bidi capability.
>>
>>* Let the remote helper know what's going on by using
>>  "import-bidi" instead of "import" in the command stream to
>>  initiate the import.
>
> Generally I like your prefered solution.
> I think there's one problem:
> The pipe needs to be created before the fork, so that the fd can be 
> inherited. 

The relevant pipe already exists at that point: the remote helper's
stdin.

In other words, it could work like this (just like the existing demo
code, except adding a conditional based on the "capabilities"
response):

0. transport-helper.c invokes the remote helper.  This requires
   a pipe used to send commands to the remote helper
   (helper->in) and a pipe used to receive responses from the
   remote helper (helper->out)

1. transport-helper.c sends the "capabilities" command to decide
   what to do.  The remote helper replies that it would like
   some feedback from fast-import.

2. transport-helper.c forks and execs git fast-import with input
   redirected from helper->out and the cat-blob fd redirected
   to helper->in

3. transport-helper.c tells the remote helper to start the
   import

4. wait for fast-import to exit
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Florian Achleitner
On Thursday 26 July 2012 10:29:51 Junio C Hamano wrote:
> Of course, if the dispatch loop has to be rewritten so that a
> central dispatcher decides what to call, individual input handlers
> do not need to say NOT_HANDLED nor TERMINATE, as the central
> dispatcher should keep track of the overall state of the system, and
> the usual "0 on success, negative on error" may be sufficient.
> 
> One thing I wondered was how an input "capability" (or "list")
> should be handled after "import" was issued (hence batch_active
> becomes true).  The dispatcher loop in the patch based on
> NOT_HANDLED convention will happily call cmd_capabilities(), which
> does not have any notion of the batch_active state (because it is a
> function scope static inside cmd_import()), and will say "Ah, that
> is mine, and let me do my thing."  If we want to diagnose such an
> input stream as an error, the dispatch loop needs to become aware of
> the overall state of the system _anyway_, so that may be an argument
> against the NOT_HANDLED based dispatch system the patch series uses.

That's a good point. The current implementation allows other commands to 
appear during import batches. This shouldn't be possible according to the 
protocol, I think. But it doesn't do harm. Solving it will require a global 
state and go towards a global displatcher.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Florian Achleitner
On Saturday 28 July 2012 02:00:31 Jonathan Nieder wrote:
> Thanks for explaining.  Now we've discussed a few different approproaches,
> none of which is perfect.
> 
> a. use --cat-blob-fd, no FIFO
> 
>Doing this unconditionally would break platforms that don't support
>--cat-blob-fd=(descriptor >2), like Windows, so we'd have to:
> 
>* Make it conditional --- only do it (1) we are not on Windows and
>  (2) the remote helper requests backflow by advertising the
>  import-bidi capability.
> 
>* Let the remote helper know what's going on by using
>  "import-bidi" instead of "import" in the command stream to
>  initiate the import.

Generally I like your prefered solution.
I think there's one problem:
The pipe needs to be created before the fork, so that the fd can be inherited. 
There is no way of creating it if the remote-helper advertises a capability, 
because it is already forked then. This would work with fifos, though.

We could:
- add a capability: bidi-import. 
- make transport-helper create a fifo if the helper advertises it.
- add a command for remote-helpers, like 'bidi-import ' that makes 
the remote helper open the fifo at  and use it.
- fast-import is forked after the helper, so we do already know if there will 
be a back-pipe. If yes, open it in transport-helper and pass the fd as command 
line argument cat-blob-fd. 

--> fast-import wouldn't need to be changed, but we'd use a fifo, and we get 
rid of the env-vars.
(I guess it could work on windows too).

What do you think?

> 
> b. use envvars to pass around FIFO path
> 
>This complicates the fast-import interface and makes debugging hard.
>It would be nice to avoid this if we can, but in case we can't, it's
>nice to have the option available.
> 
> c. transport-helper.c uses FIFO behind the scenes.
> 
>Like (a), except it would require a fast-import tweak (boo) and
>would work on Windows (yea)
> 
> d. use --cat-blob-fd with FIFO
> 
>Early scripted remote-svn prototypes did this to fulfill "fetch"
>requests.
> 
>It has no advantage over "use --cat-blob-fd, no FIFO" except being
>easier to implement as a shell script.  I'm listing this just for
>comparison; since (a) looks better in every way, I don't see any
>reason to pursue this one.
> 
> Since avoiding deadlocks with bidirectional communication is always a
> little subtle, it would be nice for this to be implemented once in
> transport-helper.c rather than each remote helper author having to
> reimplement it again.  As a result, my knee-jerk ranking is a > c >
> b > d.
> 
> Sane?
> Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-28 Thread Jonathan Nieder
Florian Achleitner wrote:

> So I should kick printd out?

I think so, yes.

"git log -SGIT_TRANSPORT_HELPER_DEBUG transport-helper.c" tells me
that that option was added to make the transport-helper machinery make
noise to make it obvious at what stage a remote helper has deadlocked.

GIT_TRANSPORT_HELPER_DEBUG already takes care of that, so there would
not be need for an imitation of that in remote-svn, unless I am
missing something (and even if I am missing something, it seems
complicated enough to be worth moving to another patch where it can be
explained more easily).

[...]
> +
> + printf("import\n");
> + printf("\n");
> + fflush(stdout);
> + return SUCCESS;
> +}
[...]
>>Maybe the purpose of the flush would
>> be more obvious if it were moved to the caller.
>
> Acutally this goes to the git parent process (not fast-import), waiting for a 
> reply to the command. I think I have to call flush on this side of the pipe. 
> Can you flush it from the reader? This wouldn't have the desired effect, it 
> drops buffered data.

*slaps head*  This is the "capabilities" command, and it needs to
flush because the reader needs to know what commands it's allowed to
use next before it starts using them.  My brain turned off and I
thought you were emitting an "import" command rather than advertising
that you support it for some reason.

And 'printf("\n")' was a separate printf because that way, patches
like

printf("import\n");
+   printf("bidi-import\n");
printf("\n");
fflush(stdout);

become simpler.

I'm tempted to suggest a structure like

const char * const capabilities[] = {"import"};
int i;

for (i = 0; i < ARRAY_SIZE(capabilities); i++)
puts(capabilities[i]);
puts("");   /* blank line */

fflush(stdout);

but your original code was fine, too.

[...]
> + /* opening a fifo for usually reading blocks until a writer has opened
> it too. +  * Therefore, we open with RDWR.
> +  */
> + report_fd = open(back_pipe_env, O_RDWR);
> + if(report_fd < 0) {
> + die("Unable to open fast-import back-pipe! %s", 
> strerror(errno));
[...]
> I believe it can be solved using RDONLY and WRONLY too. Probably we solve it 
> by not using the fifo at all.
> Currently the blocking comes from the fact, that fast-import doesn't parse 
> it's command line at startup. It rather reads an input line first and decides 
> whether to parse the argv after reading the first input line or at the end of 
> the input. (don't know why)
> remote-svn opens the pipe before sending the first command to fast-import and 
> blocks on the open, while fast-import waits for input --> deadlock.

Thanks for explaining.  Now we've discussed a few different approproaches,
none of which is perfect.

a. use --cat-blob-fd, no FIFO

   Doing this unconditionally would break platforms that don't support
   --cat-blob-fd=(descriptor >2), like Windows, so we'd have to:

   * Make it conditional --- only do it (1) we are not on Windows and
 (2) the remote helper requests backflow by advertising the
 import-bidi capability.

   * Let the remote helper know what's going on by using
 "import-bidi" instead of "import" in the command stream to
 initiate the import.

b. use envvars to pass around FIFO path

   This complicates the fast-import interface and makes debugging hard.
   It would be nice to avoid this if we can, but in case we can't, it's
   nice to have the option available.

c. transport-helper.c uses FIFO behind the scenes.

   Like (a), except it would require a fast-import tweak (boo) and
   would work on Windows (yea)

d. use --cat-blob-fd with FIFO

   Early scripted remote-svn prototypes did this to fulfill "fetch"
   requests.

   It has no advantage over "use --cat-blob-fd, no FIFO" except being
   easier to implement as a shell script.  I'm listing this just for
   comparison; since (a) looks better in every way, I don't see any
   reason to pursue this one.

Since avoiding deadlocks with bidirectional communication is always a
little subtle, it would be nice for this to be implemented once in
transport-helper.c rather than each remote helper author having to
reimplement it again.  As a result, my knee-jerk ranking is a > c >
b > d.

Sane?
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-27 Thread Jonathan Nieder
Hi,

Some quick details.

Florian Achleitner wrote:

>Anyways, specifying cat-blob-fd is not 
> allowed via the 'option' command (see Documentation and 85c62395).
> It wouldn't make too much sense, because the file descriptor must be set up 
> by 
> the parent.
>
> But for the fifo, it would, probably.

More precisely, requiring that the cat-blob fd go on the command line
instead of in the stream matches a model where fast-import's three
standard streams are abstract:

 - its input, using the INPUT FORMAT described in git-fast-import(1)
 - its standard output, which echoes "progress" commands
 - its backflow stream, where responses to "cat-blob" and "ls" commands go

The stream format is not necessarily pinned to a Unix model where
input and output are based on filesystems and file descriptors.  We
can imagine a fast-import backend that runs on a remote server and the
transport used for these streams is sockets; for fast-import frontends
to be usable in such a context, the streams they produce must not rely
on particular fd numbers, nor on pathnames (except for mark state
saved to relative paths with --relative-marks) representing anything
in particular.

This goes just as much for a fifo set up on the filesystem where the
fast-import backend runs as for an inherited file descriptor.  In the
current model, such backend-specific details of setup go on the
command line.

>   The backward channel is only used by 
> the 
> commands 'cat-blob' and 'ls' of fast-import. If a remote helper wants to use 
> them, it would could make fast-import open the pipe by sending an 'option' 
> command with the fifo filename, otherwise it defaults to stdout (like now) 
> and 
> is rather useless.

I'm getting confused by terminology again.  Let's see if I have the cast
of characters straight:

 - the fast-import backend (e.g., "git fast-import" or "hg-fastimport")
 - the fast-import frontend (e.g., "git fast-export" or svn-fe)
 - git's generic foreign VCS support plumbing, also known as
   transport-helper.c
 - the remote helper (e.g., "git remote-svn" or "git remote-testgit")

Why would the fast-import backend ever need to open a pipe?  If I want
it to write backflow to a fifo, I can use

mkfifo my-favorite-fifo
git fast-import --cat-blob-fd=3 3>my-favorite-fifo

If I want it to write backflow to a pipe, I can use (using ksh syntax)

cat |&
git fast-import --cat-blob-fd=3 3>&p

> This would take the fifo setup out of transport-helper. The remote-helper 
> would 
> have to create it, if it needs it.

We can imagine transport-helper.c learning the name of a fifo set up by
the remote helper by sending it the "capabilities" command:

git> capabilities
helper> option
helper> import
helper> cat-blob-file my-favorite-fifo
helper> refspec refs/heads/*:refs/helper/remotename/*
helper>

transport-helper.c could then use that information to invoke
fast-import appropriately:

git fast-import --cat-blob-fd=3 3>my-favorite-fifo

But this seems like pushing complication onto the remote helper; since
there is expected to be one remote helper per foreign VCS,
implementing the appropriate logic correctly once and for all in
transport-helper.c for all interested remote helpers to take advantage
of seems to me like a better policy.

> Apropos stdout. That leads to another idea. You already suggested that it 
> would be easiest to only use FDs 0..2. Currently stdout and stderr of fast-
> import go to the shell. We could connect stdout to the remote-helper and 
> don't 
> need the additional channel at all.

The complication that makes this strategy not so easy is "progress"
commands in the fast-import input stream.  (Incidentally, it would be
nice to teach transport-helper.c to display specially formatted
"progress" commands using a progress bar some day.)

Hoping that clarifies,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-27 Thread Florian Achleitner
On Thursday 26 July 2012 09:54:26 Jonathan Nieder wrote:
> 
> Since the svn remote helper relies on this, it seems worth working on,
> yeah.  As for how to spend your time (and whether to beg someone else
> to work on it instead :)): I'm not sure what's on your plate or where
> you are with respect to the original plan for the summer at the
> moment, so it would be hard for me to give useful advice about how to
> balance things.

Btw, the pipe version did already exist before I started, it was added with 
the cat-blob command and already used by Dmitry's remote-svn-alpha.
I didn't search for design discussions in the past ..

> 
> What did you think of the suggestion of adding a new bidi-import
> capability and command to the remote helper protocol?  I think this
> would be clean and avoid causing a regression on Windows, but it's
> easily possible I am missing something fundamental.

I don't have much overview over this topic besides the part I'm working on, 
like other users of fast-import. 
The bidi-import capability/command would have the advantage, that we don't 
have to bother with the pipe/fifo at all, if the remote-helper doesn't use it.

When I implemented the two variants I had the idea to pass it to the 'option' 
command, that fast-import already has. Anyways, specifying cat-blob-fd is not 
allowed via the 'option' command (see Documentation and 85c62395).
It wouldn't make too much sense, because the file descriptor must be set up by 
the parent.

But for the fifo, it would, probably. The backward channel is only used by the 
commands 'cat-blob' and 'ls' of fast-import. If a remote helper wants to use 
them, it would could make fast-import open the pipe by sending an 'option' 
command with the fifo filename, otherwise it defaults to stdout (like now) and 
is rather useless.
This would take the fifo setup out of transport-helper. The remote-helper would 
have to create it, if it needs it.

Apropos stdout. That leads to another idea. You already suggested that it 
would be easiest to only use FDs 0..2. Currently stdout and stderr of fast-
import go to the shell. We could connect stdout to the remote-helper and don't 
need the additional channel at all.
(Probably there's a good reason why they haven't done that ..)
Maybe this requires many changes to fast-import and breaks existing frontends.

--
Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Junio C Hamano
Jonathan Nieder  writes:

> [...]
 +
 +enum cmd_result { SUCCESS, NOT_HANDLED, ERROR };
> [...]
>> Hm.. the enum now has SUCCESS, NOT_HANDLED, TERMINATE.
>
> Much nicer.
>
> I think this tristate return value could be avoided entirely because...
> ... it isn't needed at the moment.

I am not sure what you mean by that.

The command dispatcher loop in [Patch v2 1/16] seems to call every
possible input handler with the first line of the input and expect
them to answer "This is not for me", so NOT_HANDLED is needed.

An alternative dispatcher could be written in such a way that the
dispatcher inspects the first line and decide what to call, and in
such a scheme, you do not need NOT_HANDLED. My intuition tells me
that such an arrangement is in general a better organization.

Looking at what cmd_import() does, however, I think the approach the
patch takes might make sense for this application.  Unlike other
handlers like "capabilities" that do not want to handle anything
other than "capabilities", it wants to handle two:

 - "import" that starts an import batch;
 - "" (an empty line), but only when an import batch is in effect.

A centralized dispatcher that does not use NOT_HANDLED could be
written for such an input stream, but then the state information
(i.e. "are we in an import batch?") needs to be global, which may or
may not be desirable (I haven't thought things through on this).

In any case, if you are going to use dispatching based on
NOT_HANDLED, the result may have to be (at least) quadri-state.  In
addition to "I am done successfully, please go back and dispatch
another command" (SUCCESS), "This is not for me" (NOT_HANDLED), and
"I am done successfully, and there is no need to dispatch and
process another command further" (TERMINATE), you may want to be
able to say "This was for me, but I found an error" (ERROR).

Of course, if the dispatch loop has to be rewritten so that a
central dispatcher decides what to call, individual input handlers
do not need to say NOT_HANDLED nor TERMINATE, as the central
dispatcher should keep track of the overall state of the system, and
the usual "0 on success, negative on error" may be sufficient.

One thing I wondered was how an input "capability" (or "list")
should be handled after "import" was issued (hence batch_active
becomes true).  The dispatcher loop in the patch based on
NOT_HANDLED convention will happily call cmd_capabilities(), which
does not have any notion of the batch_active state (because it is a
function scope static inside cmd_import()), and will say "Ah, that
is mine, and let me do my thing."  If we want to diagnose such an
input stream as an error, the dispatch loop needs to become aware of
the overall state of the system _anyway_, so that may be an argument
against the NOT_HANDLED based dispatch system the patch series uses.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
On Thursday 26 July 2012 04:08:42 Jonathan Nieder wrote:
> Florian Achleitner wrote:

> > On Monday 02 July 2012 06:07:41 Jonathan Nieder wrote:
> [...]
> 
> >>> +
> >>> +static inline void printd(const char* fmt, ...)
> 
> [...]
> 
> >> Why not use trace_printf and avoid the complication?
> > 
> > Hm.. I tried. It wasn't exactly what I wanted. When I use trace_printf,
> > it's activated together with all other traces. I can use trace_vprintf
> > and specify a key, but I would always have to print the header "rhsvn
> > debug: " and the key by hand. So I could replace vfprintf in this
> > function by trace_vprintf to do that. But then there's not much
> > simplification. (?)
> 
> Hmm.  There's no trace_printf_with_key() but that's presumably because
> no one has needed it.  If it existed, you could use
> 
>   #define printd(msg) trace_printf_with_key("GIT_TRACE_REMOTE_SVN", "%s",
> msg)
> 
> But now that I check, I don't see how the current printd() calls would
> be useful to other people.  Why announce these moments and not others?
> They're just temporary debugging cruft, right?
> 
> For that, plain trace_printf() works great.

Yes, it's for debugging only, I could just delete it all. It's inspired by 
transport-helper.c. The env var GIT_TRANSPORT_HELPER_DEBUG enables it. While 
transport-helper has a lot of if (debug) fprintf(..), I encapsulated it in 
printd.
So I should kick printd out?

> >>> +
> >>> + printf("import\n");
> >>> + printf("\n");
> >>> + fflush(stdout);
> >>> + return SUCCESS;
> >>> +}
> >> 
> >> Why the multiple printf?  Is the flush needed?
> > 
> > Excess printf gone.
> > Flush is needed. Otherwise it doesn't flush and the other end waits
> > forever.
> Ah, fast-import is ready, remote helper is ready, no one initiates
> pumping of data between them.  Maybe the purpose of the flush would
> be more obvious if it were moved to the caller.

Acutally this goes to the git parent process (not fast-import), waiting for a 
reply to the command. I think I have to call flush on this side of the pipe. 
Can you flush it from the reader? This wouldn't have the desired effect, it 
drops buffered data.

> [...]
> 
> >>> + /* opening a fifo for usually reading blocks until a writer has opened
> >>> it too. +  * Therefore, we open with RDWR.
> >>> +  */
> >>> + report_fd = open(back_pipe_env, O_RDWR);
> >>> + if(report_fd < 0) {
> >>> + die("Unable to open fast-import back-pipe! %s", 
> >>> strerror(errno));
> >>> + }
> >> 
> >> Is this necessary?  Why shouldn't we fork the writer first and wait
> >> for it here?
> > 
> > Yes, necessary.
> 
> Oh, dear.  I hope not.  E.g., Cygwin doesn't support opening fifos
> RDWR (out of scope for the gsoc project, but still).

I believe it can be solved using RDONLY and WRONLY too. Probably we solve it 
by not using the fifo at all.
Currently the blocking comes from the fact, that fast-import doesn't parse 
it's command line at startup. It rather reads an input line first and decides 
whether to parse the argv after reading the first input line or at the end of 
the input. (don't know why)
remote-svn opens the pipe before sending the first command to fast-import and 
blocks on the open, while fast-import waits for input --> deadlock.
with remote-svn: RDWR, fast-import: WRONLY, this works.

Other scenario: Nothing to import, remote-svn only sends 'done' and closes the 
pipe again. After fast-import reads the first line it parses it's command line 
and tries to open the fifo which is already closed on the other side --> 
blocks.
This is solved by using RDWR on both sides.

If we change the points where the pipes are openend and closed, this could be 
circumvented.

> 
> [...]
> 
> > E.g. If there's have nothing to import, the helper sends only 'done' to
> > fast- import and quits.
> 
> Won't the writer open the pipe and wait for us to open our end before
> doing that?
> 
> [...]
> 
> >>> +
> >>> + code = start_command(&svndump_proc);
> >>> + if(code)
> >>> + die("Unable to start %s, code %d", svndump_proc.argv[0], code);
> >> 
> >> start_command() is supposed to have printed a message already when it
> >> fails, unless errno == ENOENT and silent_exec_failure was set.
> > 
> > Yes, but it doesn't die, right?
> 
> You can exit without writing a message with exit(), e.g. like so:
> 
>   if (code)
>   exit(code);
> 
> or like so:
> 
>   if (code)
>   exit(128);

ok, why not..

> 
> [...]
> 
> >>> +
> >>> + close(svndump_proc.out);
> >> 
> >> Important?  Wouldn't finish_command do this?
> > 
> > As far as I understood it, it doesn't close extra created pipes. Probably
> > I
> > just didn't find it in the code ..
> 
> So this is to work around a bug in the run-command interface?

Good question. 

> 
> [...]
> 
> >>> + close(report_fd);
> >> 
> >> What is the purpose of this step?
> > 
> > Close the back-report pipe end of the remote-helper.
> 
> That's just repeating the question. :)  Perhaps it's supposed to
> trigger so

Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Jonathan Nieder
(cc-ing Ram since he's also knowledgeable about remote-helper protocol)
Florian Achleitner wrote:
> On Thursday 26 July 2012 06:40:39 Jonathan Nieder wrote:

>> Though I still
>> think the way forward is to keep using plain pipes internally for now
>> and to make the bidirectional communication optional, since it
>> wouldn't close any doors to whatever is most convenient on each
>> platform.  Hopefully I'll hear more from Florian about this in time.
>
> Would you like to see a new pipe patch?

Since the svn remote helper relies on this, it seems worth working on,
yeah.  As for how to spend your time (and whether to beg someone else
to work on it instead :)): I'm not sure what's on your plate or where
you are with respect to the original plan for the summer at the
moment, so it would be hard for me to give useful advice about how to
balance things.

What did you think of the suggestion of adding a new bidi-import
capability and command to the remote helper protocol?  I think this
would be clean and avoid causing a regression on Windows, but it's
easily possible I am missing something fundamental.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
On Thursday 26 July 2012 06:40:39 Jonathan Nieder wrote:
> Steven Michalske wrote:
> > On Jul 2, 2012, at 4:07 AM, Jonathan Nieder  wrote:
> >> [...]
> >> 
> >>> diff: Use fifo instead of pipe: Retrieve the name of the pipe from env
> >>> and open it for svndump.
> >> 
> >> I'd prefer to avoid this if possible, since it means having to decide
> >> where the pipe goes on the filesystem.  Can you summarize the
> >> discussion in the commit message so future readers understand why
> >> we're doing it?
> > 
> > Crazy thought here but would a socket not be a bad choice here?
> 
> Not crazy --- it was already mentioned.  It could probably allow using
> --cat-blob-fd even on the platforms that don't inherit file
> descriptors >2, though it wuld take some tweaking.  Though I still
> think the way forward is to keep using plain pipes internally for now
> and to make the bidirectional communication optional, since it
> wouldn't close any doors to whatever is most convenient on each
> platform.  Hopefully I'll hear more from Florian about this in time.

Would you like to see a new pipe patch?

> 
> > Imagine being able to ssh tunnel into the SVN server and run the helper
> > with filesystem access to the SVN repo.
> 
> We're talking about what communicates between the SVN dump parser the
> version control system-specific backend (git fast-import) that reads
> the converted result, so that particular socket wouldn't help much.
>

Yes .. the network part is already handled quite well by svnrdump.
 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Jonathan Nieder
Steven Michalske wrote:
> On Jul 2, 2012, at 4:07 AM, Jonathan Nieder  wrote:

>> [...]
>>> diff: Use fifo instead of pipe: Retrieve the name of the pipe from env and 
>>> open it
>>> for svndump.
>>
>> I'd prefer to avoid this if possible, since it means having to decide
>> where the pipe goes on the filesystem.  Can you summarize the
>> discussion in the commit message so future readers understand why
>> we're doing it?
>
> Crazy thought here but would a socket not be a bad choice here?

Not crazy --- it was already mentioned.  It could probably allow using
--cat-blob-fd even on the platforms that don't inherit file
descriptors >2, though it wuld take some tweaking.  Though I still
think the way forward is to keep using plain pipes internally for now
and to make the bidirectional communication optional, since it
wouldn't close any doors to whatever is most convenient on each
platform.  Hopefully I'll hear more from Florian about this in time.

> Imagine being able to ssh tunnel into the SVN server and run the helper with
> filesystem access to the SVN repo.

We're talking about what communicates between the SVN dump parser the
version control system-specific backend (git fast-import) that reads
the converted result, so that particular socket wouldn't help much.

Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Steven Michalske

On Jul 2, 2012, at 4:07 AM, Jonathan Nieder  wrote:

> [...]
>> diff: Use fifo instead of pipe: Retrieve the name of the pipe from env and 
>> open it
>> for svndump.
> 
> I'd prefer to avoid this if possible, since it means having to decide
> where the pipe goes on the filesystem.  Can you summarize the
> discussion in the commit message so future readers understand why
> we're doing it?

Crazy thought here but would a socket not be a bad choice here?

Imagine being able to ssh tunnel into the SVN server and run the helper with 
filesystem access to the SVN repo.

Akin to the pushy project use case.
http://packages.python.org/pushy/

SSH into the machine, copy the required components to the machine, and use the 
RPC.
Nothing needed but SSH and python.  In this case SSH, SVN, and the helper would 
be needed.

This also would work just fine with the local host too.

Steve

Note: Resent, Sorry it was signed, and rejected before.--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Jonathan Nieder
Florian Achleitner wrote:

> Most of this review went into the new version.. 
> For the remaining points, some comments follow.

Thanks for this.

> On Monday 02 July 2012 06:07:41 Jonathan Nieder wrote:

[...]
>>> +
>>> +static inline void printd(const char* fmt, ...)
[...]
>> Why not use trace_printf and avoid the complication?
>
> Hm.. I tried. It wasn't exactly what I wanted. When I use trace_printf, it's 
> activated together with all other traces. I can use trace_vprintf and specify 
> a key, but I would always have to print the header "rhsvn debug: " and the 
> key 
> by hand. So I could replace vfprintf in this function by trace_vprintf to do 
> that. But then there's not much simplification. (?)

Hmm.  There's no trace_printf_with_key() but that's presumably because
no one has needed it.  If it existed, you could use

#define printd(msg) trace_printf_with_key("GIT_TRACE_REMOTE_SVN", "%s", 
msg)

But now that I check, I don't see how the current printd() calls would
be useful to other people.  Why announce these moments and not others?
They're just temporary debugging cruft, right?

For that, plain trace_printf() works great.

[...]
>>> +
>>> +enum cmd_result { SUCCESS, NOT_HANDLED, ERROR };
[...]
> Hm.. the enum now has SUCCESS, NOT_HANDLED, TERMINATE.

Much nicer.

I think this tristate return value could be avoided entirely because...
[continued at (*) below]

[...]
>>> +
>>> +   printf("import\n");
>>> +   printf("\n");
>>> +   fflush(stdout);
>>> +   return SUCCESS;
>>> +}
>>
>> Why the multiple printf?  Is the flush needed?
>
> Excess printf gone.
> Flush is needed. Otherwise it doesn't flush and the other end waits forever.

Ah, fast-import is ready, remote helper is ready, no one initiates
pumping of data between them.  Maybe the purpose of the flush would
be more obvious if it were moved to the caller.

[...]
>>> +   /* opening a fifo for usually reading blocks until a writer has opened
>>> it too. +* Therefore, we open with RDWR.
>>> +*/
>>> +   report_fd = open(back_pipe_env, O_RDWR);
>>> +   if(report_fd < 0) {
>>> +   die("Unable to open fast-import back-pipe! %s", 
>>> strerror(errno));
>>> +   }
>>
>> Is this necessary?  Why shouldn't we fork the writer first and wait
>> for it here?
>
> Yes, necessary.

Oh, dear.  I hope not.  E.g., Cygwin doesn't support opening fifos
RDWR (out of scope for the gsoc project, but still).

[...]
> E.g. If there's have nothing to import, the helper sends only 'done' to fast-
> import and quits.

Won't the writer open the pipe and wait for us to open our end before
doing that?

[...]
>>> +
>>> +   code = start_command(&svndump_proc);
>>> +   if(code)
>>> +   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
>>
>> start_command() is supposed to have printed a message already when it
>> fails, unless errno == ENOENT and silent_exec_failure was set.
>
> Yes, but it doesn't die, right?

You can exit without writing a message with exit(), e.g. like so:

if (code)
exit(code);

or like so:

if (code)
exit(128);

[...]
>>> +
>>> +   close(svndump_proc.out);
>>
>> Important?  Wouldn't finish_command do this?
>
> As far as I understood it, it doesn't close extra created pipes. Probably I 
> just didn't find it in the code ..

So this is to work around a bug in the run-command interface?

[...]
>>> +   close(report_fd);
>>
>> What is the purpose of this step?
>
> Close the back-report pipe end of the remote-helper.

That's just repeating the question. :)  Perhaps it's supposed to
trigger some action on the other end of the pipe?  It would just be
useful to add a comment documenting why one shouldn't remove this
close() call, or else will probably end up removing it and needlessly
suffering.

[...]
>>> +
>>> +   code = finish_command(&svndump_proc);
>>> +   if(code)
>>> +   warning("Something went wrong with termination of %s, code %d",
>>> svndump_proc.argv[0], code);
>> finish_command() is supposed to print a message when it fails.
>
> I changed the message text. It should tell us if svnrdump exited with non-
> zero.

I'd suggest looking at other finish_command() callers for examples.

[...]
>>> +enum cmd_result do_command(struct strbuf* line)
>>> +{
>>> +   const command* p = command_list;
>>> +   enum cmd_result ret;
>>> +   printd("command line '%s'", line->buf);
>>> +   while(*p) {
>>> +   ret = (*p)(line);
>>> +   if(ret != NOT_HANDLED)
>>> +   return ret;
>>> +   p++;
>>> +   }
>>
>> If possible, matching commands by name (like git.c does) would make
>> the behavior easier to predict.
>
> There is some usecase for this. The intention was, that command handlers 
> should be able to process more than one 'name'. E.g. an import batch is 
> terminated by a newline. This newline is handled by the import handler if it 
> is a batch. (This feature wasn't implemented in the version reviewed here.)
>
> So I decided to let the

Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
Hi!

Most of this review went into the new version.. 
For the remaining points, some comments follow.

On Monday 02 July 2012 06:07:41 Jonathan Nieder wrote:
> Hi,
> 
> Florian Achleitner wrote:

> 
> > --- /dev/null
> > +++ b/contrib/svn-fe/remote-svn.c
> > @@ -0,0 +1,207 @@
> > +
> > +#include 
> > +#include 
> > +#include 
> 
> git-compat-util.h (or some header that includes it) must be the first
> header included so the appropriate feature test macros can be defined.
> See Documentation/CodingGuidelines for more on that.

check.

> 
> > +#include "cache.h"
> > +#include "remote.h"
> > +#include "strbuf.h"
> > +#include "url.h"
> > +#include "exec_cmd.h"
> > +#include "run-command.h"
> > +#include "svndump.h"
> > +
> > +static int debug = 0;
> 
> Small nit: please drop the redundant "= 0" here.  Or:

check.

> > +
> > +static inline void printd(const char* fmt, ...)
> > +{
> > +   if(debug) {
> > +   va_list vargs;
> > +   va_start(vargs, fmt);
> > +   fprintf(stderr, "rhsvn debug: ");
> > +   vfprintf(stderr, fmt, vargs);
> > +   fprintf(stderr, "\n");
> > +   va_end(vargs);
> > +   }
> > +}
> 
> Why not use trace_printf and avoid the complication?

Hm.. I tried. It wasn't exactly what I wanted. When I use trace_printf, it's 
activated together with all other traces. I can use trace_vprintf and specify 
a key, but I would always have to print the header "rhsvn debug: " and the key 
by hand. So I could replace vfprintf in this function by trace_vprintf to do 
that. But then there's not much simplification. (?)


> > +
> > +enum cmd_result cmd_capabilities(struct strbuf* line);
> > +enum cmd_result cmd_import(struct strbuf* line);
> > +enum cmd_result cmd_list(struct strbuf* line);
> 
> What's a cmd_result?  '*' sticks to variable name.
> 
> > +
> > +enum cmd_result { SUCCESS, NOT_HANDLED, ERROR };
> 
> Oh, that's what a cmd_result is. :)  Why not define the type before
> using it to avoid keeping the reader in suspense?
> 
> What does each result represent?  If this is a convention like
> 
>  1: handled
>  0: not handled
>  -1: error, callee takes care of printing the error message
> 
> then please document it in a comment near the caller so the reader can
> understand what is happening without too much confusion.  Given such a
> comment, does the enum add clarity?

Hm.. the enum now has SUCCESS, NOT_HANDLED, TERMINATE.
It gives the numbers a name, thats it.

> 
> > +typedef enum cmd_result (*command)(struct strbuf*);
> 
> When I first read this, I wonder what is being commanded.  Are these
> commands passed on the remote helper's standard input, commands passed
> on its output, or commands run at some point in the process?  What is
> the effect and return value of associated function?  Does the function
> always return some success/failure value, or does it sometimes exit?
> 
> Maybe a more specific type name would be clearer?

I renamed it to input_command_handler. Unfortunately the remote-helper spec 
calls what is sent to the helper a 'command'.

> 
> [...]
> 
> > +
> > +const command command_list[] = {
> > +   cmd_capabilities, cmd_import, cmd_list, NULL
> > +};
> 
> First association is to functions like cmd_fetch() which implement git
> subcommands.  So I thought these were going to implement subcommands
> like "git remote-svn capabilities", "git remote-svn import" and would
> use the same cmd_foo(argc, argv, prefix) calling convention that git
> subcommands do.  Maybe a different naming convention could avoid
> confusion.

Ok.. same as above, they are kind of commands. Of course I can change the 
names. For me it's not too confusing, because I don't know the git subcommands 
convention very well. You can choose a name.

> 
> [...]
> 
> > +enum cmd_result cmd_capabilities(struct strbuf* line)
> > +{
> > +   if(strcmp(line->buf, "capabilities"))
> > +   return NOT_HANDLED;
> 
> Style: missing SP after keyword.
> 
> > +
> > +   printf("import\n");
> > +   printf("\n");
> > +   fflush(stdout);
> > +   return SUCCESS;
> > +}
> 
> Why the multiple printf?  Is the flush needed?

Excess printf gone.
Flush is needed. Otherwise it doesn't flush and the other end waits forever.
Don't know exactly why. Some pipe-buffer ..

> > +
> > +   /* opening a fifo for usually reading blocks until a writer has opened
> > it too. +* Therefore, we open with RDWR.
> > +*/
> > +   report_fd = open(back_pipe_env, O_RDWR);
> > +   if(report_fd < 0) {
> > +   die("Unable to open fast-import back-pipe! %s", 
> > strerror(errno));
> > +   }
> 
> Is this necessary?  Why shouldn't we fork the writer first and wait
> for it here?

Yes, necessary. Blocking on this open call prevents fast-import as well as the 
remote helper from reading and writing on their normal command streams.
This leads to deadlocks.

E.g. If there's have nothing to import, the helper sends only 'done' to fast-
import and quits. That might happen before fast-import opened this pip