[RFC] Proposal for a new config-based git signing interface

2019-10-23 Thread Ibrahim El
Hello,

This is a follow-up on my previous emails related to the proposal of a new 
signing interface:

https://public-inbox.org/git/CACi-FhDeAZecXSM36zroty6kpf2BCWLS=0r+duwub96lqfk...@mail.gmail.com/T/#r43cbf31b86642ab5118e6e7b3d4098bade5f5a0a
https://public-inbox.org/git/Z2XOTcGuVovMKhcdrrO08KWI2I7L9s0CyFITvvj3jkmGTQPB6FkCiyOtTm6GdYWbnf25dsPD8M08kDCuD37EE1B-sxHQ3se9Kn1zVBrCPZw=@pm.me/T/#u
https://public-inbox.org/git/N31G34oKnfr3MVifk42-Kt3YtM_3fHuCp3V1cpGOK5f1jn1vbg1TaSCy9ukI-YD8qRfu4xMcHcPc78xFE0MSwJQWNrSvuQuer9wSNugNRLg=@pm.me/T/#u
https://public-inbox.org/git/8AMhjK19PJ35u3LCR57IvtAzOBN5bKK2vUn0Ns-4mmZzK9U14W5CGW5R8aITNXBm78J4Z7nd09RTVKW2pGaB4PnF7p2PireF_vzRST8DngE=@pm.me/T/#u
https://public-inbox.org/git/0oTOrSdJdIaEfs3NVkfRmLxjYRvUPkucwwaXPuhCjS2QL3ztRJLfIlBkcpjSRiZQaY70SKSkg8_w20rxnuD4Vu3IbRcGOZM-fht8G7ySEHk=@pm.me/T/#u
https://public-inbox.org/git/T4zS1hogOjySpdv7lDjVaZV83KKSeK9fx8m33SIo-e_BH4RtKcm67btmGzTPeflbRnQr7mWjTpObB0hCkX8VkGZElkQbLEgbrETg6Aq4nUg=@pm.me/T/#u
https://public-inbox.org/git/74R10RrvOffzj20d_Owd_1WFMh1bWq8mIhEEBSzbhkHfbvW5BLHZj-L-AgHYnpqkxgZdCfW5b72GoIvKHucQz7tdiGZEzietp0IKpU1_wuI=@pm.me/T/#u

The main feedback we received from the previous RFCs was that the drivers for 
external signing tools were still written in C and that we should go more for a 
configuration based interface.

I'v been thinking about how to go about it and would love to have your feedback 
on my proposed approach:

- Implement updated user configuration to define signing tools
- Implement a tool-agnostic signing interface in C code
- Add the possibility to use bash helper scripts to drive additional tools in 
case the default interface don't work as intended.
- The same configuration aliases can be passed to command line arguments

You can find below a detailed description of the proposed config and command 
line options:

https://hackmd.io/ZHsddYXkSmyb6rYajdyGLg
https://hackmd.io/yxS9nfiQSvmRZntcfnHOGQ

The configuration part would look like this:

```
[signing]
  format = openpgp

[signing "openpgp"]
  program = "/usr/bin/gpg"
keyring = "--keyring pubring.kbx --no-default-keyring"
identity = "--local-user \"Jane Committer \""
sign = "--sign --status-fd=2 --detach-sign --ascii"
verify = "--verify --status-fd=2"

[signing "openpgp.signature"]
regex = "^-BEGIN PGP SIGNATURE-$[^-]*^-END PGP 
SIGNATURE-$"
multiline = true
```

The equivilent command line to do a digitally signed commit looks like:

```
git commit \
  --sign
--signing-format=openpgp \
--signing-openpgp-program="/usr/bin/gpg" \
--signing-openpgp-keyring="--keyring pubring.kbx --no-default-keyring" \
--signing-openpgp-identity="--local-user \"Jane Committer 
\"" \
--signing-openpgp-sign="--sign --status-fd=2 --detach-sign --ascii"
```

Cheers,


Ibrahim




PROPOSAL

2019-09-24 Thread kojii0298
Hello,

We wish to retain your service as an intermediary representative on a contract 
basis.If interested please advise for contract 
info.

Manager.







RES: PROPOSAL.

2019-07-30 Thread José Luiz Fabris






De: José Luiz Fabris
Enviado: terça-feira, 30 de julho de 2019 18:37
Para: José Luiz Fabris
Assunto: PROPOSAL.

Good Day,
I am Mrs.Margaret Ko May-Yee Leung Deputy Managing Director and Executive 
Director of Chong Hing Bank Limited. I write briefly to seek your collaboration 
in a multi-million transaction with good return for us on participation reply 
to my private email address below. Please before we proceed further, I'd like 
to know your FIRST and LAST
name so I will cross check with what I have on my file before proceeding with 
the details of our proposal.
E-mail: margaretkoleung...@gmail.com for more details send FIRST and LAST name 
to
My private email addreess:  margaretkoleung...@gmail.com Thank you and I look 
forward to hearing from you shortly.

Regards,
Dir. Margaret Ko May-Yee Leung.



Esta mensagem (incluindo anexos) contém informação confidencial destinada a um 
usuário específico e seu conteúdo é protegido por lei. Se você não é o 
destinatário correto deve apagar esta mensagem.

O emitente desta mensagem é responsável por seu conteúdo e endereçamento.
Cabe ao destinatário cuidar quanto ao tratamento adequado. A divulgação, 
reprodução e/ou distribuição sem a devida autorização ou qualquer outra ação 
sem conformidade com as normas internas do Ifes são proibidas e passíveis de 
sanção disciplinar, cível e criminal.


Re: [Proposal] git am --check

2019-06-03 Thread Junio C Hamano
Duy Nguyen  writes:

> On Mon, Jun 3, 2019 at 4:29 PM Christian Couder
>  wrote:
>>
>> On Sun, Jun 2, 2019 at 7:38 PM Drew DeVault  wrote:
>> >
>> > This flag would behave similarly to git apply --check, or in other words
>> > would exit with a nonzero status if the patch is not applicable without
>> > actually applying the patch otherwise.
>>
>> `git am` uses the same code as `git apply` to apply patches, so there
>> should be no difference between `git am --check` and `git apply
>> --check`.
>
> One difference (that still annoys me) is "git apply" must be run at
> topdir. "git am" can be run anywhere and it will automatically find
> topdir.
>
> "git am" can also consume multiple patches, so it's some extra work if
> we just use "git apply" directly, although I don't think that's a very
> good argument for "am --check".

Another is that "am" has preprocessing phase performed by mailsplit
that deals with MIME garbage, which "apply" will totally choke on
without even attempting to cope with.

I haven't carefully read the "proposal" or any rfc patches yet, but
would/should the command make a commit if the patch cleanly applies?

I wonder if a "--dry-run" option is more useful (i.e. checks and
reports with the exit status *if* the command without "--dry-run"
would cleanly succeed, but never makes a commit or touches the index
or the working tree), given the motivating use case is a Git aware
MUA that helps the user by saying "if you are busy you could perhaps
skip this message as the patch would not apply to your tree anyway".


Re: [Proposal] git am --check

2019-06-03 Thread Duy Nguyen
On Mon, Jun 3, 2019 at 4:29 PM Christian Couder
 wrote:
>
> On Sun, Jun 2, 2019 at 7:38 PM Drew DeVault  wrote:
> >
> > This flag would behave similarly to git apply --check, or in other words
> > would exit with a nonzero status if the patch is not applicable without
> > actually applying the patch otherwise.
>
> `git am` uses the same code as `git apply` to apply patches, so there
> should be no difference between `git am --check` and `git apply
> --check`.

One difference (that still annoys me) is "git apply" must be run at
topdir. "git am" can be run anywhere and it will automatically find
topdir.

"git am" can also consume multiple patches, so it's some extra work if
we just use "git apply" directly, although I don't think that's a very
good argument for "am --check".
-- 
Duy


Re: [Proposal] git am --check

2019-06-03 Thread Christian Couder
On Sun, Jun 2, 2019 at 7:38 PM Drew DeVault  wrote:
>
> This flag would behave similarly to git apply --check, or in other words
> would exit with a nonzero status if the patch is not applicable without
> actually applying the patch otherwise.

`git am` uses the same code as `git apply` to apply patches, so there
should be no difference between `git am --check` and `git apply
--check`.

> Rationale: I'm working on an email client which has some git
> integration, and when you scroll over a patch I want to quickly test its
> applicability and show an indication of the result.
>
> Thoughts on the approach are welcome; my initial naive patch just tried
> to add --check to the apply flags but that didn't work as I had hoped.
> Will take another crack at a patch soon(ish).

Could you tell us about what didn't work as you hoped? And how `git am
--check` would be different from `git apply --check`?


[Proposal] git am --check

2019-06-02 Thread Drew DeVault
This flag would behave similarly to git apply --check, or in other words
would exit with a nonzero status if the patch is not applicable without
actually applying the patch otherwise.

Rationale: I'm working on an email client which has some git
integration, and when you scroll over a patch I want to quickly test its
applicability and show an indication of the result.

Thoughts on the approach are welcome; my initial naive patch just tried
to add --check to the apply flags but that didn't work as I had hoped.
Will take another crack at a patch soon(ish).


Re: Proposal: object negotiation for partial clones

2019-05-13 Thread Jonathan Nieder
Hi,

Matthew DeVore wrote:
> On 2019/05/09, at 11:00, Jonathan Tan  wrote:

>> - Supporting any combination of filter means that we have more to
>>  implement and test, especially if we want to support more filters in
>>  the future. In particular, the different filters (e.g. blob, tree)
>>  have different code paths now in Git. One way to solve it would be to
>>  combine everything into one monolith, but I would like to avoid it if
>>  possible (after having to deal with revision walking a few times...)
>
> I don’t believe there is any need to introduce monolithic code. The
> bulk of the filter implementation is in list-objects-filter.c, and I
> don’t think the file will get much longer with an additional filter
> that “combines” the existing filter. The new filter is likely
> simpler than the sparse filter. Once I add the new filter and send
> out the initial patch set, we can discuss splitting up the file, if
> it appears to be necessary.
>
> My idea - if it is not clear already - is to add another OO-like
> interface to list-objects-filter.c which parallels the 5 that are
> already there.

Sounds good to me.

For what it's worth, my assumption has always been that we would
eventually want the filters to be stackable.  So I'm glad you're
looking into it.

Jonathan's reminder to clean up as you go is a welcome one.

Thanks,
Jonathan


Re: Proposal: object negotiation for partial clones

2019-05-13 Thread Matthew DeVore



> On 2019/05/09, at 11:00, Jonathan Tan  wrote:
> 
> Thanks for the numbers. Let me think about it some more, but I'm still
> reluctant to introduce multiple filter support in the protocol and the
> implementation for the following reasons:

Correction to the original command - I was tweaking it in the middle of running 
it, and introduced an error that I didn’t notice. Here is one that will work 
for an entire repo:

$ git rev-list --objects --filter=blob:none HEAD: | awk '{print $1}' | xargs -n 
1 git cat-file -s  | awk '{ total += $1; print total }'

When run to completion, Chromium totaled 17 301 144 bytes.

> 
> - For large projects like Linux and Chromium, it may be reasonable to
>  expect that an infrequent checkout would result in a few-megabyte
>  download.

Anyone developing on Chromium would definitely consider a 17 MB original clone 
to be an improvement over the status quo, but it is still not ideal.

And the 17MB initial download is only incurred once *assuming* the next idea is 
implemented:

> - (After some in-office discussion) It may be possible to mitigate much
>  of that by sending root trees that we have as "have" (e.g. by
>  consulting the reflog), and that wouldn't need any protocol change.

This would complicate the code - not in Git itself, but in my FUSE-related 
logic. We would have to explore the reflog and try to find the closest commits 
in history to the target commit being checked out. This is sounding a bit hacky 
and round-about, and it assumes that at the FUSE layer we can detect when a 
checkout is happening cleanly and sufficiently early (rather than when one of 
the sub-sub-trees is being accessed).

> - Supporting any combination of filter means that we have more to
>  implement and test, especially if we want to support more filters in
>  the future. In particular, the different filters (e.g. blob, tree)
>  have different code paths now in Git. One way to solve it would be to
>  combine everything into one monolith, but I would like to avoid it if
>  possible (after having to deal with revision walking a few times...)

I don’t believe there is any need to introduce monolithic code. The bulk of the 
filter implementation is in list-objects-filter.c, and I don’t think the file 
will get much longer with an additional filter that “combines” the existing 
filter. The new filter is likely simpler than the sparse filter. Once I add the 
new filter and send out the initial patch set, we can discuss splitting up the 
file, if it appears to be necessary.

My idea - if it is not clear already - is to add another OO-like interface to 
list-objects-filter.c which parallels the 5 that are already there.



Re: Proposal: Remembering message IDs sent with git send-email

2019-05-09 Thread Drew DeVault
On 2019-05-09 11:51 AM, Emily Shaffer wrote:
> I'm still not sure I see the value of the extra header proposed here.
> I'd appreciate an explanation of how you think it would be used, Drew.

I'm not just thinking about your run of the mill mail reader, but also
mail readers which are aware of git and could use it to provide
git-specific features for browsing patchsets. Distinguishing it from the
mecahnism used for normal conversation allows us to have fewer
heuristics in such software.


Re: Proposal: Remembering message IDs sent with git send-email

2019-05-09 Thread Eric Wong
Drew DeVault  wrote:
> --in-reply-to=ask doesn't exist, that's what I'm looking to add. This
> convenient storage mechanism is exactly what I'm talking about. Sorry
> for the confusion.

Using Net::NNTP to query NNTP servers using ->xover([recent-ish
range]) to scan for Message-IDs and Subjects matching the
current ident could be an option, too.

It could cache the xover result for --dry-run and format-patch
cases; and Net::NNTP is a standard Perl module.  Going online
to do this query also benefits people who work across different
machines/environments, as it's one less thing to sync.

Fwiw, this list has:
nntp://news.gmane.org/gmane.comp.version-control.git
nntp://news.public-inbox.org/inbox.comp.version-control.git

And there's a bunch of kernel lists at nntp://nntp.lore.kernel.org/


Re: Proposal: Remembering message IDs sent with git send-email

2019-05-09 Thread Emily Shaffer
On Thu, May 09, 2019 at 12:50:25PM -0400, Drew DeVault wrote:
> On 2019-05-08  5:19 PM, Emily Shaffer wrote:
> > What I think might be useful (and what I was hoping you were going to
> > talk about when I saw the subject line) would be if the Message-Id is
> > conveniently stored during `git send-email` on v1 and somehow saved in a
> > useful place in order to apply to the In-Reply-To field on v2
> > automatically upon `git format-patch -v2`. I'll admit I didn't know
> > about --in-reply-to=ask and that helps with the pain point I've
> > experienced sending out v2 before.
> 
> --in-reply-to=ask doesn't exist, that's what I'm looking to add. This
> convenient storage mechanism is exactly what I'm talking about. Sorry
> for the confusion.

Looking at the documentation, I suppose I hadn't realized before that
--thread will generate a Message-Id for your cover letter. It does seem
like we could teach --thread to check for the previous patch's cover
letter in the directory provided by -o. Of course, this wouldn't work
if the author was generating v2 and didn't have the v1 files available
(i.e. different workstation or different author picking up the set).

I'm still not sure I see the value of the extra header proposed here.
I'd appreciate an explanation of how you think it would be used, Drew.

I don't know much about emailed workflows outside of Git; is this
something likely to be useful to other communities?

 - Emily


Re: Proposal: object negotiation for partial clones

2019-05-09 Thread Jonathan Tan
> > On 2019/05/07, at 11:34, Jonathan Tan  wrote:
> >
> > To get an enumeration of available objects, don't you need to use only
> > "blob:none"? Combining filters (once that's implemented) will get all
> > objects only up to a certain depth.
> >
> > Combining "tree:" and "blob:none" would allow us to reduce the number
> > of trees transmitted, but I would imagine that the savings would be
> > significant only for very large repositories. Do you have a specific use
> > case in mind that isn't solved by "blob:none"?
> 
> I am interested in supporting large repositories. The savings seem to be 
> larger than one may expect. I tried the following command on two huge repos 
> to find out how much it costs to fetch “blob:none” for a single commit:
> 
> $ git rev-list --objects --filter=blob:none HEAD: | xargs -n 2 bash -c 'git 
> cat-file -s $1' | awk '{ total += $1; print total }'
> 
> Note the “:” after HEAD - this limits it to the current commit.
> 
> And the results were:
>  - Linux: 2 684 054 bytes
>  - Chromium: > 16 139 570 bytes (then I got tired of waiting for it to finish)

Thanks for the numbers. Let me think about it some more, but I'm still
reluctant to introduce multiple filter support in the protocol and the
implementation for the following reasons:

- For large projects like Linux and Chromium, it may be reasonable to
  expect that an infrequent checkout would result in a few-megabyte
  download.
- (After some in-office discussion) It may be possible to mitigate much
  of that by sending root trees that we have as "have" (e.g. by
  consulting the reflog), and that wouldn't need any protocol change.
- Supporting any combination of filter means that we have more to
  implement and test, especially if we want to support more filters in
  the future. In particular, the different filters (e.g. blob, tree)
  have different code paths now in Git. One way to solve it would be to
  combine everything into one monolith, but I would like to avoid it if
  possible (after having to deal with revision walking a few times...)


Re: Proposal: Remembering message IDs sent with git send-email

2019-05-09 Thread Drew DeVault
On 2019-05-08  5:19 PM, Emily Shaffer wrote:
> What I think might be useful (and what I was hoping you were going to
> talk about when I saw the subject line) would be if the Message-Id is
> conveniently stored during `git send-email` on v1 and somehow saved in a
> useful place in order to apply to the In-Reply-To field on v2
> automatically upon `git format-patch -v2`. I'll admit I didn't know
> about --in-reply-to=ask and that helps with the pain point I've
> experienced sending out v2 before.

--in-reply-to=ask doesn't exist, that's what I'm looking to add. This
convenient storage mechanism is exactly what I'm talking about. Sorry
for the confusion.


Re: Proposal: Remembering message IDs sent with git send-email

2019-05-08 Thread Emily Shaffer
On Wed, May 08, 2019 at 07:10:13PM -0400, Drew DeVault wrote:
> I want to gather some thoughts about this. Say you've written a patch
> series and are getting ready to send a -v2. If you set
> --in-reply-to=ask, it'll show you a list of emails you've recently sent,
> and their subject lines, and ask you to pick one to use the message ID
> from. It'll set the In-Reply-To header to your selection.

It sounds to me like you mean to call this during `git format-patch` -
that is, `git format-patch -v2 --cover-letter --in-reply-to=ask master..branch
-o branch/`. That should set the In-Reply-To: header on your cover
letter.

There's also the possibility that you mean `git send-email
--in-reply-to=ask branch/v2*` - in which case I imagine the In-Reply-To:
is added as the message is sent, but not added to the cover letter text
file.

> 
> I'd also like to add a custom header, X-Patch-Supersedes: ,
> with a similar behavior & purpose.

Is the hope to store the message ID you choose from --in-reply-to=ask
into the X-Patch-Supersedes: header? I'm not sure I understand what
you're trying to solve; if you use `git format-patch --in-reply-to` it
sounds like the X-Patch-Supersedes: and In-Reply-To: would be redundant.

Is it possible you mean you want (sorry for pseudocode scribblings)
[PATCH v2 1/1]->X-Patch-Supersedes = [PATCH 1/1]->Message-Id ? I think that
wouldn't look good in a threaded mail client?

> 
> Thoughts?

Or maybe I totally misunderstood :)

What I think might be useful (and what I was hoping you were going to
talk about when I saw the subject line) would be if the Message-Id is
conveniently stored during `git send-email` on v1 and somehow saved in a
useful place in order to apply to the In-Reply-To field on v2
automatically upon `git format-patch -v2`. I'll admit I didn't know
about --in-reply-to=ask and that helps with the pain point I've
experienced sending out v2 before.

 - Emily


Proposal: Remembering message IDs sent with git send-email

2019-05-08 Thread Drew DeVault
I want to gather some thoughts about this. Say you've written a patch
series and are getting ready to send a -v2. If you set
--in-reply-to=ask, it'll show you a list of emails you've recently sent,
and their subject lines, and ask you to pick one to use the message ID
from. It'll set the In-Reply-To header to your selection.

I'd also like to add a custom header, X-Patch-Supersedes: ,
with a similar behavior & purpose.

Thoughts?


Re: Proposal: object negotiation for partial clones

2019-05-07 Thread Matthew DeVore



> On 2019/05/07, at 11:34, Jonathan Tan  wrote:
> 
> To get an enumeration of available objects, don't you need to use only
> "blob:none"? Combining filters (once that's implemented) will get all
> objects only up to a certain depth.
> 
> Combining "tree:" and "blob:none" would allow us to reduce the number
> of trees transmitted, but I would imagine that the savings would be
> significant only for very large repositories. Do you have a specific use
> case in mind that isn't solved by "blob:none"?

I am interested in supporting large repositories. The savings seem to be larger 
than one may expect. I tried the following command on two huge repos to find 
out how much it costs to fetch “blob:none” for a single commit:

$ git rev-list --objects --filter=blob:none HEAD: | xargs -n 2 bash -c 'git 
cat-file -s $1' | awk '{ total += $1; print total }'

Note the “:” after HEAD - this limits it to the current commit.

And the results were:
 - Linux: 2 684 054 bytes
 - Chromium: > 16 139 570 bytes (then I got tired of waiting for it to finish)



Re: Proposal: object negotiation for partial clones

2019-05-07 Thread Jonathan Tan
> > My main question is: we can get the same list of objects (in the form of
> > tree objects) if we fetch with "blob:none" filter. Admittedly, we will
> > get extra data (file names, etc.) - if the extra bandwidth saving is
> > necessary, this should be called out. (And some of the savings will be
> > offset by the fact that we will actually need some of those tree
> > objects.)
> That's a very good point. The data the first request gives us is
> basically the tree objects minus file names and modes. So I think a
> better feature to implement would be combining of multiple filters.
> That way, the client can combine "tree:" and
> "blob:none" and basically get an "enumeration" of available objects.

To get an enumeration of available objects, don't you need to use only
"blob:none"? Combining filters (once that's implemented) will get all
objects only up to a certain depth.

Combining "tree:" and "blob:none" would allow us to reduce the number
of trees transmitted, but I would imagine that the savings would be
significant only for very large repositories. Do you have a specific use
case in mind that isn't solved by "blob:none"?


Re: Proposal: object negotiation for partial clones

2019-05-06 Thread Jonathan Nieder
Matthew DeVore wrote:
> On 2019/05/06, at 12:46, Jonathan Nieder  wrote:

>> Ah, interesting.  When this was discussed before, the proposal has been
>> that the client can say "have" anyway.  They don't have the commit and
>> all referenced objects, but they have the commit and a *promise* that
>> they can obtain all referenced objects, which is almost as good.
>> That's what "git fetch" currently implements.
>
> Doesn’t that mean the “have” may indicate that the client has the
> entire repository already, even though it’s only a partial clone? If
> so, then the client intends to ask for some tree plus trees and
> blobs 2-3 levels down deeper, how would the server distinguish
> between those objects the client *really* has and those that were
> just promised to them? Because the whole purpose of this
> hypothetical request is to get a bunch of promises fulfilled of
> which 0-99% are fulfilled already.

For blobs, the answer is simple: the server returns any object
explicitly named in a "want", even if the client already should have
it.

For trees, the current behavior is the same: if you declare that you
"have" everything, then if you "want" a tree with filter tree:2, you
only get that tree.  So here there's already room for improvement.

[...]
> Maybe something like this (conceptually based on original proposal) ?
>
> 1. Client sends request for an object or objects with an extra flag
> which means “I can’t really tell you what I already have since it’s
> a chaotic subset of the object database of the repo”
>
> 2. Server responds back with set of objects, represented by deltas
> if that is how the server has them on disk, along with a list of
> object-IDs needed in order to resolve the content of all the
> objects. These object-IDs can go several layers of deltas back, and
> they go back as far as it takes to get to an object stored in its
> entirety by the server.
>
> 3. Client responds back with another request (this time the extra
> flag sent from step 1 is not necessary) which has “want”s for every
> object the server named which the client already has.
>
> Very hand-wavey, but I think you see my idea.

The only downside I see is that the list of objects may itself be
large, and the server has to check reachability for each one.  But
maybe that's fine.

Perhaps after that initial response, instead of sending the list of
individual objects the client wants, it could send a list of relevant
objects it has (combined with the original set of "want"s).  That
could be a smaller request and it means less work for the server to
check each "want" for reachability.

What do you think?

[...]
> That's a very good point. The data the first request gives us is
> basically the tree objects minus file names and modes. So I think a
> better feature to implement would be combining of multiple filters.
> That way, the client can combine "tree:" and
> "blob:none" and basically get an "enumeration" of available objects.

This might be simpler.

Combining filters would be useful for other uses, too.

Thanks,
Jonathan


Re: Proposal: object negotiation for partial clones

2019-05-06 Thread Matthew DeVore



> On 2019/05/06, at 12:46, Jonathan Nieder  wrote:
> 
> Hi,
> 
> Jonathan Tan wrote:
>> Matthew DeVore wrote:
> 
>>> I'm considering implementing a feature in the Git protocol which would
>>> enable efficient and accurate object negotiation when the client is a
>>> partial clone. I'd like to refine and get some validation of my
>>> approach before I start to write any code, so I've written a proposal
>>> for anyone interested to review. Your comments would be appreciated.
>> 
>> Thanks. Let me try to summarize: The issue is that, during a fetch,
>> normally the client can say "have" to inform the server that it has a
>> commit and all its referenced objects (barring shallow lines), but we
>> can't do the same if the client is a partial clone (because having a
>> commit doesn't necessarily mean that we have all referenced objects).
> 
> Ah, interesting.  When this was discussed before, the proposal has been
> that the client can say "have" anyway.  They don't have the commit and
> all referenced objects, but they have the commit and a *promise* that
> they can obtain all referenced objects, which is almost as good.
> That's what "git fetch" currently implements.
Doesn’t that mean the “have” may indicate that the client has the entire 
repository already, even though it’s only a partial clone? If so, then the 
client intends to ask for some tree plus trees and blobs 2-3 levels down 
deeper, how would the server distinguish between those objects the client 
*really* has and those that were just promised to them? Because the whole 
purpose of this hypothetical request is to get a bunch of promises fulfilled of 
which 0-99% are fulfilled already.

> 
> For blob filters, if I ignore the capability advertisements (there's
> an optimization that hasn't yet been implemented to allow
> single-round-trip fetches), the current behavior takes the same number
> of round trips as this proposal.  Where the current approach has been
> lacking is in delta base selection during fetch-on-demand.  Ideas for
> improving that?

Maybe something like this (conceptually based on original proposal) ?

1. Client sends request for an object or objects with an extra flag which means 
“I can’t really tell you what I already have since it’s a chaotic subset of the 
object database of the repo”

2. Server responds back with set of objects, represented by deltas if that is 
how the server has them on disk, along with a list of object-IDs needed in 
order to resolve the content of all the objects. These object-IDs can go 
several layers of deltas back, and they go back as far as it takes to get to an 
object stored in its entirety by the server.

3. Client responds back with another request (this time the extra flag sent 
from step 1 is not necessary) which has “want”s for every object the server 
named which the client already has.

Very hand-wavey, but I think you see my idea.



Re: Proposal: object negotiation for partial clones

2019-05-06 Thread Matthew DeVore
On Mon, May 6, 2019 at 12:28 PM Jonathan Tan  wrote:
>
> > I'm considering implementing a feature in the Git protocol which would
> > enable efficient and accurate object negotiation when the client is a
> > partial clone. I'd like to refine and get some validation of my
> > approach before I start to write any code, so I've written a proposal
> > for anyone interested to review. Your comments would be appreciated.
>
> Thanks. Let me try to summarize: The issue is that, during a fetch,
> normally the client can say "have" to inform the server that it has a
> commit and all its referenced objects (barring shallow lines), but we
> can't do the same if the client is a partial clone (because having a
> commit doesn't necessarily mean that we have all referenced objects).
> And not doing this means that the server sends a lot of unnecessary
> objects in the sent packfile. The solution is to do the fetch in 2
> parts: one to get the list of objects that would be sent, and after the
> client filters that, one to get the objects themselves.
>
> It was unclear to me whether this is meant for (1) fetches directly
> initiated by the user that fetch commits (e.g. "git fetch origin",
> reusing the configured "core.partialclonefilter") and/or for (2) lazy
> fetching of missing objects. My assumption is that this is only for (2).
Yes, that was my intention. The client doesn't really know anything
about the hashes reported, so it can't really make an informed
selection from the candidate list given by the server after the first
request. I guess if we wanted to just reject *all* objects on the
initial clone, this feature would make that possible. But that can
also be achieved more embracively with a better filter system.

>
> My main question is: we can get the same list of objects (in the form of
> tree objects) if we fetch with "blob:none" filter. Admittedly, we will
> get extra data (file names, etc.) - if the extra bandwidth saving is
> necessary, this should be called out. (And some of the savings will be
> offset by the fact that we will actually need some of those tree
> objects.)
That's a very good point. The data the first request gives us is
basically the tree objects minus file names and modes. So I think a
better feature to implement would be combining of multiple filters.
That way, the client can combine "tree:" and
"blob:none" and basically get an "enumeration" of available objects.

>
> Assuming that we do need that bandwidth saving, here's my review of that
> document.
>
> The document describes the 1st request exactly as I envision - a
> specific parameter sent by the client, and the server responds with a
> list of object names.
>
> For the 2nd request, the document describes it as repeating the original
> query of the 1st request while also giving the full list of objects
> wanted as "choose-refs". I'm still not convinced that repeating the
> original query is necessary - I would just give the list of objects as
> wants. The rationale given for repeating the original query is:
>
> > The original query is helpful because it means the server only needs
> > to do a single reachability check, rather than many separate ones.
>
> But this omits the fact that, if doing it the document's way, the server
> needs to perform an object walk in addition to the "single reachability
> check", and it is not true that if doing it my way, "many separate ones"
> need to be done because the server can check reachability of all objects
> at once.
After considering more carefully how reachability works (and getting
your explanation of it out-of-band), I would assume that my approach
is no better than marginally faster, and possibly worse, than just
doing a plain reachability check of multiple objects using the current
implementation. My current priorities preclude this kind of
benchmarking+micro-optimization. So I believe what is more important
to me is to simply enable combining multiple filters.

>
> Also, my way means that supporting the 2nd request does not require any
> code or protocol change - it already works today. Assuming we follow my
> approach, the discussion thus lies in supporting the 1st request.
>
> Some more thoughts:
>
> - Changes in server and client scalability: Currently, the server checks
>   reachability of all wants, then enumerates, then sends all objects.
>   With this change, the server checks reachability of all wants, then
>   enumerates, then sends an object list, then checks reachability of all
>   objects in the filtered list, then sends some objects. There is
>   additional overhead in the extra reachability check and lists of

Re: Proposal: object negotiation for partial clones

2019-05-06 Thread Jonathan Nieder
Hi,

Jonathan Tan wrote:
> Matthew DeVore wrote:

>> I'm considering implementing a feature in the Git protocol which would
>> enable efficient and accurate object negotiation when the client is a
>> partial clone. I'd like to refine and get some validation of my
>> approach before I start to write any code, so I've written a proposal
>> for anyone interested to review. Your comments would be appreciated.
>
> Thanks. Let me try to summarize: The issue is that, during a fetch,
> normally the client can say "have" to inform the server that it has a
> commit and all its referenced objects (barring shallow lines), but we
> can't do the same if the client is a partial clone (because having a
> commit doesn't necessarily mean that we have all referenced objects).

Ah, interesting.  When this was discussed before, the proposal has been
that the client can say "have" anyway.  They don't have the commit and
all referenced objects, but they have the commit and a *promise* that
they can obtain all referenced objects, which is almost as good.
That's what "git fetch" currently implements.

But there's a hitch: when doing the fetch-on-demand for an object
access, the client currently does not say "have".  Sure, even there,
they have a *promise* that they can obtain all referenced objects, but
this could get out of hand: the first pack may contain a delta against
an object the client doesn't have, triggering another fetch which
contains a delta against another object they don't have, and so on.
Too many round trips.

> And not doing this means that the server sends a lot of unnecessary
> objects in the sent packfile. The solution is to do the fetch in 2
> parts: one to get the list of objects that would be sent, and after the
> client filters that, one to get the objects themselves.

This helps with object selection but not with delta base selection.

For object selection, I think the current approach already works okay,
at least where tree and blob filters are involved.  For commit
filters, in the current approach the fetch-on-demand sends way too
much because there's no "filter=commit:none" option to pass.  Is that
what this proposal aims to address?

For blob filters, if I ignore the capability advertisements (there's
an optimization that hasn't yet been implemented to allow
single-round-trip fetches), the current behavior takes the same number
of round trips as this proposal.  Where the current approach has been
lacking is in delta base selection during fetch-on-demand.  Ideas for
improving that?

Thanks,
Jonathan


Re: Proposal: object negotiation for partial clones

2019-05-06 Thread Jonathan Tan
> I'm considering implementing a feature in the Git protocol which would
> enable efficient and accurate object negotiation when the client is a
> partial clone. I'd like to refine and get some validation of my
> approach before I start to write any code, so I've written a proposal
> for anyone interested to review. Your comments would be appreciated.

Thanks. Let me try to summarize: The issue is that, during a fetch,
normally the client can say "have" to inform the server that it has a
commit and all its referenced objects (barring shallow lines), but we
can't do the same if the client is a partial clone (because having a
commit doesn't necessarily mean that we have all referenced objects).
And not doing this means that the server sends a lot of unnecessary
objects in the sent packfile. The solution is to do the fetch in 2
parts: one to get the list of objects that would be sent, and after the
client filters that, one to get the objects themselves.

It was unclear to me whether this is meant for (1) fetches directly
initiated by the user that fetch commits (e.g. "git fetch origin",
reusing the configured "core.partialclonefilter") and/or for (2) lazy
fetching of missing objects. My assumption is that this is only for (2).

My main question is: we can get the same list of objects (in the form of
tree objects) if we fetch with "blob:none" filter. Admittedly, we will
get extra data (file names, etc.) - if the extra bandwidth saving is
necessary, this should be called out. (And some of the savings will be
offset by the fact that we will actually need some of those tree
objects.)

Assuming that we do need that bandwidth saving, here's my review of that
document.

The document describes the 1st request exactly as I envision - a
specific parameter sent by the client, and the server responds with a
list of object names.

For the 2nd request, the document describes it as repeating the original
query of the 1st request while also giving the full list of objects
wanted as "choose-refs". I'm still not convinced that repeating the
original query is necessary - I would just give the list of objects as
wants. The rationale given for repeating the original query is:

> The original query is helpful because it means the server only needs
> to do a single reachability check, rather than many separate ones.

But this omits the fact that, if doing it the document's way, the server
needs to perform an object walk in addition to the "single reachability
check", and it is not true that if doing it my way, "many separate ones"
need to be done because the server can check reachability of all objects
at once.

Also, my way means that supporting the 2nd request does not require any
code or protocol change - it already works today. Assuming we follow my
approach, the discussion thus lies in supporting the 1st request.

Some more thoughts:

- Changes in server and client scalability: Currently, the server checks
  reachability of all wants, then enumerates, then sends all objects.
  With this change, the server checks reachability of all wants, then
  enumerates, then sends an object list, then checks reachability of all
  objects in the filtered list, then sends some objects. There is
  additional overhead in the extra reachability check and lists of
  objects being sent twice (once by server and once by client), but
  sending fewer objects means that I/O (server, network, client) and
  disk space usage (client) is reduced.

- Usefulness outside partial clone: If the user ever wants a list of
  objects referenced by an object but without their file names, the user
  could use this, but I can't think of such a scenario.


Re: Proposal: object negotiation for partial clones

2019-05-06 Thread Jonathan Nieder
Hi,

Matthew DeVore wrote:

> I'm considering implementing a feature in the Git protocol which would
> enable efficient and accurate object negotiation when the client is a
> partial clone. I'd like to refine and get some validation of my
> approach before I start to write any code, so I've written a proposal
> for anyone interested to review. Your comments would be appreciated.

Yay!  Thanks for looking into this, and sorry I didn't respond sooner.

I know the doc has a "use case" section, but I suppose I am not sure
that I understand the use case yet.  Is this about improving the
filter syntax to handle features like directory listing?  Or is this
about being able to make better use of deltas in a partial clone, to
decrease bandwidth consumption and overhead that is proportional to
size?

Thanks,
Jonathan


Proposal: object negotiation for partial clones

2019-04-28 Thread Matthew DeVore
Hello,

I'm considering implementing a feature in the Git protocol which would
enable efficient and accurate object negotiation when the client is a
partial clone. I'd like to refine and get some validation of my
approach before I start to write any code, so I've written a proposal
for anyone interested to review. Your comments would be appreciated.

Remember this is a publicly-accessible document so be sure to not
discuss any confidential topics in the comments!

Tiny URL: http://tinyurl.com/yxz747cy
Full URL: 
https://docs.google.com/document/d/1bcDKCgd2Dw5Cl6H9TrNi0ekqzaT8rbyK8EpPE3RcvPA/edit#

Thank you,
Matt


Re: [GSoC] [RFC] Proposal: Teach git stash to handle unmerged index entries.

2019-04-09 Thread Junio C Hamano
Junio C Hamano  writes:

> As to the design, it does not quite matter if you add four or more
> separate trees to represent stage #[0123] entries in the index to
> the already octopus merge commit that represents a stash entry ...

I forgot that I was planning to expand on this part while writing
the message I am following up.

There are a few things you must take into account while designing a
new format for a stash entry:

 - Your new feature will *NOT* be the last extension to the stash
   subsystem.  Always leave room to other developers to extend it
   further, without breaking backward compatiblity when your new
   feature is int in use.

 - Even though you may have never encountered in your projects,
   higher stage entries can have duplicates.  When merging two
   branches into your current branch, and there are three merge
   bases for such an octopus merge, the system (and the index
   format) is designed to allow a merge backend to store 3 stage #1
   entries (because there are that many common ancestor versions in
   the example), 1 stage #2 entry (because there is only one
   "current brahch" a merge is made into) and 2 stage #3 entries
   (because there are that many other branches you are merging into
   the current branch), all for the same path.

So, a design that says:

   A stash entry in the current system is recorded as a merge
   commit, whose tree represents the state of the tracked working
   tree files, whose first parent records the HEAD commit the stash
   entry was created on, and whose second parent records the tree
   that would have been created if "git write-tree" were done on the
   index when the stash entry was created.  Optionally, it can have
   the third parent whose tree records the state of untracked files.

   Let's add three more parents.  IOW, the fourth parent's tree
   records the result of "git write-tree" of the index after
   removing all the entries other than those at stage #1 and moving
   the remainder from stage #1 down to stage #0, and similarly the
   fifth is for stage #2 and the sixth is for stage #3.

is bad at multiple counts.

 - It does not say what should happen to the third parent when this
   new "record unmerged state" feature is used without using the
   "record untracked paths" feature.

 - It does not allow multiple stage #1 and/or stage #3 entries.

For the first point, I think a trick to record the same commit as
the first parent may be a good hack to say "this is not used"; we
might need to allow commit-tree not to complain about duplicate
parents if we go that route.

FOr the second one, there may be multiple solutions.  A
quick-and-dirty and obvious way may be to add only one new parent to
the merge commit that represents a stash entry (i.e. the fourth
parent).  Make that new parent a merge of three commits, each of
which represents what was in stage #1, stage #2 and stage #3 (we can
reuse the second parent of the stash entry that usually records the
index state to store stage #0 entries).

As we allow multiple stage #1 or stage #3 entries in the index, and
there is no fundamental reason why we should not allow multiple
stage #2 entries, make each of these three commits able to represent
multiple entries at the same stage, perhaps by

 - iterate over the index and count the maximum occurrence of the
   same path at the same stage #$n;
 - make that stage #$n commit a merge of that many parent commits.
   The tree recorded in that stage #$n commit can be an empty tree.

I am not saying this is a good design.  I am merely showing the
expected level of detail when your design gets in a presentable
shape and shared with the list.

Have fun.




Re: [GSoC] [RFC] Proposal: Teach git stash to handle unmerged index entries.

2019-04-09 Thread Junio C Hamano
Kapil Jain  writes:

> Plan to implement the project.
>
> Objective:
>
> Description:
>
> Implementation Idea:
>
> Relevant Discussions:
>
> Idea Execution Plan: Divided into 2 parts.

Two things missing before implementation idea are design, and more
importantly, the success criteria.  What lets you and your mentor
declare victory?

As to the design, it does not quite matter if you add four or more
separate trees to represent stage #[0123] entries in the index to
the already octopus merge commit that represents a stash entry
(i.e. when keeping the untracked ones, I think the stash entry's
"result of the merge" tree records the state of the tracked files in
the working tree, and the "result of the merge" commit records the
the-current HEAD, a commit that records the state of the index and
anothre commit that records the state of the untracked files, as its
parents---that's already a 3-parent octopus).

The fact that a stash entry is represented as a merge commit is a
mere implementation detail, and there is *NO* need to worry about
resolving merge conflicts while recording a stash.  If the result of
this GSoC task is to be any usable together with the current version
in a backward compatible way, you must record these extra states as
extra parents of the merge, so it is sort of given already that
you'd be using some form of an octopus merge.

The real challenge would be how the unstashing part of such a stash
entry that records unmerged state should work.  Personally I do not
think it will be very useful to allow unstashing such a stash entry
on top of any arbitrary commit---rather, I suspect that the user
would want to come back to the exact HEAD the user had trouble
resolving conflicts at, without having to first checking it out.
IOW, a usual way to use "git stash" is

$ git checkout topic
$ edit edit edit
... I am happily hacking away ...
... the boss appears with an ultra-urgent task ...
$ git stash save -m WIP
$ git checkout master
$ edit-and-build-and-test
$ git commit
... now the emergency is over ...
$ git checkout topic
... sync with the work others may have done on topic
... while I was dealing with the boss
$ git pull --rebase origin topic
$ git stash pop

IOW, it is expected to be applied on top of an updated commit.

But I have a moderately strong suspicion that a stash that holds
unmerged state (i.e. a conflicted merge in progress) is created with
a use case, which is very different from the normal use case, in
mind.  When creating such a stash entry, the above sequence would go
more like this:

$ git checkout topic
$ git merge ...
... oops, conflicted, and it takes time to resolve ...
$ edit edit inspect edit
... the boss appears
$ git stash save -m "Merge in progress"
$ git checkout master
... deal with the emergency the same way ...
$ git checkout topic
... go back to the conflict resolution first without
... touching what may have happened on the branch in
... the meantime---a human brain cannot afford to deal
... with two or more parallel conflicts at the same
... time.
$ git stash pop
... now deal with the conflict we were looking at
... before the boss interrupted us.
$ edit inspect edit
... be satisfied with the result
$ git commit
... now let's see if others have something else that
... is interesting
$ git pull --rebase origin topic

And if we assume that the primary use of a stash for a conflicted
state is to bring us back to the exact state (rather than allowing
us to pretend as if we started form a different HEAD), it might even
make sense to teach "git stash pop" step to barf if HEAD does not
match the first parent of the merge commit that represents the stash
entry being applied (again, stash^{tree} is the working tree,
stash^1 is then-current HEAD).  That would make the application side
a lot simpler and manageable by developers who are not intimately
familiar with the code.

Others may disagree with the above assumption (i.e. "a stash for a
conflicted state does not have to be applicable), though, making
your task a lot harder ;-).

Quite honestly, I do not think you can design a system that attempts
to "stash apply/pop" a recorded unmerged state on top of any
arbitrary HEAD and leave a state useful for the end user to deal with
when the "stash apply/pop" step itself introduces _new_ conflicts
due to the differences between the then-current HEAD the stash entry
is based on and the HEAD the "stash apply" is attempted on top of.
Even the current "stash apply/pop with the change between the HEAD
and the index" does punt when it cannot make a clean application,
and that is without any unmerged entries in the recorded index
state.

The key point is "a state useful for the end user"---it is 

[GSoC] [RFC] Proposal: Teach git stash to handle unmerged index entries.

2019-04-09 Thread Kapil Jain
Plan to implement the project.

Objective:
Teach git stash to handle unmerged index entries.

Description:
When the index is unmerged, git stash refuses to do anything. That is
unnecessary, though, as it could easily craft e.g. an octopus merge of
the various stages. A subsequent git stash apply can detect that
octopus and re-generate the unmerged index.


Implementation Idea:
Performing an octopus merge of all `stage n` (n>0) unmerged index
entries, could solve the problem, but

What if there are conflicts in merging ?
In this case, we would store(commit) the conflicted state, so they can
be regenerated when git stash is applied.

How to store the conflicted files ?
create a tree from the merge using `git-write-tree`
and then commit that tree using `git-commit-tree`.


Relevant Discussions:
https://colabti.org/irclogger/irclogger_log/git-devel?date=2019-04-05#l92
https://colabti.org/irclogger/irclogger_log/git-devel?date=2019-04-09#l47


Idea Execution Plan: Divided into 2 parts.

Part 1: Store the unmerged index entries this part will work with `git
stash push`

stash.sh: file would be changed to accommodate the below implementation.

Step 1:
Extract all the unmerged entries from index file and store them in a
temporary index file.

read-cache.c: this file is responsible for reading index file,
probably this implementation will end up in this file.

Step 2:
cache-tree.c: study and implement a slightly modified version of the
function `write_index_as_tree()`

int write_index_as_tree(struct object_id *oid, struct index_state
*index_state, const char *index_path, int flags, const char *prefix);

this function is responsible for writing tree from index file.
Currently in this function, the index must be in a fully merged state,
and we are dealing with its exact opposite. So a version to write tree
for unmerged index entries will be implemented.

Step 3:
write-tree.c: some possible changes will go here, so as to use the
modified version of write_index_as_tree() function.

Step 4:
use git-commit-tree to commit the written tree and store the hash in
some file say `stash_conflicting_merge`

Step 5:
Write tests for all implementation till this point.

Part 2: Retrieve the tree hash and regenerate the state of repository
as it was earlier.

Step 6:
Modify implementation of `git stash apply` for regenerating the committed tree.

Step 7:
Write tests.


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-08 Thread Torsten Bögershausen
On 2019-04-08 21:36, Matheus Tavares Bernardino wrote:
> On Mon, Apr 8, 2019 at 4:19 PM Philip Oakley  wrote:
>>
>> Hi Matheus
>>
>> On 08/04/2019 18:04, Matheus Tavares Bernardino wrote:
 Another "32-bit problem" should also be expressly considered during the
 GSoC work because of the MS Windows definition of uInt / long to be only
 32 bits, leading to much of the Git code failing on the Git for Windows
 port and on the Git LFS (for Windows) for packs and files greater than
 4Gb.https://github.com/git-for-windows/git/issues/1063
>>
>>> Thanks for pointing it out. I didn't get it, thought, if your
>>> suggestion was to also propose tackling this issue in this GSoC
>>> project. Was it that? I read the link but it seems to be a kind of
>>> unrelated problem from what I'm planing to do with the pack access
>>> code (which is tread-safety). I may have understood this wrongly,
>>> though. Please, let me know if that's the case :)
>>>
>> The main point was to avoid accidental regressions by re-introducing
>> simple 'longs' where memsized types were more appropriate.
>>
>> Torsten has already done a lot of work at
>> https://github.com/tboegi/git/tree/tb.190402_1552_convert_size_t_only_git_master_181124_mk_size_t
>
> Got it. Thanks, Philip!
>
>> HTH
>> Philip
>> (I'm off line for a few days)

Thanks for the reminder -
I will probably send something out the next days/weeks.


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-08 Thread Matheus Tavares Bernardino
On Mon, Apr 8, 2019 at 4:19 PM Philip Oakley  wrote:
>
> Hi Matheus
>
> On 08/04/2019 18:04, Matheus Tavares Bernardino wrote:
> >> Another "32-bit problem" should also be expressly considered during the
> >> GSoC work because of the MS Windows definition of uInt / long to be only
> >> 32 bits, leading to much of the Git code failing on the Git for Windows
> >> port and on the Git LFS (for Windows) for packs and files greater than
> >> 4Gb.https://github.com/git-for-windows/git/issues/1063
>
> > Thanks for pointing it out. I didn't get it, thought, if your
> > suggestion was to also propose tackling this issue in this GSoC
> > project. Was it that? I read the link but it seems to be a kind of
> > unrelated problem from what I'm planing to do with the pack access
> > code (which is tread-safety). I may have understood this wrongly,
> > though. Please, let me know if that's the case :)
> >
> The main point was to avoid accidental regressions by re-introducing
> simple 'longs' where memsized types were more appropriate.
>
> Torsten has already done a lot of work at
> https://github.com/tboegi/git/tree/tb.190402_1552_convert_size_t_only_git_master_181124_mk_size_t

Got it. Thanks, Philip!

> HTH
> Philip
> (I'm off line for a few days)


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-08 Thread Philip Oakley

Hi Matheus

On 08/04/2019 18:04, Matheus Tavares Bernardino wrote:

Another "32-bit problem" should also be expressly considered during the
GSoC work because of the MS Windows definition of uInt / long to be only
32 bits, leading to much of the Git code failing on the Git for Windows
port and on the Git LFS (for Windows) for packs and files greater than
4Gb.https://github.com/git-for-windows/git/issues/1063



Thanks for pointing it out. I didn't get it, thought, if your
suggestion was to also propose tackling this issue in this GSoC
project. Was it that? I read the link but it seems to be a kind of
unrelated problem from what I'm planing to do with the pack access
code (which is tread-safety). I may have understood this wrongly,
though. Please, let me know if that's the case :)

The main point was to avoid accidental regressions by re-introducing 
simple 'longs' where memsized types were more appropriate.


Torsten has already done a lot of work at 
https://github.com/tboegi/git/tree/tb.190402_1552_convert_size_t_only_git_master_181124_mk_size_t


HTH
Philip
(I'm off line for a few days)


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-08 Thread Matheus Tavares Bernardino
On Mon, Apr 8, 2019 at 6:26 AM Philip Oakley  wrote:
>
> On 08/04/2019 02:23, Duy Nguyen wrote:
> > On Mon, Apr 8, 2019 at 5:52 AM Christian Couder
> >  wrote:
> >>> Git has a very optimized mechanism to compactly store
> >>> objects (blobs, trees, commits, etc.) in packfiles[2]. These files are
> >>> created by[3]:
> >>>
> >>> 1. listing objects;
> >>> 2. sorting the list with some good heuristics;
> >>> 3. traversing the list with a sliding window to find similar objects in
> >>> the window, in order to do delta decomposing;
> >>> 4. compress the objects with zlib and write them to the packfile.
> >>>
> >>> What we are calling pack access code in this document, is the set of
> >>> functions responsible for retrieving the objects stored at the
> >>> packfiles. This process consists, roughly speaking, in three parts:
> >>>
> >>> 1. Locate and read the blob from packfile, using the index file;
> >>> 2. If the blob is a delta, locate and read the base object to apply the
> >>> delta on top of it;
> >>> 3. Once the full content is read, decompress it (using zlib inflate).
> >>>
> >>> Note: There is a delta cache for the second step so that if another
> >>> delta depends on the same base object, it is already in memory. This
> >>> cache is global; also, the sliding windows, are global per packfile.
> >> Yeah, but the sliding windows are used only when creating pack files,
> >> not when reading them, right?
> > These windows are actually for reading. We used to just mmap the whole
> > pack file in the early days but that was impossible for 4+ GB packs on
> > 32-bit platforms, which was one of the reasons, I think, that sliding
> > windows were added, to map just the parts we want to read.
>
> Another "32-bit problem" should also be expressly considered during the
> GSoC work because of the MS Windows definition of uInt / long to be only
> 32 bits, leading to much of the Git code failing on the Git for Windows
> port and on the Git LFS (for Windows) for packs and files greater than
> 4Gb. https://github.com/git-for-windows/git/issues/1063

Thanks for pointing it out. I didn't get it, thought, if your
suggestion was to also propose tackling this issue in this GSoC
project. Was it that? I read the link but it seems to be a kind of
unrelated problem from what I'm planing to do with the pack access
code (which is tread-safety). I may have understood this wrongly,
though. Please, let me know if that's the case :)

> Mainly it is just substitution of size_t for long, but there can be
> unexpected coercions when mixed data types get coerced down to a local
> 32-bit long. This is made worse by it being implementation defined, so
> one needs to be explicit about some casts up to pointer/memsized types.
> >>> # Points to work on
> >>>
> >>> * Investigate pack access call chains and look for non-thread-safe
> >>> operations on then.
> >>> * Protect packfile.c read-and-write global variables, such as
> >>> pack_open_windows, pack_open_fds and etc., using mutexes.
> >> Do you want to work on making both packfile reading and packfile
> >> writing thread safe? Or just packfile reading?
> > Packfile writing is probably already or pretty close to thread-safe
> > (at least the main writing code path in git-pack-objects; the
> > streaming blobs to a pack, i'm not so sure).
> --
> Philip


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-08 Thread Matheus Tavares Bernardino
On Sun, Apr 7, 2019 at 7:52 PM Christian Couder
 wrote:
>
> Hi Matheus
>
> On Sun, Apr 7, 2019 at 10:48 PM Matheus Tavares Bernardino
>  wrote:
> >
> > This is my proposal for GSoC with the subject "Make pack access code
> > thread-safe".
>
> Thanks!
>
> > I'm late in schedule but I would like to ask for your
> > comments on it. Any feedback will be highly appreciated.
> >
> > The "rendered" version can be seen here:
> > https://docs.google.com/document/d/1QXT3iiI5zjwusplcZNf6IbYc04-9diziVKdOGkTHeIU/edit?usp=sharing
>
> Thanks for the link!
>
> > Besides administrative questions and contributions to FLOSS projects, at
> > FLUSP, I’ve been mentoring people who want to start contributing to the
> > Linux Kernel and now, to Git, as well.
>
> Nice! Do you have links about that?

Unfortunately not :( Maybe just the mentoring slides (e.g.
https://flusp.ime.usp.br/materials/Kernel_Primeiros_Passos.pdf). But
they are all in Portuguese, so I don't know wether it would be
valuable to add them here...

> > # The Project
> >
> > As direct as possible, the goal with this project is to make more of
> > Git’s codebase thread-safe, so that we can improve parallelism in
> > various commands. The motivation behind this are the complaints from
> > developers experiencing slow Git commands when working with large
> > repositories[1], such as chromium and Android. And since nowadays, most
> > personal computers have multi-core CPUs, it is a natural step trying to
> > improve parallel support so that we can better use the available resources.
> >
> > With this in mind, pack access code is a good target for improvement,
> > since it’s used by many Git commands (e.g., checkout, grep, blame, diff,
> > log, etc.). This section of the codebase is still sequential and has
> > many global states, which should be protected before we can work to
> > improve parallelism.
>
> I think it's better if global state can be made local or perhaps
> removed, rather than protected (though of course that's not always
> possible).

Indeed! I just added this to the docs version. Thanks

> > ## The Pack Access Code
> >
> > To better describe what the pack access code is, we must talk about
> > Git’s object storing (in a simplified way):
>
> Maybe s/storing/storage/

Thanks. Already changed.

> > Besides what are called loose objects,
>
> s/loose object/loose object files/

Done, thanks!

> > Git has a very optimized mechanism to compactly store
> > objects (blobs, trees, commits, etc.) in packfiles[2]. These files are
> > created by[3]:
> >
> > 1. listing objects;
> > 2. sorting the list with some good heuristics;
> > 3. traversing the list with a sliding window to find similar objects in
> > the window, in order to do delta decomposing;
> > 4. compress the objects with zlib and write them to the packfile.
> >
> > What we are calling pack access code in this document, is the set of
> > functions responsible for retrieving the objects stored at the
> > packfiles. This process consists, roughly speaking, in three parts:
> >
> > 1. Locate and read the blob from packfile, using the index file;
> > 2. If the blob is a delta, locate and read the base object to apply the
> > delta on top of it;
> > 3. Once the full content is read, decompress it (using zlib inflate).
> >
> > Note: There is a delta cache for the second step so that if another
> > delta depends on the same base object, it is already in memory. This
> > cache is global; also, the sliding windows, are global per packfile.
>
> Yeah, but the sliding windows are used only when creating pack files,
> not when reading them, right?
>
> > If these steps were thread-safe, the ability to perform the delta
> > reconstruction (together with the delta cache lookup) and zlib inflation
> > in parallel could bring a good speedup. At git-blame, for example,
> > 24%[4] of the time is spent in the call stack originated at
> > read_object_file_extended. Not only this but once we have this big
> > section of the codebase thread-safe, we can work to parallelize even
> > more work at higher levels of the call stack. Therefore, with this
> > project, we aim to make room for many future optimizations in many Git
> > commands.
>
> Nice.
>
> > # Plan
> >
> > I will probably be working mainly with packfile.c, sha1-file.c,
> > object-store.h, object.c and pack.h, however, I may also need to tackle
> > other files. I will be focusing on the following three pack access call
> > chains, found in git-g

Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-08 Thread Matheus Tavares Bernardino
On Mon, Apr 8, 2019 at 3:58 AM Christian Couder
 wrote:
>
> On Mon, Apr 8, 2019 at 5:32 AM Duy Nguyen  wrote:
> >
> > On Mon, Apr 8, 2019 at 8:23 AM Duy Nguyen  wrote:
> > >
> > > On Mon, Apr 8, 2019 at 5:52 AM Christian Couder
> > >  wrote:
> > > > > Git has a very optimized mechanism to compactly store
> > > > > objects (blobs, trees, commits, etc.) in packfiles[2]. These files are
> > > > > created by[3]:
> > > > >
> > > > > 1. listing objects;
> > > > > 2. sorting the list with some good heuristics;
> > > > > 3. traversing the list with a sliding window to find similar objects 
> > > > > in
> > > > > the window, in order to do delta decomposing;
> > > > > 4. compress the objects with zlib and write them to the packfile.
> > > > >
> > > > > What we are calling pack access code in this document, is the set of
> > > > > functions responsible for retrieving the objects stored at the
> > > > > packfiles. This process consists, roughly speaking, in three parts:
> > > > >
> > > > > 1. Locate and read the blob from packfile, using the index file;
> > > > > 2. If the blob is a delta, locate and read the base object to apply 
> > > > > the
> > > > > delta on top of it;
> > > > > 3. Once the full content is read, decompress it (using zlib inflate).
> > > > >
> > > > > Note: There is a delta cache for the second step so that if another
> > > > > delta depends on the same base object, it is already in memory. This
> > > > > cache is global; also, the sliding windows, are global per packfile.
> > > >
> > > > Yeah, but the sliding windows are used only when creating pack files,
> > > > not when reading them, right?
> > >
> > > These windows are actually for reading. We used to just mmap the whole
> > > pack file in the early days but that was impossible for 4+ GB packs on
> > > 32-bit platforms, which was one of the reasons, I think, that sliding
> > > windows were added, to map just the parts we want to read.
> >
> > To clarify (I think I see why you mentioned pack creation now), there
> > are actually two window concepts. core.packedGitWindowSize is about
> > reading pack files. pack.window is for generating pack files. The
> > second window should already be thread-safe since we do all the
> > heuristics to find best base object candidates in threads.
>
> Yeah, it is not very clear in the proposal which windows it is talking
> about as I think a window is first mentioned when describing the steps
> to create a packfile in:
>
> "3. traversing the list with a sliding window to find similar objects
> in the window, in order to do delta decomposing;"
>
> Also the proposal plans to "Protect packfile.c read-and-write global
> variables ..." which made me wonder if it was also about improving
> thread safety when generating pack files.

Sorry, it is indeed unclear. The idea here was to say that variables
which are both read and updated in code that must be thread-safe
should be protected. I will refactor this, thanks.

Oh, also I'm targeting just packfile reading. The explanation on how
packfiles are created was written just as a contextualization. But
perhaps it leaded to some confusion on the proposal's objective.
Thanks for this feedback too.

> Thanks for clarifying!


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-08 Thread Matheus Tavares Bernardino
On Mon, Apr 8, 2019 at 12:32 AM Duy Nguyen  wrote:
>
> On Mon, Apr 8, 2019 at 8:23 AM Duy Nguyen  wrote:
> >
> > On Mon, Apr 8, 2019 at 5:52 AM Christian Couder
> >  wrote:
> > > > Git has a very optimized mechanism to compactly store
> > > > objects (blobs, trees, commits, etc.) in packfiles[2]. These files are
> > > > created by[3]:
> > > >
> > > > 1. listing objects;
> > > > 2. sorting the list with some good heuristics;
> > > > 3. traversing the list with a sliding window to find similar objects in
> > > > the window, in order to do delta decomposing;
> > > > 4. compress the objects with zlib and write them to the packfile.
> > > >
> > > > What we are calling pack access code in this document, is the set of
> > > > functions responsible for retrieving the objects stored at the
> > > > packfiles. This process consists, roughly speaking, in three parts:
> > > >
> > > > 1. Locate and read the blob from packfile, using the index file;
> > > > 2. If the blob is a delta, locate and read the base object to apply the
> > > > delta on top of it;
> > > > 3. Once the full content is read, decompress it (using zlib inflate).
> > > >
> > > > Note: There is a delta cache for the second step so that if another
> > > > delta depends on the same base object, it is already in memory. This
> > > > cache is global; also, the sliding windows, are global per packfile.
> > >
> > > Yeah, but the sliding windows are used only when creating pack files,
> > > not when reading them, right?
> >
> > These windows are actually for reading. We used to just mmap the whole
> > pack file in the early days but that was impossible for 4+ GB packs on
> > 32-bit platforms, which was one of the reasons, I think, that sliding
> > windows were added, to map just the parts we want to read.
>
> To clarify (I think I see why you mentioned pack creation now), there
> are actually two window concepts. core.packedGitWindowSize is about
> reading pack files. pack.window is for generating pack files. The
> second window should already be thread-safe since we do all the
> heuristics to find best base object candidates in threads.

I was indeed confusing this two concepts, thanks for clarifying it! I
took a quick look around the usage of core.packedGitWindowSize arround
the code (at packfile.c) and it seems to be already thread-safe (I may
be wrong thought).

> --
> Duy


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-08 Thread Philip Oakley

On 08/04/2019 02:23, Duy Nguyen wrote:

On Mon, Apr 8, 2019 at 5:52 AM Christian Couder
 wrote:

Git has a very optimized mechanism to compactly store
objects (blobs, trees, commits, etc.) in packfiles[2]. These files are
created by[3]:

1. listing objects;
2. sorting the list with some good heuristics;
3. traversing the list with a sliding window to find similar objects in
the window, in order to do delta decomposing;
4. compress the objects with zlib and write them to the packfile.

What we are calling pack access code in this document, is the set of
functions responsible for retrieving the objects stored at the
packfiles. This process consists, roughly speaking, in three parts:

1. Locate and read the blob from packfile, using the index file;
2. If the blob is a delta, locate and read the base object to apply the
delta on top of it;
3. Once the full content is read, decompress it (using zlib inflate).

Note: There is a delta cache for the second step so that if another
delta depends on the same base object, it is already in memory. This
cache is global; also, the sliding windows, are global per packfile.

Yeah, but the sliding windows are used only when creating pack files,
not when reading them, right?

These windows are actually for reading. We used to just mmap the whole
pack file in the early days but that was impossible for 4+ GB packs on
32-bit platforms, which was one of the reasons, I think, that sliding
windows were added, to map just the parts we want to read.


Another "32-bit problem" should also be expressly considered during the 
GSoC work because of the MS Windows definition of uInt / long to be only 
32 bits, leading to much of the Git code failing on the Git for Windows 
port and on the Git LFS (for Windows) for packs and files greater than 
4Gb. https://github.com/git-for-windows/git/issues/1063


Mainly it is just substitution of size_t for long, but there can be 
unexpected coercions when mixed data types get coerced down to a local 
32-bit long. This is made worse by it being implementation defined, so 
one needs to be explicit about some casts up to pointer/memsized types.

# Points to work on

* Investigate pack access call chains and look for non-thread-safe
operations on then.
* Protect packfile.c read-and-write global variables, such as
pack_open_windows, pack_open_fds and etc., using mutexes.

Do you want to work on making both packfile reading and packfile
writing thread safe? Or just packfile reading?

Packfile writing is probably already or pretty close to thread-safe
(at least the main writing code path in git-pack-objects; the
streaming blobs to a pack, i'm not so sure).

--
Philip


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-07 Thread Christian Couder
On Mon, Apr 8, 2019 at 5:32 AM Duy Nguyen  wrote:
>
> On Mon, Apr 8, 2019 at 8:23 AM Duy Nguyen  wrote:
> >
> > On Mon, Apr 8, 2019 at 5:52 AM Christian Couder
> >  wrote:
> > > > Git has a very optimized mechanism to compactly store
> > > > objects (blobs, trees, commits, etc.) in packfiles[2]. These files are
> > > > created by[3]:
> > > >
> > > > 1. listing objects;
> > > > 2. sorting the list with some good heuristics;
> > > > 3. traversing the list with a sliding window to find similar objects in
> > > > the window, in order to do delta decomposing;
> > > > 4. compress the objects with zlib and write them to the packfile.
> > > >
> > > > What we are calling pack access code in this document, is the set of
> > > > functions responsible for retrieving the objects stored at the
> > > > packfiles. This process consists, roughly speaking, in three parts:
> > > >
> > > > 1. Locate and read the blob from packfile, using the index file;
> > > > 2. If the blob is a delta, locate and read the base object to apply the
> > > > delta on top of it;
> > > > 3. Once the full content is read, decompress it (using zlib inflate).
> > > >
> > > > Note: There is a delta cache for the second step so that if another
> > > > delta depends on the same base object, it is already in memory. This
> > > > cache is global; also, the sliding windows, are global per packfile.
> > >
> > > Yeah, but the sliding windows are used only when creating pack files,
> > > not when reading them, right?
> >
> > These windows are actually for reading. We used to just mmap the whole
> > pack file in the early days but that was impossible for 4+ GB packs on
> > 32-bit platforms, which was one of the reasons, I think, that sliding
> > windows were added, to map just the parts we want to read.
>
> To clarify (I think I see why you mentioned pack creation now), there
> are actually two window concepts. core.packedGitWindowSize is about
> reading pack files. pack.window is for generating pack files. The
> second window should already be thread-safe since we do all the
> heuristics to find best base object candidates in threads.

Yeah, it is not very clear in the proposal which windows it is talking
about as I think a window is first mentioned when describing the steps
to create a packfile in:

"3. traversing the list with a sliding window to find similar objects
in the window, in order to do delta decomposing;"

Also the proposal plans to "Protect packfile.c read-and-write global
variables ..." which made me wonder if it was also about improving
thread safety when generating pack files.

Thanks for clarifying!


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-07 Thread Duy Nguyen
On Mon, Apr 8, 2019 at 8:23 AM Duy Nguyen  wrote:
>
> On Mon, Apr 8, 2019 at 5:52 AM Christian Couder
>  wrote:
> > > Git has a very optimized mechanism to compactly store
> > > objects (blobs, trees, commits, etc.) in packfiles[2]. These files are
> > > created by[3]:
> > >
> > > 1. listing objects;
> > > 2. sorting the list with some good heuristics;
> > > 3. traversing the list with a sliding window to find similar objects in
> > > the window, in order to do delta decomposing;
> > > 4. compress the objects with zlib and write them to the packfile.
> > >
> > > What we are calling pack access code in this document, is the set of
> > > functions responsible for retrieving the objects stored at the
> > > packfiles. This process consists, roughly speaking, in three parts:
> > >
> > > 1. Locate and read the blob from packfile, using the index file;
> > > 2. If the blob is a delta, locate and read the base object to apply the
> > > delta on top of it;
> > > 3. Once the full content is read, decompress it (using zlib inflate).
> > >
> > > Note: There is a delta cache for the second step so that if another
> > > delta depends on the same base object, it is already in memory. This
> > > cache is global; also, the sliding windows, are global per packfile.
> >
> > Yeah, but the sliding windows are used only when creating pack files,
> > not when reading them, right?
>
> These windows are actually for reading. We used to just mmap the whole
> pack file in the early days but that was impossible for 4+ GB packs on
> 32-bit platforms, which was one of the reasons, I think, that sliding
> windows were added, to map just the parts we want to read.

To clarify (I think I see why you mentioned pack creation now), there
are actually two window concepts. core.packedGitWindowSize is about
reading pack files. pack.window is for generating pack files. The
second window should already be thread-safe since we do all the
heuristics to find best base object candidates in threads.
-- 
Duy


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-07 Thread Duy Nguyen
On Mon, Apr 8, 2019 at 5:52 AM Christian Couder
 wrote:
> > Git has a very optimized mechanism to compactly store
> > objects (blobs, trees, commits, etc.) in packfiles[2]. These files are
> > created by[3]:
> >
> > 1. listing objects;
> > 2. sorting the list with some good heuristics;
> > 3. traversing the list with a sliding window to find similar objects in
> > the window, in order to do delta decomposing;
> > 4. compress the objects with zlib and write them to the packfile.
> >
> > What we are calling pack access code in this document, is the set of
> > functions responsible for retrieving the objects stored at the
> > packfiles. This process consists, roughly speaking, in three parts:
> >
> > 1. Locate and read the blob from packfile, using the index file;
> > 2. If the blob is a delta, locate and read the base object to apply the
> > delta on top of it;
> > 3. Once the full content is read, decompress it (using zlib inflate).
> >
> > Note: There is a delta cache for the second step so that if another
> > delta depends on the same base object, it is already in memory. This
> > cache is global; also, the sliding windows, are global per packfile.
>
> Yeah, but the sliding windows are used only when creating pack files,
> not when reading them, right?

These windows are actually for reading. We used to just mmap the whole
pack file in the early days but that was impossible for 4+ GB packs on
32-bit platforms, which was one of the reasons, I think, that sliding
windows were added, to map just the parts we want to read.

> > # Points to work on
> >
> > * Investigate pack access call chains and look for non-thread-safe
> > operations on then.
> > * Protect packfile.c read-and-write global variables, such as
> > pack_open_windows, pack_open_fds and etc., using mutexes.
>
> Do you want to work on making both packfile reading and packfile
> writing thread safe? Or just packfile reading?

Packfile writing is probably already or pretty close to thread-safe
(at least the main writing code path in git-pack-objects; the
streaming blobs to a pack, i'm not so sure).
-- 
Duy


Re: [GSoC][RFC v3] Proposal: Improve consistency of sequencer commands

2019-04-07 Thread Christian Couder
Hi Rohit,

On Sun, Apr 7, 2019 at 2:17 PM Rohit Ashiwal  wrote:
>
> On Sun, 7 Apr 2019 09:15:30 +0200 Christian Couder 
>  wrote:
>
> > As we are close to the deadline (April 9th) for proposal submissions,
> > I think it's a good idea to already upload your draft proposal on the
> > GSoC site. I think you will be able to upload newer versions until the
> > deadline, but uploading soon avoid possible last minute issues and
> > mistakes.
>
> Sure, I'll upload my proposal as soon as possible.

Great!

> > It looks like you copy pasted the Git Rev News article without
> > updating the content. The improvement has been released a long time
> > ago.
>
> The intention was to document how the project started and *major* milestones 
> or
> turning points of the project. Here they are.

Yeah, the intention is good, though it would be nice if the details
were a bit more polished.

> > Maybe s/rebases/rebase/
>
> Yes, :P
>
> > It seems to me that there has been more recent work than this and also
> > perhaps interesting suggestions and discussions about possible
> > sequencer related  improvements on the mailing list.
>
> Again the idea was to document earlier stages of project, "recent" discussions
> have been on the optimizations which are not exactly relevant.

I think there were ideas (from Elijah) about using the sequencer in
the regular (non interactive) rebase too.

> Should I write more about recent developments?

I think Alban's GSoC project was relevant too.

So yeah, if you have time after uploading your proposal to the GSoC
web site, it would be nice if you can update it with a bit more
information about what happened recently.

Thanks,
Christian.


Re: [GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-07 Thread Christian Couder
Hi Matheus

On Sun, Apr 7, 2019 at 10:48 PM Matheus Tavares Bernardino
 wrote:
>
> This is my proposal for GSoC with the subject "Make pack access code
> thread-safe".

Thanks!

> I'm late in schedule but I would like to ask for your
> comments on it. Any feedback will be highly appreciated.
>
> The "rendered" version can be seen here:
> https://docs.google.com/document/d/1QXT3iiI5zjwusplcZNf6IbYc04-9diziVKdOGkTHeIU/edit?usp=sharing

Thanks for the link!

> Besides administrative questions and contributions to FLOSS projects, at
> FLUSP, I’ve been mentoring people who want to start contributing to the
> Linux Kernel and now, to Git, as well.

Nice! Do you have links about that?

> # The Project
>
> As direct as possible, the goal with this project is to make more of
> Git’s codebase thread-safe, so that we can improve parallelism in
> various commands. The motivation behind this are the complaints from
> developers experiencing slow Git commands when working with large
> repositories[1], such as chromium and Android. And since nowadays, most
> personal computers have multi-core CPUs, it is a natural step trying to
> improve parallel support so that we can better use the available resources.
>
> With this in mind, pack access code is a good target for improvement,
> since it’s used by many Git commands (e.g., checkout, grep, blame, diff,
> log, etc.). This section of the codebase is still sequential and has
> many global states, which should be protected before we can work to
> improve parallelism.

I think it's better if global state can be made local or perhaps
removed, rather than protected (though of course that's not always
possible).

> ## The Pack Access Code
>
> To better describe what the pack access code is, we must talk about
> Git’s object storing (in a simplified way):

Maybe s/storing/storage/

> Besides what are called loose objects,

s/loose object/loose object files/

> Git has a very optimized mechanism to compactly store
> objects (blobs, trees, commits, etc.) in packfiles[2]. These files are
> created by[3]:
>
> 1. listing objects;
> 2. sorting the list with some good heuristics;
> 3. traversing the list with a sliding window to find similar objects in
> the window, in order to do delta decomposing;
> 4. compress the objects with zlib and write them to the packfile.
>
> What we are calling pack access code in this document, is the set of
> functions responsible for retrieving the objects stored at the
> packfiles. This process consists, roughly speaking, in three parts:
>
> 1. Locate and read the blob from packfile, using the index file;
> 2. If the blob is a delta, locate and read the base object to apply the
> delta on top of it;
> 3. Once the full content is read, decompress it (using zlib inflate).
>
> Note: There is a delta cache for the second step so that if another
> delta depends on the same base object, it is already in memory. This
> cache is global; also, the sliding windows, are global per packfile.

Yeah, but the sliding windows are used only when creating pack files,
not when reading them, right?

> If these steps were thread-safe, the ability to perform the delta
> reconstruction (together with the delta cache lookup) and zlib inflation
> in parallel could bring a good speedup. At git-blame, for example,
> 24%[4] of the time is spent in the call stack originated at
> read_object_file_extended. Not only this but once we have this big
> section of the codebase thread-safe, we can work to parallelize even
> more work at higher levels of the call stack. Therefore, with this
> project, we aim to make room for many future optimizations in many Git
> commands.

Nice.

> # Plan
>
> I will probably be working mainly with packfile.c, sha1-file.c,
> object-store.h, object.c and pack.h, however, I may also need to tackle
> other files. I will be focusing on the following three pack access call
> chains, found in git-grep and/or git-blame:
>
> read_object_file → repo_read_object_file → read_object_file_extended →
> read_object → oid_object_info_extended → find_pack_entry →
> fill_pack_entry → find_pack_entry_one → bsearch_pack and
> nth_packed_object_offset
>
> oid_object_info → oid_object_info_extended → 
>
> read_object_with_reference → read_object_file → 
>
> Ideally, at the end of the project, it will be possible to call
> read_object_file, oid_object_info and read_object_with_reference with
> thread-safety, so that these operations can be, latter, performed in
> parallel.
>
> Here are some threads on Git’s mailing list where I started discussing
> my project:
>
> * 
> https://public-inbox.org/git/CAHd-oW7onvn4ugEjXzAX_OSVEfCboH3-FnGR00dU8iaoc+b8=q...@mail.gmail.com/
> * 
>

[GSoC][RFC] Proposal: Make pack access code thread-safe

2019-04-07 Thread Matheus Tavares Bernardino
Hi, everyone

This is my proposal for GSoC with the subject "Make pack access code
thread-safe". I'm late in schedule but I would like to ask for your
comments on it. Any feedback will be highly appreciated.

The "rendered" version can be seen here:
https://docs.google.com/document/d/1QXT3iiI5zjwusplcZNf6IbYc04-9diziVKdOGkTHeIU/edit?usp=sharing

I kindly ask you to read the text at the google docs link, because in
the conversion to plain text I noticed it discards some information :(
But for those who prefer to comment by email, here it is:

Thanks,
Matheus Tavares

===

Making pack access code thread-safe
April, 2019

#Contact Info

Name Matheus Tavares Bernardino
Timezone GMT-3
Email matheus.bernard...@usp.br
IRC Nick matheustavares on #git-devel
Telefone [...]
Postal address [...]
Github https://github.com/MatheusBernardino/
Gitlab https://gitlab.com/MatheusTavares

# About me

I’m a senior student at the University of São Paulo (USP), attending the
Bachelor’s degree in Computer Science course. Currently, I’m at the end
of a one year undergraduate research in High-Performance Computing. The
goal of this project was to accelerate astrophysical software for black
hole studies using GPUs. Also, I’m working as a teaching assistant on
IME-USP’s Concurrent and Parallel Programming course, giving lectures
and developing/grading programming assignments. Besides parallel and
high-performance computing I’m very passionate about software
development in general, but especially low-level coding, and FLOSS.

# About me and FLOSS

## Linux Kernel

Last year, I started contributing to the Linux Kernel in the IIO
subsystem, together with a group of colleagues. I worked with another
student, to move the ad2s90 module out of staging area to Kernel’s
mainline, which we accomplished by the end of the year. In total, I
authored 11 patches and co-authored 3 (all of which are already at
Torvald’s repo). If you want to know more about my contributions to
Linux Kernel, take a look at the Appendix section.

## FLUSP: FLOSS at USP

After the amazing experience contributing to the Linux Kernel, we
decided to found FLUSP: FLOSS at USP, a group opened to undergraduate
and graduate students that aims to contribute to FLOSS software. Since
then, the group has grown and evolved a lot: Currently, we have members
contributing to the Kernel, GCC, IGT GPU Tools, Git and some projects of
our own such as KernelWorkflow. And as a recognition of our endeavor
with free software, we received some donations from AnalogDevices and
DigitalOcean.

Besides administrative questions and contributions to FLOSS projects, at
FLUSP, I’ve been mentoring people who want to start contributing to the
Linux Kernel and now, to Git, as well.

# About me and Git

I joined Git community in February and, so far, I have sent the
following patches:

clone: test for our behavior on odd objects/* content
clone: better handle symlinked files at .git/objects/
dir-iterator: add flags parameter to dir_iterator_begin
clone: copy hidden paths at local clone
clone: extract function from copy_or_link_directory
clone: use dir-iterator to avoid explicit dir traversal
clone: Replace strcmp by fspathcmp

And three more patches for git.github.io:

rn-50: Add git-send-email links to light readings
SoC-2019-Microprojects: Remove git-credential-cache
SoC-2019-Microprojects: Remove all trailing spaces

Participating at FLUSP, I’ve also been part of some Git related activities:

* I actively helped to organize a Git workshop for newcomer students.
* I’ve written an article at our website to help people configure and
use git-send-email to send patches.
* I’ve been writing a ‘First steps at Git’ article (not finished yet),
in which I’m registering what I’ve learned in the Git community so far,
since downloading the source, subscribing to the mailings list and
joining the channel at IRC until how to use travis-ci and begin sending
patches.

# The Project

As direct as possible, the goal with this project is to make more of
Git’s codebase thread-safe, so that we can improve parallelism in
various commands. The motivation behind this are the complaints from
developers experiencing slow Git commands when working with large
repositories[1], such as chromium and Android. And since nowadays, most
personal computers have multi-core CPUs, it is a natural step trying to
improve parallel support so that we can better use the available resources.

With this in mind, pack access code is a good target for improvement,
since it’s used by many Git commands (e.g., checkout, grep, blame, diff,
log, etc.). This section of the codebase is still sequential and has
many global states, which should be protected before we can work to
improve parallelism.

## The Pack Access Code

To better describe what the pack access code is, we must talk about
Git’s object storing (in a

Re: [GSoC][RFC v3] Proposal: Improve consistency of sequencer commands

2019-04-07 Thread Rohit Ashiwal
Hey Chris!

On Sun, 7 Apr 2019 09:15:30 +0200 Christian Couder  
wrote:

> As we are close to the deadline (April 9th) for proposal submissions,
> I think it's a good idea to already upload your draft proposal on the
> GSoC site. I think you will be able to upload newer versions until the
> deadline, but uploading soon avoid possible last minute issues and
> mistakes.

Sure, I'll upload my proposal as soon as possible.

> It looks like you copy pasted the Git Rev News article without
> updating the content. The improvement has been released a long time
> ago.

The intention was to document how the project started and *major* milestones or
turning points of the project. Here they are.

> Maybe s/rebases/rebase/

Yes, :P

> It seems to me that there has been more recent work than this and also
> perhaps interesting suggestions and discussions about possible
> sequencer related  improvements on the mailing list.

Again the idea was to document earlier stages of project, "recent" discussions
have been on the optimizations which are not exactly relevant.

Should I write more about recent developments?

Regards
Rohit



Re: [GSoC][RFC v3] Proposal: Improve consistency of sequencer commands

2019-04-07 Thread Christian Couder
Hi Rohit,

On Fri, Apr 5, 2019 at 11:32 PM Rohit Ashiwal
 wrote:
>
> Here is one more iteration of my draft proposal[1]. RFC.

Nice, thanks for iterating on this!

As we are close to the deadline (April 9th) for proposal submissions,
I think it's a good idea to already upload your draft proposal on the
GSoC site. I think you will be able to upload newer versions until the
deadline, but uploading soon avoid possible last minute issues and
mistakes.

In the version you upload, please add one or more links to the
discussion of your proposal on the mailing list.

> ### List of Contributions at Git:
>
> Repo  |Status  |Title
> --||---
> [git/git][8]  | [Will merge in master][13] | 
> [Micro][3]**:** Use helper functions in test script
> [git-for-windows/git][9]  | Merged and released| 
> [#2077][4]**:** [FIX] git-archive error, gzip -cn : command not found.
> [git-for-windows/build-extra][10] | Merged and released| 
> [#235][5]**:** installer: Fix version of installer and installed file.

Nice!

>  Overview
>
> Since when it was created in 2005, the `git rebase` command has been
> implemented using shell scripts that are calling other git commands. Commands
> like `git format-patch` to create a patch series for some commits, and then
> `git am` to apply the patch series on top of a different commit in case of
> regular rebase and the interactive rebase calls `git cherry-pick` repeatedly
> for the same.
>
> Neither of these approaches has been very efficient though, and the main 
> reason
> behind that is that repeatedly calling a git command has a significant
> overhead. Even the regular git rebase would do that as `git am` had been
> implemented by launching `git apply` on each of the patches.
>
> The overhead is especially big on Windows where creating a new process is 
> quite
> slow, but even on other Operating Systems it requires setting up everything
> from scratch, then reading the index from disk, and then, after performing 
> some
> changes, writing the index back to the disk.
>
> Stephan Beyer \ tried to introduce git-sequencer as his GSoC
> 2008 [project][6] which executed a sequence of git instructions to  \ or
> \ and the sequence was given by a \ or through `stdin`. The
> git-sequencer wants to become the common backend for git-am, git-rebase and
> other git commands, so as to improve performance, since then it eliminated the
> need to spawn a new process.
>
> Unfortunately, most of the code did not get merged during the SoC period but 
> he
> continued his contributions to the project along with Christian Couder
> \ and then mentor Daniel Barkalow
> \.
>
> The project was continued by Ramkumar Ramachandra \ in
> [2011][7], extending its domain to git-cherry-pick. The sequencer code got
> merged and it was now possible to "continue" and "abort" when cherry-picking 
> or
> reverting many commits.
>
> A patch series by Christian Couder \ was merged in
> [2016][16] to the `master` branch that makes `git am` call `git apply`’s
> internal functions without spawning the latter as a separate process. So the
> regular rebase will be significantly faster especially on Windows and for big
> repositories in the next Git feature release.

It looks like you copy pasted the Git Rev News article without
updating the content. The improvement has been released a long time
ago.

> Despite the success (of GSoC '11), Dscho had to improve a lot of things to 
> make
> it possible to reuse the sequencer in the interactive rebases making it 
> faster.

Maybe s/rebases/rebase/

> His work can be found [here][15].

It seems to me that there has been more recent work than this and also
perhaps interesting suggestions and discussions about possible
sequencer related  improvements on the mailing list.

> The learnings from all those works will serve as a huge headstart this year 
> for
> me.
>
> As of now, there are still some inconsistencies among these commands, e.g.,
> there is no `--skip` flag in `git-cherry-pick` while one exists for
> `git-rebase`. This project aims to remove inconsistencies in how the command
> line options are handled.


Re: Proposal

2019-04-06 Thread brestnic...@unwe.bg
I have a deal for you







































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































[GSoC][RFC v3] Proposal: Improve consistency of sequencer commands

2019-04-05 Thread Rohit Ashiwal
Hiya

Here is one more iteration of my draft proposal[1]. RFC.

Thanks
Rohit

[1]: https://gist.github.com/r1walz/5588d11065d5231ee451c0136400610e

-- >8 --


# Improve consistency of sequencer commands

## About Me

### Personal Information

Name   | Rohit Ashiwal
---|
Major  | Computer Science and Engineering
E-mail | \
IRC| __rohit
Skype  | rashiwal
Ph no  | [ ph_no ]
Github | [r1walz](https://github.com/r1walz/)
Linkedin   | [rohit-ashiwal](https://linkedin.com/in/rohit-ashiwal/)
Address| [ Address ]
Postal Code| [ postal_code ]
Time Zone  | IST (UTC +0530)


### Background

I am a sophomore at the [Indian Institute of Technology Roorkee][1], pursuing my
bachelor's degree in Computer Science and Engineering. I was introduced to
programming at a very early stage of my life. Since then, I've been trying out
new technologies by taking up various projects and participating in contests.  I
am passionate about system software development and competitive programming, and
I also actively contribute to open-source projects. At college, I joined the
Mobile Development Group ([MDG][2]), IIT Roorkee - a student group that fosters
mobile development within the campus. I have been an active part of the Git
community since February of this year, contributing to git-for-windows.


### Dev-Env

I am fluent in C/C++, Java and Shell Scripting, otherwise, I can also program in
Python, JavaScript. I use both Ubuntu 18.04 and Windows 10 x64 on my laptop.  I
prefer Linux for development unless the work is specific to Windows. \
VCS **:** git \
Editor **:** VS Code with gdb integrated


## Contributions to Open Source

My contributions to open source have helped me gain experience in understanding
the flow of any pre-written code at a rapid pace and enabled me to edit/add new
features.

### List of Contributions at Git:

Repo  |Status  |Title
--||---
[git/git][8]  | [Will merge in master][13] | 
[Micro][3]**:** Use helper functions in test script
[git-for-windows/git][9]  | Merged and released| 
[#2077][4]**:** [FIX] git-archive error, gzip -cn : command not found.
[git-for-windows/build-extra][10] | Merged and released| [#235][5]**:** 
installer: Fix version of installer and installed file.


## The Project

### _Improve consistency of sequencer commands_

 Overview

Since when it was created in 2005, the `git rebase` command has been
implemented using shell scripts that are calling other git commands. Commands
like `git format-patch` to create a patch series for some commits, and then
`git am` to apply the patch series on top of a different commit in case of
regular rebase and the interactive rebase calls `git cherry-pick` repeatedly
for the same.

Neither of these approaches has been very efficient though, and the main reason
behind that is that repeatedly calling a git command has a significant
overhead. Even the regular git rebase would do that as `git am` had been
implemented by launching `git apply` on each of the patches.

The overhead is especially big on Windows where creating a new process is quite
slow, but even on other Operating Systems it requires setting up everything
from scratch, then reading the index from disk, and then, after performing some
changes, writing the index back to the disk.

Stephan Beyer \ tried to introduce git-sequencer as his GSoC
2008 [project][6] which executed a sequence of git instructions to  \ or
\ and the sequence was given by a \ or through `stdin`. The
git-sequencer wants to become the common backend for git-am, git-rebase and
other git commands, so as to improve performance, since then it eliminated the
need to spawn a new process.

Unfortunately, most of the code did not get merged during the SoC period but he
continued his contributions to the project along with Christian Couder
\ and then mentor Daniel Barkalow
\.

The project was continued by Ramkumar Ramachandra \ in
[2011][7], extending its domain to git-cherry-pick. The sequencer code got
merged and it was now possible to "continue" and "abort" when cherry-picking or
reverting many commits.

A patch series by Christian Couder \ was merged in
[2016][16] to the `master` branch that makes `git am` call `git apply`’s
internal functions without spawning the latter as a separate process. So the
regular rebase will be significantly faster especially on Windows and for big
repositories in the next Git feature release.

Despite the success (of GSoC '11), D

Re: [GSoC][RFC] proposal: convert git-submodule to builtin script

2019-04-04 Thread Christian Couder
Hi,

On Tue, Apr 2, 2019 at 10:34 PM Khalid Ali  wrote:
>
> My name is Khalid Ali and I am looking to convert the git-submodule to
> a builtin C script. The link below contains my first proposal draft
> [1] and my microproject is at [2]. My main concern is that my second
> task is not verbose enough. I am not sure if I should add a specific
> breakdown of large items within the submodule command.

There was a GSoC project about the same subject a few years ago:

https://public-inbox.org/git/CAME+mvXtA6iZNfErTX5tYB-o-5xa1yesAG5h=ip_z2_zl_k...@mail.gmail.com/

I think you should take a look at the work that was done (merged and
not merged) and report about it in your proposal.

Thanks,
Christian.


Re: [GSoC][RFC] proposal: convert git-submodule to builtin script

2019-04-03 Thread Khalid Ali
First of all, thank you so much for the detailed feedback. I wasn't sure
how much to include in the proposal, but I see it still needs a lot of work.

> When you talk about "Convert each main task in git-submodule into a C
> function." and "If certain functionality is missing, add it to the correct
> script.", it is a good idea to back that up by concrete examples.
>
> Like, study `git-submodule.sh` and extract the list of "main tasks", and
> then mention that in your proposal. I see that you listed 9 main tasks,
> but it is not immediately clear whether you extracted that list from the
> usage text, from the manual page, or from the script itself. If the latter
> (which I think would be the best, given the goal of converting the code in
> that script), it would make a ton of sense to mention the function names
> and maybe add a permalink to the corresponding code (you could use e.g.
> GitHub's permalinks).

Yes, I actually did extract the tasks straight from git-submodule.sh. I will
definitely add the appropriate function names and permalinks to the
proposal.

> And then look at one of those main tasks, think of something that you
> believe should be covered in the test suite, describe it, then figure out
> whether it is already covered. If it is, mention that, together with the
> location, otherwise state which script would be the best location, and
> why.

Ah, alright. I'll have a look at the test suite to see what is covered and
include a section in my proposal.

> Besides, if you care to have a bit of a deeper look into the
> `git-submodule.sh` script, you will see a peculiar pattern in some of the
> subcommands, e.g. in `cmd_foreach`:
> https://github.com/git/git/blob/v2.21.0/git-submodule.sh#L320-L349
>
> Essentially, it spends two handfuls of lines on option parsing, and then
> the real business logic is performed by the `submodule--helper`, which is
> *already* a built-in.
>
> Even better: most of that business logic is implemented in a file that has
> the very file name you proposed already: `submodule.c`.
>
> So if I were you, I would add a section to your proposal (which in the end
> would no doubt dwarf the existing sections) that has as subsections each
> of those commands in `git-submodule.sh` that do *not* yet follow this
> pattern "parse options then hand off to submodule--helper".
>
> I would then study the commit history of the ones that *do* use the
> `submodule--helper` to see how they were converted, what conventions were
> used, whether there were recurring patterns, etc.
>
> In each of those subsections, I would then discuss what the
> still-to-be-converted commands do, try to find the closest command that
> already uses the `submodule--helper`, and then assess what it would take
> to convert them, how much code it would probably need, whether it could
> reuse parts that are already in `submodule.c`, etc.

I definitely noticed the option parsing in multiple parts of the function, but
the pattern didn't click until you mentioned it. I'll do as you recommended
and take a look at submodule.c to see how the code and functionality in
git-submodule.sh can be merged.

> Judging from past projects to convert scripts to C, I would say that the
> most successful strategy was to chomp off manageable parts and move them
> from the script to C. I am sure that you will find tons of good examples
> for this strategy by looking at the commit history of `git-submodule.sh`
> and then searching for the corresponding patches in the Git mailing list
> archive (e.g. https://public-inbox.org/git/).
>
> Do not expect those "chomped off" parts to hit `master` very quickly,
> though. Most likely, you would work on one patch series (very closely with
> your mentor at first, to avoid unnecessary blocks and to get a better feel
> for the way the Git community works right from the start), then, when that
> patch series is robust and solid and ready to be contributed, you would
> send it to the Git mailing list and immediately start working on the next
> patch series, all the while the reviews will trickle in. Those reviews
> will help you to improve the patch series, and it is a good idea to
> incorporate the good suggestions, and to discuss the ones you think are
> not necessary, for a few days before sending the next patch series
> iteration.
>
> Essentially, you will work in parallel on a few patch series at all times.
> Those patch series stack on top of each other, and they should, one after
> the other, make it into `pu` first, then, when they are considered ready
> for testing into `next`, and eventually to `master`. Whenever you
> contribute a new patch series iteration, you then rebase the remaining
> patch series on top. 

Re: [GSoC][RFC] proposal: convert git-submodule to builtin script

2019-04-03 Thread Johannes Schindelin
Hi,

On Tue, 2 Apr 2019, Khalid Ali wrote:

> My name is Khalid Ali and I am looking to convert the git-submodule to
> a builtin C script. The link below contains my first proposal draft
> [1] and my microproject is at [2]. My main concern is that my second
> task is not verbose enough. I am not sure if I should add a specific
> breakdown of large items within the submodule command.

Nice!

Please note that while I used to be the mentor who basically helped all of
the GSoC/Outreachy students through their "convert to built-in" projects
in the recent years, I am not available to mentor this year.

Having said that, I think I can help you to improve your proposal.

When you talk about "Convert each main task in git-submodule into a C
function." and "If certain functionality is missing, add it to the correct
script.", it is a good idea to back that up by concrete examples.

Like, study `git-submodule.sh` and extract the list of "main tasks", and
then mention that in your proposal. I see that you listed 9 main tasks,
but it is not immediately clear whether you extracted that list from the
usage text, from the manual page, or from the script itself. If the latter
(which I think would be the best, given the goal of converting the code in
that script), it would make a ton of sense to mention the function names
and maybe add a permalink to the corresponding code (you could use e.g.
GitHub's permalinks).

And then look at one of those main tasks, think of something that you
believe should be covered in the test suite, describe it, then figure out
whether it is already covered. If it is, mention that, together with the
location, otherwise state which script would be the best location, and
why.

Further, I would like to caution you about "If there is still some
time"... The `git-submodule.sh` script weighs in with just over 1,000
lines. We had three GSoC projects to convert scripts last year, and they
converted scripts' weights (at the time) were 750 lines for
`git-stash.sh`, 674 lines for `git-rebase.sh` and 1,036 lines for
`git-rebase--interactive.sh`, respectively. That last number should be
taken with a big grain of salt, as is not quite the number of lines that
were converted: as part of the GSoC project, the
`git-rebase--preserve-merges.sh` script was split out, never intended to
be converted, but to be deprecated instead (in favor of `git rebase -r`),
and there were "only" some 283 lines to be converted to C remaining after
that.

Out of those three, the project converting the smallest number of lines
clearly got integrated first (and there was actually time to do more stuff
in that project, and those things are partially still being cooked). The
converted `git stash` is still not in `master`...

So... converting 1,000 lines of code is quite a challenge for 3 months.

Having said that, I would not consider your project a failure if even
"only" as much as half of the lines of code were converted to C.

Besides, if you care to have a bit of a deeper look into the
`git-submodule.sh` script, you will see a peculiar pattern in some of the
subcommands, e.g. in `cmd_foreach`:
https://github.com/git/git/blob/v2.21.0/git-submodule.sh#L320-L349

Essentially, it spends two handfuls of lines on option parsing, and then
the real business logic is performed by the `submodule--helper`, which is
*already* a built-in.

Even better: most of that business logic is implemented in a file that has
the very file name you proposed already: `submodule.c`.

So if I were you, I would add a section to your proposal (which in the end
would no doubt dwarf the existing sections) that has as subsections each
of those commands in `git-submodule.sh` that do *not* yet follow this
pattern "parse options then hand off to submodule--helper".

I would then study the commit history of the ones that *do* use the
`submodule--helper` to see how they were converted, what conventions were
used, whether there were recurring patterns, etc.

In each of those subsections, I would then discuss what the
still-to-be-converted commands do, try to find the closest command that
already uses the `submodule--helper`, and then assess what it would take
to convert them, how much code it would probably need, whether it could
reuse parts that are already in `submodule.c`, etc.

> Outside of the draft, I was wondering whether this should be
> implemented through multiple patches to the master branch or through a
> separate, long-running feature branch that will be merged at the end
> of the GSoC timeline?

Judging from past projects to convert scripts to C, I would say that the
most successful strategy was to chomp off manageable parts and move them
from the script to C. I am sure that you will find tons of good examples
for this strategy by looking at the commit history of `git-submodule.sh`
and then searching for the corresponding patches in the Git mai

[GSoC][RFC] proposal: convert git-submodule to builtin script

2019-04-02 Thread Khalid Ali
Hi,

My name is Khalid Ali and I am looking to convert the git-submodule to
a builtin C script. The link below contains my first proposal draft
[1] and my microproject is at [2]. My main concern is that my second
task is not verbose enough. I am not sure if I should add a specific
breakdown of large items within the submodule command.

Outside of the draft, I was wondering whether this should be
implemented through multiple patches to the master branch or through a
separate, long-running feature branch that will be merged at the end
of the GSoC timeline?

Feedback is greatly appreciated!

[1] 
https://docs.google.com/document/d/1olGG8eJxFoMNyGt-4uMiTD3LjRYx15pttg67AJYliu8/edit?usp=sharing
[2] https://public-inbox.org/git/20190402014115.22478-1-khalludi...@gmail.com/


Investment Proposal.

2019-03-28 Thread Saleh Hussien Consultancy
Greetings,

We are consultancy firm situated in Bahrain currently looking to finance new or 
existing projects in any industry.

Currently we are sourcing for investment opportunities for our review and 
consideration and we provide financial and strategic advisory services to 
growing companies and entrepreneurs both private and institutional investors 
and I would be delighted to discuss further.

Should you wish to know more about the investment funding, feel free to contact 
us.

Regards,

Saleh H A Hussain
Consultant
P.O. Box 11674, Manama
Kingdom of Bahrain.
www.shcbahrain.com


Re: [GSoC][RFC] Proposal: Improve consistency of sequencer commands

2019-03-23 Thread Rohit Ashiwal
recating am-based rebases only takes a little more
> work, but it might expand to use up a lot of time.
> 
> > Relevant Work
> > =
> > Dscho and I had a talk on how a non-am backend should implement `git rebase
> > --whitespace=fix`, which he warned may become a large project (as it turns
> > out it is a sub-task in one of the proposed ideas[0]), we were trying to
> > integrate this on git-for-windows first.
> > Keeping warning in mind, I discussed this project with Rafael and he 
> > suggested
> > (with a little bit uncertainty in mind) that I should work on implementing
> > a git-diff flag that generates a patch that when applied, will remove 
> > whitespace
> > errors which I am currently working on.
> 
> It's awesome that you're looking in to this, but it may make more
> sense to knock out the easy parts of this project first.  That way the
> project gets some value out of your work for sure, you gain confidence
> and familiarity with the codebase, and then you can tackle the more
> difficult items.  Of course, if you're just exploring to learn what's
> possible in order to write the proposal, that's fine, I just think
> once you start on this project, it'd make more sense to do the easier
> ones first.

Yes, I'm looking into the code to get some clear vision.

> Hope that helps,
Yes! The vision in now clearer. Thanks Elijah. :)
> Elijah

Thanks for the review
Rohit


Re: [GSoC][RFC] Proposal: Improve consistency of sequencer commands

2019-03-23 Thread Rohit Ashiwal
Hi Christian

On 2019-03-23 22:17 UTC Christian Couder <> wrote:
> On Fri, Mar 22, 2019 at 4:12 PM Rohit Ashiwal
>  wrote:
> >
> > Hey People
> >
> > I am Rohit Ashiwal and here my first draft of the proposal for the project
> > titled: `Improve consistency of sequencer commands' this summer. I need your
> > feedback and more than that I need help to improve the timeline of this
> > proposal since it looks very weak. Basically, it lacks the "how" component
> > as I don't know much about the codebase in detail.
> >
> > Thanks
> > Rohit
> >
> > PS: Point one is missing in the timeline from the ideas page[0], can someone
> > explain what exactly it wants?
> 
> You mean this point from the idea page:
> 
> "The suggestion to fix an interrupted rebase-i or cherry-pick due to a
> commit that became empty via git reset HEAD (in builtin/commit.c)
> instead of git rebase --skip or git cherry-pick --skip ranges from
> annoying to confusing. (Especially since other interrupted am’s and
> rebases both point to am/rebase –skip.). Note that git cherry-pick
> --skip is not yet implemented, so that would have to be added first."

Yes.

> or something else?
> 
> By the way it is not very clear if the proposal uses markdown or
> another related format and if it is also possible (and perhaps even
> better visually) to see it somewhere else (maybe on GitHub). If that's
> indeed possible, please provide a link. It is a good thing though to
> still also send it attached to an email, so that it can be easily
> reviewed and commented on by people who prefer email discussions.

This was intentional. Here is the link of the proposal hosted at 
gist.github.com[1],
those who prefer text only version here[2] is mailing list link.

> > List of Contributions at Git:
> > -
> > Status: Merge in next revision
> 
> Maybe "Merged into the 'next' branch"
> 
> > git/git:
> > [Micro](3): Use helper functions in test script.
> 
> Please give more information than that, for example you could point to
> the commit in the next branch on GitHub and perhaps to the what's
> cooking email from Junio where it can be seen that the patch has been
> merged into next and what's its current status.

Current proposal has added links to those commits.

> > Status: Merged
> > git-for-windows/git:
> > [#2077](4): [FIX] git-archive error, gzip -cn : command not found.

This was released in v2.21.0 [3]

> > Status: Merged
> > git-for-windows/build-extra:
> > [#235](5): installer: Fix version of installer and installed file.
> 
> For Git for Windows contributions I think a link to the pull request
> is enough. It could be nice to know though if the commits are part of
> a released version.

> > The Project: `Improve consistency of sequencer commands'
> > 
> >
> > Overview
> > 
> > git-sequencer was introduced by Stephan Beyer  as his
> > GSoC 2008 project[6]. It executed a sequence of git instructions to  
> > or  and the sequence was given by a  or through stdin. The
> > git-sequencer wants to become the common backend for git-am, git-rebase
> > and other git commands. The project was continued by Ramkumar 
> > 
> > in 2011[7], converting it to a builtin and extending its domain to 
> > git-cherry-pick.
> 
> Yeah, you can say that it was another GSoC project and maybe give his
> full name (Ramkumar Ramachandra).
> 
> There have been more related work to extend usage of the sequencer
> after these GSoC projects, at least from Dscho and maybe from Alban
> Gruin and Elijah too. I would be nice if you could document that a
> bit.
> 
> > As of now, there are still some inconsistencies among these commands, e.g.,
> > there is no `--skip` flag in `git-cherry-pick` while one exists for 
> > `git-rebase`.
> > This project aims to remove inconsistencies in how the command line options 
> > are
> > handled.
> 
> 
> > Points to work on:
> > --
> > - Add `git cherry-pick --skip`
> > - Implement flags that am-based rebases support, but not interactive
> >   or merge based, in interactive/merge based rebases
> 
> Maybe the flags could be listed.
> 
> > - [Bonus] Deprecate am-based rebases
> > - [Bonus] Make a flag to allow rebase to rewrite commit messages that
> >   refer to older commits that were also rebased
> 
> This part of your proposal ("Points to work on") looks weak to me.
> 
> Please try to

Re: [GSoC][RFC] Proposal: Improve consistency of sequencer commands

2019-03-23 Thread Elijah Newren
Hi Rohit!

On Fri, Mar 22, 2019 at 8:12 AM Rohit Ashiwal
 wrote:
>
> Hey People
>
> I am Rohit Ashiwal and here my first draft of the proposal for the project
> titled: `Improve consistency of sequencer commands' this summer. I need your
> feedback and more than that I need help to improve the timeline of this
> proposal since it looks very weak. Basically, it lacks the "how" component
> as I don't know much about the codebase in detail.
>
> Thanks
> Rohit
>
> PS: Point one is missing in the timeline from the ideas page[0], can someone
> explain what exactly it wants?

I don't understand the question; could you restate it?

> Points to work on:
> --
> - Add `git cherry-pick --skip`

I'd reword this section as 'Consistently suggest --skip for operations
that have such a concept'.[1]

Adding a --skip flag to cherry-pick is useful, but was only meant as a
step.  Let me explain in more detail and use another comparison point.
Each of the git commands cherry-pick, merge, rebase take the flags
"--continue" and "--abort"; but they didn't always do so and so
continuing or aborting an operation often used special case-specific
commands for each (e.g. git reset --hard (or later --merge) to abort a
merge, git commit to continue it, etc.)  Those commands don't
necessarily make sense to users, whereas ' --continue' and
' --abort' do make intuitive sense and is thus memorable.
We want the same for --skip.

Both am-based rebases and am itself will give advice to the user to
use 'git rebase --skip' or 'git am --skip' when a patch isn't needed.
That's good.  In contrast, interactive-based rebases and cherry-pick
will suggest that the user run 'git reset' (with no arguments). The
place that suggests that command should instead suggest either 'git
rebase --skip' or 'git cherry-pick --skip', depending on which
operation is in progress.  The first step for doing that, is making
sure that cherry-pick actually has a '--skip' option.

> - Implement flags that am-based rebases support, but not interactive
>   or merge based, in interactive/merge based rebases

The "merge-based" rebase backend was deleted in 2.21.0, with all its
special flags reimplemented on the top of the interactive backend.  So
we can omit the deleted backend from the descriptions (instead just
talk about the am-based and interactive backends).

> - [Bonus] Deprecate am-based rebases
> - [Bonus] Make a flag to allow rebase to rewrite commit messages that
>   refer to older commits that were also rebased

I'd reorder these two.  I suspect the second won't be too hard and
will provide a new user-visible feature, while the former will
hopefully not be visible to users; if the former has more than
cosmetic differences visible to user, it might transform the problem
into more of a social problem than a technical one or just make into
something we can't do.

> Proposed Timeline
> -
> + Community Bonding (May 6th - May 26th):
> - Introduction to community
> - Get familiar with the workflow
> - Study and understand the workflow and implementation of the project 
> in detail
>
> + Phase 1  (May 27th - June 23rd):
> - Start with implementing `git cherry-pick --skip`
> - Write new tests for the just introduced flag(s)
> - Analyse the requirements and differences of am-based and other 
> rebases flags

Writing or finding tests to trigger all the --skip codepaths might be
the biggest part of this phase.  Implementing `git cherry-pick --skip`
just involves making it run the code that `git reset` invokes.  The
you change the error message to reference ` --skip` instead
of `git reset`.  What you're calling phase 1 here isn't quite
microproject sized, but it should be relatively quick and easy; I'd
plan to spend much more of your time on phase 2.

> + Phase 2  (June 24th - July 21st):
> - Introduce flags of am-based rebases to other kinds.
> - Add tests for the same.

You should probably mention the individual cases from "INCOMPATIBLE
FLAGS" of the git rebase manpage.  Also, some advice for order of
tackling these: I think you should probably do --ignore-whitespace
first; my guess is that one is the easiest.  Close up would be
--committer-date-is-author-date and --ignore-date.  Re-reading, I'm
not sure -C even makes sense at all; it might be that the solution is
just accepting the flag and ignoring it, or perhaps it remains the one
flag the interactive backend won't support, or maybe there is
something that makes sense to be done.  There'd need to be a little
investigation for that one, but it might tur

Re: [GSoC][RFC] Proposal: Improve consistency of sequencer commands

2019-03-23 Thread Christian Couder
Hi Rohit,

On Fri, Mar 22, 2019 at 4:12 PM Rohit Ashiwal
 wrote:
>
> Hey People
>
> I am Rohit Ashiwal and here my first draft of the proposal for the project
> titled: `Improve consistency of sequencer commands' this summer. I need your
> feedback and more than that I need help to improve the timeline of this
> proposal since it looks very weak. Basically, it lacks the "how" component
> as I don't know much about the codebase in detail.
>
> Thanks
> Rohit
>
> PS: Point one is missing in the timeline from the ideas page[0], can someone
> explain what exactly it wants?

You mean this point from the idea page:

"The suggestion to fix an interrupted rebase-i or cherry-pick due to a
commit that became empty via git reset HEAD (in builtin/commit.c)
instead of git rebase --skip or git cherry-pick --skip ranges from
annoying to confusing. (Especially since other interrupted am’s and
rebases both point to am/rebase –skip.). Note that git cherry-pick
--skip is not yet implemented, so that would have to be added first."

or something else?

By the way it is not very clear if the proposal uses markdown or
another related format and if it is also possible (and perhaps even
better visually) to see it somewhere else (maybe on GitHub). If that's
indeed possible, please provide a link. It is a good thing though to
still also send it attached to an email, so that it can be easily
reviewed and commented on by people who prefer email discussions.

> List of Contributions at Git:
> -
> Status: Merge in next revision

Maybe "Merged into the 'next' branch"

> git/git:
> [Micro](3): Use helper functions in test script.

Please give more information than that, for example you could point to
the commit in the next branch on GitHub and perhaps to the what's
cooking email from Junio where it can be seen that the patch has been
merged into next and what's its current status.

> Status: Merged
> git-for-windows/git:
> [#2077](4): [FIX] git-archive error, gzip -cn : command not found.
>
> Status: Merged
> git-for-windows/build-extra:
> [#235](5): installer: Fix version of installer and installed file.

For Git for Windows contributions I think a link to the pull request
is enough. It could be nice to know though if the commits are part of
a released version.

> The Project: `Improve consistency of sequencer commands'
> 
>
> Overview
> 
> git-sequencer was introduced by Stephan Beyer  as his
> GSoC 2008 project[6]. It executed a sequence of git instructions to  
> or  and the sequence was given by a  or through stdin. The
> git-sequencer wants to become the common backend for git-am, git-rebase
> and other git commands. The project was continued by Ramkumar 
> 
> in 2011[7], converting it to a builtin and extending its domain to 
> git-cherry-pick.

Yeah, you can say that it was another GSoC project and maybe give his
full name (Ramkumar Ramachandra).

There have been more related work to extend usage of the sequencer
after these GSoC projects, at least from Dscho and maybe from Alban
Gruin and Elijah too. I would be nice if you could document that a
bit.

> As of now, there are still some inconsistencies among these commands, e.g.,
> there is no `--skip` flag in `git-cherry-pick` while one exists for 
> `git-rebase`.
> This project aims to remove inconsistencies in how the command line options 
> are
> handled.


> Points to work on:
> --
> - Add `git cherry-pick --skip`
> - Implement flags that am-based rebases support, but not interactive
>   or merge based, in interactive/merge based rebases

Maybe the flags could be listed.

> - [Bonus] Deprecate am-based rebases
> - [Bonus] Make a flag to allow rebase to rewrite commit messages that
>   refer to older commits that were also rebased

This part of your proposal ("Points to work on") looks weak to me.

Please try to add more details about what you plan to do, how you
would describe the new flags in the documentation, which *.c *.h and
test files might be changed, etc.

> Proposed Timeline
> -
> + Community Bonding (May 6th - May 26th):
> - Introduction to community
> - Get familiar with the workflow
> - Study and understand the workflow and implementation of the project 
> in detail
>
> + Phase 1  (May 27th - June 23rd):
> - Start with implementing `git cherry-pick --skip`
> - Write new tests for the just introduced flag(s)
> - Analyse the requirements and differences of am-based and other 
> rebases flags
>
> + Phase 2  (June 24th - July 21st):
> - Introduce flags of am-based

[GSoC][RFC] Proposal: Improve consistency of sequencer commands

2019-03-22 Thread Rohit Ashiwal
Hey People

I am Rohit Ashiwal and here my first draft of the proposal for the project
titled: `Improve consistency of sequencer commands' this summer. I need your
feedback and more than that I need help to improve the timeline of this
proposal since it looks very weak. Basically, it lacks the "how" component
as I don't know much about the codebase in detail.

Thanks
Rohit

PS: Point one is missing in the timeline from the ideas page[0], can someone
explain what exactly it wants?


##
  Improve consistency of sequencer commands
##


About Me


Personal Information
---+---
Name   | Rohit Ashiwal
Major  | Computer Science and Engineering
E-mail | rohit.ashiwal...@gmail.com
IRC| __rohit
Skype  | rashiwal
Ph no  | [ ph_no ]
Github | r1walz
Linkedin   | rohit-ashiwal
Address| [ Address  
  ]
Postal Code| [ postal_code ]
Time Zone  | IST (UTC +0530)
---+---


Background
--
I am a sophomore at the Indian Institute of Technology Roorkee[1], pursuing
my bachelor's degree in Computer Science and Engineering. I was introduced
to programming at a very early stage of my life. Since then, Ive been trying
out new technologies by taking up various projects and participating in 
contests.
I am passionate about system software development and competitive programming,
and I also actively contribute to open-source projects. At college, I joined
the Mobile Development Group [MDG](2), IIT Roorkee - a student group that 
fosters
mobile development within the campus. I have been an active part of the Git
community since February of this year, contributing to git-for-windows.


Dev-Env
---
I am fluent in C/C++, Java and Shell Scripting, otherwise, I can also program
in Python, JavaScript. I use both Ubuntu 18.04 and Windows 10 x64 on my laptop.
I prefer Linux for development unless the work is specific to Windows.
VCS:git
Editor: VS Code with gdb integrated


Contributions to Open Source

My contributions to open source have helped me gain experience in understanding
the flow of any pre-written code at a rapid pace and enabled me to edit/add new
features.

List of Contributions at Git:
-
Status: Merge in next revision
git/git:
[Micro](3): Use helper functions in test script.

Status: Merged
git-for-windows/git:
[#2077](4): [FIX] git-archive error, gzip -cn : command not found.

Status: Merged
git-for-windows/build-extra:
[#235](5): installer: Fix version of installer and installed file.


The Project: `Improve consistency of sequencer commands'


Overview

git-sequencer was introduced by Stephan Beyer  as his
GSoC 2008 project[6]. It executed a sequence of git instructions to  
or  and the sequence was given by a  or through stdin. The
git-sequencer wants to become the common backend for git-am, git-rebase
and other git commands. The project was continued by Ramkumar 

in 2011[7], converting it to a builtin and extending its domain to 
git-cherry-pick.
As of now, there are still some inconsistencies among these commands, e.g.,
there is no `--skip` flag in `git-cherry-pick` while one exists for 
`git-rebase`.
This project aims to remove inconsistencies in how the command line options are
handled.


Points to work on:
--
- Add `git cherry-pick --skip` 
- Implement flags that am-based rebases support, but not interactive
  or merge based, in interactive/merge based rebases
- [Bonus] Deprecate am-based rebases
- [Bonus] Make a flag to allow rebase to rewrite commit messages that
  refer to older commits that were also rebased


Proposed Timeline
-
+ Community Bonding (May 6th - May 26th):
- Introduction to community
- Get familiar with the workflow
- Study and understand the workflow and implementation of the project 
in detail

+ Phase 1  (May 27th - June 23rd):
- Start with implementing `git cherry-pick --skip`
- Write new tests for the just introduced flag(s)
- Analyse the requirements and differences of am-based and other 
rebases flags

+ Phase 2  (June 24th - July 21st):
- Introduce flags of am-based rebases to other kinds.
- Add tests for the same.

+ Phase 3  (July 22th - August 19th):
- Act on [Bonus] features
- Documentation
- Clean up tasks


Relevant Work
=
D

Re: Proposal: Output should push to different servers in parallel

2019-02-07 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> This seems like a reasonable idea, until such time as someone submits
> patches to implement this in git you can do this with some invocation of
> GNU parallel -k, i.e. operate on N remotes in parallel, and use the -k
> option to buffer up all their output and present it in sequence.

Stopping the message there makes it sound like a polite way to say
"a generic tool to allow you doing it on anything, not limited to
Git, is already available, and a solution specific to Git is
unwanted."

I wanted to follow up with something that says "The 'parallel' tool
works in the meantime, but here are examples of very useful things
that we would not be able to live without that 'parallel' wouldn't
let us do, and we need a Git specific solution to obtain that", but
I am coming up with empty, so perhaps indeed we do not want a Git
specific solution ;-)



Re: Proposal: Output should push to different servers in parallel

2019-02-07 Thread Ævar Arnfjörð Bjarmason


On Wed, Feb 06 2019, Victor Porton wrote:

> I experienced a slowdown in Git pushing when I push to more than one server.
>
> I propose:
>
> Run push to several servers in parallel.
>
> Not to mix the output, nevertheless serialize the output, that is for
> example cache the output of the second server push and start to output
> it immediately after the first server push is finished.
>
> This approach combines the advantages of the current way (I suppose it
> is so) to serialize pushes: first push to the first server, then to
> the second, etc. and of my idea to push in parallel.
>
> I think the best way would be use multithreading, but multiprocessing
> would be a good quick solution.

This seems like a reasonable idea, until such time as someone submits
patches to implement this in git you can do this with some invocation of
GNU parallel -k, i.e. operate on N remotes in parallel, and use the -k
option to buffer up all their output and present it in sequence.


Proposal: Output should push to different servers in parallel

2019-02-06 Thread Victor Porton

I experienced a slowdown in Git pushing when I push to more than one server.

I propose:

Run push to several servers in parallel.

Not to mix the output, nevertheless serialize the output, that is for 
example cache the output of the second server push and start to output 
it immediately after the first server push is finished.


This approach combines the advantages of the current way (I suppose it 
is so) to serialize pushes: first push to the first server, then to the 
second, etc. and of my idea to push in parallel.


I think the best way would be use multithreading, but multiprocessing 
would be a good quick solution.




Proposal

2018-11-12 Thread SGM John Dailey
Hello ,


My name is Sgt Major John Dailey. I am here in Afghanistan , I came
upon a project I think we can work together on. I and my partner (1st
Lt. Daniel Farkas ) have the sum of $15 Million United State Dollars
which we got from a Crude Oil Deal in Iraq before he was killed by an
explosion while on a Vehicle Patrol. Due to this incident, I want you
to receive these funds on my behalf as far as I can be assured that my
share will be safe in your care until I complete my service here in
Afghanistan and come over to meet with you. Since we are working here
for an Official capacity, I cannot keep these funds hence by
contacting you. I Guarantee and Assure you that this is risk free.


I just need your acceptance to help me receive these funds and all is
done. Since the death of my partner, my life is not guaranteed here
anymore, so I have decided to share these funds with you. I am also
offering you 40% of this money for the assistance you will give to me.
One passionate appeal I will make to you, is for you not to discuss
this matter with anybody, should you have reasons to reject this
offer, please and please destroy this message as any leakage of this
information will be too bad for us as soldiers here in Afghanistan. I
do not know how long we will remain here, and I have been shot,
wounded and survived so many suicide bomb attacks, this and other
reasons have prompted me to reach out to you for help. I honestly want
this matter to be resolved immediately, please contact me as soon as
possible on my e-mail address which is my only way of communication.


Yours In Service,
SGM John Dailey


Proposal

2018-09-19 Thread Ms.Lev
I wish to discuss a proposal with you, please contact me via email for more 
details immediately.


Greetings in the name of God, Business proposal in God we trust

2018-07-26 Thread Mrs, Suran Yoda
Greetings in the name of God

Dear Friend


Greetings in the name of God,please let this not sound strange to you
for my only surviving lawyer who would have done this died early this
year.I prayed and got your email id from your country guestbook. I am
Mrs Suran Yoda from London,I am 72 years old,i am suffering from a
long time cancer of the lungs which also affected my brain,from all
indication my conditions is really deteriorating and it is quite
obvious that,according to my doctors they have advised me that i may
not live for the next two months,this is because the cancer stage has
gotten to a very bad stage.I am married to (Dr Andrews Yoda) who
worked with the Embassy of United Kingdom in South

Africa for nine years,Before he died in 2004.

I was bred up from a motherless babies home and was married to my late
husband for Thirty years without a child,my husband died in a fatal
motor accident Before his death we were true believers.Since his death
I decided not to re-marry,I sold all my inherited belongings and
deposited all the sum of $6.5 Million dollars with Bank in South
Africa.Though what disturbs me mostly is the cancer. Having known my
condition I decided to donate this fund to church,i want you as God
fearing person,to also use this money to fund church,orphanages and
widows,I took this decision,before i rest in peace because my time
will so on be up.

The Bible made us to understand that blessed are the hands that
giveth. I took this decision because I don`t have any child that will
inherit this money and my husband's relatives are not Christians and I
don`t want my husband hard earned money to be misused by unbelievers.
I don`t want a situation where these money will be used in an ungodly
manner,hence the reason for taking this bold decision.I am not afraid
of death hence i know where am going.Presently,I'm with my laptop in a
hospital here in London where I have been undergoing treatment for
cancer of the lungs.

As soon as I receive your reply I shall give you the contact of the
Bank.I will also issue you a letter of authority that will prove you
as the new beneficiary of my fund.Please assure me that you will act
accordingly as I stated.Hoping to hear from you soon.

Remain blessed in the name of the Lord.

Yours in Christ,
Mrs Suran Yoda


Proposal

2018-07-15 Thread Miss Victoria Mehmet
Hello

I have a business proposal of mutual benefits i would like to discuss with 
you,i asked before and i still await your positive response thanks.


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello



I have a business proposal of mutual benefits i would like to discuss with
you i asked before and i still await your positive response thanks


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello



I have a business proposal of mutual benefits i would like to discuss with
you i asked before and i still await your positive response thanks


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello



I have a business proposal of mutual benefits i would like to discuss with
you i asked before and i still await your positive response thanks


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello



I have a business proposal of mutual benefits i would like to discuss with
you i asked before and i still await your positive response thanks


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello



I have a business proposal of mutual benefits i would like to discuss with
you i asked before and i still await your positive response thanks


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello



I have a business proposal of mutual benefits i would like to discuss with
you i asked before and i still await your positive response thanks


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello



I have a business proposal of mutual benefits i would like to discuss with
you i asked before and i still await your positive response thanks


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello

I have a business proposal of mutual benefits i would like to discuss with
you.


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello

I have a business proposal of mutual benefits i would like to discuss with
you.


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello

I have a business proposal of mutual benefits i would like to discuss with
you.


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello

I have a business proposal of mutual benefits i would like to discuss with
you.


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello

I have a business proposal of mutual benefits i would like to discuss with
you.


Proposal

2018-07-12 Thread Miss Victoria Mehmet
Hello

I have a business proposal of mutual benefits i would like to discuss with
you.


Business Proposal

2018-07-05 Thread BRENDA WILSON



I am Sgt.Brenda Wilson, originally from Lake Jackson Texas USA.I personally 
made a special research and I came across your information. I am presently 
writing this mail to you from U.S Military base Kabul Afghanistan I have a 
secured business proposal for you. Reply for more details via my private E-mail 
( brendawilson...@hotmail.com )


Proposal

2018-06-07 Thread Mr. Fawaz KhE. Al Saleh




--
Good day,

i know you do not know me personally but i have checked your profile  
and i see generosity in you, There's an urgent offer attach

to your name here in the office of Mr. Fawaz KhE. Al Saleh Member of
the Board of Directors, Kuveyt Türk Participation Bank  (Turkey) and
head of private banking and wealth management
Regards,
Mr. Fawaz KhE. Al Saleh



BUSINESS INTEREST/ PROPOSAL

2018-06-04 Thread ZHAO DONG
Hello

RE:BUSINESS INQUIRY/ PROPOSAL

How are you doing today, i hope this mail finds you in a good and convenient 
position!

My name is ZHAO DONG. I am the senior manager for Procurement, Hong Kong 
Refining Company (Sinopec Group Inc) I have been mandated to source crude oil 
from Libya for
supply to our refineries. However, I have been able to establish a good 
relationship with the senior management of the Azzawya Oil Refining Company, 
Libya.

I am now looking for a competent middle man to stand in between my company, 
Hong Kong Refining Company and the Azzawya Oil Refining Company of Libya for 
the sale and
purchase of 2 Million Barrels Monthly for 36 Months. This is in order to take 
home a commission of USD5 to USD7 per barrel. This amount is payable to the 
middle man as commission.

On your response I will give you further details you may need and proof of my 
identity. Kindly reply directly to zhaodong...@gmail.com or 
zhaodon...@yandex.com for further vital details you may need.

Best Regards

ZHAO DONG


Proposal

2018-06-02 Thread Miss Victoria Mehmet




--
Hello

I have been trying to contact you. Did you get my business proposal?

Best Regards,
Miss.Victoria Mehmet


Lucrative Business Proposal

2018-06-02 Thread Adrien Saif




--
Dear Friend,

I would like to discuss a very important issue with you. I am writing 
to find out if this is your valid email. Please, let me know if this 
email is valid


Kind regards
Adrien Saif
Attorney to Quatif Group of Companies


Lucrative Business Proposal

2018-06-02 Thread Adrien Saif




--
Dear Friend,

I would like to discuss a very important issue with you. I am writing 
to find out if this is your valid email. Please, let me know if this 
email is valid


Kind regards
Adrien Saif
Attorney to Quatif Group of Companies


Proposal

2018-05-28 Thread Miss Zeliha Omer Faruk




--
Hello

I have been trying to contact you. Did you get my business proposal?

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turke


Proposal

2018-05-28 Thread Miss Zeliha Omer Faruk




--
Hello

I have been trying to contact you. Did you get my business proposal?

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turke


Proposal

2018-05-27 Thread Miss Zeliha Omer Faruk



--
Hello

I have been trying to contact you. Did you get my business proposal?

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turke


Proposal

2018-05-26 Thread Miss Zeliha Omer Faruk



Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey



Proposal

2018-05-23 Thread Miss Zeliha Omer Faruk



Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey



Proposal

2018-05-22 Thread Miss Zeliha Omer Faruk



Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey



Proposal

2018-05-21 Thread Miss Zeliha Omer Faruk



Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey



Business Proposal

2018-05-21 Thread Alan austin
Hello,
I am Mr. Alan Austin, I am currently working with Credit suisse Bank London. I 
saw your contact during my private search and I have a deep believe that you 
will be very honest, committed and capable of assisting in this business 
venture.

I am an account officer to late Dr. Manzoor Hassan who died with his entire 
family in Syria, It is based on this that I am contacting you to stand as the 
beneficiary to my late client so that his funds in our custody will be released 
and paid to you as the beneficiary to the deceased.

It is important you respond back to me with your full name and address, 
including your direct phone number to enable me give you full details of this 
transaction and more information about my late client who left huge amount of 
money in our Bank. I will provide you with all the necessary information, 
documents and proof to legally back up the claim from the different offices 
concerned for the smooth transfer of the fund to any of your accounts as the 
true beneficiary.

Yours Sincerely,
Mr. Alan Austin


Proposal

2018-05-20 Thread Miss Zeliha Omer Faruk



Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey



Proposal

2018-05-20 Thread Zeliha Omer Faruk



--
Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey


Proposal

2018-05-16 Thread Miss Zeliha Omer Faruk



Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey



Proposal

2018-05-16 Thread Miss Zeliha Omer Faruk



Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey



Proposal

2018-05-13 Thread Zeliha Omer Faruk



--
Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey


Proposal

2018-05-13 Thread Zeliha Omer Faruk



--
Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey


Proposal

2018-05-11 Thread Zeliha Omer Faruk



--
Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey


Proposal

2018-05-09 Thread Zeliha Omer Faruk



--
Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey


Proposal

2018-05-09 Thread Zeliha Omer Faruk



--
Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey


Proposal

2018-05-09 Thread Zeliha Omer Faruk



--
Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey


  1   2   3   4   5   >