Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-22 Thread Jean-Noël AVILA
Le 22/07/2017 à 02:43, Jiang Xin a écrit :
>
> Benefit of using the tweak version of gettext:
>
> 1. `make pot` can be run in a tar extract directory (without git controlled).

This issue is real for packet maintainers who can patch the original
source and run their own set of utilities outside of a git repo. This
can be possible with Junio's proposition by writing the files to a
temporary directory before running the xgettext, then removing the
temporary directory.

Please note that with respect to this issue, the patched xgettext
approach is completely disruptive.

> 2. do not need to run `git reset --hard`.

Same as before.

> 3.  it's quick (nobody cares).
>

Requiring patched tools is really breaking collaboration. Git made a
great case of relying on standard tools (not even GNU versions), so that
would really go backward.


Plus, I hope that some day, instead of translators finding afterwards
that a change broke i18n capabilities, developpers would have some kind
of sanity check. Requiring special versions of i18n tooling stops this hope.

<>

Re: [L10N] Kickoff of translation for Git 2.14.0 round 1

2017-07-22 Thread Jean-Noël Avila
Le 22/07/2017 à 19:02, Kaartic Sivaraam a écrit :
> On Sat, 2017-07-15 at 21:30 +0200, Jean-Noël Avila wrote:
>>  * commit 4ddb1354e8 ("status: contextually notify user about an initial
>> commit") plays sentence lego while introducing colorization which again
>> does not play well with i18n.
>>
> What, if anything, should be done about this?
>

I only spotted it because the string is new for translation. But the
previous version was already playing sentence lego. So this is not a
regression ;-)


If I understand correctly, getting a i18n friendly string would require
being able to "color_sprintf" the branche name, and then "color_fprintf"
the output with a %s formatting string. None of this is already
available and that would introduce cumbersome logic in the code.


More generally, i18n puts some pressure on coding style for sure, and it
gets worse with multi-platform and coloring... how can we ease the
burden of developpers on this front without resorting to ad hoc patches?



Re: reftable [v3]: new ref storage format

2017-07-22 Thread Shawn Pearce
3rd iteration of the reftable storage format.

You can read a rendered version of this here:
https://googlers.googlesource.com/sop/jgit/+/reftable/Documentation/technical/reftable.md

Significant changes from v2:
- efficient lookup by SHA-1 for allow-tip-sha1-in-want.
- type 0x4 for FETCH_HEAD, MERGE_HEAD.
- file size up (27.7 M in v1, 34.4 M in v3)

The file size increase is due to lookup by SHA-1 support. By using
unique abbreviations its adding about 7 MiB to the file size for
865,258 objects behind 866,456 refs. Average entry for this direction
costs 8 bytes, using a 6 byte/12 hex unique abbreviation.


## Overview

### Problem statement

Some repositories contain a lot of references (e.g.  android at 866k,
rails at 31k).  The existing packed-refs format takes up a lot of
space (e.g.  62M), and does not scale with additional references.
Lookup of a single reference requires linearly scanning the file.

Atomic pushes modifying multiple references require copying the
entire packed-refs file, which can be a considerable amount of data
moved (e.g. 62M in, 62M out) for even small transactions (2 refs
modified).

Repositories with many loose references occupy a large number of disk
blocks from the local file system, as each reference is its own file
storing 41 bytes (and another file for the corresponding reflog).
This negatively affects the number of inodes available when a large
number of repositories are stored on the same filesystem.  Readers can
be penalized due to the larger number of syscalls required to traverse
and read the `$GIT_DIR/refs` directory.

### Objectives

- Near constant time lookup for any single reference, even when the
  repository is cold and not in process or kernel cache.
- Near constant time verification a SHA-1 is referred to by at least
  one reference (for allow-tip-sha1-in-want).
- Efficient lookup of an entire namespace, such as `refs/tags/`.
- Support atomic push `O(size_of_update)` operations.
- Combine reflog storage with ref storage.

### Description

A reftable file is a portable binary file format customized for
reference storage. References are sorted, enabling linear scans,
binary search lookup, and range scans.

Storage in the file is organized into blocks.  Prefix compression
is used within a single block to reduce disk space.  Block size is
tunable by the writer.

### Performance

Space used, packed-refs vs. reftable:

repository | packed-refs | reftable | % original | avg ref  | avg obj
---|:|-:|---:|-:|:
android|  62.2 M |   34.4 M | 55.2%  | 33 bytes | 8 bytes
rails  |   1.8 M |1.1 M | 57.7%  | 29 bytes | 6 bytes
git|  78.7 K |   44.0 K | 60.0%  | 50 bytes | 6 bytes
git (heads)|   332 b |239 b | 72.0%  | 31 bytes | 0 bytes

Scan (read 866k refs), by reference name lookup (single ref from 866k
refs), and by SHA-1 lookup (refs with that SHA-1, from 866k refs):

format  | scan| by name| by SHA-1
|:|---:|---:
packed-refs |  402 ms | 409,660.1 usec | 412,535.8 usec
reftable|  112 ms |  42.7 usec | 340.8 usec

Space used for 149,932 log entries for 43,061 refs,
reflog vs. reftable:

format| size  | avg log
--|--:|---:
$GIT_DIR/logs | 173 M | 1209 bytes
reftable  |   4 M |   30 bytes

## Details

### Peeling

References in a reftable are always peeled.

### Reference name encoding

Reference names should be encoded with UTF-8.

### Network byte order

All multi-byte, fixed width fields are in network byte order.

### Ordering

Blocks are lexicographically ordered by their first reference.

### Directory/file conflicts

The reftable format accepts both `refs/heads/foo` and
`refs/heads/foo/bar` as distinct references.

This property is useful for retaining log records in reftable, but may
confuse versions of Git using `$GIT_DIR/refs` directory tree to
maintain references.  Users of reftable may choose to continue to
reject `foo` and `foo/bar` type conflicts to prevent problems for
peers.

## File format

### Structure

A reftable file has the following high-level structure:

first_block {
  header
  first_ref_block
}
ref_blocks*
ref_index?
obj_blocks*
obj_index?
log_blocks*
log_index?
footer

### Block size

The `block_size` is arbitrarily determined by the writer, and does not
have to be a power of 2.  The block size must be larger than the
longest reference name or deflated log entry used in the repository,
as references cannot span blocks.

Powers of two that are friendly to the virtual memory system or
filesystem (such as 4k or 8k) are recommended.  Larger sizes (64k) can
yield better compression, with a possible increased cost incurred by
readers during access.

The largest block size is `16777215` bytes (15.99 MiB).

### Header

An 8-byte header appears at the beginning of the file:

'\1REF'
uint8( 

Re: [L10N] Kickoff of translation for Git 2.14.0 round 1

2017-07-22 Thread Kaartic Sivaraam
On Sat, 2017-07-15 at 21:30 +0200, Jean-Noël Avila wrote:
>  * commit 4ddb1354e8 ("status: contextually notify user about an initial
> commit") plays sentence lego while introducing colorization which again
> does not play well with i18n.
> 
What, if anything, should be done about this?

-- 
Kaartic


Re: [PATCH] sha1_file: use access(), not lstat(), if possible

2017-07-22 Thread Junio C Hamano
Johannes Schindelin  writes:

> But this whole thread taps into a gripe I have with parts of Git's code
> base: part of the code is not clear at all in its intent by virtue of
> calling whatever POSIX function may seem to give the answer for the
> intended question, instead of implementing a function whose name says
> precisely what question is asked.
>
> In this instance, we do not call a helper get_file_size(). Oh no. That
> would make it too obvious. We call lstat() instead.

I agree with you for this case and a case like this in general.  

In codepaths at a lot lower level (they tend to be the ancient and
quite fundamental ones) in our codebase, lstat() is often directly
used by the caller because they are interested not only in a single
aspect of a path but many fields in struct stat are of interest.

When the code is interested in existence or size or whatever single
aspect of a path and nothing else, however, the code would become
easier to read if a helper function with a more specific name is
used.  And it may even help individual platforms that do not want to
use the full lstat() emulation, by telling them that other fields in
struct stat are not needed.

Of course, then the issue becomes what to do when we are interested
in not just one but a selected few attributes.  Perhaps we create a
helper "get_A_B_and_C_attributes_for_path()", which may use lstat()
on POSIX and the most efficient way to get only A, B and C attributes
on non-POSIX platforms.  The implementation would be OK, but the naming
becomes a bit hard; we need to give it a good name.

Things gets even more interesting when the set of attributes we are
interested in grows by one and we need to rename the function to
"get_A_B_C_and_D_attributes_for_path()".  When it is a lot easier to
fall back to the full lstat() emulation on non-POSIX platforms, the
temptation to just use it even though it would grab attributes that
are not needed in that function grows, which needs to be resisted by
those who are doing the actual implementation for a particular platform.


Re: [PATCH] make get_be64() compile on pu with NO_UNALIGNED_LOADS

2017-07-22 Thread Junio C Hamano
Martin Ågren  writes:

> Applies to pu and passes the tests. I think this should be squashed in
> somewhere. Perhaps a mismerge in commit d553324d ("Merge branch
> 'bp/fsmonitor' into pu", 2017-07-21).

Yes, you spotted a mistaken evil-merge.  Thanks.

>
>  compat/bswap.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/compat/bswap.h b/compat/bswap.h
> index 133da1d2b..f86110a72 100644
> --- a/compat/bswap.h
> +++ b/compat/bswap.h
> @@ -188,11 +188,11 @@ static inline void put_be32(void *ptr, uint32_t value)
>   p[3] = value >>  0;
>  }
>  
> -static inline unit64_t get_be64(const void *ptr)
> +static inline uint64_t get_be64(const void *ptr)
>  {
> - unsigned char *p = ptr;
> + const unsigned char *p = ptr;
>   return  ((uint64_t)get_be32(p) << 32) |
> - ((uint64_t)get_be32(p + 4);
> + ((uint64_t)get_be32(p + 4));
>  }
>  
>  #endif


Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-22 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Fri, 21 Jul 2017, Junio C Hamano wrote:
>
>> Jean-Noël Avila  writes:
>> 
>> > Le 20/07/2017 à 20:57, Junio C Hamano a écrit :
>> >>
>> >> + git diff --quiet HEAD && git diff --quiet --cached
>> >> +
>> >> + @for s in $(LOCALIZED_C) $(LOCALIZED_SH) $(LOCALIZED_PERL); \
>> >
>> > Does PRIuMAX make sense for perl and sh files?
>> 
>> Not really; I did this primarily because I would prefer to keep
>> things consistent, anticipating there may be some other things we
>> need to replace before running gettext(1) for other reasons later.
>
> It would add unnecessary churn, too, to add those specific exclusions and
> make things inconsistent: the use of PRItime in Perl or shell scripts
> would already make those scripts barf. And if it is unnecessary churn...
> let's not do it?

Sorry, but I cannot quite tell if you are in favor of limiting the
set of source files that go through the sed substitution (because we
know PRIuMAX is just as nonsensical as PRItime in perl and shell
source), or if you are in favor of keeping the patch as-is (because
changing the set of source files is a churn and substitutions would
not hurt)?

I am actually OK to change the above loop to process only the C
sources; I am not OK to change it to process only date.c which
happens to be the only source that has PRItime that matters in this
context, of course.

Thanks.




Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-22 Thread Junio C Hamano
Johannes Schindelin  writes:

>> >> A very small hack on gettext.
>
> I am 100% opposed to this hack. It is already cumbersome enough to find
> out what is involved in i18n (it took *me* five minutes to find out that
> much of the information is in po/README, with a lot of information stored
> *on an external site*, and I still managed to miss the `make pot` target).
>
> If at all, we need to make things easier instead of harder.
>
> Requiring potential volunteers to waste their time to compile an
> unnecessary fork of gettext? Not so great an idea.
>
> Plus, each and every Git build would now have to compile their own
> gettext, too, as the vanilla one would not handle the .po files containing
> %!!!
>
> And that requirement would impact instantaneously people like me, and even
> worse: some other packagers might be unaware of the new requirement which
> would not be caught during the build, and neither by the test suite.
> Double bad idea.

If I understand correctly, the patch hacks the input processing of
xgettext (which reads our source code and generates po/git.pot) so
that when it sees PRItime, pretend that it saw PRIuMAX, causing it
to output % in its output.

In our workflow, 

* The po/git.pot file is updated only by the l10n coordinator,
  and then the result is committed to our tree.

* Translators build on that commit by (1) running msgmerge which
  takes po/git.pot and wiggles its entries into their existing
  po/$lang.po file so that po/$lang.po file has new entries from
  po/git.pot and (2) editing po/$lang.po file.  The result is
  committed to our tree.

* The build procedure builders use runs the resulting
  po/$lang.po files through msgfmt to produce po/$lang.mo files,
  which will be installed.

As long as the first step results in % (not % or
anything that plain vanilla msgmerge and msgfmt do not understand),
the second step and third step do not require any hacked version of
gettext tools.

Even though I tend to agree with your conclusion that pre-processing
our source before passing it to xgettext is probably a better
solution in the longer term, I think the most of the objections in
your message come from your misunderstanding of what Jiang's patch
does and are not based on facts.  My understanding is that
translators do not need to compile a custom msgmerge and builders do
not need a custom msgfmt.



Hello Beautiful,

2017-07-22 Thread Jack
Good day dear, i hope this mail meets you well? my name is Jack, from the U.S. 
I know this may seem inappropriate so i ask for your forgiveness but i wish to 
get to know you better, if I may be so bold. I consider myself an easy-going 
man, adventurous, honest and fun loving person but I am currently looking for a 
relationship in which I will feel loved. I promise to answer any question that 
you may want to ask me...all i need is just your attention and the chance to 
know you more.

Please tell me more about yourself, if you do not mind. Hope to hear back from 
you soon.

Jack.


De la señora Malika

2017-07-22 Thread abely malika
Querida

Soy la señora MALIKA Tengo 56 años de edad, viuda sin un hijo, tengo
una donación de $ 2.5 millones (dos millones, quinientos mil dólares)
para donar a alguien que puede usarla para trabajar para Dios, mi
médico me dijo que No voy a durar mucho tiempo debido a mi enfermedad
de cáncer. Quiero que me escribas para que te pueda explicar mejor.

siendo bendecido,
SRA Malika


Re: [PATCH v6 00/10] The final building block for a faster rebase -i

2017-07-22 Thread Johannes Schindelin
Hi Junio,

On Thu, 20 Jul 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > Changes since v5:
> >
> > - replaced a get_sha1() call by a get_oid() call already.
> >
> > - adjusted to hashmap API changes
> 
> Applying this to the tip of 'master' yields exactly the same result
> as merging the previous round js/rebase-i-final to the tip of
> 'master' and then applying merge-fix/js/rebase-i-final to adjust to
> the codebase, so the net effect of this reroll is none.  Which is a
> good sign, as it means there wasn't any rebase mistake and the evil
> merge we've been carrying was a good one.

Good.

> But at the same time, I prefer to avoid rebasing to newer 'master'
> until the codebase starts drifting too far apart, or until a new
> feature release is made out of newer 'master'.  This is primarily
> because I want dates on commits to mean something---namely, "this
> change hasn't seen a need to be updated for 'oops, that was wrong'
> since this date".  This use of commit dates as 'priority date'
> matters much less for a topic not in 'next', but as a general
> principle, my workflow tries to preserve commit dates for all
> topics.

By that token, commit message updates would also be inappropriate, in
particular when they came from somebody else than the patch author ;-P

As to avoiding a rebase: we can add that to the growing list of things on
which we disagree.

If the author dates really meant anything, we would also have to avoid v2,
v3, v4, ... v226 of patch series. So that flies in the face of trying to
keep the meaning of author dates.

In addition, the development flow I prefer is one that is in harmony with
the modern Continuous Integration style, where topic branches are merged
into a single, always-ready-to-release integration branch.

That means that I always work off of `master`, unless there is a good
reason to base off of `next` or even `pu`. That's to avoid merge
conflicts, to see what really gets applied.

I am *especially* adamant about rebasing to a newer upstream commit when
there are merge conflicts.

Such as is the case here.

> For the above reason, I may hold onto this patch series in my inbox
> without actually updating js/rebase-i-final topic until the current
> cycle is over; please do not mistake it as this new reroll being
> ignored.

You do as you want, of course. But please note that I will not rebase my
topic branches to an ancient revision, especially if that would cause merge
conflicts with the current `master`.

And if there should be another iteration of this wallflower patch series,
I will rebase it to the then-current `master` again [*1*].

Ciao,
Dscho

Footnote *1*: in general, I try to abide by the wishes of maintainers when
contributing code, unless those wishes are contrary to what I consider
correct software development. Like, when in Rome, I will do as the Romans
do. Except when I see them looting a parking meter.


Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-22 Thread Johannes Schindelin
Hi,

On Sat, 22 Jul 2017, Jiang Xin wrote:

> 2017-07-22 7:34 GMT+08:00 Junio C Hamano :
> > Jiang Xin  writes:
> >
> >> A very small hack on gettext.

I am 100% opposed to this hack. It is already cumbersome enough to find
out what is involved in i18n (it took *me* five minutes to find out that
much of the information is in po/README, with a lot of information stored
*on an external site*, and I still managed to miss the `make pot` target).

If at all, we need to make things easier instead of harder.

Requiring potential volunteers to waste their time to compile an
unnecessary fork of gettext? Not so great an idea.

Plus, each and every Git build would now have to compile their own
gettext, too, as the vanilla one would not handle the .po files containing
%!!!

And that requirement would impact instantaneously people like me, and even
worse: some other packagers might be unaware of the new requirement which
would not be caught during the build, and neither by the test suite.
Double bad idea.

So let's go with Junio's patch.

Ciao,
Dscho


Re: [PATCH] PRItime: wrap PRItime for better l10n compatibility

2017-07-22 Thread Johannes Schindelin
Hi,

On Fri, 21 Jul 2017, Junio C Hamano wrote:

> Jean-Noël Avila  writes:
> 
> > Le 20/07/2017 à 20:57, Junio C Hamano a écrit :
> >>
> >> +  git diff --quiet HEAD && git diff --quiet --cached
> >> +
> >> +  @for s in $(LOCALIZED_C) $(LOCALIZED_SH) $(LOCALIZED_PERL); \
> >
> > Does PRIuMAX make sense for perl and sh files?
> 
> Not really; I did this primarily because I would prefer to keep
> things consistent, anticipating there may be some other things we
> need to replace before running gettext(1) for other reasons later.

It would add unnecessary churn, too, to add those specific exclusions and
make things inconsistent: the use of PRItime in Perl or shell scripts
would already make those scripts barf. And if it is unnecessary churn...
let's not do it?

Ciao,
Dscho

Re: [PATCH] sha1_file: use access(), not lstat(), if possible

2017-07-22 Thread Johannes Schindelin
Hi,

On Thu, 20 Jul 2017, Junio C Hamano wrote:

> Jonathan Tan  writes:
> 
> > In sha1_loose_object_info(), use access() (indirectly invoked through
> > has_loose_object()) instead of lstat() if we do not need the on-disk
> > size, as it should be faster on Windows [1].
> 
> That sounds as if Windows is the only thing that matters.  "It is
> faster in general, and is much faster on Windows" would have been
> more convincing, and "It isn't slower, and is much faster on
> Windows" would also have been OK.  Do we have any measurement, or
> this patch does not yield any measuable gain?  
> 
> By the way, the special casing of disk_sizep (which is only used by
> the batch-check feature of cat-file) is somewhat annoying with or
> without this patch, but this change makes it even more so by adding
> an extra indentation level.  I do not think of a way to make it less
> annoying offhand, and I do not think this change needs to address it
> in any way, but I am mentioning this as a hint to bystanders who may
> want to find something small that can be cleaned up ;-)

I actually found a separate piece of information in the meantime:

https://blogs.msdn.microsoft.com/oldnewthing/20071023-00/?p=24713#comment-562083

i.e. _waccess() is implemented in the same way our mingw_lstat()
implementation is: by calling the very same GetFileAttributes() code path.
So it has exactly the same performance characteristics, and I was wrong.

But this whole thread taps into a gripe I have with parts of Git's code
base: part of the code is not clear at all in its intent by virtue of
calling whatever POSIX function may seem to give the answer for the
intended question, instead of implementing a function whose name says
precisely what question is asked.

In this instance, we do not call a helper get_file_size(). Oh no. That
would make it too obvious. We call lstat() instead -- under the assumption
that the whole world runs on Linux, really, because let's be honest about
it: lstat() implementations all differ in subtle ways and we really only
test on Linux.

The obviousness of something like get_file_size() would be so refreshing
to these tired eyes.

Oh, and it would make it much easier to maintain ports to other Operating
Systems, most notably Windows.

Ciao,
Dscho


[PATCH] make get_be64() compile on pu with NO_UNALIGNED_LOADS

2017-07-22 Thread Martin Ågren
1. s/unit64_t/uint64_t/

2. add const-qualifier to *p

3. add missing closing ')'

Signed-off-by: Martin Ågren 
---
Applies to pu and passes the tests. I think this should be squashed in
somewhere. Perhaps a mismerge in commit d553324d ("Merge branch
'bp/fsmonitor' into pu", 2017-07-21).

 compat/bswap.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/compat/bswap.h b/compat/bswap.h
index 133da1d2b..f86110a72 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -188,11 +188,11 @@ static inline void put_be32(void *ptr, uint32_t value)
p[3] = value >>  0;
 }
 
-static inline unit64_t get_be64(const void *ptr)
+static inline uint64_t get_be64(const void *ptr)
 {
-   unsigned char *p = ptr;
+   const unsigned char *p = ptr;
return  ((uint64_t)get_be32(p) << 32) |
-   ((uint64_t)get_be32(p + 4);
+   ((uint64_t)get_be32(p + 4));
 }
 
 #endif
-- 
2.14.0.rc0.14.g12cc05b53