Re: [PATCH v2 06/24] multi-pack-index: load into memory

2018-07-06 Thread Junio C Hamano
Eric Sunshine  writes:

> On Thu, Jul 5, 2018 at 10:20 AM Derrick Stolee  wrote:
>> On 6/25/2018 3:38 PM, Junio C Hamano wrote:
>> While I don't use substitutions in this patch, I do use them in later
>> patches. Here is the final version of this method:
>>
>> midx_read_expect () {
>>  NUM_PACKS=$1
>>  NUM_OBJECTS=$2
>>  NUM_CHUNKS=$3
>>  EXTRA_CHUNKS="$5"
>>  cat >expect <<-\EOF
>>  header: 4d494458 1 $NUM_CHUNKS $NUM_PACKS
>>  chunks: pack_names oid_fanout oid_lookup
>> object_offsets$EXTRA_CHUNKS
>>  num_objects: $NUM_OBJECTS
>>  packs:
>>  EOF
>>
>> Using <<-\EOF causes these substitutions to fail. Is there a different
>> way I should construct this method?
>
> When you need to interpolate variables into the here-doc, use <<-EOF;
> when you don't, use <<-\EOF.

I think what was said is "in an early step there is no need to
interpolate but the same here-doc will need substitution in a later
step, and that is why I started an early step with a form without
quoting", which is different from "when should I use the form with
and without quoting?"

I think a reasonable response would have been "then please do use
the quoted form in the early step to help reviewers to let them know
there is not yet any substitutions, and then switch to quote-less form
at the step that starts needing substitution.  That way, reviewers can
see the test started to become "interesting" by interpolating variable
bits in the test vector by seeing the line with "<

Re: [PATCH v2 06/24] multi-pack-index: load into memory

2018-07-05 Thread Eric Sunshine
On Thu, Jul 5, 2018 at 10:20 AM Derrick Stolee  wrote:
> On 6/25/2018 3:38 PM, Junio C Hamano wrote:
> While I don't use substitutions in this patch, I do use them in later
> patches. Here is the final version of this method:
>
> midx_read_expect () {
>  NUM_PACKS=$1
>  NUM_OBJECTS=$2
>  NUM_CHUNKS=$3
>  EXTRA_CHUNKS="$5"
>  cat >expect <<-\EOF
>  header: 4d494458 1 $NUM_CHUNKS $NUM_PACKS
>  chunks: pack_names oid_fanout oid_lookup
> object_offsets$EXTRA_CHUNKS
>  num_objects: $NUM_OBJECTS
>  packs:
>  EOF
>
> Using <<-\EOF causes these substitutions to fail. Is there a different
> way I should construct this method?

When you need to interpolate variables into the here-doc, use <<-EOF;
when you don't, use <<-\EOF.


Re: [PATCH v2 06/24] multi-pack-index: load into memory

2018-07-05 Thread Derrick Stolee

On 6/25/2018 3:38 PM, Junio C Hamano wrote:

Derrick Stolee  writes:

+   cat >expect <<- EOF

"<<-\EOF", i.e. make it easy for readers to spot that there is no
funny substitutions happening in the here-doc body.


While I don't use substitutions in this patch, I do use them in later 
patches. Here is the final version of this method:


midx_read_expect () {
    NUM_PACKS=$1
    NUM_OBJECTS=$2
    NUM_CHUNKS=$3
    OBJECT_DIR=$4
    EXTRA_CHUNKS="$5"
    cat >expect <<-\EOF
    header: 4d494458 1 $NUM_CHUNKS $NUM_PACKS
    chunks: pack_names oid_fanout oid_lookup 
object_offsets$EXTRA_CHUNKS

    num_objects: $NUM_OBJECTS
    packs:
    EOF
    if [ $NUM_PACKS -ge 1 ]
    then
    ls $OBJECT_DIR/pack/ | grep idx | sort >> expect
    fi
    printf "object_dir: $OBJECT_DIR\n" >>expect &&
    test-tool read-midx $OBJECT_DIR >actual &&
    test_cmp expect actual
}

Using <<-\EOF causes these substitutions to fail. Is there a different 
way I should construct this method?


Thanks,
-Stolee


Re: [PATCH v2 06/24] multi-pack-index: load into memory

2018-06-25 Thread Junio C Hamano
Derrick Stolee  writes:

> +#define MIDX_HASH_LEN 20
> +#define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + MIDX_HASH_LEN)
>  
>  static char *get_midx_filename(const char *object_dir)
>  {
>   return xstrfmt("%s/pack/multi-pack-index", object_dir);
>  }
>  
> +struct multi_pack_index *load_multi_pack_index(const char *object_dir)
> +{
> + struct multi_pack_index *m = NULL;
> + int fd;
> + struct stat st;
> + size_t midx_size;
> + void *midx_map = NULL;
> + uint32_t hash_version;
> + char *midx_name = get_midx_filename(object_dir);
> +
> + fd = git_open(midx_name);
> +
> + if (fd < 0) {
> + error_errno(_("failed to read %s"), midx_name);
> + FREE_AND_NULL(midx_name);
> + return NULL;
> + }
> + if (fstat(fd, )) {
> + error_errno(_("failed to read %s"), midx_name);
> + FREE_AND_NULL(midx_name);
> + close(fd);
> + return NULL;
> + }
> +
> + midx_size = xsize_t(st.st_size);
> +
> + if (midx_size < MIDX_MIN_SIZE) {
> + close(fd);
> + error(_("multi-pack-index file %s is too small"), midx_name);
> + goto cleanup_fail;
> + }
> +
> + FREE_AND_NULL(midx_name);

Error handling in the above part looks a bit inconsistent.  I first
thought that the earlier ones manually clean up and leave because
jumping to cleanup_fail would need a successfully opened fd and
successfully mmapped midx_map, but the above "goto" forces
cleanup_fail: to munmap NULL and close an already closed fd.

I wonder if it is simpler to do

cleanup_fail:
/* no need to check for NULL when freeing */
free(m);
free(midx_name);
if (midx_map)
munmap(midx_map, midx_size);
if (0 <= fd)
close(fd);
return NULL;

and have all of the above error codepath to jump there.

> + midx_map = xmmap(NULL, midx_size, PROT_READ, MAP_PRIVATE, fd, 0);
> +
> + m = xcalloc(1, sizeof(*m) + strlen(object_dir) + 1);
> + strcpy(m->object_dir, object_dir);
> + m->data = midx_map;
> +
> + m->signature = get_be32(m->data);
> + if (m->signature != MIDX_SIGNATURE) {
> + error(_("multi-pack-index signature 0x%08x does not match 
> signature 0x%08x"),
> +   m->signature, MIDX_SIGNATURE);
> + goto cleanup_fail;
> + }
> +
> + m->version = m->data[4];
> + if (m->version != MIDX_VERSION) {
> + error(_("multi-pack-index version %d not recognized"),
> +   m->version);
> + goto cleanup_fail;
> + }
> +
> + hash_version = m->data[5];

Is there a good existing example to show a better way to avoid these
hard-coded constants that describe/define the file format?

> + if (hash_version != MIDX_HASH_VERSION) {
> + error(_("hash version %u does not match"), hash_version);
> + goto cleanup_fail;
> + }
> + m->hash_len = MIDX_HASH_LEN;
> +
> + m->num_chunks = *(m->data + 6);

By the way, this mixture of m->data[4] and *(m->data + 6) is even
worse.  You could do get_be32(&8[m->data]) if you want to irritate
readers even more ;-)

> + m->num_packs = get_be32(m->data + 8);
> +
> + return m;
> +
> +cleanup_fail:
> + FREE_AND_NULL(m);
> + FREE_AND_NULL(midx_name);
> + munmap(midx_map, midx_size);
> + close(fd);
> + return NULL;
> +}
> +


> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> index 8622a7cdce..0372704c96 100755
> --- a/t/t5319-multi-pack-index.sh
> +++ b/t/t5319-multi-pack-index.sh
> @@ -3,9 +3,19 @@
>  test_description='multi-pack-indexes'
>  . ./test-lib.sh
>  
> +midx_read_expect() {

"midx_read_expect () {", i.e. SP on both sides of (), please.

> + cat >expect <<- EOF

"<<-\EOF", i.e. make it easy for readers to spot that there is no
funny substitutions happening in the here-doc body.


> + header: 4d494458 1 0 0
> + object_dir: .
> + EOF
> + test-tool read-midx . >actual &&
> + test_cmp expect actual
> +}
> +
>  test_expect_success 'write midx with no packs' '
>   git multi-pack-index --object-dir=. write &&
> - test_path_is_file pack/multi-pack-index
> + test_path_is_file pack/multi-pack-index &&
> + midx_read_expect
>  '
>  
>  test_done