Hi,
Jeff Hostetler wrote:
> Teach rev-list to use the filtering provided by the
> traverse_commit_list_filtered() interface to omit
> unwanted objects from the result.
>
> Object filtering is only allowed when one of the "--objects*"
> options are used.
micronit: the line widths seem to be uneven in these commit messages,
which is a bit distracting when reading.
> When the "--filter-print-omitted" option is used, the omitted
> objects are printed at the end. These are marked with a "~".
> This option can be combined with "--quiet" to get a list of
> just the omitted objects.
Neat. Can you give a quick example?
Using --quiet for this feels a bit odd, since it previously meant
to print nothing to stdout. I wonder if there's another way ---
e.g.
--print-omitted=(yes|no|only)
If I wanted to list all objects matching a filter, even objects
that are not reachable from any ref, is there a way to do that?
(Just curious, trying to think about this interface.)
> Add t6112 test.
This part doesn't need to be in the commit message. More generally,
anything I could more easily learn from the code or diffstat doesn't
need to be in the commit message: the commit message is about the
"why" more than the details of what in the code changed.
> In the future, we will introduce a "partial clone" mechanism
> wherein an object in a repo, obtained from a remote, may
> reference a missing object that can be dynamically fetched from
> that remote once needed. This "partial clone" mechanism will
> have a way, sometimes slow, of determining if a missing link
> is one of the links expected to be produced by this mechanism.
Does this mean the <filter-spec>s will be part of the wire protocol?
I'll look more carefully at them below with that in mind.
> This patch introduces handling of missing objects to help
> debugging and development of the "partial clone" mechanism,
> and once the mechanism is implemented, for a power user to
> perform operations that are missing-object aware without
> incurring the cost of checking if a missing link is expected.
I had trouble understanding what this paragraph is about. Can you
give an example?
> Signed-off-by: Jeff Hostetler <[email protected]>
> ---
> Documentation/git-rev-list.txt | 4 +-
> Documentation/rev-list-options.txt | 36 ++++++
> builtin/rev-list.c | 108 ++++++++++++++++-
> t/t6112-rev-list-filters-objects.sh | 225
> ++++++++++++++++++++++++++++++++++++
> 4 files changed, 370 insertions(+), 3 deletions(-)
> create mode 100755 t/t6112-rev-list-filters-objects.sh
Looks reasonably concise, good.
[...]
> --- a/Documentation/git-rev-list.txt
> +++ b/Documentation/git-rev-list.txt
> @@ -47,7 +47,9 @@ SYNOPSIS
> [ --fixed-strings | -F ]
> [ --date=<format>]
> [ [ --objects | --objects-edge | --objects-edge-aggressive ]
> - [ --unpacked ] ]
> + [ --unpacked ]
> + [ --filter=<filter-spec> [ --filter-print-omitted ] ] ]
Does this mean --filter is only useful with --objects? E.g. I can't
use it to filter commits?
> + [ --missing=<missing-action> ]
--missing=(error|allow-any|print) would be more informative and about
equally concise.
Since this is mainly for debugging, does it have a different
compatibility guarantee from other options? Could it be named
accordingly to set expectations?
[...]
> +The form '--filter=blob:none' omits all blobs.
Sounds sensible.
> ++
> +The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n bytes
> +or units. The value may be zero.
On second thought, doesn't blob:limit=0 mean blob:none is not needed?
Is it for future consistency with tree:none?
What units do [kmg] use? Are they GB, GiB, or one of the variants in
between?
> ++
> +The form '--filter=sparse:oid=<oid-ish>' uses a sparse-checkout
> +specification contained in the object (or the object that the expression
> +evaluates to) to omit blobs that would not be not required for a
> +sparse checkout on the requested refs.
This one makes me a little nervous because it would mean we're
planning on adding sparse-checkout specifications to the wire
protocol. Maybe that's okay --- they're already part of the on-disk
format --- but it makes me nervous because the sparse-checkout format
is not so great, as I believe MS has already noticed.
What is an <oid-ish>? Can it just say <blob>? How would this one
work when passed over the wire?
> ++
> +The form '--filter=sparse:path=<path>' similarly uses a sparse-checkout
> +specification contained in <path>.
Is this <path> relative to the cwd of the caller, or is it within some
commit?
Sorry it took so long to send this feedback / these questions.
Hopefully it's useful nevertheless.
Thanks and hope that helps,
Jonathan