AntoinePrv opened a new issue, #49913:
URL: https://github.com/apache/arrow/issues/49913
### Describe the enhancement requested
With `archery benchmark [run|diff] --preserve`, the build and source folders
are preserved into a folder like the following. In my personal use, I found
several improvements that would ease working with `archery benchmark`.
```
<TMP>/arrow-archery-xlzqaz4l/<GIT STR>/
- arrow/
- build/
```
### A - Set the preserve directory
On MacOS, the <TMP> directory ends up being something like this
```
/var/folders/9c/thrhbgqx2xb2xqvfk_2m6pgh0000gn/T/
```
Which is impossible to find again without parsing `archery` log.
On top of this, I'd sometime want to control more finely where the cache is
stored, either for convenience of inspecting/using it, or because the path
structure is not satisfying for some use case (e.g. baenchmarking same commit
but with xsimd 14.1 and 14.2, with different compiler options...).
I propose adding an optional CLI argument `--preserve-dir <PATH>` to
explicitly control where the preserve directory are stored (`<TMP>` in the
above).
### B - Always preserve benchmark output with preserve option
When `--preserve` is set, I recommend we always store the benchmark timings
JSON file in the preserve directory:
- It is relatively small compared the the size of the build directory, so we
should be eager in saving it just in case it might be needed
- It helps keeping track of their name. Currently we have to think of an
explicit name in the `--output`. Having a copy automatically with the build, it
is associated with the commit name, the path of the `preserve-dir`, and we can
retract the compilation context from the `build` directory (compiler flags
used...).
This greatly reduce the cognitive load of having to choose name, track which
file correspond to which settings, reduce the length of the `archery benchmark`
commands we type.
This would be independent from `--output`, which would still work as before.
There is also more information we should store, such as the invocation
command.
### C - Resolve git string (breaking)
Right now, with `archery benchmark run main` the path created is:
```
<TMP>/arrow-archery-xlzqaz4l/main/
```
I suggest replacing it automatically with
```
<TMP>/arrow-archery-xlzqaz4l/<GIT SHA>/
```
At first glance, it will be slightly harder looking at the folder that
`main` was intended. Though in practice beleive this is more sneaky than
helpful. `main` is a moving target and even with a day of work its meaning can
change and we forget "which main". This is even more error-prone with a feature
branch, and remote copies for bench-marking different platforms.
### Component(s)
Archery
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]