On Fri, Jun 03, 2022 at 09:14:05AM +0200, Ulrich Mueller wrote:
> >>>>> On Tue, 31 May 2022, Ionen Wolkens wrote:
> 
> > +# @FUNCTION: esed
> > +# @USAGE: <sed-argument>...
> > +# @DESCRIPTION:
> > +# sed(1) wrapper that dies if the expression(s) did not modify any files.
> > +# sed's -i/--in-place is forced, and so stdin/out cannot be used.
> 
> This sounds like a simple enough task ...
> 
> > +esed() {
> > +   local -i i
> > +
> > +   if [[ ${esedexps@a} =~ a ]]; then
> > +           # expression must be before -- but after the rest for e.g. -E 
> > to work
> > +           local -i pos
> > +           for ((pos=1; pos<=${#}; pos++)); do
> > +                   [[ ${!pos} == -- ]] && break
> > +           done
> > +
> > +           for ((i=0; i<${#esedexps[@]}; i++)); do
> > +                   [[ ${esedexps[i]} ]] &&
> > +                           esedexps= esed "${@:1:pos-1}" -e 
> > "${esedexps[i]}" "${@:pos}"
> > +           done
> > +
> > +           unset esedexps
> > +           return 0
> > +   fi
> > +
> > +   # Roughly attempt to find files in arguments by checking if it's a
> > +   # readable file (aka s/// is not a file) and does not start with -
> > +   # (unless after --), then store contents for comparing after sed.
> > +   local contents=() endopts files=()
> > +   for ((i=1; i<=${#}; i++)); do
> > +           if [[ ${!i} == -- && ! -v endopts ]]; then
> > +                   endopts=1
> > +           elif [[ ${!i} =~ ^(-i|--in-place)$ && ! -v endopts ]]; then
> > +                   # detect rushed sed -i -> esed -i, -i also silently 
> > breaks enewsed
> > +                   die "passing ${!i} to ${FUNCNAME[0]} is invalid"
> > +           elif [[ ${!i} =~ ^(-f|--file)$ && ! -v endopts ]]; then
> > +                   i+=1 # ignore script files
> > +           elif [[ ( ${!i} != -* || -v endopts ) && -f ${!i} && -r ${!i} 
> > ]]; then
> > +                   files+=( "${!i}" )
> > +
> > +                   # eval 2>/dev/null to silence \0 warnings if sed binary 
> > files
> > +                   eval 'contents+=( "$(<"${!i}")" )' 2>/dev/null \
> > +                           || die "failed to read: ${!i}"
> > +           fi
> > +   done
> > +   (( ${#files[@]} )) || die "no readable files found from '${*}' 
> > arguments"
> > +
> > +   local verbose
> > +   [[ ${ESED_VERBOSE} ]] && type diff &>/dev/null && verbose=1
> > +
> > +   local changed newcontents
> > +   if [[ -v _esed_output ]]; then
> > +           [[ -v verbose ]] &&
> > +                   einfo "${FUNCNAME[0]}: sed ${*} > ${_esed_output} ..."
> > +
> > +           sed "${@}" > "${_esed_output}" \
> > +                   || die "failed to run: sed ${*} > ${_esed_output}"
> > +
> > +           eval 'newcontents=$(<${_esed_output})' 2>/dev/null \
> > +                   || die "failed to read: ${_esed_output}"
> > +
> > +           local IFS=$'\n' # sed concats with newline even if none at EOF
> > +           contents=${contents[*]}
> > +           unset IFS
> > +
> > +           [[ ${contents} != "${newcontents}" ]] && changed=1
> > +
> > +           [[ -v verbose ]] &&
> > +                   diff -u --color --label="${files[*]}" 
> > --label="${_esed_output}" \
> > +                           <(echo "${contents}") <(echo "${newcontents}")
> > +   else
> > +           [[ -v verbose ]] && einfo "${FUNCNAME[0]}: sed -i ${*} ..."
> > +
> > +           sed -i "${@}" || die "failed to run: sed -i ${*}"
> > +
> > +           for ((i=0; i<${#files[@]}; i++)); do
> > +                   eval 'newcontents=$(<"${files[i]}")' 2>/dev/null \
> > +                           || die "failed to read: ${files[i]}"
> > +
> > +                   if [[ ${contents[i]} != "${newcontents}" ]]; then
> > +                           changed=1
> > +                           [[ -v verbose ]] || break
> > +                   fi
> > +
> > +                   [[ -v verbose ]] &&
> > +                           diff -u --color --label="${files[i]}"{,} \
> > +                                   <(echo "${contents[i]}") <(echo 
> > "${newcontents}")
> > +           done
> > +   fi
> > +
> > +   [[ -v changed ]] \
> > +           || die "no-op: ${FUNCNAME[0]} ${*}${_esed_command:+ (from: 
> > ${_esed_command})}"
> > +}
> 
> ... but then it's almost 100 lines of shell code, including very
> convoluted parsing of parameters. The code for detection whether a
> parameter is actually a file is a heuristic at best and looks rather

Even if it does pickup sed -e "s_im-a_real-file_", it's a semi
non-issue given the file won't change before and after and it'll
evaluate the others for difference instead.

Where it gets a bit more complicated is enewsed given this is valid:
sed s/// file1 file2 > file3
Of course, could always ban multiple files on enewsed if felt really
needed, albeit fwiw it's a limitation for a very unlikely scenario.

"--" support is probably what adds the most complexity here, but
will have a problem if any ebuild ever needs to sed a -file
(-- is also useful in esedfind fwiw)

Also I did write tests so the minor complexity hopefully doesn't
lead to regressions.

iwdevtools' qa-sed uses getopt(long) instead, but even that is not
perfect given sed has special rules (plus wouldn't want to depend
on || ( getopt util-linux ) here).

> brittle. Also, don't use eval because it is evil.

This is static evals, they're just evaluating a flat string and it's
no different than.. well just running it as-is except it works around
a noise issue (the 2>/dev/null doesn't register without it).

eval is mostly evil when it's:
eval "${expanded_variable}=${what_is_this_even}"

Seeing eval "var=\${not_expanded}" as "evil" makes no sense to me,
it's a harmless warning silencer. I feel this is just overreaction
to the eval keyword (also in other dev ML post).

To reproduce the warning:
$ var=$(printf "\0") # var=$(<file-with-null-bytes)
bash: warning: command substitution: ignored null byte in input

doesn't work: var=$(printf "\0") 2>/dev/null
works: eval 'var=$(printf "\0")' 2>/dev/null

> 
> So IMHO this isn't the right approach to the problem. If anything, make
> it a simple function with well defined arguments, e.g. exactly one
> expression and exactly one file, and call "sed -i" on it.

I was hoping it wouldn't be a limitation to using sed, aka just add
checks and take very little away. Would've even liked to do 100% compat
like qa-sed (transparent stdin/out support), but that one is more
complicated and drew the line.

Lacking multiple files on esed (not enewsed) would be kinda annoying
I think. esedfind also makes use of the fact it can take multiple
files for when people need things like:

find . -name '*.c' -exec sed -i s/__AVX__/__AVX2__/ {} + || die
(esedfind would die if not one macro was replaced)

I tried to use esed in several ebuilds and see what felt needed or
is annoying without hoping people wouldn't mind using it over sed.

Goal here is not to replace sed nor encourage using it, but just add
a guard to ensure changes happened if it does get used. I've seen
plenty of ebuilds with a sed that's been broken for >5-10 years.

> 
> Then again, we had something similar in the past ("dosed" as a package
> manager command) but later banned it for good reason.

Well, dosed /used/ to be kinda useful to avoid needing:

mv file "${T}"/tmpfile || die
sed s/// "${T}"/tmpfile > file || die

It's only later that dosed started using -i itself and mostly didn't
have a purpose anymore given ebuilds could now use -i too, see:

https://bugs.gentoo.org/152017

> 
> Detection whether the file contents changed also seems complicated.
> Why not compute a hash of the file before and after sed operated on it?
> md5sum or even cksum should be good enough, since security isn't an
> issue here.

I don't think this would simplify beside now calling more external
commands twice then comparing the outputs (capturing command output vs
capturing file is not particularly different). May be faster on large
amount of files but probably not meaningful for single/few files.

This may make sed concatenation more complicated to handle too (cat
doesn't concatenate the same way sed does wrt newlines at EOF). Not
that couldn't ban that on enewsed, although it's more limitations
over not much.

>

All this aside, I'm not adamant about adding this. If not enough
interest then I'll just drop the idea. May try to implement multiple
expression support in iwdevtools' qa-sed instead, albeit being a
separate QA thing that doesn't know the maintainer's intend (aka
which seds are deterministic or not) and may also be missed, it
doesn't really ensure ebuilds' seds are kept up to date.

-- 
ionen

Attachment: signature.asc
Description: PGP signature

Reply via email to