Assuming the eight four-digit fields are always separated by either no character or one single character, which must always be the same, that horrible sed substitution does the work:
s/.*\([[:xdigit:]]\{4\}\)\([^[:xdigit:]]\{0,1\}\)\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\)\2\([[:xdigit:]]\{4\}\).*/\1:\3:\4:\5:\6:\7:\8:\9/

If you still want the separator to be always the same but want it to be any string of at most n characters, replace every \{0,1\} with \{0,n\}. If you want the separator to be any string of at most n characters but possibly a different string between different fields, replace every \2 with [^[:xdigit:]]\{0,n\}.

The latter generalization makes it more likely that a prefix including four consecutive hexadecimal digits will be wrongly interpreted as the first field of the address. Imagine for instance a domain that would be academy.0123-4567-89ab-cdef-0123-4567-89ab-cdef.berkeley.edu. The leading four letters "acad" are hexadecimal digits too. If any sequence of at most 4 (or more) non-hexadecimal characters is seen as a separator, "acad" will be seen as the first field of the address. If that threshold is below 4 or if the separator must always be the same, then sed extracts the proper address.

Reply via email to