I can't seem ot get the hang of it for this particular job.
Well, even as a regexp wonk, it's a bit of a daunting task you
have before you. :)
Most of
the problem is with dates, in that I have a mishmash of formats.
Since you don't mention any other problematic sections, I guess
I'll focus on just mending dates. However, if there are other
sections of trouble, feel free to mention them too and perhaps
some handy solution can be found.
Most
of them are in dashed format, but there's not even much uniformity
_there_: some are MM-DD-YYYY, some are M-D-YY, and so on. What I'd
like to do is reformat them en masse as MM/DD/YYYY; preserving the
original values, replacing dashes with slashes, putting zeroes in
front of existing single digits, and expanding two-digit years into
four digits by bolting on "20" at the front.
Well, my first pass at a regexp to *find* these buggers would be
something like
\<\(\d\{1,2}\)[-/]\(\d\{1,2}\)[-/]\(\d\d\|\d\d\d\d\)\>
If there are characters other than "-" and "/" used as
separators, you can append them in those two sets. This regexp
should now have the three pieces isolated, and referencable (is
that a word?) via the usual method of "\1", "\2", and "\3", or
(as you need to massage them) via the submatch() function.
That regexp is basically "one or two digits, followed by a
delimiter, followed by one or two digits, followed by a
delimiter, followed by 2 or 4 digits".
To do some magicomystico replacement on them, we then use the \=
replacement as described in
:help sub-replace-special
The replacement will be something like this expression:
substitute('0'.submatch(1), '.*\(..\)$', '\1', '').
'/'.
substitute('0'.submatch(2), '.*\(..\)$', '\1', '').
'/'.
(strlen(submatch(3)) == 4?
submatch(3):
(submatch(3)[0] == '0'?
'20'.submatch(3):
'19'.submatch(3)
)
)
I broke it out into multiple lines to hopefully make more sense
of it. The first two substitute() lines add a zero on the left
of whatever they found, and then take whatever the rightmost two
characters of the result are...effectively padding them with
zeros on the left if needed. Ideally, Vim would provide a
right() function where you could just do something like
right('0'.submatch(1), 2)
to zero-pad to 2 places. Alas, the substitute() trick is the
easiest way I've found to simulate this.
The third element monkeys with the date. If it's a 4-digit year
(strlen() == 4) then we just use that. If it's not a 4-digit
year, we check the first digit of what was there. If it's a
zero, we prepend '20' on it. If it's not a zero, we presume it
was sooo last century, and prepend '19' to it.
However, Vim likes to have it all crammed on one line. Thus, the
final product looks something like this one-liner (take a deep
breath and a running start now...)
:%s!\<\(\d\{1,2}\)[-/]\(\d\{1,2}\)[-/]\(\d\d\d\d\|\d\d\)\>!\=substitute('0'.submatch(1),
'.*\(..\)$', '\1', ''). '/'.substitute('0'.submatch(2),
'.*\(..\)$', '\1', ''). '/'.(strlen(submatch(3)) ==
4?submatch(3):(submatch(3)[0] ==
'0'?'20'.submatch(3):'19'.submatch(3)))
And you thought that would be hard. ;-)
HTH,
-tim