On Thu, Mar 18, 2010 at 11:11 PM, Thomas Taylor <[email protected]> wrote:
> I import a comma separated list from my bank and open it in OOcalc. I have
> trouble reading the date in their format of YYYYMMDDhhmmss, it's all glommed
> together. I want to run a script on it before importing that will put some
> character (prefer either "_" or " " as shown below.
>
> DEBIT,20100208120000[0:GMT], \
> DEBIT,20100208120000[0:GMT], \
> DEBIT,20100204120000[0:GMT], \
> DEBIT,20100125120000[0:GMT], \
> DEBIT,20100125120000[0:GMT], / rest of lines trimmed
> DEBIT,20100122120000[0:GMT], /
> CREDIT,20100120120000[0:GMT], /
> CHECK,20100119120000[0:GMT], /
> ^ ^
> date field (YYYYMMDDhhmmss)
>
> DEBIT,2010_02_08_120000[0:GMT], <<<<< end result I want
>
> pass 6 or 7 char then find ","
> then 4 digits for year
> insert "_"
> then 2 digits for month
> insert "_"
> then 2 digits for day
> insert "_"
> get rest of line
>
> Tried combinations of sed & awk but haven't been able to insert the "_"
> between the year, month, day, and time fields.
>
> Pointers and suggestion would be greatly appreciated. I'm in the process of
> learning bash scripting so please include how the script works (what it
> does). I'm an old hand at C and assembler but the bash/sed/awk syntax has
> me baffled.
> --
> Thanks, Tom (retired penguin)
> openSuSE 11.3-M3, kde 4.4.0
> FF 3.6.0
>
While this isn't how I would write this in awk, it probably will make
the most sense to a C programmer. Run with
awk -f scriptname datafilename
Note that I assume 4 fields total. Adjust the printf statement as needed.
BEGIN {FS = ","}
{
$2 = substr($2,1,4) "_" substr($2,5,2) "_" substr($2,7,2) \
"_" substr($2,9,2) "_" substr($2,11)
printf "%s,%s,%s,%s\n", $1, $2, $3, $4
}
The BEGIN line is executed before any of the data is read. This just
sets the field separator to a comma. This could also be done on the
command line.
A brace with nothing before it, matches all lines. In other words,
everything in the brace block is executed for each input line.
$2 is the second field (where $1 is the first and, if you care, $0 is
the whole line).
The substr function arguments are variable or field name, starting
character (starting at 1) and number of characters. If no character
count is specified, all remaining characters are used.
Strings are just (magically) concatenated so the $2 ... line just
concatenates each substring together with the underscore strings. The
\ is just line continuation.
As the line was broken up into fields as the commas, they need to be
re-inserted on output as you see in the printf statement.
Probably not how I would actually do it but it is about as close to C
as you can get.
--
Phil Hughes
[email protected]