Ok, let me jump in here.
First: AFAICS your approach works.
tags6=: 'start{';'STATUS';'RESULT[0]';'CONFIDENCE[0]';'UTTERANCE_FILENAME'
ts 'it=. tags6 linend start taggedEvents fread <FILE'
1.26116 6.99762e7
# it
4917
compared with my version that also adds DG_ fields as last column
ts 'it=.(fread <FILE) getFields3 tags6'
3.1375 1.56796e8
So you win :-)
My interpretation of log file lay-out:
intro line CR,LF
...
start{ CR,LF
intro line CR,LF
...
TAB <tag><data> CR,LF
TAB <tag><data> CR,LF
...
TAB <tag><data> CR,LF
}end CR,LF
start{ CR,LF
...
}end CR,LF
SepTagContents=:(({.~i.&1@:=~&' ')@{.;}.@}.)~ i.&1@:=~&'=' NB. separating and
cleaning up
isMember=: +./@:E.
EOL=:CR,LF
getFields3=: 4 :0
ft=. }. y
a=. ({. y) (]<@(}.&.>#~TAB={.&>);._1~ isMember&>) (<@}.;._2~EOL&E. ) x
c=.(#~(ft e.~ {.&>))L:2 SepTagContents L:0 a
dg=. _2 ]\&.> ({:@SepTagContents)&> L:1 (#~('DG_'-:3&{.)&>)L:1 a
dg ,.~((a:$~# ft) {:@>@[`((ft I.@:e.{.&>)@[)`]}~ ;)"0 c
)
As you always say
FYI
--
Met vriendelijke groet,
@@i=Arie Groeneveld
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm