Raul

That works like a charm! It gets all the parameters, and puts them in the
right columns. Now I'll try it on a larger data file with real data in it:

  $ww
10         NB. ww has ten log files in it, one box per log file.
   $;ww
969059  NB. ww unboxed and raveled is a long text string of catenated log
files. Each log file has lots of events in it, and each event has lots of
parameters.

   a. i. crlftb
13 10 9      NB. The verb crlftb has CR, LF, Tab in it.

NB. This is the terminator string for all the lines in the log file.

NB. I want the parameters on every lines starting with STATUS, RESULT[0],
and CONFIDENCE[0]

tags2 =: 'STATUS'; crlftb ; 'RESULT[0]'; crlftb ; 'CONFIDENCE[0]' ; crlftb
   tags2
┌──────┬───┬─────────┬───┬─────────────┬───┐
│STATUS│   │RESULT[0]│   │CONFIDENCE[0]│   │
└──────┴───┴─────────┴───┴─────────────┴───┘

NB. Now the acid test:

 txt9 =:  (; ww) getTagsContents tags2
   $txt9
120 3

So there were 120 events in all the log files that had at least one of the
three parameter values we wanted, in them.

Let's take a look:

  cleanString1 10 {. 100 }. txt9
┌───────────┬───────────┬─────────────────┐
│           │           │[0][__MRCP_GID] 0│
├───────────┼───────────┼─────────────────┤
│           │           │[0][__MRCP_STR] 0│
├───────────┼───────────┼─────────────────┤
│RECOGNITION│main menu  │75               │
├───────────┼───────────┼─────────────────┤
│           │           │[0][__MRCP_GID] 0│
├───────────┼───────────┼─────────────────┤
│           │           │[0][__MRCP_STR] 0│
├───────────┼───────────┼─────────────────┤
│RECOGNITION│ninety five│64               │
├───────────┼───────────┼─────────────────┤
│           │           │[0][__MRCP_GID] 0│
├───────────┼───────────┼─────────────────┤
│           │           │[0][__MRCP_STR] 0│
├───────────┼───────────┼─────────────────┤
│RECOGNITION│yes        │86               │
├───────────┼───────────┼─────────────────┤
│           │           │[0][__MRCP_GID] 0│
└───────────┴───────────┴─────────────────┘

Yes! that's it!

Raul, Frasier, Björn, Linda, *thanks to all of you* for helping me on this
problem.

Now I have to do this same thing to a few thousand log files instead of
just 10. Then I need to do all kinds of analysis on the resulting data. I
think I know enough J to do the analysis part, but I still may have to ask
a question or two, if I get stuck.

I'll let you all know how it goes....

Skip

On Sat, Nov 19, 2011 at 11:24 AM, Raul Miller <[email protected]> wrote:

> Note that this isn't really a new function -- it's the same one that you
> posted (or would have posted, i think, if you had posted the last line of
> it).  Except, mine was from a version that had =: instead of =: for its
> intermediate results.  That's bad, for production code, but it does let us
> see what the bug is:
>
>   _4 ]\ expand #inv (+/expand){.data
>
> ┌────────────────────┬───────────────────┬─────────────────────────┬────────────────────┐
> │param1              │param2             │param3                   │param5
>             │
>
> ├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │param1    =  12345  │param2    =   NONE │param3   =   hello world │
>             │
>
> ├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │                    │                   │param1  = 34567          │param3
>  = hello bob │
>
> ├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │param5   - zero one │                   │                         │param5
> = two three  │
>
> ├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │param1 = 6789       │param2 = SOME      │                         │
>             │
>
> ├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │param1              │param2             │param3                   │param5
>             │
>
> └────────────────────┴───────────────────┴─────────────────────────┴────────────────────┘
>
> I am not defining "expand" properly.  Thus, parameters are being misplaced.
>
> If I use an alternate definition for expand, it seems to get the parameters
> into the right places:
>
>   expand=: ;0 1 2 3 e.L:0 (<;.1~ 1,2>:/\]) ,I. |:>(e.L:0~ /:~@;) {."1 locs
>   _4 ]\ expand #inv (+/expand){.data
>
> ┌───────────────────┬───────────────────┬─────────────────────────┬────────────────────┐
> │param1             │param2             │param3                   │param5
>           │
>
> ├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │param1    =  12345 │param2    =   NONE │param3   =   hello world │
>           │
>
> ├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │param1  = 34567    │                   │param3  = hello bob      │param5
> - zero one │
>
> ├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │                   │                   │                         │param5 =
> two three  │
>
> ├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │param1 = 6789      │param2 = SOME      │                         │
>           │
>
> ├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
> │param1             │param2             │param3                   │param5
>           │
>
> └───────────────────┴───────────────────┴─────────────────────────┴────────────────────┘
>
> ...and this also lets me clean up some unneeded stuff (I no longer need to
> add the blank tags to the text I am working with, and so I no longer need
> to drop those rows from the result.. except it blows up if no tags are
> present, so I can't get rid of that entirely...
>
> Anyways, here's how it looks with this definition for expand:
>
> getTagsContents=: 4 :0
>  'n m'=. $tags=. > _2 <\ y
>   locs=. (-@#@[ {. I. {./. ])&.>/\"1 tags [email protected]:0 }. txt=. ' ',x,;tags
>   assert. -:&/:&;/ |:locs  NB. tags must be balanced
>   data=. _2 {:\  ((/:~ ; locs) I. i.#txt) </.  txt
>  expand=. ;(i.n) e.L:0 (<;.1~ 1,2>:/\]) ,I. |:>(e.L:0~ /:~@;) {."1 locs
>  }: (#@>{."1 tags) }.&.>"1 (-n) ]\ expand #inv (+/expand){.data
> )
>
> --
> Raul
>
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to