Note that this isn't really a new function -- it's the same one that you
posted (or would have posted, i think, if you had posted the last line of
it). Except, mine was from a version that had =: instead of =: for its
intermediate results. That's bad, for production code, but it does let us
see what the bug is:
_4 ]\ expand #inv (+/expand){.data
┌────────────────────┬───────────────────┬─────────────────────────┬────────────────────┐
│param1 │param2 │param3 │param5
│
├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│param1 = 12345 │param2 = NONE │param3 = hello world │
│
├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│ │ │param1 = 34567 │param3
= hello bob │
├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│param5 - zero one │ │ │param5
= two three │
├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│param1 = 6789 │param2 = SOME │ │
│
├────────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│param1 │param2 │param3 │param5
│
└────────────────────┴───────────────────┴─────────────────────────┴────────────────────┘
I am not defining "expand" properly. Thus, parameters are being misplaced.
If I use an alternate definition for expand, it seems to get the parameters
into the right places:
expand=: ;0 1 2 3 e.L:0 (<;.1~ 1,2>:/\]) ,I. |:>(e.L:0~ /:~@;) {."1 locs
_4 ]\ expand #inv (+/expand){.data
┌───────────────────┬───────────────────┬─────────────────────────┬────────────────────┐
│param1 │param2 │param3 │param5
│
├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│param1 = 12345 │param2 = NONE │param3 = hello world │
│
├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│param1 = 34567 │ │param3 = hello bob │param5
- zero one │
├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│ │ │ │param5 =
two three │
├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│param1 = 6789 │param2 = SOME │ │
│
├───────────────────┼───────────────────┼─────────────────────────┼────────────────────┤
│param1 │param2 │param3 │param5
│
└───────────────────┴───────────────────┴─────────────────────────┴────────────────────┘
...and this also lets me clean up some unneeded stuff (I no longer need to
add the blank tags to the text I am working with, and so I no longer need
to drop those rows from the result.. except it blows up if no tags are
present, so I can't get rid of that entirely...
Anyways, here's how it looks with this definition for expand:
getTagsContents=: 4 :0
'n m'=. $tags=. > _2 <\ y
locs=. (-@#@[ {. I. {./. ])&.>/\"1 tags [email protected]:0 }. txt=. ' ',x,;tags
assert. -:&/:&;/ |:locs NB. tags must be balanced
data=. _2 {:\ ((/:~ ; locs) I. i.#txt) </. txt
expand=. ;(i.n) e.L:0 (<;.1~ 1,2>:/\]) ,I. |:>(e.L:0~ /:~@;) {."1 locs
}: (#@>{."1 tags) }.&.>"1 (-n) ]\ expand #inv (+/expand){.data
)
--
Raul
On Sat, Nov 19, 2011 at 10:34 AM, Skip Cave <[email protected]> wrote:
> I tried Raul's latest function:
>
> txt5 =: textfile3 getTagsContents tags1
> txt5
> ┌──────────────┬─────────────┬───────────────────┬──────────────┐
> │ = 12345 │ = NONE │ = hello world │ │
> ├──────────────┼─────────────┼───────────────────┼──────────────┤
> │ │ │ = 34567 │ = hello bob │
> ├──────────────┼─────────────┼───────────────────┼──────────────┤
> │ - zero one │ │ │ = two three │
> ├──────────────┼─────────────┼───────────────────┼──────────────┤
> │ = 6789 │ = SOME │ │ │
> └──────────────┴─────────────┴───────────────────┴──────────────┘
>
> This is VERY close. The only problem left is that each column in the result
> should always contain the same parameter from each event. Each row should
> represent one event in the log. Raul's function comes very close, but it
> scrambled row 2 and ran the params from the second event into row 3.
>
> The parameters in each event will always be in sequence 1, 2, 3, 4, 5, 6,
> etc. However, some events will skip certain parameters, so those params
> will be missing in that event. In our example, we want to extract
> parameters 1, 2, 3, and 5 from all the events. If a parameter is skipped in
> an event that we want to capture, we need to show that with an empty box in
> the result.
>
> Column one should have all of the 'param1's in it, with empty boxes when
> there isn't a param1 in that event's parameter sequence. Column two should
> have all the 'param2's in it, with empty boxes when there isn't a param2 in
> the parameter sequence. Similarly for columns 3 and 4.
>
> Row one should have all the params from the first event in it. Row 2 should
> have all the params from the second event in it, etc.
>
> We want to capture params 1, 2, 3, and 5 in all events. The first event has
> params 1, 2, and 3, but is missing param5, so the first row of the output
> should have an empty box in the last column position, and Raul's output
> does.
>
> The second event has params 1, 3, 4, 5, 6, but is missing param2. So the
> second row of the result should have param1 in the first column position,
> an empty box in the second column position, param3 in the third column, and
> param5 in the fourth column. This is where Raul's function didn't parse the
> event correctly.
>
> The third event only has param5, so the third row should have three empty
> boxes in the firsr three colums, representing the missing param1, 2, and 3,
> and param5 should be in the last column. This row was also incorrect in
> Raul's output.
>
> The fourth event has params 1 and 2, so those should go in the first two
> columns of row four. Params 3 and 5 are missing, so the last two column
> positions of row four should be empty. Raul's function got this correct.
>
> So here's the correct result.
>
> param1 param2 param3 param5
> +---------------------------------------------------------------+
> ¦ = 12345 ¦ = NONE ¦ = hello world ¦ ¦
> +--------------+-------------+-------------------+--------------¦
> ¦ = 34567 ¦ ¦ = hello bob ¦ - zero one ¦
> +--------------+-------------+-------------------+--------------¦
> ¦ ¦ ¦ ¦ = two three ¦
> +--------------+-------------+-------------------+--------------¦
> ¦ = 6789 ¦ = SOME ¦ ¦ ¦
> +---------------------------------------------------------------+
>
> Skip
>
> On Sat, Nov 19, 2011 at 7:47 AM, Raul Miller <[email protected]>
> wrote:
>
> > getTagsContents=: 4 :0
> > 'n m'=. $tags=. > _2 <\ y
> > locs=: tags [email protected]:0 }. txt=.(' ',;tags),x,;tags
> > locs=: (-@#@[ {. I. {./. ])&.>/\"1 locs
> > assert. -:&/:&;/ |:locs NB. tags must be balanced
> > data=: _2 {:\ ((/:~ ; locs) I. i.#txt) </. txt
> > expand=: ;(#~ 1&e.S:0) <@|./. |.> (e.L:0~ /:~@;) {."1 locs
> > }: }.(#@>{."1 tags) }.&.>"1 (-n) ]\ expand #inv (+/expand){.data
> > )
> >
> > tags1 =: ('param1'; 'crlftb' ; 'param2'; 'crlftb' ; 'param3' ; 'crlftb'
> > ;'param5' ; 'crlftb' )
> >
> > (and textfile3 from Skip's message, below):
> >
> > textfile3 getTagsContents tags1
> > ┌──────────────┬─────────────┬───────────────────┬──────────────┐
> > │ = 12345 │ = NONE │ = hello world │ │
> > ├──────────────┼─────────────┼───────────────────┼──────────────┤
> > │ │ │ = 34567 │ = hello bob │
> > ├──────────────┼─────────────┼───────────────────┼──────────────┤
> > │ - zero one │ │ │ = two three │
> > ├──────────────┼─────────────┼───────────────────┼──────────────┤
> > │ = 6789 │ = SOME │ │ │
> > └──────────────┴─────────────┴───────────────────┴──────────────┘
> >
> > Note that I wrote it so that the flat text is the left argument and the
> > tags are the right argument.
> >
> > FYI,
> >
> > --
> > Raul
> >
> >
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm