Hey guys,
I needed to represent a "nested list" like this:
NEST =. ;: '(345 * _2 + +/@:(+/%#) * ]) - ''hello''"_ , <.'
as analogously-nested set of boxes, like this:
] BOX =. (< (;:'345*_2+;+/@:') , (<;:'+/%#') , ;:'*]') ,
;:'-''hello''"_,<.'
+-----------------------------------+-+-------+-+-+-+--+
|+---+-+--+-+-+-+-+--+---------+-+-+|-|'hello'|"|_|,|<.|
||345|*|_2|+|;|+|/|@:|+-+-+-+-+|*|]|| | | | | | |
|| | | | | | | | ||+|/|%|#|| | || | | | | | |
|| | | | | | | | |+-+-+-+-+| | || | | | | | |
|+---+-+--+-+-+-+-+--+---------+-+-+| | | | | | |
+-----------------------------------+-+-------+-+-+-+--+
This is something I've had to do a number of times, with minor variations.
The most recent case was in the ae2 script [1], which is as above,
except instead of the actual J words I needed to nest their indices in the
original sentence. But I've encountered it in other guises over the
years: parsing XML, data representations from other languages, etc.
So, what I would like is a general dyad to transform a "parenthesized" list
into nested boxes:
A. The right argument (y) is the "parenthesized list"
(in the example above, it's a list of J words,
some of which are boxed parens, but it could be
anything, like a string with {}s, boxes containing
XML elements, etc).
B. The left argument (x) indicates or implies the "level"
or "depth" of each of the items of y . For example,
if y were NEST then the level of each word is:
(345 * _2 + +/@:(+/%#) * ]) - 'hello'"_ , <.
1 1 1 1 1 11 1222222 1 11 0 000 0 0
How to represent this is up to you. You could require
parens to be included or excluded from both x and y;
if included you could require their level to be be N
or N+1 or different for open and close parens; you
could use an integer for the level of each element,
like above, or just _1 0 1 to indicate "level
change", etc.
The key thing is how to represent x is up to you, but
how to calculate x from y is up to your user [2].
C. The output is the nested-box analog of y.
Here's how I do it in [1]:
nest2Box =. cleanNest@:cutNest@:,&<~
cleanNest =. bubble@:rmScaff
bubble =. >@:{.^:(1=#)L:1^:_
rmScaff =. >@:{.L:1
cutNest =. cutBot L:1^:maxDepth
cutBot =. ('';1) <@:,;.1 ] noParen;.2&>~ minMask
noParen =. <@:(}.@:}:^:(1<#))
minMask =. (=<./)&.>@:{:
maxDepth =. 1 + >./@:>@:{:
n2b =: nest2Box f.
x =. (;:'()') (+/\@:(-/))@:(=/) NEST NB. Depth of each element
y =. i.#NEST NB. Elements
x n2b y
+----------------------------------+--+--+--+--+--+--+
|+-+-+-+-+-+-+-+------------+--+--+|17|18|19|20|21|22|
||1|2|3|4|5|6|7|+-+--+--+--+|14|15|| | | | | | |
|| | | | | | | ||9|10|11|12|| | || | | | | | |
|| | | | | | | |+-+--+--+--+| | || | | | | | |
|+-+-+-+-+-+-+-+------------+--+--+| | | | | | |
+----------------------------------+--+--+--+--+--+--+
But I'm not too proud of it [3]. For one thing, it's not general; y must
be open (not boxed), because of the way I use L: :
x n2b NEST
|domain error: n2b
| x n2b NEST
Which raises the question of why I use L: . Isn't nesting a textbook
example of a recursive problem? Shouldn't I be recursing from the top
down, rather than cutting from the bottom up?
The reason is that I, like Jose, prefer fully fixable tacit verbs. There
are essentially two ways to recurse in J; having a named verb call its own
name, and anonymous recursion using $: . Neither method lends itself to
fully fixable tacit form.
Named recursion doesn't permit fixing, and requires a verb be named when
ideally all verbs work the same when anonymous. And verbs that use $:
cause problems when fixed in the definition of other verbs. Essentially,
when a verb ("f") is fixed, if one of the verbs it calls ("g") uses $: ,
then that $: stops referrring to f and starts referring to g , or
introduces an explicit context where it wasn't previously required (and
stops the verb from being fully tacit). On that note, I should mention
that I can picture approaches that generate and evoke code, but I consider
them ambiguously tacit at best.
So, while I'm interested in all approaches (because I can do the work to
adapt them to the form I prefer), I'm particularly interested in verbs
which are tacit, fully fixable, and could be replaced everywhere by their
definition in parentheses. I think this implies cutting from the bottom
up, as I have. But I'm happy to be proven wrong.
-Dan
[1]
http://www.jsoftware.com/svn/DanBron/trunk/environment/anonymous_evoke2.ijs
[2] Allowing the user to calculate x, rather than calculating
it for him, makes future extensions easier. For
convenience, we could also provide cover verbs of
nest2Box to calculate x for common cases (like J words,
strings with {}s, etc).
If we did, I think it would be useful to let the cover
verb's left argument be (potential) items from y that
represent "parentheses", in several forms:
(A) A single item that counts as both open and
close parens,
(B) A list of 2 items which count as open and
close-parens respectively,
(C) A table of (B); a N by 2 table of items
that count as open-parens and close-parens.
[3] If nothing else, improvements to nest2Box are welcome. For example,
bubble and rmScaff are essentially identical. (I know I implemented a
similar function in a different context, and IIRC it was a bit more
elegant. But I can't find that now.)
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm