I wrote:
the sequence of the contents of that entity.  (Conceptually, you should
have a second box even when that sequence is empty, but you can
toss these empty boxes with no loss of information.)

That said, tossing those empty boxes makes parsing more complex
than it needs to be.

Here's an implementation of "python-like" parsing, which retains
the empty boxes.  Retaining the empty boxes allows me to use
more regular arrays, which eliminates the need for two levels boxing
(one for sequencing and another for containment).

In other words, instead of two levels of boxing, I can use a two
dimensional array of boxes, where the first dimension indicates
sequence and the second dimension reflects containment.

(The second dimension is always 2, and the contained elements
are in the second half.)

Here's an example implementation:

words=: <@(((a: ,:@;~ }.~);])(0 i.~' '=]));._2
blank=: <,:0;~i.0 2
parse=: [: value [: crate^:(<:@#)@> [: shift&.>/ blank |.@, words
value=: 0 {:: ,
level=: >:@[EMAIL PROTECTED]&(1&{::)
shift=: [EMAIL PROTECTED](level {:)

NB. x - new word, y - parse stack
enter=: ,~
group=: }:@] , [ (,~&.>&{. , {:@[) {:@]
leave=: [ group (] bunch [: +/ <&(1&{::"1))

bunch=: [EMAIL PROTECTED]:[~ NB. x - levels to package
crate=: _2&}. (,<) _1 0&{:: stuff _2 0&{::
stuff=: <@[  (<_1 _1)} ]

Notice how my code which stuffs contained elements into
crates is much simpler than my previous example.

Also, the display of the result is compact enough that
I think it's ok to display in a message;

  test
html
head
body
 p stuff
 p div='adiv'
  more stuff
 table
  tr
   td 1
   td 2
 p last stuff

  parse test
+----+-----------------------------------+
|html|+----+----------------------------+|
|    ||head|                            ||
|    |+----+----------------------------+|
|    ||body|+------------+-------------+||
|    ||    ||p stuff     |             |||
|    ||    |+------------+-------------+||
|    ||    ||p div='adiv'|+----------++|||
|    ||    ||            ||more stuff|||||
|    ||    ||            |+----------++|||
|    ||    |+------------+-------------+||
|    ||    ||table       |+--+-------+ |||
|    ||    ||            ||tr|+----++| |||
|    ||    ||            ||  ||td 1||| |||
|    ||    ||            ||  |+----++| |||
|    ||    ||            ||  ||td 2||| |||
|    ||    ||            ||  |+----++| |||
|    ||    ||            |+--+-------+ |||
|    ||    |+------------+-------------+||
|    ||    ||p last stuff|             |||
|    ||    |+------------+-------------+||
|    |+----+----------------------------+|
+----+-----------------------------------+

--
Raul

P.S. I've made an experimental change to my mail settings.
Please let me know if this message comes through as garbage.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to