Thank you, thank you, thank you! I'll digest your mail, spend a few more
productive hours with the Dictionary, and send comments, but this looks like
the kind of explanation I've been wanting.
... peter
Raul Miller wrote:
On Mon, Jul 30, 2012 at 4:12 PM, Peter B. Kessler
<[email protected]> wrote:
In the "alternating binary array" case, it seems, to me, like what's called
for is a small finite state machine (or regular expression parser, if that's
how you think). Start in a neutral state, and move to a new state on
reading a column of the input. I think there are only 4 states:
TooSoonToTell, MostRecently01, MostRecently10, and Error. TooSoonToTell
just moves to one of the MostRecently states based on the input (or to Error
on 1 1). The MostRecently states transition to the other MostRecently on
the "other" input, or to Error on the same input (or on 1 1). Error just
stays in Error regardless of the input. The state machine seems almost
J-friendly, since it can be represented as a matrix with rows for states and
columns for inputs, yielding a new state. I see that J *has* a Sequential
Machine (dyadic :;) operator. Why isn't that the right solution to the
alternating 1's problem?
That could indeed be a solution, but using sequential machine this way
is clumsy, because the only result format that gives you access to
state numbers is trace (function code 5) which gives you a result with
one row for every element of your sequence of data and six columns.
So here's an implementation without using ;:
That said, I have trouble "seeing" the whole state machine when I work
with names. I'm not even sure how to express it properly. Instead,
here's the raw state machine:
transitions=: 0 1 2 3,1 3 1 3,2 2 3 3,:3 3 3 3
That transition matrix looks like this:
0 1 2 3
1 3 1 3
2 2 3 3
3 3 3 3
Or, if you want it compressed: 4#.inv 27 119 175 255
Here, each column is indexed by the current state, and the row is
indexed by values from #.@|: of the alternating binary pairs.
The initial state is zero, the next two columns mean that we validly
hit the 01 or 10 combination, and the final column means that we
encountered an invalid datum.
Cutting to the finish:
transitions=: 0 1 2 3,1 3 1 3,2 2 3 3,:3 3 3 3
validalt=: 3 -.@e.&> [: (>@[ <@, {&transitions@])/ ] ,&0&.>@{:@]`_1:`]
,~ 0:} <"0@#.@|:
Breaking this down,
Here's an example (bad) sequence:
0 1 0 0 1 0
1 0 1 1 0 0
Represented in J, this might be
0 1 0 0 1 0,:1 0 1 1 0 0
Expressing as index numbers, I get
#.|:0 1 0 0 1 0,:1 0 1 1 0 0
1 2 1 1 2 0
But to use our state machine we also need to represent "state". And,
the initial state is 0. And, I am planning on using J's insert (/) to
reduce this to the final state. So:
(] ,&0&.>@{:@]`_1:`]} <"0@#.@|: ;~ 0:)0 1 0 0 1 0,:1 0 1 1 0 0
┌─┬─┬─┬─┬─┬─┬───┐
│0│1│2│1│1│2│0 0│
└─┴─┴─┴─┴─┴─┴───┘
The } expression here gets a left argument (which it ignores) which
means that the 3 gerunds give: value to substitute, index to
substitute at, and array to merge the values into.
verb for value to substitute is ,&0&.>@{:@]
verb for index to substitute is _1
verb for array to merge value into is ]
In other words: all this does is stick an extra 0 into the last box...
Here, the last box has our initial state as its first value (remember
that state corresponds to column in the transition matrix) and the
value to be processed is second (so that will pick the column).
As an aside, note that you can inspect intermediate results for an
insert operation by performing the inserts manually. For example:
(<2) (>@[ <@, {&transitions@]) <0 0
┌───┐
│2 0│
└───┘
(<1) (>@[ <@, {&transitions@]) (<2) (>@[ <@, {&transitions@]) <0 0
┌───┐
│1 2│
└───┘
Or, once you think you have the basic logic sorted out, you can use \.
to show all intermediate results:
(>@[ <@, {&transitions@])/\. (] ,&0&.>@{:@]`_1:`]} <"0@#.@|: ;~
0:)0 1 0 0 1 0,:1 0 1 1 0 0
┌───┬───┬───┬───┬───┬───┬───┐
│0 3│1 3│2 3│1 1│1 2│2 0│0 0│
└───┴───┴───┴───┴───┴───┴───┘
(watch out for email induced line wrap -- all of the J expressions
should be single lines)
By the way, I added an extra boxed zero on the left end of the values
to be processed because (>@[ <@, {&transitions@]) takes the left
argument and prepares it for processing state in the next operation.
So by sticking an extra value on the left (which I can ignore) ensures
that the leftmost value in the original sequence of numbers is
included in the state.
Anyways, without \. this gives us a final result of <0 3:
(>@[ <@, {&transitions@])/ (] ,&0&.>@{:@]`_1:`]} <"0@#.@|: ;~ 0:)0
1 0 0 1 0,:1 0 1 1 0 0
┌───┐
│0 3│
└───┘
And the leftmost part of the verb interprets a box containing 3 as
invalid and any other box as valid.
I hope this makes sense -- state machines are so simple they are confusing.
FYI,
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm