Actually, I said something in my previous post which was wrong: Function code 3 for sequential machine gives state table index.
(This is not useful for a lot of the text processing I would like to do (where I want the state table index for each character being processed) but it's fine for this application.) If you use a transpose of that transition table, you can probably make a sequential machine work just fine here. If you run into any problems, let us know and we'll see if we can't find some solutions. Thanks, -- Raul On Mon, Jul 30, 2012 at 5:44 PM, Peter B. Kessler <[email protected]> wrote: > Thank you, thank you, thank you! I'll digest your mail, spend a few more > productive hours with the Dictionary, and send comments, but this looks like > the kind of explanation I've been wanting. > > ... peter > > > Raul Miller wrote: >> >> On Mon, Jul 30, 2012 at 4:12 PM, Peter B. Kessler >> <[email protected]> wrote: >>> >>> In the "alternating binary array" case, it seems, to me, like what's >>> called >>> for is a small finite state machine (or regular expression parser, if >>> that's >>> how you think). Start in a neutral state, and move to a new state on >>> reading a column of the input. I think there are only 4 states: >>> TooSoonToTell, MostRecently01, MostRecently10, and Error. TooSoonToTell >>> just moves to one of the MostRecently states based on the input (or to >>> Error >>> on 1 1). The MostRecently states transition to the other MostRecently on >>> the "other" input, or to Error on the same input (or on 1 1). Error just >>> stays in Error regardless of the input. The state machine seems almost >>> J-friendly, since it can be represented as a matrix with rows for states >>> and >>> columns for inputs, yielding a new state. I see that J *has* a >>> Sequential >>> Machine (dyadic :;) operator. Why isn't that the right solution to the >>> alternating 1's problem? >> >> >> That could indeed be a solution, but using sequential machine this way >> is clumsy, because the only result format that gives you access to >> state numbers is trace (function code 5) which gives you a result with >> one row for every element of your sequence of data and six columns. >> >> So here's an implementation without using ;: >> >> That said, I have trouble "seeing" the whole state machine when I work >> with names. I'm not even sure how to express it properly. Instead, >> here's the raw state machine: >> >> transitions=: 0 1 2 3,1 3 1 3,2 2 3 3,:3 3 3 3 >> >> That transition matrix looks like this: >> >> 0 1 2 3 >> 1 3 1 3 >> 2 2 3 3 >> 3 3 3 3 >> >> Or, if you want it compressed: 4#.inv 27 119 175 255 >> >> Here, each column is indexed by the current state, and the row is >> indexed by values from #.@|: of the alternating binary pairs. >> >> The initial state is zero, the next two columns mean that we validly >> hit the 01 or 10 combination, and the final column means that we >> encountered an invalid datum. >> >> Cutting to the finish: >> >> transitions=: 0 1 2 3,1 3 1 3,2 2 3 3,:3 3 3 3 >> >> validalt=: 3 -.@e.&> [: (>@[ <@, {&transitions@])/ ] ,&0&.>@{:@]`_1:`] >> ,~ 0:} <"0@#.@|: >> >> Breaking this down, >> >> Here's an example (bad) sequence: >> >> 0 1 0 0 1 0 >> 1 0 1 1 0 0 >> >> Represented in J, this might be >> >> 0 1 0 0 1 0,:1 0 1 1 0 0 >> >> Expressing as index numbers, I get >> >> #.|:0 1 0 0 1 0,:1 0 1 1 0 0 >> 1 2 1 1 2 0 >> >> But to use our state machine we also need to represent "state". And, >> the initial state is 0. And, I am planning on using J's insert (/) to >> reduce this to the final state. So: >> >> (] ,&0&.>@{:@]`_1:`]} <"0@#.@|: ;~ 0:)0 1 0 0 1 0,:1 0 1 1 0 0 >> ┌─┬─┬─┬─┬─┬─┬───┐ >> │0│1│2│1│1│2│0 0│ >> └─┴─┴─┴─┴─┴─┴───┘ >> >> The } expression here gets a left argument (which it ignores) which >> means that the 3 gerunds give: value to substitute, index to >> substitute at, and array to merge the values into. >> >> verb for value to substitute is ,&0&.>@{:@] >> verb for index to substitute is _1 >> verb for array to merge value into is ] >> >> In other words: all this does is stick an extra 0 into the last box... >> >> Here, the last box has our initial state as its first value (remember >> that state corresponds to column in the transition matrix) and the >> value to be processed is second (so that will pick the column). >> >> As an aside, note that you can inspect intermediate results for an >> insert operation by performing the inserts manually. For example: >> >> (<2) (>@[ <@, {&transitions@]) <0 0 >> ┌───┐ >> │2 0│ >> └───┘ >> (<1) (>@[ <@, {&transitions@]) (<2) (>@[ <@, {&transitions@]) <0 0 >> ┌───┐ >> │1 2│ >> └───┘ >> >> Or, once you think you have the basic logic sorted out, you can use \. >> to show all intermediate results: >> >> (>@[ <@, {&transitions@])/\. (] ,&0&.>@{:@]`_1:`]} <"0@#.@|: ;~ >> 0:)0 1 0 0 1 0,:1 0 1 1 0 0 >> ┌───┬───┬───┬───┬───┬───┬───┐ >> │0 3│1 3│2 3│1 1│1 2│2 0│0 0│ >> └───┴───┴───┴───┴───┴───┴───┘ >> >> (watch out for email induced line wrap -- all of the J expressions >> should be single lines) >> >> By the way, I added an extra boxed zero on the left end of the values >> to be processed because (>@[ <@, {&transitions@]) takes the left >> argument and prepares it for processing state in the next operation. >> So by sticking an extra value on the left (which I can ignore) ensures >> that the leftmost value in the original sequence of numbers is >> included in the state. >> >> Anyways, without \. this gives us a final result of <0 3: >> >> (>@[ <@, {&transitions@])/ (] ,&0&.>@{:@]`_1:`]} <"0@#.@|: ;~ 0:)0 >> 1 0 0 1 0,:1 0 1 1 0 0 >> ┌───┐ >> │0 3│ >> └───┘ >> >> And the leftmost part of the verb interprets a box containing 3 as >> invalid and any other box as valid. >> >> I hope this makes sense -- state machines are so simple they are >> confusing. >> >> FYI, >> > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
