I believe to have created a solution for this problem.

Extending the problem to one to which "cut" was a solution, I came to the
following specs.

- Let DATA be a string of characters, terminated by '+'. 
- Consider substrings T of DATA which do not contain '+' and which are
succeeded by '+' and, apart from the first substring, preceded by '+'.
Substrings may be empty. So DATA = T0'+'T1'+'....'+'Tn'+' for some Ti.
- A subsubstring S of T does not contain ':' and is eventually succeeded and
eventually preceded by ':'. Subsubstrings may be empty too. 
Notice that if a substring T does not contain ':', then S=T. And notice also
that the number of subsubstrings is one more than the number of ':'s in the
substring.

Question: 
        - determine for each substring T the first 4 subsubstrings
            example: 'a:bb:ccc:dddd:X:YY+' ====> 'a' 'bb' 'ccc' 'dddd'
        - if a subsubstring is empty, denote this by giving the succeeding
':'
            example: ':bb::dddd:X:YY+' ====> ':' 'bb' ':' 'dddd'
            example: 'a::::X:YY+' ====> 'a' ':' ':' ':'
        - if the 4th subsubstring is empty and there is no succeeding ':',
give the succeeding '+'
            example: 'a:bb:ccc:+' ====> a' 'bb' 'ccc' '+'
        - if there are less then 3 occurrences of ':', give (all) the
subsubstrings (or the succeeding ':' if one is empty) and the succeeding'+'
(to denote the degeneracy of the substring)
            example: 'a:bb:ccc+' ====> 'a' 'bb' 'ccc' '+'
            example: 'a::ccc+' ====> 'a' ':' 'ccc' '+'
        - if a substring T is empty, it is handled as the former case, so
only '+' is given
            example: '++' ====> '+' '+'

Solution: Sequential machine is used.

The list of states, together with the actions to be performed, if
independent from the input:
+--+----------------------------+-----------+-----+-+
|   state                       |    next action    |
+--+----------------------------+-----------+-----+-+
|0 |initial, regular '+'        |none       |j=:i |1|
+--+----------------------------+-----------+-----+-+
|1 |non empty T0                |           |     | |
+--+----------------------------+-----------+-----+-+
|2 |non empty T1                |           |     | |
+--+----------------------------+-----------+-----+-+
|3 |non empty T2                |           |     | |
+--+----------------------------+-----------+-----+-+
|4 |non empty T3                |           |     | |
+--+----------------------------+-----------+-----+-+
|5 |from 4th ':' untill next '+'|none       |none |0|
+--+----------------------------+-----------+-----+-+
|6 |regular 1ste ':'            |none       |j=:i |1|
+--+----------------------------+-----------+-----+-+
|7 |regular 2nd ':'             |none       |j=:i |1|
+--+----------------------------+-----------+-----+-+
|8 |regular 3rd ':'             |none       |j=:i |1|
+--+----------------------------+-----------+-----+-+
|9 |premature '+'               |add        |j=:i |2|
+--+----------------------------+-----------+-----+-+
|10|premature 1ste ':'          |add        |j=:i |2|
+--+----------------------------+-----------+-----+-+
|11|premature 2nd ':'           |add        |j=:i |2|
+--+----------------------------+-----------+-----+-+
|12|premature 3rd ':'           |add        |j=:i |2|
+--+----------------------------+-----------+-----+-+
|13|premature 4th ':'           |add        |j=:_1|3|
+--+----------------------------+-----------+-----+-+


The solution is given by the input mapping
        m=:'+:'i.a.
and the state transition table
        S=: _2]\"1 }.".;._2 (0 : 0) 
'+        :      Z ']0
9       1       10      1       1       1
9       2       6       3       1       0
9       2       7       3       2       0
9       2       8       3       3       0
0       3       5       3       4       0
0       0       5       0       5       0
9       1       11      1       2       1
9       1       12      1       3       1
9       1       13      1       4       1
9       2       10      2       1       2
9       2       11      2       2       2
9       2       12      2       3       2
9       2       13      2       4       2
0       3       5       3       5       3
)


If the examples above are concatenated then you  get
 
(0;S;m);:'a:bb:ccc:dddd:X:YY+:bb::dddd:X:YY+a::::X:YY+a:bb:ccc:+a:bb:ccc+a::
ccc+++'
+-+--+---+----+-+--+-+----+-+-+-+-+-+--+---+-+-+--+---+-+-+-+---+-+-+-+
|a|bb|ccc|dddd|:|bb|:|dddd|a|:|:|:|a|bb|ccc|+|a|bb|ccc|+|a|:|ccc|+|+|+|
+-+--+---+----+-+--+-+----+-+-+-+-+-+--+---+-+-+--+---+-+-+-+---+-+-+-+

which is the expected output, as is

        
,cut'a:bb:ccc:dddd:X:YY+:bb::dddd:X:YY+a::::X:YY+a:bb:ccc:+a:bb:ccc+a::ccc++
+'
+-+--+---+----++--++----+-++++-+--+---++-+--+---++-++---++++++++++++++
|a|bb|ccc|dddd||bb||dddd|a||||a|bb|ccc||a|bb|ccc||a||ccc||||||||||||||
+-+--+---+----++--++----+-++++-+--+---++-+--+---++-++---++++++++++++++


Performance figures are 
        et=:6!:2        NB. execution time

   5 et'cut"1 data'
4.0134945
   5 et'(0;S;m)&;:"1 data'
1.2608925
   5 et'(1;S;m)&;:"1 data'      NB. most efficient
0.12076429
   5 et'(2;S;m)&;:"1 data'
0.23790562
   5 et'(5;S;m)&;:"1 data'
0.23951592


   5 et'cut ,data'
4.4002232
   5 et'(0;S;m)&;: ,data'
0.83354572
   5 et'(1;S;m)&;: ,data'       NB. most efficient
0.097741519
   5 et'(2;S;m)&;: ,data'
0.11759514
   5 et'(5;S;m)&;: ,data'
0.20124527


R.E. Boss



-----Oorspronkelijk bericht-----
Van: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Namens david alis
Verzonden: dinsdag 23 januari 2007 8:08
Aan: Programming forum
Onderwerp: [Jprogramming] Re: Programming sequential machine/finite state
machine

The strings are delivered within messages.
In practice typical lengths of a string will be between 1k and 2kb.
The average number of such strings within a message may be around
20,000 - (0.1% cases > 1million)

The following demonstrates the advantage of ;: over ;.

data=: 5000 1000$'+4.31:A+-:H+-:H+4.30:A+4.32:A+4.25:A+4.25'

cut =: 4&{.@(<;._1)@(':'&,);._1@('+'&,)

sm =: (2;sc;pp)&;:  NB. 2 means return array of offets and indexes
(instead of boxes).

then
6!:2'cut"1 data'
5.75279

6!:2'sm"1 data'
0.243648

Such a pity that I cant use  ;: .

As before
pp=: '+:' i. a.
sc=: 8 3 2 $ ". (0 : 0) -. CRLF
 3 1 0 0 0 0
 1 2 5 1 3 0
 1 2 5 2 4 2
 2 1 5 1 4 0
 2 1 5 1 4 0
 1 2 7 2 6 1
 1 2 7 2 6 0
 1 1 7 0 7 0
)

Thanks again....
David


> Raul Miller wrote: <Mon Jan 22 21:32:16 HKT 2007>
> ....

> In "real life", are you processing much larger chunks
> of data?

> R.E. Boss wrote: <Mon Jan 22 19:44:32 HKT 2007>
> ....
>> Using ;:  is at least 10 times quicker than using cut (;.) - because
>On what evidence is this statement based?
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to