I believe to have created a solution for this problem.
Extending the problem to one to which "cut" was a solution, I came to the
following specs.
- Let DATA be a string of characters, terminated by '+'.
- Consider substrings T of DATA which do not contain '+' and which are
succeeded by '+' and, apart from the first substring, preceded by '+'.
Substrings may be empty. So DATA = T0'+'T1'+'....'+'Tn'+' for some Ti.
- A subsubstring S of T does not contain ':' and is eventually succeeded and
eventually preceded by ':'. Subsubstrings may be empty too.
Notice that if a substring T does not contain ':', then S=T. And notice also
that the number of subsubstrings is one more than the number of ':'s in the
substring.
Question:
- determine for each substring T the first 4 subsubstrings
example: 'a:bb:ccc:dddd:X:YY+' ====> 'a' 'bb' 'ccc' 'dddd'
- if a subsubstring is empty, denote this by giving the succeeding
':'
example: ':bb::dddd:X:YY+' ====> ':' 'bb' ':' 'dddd'
example: 'a::::X:YY+' ====> 'a' ':' ':' ':'
- if the 4th subsubstring is empty and there is no succeeding ':',
give the succeeding '+'
example: 'a:bb:ccc:+' ====> a' 'bb' 'ccc' '+'
- if there are less then 3 occurrences of ':', give (all) the
subsubstrings (or the succeeding ':' if one is empty) and the succeeding'+'
(to denote the degeneracy of the substring)
example: 'a:bb:ccc+' ====> 'a' 'bb' 'ccc' '+'
example: 'a::ccc+' ====> 'a' ':' 'ccc' '+'
- if a substring T is empty, it is handled as the former case, so
only '+' is given
example: '++' ====> '+' '+'
Solution: Sequential machine is used.
The list of states, together with the actions to be performed, if
independent from the input:
+--+----------------------------+-----------+-----+-+
| state | next action |
+--+----------------------------+-----------+-----+-+
|0 |initial, regular '+' |none |j=:i |1|
+--+----------------------------+-----------+-----+-+
|1 |non empty T0 | | | |
+--+----------------------------+-----------+-----+-+
|2 |non empty T1 | | | |
+--+----------------------------+-----------+-----+-+
|3 |non empty T2 | | | |
+--+----------------------------+-----------+-----+-+
|4 |non empty T3 | | | |
+--+----------------------------+-----------+-----+-+
|5 |from 4th ':' untill next '+'|none |none |0|
+--+----------------------------+-----------+-----+-+
|6 |regular 1ste ':' |none |j=:i |1|
+--+----------------------------+-----------+-----+-+
|7 |regular 2nd ':' |none |j=:i |1|
+--+----------------------------+-----------+-----+-+
|8 |regular 3rd ':' |none |j=:i |1|
+--+----------------------------+-----------+-----+-+
|9 |premature '+' |add |j=:i |2|
+--+----------------------------+-----------+-----+-+
|10|premature 1ste ':' |add |j=:i |2|
+--+----------------------------+-----------+-----+-+
|11|premature 2nd ':' |add |j=:i |2|
+--+----------------------------+-----------+-----+-+
|12|premature 3rd ':' |add |j=:i |2|
+--+----------------------------+-----------+-----+-+
|13|premature 4th ':' |add |j=:_1|3|
+--+----------------------------+-----------+-----+-+
The solution is given by the input mapping
m=:'+:'i.a.
and the state transition table
S=: _2]\"1 }.".;._2 (0 : 0)
'+ : Z ']0
9 1 10 1 1 1
9 2 6 3 1 0
9 2 7 3 2 0
9 2 8 3 3 0
0 3 5 3 4 0
0 0 5 0 5 0
9 1 11 1 2 1
9 1 12 1 3 1
9 1 13 1 4 1
9 2 10 2 1 2
9 2 11 2 2 2
9 2 12 2 3 2
9 2 13 2 4 2
0 3 5 3 5 3
)
If the examples above are concatenated then you get
(0;S;m);:'a:bb:ccc:dddd:X:YY+:bb::dddd:X:YY+a::::X:YY+a:bb:ccc:+a:bb:ccc+a::
ccc+++'
+-+--+---+----+-+--+-+----+-+-+-+-+-+--+---+-+-+--+---+-+-+-+---+-+-+-+
|a|bb|ccc|dddd|:|bb|:|dddd|a|:|:|:|a|bb|ccc|+|a|bb|ccc|+|a|:|ccc|+|+|+|
+-+--+---+----+-+--+-+----+-+-+-+-+-+--+---+-+-+--+---+-+-+-+---+-+-+-+
which is the expected output, as is
,cut'a:bb:ccc:dddd:X:YY+:bb::dddd:X:YY+a::::X:YY+a:bb:ccc:+a:bb:ccc+a::ccc++
+'
+-+--+---+----++--++----+-++++-+--+---++-+--+---++-++---++++++++++++++
|a|bb|ccc|dddd||bb||dddd|a||||a|bb|ccc||a|bb|ccc||a||ccc||||||||||||||
+-+--+---+----++--++----+-++++-+--+---++-+--+---++-++---++++++++++++++
Performance figures are
et=:6!:2 NB. execution time
5 et'cut"1 data'
4.0134945
5 et'(0;S;m)&;:"1 data'
1.2608925
5 et'(1;S;m)&;:"1 data' NB. most efficient
0.12076429
5 et'(2;S;m)&;:"1 data'
0.23790562
5 et'(5;S;m)&;:"1 data'
0.23951592
5 et'cut ,data'
4.4002232
5 et'(0;S;m)&;: ,data'
0.83354572
5 et'(1;S;m)&;: ,data' NB. most efficient
0.097741519
5 et'(2;S;m)&;: ,data'
0.11759514
5 et'(5;S;m)&;: ,data'
0.20124527
R.E. Boss
-----Oorspronkelijk bericht-----
Van: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Namens david alis
Verzonden: dinsdag 23 januari 2007 8:08
Aan: Programming forum
Onderwerp: [Jprogramming] Re: Programming sequential machine/finite state
machine
The strings are delivered within messages.
In practice typical lengths of a string will be between 1k and 2kb.
The average number of such strings within a message may be around
20,000 - (0.1% cases > 1million)
The following demonstrates the advantage of ;: over ;.
data=: 5000 1000$'+4.31:A+-:H+-:H+4.30:A+4.32:A+4.25:A+4.25'
cut =: 4&{.@(<;._1)@(':'&,);._1@('+'&,)
sm =: (2;sc;pp)&;: NB. 2 means return array of offets and indexes
(instead of boxes).
then
6!:2'cut"1 data'
5.75279
6!:2'sm"1 data'
0.243648
Such a pity that I cant use ;: .
As before
pp=: '+:' i. a.
sc=: 8 3 2 $ ". (0 : 0) -. CRLF
3 1 0 0 0 0
1 2 5 1 3 0
1 2 5 2 4 2
2 1 5 1 4 0
2 1 5 1 4 0
1 2 7 2 6 1
1 2 7 2 6 0
1 1 7 0 7 0
)
Thanks again....
David
> Raul Miller wrote: <Mon Jan 22 21:32:16 HKT 2007>
> ....
> In "real life", are you processing much larger chunks
> of data?
> R.E. Boss wrote: <Mon Jan 22 19:44:32 HKT 2007>
> ....
>> Using ;: is at least 10 times quicker than using cut (;.) - because
>On what evidence is this statement based?
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm