Hello Raul,
The following is my understanding of the state machine.
I used your states verb from [1]. With that I get :
manc =: <"0'<>a '
getAnc=: 0;(states'');<manc
NB. < > a space other more # got
4.1 0 0 0 0 0 NB. initial
4.1 2 1 1 1 1 NB. will accept
4.2 0.3 0.3 0.3 0.3 0.3 NB. accept
4.1 3 3 3 3 3 NB. ignore
4.1 3 5 3 3 0
4.1 3 3 1 3 0
)
(st1 'a ') -: getAnc
1
This led me to the following questions:
Q1: Why are there 6 columns when the input options are only 5?
Q2: How does the states verb work?
Q3: How does the st1 verb work in creating the other rows of the state
machine? The last two lines of st1 verb are quite involved....
Yes, I agree that specific state machines for specific phrase searches are
definitely better.
Thanks and Regards,
Yuva
[1] http://www.jsoftware.com/jwiki/JWebServer/HttpParser
On 6/16/07, Raul Miller <[EMAIL PROTECTED]> wrote:
Here's some code which extracts named element tags from an
xml string:
st1=:3 :0
cols=.<"0'<>',~.y
n=.1+#cols
rows=. ,:n {.4.1 NB. start
rows=.rows, 4.1 2,}.n#1 NB. will accept
rows=.rows, 4.2,n#0.3 NB. accept
rows=.rows, 4.1,n#3 NB. ignore
rows=.rows, 4.1,.3,.3,.~(3+(_2,~[:}:[EMAIL PROTECTED]) * (=/~.))y
0;(0 10#:10*rows);<cols
)
selel=: [EMAIL PROTECTED] }:@;: '<',~]
Example use:
'a ' selel i
where i is some xml or html.
Note that this returns some tags which seltag did not
recognize. This seems to have to do with the way st was
designed, but I've not examined this issue very closely.
Note also that I recommend explicitly including a space after
the element name. Perhaps this should instead be incorporated
into the body of the definition of st1. But this is important, for
example, to prevent <applet or <abbr from being treated as <a.
Note that this technique only works for element names, and not
for attributes. However, unless you are looking for the same
attribute on different elements you can still speed things up
by first restricting the data you are searching to the elements
of interest.
FYI,
--
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm