When I say "parsing", I usually mean breaking a string into its logical
tokens (which technically lexing). Is that what you want? Or do you mean
parsing in the technical sense, e.g. building an abstract syntax tree?
If you just mean the former, for simple rhematics, you can use ;. . For
example, take a look at the standard csv parser:
load'csv'
chopcsv
3 : 0
dat=. y,','
b=. dat e. ','
c=. ~:/\dat='"'
msk=. b>c
if. 0=+/msk do. msk=. (#msk){.1 end.
dat=. msk <;._2 dat
b=. '"'={.@(1&{.) &> dat
dat=. b }.each dat
b=. '"'={.@(_1&{.) &> dat
dat=. (-b) }.each dat
)
For more complex structures, you can use FSM, dyadic ;: . You can find an
interactive introduction to this in the lab "Sequential Machines". And a
good starter example is in the vocabulary page for ;: itself, which
provides the specification for monad ;: in terms of dyad ;: . That is,
it shows you how J's rhematics are defined (how to lex J sentences):
mj=: 256$0 NB. X other
mj=: 1 (9,a.i.' ')}mj NB. S space and tab
mj=: 2 ((a.i.'Aa')+/i.26)}mj NB. A A-Z a-z excluding N B
mj=: 3 (a.i.'N')}mj NB. N the letter N
mj=: 4 (a.i.'B')}mj NB. B the letter B
mj=: 5 (a.i.'0123456789_')}mj NB. 9 digits and _
mj=: 6 (a.i.'.')}mj NB. D .
mj=: 7 (a.i.':')}mj NB. C :
mj=: 8 (a.i.'''')}mj NB. Q quote
sj=: _2]\"1 }.".;._2 (0 : 0)
' X S A N B 9 D C Q ']0
1 1 0 0 2 1 3 1 2 1 6 1 1 1 1 1 7 1 NB. 0 space
1 2 0 3 2 2 3 2 2 2 6 2 1 0 1 0 7 2 NB. 1 other
1 2 0 3 2 0 2 0 2 0 2 0 1 0 1 0 7 2 NB. 2 alp/num
1 2 0 3 2 0 2 0 4 0 2 0 1 0 1 0 7 2 NB. 3 N
1 2 0 3 2 0 2 0 2 0 2 0 5 0 1 0 7 2 NB. 4 NB
9 0 9 0 9 0 9 0 9 0 9 0 1 0 1 0 9 0 NB. 5 NB.
1 4 0 5 6 0 6 0 6 0 6 0 6 0 1 0 7 4 NB. 6 num
7 0 7 0 7 0 7 0 7 0 7 0 7 0 7 0 8 0 NB. 7 '
1 2 0 3 2 2 3 2 2 2 6 2 1 2 1 2 7 0 NB. 8 ''
9 0 9 0 9 0 9 0 9 0 9 0 9 0 9 0 9 0 NB. 9 comment
)
x=: 0;sj;mj
y=: 'sum=. (i.3 4)+/ .*0j4+pru 4'
x ;: y
+---+--+-+--+---+-+-+-+-+-+---+-+---+-+
|sum|=.|(|i.|3 4|)|+|/|.|*|0j4|+|pru|4|
+---+--+-+--+---+-+-+-+-+-+---+-+---+-+
(x ;: y) -: ;: y
1
Devon posted another good example of this yesterday:
http://www.jsoftware.com/jwiki/Scripts/JavascriptCruncher
And I think Raul is responsible for the example in the HTTP parser:
http://www.jsoftware.com/jwiki/JWebServer/HttpParser
Unfortunately, FSM is almost a language unto itself, and hard to get right.
It's fast, but it's very low level. Oleg has enriched the community by
providing a useful frontend to help build and debug FSMs:
http://olegykj.sourceforge.net/scrshots/graphviz.html
And, if you're familiar with regexen (which have a well designed interface)
he also has a lexer parameterized by them (which actually employs ;. not
;: ):
http://www.jsoftware.com/jwiki/Essays/Regex_Lexer
But if you meant parsing in the technical sense, and want to build a
grammar, you'll have a more difficult time finding examples and direction.
I once posted a toy "interpreter" for an ad-hoc language:
http://www.jsoftware.com/pipermail/programming/2007-January/004756.html
But beyond such trivial models, I'm not aware of any examples. But, since a
lot of parsing is based on ASTs, maybe an introduction to efficient tree
handling in J would help. In general, J doesn't make it fast or easy to
process trees, but you might look at the lab "Huffman Coding" or Roger's
essay at:
http://www.jsoftware.com/jwiki/Essays/Huffman_Coding
I hope this helps. If not, maybe you could post a simple example of the
problem you're trying to solve?
-Dan
--
View this message in context:
http://www.nabble.com/parsing-tf4888496s24193.html#a13994158
Sent from the J Programming mailing list archive at Nabble.com.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm