[Pharo-users] HTML5 Parser

Mohammad Al Houssami (Alumni) Mon, 11 Mar 2013 14:16:31 -0700

Hello again,

Im building an HTML5 Parser in smalltalk.
Im building it according to the pseudo code provided by WHATWG  mainly these:
Tokenization: 
http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html
Tree Construction: 
http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html


Im still in the Tokenization phase.
The spec defines a state machine. It describes what has to be done when we 
reach a certain state.
My approach to building the tokenizer is by representing each state as a method.
Each method does some operations ( calling other methods to represent changing 
state or returning tokens etc..)

I will be translating the pseudo code provided as is to Smalltalk.
I am not sure if this is the best approach to do things especially that I am 
still new to Smalltalk.

I was told by Stephane Ducasse to use PetitParser.
Doing a quick reading I noticed that it is used when grammars are available. In 
my case I don't have a grammar but pseudo-code of a parser. Can I know if 
anyone has any suggestions for such a project or any comments on the approach I 
am aiming to follow ?

Thanks in advance,
Mohammad

[Pharo-users] HTML5 Parser

Reply via email to