Hey all,
I've uploaded a working prototype of Geriaoueg.
In short here are the salient features that are implemented:
1. Thorough URL redirection using cURL (the PHP library): This includes 
forwarding POST and GET forms as well, so the tool won't break during searches 
etc.
2. HTML DOM based parsing: apertium-deshtml is not used at all, instead the 
text blocks are extracted using 'PHP simple HTML DOM', and then fed to lt-proc 
(after inserting appropriate escape characters etc.)
3. CSS hover-boxes: This is ported over from the current Geriaoueg 
implementation, with some minor changes. It works fine and avoids the use of 
javascript.
4. Also, scripts for automatic generation and updating of analysers and 
wordlists was created, but it still needs work.

BUGS identified
1. Still have not been able to get the encoding part right. Only utf-8 works 
properly.
2. The pipe symbol |, which is not lemmatised by lt-proc gets lost after 
processing. There may be other such characters as well.

Any feedback would be appreciated.

Deepak Joy Cheenath

------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to