Hi All,

I forgot to mention that the svn diff patch is created on the base of stanbol-nlp-processing branch

Alessio

On 01/09/2013 10:22 AM, Alessio Bosca wrote:
Hi Rupert,

yesterday I updated the morphological Analysis service (We added support for the sv language, fixed the umlauts issue for German and unified all the POS tagsets for the supported languages) Therefore currently the pos tags returned by the web service are not coherent with the ones declared in the postag mappings in the enhancement engine. For this reasons the tests on the morphological engine should be failing. Concerning the umlauts (and /ß character)/, they are now recognized by the system but currently the lemma produced by the morphological analyzer converts them to sequences of characters (e.g. /ö -> oe, //ß -> ss)./ In the future version of the system I would like to include both lemma writings/.
/
The language identifier has been updated as well (adding a few new languages, now the list of supported languages includes en,fr,de,hu,pl,it,es,pt,el,et,lv,tr,pt,ru,ar,ro,da) without changes for the engines

The patch you can find in the attachment contains the fixes for the morphological service updates (sv language addition and new pos tag mapping, one for all the languages) The patch also contains the client and the test classes for the sentiment analysis engine supporting fr and it

Let me know if you have any problem integrating the patch

Alessio

On 01/07/2013 05:44 PM, Alessio Bosca wrote:
Hi Rupert,

sure tomorrow I'll have a look into that and let you know

bests
    Alessio

On 01/04/2013 01:07 PM, Rupert Westenthaler wrote:
Hi Alessio, all

Thanks for looking into that. However with the Jenking build #1200
there is still one remaining issue

tesetEngine(org.apache.stanbol.enhancer.engines.celi.langid.impl.CeliLanguageIdentifierEnhancementEngineTest)
  Time elapsed: 0.296 sec  <<< FAILURE!
junit.framework.ComparisonFailure: The detected language for text
'Brigitte Bardot, née  le 28 septembre 1934 à Paris, est une actrice
de cinéma et chanteuse française.' MUST BE 'fr' expected:<[f]r> but
was:<[a]r>
         at junit.framework.Assert.assertEquals(Assert.java:100)
at org.apache.stanbol.enhancer.engines.celi.langid.impl.CeliLanguageIdentifierEnhancementEngineTest.tesetEngine(CeliLanguageIdentifierEnhancementEngineTest.java:101)

Looks like the Language Identification engine detects the language
Arabic for the French text example. I am also able to reproduce this
issue locally.

Can you have a look into that?
best
Rupert


On Fri, Jan 4, 2013 at 10:03 AM, Alessio Bosca <[email protected]> wrote:
Hi Rupert,

thanks for the feedback, there was a problem with the the access control for
anonymous users in the services. Now it has been fixed .

PS: Next week I'll send you a patch for CELI engines with the sentiment
analysis and bug fixes for the German umlauts.
Sorry for the delay in the release.

Bests,
     Alessio

--
*************************************
Alessio Bosca, Ph.D.
CELI s.r.l.
Via San Quintino 31
10121 Torino
Tel. +39 011.562.71.15
Fax +39 011.506.40.86
http://www.celi.it
*************************************








--
*************************************
Alessio Bosca, Ph.D.
CELI s.r.l.
Via San Quintino 31
10121 Torino
Tel. +39 011.562.71.15
Fax +39 011.506.40.86
http://www.celi.it
*************************************


--
*************************************
Alessio Bosca, Ph.D.
CELI s.r.l.
Via San Quintino 31
10121 Torino
Tel. +39 011.562.71.15
Fax +39 011.506.40.86
http://www.celi.it
*************************************

Reply via email to