[ 
https://jira.nuxeo.com/browse/NXSEM-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=95577#comment-95577
 ] 

Olivier Grisel commented on NXSEM-12:
-------------------------------------

Work under way at: 
https://github.com/ogrisel/pignlproc/tree/master/examples/ner-corpus

Mostly working: handling redirects is still needed to make it not skip 
important entities such as China => People's Republic of China.

> hadoop script to build NER training corpus from wikipedia sentences with 
> links to Person, Organization or Places
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: NXSEM-12
>                 URL: https://jira.nuxeo.com/browse/NXSEM-12
>             Project: Nuxeo Semantic R&D
>          Issue Type: Task
>            Reporter: Olivier Grisel
>            Assignee: Olivier Grisel
>             Fix For: 5.4.2
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        
_______________________________________________
ECM-tickets mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm-tickets

Reply via email to