Giang Nguyen created MADLIB-969:
-----------------------------------
Summary: Adding a script to generate test segment table for CRF
Key: MADLIB-969
URL: https://issues.apache.org/jira/browse/MADLIB-969
Project: Apache MADlib
Issue Type: New Feature
Reporter: Giang Nguyen
It could be very helpful if we write a python script in Madlib to tokenize
words and assign the doc_id and start_pos correspondingly and store it into the
database. Hence, users can save a lot more time when using CRF and also enable
them to conveniently run crf model on big testing data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)