Yes, I will create a JIRA for it and start working on it also.

-Giang

> On Feb 23, 2016, at 7:59 PM, Frank McQuillan <[email protected]> wrote:
> 
> Jim's approach seems like a reasonable way to go.
> 
> Giang, can you create a JIRA for this request?  You are welcome to start
> working on it if you would like to contribute this to improve CRF usability.
> 
> Frank
> 
>> On Tue, Feb 23, 2016 at 3:24 PM, Jim Nasby <[email protected]> wrote:
>> 
>>> On 2/23/16 11:07 AM, Nguyen,Giang H wrote:
>>> 
>>> I think It could be very helpful if we write a python script in Madlib to
>>> tokenize words and assign the doc_id and start_pos correspondingly and
>>> store it into the database. Hence, users can save a lot more time when
>>> using CRF and also enable them to conveniently run crf model on big testing
>>> data.
>> 
>> Perhaps the Postgres text search stuff could be used for this (maybe
>> to_tsvector())?
>> --
>> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
>> Experts in Analytics, Data Architecture and PostgreSQL
>> Data in Trouble? Get it in Treble! http://BlueTreble.com
>> 

Reply via email to