On Tue, 31 May 2016 12:05:39 -0400
Bill Cole wrote:

> On 31 May 2016, at 2:21, Henrik K wrote:
> 
> > On Mon, May 30, 2016 at 06:25:08PM -0400, Dianne Skoll wrote:  
> >> On Mon, 30 May 2016 17:45:52 -0400
> >> "Bill Cole" <sausers-20150...@billmail.scconsult.com> wrote:
> >>  
> >>> So you could have 'sex' and 'meds' and 'watches' tallied up in
> >>> into frequency counts that sum up natural (word) and synthetic
> >>> (concept) occurrences, not just as incompatible types of input
> >>> feature but as a conflation of incompatible features.  
> >>
> >> That is easy to patch by giving "concepts" a separate namespace.
> >> You could do that by picking a character that can't be in a normal
> >> token and
> >> using something like:  concept*meds, concept*sex, etc. as tokens.  
> >
> > This is how the put_metadata stuff already works in concepts and
> > other plugins. It sees a "Hx-sa-concepts:foobar" token.  
> 
> That's less bad than the description Paul Stead originally gave,
> which was to add headers with various simple word tags "which Bayes
> can use as tokens." If the actual implementation is doing something
> else in a separate Bayes DB, I don't see a problem with it (although
> I'd expect it to be less accurate than 1-word Bayes)

It's not in a separate database, it's just that words in headers
generate distinct tokens from words in the body. 

Reply via email to