[EMAIL PROTECTED] (Tom Fawcett) wrote:
>Ken Williams <[EMAIL PROTECTED]> wrote:
>> [EMAIL PROTECTED] (Tom Fawcett) wrote:
>> >It would be nice to have a numeric discretization module as well so
>> >these would work with mixed numerics and text, but that's probably 
>> >asking for too much...
>> 
>> I'm not sure what the term 'discretization' means - is it a conversion
>> of numerics to some other form, or a lumping, or something like that?
>
>Yes, it's basically taking a continuous valued attribute and creating
>appropriate bins for the value.  So instead of C in [1,100] you have
>C_prime in {low,medium,high}.  This is necessary for techniques like
>Naive Bayes which can't handle continuous attributes naturally. 
>Figuring out the number of bins and their ranges is the trick.  I guess
>there are some straightforward entropy based methods that are pretty
>easy to write.  I'll implement one when I get some, um, spare time.

Cool.  Keep us apprised.

>> By the way, the best place to discuss this work is on the perl-AI list,
>> at [EMAIL PROTECTED] .  That's where I'm trying to coax discussions to
>> take place.
>
>OK, I've joined it.

I've cc'd this message there too.


  -------------------                            -------------------
  Ken Williams                             Last Bastion of Euclidity
  [EMAIL PROTECTED]                            The Math Forum

Reply via email to