On Sun, Jul 23, 2017 at 10:09 PM, Jon Zeppieri <zeppi...@gmail.com> wrote:
>
> Even after implementing my own suggestions, it's still much slower
> than the python example it was based. Maybe there's an algorithmic
> problem somewhere (aside from the vector iteration I mentioned
> before). At any rate, I'm intrigued now... -J

And... it turned out to be a very small thing indeed. When you iterate
over the class labels, you're supposed to iterate over the set of
*distinct* class labels. In the Python source, this is:

   class_values = list(set(row[-1] for row in dataset))

In your code, you have:

   (let* ([class-labels (data-get-col data label-column-index)] ...)

... where `data-get-col` returns a list the same length as `data`.
And that's where the huge slowdown comes from, since it means many
more iterations in `gini-index`.

-Jon

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to