[
https://issues.apache.org/jira/browse/MAHOUT-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017093#comment-13017093
]
Ted Dunning commented on MAHOUT-653:
------------------------------------
Are these really enough faster to matter? Do we have any code that is CPU
bound on distance computations
rather than I/O bound? I expect (but do not know) not.
Also, the fractional dimension L measures are pretty controversial. L_1 does a
fantastic job of
sparsification without sacrificing convergence properties. I know that both
Google and Yahoo use
L_1 regularization for on-line SGD based learners.
The IBM paper provided has some serious issues (at first glance). In
particular, they use a theoretical
figure of merit that is based on contrast for a particular distribution of
points. Unfortunately, the
choice of distribution that they used almost always involves distributions that
are substantially not
invariant under rotation. Since they are proving bounds over all possible
distributions, this choice
of an almost always pathological case gives bounds that are dominated by the
pathological case. Most of
the time, these distributions will have points that are highly oriented on the
axes, which may or may not be
realistic. Data that has this property is trivially advantageous to metrics
L_k for fractional values of k.
Given this issue with the basic premises and the well-known issues of
convergence, I would recommend that
folks be pretty cautious with these results.
> Approximations to standard functions
> ------------------------------------
>
> Key: MAHOUT-653
> URL: https://issues.apache.org/jira/browse/MAHOUT-653
> Project: Mahout
> Issue Type: New Feature
> Reporter: Lance Norskog
> Attachments: MAHOUT-653.patch, MAHOUT-653.patch
>
>
> These give approximate versions of pow(value, exponent), exp(value), and
> natural log(value).
> log() and exp() stolen from:
> [http://martin.ankerl.com/2007/02/11/optimized-exponential-functions-for-java/]
> pow() stolen from:
> [http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/]
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira