[ 
https://issues.apache.org/jira/browse/MATH-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794279#action_12794279
 ] 

Phil Steitz commented on MATH-323:
----------------------------------

Thanks, Larry. I am happy that you are not finding it too hard to get started 
contributing.   We appreciate and welcome your contributions!

Now to the eggnog...er, I mean issue at hand....

I now (think I) understand what you are trying to compute and get why you leave 
the top-coded entries in place.  What now looks funny to me is to recode and 
then just compute ordinary variance.  That will not give you E(X - MAR)^2, but 
rather E(X - E(recoded X))^2.  I think you may need to directly compute the 
squared deviations from the MAR (or the mean with the top-coded entries 
contributing 0) instead of computing the variance on the recoded data.  That 
seems to be what your second reference above is describing.   Consider the 
influence of the original values greater than or equal to the mean in the 
result computed below:

{code}
for (int loop = 0; loop < values.length; loop++) {
                if (values [loop] < mean)
                        semivariancevalues [loop] = values [loop];
                else
                        semivariancevalues [loop] = mean;
        }
        return VARIANCE.evaluate(semivariancevalues, mean);
{code}

The top-coded values will not contribute 0, but will instead contribute 
whatever their deviation is above the mean of the recoded dataset.  Is this 
what you really want?  It would seem to me that the more natural measure would 
be E(X - original mean)^2

Sorry to ask so many questions.  Could well be I am just misunderstanding what 
the statistic is trying to estimate. I just want to make sure we are computing 
something that we can easily describe and more importantly what is really 
useful.

Regarding the UnivariateStatistic, I think we should go ahead and do that and 
include the target as an optional constructor argument.


> Add Semivariance calculation
> ----------------------------
>
>                 Key: MATH-323
>                 URL: https://issues.apache.org/jira/browse/MATH-323
>             Project: Commons Math
>          Issue Type: New Feature
>    Affects Versions: 2.1
>            Reporter: Larry Diamond
>            Assignee: Phil Steitz
>            Priority: Minor
>             Fix For: 2.1
>
>         Attachments: patch.txt, patch2.txt, StatUtils.java, StatUtils.java, 
> StatUtilsTest.java, StatUtilsTest.java
>
>
> I've added semivariance calculations to my local build of commons-math and I 
> would like to contribute them.
> Semivariance is described a little bit on 
> http://en.wikipedia.org/wiki/Semivariance , but a real reason you would use 
> them is in finance in order to compute the Sortino ratio rather than the 
> Sharpe ratio.
> http://en.wikipedia.org/wiki/Sortino_ratio gives an explanation of the 
> Sortino ratio and why you would choose to use that rather than the Sharpe 
> ratio.  (There are other ways to measure the performance of your portfolio, 
> but I wont bore everybody with that stuff)
> I've already got the coding completed along with the test cases and building 
> using mvn site.
> The only two files I've modified is 
> src/main/java/org/apache/commons/stat/StatUtils.java and 
> src/test/java/org/apache/commons/math/stat/StatUtilsTest.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to