[ 
https://issues.apache.org/jira/browse/DATAFU-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884693#comment-13884693
 ] 

Matthew Hayes commented on DATAFU-24:
-------------------------------------

So, I've thought about this some more.  Here are the differences between these 
two UDFs:

1) Entropy only calculates the empirical entropy.  StreamingEntropy can 
calculate empirical, chao-shen, and potentially others.
2) Entropy supports multiple estimation methods.  It also supports both 
algebraic and accumulator.  StreamingEntropy only supports accumulator.

I think what may make more sense is to rename Entropy to EmpiricalEntropy so 
there is no confusion about which method it uses.  I'll open a separate JIRA to 
cover this and other thoughts I have.

> Entropy constructor should be consistent with other UDFs
> --------------------------------------------------------
>
>                 Key: DATAFU-24
>                 URL: https://issues.apache.org/jira/browse/DATAFU-24
>             Project: DataFu
>          Issue Type: Bug
>            Reporter: Matthew Hayes
>
> Entropy currently has the following UDFs:
> {noformat}
> Entropy()
> Entropy(String base)
> {noformat}
> This is inconsistent with StreamingEntropy and StreamingCondEntropy, which 
> both have constructors like the following:
> {noformat}
> StreamingEntropy()
> StreamingEntropy(String type)
> StreamingEntropy(String type, String base)
> {noformat}
> We should change Entropy to match the other UDFs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to