Erik Shilts created HIVE-3711:
---------------------------------

             Summary: Create UDAF to calculate an array of Benford's Law
                 Key: HIVE-3711
                 URL: https://issues.apache.org/jira/browse/HIVE-3711
             Project: Hive
          Issue Type: New Feature
          Components: UDF
            Reporter: Erik Shilts
            Priority: Minor


Benford's Law is a useful analytical tool to determine if a number was 
generated with a random process by evaluating the relative proportions of the 
leading digit. It can be used to detect accounting, financial, and election 
fraud.

[Wikipedia's|http://en.wikipedia.org/wiki/Benford's_law] Benford's Law page has 
a good overview.

Hive is well suited to calculate Benford's Law. The result should be a named 
struct with names 1-9 and values being the corresponding proportions of each 
digit.

An alternative is to calculate the deviations from Benford's Law for each 
digit. The structure of the resulting array would be the same, but the result 
would be the difference between the actual proportions and the proportions 
given the by 
[formula|http://en.wikipedia.org/wiki/Benford's_law#Mathematical_statement] on 
Wikipedia.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to