[jira] [Commented] (HIVEMALL-280) Support lift/confidence/support UDF

2019-12-29 Thread Makoto Yui (Jira)


[ 
https://issues.apache.org/jira/browse/HIVEMALL-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004768#comment-17004768
 ] 

Makoto Yui commented on HIVEMALL-280:
-

[~dharmikt] Do you have any progress? If there are something I can help, let me 
know.

> Support lift/confidence/support UDF
> ---
>
> Key: HIVEMALL-280
> URL: https://issues.apache.org/jira/browse/HIVEMALL-280
> Project: Hivemall
>  Issue Type: New Feature
>Reporter: Makoto Yui
>Assignee: Dharmik Thakkar
>Priority: Minor
>
> Support lift/confidence/support UDAF
> [https://en.wikipedia.org/wiki/Lift_(data_mining])
> [https://towardsdatascience.com/a-gentle-introduction-on-market-basket-analysis-association-rules-fa4b986a40ce]
> [https://medium.com/@samratjain/explained-market-basket-analysis-using-sql-a7434f30e649]
> {code:java}
> select
>   item, other_item,
>   lift(...) as lift,
>   confidence () as confidence
> from
>   transaction
> group by
>   1, 2{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVEMALL-280) Support lift/confidence/support UDF

2019-11-27 Thread Makoto Yui (Jira)


[ 
https://issues.apache.org/jira/browse/HIVEMALL-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983854#comment-16983854
 ] 

Makoto Yui commented on HIVEMALL-280:
-

It's just 2 pair computation but Apriori-algorithm is known to be efficient for 
computing k-pair rules where k >= 2.

[https://medium.com/edureka/apriori-algorithm-d7cc648d4f1e]

[https://www.geeksforgeeks.org/apriori-algorithm/]

[https://codereview.stackexchange.com/questions/104637/apriori-algorithm-for-frequent-itemset-generation-in-java]

[https://github.com/seratch/apriori4j]

R's arules library support lift value as well.

[http://r-statistics.co/Association-Mining-With-R.html#Caveat%20with%20using%20Lift]

[https://www.kdnuggets.com/2016/04/association-rules-apriori-algorithm-tutorial.html/2]

> Support lift/confidence/support UDF
> ---
>
> Key: HIVEMALL-280
> URL: https://issues.apache.org/jira/browse/HIVEMALL-280
> Project: Hivemall
>  Issue Type: New Feature
>Reporter: Makoto Yui
>Assignee: Makoto Yui
>Priority: Minor
>
> Support lift/confidence/support UDAF
> [https://en.wikipedia.org/wiki/Lift_(data_mining])
> [https://towardsdatascience.com/a-gentle-introduction-on-market-basket-analysis-association-rules-fa4b986a40ce]
> [https://medium.com/@samratjain/explained-market-basket-analysis-using-sql-a7434f30e649]
> {code:java}
> select
>   item, other_item,
>   lift(...) as lift,
>   confidence () as confidence
> from
>   transaction
> group by
>   1, 2{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVEMALL-280) Support lift/confidence/support UDF

2019-11-27 Thread Makoto Yui (Jira)


[ 
https://issues.apache.org/jira/browse/HIVEMALL-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983838#comment-16983838
 ] 

Makoto Yui commented on HIVEMALL-280:
-

[~dharmikt]

"A, B, A_count, B_count, AB_count, tx_total" is required to compute 
support/confidence/lift. Thus, I think "product 1, product 2, number of tx" is 
not enough.

> Support lift/confidence/support UDF
> ---
>
> Key: HIVEMALL-280
> URL: https://issues.apache.org/jira/browse/HIVEMALL-280
> Project: Hivemall
>  Issue Type: New Feature
>Reporter: Makoto Yui
>Assignee: Makoto Yui
>Priority: Minor
>
> Support lift/confidence/support UDAF
> [https://en.wikipedia.org/wiki/Lift_(data_mining])
> [https://towardsdatascience.com/a-gentle-introduction-on-market-basket-analysis-association-rules-fa4b986a40ce]
> [https://medium.com/@samratjain/explained-market-basket-analysis-using-sql-a7434f30e649]
> {code:java}
> select
>   item, other_item,
>   lift(...) as lift,
>   confidence () as confidence
> from
>   transaction
> group by
>   1, 2{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVEMALL-280) Support lift/confidence/support UDF

2019-11-27 Thread Dharmik Thakkar (Jira)


[ 
https://issues.apache.org/jira/browse/HIVEMALL-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983325#comment-16983325
 ] 

Dharmik Thakkar commented on HIVEMALL-280:
--

[~myui] Thanks!

I need one clarification on the input table 'transaction'. Does the 
'transaction' table look like below?

!https://miro.medium.com/max/678/1*K2f-Rja-1xrqe9IdUy5Dgw.png!

And in the query the arguments to lift/confidence/support would be all the 
three columns? Is this understanding correct? Please suggest. Thanks!

 

> Support lift/confidence/support UDF
> ---
>
> Key: HIVEMALL-280
> URL: https://issues.apache.org/jira/browse/HIVEMALL-280
> Project: Hivemall
>  Issue Type: New Feature
>Reporter: Makoto Yui
>Assignee: Makoto Yui
>Priority: Minor
>
> Support lift/confidence/support UDAF
> [https://en.wikipedia.org/wiki/Lift_(data_mining])
> [https://towardsdatascience.com/a-gentle-introduction-on-market-basket-analysis-association-rules-fa4b986a40ce]
> [https://medium.com/@samratjain/explained-market-basket-analysis-using-sql-a7434f30e649]
> {code:java}
> select
>   item, other_item,
>   lift(...) as lift,
>   confidence () as confidence
> from
>   transaction
> group by
>   1, 2{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVEMALL-280) Support lift/confidence/support UDF

2019-11-21 Thread Makoto Yui (Jira)


[ 
https://issues.apache.org/jira/browse/HIVEMALL-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979532#comment-16979532
 ] 

Makoto Yui commented on HIVEMALL-280:
-

[~dharmikt] 

Yes, it would be appreciated.

> Support lift/confidence/support UDF
> ---
>
> Key: HIVEMALL-280
> URL: https://issues.apache.org/jira/browse/HIVEMALL-280
> Project: Hivemall
>  Issue Type: New Feature
>Reporter: Makoto Yui
>Priority: Minor
>
> Support lift/confidence/support UDAF
> [https://en.wikipedia.org/wiki/Lift_(data_mining])
> [https://towardsdatascience.com/a-gentle-introduction-on-market-basket-analysis-association-rules-fa4b986a40ce]
> [https://medium.com/@samratjain/explained-market-basket-analysis-using-sql-a7434f30e649
> ]
> {code:java}
> select
>   item, other_item,
>   lift(...) as lift,
>   confidence () as confidence
> from
>   transaction
> group by
>   1, 2{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVEMALL-280) Support lift/confidence/support UDF

2019-11-21 Thread Dharmik Thakkar (Jira)


[ 
https://issues.apache.org/jira/browse/HIVEMALL-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979429#comment-16979429
 ] 

Dharmik Thakkar commented on HIVEMALL-280:
--

[~myui] Can I work on this?

> Support lift/confidence/support UDF
> ---
>
> Key: HIVEMALL-280
> URL: https://issues.apache.org/jira/browse/HIVEMALL-280
> Project: Hivemall
>  Issue Type: New Feature
>Reporter: Makoto Yui
>Priority: Minor
>
> Support lift/confidence/support UDAF
> [https://en.wikipedia.org/wiki/Lift_(data_mining])
> [https://towardsdatascience.com/a-gentle-introduction-on-market-basket-analysis-association-rules-fa4b986a40ce]
> [https://medium.com/@samratjain/explained-market-basket-analysis-using-sql-a7434f30e649
> ]
> {code:java}
> select
>   item, other_item,
>   lift(...) as lift,
>   confidence () as confidence
> from
>   transaction
> group by
>   1, 2{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)