Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13656#discussion_r66928786
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/fpm/AssociationRules.scala ---
    @@ -120,6 +120,13 @@ object AssociationRules {
         @Since("1.5.0")
         def confidence: Double = freqUnion.toDouble / freqAntecedent
     
    +    /**
    +     * Returns the support of the rule. Current implementation would 
return the number of
    +     * co-occurrence of antecedent and consequent.
    +     */
    +    @Since("2.1.0")
    +    def support: Double = freqUnion.toDouble
    +
    --- End diff --
    
    Dunno, that seems like a mistake to me. It should be a `Long` if it's a 
count, and should expose alternative factory methods to accept input of 
different types if needed. Overloading one argument seems like a hack and I'd 
prefer not to extend it (or fix it).
    
    See SPARK-15930 which concerns adding the input size just for this reason, 
I assume. We haven't released 2.0, and so could in theory still put in a change 
to the constructor. I agree, we might however have to deprecate the existing 
one, add a new one, and still deal with calls to the old constructor, which 
would mean it's not possible to compute values that are a fraction of the whole 
data set. This in turn may argue for clearly separating inputs/outputs that are 
counts vs percentages.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to