Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1032#issuecomment-162821875
  
    How about base class with declaration of the both statistics values and 
child class with the implementation of that values?
    
    ```scala
    abstract class FieldStats {
      // statistics values for discrete fields
      def entropy: Double = throw NonImplementedError("entropy cannot be 
accessed for continuous fields")
      def gini: Double = throw NonImplementedError("gini cannot be accessed for 
continuous fields")
      def categoryCounts: Map[Double, Int] = throw 
NonImplementedError("categoryCounts cannot be accessed for continuous fields")
    
      // statistics values for continuous fields
      def max: Double = throw NonImplementedError("max cannot be accessed for 
discrete fields")
      def min: Double = throw NonImplementedError("min cannot be accessed for 
discrete fields")
      def mean: Double = throw NonImplementedError("mean cannot be accessed for 
discrete fields")
      def variance: Double = throw NonImplementedError("variance cannot be 
accessed for discrete fields")
    }
    
    class DiscreteFieldStats(
      private val counts: Map[Double, Int]
    ) extends FieldStats {
      override lazy val entropy = // calculation of entropy
      override lazy val gini = // calculation of gini
      override lazy val categoryCounts = // calculation of categoryCounts
      override def toString = // implementation of toString
    }
    
    class ContinuousFieldStats(
      override val max: Double,
      override val min: Double,
      override val mean: Double,
      override val variance: Double
    ) extends FieldStats {
      override def toString = // implementation of toString
    }
    ```
    
    If a user calls some non-implemented methods in the derived classes, 
`scala.NotImplementedError` will be raised. Additionally, we can calculate all 
values (including gini and entropy) once in constructor with this approach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to