purijatin commented on issue #23549: [SPARK-26616][MLlib] Expose document frequency in IDFModel URL: https://github.com/apache/spark/pull/23549#issuecomment-455608054 Ok. So I need some guidance here. Below is the error: > [error] * method this(org.apache.spark.mllib.linalg.Vector)Unit in class org.apache.spark.mllib.feature.IDFModel does not have a correspondent in current version > [error] filter with: ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.feature.IDFModel.this") > [error] * method idf()org.apache.spark.mllib.linalg.Vector in class org.apache.spark.mllib.feature.IDF#DocumentFrequencyAggregator has a different result type in current version, where it is scala.Tuple3 rather than org.apache.spark.mllib.linalg.Vector > [error] filter with: ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.mllib.feature.IDF#DocumentFrequencyAggregator.idf") There seems to be 2 problems: 1) Change in constructor of IDFModel 2) `IDF.DocumentFrequencyAggregator#idf()` returning a different method signature Does the fix need me to do: 1) To fix-1, add a constructor with: ```scala def this(idf: Vector) = this(idf, new Array[Long](0), 0L) ``` I don't like this fix though. It adds a baggage that is misleading going forward 2) To fix-2, revert back to old signature and Introduce getters for `docFreq` and `numDocs`? For `This patch does not merge cleanly.`, I have synced my forked repo. That should fix the problem right?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
