[GitHub] purijatin commented on issue #23549: [SPARK-26616][MLlib] Expose document frequency in IDFModel

GitBox Fri, 18 Jan 2019 08:35:57 -0800

purijatin commented on issue #23549: [SPARK-26616][MLlib] Expose document 
frequency in IDFModel
URL: https://github.com/apache/spark/pull/23549#issuecomment-455608054
 
 
   Ok.  So I need some guidance here. Below is the error:
   
   > [error]  * method this(org.apache.spark.mllib.linalg.Vector)Unit in class 
org.apache.spark.mllib.feature.IDFModel does not have a correspondent in 
current version
   > [error]    filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.feature.IDFModel.this")
   > [error]  * method idf()org.apache.spark.mllib.linalg.Vector in class 
org.apache.spark.mllib.feature.IDF#DocumentFrequencyAggregator has a different 
result type in current version, where it is scala.Tuple3 rather than 
org.apache.spark.mllib.linalg.Vector
   > [error]    filter with: 
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.mllib.feature.IDF#DocumentFrequencyAggregator.idf")
   
   There seems to be 2 problems:
   1) Change in constructor of IDFModel
   2) `IDF.DocumentFrequencyAggregator#idf()` returning a different method 
signature
   
   Does the fix need me to do:
   1) To fix-1, add a constructor with:
   
   ```scala
     def this(idf: Vector) = this(idf, new Array[Long](0), 0L)
   ```
   I don't like this fix though. It adds a baggage that is misleading going 
forward
   
   2) To fix-2, revert back to old signature and Introduce getters for 
`docFreq` and `numDocs`?
   
   For `This patch does not merge cleanly.`, I have synced my forked repo. That 
should fix the problem right?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] purijatin commented on issue #23549: [SPARK-26616][MLlib] Expose document frequency in IDFModel

Reply via email to