Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/22236#discussion_r213523521
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala ---
@@ -217,7 +222,9 @@ object FPGrowth extends DefaultParamsReadable[FPGrowth]
{
@Experimental
class FPGrowthModel private[ml] (
@Since("2.2.0") override val uid: String,
- @Since("2.2.0") @transient val freqItemsets: DataFrame)
+ @Since("2.2.0") @transient val freqItemsets: DataFrame,
+ private val itemSupport: scala.collection.Map[Any, Double],
--- End diff --
I suppose there's no way of adding the item generic type here; it's really
in the schema of the DataFrame. Does `Map[_, Double]` also work? I don't think
it needs a change, just a side question.
If you have the support for every item, do you need the overall count here
as well? the item counts have already been divided through by
numTrainingRecords here. Below only itemSupport is really passed somewhere else.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]