[
https://issues.apache.org/jira/browse/SPARK-20322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cyril de Vogelaere updated SPARK-20322:
---------------------------------------
Description:
The current implementation already diposes of a maxPatternLength parameter.
It might be nice to add a similar minPatternLength parameter.
This would allow user to better control the solutions received, and lower the
amount of memory needed to store the solutions in the driver.
This can be implemented as a simple check before adding a solution to the
solution list, no difference in performance should be observable.
I'm proposing this fonctionality here, if it looks interesting to other people,
I will implement it along with tests to verify it's implementation.
was:
The current implementation already diposes of a maxPatternLength parameter. It
might be nice to add a similar minPatternLength parameter.
This would allow user to better control the solutions received, and lower the
amount of memory needed to store the solutions in the driver.
This can be implemented as a simple check before adding a solution to the
solution list, no difference in performance should be observable.
I'm proposing this fonctionality here, if it looks interesting to other people,
I will implement it along with tests to verify it's implementation.
> MinPattern length in PrefixSpan
> -------------------------------
>
> Key: SPARK-20322
> URL: https://issues.apache.org/jira/browse/SPARK-20322
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Affects Versions: 2.1.0
> Reporter: Cyril de Vogelaere
> Priority: Trivial
> Original Estimate: 0h
> Remaining Estimate: 0h
>
> The current implementation already diposes of a maxPatternLength parameter.
> It might be nice to add a similar minPatternLength parameter.
> This would allow user to better control the solutions received, and lower the
> amount of memory needed to store the solutions in the driver.
> This can be implemented as a simple check before adding a solution to the
> solution list, no difference in performance should be observable.
> I'm proposing this fonctionality here, if it looks interesting to other
> people, I will implement it along with tests to verify it's implementation.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]