[ 
https://issues.apache.org/jira/browse/SPARK-20180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953160#comment-15953160
 ] 

Cyril de Vogelaere commented on SPARK-20180:
--------------------------------------------

Can you not just set a very large max, like Int.MaxValue or similar?
=> Yes, I said that in the fourth paragraph of my last coment. A carefull user 
could always set it to Int.MaxValue and never have problems in an empirical 
situation. Still, it doesn't change the fact that I advocate for that special 
value (0) as the default value. Since it would be nice that, at first run and 
no matter the dataset, all solution pattern are found. Even if they are longer 
than 10.

It's not normal for tests to run more than a couple hours. You need to see why. 
Is your test of unlimited max pattern stuck?
=> It's not my test per say, it's the dev/run-tests tests which are asked to 
run before creating a pull resquest. I tested with my few changes and it ran 
for a day and a half, I'm re-running it now on the current state of the lib, 
without my changes, it doesn't seem faster ... for now at least ...
So I'm pretty sure I didn't screw up on that, for now the errors seem the same 
too, but I didn't take a deep look at them.

> Unlimited max pattern length in Prefix span
> -------------------------------------------
>
>                 Key: SPARK-20180
>                 URL: https://issues.apache.org/jira/browse/SPARK-20180
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 2.1.0
>            Reporter: Cyril de Vogelaere
>            Priority: Minor
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Right now, we need to use .setMaxPatternLength() method to
> specify is the maximum pattern length of a sequence. Any pattern longer than 
> that won't be outputted.
> The current default maxPatternlength value being 10.
> This should be changed so that with input 0, all pattern of any length would 
> be outputted. Additionally, the default value should be changed to 0, so that 
> a new user could find all patterns in his dataset without looking at this 
> parameter.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to