[ 
https://issues.apache.org/jira/browse/SPARK-19899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905405#comment-15905405
 ] 

Maciej Szymkiewicz commented on SPARK-19899:
--------------------------------------------

In my opinion a trait for each input category ({{Vector}}, {{array<\_>}}, 
{{array<array<\_>>}}) is the way to go. Development overhead is low (these 
things are small and easy to test), it is unlikely we'll need much more any 
time soon, any this gives us some way to communicate expected  input.

I am strongly against using {{Vector}}  - it is counterintuitive, requires a 
lot of additional effort and without any supported way of mapping from vector 
to features (I don't count {{Column}} metadata) it will significantly degrade 
user experience. Moreover it won't be useful for {{PrefixSpan}} at all. I 
believe that we should acknowledge that pattern mining techniques are 
significantly different from the common {{ml}} algorithms and don't hesitate to 
reflect that in the API. 

> FPGrowth input column naming
> ----------------------------
>
>                 Key: SPARK-19899
>                 URL: https://issues.apache.org/jira/browse/SPARK-19899
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 2.2.0
>            Reporter: Maciej Szymkiewicz
>
> Current implementation extends {{HasFeaturesCol}}. Personally I find it 
> rather unfortunate. Up to this moment we used consistent conventions - if we 
> mix-in  {{HasFeaturesCol}} the {{featuresCol}} should be {{VectorUDT}}. 
> Using the same {{Param}} for an {{array<T>}} (and possibly for 
> {{array<arrray<T>>}} once {{PrefixSpan}} is ported to {{ml}}) will be 
> confusing for the users.
> I would like to suggest adding new {{trait}} (let's say 
> {{HasTransactionsCol}}) to clearly indicate that the input type differs for 
> the other {{Estiamtors}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to