Hi,
We are trying to port over some code that uses Mahout Logistic Regression to
Mllib Logistic Regression and our preliminary performance tests indicate a
performance bottleneck. It is not clear to me if this is due to one of three
factors:
o Comparing apples to oranges
o Inadequate tuning
o
Hi,
I am using mllib. I use the ml vectorization tools to create the vectorized
input dataframe for
the ml/mllib machine-learning models with schema:
root
|-- label: double (nullable = true)
|-- features: vector (nullable = true)
To avoid repeated vectorization, I am trying to save and load
Hi,
I have a DataFrame df with a column "feature" of type SparseVector that
results from the ml library's VectorAssembler class.
I'd like to get a Dataset of SparseVectors from this column, but when I do a
df.as[SparseVector] scala complains that it doesn't know of an encoder for