On Wed, May 16, 2012 at 5:31 AM, Andreas Mueller
<[email protected]>wrote:
>
> The SequentialDataset was made for vector x vector operations. Depending
> on whether we
> do mini-batch or online learning, the MLP needs vector x matrix or
> matrix x matrix operations.
> In particular matrix x matrix is probably not feasible with the
> SequentialDataset, though I think
> even vector x matrix might be ugly and possibly slow, though I'm not
> sure there.
>
> What do you think Mathieu (and the others)?
>
I think that it is worth investigating the separation between the core
algorithm logic and the data representation dependent parts. SGD used to be
implemented separately for dense and sparse inputs but the rewrite based on
SequentialDataset significantly simplified the source code (but Peter is
the best person to comment on this). David could start by getting the numpy
array based implementation right, then before implementing the sparse
version, investigate how to abstract away the data representation dependent
parts either by using/extending SequentialDataset/WeightVector or by
creating his own utility classes.
Mathieu
PS: When it makes sense, it would be nice if we could strive to add sparse
matrix support whenever we add a new estimator.
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general