Can you give an example?
I imagine that just supporting the data structure will not give you any
speed benefit unless the algorithms are reimplemented to take advantage
of the problem structure.
Even if the output of logistic regression would be a sparse binary
vector, you'd still need to compute every entry, which would be the slow
part.
On 7/23/19 10:36 AM, Piotr Szymański wrote:
If I could pitch in, it would be lovely, very lovely indeed, if
scikit-learn models could:
- operate on sparse data, both input and output by default
- implement some kind of sparse vector representation (as in
https://github.com/scikit-learn/scikit-learn/issues/8908 )
- perhaps have a unifiying numpy.array / scipy.sparse_matrix interface
to give people some slack on jumping betwen [] operator conventions
We would benefit from that strongly in scikit-multilearn, as when a
multi-output problem is transformed to a single-output problem based
on unique combinations, this representation has to be dense for
scikit-learn at the moment. We end up losing some speed there. I'm
sure other libraries like ex. imbalanced-learn, or scikit-multiflow
would also see these as a huge thing.
Best,
Piotr
On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller <t3k...@gmail.com
<mailto:t3k...@gmail.com>> wrote:
Hi all.
At SciPy, Brian Granger raised a good point about their planning
for the
Jupyter Project, which is the importance of long-term goals.
I think it's great that we now have a detailed short-term roadmap
(https://scikit-learn.org/dev/roadmap.html).
Given that we now have about 6(!) full time people (Oliver, Jeremy,
Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I
think
it's realistic
to achieve most of these within a year or two. We have actually made
some significant progress already.
I think now would be a good time to start thinking about a
longer-term
roadmap, say 3-5 years out.
What do we want to achieve? What are realistic goals, and what are
moonshot goals?
Having a common vision and shared goals might help us with
funding, but
might also help us with prioritization and motivation.
What do you think? Do you think this is important and worth-while?
And what should our goals be?
Best,
Andy
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org <mailto:scikit-learn@python.org>
https://mail.python.org/mailman/listinfo/scikit-learn
--
Piotr Szymański
nied...@gmail.com <mailto:nied...@gmail.com>
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn