Can you give an example?

I imagine that just supporting the data structure will not give you any speed benefit unless the algorithms are reimplemented to take advantage of the problem structure. Even if the output of logistic regression would be a sparse binary vector, you'd still need to compute every entry, which would be the slow part.



On 7/23/19 10:36 AM, Piotr Szymański wrote:
If I could pitch in, it would be lovely, very lovely indeed, if scikit-learn models could:

- operate on sparse data, both input and output by default
- implement some kind of sparse vector representation (as in https://github.com/scikit-learn/scikit-learn/issues/8908 ) - perhaps have a unifiying numpy.array / scipy.sparse_matrix interface to give people some slack on jumping betwen [] operator conventions

We would benefit from that strongly in scikit-multilearn, as when a multi-output problem is transformed to a single-output problem based on unique combinations, this representation has to be dense for scikit-learn at the moment. We end up losing some speed there. I'm sure other libraries like ex. imbalanced-learn, or scikit-multiflow would also see these as a huge thing.

Best,
Piotr



On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller <t3k...@gmail.com <mailto:t3k...@gmail.com>> wrote:

    Hi all.
    At SciPy, Brian Granger raised a good point about their planning
    for the
    Jupyter Project, which is the importance of long-term goals.

    I think it's great that we now have a detailed short-term roadmap
    (https://scikit-learn.org/dev/roadmap.html).
    Given that we now have about 6(!) full time people (Oliver, Jeremy,
    Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I
    think
    it's realistic
    to achieve most of these within a year or two. We have actually made
    some significant progress already.

    I think now would be a good time to start thinking about a
    longer-term
    roadmap, say 3-5 years out.
    What do we want to achieve? What are realistic goals, and what are
    moonshot goals?
    Having a common vision and shared goals might help us with
    funding, but
    might also help us with prioritization and motivation.

    What do you think? Do you think this is important and worth-while?
    And what should our goals be?

    Best,
    Andy
    _______________________________________________
    scikit-learn mailing list
    scikit-learn@python.org <mailto:scikit-learn@python.org>
    https://mail.python.org/mailman/listinfo/scikit-learn



--
Piotr Szymański
nied...@gmail.com <mailto:nied...@gmail.com>

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to