Sounds great; and sure thing, will do :-)

The basic idea I had was to implement a bidirectional index mapping names
<-> indices. This requires making sure you keep the index up to date any
time you change the data, but seemed the easiest way forward.

My fork is here: https://github.com/metasoarous/core.matrix/commits/develop

Here are a couple of related issues:

https://github.com/mikera/core.matrix/issues/193
https://github.com/mikera/core.matrix/issues/220

Hope you can come up with something nice!

I would focus first on coming up with what seems like a nice set of
protocols, so that we can be flexible with implementations. Ideally, we'd
be able to just apply some wrapper to any core.matrix array, vector,
matrix, etc that provided named/labeled access to the data, and would be
fairly seamless with the rest of the library. But you should also be able
to wrap something like Renjin's dataframes (as Daniel Slutsky mentioned;
just implement the protocols using their classes, I imagine). There might
have to be some iteration here. Like: initial protocol design -> initial
implementation -> redraft potocols -> try new implementation -> redraft
protocols, etc. I've noticed that it can be difficult to properly abstract
implementation details away from the protocol/API on the first go (though
you might have mastered this more than I :-)).

My 2c

Goodluck!

Chris



On Wed, Mar 9, 2016 at 4:29 PM, <arthur.maciejew...@gmail.com> wrote:

> Chris, thanks for the reply.
>
> It's good to know that I'm not the only one who misses this functionality!
> My goal is definitely to be compatible with Incanter and core.matrix, as
> they both seem mature, and I will never have the time to implement that
> functionality from scratch myself. I'll be studying the source of Pandas
> over the next few days, as I want to have a good idea of how they implement
> their dataframes before starting on the Clojure version. My long-term goal
> is for future authors to look to this set of core tools for data analysis
> as the basis for any packages they build.
>
> If you'd like to publish whatever you've written (hacked up code is ok),
> I'll take a look at that as a starting point, or at least as one possible
> design.
>
> - Arthur
>
>
>
> On Wednesday, March 9, 2016 at 6:47:44 PM UTC-5, Christopher Small wrote:
>>
>>
>> If you're going to do any work in this area, I would highly encourage you
>> to do in as part of the core.matrix library. That is what Incanter is or
>> will be using for it's dataset implementation. But it's nice that those
>> abstractions and implementations be separate from Incanter itself, since
>> Incanter is a rather large dependency.
>>
>> Core.matrix is certainly (in my eyes) becoming the de facto matrix
>> computation library in the Clojure ecosystem, and I think in the level of
>> interop between different implementations there, and extent of utilization
>> by the clojure community, we rival the python offerings. However, while
>> core.matrix has some dataset protocols, api functions and basic
>> implementations, there's still some work to get the full expressiveness of
>> the data.frame pattern as seen in R and Pandas. Specifically, there is no
>> support for setting rownames (or arbitrary "name" assignments beyond that
>> of a single dimension (columns...)). This is something I started working on
>> a while back, but wasn't able to finish. I could potentially push what I
>> came up with to a fork, but unfortunately, I don't have any more time to
>> work on the problem at the moment.
>>
>> Mike Anderson is a great project maintainer, and will probably be happy
>> to help guide you in stitching together a solution.
>>
>> Best
>>
>> Chris
>>
>>
>>
>>
>>
>> On Wednesday, March 9, 2016 at 12:57:31 PM UTC-8, arthur.ma...@gmail.com
>> wrote:
>>>
>>> Is there any desire or need for a Clojure DataFrame?
>>>
>>>
>>> By DataFrame, I mean a structure similar to R's data.frame, and Python's
>>> pandas.DataFrame.
>>>
>>> Incanter's DataSet may already be fulfilling this purpose, and if so,
>>> I'd like to know if and how people are using it.
>>>
>>> From quickly researching, I see that some prior work has been done in
>>> this space, such as:
>>>
>>> * https://github.com/cardillo/joinery
>>> * https://github.com/mattrepl/data-frame
>>> *
>>> http://spark.apache.org/docs/latest/sql-programming-guide.html#dataframes
>>>
>>> Rather than going off and creating a competing implementation (
>>> https://xkcd.com/927/), I'd like to know if anyone here is actively
>>> working on, or would like to work on a DataFrame and related utilities for
>>> Clojure (and by extension Java)? Is it something that's sorely needed, or
>>> is everybody happy with using Incanter or some other library that I'm not
>>> aware of? If there's already a defacto standard out there, would anyone
>>> care to please point it out?
>>>
>>> As background information:
>>>
>>> My specific use-case is in NLP and ML, where I often explore and
>>> prototype in Python, but I'm then left to deal with a smattering of
>>> libraries on the JVM (Mallet, Weka, Mahout, ND4J, DeepLearning4j, CoreNLP,
>>> etc.), each with their own ad-hoc implementations of algorithms, matrices,
>>> and utilities for reading data. It would be great to have a unified way to
>>> explore my data in the Clojure REPL, and then serve the same code and
>>> models in production.
>>>
>>> I would love for Clojure to have a broadly compatible ecosystem similar
>>> to Python's Numpy/Pandas/Scikit-*/Scipy/matplotlib/GenSim,etc. Core.Matrix
>>> and Incanter appear to fulfill a large chunk of those roles, but I am not
>>> aware if they've yet become the defacto standards in the community.
>>>
>>> Any feedback is greatly appreciated.
>>>
>> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "Clojure" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/clojure/4a_f1-xboOY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to