Le dimanche 09 novembre 2014 à 23:50 +0000, John Myles White a écrit :
> FWIW, I think the best way to move forward with NamedArrays is to
> replace NamedArrays with a parametric type Named{T} that wraps around
> other AbstractArray types. That gives you both named Array and named
> DataArray objects for the same cost.
Yeah, looks like a good idea. Duplicating the code for each array type
would be a waste.Regards > On Nov 9, 2014, at 5:49 PM, Tim Holy <[email protected]> wrote: > > > Indeed, better to use a Dict if you're naming each row/column. I'd > > forgotten > > that was part of NamedArrays. > > > > --Tim > > > > On Sunday, November 09, 2014 06:11:44 PM Milan Bouchet-Valat wrote: > >> Le dimanche 09 novembre 2014 à 10:54 -0600, Tim Holy a écrit : > >>> With regards to arrays with named dimensions, I suspect that with the > >>> arrival of stagedfunctions, something like NamedAxesArrays > >>> (https://github.com/timholy/NamedAxesArrays.jl) may be a good choice. But > >>> stagedfunctions still have some show-stopper bugs, and we need to fix > >>> those > >>> first. > >> > >> Interesting package! > >> > >> But when I said "named dimensions", I actually meant that dimensions had > >> names, but that elements on each dimension (rows, columns...) had names > >> too. I'm not sure it also makes sense to use staged functions to > >> specialize code on element names, since they can vary much more than > >> dimension names. This could generate quite a lot of methods which would > >> use memory even if only used once. > >> > >> > >> Regards > >> > >>> On Sunday, November 09, 2014 05:10:06 PM Milan Bouchet-Valat wrote: > >>>> Le dimanche 09 novembre 2014 à 07:52 -0800, David van Leeuwen a écrit : > >>>>> I would vote for calling such a function `table()`, to get even closer > >>>>> to R's table(). > >>>> > >>>> Well, that's the debate at > >>>> https://github.com/JuliaStats/StatsBase.jl/issues/32 > >>>> > >>>> At first I was in favor of table() too, but now I prefer freqtable(), > >>>> because "table" could mean any kind of cross-tabulation. I think > >>>> NamedArray could even be called Table. > >>>> > >>>>> And I can't wait for such functionality to be included in METADATA... > >>>> > >>>> Actually I didn't do it because NamedArrays.jl didn't work well on 0.3 > >>>> when I first worked on the package. Now I see the tests are still > >>>> failing. Do you know what is needed to make them work? > >>>> > >>>> Another point is that I think this deserves going into StatsBase, but > >>>> before that we need everybody to agree on a design for NamedArrays. > >>>> > >>>> Regards > >>>> > >>>>> On Sunday, November 9, 2014 4:26:45 PM UTC+1, Milan Bouchet-Valat > >>>>> > >>>>> wrote: > >>>>> Le jeudi 06 novembre 2014 à 11:17 -0800, Conrad Stack a > >>>>> > >>>>> écrit : > >>>>>> I was also looking for a function like this, but could not > >>>>>> find one in docs.julialang.org. I was doing this > >>>>>> (v0.4.0-dev), for anyone who is interested: > >>>>>> > >>>>>> > >>>>>> example = rand(1:10,100) > >>>>>> uexample = sort(unique(example)) > >>>>>> counts = map(x->count(y->x==y,example),uexample) > >>>>>> > >>>>>> > >>>>>> It's pretty ugly, so thanks, Johan, for pointing out the > >>>>>> StatsBase->countmap > >>>>> > >>>>> I've also put together a small package precisely aimed at > >>>>> offering an equivalent of R's table(): > >>>>> https://github.com/nalimilan/Tables.jl > >>>>> > >>>>> But there's a more general issue about how to handle arrays > >>>>> with dimension names in Julia. NamedArrays.jl (which is used > >>>>> in my package) attempts to tackle this issue, but I don't > >>>>> think we've reached a consensus yet about the best solution. > >>>>> > >>>>> > >>>>> Regards > >>>>> > >>>>>> On Sunday, August 17, 2014 9:56:29 AM UTC-4, Johan Sigfrids > >>>>>> > >>>>>> wrote: > >>>>>> I think countmap comes closest to giving you what > >>>>>> you want: > >>>>>> > >>>>>> using StatsBase > >>>>>> data = sample(["a", "b", "c"], 20) > >>>>>> countmap(data) > >>>>>> > >>>>>> Dict{ASCIIString,Int64} with 3 entries: > >>>>>> "c" => 3 > >>>>>> "b" => 10 > >>>>>> "a" => 7 > >>>>>> > >>>>>> On Sunday, August 17, 2014 4:45:21 PM UTC+3, Florian > >>>>>> > >>>>>> Oswald wrote: > >>>>>> Hi > >>>>>> > >>>>>> > >>>>>> I'm looking for the best way to count how > >>>>>> many times a certain value x_i appears in > >>>>>> vector x, where x could be integers, floats, > >>>>>> strings. In R I would do table(x). I found > >>>>>> StatsBase.counts(x,k) but I'm a bit confused > >>>>>> by k (where k goes into 1:k, i.e. the vector > >>>>>> is scanned to find how many elements locate > >>>>>> at each point of 1:k). most of the times I > >>>>>> don't know k, and in fact I would do > >>>>>> table(x) just to find out what k was. Apart > >>>>>> from that, I don't think I could use this > >>>>>> with strings, as I can't construct a range > >>>>>> object from strings. > >>>>>> > >>>>>> > >>>>>> I'm wondering whether a method > >>>>>> StatsBase.counts(x::Vector) just returning > >>>>>> the frequency of each element appearing > >>>>>> would be useful. > >>>>>> > >>>>>> > >>>>>> The same applies to Base.hist if I > >>>>>> understand correctly. I just don't want to > >>>>>> have to specify the edges of bins. > >
