In case anyone is interested, I actually just decided to use a Matrix, and 
when I needed it I would call sortrows and take each column and put them in 
a values vector and weights vector (for further calculations) respectively. 
 I thought this was easier, and I noticed sortrows keeps the two columns 
"together" (in the sense that, upon sorting, the values in cols A & B are 
kept together).

You can see the result 
here: https://github.com/pazzo83/QuantLib.jl/blob/master/src/math/statistics.jl

If anyone has a possibly more efficient way of doing this, please share!

- Chris

On Monday, March 7, 2016 at 7:39:45 PM UTC-5, Christopher Alexander wrote:
>
> Many thanks, these are very helpful!  Yes, the two vectors are going to be 
> of the same type.  The StructsOfArrays package is interesting too, as let's 
> say you have a construction like this:
>
> *arr = StructOfArrays(Pair{Float64, Float64}, 100)*
>
> You can go ahead, and populate this as you need.  If you need all the 
> firsts or seconds of the pairs, you can access the object's array param 
> (arr.arrays), and all the firsts are in the first array, and all the 
> seconds are in the second array.  I noticed that at a certain size, the 
> sort algo must change, and I needed to override the resize! method for the 
> StructOfArrays type.  I will compare the speed vs some of these other 
> options.
>
> Thanks!!
>
> On Monday, March 7, 2016 at 6:13:22 PM UTC-5, tshort wrote:
>>
>> There are several options to "keep things together", particularly with 
>> vectors of the same type:
>>
>> - DataFrame columns -- watch how you use columns to keep type stability
>>
>> - Nx2 Array
>>
>> - Nx2 NamedArray:
>>     https://github.com/davidavdav/NamedArrays.jl
>>
>> - AxisArrays:
>>     https://github.com/mbauman/AxisArrays.jl
>>
>>
>> On Mon, Mar 7, 2016 at 5:11 PM, Christopher Alexander <[email protected]> 
>> wrote:
>>
>>> Yea, I was thinking about two different vectors, but then if I did any 
>>> sorting, the value vector and weight vector would be out-of-sync.  I'll 
>>> check out this StructsOfArrays package
>>>
>>> Thanks!
>>>
>>> Chris
>>>
>>> On Monday, March 7, 2016 at 5:03:14 PM UTC-5, tshort wrote:
>>>>
>>>> It depends on what "various weighted statistical calculations" 
>>>> involves. I'd start with two vectors, `x` and `w`. If you really need them 
>>>> to be coupled tightly, you could define an immutable type to hold the 
>>>> value 
>>>> and the weight, but the two separate vectors can be faster for some 
>>>> operations. Also, see:
>>>>
>>>> https://github.com/simonster/StructsOfArrays.jl
>>>>
>>>> On Mon, Mar 7, 2016 at 4:50 PM, Christopher Alexander <
>>>> [email protected]> wrote:
>>>>
>>>>> Hello all, I need to create a structure where I keep track of pairs of 
>>>>> value => weight so that I can do various weighted statistical 
>>>>> calculations.
>>>>>
>>>>> I know that StatsBase has a weights vector, which I plan on using, but 
>>>>> the way that is set up is that it is disassociated from each of the 
>>>>> values 
>>>>> to which the weights are to be applied.
>>>>>
>>>>> I need the mapping that "Pair" provides, but I've noticed that there 
>>>>> is no easy way, if I have an array of pairs, to grab all the first values 
>>>>> or all the second values (like you can do with a dict in grabbing keys or 
>>>>> values).
>>>>>
>>>>> I've tried to do something like map(first, my_array_of_pairs), but 
>>>>> this is about 10x slower than if you have a dictionary of value => weight 
>>>>> and just asked for the keys.  I actually tried to use a dict at first, 
>>>>> but 
>>>>> ran into issues with duplicate values (they were overwriting each other 
>>>>> because the value was the key).
>>>>>
>>>>> Any suggestions, or any better way to manipulate an array of Pairs?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Chris
>>>>>
>>>>
>>>>
>>

Reply via email to