You can use dropna() to convert a DataArray to a Array. This will obviously
drop any missing values.
On Friday, July 4, 2014 2:08:55 AM UTC+3, Donald Lacombe wrote:
>
> Patrick (and others),
>
> Another issue that has reared it's ugly head is that when I read the data
> using the Data Frames package, I get the following:
>
> data = readtable("ct_coord_2.csv",header=false)
>
> 8x2 DataFrame:
>
> x1 x2
>
> [1,] -73.3712 41.225
>
> [2,] -72.1065 41.4667
>
> [3,] -73.2453 41.7925
>
> [4,] -71.9876 41.83
>
> [5,] -72.3365 41.855
>
> [6,] -72.7328 41.8064
>
> [7,] -72.5231 41.4354
>
> [8,] -72.8999 41.3488
>
>
> julia> xc = data[:,1]
>
> 8-element DataArray{Float64,1}:
>
> -73.3712
>
> -72.1065
>
> -73.2453
>
> -71.9876
>
> -72.3365
>
> -72.7328
>
> -72.5231
>
> -72.8999
>
>
> julia> yc = data[:,2]
>
> 8-element DataArray{Float64,1}:
>
> 41.225
>
> 41.4667
>
> 41.7925
>
> 41.83
>
> 41.855
>
> 41.8064
>
> 41.4354
>
> 41.3488
>
>
> julia> xc=xc'
>
> 1x8 DataArray{Float64,2}:
>
> -73.3712 -72.1065 -73.2453 -71.9876 … -72.7328 -72.5231 -72.8999
>
>
> julia> yc=yc'
>
> 1x8 DataArray{Float64,2}:
>
> 41.225 41.4667 41.7925 41.83 41.855 41.8064 41.4354 41.3488
>
>
> julia> temp = [xc;yc]
>
> 2x8 DataArray{Float64,2}:
>
> -73.3712 -72.1065 -73.2453 -71.9876 … -72.7328 -72.5231 -72.8999
>
> 41.225 41.4667 41.7925 41.83 41.8064 41.4354 41.3488
>
>
> julia> R = pairwise(Euclidean(),temp)
>
> MethodError(At_mul_B!,(
>
> 8x8 Array{Float64,2}:
>
> 2.7273e-316 2.7273e-316 2.67478e-315 … 2.7273e-316 2.7273e-316
>
> 2.67736e-315 2.67736e-315 2.67736e-315 2.72726e-316 2.72726e-316
>
> 2.67727e-315 2.67727e-315 2.67727e-315 2.67727e-315 2.67727e-315
>
> 2.67727e-315 2.67727e-315 2.67727e-315 2.67727e-315 2.67727e-315
>
> 4.94066e-324 4.94066e-324 4.94066e-324 9.88131e-324 4.94066e-324
>
> 2.76235e-318 2.76235e-318 2.76235e-318 … 2.76235e-318 2.76235e-318
>
> 4.94066e-324 4.94066e-324 4.94066e-324 9.88131e-324 4.94066e-324
>
> 4.94066e-324 4.94066e-324 4.94066e-324 9.88131e-324 4.94066e-324,
>
>
> 2x8 DataArray{Float64,2}:
>
> -73.3712 -72.1065 -73.2453 -71.9876 … -72.7328 -72.5231 -72.8999
>
> 41.225 41.4667 41.7925 41.83 41.8064 41.4354 41.3488,
>
>
> 2x8 DataArray{Float64,2}:
>
> -73.3712 -72.1065 -73.2453 -71.9876 … -72.7328 -72.5231 -72.8999
>
> 41.225 41.4667 41.7925 41.83 41.8064 41.4354 41.3488))
>
>
> I do not think that the Distance package likes the types that is input into
> the function, i.e. the vectors are DataArrays instead of Arrays. It works
> just fine when I used Tony's idea:
>
>
> julia> data = readcsv("ct_coord_2.csv",Float64)
>
> 8x2 Array{Float64,2}:
>
> -73.3712 41.225
>
> -72.1065 41.4667
>
> -73.2453 41.7925
>
> -71.9876 41.83
>
> -72.3365 41.855
>
> -72.7328 41.8064
>
> -72.5231 41.4354
>
> -72.8999 41.3488
>
>
> julia> xc = data[:,1]
>
> 8-element Array{Float64,1}:
>
> -73.3712
>
> -72.1065
>
> -73.2453
>
> -71.9876
>
> -72.3365
>
> -72.7328
>
> -72.5231
>
> -72.8999
>
>
> julia> yc = data[:,2]
>
> 8-element Array{Float64,1}:
>
> 41.225
>
> 41.4667
>
> 41.7925
>
> 41.83
>
> 41.855
>
> 41.8064
>
> 41.4354
>
> 41.3488
>
>
> julia> xc=xc'
>
> 1x8 Array{Float64,2}:
>
> -73.3712 -72.1065 -73.2453 -71.9876 … -72.7328 -72.5231 -72.8999
>
>
> julia> yc=yc'
>
> 1x8 Array{Float64,2}:
>
> 41.225 41.4667 41.7925 41.83 41.855 41.8064 41.4354 41.3488
>
>
> julia> temp = [xc;yc]
>
> 2x8 Array{Float64,2}:
>
> -73.3712 -72.1065 -73.2453 -71.9876 … -72.7328 -72.5231 -72.8999
>
> 41.225 41.4667 41.7925 41.83 41.8064 41.4354 41.3488
>
>
> julia> R = pairwise(Euclidean(),temp)
>
> 8x8 Array{Float64,2}:
>
> 0.0 1.28762 0.581327 1.51014 … 0.863479 0.873799 0.487347
>
> 1.28762 0.0 1.18451 0.382214 0.712542 0.417808 0.802085
>
> 0.581327 1.18451 0.0 1.25833 0.512668 0.805673 0.562309
>
> 1.51014 0.382214 1.25833 0.0 0.745667 0.665227 1.03141
>
> 1.21144 0.451294 0.910982 0.349837 0.399323 0.459258 0.757372
>
> 0.863479 0.712542 0.512668 0.745667 … 0.0 0.426208 0.487124
>
> 0.873799 0.417808 0.805673 0.665227 0.426208 0.0 0.386557
>
> 0.487347 0.802085 0.562309 1.03141 0.487124 0.386557 0.0
>
>
> There seems to be some issue with the Distance package not accepting Data
> Frames. Of course, the readcsv works fine but this might be an issue for
> others as well.
>
>
> Thanks,
>
> Don
>
>
>
> On Thursday, July 3, 2014 6:49:18 PM UTC-4, Patrick O'Leary wrote:
>>
>> On Thursday, July 3, 2014 5:36:23 PM UTC-5, Donald Lacombe wrote:
>>>
>>> I'm no GIS expert (I'm an applied econometrician) and the code I've
>>> written seems to work. The Distance package also works with my "real" data
>>> which are the centroids of the counties in Connecticut and I tested it with
>>> Euclidean, Cityblock, and SqEuclidean.
>>>
>>
>> Glad you got something working. Whether those distances are accurate
>> enough depends on how the points are arranged and what you plan to do with
>> it--I can see where it wouldn't make much difference in this case. I can't
>> let the statisticians and image processing folks have all the technical
>> conversation fun in this mailing list, though!
>>
>