Dear Patrick, A bit late, but I took your advice and produced two files that implements the k nearest neighbor weight matrix algorithm using the Haversine formula.
I'm attaching these for your perusal in case you find them interesting or more importantly a bug... Thank you again for helping in this matter. Regards, Don On Thursday, July 3, 2014 12:34:28 AM UTC-4, Patrick O'Leary wrote: > > (/me removes Julia hat, replaces with navigation hat) > > That distance measurement probably isn't going to do quite what you want > it to do, particularly over such a large area. It's an okay approximation > in CONUS, but biased--1 degree latitude isn't the same distance as 1 degree > longitude. For points further north this gets worse. You have a few options > for more accuracy: > > You can put all of your coordinates in a local level frame. All the > necessary formulas and constants are conveniently provided in MATLAB's > Aerospace Blockset documentation: > http://www.mathworks.com/help/aeroblks/llatoflatearth.html. Choose > something like the centroid lat/lon as the "initial" values, which will act > as the origin of the reference frame. Note that lat/lon will need to be > converted to radians. > > For longer distances, you will get better results with something like the > haversine formula, which accounts for the curvature of the earth ( > https://en.wikipedia.org/wiki/Haversine_formula). That would probably > make for a decent contribution to Distance.jl, and is pretty easy to > implement. > > For (substantial!) extra credit, you could implement a version that > accounts for the flattening of the ellipsoid ( > https://en.wikipedia.org/wiki/Geodesics_on_an_ellipsoid, > https://en.wikipedia.org/wiki/Vincenty%27s_formulae). That would probably > not be accepted by Distance.jl, and is unlikely to be worthwhile for your > application. > > On Wednesday, July 2, 2014 7:05:21 PM UTC-5, Donald Lacombe wrote: >> >> Actually, another issue with the type information. >> >> Instead of using random coordinates, I used actual latitude and longitude >> coordinates from a CSV file like so: >> >> data = readcsv("ct_coord.csv"); >> >> temp = [xc';yc'] >> >> 2x8 Array{Any,2}: >> >> -73.3712 -72.1065 -73.2453 -71.9876 … -72.7328 -72.5231 -72.8999 >> >> 41.225 41.4667 41.7925 41.83 41.8064 41.4354 41.3488 >> >> >> Now, the type here is {Any,2} and this seems to be giving the Distance >> package problems: >> >> >> >> R = pairwise(Euclidean(),temp) >> >> MethodError(fptype,(Any,)) >> >> >> Julia seems to be unhappy with this type but I'm not sure how to correct >> this which would seem to require that the data being read in are of Float >> type. >> >> >> Again, thank you for all of the help. >> >> >> Don >> >> >> >> On Wednesday, July 2, 2014 7:10:09 PM UTC-4, Donald Lacombe wrote: >>> >>> Patrick, >>> >>> Thank you for the reply! I will try your suggestion and hopefully get a >>> better grasp of these issues. >>> >>> Thank you all gain for all of the help. It cannot be stressed enough >>> that the friendly and truly helpful comments will help the Julia community >>> grow. >>> >>> Don >>> >>> On Wednesday, July 2, 2014 5:31:55 PM UTC-4, Patrick O'Leary wrote: >>>> >>>> The second parameter to the Array type is the number of dimensions of >>>> the array. "Row vectors" are implemented with 2-dimensional arrays >>>> (a.k.a., >>>> "matrices"), since Vectors (a.k.a. Array{T, 1} for all types T) are >>>> interpreted as columns. >>>> >>>> Since pairwise() generates all of the distances simultaneously, you'll >>>> need to slice its results to use it in sortperm(), following your original >>>> loop implementation. Each slice will need to be a Vector. If you slice >>>> columns, you get Vector slices for free. If you slice rows, the shape will >>>> be maintained, so you get a row back. You can flatten this to a Vector >>>> with >>>> the vec() function. >>>> >>>> Hope this helps! >>>> >>>> On Wednesday, July 2, 2014 4:23:52 PM UTC-5, Donald Lacombe wrote: >>>>> >>>>> After thinking about this issue, another couple of question came to >>>>> mind. >>>>> >>>>> When I create a row vector, say xc = randn(1,5), the results indicate >>>>> that the vector is e.g. >>>>> >>>>> xc = rand(1,5) >>>>> >>>>> 1x5 Array{Float64,2}: >>>>> >>>>> 0.989421 0.718148 0.499914 0.13024 0.939856 >>>>> >>>>> >>>>> The type is {Float64,2}. >>>>> >>>>> >>>>> Questions: >>>>> >>>>> >>>>> 1) What does the "2" mean in {Float64,2}? If I create xc = rand(5) the >>>>> type is {Float64,1}. >>>>> >>>>> >>>>> 2) The sortperm function only seems to accept {Float64,1} types. The >>>>> Euclidean distance code create types of {Float64,2} but I need >>>>> {Float64,1} for the sortperm function. Do I need to convert these before >>>>> using this command? >>>>> >>>>> >>>>> I'm probably missing something really basic so apologies for the query. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Don >>>>> >>>>> >>>>> On Wednesday, July 2, 2014 4:20:55 PM UTC-4, Donald Lacombe wrote: >>>>>> >>>>>> Greetings, part 2... >>>>>> >>>>>> After doing some tests, I found that this code works just fine: >>>>>> >>>>>> julia> using Distance >>>>>> >>>>>> Warning: could not import Base.foldl into NumericExtensions >>>>>> >>>>>> Warning: could not import Base.foldr into NumericExtensions >>>>>> >>>>>> Warning: could not import Base.sum! into NumericExtensions >>>>>> >>>>>> Warning: could not import Base.maximum! into NumericExtensions >>>>>> >>>>>> Warning: could not import Base.minimum! into NumericExtensions >>>>>> >>>>>> >>>>>> >>>>>> julia> xc = rand(8)' >>>>>> >>>>>> 1x8 Array{Float64,2}: >>>>>> >>>>>> 0.30107 0.479169 0.230607 0.65126 0.386921 0.455316 0.921496 >>>>>> 0.244873 >>>>>> >>>>>> >>>>>> julia> yc = rand(8)' >>>>>> >>>>>> 1x8 Array{Float64,2}: >>>>>> >>>>>> 0.866199 0.767794 0.76163 0.262925 … 0.742765 0.980952 0.424966 >>>>>> >>>>>> >>>>>> julia> temp = [xc ; yc] >>>>>> >>>>>> 2x8 Array{Float64,2}: >>>>>> >>>>>> 0.30107 0.479169 0.230607 0.65126 … 0.455316 0.921496 0.244873 >>>>>> >>>>>> 0.866199 0.767794 0.76163 0.262925 0.742765 0.980952 0.424966 >>>>>> >>>>>> >>>>>> julia> R = pairwise(Euclidean(),temp) >>>>>> >>>>>> 8x8 Array{Float64,2}: >>>>>> >>>>>> 0.0 0.203477 0.126094 0.697548 … 0.197554 0.630949 >>>>>> 0.444797 >>>>>> >>>>>> 0.203477 0.0 0.248638 0.533393 0.0345751 0.491009 >>>>>> 0.415241 >>>>>> >>>>>> 0.126094 0.248638 0.0 0.652423 0.225499 0.724865 >>>>>> 0.336966 >>>>>> >>>>>> 0.697548 0.533393 0.652423 0.0 0.518306 0.767196 >>>>>> 0.437502 >>>>>> >>>>>> 0.376201 0.283308 0.304834 0.355027 0.252287 0.719136 >>>>>> 0.160613 >>>>>> >>>>>> 0.197554 0.0345751 0.225499 0.518306 … 0.0 0.523505 >>>>>> 0.381159 >>>>>> >>>>>> 0.630949 0.491009 0.724865 0.767196 0.523505 0.0 >>>>>> 0.87575 >>>>>> >>>>>> 0.444797 0.415241 0.336966 0.437502 0.381159 0.87575 0.0 >>>>>> >>>>>> >>>>>> >>>>>> As expected, the diagonal elements are 0 and everything seems to work >>>>>> fine although I did not test other distance formulas. >>>>>> >>>>>> I'm posing this so others can see the results and possibly make use >>>>>> of this themselves. >>>>>> >>>>>> Is there a central repository for user submitted solutions or >>>>>> examples? If so I'd like to post this so others can see or alternatively >>>>>> someone else is free to use this (and the original code I posted) for >>>>>> future reference. >>>>>> >>>>>> Thank you all again for your help! I'm going to be coding some models >>>>>> this summer and will probably be back to bug everyone again. >>>>>> >>>>>> Thanks, >>>>>> Don >>>>>> >>>>>> On Wednesday, July 2, 2014 10:22:58 AM UTC-4, Donald Lacombe wrote: >>>>>>> >>>>>>> Greetings! >>>>>>> >>>>>>> I am a new Julia user and have the following issue. I am writing >>>>>>> code to calculate a knn spatial weight matrix to estimate a spatial >>>>>>> econometric model. Using the MATLAB code from Jim LeSage's Econometrics >>>>>>> Toolbox, I converted that code into Julia as follows: >>>>>>> >>>>>>> xc = rand(8); >>>>>>> >>>>>>> yc = rand(8); >>>>>>> >>>>>>> n = length(xc); >>>>>>> >>>>>>> k = 2; >>>>>>> >>>>>>> distance = zeros(n); >>>>>>> >>>>>>> nnlist = zeros(n,k); >>>>>>> >>>>>>> tempw = zeros(n,n); >>>>>>> >>>>>>> >>>>>>> for i=1:n; >>>>>>> >>>>>>> xi = xc[i,1]; >>>>>>> >>>>>>> yi = yc[i,1]; >>>>>>> >>>>>>> distance = (xc - xi*ones(n)).^2 + (yc - yi*ones(n)).^2 >>>>>>> >>>>>>> temp = sortperm(distance) >>>>>>> >>>>>>> nnlist[i,1:k] = temp[2:k+1,1]'; >>>>>>> >>>>>>> end >>>>>>> >>>>>>> >>>>>>> for i=1:n >>>>>>> >>>>>>> tempw[i,nnlist[i,:]] = 1; >>>>>>> >>>>>>> end >>>>>>> >>>>>>> >>>>>>> W = tempw/k; >>>>>>> >>>>>>> >>>>>>> This is a "toy" example and I was wondering if I can use the Distance >>>>>>> package to simplify the distance = (xc - xi*ones(n)).^2 + (yc - >>>>>>> yi*ones(n)).^2 formula. I tried using the pairwise option like so: >>>>>>> >>>>>>> >>>>>>> R = pairwise(Euclidean(),xc,yc) but received the following message: >>>>>>> >>>>>>> >>>>>>> R = pairwise(Euclidean(),xc,yc) >>>>>>> >>>>>>> MethodError(pairwise,(Euclidean(),[0.05961066617957589,0.018538084399339905,0.39282193332224646,0.7006919213133509,0.5099836895629475,0.8448415935222402,0.2985674570217043,0.8022287058003177],[0.5808687231553928,0.9655167324458858,0.026306556019434435,0.6565373244339141,0.11927452074471412,0.11873635450496622,0.6271632933770979,0.7081439899673692])) >>>>>>> >>>>>>> I'd like to be able to utilize the Distance package but am a bit >>>>>>> stumped. The code as written works but it's bugging me that I cannot >>>>>>> seem to get the above command to work. I also get the following error >>>>>>> when loading Distance: >>>>>>> >>>>>>> >>>>>>> using Distance >>>>>>> >>>>>>> Warning: could not import Base.foldl into NumericExtensions >>>>>>> >>>>>>> Warning: could not import Base.foldr into NumericExtensions >>>>>>> >>>>>>> Warning: could not import Base.sum! into NumericExtensions >>>>>>> >>>>>>> Warning: could not import Base.maximum! into NumericExtensions >>>>>>> >>>>>>> Warning: could not import Base.minimum! into NumericExtensions >>>>>>> >>>>>>> >>>>>>> If there is anyone who can address this I'd greatly appreciate it. >>>>>>> Incidentally, the help on this group is one reason I am making the >>>>>>> change. >>>>>>> >>>>>>> >>>>>>> Don >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>
knn_weight_matrix.jl
Description: Binary data
haversine.jl
Description: Binary data
