Re: [PERFORM] index structure for 114-dimension vector

2007-05-01 Thread Andrew Lazarus
Let me just thank the list, especially for the references. (I found similar papers myself with Google: and to think I have a university library alumni card and barely need it any more!) I'll write again on the sorts of results I get. BEGIN:VCARD VERSION:2.1 N:Lazarus;Andrew;;;Ph.D. FN:Andrew

Re: [PERFORM] index structure for 114-dimension vector

2007-05-01 Thread Alexander Staubo
On 5/1/07, Andrew Lazarus [EMAIL PROTECTED] wrote: Let me just thank the list, especially for the references. (I found similar papers myself with Google: and to think I have a university library alumni card and barely need it any more!) I'll write again on the sorts of results I get. Looking

Re: [PERFORM] index structure for 114-dimension vector

2007-04-27 Thread Arjen van der Meijden
On 21-4-2007 1:42 Mark Kirkwood wrote: I don't think that will work for the vector norm i.e: |x - y| = sqrt(sum over j ((x[j] - y[j])^2)) I don't know if this is usefull here, but I was able to rewrite that algorithm for a set of very sparse vectors (i.e. they had very little overlapping

Re: [PERFORM] index structure for 114-dimension vector

2007-04-26 Thread C Storm
On Apr 20, 12:07 pm, [EMAIL PROTECTED] (Andrew Lazarus) wrote: I have a table with 2.5 million real[] arrays. (They are points in a time series.) Given a new array X, I'd like to find, say, the 25 closest to X in some sense--for simplification, let's just say in the usualvectornorm. Speed is

Re: [PERFORM] index structure for 114-dimension vector

2007-04-26 Thread Alexander Staubo
On 4/20/07, Andrew Lazarus [EMAIL PROTECTED] wrote: I have a table with 2.5 million real[] arrays. (They are points in a time series.) Given a new array X, I'd like to find, say, the 25 closest to X in some sense--for simplification, let's just say in the usual vector norm. Speed is critical

Re: [PERFORM] index structure for 114-dimension vector

2007-04-26 Thread Oleg Bartunov
On Fri, 27 Apr 2007, Alexander Staubo wrote: On 4/20/07, Andrew Lazarus [EMAIL PROTECTED] wrote: I have a table with 2.5 million real[] arrays. (They are points in a time series.) Given a new array X, I'd like to find, say, the 25 closest to X in some sense--for simplification, let's just say

[PERFORM] index structure for 114-dimension vector

2007-04-20 Thread Andrew Lazarus
I have a table with 2.5 million real[] arrays. (They are points in a time series.) Given a new array X, I'd like to find, say, the 25 closest to X in some sense--for simplification, let's just say in the usual vector norm. Speed is critical here, and everything I have tried has been too slow. I

Re: [PERFORM] index structure for 114-dimension vector

2007-04-20 Thread Jeff Davis
On Fri, 2007-04-20 at 12:07 -0700, Andrew Lazarus wrote: I have a table with 2.5 million real[] arrays. (They are points in a time series.) Given a new array X, I'd like to find, say, the 25 closest to X in some sense--for simplification, let's just say in the usual vector norm. Speed is

Re: [PERFORM] index structure for 114-dimension vector

2007-04-20 Thread Mark Kirkwood
Jeff Davis wrote: On Fri, 2007-04-20 at 12:07 -0700, Andrew Lazarus wrote: I have a table with 2.5 million real[] arrays. (They are points in a time series.) Given a new array X, I'd like to find, say, the 25 closest to X in some sense--for simplification, let's just say in the usual vector

Re: [PERFORM] index structure for 114-dimension vector

2007-04-20 Thread Andrew Lazarus
Because I know the 25 closest are going to be fairly close in each coordinate, I did try a multicolumn index on the last 6 columns and used a +/- 0.1 or 0.2 tolerance on each. (The 25 best are very probably inside that hypercube on the distribution of data in question.) This hypercube tended to

Re: [PERFORM] index structure for 114-dimension vector

2007-04-20 Thread Mark Kirkwood
Andrew Lazarus wrote: Because I know the 25 closest are going to be fairly close in each coordinate, I did try a multicolumn index on the last 6 columns and used a +/- 0.1 or 0.2 tolerance on each. (The 25 best are very probably inside that hypercube on the distribution of data in question.)

Re: [PERFORM] index structure for 114-dimension vector

2007-04-20 Thread Tom Lane
Andrew Lazarus [EMAIL PROTECTED] writes: Because I know the 25 closest are going to be fairly close in each coordinate, I did try a multicolumn index on the last 6 columns and used a +/- 0.1 or 0.2 tolerance on each. (The 25 best are very probably inside that hypercube on the distribution of