Re: [HACKERS] Real-life range datasets

2012-01-10 Thread Stefan Keller
Hi I'm proposing OpenStreetMap which is of variable size up to 250 GB XML Data for whole world. It's downloadable from CloudMade.com or Geofabrik.de and can be imported into PostgreSQL using osm2pgsql. It's a key/value schema literally of the real world. I'm using hstore option of osm2pgsql and

Re: [HACKERS] Real-life range datasets

2011-12-23 Thread Alexander Korotkov
Hello, On Thu, Dec 22, 2011 at 12:51 PM, Benedikt Grundmann bgrundm...@janestreet.com wrote: I should be able to give you a table with the same characteristics as the instruments table but bogus data by replacing all entries in the table with random strings of the same length or something

Re: [HACKERS] Real-life range datasets

2011-12-22 Thread Benedikt Grundmann
Hello, We have a table in a postgres 8.4 database that would make use of date ranges and exclusion constraints if they were available. Sadly I cannot give you the data as it is based on data we are paying for and as part of the relevant licenses we are obliqued to not give the data to third

Re: [HACKERS] Real-life range datasets

2011-12-22 Thread Oleg Bartunov
Bene, we have pgfoundry project http://pgfoundry.org/projects/dbsamples/. Since your sample database is very important (for me also), I suggest to use this site. Oleg On Thu, 22 Dec 2011, Benedikt Grundmann wrote: Hello, We have a table in a postgres 8.4 database that would make use of date

Re: [HACKERS] Real-life range datasets

2011-12-22 Thread David E. Wheeler
On Dec 22, 2011, at 7:48 AM, Oleg Bartunov wrote: we have pgfoundry project http://pgfoundry.org/projects/dbsamples/. Since your sample database is very important (for me also), I suggest to use this site. Or PGXN. http://pgxn.org/ You can register an account to upload extensions like you

[HACKERS] Real-life range datasets

2011-12-20 Thread Alexander Korotkov
Hackers, For better GiST indexing of range types it's important to have real-life datasets for testing on. Real-life range datasets would help to proof (or reject) some concepts and get more realistic benchmarks. Also, it would be nice to know what queries you expect to run fast on that datasets.