RE: [HACKERS] RFC: CRC datatype

2000-12-08 Thread Christopher Kings-Lynne
Therefore, I propose defining new data types like "CRC32", "CRC64", "RIPEMD", whatever (rather than pluggable arbitrary CRCs). I suspect that you are really looking at the problem from the wrong end. CRC checking should not need to be done by the database user, with a fancy type. The postgres

Re: [HACKERS] RFC: CRC datatype

2000-12-08 Thread Horst Herb
I suspect that you are really looking at the problem from the wrong end. CRC checking should not need to be done by the database user, with a fancy type. The postgres server itself should guarantee data integrity - you shouldn't have to worry about it in userland. I agree in principle.

Re: [HACKERS] pre-beta is slow

2000-12-08 Thread Hannu Krosing
"Mikheev, Vadim" wrote: recently I have downloaded a pre-beta postgresql, I found insert and update speed is slower then 7.0.3, even I turn of sync flag, it is still slow than 7.0, why? How much slower do you see it to be ? how can I make it faster? Try to compare 7.0.3 7.1beta

Re: [HACKERS] Indexing for geographic objects?

2000-12-08 Thread Oleg Bartunov
Hi, We've done some work with GiST indices and found a little problem with optimizer. The problem could be reproduced with Gene's code (link is in original message below). test data and sql I could send - it's just 52Kb gzipped file. What is a reason for optimizer to decide that sequential scan

Re: [HACKERS] Indexing for geographic objects?

2000-12-08 Thread Tom Lane
Oleg Bartunov [EMAIL PROTECTED] writes: We've done some work with GiST indices and found a little problem with optimizer. test=# set enable_seqscan = off; SET VARIABLE test=# explain select * from test where s @ '1.05 .. 3.95'; NOTICE: QUERY PLAN: Index Scan using test_seg_ix on test

[HACKERS] OK, does anyone have any better ideas?

2000-12-08 Thread mlw
I have a working version of a text search engine. I want to make it work for Postgres (I will be releasing it GPL). It can literally find the occurrence of a string of words within 5 million records in a few milliseconds. It is very fast, it works similarly to many web search engines. I have

Re: [HACKERS] RFC: CRC datatype

2000-12-08 Thread Tom Lane
"Horst Herb" [EMAIL PROTECTED] writes: AFAIK the thread for "built in" crcs referred only to CRCs in the transaction log. This here is a different thing. CRCs in the transaction log are crucial to proof integrity of the log, CRCs as datatype are neccessary to proof integrity of database

[GENERAL] Trip to Japan

2000-12-08 Thread Bruce Momjian
I have just returned from a seven-day trip to Japan. I spoke for seven hours to three separate groups, totalling 200 people. I spoke to a Linux Conference, a PostgreSQL user's group, and to SRA, a PostgreSQL support company. You can get more information on my home page under "Writings". Here

Re: [HACKERS] Indexing for geographic objects?

2000-12-08 Thread Oleg Bartunov
On Fri, 8 Dec 2000, Tom Lane wrote: Date: Fri, 08 Dec 2000 10:47:37 -0500 From: Tom Lane [EMAIL PROTECTED] To: Oleg Bartunov [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], 'pgsql-hackers ' [EMAIL PROTECTED] Subject: Re: [HACKERS] Indexing for geographic objects? Oleg Bartunov [EMAIL

[HACKERS] Re: Threads

2000-12-08 Thread Fabrizio Manfredi
Hi all Small intrusion in the Threads discussion 1) don' t forget other OS .. the linux is not the only one (for now :-) ) For example check the performance under Solaris http://www2.linuxjournal.com/lj-issues/issue70/3184.html 2) unfortunatly some platforms that had no threads

Re: [HACKERS] RFC: CRC datatype

2000-12-08 Thread Tom Lane
"Horst Herb" [EMAIL PROTECTED] writes: Surely you don't trust your TCP connection to the server, either? TCP _IS_ heavily checksummed. Yes, and so are the disk drives that you are asserting you don't trust. My point is that in both cases, there are lots and lots of failure mechanisms that

Re: [HACKERS] Indexing for geographic objects?

2000-12-08 Thread Tom Lane
Oleg Bartunov [EMAIL PROTECTED] writes: 5000 looks like a suspiciously round number ... how many rows are in the table? Have you done a vacuum analyze on it? about 10,000 rows, So the thing is estimating 0.5 selectivity, which is a fallback for operators it knows nothing whatever about. [

Re: [HACKERS] Indexing for geographic objects?

2000-12-08 Thread The Hermit Hacker
On Fri, 8 Dec 2000, Oleg Bartunov wrote: On Fri, 8 Dec 2000, The Hermit Hacker wrote: Date: Fri, 8 Dec 2000 12:19:56 -0400 (AST) From: The Hermit Hacker [EMAIL PROTECTED] To: Tom Lane [EMAIL PROTECTED] Cc: Oleg Bartunov [EMAIL PROTECTED], [EMAIL PROTECTED], 'pgsql-hackers '

RE: [HACKERS] pre-beta is slow

2000-12-08 Thread Mikheev, Vadim
Try to compare 7.0.3 7.1beta in multi-user environment. As I understand it you claim it to be faster in multi-user environment ? Could you give some brief technical background why is it so and why must it make single-user slower ? Because of commit in 7.1 does fsync, with ot without

Re: [HACKERS] pre-beta is slow

2000-12-08 Thread Tom Lane
"Mikheev, Vadim" [EMAIL PROTECTED] writes: Because of commit in 7.1 does fsync, with ot without -F (we can discuss and change this), but in multi-user env a number of commits can be made with single fsync. I was planning to ask why you disabled the -F switch. Seems to me that people who

Re: CRC was: Re: [HACKERS] beta testing version

2000-12-08 Thread Bruce Guenter
On Fri, Dec 08, 2000 at 01:58:12PM -0500, Tom Lane wrote: Bruce Guenter [EMAIL PROTECTED] writes: ... Taking an arbitrary 32 bits of a MD5 would likely be less collision prone than using a 32-bit CRC, and it appears faster as well. ... but that would be an algorithm that you know

Re: [HACKERS] Indexing for geographic objects?

2000-12-08 Thread Oleg Bartunov
On Fri, 8 Dec 2000, Tom Lane wrote: Date: Fri, 08 Dec 2000 12:59:27 -0500 From: Tom Lane [EMAIL PROTECTED] To: Oleg Bartunov [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], 'pgsql-hackers ' [EMAIL PROTECTED] Subject: Re: [HACKERS] Indexing for geographic objects? Oleg Bartunov [EMAIL

RE: [HACKERS] pre-beta is slow

2000-12-08 Thread Mikheev, Vadim
Because of commit in 7.1 does fsync, with ot without -F (we can discuss and change this), but in multi-user env a number of commits can be made with single fsync. I was planning to ask why you disabled the -F switch. Seems to me that people who trusted their OS+hardware before would

Re: CRC was: Re: [HACKERS] beta testing version

2000-12-08 Thread Tom Lane
Bruce Guenter [EMAIL PROTECTED] writes: MD5 is a cryptographic hash, which means (AFAIK) that ideally it is impossible to produce a collision using any other method than brute force attempts. True but irrelevant. What we need to worry about is the probability that a random error will be

RE: [HACKERS] pre-beta is slow

2000-12-08 Thread Mikheev, Vadim
I just didn't care about -F functionality, sorry. I agreed that we should resurrect it. OK. Do you want to work on that, or shall I? In near future I'll be busy doing CRC + "physical log" things... Vadim

Re: [HACKERS] Re: CRC

2000-12-08 Thread Bruce Guenter
On Fri, Dec 08, 2000 at 11:10:19AM -0800, Nathan Myers wrote: This is very interesting. MD4 is faster than MD5. (MD5, described as "MD4 with suspenders on", does some extra stuff to protect against more- obscure attacks, of no interest to us.) Which 64-bit CRC code did you use, Mark

Re: [HACKERS] Re: CRC

2000-12-08 Thread Tom Lane
Bruce Guenter [EMAIL PROTECTED] writes: Are you really saying MD5 was faster than CRC-32? Yes. I expect it's because the operations used in MD5 are easily parallelized, and operate on blocks of 64-bytes at a time, while the CRC is mostly non-parallelizable, uses a table lookup, and operates

[HACKERS] Hash index on macaddr - crash

2000-12-08 Thread Tom Lane
It was just pointed out on pggeneral that hash indexes on macaddr columns don't work. Looking into it, I find that someone (me :-() made a booboo: pg_amproc claims that hashvarlena is the appropriate hash function for macaddr --- but macaddr isn't a varlena type, it's a fixed-length

RE: [HACKERS] Hash index on macaddr - crash

2000-12-08 Thread Mikheev, Vadim
We could fix this either by adding a new hash function to support macaddr, or by removing the pg_amXXX entries that claim macaddr is hashable. Either change will not take effect without an initdb, however, and I'm loath to force one now that we've started beta. If we're going to add CRC to

RE: [HACKERS] Hash index on macaddr - crash

2000-12-08 Thread Mikheev, Vadim
We could fix this either by adding a new hash function to support macaddr, or by removing the pg_amXXX entries that claim macaddr is hashable. Either change will not take effect without an initdb, however, and I'm loath to force one now that we've started beta. If we're going to add

RE: [HACKERS] Hash index on macaddr - crash

2000-12-08 Thread Darren King
We could fix this either by adding a new hash function to support macaddr, or by removing the pg_amXXX entries that claim macaddr is hashable. Either change will not take effect without an initdb, however, and I'm loath to force one now that we've started beta. How about creating an SQL

Re: [HACKERS] Re: CRC

2000-12-08 Thread Bruce Guenter
On Fri, Dec 08, 2000 at 04:30:58PM -0500, Tom Lane wrote: Bruce Guenter [EMAIL PROTECTED] writes: Are you really saying MD5 was faster than CRC-32? Yes. I expect it's because the operations used in MD5 are easily parallelized, and operate on blocks of 64-bytes at a time, while the CRC

[HACKERS] European Datestyle

2000-12-08 Thread Daniele Orlandi
Hello, Why is this happening ? ctonet=# show datestyle; NOTICE: DateStyle is ISO with European conventions SHOW VARIABLE ctonet=# select creation_date from users limit 1; creation_date 2000-12-07 04:40:23+01 ^^ Datestyle has been set either with

Re: [HACKERS] European Datestyle

2000-12-08 Thread Trond Eivind GlomsrØd
Daniele Orlandi [EMAIL PROTECTED] writes: Hello, Why is this happening ? ctonet=# show datestyle; NOTICE: DateStyle is ISO with European conventions SHOW VARIABLE ctonet=# select creation_date from users limit 1; creation_date 2000-12-07

Re: [HACKERS] Re: CRC

2000-12-08 Thread Tom Lane
Bruce Guenter [EMAIL PROTECTED] writes: Would you like to see the simple benchmarking setup I used? The amount of code involved (once all the hashes are factored in) is fairly large, so I'm somewhat hesitant to just send it to the mailing list. I agree, don't send it to the whole list. But

[HACKERS] 7.0.3(nofsync) vs 7.1

2000-12-08 Thread Mikheev, Vadim
I've run tests (with 50 .. 250 simult users) for some PG project of my company. 7.1 was 3 times faster than 7.0.3 (fsync) but near 3 times slower than 7.0.3 (nofsync). It was not the best day in my life - WAL looked like big bottleneck -:( But finally I've realized that this test makes ~3 FK

Re: [HACKERS] 7.0.3(nofsync) vs 7.1

2000-12-08 Thread Tom Lane
"Mikheev, Vadim" [EMAIL PROTECTED] writes: So, I've run simple test (below) to check this. Seems that 7.1 is faster than 7.0.3 (nofsync), and that SELECT FOR UPDATE in RI triggers is quite bad for performance. Also, we should add new TODO item: implement dirty reads and use them in RI

Re: [HACKERS] OK, does anyone have any better ideas?

2000-12-08 Thread mlw
Tom Lane wrote: mlw [EMAIL PROTECTED] writes: I have a working version of a text search engine. I want to make it work for Postgres (I will be releasing it GPL). It can literally find the occurrence of a string of words within 5 million records in a few milliseconds. Where are the

RE: [HACKERS] 7.0.3(nofsync) vs 7.1

2000-12-08 Thread Mikheev, Vadim
So, I've run simple test (below) to check this. Seems that 7.1 is faster than 7.0.3 (nofsync), and that SELECT FOR UPDATE in RI triggers is quite bad for performance. Also, we should add new TODO item: implement dirty reads and use them in RI triggers. That would fix RI triggers, I

[GENERAL] Re: [HACKERS] Bug in index scans with Locale support enabled

2000-12-08 Thread Tom Lane
Barry Lind [EMAIL PROTECTED] writes: Now the real problem comes in when either the like or regex operators are used in a sql statement. Right. As of 7.1beta1 we are dealing with this by suppressing LIKE/regex index optimization in all locales other than "C". That's a pretty crude answer but

[HACKERS] Bug in index scans with Locale support enabled

2000-12-08 Thread Barry Lind
In researching a problem I have uncovered the following bug in index scans when Locale support is enabled. Given a 7.0.3 postgres installation built with Locale support enabled and a default US RedHat 7.0 Linux installation (meaning that the LANG environment variable is set to en_US) to enable

[HACKERS] OSDN database summit

2000-12-08 Thread Bruce Momjian
Here is a late report on the OSDN database summit. It was great to meet so many PostgreSQL users, and to meet the major developers of MySQL, Interbase, and Sleepycat. We clearly have many of the same hopes and concerns for open-source databases. PostgreSQL had half of all attendees. Seems we

Re: [HACKERS] Trip to Japan

2000-12-08 Thread Tatsuo Ishii
Bruce, what was the camera ? No idea. It was not mine. I brought a video camera, and made 30 minutes of video for my family and company. I don't know how to make an MP3 of that. My wife wants a digital camera now, so it looks like I will have one soon. :-) Mine is a Nikon

RE: [HACKERS] 7.0.3(nofsync) vs 7.1

2000-12-08 Thread Mikheev, Vadim
I have only one explanation: it reduces number of transactions ready to commit (because of the same FK writers will wait till first one committed - ie log fsynced) and WAL commit performance greatly depends on how many commits were done by single log fsync. 7.0.3+nofsync commit

Re: [HACKERS] OK, does anyone have any better ideas?

2000-12-08 Thread Andrew Snow
Could you perhaps post the code you have for splitting a text field up into keys, then I could work on turning into a new type of index with a new operator, as Tom suggested? (Or is this already what the text search code in contrib already does??) - Andrew

[GENERAL] Japan pictures

2000-12-08 Thread Bruce Momjian
I would like to mention that I met Tatsuo Ishii and Hiroshi Inoue while in Japan. This was the first time I met them, though I have worked with them on PostgreSQL for many years. Tatsuo is really the voice of PostgreSQL in Japan. It was a real thrill. There is a picture somewhere of the

Re: [HACKERS] 7.0.3(nofsync) vs 7.1

2000-12-08 Thread Tom Lane
"Mikheev, Vadim" [EMAIL PROTECTED] writes: And we always will have to enable fsync when comparing our performance with other DBes. Of course, but when people say "it's slower than 7.0.3+nofsync" I think that turning off fsync makes a fairer comparison there. also reduce the WAL commit delay

Re: [HACKERS] Re: CRC

2000-12-08 Thread Tom Lane
Bruce Guenter [EMAIL PROTECTED] writes: I agree, don't send it to the whole list. But I'd like a copy. Here you go. As near as I could tell, the test as you have it (one CRC computation per fread) is purely I/O bound. I changed the main loop to this: int main() { static char buf[8192];

Re: [HACKERS] OK, does anyone have any better ideas?

2000-12-08 Thread mlw
Andrew Snow wrote: Could you perhaps post the code you have for splitting a text field up into keys, then I could work on turning into a new type of index with a new operator, as Tom suggested? (Or is this already what the text search code in contrib already does??) Go to a search engine

[HACKERS] CRC, hash Co.

2000-12-08 Thread Horst Herb
There have been some misconceptions in previous mails. 1.) A CRC is _not_ stronger than a hash. CRC is a subset of the hash domain, defined as "a fast error-check hash based on mod 2 polynomial operations" which has typically no crypto strength (and does not need it either for most purposes).

Re: [HACKERS] OK, does anyone have any better ideas?

2000-12-08 Thread mlw
Andrew Snow wrote: Could you perhaps post the code you have for splitting a text field up into keys, then I could work on turning into a new type of index with a new operator, as Tom suggested? (Or is this already what the text search code in contrib already does??) - Andrew OK, I

Re: [HACKERS] European Datestyle

2000-12-08 Thread Daniele Orlandi
Trond Eivind GlomsrØd wrote: 2000-12-07 04:40:23+01 ^^ That is the ISO-style, isn't it? Yes, it is; but according to the documentation (and how it used to be on other machines running PG 6.x) it should be ordered in european format, I don't know if I'm missing something

Re: [HACKERS] Re: A mb problem in PostgreSQL

2000-12-08 Thread Tatsuo Ishii
Please apply following one-line-patch and test it again. If it's ok, I will commit it to both current and stable trees. ! return (b2c3[i][1] | 0x8080U); Yes, it's OK. Thank you! But I wonder why we need to "| 0x8080U"? b2c3[][] and BIG5toCNS()'s return

Re: [HACKERS] Indexing for geographic objects?

2000-12-08 Thread selkovjr
Tom Lane wrote: Oleg Bartunov [EMAIL PROTECTED] writes: We've done some work with GiST indices and found a little problem with optimizer. test=# set enable_seqscan = off; SET VARIABLE test=# explain select * from test where s @ '1.05 .. 3.95'; NOTICE: QUERY PLAN: Index Scan

Re: [GENERAL] Japan pictures

2000-12-08 Thread Tatsuo Ishii
I would like to mention that I met Tatsuo Ishii and Hiroshi Inoue while in Japan. This was the first time I met them, though I have worked with them on PostgreSQL for many years. Tatsuo is really the voice of PostgreSQL in Japan. It was a real thrill. I was thrilled to meet with you

Re: [HACKERS] Re: A mb problem in PostgreSQL

2000-12-08 Thread Tatsuo Ishii
Please apply following one-line-patch and test it again. If it's ok, I will commit it to both current and stable trees. ! return (b2c3[i][1] | 0x8080U); Yes, it's OK. Thank you! Thanks for the testings. I will commit soon. But I wonder why we need to "|

Re: [HACKERS] Using Threads?

2000-12-08 Thread Bruce Momjian
Adam Haberlach writes: Typically (on a well-written OS, at least), the spawning of a thread is much cheaper then the creation of a new process (via fork()). This would be well worth testing on some representative sample systems. Within the past year and a half at one of my gigs some

Re: [HACKERS] Using Threads?

2000-12-08 Thread Bruce Momjian
Bruce Guenter [EMAIL PROTECTED] writes: [ some very interesting datapoints ] So, forking a process with lots of data is expensive. However, most of the PostgreSQL data is in a SysV IPC shared memory segment, which shouldn't affect the fork numbers. I believe (but don't have numbers

Re: [HACKERS] Re: CRC

2000-12-08 Thread Bruce Guenter
On Fri, Dec 08, 2000 at 09:28:38PM -0500, Tom Lane wrote: Bruce Guenter [EMAIL PROTECTED] writes: I agree, don't send it to the whole list. But I'd like a copy. Here you go. As near as I could tell, the test as you have it (one CRC computation per fread) is purely I/O bound. Nope. They

Re: [HACKERS] RFC: CRC datatype

2000-12-08 Thread Horst Herb
I think a row-level CRC is rather pointless. Perhaps it'd be a good idea to have a disk-page-level CRC, though. That would check the rows on the page *and* allow catching errors in the page and tuple overhead structures, which row-level CRCs would not cover. row level is neccessary to be

Re: CRC was: Re: [HACKERS] beta testing version

2000-12-08 Thread Tom Lane
Bruce Guenter [EMAIL PROTECTED] writes: ... Taking an arbitrary 32 bits of a MD5 would likely be less collision prone than using a 32-bit CRC, and it appears faster as well. ... but that would be an algorithm that you know NOTHING about the properties of. What is your basis for asserting

Re: AW: [HACKERS] beta testing version

2000-12-08 Thread Daniele Orlandi
Bruce Guenter wrote: CRCs are designed to catch N-bit errors (ie N bits in a row with their values flipped). N is (IIRC) the number of bits in the CRC minus one. So, a 32-bit CRC can catch all 31-bit errors. That's the only guarantee a CRC gives. Everything else has a 1 in 2^32-1 chance