Re: [HACKERS] Is postgres.gif missing in cvs?
Hi, On Mon, 2007-12-03 at 20:07 -0500, Tom Lane wrote: You need bigger TeX settings. Please read Building the Documentation. Thanks for the info. Apparently, those values are not enough. I'm still trying to build pdf on my box. BTW, I'll be happier if someone else can build the pdf from CVS head... I'm planning to add that PDF to the RPMs... Regards, -- Devrim GÜNDÜZ , RHCE PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/ signature.asc Description: This is a digitally signed message part
Re: [HACKERS] buildenv.pl/buildenv.bat
Magnus Hagander wrote: On Mon, 2007-12-03 at 12:15 -0500, Andrew Dunstan wrote: Magnus Hagander wrote: Hi! When you redid the msvc build stuff you seem to have missed the documentatino.. Specifically, I notice that buildenv.pl isn't documented - docs still say buildenv.bat is the way to go. Also, do we now have both buildenv.bat and buildenv.pl? (I have them both in my environment, but that could be a result of being lazy) My recollection is that I changed the minimum amount necessary, because I was expecting us to go into beta at anmy moment (silly me). That might be why we still have both. There was an expectation that some cleanup might be required during 8.4 development. I know I left other .bat files as wrappers for the perl scripts, but that's obviously not appropriate here. I'll see if I can adjust builddoc.bat so we can get rid of buildenv.bat. As for documentation, you're probably right, I could easily have missed it. I'll look into it. Great. Let me know if you need me to poke at anything. OK, I'm thinking that the best way might be to do away with buildenv.bat altogether and replace the remaining references to it in .bat files with something like this fragment: if not exist src\tools\msvc\buildenv.pl goto nobuildenv perl -e require 'src/tools/msvc/buildenv.pl'; while(($k,$v) = each %ENV) { print [EMAIL PROTECTED] $k=$v\n]; } tmp.bat call tmp.bat del tmp.bat :nobuildenv and adjust the docs accordingly. Thoughts? cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] psql's describe command (for sequences) output improvement
Hi all, I have a patch which tries to improve the '\d some_sequence_name' command output in psql utility. Before sending the patch to pgsql-patches I just want to know your opinion whether the new output of the command is OK or it needs to be modified before sending it to pgsql-patches. For comparison purpose, here is the output of the '\d some_sequence_name' command: -- Say we have created a sequence named 'test_seq' already as CREATE SEQUENCE test_seq. -- -- Output: before applying the patch. -- testdb=# \d test_seq; Sequence public.test_seq Column | Type ---+- sequence_name | name last_value| bigint increment_by | bigint max_value | bigint min_value | bigint cache_value | bigint log_cnt | bigint is_cycled | boolean is_called | boolean -- -- Output: after applying the patch. -- testdb=# \d test_seq; Sequence public.test_seq last_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called +--+-+---+-+-+---+--- 1 | 1| 9223372036854775807 | 1 | 1 | 1 | f | f Thanks in advance for your feedback(s). -- Aftab.
Re: [HACKERS] psql's describe command (for sequences) output improvement
On 04/12/2007, Aftab Hussain [EMAIL PROTECTED] wrote: Hi all, I have a patch which tries to improve the '\d some_sequence_name' command output in psql utility. Before sending the patch to pgsql-patches I just want to know your opinion whether the new output of the command is OK or it needs to be modified before sending it to pgsql-patches. For comparison purpose, here is the output of the '\d some_sequence_name' command: -- Say we have created a sequence named 'test_seq' already as CREATE SEQUENCE test_seq. -- -- Output: before applying the patch. -- testdb=# \d test_seq; Sequence public.test_seq Column | Type ---+- sequence_name | name last_value| bigint increment_by | bigint max_value | bigint min_value | bigint cache_value | bigint log_cnt | bigint is_cycled | boolean is_called | boolean -- -- Output: after applying the patch. -- testdb=# \d test_seq; Sequence public.test_seq last_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called +--+-+---+-+-+---+--- 1 | 1| 9223372036854775807 | 1 | 1 | 1 | f | f Thanks in advance for your feedback(s). -- Aftab. Why? is that not what select * from test_seq; does already. \d command return meta data not data. Peter Childs
[HACKERS] TOASTed size
I'm thinking that there isn't any way currently of working out how big a compressed toast object is? All existing functions decompress the object before we do anything to it, AFAICS. Am I missing something? So there's no way currently of working out how good your compression is for individual values or when you have multiple toasted columns, other than writing a new function? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] weird - invalid string enlargement request size
A simple query: select prec.bl_considerada_prioritaria AS A proposta é considerada prioritária por Conselho Municipal ou Estadual from consultaprevia2008.tab_urb_asse_prec prec The result: ERROR: invalid string enlargement request size 1073741823 But.. If i replace the accented char, leaving the query like: select prec.bl_considerada_prioritaria AS A proposta e considerada prioritaria por Conselho Municipal ou Estadual from consultaprevia2008.tab_urb_asse_prec prec More weird yet: This is part of a large query. I have the keyword as, using accented chars in some other places, and works ok. My pg version: PostgreSQL 8.2.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) This is a restored backup from 8.1 . In 8.1 it works ok. []'s - Walter ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] weird - invalid string enlargement request size
Walter Cruz escribió: A simple query: select prec.bl_considerada_prioritaria AS A proposta é considerada prioritária por Conselho Municipal ou Estadual from consultaprevia2008.tab_urb_asse_prec prec The result: ERROR: invalid string enlargement request size 1073741823 Weird. Probably encoding mismatch. What are your locale and encoding settings, and what encoding is the actual client? And what client is it? Easiest way to show server settings is: select name, setting from pg_settings where name ~~ any ( array['lc_%', '%_encoding']); -- Alvaro Herrerahttp://www.advogato.org/person/alvherre La Primavera ha venido. Nadie sabe como ha sido (A. Machado) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] There's random access and then there's random access
Gregory Stark [EMAIL PROTECTED] writes: The two interfaces I'm aware of for this are posix_fadvise() and libaio. I've run tests with a synthetic benchmark which generates a large file then reads a random selection of blocks from within it using either synchronous reads like we do now or either of those interfaces. I saw impressive speed gains on a machine with only three drives in a raid array. I did this a while ago so I don't have the results handy. I'll rerun the tests again and post them. Here's the results of running the synthetic test program on a 3-drive raid array. Note that the results *exceeded* the 3x speedup I expected, even for ordered blocks. Either the drive (or the OS) is capable of reordering the block requests better than the offset into the file would appear or some other effect is kicking in. The test is with an 8GB file, picking 8,192 random 8k blocks from within it. The pink diamonds represent the bandwidth obtained if the random blocks are sorted before fetching (like a bitmap indexscan) and the blue if they're unsorted. inline: test-pfa-results.png for i in 1 2 3 4 5 6 7 8 16 24 32 64 96 128 192 256 384 512 768 1024 2048 4096 8192 ; do ./a.out pfa2 /mnt/data/test.data 8388608 8192 $i 8192 false ; done test-pfa-results for i in 1 2 3 4 5 6 7 8 16 24 32 64 96 128 192 256 384 512 768 1024 2048 4096 8192 ; do ./a.out pfa2 /mnt/data/test.data 8388608 8192 $i 8192 true ; done test-pfa-results test-pfa-results Description: Binary data #define _XOPEN_SOURCE 600 #define _GNU_SOURCE #define _FILE_OFFSET_BITS 64 #define __EXTENSIONS__ #include sys/types.h #include unistd.h #include sys/stat.h #include fcntl.h #include sys/time.h #include time.h #include sys/fcntl.h #include errno.h #include aio.h #include stdio.h #include stdlib.h #include string.h #if LINUX #define HAVE_POSIX_MEMALIGN #else #include malloc.h #endif #if defined(POSIX_FADV_DONTNEED) defined(POSIX_FADV_WILLNEED) #define HAVE_PFA #endif #if defined(DIRECTIO_ON) defined(DIRECTIO_OFF) #define HAVE_DIRECTIO #define WITH_DIO w/directio #elif defined(O_DIRECT) #define WITH_DIO w/O_DIRECT #define PLUS_DIO +O_DIRECT #else #define WITH_DIO with buffered i/o #define PLUS_DIO #endif enum method { seek, pfa, pfa2, aio} method; static unsigned work_set_size, block_size; static void seek_scan(int fd, off_t *offset_list, unsigned noffsets); #ifdef HAVE_PFA static void pfa_scan(int fd, off_t *offset_list, unsigned noffsets); static void pfa_scan2(int fd, off_t *offset_list, unsigned noffsets); #endif static void aio_scan(int fd, off_t *offset_list, unsigned noffsets); static void gen_buf(off_t offset, char *buf); static void check_buf(off_t offset, const char *read_buf); /* qsort helper */ static int cmp(const void *arg1, const void *arg2) { off_t a = *(off_t*)arg1; off_t b = *(off_t*)arg2; if (a b) return -1; else if (a b) return 1; else return 0; } int main(int argc, char *argv[]) { off_t file_size, sample_size, *offset_list, existing_size; unsigned noffsets, sorted_offsets; const char *file_name; int fd; struct timeval before, after; unsigned i; double elapsed; if (argc = 1) method = seek; #ifdef HAVE_PFA else if (!strcmp(argv[1], pfa)) method = pfa; else if (!strcmp(argv[1], pfa2)) method = pfa2; #endif else if (!strcmp(argv[1], aio)) method = aio; else if (!strcmp(argv[1], seek)) method = seek; else { fprintf(stderr, usage: ./a.out [seek|pfa|pfa2|aio] [filename] [file kB] [sample blocks] [concurrent blocks] [block bytes]\n); exit(1); } if (argc = 2) file_name = test.data; else file_name = argv[2]; if (argc = 3) file_size = 1024*1024; else file_size = (off_t)1024*atoi(argv[3]); if (argc = 4) sample_size = 1; else sample_size = atoi(argv[4]); if (argc = 5) work_set_size = 128; else work_set_size = atoi(argv[5]); if (argc = 6) block_size = 8192; else block_size = atoi(argv[6]); if (argc = 7) if (*argv[7] == 't' || atoi(argv[6])) sorted_offsets = 1; if (block_size = 0) { fprintf(stderr, bad block size %u\n, block_size); exit(1); } file_size = file_size/block_size*block_size; if (file_size = 0 || sample_size = 0) { fprintf(stderr, bad file/sample size %llu/%llu\n, (long long unsigned)file_size, (long long unsigned)sample_size); exit(1); } fprintf(stderr, reading random %lu %s %uk blocks out of %luM using %s (working set %u)\n, (unsigned long) sample_size, sorted_offsets ? sorted : unordered, block_size, (unsigned long) (file_size/1024/1024), (method == seek ? lseek only : method == pfa ? posix_fadvise : method == pfa2 ? posix_fadvise v2 : method == aio ? aio_read WITH_DIO: ???), work_set_size); fd = open(file_name, O_RDWR | O_CREAT, 0644); if (fd 0) { perror(open); exit(1); } existing_size = lseek(fd, 0, SEEK_END); if (existing_size ==
Re: [HACKERS] weird - invalid string enlargement request size
Yes, encoding mismatch [:)] client_encoding;UNICODE lc_collate;pt_BR.UTF-8 lc_ctype;pt_BR.UTF-8 lc_messages;pt_BR.UTF_8 lc_monetary;pt_BR.UTF_8 lc_numeric;pt_BR.UTF_8 lc_time;pt_BR.UTF_8 server_encoding;LATIN1 I'm with some troubles to set the encoding In my PHP program, but this is not hackers issue :) I posted on hackers cause I think that was a bug, os something. []'s - Walter On Dec 4, 2007 3:46 PM, Alvaro Herrera [EMAIL PROTECTED] wrote: Walter Cruz escribió: A simple query: select prec.bl_considerada_prioritaria AS A proposta é considerada prioritária por Conselho Municipal ou Estadual from consultaprevia2008.tab_urb_asse_prec prec The result: ERROR: invalid string enlargement request size 1073741823 Weird. Probably encoding mismatch. What are your locale and encoding settings, and what encoding is the actual client? And what client is it? Easiest way to show server settings is: select name, setting from pg_settings where name ~~ any ( array['lc_%', '%_encoding']); -- Alvaro Herrerahttp://www.advogato.org/person/alvherre La Primavera ha venido. Nadie sabe como ha sido (A. Machado) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] There's random access and then there's random access
Gregory Stark wrote: Gregory Stark [EMAIL PROTECTED] writes The two interfaces I'm aware of for this are posix_fadvise() and libaio. I've run tests with a synthetic benchmark which generates a large file then reads a random selection of blocks from within it using either synchronous reads like we do now or either of those interfaces. I saw impressive speed gains on a machine with only three drives in a raid array. I did this a while ago so I don't have the results handy. I'll rerun the tests again and post them. Here's the results of running the synthetic test program on a 3-drive raid array. Note that the results *exceeded* the 3x speedup I expected, even for ordered blocks. Either the drive (or the OS) is capable of reordering the block requests better than the offset into the file would appear or some other effect is kicking in. The test is with an 8GB file, picking 8,192 random 8k blocks from within it. The pink diamonds represent the bandwidth obtained if the random blocks are sorted before fetching (like a bitmap indexscan) and the blue if they're unsorted. I didn't see exceeded 3X in the graph. But I certainly see 2X+ for most of the graphic, and ~3X for very small reads. Possibly, it is avoiding unnecessary read-ahead at the drive or OS levels? I think we expected to see raw reads significantly faster for the single process case. I thought your simulation was going to involve a tweak to PostgreSQL on a real query to see what overall effect it would have on typical queries and on special queries like Matthew's. Are you able to tweak the index scan and bitmap scan methods to do posfix_fadvise() before running? Even if it doesn't do anything more intelligent such as you described in another post? Cheers, mark -- Mark Mielke [EMAIL PROTECTED]
Re: [HACKERS] There's random access and then there's random access
Mark Mielke [EMAIL PROTECTED] writes: I didn't see exceeded 3X in the graph. But I certainly see 2X+ for most of the graphic, and ~3X for very small reads. Possibly, it is avoiding unnecessary read-ahead at the drive or OS levels? Then you're misreading the graph -- which would be my fault, my picture was only worth 500 words then. Ordered scans (simulating a bitmap index scan) is getting 3.8 MB/s a prefetch of 1 (effectively no prefetch) and 14.1 MB/s with a prefetch of 64. That's a factor of 3.7! Unordered scans have an even larger effect (unsurprisingly) going from 1.6MB/s to 8.9MB/s or a factor of 5.6. Another surprising bit is that prefetching 8192 blocks, ie, the whole set, doesn't erase the advantage of the presorting. I would have expected that when prefetching all the blocks it would make little difference what order we feed them to posix_fadvise. I guess since the all the blocks which have had i/o initiated on them haven't been read in yet when we reach the first real read() that forces some blocks to be read out-of-order. I'm surprised it makes nearly a 2x speed difference though. I think we expected to see raw reads significantly faster for the single process case. I thought your simulation was going to involve a tweak to PostgreSQL on a real query to see what overall effect it would have on typical queries and on special queries like Matthew's. Are you able to tweak the index scan and bitmap scan methods to do posfix_fadvise() before running? Even if it doesn't do anything more intelligent such as you described in another post? That's the next step. I'm debating between two ways to structure the code right now. Do I put the logic to peek ahead in nodeBitmapHeapScan to read ahead and remember the info seen or in tidbitmap with an new api function which is only really useful for this one use case. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] There's random access and then there's random access
Mark Mielke [EMAIL PROTECTED] writes: I didn't see exceeded 3X in the graph. But I certainly see 2X+ for most of the graphic, and ~3X for very small reads. Possibly, it is avoiding unnecessary read-ahead at the drive or OS levels? Ahh! I think I see how you're misreading it now. You're comparing the pink with the blue. That's not what's going on. The X axis (which is logarithmic) is the degree of prefetch. So 1 means it's prefetching one block then immediately reading it -- effectively not prefetching at all. 1 (actually the last data point is 8192) is completely prefetching the whole data set. The two data sets are the same tests run with ordered (ie, like a bitmap scan) or unordered (ie, like a regular index scan) blocks. Unsurprisingly ordered sets read faster with low levels of prefetch and both get faster the more blocks you prefetch. What's surprising to me is that the advantage of the ordered blocks doesn't diminish with prefetching. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] weird - invalid string enlargement request size
Walter Cruz [EMAIL PROTECTED] writes: I posted on hackers cause I think that was a bug, os something. Yeah, I think so too. Can you extract a reproducible test case? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Simplifying Text Search
On Nov 15, 2007, at 4:41 AM, Gregory Stark wrote: Ask me about EnterpriseDB's Slony Replication support! Hah, wtf is that all about? :) BTW, looked at londiste? -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Simplifying Text Search
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, 4 Dec 2007 16:41:52 -0600 Decibel! [EMAIL PROTECTED] wrote: On Nov 15, 2007, at 4:41 AM, Gregory Stark wrote: Ask me about EnterpriseDB's Slony Replication support! Hah, wtf is that all about? :) Stealth marketing :)... Joshua D. Drake - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHVdlPATb/zqfZUUQRAsnFAJ9W8KkxkoijdAUOLsB71kZGmWaxfgCgqat+ GswHcfzZNDkw4i37s9ecy7M= =7x8o -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Simplifying Text Search
On Dec 4, 2007, at 4:48 PM, Joshua D. Drake wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, 4 Dec 2007 16:41:52 -0600 Decibel! [EMAIL PROTECTED] wrote: On Nov 15, 2007, at 4:41 AM, Gregory Stark wrote: Ask me about EnterpriseDB's Slony Replication support! Hah, wtf is that all about? :) Stealth marketing :)... Dammit, I need to de-program reply-all from my fingers... But yeah, I thought it funny. :) -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] There's random access and then there's random access
On Dec 4, 2007, at 1:42 PM, Gregory Stark wrote: I'm debating between two ways to structure the code right now. Do I put the logic to peek ahead in nodeBitmapHeapScan to read ahead and remember the info seen or in tidbitmap with an new api function which is only really useful for this one use case. There has been discussion of having a bg_reader, similar to the bg_writer. Perhaps that would be better than creating something that's specific to bitmap scans? Also, I would expect to see a speed improvement even on single drives if the OS is actually issuing multiple requests to the drive. Doing so allows the drive to optimally order all of the reads. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] There's random access and then there's random access
Decibel! [EMAIL PROTECTED] writes: On Dec 4, 2007, at 1:42 PM, Gregory Stark wrote: I'm debating between two ways to structure the code right now. Do I put the logic to peek ahead in nodeBitmapHeapScan to read ahead and remember the info seen or in tidbitmap with an new api function which is only really useful for this one use case. There has been discussion of having a bg_reader, similar to the bg_writer. Perhaps that would be better than creating something that's specific to bitmap scans? Has there? AFAICT a bg_reader only makes sense if we move to a direct-i/o situation where we're responsible for read-ahead and have to read into shared buffers any blocks we decide are interesting to readahead. Regardless of what mechanism is used and who is responsible for doing it someone is going to have to figure out which blocks are specifically interesting to prefetch. Bitmap index scans happen to be the easiest since we've already built up a list of blocks we plan to read. Somehow that information has to be pushed to the storage manager to be acted upon. Normal index scans are an even more interesting case but I'm not sure how hard it would be to get that information. It may only be convenient to get the blocks from the last leaf page we looked at, for example. Also, I would expect to see a speed improvement even on single drives if the OS is actually issuing multiple requests to the drive. Doing so allows the drive to optimally order all of the reads. Sure, but a 2x speed improvement? That's way more than I was expecting -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] TOASTed size
Simon Riggs [EMAIL PROTECTED] writes: I'm thinking that there isn't any way currently of working out how big a compressed toast object is? pg_column_size() ? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org