Re: Table spaces again [was Re: [HACKERS] Threaded Sorting]

2002-10-08 Thread Hans-Jürgen Schönig
Jim Buttafuoco wrote: Is this NOT what I have been after for many months now. I dropped the tablespace/location idea before 7.2 because that didn't seem to be any interest. Please see my past email's for the SQL commands and on disk directory layout I have proposed. I have a working 7.2

Re: [HACKERS] Threaded Sorting

2002-10-07 Thread Hans-Jürgen Schönig
Bingo = great :). The I/O problem seems to be solved :). A table space concept would be top of the histlist :). The symlink version is not very comfortable and I think it would be a real hack. Also: If we had a clean table space concept it would be real advantage. In the first place it would

Re: [HACKERS] Threaded Sorting

2002-10-07 Thread Hans-Jürgen Schönig
Greg Copeland wrote: I wouldn't hold your breath for any form of threading. Since PostgreSQL is process based, you might consider having a pool of sort processes which address this but I doubt you'll get anywhere talking about threads here. Greg I came across the problem yesterday. We

Re: [HACKERS] Threaded Sorting

2002-10-07 Thread Hans-Jürgen Schönig
Threads are not the best solutions when it comes to portability. A prefer a process model as well. My concern was that a process model might be a bit too slow for that but if we had processes in memory this would be wonderful thing. Using it for small amounts of data is pretty useless - I

Re: [HACKERS] Threaded Sorting

2002-10-07 Thread Hans-Jürgen Schönig
Threads are bad - I know ... I like the idea of a pool of processes instead of threads - from my point of view this would be useful. I am planning to run some tests (GEQO, AIX, sorts) as soon as I have time to do so (still too much work ahead before :( ...). If I had time I'd love to do

Table spaces again [was Re: [HACKERS] Threaded Sorting]

2002-10-07 Thread Shridhar Daithankar
On 4 Oct 2002 at 21:13, Hans-Jürgen Schönig wrote: Bingo = great :). The I/O problem seems to be solved :). A table space concept would be top of the histlist :). The symlink version is not very comfortable and I think it would be a real hack. Also: If we had a clean table space

Re: Table spaces again [was Re: [HACKERS] Threaded Sorting]

2002-10-07 Thread Hans-Jürgen Schönig
Can anybody please tell me in detail.(Not just a pointing towards TODO items) 1) What a table space supposed to offer? They allow you to define a maximum amount of storage for a certain set of data. They help you to define the location of data. They help you to define how much data can be

Re: Table spaces again [was Re: [HACKERS] Threaded Sorting]

2002-10-07 Thread Hans-Jürgen Schönig
2) What a directory structure does not offer that table space does? You need to the command line in order to manage quotas - you might not want that. Mount a directory on a partition. If the data exceeds on that partition, there would be disk error. Like tablespace getting

Re: Table spaces again [was Re: [HACKERS] Threaded Sorting]

2002-10-07 Thread Shridhar Daithankar
On 7 Oct 2002 at 16:49, Hans-Jürgen Schönig wrote: Mount a directory on a partition. If the data exceeds on that partition, there would be disk error. Like tablespace getting overflown. I have seen both the scenarios in action.. Of course it can be done somehow. However, with tablespaces

Re: Table spaces again [was Re: [HACKERS] Threaded Sorting]

2002-10-07 Thread Tom Lane
=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes: how would you handle table spaces? The plan that's been discussed simply defines a tablespace as being a directory somewhere; physical storage of individual tables would remain basically the same, one or more files under the

Re: Table spaces again [was Re: [HACKERS] Threaded Sorting]

2002-10-07 Thread Hans-Jürgen Schönig
Quotas are handled differently on ever platform (if available). Yeah. But that's sysadmins responsibility not DBA's. Maybe many people ARE the sysadmins of their PostgreSQL box ... When developing a database with an open mind people should try to see a problem from more than

Re: Table spaces again [was Re: [HACKERS] Threaded Sorting]

2002-10-07 Thread Jim Buttafuoco
Is this NOT what I have been after for many months now. I dropped the tablespace/location idea before 7.2 because that didn't seem to be any interest. Please see my past email's for the SQL commands and on disk directory layout I have proposed. I have a working 7.2 system with

Re: Parallel Executors [was RE: [HACKERS] Threaded Sorting]

2002-10-07 Thread Jan Wieck
Curtis Faith wrote: The current transaction/user state seems to be stored in process global space. This could be changed to be a sointer to a struct stored in a back-end specific shared memory area which would be accessed by the executor process at execution start. The backend would destroy

Re: Parallel Executors [was RE: [HACKERS] Threaded Sorting]

2002-10-07 Thread Curtis Faith
Curtis Faith wrote: The current transaction/user state seems to be stored in process global space. This could be changed to be a sointer to a struct stored in a back-end specific shared memory area which would be accessed by the executor process at execution start. The backend would

Parallel Executors [was RE: [HACKERS] Threaded Sorting]

2002-10-06 Thread Curtis Faith
tom lane wrote: Curtis Faith [EMAIL PROTECTED] writes: What about splitting out parsing, optimization and plan generation from execution and having a separate pool of exececutor processes. As an optimizer finished with a query plan it would initiate execution by grabbing an executor

Re: [HACKERS] Threaded Sorting

2002-10-05 Thread Curtis Faith
tom lane writes: The notion of a sort process pool seems possibly attractive. I'm unconvinced that it's going to be a win though because of the cost of shoving data across address-space boundaries. What about splitting out parsing, optimization and plan generation from execution and having a

Re: [HACKERS] Threaded Sorting

2002-10-05 Thread Tom Lane
Curtis Faith [EMAIL PROTECTED] writes: What about splitting out parsing, optimization and plan generation from execution and having a separate pool of exececutor processes. As an optimizer finished with a query plan it would initiate execution by grabbing an executor from a pool and passing

[HACKERS] Threaded Sorting

2002-10-04 Thread Hans-Jürgen Schönig
Did anybody think about threaded sorting so far? Assume an SMP machine. In the case of building an index or in the case of sorting a lot of data there is just one backend working. Therefore just one CPU is used. What about starting a thread for every temporary file being created? This way

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Shridhar Daithankar
On 4 Oct 2002 at 9:46, Hans-Jürgen Schönig wrote: Did anybody think about threaded sorting so far? Assume an SMP machine. In the case of building an index or in the case of sorting a lot of data there is just one backend working. Therefore just one CPU is used. What about starting a

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Tom Lane
=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes: Did anybody think about threaded sorting so far? Assume an SMP machine. In the case of building an index or in the case of sorting a lot of data there is just one backend working. Therefore just one CPU is used. What about

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Greg Copeland
I wouldn't hold your breath for any form of threading. Since PostgreSQL is process based, you might consider having a pool of sort processes which address this but I doubt you'll get anywhere talking about threads here. Greg On Fri, 2002-10-04 at 02:46, Hans-Jürgen Schönig wrote: Did anybody

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Greg Copeland
On Fri, 2002-10-04 at 09:40, Hans-Jürgen Schönig wrote: I had a brief look at the code used for sorting. It is very well documented so maybe it is worth thinking about a parallel algorithm. When talking about threads: A pool of processes for sorting? Maybe this could be useful but I

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Greg Copeland
On Fri, 2002-10-04 at 10:37, Hans-Jürgen Schönig wrote: My concern was that a process model might be a bit too slow for that but if we had processes in memory this would be wonderful thing. Yes, that's the point of having a pool. The idea is not only do you avoid process creation and

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Greg Copeland
On Fri, 2002-10-04 at 12:26, Bruce Momjian wrote: Added to TODO: * Allow sorting to use multiple work directories Why wouldn't that fall under the table space effort??? Greg signature.asc Description: This is a digitally signed message part

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Bingo! Want to increase sorting performance, give it more I/O bandwidth, and it will take 1/100th of the time to do threading. Added to TODO: * Allow sorting to use multiple work directories Yeah, I like that. Actually it

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Bruce Momjian
Greg Copeland wrote: -- Start of PGP signed section. On Fri, 2002-10-04 at 12:26, Bruce Momjian wrote: Added to TODO: * Allow sorting to use multiple work directories Why wouldn't that fall under the table space effort??? Yes, but we make it a separate item so we are sure that is

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Greg Copeland
I see. I just always assumed that it would be done as part of table space effort as it's such a defacto feature. I am curious as to why no one has commented on the other rather obvious performance enhancement which was brought up in this thread. Allowing for parallel sorting seems rather

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Bruce Momjian
Greg Copeland wrote: -- Start of PGP signed section. I see. I just always assumed that it would be done as part of table space effort as it's such a defacto feature. I am curious as to why no one has commented on the other rather obvious performance enhancement which was brought up in this

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Greg Copeland
Well, that's why I was soliciting developer input as to exactly what goes on with sorts. From what I seem to be hearing, all sorts result in temp files being created and/or used. If that's the case then yes, I can understand the fixation. Of course that opens the door for it being a horrible

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: Tom, what temp files do we use that aren't for sorting; I forgot. MATERIALIZE plan nodes are the only thing I can think of offhand that uses a straight temp file. But ISTM that if this makes sense for our internal temp files, it makes sense for

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Bruce Momjian
Greg Copeland wrote: -- Start of PGP signed section. Well, that's why I was soliciting developer input as to exactly what goes on with sorts. From what I seem to be hearing, all sorts result in temp files being created and/or used. If that's the case then yes, I can understand the fixation.

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom, what temp files do we use that aren't for sorting; I forgot. MATERIALIZE plan nodes are the only thing I can think of offhand that uses a straight temp file. But ISTM that if this makes sense for our internal temp files, it

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Greg Copeland
On Fri, 2002-10-04 at 14:31, Bruce Momjian wrote: We use tape sorts, ala Knuth, meaning we sort in memory as much as possible, but when there is more data than fits in memory, rather than swapping, we write to temp files then merge the temp files (aka tapes). Right, which is what I originally

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: ... But ISTM that if this makes sense for our internal temp files, it makes sense for user-created temp tables as well. Yes, I was thinking that, but of course, those are real tables, rather than just files. Not sure how clean it

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Bruce Momjian
Hans-Jürgen Schönig wrote: Did anybody think about threaded sorting so far? Assume an SMP machine. In the case of building an index or in the case of sorting a lot of data there is just one backend working. Therefore just one CPU is used. What about starting a thread for every temporary

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread scott.marlowe
On Fri, 4 Oct 2002, Bruce Momjian wrote: Hans-Jürgen Schönig wrote: Did anybody think about threaded sorting so far? Assume an SMP machine. In the case of building an index or in the case of sorting a lot of data there is just one backend working. Therefore just one CPU is used.

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Bruce Momjian
Hans-Jürgen Schönig wrote: Threads are bad - I know ... I like the idea of a pool of processes instead of threads - from my point of view this would be useful. I am planning to run some tests (GEQO, AIX, sorts) as soon as I have time to do so (still too much work ahead before :( ...).

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Bruce Momjian
scott.marlowe wrote: We haven't thought about it yet because there are too many buggy thread implementations. We are probably just now getting to a point where we can consider it. However, lots of databases have moved to threads for all sorts of things and ended up with a royal mess of

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Justin Clift
Bruce Momjian wrote: scott.marlowe wrote: snip It seems like sometimes we consider these issues more from the one or two SCSI drives perspective insted of the big box o drives perspective. Yes, it is mostly for non-RAID drives, but also, sometimes single drives can be faster. When you

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: Bingo! Want to increase sorting performance, give it more I/O bandwidth, and it will take 1/100th of the time to do threading. Added to TODO: * Allow sorting to use multiple work directories Yeah, I like that. Actually it should apply to all

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Tom Lane
Greg Copeland [EMAIL PROTECTED] writes: ... I can understand why addressing the seemingly more common I/O bound case would receive priority, however, I'm at a loss as to why the other would be completely ignored. Bruce already explained that we avoid threads because of portability and

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Greg Copeland
On Fri, 2002-10-04 at 15:07, Tom Lane wrote: the sort comparison function can be anything, including user-defined code that does database accesses or other interesting stuff. This This is something that I'd not considered. would mean that the sort auxiliary process would have to adopt the

Re: [HACKERS] Threaded Sorting

2002-10-04 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: ... But ISTM that if this makes sense for our internal temp files, it makes sense for user-created temp tables as well. Yes, I was thinking that, but of course, those are real tables, rather than just files.