[PERFORM] \d output to a file
Hi All, I would like to write the output of the \d command on all tables in a database to an output file. There are more than 200 tables in the database. I am aware of \o command to write the output to a file. But,it will be tough to do the \d for each table manually and write the output to a file. Is there a command/ way in which I can achieve this without having to do it for each table? Any help in this regard would be really appreciated. Thanks, Saranya__Do You Yahoo!?Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [PERFORM] \d output to a file
...and on Wed, Dec 15, 2004 at 06:38:22AM -0800, sarlav kumar used the keyboard: Hi All, I would like to write the output of the \d command on all tables in a database to an output file. There are more than 200 tables in the database. I am aware of \o command to write the output to a file. But, it will be tough to do the \d for each table manually and write the output to a file. Is there a command/ way in which I can achieve this without having to do it for each table? Any help in this regard would be really appreciated. Hello Sarlav. You don't say which platform you're doing this on. If it's Windows, someone else will have to advise you; if it's a UNIX-like platform though, the following simple shell script should be helpful in achieving what you want: ---CUT-HERE--- #!/bin/bash if [ -z $1 ]; then echo Please specify a database to query. exit 1 fi DATABASE=$1 MYTABLES=`echo '\t\a\dt' | psql -q ${DATABASE} | cut -f 2 -d '|'` for table in ${MYTABLES}; do echo '\d '${table} done | psql ${DATABASE} ---CUT-HERE--- You can store this script into a file called, for example, describe.sh and invoke it like so: $ ./describe.sh mydatabase description.txt It should then do what you want. Should you have additional arguments to specify to psql, such as a host, a username, a password and so on, it is easy to modify the script to do that. Just supply those arguments in places where the psql command is used. Hope this helped, -- Grega Bremec gregab at p0f dot net pgpTPZRwRibTV.pgp Description: PGP signature
Re: [PERFORM] [NOVICE] \d output to a file
Geoffrey [EMAIL PROTECTED] writes: sarlav kumar wrote: I would like to write the output of the \d command on all tables in a database to an output file. What is the OS? On any UNIX variant you can do: echo '\d' | psql outputfile Or use \o: regression=# \o zzz1 regression=# \d regression=# \o regression=# \d List of relations Schema | Name | Type | Owner +---+---+-- public | pg_ts_cfg | table | postgres public | pg_ts_cfgmap | table | postgres public | pg_ts_dict| table | postgres public | pg_ts_parser | table | postgres public | t_test| table | postgres public | test_tsvector | table | postgres (6 rows) regression=# \q $ cat zzz1 List of relations Schema | Name | Type | Owner +---+---+-- public | pg_ts_cfg | table | postgres public | pg_ts_cfgmap | table | postgres public | pg_ts_dict| table | postgres public | pg_ts_parser | table | postgres public | t_test| table | postgres public | test_tsvector | table | postgres (6 rows) $ regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Query Optimization
sarlav kumar wrote: Hi all, Can someone please help me optimize this query? Is there a better way to write this query? I am generating a report of transactions ordered by time and with details of the sender and receiver etc. SELECT distinct a.time::date ||'br'||substring(a.time::time::text,1,8) as Time, CASE WHEN a.what = 0 THEN 'Money Transfer' WHEN a.what = 15 THEN 'Purchase' WHEN a.what = 26 THEN 'Merchant Streamline' WHEN a.what = 13 THEN 'Reversal' END as Transaction_Type , c1.account_no as SenderAccount, c2.account_no as RecieverAccount, b.country as SenderCountry, d.country as RecieverCountry, b.firstname as SenderFirstName, b.lastname as SenderLastName, d.firstname as ReceiverFirstName, d.lastname as ReceiverLastName, a.status as status, (select sum(td.amount * 0.01) from transaction_data td where td.data_id = a2.id and td.dir = 1 and td.uid = a.target_uid) as ReversedAmount, (select sum(td.amount * 0.01) from transaction_data td where td.data_id = a2.id and td.dir = 0 and td.uid = a.uid ) as DepositedAmount, a.flags, (a.amount * 0.01) as Amount, (a.fee * 0.01) as Fee FROM data a, customerdata b, customerdata d, customer c1, customer c2 , participant p, data a2 WHERE p.id = a.partner_id AND (a.uid = c1.id) AND (a.target_uid = c2.id) and c1.id=b.uid and c2.id=d.uid and a.confirmation is not null AND (a2.ref_id = a.id) and ((a2.what = 13) or (a2.what = 17) ) ORDER BY time desc ; (query plan followed) The expensive operation is the UNIQUE. Are you sure, in terms of business logic, that this is necessary? Is it actually possible to have duplicate transactions at the exact same time, and if so, would you really want to eliminate them? As an aside, I prefer to have numeric constants like the 'what' field in a small lookup table of two columns (what_code, what_description); it's easier to extend and to document. begin:vcard fn:Andrew Lazarus n:Lazarus;Andrew org:Pillette Investment Management;Research and Development adr;dom:;;3028 Fillmore;San Francisco;CA;94123 email;internet:[EMAIL PROTECTED] title:Director tel;work:800-366-0688 tel;fax:415-440-4093 url:http://www.pillette.com version:2.1 end:vcard ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] [NOVICE] \d output to a file
On Wed, 2004-12-15 at 11:50 -0500, Tom Lane wrote: Geoffrey [EMAIL PROTECTED] writes: sarlav kumar wrote: I would like to write the output of the \d command on all tables in a database to an output file. What is the OS? On any UNIX variant you can do: echo '\d' | psql outputfile Or use \o: regression=# \o zzz1 regression=# \d or: =# \d * to get all tables as th OP wanted regression=# \o gnari ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Partitioned table performance
Stacy, Thanks again for the reply. So it sounds like the answer to my original question is that it's expected that the pseudo-partitioning would introduce a fairly significant amount of overhead. Correct? Correct. For that matter, Oracle table partitioning introduces significant overhead, from what I've seen. I don't think there's a way not to. Generally, I counsel people that they only want to consider pseudo-partitioning if they have one axis on the table which is used in 90% or more of the queries against that table. What would improve the situation significantly, and the utility of pseudo-partitioning, is the ability to have a single index span multiple partitions. This would allow you to have a segmented index for the partitioned axis, yet still use an unsegmented index for the other columns. However, there's a *lot* of work to do to make that happen. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Partitioned table performance
Greg, Well Oracle has lots of partitioning intelligence pushed up to the planner to avoid overhead. If you have a query with something like WHERE date = '2004-01-01' and date is your partition key (even if it's a range) then Oracle will figure out which partition it will need at planning time. Hmmm ... well, we're looking at making a spec for Postgres Table Partitioning. Maybe you could help? But I'm a bit puzzled. Why would Append have any significant cost? It's just taking the tuples from one plan node and returning them until they run out, then taking the tuples from another plan node. It should have no i/o cost and hardly any cpu cost. Where is the time going? Beats me. Tom? In my experience global indexes defeat the whole purpose of having the partitions. They make dropping and adding partitions expensive which was always the reason we wanted to partition something anyways. Hmmm.Possibly, I was just thinking about the cost to partitioned tables when you do a selection *not* on the partitioned axis. Also that currently we can't enforce UNIQUE constraints across partitions. But maybe reducing the cost of Append is the answer to this. It is handy having a higher level interface to deal with partitioned tables. You can create a single local or segmented index and not have to manually deal with all the partitions as separate tables. But that's just syntactic sugar. Right, and the easy part. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Partitioned table performance
Josh Berkus [EMAIL PROTECTED] writes: Stacy, Thanks again for the reply. So it sounds like the answer to my original question is that it's expected that the pseudo-partitioning would introduce a fairly significant amount of overhead. Correct? Correct. For that matter, Oracle table partitioning introduces significant overhead, from what I've seen. I don't think there's a way not to. Well Oracle has lots of partitioning intelligence pushed up to the planner to avoid overhead. If you have a query with something like WHERE date = '2004-01-01' and date is your partition key (even if it's a range) then Oracle will figure out which partition it will need at planning time. Even if your query is something like WHERE date = ? then Oracle will still recognize that it will only need a single partition at planning time, though it has to decide which partition at execution time. We didn't notice any run-time performance degradation when we went to partitioned tables. Maybe we were so blinded by the joy they brought us on the maintenance side though. I don't think we specifically checked for run-time consequences. But I'm a bit puzzled. Why would Append have any significant cost? It's just taking the tuples from one plan node and returning them until they run out, then taking the tuples from another plan node. It should have no i/o cost and hardly any cpu cost. Where is the time going? What would improve the situation significantly, and the utility of pseudo-partitioning, is the ability to have a single index span multiple partitions. This would allow you to have a segmented index for the partitioned axis, yet still use an unsegmented index for the other columns. However, there's a *lot* of work to do to make that happen. In my experience global indexes defeat the whole purpose of having the partitions. They make dropping and adding partitions expensive which was always the reason we wanted to partition something anyways. It is handy having a higher level interface to deal with partitioned tables. You can create a single local or segmented index and not have to manually deal with all the partitions as separate tables. But that's just syntactic sugar. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Partitioned table performance
Josh Berkus [EMAIL PROTECTED] writes: But I'm a bit puzzled. Why would Append have any significant cost? It's just taking the tuples from one plan node and returning them until they run out, then taking the tuples from another plan node. It should have no i/o cost and hardly any cpu cost. Where is the time going? Beats me. Tom? In my experience global indexes defeat the whole purpose of having the partitions. They make dropping and adding partitions expensive which was always the reason we wanted to partition something anyways. Hmmm.Possibly, I was just thinking about the cost to partitioned tables when you do a selection *not* on the partitioned axis. Also that currently we can't enforce UNIQUE constraints across partitions. Like I said though, we found global indexes defeated the whole purpose. That meant no global UNIQUE constraints for us when we went to partitioned tables. It gave the DBAs the willies but it really wasn't a big deal. You can still do unique local indexes on a specific partition. So as long as your partition key is in the primary key you can have a trustworthy primary key. And even if not, you usually find you're only loading data into only one partition. In most applications it's pretty hard to get a record from two different partitions with conflicting IDs and not hard to check for. You could easily put a constraint saying that all PO numbers in the new fiscal year have to be greater than the last PO number from last year, for example. But maybe reducing the cost of Append is the answer to this. The problem with global indexes is that adding or removing an entire partition becomes a large job. [Actually with Postgres MVCC I suppose removing might not. But cleaning up would eventually be a large job, and the point remains for adding a partition.] Ideally adding and removing a partition should be a O(1) operation. No data modification at all, purely catalog changes. It is handy having a higher level interface to deal with partitioned tables. You can create a single local or segmented index and not have to manually deal with all the partitions as separate tables. But that's just syntactic sugar. Right, and the easy part. I think the hard part lies in the optimizer actually. The semantics of the operations to manipulate partitions might be tricky to get right but the coding should be straightforward. Having the optimizer be able to recognize when it can prune partitions will be a lot of work. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] indentifying the database in a Postgres log file.
Theo Galanakis wrote: I have written a program that parses a syslog file, reading all the postgres transactions. I would like to know if there is a way for postgres to log also the specific database the sql statement originated from. The only options available in the postgresql.conf are: #log_connections = false #log_duration = false #log_pid = false #log_statement = false #log_timestamp = false #log_hostname = false #log_source_port = false Is this possible? Or is there a smart work around. In pre-8.0 the only way to do it is to log connections, grab the database from there, and add the pid to join all log rows back to the server row. In 8.0 we have log_line_prefix that can display all information. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Partitioned table performance
Greg Stark [EMAIL PROTECTED] writes: But I'm a bit puzzled. Why would Append have any significant cost? It's just taking the tuples from one plan node and returning them until they run out, then taking the tuples from another plan node. It should have no i/o cost and hardly any cpu cost. Where is the time going? As best I can tell by profiling, the cost of the Append node per se is indeed negligible --- no more than a couple percent of the runtime in CVS tip for a test case similar to Stacy White's example. It looks bad in EXPLAIN ANALYZE, but you have to realize that passing the tuples up through the Append node doubles the instrumentation overhead of EXPLAIN ANALYZE, which is pretty sizable already. (If you turn on \timing in psql and try the query itself vs. EXPLAIN ANALYZE, the actual elapsed time is about double, at least for me.) The other effect, which I hadn't expected, is that the seqscans themselves actually slow down. I get regression=# explain analyze SELECT COUNT(*), MAX(bar1) FROM super_foo ; QUERY PLAN Aggregate (cost=16414.32..16414.32 rows=1 width=4) (actual time=32313.980..32313.988 rows=1 loops=1) - Append (cost=0.00..13631.54 rows=556555 width=4) (actual time=0.232..21848.401 rows=524289 loops=1) - Seq Scan on super_foo (cost=0.00..0.00 rows=1 width=4) (actual time=0.020..0.020 rows=0 loops=1) - Seq Scan on sub_foo1 super_foo (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.187..6926.395 rows=262144 loops=1) - Seq Scan on sub_foo2 super_foo (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.168..7026.953 rows=262145 loops=1) Total runtime: 32314.993 ms (6 rows) regression=# explain analyze SELECT COUNT(*), MAX(bar1) FROM sub_foo1; QUERY PLAN Aggregate (cost=8207.16..8207.16 rows=1 width=4) (actual time=9850.420..9850.428 rows=1 loops=1) - Seq Scan on sub_foo1 (cost=0.00..6815.77 rows=278277 width=4) (actual time=0.202..4642.401 rows=262144 loops=1) Total runtime: 9851.423 ms (3 rows) Notice the actual times for the sub_foo1 seqscans. That increase (when counted for both input tables) almost exactly accounts for the difference in non-EXPLAIN ANALYZE runtime. After digging around, I find that the reason for the difference is that the optimization to avoid a projection step (ExecProject) isn't applied for scans of inheritance unions: /* * Can't do it with inheritance cases either (mainly because Append * doesn't project). */ if (rel-reloptkind != RELOPT_BASEREL) return false; So if you were to try the example in a pre-7.4 PG, which didn't have that optimization, you'd probably find that the speeds were just about the same. (I'm too lazy to verify this though.) I looked briefly at what it would take to cover this case, and decided that it's a nontrivial change, so it's too late to do something about it for 8.0. I think it's probably possible to fix it though, at least for cases where the child tables have rowtypes identical to the parent. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [NOVICE] \d output to a file
sarlav kumar wrote: Hi All, I would like to write the output of the \d command on all tables in a database to an output file. There are more than 200 tables in the database. I am aware of \o command to write the output to a file. But, it will be tough to do the \d for each table manually and write the output to a file. Is there a command/ way in which I can achieve this without having to do it for each table? Any help in this regard would be really appreciated. What is the OS? On any UNIX variant you can do: echo '\d' | psql outputfile But this will get you the system tables as well I think. Alternately you could do something like: for table in $(listoftables); do echo '\d' | psql done outputfile -- Until later, Geoffrey ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] indentifying the database in a Postgres log file.
Title: indentifying the database in a Postgres log file. I have written a program that parses a syslog file, reading all the postgres transactions. I would like to know if there is a way for postgres to log also the specific database the sql statement originated from. The only options available in the postgresql.conf are: #log_connections = false #log_duration = false #log_pid = false #log_statement = false #log_timestamp = false #log_hostname = false #log_source_port = false Is this possible? Or is there a smart work around. Regards, Theo __ This email, including attachments, is intended only for the addressee and may be confidential, privileged and subject to copyright. If you have received this email in error, please advise the sender and delete it. If you are not the intended recipient of this email, you must not use, copy or disclose its content to anyone. You must not copy or communicate to others content that is confidential or subject to copyright, unless you have the consent of the content owner.
Re: [PERFORM] Trying to create multi db query in one large queries
The world rejoiced as [EMAIL PROTECTED] (Josh Berkus) wrote: Hasnul, My question is if there is a query design that would query multiple server simultaneously.. would that improve the performance? Not without a vast amounts of infrastructure coding. You're basically talking about what Oracle has spent the last 3 years and $100 million working on. I recall a presentation from folks from Empress Software http://www.empress.com/ back in about '94 or '95 offering this very feature as part of the base functionality of their product. I'm not sure it's quite fair to assess things as more or less preposterous simply because they prove ludicrously expensive to develop on a particular platform that happens to be targeted by even more ludicrous quantities of development dollars... On the other hand, it seems unlikely that improved performance would be one of the merits of this approach... -- let name=cbbrowne and tld=gmail.com in String.concat @ [name;tld];; http://www3.sympatico.ca/cbbrowne/ Rules of the Evil Overlord #92. If I ever talk to the hero on the phone, I will not taunt him. Instead I will say that his dogged perseverance has given me new insight on the futility of my evil ways and that if he leaves me alone for a few months of quiet contemplation I will likely return to the path of righteousness. (Heroes are incredibly gullible in this regard.) http://www.eviloverlord.com/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])