Re: [PATCHES] [HACKERS] Automatic free space map filling

2006-03-08 Thread ITAGAKI Takahiro
"Zeugswetter Andreas DCP SD" <[EMAIL PROTECTED]> wrote:

> Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> tuple by reducing the tuple to it's header info.

Attached patch realizes the concept of his idea. The dead tuples will be
reduced to their headers are done by bgwriter.

This patch is incomplete, so please discuss in the thread on HACKERS.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories



bgvacuum-0309.patch.txt
Description: Binary data

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] pg_freespacemap question

2006-03-08 Thread Mark Kirkwood

Tom Lane wrote:

Mark Kirkwood <[EMAIL PROTECTED]> writes:

Good points! I had not noticed this test case. Probably NULL is better 



Would setting it to 'BLCKSZ - (fixed index header stuff)' be better, 



No, I don't think so, because that will just make it harder to recognize
what's what (remember that BLCKSZ isn't really a constant, and the index
overhead is not the same for all AMs either).  The point here is that
for indexes the FSM tracks whole-page availability, not the amount of
free space within pages.  So I think NULL is a reasonable representation
of that.  Using NULL will make it easy to filter the results if you want
to see only heap-page data or only index-page data, whereas it will be
very hard to do that if the view adopts an ultimately-artificial
convention about the amount of available space on an index page.



Right - after suggesting it I realized that coding the different index
overhead for each possible AM would have been ... difficult :-). A patch
is attached to implement the NULL free bytes and other recommendations:

1/ Index free bytes set to NULL
2/ Comment added to the README briefly mentioning the index business
3/ Columns reordered more logically
4/ 'Blockid' column removed
5/ Free bytes column renamed to just 'bytes' instead of 'blockfreebytes'

Now 5/ was only hinted at, but seemed worth doing while I was there
(hopefully I haven't made it too terse now).

cheers

Mark


Index: pg_freespacemap.c
===
RCS file: /projects/cvsroot/pgsql/contrib/pg_freespacemap/pg_freespacemap.c,v
retrieving revision 1.2
diff -c -r1.2 pg_freespacemap.c
*** pg_freespacemap.c   14 Feb 2006 15:03:59 -  1.2
--- pg_freespacemap.c   9 Mar 2006 03:38:10 -
***
*** 12,18 
  #include "storage/freespace.h"
  #include "utils/relcache.h"
  
! #define   NUM_FREESPACE_PAGES_ELEM6
  
  #if defined(WIN32) || defined(__CYGWIN__)
  /* Need DLLIMPORT for some things that are not so marked in main headers */
--- 12,18 
  #include "storage/freespace.h"
  #include "utils/relcache.h"
  
! #define   NUM_FREESPACE_PAGES_ELEM5
  
  #if defined(WIN32) || defined(__CYGWIN__)
  /* Need DLLIMPORT for some things that are not so marked in main headers */
***
*** 29,40 
  typedef struct
  {
  
-   uint32  blockid;
-   uint32  relfilenode;
uint32  reltablespace;
uint32  reldatabase;
uint32  relblocknumber;
!   uint32  blockfreebytes;
  
  } FreeSpacePagesRec;
  
--- 29,40 
  typedef struct
  {
  
uint32  reltablespace;
uint32  reldatabase;
+   uint32  relfilenode;
uint32  relblocknumber;
!   uint32  bytes;
!   boolisindex;
  
  } FreeSpacePagesRec;
  
***
*** 91,107 
  
/* Construct a tuple to return. */
tupledesc = CreateTemplateTupleDesc(NUM_FREESPACE_PAGES_ELEM, 
false);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 1, "blockid",
!  INT4OID, -1, 0);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
   OIDOID, -1, 0);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
   OIDOID, -1, 0);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
   OIDOID, -1, 0);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 5, "relblocknumber",
   INT8OID, -1, 0);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 6, "blockfreebytes",
   INT4OID, -1, 0);
  
/* Generate attribute metadata needed later to produce tuples */
--- 91,105 
  
/* Construct a tuple to return. */
tupledesc = CreateTemplateTupleDesc(NUM_FREESPACE_PAGES_ELEM, 
false);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 1, "reltablespace",
   OIDOID, -1, 0);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 2, "reldatabase",
   OIDOID, -1, 0);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 3, "relfilenode",
   OIDOID, -1, 0);
!   TupleDescInitEntry(tupledesc, (AttrNumber) 4, "relblocknumber",
   INT8OID, -1, 0);
!   TupleD

Re: [HACKERS] [PATCHES] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread Jim C. Nasby
On Wed, Mar 08, 2006 at 04:57:52PM -0500, Neil Conway wrote:
> On Wed, 2006-03-08 at 08:20 -0800, David Fetter wrote:
> > The previous discussion showed that there is a wide diversity of
> > opinions on what The Right Delimiter and The Right NULL String(TM)
> > are.
> 
> Barring a more convincing justification for why we need this feature,
> I'm inclined to side with Tom: pg_dump has enough obscure options as it
> is, and I can't imagine very many people needing this functionality.

Given all the different requests that come in for pg_dump and copy,
maybe it makes sense for Someone Who Cares to start a pgFoundry project
(or maybe extend the import/export project that's already there).
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Stephan Szabo

On Wed, 8 Mar 2006, Jonah H. Harris wrote:

> On 3/8/06, Stephan Szabo <[EMAIL PROTECTED]> wrote:
> >
> > What feature though?  Part of the definition of a feature like synonym has
> > to nail down things like how it interacts with search path
>
>
> I've said how it interacts with the search path (with Oracle) several times
> and illustrated it in the last one, let's not go through this again.

Argh.

You responded with a statement about synonyms having cost in response to a
message of mine in response to a message from [EMAIL PROTECTED] which
said "the costs will only be added if the "real table" is not found.
therefore there is no impact on "normal" users." Your idea and that of the
person who submitted this patch thus seem to be slightly different. Maybe
I'm confused, but to me this seems to show incomplete acceptance of this
point when the patch submitter and one of the vocal proponents have
different models in mind. I'd rather the feature follows the model you
described (although I may have greater concerns of the cost) because it
seems consistent with other lookups.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Jonah H. Harris
On 3/8/06, Stephan Szabo <[EMAIL PROTECTED]> wrote:
What feature though?  Part of the definition of a feature like synonym hasto nail down things like how it interacts with search pathI've said how it interacts with the search path (with Oracle) several times and illustrated it in the last one, let's not go through this again.
The message I was responding to was talking about the patch and seeming to say that
there wasn't a cost for non-users because the search was done iff acandidate object wasn't found. IMHO, this is a different feature than asynonym feature for which each search path entry is checked so that
synonyms in earlier path entries shadow later concrete objects. Weprobably don't want both features even if we want either, but they'rereally different features.Like I said in the email before this, there is a way to limit the cost of synonyms for ONLY if a real object does not exist in the search path.  However, this would be odd behavior regarding namespace searching IMHO.  I think the only *good* implementation is to follow the search path as it is now and include synonyms in it... this would mean a cost for any query whether or not a synonym were used or not.  The real question is how to lessen the cost if we decide to implement the functionality.
I would say that that's a really bad choice, and Joe should have hissynonyms somewhere other than public so as not to pollute other people's
default search path with his particular needs that may not be the same assomeone else's.  What does Jane do now when she needs the opposite set andwhy is Joe's choice more relevant than Jane's?
Joe and Jane could create synonyms locally in their own schemas, so this isn't an issue at all.  The demonstration example was representative of many ERP systems where a synonym is publicly shared by all modules and you wouldn't have the "Jane" issue.
I'm nearly done fighting this... synonyms are useful functionality that many people in this discussion have not used.  I've explained how it works in Oracle and the reasoning behind it.  If we want to limit users to search_path for the sake of not being Oracle, fine.
I have patches to work on and this seems to be going nowhere.  I'm open to helping anyone implement similar functionality and/or discussion, but this thread has too many sub-discussions to be useful.  For functionality descriptions, see the Oracle docs.  I'm not averse to straying from Oracle's way of doing it if it makes sense.
-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Stephan Szabo

On Wed, 8 Mar 2006, Jonah H. Harris wrote:

> On 3/8/06, Stephan Szabo <[EMAIL PROTECTED]> wrote:
> >
> > Yes, however there are two slightly separate discussions going on and I
> > think you're taking them as a single discussion.
>
>
> I agree that there are two discussions happening in this thread, but I don't
> think anyone has agreed at all that this patch as it is would be acceptable
> for various reasons.  There are a couple things that Hans and I will discuss
> about the patch assuming we decide this is a feature that would be nice for
> PostgreSQL.

What feature though?  Part of the definition of a feature like synonym has
to nail down things like how it interacts with search path. The message I
was responding to was talking about the patch and seeming to say that
there wasn't a cost for non-users because the search was done iff a
candidate object wasn't found. IMHO, this is a different feature than a
synonym feature for which each search path entry is checked so that
synonyms in earlier path entries shadow later concrete objects. We
probably don't want both features even if we want either, but they're
really different features.

> With synonyms, the search path for Joe would remain $user, public and one
> could easily do
> CREATE SYNONYM public.employee FOR hr.employee;
> CREATE SYNONYM public.commissions FOR crm.commissions;

I would say that that's a really bad choice, and Joe should have his
synonyms somewhere other than public so as not to pollute other people's
default search path with his particular needs that may not be the same as
someone else's.  What does Jane do now when she needs the opposite set and
why is Joe's choice more relevant than Jane's?

This is not a negative effect of synonyms, merely this use.

> I guess synonym searching could be done iff no object were found in the
> current search.  I don't know why I thought it would be just as costly
> (perhaps too much Sam Adams).  The worst-case scenario would be an
> additional search only if an object weren't found in a catalog search,
> basically this would be the cost of using synonyms and wouldn't affect
> performance for everyone else.  Oracle does have a small cost only when
> using synonyms as well.

Yeah, that just seems less consistent with the rest of the way schema
searches work right now for other objects.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread Neil Conway
On Wed, 2006-03-08 at 08:20 -0800, David Fetter wrote:
> The previous discussion showed that there is a wide diversity of
> opinions on what The Right Delimiter and The Right NULL String(TM)
> are.

Barring a more convincing justification for why we need this feature,
I'm inclined to side with Tom: pg_dump has enough obscure options as it
is, and I can't imagine very many people needing this functionality.

-Neil



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Jonah H. Harris
On 3/8/06, Stephan Szabo <[EMAIL PROTECTED]> wrote:
Yes, however there are two slightly separate discussions going on and Ithink you're taking them as a single discussion.
I agree that there are two discussions happening in this thread, but I
don't think anyone has agreed at all that this patch as it is would be
acceptable for various reasons.  There are a couple things that
Hans and I will discuss about the patch assuming we decide this is a
feature that would be nice for PostgreSQL.  
If your search path is A,B and there is a B.EMPLOYEE table and an
A.EMPLOYEE synonym to HR.EMPLOYEE, which table does select * from EMPLOYEEread?   The
one first in your search path.  You could not, for example, create
a SYNONYM called EMPLOYEE in the HR schema as it would conflict with
the EMPLOYEE table.  Synonyms act like the objects they represent
in term of namespace searches.

ASSUME:
CREATE USER joe;
CREATE SCHEMA AUTHORIZATION joe;

Joe's search_path is $user,public

CREATE SCHEMA hr;
CREATE TABLE hr.employee (emp_id, ...)
CREATE TABLE hr.payroll (emp_id, ...)
CREATE TABLE hr.commissions;

For Joe to see this, they either have to add HR to their search_path or
fully qualify it.  Let's assume they use current PostgreSQL
behavior:

SET search_path TO ..., HR

Now they can SELECT * FROM EMPLOYEE where EMPLOYEE is HR.EMPLOYEE

Now assume:

CREATE SCHEMA crm;
CREATE TABLE crm.employee;
CREATE TABLE crm.customer;
CREATE TABLE crm.commissions;

Now, joe needs to query customer and employee without qualification...
HR.EMPLOYEE is the common table that, with the exception of the CRM
module, the application refers to simply as EMPLOYEE.  Now what
does Joe do:

SET search_path TO ..., HR, CRM;

OK, they still have the tables named correctly but they have to
manually make sure they order search_path.  Now, you tell me
(without qualification) how Joe can access the CRM commissions
table?  They can't.

With synonyms, the search path for Joe would remain $user, public and one could easily do
CREATE SYNONYM public.employee FOR hr.employee;
CREATE SYNONYM public.commissions FOR crm.commissions;

As Joe: SELECT * FROM EMPLOYEE becomes SELECT * FROM HR.EMPLOYEE
As Joe: SELECT * FROM COMMISSIONS becomes SELECT * FROM CRM.COMMISSIONS

I guess synonym searching could be done iff no object were found in the
current search.  I don't know why I thought it would be just as
costly (perhaps too much Sam Adams).  The worst-case scenario
would be an additional search only if an object weren't found in a
catalog search, basically this would be the cost of using synonyms and
wouldn't affect performance for everyone else.  Oracle does have a
small cost only when using synonyms as well.

-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Andreas Pflug

Neil Conway wrote:




I'm still unconvinced that this feature would be sufficiently useful to
justify the maintenance burden, in addition to the added complexity:
even if it is implemented in a way that imposes minimal *runtime*
overhead, new features add complexity: introducing a bunch of new DDL
commands and a new concept ("synonyms") makes the system more difficult
for users to understand.


Synonyms appear to me a little like domains. I like them to abstract 
from proprietary data types. Similar, leightweight synonyms (pg_class 
entries) allow some abstraction if needed, without using rules. I don't 
think that synonyms are more difficult to understand than domains.


Regards,
Andreas

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread Greg Stark

David Fetter <[EMAIL PROTECTED]> writes:

> Not everybody's editor/mailer/whatever does this right, and it makes
> things fragile.  Another way to do this is to change the delimter to a
> printable character like '|', but that raises hackles, too.

Frankly if you're passing you data through an editor/mailer/whatever you don't
trust then your setup is already fragile. At least if you're using tabs then
you find out about these problems. Tiptoeing around the untrustworthy process
just means that it'll fail randomly (and unpredictably) when other characters
appear in the data that the software doesn't handle.

There are certainly cases where you'll need to do this to interface with other
(amateurish) software. But pg_dump isn't for that at all. Even COPY isn't a
general purpose data formatter. To interface with other software not using a
standard format you're going to have to pass the data through Perl or
something like that anyways.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread David Fetter
On Wed, Mar 08, 2006 at 11:26:00AM -0500, Tom Lane wrote:
> David Fetter <[EMAIL PROTECTED]> writes:
> > On Wed, Mar 08, 2006 at 11:03:00AM -0500, Tom Lane wrote:
> >> Did anyone provide a convincing use case for this?
> 
> > I've had one so far, and it was enough to cause me to make a
> > special patched version of pg_dump.  To get some idea of how
> > drastic that was, consider that I think it's generally bad
> > practice to compile from source because it can take you too far
> > off the "generally supported software" map.  The case I had was
> > making a database with a schema and initial data whose dump output
> > gets checked into a source code management system.
> 
> So?  Don't tell me your SCMS can't handle tabs.

Not everybody's editor/mailer/whatever does this right, and it makes
things fragile.  Another way to do this is to change the delimter to a
printable character like '|', but that raises hackles, too.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread Tom Lane
David Fetter <[EMAIL PROTECTED]> writes:
> On Wed, Mar 08, 2006 at 11:03:00AM -0500, Tom Lane wrote:
>> Did anyone provide a convincing use case for this?

> I've had one so far, and it was enough to cause me to make a special
> patched version of pg_dump.  To get some idea of how drastic that was,
> consider that I think it's generally bad practice to compile from
> source because it can take you too far off the "generally supported
> software" map.  The case I had was making a database with a schema and
> initial data whose dump output gets checked into a source code
> management system.

So?  Don't tell me your SCMS can't handle tabs.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread David Fetter
On Wed, Mar 08, 2006 at 11:10:04AM -0500, Neil Conway wrote:
> On Wed, 2006-03-08 at 07:47 -0800, David Fetter wrote:
> > From the earlier discussion, it appears that there is a variety of
> > opinions on what the COPY delimiter should be in pg_dump.  This patch
> > allows people to set it and the NULL string.
> 
> I'm still not convinced there is a reasonable use-case for this feature.
> I can't recall: did the previous discussion conclude that we actually
> want this functionality?

The previous discussion showed that there is a wide diversity of
opinions on what The Right Delimiter and The Right NULL String(TM)
are.

Thanks for the tips. :)  I'll make a revised patch this evening, time
permitting.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread David Fetter
On Wed, Mar 08, 2006 at 11:03:00AM -0500, Tom Lane wrote:
> David Fetter <[EMAIL PROTECTED]> writes:
> > From the earlier discussion, it appears that there is a variety of
> > opinions on what the COPY delimiter should be in pg_dump.  This
> > patch allows people to set it and the NULL string.
> 
> Did anyone provide a convincing use case for this?

I've had one so far, and it was enough to cause me to make a special
patched version of pg_dump.  To get some idea of how drastic that was,
consider that I think it's generally bad practice to compile from
source because it can take you too far off the "generally supported
software" map.  The case I had was making a database with a schema and
initial data whose dump output gets checked into a source code
management system.  Those initial data sets, which can change--for
example when the corresponding ISO codes do--may be in many different
tables, so the easiest way to do this is to make the dump file as easy
as possible to edit.

> It's of zero value from the perspective of pg_dump itself; the only
> possible argument is that it makes it easier for program-foo to
> parse the output of pg_dump.  But I don't see any programs around to
> parse arbitrary SQL scripts, especially not the pretty-PG-specific
> scripts that pg_dump emits.

It's less about program-foo parsing than about multi-table data
management, as above.  However, I'm sure that there are people who
will find other uses for it.

> I think it much more likely that people needing this sort of thing would
> be using something like "psql -c 'copy foo to stdout'", so as to get the
> data without any added overhead.

The one-table-at-a-time approach is quite error-prone for large
numbers of tables and/or large data sets.

> So this seems like mere creeping featurism to me.  pg_dump has too
> many switches already.

I've been careful to see to it that only people who use the switches
are affected by it.  I am also volunteering to do ongoing maintenance
of this feature. :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread Neil Conway
On Wed, 2006-03-08 at 07:47 -0800, David Fetter wrote:
> From the earlier discussion, it appears that there is a variety of
> opinions on what the COPY delimiter should be in pg_dump.  This patch
> allows people to set it and the NULL string.

I'm still not convinced there is a reasonable use-case for this feature.
I can't recall: did the previous discussion conclude that we actually
want this functionality?

> *** src/bin/pg_dump/pg_dump.c   5 Mar 2006 15:58:50 -   1.433
> --- src/bin/pg_dump/pg_dump.c   6 Mar 2006 07:32:12 -
> ***
> *** 114,119 
> --- 114,125 
>   /* flag to turn on/off dollar quoting */
>   static intdisable_dollar_quoting = 0;
>   
> + /* Things used when caller invokes COPY options. */
> + #define ARG_COPY_DELIMITER 2
> + #define ARG_COPY_NULL 3
> + char *copy_delimiter = "\t";
> + char *copy_null;
> + 

The variables should be declared static.
  
>   static void help(const char *progname);
>   static NamespaceInfo *findNamespace(Oid nsoid, Oid objoid);
> ***
> *** 181,186 
> --- 187,193 
>  ExecStatusType expected);
>   
>   
> + 
>   int
>   main(int argc, char **argv)
>   {
> ***
> *** 211,217 
> char   *outputSuperuser = NULL;
>   
> RestoreOptions *ropt;
> ! 
> static struct option long_options[] = {
> {"data-only", no_argument, NULL, 'a'},
> {"blobs", no_argument, NULL, 'b'},
> --- 218,224 
> char   *outputSuperuser = NULL;
>   
> RestoreOptions *ropt;
> !   
> static struct option long_options[] = {
> {"data-only", no_argument, NULL, 'a'},
> {"blobs", no_argument, NULL, 'b'},

Please review patches and eliminate content-free hunks like these before
submitting.

> ***
> *** 427,432 
> --- 464,479 
> }
> }
>   
> +   if (copy_null == NULL)
> +   copy_null = malloc(3);
> +   strcpy(copy_null, "\\N");

You're missing some braces.

> +   if (strstr(copy_null, copy_delimiter))
> +   {
> +   fprintf(stderr, _("In %s, the NULL AS string cannot
> contain the COPY delimiter.\n"), progname);
> +   exit(1);
> +   }

I'm not sure as to whether you should be using write_msg() or fprintf()
here, but we should probably pick one and be consistent. Also ISTM we
should be to refactor the code to use exit_nicely() anyway, provided
that g_conn is initialized to NULL before we have connected to the DB.

> ***
> *** 702,707 
> --- 749,756 
>  "   use SESSION
> AUTHORIZATION commands instead of\n"
>  "   OWNER TO commands
> \n"));
>   
> +   printf(_("  --copy-delimiter string to use as column
> DELIMITER in COPY statements\n"));

Capitalizing "DELIMITER" here is not good style, IMHO: it is just a
normal word.

> *** 844,849 
> --- 893,904 
> int ret;
> char   *copybuf;
> const char *column_list;
> +   char *local_copy_delimiter;
> +   char *local_copy_null;
> +   local_copy_delimiter = malloc(2*strlen(copy_delimiter)+1);
> +   PQescapeString (local_copy_delimiter, copy_delimiter,
> 2*strlen(copy_delimiter)+1);
> +   local_copy_null = malloc(2*strlen(copy_null)+1);
> +   PQescapeString (local_copy_null, copy_null,
> 2*strlen(copy_null)+1);

Spacing: spaces around operands to mathematical operators, no spaces
before the parameter list to a function call.

You should also fix this compiler warning:

[...]/pg_dump.c:440: warning: format '%d' expects type 'int', but
argument 4 has type 'size_t'

-Neil



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] [HACKERS] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread Tom Lane
David Fetter <[EMAIL PROTECTED]> writes:
> From the earlier discussion, it appears that there is a variety of
> opinions on what the COPY delimiter should be in pg_dump.  This patch
> allows people to set it and the NULL string.

Did anyone provide a convincing use case for this?  It's of zero value
from the perspective of pg_dump itself; the only possible argument is
that it makes it easier for program-foo to parse the output of pg_dump.
But I don't see any programs around to parse arbitrary SQL scripts,
especially not the pretty-PG-specific scripts that pg_dump emits.

I think it much more likely that people needing this sort of thing would
be using something like "psql -c 'copy foo to stdout'", so as to get the
data without any added overhead.

So this seems like mere creeping featurism to me.  pg_dump has too many
switches already.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Neil Conway
On Wed, 2006-03-08 at 08:16 -0500, Stephen Frost wrote:
> If all you were doing with the view was reading from it then sure...
> Creating the rules for all the views you want to make them updatable
> wouldn't be fun.

Well, updateable views are on the TODO list: expending our finite
development resources implementing those would add plainly useful
functionality and improve our conformance with the SQL standard, neither
of which can be said about synonyms.

> It'd be nice to be able to support them in Postgres without
> having to go through alot of work.

I'm still unconvinced that this feature would be sufficiently useful to
justify the maintenance burden, in addition to the added complexity:
even if it is implemented in a way that imposes minimal *runtime*
overhead, new features add complexity: introducing a bunch of new DDL
commands and a new concept ("synonyms") makes the system more difficult
for users to understand.

-Neil



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PATCHES] Add switches for DELIMITER and NULL in pg_dump COPY

2006-03-08 Thread David Fetter
Folks,

>From the earlier discussion, it appears that there is a variety of
opinions on what the COPY delimiter should be in pg_dump.  This patch
allows people to set it and the NULL string.  Thanks to Gavin Sherry
for help with the pointers :)

I didn't patch pg_dumpall, but it would be trivial if there's a use
case.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!
Index: doc/src/sgml/ref/pg_dump.sgml
===
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ref/pg_dump.sgml,v
retrieving revision 1.81
diff -c -r1.81 pg_dump.sgml
*** doc/src/sgml/ref/pg_dump.sgml   1 Nov 2005 21:09:50 -   1.81
--- doc/src/sgml/ref/pg_dump.sgml   6 Mar 2006 07:32:05 -
***
*** 163,168 
--- 163,208 
   
  
   
+   --copy-delimiter=delimiter
+   
+
+ Use delimiter
+ instead of the default tab character in COPY 
statements. 
+
+   
+  
+ 
+  
+   --copy-null=string_for_nulls
+   
+
+ Use  string_for_nulls 
instead of the
+ default \N in COPY statements.
+
+   
+  
+ 
+  
+   --copy-delimiter=delimiter
+   
+
+ Use delimiter
+ instead of the default tab character in COPY 
statements. 
+
+   
+  
+ 
+  
+   --copy-null=string_for_nulls
+   
+
+ Use  string_for_nulls 
instead of the
+ default \N in COPY statements.
+
+   
+  
+ 
+  
-d
--inserts

Index: src/bin/pg_dump/pg_dump.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.433
diff -c -r1.433 pg_dump.c
*** src/bin/pg_dump/pg_dump.c   5 Mar 2006 15:58:50 -   1.433
--- src/bin/pg_dump/pg_dump.c   6 Mar 2006 07:32:12 -
***
*** 114,119 
--- 114,125 
  /* flag to turn on/off dollar quoting */
  static intdisable_dollar_quoting = 0;
  
+ /* Things used when caller invokes COPY options. */
+ #define ARG_COPY_DELIMITER 2
+ #define ARG_COPY_NULL 3
+ char *copy_delimiter = "\t";
+ char *copy_null;
+ 
  
  static void help(const char *progname);
  static NamespaceInfo *findNamespace(Oid nsoid, Oid objoid);
***
*** 181,186 
--- 187,193 
 ExecStatusType expected);
  
  
+ 
  int
  main(int argc, char **argv)
  {
***
*** 211,217 
char   *outputSuperuser = NULL;
  
RestoreOptions *ropt;
! 
static struct option long_options[] = {
{"data-only", no_argument, NULL, 'a'},
{"blobs", no_argument, NULL, 'b'},
--- 218,224 
char   *outputSuperuser = NULL;
  
RestoreOptions *ropt;
!   
static struct option long_options[] = {
{"data-only", no_argument, NULL, 'a'},
{"blobs", no_argument, NULL, 'b'},
***
*** 249,254 
--- 256,269 
{"disable-dollar-quoting", no_argument, 
&disable_dollar_quoting, 1},
{"disable-triggers", no_argument, &disable_triggers, 1},
{"use-set-session-authorization", no_argument, 
&use_setsessauth, 1},
+   
+   /*
+* The following options don't have an equivalent short option
+* letter, and are not available as -X long-name.  Just use
+* the long form.
+*/
+   {"copy-delimiter", required_argument, NULL, ARG_COPY_DELIMITER},
+   {"copy-null", required_argument, NULL, ARG_COPY_NULL},
  
{NULL, 0, NULL, 0}
};
***
*** 418,423 
--- 433,460 
break;
/* This covers the long options equivalent to 
-X xxx. */
  
+   case ARG_COPY_DELIMITER:
+   if ( strlen(optarg) != 1)
+   {
+   fprintf(stderr, _("In %s, 
copy-delimiter must be exactly one byte long, not %d.\n"),
+   progname, 
strlen(optarg));
+   exit(1);
+   }
+   if (*optarg == '\r' || *optarg == '\n' ||
+   *optarg == '\\')
+   {
+   fprintf(stderr, _("In %s, 
copy-delimiter may not be any of \\r, \\n or \\.\n"),
+   progname);
+   exit(1);
+   }
+   copy_delimiter = optarg;
+   break;
+ 
+   case ARG_C

Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Stephan Szabo
On Wed, 8 Mar 2006, Jonah H. Harris wrote:

> On 3/8/06, Stephan Szabo <[EMAIL PROTECTED]> wrote:
> >
> > Doesn't that pretty much go against the (I thought) outstanding behavioral
> > question of whether the synonyms are scoped and obey search path?  If they
> > do, I don't see how the above rule can hold since finding the "real table"
> > is insufficient to know if there's an earlier synonym.
> >
>
> There is a cost for synonyms no matter how you look at it.

Yes, however there are two slightly separate discussions going on and I
think you're taking them as a single discussion. One is about the feature
in general. One is about this patch in particular and the approach taken
in it. I'm mostly talking about the latter and specifically, what are the
costs of this particular way of implementing synonyms for people not using
the feature.

Even if we want a feature, there's a particular patch (or particular
patches) that implement the feature that come for review. If one thinks
the approach taken by the patch gives costs that are greater than our
desire for the feature, then one argues against it.  That's irrespective
of whether that person believes in the feature as a whole.  If you notice,
AFAICS I haven't said, "we shouldn't implement synonyms" or "synonyms are
unnecessary", but instead effectively "what are the costs of implementing
synonyms" and "that analysis of the cost for this approach seems wrong".

> Assume your user has it's own schema, that there is a synonym in public for
> EMPLOYEE which pointed to HR.EMPLOYEE, and your search path is
> $user,public.  If you do a SELECT * FROM EMPLOYEE, the search order is still
> the same as it is in PostgreSQL now, there's no EMPLOYEE table in the $user
> schema, so when it gets to searching public, it finds the synonym.  The only
> alternative in this scenario is to create the EMPLOYEE table in public
> (which is pretty stupid in most cases), or to set the search path to
> $user,public,hr.  Again, this doesn't cover the "same-named tables in
> multiple schemas" argument, but it does illustrate that PostgreSQL's
> namespace scoping remains the same.

IMHO, that's insufficient analysis, precisely for the reason it doesn't
cover multiple schemas with the same objct.

If your search path is A,B and there is a B.EMPLOYEE table and an
A.EMPLOYEE synonym to HR.EMPLOYEE, which table does select * from EMPLOYEE
read?  If the behavior is find the table through the whole path, then find
the synonym, it's B.EMPLOYEE and there's only cost to people who aren't
using the feature for error cases. If the behavior is find either table or
synonym in each search path entry in order (thus HR.EMPLOYEE) and finding
synonyms requires a separate search of the catalogs, then it seems like
everyone pays whether or not they are using the feature.

So far, there have been statements made that the cost to people not using
the feature is minimal in this approach because the extra search only
happens if the table isn't found. However, I still am not seeing how that
approach gives the second behavior (assuming that's what we want). To
discuss how to implement a feature we need at least an understanding of
what the behavior an approach implements and the costs that approach
incurs.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Jonah H. Harris
GRR, not enough coffee yet today :(

Jonah now enjoys some SumatraOn 3/8/06, Jonah H. Harris <[EMAIL PROTECTED]
> wrote:I agree wholeheartedly.

regarding "we should take a fresh look at the problem rather than just blindly copying a construct from Oracle".
SELECT dblink('host=somehost dbname=remotedb ...', 'SELECT
emp_id,first_name,last_name,middle_name,birth_dt,ssn,... FROM EMPLOYEE
WHERE last_name = ''Blow''') AS t1(emp_id BIGINT, first_name VARCHAR,
last_name VARCHAR, middle_name VARCHAR, birth_dt DATE, ssn NUMERIC, ...)
should be SELECT * FROM ... 
-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Jonah H. Harris
On 3/8/06, Neil Conway <[EMAIL PROTECTED]> wrote:
I'm inclined to agree. The points raised about the difficulties ofmanaging large numbers of schemas are legitimate, but I don't see thatsynonyms are a very effective solution. If we're going to make it lesspainful to work on applications with many tens of schemas, that's a
worthwhile project, but I think we should take a fresh look at theproblem rather than just blindly copying a construct from Oracle.
I agree wholeheartedly.

As for the idea that because we're not supporting database links (which
synonyms are used for a lot) we shouldn't support synonyms, now that we
have 2PC, it's now possible to reimplement the dblink contrib module
into something which supported nicer database linking.  Don't get
me wrong, it certainly would be a task, but it's doable and I know a
number of people who use the dblink contrib module on a daily basis for
data copying and remote querying.

I don't know anyone that really likes typing:

SELECT dblink('host=somehost dbname=remotedb ...', 'SELECT
emp_id,first_name,last_name,middle_name,birth_dt,ssn,... FROM EMPLOYEE
WHERE last_name = ''Blow''') AS t1(emp_id BIGINT, first_name VARCHAR,
last_name VARCHAR, middle_name VARCHAR, birth_dt DATE, ssn NUMERIC, ...)

instead of:

SELECT * FROM [EMAIL PROTECTED] WHERE last_name = 'Blow';

If that's not bad enough, just try to do a lot of dynamic work using
database links using the contrib module... it's not easy or efficient
having to create types, functions, views, and rules to do dynamic work.

This is another discussion in and of itself, but I don't think
supporting nicer database links is a discussion that's too far off
either.  I'm glad we have the contrib module, but there's a lot of
nicer things we could do there as well.  Not because it's an
Oracle thing, but because it's great functionality to have.

-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Jonah H. Harris
On 3/8/06, Stephan Szabo <[EMAIL PROTECTED]> wrote:
Doesn't that pretty much go against the (I thought) outstanding behavioralquestion of whether the synonyms are scoped and obey search path?  If theydo, I don't see how the above rule can hold since finding the "real table"
is insufficient to know if there's an earlier synonym.
There is a cost for synonyms no matter how you look at it.

Assume your user has it's own schema, that there is a synonym in public
for EMPLOYEE which pointed to HR.EMPLOYEE, and your search path is
$user,public.  If you do a SELECT * FROM EMPLOYEE, the search
order is still the same as it is in PostgreSQL now, there's no EMPLOYEE
table in the $user schema, so when it gets to searching public, it
finds the synonym.  The only alternative in this scenario is to
create the EMPLOYEE table in public (which is pretty stupid in most
cases), or to set the search path to $user,public,hr.  Again, this
doesn't cover the "same-named tables in multiple schemas" argument, but
it does illustrate that PostgreSQL's namespace scoping remains the same.

The question is whether we want to offer the functionality and what the least intrusive way to handle it is.
-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Jonah H. Harris
On 3/7/06, Tom Lane <[EMAIL PROTECTED]> wrote:
To be blunt, those people aren't going to be moving to Postgres anyhow.If the notion of fixing this issue daunts them, they are not going to bewilling to deal with the other incompatibilities between Oracle and PG.

IMHO, we're not discussing incompatibilities.  We're discussing
functionality which PostgreSQL does not offer.  The real question
is should PostgreSQL offer similar functionality?
And we are *not* buying into the notion of becoming a bug-compatibleOracle clone.

I don't think I said anything about that.  Synonyms are not a bug,
they are a feature which offers functionality that PostgreSQL (in some
cases) cannot.  I'm not saying we should clone synonyms just
because Oracle has them; instead, I think we could find a way to offer
similar functionality regardless of who/where the idea came from.

Like it or not, people use the proprietary functionality bigger vendors
offer; that's partly why the big vendors exist.  When people move
towards an open source database they generally look at PostgreSQL first
as we are known for being the, "most advanced open source database" but
in the same breath we way things like "we're not going to copy
Oracle/SQL Server/Sybase/DB2 features."  You're right, we all
agree that we shouldn't just add something because [insert database
vendor name here] has it, but we should at least have an open mind and
look at it from a functionality/migration perspective before dismissing
it.

-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Stephen Frost
* Neil Conway ([EMAIL PROTECTED]) wrote:
> BTW, AFAICS synonyms for tables and views can be approximated by views:
> you pay a small runtime hit to expand the view definition, but that's
> fairly cheap for a simple view. Synonyms for functions can be
> approximated by shell functions defined in SQL -- those can even be
> inlined, reducing the amount of overhead. It's not as easy to define
> synonyms for sequences, but that is hardly justification for the
> feature.

If all you were doing with the view was reading from it then sure...
Creating the rules for all the views you want to make them updatable
wouldn't be fun.  That's actually the one thing I like about synonyms
over create view a as select * from b.a;

We've got a number of synonyms in our Oracle databases (no
cross-database ones or anything like that...) and I'd like to get rid of
them but the folks using them aren't too keen on that for various
reasons...  It'd be nice to be able to support them in Postgres without
having to go through alot of work.  Tom's suggestion for just an
additional catalog entry for them would be exactly what I'd need and
seems pretty straight-forward and simple to me.

Just my 2c.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PATCHES] CREATE SYNONYM ...

2006-03-08 Thread Neil Conway
On Tue, 2006-03-07 at 17:14 -0500, Tom Lane wrote:
> (Actually, I don't think the case for table synonyms has been made
> adequately either; "Oracle has it" is *not* enough reason to take on
> another feature that we'll have to maintain forever, especially given
> that we're being told that one of the major use-cases for synonyms
> isn't going to be supported.

I'm inclined to agree. The points raised about the difficulties of
managing large numbers of schemas are legitimate, but I don't see that
synonyms are a very effective solution. If we're going to make it less
painful to work on applications with many tens of schemas, that's a
worthwhile project, but I think we should take a fresh look at the
problem rather than just blindly copying a construct from Oracle.

BTW, AFAICS synonyms for tables and views can be approximated by views:
you pay a small runtime hit to expand the view definition, but that's
fairly cheap for a simple view. Synonyms for functions can be
approximated by shell functions defined in SQL -- those can even be
inlined, reducing the amount of overhead. It's not as easy to define
synonyms for sequences, but that is hardly justification for the
feature.

-Neil



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings