Re: [HACKERS] C function proposal redux

2000-12-12 Thread Karel Zak


On Wed, 13 Dec 2000, mlw wrote:

> I propose we modify C functions for 7.2. 

 Too simple imagine anything, but often difficult do it :-)

> (1) C functions should be able to return multiple values.

 for 7.2 / 7.3 are planned functions return tuples, but do it 
is really diffucult. See the current source...

> (2) A setup and breakdown function should be able to be called
> surrounding the query set in which a function is called. This allows
> constructors and destructors.

 Why? Can you show any example where is it needful? If you really
need an init/destroy, you can use:

SELECT my_init();
..query...
SELECT my_destroy();

> (3) A function should be able to tell Postgres how to use it. For
> instance:
> 
> select * from table where column = function();
> 
> Should be able to instruct Postgres to either take the value returned
> and search that one value (allowing index match against the value), or
> perform a table scan against the function each time. Both behaviors are
> important. Currently a function seems to force a table scan.

Here I not undestand. We have 'iscacheable' - or what you mean? 


Karel




[HACKERS] Bug in ILIKE function?

2000-12-12 Thread Christopher Kings-Lynne

Hi,

I have just tried using the ILIKE function in 7.0.3.  I assume that it is
just a case-insensitive version of LIKE.  (Please correct me if I am wrong
on this assumption.)

This is my example test case:

usa=# select 'test' LIKE '%es%';
 ?column?
--
 t
(1 row)

usa=# select 'test' ILIKE '%es%';
ERROR:  parser: parse error at or near "ilike"
usa=#

HEre is a dump (\do) of the some of the tilde operators in 7.0.3:

 ~*  | bpchar  | text| bool  | matches regex.,
case-insensitive
 ~*  | name| text| bool  | matches regex.,
case-insensitive
 ~*  | text| text| bool  | matches regex.,
case-insensitive
 ~*  | varchar | text| bool  | matches regex.,
case-insensitive
 ~=  | box | box | bool  | same as
 ~=  | circle  | circle  | bool  | same as
 ~=  | point   | point   | bool  | same as
 ~=  | polygon | polygon | bool  | same as
 ~=  | tinterval   | tinterval   | bool  | same as
 ~~  | bpchar  | text| bool  | matches LIKE expression
 ~~  | name| text| bool  | matches LIKE expression
 ~~  | text| text| bool  | matches LIKE expression
 ~~  | varchar | text| bool  | matches LIKE expression

Notice that there's no ILIKE operators, (~~*), at all!

Is this documented, but not implemented or what

Chris




[HACKERS] C function proposal redux

2000-12-12 Thread mlw

I propose we modify C functions for 7.2. 

( I'll volunteer to do as much as I can figure out ;-)

(1) C functions should be able to return multiple values.

(2) A setup and breakdown function should be able to be called
surrounding the query set in which a function is called. This allows
constructors and destructors.

(3) A function should be able to tell Postgres how to use it. For
instance:

select * from table where column = function();

Should be able to instruct Postgres to either take the value returned
and search that one value (allowing index match against the value), or
perform a table scan against the function each time. Both behaviors are
important. Currently a function seems to force a table scan.

Estimates:
1 may be difficult. 2 should be easy enough. 3, depending on the code
dependencies, could either be very hard or easy. (my guess is that it
would be hard)

-- 
http://www.mohawksoft.com



Re: [HACKERS] Re: COPY BINARY file format proposal

2000-12-12 Thread Tom Lane

[EMAIL PROTECTED] (Nathan Myers) writes:
> I don't know if you get the point of the fixed-size comment field.  
> The idea was that a comment could be poked into an existing COPY 
> image, after it was written.

Yes, I did get the point ...

> A variable-size comment field in an
> already-written image might leave no space to poke in anything.  A 
> variable-size comment field with a required minimum size would 
> satisfy both needs, at some cost in complexity.  

This strikes me as a perfect argument for a variable-size field.
If you want to leave N bytes for a future poked-in comment, you do that.
If you don't, then not.  Leaving 128 bytes (or any other frozen-by-the-
file-format number) is guaranteed to satisfy nobody.

regards, tom lane



Re: [HACKERS] external function proposal for 7.2

2000-12-12 Thread Tom Lane

mlw <[EMAIL PROTECTED]> writes:
> I just have to find where I call the exit function.

That will be the hard part.

FmgrInfo is not currently considered a durable data structure, and I
think you will be in for grief if you try to make any guarantees about
what will happen when one disappears.  If you need a cleanup proc to
be called, I'd suggest looking into registering it to be called at
query completion and/or transaction cleanup/abort, as needed.

Most of the sorts of resources you might need to clean up already have
cleanup mechanisms, so it's not entirely clear that you even *need*
a cleanup proc.  Maybe a different way to say that is that Postgres
already has a pretty well-defined cleanup philosophy, and it's geared
to particular resources (memory, open files, etc) not to individual
called functions.  You should consider swimming with that tide rather
than against it.

I have no objection to adding another field to FmgrInfo for the callee's
use, if you can show an example or two where it'd be useful.  I'm only
concerned about the callback-on-delete part.  That sounds like a recipe
for fragility...

regards, tom lane



Re: [HACKERS] RFC: CRC datatype

2000-12-12 Thread Horst Herb

O
> > Always remember that a psotgres data base on the harddisk can be
> > manipulated accidentally / maliciously without postgres even running.
> > These are the cases where you need row level CRCs.
>
> "There is no security without physical security."  If somebody can
> change the row contents, they can also change the row and/or block
> checksum to match.

They may, but in a proper setup they won't be able to access the CRC log 
files. That way, you can still detect alterations. I presume anyway that most 
alterations would be rather accidental than malicious, and in that case the 
CRC is extremely helpful

Horst



[HACKERS] Creating a 'SET' type

2000-12-12 Thread Christopher Kings-Lynne

Hi,

I am trying to emulate MySQL's SET type, by creating a new postgresql type.

However, is it possible to create a type that has different parameters
wherever it is used.

For instance - the varchar type takes as a parameter the max characters in
the field.  Although there is only one varchar type, it has different
properties depending on whether or not it is varchar(5) or varchar(20).

I wish to be able to declare:

bitset('LOW','MEDIUM','HIGH')  // Not sure of exact syntax

Internally stored as an int4.

The trouble is in writing the in and out functions.  They need to be able to
store a list of token names in order to recreate the comma delimited list of
tokens from the internal bitset, and vice versa...

Any help?

Thanks,

Chris

--
Christopher Kings-Lynne
Family Health Network (ACN 089 639 243)




Re: [HACKERS] external function proposal for 7.2

2000-12-12 Thread Vincent AE Scott


As a lurker on the list this post caught my eye somewhat.  I think this
would be excellent functionality to have in postgres, i was considering
doing something like this in a non intruse manner, by manipulating
_init() and _fini functions of shared libraries.  But what you have
described below is a much better interface.  In particular i was looking
at a way of getting async notifications when a row had been inserted, and
pasing out to my other applications enough data, to be able to query back
in for the complete row.

The ability to have an init/exit for an external function would be a big
win, you could even have the init() create a thread for passing results
to, and performing what ever voodoo magic you wanted.

i'll go back to lurking and listening now.



On Tue, 12 Dec 2000, mlw wrote:

> I think the newC function idea is pretty good, however, what would be
> great is just one more step of protocol, perhaps an API verson 2 or 3:
> 
> One thing than makes writing a non-trivial function a bit problematic,
> and perhaps even less efficient, is that the function does not know when
> it is first run and when it is finished, and there is no facility to
> manage contextual information. This limits external functons having to
> be  fairly simple, or overly complex.
> 
> I propose that when the newC structure is allocated that a function
> specific "Init" function be called, and when the structure is being
> freed, calling a "Exit" function. The new C structure should also have a
> void pointer that allows persistent information to be passed around.
> 
> typedef struct
> {
> FmgrInfo   *flinfo; /* ptr to lookup info used for this call
> */
> Node   *context;/* pass info about context of call */
> Node   *resultinfo; /* pass or return extra info about
> result */
> boolisnull; /* function must set true if result is
> NULL */
> short   nargs;  /* # arguments actually passed */
> Datum   arg[FUNC_MAX_ARGS];  /* Arguments passed to function */
> boolargnull[FUNC_MAX_ARGS];  /* T if arg[i] is actually NULL
> */
> 
> void *userparam;/* to be used by he function */
> 
> } FunctionCallInfoData;
> typedef FunctionCallInfoData* FunctionCallInfo;
> 
> The userparam can be used to store data, or a count, or whatever.
> 
> Datum function(PG_FUNCTION_ARGS) ;
> bool function_Init(PG_FUNCTION_ARGS);
> void function_Exit(PG_FUNCTION_ARGS);
> 
> This protocol would make writing some really cool features much easier.
> As a C++ guy, I could execute "new" at Init and "delete" at Exit. ;-)
> 
> 
> Mark.
> 



PGP key:  http://codex.net/pgp/pgp.asc




Re: [HACKERS] external function proposal for 7.2

2000-12-12 Thread mlw

Vincent AE Scott wrote:
> 
> As a lurker on the list this post caught my eye somewhat.  I think this
> would be excellent functionality to have in postgres, i was considering
> doing something like this in a non intruse manner, by manipulating
> _init() and _fini functions of shared libraries.  But what you have
> described below is a much better interface.  In particular i was looking
> at a way of getting async notifications when a row had been inserted, and
> pasing out to my other applications enough data, to be able to query back
> in for the complete row.
> 
> The ability to have an init/exit for an external function would be a big
> win, you could even have the init() create a thread for passing results
> to, and performing what ever voodoo magic you wanted.
> 
> i'll go back to lurking and listening now.

I did some code spelunking today. It will not be easy, but I think it is
quite doable. Currently, in the code, a function pointer is passed
around. If I resurrect some of the "old" C code a bit, and do some
merging with the new code we could do it. I just have to find where I
call the exit function.

As far as I can see, the code passes around a function pointer, but
seems to mostly call a small number of localized functions to dispatch
the call. So, I was thinking, rather than pass the function, why not
pass the structure? The old C code stuff does this, why not keep it
around, and pass around the finfo struct instead? and call
(*finfo->funct)(args)?

> 
> On Tue, 12 Dec 2000, mlw wrote:
> 
> > I think the newC function idea is pretty good, however, what would be
> > great is just one more step of protocol, perhaps an API verson 2 or 3:
> >
> > One thing than makes writing a non-trivial function a bit problematic,
> > and perhaps even less efficient, is that the function does not know when
> > it is first run and when it is finished, and there is no facility to
> > manage contextual information. This limits external functons having to
> > be  fairly simple, or overly complex.
> >
> > I propose that when the newC structure is allocated that a function
> > specific "Init" function be called, and when the structure is being
> > freed, calling a "Exit" function. The new C structure should also have a
> > void pointer that allows persistent information to be passed around.
> >
> > typedef struct
> > {
> > FmgrInfo   *flinfo; /* ptr to lookup info used for this call
> > */
> > Node   *context;/* pass info about context of call */
> > Node   *resultinfo; /* pass or return extra info about
> > result */
> > boolisnull; /* function must set true if result is
> > NULL */
> > short   nargs;  /* # arguments actually passed */
> > Datum   arg[FUNC_MAX_ARGS];  /* Arguments passed to function */
> > boolargnull[FUNC_MAX_ARGS];  /* T if arg[i] is actually NULL
> > */
> >
> > void *userparam;/* to be used by he function */
> >
> > } FunctionCallInfoData;
> > typedef FunctionCallInfoData* FunctionCallInfo;
> >
> > The userparam can be used to store data, or a count, or whatever.
> >
> > Datum function(PG_FUNCTION_ARGS) ;
> > bool function_Init(PG_FUNCTION_ARGS);
> > void function_Exit(PG_FUNCTION_ARGS);
> >
> > This protocol would make writing some really cool features much easier.
> > As a C++ guy, I could execute "new" at Init and "delete" at Exit. ;-)
> >
> >
> > Mark.
> >
> 
> PGP key:  http://codex.net/pgp/pgp.asc

-- 
http://www.mohawksoft.com



Re: [HACKERS] RFC C++ Interface

2000-12-12 Thread Nathan Myers

On Tue, Dec 12, 2000 at 05:28:46PM -0500, Bruce Momjian wrote:
> > On Sun, Dec 10, 2000 at 06:53:11PM -0500, Bruce Momjian wrote:
> > > > I appreciate your comments and would like to respond to your
> > > > concerns. The API I sketched in my earlier e-mail is borrowed
> > > > heavily from Rogue Wave's dbtools.h++ library. I think it can be
> > > > a very clean and elegant way of accessing a database.
> > >
> > > Rogue Wave's API is quite interesting. It would be a challenge to
> > > implement. If you think you can do it, I think it would be a real
> > > win, and a real object-oriented API to PostgreSQL.
> >
> > I was co-architect of the Rogue Wave Dbtools.h++ interface design
> > ... The design is really showing its age. SQL92 and SQL3 didn't
> > exist then, and neither did the STL or the ISO 14882 C++ Language
> > standard.
>
> Can you suggest areas that should be changed?

As I recall, we were much more fond of operator overloading then than is
considered tasteful or wise today. Also, there was no standard for how
iterators ought to work, then, whereas today one needs unusually good
reasons to depart from the STL style.

Nathan Myers 
[EMAIL PROTECTED]



Re: [HACKERS] SourceForge & Postgres

2000-12-12 Thread mlw

Tom Lane wrote:
> 
> mlw <[EMAIL PROTECTED]> writes:
> > btw anyone trying this query should use: "attdispersion"
> 
> Sorry about that --- I just copied-and-pasted the query from some notes
> that are obsolete as of 7.1...
> 
> > cdinfo=# explain select * from ztitles where artistid = 10220 ;
> > NOTICE:  QUERY PLAN:
> 
> > Seq Scan on ztitles  (cost=0.00..4740.75 rows=3163 width=296)
> 
> > And this is with "-o -fs"
> 
> > Index Scan using ztitles_artistid_ndx on ztitles  (cost=0.00..5915.01
> > rows=3163 width=296)
> 
> >   attname   | attdispersion | starelid | staattnum | staop | stanullfrac | 
>stacommonfrac |  stacommonval  | staloval | stahival
> >  artistid   | 0.0477198 |19274 | 2 |97 |   0 |  
>0.149362 | 100050450  | 1| 100055325
> 
> The reason why the thing is going for a sequential scan is that
> astonishingly high stacommonfrac statistic.  Does artistid 100050450
> really account for 14.9% of all the rows in your table?  (Who is that
> anyway? ;-))  If so, a search for artistid 100050450 definitely *should*
> use a sequential scan.

I tested this statement against the database and you are right, about 14
seconds with the index, 4 without.

BTW ID # 100050450 is "Various Artists"

This is sort of a point I was trying to make in previous emails. I think
this situation, and this sort of ratio is far more likely than the
attention it has been given.

In about every project I have used postgres I have run into this. It is
only recently that I have understood what the problem was and how to get
around it (sort of).

This one entry is destroying any intelligent performance we could hope
to attain. As I said, I always see this sort of behavior in some
implementation.


>  The problem at hand is estimating the frequency
> of entries for some other artistid, given that we only have this much
> statistical info available.  Obviously the stats are insufficient, and
> I hope to do something about that in a release or two, but it ain't
> gonna happen for 7.1.  In the meantime, if you've got huge outliers
> like that, you could try reducing the value of NOT_MOST_COMMON_RATIO
> in src/backend/utils/adt/selfuncs.c.

I did some playing with this value, and I can seem to have it
differentiate between 100050450 and anything else.


-- 
http://www.mohawksoft.com



Re: [HACKERS] (one more time) Patches with vacuum fixes available .

2000-12-12 Thread Nathan Myers

On Mon, Dec 11, 2000 at 11:32:17PM -0500, Tom Lane wrote:
> The Hermit Hacker <[EMAIL PROTECTED]> writes:
> > worst case, we pull it out afterwards ...
> 
> No, worst case is that we release a seriously broken 7.1, and don't
> find out till afterwards.
> 
> There are plenty of new features on my to-do list, if beta no longer
> means anything...

Does this mean the code gets put in contrib/, with a prominent
pointer in the release notes?

Nathan Myers
[EMAIL PROTECTED]



Re: [HACKERS] Re: COPY BINARY file format proposal

2000-12-12 Thread Nathan Myers

On Sun, Dec 10, 2000 at 08:51:52PM -0500, Tom Lane wrote:
> [EMAIL PROTECTED] (Nathan Myers) writes:
> > I'd like to see a timestamp for when the image was created, and a 
> > 128-byte comment field to allow annotations, even after the fact.
> 
> Both seem like reasonable options.  If you don't mind, however,
> I'd suggest that they be left for inclusion as chunks in the header
> extension area, rather than nailing them down in the fixed header.
> 
> The advantage of handling a comment that way is obvious: it needn't
> be fixed-length.  As for the timestamp, handling it as an optional
> chunk would allow graceful substitution of a different timestamp
> format, which we'll need when 2038 begins to loom.

I don't know if you get the point of the fixed-size comment field.  
The idea was that a comment could be poked into an existing COPY 
image, after it was written.  A variable-size comment field in an
already-written image might leave no space to poke in anything.  A 
variable-size comment field with a required minimum size would 
satisfy both needs, at some cost in complexity.  
 
> Basically what I want to do at the moment is get a minimal format
> spec nailed down for 7.1.  There'll be time for neat extras later
> as long as we get it right now --- but there's not a lot of time
> for extras before 7.1.

I understand.

Nathan Myers
[EMAIL PROTECTED]



[HACKERS] Need help with redefining locales

2000-12-12 Thread Hannu Krosing

I'm having trouble with like et.al. as there is no single character 
in et_EE locale (on linux at least) that is bigger than all the others.

I would like to modify my locale definition files so that char(255) 
would always sort after all others but I can't find docs on modifying 
the locales

So i have a couple of questions:

1) what file must i change and how to get char(255) out of the Y-group 
  and into a separate group that sorts after all others 
 (I guess the file is /usr/share/i18n/locales/et_EE, but I'm not so 
  sure about the how part)

2) how can I then turn this file into the various LC_* files

--
Hannu



Re: [HACKERS] RFC C++ Interface

2000-12-12 Thread Bruce Momjian

> On Sun, Dec 10, 2000 at 06:53:11PM -0500, Bruce Momjian wrote:
> > > I appreciate your comments and would like to respond to your concerns.
> > > The API I sketched in my earlier e-mail is borrowed heavily from
> > > Rogue Wave's dbtools.h++ library.  I think it can be a very clean and
> > > elegant way of accessing a database.
> > 
> > Rogue Wave's API is quite interesting.  It would be a challenge to
> > implement.  If you think you can do it, I think it would be a real win,
> > and a real object-oriented API to PostgreSQL.
> 
> I was co-architect of the Rogue Wave Dbtools.h++ interface design (along 
> with somebody who actually knew something about databases, Stan Sulsky) in 
> the early 90's.  We really tried to make the "Datum" type unnecessary in 
> normal programs.  To my disgrace, I didn't participate in implementation;
> it was implemented mainly by Lars Lohn, who went on to a stellar career 
> as a consultant to users of the library.
> 
> At the time, ODBC was just beginning to be used.  Oracle was already
> a bully, actually moreso than today; we had to buy a full production 
> license just to develop on it.  Ingres was much better; they sent two
> engineers to do the port themselves.  
> 
> The design is really showing its age.  SQL92 and SQL3 didn't exist then,
> and neither did the STL or the ISO 14882 C++ Language standard.

Can you suggest areas that should be changed?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



Re: [HACKERS] RFC C++ Interface

2000-12-12 Thread Nathan Myers

On Sun, Dec 10, 2000 at 06:53:11PM -0500, Bruce Momjian wrote:
> > I appreciate your comments and would like to respond to your concerns.
> > The API I sketched in my earlier e-mail is borrowed heavily from
> > Rogue Wave's dbtools.h++ library.  I think it can be a very clean and
> > elegant way of accessing a database.
> 
> Rogue Wave's API is quite interesting.  It would be a challenge to
> implement.  If you think you can do it, I think it would be a real win,
> and a real object-oriented API to PostgreSQL.

I was co-architect of the Rogue Wave Dbtools.h++ interface design (along 
with somebody who actually knew something about databases, Stan Sulsky) in 
the early 90's.  We really tried to make the "Datum" type unnecessary in 
normal programs.  To my disgrace, I didn't participate in implementation;
it was implemented mainly by Lars Lohn, who went on to a stellar career 
as a consultant to users of the library.

At the time, ODBC was just beginning to be used.  Oracle was already
a bully, actually moreso than today; we had to buy a full production 
license just to develop on it.  Ingres was much better; they sent two
engineers to do the port themselves.  

The design is really showing its age.  SQL92 and SQL3 didn't exist then,
and neither did the STL or the ISO 14882 C++ Language standard.

Nathan Myers
[EMAIL PROTECTED]



Re: [HACKERS] SourceForge & Postgres

2000-12-12 Thread Tom Lane

mlw <[EMAIL PROTECTED]> writes:
> btw anyone trying this query should use: "attdispersion"

Sorry about that --- I just copied-and-pasted the query from some notes
that are obsolete as of 7.1...

> cdinfo=# explain select * from ztitles where artistid = 10220 ;
> NOTICE:  QUERY PLAN:
 
> Seq Scan on ztitles  (cost=0.00..4740.75 rows=3163 width=296)
 
> And this is with "-o -fs"

> Index Scan using ztitles_artistid_ndx on ztitles  (cost=0.00..5915.01
> rows=3163 width=296)
 
>   attname   | attdispersion | starelid | staattnum | staop | stanullfrac | 
>stacommonfrac |  stacommonval  | staloval | stahival  
>  artistid   | 0.0477198 |19274 | 2 |97 |   0 |  
>0.149362 | 100050450  | 1| 100055325

The reason why the thing is going for a sequential scan is that
astonishingly high stacommonfrac statistic.  Does artistid 100050450
really account for 14.9% of all the rows in your table?  (Who is that
anyway? ;-))  If so, a search for artistid 100050450 definitely *should*
use a sequential scan.  The problem at hand is estimating the frequency
of entries for some other artistid, given that we only have this much
statistical info available.  Obviously the stats are insufficient, and
I hope to do something about that in a release or two, but it ain't
gonna happen for 7.1.  In the meantime, if you've got huge outliers
like that, you could try reducing the value of NOT_MOST_COMMON_RATIO
in src/backend/utils/adt/selfuncs.c.

regards, tom lane



Re: [HACKERS] RFC: CRC datatype

2000-12-12 Thread Nathan Myers

On Sat, Dec 09, 2000 at 12:03:52AM +1100, Horst Herb wrote:
> AFAIK the thread for "built in" crcs referred only to CRCs in
> the transaction log. 

We have been discussing checksums for both the table blocks and for
the transaction log.

> Always remember that a psotgres data base on the harddisk can be
> manipulated accidentally / maliciously without postgres even running.
> These are the cases where you need row level CRCs.

"There is no security without physical security."  If somebody can
change the row contents, they can also change the row and/or block 
checksum to match.

Nathan Myers
[EMAIL PROTECTED]



Re: CRC was: Re: [HACKERS] beta testing version

2000-12-12 Thread Nathan Myers

On Thu, Dec 07, 2000 at 07:36:33PM -0500, Tom Lane wrote:
> [EMAIL PROTECTED] (Nathan Myers) writes:
> > 2. I disagree with way the above statistics were computed.  That eleven 
> >million-year figure gets whittled down pretty quickly when you 
> >factor in all the sources of corruption, even without crashes.  
> >(Power failures are only one of many sources of corruption.)  They 
> >grow with the size and activity of the database.  Databases are 
> >getting very large and busy indeed.
> 
> Sure, but the argument still holds.  If the net MTBF of your underlying
> system is less than a day, it's too unreliable to run a database that
> you want to trust.  Doesn't matter what the contributing failure
> mechanisms are.  In practice, I'd demand an MTBF of a lot more than a
> day before I'd accept a hardware system as satisfactory...

In many intended uses (such as Landmark's original plan?) it is not just 
one box, but hundreds or thousands.  With thousands of databases deployed, 
the MTBF (including power outages) for commodity hardware is well under a 
day, and there's not much you can do about that.

In a large database (e.g. 64GB) you have 8M blocks.  Each hash covers
one block.  With a 32-bit checksum, when you check one block, you have 
a 2^(-32) likelihood of missing an error, assuming there is one.  With 
8M blocks, you can only claim a 2^(-9) chance.

This is what I meant by "whittling".  A factor of ten or a thousand
here, another there, and pretty soon the possibility of undetected
corruption is something that can't reasonably be ruled out.


> > 3. Many users clearly hope to be able to pull the plug on their hardware 
> >and get back up confidently.  While we can't promise they won't have 
> >to go to their backups, we should at least be equipped to promise,
> >with confidence, that they will know whether they need to.
> 
> And the difference in odds between 2^32 and 2^64 matters here?  I made
> a numerical case that it doesn't, and you haven't refuted it.  By your
> logic, we might as well say that we should be using a 128-bit CRC, or
> 256-bit, or heck, a few kilobytes.  It only takes a little longer to go
> up each step, right, so where should you stop?  I say MTBF measured in
> megayears ought to be plenty.  Show me the numerical argument that 64
> bits is the right place on the curve.

I agree that this is a reasonable question.  However, the magic of 
exponential growth makes any dissatisfaction with a 64-bit checksum
far less likely than with a 32-bit checksum.

It would forestall any such problems to arrange a configure-time
flag such as "--with-checksum crc-32" or "--with-checksum md4",
and make it clear where to plug in the checksum of one's choice.
Then, ship 7.2 with just crc-32 and let somebody else produce 
patches for alternatives if they want them.

BTW, I have been looking for Free 64-bit CRC codes/polynomials and 
the closest thing I have found so far was Mark Mitchell's hash, 
translated from the Modula-3 system.  All the tape drive makers
advertise (but don't publish (AFAIK)) a 64-bit CRC.

A reasonable approach would be to deliver CRC-32 in 7.2, and then
reconsider the default later if anybody contributes good alternatives.

Nathan Myers
[EMAIL PROTECTED]



Re: [HACKERS] RFC C++ Interface

2000-12-12 Thread Randy Jonasz


Thanks for the vote of confidence.  I'm looking forward to doing it.  I've
got most of the classes needed laid out.  Once I'm done this step, I'll
post what I have for more comments, crticism, suggestions.

Cheers,

Randy

On Sun, 10 Dec 2000, Bruce Momjian wrote:

> > I appreciate your comments and would like to respond to your concerns.
> > The API I sketched in my earlier e-mail is borrowed heavily from
> > Rogue Wave's dbtools.h++ library.  I think it can be a very clean and
> > elegant way of accessing a database.
>
> Rogue Wave's API is quite interesting.  It would be a challenge to
> implement.  If you think you can do it, I think it would be a real win,
> and a real object-oriented API to PostgreSQL.
>
> --
>   Bruce Momjian|  http://candle.pha.pa.us
>   [EMAIL PROTECTED]   |  (610) 853-3000
>   +  If your life is a hard drive, |  830 Blythe Avenue
>   +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026
>
>

Randy Jonasz
Software Engineer
Click2net Inc.
Web:  http://www.click2net.com
Phone: (905) 271-3550

"You cannot possibly pay a philosopher what he's worth,
but try your best" -- Aristotle




[HACKERS] Re: [GENERAL] No postgres on Solaris

2000-12-12 Thread Chris Jones

"Martin A. Marques" <[EMAIL PROTECTED]> writes:

> It occurs either because system limit for the maximum number of
> semaphore sets (SEMMNI), or the system wide maximum number of
> semaphores (SEMMNS), would be exceeded.  You need to raise the
> respective kernel parameter.  Look into the PostgreSQL documentation
> for details.
[...]
> I looked at the FAQ_Solaris, but found nothing on this case. I remember 
> making changes to the kernel parameters when I fist installed postgres, but 
> can't remember where I found that info.

I don't remember where on the PG site I found this, but this is what
I'm using currently:

set shmsys:shminfo_shmmax=0x200
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmmni=256
set shmsys:shminfo_shmseg=256
set semsys:seminfo_semmap=256
set semsys:seminfo_semmni=512
set semsys:seminfo_semmns=512
set semsys:seminfo_semmsl=32

These lines are all at the bottom of /etc/system.

Chris

-- 
- [EMAIL PROTECTED]
Chris JonesSRI International, Inc.



RE: [HACKERS] 7.0.3(nofsync) vs 7.1

2000-12-12 Thread Mikheev, Vadim

> >> What is the default commit delay now?
> 
> > As before 5 * 10^(-6) sec - pretty the same as sleep(0) -:)
> > Seems CommitDelay is not very useful parameter now - XLogFlush
> > logic and fsync time add some delay.
> 
> There was a thread recently about smarter ways to handle shared fsync
> of the log --- IIRC, we talked about self-tuning commit delay,
> releasing waiting processes as soon as someone else had fsync'd, etc.
> Looks like none of those ideas are in the code now.  Did you not like 
> any of those ideas, or just no time to work on it yet?

We're in beta - it's better to test WAL to find/fix bugs than make
further improvements.

Also, I've run test with 100 clients inserting records into 100 tables
(to minimize contentions) - 915 tps with fsync and 1190 tps without fsync.
So, we do ~ 18 commits per fsync now and probably we'll be able to
increase commit performance by ~ 30%, no more.

Vadim



Re: Fwd: Re: [HACKERS] HELP! foreign eys & inheritance

2000-12-12 Thread Stephan Szabo


You'll need to make a unique index/unique constraint on the fields
of child you wish to constrain.  The unique constraint check wasn't
checked in 7.0, and also unique constraints are not inherited so
it has to be on the actual table you want to reference.

Stephan Szabo
[EMAIL PROTECTED]

On Wed, 13 Dec 2000, Horst Herb wrote:

> Ooops, sorry, error in this example:
> > The following example worked in previous versions (7.0.2 was the last I
> > tested), but not in 7.1 any more:
> >
> > create table parent (
> > global_id serial
> > );
> >
> > create table child (
> > anything text
> > ) inherits (parent);
> >
> > create table foreign (
> > fk_id int4 references child(global_id) on update cascade on delete no
> 
> action
>   ^ child, of course, not parent!
> 
> > ) inherits (parent);
> >
> > test.sql:83: ERROR:  UNIQUE constraint matching given keys for referenced
> > table "child" not found
> > WHY ???
> >
> > I would appreciate any help. Our database depends heavily on this.
> >
> > Horst
> 
> ---
> 




Re: [HACKERS] Re: [GENERAL] Oracle-compatible lpad/rpad behavior

2000-12-12 Thread Hannu Krosing

Jonathan Ellis wrote:
> 
> > I went to fix this and then realized I still don't have an adequate spec
> > of how Oracle defines these functions.  It would seem logical, for
> > example, that lpad might truncate on the left instead of the right,
> > ie lpad('abcd', 3, 'whatever') might yield 'bcd' not 'abc'.  Would
> > someone check?
> 
> SQL> select lpad('abcd', 3, 'foobar') from dual;
> 
> LPA
> ---
> abc
> 
> > Also, what happens if the specified length is less than zero?  Error,
> > or is it treated as zero?
> 
> SQL> select ':' || lpad('abcd', -1, 'foobar') || ':' from dual;
> 
> ':
> --
> ::
> 
> (colons added so it's obvious that it's a zero-length string)

AFAIK Oracle is unable to distinguish NULL and zero-length string ;(

--
Hannu



Re: [HACKERS] shared libs on sco how?

2000-12-12 Thread Peter Eisentraut

Arno A. Karner writes:

> when i compile on linux i get shared libs, on sco with udk, or sdk just
> get static libs
> can some on point me to the config files to hack to get both static, and
> shared libs

Try the patch below.  I don't actually have SCO, but it's what I
constructed from the documentation.

diff -cr postgresql-7.0.3.orig/src/Makefile.shlib postgresql-7.0.3/src/Makefile.shlib
*** postgresql-7.0.3.orig/src/Makefile.shlibTue May 16 22:48:48 2000
--- postgresql-7.0.3/src/Makefile.shlib Tue Dec 12 17:51:16 2000
***
*** 207,212 
--- 207,220 
shlib   := $(NAME)$(DLSUFFIX)
  endif

+ ifeq ($(PORTNAME), sco)
+   install-shlib-dep   := install-shlib
+   shlib   := 
+lib$(NAME)$(DLSUFFIX).$(SO_MAJOR_VERSION).$(SO_MINOR_VERSION)
+   LDFLAGS_SL  := -G -z text
+   CFLAGS  += $(CFLAGS_SL)
+ endif
+
+
  # Default target definition.  Note shlib is empty if not building a shlib.

  all: lib$(NAME).a $(shlib)


-- 
Peter Eisentraut  [EMAIL PROTECTED]   http://yi.org/peter-e/




AW: [HACKERS] Re: COPY BINARY file format proposal

2000-12-12 Thread Zeugswetter Andreas SB


> I take it from the smiley that you're not serious, but actually it seems
> like it might not be a bad idea.  I could see appending a CRC to each
> tuple record.  Comments anyone?

Let's not get paranoid. If you compress the output the file will get checksummed
anyway. I am against a CRC in binary copy output :-)

Andreas



[HACKERS] Bug in FOREIGN KEY

2000-12-12 Thread Bruce Momjian

This problem with foreign keys has been reported to me, and I have confirmed
the bug exists in current sources.  The DELETE should succeed:

---

CREATE TABLE primarytest2 (
   col1 INTEGER, 
   col2 INTEGER, 
   PRIMARY KEY(col1, col2)
  );

CREATE TABLE foreigntest2 (col3 INTEGER, 
   col4 INTEGER,
   FOREIGN KEY (col3, col4) REFERENCES primarytest2
 );
test=> BEGIN;
BEGIN
test=> INSERT INTO primarytest2 VALUES (5,5);
INSERT 27618 1
test=> DELETE FROM primarytest2 WHERE col1 = 5 AND col2 = 5;
ERROR:  triggered data change violation on relation "primarytest2"

 
-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



[HACKERS] No postgres on Solaris

2000-12-12 Thread Martin A. Marques

Hi,
I have been using Postgres-7.0.2 on Solaris 8 for the past few months, and 
was about to upgrade to 7.1-test, and after following carefully the docs, I 
get this:

postgres@ultra31:~ > /usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data
IpcSemaphoreCreate: semget(key=5432004, num=17, 03600) failed: No space left 
on
device
 
This error does *not* mean that you have run out of disk space.
 
It occurs either because system limit for the maximum number of
semaphore sets (SEMMNI), or the system wide maximum number of
semaphores (SEMMNS), would be exceeded.  You need to raise the
respective kernel parameter.  Look into the PostgreSQL documentation
for details.
 
postgres@ultra31:~ >  

I looked at the FAQ_Solaris, but found nothing on this case. I remember 
making changes to the kernel parameters when I fist installed postgres, but 
can't remember where I found that info.

Any clues?

-- 
System Administration: It's a dirty job, 
but someone told I had to do it.
-
Martín Marqués  email:  [EMAIL PROTECTED]
Santa Fe - Argentinahttp://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-



[HACKERS] external function proposal for 7.2

2000-12-12 Thread mlw

I think the newC function idea is pretty good, however, what would be
great is just one more step of protocol, perhaps an API verson 2 or 3:

One thing than makes writing a non-trivial function a bit problematic,
and perhaps even less efficient, is that the function does not know when
it is first run and when it is finished, and there is no facility to
manage contextual information. This limits external functons having to
be  fairly simple, or overly complex.

I propose that when the newC structure is allocated that a function
specific "Init" function be called, and when the structure is being
freed, calling a "Exit" function. The new C structure should also have a
void pointer that allows persistent information to be passed around.

typedef struct
{
FmgrInfo   *flinfo; /* ptr to lookup info used for this call
*/
Node   *context;/* pass info about context of call */
Node   *resultinfo; /* pass or return extra info about
result */
boolisnull; /* function must set true if result is
NULL */
short   nargs;  /* # arguments actually passed */
Datum   arg[FUNC_MAX_ARGS];  /* Arguments passed to function */
boolargnull[FUNC_MAX_ARGS];  /* T if arg[i] is actually NULL
*/

void *userparam;/* to be used by he function */

} FunctionCallInfoData;
typedef FunctionCallInfoData* FunctionCallInfo;

The userparam can be used to store data, or a count, or whatever.

Datum function(PG_FUNCTION_ARGS) ;
bool function_Init(PG_FUNCTION_ARGS);
void function_Exit(PG_FUNCTION_ARGS);

This protocol would make writing some really cool features much easier.
As a C++ guy, I could execute "new" at Init and "delete" at Exit. ;-)


Mark.




Re: [HACKERS] CRC, hash & Co.

2000-12-12 Thread Horst Herb

On Sunday 10 December 2000 17:35, you wrote:

> > 1.) A CRC is _not_ stronger than a hash. CRC is a subset of the hash
> > domain, defined as "a fast error-check hash based on mod 2 polynomial
> > operations" which has typically no crypto strength (and does not need it
> > either for most purposes).
>
> Not true, unless your definition of strength is different than mine.

It is not my definition, but the definition found in any technical / IT 
dictionary I could grab. Examples:


> > 3.) There are many domains where you need to protect yout database not
> > only against random accidental glitches, but also against malicious
> > attacks. In these cases, CRC-32 (and other CRCs without any cryptographic
> > strength) are no help.
>
> If you have malicious attackers who can deliberately modify live data in
> a database, you have problems beyond what any kind of hash can protect
> against.

In the medical domain, the "malicious attacker" is often the user himself. 
For medico-legal reasons, we need a complete audit trail proofing that no 
alterations have been made to medical records. For practical reasons, the 
quickest means (AFAIK) to achieve this is digestig the digests of all entries 
(or at least those of the change log) and store these externally on a trusted 
authentication server. No matter how unlikely such a manipulation is; for a 
court case you always need the state-of-the-art precautions.

> > 5.) As opposed to a previous posting (Bruce ?), MD5 has been shown to be
> > "crackable" (deliberate collison feasible withavailable technology)
>
> No, it hasn't, unless you can provide us a reference to a paper showing
> that.  I've seen references that there are internal collisions in the
> MD5 reduction function, but still no way to produce collisions on the
> final digest.

You are partially right. It was only the compression function of MD5. But 
that's enough. 
"An iterated hash function is thus in this regard at most as strong as its 
compression function" ( A.J.Menezes, P.C.van Oorschot, S.A.Vanstone "Handbook 
of Applied Cryptography", CRC Press, 1999, link to online version: 
http://www.cacr.math.uwaterloo.ca/hac/ ).
Read Cryptobytes Vol.2 No.2 Summer 1996; Hans Dobbertin: The status of MD5 
after a recent attack 
(ftp://ftp.rsasecurity.com/pub/cryptobytes/crypto2n2.pdf). 
and
RSA Data security recommended already 1996 that MD4 should no longer be used 
and that MD5 "should not be used for future applications that require the 
hash function to be collision resistant" 
(ftp://ftp.rsasecurity.com/pub/pdfs/bulletn4.pdf)
Even in S/MIME MD5 "is only provided for backwards compatibility" for that 
very reason 
(http://web.elastic.org/~fche/mirrors/www.jya.com/pgpfaq-ss.htm#SubMD5Broke)
and Bruce Schneier stated that he is "wary of MD5" ( B.Schneier, "Applied 
Cryptography, Second Edition", Wiley, 1996 (cited at 
http://web.elastic.org/~fche/mirrors/www.jya.com/pgpfaq-ss.htm, I am still 
trying to find the original quote in the book))

For further reference I recommend the "Handbook of applied cryptography" 
which surprisingly is available online (full text) at 
http://www.cacr.math.uwaterloo.ca/hac/

Please remember that the whole reason for my reasoning is that we need a 
run-time definable choice of CRCs/digests as no one single hash will suit all 
needs.

Horst



[HACKERS] Re:Postgresql on dynix/ptx system

2000-12-12 Thread Barry Jenner


I am also trying to port PostgreSQL to Dynix/ptx
4.4.5. I have GNU gcc 2.95.2. 

I also have errors when trying to configure and make
the application and I am not particularily well
qualified to sort them out. I would however be pleased
to help where I can. 

Barry




__
Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores. Millions of Products.
http://shopping.yahoo.com/



Fwd: Re: [HACKERS] HELP! foreign eys & inheritance

2000-12-12 Thread Horst Herb

Ooops, sorry, error in this example:
> The following example worked in previous versions (7.0.2 was the last I
> tested), but not in 7.1 any more:
>
> create table parent (
> global_id serial
> );
>
> create table child (
> anything text
> ) inherits (parent);
>
> create table foreign (
> fk_id int4 references child(global_id) on update cascade on delete no

action
  ^ child, of course, not parent!

> ) inherits (parent);
>
> test.sql:83: ERROR:  UNIQUE constraint matching given keys for referenced
> table "child" not found
> WHY ???
>
> I would appreciate any help. Our database depends heavily on this.
>
> Horst

---



Re: [HACKERS] Re: COPY BINARY file format proposal

2000-12-12 Thread Tom Lane

Philip Warner <[EMAIL PROTECTED]> writes:
> How about a CRC? ;-P

I take it from the smiley that you're not serious, but actually it seems
like it might not be a bad idea.  I could see appending a CRC to each
tuple record.  Comments anyone?

You seemed to like the PNG philosophy of using feature flags rather than
a version number.  Accordingly, I propose dropping the version number
field in favor of a flags word.  (Which was needed anyway, because I had
*again* forgotten about COPY WITH OIDS :-(.)

Attached is the current state of the proposal.  I haven't added a CRC
field but am willing to do so if that's the consensus.

regards, tom lane


COPY BINARY file format proposal

The objectives of this change are:

1. Get rid of the tuple count at the front of the file.  This requires
an extra pass over the relation, which is a lot more trouble than the
count is worth.  Use an explicit EOF marker instead.
2. Send fields of a tuple individually, instead of dumping out raw tuples
(complete with alignment padding and so forth) as is currently done.
This is mainly to simplify TOAST-related processing.
3. Make the format somewhat self-identifying, so that the reader has at
least some chance of detecting it when the data doesn't match the table
it's supposed to be loaded into.

The proposed format consists of a file header, zero or more tuples, and a
file trailer.


File Header
---

The proposed file header consists of 24 bytes of fixed fields, followed
by a variable-length header extension area.

Signature: 12-byte sequence "PGBCOPY\n\377\r\n\0" --- note that the null
is a required part of the signature.  (The signature is designed to allow
easy identification of files that have been munged by a non-8-bit-clean
transfer.  The proposed signature will be changed by newline-translation
filters, dropped nulls, dropped high bits, or parity changes.)

Integer layout field: int32 constant 0x01020304 in source's byte order.
Potentially, a reader could engage in byte-flipping of subsequent fields
if the wrong byte order is detected here.

Flags field: a 4-byte bit mask to denote important aspects of the file
format.  Bits are numbered from 0 (LSB) to 31 (MSB) --- note that this
field is stored with source's endianness, as are all subsequent integer
fields.  Bits 16-31 are reserved to denote critical file format issues;
a reader should abort if it finds an unexpected bit set in this range.
Bits 0-15 are reserved to signal backwards-compatible format issues;
a reader should simply ignore any unexpected bits set in this range.
Currently only one flag bit is defined, and the rest must be zero:
Bit 16: if 1, OIDs are included in the dump; if 0, not

Next 4 bytes: length of remainder of header, not including self.  In
the initial version this will be zero, and the first tuple follows
immediately.  Future changes to the format might allow additional data
to be present in the header.  A reader should silently ignore any header
extension data it does not know what to do with.

Note that I envision the content of the header extension area as being a
sequence of self-identifying chunks (but the specific design of same is
postponed until we need 'em).  The flags field is not intended to tell
readers what is in the extension area.

This design allows for both backwards-compatible header additions (add
header extension chunks, or set low-order flag bits) and non-backwards-
compatible changes (set high-order flag bits to signal such changes,
and add supporting data to the extension area if needed).


Tuples
--

Each tuple begins with an int16 count of the number of fields in the
tuple.  (Presently, all tuples in a table will have the same count, but
that might not always be true.)  Then, repeated for each field in the
tuple, there is an int16 typlen word possibly followed by field data.
The typlen field is interpreted thus:

ZeroField is NULL.  No data follows.

> 0 Field is a fixed-length datatype.  Exactly N
bytes of data follow the typlen word.

-1  Field is a varlena datatype.  The next four
bytes are the varlena header, which contains
the total value length including itself.

< -1Reserved for future use.

For non-NULL fields, the reader can check that the typlen matches the
expected typlen for the destination column.  This provides a simple
but very useful check that the data is as expected.

There is no alignment padding or any other extra data between fields.
Note also that the format does not distinguish whether a datatype is
pass-by-reference or pass-by-value.  Both of these provisions are
deliberate: they might help improve portability of the files (although
of course endianness and floating-point-format issues can still keep
you from moving a binary file across machines).

If OIDs are included in the dump, the OID field immediately fol

Re: [HACKERS] Re: COPY BINARY file format proposal

2000-12-12 Thread Tom Lane

Philip Warner <[EMAIL PROTECTED]> writes:
> More a matter of not thinking it was important enough to worry about, and
> not really wanting to drag the MD5/MD4/CRC64/etc debate into this one.

I'd just as soon not drag that debate in here either ;-) ... but once we
settle on an appropriate CRC method for WAL it's easy enough to call the
same routine for this code.

> Sounds good to me. I'm not sure you need it on a per-tuple basis - but it
> can't hurt, assuming it's cheap to generate. Does the backend send tuples
> or blocks of tuples? If the latter, and if CRC is expensive, then maybe 1
> CRC for each group of tuples.

Extending the CRC over multiple tuples would just complicate life,
I think.  The per-byte cost is the biggest factor, so you don't really
save all that much.

>> Next 4 bytes: length of remainder of header, not including self.  In
>> the initial version this will be zero, and the first tuple follows
>> immediately.  Future changes to the format might allow additional data
>> to be present in the header.  A reader should silently ignore any header
>> extension data it does not know what to do with.

> Don't you need to at least define how to specify non-essential chunks,
> since the flags are not to be used to describe the header extensions. Or
> are we going to make the initial version barf when it encounters any header
> extension?

No, the initial version will just silently skip the whole header
extension; it's defined so that that's a legal behavior (everything
in the header extension is inessential).  We can come back and define
a format for the entries in the header extension area when we need some.

> Another option would be to:
> - dump the field sizes in the header somewhere (they will all be the same), 
> - for each row output a bitmap of non-null fields, followed by the data.
> - varlena would have a -1 length in the header, an an int32 length in the row.

That would work if you are willing to assume that all the tuples indeed
always have the same set of fields --- you're not, for example, doing an
inheritance-tree-walk "COPY FROM foo*".  But Chris Bitmead still has a
gleam in his eye about that sort of thing, so we might want it someday.
I think it's worth a small amount of extra space to avoid that
assumption, especially since it simplifies the code too.

regards, tom lane



Re: [HACKERS] Re: COPY BINARY file format proposal

2000-12-12 Thread Philip Warner

At 19:55 8/12/00 -0500, Tom Lane wrote:
>Philip Warner <[EMAIL PROTECTED]> writes:
>> How about a CRC? ;-P
>
>I take it from the smiley that you're not serious, but actually it seems
>like it might not be a bad idea.  I could see appending a CRC to each
>tuple record.  Comments anyone?

More a matter of not thinking it was important enough to worry about, and
not really wanting to drag the MD5/MD4/CRC64/etc debate into this one.
Having said that, I think it would be a nice-to-have, like CRCs on db pages
- in the latter case I'd really like VACCUM (or another utility) to be able
to report 'invalid pages' on a nightly basis (or, better still, not report
them). 


>Attached is the current state of the proposal.  I haven't added a CRC
>field but am willing to do so if that's the consensus.

Sounds good to me. I'm not sure you need it on a per-tuple basis - but it
can't hurt, assuming it's cheap to generate. Does the backend send tuples
or blocks of tuples? If the latter, and if CRC is expensive, then maybe 1
CRC for each group of tuples.

Also having a CRC on a per-tupple basis will prevent getting out of sync
with the data, and make partial data recovery 


>Next 4 bytes: length of remainder of header, not including self.  In
>the initial version this will be zero, and the first tuple follows
>immediately.  Future changes to the format might allow additional data
>to be present in the header.  A reader should silently ignore any header
>extension data it does not know what to do with.

Don't you need to at least define how to specify non-essential chunks,
since the flags are not to be used to describe the header extensions. Or
are we going to make the initial version barf when it encounters any header
extension?


>Tuples
>--
>
>Each tuple begins with an int16 count of the number of fields in the
>tuple.  (Presently, all tuples in a table will have the same count, but
>that might not always be true.)

Another option would be to:

- dump the field sizes in the header somewhere (they will all be the same), 
- for each row output a bitmap of non-null fields, followed by the data.
- varlena would have a -1 length in the header, an an int32 length in the row.

This is harder to read and to write, but saves space, if that is desirable.

>
>For non-NULL fields, the reader can check that the typlen matches the
>expected typlen for the destination column.  This provides a simple
>but very useful check that the data is as expected.

CRC seems like the go here...





Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/



[HACKERS] Oracle-compatible lpad/rpad behavior

2000-12-12 Thread Paul

Tom,

Hope this helps

>From the Oracle manual:
Purpose
Returns char1, left-padded to length n with the sequence of characters
in char2; char2 defaults to a single blank. If char1 is longer than n,
this function returns the portion of char1 that fits in n.

The argument n is the total length of the return value as it is
displayed on your terminal screen. In most character sets, this is also
the number of characters in the return value. However, in some multibyte
character sets, the display length of a character string can differ from
the number of characters in the string.


and some examples (8.1.5 on NT):

SQL> select lpad('abcdef',3,'x') from dual;

LPA
---
abc

SQL> select lpad ('abcdef',8,'x') from dual;

LPAD('AB

xxabcdef

SQL>
SQL> select lpad('abcdef',0,'x') from dual;

L
-


Zeugswetter Andreas SB <[EMAIL PROTECTED]> writes:
>> lpad and rpad never truncate, they only pad.
>>
>> Perhaps they *should* truncate if the specified length is less than
>> the original string length.  Does Oracle do that?

> Yes, it truncates, same as Informix.

I went to fix this and then realized I still don't have an adequate spec

of how Oracle defines these functions.  It would seem logical, for
example, that lpad might truncate on the left instead of the right,
ie lpad('abcd', 3, 'whatever') might yield 'bcd' not 'abc'.  Would
someone check?

Also, what happens if the specified length is less than zero?  Error,
or is it treated as zero?

regards, tom lane




[HACKERS] shared libs on sco how?

2000-12-12 Thread Arno A. Karner

well im trying to get apache + php4 + pgsql 7.0.3 running on sco
givin up on the udk on sco openserver 5.0.5 now using sdk on sco open
server 5.0.4
i can compile all the stuff static, but php4 wants to used shared
libpq.so i get undefined symbold on php4 module load unresolved symbol
PQfinish

when i compile on linux i get shared libs, on sco with udk, or sdk just
get static libs
can some on point me to the config files to hack to get both static, and
shared libs

thanks in advance
ps the mail search seems to be working, and plenty of info on sco but
nothing on shared libs 
-- 
My opinions are my own and not that of my employer even if I am self
employed



Re: [HACKERS] Re: COPY BINARY file format proposal

2000-12-12 Thread Philip Warner

At 01:27 8/12/00 -0500, Tom Lane wrote:
>Recovering the data on a machine
>of different endianness is a project for future data archeologists.

It's frightening to think that in 1000 years time people will be deducing
things about our society from the way we stored data.


>
>Tell you the truth, I don't believe in file-format version numbers at
>all...
>(RFC 2083, esp section 12.13) --- the versioning philosophy described
>there is largely yours truly's.

Seems to be a much better approach; (non)critical chunks & chunk types are
much more portable.



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 0500 83 82 82 | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/



Re: [HACKERS] RPM changes for 7.1.

2000-12-12 Thread Oliver Elphick

"Ross J. Reedstrom" wrote:
  >Here's the current set of Debian  packages derived from the main postgresql 
  >source, with their descriptions:
...
  >
  >the -pl package contains both plpgsql and pltcl. The descripton there
  >needs updating. 

This is a mistake, and the dependencies aren't right.  pltcl ought
to be in with libpgtcl, to share the dependency on tcl.  I suppose
I originally classed the PLs together, but there's plperl as well now. 

I'll be fixing this soon.

-- 
Oliver Elphick[EMAIL PROTECTED]
Isle of Wight  http://www.lfix.co.uk/oliver
PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47  6B 7E 39 CC 56 E4 C1 47
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
 
 "Be of good courage, and he shall strengthen your 
  heart, all ye that hope in the LORD." 
  Psalms 31:24 





Re: AW: [HACKERS] SourceForge & Postgres (attdispursion)

2000-12-12 Thread Bruce Momjian

> I see it, yes. Was this an intended change ? I am quite sure, that it was 
> attdisbursion in 7.0 ?

Yes, I couldn't spell dispersion in the past.  :-)

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



[HACKERS] HELP! foreign eys & inheritance

2000-12-12 Thread Horst Herb

The following example worked in previous versions (7.0.2 was the last I 
tested), but not in 7.1 any more:

create table parent (
global_id serial
);

create table child (
anything text
) inherits (parent);

create table foreign (
fk_id int4 references parent(global_id) on update cascade on delete no action
) inherits (parent);

test.sql:83: ERROR:  UNIQUE constraint matching given keys for referenced 
table "child" not found
WHY ???

I would appreciate any help. Our database depends heavily on this.

Horst



AW: [HACKERS] SourceForge & Postgres (attdispursion)

2000-12-12 Thread Zeugswetter Andreas SB


> btw anyone trying this query should use: "attdispersion"
> 

I see it, yes. Was this an intended change ? I am quite sure, that it was 
attdisbursion in 7.0 ?

Andreas



Re: [HACKERS] SourceForge & Postgres

2000-12-12 Thread mlw

Tom Lane wrote:
> 
> mlw <[EMAIL PROTECTED]> writes:
> > cdinfo=# explain select * from ztitles where artistid = 0 ;
> > NOTICE:  QUERY PLAN:
> 
> > Index Scan using ztitles_artistid_ndx on ztitles  (cost=0.00..5915.01
> > rows=3163 width=296)
> 
> > When postmaster is started without "-o -fs" I get this:
> 
> > cdinfo=# explain select * from ztitles where artistid = 0 ;
> > NOTICE:  QUERY PLAN:
> 
> > Seq Scan on ztitles  (cost=0.00..4740.75 rows=3163 width=296)
> 
> How many tuples are in the table?  How many are actually returned
> by this query?  Also, what do you get from
> 
> select attname,attdisbursion,s.*
> from pg_statistic s, pg_attribute a, pg_class c
> where starelid = c.oid and attrelid = c.oid and staattnum = attnum
> and relname = 'ztitles';

I have attached the output. 

btw anyone trying this query should use: "attdispersion"

The explain I gave, there are no records that actually have an artistid
of 0. However, I will show the explain with a valid artistid number.

This is without "-o -fs"
cdinfo=# explain select * from ztitles where artistid = 10220 ;
NOTICE:  QUERY PLAN:
 
Seq Scan on ztitles  (cost=0.00..4740.75 rows=3163 width=296)
 
EXPLAIN 

And this is with "-o -fs"

cdinfo=# explain select * from ztitles where artistid = 10220 ;
NOTICE:  QUERY PLAN:
 
Index Scan using ztitles_artistid_ndx on ztitles  (cost=0.00..5915.01
rows=3163 width=296)
 
EXPLAIN


select count(*) from ztitles where artistid = 10220 ;
 count
---
16
(1 row) 

-- 
http://www.mohawksoft.com

  attname   | attdispersion | starelid | staattnum | staop | stanullfrac | 
stacommonfrac |  stacommonval  | staloval | stahival  
+---+--+---+---+-+---++--+---
 muzenbr|-1 |19274 | 1 |97 |   0 |   
4.72277e-06 | 397705 | 58608| 399022
 artistid   | 0.0477198 |19274 | 2 |97 |   0 |  
0.149362 | 100050450  | 1| 100055325
 cat2   |  0.618418 |19274 | 3 |  1066 |   0 |  
0.763058 | Performer  | Boxed Set| Single
 cat3   | 0.0459786 |19274 | 4 |  1066 | 4.72277e-06 |  
0.145367 | International  | Blues| Sound Effects
 cat4   |  0.308324 |19274 | 5 |  1066 | 4.72277e-06 |   
0.50827 ||  | Zydeco
 performer  | 0.0477281 |19274 | 6 |  1066 | 4.72277e-06 |  
0.149381 | Various Artists| "Blue" Gene Tyranny  | underGRIND
 performer2 | 0.0477198 |19274 | 7 |  1066 | 4.72277e-06 |  
0.149362 | Various Artists| "Chuscales", Jose Valle  | underGRIND
 title  |   1.88982e-05 |19274 | 8 |  1066 | 4.72277e-06 |   
9.44555e-05 | Good Old Country   | !| Zz... [EP] *
 artist1|  0.770286 |19274 | 9 |  1066 | 4.72277e-06 |  
0.864182 ||  | w
 engineer   |  0.719466 |19274 |10 |  1066 | 4.72277e-06 |  
0.831534 ||  | ob Bullock
 producer   |  0.586756 |19274 |11 |  1066 | 4.72277e-06 |  
0.740488 ||  | Zvika Nadaf
 labelname  |   0.000490215 |19274 |12 |  1066 | 4.72277e-06 |
0.00242751 | Rykodisc USA   |  | `A`A`Li`I Records
 catalog|0.00114041 |19274 |13 |  1066 | 4.72277e-06 |
0.00557759 | 1  |  | sftri325
 distribut  | 0.0342314 |19274 |14 |  1066 | 4.72277e-06 |   
0.11669 | Universal Distribution | (Independently by Label) | n/a
 released   | 0.0331312 |19274 |15 |  1066 | 4.72277e-06 |  
0.113828 | n/a| 01/01/1986   | n/a
 origrel| 0.0266312 |19274 |16 |  1066 | 4.72277e-06 | 
0.0961651 | 2000   | 1911 | n/a
 nbrdiscs   |  0.931311 |19274 |17 |  1066 | 4.72277e-06 |  
0.961169 | 1  | 01   | 9
 spar   |   0.84807 |19274 |18 |  1066 | 4.72277e-06 |  
0.912166 | n/a| *N*  | n/a
 minutes|  0.778454 |19274 |19 |  1066 | 4.72277e-06 |   
0.86933 ||  | 99
 seconds|  0.778454 |19274 |20 |  1066 | 4.72277e-06 |   
0.86933 ||  | 98
 monostereo |  0.854336 |19274 |21 |  1066 | 4.72277e-06 |  
0.915935 | Stereo

AW: [HACKERS] SourceForge & Postgres

2000-12-12 Thread Zeugswetter Andreas SB

> I have an index on group_id, one on
> (group_id,status_id) and one on (group_id,status_id,assigned_to) 

As an aside notice: you should definitely only need the last of the
three indices, since it can perfectly work on group_id
or group_id + status_id only restrictions.

Andreas