Re: [HACKERS] PostgreSQL 8.2beta1 w/ VALUES

2006-09-23 Thread Stephen Frost
Luke, et al,

* Luke Lonergan ([EMAIL PROTECTED]) wrote:
> > Except that one warning would not be accurate, because the 
> > warning is per tuple. How is postgresql going to know that 
> > the warning applies to the same set of data but just a 
> > different tuple?

I didn't say it'd be easy. :)

> If it's going to roll back the entire load after that one warning, it
> should terminate there.

It didn't terminate it, though I agree that it would have been nice if I
could control if it would terminate on first warning or not.

> This is a common problem with OLAP and based on the observation here,
> this needs to be fixed.  Not being able to cancel out at this point is
> even worse, can you imagine the frustration of trying to load 10GB of
> data and having to wait until the end after seeing these warnings, while
> knowing that you're just going to have to try again anyway?

Yes, rather frustrating even with only 20k rows.

> Eventually we'll implement single row error handling, but even then
> there should be a selectable behavior to terminate the load on the first
> warning/error.

It'd be nice to be able to do what (I believe..) Oracle and Access can
do- dump the warnings/error messages/rows into a seperate table and go
over them afterwards..  Probably wouldn't have helped me in this case
but I've been in other situations where it would have been nice. :)

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ReadBuffer(P_NEW) versus valid buffers

2006-09-23 Thread Alvaro Herrera
Joshua D. Drake wrote:
> Tom Lane wrote:

> >I asked around inside Red Hat but haven't gotten any responses yet ...
> >seeing that it's a rather old Suse kernel, I can understand that RH's
> >kernel hackers might not be too excited about investigating.  (Alan Cox,
> >for one, has got other things to worry about this weekend:
> >http://zeniv.linux.org.uk/%7etelsa/boom/
> 
> Uhmm... doh?

Telsa got "fired" for buying IBM?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PostgreSQL 8.2beta1 w/ VALUES

2006-09-23 Thread Luke Lonergan
Josh,

> >   Anyhow, don't know if there's really a good solution but 
> it'd be nice
> >   to only get one warning, or one of a given type, or 
> something, and 
> > to
> 
> Except that one warning would not be accurate, because the 
> warning is per tuple. How is postgresql going to know that 
> the warning applies to the same set of data but just a 
> different tuple?

If it's going to roll back the entire load after that one warning, it
should terminate there.

This is a common problem with OLAP and based on the observation here,
this needs to be fixed.  Not being able to cancel out at this point is
even worse, can you imagine the frustration of trying to load 10GB of
data and having to wait until the end after seeing these warnings, while
knowing that you're just going to have to try again anyway?

Eventually we'll implement single row error handling, but even then
there should be a selectable behavior to terminate the load on the first
warning/error.

- Luke


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PostgreSQL 8.2beta1 Now Available

2006-09-23 Thread Joshua D. Drake

Marc G. Fournier wrote:

On Sat, 23 Sep 2006, Walter Cruz wrote:


There's a date to postgreSQL 8.2 final?


Figure 45-60 days, but not a firm date ...


See Tom Lane's post.

Joshua D. Drake



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL 8.2beta1 w/ VALUES

2006-09-23 Thread Joshua D. Drake



  Anyhow, don't know if there's really a good solution but it'd be nice
  to only get one warning, or one of a given type, or something, and to


Except that one warning would not be accurate, because the warning is 
per tuple. How is postgresql going to know that the warning applies to 
the same set of data but just a different tuple?




  respond to cancel requests (if there was an issue there).  Sorry this
  is more from a user's perspective, I havn't got time atm to go digging
  through the code.  I'd be curious about implementing a possible
  error-aggregation system for reporting on large sets like this but
  that might be overkill anyway.


You could dial down client_min_messages, set it to ERROR, then you won't 
see warnings ;)


Sincerely,

Joshua D. Drake




Thanks,

Stephen



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Bitmap index status

2006-09-23 Thread Jie Zhang
Gavin & Heikki,

>> 
>> The handling of stream and hash bitmaps looks pretty complicated to me.
>> All the bitmap-related nodes have logic to handle both types slightly
>> differently. It all seems to come down to that if a subnode (or
>> amgetbitmap in a bitmap index scan node) returns a StreamBitmap, the
>> caller needs to call the subnode many times, until it returns a NULL.
>> With a HashBitmap, the caller only calls the subnode once.
>> 
>> I think amgetbitmap should be called just once per index scan, and it
>> should return either a hash bitmap or a stream bitmap. The same applies
>> to all the executor nodes that return bitmaps, they would only return a
>> single HashBitmap or StreamBitmap, and the upper node would call
>> tbm_iterate repeatedly on that.
>> 
>> StreamBitmap would contain a callback (filled by the indexam) that
>> tbm_iterate would call to fill the next TBMIterateResult.
> 
> Right, this was the approach taken by an earlier version of the patch I
> had worked on. It was significantly uglified by the need to keep the index
> state around and to be careful about what amrescan might do behind out
> back. I will, however, introduce the idea again because it makes the code
> much cleaner and more logical, as you seem to suggest.
> 

I have been thinking about this approach some more. I do agree that this is
more attractive now. The following includes some more detailed design.
Please let me know if you have any comments. (My apologies to Gavin. You
talked to me about this approach before. But you introduced some on-disk
bitmap specific code into the tidbitmap.c, which prevented me from looking
more in this direction.)

Essentially, we want to have a stream bitmap object that has an iterator,
which will be able to iterate through the bitmaps. The BitmapIndexscan,
BitmapAnd, BitmapOr will be executed once and return a streamp bitmap or a
hash bitmap. The BitmapHeapscan then calls tbm_iterate() to iterate through
the bitmaps.

The StreamBitmap structure will look like below.

struct StreamBitmap {
NodeTag   type;   /* to make it a valid Node */
PagetableEntryentry;  /* a page of tids in this stream bitmap */

/* the iterator function */
void(*next)(StreamBitmap*);
Node*   state;/* store how this stream bitmap generated,
 and all necessary information to
 obtain the next stream bitmap. */
};

Two new state objects will look like below. At the same time, we introduce
three new node types: T_StreamBitmapAND, T_StreamBitmapOR,
And T_StreamBitmapIndex, to define different states.

/*
 * Stores the necessary information for iterating through the stream bitmaps
 * generated by nodeBitmapAnd or nodeBitmapOr.
 */
struct StreamBitmapOp {
NodeTag type;  /* handles T_StreamBitmapAND and T_StreamBitmapOR */
List*   bitmaps;
};

/*
 * Stores some necessary information for iterating through the stream
 * bitmaps generated by nodeBitmapIndexscan.
 */
struct StreamBitmapIndex {
NodeTag type; /* handle T_StreamBitmapIndex */
IndexScanDescscan;
BlockNumbernextBlockNo;/* next block no to be read */
};

Then we will have the iterator functions like the following:

void StreamBitmapAndNext(StreamBitmap* node) {
  tbm_intersect_stream(((StreampBitmapOp*) node->state)->bitmaps, node);
}

void StreamBitmapOrNext(StreamBitmap* node) {
  tbm_union_stream(((StreampBitmapOp*) node->state)->bitmaps, node);
}

void StreamBitmapIndexNext(StreamBitmap* node) {
  StreamBitmapIndex* sbi = (StreamBitmapIndex*) node->state;
  amgetbitmap(sbi->scan, NULL, sbi->nextBlockNo);
}

What do you think?

Thanks,
Jie



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] ReadBuffer(P_NEW) versus valid buffers

2006-09-23 Thread Joshua D. Drake

Tom Lane wrote:

Mark Kirkwood <[EMAIL PROTECTED]> writes:
The check looks good - are we chasing up the Linux kernel (or Suse) guys 
to get the bug investigated?


I asked around inside Red Hat but haven't gotten any responses yet ...
seeing that it's a rather old Suse kernel, I can understand that RH's
kernel hackers might not be too excited about investigating.  (Alan Cox,
for one, has got other things to worry about this weekend:
http://zeniv.linux.org.uk/%7etelsa/boom/


Uhmm... doh?

Joshua D. Drake



I believe Dan's busy updating his kernel --- if a current Suse kernel
still shows the problem then he should definitely file a bug with them.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Buildfarm alarms

2006-09-23 Thread Tom Lane
"Andrew Dunstan" <[EMAIL PROTECTED]> writes:
> It could certainly be done. In general, I have generally taken the view
> that owners have the responsibility for monitoring their own machines.

Sure, but providing them tools to do that seems within buildfarm's
purview.

For some types of failure, the buildfarm script could make a local
notification without bothering the server --- but a timeout on the
server side would cover a wider variety of failures, including "this
machine is dead and ought to be removed from the farm".

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PostgreSQL 8.2beta1 Now Available

2006-09-23 Thread Marc G. Fournier

On Sat, 23 Sep 2006, Walter Cruz wrote:


There's a date to postgreSQL 8.2 final?


Figure 45-60 days, but not a firm date ...




[]'s
- Walter

On 9/23/06, Marc G. Fournier <[EMAIL PROTECTED]> wrote:



Just a short note that the first Beta is now available on
ftp.postgresql.org, and, shortly, on the mirrors ...

This isn't a full announce, which will be on Monday ... but please run a
few tests, make sure everything looks okay ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org
)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq






Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Buildfarm alarms

2006-09-23 Thread Andrew Dunstan
Dave Page wrote:
>
> I'm just investigating a problem with beta 1 running on Windows 2K and
> XP, and noticed that neither Snake or Bandicoot have built -HEAD for
> nearly 3 weeks. I'm investigating why and will fix the problem, but it
> strikes me that what would be useful is an alarm email from the server
> to note that a run hasn't been reported for a while would have helped
> spot this earlier. This could be configured with an admin-specified
> maximum number of days between reports to allow for those machines that
> connect far less frequently.
>
> Does that sound feasible to you?
>
>


It could certainly be done. In general, I have generally taken the view
that owners have the responsibility for monitoring their own machines.
I'll think about it some more.

cheers

andrew


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL 8.2beta1 Now Available

2006-09-23 Thread Tom Lane
"Walter Cruz" <[EMAIL PROTECTED]> writes:
> There's a date to postgreSQL 8.2 final?

[ ... all together now ... ]  When it's ready.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ReadBuffer(P_NEW) versus valid buffers

2006-09-23 Thread Tom Lane
Mark Kirkwood <[EMAIL PROTECTED]> writes:
> The check looks good - are we chasing up the Linux kernel (or Suse) guys 
> to get the bug investigated?

I asked around inside Red Hat but haven't gotten any responses yet ...
seeing that it's a rather old Suse kernel, I can understand that RH's
kernel hackers might not be too excited about investigating.  (Alan Cox,
for one, has got other things to worry about this weekend:
http://zeniv.linux.org.uk/%7etelsa/boom/

I believe Dan's busy updating his kernel --- if a current Suse kernel
still shows the problem then he should definitely file a bug with them.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] PostgreSQL 8.2beta1 w/ VALUES

2006-09-23 Thread Stephen Frost
Greetings,

  Was just playing with 8.2beta1 and importing some data from MySQL and
  found something rather annoying.  Not *100%* sure the best way to deal
  with this, if there even is a way, but...

  When loading a rather large data set I started getting errors along
  these lines:

psql:/home/sfrost/school/cs750/reality/dump-anonymized.postgres.sql:262:
WARNING:  nonstandard use of escape in a string literal
LINE 1: ...XX ,9:9:999'),(9,'',0,'X XXX...
 ^
HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
psql:/home/sfrost/school/cs750/reality/dump-anonymized.postgres.sql:262:
WARNING:  nonstandard use of escape in a string literal
LINE 1: ...99',0,',9:9:999'),(9,'',0,' ...
 ^
HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
INSERT 0 20795
cs750=#

  Which, by themselves, aren't really an issue *except* for the fact
  that I got an *insane* number of them.  I don't think it was quite one
  for every row (of which there were 20,795, you'll note) but it was
  more than enough to drive me insane.  Additionally, cancel requests
  were ignored.  It's possible this was because of network lag and the
  server had already processed the request but I'm not sure that was the
  only reason.  I know I held down ctrl-c for quite a while during the
  spew of messages...

  Anyhow, don't know if there's really a good solution but it'd be nice
  to only get one warning, or one of a given type, or something, and to
  respond to cancel requests (if there was an issue there).  Sorry this
  is more from a user's perspective, I havn't got time atm to go digging
  through the code.  I'd be curious about implementing a possible
  error-aggregation system for reporting on large sets like this but
  that might be overkill anyway.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PostgreSQL 8.2beta1 Now Available

2006-09-23 Thread Bruce Momjian
Walter Cruz wrote:
> There's a date to postgreSQL 8.2 final?
> 
> []'s

No.

---


> - Walter
> 
> On 9/23/06, Marc G. Fournier <[EMAIL PROTECTED]> wrote:
> >
> >
> > Just a short note that the first Beta is now available on
> > ftp.postgresql.org, and, shortly, on the mirrors ...
> >
> > This isn't a full announce, which will be on Monday ... but please run a
> > few tests, make sure everything looks okay ...
> >
> > 
> > Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org
> > )
> > Email . [EMAIL PROTECTED]  MSN . [EMAIL 
> > PROTECTED]
> > Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
> >
> > ---(end of broadcast)---
> > TIP 3: Have you checked our extensive FAQ?
> >
> >http://www.postgresql.org/docs/faq
> >

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PostgreSQL 8.2beta1 Now Available

2006-09-23 Thread Walter Cruz
There's a date to postgreSQL 8.2 final?[]'s- WalterOn 9/23/06, Marc G. Fournier <[EMAIL PROTECTED]
> wrote:Just a short note that the first Beta is now available on
ftp.postgresql.org, and, shortly, on the mirrors ...This isn't a full announce, which will be on Monday ... but please run afew tests, make sure everything looks okay ...Marc G. Fournier   
Hub.Org Networking Services (http://www.hub.org)Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664---(end of broadcast)---TIP 3: Have you checked our extensive FAQ?
   http://www.postgresql.org/docs/faq


Re: [HACKERS] Increase default effective_cache_size?

2006-09-23 Thread Stephen Frost
* Tom Lane ([EMAIL PROTECTED]) wrote:
> Russ Brown <[EMAIL PROTECTED]> writes on pgsql-general:
> > Thank you: the problem was the effective_cache_size (which I hadn't
> > changed from the default of 1000). This machine doesn't have loads of
> > RAM, but I knocked it up to 65536 and now the query uses the index,
> > without having to change the statistics.
> 
> Considering recent discussion about how 8.2 is probably noticeably more
> sensitive to effective_cache_size than prior releases, I wonder whether
> it's not time to adopt a larger default value for that setting.  The
> current default of 1000 pages (8Mb) seems really pretty silly for modern
> machines; we could certainly set it to 10 times that without problems,
> and maybe much more.  Thoughts?

I'd have to agree 100% with this.  Though don't we now have something
automated for shared_buffers?  I'd think effective_cache_size would
definitely be a candidate for automation (say, half or 1/4th the ram in
the box...).

Barring the ability to do something along those lines- yes, I'd
recommend up'ing it to at least 128M or 256M.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Increase default effective_cache_size?

2006-09-23 Thread Gevik Babakhani
On Sat, 2006-09-23 at 17:14 -0700, Joshua D. Drake wrote:
> >> Thank you: the problem was the effective_cache_size (which I hadn't
> >> changed from the default of 1000). This machine doesn't have loads of
> >> RAM, but I knocked it up to 65536 and now the query uses the index,
> >> without having to change the statistics.
> > 
> > Considering recent discussion about how 8.2 is probably noticeably more
> > sensitive to effective_cache_size than prior releases, I wonder whether
> > it's not time to adopt a larger default value for that setting.  The
> > current default of 1000 pages (8Mb) seems really pretty silly for modern
> > machines; we could certainly set it to 10 times that without problems,
> > and maybe much more.  Thoughts?
> 
> I think that 128 megs is probably a reasonable starting point. I know 
> plenty of people that run postgresql on 512 megs of ram. If you take 
> into account shared buffers and work mem, that seems like a reasonable 
> starting point.
> 

I agree, Adopting a higher effective_cache_size seems to be a good thing
to do. 


(hmmm I must be dreaming again But I cannot stop wondering how
it would be to have a smart "agent" that configures these values by
analyzing the machine power and statistical values gathered from
database usage..)  


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ReadBuffer(P_NEW) versus valid buffers

2006-09-23 Thread Mark Kirkwood

Tom Lane wrote:


So ReadBuffer without hesitation zeroes out the page of data we just
filled, and returns it for re-filling.  There went some tuples :-(

Although this is clearly Not Our Bug, it's annoying that ReadBuffer
falls into the trap so easily instead of complaining.  I'm still
disinclined to try to change the behavior of mdread(), but what I am
considering doing is adding a check here to error out if not PageIsNew.
AFAICS, if we do find a buffer for a page supposedly past EOF, it should
be zero-filled because that's what mdread returns in this case.  So this
change would prevent Dan's silent-overwrite scenario without changing the
behavior for any legitimate case.

Thoughts, problems, better ideas?



The check looks good - are we chasing up the Linux kernel (or Suse) guys 
to get the bug investigated?


Cheers

Mark

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Increase default effective_cache_size?

2006-09-23 Thread Joshua D. Drake



Thank you: the problem was the effective_cache_size (which I hadn't
changed from the default of 1000). This machine doesn't have loads of
RAM, but I knocked it up to 65536 and now the query uses the index,
without having to change the statistics.


Considering recent discussion about how 8.2 is probably noticeably more
sensitive to effective_cache_size than prior releases, I wonder whether
it's not time to adopt a larger default value for that setting.  The
current default of 1000 pages (8Mb) seems really pretty silly for modern
machines; we could certainly set it to 10 times that without problems,
and maybe much more.  Thoughts?


I think that 128 megs is probably a reasonable starting point. I know 
plenty of people that run postgresql on 512 megs of ram. If you take 
into account shared buffers and work mem, that seems like a reasonable 
starting point.


Joshua D. Drake


regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Increase default effective_cache_size?

2006-09-23 Thread Tom Lane
Russ Brown <[EMAIL PROTECTED]> writes on pgsql-general:
> On Thu, 2006-09-21 at 23:39 -0400, Jim Nasby wrote:
>> Also make sure that you've set effective_cache_size  
>> correctly (I generally set it to total memory - 1G, assuming the  
>> server has at least 4G in it).

> Thank you: the problem was the effective_cache_size (which I hadn't
> changed from the default of 1000). This machine doesn't have loads of
> RAM, but I knocked it up to 65536 and now the query uses the index,
> without having to change the statistics.

Considering recent discussion about how 8.2 is probably noticeably more
sensitive to effective_cache_size than prior releases, I wonder whether
it's not time to adopt a larger default value for that setting.  The
current default of 1000 pages (8Mb) seems really pretty silly for modern
machines; we could certainly set it to 10 times that without problems,
and maybe much more.  Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> So maybe your Openjade is not exactly the same
> Martijn was using, because what I understood was that Openjade replaced
> the ı with ı, which should work.

I think it's more likely that he was running with a non-DocBook
stylesheet (his openjade command did not explicitly select a catalog and
stylesheet the way that our Makefiles do).  Or just a different version
of the stylesheet.  I'm testing with whatever ships in Fedora Core 5.
I see definitions of ı in some of the files under
/usr/share/sgml, but evidently none of them are included by docbook...

> Does your browser display it correctly if you replace manually with ı?

Doesn't really matter whether it does or not, since my gripe about that
is that DocBook rejects it.

> On the other hand, I don't understand why DocBook would be Latin-1 only.

I'm surprised too that it couldn't be easily overridden.  Peter, any
idea why not?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Alvaro Herrera
Tom Lane wrote:
> Martijn van Oosterhout  writes:
> > I created a simple docbook document on my computer with ı and
> > ran openjade over and in the output file it is converted to ı.
> 
> I experimented with that, and openjade didn't complain about it, but
> it renders in my browser (Safari) as
> 
> Have the COPY command return a command tag that includes the number of rows 
> copied (Volkan Yazıcı)

Well, if I put a ı into an HTML document and open it on my
browser (Epiphany, which is Mozilla-based), it surely looks like
verbatim ı.  However, if I replace it with ı then it looks
like a dotless i.  So maybe your Openjade is not exactly the same
Martijn was using, because what I understood was that Openjade replaced
the ı with ı, which should work.

Does your browser display it correctly if you replace manually with ı?

On the other hand, I don't understand why DocBook would be Latin-1 only.
What would be the point of that limitation?  Some googling seems to
reveal that people indeed uses other charsets, UTF-8 in particular (but
also Big5, Latin-2, etc), so apparently this isn't set in stone.  (I
admit that they mainly talk about XML Docbook though).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Tom Lane
Martijn van Oosterhout  writes:
> I created a simple docbook document on my computer with ı and
> ran openjade over and in the output file it is converted to ı.

I experimented with that, and openjade didn't complain about it, but
it renders in my browser (Safari) as

Have the COPY command return a command tag that includes the number of rows 
copied (Volkan Yazıcı)

So that hardly looks like a portable solution either.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Buildfarm alarms

2006-09-23 Thread Dave Page
Hi Andrew,

I'm just investigating a problem with beta 1 running on Windows 2K and
XP, and noticed that neither Snake or Bandicoot have built -HEAD for
nearly 3 weeks. I'm investigating why and will fix the problem, but it
strikes me that what would be useful is an alarm email from the server
to note that a run hasn't been reported for a while would have helped
spot this earlier. This could be configured with an admin-specified
maximum number of days between reports to allow for those machines that
connect far less frequently.

Does that sound feasible to you?

Regards, Dave.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Fwd: Is the fsync() fake on FreeBSD6.1?

2006-09-23 Thread Andrew - Supernews
On 2006-09-23, Tom Lane <[EMAIL PROTECTED]> wrote:
> Andrew - Supernews <[EMAIL PROTECTED]> writes:
>> Whether the underlying device lies about the write completion is another
>> matter. All current SCSI disks have WCE enabled by default, which means
>> that they will lie about write completion if FUA was not set in the
>> request, which FreeBSD never sets.
>
> Huh?  The entire point of the SCSI command set is that it's not
> necessary to lie about write completion for performance reasons, because
> the architecture has always supported the concept of multiple requests
> in-flight concurrently.

I seem to recall we've had this conversation previously.

> Has the disk drive industry gotten a whole lot
> stupider in the fifteen years since I last wrote a SCSI driver?

Quite possibly, yes.

I certainly would never claim that WCE is a good idea, or that having it
enabled by default is a good idea, I merely report the _fact_ that it is
indeed enabled by default on every SCSI drive that I have recently
encountered (over several different vendors).

On my database machines I am careful to disable it (and check that this
does indeed take effect). I would recommend that others do likewise. The
performance impact of disabling WCE is not serious (other than removing
the unsafe speed gains of course).

Since posting the previous response I've been directed to a document that
seems to imply that Linux drivers now attempt to handle write-order
guarantees by introducing the concept of a "write barrier", i.e. a write
request which must complete after all previous writes and before all
subsequent ones.  Achieving this requires different strategies depending
on whether the underlying device allows command-queueing and/or exposes a
useful cache flush command; the implication of this is that for SCSI disks
with WCE, the linux driver will actually send SYNCHRONIZE CACHE when doing
a write barrier (which could be expensive of course). If (and I have no
idea if this is true) fsync() is implemented by means of such a barrier,
then this implies that an fsync()-heavy workload will perform much worse
on Linux when WCE is enabled than when it is disabled, since in the latter
case the driver will not issue SYNCHRONIZE CACHE and will simply ensure
that the relevent writes are all completed.

It would be interesting to see benchmarks of this.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Martijn van Oosterhout
On Sat, Sep 23, 2006 at 12:27:51PM -0400, Tom Lane wrote:
> To my mind the real problem is that one of the principal output formats
> we are interested in is HTML, and there is no dotless-i entity in any
> version of the HTML standard.  I trust I need not point out again the
> difference between "my browser recognizes this construct" and "it's in
> the standard".

Sure there is, HTML4 includes all of Unicode, thus also the dotless-i.
They gave up assigning names to them after latin1, but numerical
references are in the standard also (decimal and hex).

I created a simple docbook document on my computer with ı and
ran openjade over and in the output file it is converted to ı.
Openjade knows how to generate valid character references. The input
file is attached, I compiled it with the command:

openjade -V draft-mode -wall -wno-unused-param -wno-empty -i output-html -t 
sgml /tmp/a.sgml

For dsl file just copy the stylesheet.dsl file in the postgresql source
tree.

Why it doesn't work in the current docs I don't know, but I think we can
rule out limitations of HTML or Docbook.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.




  

  ı ı

  
  
Introduction

  ı ı

  



signature.asc
Description: Digital signature


Re: [HACKERS] Fwd: Is the fsync() fake on FreeBSD6.1?

2006-09-23 Thread Tom Lane
Andrew - Supernews <[EMAIL PROTECTED]> writes:
> Whether the underlying device lies about the write completion is another
> matter. All current SCSI disks have WCE enabled by default, which means
> that they will lie about write completion if FUA was not set in the
> request, which FreeBSD never sets.

Huh?  The entire point of the SCSI command set is that it's not
necessary to lie about write completion for performance reasons, because
the architecture has always supported the concept of multiple requests
in-flight concurrently.  Has the disk drive industry gotten a whole lot
stupider in the fifteen years since I last wrote a SCSI driver?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] ReadBuffer(P_NEW) versus valid buffers

2006-09-23 Thread Tom Lane
Some off-list investigation of Dan Kavan's data loss problem,
http://archives.postgresql.org/pgsql-admin/2006-09/msg00092.php
has led to the conclusion that it seems to be a kernel bug.
The smoking gun is this strace excerpt:

> lseek(10, 0, SEEK_END)  = 913072128
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) 
> = 8192
> lseek(10, 0, SEEK_END)  = 913080320
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) 
> = 8192
> lseek(10, 0, SEEK_END)  = 913088512
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) 
> = 8192
> lseek(10, 0, SEEK_END)  = 913088512
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) 
> = 8192
> lseek(10, 0, SEEK_END)  = 913096704
> write(10, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) 
> = 8192

Note the lseek results --- surely each successive result ought to be 8K
more than the one before, but the fourth in this extract seems to have
forgotten about the immediately preceding write().

These calls are coming from successive ReadBuffer(rel, P_NEW)
operations, which should just extend the file each time.  But
the incorrect lseek result is causing ReadBuffer to re-find
the buffer we had just finished filling with a page of data,
and that leads it to this conclusion:

/*
 * We get here only in the corner case where we are trying to extend
 * the relation but we found a pre-existing buffer marked BM_VALID.
 * (This can happen because mdread doesn't complain about reads
 * beyond EOF --- which is arguably bogus, but changing it seems
 * tricky.)  We *must* do smgrextend before succeeding, else the
 * page will not be reserved by the kernel, and the next P_NEW call
 * will decide to return the same page.  Clear the BM_VALID bit,
 * do the StartBufferIO call that BufferAlloc didn't, and proceed.
 */

So ReadBuffer without hesitation zeroes out the page of data we just
filled, and returns it for re-filling.  There went some tuples :-(

Although this is clearly Not Our Bug, it's annoying that ReadBuffer
falls into the trap so easily instead of complaining.  I'm still
disinclined to try to change the behavior of mdread(), but what I am
considering doing is adding a check here to error out if not PageIsNew.
AFAICS, if we do find a buffer for a page supposedly past EOF, it should
be zero-filled because that's what mdread returns in this case.  So this
change would prevent Dan's silent-overwrite scenario without changing the
behavior for any legitimate case.

Thoughts, problems, better ideas?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] pgsql: We're going to have to spell dotless i as plain i, because

2006-09-23 Thread Peter Eisentraut
Martijn van Oosterhout wrote:
> Oh sorry, it wasn't clear from the commit entry. It's not that
> DocBook doesn't support the character or that it can't be
> represented. It's just not supported in the document encoding we're
> using.

No, no, and no.

The reason that it doesn't work is that the document character set for
DocBook is Latin 1, so any attempt to refer to a character not in this 
set is going to fail.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Tom Lane
Martijn van Oosterhout  writes:
> So to me (a more docbook novice) it seems like it's the stylesheet
> that's limiting you to latin1, not the docbook parser.

But the "stylesheet" in question is part of the basic docbook
infrastructure, so the above distinction is academic.  (Or at least
that's what Peter stated upthread.)

To my mind the real problem is that one of the principal output formats
we are interested in is HTML, and there is no dotless-i entity in any
version of the HTML standard.  I trust I need not point out again the
difference between "my browser recognizes this construct" and "it's in
the standard".

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Martijn van Oosterhout
On Sat, Sep 23, 2006 at 08:49:02AM -0400, Bruce Momjian wrote:
> That's not how I understand it.  The document encoding is only related
> to how high-bit characters are interpreted, I am told by Peter, but for
> some reason the toolchain just doesn't support UTF8, even though if you
> use ı in SGML it does come out right in HTML, but new toolchains
> throw an error for it.

Dunno about UTF-8, but openjade only supports one character repertoire,
and that's Unicode (under character handling in the man page).

According to the docbook reference, a way to specify the dotless i
is ı 

http://www.oasis-open.org/docbook/documentation/reference/html/iso-lat2.html

But it's part of Latin-2, and if your stylesheet declares latin1 as
the only valid characters, then that character is invalid, no matter
how you represent it. I was just surprised, because ı has been
part of docbook since version 3, which is quite some time ago now.

So to me (a more docbook novice) it seems like it's the stylesheet
that's limiting you to latin1, not the docbook parser.

Anyway, the problem has been solved, so we can all get back to testing
the beta now.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] pgsql: We're going to have to spell dotless i

2006-09-23 Thread Bruce Momjian
Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> On Sat, Sep 23, 2006 at 11:54:47AM +0200, Peter Eisentraut wrote:
> > Martijn van Oosterhout wrote:
> > > Well you could always use te HTML4 ı which most tools should
> > > understand. At least browsers have good support for this kind of
> > > entity.
> > 
> > Please review the recent thread on pgsql-docs before reiterating all the 
> > suggestions.
> 
> Oh sorry, it wasn't clear from the commit entry. It's not that DocBook
> doesn't support the character or that it can't be represented. It's
> just not supported in the document encoding we're using.

That's not how I understand it.  The document encoding is only related
to how high-bit characters are interpreted, I am told by Peter, but for
some reason the toolchain just doesn't support UTF8, even though if you
use ı in SGML it does come out right in HTML, but new toolchains
throw an error for it.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pgsql: We're going to have to spell dotless i as plain i, because

2006-09-23 Thread Martijn van Oosterhout
On Sat, Sep 23, 2006 at 11:54:47AM +0200, Peter Eisentraut wrote:
> Martijn van Oosterhout wrote:
> > Well you could always use te HTML4 ı which most tools should
> > understand. At least browsers have good support for this kind of
> > entity.
> 
> Please review the recent thread on pgsql-docs before reiterating all the 
> suggestions.

Oh sorry, it wasn't clear from the commit entry. It's not that DocBook
doesn't support the character or that it can't be represented. It's
just not supported in the document encoding we're using.

Sorry for the noise.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] pgsql: We're going to have to spell dotless i as plain i, because

2006-09-23 Thread Peter Eisentraut
Martijn van Oosterhout wrote:
> Well you could always use te HTML4 ı which most tools should
> understand. At least browsers have good support for this kind of
> entity.

Please review the recent thread on pgsql-docs before reiterating all the 
suggestions.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pgsql: We're going to have to spell dotless i as plain i, because

2006-09-23 Thread Martijn van Oosterhout
On Fri, Sep 22, 2006 at 12:29:05PM -0300, Tom Lane wrote:
> Log Message:
> ---
> We're going to have to spell dotless i as plain i, because dotless i is
> not in the character set supported by DocBook nor standard HTML.  (Sorry
> Volkan.)  Also replace random character-set references by a pointer to
> the actual standard.

Well you could always use te HTML4 ı which most tools should
understand. At least browsers have good support for this kind of
entity.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature