Re: [HACKERS] Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

2007-05-20 Thread Shachar Shemesh
Greg Smith wrote:
> On Sun, 20 May 2007, Shachar Shemesh wrote:
>
>> This is not data given to store. It's data being exported.
>
> Data being exported has a funny way of turning around and being stored
> in the database again.  It's kind of nice to know the damage done
> during that round trip is minimized.
I agree. All I'm asking, and have not received an answer yet, is whether
assuring that we don't have any SEMANTIC damage is enough.

In other words, if I can assure that data exported and then imported
will always, under all circumstances, compare the same to the original,
would that be enough of a requirement? In other words, if I offer a
format that is assured of preserving both mantissa and exponent
precision and range, as well as all extra attributes (+/-Infinity and
NaN), but does not guarantee that the semantically identical constructs
are told apart (+0 vs. -0, and the different NaNs), would that format be
acceptable?
>
>> Tom seems to think this is not a goal (though, aside from his disbelief
>> that such a goal is attainable, I have heard no arguments against it).
>
> If Tom thinks it's not attainable, the best way to convince him
> otherwise would be demonstrate that it's not.
Granted. That's why I've been quite. I'm pulling my sources for the ARM
FP format details, to make sure what I have in mind would work.
> One reason people use text formats for cross-platform exchanges is
> that getting portable binary compatibility for things like floating
> point numbers is much harder than you seem to think it is.
I'll just point out that none of the things that Tom seems to be
concerned about are preserved over text format.
>
> Stepping back for a second, your fundamental argument seem to be based
> on the idea that doing conversions to text is such a performance issue
> in a driver that it's worth going through these considerable
> contortions to avoid it.
Converting to text adds a CPU overhead in both client and server, as
well as a network transmission overhead. Even if it's not determental to
performance, I'm wondering why insist on paying it.

You are right that I offered no concrete implementation. I'll do it now,
but it is dependent on an important question - what is the range for the
ARM floating point. Not having either an ARM to test it on, nor the
floating point specs, it may be that a simpler implementation is
possible. I offer this implementation up because I see people think I'm
talking up my ass.

A 64 bit IEEE float can distinguish between almost all 2^64 distinct
floats. It loses two combinations for the + and - infinity, one
combination for the dual zero notation, and we also lose all of the
NaNs, which means (2^mantissa)-2 combinations. Over all, an n bit IEEE
float with m bits of mantissa will be able to represent 2^n - 2^m - 1
actual floating point numbers.

That means that if we take a general signed floating point number, of
which representation we know nothing but the fact it is n bits wide, and
that it has a mantissa and an exponent, and we want to encode it as an
IEEE number of the same width with mantissa size m and exponent of size
e=n-m-1, we will have at most 2^m+1 unrepresentable numbers.

In a nutshell, what I suggest is that we export floating points in
binary form in IEEE format, and add a status word to it. The status word
with dictate how many bits of mantissa there are in the IEEE format,
what the exponent bias is, as well as add between one and two bits to
the actual number, in case the number of floats the exported platform
has is larger than the number of floats that can be represented in IEEE
with the same word length.

The nice thing about this format is that exporting from an IEEE platform
is as easy as exporting the binary image of the float, plus a status
word that is a constant. Virtually no overhead. Importing from an IEEE
platform to an IEEE platform is, likewise, as easy as comparing the
status word to your own constant, and if they match, just copy the
binary. This maintains all of Tom's strict round trip requirements. In
fact, for export/import on the same IEEE platform no data conversion of
any kind takes place at all.

There are questions that need to be answered. For example, what happens
if you try to import a NaN into a platform that has no such concept?
You'd have to put in a NULL or something similar. Similarly, how do you
import Infinity. These, however, are questions that should be answered
the same way for text imports, so there is nothing binary specific here.

I hope that, at least, presents a workable plan. As I said before, I'm
waiting for the specs for ARM's floating point before I can move
forward. If, as I suspect, ARM's range is even more limited, then I may
try and suggest a more compact export representation pending question of
whether we have any other platform that is non-IEEE, and what is the
situation there.

Shachar

---(end of broadcast)---
TIP 7: You can help support the Post

Re: [HACKERS] Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

2007-05-20 Thread Greg Smith

On Sun, 20 May 2007, Shachar Shemesh wrote:


This is not data given to store. It's data being exported.


Data being exported has a funny way of turning around and being stored in 
the database again.  It's kind of nice to know the damage done during that 
round trip is minimized.



Tom seems to think this is not a goal (though, aside from his disbelief
that such a goal is attainable, I have heard no arguments against it).


If Tom thinks it's not attainable, the best way to convince him otherwise 
would be demonstrate that it's not.  From here, it looks like your 
response to his concerns for the pitfalls he pointed out has been waving 
your hands and saying "no, that can't really be a problem" while making it 
clear you haven't dug into the details.  One reason people use text 
formats for cross-platform exchanges is that getting portable binary 
compatibility for things like floating point numbers is much harder than 
you seem to think it is.


Stepping back for a second, your fundamental argument seem to be based on 
the idea that doing conversions to text is such a performance issue in a 
driver that it's worth going through these considerable contortions to 
avoid it.  Given how many other places performance can be throttled along 
that path, that itself is a position that requires defending nowadays. 
In the typical driver-bound setups I work with, there's plenty of CPU time 
to burn for simple data conversion work because either the network wire 
speed or the speed of the underlying database I/O are the real 
bottlenecks.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Signing off of patches (was Re: Not ready for 8.3)

2007-05-20 Thread Karl O. Pinc


On 05/19/2007 12:48:22 PM, Tom Lane wrote:

Well, but if you ask at an early stage it's perfectly fair to ask for
comments on how much work an implementation of idea X might be.  Plus
people could save you from wasting time going down dead-end paths.


True.  But then I wouldn't get extra points for being both clueless
_and_ stubborn.  ;-)


Karl <[EMAIL PROTECTED]>
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] build/install xml2 when configured with libxml

2007-05-20 Thread Nikolay Samokhvalov

On 5/20/07, Andrew Dunstan <[EMAIL PROTECTED]> wrote:


contrib is a misnomer at best. When 8.3 branches I intend to propose
that we abandon it altogether, in line with some previous discussions.

We can change the configure help text if people think it matters that
much - which seems to me much more potentially useful than changing
comments.


Actually, I meant configure help text, not any comment in the code :-)

--
Best regards,
Nikolay

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Idea that might inspire more patch reviewing.

2007-05-20 Thread Zdenek Kotala

Ron Mayer wrote:

Bruce Momjian wrote:

In talking to people who are assigned to review patches or could review
patches, I often get the reply, "Oh, yea, I need to do that".


Would it inspire more people to learn enough to become patch
reviewers if patch authors scheduled walkthroughs of their
patches with question and answer sessions over IRC or maybe
even some voice conferencing system like skype?


It is common in one company but I'm not sure if it is possible do in 
open source community.


I think the following tool looks likes good solution for patch review:

http://www.chipx86.com/blog/?p=222
http://code.google.com/p/reviewboard/

Zdenek

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-05-20 Thread Tom Lane
Koichi Suzuki <[EMAIL PROTECTED]> writes:
> As replied to "Patch queue triage" by Tom, here's simplified patch to
> mark WAL record as "compressable", with no increase in WAL itself.
> Compression/decompression commands will be posted separately to PG
> Foundary for further review.

Applied with some minor modifications.  I didn't like the idea of
suppressing the sanity-check on WAL record length; I think that's
fairly important.  Instead, I added a provision for an XLOG_NOOP
WAL record type that can be used to fill in the extra space.
The way I envision that working is that the compressor removes
backup blocks and converts each compressible WAL record to have the
same contents and length it would've had if written without backup
blocks.  Then, it inserts an XLOG_NOOP record with length set to
indicate the amount of extra space that needs to be chewed up --
but in the compressed version of the WAL file, XLOG_NOOP's "data
area" is not actually stored.  The decompressor need only scan
the file looking for XLOG_NOOP and insert the requisite number of
zero bytes (and maybe recompute the XLOG_NOOP's CRC, depending on
whether you want it to be valid for the short-format record in the
compressed file).  There will also be some games to be played for
WAL page boundaries, but you had to do that anyway.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] build/install xml2 when configured with libxml

2007-05-20 Thread Andrew Dunstan



Nikolay Samokhvalov wrote:


The current CVS' configure is really confusing: it has "--with-xslt"
option, while there is no XSLT support in the core. At least let's
change the option's comment to smth like "build with XSLT support (now
it is used for contrib/xml2 only)"...



contrib is a misnomer at best. When 8.3 branches I intend to propose 
that we abandon it altogether, in line with some previous discussions.


We can change the configure help text if people think it matters that 
much - which seems to me much more potentially useful than changing 
comments.


cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] build/install xml2 when configured with libxml

2007-05-20 Thread Nikolay Samokhvalov

On 4/15/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote:

Well, if we're going to make libxslt an explicit thing, then it'd be
trivial to add an xslt transformation function into the core, and then
I think we can claim equivalent support.  But we'll have to check the
details, of course.

I have been thinking, however, that I don't want to add more and more
library dependencies into the server.  libxml2 was necessary to some
extent.  But xslt functionality could easily be provided as a module.
This would be easy to do and might be useful even for 8.3.  But I don't
really know how to label that.  Having a contrib/xslt alongside
contrib/xml2 would probably be confusing.  Ideas?


The current CVS' configure is really confusing: it has "--with-xslt"
option, while there is no XSLT support in the core. At least let's
change the option's comment to smth like "build with XSLT support (now
it is used for contrib/xml2 only)"...

--
Best regards,
Nikolay

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Passing more context info to selectivity-estimation code

2007-05-20 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Comments, better ideas?

> How about building a separate Var-node for the variable when it's above 
> an outer join?

[ itch... ]  That would be a whole lot *more* invasive than what I'm
proposing now.  It might be an interesting avenue to pursue,
particularly in conjunction with the planner's inability to deal with
non-nullable expressions in subselect outputs below outer joins (which
seems to require some special kind of Var to fix).  But I can't see
back-patching it.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Passing more context info to selectivity-estimation code

2007-05-20 Thread Heikki Linnakangas

Tom Lane wrote:

Comments, better ideas?


How about building a separate Var-node for the variable when it's above 
an outer join? That node would point to the original Var, and have an 
additional modifier which indicates the percentage of injected nulls.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings