Re: [HACKERS] Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server
Greg Smith wrote: > On Sun, 20 May 2007, Shachar Shemesh wrote: > >> This is not data given to store. It's data being exported. > > Data being exported has a funny way of turning around and being stored > in the database again. It's kind of nice to know the damage done > during that round trip is minimized. I agree. All I'm asking, and have not received an answer yet, is whether assuring that we don't have any SEMANTIC damage is enough. In other words, if I can assure that data exported and then imported will always, under all circumstances, compare the same to the original, would that be enough of a requirement? In other words, if I offer a format that is assured of preserving both mantissa and exponent precision and range, as well as all extra attributes (+/-Infinity and NaN), but does not guarantee that the semantically identical constructs are told apart (+0 vs. -0, and the different NaNs), would that format be acceptable? > >> Tom seems to think this is not a goal (though, aside from his disbelief >> that such a goal is attainable, I have heard no arguments against it). > > If Tom thinks it's not attainable, the best way to convince him > otherwise would be demonstrate that it's not. Granted. That's why I've been quite. I'm pulling my sources for the ARM FP format details, to make sure what I have in mind would work. > One reason people use text formats for cross-platform exchanges is > that getting portable binary compatibility for things like floating > point numbers is much harder than you seem to think it is. I'll just point out that none of the things that Tom seems to be concerned about are preserved over text format. > > Stepping back for a second, your fundamental argument seem to be based > on the idea that doing conversions to text is such a performance issue > in a driver that it's worth going through these considerable > contortions to avoid it. Converting to text adds a CPU overhead in both client and server, as well as a network transmission overhead. Even if it's not determental to performance, I'm wondering why insist on paying it. You are right that I offered no concrete implementation. I'll do it now, but it is dependent on an important question - what is the range for the ARM floating point. Not having either an ARM to test it on, nor the floating point specs, it may be that a simpler implementation is possible. I offer this implementation up because I see people think I'm talking up my ass. A 64 bit IEEE float can distinguish between almost all 2^64 distinct floats. It loses two combinations for the + and - infinity, one combination for the dual zero notation, and we also lose all of the NaNs, which means (2^mantissa)-2 combinations. Over all, an n bit IEEE float with m bits of mantissa will be able to represent 2^n - 2^m - 1 actual floating point numbers. That means that if we take a general signed floating point number, of which representation we know nothing but the fact it is n bits wide, and that it has a mantissa and an exponent, and we want to encode it as an IEEE number of the same width with mantissa size m and exponent of size e=n-m-1, we will have at most 2^m+1 unrepresentable numbers. In a nutshell, what I suggest is that we export floating points in binary form in IEEE format, and add a status word to it. The status word with dictate how many bits of mantissa there are in the IEEE format, what the exponent bias is, as well as add between one and two bits to the actual number, in case the number of floats the exported platform has is larger than the number of floats that can be represented in IEEE with the same word length. The nice thing about this format is that exporting from an IEEE platform is as easy as exporting the binary image of the float, plus a status word that is a constant. Virtually no overhead. Importing from an IEEE platform to an IEEE platform is, likewise, as easy as comparing the status word to your own constant, and if they match, just copy the binary. This maintains all of Tom's strict round trip requirements. In fact, for export/import on the same IEEE platform no data conversion of any kind takes place at all. There are questions that need to be answered. For example, what happens if you try to import a NaN into a platform that has no such concept? You'd have to put in a NULL or something similar. Similarly, how do you import Infinity. These, however, are questions that should be answered the same way for text imports, so there is nothing binary specific here. I hope that, at least, presents a workable plan. As I said before, I'm waiting for the specs for ARM's floating point before I can move forward. If, as I suspect, ARM's range is even more limited, then I may try and suggest a more compact export representation pending question of whether we have any other platform that is non-IEEE, and what is the situation there. Shachar ---(end of broadcast)--- TIP 7: You can help support the Post
Re: [HACKERS] Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server
On Sun, 20 May 2007, Shachar Shemesh wrote: This is not data given to store. It's data being exported. Data being exported has a funny way of turning around and being stored in the database again. It's kind of nice to know the damage done during that round trip is minimized. Tom seems to think this is not a goal (though, aside from his disbelief that such a goal is attainable, I have heard no arguments against it). If Tom thinks it's not attainable, the best way to convince him otherwise would be demonstrate that it's not. From here, it looks like your response to his concerns for the pitfalls he pointed out has been waving your hands and saying "no, that can't really be a problem" while making it clear you haven't dug into the details. One reason people use text formats for cross-platform exchanges is that getting portable binary compatibility for things like floating point numbers is much harder than you seem to think it is. Stepping back for a second, your fundamental argument seem to be based on the idea that doing conversions to text is such a performance issue in a driver that it's worth going through these considerable contortions to avoid it. Given how many other places performance can be throttled along that path, that itself is a position that requires defending nowadays. In the typical driver-bound setups I work with, there's plenty of CPU time to burn for simple data conversion work because either the network wire speed or the speed of the underlying database I/O are the real bottlenecks. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Signing off of patches (was Re: Not ready for 8.3)
On 05/19/2007 12:48:22 PM, Tom Lane wrote: Well, but if you ask at an early stage it's perfectly fair to ask for comments on how much work an implementation of idea X might be. Plus people could save you from wasting time going down dead-end paths. True. But then I wouldn't get extra points for being both clueless _and_ stubborn. ;-) Karl <[EMAIL PROTECTED]> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] build/install xml2 when configured with libxml
On 5/20/07, Andrew Dunstan <[EMAIL PROTECTED]> wrote: contrib is a misnomer at best. When 8.3 branches I intend to propose that we abandon it altogether, in line with some previous discussions. We can change the configure help text if people think it matters that much - which seems to me much more potentially useful than changing comments. Actually, I meant configure help text, not any comment in the code :-) -- Best regards, Nikolay ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Idea that might inspire more patch reviewing.
Ron Mayer wrote: Bruce Momjian wrote: In talking to people who are assigned to review patches or could review patches, I often get the reply, "Oh, yea, I need to do that". Would it inspire more people to learn enough to become patch reviewers if patch authors scheduled walkthroughs of their patches with question and answer sessions over IRC or maybe even some voice conferencing system like skype? It is common in one company but I'm not sure if it is possible do in open source community. I think the following tool looks likes good solution for patch review: http://www.chipx86.com/blog/?p=222 http://code.google.com/p/reviewboard/ Zdenek ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Koichi Suzuki <[EMAIL PROTECTED]> writes: > As replied to "Patch queue triage" by Tom, here's simplified patch to > mark WAL record as "compressable", with no increase in WAL itself. > Compression/decompression commands will be posted separately to PG > Foundary for further review. Applied with some minor modifications. I didn't like the idea of suppressing the sanity-check on WAL record length; I think that's fairly important. Instead, I added a provision for an XLOG_NOOP WAL record type that can be used to fill in the extra space. The way I envision that working is that the compressor removes backup blocks and converts each compressible WAL record to have the same contents and length it would've had if written without backup blocks. Then, it inserts an XLOG_NOOP record with length set to indicate the amount of extra space that needs to be chewed up -- but in the compressed version of the WAL file, XLOG_NOOP's "data area" is not actually stored. The decompressor need only scan the file looking for XLOG_NOOP and insert the requisite number of zero bytes (and maybe recompute the XLOG_NOOP's CRC, depending on whether you want it to be valid for the short-format record in the compressed file). There will also be some games to be played for WAL page boundaries, but you had to do that anyway. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PATCHES] build/install xml2 when configured with libxml
Nikolay Samokhvalov wrote: The current CVS' configure is really confusing: it has "--with-xslt" option, while there is no XSLT support in the core. At least let's change the option's comment to smth like "build with XSLT support (now it is used for contrib/xml2 only)"... contrib is a misnomer at best. When 8.3 branches I intend to propose that we abandon it altogether, in line with some previous discussions. We can change the configure help text if people think it matters that much - which seems to me much more potentially useful than changing comments. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] build/install xml2 when configured with libxml
On 4/15/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote: Well, if we're going to make libxslt an explicit thing, then it'd be trivial to add an xslt transformation function into the core, and then I think we can claim equivalent support. But we'll have to check the details, of course. I have been thinking, however, that I don't want to add more and more library dependencies into the server. libxml2 was necessary to some extent. But xslt functionality could easily be provided as a module. This would be easy to do and might be useful even for 8.3. But I don't really know how to label that. Having a contrib/xslt alongside contrib/xml2 would probably be confusing. Ideas? The current CVS' configure is really confusing: it has "--with-xslt" option, while there is no XSLT support in the core. At least let's change the option's comment to smth like "build with XSLT support (now it is used for contrib/xml2 only)"... -- Best regards, Nikolay ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Passing more context info to selectivity-estimation code
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Comments, better ideas? > How about building a separate Var-node for the variable when it's above > an outer join? [ itch... ] That would be a whole lot *more* invasive than what I'm proposing now. It might be an interesting avenue to pursue, particularly in conjunction with the planner's inability to deal with non-nullable expressions in subselect outputs below outer joins (which seems to require some special kind of Var to fix). But I can't see back-patching it. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Passing more context info to selectivity-estimation code
Tom Lane wrote: Comments, better ideas? How about building a separate Var-node for the variable when it's above an outer join? That node would point to the original Var, and have an additional modifier which indicates the percentage of injected nulls. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings