Re: [HACKERS] Including Snapshot Info with Indexes
Ühel kenal päeval, L, 2007-10-13 kell 17:44, kirjutas Gokulakannan Somasundaram: Hi, I went through this article and it was good. Please have a look at it. http://www.databasecolumn.com/2007/09/one-size-fits-all.html This article was written by Michael Stonebraker, considered to be the founder of our database. He has mentioned that the DBMS designed in 1970s haven't changed according to the change that has happened in Hardware landscape. What has happened in reality, is that the speed difference between CPU, RAM and disk speeds has _increased_ tremendously, which makes it even more important to _decrease_ the size of stored data if you want good performance The Vertica database(Monet is a open source version with the same principle) makes use of the very same principle. Use more disk space, since they are less costly and optimize the data warehousing. MonetDB is not about using more disk to get better performance, but about reducing the need to read unused data and increasing the speed by that. There is also a MonetDB/X100 project, which tries to make MonetOD order(s) of magnitude faster by doing in-page compression in order to get even more performance, see: http://homepages.cwi.nl/~boncz/x100.html http://www.cwi.nl/themes/ins1/publications/docs/ZuBoNeHe:DEBULL:05.pdf Even otherwise we are recommending Indexes with snapshot as an option. We are not replacing the current index scheme. So if someone feels that his database should run on lesser disk space, let them create the normal index. If he feels he can afford to have more redundant disk space, then he can create indexes with snapshots. We are reducing random I/Os at the cost of extra disk space. So definitely that's a good. But tech folks like us can better decide on something based on experiments, as Tom has pointed out. So let's see whether Indexes with snapshot is worth the trade-off in space. Agreed. Hannu ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Including Snapshot Info with Indexes
A Dissabte 13 Octubre 2007, Gokulakannan Somasundaram va escriure: Even otherwise we are recommending Indexes with snapshot as an option. We are not replacing the current index scheme. So if someone feels that his database should run on lesser disk space, let them create the normal index. If he feels he can afford to have more redundant disk space, then he can create indexes with snapshots. We are reducing random I/Os at the cost of extra disk space. So definitely that's a good. But tech folks like us can better decide on something based on experiments, as Tom has pointed out. So let's see whether Indexes with snapshot is worth the trade-off in space. There's also LucidDB [1], another open souce column based data base. But if you look at the features section in their web page, you'll see they use page-level multi-versioning. So they are avoiding the need for storing snapshot information for each tuple, I think that has to be kept in mind. I'd really like that PostgreSQL could gain some features ala Column Based databases, so the administrator could choose how he wants to use the database, but I don't think we'll be able to compete with them if they store snapshot informatin per page, and we're storing it per tuple, for example. So any step in this directoy will probably mean understanding the decisions they've made in their architectures. [1] http://www.luciddb.org/ -- Albert Cervera i Areny http://www.NaN-tic.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] ABIs are hard
* Gregory Stark ([EMAIL PROTECTED]) wrote: Hm, I suppose this is expected. I always hated the idea that libraries could introduce new symbols without an soname bump but obviously nobody's going to be interested in an soname bump today... ! /home/stark/src/local-HEAD/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/psql: symbol lookup error: /home/stark/src/local-HEAD/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/psql: undefined symbol: pg_valid_server_encoding_id Just to clarify... You compiled a new program which uses the new symbols against new headers (otherwise they would have complained during compilation) and then tried to use that new program with an old library. If so, then yes, that's expected to break. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Including Snapshot Info with Indexes
On 10/14/07, Hannu Krosing [EMAIL PROTECTED] wrote: Ühel kenal päeval, L, 2007-10-13 kell 17:44, kirjutas Gokulakannan Somasundaram: Hi, I went through this article and it was good. Please have a look at it. http://www.databasecolumn.com/2007/09/one-size-fits-all.html This article was written by Michael Stonebraker, considered to be the founder of our database. He has mentioned that the DBMS designed in 1970s haven't changed according to the change that has happened in Hardware landscape. What has happened in reality, is that the speed difference between CPU, RAM and disk speeds has _increased_ tremendously, which makes it even more important to _decrease_ the size of stored data if you want good performance The Vertica database(Monet is a open source version with the same principle) makes use of the very same principle. Use more disk space, since they are less costly and optimize the data warehousing. MonetDB is not about using more disk to get better performance, but about reducing the need to read unused data and increasing the speed by that. There is also a MonetDB/X100 project, which tries to make MonetOD order(s) of magnitude faster by doing in-page compression in order to get even more performance, see: http://homepages.cwi.nl/~boncz/x100.html http://www.cwi.nl/themes/ins1/publications/docs/ZuBoNeHe:DEBULL:05.pdf What i meant there was, it has duplicated storage of certain columns of the table. A table with more than one projection always needs more space, than a table with just one projection. By doing this they are reducing the number of disk operations. If they are duplicating columns of data to avoid reading un-necessary information, we are duplicating the snapshot information to avoid going to the table. Even otherwise we are recommending Indexes with snapshot as an option. We are not replacing the current index scheme. So if someone feels that his database should run on lesser disk space, let them create the normal index. If he feels he can afford to have more redundant disk space, then he can create indexes with snapshots. We are reducing random I/Os at the cost of extra disk space. So definitely that's a good. But tech folks like us can better decide on something based on experiments, as Tom has pointed out. So let's see whether Indexes with snapshot is worth the trade-off in space. Agreed. And more one more good news for people, who are following this thread. It seems like we won't be having a hit on update performance, if the indexes are not updated. BTStack remains the same for the old and new tuples, if the index tuple is not updated. But i don't know whether i would be able to put that tuning(re-using BTSTack) in the first patch So Indexes with snapshots will be degrading the performance only for deletes and only those updates, which are updating the index tuple. I think delete overhead can be ruled out for those who will be working with partitions, since they usually drop the partitions. Thanks, Gokul. Hannu
Re: [HACKERS] Including Snapshot Info with Indexes
Gokulakannan Somasundaram [EMAIL PROTECTED] writes: So Indexes with snapshots will be degrading the performance only for deletes and only those updates, which are updating the index tuple. Deletes never update indexes in Postgres. Increasing the size of the index would affect vacuum, inserts, and index accesses. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Including Snapshot Info with Indexes
On 10/14/07, Gokulakannan Somasundaram [EMAIL PROTECTED] wrote: http://www.databasecolumn.com/2007/09/one-size-fits-all.html The Vertica database(Monet is a open source version with the same principle) makes use of the very same principle. Use more disk space, since they are less costly and optimize the data warehousing. What i meant there was, it has duplicated storage of certain columns of the table. A table with more than one projection always needs more space, than a table with just one projection. By doing this they are reducing the number of disk operations. If they are duplicating columns of data to avoid reading un-necessary information, we are duplicating the snapshot information to avoid going to the table. Was this about Vertica or MonetDB? I saw that article a while ago, and I didn't see anything that suggested Vertica duplicated data, just that it organized it differently on disk. What are you seeing as being duplicated? (This is orthogonal to the current thread; I'm just curious.) ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Difference between materialized view and table
Hello all, As of now, PGSQL does not support materialized views. Just wanted to confirm the difference between the materialized view and table. Is there some operation which can be performed only on a table and not on a materialized view or vice-versa? Please let me know. Thanks in advance, Best regards, Sayali == ~~~ == ~~~ == ~~~ == ~~~ == ~~~ == ~~~ == ~~~ == ~~~ == Happiness is nothing more than a good health and a bad memory == ~~~ == ~~~ == ~~~ == ~~~ == ~~~ == ~~~ == ~~~ == ~~~ == - For ideas on reducing your carbon footprint visit Yahoo! For Good this month.
[HACKERS] Back-patch support for python 2.5?
I noticed that two of the buildfarm members are failing the 8.1 branch because they're trying to build plpython with python 2.5. To get that to work, I think we'd need to back-patch these two fixes: http://archives.postgresql.org/pgsql-committers/2006-11/msg00165.php http://archives.postgresql.org/pgsql-committers/2007-04/msg00043.php Is that worth doing? Should we go further back than 8.1? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Including Snapshot Info with Indexes
On 10/14/07, Gregory Stark [EMAIL PROTECTED] wrote: Gokulakannan Somasundaram [EMAIL PROTECTED] writes: So Indexes with snapshots will be degrading the performance only for deletes and only those updates, which are updating the index tuple. Deletes never update indexes in Postgres. Increasing the size of the index would affect vacuum, inserts, and index accesses. In the new proposal, deletes are going to update indexes. So its a trade-off between selects and deletes, since selects may not need to goto the table for checking visibility. You may go through this thread, to get more details. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Including Snapshot Info with Indexes
On 10/14/07, Trevor Talbot [EMAIL PROTECTED] wrote: On 10/14/07, Gokulakannan Somasundaram [EMAIL PROTECTED] wrote: http://www.databasecolumn.com/2007/09/one-size-fits-all.html The Vertica database(Monet is a open source version with the same principle) makes use of the very same principle. Use more disk space, since they are less costly and optimize the data warehousing. What i meant there was, it has duplicated storage of certain columns of the table. A table with more than one projection always needs more space, than a table with just one projection. By doing this they are reducing the number of disk operations. If they are duplicating columns of data to avoid reading un-necessary information, we are duplicating the snapshot information to avoid going to the table. Was this about Vertica or MonetDB? I saw that article a while ago, and I didn't see anything that suggested Vertica duplicated data, just that it organized it differently on disk. What are you seeing as being duplicated? Hi Trevor, This is a good paper to read about the basics of Column-oriented databases. http://db.lcs.mit.edu/projects/cstore/vldb.pdf If you goto the Section 2 - Data Model. He has shown the data model, with a sample EMP table. The example shows that EMP table contains four columns - Name, Age, Dept, Salary From this table, projections are being formed - (In the paper, they have shown the creation of four projections for Example 1) EMP1 (name, age) EMP2 (dept, age, DEPT.floor) EMP3 (name, salary) DEPT1(dname, floor) As you can see, the same column information gets duplicated in different projections. The advantage is that if a query is around name and age, it need not skim around other details. But the storage requirements go high, since there is redundancy. As you may know, if you increase data redundancy, it will help selects at the cost of inserts, updates and deletes. This is what i was trying to say. Thanks, Gokul.
Re: [HACKERS] Back-patch support for python 2.5?
Tom Lane wrote: I noticed that two of the buildfarm members are failing the 8.1 branch because they're trying to build plpython with python 2.5. To get that to work, I think we'd need to back-patch these two fixes: http://archives.postgresql.org/pgsql-committers/2006-11/msg00165.php http://archives.postgresql.org/pgsql-committers/2007-04/msg00043.php Is that worth doing? Should we go further back than 8.1? I find these patches a bit excessive for backporting, especially 10 minor releases into the branch. Those who need Python 2.5 can use PostgreSQL 8.2. I consider that to be a new feature. The fact that the buildfarm exercises other combinations isn't evidence of a practical need. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Back-patch support for python 2.5?
Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane wrote: I noticed that two of the buildfarm members are failing the 8.1 branch because they're trying to build plpython with python 2.5. To get that to work, I think we'd need to back-patch these two fixes: http://archives.postgresql.org/pgsql-committers/2006-11/msg00165.php http://archives.postgresql.org/pgsql-committers/2007-04/msg00043.php Is that worth doing? Should we go further back than 8.1? Those who need Python 2.5 can use PostgreSQL 8.2. Not being a Python user, I don't really care. I notice though that the second of the above-mentioned patches is only in HEAD, which is why those same machines are giving scary-looking warnings in the 8.2 branch, eg wasp on 8.2: ccache gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -g -fpic -I. -I/usr/include/python2.5 -I../../../src/include -D_GNU_SOURCE -I/usr/include/et -c -o plpython.o plpython.c plpython.c:1996: warning: initialization from incompatible pointer type plpython.c:1998: warning: 'intargfunc' is deprecated plpython.c:1998: warning: initialization from incompatible pointer type plpython.c:1999: warning: 'intargfunc' is deprecated plpython.c:1999: warning: initialization from incompatible pointer type plpython.c:2000: warning: 'intintargfunc' is deprecated plpython.c:2000: warning: initialization from incompatible pointer type plpython.c:2001: warning: initialization from incompatible pointer type plpython.c:2002: warning: initialization from incompatible pointer type Seems we should at least back-patch Marko's fixes into 8.2. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] rolcanlogin vs. the flat password file
There's a gripe over here http://archives.postgresql.org/pgsql-general/2007-10/msg00640.php to the effect that PG should not give a message like password authentication failure when the user is attempting to log in as a NOLOGIN role. This surprised me because there is a specific message for that, and it worked when I tried it: regression=# create user foo nologin; CREATE ROLE regression=# \c - foo FATAL: role foo is not permitted to log in Previous connection kept regression=# On investigation though, it turns out that it depends on which auth mode you're using: some of the auth modes look up the user in the flat password file, and some don't. Now flatfiles.c makes a point of not entering roles into the flat password file if they are not rolcanlogin, which means that for password auth you are guaranteed to fail long before you can get to the explicit check in InitializeSessionUserId. We could certainly change flatfiles.c to disregard rolcanlogin, which'd actually make the code simpler. However, that in itself wouldn't change the behavior, unless you were to assign a password to the NOLOGIN role which seems a fairly strange thing to do. I think what the OP wishes is that not permitted to log in would be checked before checking password validity, and to do that we'd have to add rolcanlogin to the flat password file and put the check somewhere upstream of the authentication process. I am not entirely convinced whether we should do anything about this: the general theory on authentication failures is that you don't say much about exactly why it failed, so as to not give a brute-force attacker any info about whether he gave a valid userid or not. So there's an argument to be made that the current behavior is what we want. But I'm pretty sure that it wasn't intentionally designed to act this way. Comments? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] rolcanlogin vs. the flat password file
On Oct 14, 2007, at 14:34 , Tom Lane wrote: I am not entirely convinced whether we should do anything about this: the general theory on authentication failures is that you don't say much about exactly why it failed, so as to not give a brute-force attacker any info about whether he gave a valid userid or not. So there's an argument to be made that the current behavior is what we want. But I'm pretty sure that it wasn't intentionally designed to act this way. Would there be a difference in how this is logged and how it's reported to the user? I can see where an admin (having access to logs) would want to have additional information such as whether a role login has failed due to not having login privileges or whether the failure was due to an incorrect role/password pair. I lean towards less information back to the user as to the nature of the failure. If the general consensus is to leave the current behavior, a comment should probably be included to note that the behavior is intentional. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] rolcanlogin vs. the flat password file
* Tom Lane ([EMAIL PROTECTED]) wrote: We could certainly change flatfiles.c to disregard rolcanlogin, which'd actually make the code simpler. However, that in itself wouldn't change the behavior, unless you were to assign a password to the NOLOGIN role which seems a fairly strange thing to do. I think what the OP wishes is that not permitted to log in would be checked before checking password validity, and to do that we'd have to add rolcanlogin to the flat password file and put the check somewhere upstream of the authentication process. I wonder if the OP was unhappy because he created a role w/ a pw and then couldn't figure out why the user couldn't log in? I've run into that in the past and it takes some leg-work to figure out what's going on. A warning on a 'create role' or 'alter role' command which sets a password when 'rolcanlogin' is false might be an alternative way to 'fix' this. In general, I would say that it's correct to say 'invalid authentication'/'bad pw' until the user is authenticated and then say 'not permitted to log in' if they're not authorized (don't have rolcanlogin), which is I think what we do. That combined with the warning above would, I think, cover most of problem cases. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] rolcanlogin vs. the flat password file
Michael Glaesemann [EMAIL PROTECTED] writes: Would there be a difference in how this is logged and how it's reported to the user? Not without making all the same infrastructure changes that would be needed to tell the user something different than now. As things stand, the password auth code can't tell the difference between a nonexistent role and a nologin role; neither one has an entry in the flat file. If we dropped the filtering in flatfiles.c, then a nologin role would have an entry, but most likely without a password, so you'd still just see password auth failed. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] rolcanlogin vs. the flat password file
Stephen Frost [EMAIL PROTECTED] writes: * Tom Lane ([EMAIL PROTECTED]) wrote: ... I think what the OP wishes is that not permitted to log in would be checked before checking password validity, and to do that we'd have to add rolcanlogin to the flat password file and put the check somewhere upstream of the authentication process. I wonder if the OP was unhappy because he created a role w/ a pw and then couldn't figure out why the user couldn't log in? Hm, maybe. In that case just not filtering the entry out of the flat file would be good enough. In hindsight I'm not sure why we indulged in that bit of complication anyway --- it seems unlikely that an installation would have so many nologin roles, compared to regular ones, that the increase in size of the flat file would be objectionable. In general, I would say that it's correct to say 'invalid authentication'/'bad pw' until the user is authenticated and then say 'not permitted to log in' if they're not authorized (don't have rolcanlogin), which is I think what we do. That *would* be the behavior if we removed the filtering. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] rolcanlogin vs. the flat password file
I wrote: Stephen Frost [EMAIL PROTECTED] writes: I wonder if the OP was unhappy because he created a role w/ a pw and then couldn't figure out why the user couldn't log in? Hm, maybe. In that case just not filtering the entry out of the flat file would be good enough. I've confirmed the confusing behavior in CVS HEAD. With password auth selected in pg_hba.conf: postgres=# create user foo nologin; CREATE ROLE postgres=# \c - foo Password for user foo: FATAL: password authentication failed for user foo Previous connection kept postgres=# alter user foo password 'foo'; ALTER ROLE postgres=# \c - foo Password for user foo: correct password entered here FATAL: password authentication failed for user foo Previous connection kept With the attached patch to not drop nologin roles from the flat password file, it acts more sanely: postgres=# create user foo nologin; CREATE ROLE postgres=# \c - foo Password for user foo: FATAL: password authentication failed for user foo Previous connection kept postgres=# alter user foo password 'foo'; ALTER ROLE postgres=# \c - foo Password for user foo: correct password entered here FATAL: role foo is not permitted to log in Previous connection kept Should we just do this, or is it worth working harder? regards, tom lane *** src/backend/utils/init/flatfiles.c.orig Wed Aug 1 18:45:08 2007 --- src/backend/utils/init/flatfiles.c Sun Oct 14 17:14:27 2007 *** *** 298,304 * * The format for the flat auth file is *rolename password validuntil memberof memberof ... - * Only roles that are marked rolcanlogin are entered into the auth file. * Each role's line lists all the roles (groups) of which it is directly * or indirectly a member, except for itself. * --- 298,303 *** *** 312,318 typedef struct { Oid roleid; - boolrolcanlogin; char *rolname; char *rolpassword; char *rolvaliduntil; --- 311,316 *** *** 407,414 tempname))); /* !* Read pg_authid and fill temporary data structures. Note we must read !* all roles, even those without rolcanlogin. */ totalblocks = RelationGetNumberOfBlocks(rel_authid); totalblocks = totalblocks ? totalblocks : 1; --- 405,411 tempname))); /* !* Read pg_authid and fill temporary data structures. */ totalblocks = RelationGetNumberOfBlocks(rel_authid); totalblocks = totalblocks ? totalblocks : 1; *** *** 433,439 } auth_info[curr_role].roleid = HeapTupleGetOid(tuple); - auth_info[curr_role].rolcanlogin = aform-rolcanlogin; auth_info[curr_role].rolname = pstrdup(NameStr(aform-rolname)); auth_info[curr_role].member_of = NIL; --- 430,435 *** *** 565,574 List *roles_names_list = NIL; ListCell *mem; - /* We can skip this for non-login roles */ - if (!auth_info[curr_role].rolcanlogin) - continue; - /* * This search algorithm is the same as in is_member_of_role; we * are just working with a different input data structure. --- 561,566 *** *** 642,650 for (curr_role = 0; curr_role total_roles; curr_role++) { auth_entry *arole = auth_info[curr_role]; - - if (arole-rolcanlogin) - { ListCell *mem; fputs_quote(arole-rolname, fp); --- 634,639 *** *** 660,666 } fputs(\n, fp); - } } if (FreeFile(fp)) --- 649,654 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] rolcanlogin vs. the flat password file
Tom Lane wrote: Should we just do this, or is it worth working harder? Not worth more, IMNSHO. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Back-patch support for python 2.5?
Tom Lane wrote: I notice though that the second of the above-mentioned patches is only in HEAD, which is why those same machines are giving scary-looking warnings in the 8.2 branch, eg wasp on 8.2: The CVS log message for that says Allow PL/PythonU to compile on Python 2.5., but that is evidently not true. Is this just a warning cleanup, or is there perhaps a 32 vs 64 bit issue that is being fixed there? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] rolcanlogin vs. the flat password file
* Tom Lane ([EMAIL PROTECTED]) wrote: Stephen Frost [EMAIL PROTECTED] writes: I wonder if the OP was unhappy because he created a role w/ a pw and then couldn't figure out why the user couldn't log in? Hm, maybe. In that case just not filtering the entry out of the flat file would be good enough. I've confirmed the confusing behavior in CVS HEAD. With password auth selected in pg_hba.conf: [...] Should we just do this, or is it worth working harder? I certainly like this. Honestly, I'd also like the warning when doing a 'create role'/'alter role' that sets/changes the pw on an account that doesn't have 'rolcanlogin'. Much better to have me notice that I goof'd the command and fix it before telling the user 'go ahead and log in' than to have the user complain that it's not working. :) Just my 2c. Thanks, Stephen signature.asc Description: Digital signature