Re: [habari-dev] Re: - SCHEMA.SQL - CREATE SEQUENCE Improvements

cga Wed, 22 Jun 2011 15:17:55 -0700

Hi Rick,

> How would this change affect people upgrading their databases? Would
> the tables need to be totally rebuilt?


ANSWER: 

Right now I already have a compromise that will keep existing tables intact. 
Alas, using old style by adding the missing part to SQL w/ upgrade/2265.sql:

> ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;

to all sequences in the upgrade. This won't affect the data, only add the 
missing part to the original schema.

IDEAS AND TESTS:

>From here on are things I want to try:

While for new db it's not to going be a problem and would make a good 
schema.sql , I really want to understand how to migrate to the proposed schema 
also for those who have an existing db. I already have the raw idea and I'm 
already planning some steps and tests to "gradually" move to "new style" and 
keep the old db/tables intact.

Raw Idea: (implies that we used the above mentioned compromise already).

When we issue ALTER SEQUENCE, it won't affect the sequence and will allow us to 
use "pg_get_serial_sequence()" to find the name of the sequence associated with 
a given column.

Then with the use of currval() function, which returns the most recent value 
generated by a sequence for the current session, combined with 
"pg_get_serial_sequence()" that we are now able to use:

> SELECT currval(pg_get_serial_sequence('tablename', 'colname'));

we retrieve the most recent value generated by that specific sequence.

This value (for each sequence) could be stored in a variable_seq# or tmp table 
(either for each sequence) and set as "START WITH start" for each table that 
is to be recreated at a certain given point in the SEQUENCE.

My educated guess it's that this (and perhaps some more code in upgrade and 
perhaps php?) will allow us to extract the exact point of a given sequence at 
the moment of upgrade and migrate the existing to new schema too.

Basically the step could be:

0 - backup. backup. backup. and do not put anything more into db.

1 - ALTER SEQUENCE .. OWNED BY ... ;  

2 - SELECT currval(pg_get_serial_sequence('tablename', 'colname')); and store 
that in a variable or tmp_table (either one of those, but one for each 
sequence); 

3 - dump the db, drop the db.

4 - apply new schema w/ use of SERIAL ; 

5  - import the previusly dump for old db w/ use of tmp_table/var and the 
"START WITH var#" parameter.

6 - Check the resutls, prolly Frankenstein..... by Mel Brooks.

But this will require a test env, which I'm going to install later or at max 
tomorrow. 


On Wednesday 22 June 2011 22:05:30 rick c wrote:
> cga,
> 
> From what I see reading the docs you referenced, this it seems to a
> reasonable change. It decreases the verbosity of the sql, while at the
> same time ensuring the sequence generated is private to the table,
> which, if I understand correctly, isn't the case with Habari's current
> sql.
> 
> How would this change affect people upgrading their databases? Would
> the tables need to be totally rebuilt?
> 
> Rick
> 
> On Jun 22, 3:13 pm, cga <[email protected]> wrote:
> > Hi all,
> > 
> > <PREMISE>
> > you might forgive me if I don't use Trac, but I'd like to explain and
> > discuss what I'm doing with PGSQL schema.sql for Habari before
> > submitting (if my contrib will be accepted code wise) anything.
> > 
> > Not to mention that in these very days things are getting moved on
> > Github. If so, I would clone and request for pull once there, since I
> > have an active account on Github and it seems pointless to get a Trac
> > account now.
> > 
> > BUT: since this very patch isn't going (at least theoretically) to have
> > any bad effect on Habari , I'm attaching it to this very email for now.
> > For you to test. Forgive me on this last thing but I haven't set up a
> > test env just yet. I'll do it after sending this patch.
> > 
> > <little INTRO and WHY>
> > I'm cga. I'm a Linux SysAdmin and I'm all for "best practices" as much as
> > I can. Not to mention I'm really paranoid and analytical when it
> > concerns coding/installing/configuring/deploying. Especially for things
> > I care/use/ administer.
> > 
> > I basically started this little contribution because I really love my
> > stack: debian + postgresql + nginx and I'd like to use it for all my
> > needs. Habari included.
> > Ref:http://drunkenmonkey.org/irc/habari/2011-06-19#T17-04-24 I think
> > it'sfair enough for a small intro.
> > 
> > <CONTRIB AND REASONING>
> > As you can read in the chat log I linked above , I'm not expert in
> > neither PostgreSQL nor SQL but I like to contribute for as much as I
> > can. Especially if this can be the only way to keep Habari's support for
> > PostgreSQL alive.
> > 
> > This first little but important contribution it is all about the "CREATE
> > SEQUENCE" used in PGSQL actual schema.sql.
> > 
> > All the reasoning I do it's to be documented w/ links to official docs
> > and (attacched) chat logs from freenode.#postgresql , where (beside docs
> > and wiki) I learn and ask my stuff. And to my great pleasure I always
> > find very competent and very helpful people.
> > 
> > Please forgive my English errors, I'm Italian. Please *really* forgive
> > my verbosity but I'm sure you'd understand that people who didn't follow
> > in freenode.#habari don't know all the reasoning from the room. If I'm
> > to be accepted I promise I'll be as concise as possible in next
> > communications.
> > 
> > Here we go:
> > 
> > I noticed that you use a lot of serialization in schema.sql , w/o
> > questioning why (which I'm not in a position to, anyway) I noticed that
> > you (being whoever did that schema.sql) do it wrong. Why?
> > 
> > Because: CREATE SEQUENCE done in the present form from schema.sql:
> > >CREATE SEQUENCE {$prefix}posts_pkey_seq;
> > >CREATE TABLE {$prefix}posts (
> > >
> > >  id BIGINT NOT NULL DEFAULT nextval('{$prefix}posts_pkey_seq'),
> > >
> > >.....
> > >.....
> > >
> > >  PRIMARY KEY (id),
> > >
> > >.....
> > >);
> > 
> > 1 - It's *almost* correct, but no cigar. It basically miss the most
> > crucial part of the query to make the SEQUENCE itself have any sense
> > *and* work
> > 
> > properly:
> > > ALTER SEQUENCE  {$prefix}posts_pkey_seq OWNED BY{$prefix}posts.id;
> > 
> > w/o this ALTER TABLE the SEQUENCE is to behave weird more sooner than
> > later, resulting in duplicate numbers (see chat log; using BIGINT won't
> > save you from this, more on this later) and errors that are more than
> > diffcult to find when they happen.
> > 
> > 2 - it determines in a recursive way the name of the SEUQNECE itself ,
> > which is wrong because you cannot be 100% sure that will actually work
> > as intended. (see chat log).
> > 
> > 3 - it uses BIGINT. My reasoning is that whoever created this schema.sql
> > didn't wan't to finish the SEQUENCE too soon. Good idea , bad
> > implementation. I strongly believe that BIGINT is not needed indeed, and
> > that it's just a waste of cylces and space. Not to mention that BIGINT
> > behaves differently on 32bit and 64bit, causing slowness (and perhaps
> > some more side effects) on 32bit. (this is chat logs from the office, if
> > you need them I'll attach them tomorrow). And as explained just above,
> > it is not going to be used as a BIGINT anyway because of the wrong query
> > CREATE SEQUENCE. But the main point why it is not needed it's because
> > who's to write a post every second for # quintillions years? I think
> > that a post every second for 68yrs is more than enough (INTEGER/SERIAL).
> > (see chat log too).
> > 
> > 4 - when all the above are to be combined w/ the use of NOT NULL DEFAULT
> > nextval()" (and I mean all of them are compulsory clauses, see chat log),
> > you can save a lot of hassles, code and cycles with just the improvement
> > I'm submitting for every CREATE SEQUENCE query, leaving intact the final
> > result for what it concerns Habari's functioning *and* improves things a
> > whole lotta
> > 
> > love:
> > >CREATE TABLE {$prefix}posts (
> > >
> > >  id SERIAL,
> > >
> > >.....
> > >.....
> > >
> > >  PRIMARY KEY (id),
> > >
> > >.....
> > >);
> > 
> > <IMPROVEMENTS ACHIEVED>
> > Improvements and optimizations achieved with this little patch: (see Ref.
> > [2])
> > 
> > 0 - it will leave intact the need of Habari to use the numerical field
> > for internal functioning.
> > 1 - (id) will be unique and auto incremental by properties inherited from
> > query.
> > 2 - You won't skip a number in a SEQUENCE.
> > 3 - It won't fail because of wrong creation query.
> > 4 - It won't determine an unexpected SEQUENCE name/relation to the column
> > you want it to.
> > 5 - OWNED BY is to be used implicitly , this will drop anything related
> > to the elected talbe and/or column. This sounds harsh, but what will you
> > do w/ data that isn't associated w/ anything anymore?
> > 
> > <FINAL THOUGHTS>
> > 1 - For the impact on Habari's code (which I didn't read because I don't
> > do php and couldn't afford to get it wrong) on insert the next value of
> > the sequence into the serial column please read Ref. [2] just below the
> > PostgreSQL 7.3 Note.
> > 
> > 2 - This new schema.sql would work (in theory) for new installations but
> > I don't know how to implement and test a
> > system/schema/pgsql/upgrades/post/*.sql yet. Help and advice from you is
> > very much apprecciated.
> > 
> > 3 - Let me know if anyone is going to test this before I do. If you want.
> > Anyway I'm going to test it when my test env will be up and running.
> > 
> > 4 - I'm already planning the next batch of fixes but that can wait for
> > now. I anticipate that it will be about optimizing queries and about
> > keyvil issues as much as possible and w/o breaking Habari Ref. [3].
> > 
> > My pleasure to meet you.
> > 
> > <REFERENCES>
> > Ref. chat log attacched.
> > 
> > Ref [1].http://www.postgresql.org/docs/8.4/interactive/sql-
> > createsequence.html
> > 
> > Ref [2].http://www.postgresql.org/docs/8.4/interactive/datatype-
> > numeric.html#DATATYPE-SERIAL
> > 
> > Ref.
> > [3]http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327h
> > ttp://it.toolbox.com/blogs/database-soup/primary-keyvil-part-ii-7345http:
> > //it.toolbox.com/blogs/database-soup/primary-keyvil-part-iii-7365
> > 
> > --
> > knowledge has no owner, only means to reveal itself
> > 
> >  CREATE_SEQUENCE_#postgresql.log
> > 34KViewDownload
> > 
> >  postgresql.1stbatch.optimize.sequences.sql
> > 6KViewDownload
> > 
> >  pgsql.1stbatch.sql_schema.sql.diff
> > 3KViewDownload
> > 
> >  signature.asc
> > < 1KViewDownload

-- 
knowledge has no owner, only means to reveal itself

-- 
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at http://groups.google.com/group/habari-dev

Re: [habari-dev] Re: - SCHEMA.SQL - CREATE SEQUENCE Improvements

Reply via email to