Greetings hackers!

I've renamed the project fsnapshot to pov, PostgreSQL Object Version
control system. :)

I've also created a quite nifty SQL command line based utility to
perform graph/tree sorting algorithms on data in the database, without
the need to export the data to some external application.

The utility is named tsort and behaves exactly like the GNU coreutils
tool tsort (or the Perl Power Tools tool tcsort).

Since tree sorting is a quite common task in computer science, perhaps
some of you will find it useful.

Source code: 
https://github.com/gluefinance/pov/blob/master/sql/schema/pov/functions/tsort.pl

Also, if I have understood everything correctly, the toposort
algorithm in pg_dump_sort.c does not does its best ensuring the
objects are created in the same order, if the oids would change or if
objects would be dropped/added.

The algorithm is probably a simple DFS (depth-first sorting) or BFS
(breadth-first sorting) without any effort made to do sorting when
selecting the successor objects.

It's probably a quite challenging task to implement the extra
necessary sorting in C, on top of the standard DFS or BFS algorithm.

I put together a small quite nifty plperl utility to do the task, as a
simple proof-of-concept and also because I thought it makes sense to
provide such a method accessible from the SQL prompt,
since tree/graph sorting is probably one of the most commonly
performed tasks in many applications dealing.

It is quite powerful with all the basic tree sorting features I could think of.

I hope you find it useful and perhaps even a bit fun :)

Here is a small example on how to use it to analyze pg_depend:

create table a ( id int not null, primary key (id));
create table aa ( id int not null, primary key (id), foreign key (id)
references a(id));
create table ab ( id int not null, primary key (id), foreign key (id)
references a(id));
create table aaa ( id int not null, primary key (id), foreign key (id)
references aa(id));
create table aab ( id int not null, primary key (id), foreign key (id)
references aa(id));
create table aba ( id int not null, primary key (id), foreign key (id)
references ab(id));
create table abb ( id int not null, primary key (id), foreign key (id)
references ab(id));

Instead of using oid, one should use unique names for each object
(composed differently based on the object type).

If doing so, the example below would render exactly the same result
even in two databases with completely different oids for the same
schema.

glue=# select oid,relname from pg_class where relname ~ '^a.?.?$';
  oid  | relname
-------+---------
 52354 | a
 52359 | aa
 52399 | aba
 52369 | ab
 52389 | aab
 52379 | aaa
 52409 | abb
(7 rows)

Find root objects (source objects, no predecessors), sort numerically:
glue=# SELECT tsort(array_to_string(array_agg(distinct objid || ',' ||
refobjid), ','), ',', 0, 'sub {$a <=> $b}', 'SOURCE') FROM pg_depend;



tsort



--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------
 
{0,10939,10940,10942,10943,10945,10946,10948,10949,10951,10952,10955,10956,10958,10959,10962,10963,10966,10967,10970,10971,10973,10974,10976,10977,10980,10981,10983,10984,10985,10986,10988,10989,10991,10992,10994,10995,10998,10999,11001,11002,11004,11005,11008,11009,11011,11012,11014,11015,11018,11019,11021,11022,11024,11025,11028,11029,11031,11032,11034,1103
5,11037,11038,11040,11041,11043,11044,11046,11047,11049,11050,11053,11054,11056,11057,11074,11076,11078,11080,11082,11084,11086,11088,11090,11092,11094,11096,11098,11100,11102,11104,11106,11108,11110,11112,11114,11116,11118,11120,11122,11124,11126,11128,11130,11132,11134,11136,11138,11140,11142,11144,11146,11148,11150,11152,11154,11156,11158,11160,11162,11164,
11166,11168,11170,11172,11174,11176,11178,11180,11182,11184,11186,11188,11190,11192,11193,11194,11195,11196,11197,11198,11199,11200,11201,11202,11203,11204,11205,11206,11207,11208,11209,11210,11211,11212,11214,11216,11218,11220,11222,11224,11226,11228,11230,11232,11234,11236,11238,11240,11241,11242,11243,11244,11245,11246,11247,11248,11249,11250,11251,11252,11
253,11254,11255,11256,11257,11258,11259,11260,11261,11262,11263,11264,11266,11268,11270,11272,11274,11276,11278,11280,11282,11284,11286,11288,11290,11292,11297,11299,11301,11303,11305,11307,11309,11311,11313,11315,11317,11319,11321,11323,11325,11340,11344,11345,11348,11349,11352,11353,11355,11356,11359,11360,11363,11364,11367,11368,11371,11372,11375,11376,1137
9,11380,11383,11384,11387,11388,11391,11392,11395,11396,11398,11399,11402,11403,11405,11406,11409,11410,11413,11414,11417,11418,11421,11422,11425,11426,11429,11430,11433,11434,11437,11438,11441,11442,11444,11445,11448,11450,11451,11453,11455,11456,11458,11460,11461,11463,11465,11466,11468,11470,11471,11473,11475,11476,11478,11480,11481,11483,11484,11487,11488,
11491,11492,11495,11496,11498,11499,11502,11503,11506,11507,11510,11511,11514,11515,11518,11519,11522,11523,11526,11527,11530,11531,11533,11534,11536,11537,11539,11540,11542,11543,11545,11546,11548,11549,11551,11552,11555,11556,52267,52342,52344,52345,52346,52347,52348,52349,52350,52351,52355,52360,52365,52366,52367,52368,52370,52375,52376,52377,52378,52380,52
382,52385,52386,52387,52388,52390,52392,52395,52396,52397,52398,52400,52402,52405,52406,52407,52408,52410,52412,52415,52416,52417,52418}
(1 row)


Find connected objects to the tables we created above and sort like
pg_dump_sort.c (DFS/BFS):
glue=# SELECT tsort(array_to_string(array_agg(distinct objid || ',' ||
refobjid), 
','),',',0,'DFS','CONN_INCL','INCLUDE','52354,52359,52399,52369,52389,52379,52409','BOTH')
FROM pg_depend;


                                                      tsort

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------
 
{52398,52405,52386,52390,52391,52400,52401,52387,52380,52381,52406,52375,52396,52418,52407,52417,52397,52410,52411,52408,52404,52385,52395,52394,52365,52415,52378,52355,52356,52376,52402,52403,52399,52416,52414,52372,52373,52377,52374,52366,52370,52371,52369,52412,52413,52409,52392,52393,52389,52360,52361,52388,52384,52362,52363,52367,52382,52383,52379,52368,
52364,52359,52357,52358,52354,2200}
(1 row)


Find connected objects to the tables we created above and sort each
lexically ascending, preserving the order:
glue=# SELECT tsort(array_to_string(array_agg(distinct objid || ',' ||
refobjid), ','),',',0,'sub {$a cmp
$b}','CONN_INCL','INCLUDE','52354,52359,52399,52369,52389,52379,52409','BOTH')
FROM pg_depend;


                                                      tsort

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------
 
{52355,52356,52360,52361,52365,52366,52367,52368,52364,52370,52371,52375,52376,52377,52378,52374,52357,52358,52354,52380,52381,52382,52383,52385,52386,52387,52388,52384,52379,52390,52391,52392,52393,52395,52396,52397,52398,52394,52362,52363,52359,52389,52400,52401,52402,52403,52405,52406,52407,52408,52404,52399,52410,52411,52412,52413,52415,52416,52417,52418,
52414,52372,52373,52369,52409,2200}
(1 row)


Calling the utility with no arguments shows the help:

glue=# SELECT tsort();
                                      tsort
----------------------------------------------------------------------------------

 DESCRIPTION
 tsort - return a tree's nodes in topological order


 SYNOPSIS
 tsort(edges text, delimiter text, debug integer, algorithm text,
 selection text, operator text, nodes text, direction text);

 OUTPUT
     nodes       text[]  Array of nodes in topologically sorted order

 INPUT PARAMETERS
     Parameter   Type    Regex     Description
     
============================================================================
     edges       text    ^.+$        Node pairs, separated by [delimiter].

     delimiter   text    ^.*$        Node separator in [edges],
                                     default is ' ', i.e. single blank space.

     debug       integer             Print debug information using RAISE DEBUG:
                         0           no debug (default)
                         1           some debug
                         2           verbose debug

     algorithm   text                Sorting algorithm:
                         DFS         depth-first (default)
                                     explores as far as possible along each
                                     branch before backtracking.
                         BFS         breadth-first
                                     explores all the neighboring nodes,
                                     then for each of those nearest nodes,
                                     it explores their unexplored neighbor
                                     nodes, and so on.
                         ^sub        sort using perl subroutine.
                                     examples:
                                     # sort numerically ascending
                                     sub {$a <=> $b}

                                     # sort numerically descending
                                     sub {$b <=> $a}

                                     #sort lexically ascending
                                     sub {$a cmp $b}

                                     # sort lexically descending
                                     sub {$b cmp $a}

                                     # sort case-insensitively
                                     sub {uc($a) cmp uc($b)}

                                     For more examples, please goto:
                                     http://perldoc.perl.org/functions/sort.html

     The following options will not affect the order of the nodes in the result,
     they only control which nodes are included in the result:

     Parameter   Type    Regex     Description
     
============================================================================
     selection   text                Selection of nodes, used by [operator]:
                         ALL         select all nodes (default)
                         ISOLATED    select nodes without any
                                     successors nor predecessors
                         SOURCE      select nodes with successors
                                     but no predecessors
                         SINK        select nodes with predecessors
                                     but no successors
                         CONN_INCL   select nodes connected to [nodes],
                                     including [nodes]
                         CONN_EXCL   select nodes connected to [nodes],
                                     excluding [nodes]
                                     separated by [delimiter]

     operator    text                Include or exclude nodes in [selection]:
                         INCLUDE     include nodes (default)
                         EXCLUDE     exclude nodes

     The following options are only applicable if,
     [selection] is CONN_INCL or CONN_EXCL

     Parameter   Type    Regex     Description
     
============================================================================

     nodes       text                select nodes connected to [nodes]
                         NULL        not applicable (default)
                         [nodes]     the start nodes, separated by [delimiter]


     direction   text                direction to look for connected nodes
                         BOTH        traverse both successors and
                                     predecessors (default)
                         UP          only traverse predecessors
                         DOWN        only traverse successors








-- 
Best regards,

Joel Jacobson
Glue Finance

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to