ddlutils troubles and workarounds with svn, mysql, postgresql

Eric Auer Thu, 24 Jan 2008 05:16:19 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi everybody,

I have been doing some experiments with ddl-utils
recently, and found several tricky things... :-).

For most of them, I found workarounds, but of course
it would be even better to have other workarounds
embedded in ddl-utils themselves. Thanks in advance
and sorry for the long mail :-).

First, I tried to compile ddlutils from SVN. The
problem here is that I failed to compile only one
part of the toolkit. So after installing some of
the usual packages, the build process told me that
I would need forrest.apache.org - apparently to
create the documentation in HTML and PDF. Forrest
is a really big package... Next it told me that
it would depend on two DTD thingies, too, luckily
always mentioning where I can download them. Then
it complained that I did not have the DTD or similar
for some part of the data, which was when I gave up.

Next attempt was to download the 1.0 binary. It
turned out that the binary already contains three
things: The docs, the main jar, and most of the
supplemental jars. Very nice. Of course I only
found out after having downloaded the separate
doc download and jar download... ;-).

The binary zip seemed to be very easy to install,
but I had troubles with the classpath. In the end,
I just used the following:

ant -d -v -lib ddlutils/lib/ -f my-ant-tasks.xml ...

(I also copied the main ddlutils jar into .../lib/ )

Now finally I was able to dump the mysql database
that I had received as proprietary mysql dump file
and then loaded into a mysql that I have installed
especially for that purpose :-)

Next, I tried to import that file into my normal
postgresql database, again using ddlutils. This
failed, and I had to change 2 things to work
around the problem:

Somehow mysql uses 0000-00-00 00:00:00 and null as
synonyms for null for timestamps. However, you
cannot insert such invalid (month/day are 0) date
timestamps into postgresql databases. As workaround,
either of the following did work: Search and replace
all 0000-... timestamps by 1980-01-01 timestamps in
the XML file, issue SQL commands for each column of
type timestamp to update values which are null or
earlier than 1970-01-01 into 1980-... timestamps,
or search and replace all 0000-... timestamps into
null timestamps... I did not test the latter as it
would have been hard to explain to my text editor.

Second problem is that bugzilla uses indexes on
some varchar columns. Postgresql uses btree indexes
in a way where each value must be at most 8/3 kB.
This caused errors for rows with long values. The
solution was to comment out (in the XML file for
the database schema) the part which tells that an
index should be made for the longdescs.thetext
column. I can live without that index ;-).

As connectors, I used:
mysql-connector-java-5.0.8-bin.jar
postgresql-8.2-507.jdbc4.jar
... although I do not know if JDBC4 is of any
added value for DDL-UTILS :-)

Next, I tried to export some data back from the
postgresql database to XML files. It turned out
that ddl-utils failed to read the database schema
from the live database because of a missing schema
function. I was unable to get ddl-utils to proceed
even if I told it only to write the contents...

Luckily, others have has similar problems before:
ERROR: function information_schema._pg_keypositions() does not exist
is mentioned on
http://forum.hibernate.org/viewtopic.php?t=957909&view=next
as is some workaround: It might make a difference which
JDBC connector you use, but it always helps to:

CREATE FUNCTION information_schema._pg_keypositions() RETURNS SETOF int4 AS
$BODY$select g.s
from generate_series(1,current_setting('max_index_keys')::int,1)
as g(s)$BODY$
LANGUAGE 'sql' IMMUTABLE;
ALTER FUNCTION information_schema._pg_keypositions() OWNER TO postgres;

Note that you have to do this for the database in
question, not for the global information schema.
I think the latter only affects databases created
in the future. The problem only seems to occur in
postgresql 8.x ... An older postgres 7.4 of mine
already does have a function of this name, but in
a "flat" and less portable implementation:

myolddb=> \df+ information_schema._pg_keypositions

List of functions
Result data type | Schema | Name | Argument data
types| Owner | Language | Source code | Description
-
------------------+--------------------+------------------+---------------------+----------+----------+----------------------------------------------------------+-------------
setof integer | information_schema | _pg_keypositions || postgres | sql
| select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9 union all
select 10 union all select 11 union all select 12 union all
select 13 union all select 14 union all select 15 union all
select 16 union all select 17 union all select 18 union all
select 19 union all select 20 union all select 21 union all
select 22 union all select 23 union all select 24 union all
select 25 union all select 26 union all select 27 union all
select 28 union all select 29 union all select 30 union all
select 31 union all select 32 |
(1 row)

Now things worked quite okay, but I still could not
find any workaround for my last problem: The files
always contain ALL data. By using schemapatterns of
"public" or "p%" I can select between files with
and without the indexes as part of the schema (no
idea why it is "p%", just trial and error here),
but I cannot use any TABLE NAME PATTERN. Would be
great if ddl-utils could implement that filter!

I guess a sort of workaround would be to copy the
whole database to another inside postgresql and
then drop all tables that I do not want in the XML
files and then use ddl-utils to export the new DB
into XML files. Or use some good XML file editor
to remove the unneeded tables, but that will be a
problem because the data XML files can be 100s of
megabytes for some of the DB which I want to use
ddl-utils on... For the latter, it can be useful
to say ANT_OPTS=-Xmx1536M in the shell first ;-).

Last but not least, I was unable to dump some
databases as "mortal" user because for unknown
reason, pga_diagrams is only readable for the
superuser on one postgresql installation while
it is readily accessible even for low priv users
on other installations...? It is interesting how
much metadata is floating around in modern 8.x
versions of postgresql :-).

It would be cool if future ddl-utils could have
built-in workarounds for the 0000-... timestamp
issue and some of the other metadata / filtering
magic described above. Thanks :-). And of course
it would be good if ddl-utils were easier to
compile from SVN...

Eric

PS: troubles in the precompiled docs I downloaded
later: The PDF uses very dark blue compared to the
HTML and the UML diagrams are not shown in the PDF.
And there is a content error where the docs say
that some DB cannot do tinyint in -255..255 range
but only in (signed byte) range. Some other place
in the docs says that tinyint can be signed or
unsigned byte - it does not mention anything like
signed 9 bit value ;-). By the way, VARCHAR can be
up to 10*1024*1024 Unicode chars, not 254 ASCII,
at least in 8.x versions of PostgreSQL :-).

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFHmI+A99dkROyhRRsRAhUUAJ9vEjkjnYMR62vqBro517sksZH76ACeNQSV
fEcFf2xpWmmbODMzI2zsLq4=
=o8C+
-----END PGP SIGNATURE-----

ddlutils troubles and workarounds with svn, mysql, postgresql

Reply via email to