Hi, I'm trying to import some relatively huge (~300GiB) set of data from Microsoft SQL Server database to PostgreSQL. For this purpose, I use bcp.exe (bulk copy utility) comes with MSSQL. Despite there are command line arguments which you can specify batch size, escape characters, null fields, etc.; %90 of these arguments are unsuprisingly ignored by bcp.exe. The problem is, bcp.exe
- Uses nothing to specify null fields, which is equivalent to COPY ... WITH NULL AS '' in PostgreSQL. - Uses \x00 character to specify empty strings. (Sorry not any single part of this mess is configurable.) To successfully accept such an input during "COPY ... FROM ..." in PostgreSQL, I decided to write middleware triggers to tables including columns of type character varying. Below is the procedure I come up with CREATE OR REPLACE FUNCTION create_bcp_fix_triggers (_table_schema text) RETURNS void AS $$ DECLARE _table record; _column record; _create_stmt text; BEGIN SET standard_conforming_strings TO off; SET escape_string_warning TO off; -- Find every table having a column of type 'character varying'. FOR _table IN SELECT table_name FROM information_schema.columns WHERE table_schema = _table_schema AND data_type = 'character varying' GROUP BY table_name ORDER BY table_name LOOP _create_stmt = 'BEGIN;\n' || 'CREATE OR REPLACE\n' || 'FUNCTION ' || _table.table_name || '_bcp_fix ()\n' || 'RETURNS "trigger" AS $bcp-fix$\n' || 'BEGIN\n'; -- Create appropriate bcp.exe fix clauses for every found column. FOR _column IN SELECT column_name FROM information_schema.columns WHERE table_schema = _table_schema AND table_name = _table.table_name LOOP _create_stmt = _create_stmt || ' IF NEW.' || _column.column_name || ' = E''\0'' THEN\n' || ' NEW.' || _column.column_name || ' = ''''\n' || ' END IF;\n'; END LOOP; _create_stmt = _create_stmt || ' RETURN NEW;\n' || 'END;' || '$bcp-fix$ LANGUAGE plpgsql;\n' || 'CREATE TRIGGER ' || _table.table_name || '_bcp_fix\n' || ' BEFORE INSERT ON ' || _table.table_name || '\n' || ' FOR EACH ROW EXECUTE ' || ' PROCEDURE ' || _table.table_name || '_bcp_fix();\n' || 'COMMIT;'; EXECUTE _create_stmt; END LOOP; END; $$ LANGUAGE plpgsql; But executing this procedure fails for some reason I couldn't understand. > SELECT public.create_bcp_fix_triggers('commsrv'); ERROR: syntax error at or near "AS $" LINE 4: RETURNS "trigger" AS $bcp-fix$ ^ QUERY: BEGIN; CREATE OR REPLACE FUNCTION xyz_bcp_fix () RETURNS "trigger" AS $bcp-fix$ BEGIN IF NEW.foo = E' NEW.foo = '' END IF; IF NEW.bar = E' NEW.bar = '' END IF; RETURN NEW; END;$bcp-fix$ LANGUAGE plpgsql; CREATE TRIGGER xyz_bcp_fix BEFORE INSERT ON xyz FOR EACH ROW EXECUTE PROCEDURE xyz_bcp_fix(); COMMIT; CONTEXT: PL/pgSQL function "create_bcp_fix_triggers" line 41 at EXECUTE statement Can anybody help me to spot the problem? Regards. P.S. For data transfer during migratition, I first considered using "EMS Data Export for SQL Server" tool, but it lacks of performance. (~1000rows/sec) Namely, I'm open to any bcp.exe alternatives.