A tested example in Python :
Basically it counts the \t and accumulates the lines until it has enough and then prints the line.
Note : as an exercise you could add a test so that there are exactly (columns-1) delimiters and not >=(columns-1).
def grouplines( in_stream, columns, delimiter ): num_delimiters = columns - 1 accum = '' for line in in_stream: accum += line if accum.count( delimiter ) >= num_delimiters: print accum.replace( "\n", "\\n" ) accum = '' if accum: print "Last line unterminated."
grouplines( open( 'data.in' ), 3, "\t" ):
Input data (I added a column over your example): 1 What a day! A 2 What a week it has been! B 3 What the! C
Output : 1 What a day! A\n 2 What a week it has\nbeen! B\n 3 What the! C
Have fun with your copy !
On Tue, 12 Oct 2004 15:33:46 +1000, Theo Galanakis <[EMAIL PROTECTED]> wrote:
Thanks for all your comments,
I have beent trying the insert within a transaction block, however
it does not seem to reduce the time it takes to process each records. Mind
you there are 80 column and the insert statement explicitly defines the
column to insert into.
I need any tip I can get help me transform the text file into a format postgres copy will successfully read.
Here is sample of the current format of a mysql tab delimited dump..
columnA columnB
1 What a day! 2 What a week it has been! 3 What the!
As you can see row 2 has a value that holds a CR value which ends up wrapping around onto the third line. Postgres copy command does not like this and mysql is unable to replace the value with another type of delimiter, like a \r.
So I gather I have to some how manually replace the carriage return with something postgres understand \r...
columnA columnB
1 What a day! 2 What a week it has \r been! 3 What the!
How do I do this without getting a text file that looks like this
1 What a day! \r\n2 What a week it has \r been!\r\n3 What the!\r\n
Any help would be appreciated.
Theo
-----Original Message----- From: Christopher Browne [mailto:[EMAIL PROTECTED]
Sent: Tuesday, 12 October 2004 10:46 AM To: [EMAIL PROTECTED] Subject: Re: [SQL] Export tab delimited from mysql to postgres.
Quoth [EMAIL PROTECTED] (Theo Galanakis):Could you provide a example of how to do this?
I actually ended up exporting the data as Insert statements, which strips out cf/lf within varchars. However it takes an eternity
to import 200,000 records... 24 hours infact???? Is this normal?
I expect that this results from each INSERT being a separate transaction.
If you put a BEGIN at the start and a COMMIT at the end, you'd doubtless see
an ENORMOUS improvement.
That's not even the _big_ improvement, either. The _big_ improvement would
involve reformatting the data so that you could use the COPY statement,
which is _way_ faster than a bunch of INSERTs. Take a look at the
documentation to see the formatting that is needed:
http://techdocs.postgresql.org/techdocs/usingcopy.php http://www.faqs.org/docs/ppbook/x5504.htm http://www.postgresql.org/docs/7.4/static/sql-copy.html --
output = ("cbbrowne" "@" "ntlug.org") http://www3.sympatico.ca/cbbrowne/lsf.html Question: How many surrealists does it take to change a light bulb?
Answer: Two, one to hold the giraffe, and the other to fill the bathtub with brightly colored machine tools.
---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
______________________________________________________________________ This email, including attachments, is intended only for the addressee and may be confidential, privileged and subject to copyright. If you have received this email in error, please advise the sender and delete it. If you are not the intended recipient of this email, you must not use, copy or disclose its content to anyone. You must not copy or
communicate to others content that is confidential or subject to
copyright, unless you have the consent of the content owner.
---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])