Hello Friends, 

We're trying something for the first time: A COPY into a database, from a TEXT 
(or CSV) file containing one really, really, big field of html. 


The field happens to be content of complete webpages, which we then need to 
later analyze, slice, dice, etc. - so it's verbatim html, with all the carriage 
returns, spaces, linefeeds(?) and double quotes included! 


Problem is: With the very first record, the COPY commands hiccups with: missing 
data from column error. 
in CSV mode, it's 'extra data after last expected column' (yes, using different 
input files for test). 


Both errors above make sense to me; COPY is running into either a cr or a tab 
character in each case. 


Q: Is there way to handle this directly, as a PG import? 


Meanwhile, we're off into using grep/gawk to remove all carriage returns in the 
field? 


TIA for any help, inspiration, recipes (or time in the stocks). Lou 

Reply via email to