On October 25, 2007 09:35:23 am Chuck D. wrote: > On October 24, 2007 01:10:59 am Paul Lambert wrote: > > I get around this problem with my data loads by specifying some other > > arbitrary character that I know won't appear in the data as the quote > > character. > > > > Eg QUOTE E'\f' will specify form feed as the quote character, ergo any > > data with double or single quotes will be loaded with those quote > > characters in the string. > > > > Something similar may help with your case. > > This was the solution. I specified a quote character that was not in the > data and the data imported perfectly. Without specifying any delimiter > postgres defaults to one of the quotes (I forget which). > > Unfortunately, the data I imported wasn't good. MaxMind, like the > Geonames.org derivatives, uses FIPS code for a state identifier in the > cities table for all countries EXCEPT USA in which case they use the iso > code. Both these data sets mix types within one column and I find that > absolutely unacceptable. > > Back to my original problem, which was trying to COPY in some of the > earth-info.nga.mil world city data. This data is tab delimited, no quotes > around fields, newline line terminated and UTF-8 encoded. > > Using a similar COPY statement with the defaults, it fails with this: > > COPY geo.orig_city FROM > '/home/www/geo/DATA/nga.mil/geonames_no_header.txt'; > > ERROR: literal carriage return found in data > HINT: Use "\r" to represent carriage return. > CONTEXT: COPY orig_city, line 1071850 > > And of course, at that line we find a field that has several lines which > appear (using cat -A) to be terminated with a new line ($). I originally > deleted this line but there are others like it. And the file is 2 Gigs in > size so it isn't acceptable to comb through it. > > I believe this is a new problem because I have a vintage file dated early > 2007 that didn't have this problem. Does anyone know how to solve this > COPY issue? >
Pardon me on this, the cat -A report for the failed line (and subsequent lines) shows ^M$ within the field, not just $. I assume that is probably a \r\n and postgres wants \r for field data and \n to end a line. I've tried working this over with sed but can't get the syntax right. I also have iconv installed if that would help any. Are there any good tools that will tell me what this really is instead of just ^M$ ? ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings