Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Craig Ringer
On 03/25/2014 07:05 AM, Tom Lane wrote: > Jim Nasby writes: >> Wait... I thought that was one of the objections... that we wanted to >> leave a BOM in something like a COPY untouched? > > I think most of us are okay with stripping a BOM that appears at the > *beginning* of a text file (assuming t

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Craig Ringer
On 03/25/2014 02:50 AM, Jim Nasby wrote: > So instead of trying to handle this on the psql side[1], I think we need > to handle it in the backend; specifically in the parser. Is there an > easy way to get the parser to ignore the BOM character in the context of > commands (but not in strings)? I d

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Andrew Dunstan
On 03/24/2014 08:28 PM, Tatsuo Ishii wrote: The code would probably be pretty trivial, *if* we had consensus on what the behavior ought to be. I'm not sure if we do. People who only use Unicode would probably like it if BOMs were unconditionally swallowed, whether or not psql thinks the client

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Tatsuo Ishii
> The code would probably be pretty trivial, *if* we had consensus on > what the behavior ought to be. I'm not sure if we do. People who > only use Unicode would probably like it if BOMs were unconditionally > swallowed, whether or not psql thinks the client_encoding is UTF8. > (And I seem to rec

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Tatsuo Ishii
>> Just a quick comment on this. Yes, pgAdmin always added a BOM in every >> SQL files it wrote. > > From > http://stackoverflow.com/questions/2223882/whats-different-between-utf-8-and-utf-8-without-bom: > > According to the Unicode standard, the BOM for UTF-8 files is not recommended: > > 2.6

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Tom Lane
Jim Nasby writes: > Wait... I thought that was one of the objections... that we wanted to > leave a BOM in something like a COPY untouched? I think most of us are okay with stripping a BOM that appears at the *beginning* of a text file (assuming there's reason to believe the file is in UTF8 encod

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Jim Nasby
On 3/24/14, 1:59 PM, Andrew Dunstan wrote: It occurs to me that we're going about this the wrong way... The error here isn't being generated by psql; it's generated by the backend. In the context of a statement (and not, say, a COPY command). So instead of trying to handle this on the psql sid

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Merlin Moncure
On Mon, Mar 24, 2014 at 2:37 PM, Merlin Moncure wrote: > psql -1 already requires '-f' to work actually, it doesn't. this was fixed recently. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Merlin Moncure
On Mon, Mar 24, 2014 at 2:16 PM, Tom Lane wrote: > Andrew Dunstan writes: >> I suspect suspect trying to do this in the parser will be quite messy. >> This needs to happen before the input is converted to the server >> encoding, I think. > > Indeed --- what if the server isn't using utf8 internal

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Tom Lane
Andrew Dunstan writes: > I suspect suspect trying to do this in the parser will be quite messy. > This needs to happen before the input is converted to the server > encoding, I think. Indeed --- what if the server isn't using utf8 internally? And a larger point is that the server has no idea w

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Andrew Dunstan
On 03/24/2014 02:50 PM, Jim Nasby wrote: On 3/22/14, 11:26 AM, Jim Nasby wrote: On 3/21/14, 4:54 PM, Tom Lane wrote: Merlin Moncure writes: There is no way for psql to handle that case though unless you'd strip *all* BOMs encountered. Compounding this problem is that there's no practical wa

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-24 Thread Jim Nasby
On 3/22/14, 11:26 AM, Jim Nasby wrote: On 3/21/14, 4:54 PM, Tom Lane wrote: Merlin Moncure writes: There is no way for psql to handle that case though unless you'd strip *all* BOMs encountered. Compounding this problem is that there's no practical way AFAIK to send multiple file to psql via s

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-23 Thread David E. Wheeler
On Mar 23, 2014, at 8:03, Guillaume Lelarge wrote: > > Just a quick comment on this. Yes, pgAdmin always added a BOM in every > SQL files it wrote. From http://stackoverflow.com/questions/2223882/whats-different-between-utf-8-and-utf-8-without-bom: According to the Unicode standard, the BOM fo

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-23 Thread Guillaume Lelarge
On Sat, 2014-03-22 at 11:23 -0500, Jim Nasby wrote: > On 3/21/14, 8:13 PM, David E. Wheeler wrote: > > On Mar 21, 2014, at 2:16 PM, Andrew Dunstan wrote: > > > >> Surely if it were really a major annoyance, someone would have sent code > >> to fix it during the last 4 years and more since the abo

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-22 Thread Jim Nasby
On 3/21/14, 4:54 PM, Tom Lane wrote: Merlin Moncure writes: There is no way for psql to handle that case though unless you'd strip *all* BOMs encountered. Compounding this problem is that there's no practical way AFAIK to send multiple file to psql via single command line invocation. If you p

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-22 Thread Jim Nasby
On 3/21/14, 8:13 PM, David E. Wheeler wrote: On Mar 21, 2014, at 2:16 PM, Andrew Dunstan wrote: Surely if it were really a major annoyance, someone would have sent code to fix it during the last 4 years and more since the above. I suspect it's a minor annoyance :-) But by all means add it t

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-21 Thread David E. Wheeler
On Mar 21, 2014, at 2:16 PM, Andrew Dunstan wrote: > Surely if it were really a major annoyance, someone would have sent code to > fix it during the last 4 years and more since the above. > > I suspect it's a minor annoyance :-) > > But by all means add it to the TODO list if it's not there al

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-21 Thread Tom Lane
Merlin Moncure writes: > There is no way for psql to handle that case though unless you'd strip > *all* BOMs encountered. Compounding this problem is that there's no > practical way AFAIK to send multiple file to psql via single command > line invocation. If you pass multiple -f arguments all bu

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-21 Thread Merlin Moncure
On Fri, Mar 21, 2014 at 4:28 PM, Tom Lane wrote: > I'd be okay with swallowing a leading BOM if and only if client encoding > is UTF8. This should apply to any file psql reads, whether script or > data. Yeah. The one case that doesn't solve is: cat f1.sql f2.sql | psql ... Which is common usa

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-21 Thread Tom Lane
Andrew Dunstan writes: > Surely if it were really a major annoyance, someone would have sent code > to fix it during the last 4 years and more since the above. The code would probably be pretty trivial, *if* we had consensus on what the behavior ought to be. I'm not sure if we do. People who o

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-21 Thread Andrew Dunstan
On 03/21/2014 05:06 PM, Merlin Moncure wrote: On Fri, Mar 21, 2014 at 4:02 PM, Jim Nasby wrote: See http://www.postgresql.org/message-id/4afeab39.3000...@dunslane.net This is still broken as of fairly recent HEAD; any objections to adding it to TODO? Agreed: this is a major annoyance.

Re: [HACKERS] psql blows up on BOM character sequence

2014-03-21 Thread Merlin Moncure
On Fri, Mar 21, 2014 at 4:02 PM, Jim Nasby wrote: > See http://www.postgresql.org/message-id/4afeab39.3000...@dunslane.net > > This is still broken as of fairly recent HEAD; any objections to adding it to > TODO? Agreed: this is a major annoyance. merlin -- Sent via pgsql-hackers mailing lis

[HACKERS] psql blows up on BOM character sequence

2014-03-21 Thread Jim Nasby
See http://www.postgresql.org/message-id/4afeab39.3000...@dunslane.net This is still broken as of fairly recent HEAD; any objections to adding it to TODO? -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net --