On Fri, Nov 1, 2024, at 22:28, Masahiko Sawada wrote: > As I mentioned in a separate email, if we use the OS default EOL as > the default EOL in raw format, it would not be necessary to allow it > to be multi characters. I think it's worth considering it.
I like the idea, but not sure I understand how it would work. What if a user's OS default is \n (LF) and this user wants to import a Windows text file \r\n (CR LR), which is a multi characters EOL delimiter. Was your idea to make an exception for that particular EOL, or to simply not support that edge case? > --- > * copyfromparse.c > * Parse CSV/text/binary format for COPY FROM. > > We need to update here as well. > > -- > - * 4. CopyReadAttributesText/CSV() function takes the input line from > + * 4. CopyReadAttributesText/CSV/Raw() function takes the input line from > > I think we don't have CopyReadAttributesRaw() function now. > > --- > I think it would be better to explain how to parse data in raw mode, > especially which steps in the pipeline we skip, in the comment at the > top of copyfromparse.c. > > --- > - if (cstate->opts.format == COPY_FORMAT_CSV) > + if (cstate->opts.format == COPY_FORMAT_RAW) > { > - /* > - * If character is '\r', we may need to look ahead below. Force > - * fetch of the next character if we don't already have it. We > - * need to do this before changing CSV state, in case '\r' is also > - * the quote or escape character. > - */ > - if (c == '\r') > + if (delim_len == 0) > { > - IF_NEED_REFILL_AND_NOT_EOF_CONTINUE(0); > + /* When reading entire file, consume all remaining > bytes at once */ > + input_buf_ptr = copy_buf_len; > + continue; > } > + else > > I think that the code become much simpler if we do 'continue' at the > end of 'if (cstate->opts.format == COPY_FORMAT_RAW)' branch, instead > of doing 'if (cstate->opts.format == COPY_FORMAT_RAW) {} else ...'. > > --- > @@ -1461,6 +1536,7 @@ CopyReadLineText(CopyFromState cstate) > return result; > } > > + > /* > * Return decimal value for a hexadecimal digit > */ > @@ -1937,7 +2013,6 @@ endfield: > return fieldno; > } > > - > /* > * Read a binary attribute > */ > > Unnecessary line addition and removal. Thanks for review, I agree on all comments, awaiting your comment on EOL before implementing changes. /Joel