On Thu, Feb 20, 2020 at 02:36:02PM +0100, Tomas Vondra wrote: > On Thu, Feb 20, 2020 at 04:11:39PM +0530, Amit Kapila wrote: > > On Thu, Feb 20, 2020 at 5:12 AM David Fetter <da...@fetter.org> wrote: > > > > > > On Fri, Feb 14, 2020 at 01:41:54PM +0530, Amit Kapila wrote: > > > > This work is to parallelize the copy command and in particular "Copy > > > > <table_name> from 'filename' Where <condition>;" command. > > > > > > Apropos of the initial parsing issue generally, there's an interesting > > > approach taken here: https://github.com/robertdavidgraham/wc2 > > > > > > > Thanks for sharing. I might be missing something, but I can't figure > > out how this can help here. Does this in some way help to allow > > multiple workers to read and tokenize the chunks? > > I think the wc2 is showing that maybe instead of parallelizing the > parsing, we might instead try using a different tokenizer/parser and > make the implementation more efficient instead of just throwing more > CPUs on it.
That was what I had in mind. > I don't know if our code is similar to what wc does, maytbe parsing > csv is more complicated than what wc does. CSV parsing differs from wc in that there are more states in the state machine, but I don't see anything fundamentally different. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate