On Tue, Sep 22, 2015 at 3:14 AM, Haribabu Kommi <kommi.harib...@gmail.com> wrote: > copy_user_generic_string system call is because of file read operations. > In my test, I gave the shared_buffers as 12GB with the table size of 18GB.
OK, cool. So that's actually good: all that work would have to be done either way, and parallelism lets several CPUs work on it at once. > The _spin_lock calls are from the signals that are generated by the workers. > With the increase of tuple queue size, there is a change in kernel system > calls usage. And this part is not so good: that's additional work created by parallelism that wouldn't have to be done if we weren't in parallel mode. Of course, it's impossible to eliminate that, but we should try to reduce it. > - From the above performance readings, increase of tuple queue size > gets benefited with lesser > number of workers compared to higher number of workers. That makes sense to me, because there's a separate queue for each worker. If we have more workers, then the total amount of queue space available rises in proportion to the number of workers available. > Workers are getting started irrespective of the system load. If user > configures 16 workers, but > because of a sudden increase in the system load, there are only 2 or 3 > cpu's are only IDLE. > In this case, if any parallel seq scan eligible query is executed, the > backend may start 16 workers > thus it can lead to overall increase of system usage and may decrease > the performance of the > other backend sessions? Yep, that could happen. It's something we should work on, but the first version isn't going to try to be that smart. It's similar to the problem we already have with work_mem, and I want to work on it, but we need to get this working first. > If the query have two parallel seq scan plan nodes and how the workers > will be distributed across > the two nodes? Currently parallel_seqscan_degree is used per plan > node, even if we change that > to per query, I think we need a worker distribution logic, instead of > using all workers by a single > plan node. Yes, we need that, too. Again, at some point. > Select with a limit clause is having a performance drawback with > parallel seq scan in some scenarios, > because of very less selectivity compared to seq scan, it should be > better if we document it. Users > can take necessary actions based on that for the queries with limit clause. This is something I want to think further about in the near future. We don't have a great plan for shutting down workers when no further tuples are needed because, for example, an upper node has filled a limit. That makes using parallel query in contexts like Limit and InitPlan significantly more costly than you might expect. Perhaps we should avoid parallel plans altogether in those contexts, or maybe there is some other approach that can work. I haven't figured it out yet. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers