Hi Gopal, Aha! thank you for background behind this. that makes things much more understandable.
and ~3000 queries across 10 HS2 servers. sweet. now that's what i call pushing the edge. I like it! Thanks again, Stephen. On Tue, Aug 9, 2016 at 10:29 PM, Gopal Vijayaraghavan <gop...@apache.org> wrote: > > not get the progress messages back until the query finishes which > >somewhat defeats the purpose of interactive usage. > > That happens entirely on the client side btw. > > So to avoid a hard sleep() + check loop causing pointless HTTP traffic, > HiveServer2 now does a long poll on the server side. > > hive.server2.long.polling.timeout", "5000ms" > > > This means that it is edge-triggered to return whenever the query finishes > instead of adding extra time when the results are ready but beeline > doesn't know about. > > > However, the get_logs() synchronizes on the same HiveStatement and is > mutexed out by the long poll for getting results. > > You can escape this on a low-concurrency cluster by changing the > long.polling.timeout to 0.5s instead of 5s & restarting HS2. > > However as the total # of concurrent queries goes up, the current setting > does very well due to the reduction in total # of "Nope, come back" http > noise (largest parallel workload I've seen is about ~3000 queries on 10 > HS2 nodes load-balanced). > > Cheers, > Gopal > > >