I just became aware that AUTOFLUSH_BACKGROUND has been available for longer on the java client, so likely not the cause after all. So never mind my suggestion.
-david On Mon, Apr 24, 2017 at 10:56 AM, Todd Lipcon <[email protected]> wrote: > I vaguely recall some bug in earlier versions of the Java client where > 'shutdown' wouldn't properly block on the data being flushed. So it's > possible in 1.0.x and below, you're not actually measuring the full amount > of time to write all the data, whereas when the bug is fixed, you are. > > I'll see if I can repro this locally as well using your code. > > -Todd > > On Mon, Apr 24, 2017 at 10:49 AM, David Alves <[email protected]> > wrote: > >> Hi Pavel >> >> Interesting, Thanks for sharing those numbers. >> I assume you weren't using AUTOFLUSH_BACKGROUND for the first versions >> you tested (don't think it was available then iirc). >> Could you try without in the last version and see how the numbers >> compare? >> We'd be happy to help track down the reason for this perf regression. >> >> Best >> David >> >> On Mon, Apr 24, 2017 at 4:58 AM, Pavel Martynov <[email protected]> >> wrote: >> >>> Hi, I ran into the fact that I can not achieve high insertion speed and >>> I start to experiment with https://github.com/cloude >>> ra/kudu-examples/tree/master/java/insert-loadgen. >>> My slightly modified code (recreation of table on startup + duration >>> measuring): https://gist.github.com/xkrt/9405a2eeb98a56288b7 >>> c5a7d817097b4. >>> On every run I change kudu-client version, results: >>> >>> kudu-client-ver perf >>> 0.10 Duration: 626 ms, 79872/sec >>> 1.0.0 Duration: 622 ms, 80385 inserts/sec >>> 1.0.1 Duration: 630 ms, 79365 inserts/sec >>> 1.1.0 Duration: 11703 ms, 4272 inserts/sec >>> 1.3.1 Duration: 12317 ms, 4059 inserts/sec >>> >>> As can you see there was a great degradation between 1.0.1 and 1.1.0 >>> (about a ~20 times!). >>> What could be a problem, how can I fix it? (actually I interested in >>> kudu-spark, so probably using of kudu-client 1.0.1 is not right solution?). >>> >>> My test cluster: 3 hosts with master and tserver on each (3 masters and >>> 3 tservers overall). >>> No extra settings, flags used: >>> fs_wal_dir >>> fs_data_dirs >>> master_addresses >>> tserver_master_addrs >>> >>> >>> -- >>> with best regards, Pavel Martynov >>> >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera >
