Yes, I submit patch https://gerrit.cloudera.org/#/c/6735/. Worked on it all my day, you know, make it compile on Windows not so easy... :)
2017-04-25 22:57 GMT+03:00 Todd Lipcon <[email protected]>: > Hi Pavel, > > That's a good find. It certainly does look like we could do caching of > this data. We use the local network interface address list to determine > whether a remote server is local or not. > > In fact in many cases we are calling this we don't even care about the > result - it's just computed as a side effect of creating the 'ServerInfo' > object. > > I filed KUDU-1982 to track this issue. > > Any interest in working on a fix? > > -Todd > > > On Tue, Apr 25, 2017 at 5:10 AM, Pavel Martynov <[email protected]> > wrote: > >> I reproduce this problem with java.net.NetworkInterface.getByInetAddress >> and Windows on a few other machines. Also found this 'not an issue' >> http://bugs.java.com/view_bug.do?bug_id=7039343. >> Maybe kudu-client will use some memoization for this function? >> >> 2017-04-25 13:09 GMT+03:00 Pavel Martynov <[email protected]>: >> >>> I figure out that problem was that I run this program on my development >>> Windows machine. It seems that there is some performance issue with >>> java.net.NetworkInterface.getByInetAddress on Windows (I found only >>> that http://stackoverflow.com/questions/35541870/java-networ >>> kinterface-getbyinetaddress-takes-way-too-long confirmation so far). >>> See profiler screenshot http://pasteboard.co/8uHil3I5H.png (kudu-client >>> v1.3.1), every call take 53 ms (!) on average. >>> Also, could you recheck logic, why this function recalls 88 times in 12 >>> seconds for that small program? >>> >>> 2017-04-24 22:29 GMT+03:00 Todd Lipcon <[email protected]>: >>> >>>> I tried to reproduce this locally using your code and couldn't. I get >>>> around 100K inserts/second for 1.0, 1.1, 1.2, and 1.3 clients (against a >>>> 1.4-SNAPSHOT cluster) >>>> >>>> Is it always reproducible for you? eg if you switch back to the earlier >>>> client and try another set of runs, do you get the same results? >>>> >>>> -Todd >>>> >>>> On Mon, Apr 24, 2017 at 10:56 AM, Todd Lipcon <[email protected]> >>>> wrote: >>>> >>>>> I vaguely recall some bug in earlier versions of the Java client where >>>>> 'shutdown' wouldn't properly block on the data being flushed. So it's >>>>> possible in 1.0.x and below, you're not actually measuring the full amount >>>>> of time to write all the data, whereas when the bug is fixed, you are. >>>>> >>>>> I'll see if I can repro this locally as well using your code. >>>>> >>>>> -Todd >>>>> >>>>> On Mon, Apr 24, 2017 at 10:49 AM, David Alves <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Pavel >>>>>> >>>>>> Interesting, Thanks for sharing those numbers. >>>>>> I assume you weren't using AUTOFLUSH_BACKGROUND for the first >>>>>> versions you tested (don't think it was available then iirc). >>>>>> Could you try without in the last version and see how the numbers >>>>>> compare? >>>>>> We'd be happy to help track down the reason for this perf >>>>>> regression. >>>>>> >>>>>> Best >>>>>> David >>>>>> >>>>>> On Mon, Apr 24, 2017 at 4:58 AM, Pavel Martynov <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi, I ran into the fact that I can not achieve high insertion speed >>>>>>> and I start to experiment with https://github.com/cloude >>>>>>> ra/kudu-examples/tree/master/java/insert-loadgen. >>>>>>> My slightly modified code (recreation of table on startup + duration >>>>>>> measuring): https://gist.github.com/xkrt/9405a2eeb98a56288b7 >>>>>>> c5a7d817097b4. >>>>>>> On every run I change kudu-client version, results: >>>>>>> >>>>>>> kudu-client-ver perf >>>>>>> 0.10 Duration: 626 ms, 79872/sec >>>>>>> 1.0.0 Duration: 622 ms, 80385 inserts/sec >>>>>>> 1.0.1 Duration: 630 ms, 79365 inserts/sec >>>>>>> 1.1.0 Duration: 11703 ms, 4272 inserts/sec >>>>>>> 1.3.1 Duration: 12317 ms, 4059 inserts/sec >>>>>>> >>>>>>> As can you see there was a great degradation between 1.0.1 and 1.1.0 >>>>>>> (about a ~20 times!). >>>>>>> What could be a problem, how can I fix it? (actually I interested in >>>>>>> kudu-spark, so probably using of kudu-client 1.0.1 is not right >>>>>>> solution?). >>>>>>> >>>>>>> My test cluster: 3 hosts with master and tserver on each (3 masters >>>>>>> and 3 tservers overall). >>>>>>> No extra settings, flags used: >>>>>>> fs_wal_dir >>>>>>> fs_data_dirs >>>>>>> master_addresses >>>>>>> tserver_master_addrs >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> with best regards, Pavel Martynov >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Todd Lipcon >>>>> Software Engineer, Cloudera >>>>> >>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>>> >>> >>> >>> >>> -- >>> with best regards, Pavel Martynov >>> >> >> >> >> -- >> with best regards, Pavel Martynov >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- with best regards, Pavel Martynov
