Dmitry,

Thanks for review

Agree with all suggestions.

Made and run new benches without any redundant operations (prints, time
calculation), only with "put" in "while true" cycle (almost)
Case description, environment, results and code in ticket
https://issues.apache.org/jira/browse/IGNITE-9824 (last comment)

Please review new metrics
Because picture is not changed so much, python still have performance issues

ср, 10 окт. 2018 г. в 12:30, Dmitry Melnichuk <
dmitry.melnic...@nobitlost.com>:

> Stepan!
>
> Can you please update the benchmarks based on my suggestions earlier in
> this thread: use a cleaner time profiling for the loop, remove
> unnecessary operations (string formatting, rounding operations), stick
> to the primitive int value for put operations (agree with Vladimir,
> let's keep it simple). And let me know the results.
>
> On 10/10/18 5:46 PM, Vladimir Ozerov wrote:
> > Hi Dmitry,
> >
> > I agree with your comments on benchmarking code. As more fair
> > alternative we may measure time of loading N elements into the cache, so
> > that it will be necessary to call time() only twice. However, provided
> > that we have real network here, and according to numbers one PUT in
> > Python client requires about 0.3 msec, I hardly can imagine that call to
> > time() or round() may dominate in that response time. But I said, we can
> > easily rule that out by slight benchmark rewrite.
> >
> > As far as serialization, I would prefer to keep primitive objects at the
> > moment, because AFAIK the purpose of the benchmark was to assess
> > underlying infrastructure, looking for some obvious performance issues.
> > Every platform have sockets which use more or less the same API of
> > underlying OS. To the contrast, performance of serialization mechanics
> > may differ widely between platforms depending on implementation. E.g. in
> > Java we spend a lot of time on fine-grained tuning, applied a lot of
> > speculative optimizations. Add to this JIT nature of Java, and you will
> > hardly ever beat Java serialization engine from any interpreted
> > language. So primitive data types make good sense to me.
> >
> > At this point our goal is not make Python equally fast with other
> > platforms, but rather to understand why is it slower than others. Ignite
> > is a product which brings speed to end users. If they do no have speed,
> > they will not use Ignite. So performance questions are always of great
> > importance for us.
> >
> > Vladimir.
> >
> > On Wed, Oct 10, 2018 at 9:57 AM Dmitry Melnichuk
> > <dmitry.melnic...@nobitlost.com <mailto:dmitry.melnic...@nobitlost.com>>
>
> > wrote:
> >
> >     Hi, Stepan!
> >
> >     I looked at the benchmark code and the overall methodics, discussed
> it
> >     with fellow programmers, and came up with basically two remarks.
> >
> >     First of all, I think, the key for the good (or, at least,
> >     unobjectionable) measurement is to isolate the object being measured
> >     from the influence of the measurement tool. The usual pattern we use
> in
> >     Python looks like:
> >
> >     ```
> >     from time import time
> >
> >
> >     bench_start = time()
> >
> >     for _ in range(number_of_tests):
> >           do_test()
> >
> >     bench_end = time()
> >
> >     print('Performance is {} tests per second'.format(
> >           number_of_tests / (bench_end - bench_start)
> >     ))
> >     ```
> >
> >     I think, you got the idea: the measurement consists almost solely of
> >     the
> >     time taken by our subject function `do_test`. As few other code as
> >     possible influence the result.
> >
> >     Now, let's take a look at your code:
> >
> >     https://gist.github.com/pilshchikov/8aff4e30d83f8bac20c5a4a9c3917abb
> >
> >     Ideally, the `while` loop should include only `cache.put(last_key,
> >     some_precomputed_value)`. But instead it also includes:
> >
> >     - a series of the `time()` calls, which could be mostly excluded from
> >     the measured time, if the measurement done right; each call is
> probably
> >     addresses the HPET device, or network time, or both,
> >
> >     - some floating-point calculations, including `round()`, which is
> >     hardly
> >     necessary,
> >
> >     - formatting and output of the intermediate result.
> >
> >     I suppose the measurement influence can be quite significant here,
> but
> >     it is at least more or less constant for each test.
> >
> >     But if we look at the other benchmarks:
> >
> >     https://gist.github.com/pilshchikov/8a4bdb03a8304136c22c9bf7217ee447
> >     [Node.js]
> >     https://gist.github.com/pilshchikov/b4351d78ad59e9cd923689c2e387bc80
> >     [PHP]
> >     https://gist.github.com/pilshchikov/08096c78b425e00166a2ffa2aa5f49ce
> >     [Java]
> >
> >     The extra code that influence the measurement is not equivalent
> across
> >     all platforms. For example, PHP's `time()` is most probably lighter
> >     than
> >     Python `time()`, since it do not give out milliseconds and may
> address
> >     RTC, not HPET. So the platform-to-platform comparison in your
> benchmark
> >     does not look completely fair to me.
> >
> >     The second remark concerns not the measurement procedure, but the
> >     object
> >     being measured.
> >
> >     The only client operation being used is OP_CACHE_PUT with a payload
> >     of a
> >     primitive type. (BTW the type is `Long` in case of the Python client;
> >     what about the other clients? Is it `Int`?) I afraid that such an
> >     object, even being properly isolated from the measurement tool, would
> >     show mostly the throughput of the underlying platform's sockets
> >     implementation, not the performance of the client's code. To show the
> >     potential of the thin client itself, more varied and fancy tasks
> >     needed,
> >     i. e. serialization/deserialization of the Complex objects.
> >
> >     But it depends on the goal of the benchmarking. If showing off the
> raw
> >     platform agility was intended, than this objection is removed.
> >
> >     Dmitry
> >
> >     On 10/9/18 10:50 PM, Stepan Pilshchikov wrote:
> >      > Hi, all
> >      >
> >      > Tried to compare new thin client performance and made similar
> >     benchmarks for
> >      > each one
> >      > And result not so good for python
> >      >
> >      > Ticket with results and bench code:
> >      > https://issues.apache.org/jira/browse/IGNITE-9824
> >      >
> >      > Py code src:
> >      >
> https://gist.github.com/pilshchikov/8aff4e30d83f8bac20c5a4a9c3917abb
> >      >
> >      > Dmitry, please review results and bench code, maybe somthing
> >     wrong or it's
> >      > expected numbers?
> >      >
> >      >
> >      >
> >      > --
> >      > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >      >
> >
>
>

Reply via email to