On 11/29/13, 8:33 PM, Cedric Greevey wrote:
Have you checked for other sources of performance hits? Boxing, var lookups, and especially reflection.
As I said, I haven't done any optimization yet. :) I did check for reflection though and didn't see any.

I'd expect a reasonably optimized Clojure version to outperform a Python version by a very large factor -- 10x just for being JITted JVM bytecode instead of interpreted Python, times another however-many-cores-you-have for core.async keeping all your processors warm vs. Python and its GIL limiting the Python version to single-threaded performance.
This task does not benefit from the multiplexing that core.async provides, at least not in the case of a single simulation which has no clear logical partition that can be run in parallel. The primary benefit that core.async is providing in this case is to escape from call-back hell.

If your Clojure version is 2.5x *slower* then it's probably capable of a *hundredfold* speedup somewhere, which suggests reflection (typically a 10x penalty if happening heavily in inner loops) *and* another sizable performance degrader* are combining here. Unless, again, you're measuring mostly overhead and not real workload on the Clojure side, but not on the Python side. Put a significant load into each goroutine in both versions and compare them then, see if that helps the Clojure version much more than the Python one for some reason.

Yeah, I think a real life simulation may have different results than this micro-benchmark.


* The other degrader would need to multiply with, not just add to, the reflection, too. That suggests either blocking (reflection making that worse by reflection in one thread/go holding up progress systemwide for 10x as long as without reflection) or else excess/discarded work (10x penalty for reflection, times 10x as many calls as needed to get the job done due to transaction retries, poor algo, or something, would get you a 100-fold slowdown -- but retries of swap! or dosync shouldn't be a factor if you're eschewing those in favor of go blocks for coordination...)



On Fri, Nov 29, 2013 at 10:13 PM, Ben Mabey <b...@benmabey.com <mailto:b...@benmabey.com>> wrote:

    On Fri Nov 29 17:04:59 2013, kandre wrote:

        Here is the gist: https://gist.github.com/anonymous/7713596
        Please not that there's no ordering of time for this simple
        example
        and there's only one event (timeout). This is not what I
        intend to use
        but it shows the problem.
        Simulating 10^5 steps this way takes ~1.5s

        Cheers
        Andreas

        On Saturday, 30 November 2013 09:31:08 UTC+10:30, kandre wrote:

            I think I can provide you with a little code snipped.
            I am talking about the very basic car example
            (driving->parking->driving). Running the sim using core.async
            takes about 1s for 10^5 steps whereas the simpy version
        takes less
            than 1s for 10^6 iterations on my vm.
            Cheers
            Andreas

            On Saturday, 30 November 2013 09:22:22 UTC+10:30, Ben
        Mabey wrote:

                On Fri Nov 29 14:13:16 2013, kandre wrote:
                > Thanks for all the replies. I accidentally left out the
                close! When I contrived the example. I am using
        core.async for
                a discrete event simulation system. There are hundreds
        of go
                blocks all doing little but putting a sequence of
        events onto

                a channel and one go block advancing taking these
        events and
                advancing the time similar to simpy.readthedocs.org/
        <http://simpy.readthedocs.org/>
                <http://simpy.readthedocs.org/>

                >
                > The basic one car example under the previous link
        executes
                about 10 times faster than the same example using
        core.a sync.
                >

                Hi Andreas,
                I've been using core.async for DES as well since I
        think the
                process-based approach is useful.  I could try doing
        the same
                simulation you're attempting to see how my approach
        compares
                speed-wise.  Are you talking about the car wash or the gas
                station
                simulation?  Posting a gist of what you have will be
        helpful
                so I can
                use the same parameters.

                -Ben




        --
        --
        You received this message because you are subscribed to the Google
        Groups "Clojure" group.
        To post to this group, send email to clojure@googlegroups.com
        <mailto:clojure@googlegroups.com>
        Note that posts from new members are moderated - please be patient
        with your first post.
        To unsubscribe from this group, send email to
        clojure+unsubscr...@googlegroups.com
        <mailto:clojure%2bunsubscr...@googlegroups.com>
        For more options, visit this group at
        http://groups.google.com/group/clojure?hl=en
        ---
        You received this message because you are subscribed to the Google
        Groups "Clojure" group.
        To unsubscribe from this group and stop receiving emails from
        it, send
        an email to clojure+unsubscr...@googlegroups.com
        <mailto:clojure%2bunsubscr...@googlegroups.com>.
        For more options, visit https://groups.google.com/groups/opt_out.


    I've verified your results and compared it with an implementation
    using my library.  My version runs 1.25x faster than yours and
    that is with an actual priority queue behind the scheduling for
    correct simulation/time semantics.  However, mine is still 2x
    slower than the simpy version.  Gist with benchmarks:

    https://gist.github.com/bmabey/7714431

    simpy is a mature library with lots of performance tweaking and I
    have done no optimizations so far.  My library is a thin wrapping
    around core.async with a few hooks into the internals and so I
    would expect that most of the time is being spent in core.async
    (again, I have done zero profiling to actually verify this).  So,
    it may be that core.async is slower than python generators for
    this particular use case.  I should say that this use case is odd
    in that our task is a serial one and so we don't get any benefit
    from having a threadpool to multiplex across (in fact the context
    switching may be harmful).

    In my case the current slower speeds are vastly outweighed by the
    benefits:
    * can run multiple simulations in parallel for sensitivity analysis
    * I plan on eventually targeting Clojurescript for visualization
    (right now an event stream from JVM is used)
    * ability to leverage CEP libraries for advanced stats
    * being integrated into my production systems via channels which
    does all the real decision making in the sims.
        This means I can do sensitivity analysis on different policies
    using actual production code.  A nice side benefit of this is that
    I get a free integration test. :)

    Having said all that I am still exploring the use of core.async
    for DES and have not yet replaced my event-based simulator.  I
    most likely will replace at least parts of my simulations that
    have a lot of nested call-backs that make things hard to reason
    about.


    -Ben

-- -- You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to clojure@googlegroups.com
    <mailto:clojure@googlegroups.com>
    Note that posts from new members are moderated - please be patient
    with your first post.
    To unsubscribe from this group, send email to
    clojure+unsubscr...@googlegroups.com
    <mailto:clojure%2bunsubscr...@googlegroups.com>
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
    --- You received this message because you are subscribed to the
    Google Groups "Clojure" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to clojure+unsubscr...@googlegroups.com
    <mailto:clojure%2bunsubscr...@googlegroups.com>.
    For more options, visit https://groups.google.com/groups/opt_out.


--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to