Apologies for the incoming wall of text, as well as for co-opting the -RC4
thread

TLDR:
-RC4 LGTM, I enjoy the startup-speed boost
- we need a benchmark before further evaluating the work on tuples

2015-12-18 13:02 GMT+01:00 Mikera <mike.r.anderson...@gmail.com>:

>
> I don't actually recall seeing any benchmarks showing slow-downs in
> real-world programs. Rich made an apparently unsubstantiated assertion that
> these exist but didn't provide his analysis (see CLJ-1517).
>

I don't remember any benchmarks showing the slowdown either, but I'm taking
Rich's word for it.

On the other hand Zach ran some benchmarks on JSON decoding and found a
> roughly 2x speedup. That's a pretty big deal for code implementing JSON
> APIs (which is probably a reasonable example of real world,
> nested-data-structure heavy code).
>

Well I'm also taking your word for that and for the speedups that you saw
in other benchmarks. Whether it represents real-world usage, depends on the
shape of your test data. If the test data is just (repeat [:test :vector]),
then no it doen't represent real-world usage, because it would exercise
just one arity.
Is this benchmark, you're speaking of, posted somewhere the wider community
can review it? I'd like to play with it, see the speedup and try to break
it by adding polymorphism. I'd also be happy to help developing the test
cases.

I think it's good to have your tuple proposal around, so that we have
something to benchmark stock clojure against, but, before making a serious
push into core, we should have a test suite, that allows to run many
different permutations of enabled test cases (exercising various arities),
with clojure + various proposal patches, ideally on various jvms. Only then
we can get serious about discussing performance trade-offs.

Does anyone have any actual evidence of this supposed slowdown? i.e. is
> there a standard benchmark that is considered acceptable for general
> purpose / real world performance in Clojure applications?
>

If there were, somebody would probably have pointed it out. Right now, I
feel that any work is best spent on developing such a benchmark, to help
the community evaluate the situation.


> If so I'm happy to run it and figure out why any slowdown with Tuples is
> happening. My strong suspicion is that the following is true:
> 1) The Tuples generally provide a noticeable speedup (as demonstrated by
> the various micro-benchmarks)
>

(IMHO) Clojure has always been a big-picture language and reliable
end-to-end performance in a multi-tenant setup is more important than
looking good on alioth


> 2) There are a few hotspots where Tuples *don't* make sense because of PIC
> pressure / megamorphic call sites (repeated conj on vectors might be an
> example....). These cases can revealed by more macro-level benchmarking.
>

There are many possible caveats:
- is the morphism degree of a protocol-call local to the callsite or global
to the protocol dispatch fn?
- does the gc take advantage of objects being uniformly sized and how much
of that will we lose?
- how much do the hot-spots shift amongst various programs?

3) We should be able to identify these cases of 2) and revert to generating
> regular PersistentVectors (or switching to Transients....). In that case
> the Tuple patches may develop from being a debatable patch with some
> problematic trade-offs to a pretty clear all-round improvement (in both
> micro and macro benchmarks).
>

Well, before we have a comprehensive set of benchmarks, all we can really
do, is throw code at a wall and see if it sticks.

The key point regarding 3): code that is performance sensitive (certainly
> in core, maybe in some libs) should consider whether a Tuple is a good idea
> or not (for any given call-site). These may need addressing individually,
> but this is incremental to the inclusion of Tuples themselves. The
> performance comparison isn't as simple as "current vs. tuples patch", it
> should be "current vs. tuples patch + related downstream optimisation"
> because that is what you are going to see in the released version.
>

To be really honest, this sounds a bit like: If only cognitect shoved
tuples down the community's throat, people would start optimizing for them.
Which is true. It's also probable, that, after the dust settling, we'd end
up with somewhat better performance than we have now. We still shouldn't do
it that way.

Why not start with a tuple library, that we can use if we want increased
tuple performance? That certainly worked for cljx, even if people used to
complain about it.
(ns my.lib/awesome-ns
  (:refer-clojure :exclude [into conj vector vec])
  (:require [mikera.awesome/vectors :refer [into conj vector vec]]))

You could even include a flag in your tuple library to revert to core
functions, in order to benchmark against core without rewriting anything.

Also it should be remembered that JVMs are getting smarter (escape analysis
> allowing allocation of small objects on the stack etc.) and the Clojure
> compiler is also getting smarter (direct linking etc.). Tuples could
> potentially give further upside in these cases, so there is a broader
> context to be considered. My view is that the balance will shift more in
> favour of Tuples over time as the respective runtime components get smarter
> at taking advantage of type specialisation (happy to hear other views, of
> course).
>

Look at it this way: You're telling me, that the JVM's JIT will get smarter
in the future. Ok. Right there are two things that I'd have to take
somebody's word for. That criticism doesn't even touch on the hell of a lot
of maybes, that you ask us to base decisions on.
It's just about the simple fact that the only way to falsify your
hypothesis is to wait it out.

I agree checking in generated files is a bad idea, that was why I actually
> created hand-coded variants of Zach's original Tuple code as part of
> CLJ-1517. My reasoning for this was as follows:
> 1) You do in fact want some hand-coded differences, e.g. making the
> 2-Tuple work as a MapEntry, having a single immutable instance of Tuple0
> etc.). It is annoying to handle these special cases in a code generator
>

OTOH, it's worth it, because the generator would make it easy to benchmark
many different permutations. Maybe even generate per-application variants
based on profiling.


> 2) Class generation at compile time is fiddly and would complicate the
> build / development process (definitely not a good thing!)
>

Commiting to a generated blob with the generator being lost to tribal
knowledge is also not a good trade-off, complexity-wise.


> 3) It is simpler to maintain a small, fixed number of concrete Java source
> files than it is to maintain a code-generator for the same (which may be
> less lines of code, but has much higher conceptual overhead)
>

That would be true, if perfomance was a fixed point to optimize against.
ALAS, as you argued yourself, performance is a moving target, with the
constraints changing even from machine to machine.
That means that any, however small, set of classes dedicated to optimizing
performance will also be a moving target. It certainly will be until we
agree on a sweet spot in our test suite.

> So, while the second point certainly would make a proposal more appealing,
>> the first one is mandatory due diligence. I'm really glad, that cognitect
>> acted as a gate-keeper there and saved us from microbenchmark-hell.
>>
> Really?
>

Yes


> I think this CLJ-1517 issue is an example of how *not* to do OSS
> development.
>

Let me tell you: I'm also active in the NixOS community and while I love
the community as well as the system (almost as much as clojure's :-), the
thing that annoys me the most are collaborators just hitting that "Merge"
button without proper evaluation, blindly relying on the CI Server. It
works out, because it's still a functional system with deep immutability
and I don't think there is much of an alternative with 1000s of upstream
packages, but it still made me appreciate the zen-like pace of clojure,
especially since the language is so extensible. Stuart Halloway's remarks
on that topic, in the recent cognicast episode struck a chord with me there.

a) Substantial potential improvements (demonstrated with numerous
> benchmarks) sitting unresolved for well over a year with limited / very
> slow feedback
>

Well, the same is also true for some bugs and I agree that Cognitect still
has bottle-neck problems, even though they got a lot better, since Alex
Miller became the community spokesperson.
But Cognitect is only one half of the problem. The other is a community
where fantastic large-scale efforts to advance the language, like dunaj, go
largely undiscussed, while every second thread about a new uri-parsing
library goes into a double digit post-count.


> b) Motivated, skilled contributors initially being encouraged to work on
> this but find themselves getting ignored / annoyed with the process /
> confused by lack of communication (certainly myself and I suspect I also
> speak for Zach here)
>

I agree and I hope that Cognitect will find more ways to transfer Rich's
sense of direction into the community.


> c) Rich commits his own patch, to the surprise of contributors. I provided
> some (admittedly imperfect, but hopefully directionally correct) evidence
> that Zach's approach is better. Rich's patch subsequently gets reverted,
> but we are just back to square one.
>

Full ack, while Rich certanly has the privilege to commit however he likes,
it would be good to see him use the bug tracker for his own patches, if
only to appreciate how crappy the workflow, for creating a ticket and
attaching a patch really is in jira ;-)


> d) Lack of clarity on process / requirements for ultimately getting a
> patch accepted. What benchmark of "real world usage" is actually wanted?
> I've seen little / no communication on this despite multiple requests.
>

I suspect that my requirements for a benchmark (mainly extensibility and
ease of testing many permutations), would happily coincide with advancing
the state of data-structure based optimizations, thus coincide with said
requirements. (Yes, Rich, I'm putting words in your mouth, if you don't
like it, come over to discuss it ;-)

This is all meant as honest constructive criticism, I hope Cognitect can
> learn from it. If anyone from Cognitect wants more detailed feedback on how
> I think the process could be improved, happy to provide. To be clear I'm
> not angry about this, nor am I the kind of person to demand that my patches
> get accepted, I am just a little sad that my favourite language appears to
> be held back by the lack of a fully collaborative, open development process.
>

To be fair, such a process need not be provided by Rich, or even Cognitect.
Why not create a community-supported upstream fork of clojure, where
promising patches from jira (maybe even PRs) are collected and distributed
as clojure.next, without any promise of eventual inclusion into clojure
proper. A place where speculative work can prove out and mature a bit,
before needing to deal with cognitect's processes. A bit like wine-staging
does.

I also have a related philosophical point about the "burden of proof" for
> accepting patches that may cause regressions. For functional / API changes
> the right standard is "beyond reasonable doubt" because any regression is a
> breaking change to user code and therefore usually unacceptable. For
> performance-related patches the standard should be "on the balance of
> probabilities" because regressions in less common cases are acceptable
> providing the overall performance impact (for the average real world user)
> is expected to be positive.
>

I hear you, yet, I'd also weigh in with implementation complexity, and
hacks that tend to stick once they're in, especially in the face of those
"downstream optimizations", that you mentioned.
Let's not forget, that it's such a joy to work with the code base, exactly
because Rich tends to hold back when faced with the option of commiting
non-essential stuff.

Interested to hear your views Herwig  - it's always worth discussing ideas
> and alternatives, this can help inform the ultimate solution.  FWIW I think
> most of the wins for Tuples are for the very small arities (0-4), larger
> sizes than that are probably much more marginal in value.
>

Well, if you must know, among the permutations I'd try are:
- generating tuple sizes of powers - of - two 1, 2, 4, 8, 16
  - with a separate length field
  - or with the unused slots holding a sentinel object
- replacing array-map with tuple-map
- specializing IFn for being applied to tuples

Anyway, before having some kind of extensible, permutable benchmark, I'd
rather not sink time into the implementation.

I agree macro-level benchmarks would be great to inform the debate, but
> just to repeat my point d) above - different contributors asked multiple
> times what sort of real world benchmark would be considered informative but
> these requests seem to have been ignored so far. Would be great if the core
> team could provide some guidance here (Alex? Rich?)
>

I'm not core, but if you accept (or criticize) my guidance for a good
real-world benchmark, we could already be two vocal contributors agreeing
on a benchmark, they can't ignore us forever ;-) It should

- be reviewable
- be extensible
- solicit representative cases from all corners of the community
- support basic combinatorics on
  - the tested clojure versions (i.e. various patchsets)
  - the set of routines within a single run (to monitor effect of patches
with various PIC loadouts)
  - used jvms
- generate machine-readable reports

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to