@Andrey Will you open a PR to add this to the code style?

On Mon, Aug 19, 2019 at 11:51 AM Andrey Zagrebin <and...@ververica.com>
wrote:

> Hi All,
>
> It looks like this proposal has an approval and we can conclude this
> discussion.
> Additionally, I agree with Piotr we should really force the proven good
> reasoning for setting the capacity to avoid confusion, redundancy and other
> already mentioned things while reading and maintaining the code.
> Ideally the need of setting the capacity should be either immediately clear
> (e.g. perf etc) or explained in comments if it is non-trivial.
> Although, it can easily enter a grey zone, so I would not demand strictly
> performance measurement proof e.g. if the size is known and it is "per
> record" code.
> At the end of the day it is a decision of the code developer and reviewer.
>
> The conclusion is then:
> Set the initial capacity only if there is a good proven reason to do it.
> Otherwise do not clutter the code with it.
>
> Best,
> Andrey
>
> On Thu, Aug 1, 2019 at 5:10 PM Piotr Nowojski <pi...@ververica.com> wrote:
>
> > Hi,
> >
> > > - a bit more code, increases maintenance burden.
> >
> > I think there is even more to that. It’s almost like a code duplication,
> > albeit expressed in very different way, with all of the drawbacks of
> > duplicated code: initial capacity can drift out of sync, causing
> confusion.
> > Also it’s not “a bit more code”, it might be non trivial
> > reasoning/calculation how to set the initial value. Whenever we change
> > something/refactor the code, "maintenance burden” will mostly come from
> > that.
> >
> > Also I think this just usually falls under a premature optimisation rule.
> >
> > Besides:
> >
> > > The conclusion is the following at the moment:
> > > Only set the initial capacity if you have a good idea about the
> expected
> > size.
> >
> > I would add a clause to set the initial capacity “only for good proven
> > reasons”. It’s not about whether we can set it, but whether it makes
> sense
> > to do so (to avoid the before mentioned "maintenance burden”).
> >
> > Piotrek
> >
> > > On 1 Aug 2019, at 14:41, Xintong Song <tonysong...@gmail.com> wrote:
> > >
> > > +1 on setting initial capacity only when have good expectation on the
> > > collection size.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Thu, Aug 1, 2019 at 2:32 PM Andrey Zagrebin <and...@ververica.com>
> > wrote:
> > >
> > >> Hi all,
> > >>
> > >> As you probably already noticed, Stephan has triggered a discussion
> > thread
> > >> about code style guide for Flink [1]. Recently we were discussing
> > >> internally some smaller concerns and I would like start separate
> threads
> > >> for them.
> > >>
> > >> This thread is about creating collections always with initial
> capacity.
> > As
> > >> you might have seen, some parts of our code base always initialise
> > >> collections with some non-default capacity. You can even activate a
> > check
> > >> in IntelliJ Idea that can monitor and highlight creation of collection
> > >> without initial capacity.
> > >>
> > >> Pros:
> > >> - performance gain if there is a good reasoning about initial capacity
> > >> - the capacity is always deterministic and does not depend on any
> > changes
> > >> of its default value in Java
> > >> - easy to follow: always initialise, has IDE support for detection
> > >>
> > >> Cons (for initialising w/o good reasoning):
> > >> - We are trying to outsmart JVM. When there is no good reasoning about
> > >> initial capacity, we can rely on JVM default value.
> > >> - It is even confusing e.g. for hash maps as the real size depends on
> > the
> > >> load factor.
> > >> - It would only add minor performance gain.
> > >> - a bit more code, increases maintenance burden.
> > >>
> > >> The conclusion is the following at the moment:
> > >> Only set the initial capacity if you have a good idea about the
> expected
> > >> size.
> > >>
> > >> Please, feel free to share you thoughts.
> > >>
> > >> Best,
> > >> Andrey
> > >>
> > >> [1]
> > >>
> > >>
> >
> http://mail-archives.apache.org/mod_mbox/flink-dev/201906.mbox/%3ced91df4b-7cab-4547-a430-85bc710fd...@apache.org%3E
> > >>
> >
> >
>

Reply via email to