Re: [Wikitech-l] Declaring methods final in classes

2019-09-01 Thread Krinkle
On Sun, Sep 1, 2019 at 12:40 PM Aryeh Gregor  wrote:

> On Fri, Aug 30, 2019 at 10:09 PM Krinkle  wrote:
> > For anything else, it doesn't really work in my experience because
> PHPUnit
> > won't actually provide a valid implementation of the interface. It
> returns
> > null for anything, which is usually not a valid implementation of the
> > contract the class under test depends on. [..]
>
> In my experience, classes that use a particular interface usually call
> only one or two methods from it, which can usually be replaced by
> returning fixed values or at most two or three lines of code. [..]

It could sometimes not be worth it to mock for the reason you give,
> but in my experience that's very much the exception rather than the rule.
> [..]


> Consider a class like ReadOnlyMode. It has an ILoadBalancer injected.
> The only thing it uses it for is getReadOnlyReason(). When testing
> ReadOnlyMode, we want to test "What happens if the load balancer
> returns true? What about false?" A mock allows us to inject the
> required info with a few lines of code.


While I do prefer explicit dependencies, an extreme of anything is rarely
good. For this case, I'd agree and follow the same approach.

And I think there are plenty of other examples where I'd probably go for a
partial mock. Doing so feels like a compromise to me, as I have often seen
this associated with working around technical debt (e.g. the relationship
between those classes could've been better perhaps). But I think it's fair
to say there will be cases where the relationship is fine and it still
makes sense to write a partial mock to keep the test simple.

In addition to that, I do think it is important to also have at least one
test case at the integration level to verify that the higher-level purpose
and use case of the feature works as intended - where you'd have to
construct the full dependency tree and mock or stub only the signal that is
meant to travel through the layers. Thus ruling out any mistake in the
(separate) unit tests for ReadOnlyMode and LoadBalancer. But, you wouldn't
need to have full coverage through that - just test the communication
between the two. The unit tests would then aim for coverage of all the edge
cases and variations.

Thanks for writing this up.


> 5) In some cases you want to know that a certain method is being
> called with certain parameters,[ ..] Maybe the bug changed the
> WHERE clause to one that happens to select your row of test data
> despite being wrong.
>

Yep, this is important. I tend to go for lower level observation instead of
injected assertions. E.g. mock IDatabase::select/insert etc. to stash each
query in an array, and then toward the end assert that the desired queries
have occurred exactly as intended and nothing more. Similar to how one
can memory-buffer a Logger instance.

I like the higher confidence this gives and I find it easier to write than
a PHPUnit
mock where each function call is precisely counted and params validated,
while being more tolerant to internal details changing over time.

Having said that, they both achieve the same goal. I suppose its a matter
of taste. Certainly nothing wrong with the PHPUnit approach, I'd just not
do it myself as much. Thanks for reminding me of this.


> 6) [..]
> One way to help catch inadvertent performance regressions is to test
> that the means of ensuring performance are being used properly. For
> instance, if a method is supposed to first check a cache and only fall
> back to the database for a cache miss, we want to test that the
> database query is actually only issued in the event of a cache miss.


Yep, that works. I tend to do this differently, but it works and its
important
that we make these assertions somewhere. I don't disagree :)

-- Timo
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Phabricator monthly statistics - 2019-08

2019-09-01 Thread יגאל חיטרון
‫בתאריך יום א׳, 1 בספט׳ 2019 ב-15:22 מאת ‪Andre Klapper‬‏ <‪
aklap...@wikimedia.org‬‏>:‬

> Hi,
>
> On Sun, 2019-09-01 at 14:09 +0300, יגאל חיטרון wrote:
> > About " Unbreak now: 0": Can we get a better approximation, as 0.1 or
> > 0.45?
> > Igal (User: IKhitron)
>
> Which specific underlying problem would get solved for you by having
> even more precision? Also note the corresponding warning:
>
> I'm just curious, it's the first time I can see this number, and for sure
it isn't an average for 3 days and minus 3 days.

> > > (How long tasks have been open, not how long they have had that
> > > priority)
>
Sure, I am aware.
Igal


> Cheers,
> andre
> --
> Andre Klapper (he/him) | Bugwrangler / Developer Advocate
> https://blogs.gnome.org/aklapper/
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Phabricator monthly statistics - 2019-08

2019-09-01 Thread Andre Klapper
Hi,

On Sun, 2019-09-01 at 14:09 +0300, יגאל חיטרון wrote:
> About " Unbreak now: 0": Can we get a better approximation, as 0.1 or
> 0.45?
> Igal (User: IKhitron)

Which specific underlying problem would get solved for you by having
even more precision? Also note the corresponding warning:

> > (How long tasks have been open, not how long they have had that
> > priority)

Cheers,
andre
-- 
Andre Klapper (he/him) | Bugwrangler / Developer Advocate
https://blogs.gnome.org/aklapper/


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Phabricator monthly statistics - 2019-08

2019-09-01 Thread יגאל חיטרון
Hi. Thank you for the statistics.
About " Unbreak now: 0": Can we get a better approximation, as 0.1 or 0.45?
Igal (User: IKhitron)


‫בתאריך יום א׳, 1 בספט׳ 2019 ב-3:00 מאת <‪communitymetr...@wikimedia.org
‬‏>:‬

>
> Hi Community Metrics team,
>
> This is your automatic monthly Phabricator statistics mail.
>
> Accounts created in (2019-08): 325
> Active Maniphest users (any activity) in (2019-08): 960
> Task authors in (2019-08): 514
> Users who have closed tasks in (2019-08): 285
>
> Projects which had at least one task moved from one column to another on
> their workboard in (2019-08): 285
>
> Tasks created in (2019-08): 2215
> Tasks closed in (2019-08): 1866
> Open and stalled tasks in total: 42538
>
> Median age in days of open tasks by priority:
>
> Unbreak now: 0
> Needs Triage: 520
> High: 936
> Normal: 1142
> Low: 1556
> Lowest: 1536
>
> (How long tasks have been open, not how long they have had that priority)
>
> Active Differential users (any activity) in (2019-08): 8
>
> To see the names of the most active task authors:
> * Go to https://wikimedia.biterg.io/
> * Choose "Phabricator > Overview" from the top bar
> * Adjust the time frame in the upper right corner to your needs
> * See the author names in the "Submitters" panel
>
> TODO: Numbers which refer to closed tasks might not be correct, as
> described in https://phabricator.wikimedia.org/T1003 .
>
> Yours sincerely,
> Fab Rick Aytor
>
> (via community_metrics.sh on phab1003 at Sun Sep  1 00:00:22 UTC 2019)
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Declaring methods final in classes

2019-09-01 Thread Aryeh Gregor
On Fri, Aug 30, 2019 at 10:09 PM Krinkle  wrote:
> For anything else, it doesn't really work in my experience because PHPUnit
> won't actually provide a valid implementation of the interface. It returns
> null for anything, which is usually not a valid implementation of the
> contract the class under test depends on. As a result, you end up with
> hundreds of lines of phpunit-mockery to basically reimplement the
> dependency's general functionality from scratch, every time.

In my experience, classes that use a particular interface usually call
only one or two methods from it, which can usually be replaced by
returning fixed values or at most two or three lines of code. The
implementations of the stub methods usually can be very simple,
because you know exactly the context that they'll be called and you
can hard-code the results. It could sometimes not be worth it to mock
for the reason you give, but in my experience that's very much the
exception rather than the rule.

> I have seen
> this too many times and fail to see the added value for all that complexity
> and maintenance, compared to simply letting the actual class do its work.
> Practically speaking, what kind of problem would that test methodology
> catch? I believe others may have a valid answer to this, but I haven't yet
> seen it. Perhaps it's a source of problems I tend to prevent by other
> means. There's more than one way to catch most problems :)

Consider a class like ReadOnlyMode. It has an ILoadBalancer injected.
The only thing it uses it for is getReadOnlyReason(). When testing
ReadOnlyMode, we want to test "What happens if the load balancer
returns true? What about false?" A mock allows us to inject the
required info with a few lines of code. If you use a real load
balancer, you're creating a number of problems:

1) Where do you get the load balancer from? You probably don't want to
manually supply all the parameters. If you use MediaWikiServices,
you're dependent on global state, so other tests can interfere with
yours. (Unless you tear down the services for every test, and see next
point.)

2) It's a waste of resources to run so much code just to generate
"true" or "false". One reason unit tests are so much faster than
integration tests is because they don't have to do all this setup of
the services.

3) I want an ILoadBalancer whose getReadOnlyReason() will return true
or false, so I can test how ReadOnlyMode will behave. How do I get a
real one to do that? There's no setReadOnlyReason() in ILoadBalancer.
Now I have to explore a concrete implementation and its dependencies
to figure out how to get it to return "true" or "false". Then anyone
reviewing my code has to do the same to verify that it's correct. This
is instead of a few lines of code that are contained within my test
file itself and are checkable for correctness via cursory inspection.

4) If the contract or implementation details of this totally separate
interface/class change, it will break my test. Now anyone who changes
LoadBalancer's implementation details risks breaking tests of classes
that depend on it. Worse, they might make the tests incorrect but
still pass. Perhaps I was setting readOnlyReason to true somehow, and
now that way no longer works, so it's really returning false -- but
maybe for some other reason my test now passes (perhaps other changes
made at the same time). Probably nobody will ever notice until an
actual bug arises.

5) In some cases you want to know that a certain method is being
called with certain parameters, but it's nontrivial to check
indirectly. For instance, suppose a certain method call is supposed to
update a row in the database. Even if you didn't mind giving it a real
database despite all the above reasons, how are you going to test the
correct row is being updated? You'd have to populate an initial row
(make sure the database gets reset afterwards!) and check that it was
updated properly. But what if a refactor introduced a bug in the WHERE
clause that happens not to stop the query from working in your case
but might cause problems on production data? Maybe the bug changed the
WHERE clause to one that happens to select your row of test data
despite being wrong. Trying to test every possible set of preexisting
data is not feasible. What's extremely simple is just testing that the
expected query is issued to a mock.

6) Performance is important to us. The direct way to test performance
is by measuring test run time, but this is unreliable because it's
heavily affected by load on the test servers. Also, lots of
performance issues only come up on large data sets, and we're probably
not going to run tests routinely on databases the size of Wikipedia's.
One way to help catch inadvertent performance regressions is to test
that the means of ensuring performance are being used properly. For
instance, if a method is supposed to first check a cache and only fall
back to the database for a cache miss, we want to test that the
database query is