Re: Property Paths benchmark @ ISWC2017

Andy Seaborne Mon, 23 Oct 2017 07:59:27 -0700

Process : if we all agree with some text, then shoudl we email it tosemantic-web@w3? I'd prefer if it wasn't me sending it to show it's thePMC.


Comments inline.


On 23/10/17 13:13, Osma Suominen wrote:

+1 for the edited text, but "A fix has been available for some time"looks a bit too vague for my taste. It could be understood as "we fixedthis last week so you shouldn't complain about it". So maybe add aversion number and date. Also mentioning the JIRA ticket numbers wouldmake this statement more transparent.

I'm neutral to that - I didn't want the message to be too much about thespecific paper - it's an MSc piece of work, researcher in training.

I am annoyed by the revision of dates to July/2017 (while the work wasdone in 2015, it's not the first thing you come across). That in itselfis poor.

As a former PhD student who has done some "benchmarking" style papers(albeit about SKOS dataset quality, not software) I can somewhatsympathize with the researchers' point of view here. If you've come upwith a new benchmark and spent a non-trivial amount of time testingvarious software packages, you're likely to find a number of problems inthem. Reporting back all of them through various channels can seem a bitof an extra burden - and maybe you don't want to explain your benchmarkin a public forum such as an issue tracker or mailing list before you'vepublished a paper about it, in case you're afraid of someone stealingyour ideas.


We don't say "report in public". That's for discussion.

This work isn't new - it's 2015 reworked. So even by the argument ofsecret, they could have done something.

I don't see why it can't be public - it's the nature of open source (AKAfree) - researchers keeping secrets is the ambush games at the end ofSPARQL 1.1 is exploitation for personal publicity.

I am annoyed by the revision of dates to July/2017 (while the work wasdone in 2015, it's not the first thing you come across). That in itselfis poor - it's marketing FUD and deserves a public correction from them.

I think it would make sense for workshops like this to require that thetested tools are recent enough - say, no more than three or six monthsbehind the latest official release. This wouldn't enforce contacting theauthors (which can be problematic, e.g. for the above reasons) but wouldat least make the results more relevant for comparisons and prevent thesituation we had here, where the problem reported in the paper wasapparently fixed some time ago, independently of the research.


-Osma


Andy Seaborne kirjoitti 21.10.2017 klo 18:36:

I did some editting on it to emphasis the Code of practice, and awayfrom the incident:


---------------------------


On the Responsible Disclosure of Benchmarking Results


The Apache Jena PMC would like to suggest to the benchmarking community
that they adopt a code of practice that will improve benchmarking
semantic web systems by focusing on the contribution to the literature
and away from transient details.

The PMC was recently made aware of a paper scheduled to be presented at
the Workshop on Benchmarking Linked Data (BLINK) at ISWC 2017. The paper
in question provides a new benchmark for property paths.

We are disappointed that the authors identified a deficiency in our
project's implementation about which they made no attempt to contact us.
Indeed, our public JIRA has independently reported tickets that are
relevant.  A fix has been available for some time.

We are by no means the only project affected, other correctness and
performance issues across several projects were identified in this
paper.

We wish to raise a general issue we and others in our community perceive
across this field of research.

Investigation and analysis of algorithms and designs should not be based
on engineering details.

As an open source project maintained by volunteers we rely upon the
wider community, both in industry and academia, to bring issues to our
attention in a timely fashion.

If this was a security flaw the expected standard practice would be to
responsibly disclose the issue to the affected projects and work with
those projects to address the issue.

Many of us have a background in scientific research and appreciate that
research often happens on tight timelines but simply sending a short to
a project about an identified issue is not unreasonable.

We would like to suggest to the benchmarking community that they adopt a
code of good practice that encourages feedback leading to high-quality
results for the long term benefit of other researchers.

If you could raise the topic of responsible disclosure of issues
identified during the course of your workshop that would be much
appreciated.

Regards,

The Apache Jena PMC

Re: Property Paths benchmark @ ISWC2017

Reply via email to