Might I suggest the following two additions:

For item (1), I suggest adding to the end of it something like "Consider attaching this output/data as a txt file if it is too large, or consider using one of the built in data sets (as produced e.g. by data() ) if they suffice to illustrate the problem." I find it rather distracting to have to wade through pages and pages of the the output of dput before I can read the questions to be answered, and perhaps they are the kinds of questions that indeed can be answered without that output, in which case having it pasted straight into the text can be quite distracting. Unless we can at least convince them to append the output to the end, instead of the core of the message.

With regards to sessionInfo, I would consider it equally important, many times, to have the output of ls(), to make sure that functions etc are not masked by user defined global variables. But perhaps I'm alone in that? At least mention clearly that the code provided should be reproducible on a clean R workspace, or something like that?

I think creating this summary section to the posting guide is a great idea. The posting guide, though chock full with useful information on how to do a proper post, ends up having just way too much information, resulting, as experienced, in people not following it.

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

On Jun 7, 2008, at 10:48 AM, hadley wickham wrote:

Here's my attempt at making a little more friendly:

Removed self-contained - implied by reproducible
Used slightly less formal language (and you instead of the questioner)
Fixed a couple of spelling mistakes
Removed references to testing framework - I don't think that that term
needs to be introduced

-------

For most questions, the main problem isn't answering the question, but
understanding exactly what the question is, reproducing the problem and checking the answer. To make easy for others to help you, you should provide:

(1) reproducible, minimal code, and the data needed to run it. That means
     others can copy and paste from your email and see the same
output that you did. An easy way to include data in an email is to
     include the output of dput(mydata)

 (2) comments/explanations of what the code is supposed to do, and

(3) the version of R and the packages that you used, easily produced by
     sessionInfo().

Without reproducible code, others have to spend a lot of time
recreating the problem so that they can provide an answer that works.
Do NOT assume the problem is so simple that it is not necessary.

This can seem like a lot of work, but it often pays off by revealing the solution without having to ask anyone else. Even if it doesn't, your effort
shows the list that you have tried to solve it yourself.

It's also worthwhile spending some time writing a good subject line that succinctly summarises your problem. This also helps others trying to solve the same problem in the future as they can more easily locate relevant messages.

Hadley

On Sat, Jun 7, 2008 at 8:38 AM, Gabor Grothendieck
<[EMAIL PROTECTED]> wrote:
Here is a second version of the summary.  Its been rearranged to
place most important info at top.  Also shortened it a bit.

It still needs links to example posts, as suggested.  Anyone?

Summary

Surprisingly, the main problem for responders is not to answer the
posted questions but to quickly figure out what the question is, reproduce
it in their own R session and test their answer.

Test Framework.  To faciliate that provide a test framework of:

 (1) reproducible self-contained minimal code and data.  That means
     responders can copy it from the questioner's post and paste it
     into their session to see the same output without having to
     enter even one R command.
     NB. dput(mydata) produces mydata in reproducible form.
(2) comments/explanations of what the code is intended to produce and
 (3) versions of all software used, e.g. sessionInfo().

Without self-contained reproducible code the responder must not only
understand the question but must also create a test framework and that
typically takes more time than answering the question!  Its not fair
to ask the responder to provide all that on top of answering the
question.  Do NOT assume the problem is so simple that it is not
necessary.

Effort. The effort taken to reduce the problem to its essentials and
produce a test framework often solves the problem avoiding the need
for a post in the first place.  It at the least shows that the
questioner tried to solve it themself.

Subscribers. The questioner should ensure that the thread is complete
and that it has an appropriate Subject.  The purpose of the post is
not only to help the questioner but also the other list subscribers
and those later searching the archives.



On Fri, Jun 6, 2008 at 1:30 PM, Gabor Grothendieck
<[EMAIL PROTECTED]> wrote:
People read the posting guide yet they are still unable to create an acceptable
post. e.g.
https://stat.ethz.ch/pipermail/r-help/2008-June/164092.html

I think the problem is that the guide is not clear or concise enough.
I suggest we add a summary at the beginning which gets to the heart
of what a poster is expected to provide:

Summary

To maximize your change of getting a response when posting provide (1)
commented,
(2) minimal, (3) self-contained and (4) reproducible code. (This one
line summary
also appears at the end of each message to r-help.)

"Self-contained" and "reproducible" mean that a responder can copy the
questioner's code to
the clipboard, paste it into their R session and see the same problem
you as the questioner
see. Note that dput(mydata) will display mydata in a reproducible way.
Self-contained and reproducible are needed because:
(1) Self-Effort. It shows that the questioner tried to solve the
problem by themself first.
(2) Test framework. Often the responder needs to play with the code a
bit in order to respond
or at least to give the best answer.  They can't do that without a
test framework that includes
the data and the code to run it and its not fair to ask them to not
only answer the question but
also to come up with test data and to complete incomplete code.
(3) Archives. Questions and answers go into the archives so they are
not only for the benefit of
of the questioner but also for the benefit of all future searchers of
the archive.  That means
that its not finished if you have solved the problem for yourself.
You still need to ensure that
the thread has a complete solution. (For that reason its also
important to give a meaningful
subject to each post.)

"Commented" and "minimal" also reduce the time it takes to understand
the problem.
Don't just dump your code as is into the message since you are just
wasting your own
time. Its not likely anyone will answer a message if the questioner
has not taken the
time to reduce it to its essential elements.  Surprisingly, quite
often understanding what
the problem is takes the responder most of the time -- not solving the
problem. Once the
question is actually understood its often quite fast to answer. Thus
in addition to posting
it in a minimal form, comment on it sufficiently so that the responder
knows what the code
does and is intended to produce. It may be obvious to the questioner
who is embroiled in
the problem but that does not mean its obvious to others.

Introduction

.... rest of posting guide ...


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to