Re: [Wikimedia-l] Defining impact for Wikimedia programs, grants and evaluation
Nathan, 20/05/2014 02:59: Judging by meta I think Edward and the PE team have made a great start. But it's 2014 and the WMF is still at a starting point. Proposing that funding requests include SMART goals is not good enough, and I'd love to see Lila and the board empower Edward to do a lot more, and to insist on deep cooperation from entities receiving funds. Cf. https://meta.wikimedia.org/wiki/Grants_talk:APG/Proposals/2013-2014_round2/Wikimedia_Foundation/Proposal_form#Metrics_for_the_infrastructure_programs.3F Nemo ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe
Re: [Wikimedia-l] Defining impact for Wikimedia programs, grants and evaluation
Thank you, Nathan, for your comments and suggestions. Each of the points you have raised are very much on our radar, but we are still in a place in which we must make strategic choices in moving toward the end goal of knowing which programs and activities have high potential for impact, which have low or high costs for impact, and how to value achieved impact in order to more clearly identify both successes and failures. You are right, this will take a deep cooperation from those who design, implement, and actually evaluate program work. We have a sense of growing cooperation and collaboration on that front and are hopeful that our team’s integration into Grantmaking will only work to strengthen those connections and supports. Responding to some of your discussion points below: The logic models are useful tools for thinking through and explaining to an audience the structure and goals of a program, but they are vulnerable to the same fuzziness that exists without the tools. They are also not well oriented to measuring performance, which is really the crux of the problem and of Pine's question. Let's look at the logic model you've used as an example from the WikiWomen's edit-a-thon[1]. Their logic model is great at explaining the goals of the program. This is a major improvement, particularly if it is standardized across all WMF-funded projects. But does it help us answer the question about impact? Using the Boulmetin Dutwin model of analysis, we can get clear information about program efficiency and program effectiveness. But we don't get anywhere on impact, despite the use of the logic model. =The place for logic models= To be clear, we also began with community derived logic models for each of the programs we initiated reporting on this past six months, however, they are also in need of some attention and better integration in our portal resources: https://meta.wikimedia.org/wiki/Programs:Share_Space/Overview_Logic_Model Unfortunately, many program leaders with slight variations in theory of change make for a crowded format compared to the basic community one on a Wiki Women edit-a-thon that has been shared under community contributions on the resource page.[1] (We are working to clean these up and include them on the resource page also.) However, we did use these initial logic models as our starting point in determining which basic metrics to pilot, which areas we have measurement gaps, and resulting guidance for evaluation measures for those programs that we mapped. Now, after piloting those measures in the beta reports[2], we are asking for community input at: https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Parlor/Dialogue Judging by meta I think Edward and the PE team have made a great start. But it's 2014 and the WMF is still at a starting point. Proposing that funding requests include SMART goals is not good enough, and I'd love to see Lila and the board empower Edward to do a lot more, and to insist on deep cooperation from entities receiving funds. At some point in the future we can move this discussion from does anything anyone does have any impact? to knowing that we *can have an impact*, how much impact is enough to justify funding? =SMART Targets and Collaboration across Grantmaking= Our team is working in collaboration with grantmaking programs to better guide the expectations and resources for evaluation and this community dialogue will also help to guide that. Still, this is not a top-down approach and we must be reasonable in allowing the time to explore programs and target measures that are reasonable and valid. We are still very much in the process of drilling down while at the same time moving forward with the most clear metrics we have identified. I appreciate also that SMART goals themselves are not enough, still, they are one of many first steps in advancing systematic program evaluation and design across Wikimedia programs and activities and there is much collaborative planning going on within Grantmaking to empower the initiative further. Still, SMART targets must be aligned to relevant impact targets and must actually be SMART in order to include associated metrics and timelines. We have also added guidance on writing SMART targets to our portal resources, however, inclusion of SMART targets it is highly variable across grant applications. As this has just been added this last round, it is not too surprising and we expect that will improve as will all of the evaluation activities and strategies that are still relatively new to the process. I would like to encourage you (and anyone else interested in this discussion) to view our question prompts on our dialogue page [3] as well. If you do not mind, I would also like to migrate your comments to the appropriate discussion spaces there so that your feedback is also captured in our process. Thank you for this feedback, please let us know if we can answer anything further and feel welcome to contribute to the
Re: [Wikimedia-l] Defining impact for Wikimedia programs, grants and evaluation
Hi Pine, Thank you for your bringing this page to our attention and for raising these interesting questions. I would have to agree that the “Program evaluation basics” page is not well-designed and should be revisited. We are actually going to be redesigning the entire evaluation portal soon and this page will likely be revised and included in the new design in some way. We are also continuing to build tools and learning resources (like the learning modules [1]) on evaluation to help explain some of these concepts. I also agree that we need to think more about how we can define “impact” within the context of Wikimedia. Before we can reach a final “impact”, there are different layers of success in terms of outputs and short-, intermediate-, and long-term outcomes that help to measure success along the way. We have been working on this approach to evaluation—we have developed resources for mapping a program’s theory of change in order to identify measurable outcomes, both near and far. Specifically, logic models are a useful tool for drawing out the steps needed to reach long-term impact and identifying more immediate indicators for evaluation; there is a resource page within the Evaluation portal on logic models [2] and I am working on a learning module that will guide anyone through what a logic model is and how to create one. As far as the term “impact”, it is very jargonistic and can be used in many ways which can be confusing. Since we began last year, we have been working to generate a growing glossary of a shared language around evaluation [3]. That glossary page is more current and inclusive than the original “Program Evaluation basics” page you linked to. Please feel free to discuss this and any other of those terms and definitions there on the portal. Coincidentally, we are asking the community to provide feedback on some of the initial evaluation capacity building efforts our team has engaged in thus far. We’d like to hear feedback on the metrics and methods used so we can continue towards a shared understanding of Wikimedia programs and their impacts. We invite you (or anyone!) to read about the Community Dialogue [4] and join in the discussion on the Evaluation portal Parlor [5]. As always, I’m available for any questions! Best, Edward [1] https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Learning_modules [2] https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Library/Logic_models [3] https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Library/Glossary [4] https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Parlor/Dialogue [5] https://meta.wikimedia.org/wiki/Programs_talk:Evaluation_portal/Parlor/Dialogue On Sun, May 18, 2014 at 6:23 PM, ENWP Pine deyntest...@hotmail.com wrote: Hi, I spent a few minutes searching on Meta for how impact is defined. What is the WMF definition? Some examples of places where impact is used: * https://meta.wikimedia.org/wiki/Grants:APG/FDC_portal/Impact_report_form * https://meta.wikimedia.org/wiki/Grants:APG/Impact_report_form_Q%26A * https://meta.wikimedia.org/wiki/Grants:IEG/Learning/Round_1_2013/Impact * https://blog.wikimedia.org/2014/05/02/beginning-understand-what-works-measuring-impact-programs/ * https://meta.wikimedia.org/wiki/Program_evaluation_basics:_efficiency,_effectiveness_and_impact I am not fond of the Boulmetis / Dutwin definition used in that last reference because short-term effects can be important and much easier to measure than long-term effects. For example, an administrator protecting a page can have the short-term effect of preventing editing and preventing an edit war, and the long term effects of that can be impossible to know, such as whether preventing an edit war prevented the situation from escalating to an Arbcom case with imposition of long-term blocks, and also whether preventing editing prevented important information from being added to the page by an occasional IP editor. I might suggest a rewrite of that entire page on program evaluation basics to make it simple. Right now it's a wall of text that's difficult to follow and, I feel, at least partly wrong. I think that Edward Galvez is working on some of these issues and I would be happy to have him or someone else in Evaluation thoughtfully redesign and rewrite that page to make it easy to follow for everyone including non-native English speakers. If I have a hard time with that page, you can imagine how difficult it is for someone who only understands English at an intermediate level. I would like to start with having a clear and simple definition of impact that makes sense in Wikimedia contexts, and some examples that are easy to follow. Thanks, Pine ___ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe:
Re: [Wikimedia-l] Defining impact for Wikimedia programs, grants and evaluation
On Mon, May 19, 2014 at 7:06 PM, Edward Galvez egal...@wikimedia.orgwrote: Hi Pine, Thank you for your bringing this page to our attention and for raising these interesting questions. I would have to agree that the “Program evaluation basics” page is not well-designed and should be revisited. We are actually going to be redesigning the entire evaluation portal soon and this page will likely be revised and included in the new design in some way. We are also continuing to build tools and learning resources (like the learning modules [1]) on evaluation to help explain some of these concepts. I also agree that we need to think more about how we can define “impact” within the context of Wikimedia. Before we can reach a final “impact”, there are different layers of success in terms of outputs and short-, intermediate-, and long-term outcomes that help to measure success along the way. We have been working on this approach to evaluation—we have developed resources for mapping a program’s theory of change in order to identify measurable outcomes, both near and far. Specifically, logic models are a useful tool for drawing out the steps needed to reach long-term impact and identifying more immediate indicators for evaluation; there is a resource page within the Evaluation portal on logic models [2] and I am working on a learning module that will guide anyone through what a logic model is and how to create one. As far as the term “impact”, it is very jargonistic and can be used in many ways which can be confusing. Since we began last year, we have been working to generate a growing glossary of a shared language around evaluation [3]. That glossary page is more current and inclusive than the original “Program Evaluation basics” page you linked to. Please feel free to discuss this and any other of those terms and definitions there on the portal. Coincidentally, we are asking the community to provide feedback on some of the initial evaluation capacity building efforts our team has engaged in thus far. We’d like to hear feedback on the metrics and methods used so we can continue towards a shared understanding of Wikimedia programs and their impacts. We invite you (or anyone!) to read about the Community Dialogue [4] and join in the discussion on the Evaluation portal Parlor [5]. As always, I’m available for any questions! Best, Edward [1] https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Learning_modules [2] https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Library/Logic_models [3] https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Library/Glossary [4] https://meta.wikimedia.org/wiki/Programs:Evaluation_portal/Parlor/Dialogue [5] https://meta.wikimedia.org/wiki/Programs_talk:Evaluation_portal/Parlor/Dialogue Interesting exchange, thanks guys. This particular topic needs a great deal of attention - not just because of how crucial it is to measuring success, but also because it has traditionally been both difficult and sensitive. Sue and others have raised questions over the years about how we determine if the various programs run by the WMF and chapters are useful or not, and if so to what degree. The WMF and the Program Evaluation team are just beginning to take steps to answer these questions, and in my opinion much more needs to be invested in this effort. I would like to see compliance with program evaluation standards integrated into every grant of funding drawing on donor funds. To smooth the way for this increased level of scrutiny each grant of any type should include an earmark for just this purpose. Why? Because ultimately we are where we've always been -- with clear knowledge of what impacts matter but difficulty in working out whether anything any movement partner does or has done helps the bottom line. Tens of millions of dollars a year get spent, but most non-core spending would be hard to justify using strict measures of impact. That doesn't mean they don't *have* impact, just that because we don't forcefully ask the questions we don't and can't get the answers. Every project, chapter, grant, initiative and expenditure should be scrutinized with basically the same few questions: 1) Does it add to the quantity and / or quality of content? 2) Does it add readers, either by increasing interest or improving accessibility? 3) Does it add editors? Any major expense, grant request or new initiative should be measured by the answers to these questions, and every answer should be quantifiable to some degree. I would suggest that if the answer to all three is no for any non-core expense, heavy scrutiny should be applied to ensure funds aren't being wasted. The FDC does this to some extent now, although it asks the same questions much more vaguely and in terms of strategic alignment. The logic models are useful tools for thinking through and explaining to an audience the structure and goals of a program, but they are