Re: [Server-devel] [Sugar-devel] The quest for data
On 12.1.2014 10:12, Sameer Verma wrote: Has anyone created the wiki page as yet? Just created the wiki page: http://wiki.sugarlabs.org/go/Education_Team/Quest_for_Data Please help me expand it as you gather feedback from other deployments. Cheers, Martin ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] [Sugar-devel] The quest for data
Just to add my $.02, I agree with Walter and Claudia's approach in this paper. Making the specifics of learning visible to teachers and students, and doing the development from this perspective, I think is the best way to go. Thanks. Gerald On Sun, Jan 12, 2014 at 9:33 AM, Walter Bender walter.ben...@gmail.comwrote: On Fri, Jan 10, 2014 at 3:37 PM, Sameer Verma sve...@sfsu.edu wrote: On Fri, Jan 10, 2014 at 3:26 AM, Martin Dluhos mar...@gnu.org wrote: On 7.1.2014 01:49, Sameer Verma wrote: On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote: For visualization, I have explored using LibreOffice and SOFA, but neither of those were flexible to allow for customization of the output beyond some a few rudimentary options, so I started looking at various Javascript libraries, which are much more powerful. Currently, I am experimenting with Google Charts, which I found the easiest to get started with. If I run into limitations with Google Charts in the future, others on my list are InfoVIS Toolkit (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). Then, there is also D3.js, but that's a bigger animal. Keep in mind that if you want to visualize at the school's local XS[CE] you may have to rely on a local js method instead of an online library. Yes, that's a very good point. Originally, I was only thinking about collecting and visualizing the information centrally, but there is no reason why it couldn't be viewed by teachers and school administrators on the schoolserver itself. Thanks for the warning. In fact, my guess would be that what the teachers and principal want to see at the school will be different from what OLE Nepal and the government would want to see, with interesting overlaps. You left out one important constituent: the learner. Ultimately we are responsible for making learning visible to the learner. Claudia and I touched on this topic in the attached paper. Just to place all my cards on the table, as much as I hate to suggest we head down this route, I think we really need to instrument activities themselves (and build analyses of activity output) if we want to provide meaningful statistics about learning. We've done some of this with Turtle Blocks, even capturing the mistakes the learner makes along the way. We are lacking in decent visualizations of these data, however. Meanwhile, I remain convinced that the portfolio is our best tool. regards. -walter cheers, Sameer ___ Sugar-devel mailing list sugar-de...@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel -- Walter Bender Sugar Labs http://www.sugarlabs.org ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] [Sugar-devel] The quest for data
On Sun, Jan 12, 2014 at 3:32 PM, Sameer Verma sve...@sfsu.edu wrote: On Sun, Jan 12, 2014 at 6:33 AM, Walter Bender walter.ben...@gmail.com wrote: On Fri, Jan 10, 2014 at 3:37 PM, Sameer Verma sve...@sfsu.edu wrote: On Fri, Jan 10, 2014 at 3:26 AM, Martin Dluhos mar...@gnu.org wrote: On 7.1.2014 01:49, Sameer Verma wrote: On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote: For visualization, I have explored using LibreOffice and SOFA, but neither of those were flexible to allow for customization of the output beyond some a few rudimentary options, so I started looking at various Javascript libraries, which are much more powerful. Currently, I am experimenting with Google Charts, which I found the easiest to get started with. If I run into limitations with Google Charts in the future, others on my list are InfoVIS Toolkit (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). Then, there is also D3.js, but that's a bigger animal. Keep in mind that if you want to visualize at the school's local XS[CE] you may have to rely on a local js method instead of an online library. Yes, that's a very good point. Originally, I was only thinking about collecting and visualizing the information centrally, but there is no reason why it couldn't be viewed by teachers and school administrators on the schoolserver itself. Thanks for the warning. In fact, my guess would be that what the teachers and principal want to see at the school will be different from what OLE Nepal and the government would want to see, with interesting overlaps. You left out one important constituent: the learner. Ultimately we are responsible for making learning visible to the learner. Claudia and I touched on this topic in the attached paper. Thanks for the paper. While we did point out to Portfolio and Analyze Journal activities in our session at OLPC SF Summit in 2013, I didn't include it in the scope of the blog post. I'll go back and update it when I get a chance. Just to place all my cards on the table, as much as I hate to suggest we head down this route, I think we really need to instrument activities themselves (and build analyses of activity output) if we want to provide meaningful statistics about learning. We've done some of this with Turtle Blocks, even capturing the mistakes the learner makes along the way. We are lacking in decent visualizations of these data, however. I haven't had a chance to read the paper in depth (which I intend to do this afternoon), but how much of this approach would be shareable across activities? Or would the depth of analysis be on a per activity basis? If the latter, then I'd imagine it would be simpler for something like the Moon activity than the TurtleBlocks activity. Meanwhile, I remain convinced that the portfolio is our best tool. I think the approaches differ in scope and purpose. In the RFPs I've been involved in, the funding agencies and/or the decision makers either request or outright require dashboard style features to report frequency of use, time of day, and in some cases even GPS-based location in addition to theft-deterrence, remote provisioning, etc. The same goes for going back to an agency to get renewed funding or to raise funds for a new site expansion. In a way, the scope of the learner-teacher bubble is significantly different from that of the principal-minister of edu. One is driven by learning and pedagogy, while the other is driven by administration. Accordingly, the reports they want to see are also different. While the measurements from the Activity may be distilled into coarser indicators for the MoE, I think it is important to keep the entire scope in mind. Don't get me wrong: satisfying the needs of funders, administrators, etc. is important too. They have metrics that they value and we should gather those data too. My earlier post was just to suggest ultimately we need to consider the learner and how making learning visible can be of use. That theme seemed to be missing from the earlier discussion. I am mindful of the garbage in, garbage out problem. In building this pipeline (which is where my skills are) I hope that the data that goes into this pipeline is representative of what is measured at the child's end. I am glad that you and Claudia are the experts on that end :-) cheers, Sameer regards. -walter cheers, Sameer ___ Sugar-devel mailing list sugar-de...@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel -- Walter Bender Sugar Labs http://www.sugarlabs.org -- Walter Bender Sugar Labs http://www.sugarlabs.org ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] [Sugar-devel] The quest for data
Agreed. On Sun, Jan 12, 2014 at 6:02 PM, Walter Bender walter.ben...@gmail.comwrote: On Sun, Jan 12, 2014 at 3:32 PM, Sameer Verma sve...@sfsu.edu wrote: On Sun, Jan 12, 2014 at 6:33 AM, Walter Bender walter.ben...@gmail.com wrote: On Fri, Jan 10, 2014 at 3:37 PM, Sameer Verma sve...@sfsu.edu wrote: On Fri, Jan 10, 2014 at 3:26 AM, Martin Dluhos mar...@gnu.org wrote: On 7.1.2014 01:49, Sameer Verma wrote: On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote: For visualization, I have explored using LibreOffice and SOFA, but neither of those were flexible to allow for customization of the output beyond some a few rudimentary options, so I started looking at various Javascript libraries, which are much more powerful. Currently, I am experimenting with Google Charts, which I found the easiest to get started with. If I run into limitations with Google Charts in the future, others on my list are InfoVIS Toolkit (http://philogb.github.io/jit) and HighCharts ( http://highcharts.com). Then, there is also D3.js, but that's a bigger animal. Keep in mind that if you want to visualize at the school's local XS[CE] you may have to rely on a local js method instead of an online library. Yes, that's a very good point. Originally, I was only thinking about collecting and visualizing the information centrally, but there is no reason why it couldn't be viewed by teachers and school administrators on the schoolserver itself. Thanks for the warning. In fact, my guess would be that what the teachers and principal want to see at the school will be different from what OLE Nepal and the government would want to see, with interesting overlaps. You left out one important constituent: the learner. Ultimately we are responsible for making learning visible to the learner. Claudia and I touched on this topic in the attached paper. Thanks for the paper. While we did point out to Portfolio and Analyze Journal activities in our session at OLPC SF Summit in 2013, I didn't include it in the scope of the blog post. I'll go back and update it when I get a chance. Just to place all my cards on the table, as much as I hate to suggest we head down this route, I think we really need to instrument activities themselves (and build analyses of activity output) if we want to provide meaningful statistics about learning. We've done some of this with Turtle Blocks, even capturing the mistakes the learner makes along the way. We are lacking in decent visualizations of these data, however. I haven't had a chance to read the paper in depth (which I intend to do this afternoon), but how much of this approach would be shareable across activities? Or would the depth of analysis be on a per activity basis? If the latter, then I'd imagine it would be simpler for something like the Moon activity than the TurtleBlocks activity. Meanwhile, I remain convinced that the portfolio is our best tool. I think the approaches differ in scope and purpose. In the RFPs I've been involved in, the funding agencies and/or the decision makers either request or outright require dashboard style features to report frequency of use, time of day, and in some cases even GPS-based location in addition to theft-deterrence, remote provisioning, etc. The same goes for going back to an agency to get renewed funding or to raise funds for a new site expansion. In a way, the scope of the learner-teacher bubble is significantly different from that of the principal-minister of edu. One is driven by learning and pedagogy, while the other is driven by administration. Accordingly, the reports they want to see are also different. While the measurements from the Activity may be distilled into coarser indicators for the MoE, I think it is important to keep the entire scope in mind. Don't get me wrong: satisfying the needs of funders, administrators, etc. is important too. They have metrics that they value and we should gather those data too. My earlier post was just to suggest ultimately we need to consider the learner and how making learning visible can be of use. That theme seemed to be missing from the earlier discussion. I am mindful of the garbage in, garbage out problem. In building this pipeline (which is where my skills are) I hope that the data that goes into this pipeline is representative of what is measured at the child's end. I am glad that you and Claudia are the experts on that end :-) cheers, Sameer regards. -walter cheers, Sameer ___ Sugar-devel mailing list sugar-de...@lists.sugarlabs.org http://lists.sugarlabs.org/listinfo/sugar-devel -- Walter Bender Sugar Labs http://www.sugarlabs.org -- Walter Bender Sugar Labs http://www.sugarlabs.org ___ Sugar-devel
Re: [Server-devel] [Sugar-devel] The quest for data
On 7.1.2014 01:49, Sameer Verma wrote: On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote: For visualization, I have explored using LibreOffice and SOFA, but neither of those were flexible to allow for customization of the output beyond some a few rudimentary options, so I started looking at various Javascript libraries, which are much more powerful. Currently, I am experimenting with Google Charts, which I found the easiest to get started with. If I run into limitations with Google Charts in the future, others on my list are InfoVIS Toolkit (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). Then, there is also D3.js, but that's a bigger animal. Keep in mind that if you want to visualize at the school's local XS[CE] you may have to rely on a local js method instead of an online library. Yes, that's a very good point. Originally, I was only thinking about collecting and visualizing the information centrally, but there is no reason why it couldn't be viewed by teachers and school administrators on the schoolserver itself. Thanks for the warning. ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] [Sugar-devel] The quest for data
On 10.1.2014 11:55, Anish Mangal wrote: Sorry for being late to the party. Clearly the quest for data is a commonly shared one, with many different approaches, questions, and reporting/results. One of the already mentioned solutions is the sugar-stats package, originally developed by Aleksey, which have now been part of dextrose-sugar builds for over a year, and the server side (xsce). http://wiki.sugarlabs.org/go/Platform_Team/Usage_Statistics The approach we followed was to collect as much data as possible without interfering with sugar-apis or code. The project has made slow progress on the visualization front, but the data collection front has already been field tested. I for one think there are a few technical trade-offs, which lead to larger strategy decisions: * Context v/s Universality ... Ideally we'd like to collect (activity) context specific data, but that requires tinkering with the sugar api itself and each activity. The other side is we might be ignoring the other types of data a server might be collecting ... internet usage and the various other logfiles in /var/log * Static v/s Dynamic ... Analyzing journal backups is great, but they are ultimately limited in time resolution due to the datastore's design itself. So the key question being what's valuable? ... a) Frequency counts of activities? b) Data such as upto the minute resolution of what activities are running, which activity is active (visible when), collaborators over time ... etc ... In my humble opinion, the next steps could be: 1 Get better on the visualization front. 2 Search for more context. Maybe arm the sugar-datastore to collect higher resolution data. I think that you are absolutely right, Anish. In my project, I am currently focused on the former point, but I am running into limitations regarding the data stored in the datastore. As Sameer suggested, let's create a wiki page with a list of the data that's the community finds important and then compare that list with what's currently collected in the datastore. ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] [Sugar-devel] The quest for data
Sorry for being late to the party. Clearly the quest for data is a commonly shared one, with many different approaches, questions, and reporting/results. One of the already mentioned solutions is the sugar-stats package, originally developed by Aleksey, which have now been part of dextrose-sugar builds for over a year, and the server side (xsce). http://wiki.sugarlabs.org/go/Platform_Team/Usage_Statistics The approach we followed was to collect as much data as possible without interfering with sugar-apis or code. The project has made slow progress on the visualization front, but the data collection front has already been field tested. I for one think there are a few technical trade-offs, which lead to larger strategy decisions: * Context v/s Universality ... Ideally we'd like to collect (activity) context specific data, but that requires tinkering with the sugar api itself and each activity. The other side is we might be ignoring the other types of data a server might be collecting ... internet usage and the various other logfiles in /var/log * Static v/s Dynamic ... Analyzing journal backups is great, but they are ultimately limited in time resolution due to the datastore's design itself. So the key question being what's valuable? ... a) Frequency counts of activities? b) Data such as upto the minute resolution of what activities are running, which activity is active (visible when), collaborators over time ... etc ... In my humble opinion, the next steps could be: 1 Get better on the visualization front. 2 Search for more context. Maybe arm the sugar-datastore to collect higher resolution data. On Tue, Jan 7, 2014 at 12:24 PM, Christophe Guéret christophe.gue...@dans.knaw.nl wrote: Dear Sameer, all, That's a very interesting blog post and discussion. I agree that collecting data is important but knowing that are the questions aimed to be answered with that data is even more so. If you need help with that last bit, I could propose to use the journal data as a use-case for the project KnowEscape ( http://knowescape.org/ ). This project is about getting insights out of large knowledge spaces via visualisation. There is wide (European) community of experts behind it coming from different research fields (humanities, physic, computer science, ...). Something useful could maybe come out... I would also like to refer you to the project ERS we have now almost finished. This project is an extension of the ideas behind SemanticXO some of you may remember. We developed a decentralised entity registry system with the XO as a primary platform for coding and testing. There is a description of the implementation and links to code on http://ers-devs.github.io/ers/ . We also had a poster at OLPC SF (thanks for that !). In a nutshell, ERS creates global and shared knowledge spaces through series of statements. For instance, Amsterdam is in the Netherlands is a statement made about the entity Amsterdam relating it to the entity the Netherlands. Every user of ERS may want to either de-reference an entity (*e.g.*, asking for all pieces of information about Amsterdam) or contribute to the content of the shared space by adding new statements. This is made possible via Contributors nodes, one of the three types of node defined in our system. Contributors can interact freely with the knowledge base. They themselves take care of publishing their own statements but cannot edit third-party statements. Every set of statements about a given entity contributed by one single author is wrapped into a document in couchDB to avoid conflicts and enable provenance tracking. Every single XO is a Contributor. Two Contributors in a closed P2P network can freely create and share Linked Open Data. In order for them to share data with another closed group of Contributors, we haves Bridges. A Bridge is a relay between two closed networks using the internet or any other form of direct connection to share data. Two closed communities, for example two schools, willing to share data can each setup one Bridge and connect these two nodes to each other. The Bridges will then collect and exchange data coming from the Contributors. These bridges are not Contributors themselves, they are just used to ship data (named graphs) around and can be shut-down or replaced without any data-loss. Lastly, the third component we define in our architecture is the Aggregator. This is a special node every Bridge may push content to and get updated content from. As its name suggests, an Aggregator is used to aggregate entity descriptions that are otherwise scattered among all the Contributors. When deployed, an aggregator can be used to access and expose the global content of the knowledge space or a subset thereof. One could use ERS to store (part of) the content of the Journal on an XO (Contributor), cluster information as the school level (Bridge put on the XS) and provide higher level analysis
Re: [Server-devel] [Sugar-devel] The quest for data
On Mon, Jan 6, 2014 at 3:00 PM, Sameer Verma sve...@sfsu.edu wrote: On Mon, Jan 6, 2014 at 4:50 AM, Walter Bender walter.ben...@gmail.com wrote: On Mon, Jan 6, 2014 at 3:48 AM, Martin Dluhos mar...@gnu.org wrote: On 4.1.2014 10:44, Sameer Verma wrote: True. Activities do not report end times, or whether the frequency count is for the number of times a new activity was started, or if it was simply a resumption of the previous instance. Walter had indicated that thre is some movement in this direction to gather end times. This would be indeed very useful. Is anyone working on implementing these features? The frequency count is a count of the number of times an instance of an activity has been opened. There number of new instances can be determined by the number of instance entries in the Journal. Walter, From a conversation we had some time ago, you had pointed out that TuxMath does not necessarily stick to this regimen. Every time a one resumes an instance, it gets counted as a new instance. I haven't gone back to verify this, but how consistent is this behavior across activities? Can this behavior be standardized? I am not sure about TuxMath (or Tuxpaint, Scratch or Etoys) none of which are native Sugar activities. But the behavior I described is standard across native Sugar activities. -walter Yes, the methods that use the datastore as a source rely on the Journal, but the sugar-stats system does not. I believe it collects in GNOME as well. Have you done any processing, analysis, or visualization of the sugar-stats data? Is that something that you are planning to integrate into OLPC Dashboard? There is an app for letting the user visualize their own stats. (Journal Stats). Could use some love and attention. This is an excellent example of providing meaningful feedback with respect to the scope. To borrow the Zoom metaphor, I see the Journal stats to be at the level when the scope is local to the child. The same scope zooms out at the level of the teacher, principal, district education officer, MoE, etc. cheers, Sameer 4) The reporting can be done either via visualization, and/or by generating periodic reports. The reporting should be specific to the person(s) looking at it. No magic there. I think that many questions (some of which we already mentioned above) can be answered with reports and visualizations, which are not deployment specific. For example, those you are targeting with OLPC dashboard. How the data will be used remains to be seen. I have not seen it being used in any of the projects that I know of. If others have seen/done so, it would help to hear from them. I know that in conversations and presentations to decision makers, the usual sore point is can you show us what you have so far? For Jamaica, we have used a basic exploratory approach on the Journal data, corroborated with structured interviews with parents, teachers, etc. So, for instance, the data we have shows a relatively large frequency of use of TuxMath (even with different biases). However, we have qualitative evidence that supports both usage of TuxMath and improvement in numeracy (standardized test). We can support strong(er) correlation, but cannot really establish causality. The three data points put together make for a compelling case. I think this is a really important point to emphasize: None of these approaches to evaluation provides the complete picture, but all of these used in aggregate can provide useful insights. Here at OLE Nepal, we already use standardized testing to compare students performance before and after the program launch. We also follow up with teachers through conversations using surveys on regular support visit. I agree with Sameer that supplementing those with statistical data can make for a much stronger case. Martin ___ Devel mailing list de...@lists.laptop.org http://lists.laptop.org/listinfo/devel -- Walter Bender Sugar Labs http://www.sugarlabs.org -- Walter Bender Sugar Labs http://www.sugarlabs.org ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] [Sugar-devel] The quest for data
On Fri, Jan 3, 2014 at 2:23 PM, James Cameron qu...@laptop.org wrote: Metrics can direct action. Unfortunately, in the absence of meaningful metrics, the meaningless metrics will also direct action. True. In fact, the reliability of the whole thing is dependent on the reliability of the generated data. For instance, if the time stamp is corrupt, then so will be the analysis, unless the data are treated for that bias. One of the assertions inherent in OLPC is that merely using a device can have an effect on a brain, regardless of what activities are used. Brain, perhaps. I'm leaning more on the learning side ;-) In the data listed, I haven't seen any use of more fundamental measurements like how long a device is used for. OLPC's builds have a power log. This captures time spent using a device. True. Activities do not report end times, or whether the frequency count is for the number of times a new activity was started, or if it was simply a resumption of the previous instance. Walter had indicated that thre is some movement in this direction to gather end times. The sugar-stats system does record end times. We still have an assumption (to be addressed by the researcher) that x number of seconds actually lead to a delta of y in learning. Usually we establish correlation, and support a case for causality with proxy observations. It is especially relevant for a device that might also be used in Gnome rather than Sugar. Harvest seems to have arisen out of the availability of the Journal. Yes, the methods that use the datastore as a source rely on the Journal, but the sugar-stats system does not. I believe it collects in GNOME as well. The way I see it, there are four parts to this supply chain: measurement, collection, analysis and report (see http://www.educause.edu/ero/article/penetrating-fog-analytics-learning-and-education). 1) The data has to be generated at the source (Sugar activity or dbus) and must be done with required granularity and reliability. So, for instance, TurtleArt can record the type of blocks, or Maze can record the number of turns. This will vary by activity. We also have to be mindful of the reliability, for instance, of internal clock variation for timestamps. 2) We need a way to collect data on an ongoing basis on the laptop. This may be in the Journal datastore, or in the RRD file, as in the case of sugar-stats. We then continue the collection either by aggregating the data at the XS/XSCE and/or a central location (as with the Harvest system) so that the data can be analyzed. 3) The analysis stage can be done with the raw data (basic statistics, correlation, qualitative), or it can be aggregated (as with the Jamaica CouchDB system doing basic stats) and made ready for reporting. Some of this may be automated, but to go beyond Powerpoint pie charts, it's really on a case-by-case basis. 4) The reporting can be done either via visualization, and/or by generating periodic reports. The reporting should be specific to the person(s) looking at it. No magic there. Now, of course, if the data at the source is corrupt, then it may reflect in the report. There are ways to address missing data and biases, but it would be better to have a reliable way to generate data at the source. On the other hand, use of metrics tends towards standardised testing, with the ultimate implementation being an examination that must be completed each time before using a device for learning. Imagine having to delay learning! How the data will be used remains to be seen. I have not seen it being used in any of the projects that I know of. If others have seen/done so, it would help to hear from them. I know that in conversations and presentations to decision makers, the usual sore point is can you show us what you have so far? For Jamaica, we have used a basic exploratory approach on the Journal data, corroborated with structured interviews with parents, teachers, etc. So, for instance, the data we have shows a relatively large frequency of use of TuxMath (even with different biases). However, we have qualitative evidence that supports both usage of TuxMath and improvement in numeracy (standardized test). We can support strong(er) correlation, but cannot really establish causality. The three data points put together make for a compelling case. As an aside, I did encounter a clever question in one of the presentations: What's constructivist about TuxMath?. That's a discussion for another thread :-) I don't like the idea of standardised testing. I've seen the damage that it does. Sir Ken Robinson had a few things to say about that, in his talk Changing Education Paradigms. It plays a role in the education-industrial complex, and it is difficult to entirely walk way from it, but yes, YMMV. cheers, Sameer -- James Cameron http://quozl.linux.org.au/ ___ Sugar-devel mailing list sugar-de...@lists.sugarlabs.org