Re: [Sugar-devel] The quest for data

2014-01-13 Thread Martin Dluhos
On 12.1.2014 10:12, Sameer Verma wrote:

 Has anyone created the wiki page as yet?

Just created the wiki page:

http://wiki.sugarlabs.org/go/Education_Team/Quest_for_Data

Please help me expand it as you gather feedback from other deployments.

Cheers,
Martin

___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-12 Thread Sameer Verma
On Sun, Jan 12, 2014 at 6:33 AM, Walter Bender walter.ben...@gmail.com wrote:
 On Fri, Jan 10, 2014 at 3:37 PM, Sameer Verma sve...@sfsu.edu wrote:
 On Fri, Jan 10, 2014 at 3:26 AM, Martin Dluhos mar...@gnu.org wrote:
 On 7.1.2014 01:49, Sameer Verma wrote:
 On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote:
 For visualization, I have explored using LibreOffice and SOFA, but 
 neither of
 those were flexible to allow for customization of the output beyond some 
 a few
 rudimentary options, so I started looking at various Javascript 
 libraries, which
 are much more powerful. Currently, I am experimenting with Google Charts, 
 which
 I found the easiest to get started with. If I run into limitations with 
 Google
 Charts in the future, others on my list are InfoVIS Toolkit
 (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). 
 Then,
 there is also D3.js, but that's a bigger animal.

 Keep in mind that if you want to visualize at the school's local
 XS[CE] you may have to rely on a local js method instead of an online
 library.

 Yes, that's a very good point.  Originally, I was only thinking about 
 collecting
 and visualizing the information centrally, but there is no reason why it
 couldn't be viewed by teachers and school administrators on the schoolserver
 itself. Thanks for the warning.



 In fact, my guess would be that what the teachers and principal want
 to see at the school will be different from what OLE Nepal and the
 government would want to see, with interesting overlaps.

 You left out one important constituent: the learner. Ultimately we are
 responsible for making learning visible to the learner. Claudia and I
 touched on this topic in the attached paper.


Thanks for the paper. While we did point out to Portfolio and Analyze
Journal activities in our session at OLPC SF Summit in 2013, I didn't
include it in the scope of the blog post. I'll go back and update it
when I get a chance.

 Just to place all my cards on the table, as much as I hate to suggest
 we head down this route, I think we really need to instrument
 activities themselves (and build analyses of activity output) if we
 want to provide meaningful statistics about learning. We've done some
 of this with Turtle Blocks, even capturing the mistakes the learner
 makes along the way. We are lacking in decent visualizations of these
 data, however.


I haven't had a chance to read the paper in depth (which I intend to
do this afternoon), but how much of this approach would be shareable
across activities? Or would the depth of analysis be on a per activity
basis? If the latter, then I'd imagine it would be simpler for
something like the Moon activity than the TurtleBlocks activity.

 Meanwhile, I remain convinced that the portfolio is our best tool.


I think the approaches differ in scope and purpose. In the RFPs I've
been involved in, the funding agencies and/or the decision makers
either request or outright require dashboard style features to
report frequency of use, time of day, and in some cases even GPS-based
location in addition to theft-deterrence, remote provisioning, etc.
The same goes for going back to an agency to get renewed funding or to
raise funds for a new site expansion. In a way, the scope of the
learner-teacher bubble is significantly different from that of the
principal-minister of edu. One is driven by learning and pedagogy,
while the other is driven by administration. Accordingly, the reports
they want to see are also different. While the measurements from the
Activity may be distilled into coarser indicators for the MoE, I think
it is important to keep the entire scope in mind.

I am mindful of the garbage in, garbage out problem. In building
this pipeline (which is where my skills are) I hope that the data that
goes into this pipeline is representative of what is measured at the
child's end. I am glad that you and Claudia are the experts on that
end :-)

cheers,
Sameer

 regards.

 -walter



 cheers,
 Sameer
 ___
 Sugar-devel mailing list
 Sugar-devel@lists.sugarlabs.org
 http://lists.sugarlabs.org/listinfo/sugar-devel



 --
 Walter Bender
 Sugar Labs
 http://www.sugarlabs.org
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-12 Thread Walter Bender
On Sun, Jan 12, 2014 at 3:32 PM, Sameer Verma sve...@sfsu.edu wrote:
 On Sun, Jan 12, 2014 at 6:33 AM, Walter Bender walter.ben...@gmail.com 
 wrote:
 On Fri, Jan 10, 2014 at 3:37 PM, Sameer Verma sve...@sfsu.edu wrote:
 On Fri, Jan 10, 2014 at 3:26 AM, Martin Dluhos mar...@gnu.org wrote:
 On 7.1.2014 01:49, Sameer Verma wrote:
 On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote:
 For visualization, I have explored using LibreOffice and SOFA, but 
 neither of
 those were flexible to allow for customization of the output beyond some 
 a few
 rudimentary options, so I started looking at various Javascript 
 libraries, which
 are much more powerful. Currently, I am experimenting with Google 
 Charts, which
 I found the easiest to get started with. If I run into limitations with 
 Google
 Charts in the future, others on my list are InfoVIS Toolkit
 (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). 
 Then,
 there is also D3.js, but that's a bigger animal.

 Keep in mind that if you want to visualize at the school's local
 XS[CE] you may have to rely on a local js method instead of an online
 library.

 Yes, that's a very good point.  Originally, I was only thinking about 
 collecting
 and visualizing the information centrally, but there is no reason why it
 couldn't be viewed by teachers and school administrators on the 
 schoolserver
 itself. Thanks for the warning.



 In fact, my guess would be that what the teachers and principal want
 to see at the school will be different from what OLE Nepal and the
 government would want to see, with interesting overlaps.

 You left out one important constituent: the learner. Ultimately we are
 responsible for making learning visible to the learner. Claudia and I
 touched on this topic in the attached paper.


 Thanks for the paper. While we did point out to Portfolio and Analyze
 Journal activities in our session at OLPC SF Summit in 2013, I didn't
 include it in the scope of the blog post. I'll go back and update it
 when I get a chance.

 Just to place all my cards on the table, as much as I hate to suggest
 we head down this route, I think we really need to instrument
 activities themselves (and build analyses of activity output) if we
 want to provide meaningful statistics about learning. We've done some
 of this with Turtle Blocks, even capturing the mistakes the learner
 makes along the way. We are lacking in decent visualizations of these
 data, however.


 I haven't had a chance to read the paper in depth (which I intend to
 do this afternoon), but how much of this approach would be shareable
 across activities? Or would the depth of analysis be on a per activity
 basis? If the latter, then I'd imagine it would be simpler for
 something like the Moon activity than the TurtleBlocks activity.

 Meanwhile, I remain convinced that the portfolio is our best tool.


 I think the approaches differ in scope and purpose. In the RFPs I've
 been involved in, the funding agencies and/or the decision makers
 either request or outright require dashboard style features to
 report frequency of use, time of day, and in some cases even GPS-based
 location in addition to theft-deterrence, remote provisioning, etc.
 The same goes for going back to an agency to get renewed funding or to
 raise funds for a new site expansion. In a way, the scope of the
 learner-teacher bubble is significantly different from that of the
 principal-minister of edu. One is driven by learning and pedagogy,
 while the other is driven by administration. Accordingly, the reports
 they want to see are also different. While the measurements from the
 Activity may be distilled into coarser indicators for the MoE, I think
 it is important to keep the entire scope in mind.

Don't get me wrong: satisfying the needs of funders, administrators,
etc. is important too. They have metrics that they value and we should
gather those data too. My earlier post was just to suggest ultimately
we need to consider the learner and how making learning visible can be
of use. That theme seemed to be missing from the earlier discussion.


 I am mindful of the garbage in, garbage out problem. In building
 this pipeline (which is where my skills are) I hope that the data that
 goes into this pipeline is representative of what is measured at the
 child's end. I am glad that you and Claudia are the experts on that
 end :-)

 cheers,
 Sameer

 regards.

 -walter



 cheers,
 Sameer
 ___
 Sugar-devel mailing list
 Sugar-devel@lists.sugarlabs.org
 http://lists.sugarlabs.org/listinfo/sugar-devel



 --
 Walter Bender
 Sugar Labs
 http://www.sugarlabs.org



-- 
Walter Bender
Sugar Labs
http://www.sugarlabs.org
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-12 Thread Dr. Gerald Ardito
Agreed.


On Sun, Jan 12, 2014 at 6:02 PM, Walter Bender walter.ben...@gmail.comwrote:

 On Sun, Jan 12, 2014 at 3:32 PM, Sameer Verma sve...@sfsu.edu wrote:
  On Sun, Jan 12, 2014 at 6:33 AM, Walter Bender walter.ben...@gmail.com
 wrote:
  On Fri, Jan 10, 2014 at 3:37 PM, Sameer Verma sve...@sfsu.edu wrote:
  On Fri, Jan 10, 2014 at 3:26 AM, Martin Dluhos mar...@gnu.org wrote:
  On 7.1.2014 01:49, Sameer Verma wrote:
  On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org
 wrote:
  For visualization, I have explored using LibreOffice and SOFA, but
 neither of
  those were flexible to allow for customization of the output beyond
 some a few
  rudimentary options, so I started looking at various Javascript
 libraries, which
  are much more powerful. Currently, I am experimenting with Google
 Charts, which
  I found the easiest to get started with. If I run into limitations
 with Google
  Charts in the future, others on my list are InfoVIS Toolkit
  (http://philogb.github.io/jit) and HighCharts (
 http://highcharts.com). Then,
  there is also D3.js, but that's a bigger animal.
 
  Keep in mind that if you want to visualize at the school's local
  XS[CE] you may have to rely on a local js method instead of an online
  library.
 
  Yes, that's a very good point.  Originally, I was only thinking about
 collecting
  and visualizing the information centrally, but there is no reason why
 it
  couldn't be viewed by teachers and school administrators on the
 schoolserver
  itself. Thanks for the warning.
 
 
 
  In fact, my guess would be that what the teachers and principal want
  to see at the school will be different from what OLE Nepal and the
  government would want to see, with interesting overlaps.
 
  You left out one important constituent: the learner. Ultimately we are
  responsible for making learning visible to the learner. Claudia and I
  touched on this topic in the attached paper.
 
 
  Thanks for the paper. While we did point out to Portfolio and Analyze
  Journal activities in our session at OLPC SF Summit in 2013, I didn't
  include it in the scope of the blog post. I'll go back and update it
  when I get a chance.
 
  Just to place all my cards on the table, as much as I hate to suggest
  we head down this route, I think we really need to instrument
  activities themselves (and build analyses of activity output) if we
  want to provide meaningful statistics about learning. We've done some
  of this with Turtle Blocks, even capturing the mistakes the learner
  makes along the way. We are lacking in decent visualizations of these
  data, however.
 
 
  I haven't had a chance to read the paper in depth (which I intend to
  do this afternoon), but how much of this approach would be shareable
  across activities? Or would the depth of analysis be on a per activity
  basis? If the latter, then I'd imagine it would be simpler for
  something like the Moon activity than the TurtleBlocks activity.
 
  Meanwhile, I remain convinced that the portfolio is our best tool.
 
 
  I think the approaches differ in scope and purpose. In the RFPs I've
  been involved in, the funding agencies and/or the decision makers
  either request or outright require dashboard style features to
  report frequency of use, time of day, and in some cases even GPS-based
  location in addition to theft-deterrence, remote provisioning, etc.
  The same goes for going back to an agency to get renewed funding or to
  raise funds for a new site expansion. In a way, the scope of the
  learner-teacher bubble is significantly different from that of the
  principal-minister of edu. One is driven by learning and pedagogy,
  while the other is driven by administration. Accordingly, the reports
  they want to see are also different. While the measurements from the
  Activity may be distilled into coarser indicators for the MoE, I think
  it is important to keep the entire scope in mind.

 Don't get me wrong: satisfying the needs of funders, administrators,
 etc. is important too. They have metrics that they value and we should
 gather those data too. My earlier post was just to suggest ultimately
 we need to consider the learner and how making learning visible can be
 of use. That theme seemed to be missing from the earlier discussion.

 
  I am mindful of the garbage in, garbage out problem. In building
  this pipeline (which is where my skills are) I hope that the data that
  goes into this pipeline is representative of what is measured at the
  child's end. I am glad that you and Claudia are the experts on that
  end :-)
 
  cheers,
  Sameer
 
  regards.
 
  -walter
 
 
 
  cheers,
  Sameer
  ___
  Sugar-devel mailing list
  Sugar-devel@lists.sugarlabs.org
  http://lists.sugarlabs.org/listinfo/sugar-devel
 
 
 
  --
  Walter Bender
  Sugar Labs
  http://www.sugarlabs.org



 --
 Walter Bender
 Sugar Labs
 http://www.sugarlabs.org
 ___
 Sugar-devel 

Re: [Sugar-devel] The quest for data

2014-01-12 Thread Sameer Verma
On Sun, Jan 12, 2014 at 3:02 PM, Walter Bender walter.ben...@gmail.com wrote:
 On Sun, Jan 12, 2014 at 3:32 PM, Sameer Verma sve...@sfsu.edu wrote:
 On Sun, Jan 12, 2014 at 6:33 AM, Walter Bender walter.ben...@gmail.com 
 wrote:
 On Fri, Jan 10, 2014 at 3:37 PM, Sameer Verma sve...@sfsu.edu wrote:
 On Fri, Jan 10, 2014 at 3:26 AM, Martin Dluhos mar...@gnu.org wrote:
 On 7.1.2014 01:49, Sameer Verma wrote:
 On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote:
 For visualization, I have explored using LibreOffice and SOFA, but 
 neither of
 those were flexible to allow for customization of the output beyond 
 some a few
 rudimentary options, so I started looking at various Javascript 
 libraries, which
 are much more powerful. Currently, I am experimenting with Google 
 Charts, which
 I found the easiest to get started with. If I run into limitations with 
 Google
 Charts in the future, others on my list are InfoVIS Toolkit
 (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). 
 Then,
 there is also D3.js, but that's a bigger animal.

 Keep in mind that if you want to visualize at the school's local
 XS[CE] you may have to rely on a local js method instead of an online
 library.

 Yes, that's a very good point.  Originally, I was only thinking about 
 collecting
 and visualizing the information centrally, but there is no reason why it
 couldn't be viewed by teachers and school administrators on the 
 schoolserver
 itself. Thanks for the warning.



 In fact, my guess would be that what the teachers and principal want
 to see at the school will be different from what OLE Nepal and the
 government would want to see, with interesting overlaps.

 You left out one important constituent: the learner. Ultimately we are
 responsible for making learning visible to the learner. Claudia and I
 touched on this topic in the attached paper.


 Thanks for the paper. While we did point out to Portfolio and Analyze
 Journal activities in our session at OLPC SF Summit in 2013, I didn't
 include it in the scope of the blog post. I'll go back and update it
 when I get a chance.

 Just to place all my cards on the table, as much as I hate to suggest
 we head down this route, I think we really need to instrument
 activities themselves (and build analyses of activity output) if we
 want to provide meaningful statistics about learning. We've done some
 of this with Turtle Blocks, even capturing the mistakes the learner
 makes along the way. We are lacking in decent visualizations of these
 data, however.


 I haven't had a chance to read the paper in depth (which I intend to
 do this afternoon), but how much of this approach would be shareable
 across activities? Or would the depth of analysis be on a per activity
 basis? If the latter, then I'd imagine it would be simpler for
 something like the Moon activity than the TurtleBlocks activity.

 Meanwhile, I remain convinced that the portfolio is our best tool.


 I think the approaches differ in scope and purpose. In the RFPs I've
 been involved in, the funding agencies and/or the decision makers
 either request or outright require dashboard style features to
 report frequency of use, time of day, and in some cases even GPS-based
 location in addition to theft-deterrence, remote provisioning, etc.
 The same goes for going back to an agency to get renewed funding or to
 raise funds for a new site expansion. In a way, the scope of the
 learner-teacher bubble is significantly different from that of the
 principal-minister of edu. One is driven by learning and pedagogy,
 while the other is driven by administration. Accordingly, the reports
 they want to see are also different. While the measurements from the
 Activity may be distilled into coarser indicators for the MoE, I think
 it is important to keep the entire scope in mind.

 Don't get me wrong: satisfying the needs of funders, administrators,
 etc. is important too. They have metrics that they value and we should
 gather those data too. My earlier post was just to suggest ultimately
 we need to consider the learner and how making learning visible can be
 of use. That theme seemed to be missing from the earlier discussion.


Agreed. In fact, down the road, if the data gathering can be sourced
in one, focused manner, that would help us in the long run in
supporting the goals of both the learner space and the administrator
space. As an interesting aside, I see similar challenges on my campus
with systems for the learner space and the admin space. The sad part
is that the decision makers usually begin with the vendors, and not
the users.

cheers,
Sameer


 I am mindful of the garbage in, garbage out problem. In building
 this pipeline (which is where my skills are) I hope that the data that
 goes into this pipeline is representative of what is measured at the
 child's end. I am glad that you and Claudia are the experts on that
 end :-)

 cheers,
 Sameer

 regards.

 -walter



 cheers,
 Sameer
 

Re: [Sugar-devel] The quest for data

2014-01-11 Thread Sameer Verma
We had our January meeting at OLPCSF (and our 6th birthday). We talked
about contributions to this project. Introducing Nina Stawski to the
thread. She works with HTML and Javascript and is familiar with
visualization. She suggested d3js.org as one of the options.

Has anyone created the wiki page as yet?

cheers,
Sameer

On Fri, Jan 10, 2014 at 12:37 PM, Sameer Verma sve...@sfsu.edu wrote:
 On Fri, Jan 10, 2014 at 3:26 AM, Martin Dluhos mar...@gnu.org wrote:
 On 7.1.2014 01:49, Sameer Verma wrote:
 On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote:
 For visualization, I have explored using LibreOffice and SOFA, but neither 
 of
 those were flexible to allow for customization of the output beyond some a 
 few
 rudimentary options, so I started looking at various Javascript libraries, 
 which
 are much more powerful. Currently, I am experimenting with Google Charts, 
 which
 I found the easiest to get started with. If I run into limitations with 
 Google
 Charts in the future, others on my list are InfoVIS Toolkit
 (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). 
 Then,
 there is also D3.js, but that's a bigger animal.

 Keep in mind that if you want to visualize at the school's local
 XS[CE] you may have to rely on a local js method instead of an online
 library.

 Yes, that's a very good point.  Originally, I was only thinking about 
 collecting
 and visualizing the information centrally, but there is no reason why it
 couldn't be viewed by teachers and school administrators on the schoolserver
 itself. Thanks for the warning.



 In fact, my guess would be that what the teachers and principal want
 to see at the school will be different from what OLE Nepal and the
 government would want to see, with interesting overlaps.

 cheers,
 Sameer
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-10 Thread Martin Dluhos
On 7.1.2014 01:49, Sameer Verma wrote:
 On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote:
 For visualization, I have explored using LibreOffice and SOFA, but neither of
 those were flexible to allow for customization of the output beyond some a 
 few
 rudimentary options, so I started looking at various Javascript libraries, 
 which
 are much more powerful. Currently, I am experimenting with Google Charts, 
 which
 I found the easiest to get started with. If I run into limitations with 
 Google
 Charts in the future, others on my list are InfoVIS Toolkit
 (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). Then,
 there is also D3.js, but that's a bigger animal.
 
 Keep in mind that if you want to visualize at the school's local
 XS[CE] you may have to rely on a local js method instead of an online
 library.

Yes, that's a very good point.  Originally, I was only thinking about collecting
and visualizing the information centrally, but there is no reason why it
couldn't be viewed by teachers and school administrators on the schoolserver
itself. Thanks for the warning.
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-10 Thread Martin Dluhos
On 10.1.2014 11:55, Anish Mangal wrote:
 Sorry for being late to the party. Clearly the quest for data is a commonly
 shared one, with many different approaches, questions, and reporting/results.
 
 One of the already mentioned solutions is the sugar-stats package, originally
 developed by Aleksey, which have now been part of dextrose-sugar builds for 
 over
 a year, and the server side (xsce). 
 
 http://wiki.sugarlabs.org/go/Platform_Team/Usage_Statistics
 
 The approach we followed was to collect as much data as possible without
 interfering with sugar-apis or code. The project has made slow progress on the
 visualization front, but the data collection front has already been field 
 tested.
 
 
 I for one think there are a few technical trade-offs, which lead to larger
 strategy decisions:
 * Context v/s Universality ... Ideally we'd like to collect (activity) context
 specific data, but that requires tinkering with the sugar api itself and each
 activity. The other side is we might be ignoring the other types of data a
 server might be collecting ... internet usage and the various other logfiles 
 in
 /var/log
 
 * Static v/s Dynamic ... Analyzing journal backups is great, but they are
 ultimately limited in time resolution due to the datastore's design itself. So
 the key question being what's valuable? ... a) Frequency counts of 
 activities?
 b) Data such as upto the minute resolution of what activities are running, 
 which
 activity is active (visible  when), collaborators over time ... etc ... 
 
 In my humble opinion, the next steps could be: 
 1 Get better on the visualization front. 
 2 Search for more context. Maybe arm the sugar-datastore to collect higher
 resolution data. 

I think that you are absolutely right, Anish. In my project, I am currently
focused on the former point, but I am running into limitations regarding the
data stored in the datastore. As Sameer suggested, let's create a wiki page with
a list of the data that's the community finds important and then compare that
list with what's currently collected in the datastore.

___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-10 Thread Martin Abente
+1


On Fri, Jan 10, 2014 at 8:37 AM, Martin Dluhos mar...@gnu.org wrote:

 On 10.1.2014 11:55, Anish Mangal wrote:
  Sorry for being late to the party. Clearly the quest for data is a
 commonly
  shared one, with many different approaches, questions, and
 reporting/results.
 
  One of the already mentioned solutions is the sugar-stats package,
 originally
  developed by Aleksey, which have now been part of dextrose-sugar builds
 for over
  a year, and the server side (xsce).
 
  http://wiki.sugarlabs.org/go/Platform_Team/Usage_Statistics
 
  The approach we followed was to collect as much data as possible without
  interfering with sugar-apis or code. The project has made slow progress
 on the
  visualization front, but the data collection front has already been
 field tested.
 
 
  I for one think there are a few technical trade-offs, which lead to
 larger
  strategy decisions:
  * Context v/s Universality ... Ideally we'd like to collect (activity)
 context
  specific data, but that requires tinkering with the sugar api itself and
 each
  activity. The other side is we might be ignoring the other types of data
 a
  server might be collecting ... internet usage and the various other
 logfiles in
  /var/log
 
  * Static v/s Dynamic ... Analyzing journal backups is great, but they are
  ultimately limited in time resolution due to the datastore's design
 itself. So
  the key question being what's valuable? ... a) Frequency counts of
 activities?
  b) Data such as upto the minute resolution of what activities are
 running, which
  activity is active (visible  when), collaborators over time ... etc ...
 
  In my humble opinion, the next steps could be:
  1 Get better on the visualization front.
  2 Search for more context. Maybe arm the sugar-datastore to collect
 higher
  resolution data.

 I think that you are absolutely right, Anish. In my project, I am currently
 focused on the former point, but I am running into limitations regarding
 the
 data stored in the datastore. As Sameer suggested, let's create a wiki
 page with
 a list of the data that's the community finds important and then compare
 that
 list with what's currently collected in the datastore.

 ___
 Devel mailing list
 de...@lists.laptop.org
 http://lists.laptop.org/listinfo/devel

___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-10 Thread Sameer Verma
On Thu, Jan 9, 2014 at 10:10 PM, Anish Mangal an...@activitycentral.comwrote:

 Sorry for being late to the party. Clearly the quest for data is a
 commonly shared one, with many different approaches, questions, and
 reporting/results.

 One of the already mentioned solutions is the sugar-stats package,
 originally developed by Aleksey, which have now been part of dextrose-sugar
 builds for over a year, and the server side (xsce).

 http://wiki.sugarlabs.org/go/Platform_Team/Usage_Statistics

 The approach we followed was to collect as much data as possible without
 interfering with sugar-apis or code. The project has made slow progress on
 the visualization front, but the data collection front has already been
 field tested.


 I for one think there are a few technical trade-offs, which lead to larger
 strategy decisions:
 * Context v/s Universality ... Ideally we'd like to collect (activity)
 context specific data, but that requires tinkering with the sugar api
 itself and each activity. The other side is we might be ignoring the other
 types of data a server might be collecting ... internet usage and the
 various other logfiles in /var/log

 * Static v/s Dynamic ... Analyzing journal backups is great, but they are
 ultimately limited in time resolution due to the datastore's design itself.
 So the key question being what's valuable? ... a) Frequency counts of
 activities? b) Data such as upto the minute resolution of what activities
 are running, which activity is active (visible  when), collaborators over
 time ... etc ...

 In my humble opinion, the next steps could be:
 1 Get better on the visualization front.
 2 Search for more context. Maybe arm the sugar-datastore to collect higher
 resolution data.



1 and 2 can be done in parallel. As long as the architecture is
independent, the data sources can be sugar-datastore or sugar-stats.

BTW, Leotis has pushed his OLPC Dashboard code to github:
https://github.com/Leotis/olpc-datavisualization-

cheers,
Sameer



 On Tue, Jan 7, 2014 at 12:24 PM, Christophe Guéret 
 christophe.gue...@dans.knaw.nl wrote:

 Dear Sameer, all,

 That's a very interesting blog post and discussion. I agree that
 collecting data is important but knowing that are the questions aimed to be
 answered with that data is even more so. If you need help with that last
 bit, I could propose to use the journal data as a use-case for the project
 KnowEscape ( http://knowescape.org/ ). This project is about getting
 insights out of large knowledge spaces via visualisation. There is wide
 (European) community of experts behind it coming from different research
 fields (humanities, physic, computer science, ...). Something useful could
 maybe come out...

 I would also like to refer you to the project ERS we have now almost
 finished. This project is an extension of the ideas behind SemanticXO some
 of you may remember. We developed a decentralised entity registry system
 with the XO as a primary platform for coding and testing. There is a
 description of the implementation and links to code on
 http://ers-devs.github.io/ers/ . We also had a poster at OLPC SF (thanks
 for that !).

 In a nutshell, ERS creates global and shared knowledge spaces through
 series of statements. For instance, Amsterdam is in the Netherlands is a
 statement made about the entity Amsterdam relating it to the entity the
 Netherlands. Every user of ERS may want to either de-reference an entity
 (*e.g.*, asking for all pieces of information about Amsterdam) or
 contribute to the content of the shared space by adding new statements.
 This is made possible via Contributors nodes, one of the three types of
 node defined in our system. Contributors can interact freely with the
 knowledge base. They themselves take care of publishing their own
 statements but cannot edit third-party statements. Every set of statements
 about a given entity contributed by one single author is wrapped into a
 document in couchDB to avoid conflicts and enable provenance tracking.
 Every single XO is a Contributor. Two Contributors in a closed P2P network
 can freely create and share Linked Open Data. In order for them to share
 data with another closed group of Contributors, we haves Bridges. A
 Bridge is a relay between two closed networks using the internet or any
 other form of direct connection to share data. Two closed communities, for
 example two schools, willing to share data can each setup one Bridge and
 connect these two nodes to each other. The Bridges will then collect and
 exchange data coming from the Contributors. These bridges are not
 Contributors themselves, they are just used to ship data (named graphs)
 around and can be shut-down or replaced without any data-loss. Lastly, the
 third component we define in our architecture is the Aggregator. This is
 a special node every Bridge may push content to and get updated content
 from. As its name suggests, an Aggregator is used to aggregate entity
 descriptions that are 

Re: [Sugar-devel] The quest for data

2014-01-10 Thread Sameer Verma
On Fri, Jan 10, 2014 at 3:26 AM, Martin Dluhos mar...@gnu.org wrote:
 On 7.1.2014 01:49, Sameer Verma wrote:
 On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote:
 For visualization, I have explored using LibreOffice and SOFA, but neither 
 of
 those were flexible to allow for customization of the output beyond some a 
 few
 rudimentary options, so I started looking at various Javascript libraries, 
 which
 are much more powerful. Currently, I am experimenting with Google Charts, 
 which
 I found the easiest to get started with. If I run into limitations with 
 Google
 Charts in the future, others on my list are InfoVIS Toolkit
 (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). Then,
 there is also D3.js, but that's a bigger animal.

 Keep in mind that if you want to visualize at the school's local
 XS[CE] you may have to rely on a local js method instead of an online
 library.

 Yes, that's a very good point.  Originally, I was only thinking about 
 collecting
 and visualizing the information centrally, but there is no reason why it
 couldn't be viewed by teachers and school administrators on the schoolserver
 itself. Thanks for the warning.



In fact, my guess would be that what the teachers and principal want
to see at the school will be different from what OLE Nepal and the
government would want to see, with interesting overlaps.

cheers,
Sameer
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-09 Thread Anish Mangal
Sorry for being late to the party. Clearly the quest for data is a
commonly shared one, with many different approaches, questions, and
reporting/results.

One of the already mentioned solutions is the sugar-stats package,
originally developed by Aleksey, which have now been part of dextrose-sugar
builds for over a year, and the server side (xsce).

http://wiki.sugarlabs.org/go/Platform_Team/Usage_Statistics

The approach we followed was to collect as much data as possible without
interfering with sugar-apis or code. The project has made slow progress on
the visualization front, but the data collection front has already been
field tested.


I for one think there are a few technical trade-offs, which lead to larger
strategy decisions:
* Context v/s Universality ... Ideally we'd like to collect (activity)
context specific data, but that requires tinkering with the sugar api
itself and each activity. The other side is we might be ignoring the other
types of data a server might be collecting ... internet usage and the
various other logfiles in /var/log

* Static v/s Dynamic ... Analyzing journal backups is great, but they are
ultimately limited in time resolution due to the datastore's design itself.
So the key question being what's valuable? ... a) Frequency counts of
activities? b) Data such as upto the minute resolution of what activities
are running, which activity is active (visible  when), collaborators over
time ... etc ...

In my humble opinion, the next steps could be:
1 Get better on the visualization front.
2 Search for more context. Maybe arm the sugar-datastore to collect higher
resolution data.



On Tue, Jan 7, 2014 at 12:24 PM, Christophe Guéret 
christophe.gue...@dans.knaw.nl wrote:

 Dear Sameer, all,

 That's a very interesting blog post and discussion. I agree that
 collecting data is important but knowing that are the questions aimed to be
 answered with that data is even more so. If you need help with that last
 bit, I could propose to use the journal data as a use-case for the project
 KnowEscape ( http://knowescape.org/ ). This project is about getting
 insights out of large knowledge spaces via visualisation. There is wide
 (European) community of experts behind it coming from different research
 fields (humanities, physic, computer science, ...). Something useful could
 maybe come out...

 I would also like to refer you to the project ERS we have now almost
 finished. This project is an extension of the ideas behind SemanticXO some
 of you may remember. We developed a decentralised entity registry system
 with the XO as a primary platform for coding and testing. There is a
 description of the implementation and links to code on
 http://ers-devs.github.io/ers/ . We also had a poster at OLPC SF (thanks
 for that !).

 In a nutshell, ERS creates global and shared knowledge spaces through
 series of statements. For instance, Amsterdam is in the Netherlands is a
 statement made about the entity Amsterdam relating it to the entity the
 Netherlands. Every user of ERS may want to either de-reference an entity
 (*e.g.*, asking for all pieces of information about Amsterdam) or
 contribute to the content of the shared space by adding new statements.
 This is made possible via Contributors nodes, one of the three types of
 node defined in our system. Contributors can interact freely with the
 knowledge base. They themselves take care of publishing their own
 statements but cannot edit third-party statements. Every set of statements
 about a given entity contributed by one single author is wrapped into a
 document in couchDB to avoid conflicts and enable provenance tracking.
 Every single XO is a Contributor. Two Contributors in a closed P2P network
 can freely create and share Linked Open Data. In order for them to share
 data with another closed group of Contributors, we haves Bridges. A
 Bridge is a relay between two closed networks using the internet or any
 other form of direct connection to share data. Two closed communities, for
 example two schools, willing to share data can each setup one Bridge and
 connect these two nodes to each other. The Bridges will then collect and
 exchange data coming from the Contributors. These bridges are not
 Contributors themselves, they are just used to ship data (named graphs)
 around and can be shut-down or replaced without any data-loss. Lastly, the
 third component we define in our architecture is the Aggregator. This is
 a special node every Bridge may push content to and get updated content
 from. As its name suggests, an Aggregator is used to aggregate entity
 descriptions that are otherwise scattered among all the Contributors. When
 deployed, an aggregator can be used to access and expose the global content
 of the knowledge space or a subset thereof.

 One could use ERS to store (part of) the content of the Journal on an XO
 (Contributor), cluster information as the school level (Bridge put on the
 XS) and provide higher level analysis 

Re: [Sugar-devel] The quest for data

2014-01-06 Thread Martin Dluhos
On 3.1.2014 04:09, Sameer Verma wrote:
 Happy new year! May 2014 bring good deeds and cheer :-)
 
 Here's a blog post on the different approaches (that I know of) to data
 gathering across different projects. Do let me know if I missed anything.
 
 cheers,
 Sameer
 
 http://www.olpcsf.org/node/204

Thanks for putting together the summary, Sameer. Here is more information about
my xo-stats project:

The project's objective is to determine how XOs are used in Nepalese
classrooms, but I am intending for the implementation to be general enough, so
that it can be reused by other deployments as well. Similarly to other projects
you've mentioned, I separated the project into four stages:

1) collecting data from the XO Journal backups on the schoolserver
2) extracting the data from the backups and storing it in an appropriate format
for analysis and visualization
3) statistically analyzing and visualizing the captured data
4) formulating recommendations for improving the program based on the analysis.

Stage 1 is already implemented on both the server side as well as the client
side, so I first focused on the next step of extracting the data. Initially, I
wanted to reuse an existing script, but I eventually found that none of them
were general enough to meet my criteria. One of my goals is to make the script
work on any version of Sugar.

Thus, I have been working on process_journal_stats.py, which takes a '/users'
directory with XO Journal backups as input, pulls out the Journal metadata and
outputs them in a CSV or JSON file as output.

Journal backups can be in a variety of formats depending on the version
of Sugar. The script currently supports backup format present in Sugar versions
0.82 - 0.88 since the laptops distributed in Nepal are XO-1s running Sugar
0.82. I am planning to add support for later versions of Sugar in the next
version of the script.

The script currently supports two ways to output statistical data. To produce
all statistical data from the Journal, one row per Journal record:

process_journal_stats.py all

To extract statistical data about the use of activities on the system, use:

process_journal_stats.py activity

The full documentation with all the options are described in README at:

https://github.com/martasd/xo-stats

One challenge of the project has been determining how much data processing to do
in the python script and what to leave for the data analysis and visualization
tools later in the workflow. For now, I stopped adding features to the script
and I am  evaluating the most appropriate tools to use for visualizing the data.

Here are some of the questions I am intending to answer with the visualizations
and analysis:

* How many times do installed activities get used? How does the activity use
differ over time?
* Which activities are children using to create files? What kind of files are
being created?
* Which activities are being launched in share-mode and how often?
* Which part of the day do children play with the activities?
* How does the set of activities used evolve as children age?

I am also going to be looking how answers to these questions vary from class to
class, school to school, and region to region.

As Martin Abente and Sameer mentioned above, our work needs to be informed by
discussions with the stakeholders- children, educators, parents, school
administrators etc. We do have educational experts among the staff at OLE, who
have worked with more than 50 schools altogether, and I will be talking to them
as I look beyond answering the obvious questions.

For visualization, I have explored using LibreOffice and SOFA, but neither of
those were flexible to allow for customization of the output beyond some a few
rudimentary options, so I started looking at various Javascript libraries, which
are much more powerful. Currently, I am experimenting with Google Charts, which
I found the easiest to get started with. If I run into limitations with Google
Charts in the future, others on my list are InfoVIS Toolkit
(http://philogb.github.io/jit) and HighCharts (http://highcharts.com). Then,
there is also D3.js, but that's a bigger animal.

Alternatively or perhaps in parallel, I am also willing to join efforts to
improve the OLPC Dashboard, which is trying to answer very similar questions to
mine.

I am looking forward to collaborating with everyone who is interested in
exploring ways to analyze and visualize OLPC/Sugar data in a interesting and
meaningful way.

Cheers,
Martin
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-06 Thread Martin Dluhos
On 4.1.2014 10:44, Sameer Verma wrote:

 True. Activities do not report end times, or whether the frequency
 count is for the number of times a new activity was started, or if
 it was simply a resumption of the previous instance. Walter had
 indicated that thre is some movement in this direction to gather end
 times. 

This would be indeed very useful. Is anyone working on implementing these 
features?

 Yes, the methods that use the datastore as a source rely on the
 Journal, but the sugar-stats system does not. I believe it collects in
 GNOME as well.

Have you done any processing, analysis, or visualization of the sugar-stats
data? Is that something that you are planning to integrate into OLPC Dashboard?

 4) The reporting can be done either via visualization, and/or by
 generating periodic reports. The reporting should be specific to the
 person(s) looking at it. No magic there.

I think that many questions (some of which we already mentioned above) can be
answered with reports and visualizations, which are not deployment specific. For
example, those you are targeting with OLPC dashboard.

 
 How the data will be used remains to be seen. I have not seen it being
 used in any of the projects that I know of. If others have seen/done
 so, it would help to hear from them. I know that in conversations and
 presentations to decision makers, the usual sore point is can you
 show us what you have so far? For Jamaica, we have used a basic
 exploratory approach on the Journal data, corroborated with structured
  interviews with parents, teachers, etc. So, for instance, the data we
 have shows a relatively large frequency of use of TuxMath (even with
 different biases). However, we have qualitative evidence that supports
 both usage of TuxMath and improvement in numeracy (standardized test).
 We can support strong(er) correlation, but cannot really establish
 causality. The three data points put together make for a compelling
 case. 

I think this is a really important point to emphasize: None of these approaches
to evaluation provides the complete picture, but all of these used in aggregate
can provide useful insights. Here at OLE Nepal, we already use standardized
testing to compare students performance before and after the program launch. We
also follow up with teachers through conversations using surveys on regular
support visit. I agree with Sameer that supplementing those with statistical
data can make for a much stronger case.

Martin

___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-06 Thread Walter Bender
On Mon, Jan 6, 2014 at 3:48 AM, Martin Dluhos mar...@gnu.org wrote:
 On 4.1.2014 10:44, Sameer Verma wrote:

 True. Activities do not report end times, or whether the frequency
 count is for the number of times a new activity was started, or if
 it was simply a resumption of the previous instance. Walter had
 indicated that thre is some movement in this direction to gather end
 times.

 This would be indeed very useful. Is anyone working on implementing these 
 features?

The frequency count is a count of the number of times an instance of
an activity has been opened. There number of new instances can be
determined by the number of instance entries in the Journal.


 Yes, the methods that use the datastore as a source rely on the
 Journal, but the sugar-stats system does not. I believe it collects in
 GNOME as well.

 Have you done any processing, analysis, or visualization of the sugar-stats
 data? Is that something that you are planning to integrate into OLPC 
 Dashboard?

There is an app for letting the user visualize their own stats.
(Journal Stats). Could use some love and attention.


 4) The reporting can be done either via visualization, and/or by
 generating periodic reports. The reporting should be specific to the
 person(s) looking at it. No magic there.

 I think that many questions (some of which we already mentioned above) can be
 answered with reports and visualizations, which are not deployment specific. 
 For
 example, those you are targeting with OLPC dashboard.


 How the data will be used remains to be seen. I have not seen it being
 used in any of the projects that I know of. If others have seen/done
 so, it would help to hear from them. I know that in conversations and
 presentations to decision makers, the usual sore point is can you
 show us what you have so far? For Jamaica, we have used a basic
 exploratory approach on the Journal data, corroborated with structured
  interviews with parents, teachers, etc. So, for instance, the data we
 have shows a relatively large frequency of use of TuxMath (even with
 different biases). However, we have qualitative evidence that supports
 both usage of TuxMath and improvement in numeracy (standardized test).
 We can support strong(er) correlation, but cannot really establish
 causality. The three data points put together make for a compelling
 case.

 I think this is a really important point to emphasize: None of these 
 approaches
 to evaluation provides the complete picture, but all of these used in 
 aggregate
 can provide useful insights. Here at OLE Nepal, we already use standardized
 testing to compare students performance before and after the program launch. 
 We
 also follow up with teachers through conversations using surveys on regular
 support visit. I agree with Sameer that supplementing those with statistical
 data can make for a much stronger case.

 Martin

 ___
 Devel mailing list
 de...@lists.laptop.org
 http://lists.laptop.org/listinfo/devel



-- 
Walter Bender
Sugar Labs
http://www.sugarlabs.org
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-06 Thread Sameer Verma
On Sun, Jan 5, 2014 at 5:03 PM, Andreas Gros andigro...@gmail.com wrote:
 Great utilization of CouchDB and its views feature! That's definitely
 something we can build on. But more importantly, to make this meaningful, we
 need more data.

I like this approach as well because the aggregation is offloaded to
CouchDB through views and reduce/rereduce so we can have a fairly
independent choice of Javascript-based visualization frontend, be it
Google Charts (https://developers.google.com/chart/) or D3.js
(http://d3js.org/).

 It's good to know what the activities are that are used most, so one can
 come up with a priority list for improvements, and/or focus developer
 attention.
 CouchDB allows to pull data together from different instances, which should
 make aggregation and comparisons between projects possible. And for projects
 that are not online, the data could be transferred to a USB stick quite
 easily and then uploaded to any other DB instance.


True. CouchDB will allow for aggregation across classes, schools,
districts, etc. Depending on the willingness of participation of
different projects, we can certainly go cross-project. Even if these
views are not made public, they will be useful. For instance, I would
love to compare my Jamaica projects with my India projects with my
Madagascar projects.

 Is there a task/todo list somewhere?


Not that I know of, but we can always start one on the sugarlabs wiki.
Anybody have suggestions?

Sameer

 Andi








 On Fri, Jan 3, 2014 at 11:16 AM, Sameer Verma sve...@sfsu.edu wrote:

 On Fri, Jan 3, 2014 at 4:15 AM, Martin Abente
 martin.abente.lah...@gmail.com wrote:
  Hello Sameer,
 
  I totally agree we should join efforts for a visualization solution,
  but,
  personally, my main concern is still a  basic one: what are the
  important
  questions we should be asking? And how can we answer these questions
  reliably? Even though most of us have experience in deployments and
  their
  needs, we are engineers, not educators, nor decision makers.
 

 Agreed. It would be helpful to have a conversation on what the various
 constituencies need (different from want) to see at their level. The
 child, the parents/guardians, the teacher, the
 principal/administrator, and educational bureaucracy. We should also
 consider the needs of those of us who have to fundraise by showing
 progress of ongoing effort.

  I am sure that most of our collection approaches cover pretty much the
  trivial stuff like: what are they using, when are they using it, how
  often
  they use it, and all kind of things that derive directly from journal
  metadata. Plus the extra insight that comes when considering different
  demographics

 True. Basic frequency counts such as frequency of use of activities,
 usage by time of day, day of week, scope of collaboration are a few
 simple one. Comparison of one metric vs the other will need more
 thinking. That's where we should talk to the constituents.

 
  But, If we could also work together on that (including the trivial
  questions), it will be a good step forward. Once we identify these
  questions
  and figure out how to answer them, it would be a lot easier to think
  about
  visualization techniques, etc.

 If the visualization subsystem (underlying tech pieces) are common and
 flexible, then we can start with a few basic templates, and make it
 extensible, so we can all aggregate, collate, and correlate as needed.
 I'll use an example that I'm familiar with. We looked at CouchDB for
 two reasons: 1) It allows for sync over intermittent/on-off
 connections to the Internet and 2) CouchDB has a views feature which
 provides selective subsets of the data, and the reduce feature does
 aggregates. The actual visual is done in Javascript. Here's the
 example Leotis had at the OLPC SF summit
 (http://108.171.173.65:8000/).
 
  What you guys think?
 

 A great start for a great year ahead!

  Saludos,

 cheers,
  tch.
 Sameer
 ___
 Sugar-devel mailing list
 Sugar-devel@lists.sugarlabs.org
 http://lists.sugarlabs.org/listinfo/sugar-devel
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-06 Thread Sameer Verma
On Mon, Jan 6, 2014 at 4:50 AM, Walter Bender walter.ben...@gmail.com wrote:
 On Mon, Jan 6, 2014 at 3:48 AM, Martin Dluhos mar...@gnu.org wrote:
 On 4.1.2014 10:44, Sameer Verma wrote:

 True. Activities do not report end times, or whether the frequency
 count is for the number of times a new activity was started, or if
 it was simply a resumption of the previous instance. Walter had
 indicated that thre is some movement in this direction to gather end
 times.

 This would be indeed very useful. Is anyone working on implementing these 
 features?

 The frequency count is a count of the number of times an instance of
 an activity has been opened. There number of new instances can be
 determined by the number of instance entries in the Journal.


Walter,
From a conversation we had some time ago, you had pointed out that
TuxMath does not necessarily stick to this regimen. Every time a one
resumes an instance, it gets counted as a new instance. I haven't gone
back to verify this, but how consistent is this behavior across
activities? Can this behavior be standardized?


 Yes, the methods that use the datastore as a source rely on the
 Journal, but the sugar-stats system does not. I believe it collects in
 GNOME as well.

 Have you done any processing, analysis, or visualization of the sugar-stats
 data? Is that something that you are planning to integrate into OLPC 
 Dashboard?

 There is an app for letting the user visualize their own stats.
 (Journal Stats). Could use some love and attention.


This is an excellent example of providing meaningful feedback with
respect to the scope. To borrow the Zoom metaphor, I see the Journal
stats to be at the level when the scope is local to the child. The
same scope zooms out at the level of the teacher, principal, district
education officer, MoE, etc.

cheers,
Sameer


 4) The reporting can be done either via visualization, and/or by
 generating periodic reports. The reporting should be specific to the
 person(s) looking at it. No magic there.

 I think that many questions (some of which we already mentioned above) can be
 answered with reports and visualizations, which are not deployment specific. 
 For
 example, those you are targeting with OLPC dashboard.


 How the data will be used remains to be seen. I have not seen it being
 used in any of the projects that I know of. If others have seen/done
 so, it would help to hear from them. I know that in conversations and
 presentations to decision makers, the usual sore point is can you
 show us what you have so far? For Jamaica, we have used a basic
 exploratory approach on the Journal data, corroborated with structured
  interviews with parents, teachers, etc. So, for instance, the data we
 have shows a relatively large frequency of use of TuxMath (even with
 different biases). However, we have qualitative evidence that supports
 both usage of TuxMath and improvement in numeracy (standardized test).
 We can support strong(er) correlation, but cannot really establish
 causality. The three data points put together make for a compelling
 case.

 I think this is a really important point to emphasize: None of these 
 approaches
 to evaluation provides the complete picture, but all of these used in 
 aggregate
 can provide useful insights. Here at OLE Nepal, we already use standardized
 testing to compare students performance before and after the program launch. 
 We
 also follow up with teachers through conversations using surveys on regular
 support visit. I agree with Sameer that supplementing those with statistical
 data can make for a much stronger case.

 Martin

 ___
 Devel mailing list
 de...@lists.laptop.org
 http://lists.laptop.org/listinfo/devel



 --
 Walter Bender
 Sugar Labs
 http://www.sugarlabs.org


___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-06 Thread Sameer Verma
On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote:
 On 3.1.2014 04:09, Sameer Verma wrote:
 Happy new year! May 2014 bring good deeds and cheer :-)

 Here's a blog post on the different approaches (that I know of) to data
 gathering across different projects. Do let me know if I missed anything.

 cheers,
 Sameer

 http://www.olpcsf.org/node/204

 Thanks for putting together the summary, Sameer. Here is more information 
 about
 my xo-stats project:

 The project's objective is to determine how XOs are used in Nepalese
 classrooms, but I am intending for the implementation to be general enough, so
 that it can be reused by other deployments as well. Similarly to other 
 projects
 you've mentioned, I separated the project into four stages:

 1) collecting data from the XO Journal backups on the schoolserver
 2) extracting the data from the backups and storing it in an appropriate 
 format
 for analysis and visualization
 3) statistically analyzing and visualizing the captured data
 4) formulating recommendations for improving the program based on the 
 analysis.

 Stage 1 is already implemented on both the server side as well as the client
 side, so I first focused on the next step of extracting the data. Initially, I
 wanted to reuse an existing script, but I eventually found that none of them
 were general enough to meet my criteria. One of my goals is to make the script
 work on any version of Sugar.

 Thus, I have been working on process_journal_stats.py, which takes a '/users'
 directory with XO Journal backups as input, pulls out the Journal metadata and
 outputs them in a CSV or JSON file as output.

 Journal backups can be in a variety of formats depending on the version
 of Sugar. The script currently supports backup format present in Sugar 
 versions
 0.82 - 0.88 since the laptops distributed in Nepal are XO-1s running Sugar
 0.82. I am planning to add support for later versions of Sugar in the next
 version of the script.

 The script currently supports two ways to output statistical data. To produce
 all statistical data from the Journal, one row per Journal record:

 process_journal_stats.py all

 To extract statistical data about the use of activities on the system, use:

 process_journal_stats.py activity

 The full documentation with all the options are described in README at:

 https://github.com/martasd/xo-stats

 One challenge of the project has been determining how much data processing to 
 do
 in the python script and what to leave for the data analysis and visualization
 tools later in the workflow. For now, I stopped adding features to the script
 and I am  evaluating the most appropriate tools to use for visualizing the 
 data.

 Here are some of the questions I am intending to answer with the 
 visualizations
 and analysis:

 * How many times do installed activities get used? How does the activity use
 differ over time?
 * Which activities are children using to create files? What kind of files are
 being created?
 * Which activities are being launched in share-mode and how often?
 * Which part of the day do children play with the activities?
 * How does the set of activities used evolve as children age?

 I am also going to be looking how answers to these questions vary from class 
 to
 class, school to school, and region to region.

 As Martin Abente and Sameer mentioned above, our work needs to be informed by
 discussions with the stakeholders- children, educators, parents, school
 administrators etc. We do have educational experts among the staff at OLE, who
 have worked with more than 50 schools altogether, and I will be talking to 
 them
 as I look beyond answering the obvious questions.


We should start a list on the wiki to collate this information. I'll
get someone from Jamaica to provide some feedback as well.

 For visualization, I have explored using LibreOffice and SOFA, but neither of
 those were flexible to allow for customization of the output beyond some a few
 rudimentary options, so I started looking at various Javascript libraries, 
 which
 are much more powerful. Currently, I am experimenting with Google Charts, 
 which
 I found the easiest to get started with. If I run into limitations with Google
 Charts in the future, others on my list are InfoVIS Toolkit
 (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). Then,
 there is also D3.js, but that's a bigger animal.

Keep in mind that if you want to visualize at the school's local
XS[CE] you may have to rely on a local js method instead of an online
library.


 Alternatively or perhaps in parallel, I am also willing to join efforts to
 improve the OLPC Dashboard, which is trying to answer very similar questions 
 to
 mine.

I'll ping Leotis (cc'd) to push his dashboard code to github, so we
don't reinvent.

cheers,
Sameer


 I am looking forward to collaborating with everyone who is interested in
 exploring ways to analyze and visualize OLPC/Sugar data in a 

Re: [Sugar-devel] The quest for data

2014-01-06 Thread Sameer Verma
On Mon, Jan 6, 2014 at 12:04 PM, Sameer Verma sve...@sfsu.edu wrote:
 On Mon, Jan 6, 2014 at 12:28 AM, Martin Dluhos mar...@gnu.org wrote:
 On 3.1.2014 04:09, Sameer Verma wrote:
 Happy new year! May 2014 bring good deeds and cheer :-)

 Here's a blog post on the different approaches (that I know of) to data
 gathering across different projects. Do let me know if I missed anything.

 cheers,
 Sameer

 http://www.olpcsf.org/node/204

 Thanks for putting together the summary, Sameer. Here is more information 
 about
 my xo-stats project:

 The project's objective is to determine how XOs are used in Nepalese
 classrooms, but I am intending for the implementation to be general enough, 
 so
 that it can be reused by other deployments as well. Similarly to other 
 projects
 you've mentioned, I separated the project into four stages:

 1) collecting data from the XO Journal backups on the schoolserver
 2) extracting the data from the backups and storing it in an appropriate 
 format
 for analysis and visualization
 3) statistically analyzing and visualizing the captured data
 4) formulating recommendations for improving the program based on the 
 analysis.

 Stage 1 is already implemented on both the server side as well as the client
 side, so I first focused on the next step of extracting the data. Initially, 
 I
 wanted to reuse an existing script, but I eventually found that none of them
 were general enough to meet my criteria. One of my goals is to make the 
 script
 work on any version of Sugar.

 Thus, I have been working on process_journal_stats.py, which takes a '/users'
 directory with XO Journal backups as input, pulls out the Journal metadata 
 and
 outputs them in a CSV or JSON file as output.

 Journal backups can be in a variety of formats depending on the version
 of Sugar. The script currently supports backup format present in Sugar 
 versions
 0.82 - 0.88 since the laptops distributed in Nepal are XO-1s running Sugar
 0.82. I am planning to add support for later versions of Sugar in the next
 version of the script.

 The script currently supports two ways to output statistical data. To produce
 all statistical data from the Journal, one row per Journal record:

 process_journal_stats.py all

 To extract statistical data about the use of activities on the system, use:

 process_journal_stats.py activity

 The full documentation with all the options are described in README at:

 https://github.com/martasd/xo-stats

 One challenge of the project has been determining how much data processing 
 to do
 in the python script and what to leave for the data analysis and 
 visualization
 tools later in the workflow. For now, I stopped adding features to the script
 and I am  evaluating the most appropriate tools to use for visualizing the 
 data.

 Here are some of the questions I am intending to answer with the 
 visualizations
 and analysis:

 * How many times do installed activities get used? How does the activity use
 differ over time?
 * Which activities are children using to create files? What kind of files are
 being created?
 * Which activities are being launched in share-mode and how often?
 * Which part of the day do children play with the activities?
 * How does the set of activities used evolve as children age?

 I am also going to be looking how answers to these questions vary from class 
 to
 class, school to school, and region to region.

 As Martin Abente and Sameer mentioned above, our work needs to be informed by
 discussions with the stakeholders- children, educators, parents, school
 administrators etc. We do have educational experts among the staff at OLE, 
 who
 have worked with more than 50 schools altogether, and I will be talking to 
 them
 as I look beyond answering the obvious questions.


 We should start a list on the wiki to collate this information. I'll
 get someone from Jamaica to provide some feedback as well.

 For visualization, I have explored using LibreOffice and SOFA, but neither of
 those were flexible to allow for customization of the output beyond some a 
 few
 rudimentary options, so I started looking at various Javascript libraries, 
 which
 are much more powerful. Currently, I am experimenting with Google Charts, 
 which
 I found the easiest to get started with. If I run into limitations with 
 Google
 Charts in the future, others on my list are InfoVIS Toolkit
 (http://philogb.github.io/jit) and HighCharts (http://highcharts.com). Then,
 there is also D3.js, but that's a bigger animal.

 Keep in mind that if you want to visualize at the school's local
 XS[CE] you may have to rely on a local js method instead of an online
 library.


 Alternatively or perhaps in parallel, I am also willing to join efforts to
 improve the OLPC Dashboard, which is trying to answer very similar questions 
 to
 mine.

 I'll ping Leotis (cc'd) to push his dashboard code to github, so we
 don't reinvent.


For those who haven't seen the protoype that Leotis has (demo'd at

Re: [Sugar-devel] The quest for data

2014-01-06 Thread Walter Bender
On Mon, Jan 6, 2014 at 3:00 PM, Sameer Verma sve...@sfsu.edu wrote:
 On Mon, Jan 6, 2014 at 4:50 AM, Walter Bender walter.ben...@gmail.com wrote:
 On Mon, Jan 6, 2014 at 3:48 AM, Martin Dluhos mar...@gnu.org wrote:
 On 4.1.2014 10:44, Sameer Verma wrote:

 True. Activities do not report end times, or whether the frequency
 count is for the number of times a new activity was started, or if
 it was simply a resumption of the previous instance. Walter had
 indicated that thre is some movement in this direction to gather end
 times.

 This would be indeed very useful. Is anyone working on implementing these 
 features?

 The frequency count is a count of the number of times an instance of
 an activity has been opened. There number of new instances can be
 determined by the number of instance entries in the Journal.


 Walter,
 From a conversation we had some time ago, you had pointed out that
 TuxMath does not necessarily stick to this regimen. Every time a one
 resumes an instance, it gets counted as a new instance. I haven't gone
 back to verify this, but how consistent is this behavior across
 activities? Can this behavior be standardized?

I am not sure about TuxMath (or Tuxpaint, Scratch or Etoys) none of
which are native Sugar activities. But the behavior I described is
standard across native Sugar activities.

-walter

 Yes, the methods that use the datastore as a source rely on the
 Journal, but the sugar-stats system does not. I believe it collects in
 GNOME as well.

 Have you done any processing, analysis, or visualization of the sugar-stats
 data? Is that something that you are planning to integrate into OLPC 
 Dashboard?

 There is an app for letting the user visualize their own stats.
 (Journal Stats). Could use some love and attention.


 This is an excellent example of providing meaningful feedback with
 respect to the scope. To borrow the Zoom metaphor, I see the Journal
 stats to be at the level when the scope is local to the child. The
 same scope zooms out at the level of the teacher, principal, district
 education officer, MoE, etc.

 cheers,
 Sameer


 4) The reporting can be done either via visualization, and/or by
 generating periodic reports. The reporting should be specific to the
 person(s) looking at it. No magic there.

 I think that many questions (some of which we already mentioned above) can 
 be
 answered with reports and visualizations, which are not deployment 
 specific. For
 example, those you are targeting with OLPC dashboard.


 How the data will be used remains to be seen. I have not seen it being
 used in any of the projects that I know of. If others have seen/done
 so, it would help to hear from them. I know that in conversations and
 presentations to decision makers, the usual sore point is can you
 show us what you have so far? For Jamaica, we have used a basic
 exploratory approach on the Journal data, corroborated with structured
  interviews with parents, teachers, etc. So, for instance, the data we
 have shows a relatively large frequency of use of TuxMath (even with
 different biases). However, we have qualitative evidence that supports
 both usage of TuxMath and improvement in numeracy (standardized test).
 We can support strong(er) correlation, but cannot really establish
 causality. The three data points put together make for a compelling
 case.

 I think this is a really important point to emphasize: None of these 
 approaches
 to evaluation provides the complete picture, but all of these used in 
 aggregate
 can provide useful insights. Here at OLE Nepal, we already use standardized
 testing to compare students performance before and after the program 
 launch. We
 also follow up with teachers through conversations using surveys on regular
 support visit. I agree with Sameer that supplementing those with statistical
 data can make for a much stronger case.

 Martin

 ___
 Devel mailing list
 de...@lists.laptop.org
 http://lists.laptop.org/listinfo/devel



 --
 Walter Bender
 Sugar Labs
 http://www.sugarlabs.org





-- 
Walter Bender
Sugar Labs
http://www.sugarlabs.org
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-06 Thread Christophe Guéret
Dear Sameer, all,

That's a very interesting blog post and discussion. I agree that collecting
data is important but knowing that are the questions aimed to be answered
with that data is even more so. If you need help with that last bit, I
could propose to use the journal data as a use-case for the project
KnowEscape ( http://knowescape.org/ ). This project is about getting
insights out of large knowledge spaces via visualisation. There is wide
(European) community of experts behind it coming from different research
fields (humanities, physic, computer science, ...). Something useful could
maybe come out...

I would also like to refer you to the project ERS we have now almost
finished. This project is an extension of the ideas behind SemanticXO some
of you may remember. We developed a decentralised entity registry system
with the XO as a primary platform for coding and testing. There is a
description of the implementation and links to code on
http://ers-devs.github.io/ers/ . We also had a poster at OLPC SF (thanks
for that !).

 In a nutshell, ERS creates global and shared knowledge spaces through
series of statements. For instance, Amsterdam is in the Netherlands is a
statement made about the entity Amsterdam relating it to the entity the
Netherlands. Every user of ERS may want to either de-reference an entity
(*e.g.*, asking for all pieces of information about Amsterdam) or
contribute to the content of the shared space by adding new statements.
This is made possible via Contributors nodes, one of the three types of
node defined in our system. Contributors can interact freely with the
knowledge base. They themselves take care of publishing their own
statements but cannot edit third-party statements. Every set of statements
about a given entity contributed by one single author is wrapped into a
document in couchDB to avoid conflicts and enable provenance tracking.
Every single XO is a Contributor. Two Contributors in a closed P2P network
can freely create and share Linked Open Data. In order for them to share
data with another closed group of Contributors, we haves Bridges. A
Bridge is a relay between two closed networks using the internet or any
other form of direct connection to share data. Two closed communities, for
example two schools, willing to share data can each setup one Bridge and
connect these two nodes to each other. The Bridges will then collect and
exchange data coming from the Contributors. These bridges are not
Contributors themselves, they are just used to ship data (named graphs)
around and can be shut-down or replaced without any data-loss. Lastly, the
third component we define in our architecture is the Aggregator. This is
a special node every Bridge may push content to and get updated content
from. As its name suggests, an Aggregator is used to aggregate entity
descriptions that are otherwise scattered among all the Contributors. When
deployed, an aggregator can be used to access and expose the global content
of the knowledge space or a subset thereof.

One could use ERS to store (part of) the content of the Journal on an XO
(Contributor), cluster information as the school level (Bridge put on the
XS) and provide higher level analysis (Aggregator). The best things about
ERS, I think is that:
* It can store and share any data that consists of property/values about a
given thing identified with a unique identifier
* It is off-line by default, all the upper level components are optional.
So is the connectivity to them
* It's conservative in terms of bandwidth used

The creation of graphs could be done at every level to get some statistics
on the XO, on the XS and at a more global level. All these potentially
using the same code as the data is always stored using the same model (a
variant of JSON-LD).

We are now finalising a small social-networking activity to demotest ERS.
You can easily play with it using the virtual images we put on the site.
Here is a video showing it running: https://vimeo.com/81796228

Please have a look and let us know how what you think of it :-) The project
is still funded for a bit less than three months and we would really like
it to be useful for the OLPC community (that's why we targeted the XO) so
don't hesitate to ask for missing features!

Cheers,
Christophe

On 6 January 2014 02:03, Andreas Gros andigro...@gmail.com wrote:

 Great utilization of CouchDB and its views feature! That's definitely
 something we can build on. But more importantly, to make this meaningful,
 we need more data.
 It's good to know what the activities are that are used most, so one can
 come up with a priority list for improvements, and/or focus developer
 attention.
 CouchDB allows to pull data together from different instances, which
 should make aggregation and comparisons between projects possible. And for
 projects that are not online, the data could be transferred to a USB stick
 quite easily and then uploaded to any other DB instance.

 Is there a task/todo list somewhere?

Re: [Sugar-devel] The quest for data

2014-01-05 Thread Andreas Gros
Great utilization of CouchDB and its views feature! That's definitely
something we can build on. But more importantly, to make this meaningful,
we need more data.
It's good to know what the activities are that are used most, so one can
come up with a priority list for improvements, and/or focus developer
attention.
CouchDB allows to pull data together from different instances, which should
make aggregation and comparisons between projects possible. And for
projects that are not online, the data could be transferred to a USB stick
quite easily and then uploaded to any other DB instance.

Is there a task/todo list somewhere?

Andi








On Fri, Jan 3, 2014 at 11:16 AM, Sameer Verma sve...@sfsu.edu wrote:

 On Fri, Jan 3, 2014 at 4:15 AM, Martin Abente
 martin.abente.lah...@gmail.com wrote:
  Hello Sameer,
 
  I totally agree we should join efforts for a visualization solution, but,
  personally, my main concern is still a  basic one: what are the important
  questions we should be asking? And how can we answer these questions
  reliably? Even though most of us have experience in deployments and their
  needs, we are engineers, not educators, nor decision makers.
 

 Agreed. It would be helpful to have a conversation on what the various
 constituencies need (different from want) to see at their level. The
 child, the parents/guardians, the teacher, the
 principal/administrator, and educational bureaucracy. We should also
 consider the needs of those of us who have to fundraise by showing
 progress of ongoing effort.

  I am sure that most of our collection approaches cover pretty much the
  trivial stuff like: what are they using, when are they using it, how
 often
  they use it, and all kind of things that derive directly from journal
  metadata. Plus the extra insight that comes when considering different
  demographics

 True. Basic frequency counts such as frequency of use of activities,
 usage by time of day, day of week, scope of collaboration are a few
 simple one. Comparison of one metric vs the other will need more
 thinking. That's where we should talk to the constituents.

 
  But, If we could also work together on that (including the trivial
  questions), it will be a good step forward. Once we identify these
 questions
  and figure out how to answer them, it would be a lot easier to think
 about
  visualization techniques, etc.

 If the visualization subsystem (underlying tech pieces) are common and
 flexible, then we can start with a few basic templates, and make it
 extensible, so we can all aggregate, collate, and correlate as needed.
 I'll use an example that I'm familiar with. We looked at CouchDB for
 two reasons: 1) It allows for sync over intermittent/on-off
 connections to the Internet and 2) CouchDB has a views feature which
 provides selective subsets of the data, and the reduce feature does
 aggregates. The actual visual is done in Javascript. Here's the
 example Leotis had at the OLPC SF summit
 (http://108.171.173.65:8000/).
 
  What you guys think?
 

 A great start for a great year ahead!

  Saludos,

 cheers,
  tch.
 Sameer
 ___
 Sugar-devel mailing list
 Sugar-devel@lists.sugarlabs.org
 http://lists.sugarlabs.org/listinfo/sugar-devel

___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-03 Thread Martin Abente
Hello Sameer,

I totally agree we should join efforts for a visualization solution, but,
personally, my main concern is still a  basic one: what are the important
questions we should be asking? And how can we answer these questions
reliably? Even though most of us have experience in deployments and their
needs, we are engineers, not educators, nor decision makers.

I am sure that most of our collection approaches cover pretty much the
trivial stuff like: what are they using, when are they using it, how often
they use it, and all kind of things that derive directly from journal
metadata. Plus the extra insight that comes when considering different
demographics.

But, If we could also work together on that (including the trivial
questions), it will be a good step forward. Once we identify these
questions and figure out how to answer them, it would be a lot easier to
think about visualization techniques, etc.

What you guys think?

Saludos,
tch.
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-03 Thread Sameer Verma
On Fri, Jan 3, 2014 at 4:15 AM, Martin Abente
martin.abente.lah...@gmail.com wrote:
 Hello Sameer,

 I totally agree we should join efforts for a visualization solution, but,
 personally, my main concern is still a  basic one: what are the important
 questions we should be asking? And how can we answer these questions
 reliably? Even though most of us have experience in deployments and their
 needs, we are engineers, not educators, nor decision makers.


Agreed. It would be helpful to have a conversation on what the various
constituencies need (different from want) to see at their level. The
child, the parents/guardians, the teacher, the
principal/administrator, and educational bureaucracy. We should also
consider the needs of those of us who have to fundraise by showing
progress of ongoing effort.

 I am sure that most of our collection approaches cover pretty much the
 trivial stuff like: what are they using, when are they using it, how often
 they use it, and all kind of things that derive directly from journal
 metadata. Plus the extra insight that comes when considering different
 demographics

True. Basic frequency counts such as frequency of use of activities,
usage by time of day, day of week, scope of collaboration are a few
simple one. Comparison of one metric vs the other will need more
thinking. That's where we should talk to the constituents.


 But, If we could also work together on that (including the trivial
 questions), it will be a good step forward. Once we identify these questions
 and figure out how to answer them, it would be a lot easier to think about
 visualization techniques, etc.

If the visualization subsystem (underlying tech pieces) are common and
flexible, then we can start with a few basic templates, and make it
extensible, so we can all aggregate, collate, and correlate as needed.
I'll use an example that I'm familiar with. We looked at CouchDB for
two reasons: 1) It allows for sync over intermittent/on-off
connections to the Internet and 2) CouchDB has a views feature which
provides selective subsets of the data, and the reduce feature does
aggregates. The actual visual is done in Javascript. Here's the
example Leotis had at the OLPC SF summit
(http://108.171.173.65:8000/).

 What you guys think?


A great start for a great year ahead!

 Saludos,

cheers,
 tch.
Sameer
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-03 Thread James Cameron
Metrics can direct action.

Unfortunately, in the absence of meaningful metrics, the meaningless
metrics will also direct action.

One of the assertions inherent in OLPC is that merely using a device
can have an effect on a brain, regardless of what activities are used.

In the data listed, I haven't seen any use of more fundamental
measurements like how long a device is used for.  OLPC's builds
have a power log.  This captures time spent using a device.

It is especially relevant for a device that might also be used in
Gnome rather than Sugar.  Harvest seems to have arisen out of the
availability of the Journal.

On the other hand, use of metrics tends towards standardised testing,
with the ultimate implementation being an examination that must be
completed each time before using a device for learning.  Imagine
having to delay learning!

I don't like the idea of standardised testing.  I've seen the damage
that it does.  Sir Ken Robinson had a few things to say about that, in
his talk Changing Education Paradigms.

-- 
James Cameron
http://quozl.linux.org.au/
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] The quest for data

2014-01-03 Thread Sameer Verma
On Fri, Jan 3, 2014 at 2:23 PM, James Cameron qu...@laptop.org wrote:
 Metrics can direct action.

 Unfortunately, in the absence of meaningful metrics, the meaningless
 metrics will also direct action.


True. In fact, the reliability of the whole thing is dependent on the
reliability of the generated data. For instance, if the time stamp is
corrupt, then so will be the analysis, unless the data are treated for
that bias.

 One of the assertions inherent in OLPC is that merely using a device
 can have an effect on a brain, regardless of what activities are used.

Brain, perhaps. I'm leaning more on the learning side ;-)


 In the data listed, I haven't seen any use of more fundamental
 measurements like how long a device is used for.  OLPC's builds
 have a power log.  This captures time spent using a device.

True. Activities do not report end times, or whether the frequency
count is for the number of times a new activity was started, or if
it was simply a resumption of the previous instance. Walter had
indicated that thre is some movement in this direction to gather end
times. The sugar-stats system does record end times. We still have an
assumption (to be addressed by the researcher) that x number of
seconds actually lead to a delta of y in learning. Usually we
establish correlation, and support a case for causality with proxy
observations.


 It is especially relevant for a device that might also be used in
 Gnome rather than Sugar.  Harvest seems to have arisen out of the
 availability of the Journal.


Yes, the methods that use the datastore as a source rely on the
Journal, but the sugar-stats system does not. I believe it collects in
GNOME as well.

The way I see it, there are four parts to this supply chain:
measurement, collection, analysis and report (see
http://www.educause.edu/ero/article/penetrating-fog-analytics-learning-and-education).

1) The data has to be generated at the source (Sugar activity or dbus)
and must be done with required granularity and reliability. So, for
instance, TurtleArt can record the type of blocks, or Maze can record
the number of turns. This will vary by activity. We also have to be
mindful of the reliability, for instance, of internal clock variation
for timestamps.

2) We need a way to collect data on an ongoing basis on the laptop.
This may be in the Journal datastore, or in the RRD file, as in the
case of sugar-stats. We then continue the collection either by
aggregating the data at the XS/XSCE and/or a central location (as with
the Harvest system) so that the data can be analyzed.

3) The analysis stage can be done with the raw data (basic statistics,
correlation, qualitative), or it can be aggregated (as with the
Jamaica CouchDB system doing basic stats) and made ready for
reporting. Some of this may be automated, but to go beyond Powerpoint
pie charts, it's really on a case-by-case basis.

4) The reporting can be done either via visualization, and/or by
generating periodic reports. The reporting should be specific to the
person(s) looking at it. No magic there.

Now, of course, if the data at the source is corrupt, then it may
reflect in the report. There are ways to address missing data and
biases, but it would be better to have a reliable way to generate data
at the source.

 On the other hand, use of metrics tends towards standardised testing,
 with the ultimate implementation being an examination that must be
 completed each time before using a device for learning.  Imagine
 having to delay learning!

How the data will be used remains to be seen. I have not seen it being
used in any of the projects that I know of. If others have seen/done
so, it would help to hear from them. I know that in conversations and
presentations to decision makers, the usual sore point is can you
show us what you have so far? For Jamaica, we have used a basic
exploratory approach on the Journal data, corroborated with structured
 interviews with parents, teachers, etc. So, for instance, the data we
have shows a relatively large frequency of use of TuxMath (even with
different biases). However, we have qualitative evidence that supports
both usage of TuxMath and improvement in numeracy (standardized test).
We can support strong(er) correlation, but cannot really establish
causality. The three data points put together make for a compelling
case. As an aside, I did encounter a clever question in one of the
presentations: What's constructivist about TuxMath?. That's a
discussion for another thread :-)


 I don't like the idea of standardised testing.  I've seen the damage
 that it does.  Sir Ken Robinson had a few things to say about that, in
 his talk Changing Education Paradigms.


It plays a role in the education-industrial complex, and it is
difficult to entirely walk way from it, but yes, YMMV.

cheers,
Sameer

 --
 James Cameron
 http://quozl.linux.org.au/
 ___
 Sugar-devel mailing list
 Sugar-devel@lists.sugarlabs.org
 

[Sugar-devel] The quest for data

2014-01-02 Thread Sameer Verma
Happy new year! May 2014 bring good deeds and cheer :-)

Here's a blog post on the different approaches (that I know of) to data
gathering across different projects. Do let me know if I missed anything.

cheers,
Sameer

http://www.olpcsf.org/node/204
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel