Re: [Dspace-tech] Report of items including usage statistics
We don’t have thumbnails so my live might be simpler than yours We also decided that we are not particularly interested in item views but rather bitstream downloads. so in our case we can simply go for bitstream events, facet by bitstream ids and list according to counts to get the top popular bitstreams - of cause going from BITSTREAM id to item and metadata is the trick I have jruby code that interacts with the DSPACE core objects. I use it for various purposes, e.g. create collections according to some template, create users and add to groups, print reports on collections/communities hierarchy including authorization settings, use jruby's interactive console to poke around ... One of the bigger scripts I wrote is a statistics script that reports on collection views, item views, as well as bitstream downloads in one or more community; usually I do the top 20 or so, but the script can also dump all downloaded bitstreams. It produces a tab separated list which includes bitstream id, item name, enclosing collection and community handles, … It can be parameterized with time slots of interest. Ruby might not be your thing - but if you want to have a look - see https://github.com/akinom/dscriptor/tree/master/statistics Monika — Monika Mevenkamp Digital Repository Infrastructure Developer Phone: 609-258-4161 333C 701 Carnegie, Princeton University, Princeton, NJ 08544 On Aug 4, 2015, at 12:52 PM, Terry Brady terry.br...@georgetown.edumailto:terry.br...@georgetown.edu wrote: Do you have a preferred technology stack for the solution? I have some PHP code that may be useful. The total views, total downloads, and owning collection id can be pulled from the solr statistics repository. * Query solr for item views, facet by item id * Query solr for bitstream downloads, facet by item id (do you want to include thumbnail views?) The title, author, abstract, and date created are probably easiest to pull from the database. Here are 2 approaches that would work. 1. Query the database for all items. As you iterate over the SQL results, query SOLR for the view/download counts 2. Run the faceted SOLR queries by item number. As you iterate over the XML/JSON results, query the database for supplemental metadata. Terry On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca wrote: Hello, We’re interested in creating a report (spreadsheet) of items, which includes basic metadata AND associated usage statistics. For instance: - Title - Author - Abstract - Date Created - Owning Collection - Total Views (since accessioned) - Total Downloads (all bitstreams) What’s the best way to do this? Best, Anthony -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.netmailto:DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498 (Seattle, WA) -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.netmailto:DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Report of items including usage statistics
Do you have a preferred technology stack for the solution? I have some PHP code that may be useful. The total views, total downloads, and owning collection id can be pulled from the solr statistics repository. - Query solr for item views, facet by item id - Query solr for bitstream downloads, facet by item id (do you want to include thumbnail views?) The title, author, abstract, and date created are probably easiest to pull from the database. Here are 2 approaches that would work. 1. Query the database for all items. As you iterate over the SQL results, query SOLR for the view/download counts 2. Run the faceted SOLR queries by item number. As you iterate over the XML/JSON results, query the database for supplemental metadata. Terry On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk anthony.pet...@uottawa.ca wrote: Hello, We’re interested in creating a report (spreadsheet) of items, which includes basic metadata AND associated usage statistics. For instance: - Title - Author - Abstract - Date Created - Owning Collection - Total Views (since accessioned) - Total Downloads (all bitstreams) What’s the best way to do this? Best, Anthony -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498 (Seattle, WA) -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Report of items including usage statistics
Anthony, Check out the following code. There might be some useful code to clone. - Statistics report tool (this code reports collection by collection): https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/stats/qcHierarchyStats.php - Populate a PHP array with the collection list using SQL: https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/community.php - Package a SOLR query with an ajax call: https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/stats/qcHierarchyStats.php#L41-L50 Here is a wiki page describing our statistics reports. - https://github.com/Georgetown-University-Libraries/batch-tools/wiki/Statistics-reporting Good luck, Terry On Tue, Aug 4, 2015 at 10:45 AM, Anthony Petryk anthony.pet...@uottawa.ca wrote: Hi Terry, PHP works for me. :) Thanks for your overview of the 2 approaches. The first one looks easier so I’ll try that one. Best, Anthony *From:* Terry Brady [mailto:terry.br...@georgetown.edu] *Sent:* Tuesday, August 04, 2015 12:52 PM *To:* Anthony Petryk anthony.pet...@uottawa.ca *Cc:* dspace-tech@lists.sourceforge.net *Subject:* Re: [Dspace-tech] Report of items including usage statistics Do you have a preferred technology stack for the solution? I have some PHP code that may be useful. The total views, total downloads, and owning collection id can be pulled from the solr statistics repository. - Query solr for item views, facet by item id - Query solr for bitstream downloads, facet by item id (do you want to include thumbnail views?) The title, author, abstract, and date created are probably easiest to pull from the database. Here are 2 approaches that would work. 1. Query the database for all items. As you iterate over the SQL results, query SOLR for the view/download counts 2. Run the faceted SOLR queries by item number. As you iterate over the XML/JSON results, query the database for supplemental metadata. Terry On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk anthony.pet...@uottawa.ca wrote: Hello, We’re interested in creating a report (spreadsheet) of items, which includes basic metadata AND associated usage statistics. For instance: - Title - Author - Abstract - Date Created - Owning Collection - Total Views (since accessioned) - Total Downloads (all bitstreams) What’s the best way to do this? Best, Anthony -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498 (Seattle, WA) -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498 (Seattle, WA) -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Report of items including usage statistics
Hi Terry, PHP works for me. :) Thanks for your overview of the 2 approaches. The first one looks easier so I’ll try that one. Best, Anthony From: Terry Brady [mailto:terry.br...@georgetown.edu] Sent: Tuesday, August 04, 2015 12:52 PM To: Anthony Petryk anthony.pet...@uottawa.ca Cc: dspace-tech@lists.sourceforge.net Subject: Re: [Dspace-tech] Report of items including usage statistics Do you have a preferred technology stack for the solution? I have some PHP code that may be useful. The total views, total downloads, and owning collection id can be pulled from the solr statistics repository. * Query solr for item views, facet by item id * Query solr for bitstream downloads, facet by item id (do you want to include thumbnail views?) The title, author, abstract, and date created are probably easiest to pull from the database. Here are 2 approaches that would work. 1. Query the database for all items. As you iterate over the SQL results, query SOLR for the view/download counts 2. Run the faceted SOLR queries by item number. As you iterate over the XML/JSON results, query the database for supplemental metadata. Terry On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca wrote: Hello, We’re interested in creating a report (spreadsheet) of items, which includes basic metadata AND associated usage statistics. For instance: - Title - Author - Abstract - Date Created - Owning Collection - Total Views (since accessioned) - Total Downloads (all bitstreams) What’s the best way to do this? Best, Anthony -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.netmailto:DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498 (Seattle, WA) -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Report of items including usage statistics
Thanks very much, Terry. Anthony From: Terry Brady [mailto:terry.br...@georgetown.edu] Sent: Tuesday, August 04, 2015 2:09 PM To: Anthony Petryk anthony.pet...@uottawa.ca Cc: dspace-tech@lists.sourceforge.net Subject: Re: [Dspace-tech] Report of items including usage statistics Anthony, Check out the following code. There might be some useful code to clone. * Statistics report tool (this code reports collection by collection): https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/stats/qcHierarchyStats.php * Populate a PHP array with the collection list using SQL: https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/community.php * Package a SOLR query with an ajax call: https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/stats/qcHierarchyStats.php#L41-L50 Here is a wiki page describing our statistics reports. * https://github.com/Georgetown-University-Libraries/batch-tools/wiki/Statistics-reporting Good luck, Terry On Tue, Aug 4, 2015 at 10:45 AM, Anthony Petryk anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca wrote: Hi Terry, PHP works for me. :) Thanks for your overview of the 2 approaches. The first one looks easier so I’ll try that one. Best, Anthony From: Terry Brady [mailto:terry.br...@georgetown.edumailto:terry.br...@georgetown.edu] Sent: Tuesday, August 04, 2015 12:52 PM To: Anthony Petryk anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca Cc: dspace-tech@lists.sourceforge.netmailto:dspace-tech@lists.sourceforge.net Subject: Re: [Dspace-tech] Report of items including usage statistics Do you have a preferred technology stack for the solution? I have some PHP code that may be useful. The total views, total downloads, and owning collection id can be pulled from the solr statistics repository. * Query solr for item views, facet by item id * Query solr for bitstream downloads, facet by item id (do you want to include thumbnail views?) The title, author, abstract, and date created are probably easiest to pull from the database. Here are 2 approaches that would work. 1. Query the database for all items. As you iterate over the SQL results, query SOLR for the view/download counts 2. Run the faceted SOLR queries by item number. As you iterate over the XML/JSON results, query the database for supplemental metadata. Terry On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca wrote: Hello, We’re interested in creating a report (spreadsheet) of items, which includes basic metadata AND associated usage statistics. For instance: - Title - Author - Abstract - Date Created - Owning Collection - Total Views (since accessioned) - Total Downloads (all bitstreams) What’s the best way to do this? Best, Anthony -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.netmailto:DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498tel:425-298-5498 (Seattle, WA) -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498 (Seattle, WA) -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette