Re: [Dspace-tech] Report of items including usage statistics

2015-08-04 Thread Monika C. Mevenkamp
We don’t have thumbnails so my live might be simpler than yours

We also decided that we are not particularly interested in item views but 
rather bitstream downloads. so in our case we can simply go for bitstream 
events, facet by bitstream ids and list according to counts to get the top 
popular bitstreams - of cause going from BITSTREAM id to item and metadata is 
the trick

I have jruby code that interacts with the DSPACE core objects. I use it for 
various purposes, e.g. create collections according to some template, create 
users and add to  groups, print reports on collections/communities hierarchy 
including authorization settings, use jruby's interactive console to poke 
around ...

One of the bigger scripts I wrote is a statistics script that reports on 
collection views, item views, as well as bitstream downloads in one or more 
community;  usually I do the top 20 or so, but the script can also dump all 
downloaded bitstreams. It produces a tab separated list which includes 
bitstream id, item name, enclosing collection and community handles, … It can 
be parameterized with time slots of interest.

Ruby might not be your thing - but if you want to have a look - see 
https://github.com/akinom/dscriptor/tree/master/statistics


Monika

—
Monika Mevenkamp
Digital Repository Infrastructure Developer
Phone: 609-258-4161
333C 701 Carnegie, Princeton University, Princeton, NJ 08544

On Aug 4, 2015, at 12:52 PM, Terry Brady 
terry.br...@georgetown.edumailto:terry.br...@georgetown.edu wrote:

Do you have a preferred technology stack for the solution?  I have some PHP 
code that may be useful.

The total views, total downloads, and owning collection id can be pulled from 
the solr statistics repository.

  *   Query solr for item views, facet by item id
  *   Query solr for bitstream downloads, facet by item id (do you want to 
include thumbnail views?)

The title, author, abstract, and date created are probably easiest to pull from 
the database.

Here are 2 approaches that would work.

1. Query the database for all items.  As you iterate over the SQL results, 
query SOLR for the view/download counts
2. Run the faceted SOLR queries by item number.  As you iterate over the 
XML/JSON results, query the database for supplemental metadata.

Terry

On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk 
anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca wrote:
Hello,

We’re interested in creating a report (spreadsheet) of items, which includes 
basic metadata AND associated usage statistics.  For instance:


-  Title

-  Author

-  Abstract

-  Date Created

-  Owning Collection

-  Total Views (since accessioned)

-  Total Downloads (all bitstreams)

What’s the best way to do this?

Best,

Anthony


--

___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.netmailto:DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette



--
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://www.library.georgetown.edu/lit/code
425-298-5498 (Seattle, WA)
--
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.netmailto:DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] Report of items including usage statistics

2015-08-04 Thread Terry Brady
Do you have a preferred technology stack for the solution?  I have some PHP
code that may be useful.

The total views, total downloads, and owning collection id can be pulled
from the solr statistics repository.

   - Query solr for item views, facet by item id
   - Query solr for bitstream downloads, facet by item id (do you want to
   include thumbnail views?)

The title, author, abstract, and date created are probably easiest to pull
from the database.

Here are 2 approaches that would work.

1. Query the database for all items.  As you iterate over the SQL results,
query SOLR for the view/download counts
2. Run the faceted SOLR queries by item number.  As you iterate over the
XML/JSON results, query the database for supplemental metadata.

Terry

On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk anthony.pet...@uottawa.ca
wrote:

 Hello,



 We’re interested in creating a report (spreadsheet) of items, which
 includes basic metadata AND associated usage statistics.  For instance:



 -  Title

 -  Author

 -  Abstract

 -  Date Created

 -  Owning Collection

 -  Total Views (since accessioned)

 -  Total Downloads (all bitstreams)



 What’s the best way to do this?



 Best,



 Anthony




 --

 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech
 List Etiquette:
 https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette




-- 
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://www.library.georgetown.edu/lit/code
425-298-5498 (Seattle, WA)
--
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] Report of items including usage statistics

2015-08-04 Thread Terry Brady
Anthony,

Check out the following code.  There might be some useful code to clone.

   - Statistics report tool (this code reports collection by collection):
   
https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/stats/qcHierarchyStats.php
  - Populate a PHP array with the collection list using SQL:
  
https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/community.php
  - Package a SOLR query with an ajax call:
  
https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/stats/qcHierarchyStats.php#L41-L50


Here is a wiki page describing our statistics reports.

   -
   
https://github.com/Georgetown-University-Libraries/batch-tools/wiki/Statistics-reporting

Good luck,

Terry

On Tue, Aug 4, 2015 at 10:45 AM, Anthony Petryk anthony.pet...@uottawa.ca
wrote:

 Hi Terry,



 PHP works for me. :)



 Thanks for your overview of the 2 approaches.  The first one looks easier
 so I’ll try that one.



 Best,



 Anthony



 *From:* Terry Brady [mailto:terry.br...@georgetown.edu]
 *Sent:* Tuesday, August 04, 2015 12:52 PM
 *To:* Anthony Petryk anthony.pet...@uottawa.ca
 *Cc:* dspace-tech@lists.sourceforge.net
 *Subject:* Re: [Dspace-tech] Report of items including usage statistics



 Do you have a preferred technology stack for the solution?  I have some
 PHP code that may be useful.



 The total views, total downloads, and owning collection id can be pulled
 from the solr statistics repository.

- Query solr for item views, facet by item id
- Query solr for bitstream downloads, facet by item id (do you want to
include thumbnail views?)

 The title, author, abstract, and date created are probably easiest to pull
 from the database.



 Here are 2 approaches that would work.



 1. Query the database for all items.  As you iterate over the SQL results,
 query SOLR for the view/download counts

 2. Run the faceted SOLR queries by item number.  As you iterate over the
 XML/JSON results, query the database for supplemental metadata.



 Terry



 On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk anthony.pet...@uottawa.ca
 wrote:

 Hello,



 We’re interested in creating a report (spreadsheet) of items, which
 includes basic metadata AND associated usage statistics.  For instance:



 -  Title

 -  Author

 -  Abstract

 -  Date Created

 -  Owning Collection

 -  Total Views (since accessioned)

 -  Total Downloads (all bitstreams)



 What’s the best way to do this?



 Best,



 Anthony





 --

 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech
 List Etiquette:
 https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette





 --

 Terry Brady

 Applications Programmer Analyst

 Georgetown University Library Information Technology

 https://www.library.georgetown.edu/lit/code

 425-298-5498 (Seattle, WA)




-- 
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://www.library.georgetown.edu/lit/code
425-298-5498 (Seattle, WA)
--
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] Report of items including usage statistics

2015-08-04 Thread Anthony Petryk
Hi Terry,

PHP works for me. :)

Thanks for your overview of the 2 approaches.  The first one looks easier so 
I’ll try that one.

Best,

Anthony

From: Terry Brady [mailto:terry.br...@georgetown.edu]
Sent: Tuesday, August 04, 2015 12:52 PM
To: Anthony Petryk anthony.pet...@uottawa.ca
Cc: dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] Report of items including usage statistics

Do you have a preferred technology stack for the solution?  I have some PHP 
code that may be useful.

The total views, total downloads, and owning collection id can be pulled from 
the solr statistics repository.

  *   Query solr for item views, facet by item id
  *   Query solr for bitstream downloads, facet by item id (do you want to 
include thumbnail views?)
The title, author, abstract, and date created are probably easiest to pull from 
the database.

Here are 2 approaches that would work.

1. Query the database for all items.  As you iterate over the SQL results, 
query SOLR for the view/download counts
2. Run the faceted SOLR queries by item number.  As you iterate over the 
XML/JSON results, query the database for supplemental metadata.

Terry

On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk 
anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca wrote:
Hello,

We’re interested in creating a report (spreadsheet) of items, which includes 
basic metadata AND associated usage statistics.  For instance:


-  Title

-  Author

-  Abstract

-  Date Created

-  Owning Collection

-  Total Views (since accessioned)

-  Total Downloads (all bitstreams)

What’s the best way to do this?

Best,

Anthony


--

___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.netmailto:DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette



--
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://www.library.georgetown.edu/lit/code
425-298-5498 (Seattle, WA)
--
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] Report of items including usage statistics

2015-08-04 Thread Anthony Petryk
Thanks very much, Terry.

Anthony

From: Terry Brady [mailto:terry.br...@georgetown.edu]
Sent: Tuesday, August 04, 2015 2:09 PM
To: Anthony Petryk anthony.pet...@uottawa.ca
Cc: dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] Report of items including usage statistics

Anthony,

Check out the following code.  There might be some useful code to clone.

  *   Statistics report tool (this code reports collection by collection): 
https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/stats/qcHierarchyStats.php

 *   Populate a PHP array with the collection list using SQL: 
https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/community.php
 *   Package a SOLR query with an ajax call: 
https://github.com/Georgetown-University-Libraries/batch-tools/blob/master/web/stats/qcHierarchyStats.php#L41-L50

Here is a wiki page describing our statistics reports.

  *   
https://github.com/Georgetown-University-Libraries/batch-tools/wiki/Statistics-reporting
Good luck,

Terry

On Tue, Aug 4, 2015 at 10:45 AM, Anthony Petryk 
anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca wrote:
Hi Terry,

PHP works for me. :)

Thanks for your overview of the 2 approaches.  The first one looks easier so 
I’ll try that one.

Best,

Anthony

From: Terry Brady 
[mailto:terry.br...@georgetown.edumailto:terry.br...@georgetown.edu]
Sent: Tuesday, August 04, 2015 12:52 PM
To: Anthony Petryk anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca
Cc: dspace-tech@lists.sourceforge.netmailto:dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] Report of items including usage statistics

Do you have a preferred technology stack for the solution?  I have some PHP 
code that may be useful.

The total views, total downloads, and owning collection id can be pulled from 
the solr statistics repository.

  *   Query solr for item views, facet by item id
  *   Query solr for bitstream downloads, facet by item id (do you want to 
include thumbnail views?)
The title, author, abstract, and date created are probably easiest to pull from 
the database.

Here are 2 approaches that would work.

1. Query the database for all items.  As you iterate over the SQL results, 
query SOLR for the view/download counts
2. Run the faceted SOLR queries by item number.  As you iterate over the 
XML/JSON results, query the database for supplemental metadata.

Terry

On Tue, Aug 4, 2015 at 8:29 AM, Anthony Petryk 
anthony.pet...@uottawa.camailto:anthony.pet...@uottawa.ca wrote:
Hello,

We’re interested in creating a report (spreadsheet) of items, which includes 
basic metadata AND associated usage statistics.  For instance:


-  Title

-  Author

-  Abstract

-  Date Created

-  Owning Collection

-  Total Views (since accessioned)

-  Total Downloads (all bitstreams)

What’s the best way to do this?

Best,

Anthony


--

___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.netmailto:DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette



--
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://www.library.georgetown.edu/lit/code
425-298-5498tel:425-298-5498 (Seattle, WA)



--
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://www.library.georgetown.edu/lit/code
425-298-5498 (Seattle, WA)
--
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette