My feeling is that with the development of DSpace 7, we need to refactor
and improve the way DSpace logs and processes stats/events:

   - Add more event types like OAI requests and user logins. But we could
   even take this further and provide a *complete audit trail* (log edits,
   deletes, updates, moves... of all DSpace objects). This would allow and
   admin to see everything that happened to an item.
   - When we have all that information, we can remove the legacy stats from
   the code base and build similar screens that use this new information.
   - I also think that this event information should be logged in a table
   in the database. Events should than be processed asynchronously (send data
   to Google Analytics, index statistics view record in SOLR with extra item
   metadata, notify any other third party that might be interested (like IRUS)
   ...). This would improve the user experience (page load times) and also
   solve problems like https://jira.duraspace.org/browse/DS-2904
   - This would also allow you to "reindex" stats and makes taking a backup
   of your statistics a lot easier since they are included in the regular
   database backups. SOLR was never built to be a "persistent data store" as
   mentioned here:
   https://groups.google.com/forum/#!msg/dspace-tech/tMxMSif5U-Q/mC7SuBBDFwAJ.
   SOLR cores can easily become corrupt by unexpected server shutdowns.


What do you guys think? Should we create a Jira ticket for this and discuss
this in a developer meeting?



[image: logo] Tom Desair
250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
Esperantolaan 4, Heverlee 3001, Belgium
www.atmire.com
<http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=tomdesair>

2017-01-12 18:25 GMT+01:00 Terry Brady <terry.br...@georgetown.edu>:

> Bram,
>
> Thanks for the feedback on this.  If the data in these reports should not
> be used anymore, I wonder if we should suppress the inclusion of these
> reports by default and require an explicit action to continue to display
> them.
>
> Terry
>
> On Thu, Jan 12, 2017 at 4:25 AM, Bram Luyten <b...@atmire.com> wrote:
>
>> The code for these reports can be found here if I'm not mistaking:
>> https://github.com/DSpace/DSpace/tree/master/dspace-api/src/
>> main/java/org/dspace/app/statistics
>>
>> I was looking for a trace of robot detection/filtering but couldn't find
>> any.
>>
>> Our (Atmire) point of view on these legacy stats is that they haven't
>> been touched/developed for a long while and shouldn't be used anymore.
>>
>> IF there is some bot filtering in there, the bot filtering we currently
>> have in SOLR, and the possibility to retroactively mark usage as bots when
>> new ips or agents have been detected, is definitely not present in these
>> reports.
>>
>> However, this is still an interesting discussion, would definitely be in
>> favor of adding OAI requests and User logins as usage events that we start
>> tracking in the SOLR logs. Will create JIRA issues for those.
>>
>> Bram
>>
>>
>> [image: logo] Bram Luyten
>> 250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
>> Esperantolaan 4, Heverlee 3001, Belgium
>> atmire.com
>> <http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=braml>
>>
>> On 11 January 2017 at 23:54, Terry Brady <terry.br...@georgetown.edu>
>> wrote:
>>
>>> I am re-sending this question hoping to get some additional feedback.
>>> Alan, thank you for your earlier response.
>>>
>>> Is there a current recommendation on the use of the "legacy statistics"
>>> reports?  I see that these reports continue to be produced on
>>> demo.dspace.org.
>>>
>>> How trustworthy is the data generated from these reports?  Does the
>>> community recommend that these reports continue to be run?
>>>
>>> When I attempt to reconcile the data in this report with my solr
>>> statistics, I see significant differences.
>>>
>>> There are a couple of fields such as OAI requests and User logins that
>>> are not captured in solr statistics.
>>>
>>> Terry
>>>
>>> On Tue, Dec 20, 2016 at 1:36 AM, Alan Orth <alan.o...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> We still use these legacy stats as well in DSpace 5.5, which is
>>>> annoying because we need to keep all dspace.log.* files around for the
>>>> entire month. Anyways, this is the cron job I run every night:
>>>>
>>>> /dspace/bin/dspace stat-general && \
>>>> /dspace/bin/dspace stat-monthly && \
>>>> /dspace/bin/dspace stat-report-general && \
>>>> /dspace/bin/dspace stat-report-monthly
>>>>
>>>> Hope that helps.
>>>>
>>>> On Tue, Dec 20, 2016 at 12:10 AM Terry Brady <
>>>> terry.br...@georgetown.edu> wrote:
>>>>
>>>>> The DSpace Wiki indicates that the "stat-report" commands are
>>>>> deprecated.
>>>>>
>>>>> https://wiki.duraspace.org/display/DSDOC6x/Command+Line+Oper
>>>>> ations#CommandLineOperations-Legacystatistics
>>>>>
>>>>> Looking at demo.dspace.org, I see the following pages are available
>>>>>
>>>>>    - http://demo.dspace.org/xmlui/statistics
>>>>>    - http://demo.dspace.org/xmlui/statistics?date=2016-11
>>>>>
>>>>> What process is used to create these pages?
>>>>>
>>>>> --
>>>>> Terry Brady
>>>>> Applications Programmer Analyst
>>>>> Georgetown University Library Information Technology
>>>>> http://georgetown-university-libraries.github.io/
>>>>> <https://www.library.georgetown.edu/lit/code>
>>>>> 425-298-5498 <(425)%20298-5498> (Seattle, WA)
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "DSpace Technical Support" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to dspace-tech+unsubscr...@googlegroups.com.
>>>>> To post to this group, send email to dspace-tech@googlegroups.com.
>>>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>> --
>>>>
>>>> Alan Orth
>>>> alan.o...@gmail.com
>>>> https://englishbulgaria.net
>>>> https://alaninkenya.org
>>>> https://mjanja.ch
>>>>
>>>
>>>
>>>
>>> --
>>> Terry Brady
>>> Applications Programmer Analyst
>>> Georgetown University Library Information Technology
>>> http://georgetown-university-libraries.github.io/
>>> <https://www.library.georgetown.edu/lit/code>
>>> 425-298-5498 <(425)%20298-5498> (Seattle, WA)
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to dspace-community+unsubscr...@googlegroups.com.
>>> To post to this group, send email to dspace-commun...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/dspace-community.
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>
>
> --
> Terry Brady
> Applications Programmer Analyst
> Georgetown University Library Information Technology
> http://georgetown-university-libraries.github.io/
> <https://www.library.georgetown.edu/lit/code>
> 425-298-5498 <(425)%20298-5498> (Seattle, WA)
>
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to