[ 
https://jira.duraspace.org/browse/DS-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=25861#comment-25861
 ] 

Mark Diggory commented on DS-1241:
----------------------------------


Hi Peter,

Kevin and I have reviewed your codebase (predominantly Kevin's work at this 
point) for your contribution and identified the following crique for 
improvements and concerns with the @mire statistics contributions and the 
community solr statistics features.

Given this we highly recommend that the @mire stats pull request be prioritized 
for contribution before the elastic search (but still will support its 
inclusion)

**Potential issues with @mire's statistics pull request**

* The only issue that will arise is that in our latest pull request 
(https://github.com/DSpace/DSpace/pull/69) we also fire "usage" events for 
searches, items moving through workflow, .... Since these usage events contain 
different data compared to the view events I would recommend filtering on these 
inside your *ElasticSearchLoggerEventListener* class. 
* In our contribution the StatisticsAuthorizedMatcher has received a type 
parameter (there will also be statistics pages for search statistics and 
workflow stats), this will need to be added & set to usage for this 
contribution (a minor issue)

**Feedback on the current state of pull request**

* The configuration for elastic search is placed inside the dspace.cfg file, we 
would recommend moving this to 
*[dspace.dir]/config/modules/elastic-search-statistics.cfg*
* Since the elastic search uses configuration from the *solr-statistics.cfg* 
file I would recommend moving the shared configuration (stuff like use proxies, 
GeoLite database location, ...) into a separate *usage-statistics* file. The 
solr specific configuration would still reside in the *solr-statistics.cfg* 
file and make it easier for users to understand the difference between these 2 
implementations.
* Some suggestions for improving the mapping (which fields are analyzed, the 
field type, the available field names, ...)
** The entire json string that defines the mapping is hardcoded in the 
*ElasticSearchLogger.java* class
** Therefore we would recommend moving this configuration into objects & making 
this configurable with spring (create a new 
*[dspace.dir]/config/spring/statistics directory*). We then recommend using the 
*GSON* library to translate these objects into JSON. (The controlled vocabulary 
contribution already uses the *GSON* library. This *GSON* library can translate 
any list of objects into JSON by 2 lines of code. Please see example here: 
(https://github.com/KevinVdV/DSpace/commit/61c597ae14c2c7f7b219891e121ea31b4bf31173#diff-4)).
* We noticed a lot of code has been added to the solr *StatisticsTransformer* 
that isn't used (perhaps some left overs ? We recommend shifting changes out of 
Statistics Transformer and possibly creating a separate aspect for the time 
being).
** We saw a call to the *DatabaseManager* in the xmlui code, this violates 
having a separation of layers, the UI should not be calling the Database 
directly, we recommend encapsulating this into a service class.
** There are a lot of labels that haven't found their way into the messages
* The *ElasticSearchStatsViewer* user interface is still a bit messy (tables 
are too wide for the theme, ...)
* The *ElasticSearchStatsViewer* trail doesn't create a link for the last 
DSpace object, this was only fixed recently 
(https://github.com/DSpace/DSpace/pull/31) all that needs to change is to 
change the line 
```java
HandleUtil.buildHandleTrail(dso,pageMeta,contextPath);
``` with ```java
HandleUtil.buildHandleTrail(dso,pageMeta,contextPath, true);
```
* The *ReportGenerator* also doesn't use messages
* The *visualizeData.js* file uses a lot of string messages, it might be hard 
to internationalize this page. A way to fix this might be to retrieve 
internationalized messages using json ? Also we wouldn't recommend including 
messages in a js in the first place.
* The print this page button appears to print the entire page, not just the 
reports

* About the *dashboard theme* (theme used on the elastic search view stats 
pages)
 * The Dashboard theme sitemap doesn't have a theme name (just named "The theme 
template")
 * The Dashboard theme uses hardcoded css from the reference theme, this way 
the users theme is never applied to these statistic pages we understand the 
need for some custom javascript files but these should be added by the sitemap 
(like discovery does this)
 * There is an old jquery-1.2 file present but this one isn't used
 * The Dashboard theme fails to load the following files:
  * {context.path}/themes/dashboard/themes/Reference/lib/style.css
  * {context.path}/themes/dashboard/images/manakin_logo.jpg
  * {context.path}/themes/dashboard/images/trail.bmp
  * {context.path}/themes/dashboard/images/di_logo.jpg
* Our recommendation for the dashboard theme would be to drop the theme & load 
the required javascript files in using the *IncludePageMeta* transformer. Since 
the embargo contribution every theme has jqueryUI loaded by default so there is 
no need to load jqueryUI. We would furthermore add all the css either in all 
the themes or create a separate elastic search css and also load this from the 
*IncludePageMeta* transformer (for example in discovery)

**General feedback**
* W also noticed that the elastic search and solr loggers are both enabled at 
all times. This will result in duplicate logging. We will need a solution that 
will allow us to configure the backend used to log the usage information. 
(Using both should also be possible).
* Although we agree with you that it might be useful to render the user 
interface by just using json. We do have concerns that this kind of change will 
not be feasible for DSpace 3.0.
                
> Statistics implementation in Elastic Search
> -------------------------------------------
>
>                 Key: DS-1241
>                 URL: https://jira.duraspace.org/browse/DS-1241
>             Project: DSpace
>          Issue Type: New Feature
>            Reporter: Peter Dietz
>            Assignee: Peter Dietz
>             Fix For: 3.0
>
>         Attachments: Screen Shot 2012-08-10 at 5.10.52 PM.png
>
>
> Work at Ohio State University Libraries for the Knowledge Bank, has had us 
> demanding more out of statistics. We have worked building a statistics 
> dashboard in SOLR, but ran into performance issues. We have since started 
> from the ground up (to a degree), and built an implementation of storing 
> UsageEvent statistics into Elastic Search. I can't really speak about 
> comparing SOLR to Elastic Search, but that we (OSU) prefers Elastic Search, 
> and that this is our implementation that we'd like to contribute.
> This contribution will include the ElasticSearch UsageEvent listener, a 
> standalone ElasticSearch client, and some modifications to XMLUI to view a 
> statistics portal that has statistics from this solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to