Dear Mark,

I think flame graphs might be what we need to look into. Brendan more or
less pioneered this way of doing visualization of performance analysis
while troubleshooting MySQL in 2011, and now the tooling ecosystem has
really developed. We should definitely look into this. I've only ever done
amateur performance analysis with strace, iotop, VisualVM, etc, but I will
try to see if I can figure out how to generate a flame graph of DSpace
while it is indexing.

Regarding patches pending in 6.4, I've been testing 6.4-SNAPSHOT and it's
the same. :\

Cheers,

http://www.brendangregg.com/flamegraphs.html

Brendan is a pioneer in this type of performance visualization that I see

On Thu, Feb 6, 2020 at 4:15 PM Mark H. Wood <[email protected]> wrote:

> On Thu, Feb 06, 2020 at 09:46:49AM +0200, Alan Orth wrote:
> > Dear list,
> >
> > I'm testing an upgrade of a DSpace 5.8 instance to DSpace 6.3 and one of
> > the first things I notice is that Discovery indexing is about three or
> four
> > times slower than it was before. On the same hardware, my repository with
> > ~85,000 items takes 30 minutes to index with DSpace 5 and three hours
> with
> > DSpace 6.3 and DSpace 6.4-SNAPSHOT. My development environment is on
> Linux
> > with a fast SSD and lots of RAM, so I fear it will be even worse on our
> > production server.
> >
> > I have read that the new Hibernate database layer in DSpace 6 involves
> much
> > more complicated or time-consuming database queries. How are other people
> > handling this? We're using PostgreSQL 9.6. Could it be time to move to
> > something higher to hopefully gain something from PostgreSQL's own
> advances?
>
> I don't know that upgrading PostgreSQL will help your indexing
> performance all that much, but it shouldn't hurt.  We run production
> against Pg 10.9 and I develop DSpace 5, 6, and 7 against 12.1.
>
> Hibernate does tend to fetch more stuff, but it also caches very
> aggressively and rather well, so it's hard to say whether it is
> contributing to any particular slow-down.  There have been specific
> DSpace operations in which Hibernate was found to be a source of
> excess activity, but I think that most of them have been addressed in
> patches scheduled for 6.4.  I have no doubt that there are others.
>
> Probably the most methodical approach would be to run indexing with a
> profiler and find out where the time is being spent.  Since
> command-line indexing involves three processes (bin/dspace, Pg, and
> Tomcat (running Solr)) it would be good to pay particular attention to
> time spent waiting on another process.
>
> Short of profiling, tools like 'top' and 'iotop' will give a rough
> idea of whether the system is generally busier and suggest which parts
> are responsible.  You might be able to set up 'strace' or the like to
> log mainly I/O calls and grind some statistics out of the log.
>
> (I really should try some of these myself....)
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu
>
> --
> All messages to this mailing list should adhere to the DuraSpace Code of
> Conduct: https://duraspace.org/about/policies/code-of-conduct/
> ---
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/dspace-tech/20200206141511.GE11530%40IUPUI.Edu
> .
>


-- 
Alan Orth
[email protected]
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/CAKKdN4XVEWYZetDq4y0pBXXK81_ZUbbL-Xr6zHFV6Fhi2LjbWg%40mail.gmail.com.

Reply via email to