Wow, that's great, Andrea. I'm very curious to try your patches. I
will play with them and see if I can get them to apply to our slightly
modified DSpace 5.1 code base.

Cheers,

On Fri, Aug 12, 2016 at 8:53 PM, Andrea Bollini
<andrea.boll...@4science.it> wrote:
> Dear Alan,
>
> on DSpace-CRIS we have make the indexing process multi-thread and for
> our experience this improve the performance a lot, 10x or more depending
> on the number of threads used and the server configuration.
>
> See
>
> https://github.com/4Science/DSpace/commit/6206ca6f7980cdae31d5bd69c450d706f8518dfb
>
> https://github.com/4Science/DSpace/blob/dspace-5_x_x-cris/dspace-api/src/main/java/org/dspace/discovery/SolrServiceImpl.java#L476
>
> https://github.com/4Science/DSpace/blob/dspace-5_x_x-cris/dspace-api/src/main/java/org/dspace/discovery/SolrServiceImpl.java#L2585
>
>
> Running SOLR on a separate tomcat, or better on a dedicated server bring
> also big improvements. When you use multiple threads you need to be sure
> to have enough database connections to serve all the threads and the
> running webapps.
>
> If you are available to test this improvement on a plan DSpace 5.x, we
> can prepare a pull request (against DSpace 5.x) and if the feedback is
> good we can also port it to DSpace 6.x
>
> BTW, the DSpace-CRIS enhancement also introduce the ability to (re)index
> a single object so can also replace the pull request
> https://github.com/DSpace/DSpace/pull/1469
>
> Best,
> Andrea
>
> Il 12/08/2016 15:33, Alan Orth ha scritto:
>> Evelthon,
>>
>> Interesting observation about the indexing speed. Just yesterday I
>> posted a message about Java JVM settings for Solr/Lucene to this
>> mailing list. I'm sure there is room for improvements in Solr
>> performance if you're willing to monitor, tweak, monitor, tweak, etc.
>> I've stayed away from JVM tuning for the most part. Here's the link I
>> posted yesterday, from a Solr developer, where he recommends some JVM
>> settings as well as Java 8 (which wasn't the case when I checked this
>> wiki last year):
>>
>> https://wiki.apache.org/solr/ShawnHeisey
>>
>> For what it's worth, our indexing takes ~60 minutes for 55,000 items,
>> and we're on a Linode VPS where we have an SSD and plenty of CPU cores
>> and memory — I hate to think how long it takes on less performant
>> hardware.
>>
>> Regards,
>>
>>
>> On Fri, Aug 12, 2016 at 10:56 AM, Evelthon Prodromou
>> <prodromou.evelt...@ucy.ac.cy> wrote:
>>> Hello Alan,
>>>
>>> Basically to finish the initial indexing, media-filter faster. Probably an
>>> overkill.
>>>
>>>
>>> Thanks.
>>>
>>> On Friday, August 12, 2016 at 10:20:49 AM UTC+3, Alan Orth wrote:
>>>> I'm glad you solved it, Evelthon.
>>>>
>>>> I guess it depends on your OS and how you have Tomcat running. In
>>>> Ubuntu we set JAVA_OPTS in /etc/default/tomcat7, but CentOS's Tomcat
>>>> is surely different. By the way, there's more discussion about tuning
>>>> DSpace (including JAVA_OPTS and CATALINA_OPTS) on the wiki:
>>>>
>>>> https://wiki.duraspace.org/display/DSDOC5x/Performance+Tuning+DSpace
>>>>
>>>> I still wonder why your JVM settings are so highly tweaked. Most
>>>> people don't need to adjust those, and unless you really know you need
>>>> them, I'd say to leave them off. Remember, "premature optimization is
>>>> the root of all evil" ;)
>>>>
>>>> Cheers,
>>>>
>>>> On Fri, Aug 12, 2016 at 9:59 AM, Evelthon Prodromou
>>>> <prodromou...@ucy.ac.cy> wrote:
>>>>> Hello Luigi,
>>>>>
>>>>> CATALINA_OPTS did it. All works good now. Seems like solr was choaking
>>>>> and
>>>>> caused slow loading on the UI.
>>>>>
>>>>> Curious though, shouldn't it fallback to java_opts?
>>>>>
>>>>>
>>>>> In any case, thank you.
>>>>>
>>>>> Evelthon
>>>>>
>>>>>
>>>>> On Thursday, August 11, 2016 at 1:01:34 PM UTC+3, Luigi Andrea
>>>>> Pascarelli
>>>>> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> as far as I know DSpace doesn't need a big amount of memory. For the
>>>>>> first
>>>>>> step you can try to setup CATALINA_OPTS to
>>>>>>
>>>>>> CATALINA_OPTS: -Xms1024m -Xmx2048m -XX:MaxPermSize=256m
>>>>>> -Dfile.encoding=UTF-8
>>>>>>
>>>>>> And as Alan highlighted you could use JAVA_OPTS with less memory.
>>>>>>
>>>>>> Second step you could try to check if the I/O is the real issue. Maybe
>>>>>> POSTGRES and SOLR and TOMCAT write and read on the same node and so you
>>>>>> can
>>>>>> benefits from separate them (or one of them) on differents volumes.
>>>>>>
>>>>>> Let me know.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Luigi Andrea
>>>>>>
>>>>>>
>>>>>> Il 11/08/2016 11:46, Evelthon Prodromou ha scritto:
>>>>>>
>>>>>> The server has 32GB, postgresql on different box. I don't think it's a
>>>>>> RAM
>>>>>> issue.
>>>>>>
>>>>>> I am wondering if it is a solr issue. I don't have tomcat running as
>>>>>> user
>>>>>> dspace. Instead, I change ownership in [dspace]/solr to dspace:tomcat
>>>>>> and
>>>>>> gave rw rights to both user and group.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thursday, August 11, 2016 at 12:33:05 PM UTC+3, Alan Orth wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>  From your JAVA_OPTS I see you are allocating 4096 + 2048 megabytes of
>>>>>>> RAM to Tomcat right from the start. How much memory does your server
>>>>>>> have? This means your host must have AT LEAST 6GB of RAM just for
>>>>>>> Tomcat, let alone PostgreSQL, Solr, and the rest of the operating
>>>>>>> system. I wouldn't be surprised if you are encountering poor
>>>>>>> performance due to swapping.
>>>>>>>
>>>>>>> For reference, we run a fairly large DSpace instance with ~55,000
>>>>>>> items and a decent amount of traffic and these are our JAVA_OPTS:
>>>>>>>
>>>>>>> -Djava.awt.headless=true -Xms3072m -Xmx3072m -XX:MaxPermSize=256m
>>>>>>> -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8
>>>>>>>
>>>>>>> Our server has 8GB of physical memory. Unless you know you need all
>>>>>>> those JVM tweaks, I'd start by simplifying your JAVA_OPTS to something
>>>>>>> more simple (for testing at least).
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> On Wed, Aug 10, 2016 at 7:03 PM, Evelthon Prodromou
>>>>>>> <prodromou...@ucy.ac.cy> wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>>
>>>>>>>>   I seem to be having an issue with tomcat. It takes ~38+ seconds to
>>>>>>>> load
>>>>>>>> pages. I believe it's tomcat since i notice shell scripts (
>>>>>>>> [dspace]/bin/dspace ) executing slow when tomcat is started, and
>>>>>>>> very
>>>>>>>> fast
>>>>>>>> (normal I presume) when tomcat is stopped.
>>>>>>>>
>>>>>>>>
>>>>>>>> The system  is a new installation of DSpace 5.5 on CentOS7. Data and
>>>>>>>> sql
>>>>>>>> migrated from an 1.7.0 installation.
>>>>>>>>
>>>>>>>> tomcat.conf includes the following JAVA_OPTS
>>>>>>>>
>>>>>>>> JAVA_OPTS="-Xmx4096m -Xms4096m -XX:MaxPermSize=2048m
>>>>>>>> -Dfile.encoding=UTF-8
>>>>>>>> -XX:MaxHeapFreeRatio=70 -XX:+UseConcMarkSweepGC
>>>>>>>> -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled"
>>>>>>>>
>>>>>>>> index-discovery was executed and discovery facets show up.
>>>>>>>>
>>>>>>>>
>>>>>>>> Someone please point me to the right direction to investigate.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>>
>>>>>>>>
>>>>>>>> Evelthon
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups
>>>>>>>> "DSpace Technical Support" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send
>>>>>>>> an
>>>>>>>> email to dspace-tech...@googlegroups.com.
>>>>>>>> To post to this group, send email to dspac...@googlegroups.com.
>>>>>>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Alan Orth
>>>>>>> alan...@gmail.com
>>>>>>> https://englishbulgaria.net
>>>>>>> https://alaninkenya.org
>>>>>>> https://mjanja.ch
>>>>>>> "In heaven all the interesting people are missing." ―Friedrich
>>>>>>> Nietzsche
>>>>>>> GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups
>>>>>> "DSpace Technical Support" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>>> an
>>>>>> email to dspace-tech...@googlegroups.com.
>>>>>> To post to this group, send email to dspac...@googlegroups.com.
>>>>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Luigi Andrea Pascarelli
>>>>>>
>>>>>> DSpace Committer and DSpace-CRIS Lead Developer
>>>>>>
>>>>>> 4Science,  www.4science.it (an Itway Group Company)
>>>>>>
>>>>>> office: Via Edoardo D'Onofrio 304, 00155 Roma, Italy
>>>>>> tel: +39 333 934 1782
>>>>>> skype: l_a_p82
>>>>>> linkedin: luigiandreapascarelli
>>>>>>
>>>>>> ====================================
>>>>>> Salvate un albero. Non stampate questa mail se non necessario.
>>>>>> Save a tree. Don't print this e-mail unless it's really necessary.
>>>>>>
>>>>>> DISCLAIMER: Le informazioni contenute in questo messaggio sono
>>>>>> confidenziali, possono essere protette da leggi locali,
>>>>>> e devono essere utilizzate esclusivamente dal destinatario.  La
>>>>>> pubblicazione, l'utilizzo, la divulgazione, la stampa
>>>>>> o la copia non autorizzata del contenuto della presente e-mail sono
>>>>>> espressamente vietate e potenzialmente illegali.
>>>>>> Nel caso si sia ricevuto il presente messaggio per errore, si prega di
>>>>>> cancellarlo e di inviarne notifica al mittente.
>>>>>>
>>>>>> DISCLAIMER: The information contained in this message is confidential,
>>>>>> can
>>>>>> be legally protected by local Laws,
>>>>>> and must be exclusively used by the recipient. The publication, use,
>>>>>> distribution, printing or unauthorized copy
>>>>>> of the content of this message is strictly forbidden and it can be
>>>>>> illegal. If you received this message by mistake,
>>>>>> please destroy it and notify it to the sender.
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups
>>>>> "DSpace Technical Support" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an
>>>>> email to dspace-tech...@googlegroups.com.
>>>>> To post to this group, send email to dspac...@googlegroups.com.
>>>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>>
>>>> --
>>>> Alan Orth
>>>> alan...@gmail.com
>>>> https://englishbulgaria.net
>>>> https://alaninkenya.org
>>>> https://mjanja.ch
>>>> "In heaven all the interesting people are missing." ―Friedrich Nietzsche
>>>> GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "DSpace Technical Support" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to dspace-tech+unsubscr...@googlegroups.com.
>>>
>>> To post to this group, send email to dspace-tech@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>
> --
> Andrea Bollini
> Chief Technology Innovation Officer
>
> 4Science,  www.4science.it
> office: Via Edoardo D'Onofrio 304, 00155 Roma, Italy
> mobile: +39 333 934 1808
> skype: a.bollini
> linkedin: andreabollini
> orcid: 0000-0002-9029-1854
>
> an Itway Group Company
> Italy, France, Spain, Portugal, Greece, Turkey, Lebanon, Qatar, U.A.Emirates



-- 
Alan Orth
alan.o...@gmail.com
https://englishbulgaria.net
https://alaninkenya.org
https://mjanja.ch
"In heaven all the interesting people are missing." ―Friedrich Nietzsche
GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to