Alan,

The issue that you are reporting does not sound like the same issue that
was addressed in DSpace 5.7 and 6.1.

In DSpace 6.0, running the sharding process prevented tomcat from starting
up properly.  While addressing that serious bug, a number of smaller issues
related to statistics import and export were addressed.  It was very
difficult to re-test the sharding process without first fixing the
statistics import and export tools.  I ported the fixes to those tools into
DSpace 5.7.

It is interesting that you see that a new shard is created.  Does that new
shard contain any records?  Are you able to query it in the solr admin
console?

The following fix https://jira.duraspace.org/browse/DS-3458 might make the
process more tolerant if you attempt to re-run the sharding process after a
new shard is in place.

Terry


On Wed, Jan 10, 2018 at 11:50 PM, Alan Orth <alan.o...@gmail.com> wrote:

> Hi, Christian. I just tried again with 4096m heap size and the error is
> the same. I think the problem is indeed related to the Solr optimistic
> concurrency version conflicts that are addressed in DSpace 5.7 and DSpace
> 6.1.
>
> - 5.7: https://wiki.duraspace.org/display/DSPACE/DSpace+Release+5.7+Status
> - 6.1: https://wiki.duraspace.org/display/DSPACE/DSpace+Release+6.1+Status
>
> Thanks,
>
> On Thu, Jan 11, 2018 at 9:13 AM Christian Scheible <
> christian.schei...@uni-konstanz.de> wrote:
>
>> Hi Alan,
>>
>> have you tried increasing the Java Heap Space?
>> On my local development machine (DSpace 6.2) the command did not run when
>> Heap Space was only 1 GB but did work with 4GB.
>>
>> Like this:
>> sudo -u tomcat7 JAVA_OPTS="-Xmx4024M -Xms1024M -Dfile.encoding=UTF-8"
>> /opt/dspace-kops/bin/dspace stats-util -s
>>
>> Regards
>>
>> Christian
>>
>>
>> Am 11.01.2018 um 08:04 schrieb Alan Orth:
>>
>> @Mark, I looked in DSpace's solr.log and see that the new Solr core is
>> created, then it does some stuff and eventually closes it without an error.
>> Neither Tomcat's catalina.out nor localhost.log have any errors around the
>> time I attempted to shard on my local development machine. There might be a
>> hint here in Tomcat's localhost_access_log, though:
>>
>> 127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] "GET /solr/statistics/select?q=
>> type%3A2+AND+id%3A1&wt=javabin&version=2 HTTP/1.1" 200 107
>> 127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] "GET
>> /solr/statistics/select?q=*%3A*&rows=0&facet=true&facet.
>> range=time&facet.range.start=NOW%2FYEAR-18YEARS&facet.
>> range.end=NOW%2FYEAR%2B0YEARS&facet.range.gap=%2B1YEAR&
>> facet.mincount=1&wt=javabin&version=2 HTTP/1.1" 200 447
>> 127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] "GET /solr/admin/cores?action=
>> STATUS&core=statistics-2016&indexInfo=true&wt=javabin&version=2
>> HTTP/1.1" 200 76
>> 127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] "GET /solr/admin/cores?action=
>> CREATE&name=statistics-2016&instanceDir=statistics&
>> dataDir=%2FUsers%2Faorth%2Fdspace%2Fsolr%2Fstatistics-
>> 2016%2Fdata&wt=javabin&version=2 HTTP/1.1" 200 63
>> 127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] "GET
>> /solr/statistics/select?csv.mv.separator=%7C&q=*%3A*&fq=
>> time%3A%28%5B2016%5C-01%5C-01T00%5C%3A00%5C%3A00Z+TO+
>> 2017%5C-01%5C-01T00%5C%3A00%5C%3A00Z%5D+NOT+2017%5C-01%5C-
>> 01T00%5C%3A00%5C%3A00Z%29&rows=10000&wt=csv HTTP/1.1" 200 2137630
>> 127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] "GET
>> /solr/statistics/admin/luke?show=schema&wt=javabin&version=2 HTTP/1.1"
>> 200 16253
>> 127.0.0.1 - - [10/Jan/2018:10:51:19 +0200] "POST
>> /solr//statistics-2016/update/csv?commit=true&softCommit=
>> false&waitSearcher=true&f.previousWorkflowStep.split=
>> true&f.previousWorkflowStep.separator=%7C&f.previousWorkflowStep.
>> encapsulator=%22&f.actingGroupId.split=true&f.
>> actingGroupId.separator=%7C&f.actingGroupId.encapsulator=%
>> 22&f.containerCommunity.split=true&f.containerCommunity.separator=%7C&f.
>> containerCommunity.encapsulator=%22&f.range.
>> split=true&f.range.separator=%7C&f.range.encapsulator=%22&f.
>> containerItem.split=true&f.containerItem.separator=%7C&f.
>> containerItem.encapsulator=%22&f.p_communities_map.split=
>> true&f.p_communities_map.separator=%7C&f.p_communities_
>> map.encapsulator=%22&f.ngram_query_search.split=true&f.
>> ngram_query_search.separator=%7C&f.ngram_query_search.encapsulator=%22&f.
>> containerBitstream.split=true&f.containerBitstream.separator=%7C&f.
>> containerBitstream.encapsulator=%22&f.owningItem.split=true&f.owningItem.
>> separator=%7C&f.owningItem.encapsulator=%22&f.actingGroupParentId.split=
>> true&f.actingGroupParentId.separator=%7C&f.actingGroupParentId.
>> encapsulator=%22&f.text.split=true&f.text.separator=%7C&f.
>> text.encapsulator=%22&f.simple_query_search.split=
>> true&f.simple_query_search.separator=%7C&f.simple_query_
>> search.encapsulator=%22&f.owningComm.split=true&f.
>> owningComm.separator=%7C&f.owningComm.encapsulator=%22&f.
>> owner.split=true&f.owner.separator=%7C&f.owner.encapsulator=%22&f.
>> filterquery.split=true&f.filterquery.separator=%7C&f.
>> filterquery.encapsulator=%22&f.p_group_map.split=true&f.p_
>> group_map.separator=%7C&f.p_group_map.encapsulator=%22&f.
>> actorMemberGroupId.split=true&f.actorMemberGroupId.separator=%7C&f.
>> actorMemberGroupId.encapsulator=%22&f.bitstreamId.split=true&f.
>> bitstreamId.separator=%7C&f.bitstreamId.encapsulator=%22&
>> f.group_name.split=true&f.group_name.separator=%7C&f.
>> group_name.encapsulator=%22&f.p_communities_name.split=true&
>> f.p_communities_name.separator=%7C&f.p_communities_
>> name.encapsulator=%22&f.query.split=true&f.query.separator=%
>> 7C&f.query.encapsulator=%22&f.workflowStep.split=true&f.
>> workflowStep.separator=%7C&f.workflowStep.encapsulator=%22&
>> f.containerCollection.split=true&f.containerCollection.separator=%7C&f.
>> containerCollection.encapsulator=%22&f.complete_
>> query_search.split=true&f.complete_query_search.separator=%7C&f.complete_
>> query_search.encapsulator=%22&f.p_communities_id.split=true&
>> f.p_communities_id.separator=%7C&f.p_communities_id.encapsulator=%22&f.
>> rangeDescription.split=true&f.rangeDescription.separator=%
>> 7C&f.rangeDescription.encapsulator=%22&f.group_id.split=true&f.group_id.
>> separator=%7C&f.group_id.encapsulator=%22&f.bundleName.
>> split=true&f.bundleName.separator=%7C&f.bundleName.
>> encapsulator=%22&f.ngram_simplequery_search.split=true&
>> f.ngram_simplequery_search.separator=%7C&f.ngram_simplequery_search.
>> encapsulator=%22&f.group_map.split=true&f.group_map.
>> separator=%7C&f.group_map.encapsulator=%22&f.owningColl.
>> split=true&f.owningColl.separator=%7C&f.owningColl.
>> encapsulator=%22&f.p_group_id.split=true&f.p_group_id.
>> separator=%7C&f.p_group_id.encapsulator=%22&f.p_group_
>> name.split=true&f.p_group_name.separator=%7C&f.p_group_
>> name.encapsulator=%22&wt=javabin&version=2 HTTP/1.1" 409 156
>>
>> A new core is created, then DSpace GETs a CSV from Solr, tries to POST it
>> to the new core and is greeted with an HTTP 409 error. I just Googled for
>> "HTTP 409 solr" and found some mentions of optimistic concurrency and
>> version conflicts. Interesting! This indeed sounds a lot like what I've
>> read in some Jira issues. Could this be the problem fixed in DSpace 5.7,
>> Terry?
>>
>> Our Solr statistics core has something like 80 million documents so I'm
>> really hoping to be able to shard it eventually!
>>
>> Thank you,
>>
>> On Wed, Jan 10, 2018 at 7:04 PM Terry Brady <terry.br...@georgetown.edu>
>> wrote:
>>
>>> Alan,
>>>
>>> There were some bug fixes to the Solr Sharding process in DSpace 5.7.
>>> See https://wiki.duraspace.org/display/~terrywbrady/
>>> Statistics+Import+Export+Issues
>>> <https://wiki.duraspace.org/display/%7Eterrywbrady/Statistics+Import+Export+Issues>
>>> for details.
>>>
>>> I am running DSpace 5.8 and I was able to shard successfully.
>>> https://wiki.duraspace.org/display/DSDOC5x/SOLR+Statistics+Maintenance#
>>> SOLRStatisticsMaintenance-SolrShardingByYear
>>>
>>> Terry
>>>
>>> On Wed, Jan 10, 2018 at 6:07 AM, Mark H. Wood <mwoodiu...@gmail.com>
>>> wrote:
>>>
>>>> Does the server log anything interesting?  It seems to be dropping the
>>>> connection.  I suspect a timeout of some sort, on the server side.
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "DSpace Technical Support" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to dspace-tech+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to dspace-tech@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> Terry Brady
>>> Applications Programmer Analyst
>>> Georgetown University Library Information Technology
>>> https://github.com/terrywbrady/info
>>> 425-298-5498 <%28425%29%20298-5498> (Seattle, WA)
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Technical Support" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to dspace-tech+unsubscr...@googlegroups.com.
>>> To post to this group, send email to dspace-tech@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/dspace-tech.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>>
>> Alan Orth
>> alan.o...@gmail.com
>> https://picturingjordan.com
>> https://englishbulgaria.net
>> https://mjanja.ch
>> --
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to dspace-tech+unsubscr...@googlegroups.com.
>> To post to this group, send email to dspace-tech@googlegroups.com.
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to dspace-tech+unsubscr...@googlegroups.com.
>> To post to this group, send email to dspace-tech@googlegroups.com.
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
>
> Alan Orth
> alan.o...@gmail.com
> https://picturingjordan.com
> https://englishbulgaria.net
> https://mjanja.ch
>
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://github.com/terrywbrady/info
425-298-5498 (Seattle, WA)

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to