I have encountered this same issue. I filed a bug based on the issue.
https://jira.duraspace.org/browse/DS-2212 Terry On Wed, Sep 17, 2014 at 9:18 AM, Peter Dietz <[email protected]> wrote: > Hi Patrick, > > Sorry that nobody got back to you back then. But thank you for your post, > I've just discovered this too, when trying to do some work on a site that > had previously started on an older version of DSpace, and had seen some > upgrades. So, today, I was attempting to shard a solr statistics index. > > peterdietz:peterDSpace peterdietz$ */dspace/bin/dspace stats-util -s* > Moving: 275 into core statistics-2010 > Exception: Document is missing mandatory uniqueKey field: uid > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: > Document is missing mandatory uniqueKey field: uid > at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:424) > at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) > at org.dspace.statistics.SolrLogger.shardSolrIndex(SolrLogger.java:1345) > at > org.dspace.statistics.util.StatisticsClient.main(StatisticsClient.java:106) > > > Document is missing mandatory uniqueKey field: uid > > So, I looked at my solr statistics instance: > http://127.0.0.1:8080/solr/statistics/select?q=*:*+AND+-uid:[*%20TO%20*]&wt=json&indent=true > And look for entries that are missing the uid field. > ouch.. 1.7M out of 9M entries are missing UID. > > Here's an example entry that is missing UID (I've just changed the IP/DNS): > > { > "ip":"8.8.8.8", > "id":1, > "type":4, > "time":"2011-06-03T18:56:05.174Z", > "epersonid":2, > "dns":"hidden.example.com.", > "continent":"NA", > "countryCode":"US", > "city":"New York", > "latitude":40.7619, > "longitude":-73.9763, > "isBot":false, > "userAgent":"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) > Gecko/20100101 Firefox/4.0.1"}, > > And here's a recent hit, which has a UID (changed IP/DNS). > > { > "ip":"8.8.8.8", > "referrer":"https://trydspace.longsight.com/", > "dns":"hidden.example.com.", > "continent":"NA", > "countryCode":"US", > "city":"New York", > "latitude":40.7619, > "longitude":-73.9763, > "isBot":false, > "userAgent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 > (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36", > "id":13, > "type":4, > "time":"2013-09-17T15:50:58.067Z", > "epersonid":29, > "statistics_type":"view", > "uid":"6879167c-b90b-41a3-8249-f98d4e66fb86"} > > > I don't see anything in the SOLR documentation for how to just add a UID > to any entry. I suppose you would have to search for all records that are > missing a UID, then store their information into CSV, then delete by query > that matches all of that information, and then add a document that had all > of that old information, and perhaps this adds a UID? > > I'm half tempted to say we'll have to delete these 1.7M records that are > missing the UID. They're causing issues in the SOLR index. But this will > surely cause a too large a gap in the data. > > > ________________ > Peter Dietz > Longsight > www.longsight.com > [email protected] > p: 740-599-5005 x809 > > On Fri, Jun 20, 2014 at 1:17 AM, Patrick Rynhart <[email protected]> > wrote: > >> Update: I have found this post by helix84 which looks similar (the >> context is for OAI but the traceback looks about the same): >> >> http://dspace.2283337.n4.nabble.com/problems-with-OAI-td4671531.html >> >> When I try running the following SQL on our DSpace 1.8 server (cut and >> paste from the above message), I get: >> >> # SELECT item.item_id FROM item WHERE NOT EXISTS (SELECT resource_id >> FROM handle WHERE handle.resource_id = item.item_id AND >> handle.resource_type_id = 2); >> item_id >> --------- >> 435 >> 503 >> 499 >> 432 >> 646 >> 461 >> 559 >> 2627 >> 2628 >> 3844 >> 5443 >> 5899 >> (12 rows) >> >> We've been through by hand and bitstreams 435, 432 and 5899 result in >> "Not found on this server", while the remainder appear okay (at least >> via the webapp). >> >> Looks like our 1.8 server has some problem that is only showing up as an >> issue following the migration to 4.1 ? >> >> What do we need to do to fix up the existing 1.8 install ? >> >> Thanks, >> >> Patrick >> >> > Hi all, >> > >> > I’m attempting an upgrade from DSpace 1.8 directly to 4.1 on a new >> > server but am running into a problem. Along with the asset store and >> > DB, we are wanting to preserve our viewing stats. After migration, if I >> > run: >> > >> > /usr/local/dspace/bin/dspace stats-util -b >> > >> > then I am getting the following Java traceback. >> > >> > Exception: Document is missing mandatory uniqueKey field: uid >> > org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: >> > Document is missing mandatory uniqueKey field: uid >> > at >> > >> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:424) >> > at >> > >> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) >> > at >> > >> org.dspace.statistics.SolrLogger.reindexBitstreamHits(SolrLogger.java:1482) >> > at >> > >> org.dspace.statistics.util.StatisticsClient.main(StatisticsClient.java:97) >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> > at >> > >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> > at >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> > at java.lang.reflect.Method.invoke(Method.java:606) >> > at >> > >> org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:225) >> > at >> org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:77) >> > >> > This error message does not occur if I run “dspace stats-util -b” on our >> > DSpace 1.8 server. >> > >> > ——— >> > >> > A summary of the steps that I’m taking are: >> > >> > 1. Copying across /usr/local/dspace/solr/statistics/data to the new >> server. >> > 2. Doing a rsync of the asset store: >> > >> > rsync -av --exclude=.snapshot --exclude=log >> > /usr/local/dspace/assetstore/ newserver:/usr/local/dspace/assetstore/ >> > >> > 3. Exporting / importing the PSQL database from the old to new server. >> > 4. Applying DB schema database_schema_18-3.sql followed by >> > database_schema_3-4.sql >> > >> > 5. Running /usr/local/dspace/bin/dspace checker -l -p >> > 6. Starting tomcat on the new server and running: >> > >> > /usr/local/dspace/bin/dspace index-discovery >> > >> > Then I'm attempting to run "/usr/local/dspace/bin/dspace stats-util -b" >> > which is when the error message occurs. >> > >> > After Step 6, I have also tried: >> > >> > /usr/local/dspace/bin/dspace stats-util -o >> > /usr/local/dspace/bin/dspace stat-general >> > /usr/local/dspace/bin/dspace stat-initial >> > >> > but am still running into the above error message upon running >> > “stats-util -b”. >> > >> > If anyone could assist me with this it would be appreciated. >> > >> > With Thanks, >> > >> > Patrick Rynhart >> > >> > >> > >> > >> ------------------------------------------------------------------------------ >> > HPCC Systems Open Source Big Data Platform from LexisNexis Risk >> Solutions >> > Find What Matters Most in Your Big Data with HPCC Systems >> > Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. >> > Leverages Graph Analysis for Fast Processing & Easy Data Exploration >> > http://p.sf.net/sfu/hpccsystems >> > >> >> >> >> >> ------------------------------------------------------------------------------ >> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions >> Find What Matters Most in Your Big Data with HPCC Systems >> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. >> Leverages Graph Analysis for Fast Processing & Easy Data Exploration >> http://p.sf.net/sfu/hpccsystems >> _______________________________________________ >> DSpace-tech mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/dspace-tech >> List Etiquette: >> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette >> > > > > ------------------------------------------------------------------------------ > Want excitement? > Manually upgrade your production database. > When you want reliability, choose Perforce > Perforce version control. Predictably reliable. > > http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk > _______________________________________________ > DSpace-tech mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dspace-tech > List Etiquette: > https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette > -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 425-298-5498
------------------------------------------------------------------------------
_______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

