[dspace-tech] Re: DOI import error when DOI contains parentheses. Publisher, the American Society of Civil Engineer

2019-06-04 Thread Chris Gray
Recently I was doing work with APIs and I ran across the documents for the 
Crossref REST API at https://github.com/CrossRef/rest-api-doc.

This includes the following warning:

"You should always url-encode DOIs and parameter values when using the API. 
DOIs are notorious for including characters that break URLs (e.g. 
semicolons, hashes, slashes, ampersands, question marks, etc.)."

Parentheses can be encoded as  %28 and %29.

On Tuesday, June 4, 2019 at 2:07:20 PM UTC-4, Mr. B wrote:
>
> Hi there,
> We've been experiencing some issues with the import publication details 
> using
> DOI feature when the DOI contains parentheses. See for example
> 10.1016/S2214-109X(16)30242-X, where multiple publications are found.
> Parentheses are pretty rare in DOIs but there is one publisher, the 
> American
> Society of Civil Engineers, that includes them in all publications, see for
> ex: https://doi.org/10.1061/(ASCE)AE.1943-5568.342. 
> Any help would be appreciated!!
>
>

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/de40284c-65d5-4011-9d52-8b3f94b3a1e3%40googlegroups.com.


[dspace-tech] Re: Filter media error

2019-05-24 Thread Chris Gray
Essentially this means the PDF in question is defective and was not created 
properly by whatever tool was used to create it.

A problem like this is in creating the plain text bitstream and is hard to 
fix.  You would need to convert the pages in the PDF into high quality 
images and apply OCR, which might be hard with "symbolic fonts".  Then it 
is hard to put the new TEXT bitstream into DSpace.  You have to get that 
right so that Solr will read it properly for indexing.

Chris

On Friday, May 24, 2019 at 8:22:57 AM UTC-4, Massimiliano CILURZO wrote:
>
> Dear All,
>  We have installed a new DSPACE server (6.3), and then we have 
> launched the command dspace]/bin/filter-media,
> But during the execution for some items we have this error :
> java.lang.IllegalArgumentException: Symbolic fonts must have a built-in 
> encoding
> Some help?
> Thanks
> Best regards 
> Massimiliano
>

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/01699159-6cd7-4bac-85e5-78fb1f65b2d8%40googlegroups.com.


Re: [dspace-tech] How to debug the DSpace 5 REST API?

2018-05-14 Thread Chris Gray
Thanks for your reply, Bram,

As it turned out we were confused by a slight difference between our 
staging server (where we are test PURE integration) and out production 
server.  We had uri.identifier.doi on production and not on staging.  Once 
we added that to our metadata registry, everything started working.

On Monday, April 30, 2018 at 2:18:02 AM UTC-4, Bram Luyten wrote:
>
> Hi Chris,
>
> we are also actively working with clients on PURE integrations, exploring 
> the REST-API, whereas PURE used to rely on the LNI connector in the past.
> Normally, the dspace logs should provide some context about what's going 
> on where. If you're not seeing any entries in your dspace logs for the REST 
> calls, there might be something wrong regarding log config.
>
> Very recently, we saw that PURE is making ?expand=bitstreams calls on 
> items that are none existant/deleted, in which case DSpace returns 500 
> errors, but we still need to learn why PURE is exactly making these calls.
>
> with kindest regards,
>
> Bram
>
> [image: logo] Bram Luyten
> *250-B Luci*us Gordon Drive, Suite 3A, West Henrietta, NY 14586
> Gaston Geenslaan 14, 3001 Leuven, Belgium New address Apr 2017
> atmire.com 
> <http://atmire.com/website/?q=services_source=emailfooter_medium=email_campaign=braml>
>
> On 26 April 2018 at 20:49, Chris Gray <cpgr...@gmail.com > 
> wrote:
>
>> We are trying to integrate Elsevier's PURE with the DSpace 5 REST API and 
>> having no luck so far.
>>
>> One impediment is that the error codes for the REST API are nearly 
>> useless.  HTTP codes are being used and are used in inappropriate ways.  
>> For instance, all unsuccessful login attempts return a 403 Forbidden, which 
>> is not what 403 means.  The correct code is 401 Unauthorized.
>>
>> We are currently getting a 500 Internal Server Error, and I see from the 
>> Java code that there are a number different conditions that can throw this 
>> error.
>>
>> Is there some way I can see exactly what's going wrong so I don't have to 
>> keep guessing what to try next?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to dspace-tech...@googlegroups.com .
>> To post to this group, send email to dspac...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] How to debug the DSpace 5 REST API?

2018-04-26 Thread Chris Gray
We are trying to integrate Elsevier's PURE with the DSpace 5 REST API and 
having no luck so far.

One impediment is that the error codes for the REST API are nearly 
useless.  HTTP codes are being used and are used in inappropriate ways.  
For instance, all unsuccessful login attempts return a 403 Forbidden, which 
is not what 403 means.  The correct code is 401 Unauthorized.

We are currently getting a 500 Internal Server Error, and I see from the 
Java code that there are a number different conditions that can throw this 
error.

Is there some way I can see exactly what's going wrong so I don't have to 
keep guessing what to try next?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Trouble removing OAI metadata format

2018-02-06 Thread Chris Gray
Thanks, Tim,

That is what I was missing. I ran the command and everything is now showing 
as I expected.

On Tuesday, February 6, 2018 at 5:06:33 PM UTC-5, Tim Donohue wrote:
>
> Hi Chris,
>
> Have you cleared the OAI server cache?  The OAI server has its own 
> separate cache (of responses) that can be cleaned by running:
>
> [dspace]/bin/dspace oai clean-cache
>
> See also this section of the documentation: 
> https://wiki.duraspace.org/display/DSDOC6x/OAI+2.0+Server#OAI2.0Server-IndexingOAIcontent
>
> - Tim
>
> On Tue, Feb 6, 2018 at 3:11 PM Chris Gray <cpgr...@gmail.com > 
> wrote:
>
>> I've just removed a metadata format (one we made locally) from our 
>> staging server.  I removed all reference to it in 
>> [dspace]/config/crosswalks/oai/xoai.xml. While the OAI interface no longer 
>> works for this format, the format is still listed when the request 
>> verb=ListMetadataFormats is called.
>>
>> I have tried clearing caches, restarting tomcat, restarting the oai 
>> application, reloading the oai application, and rebooting the entire 
>> server.  I've checked and double checked the xoai.xml file and the relevant 
>> xsl files both in source and as deployed.  The defunct format is still 
>> listed among the available metadata formats.
>>
>> Is there some setting I'm missing?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to dspace-tech...@googlegroups.com .
>> To post to this group, send email to dspac...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
> -- 
> Tim Donohue
> Technical Lead for DSpace & DSpaceDirect
> DuraSpace.org | DSpace.org | DSpaceDirect.org
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Trouble removing OAI metadata format

2018-02-06 Thread Chris Gray
I've just removed a metadata format (one we made locally) from our staging 
server.  I removed all reference to it in 
[dspace]/config/crosswalks/oai/xoai.xml. While the OAI interface no longer 
works for this format, the format is still listed when the request 
verb=ListMetadataFormats is called.

I have tried clearing caches, restarting tomcat, restarting the oai 
application, reloading the oai application, and rebooting the entire 
server.  I've checked and double checked the xoai.xml file and the relevant 
xsl files both in source and as deployed.  The defunct format is still 
listed among the available metadata formats.

Is there some setting I'm missing?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] possible bug in DSpace 5.8 in author browse with authority index

2017-10-30 Thread Chris Gray
We were at DSpace 5.5 (with XMLUI and Mirage2 and with ORCID enabled) and 
responded to the bug fixes in 5.6 and 5.7 by incorporating the security 
patches rather than doing a full upgrade to 5.6 and 5.7.

When 5.8 was released we decided to upgrade to 5.8, and everything is 
working fine for us except for the author browse with metadataAuthority.  
The authors of new items were not appearing in our author browse index 
although they are in the Solr discovery and authority cores.  When we ran 
index-discovery -b and -o and index-authority, every author disappeared 
from the author browse and they have not been seen since.  All our items 
appear in the other browses.

We have a staging server, where we tried multiple attempts to reconfigure 
the browse index and adjust confidence values all to no avail.  I've 
reviewed carefully the documentation on ORCID and authority configuration, 
all with no satisfaction.  Not even going back to the default author browse 
that isn't based on the authority index works.

Finally, taking the staging server back to our 5.5 code with 5.6 and 5.7 
patches, we were able to bring back the author browse.  Checking carefully 
I found that there were no significant differences in dspace.cfg between 
our staging server with the patched 5.5 code and our 5.8 production 
server.  I've also been carefully through our log files for cocoon, dspace, 
and solr and can find nothing that would explain the difference between the 
author browse on our two servers.

We are holding back from updgrading to DSpace 6 for two reasons.  Our 
institution has adopted PURE and so far our information is that PURE works 
with DSpace 5 (REST API) but it is not known if it works with DSpace 6.  In 
any case, our PURE project is not at the point of trying DSpace 
integration. The other reason for hesitating is seeing that people are 
still having various problems with adopting DSpace 6.

At this point, I'm fairly certain there is a bug, but I can't be sure where 
it was introduced between 5.5 and 5.8, but I'm at a loss as to how to 
investigate further.  Given our present knowledge, it would seem best to go 
back to our patched 5.5 rather than staying with 5.8.

Does anyone have any suggestions about where we might look next?  Is anyone 
else at 5.6, 5.7, or 5.8 using Mirage 2 and ORCID successfully?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Added items not accessible via browsing by author

2017-10-11 Thread Chris Gray
After upgrading to DSpace 5.8, added items no longer appear in the browse 
by author.  Existing items were there but not new items.  If an author with 
a specific item was in DSpace before the upgrade, after the upgrade the 
browse by author only showed the older item not the new item.  The 
underlying unique id for that author is present for both items.  Both items 
are in the 'search' Solr core and the author is in the 'authority' core.  
Everything is findable by a search and by all the other browse functions, 
just not author browse.

We have implemented ORCID functionality and our author browse was working 
before the upgrade to 5.8.  We set the author browse to use 
metadataAuthority instead of metadata, and it was working before the 
upgrade.  We've reverted our staging server to before the 5.8 upgrade and 
the author browse is working fine.

We tried to fix the production server by having the discovery index rebuilt 
from scratch (-b option to index-discovery).  This removed everything from 
the author browse.  This suggests that items are added to the discovery 
index in such a way that they are not accessed by the author browse anymore.

It seems that there is some problem with the way items are added to the 
discovery index and how the XMLUI browse function accesses the Solr core.  
We are using XMLUI with Mirage2.

Does anyone have any idea where we should look for the problem?  How to 
diagnose it?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] bitstreams and character encodings

2017-08-11 Thread Chris Gray
Thanks for the reminder, Terry.

I forgot that the text extraction doesn't use Imagemagick.  (I got confused 
because I am using Imagemagick and Tesseract to repair some bad text 
extractions in our DSpace.  We've found that the default mechanism has 
trouble in certain cases, including PDFs made from TeX and DVI.)

It also occurred to me, after posting, that Tomcat 7 is probably defaulting 
to ISO-8859-1 on our server.  We're using Ubuntu and the standard Ubuntu 
package for Tomcat and it probably needs to be set to serve files with 
UTF-8 since everything else in DSpace is UTF-8.

On Friday, August 11, 2017 at 1:46:00 PM UTC-4, Terry Brady wrote:
>
> Chris,
>
> The ImageMagick filter uses ghostscript to generate an image of the first 
> page of a document in order to create a thumbnail.  The full text 
> extraction is handled by a different filter.
>
>
> https://github.com/DSpace/DSpace/blob/dspace-6_x/dspace/config/dspace.cfg#L346
>
> I have encountered similar issue to the one that you have described, but I 
> have not found a comprehensive solution.
>
> I suspect that some of our issues are related to the source PDF's rather 
> than the DSpace code base.
>
> On a related note, we used to host HTML finding aids in our repository, 
> and we encountered a number of character set issues when displaying those 
> files.  I made the following modification to this file
>
>
> https://github.com/DSpace/DSpace/blob/dspace-6_x/dspace-xmlui/src/main/java/org/dspace/app/xmlui/cocoon/BitstreamReader.java#L403
>
> if (bitstreamMimeType.equals("text/html")) {
> bitstreamMimeType = "text/html; charset=UTF-8";
> }
>
>
> On Fri, Aug 11, 2017 at 6:45 AM, Chris Gray <cpgr...@gmail.com 
> > wrote:
>
>> You can fetch bitstreams from DSpace with URL paths like this (our xmlui 
>> context is implicit):
>>
>> /bitstream/id/{bitstream_id}/{bitstream_filename}
>>
>> I've been noticing that in our case txt bitstreams and pdf bitstreams are 
>> always delivered by the server with the character set in the response 
>> header set to ISO-8859-1 and not UTF-8.
>>
>> Is this a setting somewhere?  Is it possible to make it more flexible and 
>> adapt to actual content?
>>
>> In particular, I'm looking at the .pdf.txt files extracted by Imagemagick 
>> for full text indexing purposes.
>>
>> Is it possible to set a character encoding for individual pdfs and have 
>> Imagemagick take that into consideration in extracting full text?
>>
>> We are using DSpace 5.5 with security patches for 5.6 and 5.7 and XMLUI 
>> with Mirage2.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to dspace-tech...@googlegroups.com .
>> To post to this group, send email to dspac...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Terry Brady
> Applications Programmer Analyst
> Georgetown University Library Information Technology
> http://georgetown-university-libraries.github.io/
> 425-298-5498 (Seattle, WA)
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] bitstreams and character encodings

2017-08-11 Thread Chris Gray
You can fetch bitstreams from DSpace with URL paths like this (our xmlui 
context is implicit):

/bitstream/id/{bitstream_id}/{bitstream_filename}

I've been noticing that in our case txt bitstreams and pdf bitstreams are 
always delivered by the server with the character set in the response 
header set to ISO-8859-1 and not UTF-8.

Is this a setting somewhere?  Is it possible to make it more flexible and 
adapt to actual content?

In particular, I'm looking at the .pdf.txt files extracted by Imagemagick 
for full text indexing purposes.

Is it possible to set a character encoding for individual pdfs and have 
Imagemagick take that into consideration in extracting full text?

We are using DSpace 5.5 with security patches for 5.6 and 5.7 and XMLUI 
with Mirage2.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: index-authority -o

2017-05-04 Thread Chris Gray
We ran into this problem.

The account that runs dspace may have its login profile configured to set 
the Java variables (JAVA_OPTS) needed to give the account more memory than 
the default, so the [dspace]/bin/dspace scripts run fine from the command 
line, but the login profile is not used by cronjobs.

For us the problem was solved by adding a line setting the JAVA_OPTS 
environment variable near the beginning of our crontab. Just repeat the 
JAVA_OPTS line that is in the account's .profile file. An "export" command 
is not needed since the environment variable is only needed for the 
duration of the crontab script.

Chris

On Wednesday, May 3, 2017 at 10:07:48 AM UTC-4, molly mcmanus wrote:
>
> We are using dspace 5.3, linux, oracle and are getting an error each night 
> when we run the index-authority -o
>
>
> The error we are getting is below. Is this due to memory or something we 
> have set up incorrectly? 
>
> Thanks, Molly
>
> Exception: null
>
> java.lang.StackOverflowError
>
> at 
> java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>
> at 
> java.net.SocketInputStream.read(SocketInputStream.java:171)
>
> at 
> java.net.SocketInputStream.read(SocketInputStream.java:141)
>
> at oracle.net.ns.Packet.receive(Packet.java:308)
>
> at oracle.net.ns.DataPacket.receive(DataPacket.java:106)
>
> at 
> oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:324)
>
> at 
> oracle.net.ns.NetInputStream.read(NetInputStream.java:268)
>
> at 
> oracle.net.ns.NetInputStream.read(NetInputStream.java:190)
>
> at 
> oracle.net.ns.NetInputStream.read(NetInputStream.java:107)
>
> at 
>
> oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInpu
>
> tStreamWrapper.java:124)
>
> at 
>
> oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWra
>
> pper.java:80)
>
> at 
> oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1137)
>
> at 
> oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:350)
>
> at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227)
>
> at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
>
> at 
>
> oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:2
>
> 08)
>
> at 
>
> oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement
>
> .java:1046)
>
> at 
>
> oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.jav
>
> a:1207)
>
> at 
>
> oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.jav
>
> a:1296)
>
> at 
>
> oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedSt
>
> atement.java:3613)
>
> at 
>
> oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedState
>
> ment.java:3657)
>
> at 
>
> oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePrepar
>
> edStatementWrapper.java:1495)
>
> at 
>
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(Delegating
>
> PreparedStatement.java:96)
>
> at 
>
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(Delegating
>
> PreparedStatement.java:96)
>
> at 
>
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(Delegating
>
> PreparedStatement.java:96)
>
> at 
>
> org.dspace.storage.rdbms.DatabaseManager.queryTable(DatabaseManager.java:23
>
> 4)
>
> at 
>
> org.dspace.content.DSpaceObject$MetadataCache.retrieveMetadata(DSpaceObject
>
> .java:1330)
>
> at 
>
> org.dspace.content.DSpaceObject$MetadataCache.get(DSpaceObject.java:1265)
>
> at 
> org.dspace.content.DSpaceObject.getMetadata(DSpaceObject.java:676)
>
> at 
> org.dspace.content.DSpaceObject.getMetadata(DSpaceObject.java:585)
>
> at 
>
> org.dspace.content.DSpaceObject.getMetadataByMetadataString(DSpaceObject.ja
>
> va:643)
>
> at 
>
> org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthority
>
> Indexer.java:130)
>
> at 
>
> org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthority
>
> Indexer.java:159)
>
> at 
>
> org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthority
>
> Indexer.java:159)
>
> at 
>
> org.dspace.authority.indexer.DSpaceAuthorityIndexer.hasMore(DSpaceAuthority
>
> Indexer.java:159)
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post 

Re: [dspace-tech] Where in DSpace are ORCIDs stored?

2017-04-07 Thread Chris Gray
Thanks, Helix,

That clears some of this up for me.  I'm still a little mystified by what 
you say about backing up the Solr authority index.  I know that Solr 
indexes are just files.  I'll have to learn how to back up and restore of 
Solr indexes.

When you say you can deduplicate with CSV export, do you mean a CSV export 
from Solr?  How would you get the CSV back in to Solr?  Again my ignorance 
of Solr may be showing.

As I understand the DSpace metadata export and import via CSV, there is no 
provision for specifying authority values.  That would be an "unattended" 
submission, which then has to be corrected via metadata editing in the UI.  
Metadata editing in the UI is the only method I've been able to find for 
deduplicating authors.

On Friday, April 7, 2017 at 11:08:09 AM UTC-4, helix84 wrote:
>
> Hi Chris,
>
>
> The bad news is that Solr authority is treated as a persistent store, so 
> you have to back it up, too. If you lost it, you'd have to de-duplicate 
> your authors again. Also, there is currently no UI for deduplication (you 
> can use CSV export and deduplicate manually).
>
>
> There are currently some improvements to ORCID lookup in the pipeline (but 
> not to the aforementioned design) in case you want to review them:
> https://github.com/DSpace/DSpace/pull/1698
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Where in DSpace are ORCIDs stored?

2017-04-07 Thread Chris Gray
I've been poking around in DSpace 5 and ORCIDs don't seem to be stored 
anywhere in the database.  They only seem to turn up in the Solr authority 
index.  They don't appear in the Solr discovery (search) index.

So my guess at this point is that the ORCID and some information from the 
ORCID database at orcid.org is saved in the authority index as a result of 
the choice made when submitting an item or editing it's metadata.  
Furthermore, the choice management method is the only way to get ORCIDs 
into DSpace.  So submission forms, work flows, and metadata editing in the 
UI are the only ways to enter this data.

I suppose this also means that if you ever delete the authority index and 
rebuild it from scratch, much of the data associated with an authority 
record will be lost.  This is why there is no -b option for the bin/dspace 
index-authority command.

Am I correct or am I missing something?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: How to get bitstream thumbnail location from the database?

2017-03-30 Thread Chris Gray
Pick an easy problem, OK?  :)

I can get you started and get you to the file name and sequence number, but 
the policy (isAllowed=y) is trickier because policies are hierarchical and 
inheritable.  I haven't worked much with policies in the database.

This will get you started, to get the other data given a handle:

select item2bundle.item_id, bundle_id, bitstream_id, text_value as 
filename, sequence_id
from bitstream --here we go up the hierarchy
join bundle2bitstream
using (bitstream_id)
join item2bundle
using (bundle_id)
join handle
on (handle.resource_id = item2bundle.item_id)
join metadatavalue
on (bitstream.bitstream_id = metadatavalue.resource_id) --file name is 
dc.title for the bitstream
where handle = '10568/70234' -- your handle example
and metadata_field_id = 64 -- dc.title
and bitstream_format_id = 11;

The text_value is the file name.  This will require some refinement, since 
there may be other jpeg bitstreams (bitstream_format_id = 11).  You could 
check that the file name ends in .pdf.jpg instead of using the 
bitstream_format_id.

You can look up the policies by using the resource_id in the resourcepolicy 
table, but deciphering that isn't straightforward.  It depends if you ever 
make thumbnails private.  We don't in our instance.

Cheers,
Chris

On Thursday, March 30, 2017 at 10:35:57 AM UTC-4, Tsegaselassie Tadesse 
wrote:
>
> I am trying to build a very light-weight API endpoint that lists items 
> from the database. I want to provide a url link to the item's thumbnail but 
> it seems DSpace works very differently from what I expected.
>
> I have read the Storage Layer 
>  documentation 
> the Bitstream Store section to be specific on how to use the internal id to 
> get to the directory of the bitstream; however, I am looking for is just to 
> get the exact thumbnail for the item; e.g. on the detail page of an item, 
> https://cgspace.cgiar.org/handle/10568/70234, you will see the thumbnail 
> has a normal image source, https://cgspace.cgiar.org/
> bitstream/handle/10568/70234/beca_africanRice_poster_feb2016.pdf.jpg?sequence=4=y
> .
>
> How do I generate that?
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: Cannot access SOLR remotely

2016-11-08 Thread Chris Gray
Solr should never be exposed to the Web.  It is the full administrative 
interface for Solr and has no authority control.  Making it available on 
the Web would be huge vulnerability.

Solr is only accessible from localhost.  You can access it by setting up an 
SSH tunnel from your machine to the server and then access the Solr 
interface via the tunnel.

On Monday, November 7, 2016 at 1:33:15 PM UTC-5, Donald Bynum wrote:
>
> I have DSpace 5.5 on Tomcat with Oracle as the DB.  I want to run some 
> SOLR queries from a remote client, i.e. NOT running on the Tomcat server as 
> localhost.  I need to do this in order to create some remote reporting 
> functions.  Accessing SOLR on the Tomcat server as localhost is just fine: 
> http://localhost:8080/solr/...
>
> When I try the same from a remote client:  
> http://myserver.thing.org:8080/solr... I get a 403 error - "*Access to 
> the specified resource has been forbidden."*
>
> Any guidance here would be much appreciated.
>
> Regards,
>
> Don.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Author name alternate spellings

2016-08-04 Thread Chris Gray
This could be a problem with the underlying id stored for each name.  By 
default, the authority index doesn't assume two names are for the same 
person unless they are explicitly associated with the same underlying id.

An authority record looks like this:

  {
"id": "badd7401-12aa-4c5c-8d62-515e5746b9d3",
"field": "dc_contributor_author",
"value": "Author, Random",
"deleted": false,
"creation_date": "2015-11-27T05:46:05.066Z",
"last_modified_date": "2015-11-27T05:46:05.066Z",
"authority_type": "person",
"first_name": "Random",
"last_name": "Author"
  },

That first "id" element determines what can or can't be considered the same 
person for indexing.

We discovered this problem with ORCIDs.  Just because two entries are given 
with the same ORCID, still the entries were given different ids and show up 
in browse lists and facets as multiple authors.  We have to force a certain 
id when adding a new dc.contributor field.

On Thursday, August 4, 2016 at 8:08:05 AM UTC-4, Alan Orth wrote:
>
> Hi, 
>
> We have hundreds or thousands of duplicate authors that have the same 
> exact text_value, but show as separate authors in Discovery's author 
> sidebar facet. I have tried a handful of these configuration keys 
> (with a full discovery reindex after) on DSpace 5.1 but I never see 
> any change. 
>
> First I tried: 
>
> index.authority.ignore-prefered.dc.contributor.author=true 
> index.authority.ignore-variants.dc.contributor.author=false 
>
> Then: 
>
> index.authority.ignore=true 
> index.authority.ignore-prefered=true 
> index.authority.ignore-variants=true 
>
> Then: 
>
> discovery.index.authority.ignore-prefered.dc.contributor.author=true 
> discovery.index.authority.ignore-variants=true 
>
> What is the trick to getting Discovery to use author text values for 
> its indexes? Is this a bug that upgrading to 5.{2,3,4,5} will fix? I'm 
> going slightly crazy. :) 
>
> Thanks, 
>
> On Mon, Jan 4, 2016 at 11:19 PM, Andrea Bollini  > wrote: 
> > Hi all, 
> > the relevant parameters are 
> > discovery.browse.authority.ignore-prefered 
> > discovery.index.authority.ignore-prefered 
> > 
> > I was probably partially wrong about the browse behavious as looking to 
> the code (sorry I have had no chances to make an actual test) the 
> metadatavalue is never recorded in the browse index when it is authority 
> controlled 
> > see 
> > 
> https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/browse/SolrBrowseCreateDAO.java#L186
>  
> > 
> > instead the search (facets) only include the prefered form if you use 
> default configuration, see 
> > 
> https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/discovery/SolrServiceImpl.java#L1091
>  
> > 
> > it looks also as the browse system is buggy when the ignore-prefered is 
> set to true as probably nothing is indexed in the browse in this case. 
> Probably we have fixed this bug on our dspace-cris fork and we have forget 
> to back port to the basic dspace 
> > see 
> > 
> https://github.com/Cineca/DSpace/blob/dspace-5_x_x-cris/dspace-api/src/main/java/org/dspace/browse/SolrBrowseCreateDAO.java#L224
>  
> > 
> > about your second question, which is the authority for authors in a 
> dspace-cris instance it is the internal researcher pages database and the 
> ORCID registry, in a standard DSpace istance you can configure the ORCID 
> registry as first lookup for the authors name, when the item is archived 
> the orcid record is recorded in a local cache solr based that is what is 
> actually used as authority 
> > 
> https://github.com/DSpace/DSpace/blob/master/dspace/config/dspace.cfg#L1563 
> > 
> > this mean that, at least if you no edit the metadata value directly in 
> the database or using the admin edit, you cannot have an authority with a 
> corresponding value different than the "prefered one". With DSpace-CRIS 
> where also variants are managed out-of-box this can happen more easily. 
> > 
> > Best, 
> > Andrea 
> > 
> > - Messaggio originale - 
> > Da: "Hilton Gibson"  
> > A: "Peter Dietz"  
> > Cc: "DSpace Technical Support"  
>
> > Inviato: Lunedì, 4 gennaio 2016 16:40:31 
> > Oggetto: Re: [dspace-tech] Author name alternate spellings 
> > 
> > 
> > 
> > Hi All, 
> > 
> > 
> > " You can also configure the system (see the discovery.cfg options) to 
> ignore the metadatavalue and put in the index only the prefered form of a 
> name as provided by the authority." 
> > 
> > 
> > 1. Where in "discovery.cfg" is this configured? A github link would 
> help. 
> > 2. Who is the "authority" for authors? Excuse the pun! 
> > 
> > 
> > Regards 
> > 
> > 
> > hg 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Hilton Gibson 
> > Stellenbosch University Library 
> > 
> > http://orcid.org/-0002-2992-208X 
> > 
> > 
> 

Re: [dspace-tech] possible (very minor) bug in DSpace 5.5

2016-07-26 Thread Chris Gray
In each of the pages I pointed out there are two links back to the home 
page.  One above the "page cannot be found" message and one below it.  The 
one below works.  The one above comes back to the error page.

On Tuesday, July 26, 2016 at 2:19:37 PM UTC-4, Luiz dos Santos wrote:
>
> Hi Chris,
>
> Weird, beside the exception highlighted by Tom, I could go back to the 
> home page by clicking in all links that you posted. I would like to see the 
> error, could you give another example? Maybe it is related with something 
> else.
>
> Best regards
> Luiz
>
> On Tue, Jul 26, 2016 at 10:59 AM, Chris Gray <cpgr...@gmail.com 
> > wrote:
>
>> I just discovered something that doesn't work in our instance of DSpace 
>> 5.5.  We are using xmlui with Mirage2, so I don't know if it is only a bug 
>> in that particular interface.
>>
>> If you go to an item that doesn't exist, for example, 
>> https://uwspace.uwaterloo.ca/handle/10012/105932, the breadcrumb link to 
>> "UWSpace Home" (with the house icon beside it) doesn't go back to the home 
>> page; it points back to the URL that causes the error.
>>
>> I can confirm that other sites are affected by this.
>>
>> http://www.oceandocs.org/handle/1834/917388
>> http://cardinalscholar.bsu.edu/handle/123456789/2002778
>>
>> Some aren't.
>>
>> https://ubir.buffalo.edu/xmlui/handle/10477/6074999
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to dspace-tech...@googlegroups.com .
>> To post to this group, send email to dspac...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] possible (very minor) bug in DSpace 5.5

2016-07-26 Thread Chris Gray
I just discovered something that doesn't work in our instance of DSpace 
5.5.  We are using xmlui with Mirage2, so I don't know if it is only a bug 
in that particular interface.

If you go to an item that doesn't exist, for example, 
https://uwspace.uwaterloo.ca/handle/10012/105932, the breadcrumb link to 
"UWSpace Home" (with the house icon beside it) doesn't go back to the home 
page; it points back to the URL that causes the error.

I can confirm that other sites are affected by this.

http://www.oceandocs.org/handle/1834/917388
http://cardinalscholar.bsu.edu/handle/123456789/2002778

Some aren't.

https://ubir.buffalo.edu/xmlui/handle/10477/6074999

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Trigger an event on publishing an item

2016-07-12 Thread Chris Gray
Is it possible to trigger an event automatically when an item is published?

Specifically, I want to automatically reindex OAI (run [dspace]/bin/dspace 
oai import -o) when a new item is submitted.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: Solr url access returning 403 error

2016-05-24 Thread Chris Gray
You can solve this problem by ssh tunneling to your server without
any change to the server's configuration.  If you have ssh access
to the server you can set up a tunnel.

Coming from a Linux box I use:
   ssh -L localhost:9180:127.0.0.1:8080 myusername@my.dspace.server -N
If you don't have ssh keys set up, you will have to log in after giving 
this command.  When
you are done accessing Solr and close your browser you have to use ^C to 
exit this command

This tunnels from my Linux workstation to the dspace server.  Solr 
dashboard is then accessible by my browser to:
  http://localhost:9180/solr/#/

We also use the tunneling capability of PuTTY to give access to Window's 
workstations.

Chris

On Tuesday, May 24, 2016 at 11:01:29 AM UTC-4, Pantelis Karamolegkos wrote:
>
> I just installed dspace from the master branch source.
> I am however getting an "Internal System Error" on the landing page (as 
> you can see in the attached image)
>
> dspace log indicates a solr issue: http://pastebin.com/CP93Mwj0
>
> tl;dr: solr is not accessible via http (returning 403). Here is my 
> server.xml extract regarding dspace sonfiguration:
>
>   docBase="/home/pkaramol/Workspace/dspace/dspace-installation/webapps/jspui" 
> reloadable="true" />
>  docBase="/home/pkaramol/Workspace/dspace/dspace-installation/webapps/solr" 
> reloadable="true">
>  className="org.apache.catalina.valves.RemoteAddrValve" 
> allow="127\.0\.0\.1|123\.123\.123\.123|111\.222\.233\.d+"/>
>  value="false" override="false" />
> 
>
>
> Any suggestions would be appreciated, thx.
>
> Pantelis
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] metadata export doubles columns

2016-04-20 Thread Chris Gray


On Wednesday, April 20, 2016 at 11:28:31 AM UTC-4, Claudia Jürgen wrote:
>
>
>
> Am 20.04.2016 um 16:57 schrieb Tim Donohue:
>
>
>
> On 4/20/2016 9:44 AM, Mark H. Wood wrote: 
>
> Is there any use case for a text_language value of an empty or blank 
> string? Should we not convert such values to null values in the business 
> logic layer on ingestion, and remove this puzzle? 
>
>
> No, there is not. It's just a hard puzzle to solve. I'm not exactly sure 
> *when* these empty / blank language values get into the system. 
>
>
> guess they come mostly from import and templates. (gonna check tomorrow 
> just leaving).
>
> And they are difficult to get rid of via the UI. One can only delete the 
> metadatum and create it again.
>
> Apart from making sure they are not created  a curation task  to clean up 
> would be good. Not only for empty values but invalid ones not referring to 
> a locale or some with trailing spaces etc. like "en ". As the text_lang 
> influences other things like sort in browse it would be useful. Also 
> checking for null values where there most likely should be some and  for 
> fields which should not have any and got some. In older versions the 
> default text lang was applied to all metadata field even dates.
>
> Claudia
>
>
>
What prompted my original question was this duplication in the 
dc.contributor.author field where we are struggling to clean up authority 
issues.  I exported records that were created by the submission forms and 
then some of them were edited in the UI (Mirage2) metadata editing screen.  
Some had empty strings and some had nulls.

I get the following results in the database:

dspace=> select count(*)
from metadatavalue
where text_lang is null;
 count  

 241888
(1 row)

dspace=> select count(*)
from metadatavalue
where text_lang = '';
 count 
---
  8497
(1 row)

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] metadata export doubles columns

2016-04-15 Thread Chris Gray
When exporting metadata for items I get duplication of columns.  For 
instance, the CSV will have a column for dc.identifier.uri and a column for 
dc.identifier.uri[].  This happens with other columns too.

I know you get this when you have different languages associated with 
values in the same field, but in this case there is no language associated 
with the fields.  They look identical in the UI, but get separated into two 
columns in the metadata output.

Does anyone know what would do this?  I assume we've misconfigured 
something.  Is it bad not to associate any language with some fields?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] discovery and authority indexes retain typos after correction

2016-04-04 Thread Chris Gray
In submitting an item to our instance of DSpace a typo was made in the 
author's name.  We have since corrected the metadata (dc.contributor. 
author) for this item, but the name with the typo remains in both the 
search and authority Solr indexes despite all attempts to clean and rebuild 
them.

We know of this particular problem, but if other typos have been corrected 
after submission and acceptance then they probably still linger.

Is there a more radical method to wipe out the Solr indexes and rebuild 
them?  Where is the faulty value of dc.contributor author being retained so 
that it reappears upon reindexing?

If I delete the indexes from the Tomcat directories, can they be rebuilt?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] how to refresh solr authority index

2016-03-14 Thread Chris Gray
How can the Solr authority index be updated?

Running [dspace]/bin/dspace index-discovery -b or -bfs leaves the authority 
core untouched.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] build fails because ruby gem is unavailable

2016-02-09 Thread Chris Gray
We are getting this error when trying to rebuild DSpace:

[INFO] 


[ERROR] Failed to execute goal on project xmlui-mirage2: Could not resolve 
dependencies for project org.dspace.modules:xmlui-mirage2:war:5.1: Failure 
to find rubygems:rb-inotify:gem:0.9.6 in 
http://rubygems-proxy.torquebox.org/releases was cached in the local 
repository, resolution will not be reattempted until the update interval of 
rubygems-release has elapsed or updates are forced -> [Help 1]

[ERROR] 

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR] 

[ERROR] For more information about the errors and possible solutions, 
please read the following articles:

[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException

[ERROR] 

[ERROR] After correcting the problems, you can resume the build with the 
command

[ERROR]   mvn  -rf :xmlui-mirage2

 
Looking at 
http://stackoverflow.com/questions/35270637/rb-inotify-0-9-6-gem-missing-from-rubygems-org
 
there is a workaround suggested but the Mirage 2 and Maven configuration is 
a black box I hesitate to touch.

The same message now refers to rb-inotify:0.9.7 today.

Can someone provide DSpace Mirage 2 specific instructions for fixing this 
problem?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Re: Full text content not being found in index for a particular item(s)

2015-12-11 Thread Chris Gray
Following this thread helped solve a similar issue.

We had one item that wasn't searchable that should have been.  I could find 
it by directly doing a solr search, but it wouldn't turn up in the web 
pages.

The mention of things being set private made me try setting the item as 
private and then setting it back to public.  This worked.

Is a private item recorded in the database in the item table under the 
"discoverable" column?  The item in question was the only one in our 
collection that was set as in_archive but not discoverable.  I set 
"discoverable" to true, but still it was not searchable.  Only resetting 
the public status through the web UI worked.  So I suspect private/public 
is marked elsewhere.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] after migration bitstreams can't be found in assetstore

2015-11-09 Thread Chris Gray
We recently migrated servers from 3.1 to 5.3.

At first we migrated content via AIP export and import but found that our 
database was askew, disallowing creation of new communities.

So we moved the database directly via pg_dump and pg_restore, and 
everything seemed fine.

Now we're finding that many of the pdf bitstreams give a file not found 
error saying the file is not in the asseststore.

My suspicion is that the AIP import didn't duplicate the old asseststore 
but assigned new directories and filenames and the databases references are 
to the old structure.

Can we fix this by a straight copy of the assetstore from the old server to 
the new server?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] license.txt truncated; best way to correct

2015-10-29 Thread Chris Gray
I've come across two items in our 3.1 repository that are incorrectly 
exported by the packager command.  This is because the size of the 
license.txt in the item is not reported correctly and this in turn seems to 
be because the licence.txt file has been truncated.

In one case, only the final period on the text file is missing.  In the 
other case one and a half words have gone missing at the end.  In both 
cases, the missing characters are part of the standard wording of the file.

I'm not seeing a way to edit the file in the 3.1 xmlui interface.  How can 
this problem be corrected and the integrity of the file restored?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] can't ingest in 5.3 - bitstream size does not match size in PREMIS

2015-10-26 Thread Chris Gray
An import in 5.3 is choking on an export done in 3.1.

Using the command:

[dspace]/bin/dspace packager -r -a -f -t AIP -e [email] -i [site prefix]/0 
-o skipIfParentMissing=true /path/aip.zip

I get the following error and stack trace:

org.dspace.content.crosswalk.MetadataValidationException: Bitstream size 
(247) does not match size in PREMIS (246), rejecting it.
at 
org.dspace.content.crosswalk.PREMISCrosswalk.ingest(PREMISCrosswalk.java:115)
at 
org.dspace.content.crosswalk.PREMISCrosswalk.ingest(PREMISCrosswalk.java:88)
at 
org.dspace.content.packager.METSManifest.crosswalkXmd(METSManifest.java:1193)
at 
org.dspace.content.packager.METSManifest.crosswalkBitstream(METSManifest.java:1310)
at 
org.dspace.content.packager.AbstractMETSIngester.addBitstreams(AbstractMETSIngester.java:814)
at 
org.dspace.content.packager.AbstractMETSIngester.ingestObject(AbstractMETSIngester.java:494)
at 
org.dspace.content.packager.AbstractMETSIngester.replace(AbstractMETSIngester.java:1180)
at 
org.dspace.content.packager.AbstractPackageIngester.replaceAll(AbstractPackageIngester.java:286)
at 
org.dspace.content.packager.AbstractPackageIngester.replaceAll(AbstractPackageIngester.java:319)
at 
org.dspace.content.packager.AbstractPackageIngester.replaceAll(AbstractPackageIngester.java:319)
at 
org.dspace.content.packager.AbstractPackageIngester.replaceAll(AbstractPackageIngester.java:319)
at org.dspace.app.packager.Packager.replace(Packager.java:732)
at org.dspace.app.packager.Packager.main(Packager.java:373)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
at 
org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
org.dspace.content.crosswalk.MetadataValidationException: Bitstream size 
(247) does not match size in PREMIS (246), rejecting it.

Doesn't tell me which of the over 8000 aip zip files has the problem.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Re: [Dspace-tech] problems with crosswalks, export site: user passwords

2015-10-26 Thread Chris Gray
Yes we have a source directory with pom.xml at /home/dspace/src/pom.xml.

On Monday, October 26, 2015 at 8:08:33 AM UTC-4, Chris Gray wrote:
>
> On Monday, October 26, 2015 at 4:47:08 AM UTC-4, helix84 wrote:
>>
>> Please, post the full stack trace, like Luis did.
>>
>> On Sun, Oct 25, 2015 at 11:43 PM, Chris Gray <cpgr...@gmail.com> wrote:
>>
>>> I'm also having trouble locating RoleDisseminator.java on our system.  
>>> As root from the root directory "find | grep "RoleDisseminator\.java" draws 
>>> a blank.
>>>
>>
>> It's possible you either don't have the source on your system or your 
>> installation was built from per-built modules (what we call a binary dspace 
>> download). Check whether you have the dspace source directory at all:
>> find / -name pom.xml
>>
>>
>> Regards,
>> ~~helix84
>>
>> Compulsory reading: DSpace Mailing List Etiquette
>> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Re: [Dspace-tech] problems with crosswalks, export site: user passwords

2015-10-26 Thread Chris Gray
Exception: Failed to export Roles via packager (see wrapped error message 
for more details)
org.dspace.content.crosswalk.CrosswalkInternalException: Failed to export 
Roles via packager (see wrapped error message for more details)
at 
org.dspace.content.crosswalk.RoleCrosswalk.disseminateElement(RoleCrosswalk.java:234)
at 
org.dspace.content.packager.AbstractMETSDisseminator.crosswalkToMetsElement(AbstractMETSDisseminator.java:1359)
at 
org.dspace.content.packager.AbstractMETSDisseminator.makeMdSec(AbstractMETSDisseminator.java:614)
at 
org.dspace.content.packager.AbstractMETSDisseminator.addToAmdSec(AbstractMETSDisseminator.java:727)
at 
org.dspace.content.packager.AbstractMETSDisseminator.addAmdSec(AbstractMETSDisseminator.java:753)
at 
org.dspace.content.packager.AbstractMETSDisseminator.makeManifest(AbstractMETSDisseminator.java:839)
at 
org.dspace.content.packager.AbstractMETSDisseminator.writeZipPackage(AbstractMETSDisseminator.java:311)
at 
org.dspace.content.packager.AbstractMETSDisseminator.disseminate(AbstractMETSDisseminator.java:258)
at 
org.dspace.content.packager.DSpaceAIPDisseminator.disseminate(DSpaceAIPDisseminator.java:160)
at 
org.dspace.content.packager.AbstractPackageDisseminator.disseminateAll(AbstractPackageDisseminator.java:86)
at org.dspace.app.packager.Packager.disseminate(Packager.java:636)
at org.dspace.app.packager.Packager.main(Packager.java:460)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183)
Caused by: org.dspace.content.packager.PackageException: 
java.lang.NullPointerException, Reason: java.lang.NullPointerException
at 
org.dspace.content.packager.RoleDisseminator.writeToStream(RoleDisseminator.java:257)
at 
org.dspace.content.packager.RoleDisseminator.disseminate(RoleDisseminator.java:108)
at 
org.dspace.content.crosswalk.RoleCrosswalk.disseminateElement(RoleCrosswalk.java:202)
... 16 more
Caused by: java.lang.NullPointerException
at 
com.ctc.wstx.sw.BaseStreamWriter.writeCharacters(BaseStreamWriter.java:498)
at 
org.dspace.content.packager.RoleDisseminator.writeEPerson(RoleDisseminator.java:448)
at 
org.dspace.content.packager.RoleDisseminator.writeToStream(RoleDisseminator.java:244)
... 18 more


On Monday, October 26, 2015 at 4:47:08 AM UTC-4, helix84 wrote:
>
> Please, post the full stack trace, like Luis did.
>
> On Sun, Oct 25, 2015 at 11:43 PM, Chris Gray <cpgr...@gmail.com 
> > wrote:
>
>> I'm also having trouble locating RoleDisseminator.java on our system.  As 
>> root from the root directory "find | grep "RoleDisseminator\.java" draws a 
>> blank.
>>
>
> It's possible you either don't have the source on your system or your 
> installation was built from per-built modules (what we call a binary dspace 
> download). Check whether you have the dspace source directory at all:
> find / -name pom.xml
>
>
> Regards,
> ~~helix84
>
> Compulsory reading: DSpace Mailing List Etiquette
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: [Dspace-tech] problems with crosswalks, export site: user passwords

2015-10-25 Thread Chris Gray
I'm currently having a similar problem with 3.1.  The command:

./dspace packager -d -a -t AIP -e [email] -i [siteprefix]/0 
/[path]/[aip].zip

fails.  Note that I am NOT using the -o passwords=true option.

What may be the complicating factor is that we use a CAS-based 
authentication method written for us by @mire.  They are no longer under 
contract to us.

We get the same error message and stack trace that Luis reported.

On Wednesday, August 26, 2015 at 10:34:54 AM UTC-4, Luis Alberto Maguiña 
Silva wrote:
>
> Dear Friends:
> We use DSpace 3.1, we have a problem:
>
> We exporting entire site:
> /dspace/bin/dspace packager -d -a -t AIP -e [EMAIL] -o passwords=true -i 
> 123456789/0 /path/aip.zip
>
> But, the system presents the error:
>
> Disseminating DSpace SITE [ hdl=123456789/0 ] to /data/aip/aip.zip
>
> Also disseminating all child objects (recursive mode)..
> This may take a while, please check your logs for ongoing status while we 
> process each package.
> Exception: Failed to export Roles via packager (see wrapped error message 
> for more details)
> org.dspace.content.crosswalk.CrosswalkInternalException: Failed to export 
> Roles via packager (see wrapped error message for more details)
> at 
> org.dspace.content.crosswalk.RoleCrosswalk.disseminateElement(RoleCrosswalk.java:234)
> at 
> org.dspace.content.packager.AbstractMETSDisseminator.crosswalkToMetsElement(AbstractMETSDisseminator.java:1359)
> at 
> org.dspace.content.packager.AbstractMETSDisseminator.makeMdSec(AbstractMETSDisseminator.java:614)
> at 
> org.dspace.content.packager.AbstractMETSDisseminator.addToAmdSec(AbstractMETSDisseminator.java:727)
> at 
> org.dspace.content.packager.AbstractMETSDisseminator.addAmdSec(AbstractMETSDisseminator.java:753)
> at 
> org.dspace.content.packager.AbstractMETSDisseminator.makeManifest(AbstractMETSDisseminator.java:839)
> at 
> org.dspace.content.packager.AbstractMETSDisseminator.writeZipPackage(AbstractMETSDisseminator.java:311)
> at 
> org.dspace.content.packager.AbstractMETSDisseminator.disseminate(AbstractMETSDisseminator.java:258)
> at 
> org.dspace.content.packager.DSpaceAIPDisseminator.disseminate(DSpaceAIPDisseminator.java:160)
> at 
> org.dspace.content.packager.AbstractPackageDisseminator.disseminateAll(AbstractPackageDisseminator.java:86)
> at org.dspace.app.packager.Packager.disseminate(Packager.java:635)
> at org.dspace.app.packager.Packager.main(Packager.java:460)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183)
> Caused by: org.dspace.content.packager.PackageException: 
> java.lang.NullPointerException, Reason: java.lang.NullPointerException
> at 
> org.dspace.content.packager.RoleDisseminator.writeToStream(RoleDisseminator.java:254)
> at 
> org.dspace.content.packager.RoleDisseminator.disseminate(RoleDisseminator.java:105)
> at 
> org.dspace.content.crosswalk.RoleCrosswalk.disseminateElement(RoleCrosswalk.java:202)
> ... 16 more
> Caused by: java.lang.NullPointerException
> at 
> com.ctc.wstx.sw.BaseStreamWriter.writeCharacters(BaseStreamWriter.java:498)
> at 
> org.dspace.content.packager.RoleDisseminator.writeEPerson(RoleDisseminator.java:479)
> at 
> org.dspace.content.packager.RoleDisseminator.writeToStream(RoleDisseminator.java:241)
> ... 18 more
>
>  But, when I write :
> /dspace/bin/dspace packager -d -a -t AIP -e [EMAIL] -i 123456789/0 
> /path/aip.zip
>
> The system creates zip package, but the users don't have your passwords. 
>
> in the dspace.cfg crosswalks DSPACE-ROLE is enable.
>  org.dspace.content.crosswalk.RoleCrosswalk = DSPACE-ROLES
>  org.dspace.content.packager.RoleDisseminator = DSPACE-ROLES
> org.dspace.content.packager.RoleIngester = DSPACE-ROLES
> aip.disseminate.techMD = PREMIS, DSPACE-ROLES
>
> Thanks for your assistance.
>
> Luis Maguina
> Lima - Perú
>
> -- 
> Luis Maguiña
> Linux user number 386737
> Estás en tierra de Linux. En las noches tranquilas, se puede escuchar en 
> la lejanía el ruido de las máquinas de windows reiniciándose una y otra vez
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Apache and Tomcat questions

2015-10-19 Thread Chris Gray
We've reconfigured Tomcat and Apache to put Apache in front.  And we've 
configured Apache to force everything to HTTPS.  And we've registered xmlui 
as the ROOT webapp for Tomcat.

This is working but there are problems.

The site is still accessible to the world via port 8080.  We added ajp on 
port 8009 for Apache to talk to tomcat.  I hesitated to turn off port 8080 
because of Solr.  Is there a safe way to do this?

I noticed that the robots.txt still says the sitemap is at 
http://example.com:8080/xmlui/sitemap when it is now at 
https://example.com/sitemap:  new protocol, new port, new path.  Are there 
changes that can be made in the [dspace-source] to reflect the Tomcat and 
Apache changes?  Is anything other than the robots.txt affected?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.