Hi All,
I noticed that the Sword v2 server on DSpace v4.2 doesn't accept deposits for 
files whose name contains a space. To reproduce the issue:

Running this CURL command results in the following exception on the server side:
curl -X POST \
-d 'test string' \
--header "packaging: http://purl.org/net/sword/package/Binary"; \
--header "user-agent: SWORD Client 2.0" \
--header "content-disposition: attachment; filename=\"Filename with 
spaces.txt\"" \
--header "metadata-relevant: false" \
--user myusername:secret \
-v \
"http://localhost:8080/swordv2/edit-media/12887";
***************
org.apache.abdera.i18n.iri.IRISyntaxException: 
org.apache.abdera.i18n.text.InvalidCharacterException: Invalid Character 0x20( )
        org.apache.abdera.i18n.iri.IRI.parse(IRI.java:572)
        org.apache.abdera.i18n.iri.IRI.<init>(IRI.java:64)
        
org.dspace.sword2.ReceiptGenerator.createFileReceipt(ReceiptGenerator.java:47)
        
org.dspace.sword2.MediaResourceManagerDSpace.addResource(MediaResourceManagerDSpace.java:742)
        org.swordapp.server.MediaResourceAPI.post(MediaResourceAPI.java:272)
        
org.swordapp.server.servlets.MediaResourceServletDefault.doPost(MediaResourceServletDefault.java:49)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:646)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
        org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
</pre></p><p><b>root cause</b> 
<pre>org.apache.abdera.i18n.text.InvalidCharacterException: Invalid Character 
0x20( )
        
org.apache.abdera.i18n.text.CodepointIterator$RestrictedCodepointIterator.next(CodepointIterator.java:476)
        org.apache.abdera.i18n.text.CharUtils.verify(CharUtils.java:820)
        org.apache.abdera.i18n.text.CharUtils.verify(CharUtils.java:838)
        org.apache.abdera.i18n.iri.IRI.parse(IRI.java:568)
        org.apache.abdera.i18n.iri.IRI.&lt;init&gt;(IRI.java:64)
        
org.dspace.sword2.ReceiptGenerator.createFileReceipt(ReceiptGenerator.java:47)
        
org.dspace.sword2.MediaResourceManagerDSpace.addResource(MediaResourceManagerDSpace.java:742)
        org.swordapp.server.MediaResourceAPI.post(MediaResourceAPI.java:272)
        
org.swordapp.server.servlets.MediaResourceServletDefault.doPost(MediaResourceServletDefault.java:49)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:646)
        javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
        org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
*******************
In the curl command if I remove all the spaces from the filename and change the 
header to the following, the server performs the action as expected:
--header "content-disposition: attachment; filename=Filenamewithspaces.txt"


As a workaround, I have changed the following code in the file 
SwordUrlManager.java in our implementation of DSpace:

return this.getSwordBaseUrl() + "/edit-media/bitstream/" + bitstream.getID() + 
"/" + bitstream.getName();

to:

// Percent encode bitstream name. E.g. "File name with spaces" becomes 
"File%20name%20with%20spaces"
String perEncBitstreamName = bitstream.getName().replace(" ", "%20");
return this.getSwordBaseUrl() + "/edit-media/bitstream/" + bitstream.getID() + 
"/" + perEncBitstreamName;

Running the first curl command that previously threw an error now works as 
expected, storing the bitstream with a name containing spaces.

Would the DSpace developers be interested in accepting a pull request for this 
change? Please let me know if anyone has any questions regarding this issue.


Rahul Khanna
Systems Developer
IT Services |Menzies Building, Level 4 |The Australian National University | 
ACTON ACT 2601
T: +61 2 6125 9010 | E: rahul.kha...@anu.edu.au<mailto:rahul.kha...@anu.edu.au>


------------------------------------------------------------------------------
_______________________________________________
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to