Hi Petya & James,

quick word of thanks for the documentation on this solution ;)

We hit this issue again, made JIRA issues and pull requests so it can go
away permanently
https://jira.duraspace.org/browse/DS-3549

rgds,

Bram

[image: logo] Bram Luyten
250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
Esperantolaan 4, Heverlee 3001, Belgium
atmire.com
<http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=braml>

On 13 January 2014 at 21:53, Petya Kohts <[email protected]> wrote:

> Hello,
>
> following up on the discussion
>
> https://sourceforge.net/mailarchive/message.php?msg_id=31215737
> http://web.archiveorange.com/archive/v/hxciqsTWLSVu2rG3JE47
>
> started by James Leonard Halliday with the subject
> "text encoding problem with bitstreams in DSpace 3.1 - resolved":
>
> > Hi everyone,
> >
> > I posted about this a while back, and finally found a workaround
> > so I wanted to share. My problem was regarding HTML bitstreams
> > in DSpace 3.1 (XMLUI).
> >
> > In previous versions of DSpace, the encoding for my UTF-8 bitstreams
> > worked just fine, but in DSpace 3.1, the encoding for ONLY the bitstreams
> > was coming out as ISO-8859 instead. After much searching, I finally found
> > a workaround.
>
> First of all thanks for sharing, Leonard!
>
>
> Now Dspace 4.0 rc3 has the same problem and the same fix helps:
>
> in /dspace/webapps/xmlui/WEB-INF/web.xml replace:
>
>   <filter>
>     <filter-name>SetCharacterEncoding</filter-name>
>     <filter-class>org.dspace.app.xmlui.cocoon.SetCharacterEncodi
> ngFilter</filter-class>
>     <init-param>
>       <param-name>encoding</param-name>
>       <param-value>UTF-8</param-value>
>     </init-param>
>   </filter>
>
> with
>
>   <filter>
>     <filter-name>SetCharacterEncoding</filter-name>
>     <filter-class>org.springframework.web.filter.CharacterEncodi
> ngFilter</filter-class>
>     <init-param>
>       <param-name>encoding</param-name>
>       <param-value>UTF-8</param-value>
>     </init-param>
>     <init-param>
>       <param-name>forceEncoding</param-name>
>       <param-value>true</param-value>
>     </init-param>
>   </filter>
>
>
> Answering Mark's questions:
>
> > I'm thinking that our filter as written could never have done what you
> > expect, and the effect was produced elsewhere.  Our filter only sets
> > the request's encoding.  Spring's filter is documented to also set the
> > response's encoding when forceEncoding=true.  Perhaps BitstreamReader
> > should just set the encoding on the response?
>
> It seems that Spring's filter not only forces encoding for text/html,
> but also converts the file. Please check out the results with default
> Dspace web.xml (web.xml.old.data and web.xml.old.head) and
> modified as described above (web.xml.new.data and web.xml.new.head):
>
> root@dspace4-test:~# ls -l
> total 52
> -rw-r--r-- 1 root root 15140 Jan 13 15:06 web.xml.new.data
> -rw-r--r-- 1 root root   384 Jan 13 15:06 web.xml.new.head
> -rw-r--r-- 1 root root 24656 Jan 13 15:05 web.xml.old.data
> -rw-r--r-- 1 root root   389 Jan 13 15:06 web.xml.old.head
>
> web.xml.*.data files were obtained by running "lynx  --dump"
> and web.xml.*.head files were obtained by "lynx --head --dump"
>
>
> As you can see headers differ as one would expect:
>
> root@dspace4-test:~# diff -u web.xml.old.head web.xml.new.head
> --- web.xml.old.head    2014-01-13 15:06:00.036506000 -0500
> +++ web.xml.new.head    2014-01-13 15:06:52.800506000 -0500
> @@ -1,14 +1,14 @@
>  HTTP/1.1 200 OK
>  Server: Apache-Coyote/1.1
> -Set-Cookie: JSESSIONID=818F618169946A0770D8DE6A572348E5; Path=/xmlui/;
> HttpOnly
> +Set-Cookie: JSESSIONID=9A3ADC942740B4A31CF0AC971CD4BCBB; Path=/xmlui/;
> HttpOnly
>  X-Cocoon-Version: 2.2.0
>  Vary: User-Agent
>  Last-Modified: Mon, 13 Jan 2014 19:50:43 GMT
> -Expires: Mon, 13 Jan 2014 21:06:00 GMT
> -Content-Type: text/html;charset=ISO-8859-1
> +Expires: Mon, 13 Jan 2014 21:06:52 GMT
> +Content-Type: text/html;charset=UTF-8
>  Content-Language: en
>  Content-Length: 18139
> -Date: Mon, 13 Jan 2014 20:06:00 GMT
> +Date: Mon, 13 Jan 2014 20:06:52 GMT
>  Connection: close
>
>
> But data files also differ (check out the size) and this:
>
> root@dspace4-test:~# cat web.xml.new.data | head -n 3 | hd
> 00000000  d0 92 d0 b2 d0 b5 d0 b4  d0 b5 d0 bd d0 b8 d0 b5
> |................|
> 00000010  0a 0a d0 9f d1 80 d0 be  d0 b2 d0 b5 d1 80 d0 b5
> |................|
> 00000020  d0 bd d0 be 3a 20 32 30  20 d0 bc d0 b0 d1 80 d1  |....: 20
> .......|
> 00000030  82 d0 b0 20 31 39 34 38  20 d0 b3 d0 be d0 b4 d0  |... 1948
> .......|
> 00000040  b0 0a                                             |..|
> 00000042
>
> root@dspace4-test:~# cat web.xml.old.data | head -n 3 | hd
> 00000000  c3 90 c3 90 c2 b2 c3 90  c2 b5 c3 90 c2 b4 c3 90
> |................|
> 00000010  c2 b5 c3 90 c2 bd c3 90  c2 b8 c3 90 c2 b5 0a 0a
> |................|
> 00000020  c3 90 c3 91 c3 90 c2 be  c3 90 c2 b2 c3 90 c2 b5
> |................|
> 00000030  c3 91 c3 90 c2 b5 c3 90  c2 bd c3 90 c2 be 3a 20
> |..............: |
> 00000040  32 30 20 c3 90 c2 bc c3  90 c2 b0 c3 91 c3 91 c3  |20
> .............|
> 00000050  90 c2 b0 20 31 39 34 38  20 c3 90 c2 b3 c3 90 c2  |... 1948
> .......|
> 00000060  be c3 90 c2 b4 c3 90 c2  b0 0a                    |..........|
> 0000006a
>
>
> Petya.
>
> ------------------------------------------------------------
> ------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/
> 4140/ostg.clktrk
> _______________________________________________
> Dspace-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-devel
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to