Hi Petya & James, quick word of thanks for the documentation on this solution ;)
We hit this issue again, made JIRA issues and pull requests so it can go away permanently https://jira.duraspace.org/browse/DS-3549 rgds, Bram [image: logo] Bram Luyten 250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586 Esperantolaan 4, Heverlee 3001, Belgium atmire.com <http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=braml> On 13 January 2014 at 21:53, Petya Kohts <[email protected]> wrote: > Hello, > > following up on the discussion > > https://sourceforge.net/mailarchive/message.php?msg_id=31215737 > http://web.archiveorange.com/archive/v/hxciqsTWLSVu2rG3JE47 > > started by James Leonard Halliday with the subject > "text encoding problem with bitstreams in DSpace 3.1 - resolved": > > > Hi everyone, > > > > I posted about this a while back, and finally found a workaround > > so I wanted to share. My problem was regarding HTML bitstreams > > in DSpace 3.1 (XMLUI). > > > > In previous versions of DSpace, the encoding for my UTF-8 bitstreams > > worked just fine, but in DSpace 3.1, the encoding for ONLY the bitstreams > > was coming out as ISO-8859 instead. After much searching, I finally found > > a workaround. > > First of all thanks for sharing, Leonard! > > > Now Dspace 4.0 rc3 has the same problem and the same fix helps: > > in /dspace/webapps/xmlui/WEB-INF/web.xml replace: > > <filter> > <filter-name>SetCharacterEncoding</filter-name> > <filter-class>org.dspace.app.xmlui.cocoon.SetCharacterEncodi > ngFilter</filter-class> > <init-param> > <param-name>encoding</param-name> > <param-value>UTF-8</param-value> > </init-param> > </filter> > > with > > <filter> > <filter-name>SetCharacterEncoding</filter-name> > <filter-class>org.springframework.web.filter.CharacterEncodi > ngFilter</filter-class> > <init-param> > <param-name>encoding</param-name> > <param-value>UTF-8</param-value> > </init-param> > <init-param> > <param-name>forceEncoding</param-name> > <param-value>true</param-value> > </init-param> > </filter> > > > Answering Mark's questions: > > > I'm thinking that our filter as written could never have done what you > > expect, and the effect was produced elsewhere. Our filter only sets > > the request's encoding. Spring's filter is documented to also set the > > response's encoding when forceEncoding=true. Perhaps BitstreamReader > > should just set the encoding on the response? > > It seems that Spring's filter not only forces encoding for text/html, > but also converts the file. Please check out the results with default > Dspace web.xml (web.xml.old.data and web.xml.old.head) and > modified as described above (web.xml.new.data and web.xml.new.head): > > root@dspace4-test:~# ls -l > total 52 > -rw-r--r-- 1 root root 15140 Jan 13 15:06 web.xml.new.data > -rw-r--r-- 1 root root 384 Jan 13 15:06 web.xml.new.head > -rw-r--r-- 1 root root 24656 Jan 13 15:05 web.xml.old.data > -rw-r--r-- 1 root root 389 Jan 13 15:06 web.xml.old.head > > web.xml.*.data files were obtained by running "lynx --dump" > and web.xml.*.head files were obtained by "lynx --head --dump" > > > As you can see headers differ as one would expect: > > root@dspace4-test:~# diff -u web.xml.old.head web.xml.new.head > --- web.xml.old.head 2014-01-13 15:06:00.036506000 -0500 > +++ web.xml.new.head 2014-01-13 15:06:52.800506000 -0500 > @@ -1,14 +1,14 @@ > HTTP/1.1 200 OK > Server: Apache-Coyote/1.1 > -Set-Cookie: JSESSIONID=818F618169946A0770D8DE6A572348E5; Path=/xmlui/; > HttpOnly > +Set-Cookie: JSESSIONID=9A3ADC942740B4A31CF0AC971CD4BCBB; Path=/xmlui/; > HttpOnly > X-Cocoon-Version: 2.2.0 > Vary: User-Agent > Last-Modified: Mon, 13 Jan 2014 19:50:43 GMT > -Expires: Mon, 13 Jan 2014 21:06:00 GMT > -Content-Type: text/html;charset=ISO-8859-1 > +Expires: Mon, 13 Jan 2014 21:06:52 GMT > +Content-Type: text/html;charset=UTF-8 > Content-Language: en > Content-Length: 18139 > -Date: Mon, 13 Jan 2014 20:06:00 GMT > +Date: Mon, 13 Jan 2014 20:06:52 GMT > Connection: close > > > But data files also differ (check out the size) and this: > > root@dspace4-test:~# cat web.xml.new.data | head -n 3 | hd > 00000000 d0 92 d0 b2 d0 b5 d0 b4 d0 b5 d0 bd d0 b8 d0 b5 > |................| > 00000010 0a 0a d0 9f d1 80 d0 be d0 b2 d0 b5 d1 80 d0 b5 > |................| > 00000020 d0 bd d0 be 3a 20 32 30 20 d0 bc d0 b0 d1 80 d1 |....: 20 > .......| > 00000030 82 d0 b0 20 31 39 34 38 20 d0 b3 d0 be d0 b4 d0 |... 1948 > .......| > 00000040 b0 0a |..| > 00000042 > > root@dspace4-test:~# cat web.xml.old.data | head -n 3 | hd > 00000000 c3 90 c3 90 c2 b2 c3 90 c2 b5 c3 90 c2 b4 c3 90 > |................| > 00000010 c2 b5 c3 90 c2 bd c3 90 c2 b8 c3 90 c2 b5 0a 0a > |................| > 00000020 c3 90 c3 91 c3 90 c2 be c3 90 c2 b2 c3 90 c2 b5 > |................| > 00000030 c3 91 c3 90 c2 b5 c3 90 c2 bd c3 90 c2 be 3a 20 > |..............: | > 00000040 32 30 20 c3 90 c2 bc c3 90 c2 b0 c3 91 c3 91 c3 |20 > .............| > 00000050 90 c2 b0 20 31 39 34 38 20 c3 90 c2 b3 c3 90 c2 |... 1948 > .......| > 00000060 be c3 90 c2 b4 c3 90 c2 b0 0a |..........| > 0000006a > > > Petya. > > ------------------------------------------------------------ > ------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/ > 4140/ostg.clktrk > _______________________________________________ > Dspace-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dspace-devel > -- You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/dspace-tech. For more options, visit https://groups.google.com/d/optout.
