Re: [Dspace-tech] Odd Characters in Search Results...

2015-05-29 Thread bkelm
Andrea, thank you so much. We add this to the top of our cron job: LANG=en_US.UTF-8 We remove the corrupt text bundle and re-run the media filter: /dspace/bin/dspace filter-media -i 10177/4732 and the files look perfect. Bill K. -- View this message in context:

Re: [Dspace-tech] Odd Characters in Search Results...

2015-05-28 Thread euler
Hi Andrea, I guess I figured it out how to apply this in a windows environment. I just added the line LANG=en_US.UTF-8 at the end of the command dspace filter-media. I did a search on our repository first and looked for items that returned odd characters in its search results. Then I force dspace

Re: [Dspace-tech] Odd Characters in Search Results...

2015-05-28 Thread Andrea Schweer
Hi, On 28/05/15 17:21, euler wrote: Thanks for the link. I forgot to mention that I am using Windows 2003 as my OS, so I'm not using crontab, instead I have a batch file that is executed by Scheduled Tasks. Apologies for my ignorance, but I don't know how to apply this to a Windows

Re: [Dspace-tech] Odd Characters in Search Results...

2015-05-27 Thread Andrea Schweer
Hi, On 28/05/15 15:24, euler wrote: Thanks for this. I just assumed that the original characters in my pdfs were defective somehow (some are defective actually, not OCRed but digital born documents). I would be glad to know how to make sure that the dspace media-filter will use the correct

Re: [Dspace-tech] Odd Characters in Search Results...

2015-05-27 Thread euler
Hi Andrea, Thanks for this. I just assumed that the original characters in my pdfs were defective somehow (some are defective actually, not OCRed but digital born documents). I would be glad to know how to make sure that the dspace media-filter will use the correct locale and UTF-8 encoding? I

Re: [Dspace-tech] Odd Characters in Search Results...

2015-05-27 Thread euler
Hi Bill, I'm having this issues also. I resolved this by adding TEXT in dspace.cfg, ie xmlui.bundle.upload = ORIGINAL, TEXT, METADATA, THUMBNAIL, LICENSE, CC-LICENSE so that I can upload TEXT bundle aside from the ORIGINAL which is pdf. I just made sure that the text file was saved in UTF-8

Re: [Dspace-tech] Odd Characters in Search Results...

2015-05-27 Thread Andrea Schweer
Hi, On 28/05/15 14:44, euler wrote: I'm having this issues also. I resolved this by adding TEXT in dspace.cfg, ie xmlui.bundle.upload = ORIGINAL, TEXT, METADATA, THUMBNAIL, LICENSE, CC-LICENSE so that I can upload TEXT bundle aside from the ORIGINAL which is pdf. I just made sure that the