[
https://issues.apache.org/jira/browse/TIKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536396#comment-15536396
]
Tim Allison commented on TIKA-2105:
-----------------------------------
I got this to work from a .bat script. The trick was that I had to save the
.bat script as codepage 1252. I used Notepad++ to do this.
chcp 1252>nul
java -jar tika-app-1.10.jar "c:\\data\\français.doc"
In general, though, this is a windows/.bat issue, not a Java or Tika issue.
> Unable to process documents with french accents in filenames
> ------------------------------------------------------------
>
> Key: TIKA-2105
> URL: https://issues.apache.org/jira/browse/TIKA-2105
> Project: Tika
> Issue Type: Bug
> Components: batch
> Affects Versions: 1.13
> Environment: Windows 7, Java version 1.7.0.111
> Reporter: susserj
>
> When I execute the following batch test1.bat script from my command prompt,
> I get this error message:
> test1.bat
> @echo off
> "C:\Program Files (x86)\Java\jre7\bin\java" -jar c:\temp\tika-app-1.13.jar -m
> "S:\2008-09\2009-10\IC IT Environment 2009\français.docx"
> Error:
> Exception in thread "main" java.net.MalformedURLException: unknown protocol: s
> at java.net.URL.<init>(Unknown Source)
> at java.net.URL.<init>(Unknown Source)
> at java.net.URL.<init>(Unknown Source)
> at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:472)
> at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:145)
> When the filenames don't have special French characters, it works fine. (I
> cannot change the names of all the files that need to be processed).
> I apologise, my experience with java and TIKA is very limited.
> Thanks
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)