[ 
https://issues.apache.org/jira/browse/TIKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536346#comment-15536346
 ] 

susserj edited comment on TIKA-2105 at 9/30/16 4:10 PM:
--------------------------------------------------------

Hi Tim

When I added the -I <input_dir> -o <output_dir> to my command line I got a 
bunch of zero byte files in my output directory but the specific file in the 
input directory called "français.docx" was missing.

It doesn't like the French characters in the filenames.



was (Author: susserj):
Hi Tim

When I added the -I <input_dir> -o <output_dir> to my command line I got a 
bunch of zero byte files in my output directory but the specific file in the 
input directory called "français.docx" was missing.

It doesn't like the French characters.


> Unable to process documents with french accents in filenames
> ------------------------------------------------------------
>
>                 Key: TIKA-2105
>                 URL: https://issues.apache.org/jira/browse/TIKA-2105
>             Project: Tika
>          Issue Type: Bug
>          Components: batch
>    Affects Versions: 1.13
>         Environment: Windows 7, Java version 1.7.0.111
>            Reporter: susserj
>
> When I execute the following batch test1.bat script from my command prompt,  
> I get this error message:
> test1.bat
> @echo off
> "C:\Program Files (x86)\Java\jre7\bin\java" -jar c:\temp\tika-app-1.13.jar -m 
> "S:\2008-09\2009-10\IC IT Environment 2009\français.docx"
> Error:
> Exception in thread "main" java.net.MalformedURLException: unknown protocol: s
>         at java.net.URL.<init>(Unknown Source)
>         at java.net.URL.<init>(Unknown Source)
>         at java.net.URL.<init>(Unknown Source)
>         at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:472)
>         at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:145)
> When the filenames don't have special French characters, it works fine. (I 
> cannot change the names of  all the files that need to be processed).
> I apologise, my experience with java and TIKA is very limited.
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to