[ 
https://issues.apache.org/jira/browse/CONNECTORS-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358161#comment-16358161
 ] 

Vinay commented on CONNECTORS-1494:
-----------------------------------

Thanks Karl. Finally figured out the solution. I had to change the default 
locale configuration for linux. I edited /etc/sysconfig/i18n and changed 
LANG="en_US.UTF-8". Now it is picking those files.

> Error crawling file system with file names having special characters.
> ---------------------------------------------------------------------
>
>                 Key: CONNECTORS-1494
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1494
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: File system connector
>    Affects Versions: ManifoldCF 2.9.1
>            Reporter: Vinay
>            Assignee: Karl Wright
>            Priority: Critical
>             Fix For: ManifoldCF 2.10
>
>
> I am crawling a file system mounted on linux machine. So the Repository 
> Connection is of type "File System". For some files which has some special 
> characters, Manifold Cf is not picking such files.
> File ex: a_XY-SMnA_ABC_Uuޓࠚϯmӣܼ˵Ҫȳ_֚3ҿؖúشԃԫхրҠë.pdf
> exception: java.lang.NumberFormatException: For input string: ""
>      at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) 
> ~[?:1.8.0_151]
>      at java.lang.Long.parseLong(Long.java:601) ~[?:1.8.0_151]
>      at java.lang.Long.<init>(Long.java:965) ~[?:1.8.0_151]
>      at 
> org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter$SpecPacker.<init>(DocumentFilter.java:513)
>  ~[?:?]
>      at 
> org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.getPipelineDescription(DocumentFilter.java:76)
>  ~[?:?]
>      at 
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getTransformationDescription(IncrementalIngester.java:503)
>  ~[mcf-agents.jar:?]
>      at 
> org.apache.manifoldcf.crawler.system.PipelineSpecification.<init>(PipelineSpecification.java:47)
>  ~[mcf-pull-agent.jar:?]
>      at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:308) 
> [mcf-pull-agent.jar:?]
>  FATAL 2018-02-07T23:47:15,927 (Worker thread '2') - Error tossed: For input 
> string: ""



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to