[
https://issues.apache.org/jira/browse/CONNECTORS-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358161#comment-16358161
]
Vinay edited comment on CONNECTORS-1494 at 2/13/18 12:16 PM:
-------------------------------------------------------------
Thanks Karl. Finally figured out the solution. I had to change the default
locale configuration for linux. I edited /etc/sysconfig/i18n and changed
LANG="en_US.ISO8859". Now it is picking those files.
was (Author: [email protected]):
Thanks Karl. Finally figured out the solution. I had to change the default
locale configuration for linux. I edited /etc/sysconfig/i18n and changed
LANG="en_US.UTF-8". Now it is picking those files.
> Error crawling file system with file names having special characters.
> ---------------------------------------------------------------------
>
> Key: CONNECTORS-1494
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1494
> Project: ManifoldCF
> Issue Type: Bug
> Components: File system connector
> Affects Versions: ManifoldCF 2.9.1
> Reporter: Vinay
> Assignee: Karl Wright
> Priority: Critical
> Fix For: ManifoldCF 2.10
>
>
> I am crawling a file system mounted on linux machine. So the Repository
> Connection is of type "File System". For some files which has some special
> characters, Manifold Cf is not picking such files.
> File ex: a_XY-SMnA_ABC_Uuޓࠚϯmӣܼ˵Ҫȳ_֚3ҿؖúشԃԫхրҠë.pdf
> exception: java.lang.NumberFormatException: For input string: ""
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> ~[?:1.8.0_151]
> at java.lang.Long.parseLong(Long.java:601) ~[?:1.8.0_151]
> at java.lang.Long.<init>(Long.java:965) ~[?:1.8.0_151]
> at
> org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter$SpecPacker.<init>(DocumentFilter.java:513)
> ~[?:?]
> at
> org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.getPipelineDescription(DocumentFilter.java:76)
> ~[?:?]
> at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getTransformationDescription(IncrementalIngester.java:503)
> ~[mcf-agents.jar:?]
> at
> org.apache.manifoldcf.crawler.system.PipelineSpecification.<init>(PipelineSpecification.java:47)
> ~[mcf-pull-agent.jar:?]
> at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:308)
> [mcf-pull-agent.jar:?]
> FATAL 2018-02-07T23:47:15,927 (Worker thread '2') - Error tossed: For input
> string: ""
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)