[
https://issues.apache.org/jira/browse/SOLR-7670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shawn Heisey resolved SOLR-7670.
--------------------------------
Resolution: Invalid
Issues like this should be brought up on the mailing list, to figure out
whether there is a bug or just a misconfiguration.
I'm going to guess that this is a misconfiguration, and I might know what it
is: You have some nested entities, with ${files.fileAbsolutePath} used in the
inner entity ... but you don't have any entities named "files" ... the outer
entity is files1 in the first nested case and files2 in the second nested case.
If that is not the problem, please bring this issue up on the user mailing
list. Use a paste website (perhaps http://apaste.info would work) to include
the full stacktrace from the exception and any configs.
https://lucene.apache.org/solr/resources.html#mailing-lists
I will mark this issue resolved. If it turns out that there actually is a bug,
we can re-open it.
> solr import files from multiple dataSource entity
> -------------------------------------------------
>
> Key: SOLR-7670
> URL: https://issues.apache.org/jira/browse/SOLR-7670
> Project: Solr
> Issue Type: Bug
> Affects Versions: 5.1
> Reporter: István Bakró Nagy
> Priority: Minor
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> I am trying to import files from multiple folders.
> My solrconfig.xml invokes the following file to use it with
> org.apache.solr.handler.dataimport.DataImportHandler.
> <dataConfig>
> <dataSource type="BinFileDataSource" />
> <document>
> <entity name="files1"
> dataSource="null"
> rootEntity="false"
> processor="FileListEntityProcessor"
> baseDir="/w/PDF/"
>
> fileName=".*\.(pdf)|(doc)|(docx)|(ppt)|(pptx)|(xls)|(xlsx)|(odf)|(txt)|(rtf)|(html)|(htm)|(jpg)"
> onError="skip"
> recursive="true">
> <field column="fileAbsolutePath" name="id" />
> <field column="fileSize" name="size" />
> <field column="fileLastModified" name="lastModified" />
> <field column="file" name="fileName"/>
> <entity
> name="documentImport1"
> processor="TikaEntityProcessor"
> url="${files.fileAbsolutePath}"
> format="text">
> <field column="file" name="fileName"/>
> <field column="Author" name="author" meta="true"/>
> <field column="title" name="title" meta="true"/>
> <field column="text" name="text"/>
> <copyField source="content" dest="text"/>
> </entity>
> </entity>
> <entity name="files2"
> dataSource="null"
> rootEntity="false"
> processor="FileListEntityProcessor"
> baseDir="/w/KNOW-HOW/"
>
> fileName=".*\.(pdf)|(doc)|(docx)|(ppt)|(pptx)|(xls)|(xlsx)|(odf)|(txt)|(rtf)|(html)|(htm)|(jpg)"
> onError="skip"
> recursive="true">
> <field column="fileAbsolutePath" name="id" />
> <field column="fileSize" name="size" />
> <field column="fileLastModified" name="lastModified" />
> <field column="file" name="fileName"/>
> <entity
> name="documentImport2"
> processor="TikaEntityProcessor"
> url="${files.fileAbsolutePath}"
> format="text">
> <field column="file" name="fileName"/>
> <field column="Author" name="author" meta="true"/>
> <field column="title" name="title" meta="true"/>
> <field column="text" name="text"/>
> <copyField source="content" dest="text"/>
> </entity>
> </entity>
> </document>
> </dataConfig>
> During import I get a FileNotFoundException.
> What am I missing?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]