Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The following page has been changed by NoblePaul: http://wiki.apache.org/solr/DataImportHandler ------------------------------------------------------------------------------ == EntityProcessor == Each entity is handled by a default Entity processor called !SqlEntityProcessor. This works well for systems which use RDBMS as a datasource. For other kind of datasources like REST or Non Sql datasources you can choose to implement this interface `org.apache.solr.handler.dataimport.Entityprocessor`. This is designed to Stream rows one by one from an entity. The simplest way to implement your own !EntityProcessor is to just extent !EntityProcessorBase and override the `public Map<String,Object> nextRow()` method. '!EntityProcessor' rely on the !DataSource for fetching data. The return type of the !DataSource is important for an !EntityProcessor. The in-built ones are, + === SqlEntityProcessor === - * '''!SqlEntityProcessor''' : This is the defaut. The !DataSource must be of type `DataSourec<Iterator<Map<String, Object>>` . !JdbcDataSource can be used with this. + This is the defaut. The !DataSource must be of type `DataSourec<Iterator<Map<String, Object>>` . !JdbcDataSource can be used with this. + === XPathEntityProcessor === - * '''X!PathEntityProcessor''' : Used for XML type datasource. The !DataSource must be of type `DataSourec<Reader>` . !HttpDataSource or !FileDataSource can be used with this + Used for XML type datasource. The !DataSource must be of type `DataSourec<Reader>` . !HttpDataSource or !FileDataSource can be used with this . + === FileListEntityProcessor === - * '''!FileListEntityProcessor''' : A simple one which can be used to enumerate the list of files from a File System based on some criteria. It does not use a !DataSource . The entity attributes are.. + A simple one which can be used to enumerate the list of files from a File System based on some criteria. It does not use a !DataSource . The entity attributes are.. - *'''`fileName`''' :(required) A regex pattern to identify files + *'''`fileName`''' :(required) A regex pattern to identify files - *'''`baseDir`''' : (required) The Base directory (absolute path) + *'''`baseDir`''' : (required) The Base directory (absolute path) - *'''`recursive`''' : Recursive listing or not.default is 'false ' + *'''`recursive`''' : Recursive listing or not.default is 'false ' - * '''`excludes`''' : A Regex pattern of excluded file names + * '''`excludes`''' : A Regex pattern of excluded file names - * '''`newerThan`''' : A date param . Use the format (`yyyy-MM-dd HH:mm:ss`) . It can also be a datemath string eg: ('NOW-3DAYS'). The single quote is necessary . Or it can be a valid variableresolver format like (${var.name}) + * '''`newerThan`''' : A date param . Use the format (`yyyy-MM-dd HH:mm:ss`) . It can also be a datemath string eg: ('NOW-3DAYS'). The single quote is necessary . Or it can be a valid variableresolver format like (${var.name}) - * '''`olderThan`''' : A date param . Same rules as above + * '''`olderThan`''' : A date param . Same rules as above example: {{{ <entity name="f" processor="FileListEntityProcessor" fileName=".*xml" newerThan="'NOW-3DAYS'" recursive="true" rootEntity="false"> @@ -520, +523 @@ </entity> </entity> }}} - - Do not miss the `rootEntity` attribute. The implicit fields generated by the processor are `fileAbsolutePath,fileSize,fileLastModified,fileName` + Do not miss the `rootEntity` attribute. The implicit fields generated by the processor are `fileAbsolutePath,fileSize,fileLastModified,fileName`. + === CachedSqlEntityProcessor === + [[Anchor(cached)]] + /!\ TODO + + + == DataSource == [[Anchor(datasource)]] - == DataSource == A class can implement `org.apache.solr.handler.dataimport.DataSource` {{{ public interface DataSource <T> {
