[ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-469:
---------------------------------------

    Attachment: SOLR-469.patch

This patch contains the following changes

 * DataSource definitions can now be added inside data-config.xml so there is 
no need to maintain configuration in two files. It also comes in handy with the 
interactive development mode.
 * XSLT support in XPathEntityProcessor can apply a given XSL on the XML 
document before processing it. For example: <entity name="e" 
processor="XPathEntityProcessor" xsl="/home/user/my.xsl">
 * XPathEntityProcessor now knows how to process Solr Add XMLs. This is handy 
when using XSLT to change fetched XML directly into Solr Add XML format. Add an 
extra attribute useSolrAddSchema="true" to enable this. If 
useSolrAddSchema="true" is specified, then there is no need to put fields in 
the entity.
 * A new EntityProcessor called FileListEntityProcessor has been added which 
can operate over a filesystem (directory) and can be used to get files by name 
(using a regex), size (in bytes) and can also exclude files matching a regex. 
Recursively operating over a directory is also supported.
 * A TemplateTransformer which lets you put multiple fields into one field 
according the the given template. For example <field column="name" 
template="${e.lastName}, ${e.firstName} ${e.middleName}" />
 * In-built transformers are now enhanced to operate on multi-valued fields 
also.
 * A Test harness has been created to make it easier to test DataImportHandler 
features. It is called AbstractDataImportHandlerTest and extends from 
AbstractSolrTestCase. Look at TestDataConfig and TestDocBuilder2 for examples

We shall write documentation and examples on these changes on the wiki at 
http://wiki.apache.org/solr/DataImportHandler

> Data Import RequestHandler
> --------------------------
>
>                 Key: SOLR-469
>                 URL: https://issues.apache.org/jira/browse/SOLR-469
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Noble Paul
>            Assignee: Grant Ingersoll
>             Fix For: 1.3
>
>         Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
>     * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>           - It also takes in a properties file for the data source 
> configuraution
>     * Given the configuration it can also generate the solr schema.xml
>     * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>           -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>           - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
>     * It provides a admin page
>           - where we can schedule it to be run automatically at regular 
> intervals
>           - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to