Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The following page has been changed by NoblePaul: http://wiki.apache.org/solr/DataImportHandler ------------------------------------------------------------------------------ * How to fetch data (queries,url etc) * What to read ( resultset columns, xml fields etc) * How to process (modify/add/remove fields) - = Usage with databases = + = Usage with RDBMS = In order to use this handler, the following steps are required. * Define a data-config.xml and specify the location this file in solrconfig.xml under DataImportHandler section + * Give connection information (if you choose to put the datasource information in solrconfig) - * Give connection information - * '''`driver`''' (required): The jdbc driver classname - * '''`url`''' (required) : The jdbc connection url - * '''`user`''' : User name - * '''`password`''' : The password - * '''`batchSize`''' : The batchsize used in jdbc connection * Open the DataImportHandler page to verify if everything is in order [http://localhost:8983/solr/dataimport] * Use full-import command to do a full import from the database and add to SOLR index * Use delta-import command to do a delta import (get new inserts/updates) and add to SOLR index + == Configuring DataSources == + Add the tag 'dataSource' directly uner the 'dataConfig' tag. - == Configuration in solrconfig.xml == - In the example given below the datasource is configured in the solrconfig.xml. Whatever datasource configuration is done here can also be done in data-config xml also. - {{{ + <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/dbname" user="db_username" password="db_password"/> - <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> - <lst name="defaults"> - <str name="config">/home/username/data-config.xml</str> - <lst name="datasource"> - <str name="driver">com.mysql.jdbc.Driver</str> - <str name="url">jdbc:mysql://localhost/dbname</str> - <str name="user">db_username</str> - <str name="password">db_password</str> - </lst> - </lst> - </requestHandler> }}} + + The datasource configuration can also be done in data-config xml [#solrconfigdatasource also] . The attributes other than 'type' and 'name' are not decided by the datasource implementation. Each one can decide what it needs. We will discuss them as we see them === Multiple DataSources === - It is possible to have more than one datasources for a configuration. To configure an extra datasource , just keep an another `<lst name="datasource">` entry . There is an implicit attribute "name" for a datasource. If there are more than one, each extra datasource must be identified by a unique name `'<str name="name">datasource-2/str>'` + It is possible to have more than one datasources for a configuration. To configure an extra datasource , just keep an another 'dataSource' tag . There is an implicit attribute "name" for a datasource. If there are more than one, each extra datasource must be identified by a unique name `'name="datasource-2"'` eg: {{{ + <dataSource type="JdbcDataSource" name="datasource-1" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://db1-host/dbname" user="db_username" password="db_password"/> + <dataSource type="JdbcDataSource" name="datasource-2" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://db2-host/dbname" user="db_username" password="db_password"/> - <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> - <lst name="defaults"> - ... - <lst name="datasource"> - <str name="name">datasource-1/str> - <str name="driver">com.mysql.jdbc.Driver</str> - ..... - </lst> - <lst name="datasource"> - <str name="name">datasource-2/str> - <str name="driver">com.mysql.jdbc.Driver</str> - ..... - </lst> - </lst> - </requestHandler> }}} in your entities: {{{ @@ -100, +73 @@ </entity> .. }}} + == Configuring JdbcDataSource == + The attributes accepted by !JdbcDataSource are , + * '''`driver`''' (required): The jdbc driver classname + * '''`url`''' (required) : The jdbc connection url + * '''`user`''' : User name + * '''`password`''' : The password + * '''`batchSize`''' : The batchsize used in jdbc connection == Configuration in data-config.xml == A SOLR document can be considered as a de-normalized schema having fields whose values come from multiple tables. @@ -118, +98 @@ * '''`pk`''' : The primary key for the entity. Only needed for the root entity. This will be the id for the document * '''`rootEntity`''' : By default the entities falling under the document are root entities. If it is set to false , the entity directly falling under that entity will be treated as the root entity (so on and so forth). For every row returned by the roor entity a document is created in Solr - For !JdbcdataSource the entity attributes are : + For !SqlEntityProcessor the entity attributes are : * '''`query`''' (required) : The sql string using which to query the db * '''`deltaQuery`''' : Only used in delta-import @@ -591, +571 @@ </lst> }}} + == Adding datasource in solrconfig.xml == + [[Anchor(solrconfigdatasource)]] + + It is possible to configure datasource in solrconfig.xml also. The attributes are same ,just in a different way. + {{{ + <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> + <lst name="defaults"> + <str name="config">/home/username/data-config.xml</str> + <lst name="datasource"> + <str name="driver">com.mysql.jdbc.Driver</str> + <str name="url">jdbc:mysql://localhost/dbname</str> + <str name="user">db_username</str> + <str name="password">db_password</str> + </lst> + </lst> + </requestHandler> + }}} + = Architecture = The following diagram describes the logical flow for a sample configuration.
