Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The following page has been changed by ShalinMangar: http://wiki.apache.org/solr/DataImportHandler The comment on the change is: Added hsqldb example ------------------------------------------------------------------------------ In order to get data from the database, our design philosophy revolves around 'templatized sql' entered by the user for each entity. This gives the user the entire power of SQL if he needs it. The root entity is the central table whose primary key can be used to join this table with other child entities. + = Example = + Let us consider an example. Suppose we have the following schema in our database inline:example-schema.png - This is a relational model of the same schema that SOLR currently ships with. We will use this as an example to build a data-config.xml for DataImportHandler. + This is a relational model of the same schema that SOLR currently ships with. We will use this as an example to build a data-config.xml for DataImportHandler. We've created a sample database with this schema in HSQLDB. To run it, do the following steps: + + * Download this attachment:hsqldb-database.zip to execute this example. Extract the downloaded zip file into c:\temp. Also save the following xml as c:\temp\example-data-config.xml {{{ <dataConfig> <document name="products"> - <entity name="item" pk="id" query="select * from item"> + <entity name="item" pk="ID" query="select * from item"> - <field column="id" /> - <field column="name" /> + <field column="ID" name="id" /> + <field column="NAME" name="name" /> - <field column="name" name="nameSort" /> + <field column="NAME" name="nameSort" /> - <field column="name" name="alphaNameSort" /> + <field column="NAME" name="alphaNameSort" /> - <field column="manu" /> + <field column="MANU" name="manu" /> - <field column="weight" /> + <field column="WEIGHT" name="weight" /> - <field column="price" /> + <field column="PRICE" name="price" /> - <field column="popularity" /> + <field column="POPULARITY" name="popularity" /> - <field column="inStock" /> + <field column="INSTOCK" name="inStock" /> - <field column="includes" /> + <field column="INCLUDES" name="includes" /> - <entity name="feature" + <entity name="feature" pk="ITEM_ID" query="select description from feature where item_id='${item.id}'"> <field name="feature" column="description" /> </entity> - <entity name="item_category" + <entity name="item_category" pk="ITEM_ID, CATEGORY_ID" query="select category_id from item_category where item_id='${item.id}'"> - <entity name="category" + <entity name="category" pk="ID" query="select description from category where id = '${item_category.category_id}'"> <field column="description" name="cat" /> </entity> @@ -90, +94 @@ </entity> </document> </dataConfig> + }}} + * Now modify the dataimport section in solrconfig.xml to the following: + {{{ + <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> + <lst name="defaults"> + <str name="config">c:/temp/example-data-config.xml</str> + <lst name="datasource"> + <str name="driver">org.hsqldb.jdbcDriver</str> + <str name="url">jdbc:hsqldb:/temp/example/ex</str> + <str name="user">sa</str> + <str name="password"></str> + </lst> + </lst> + </requestHandler> }}} + + * Download hsqldb driver jar and add it to c:\apache-solr-nightly\lib + * Use "ant dist" to rebuild the apache-solr-1.3.war + * Use the Solr jetty start.jar to startup solr + * Follow full-import section to do a full import Here, the root entity is a table called "item" whose primary key is a column "id". Data can be read from this table with the query "select * from item". Each item can have multiple "features" which are in the table ''feature'' inside the column ''description''. Note the query in ''feature'' entity: {{{ - <entity name="feature" + <entity name="feature" pk="ITEM_ID" query="select description from feature where item_id='${item.id}'"> <field name="feature" column="description" /> </entity> }}} The ''item_id'' foreign key in feature table is joined together with ''id'' primary key in ''item'' to retrieve rows for each row in ''item''. In a similar fashion, we join ''item'' and 'category' (which is a many-to-many relationship). Notice how we join these two tables using the intermediate table ''item_category'' again using templated SQL. {{{ - <entity name="item_category" + <entity name="item_category" pk="ITEM_ID, CATEGORY_ID" query="select category_id from item_category where item_id='${item.id}'"> - <entity name="category" + <entity name="category" pk="ID" query="select description from category where id = '${item_category.category_id}'"> <field column="description" name="cat" /> </entity>
