[Solr Wiki] Update of "DataImportHandler" by NoblePaul

Apache Wiki Wed, 02 Apr 2008 21:43:33 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.


The following page has been changed by NoblePaul:
http://wiki.apache.org/solr/DataImportHandler

------------------------------------------------------------------------------
    </requestHandler>
  }}}
  note: It is possible to have more than one datasources for a configuration. 
To configure another datasource , just keep an another `<lst 
name="datasource">` entry . There is an implicit attribute "name" for a 
datasource. If there are more than one, each extra datasource must be 
identified by a unique name . eg: `<str name="name">datasource-2/str>`
+ eg:
+ {{{
+   <requestHandler name="/dataimport" 
class="org.apache.solr.handler.dataimport.DataImportHandler">
+     <lst name="defaults">
+       ...
+       <lst name="datasource">
+       <str name="name">datasource-1/str>
+          <str name="driver">com.mysql.jdbc.Driver</str>
+          .....
+       </lst>
+       <lst name="datasource">
+       <str name="name">datasource-2/str>
+          <str name="driver">com.mysql.jdbc.Driver</str>
+          .....
+       </lst>
+     </lst>
+   </requestHandler>
+ }}}
+ in your entities:
+ {{{
+ ..
+ <entity name="one" dataSource="datasource-1" ...>
+    ..
+ </entity>
+ <entity name="two" dataSource="datasource-2" ...>
+    ..
+ </entity>
+ ..
+ }}}
  
  == Configuration in data-config.xml ==
  A SOLR document can be considered as a de-normalized schema having fields 
whose values come from multiple tables.
@@ -379, +408 @@

  
  It moves ahead and encounters `/RDF/item` and processes the rows one by one . 
It gets the values for all the fields except for the 3 fields in the header. 
But as they were marked as common fields, the processor puts those fields into 
the record just before creating the document.
  
- What about this ''transformer=!DateFormatTransformer'' attribute in the 
entity? . See !DateFormat Section for details
+ What about this ''transformer=!DateFormatTransformer'' attribute in the 
entity? . See !DateFormatTransformer Section for details
  
  You can use this feature for indexing from REST API's such as rss/atom feeds, 
XML data feeds , other SOLR servers or even well formed xhtml documents . Our 
XPath support has its limitations but we have tried to make sure that common 
use-cases are covered and since it's based on a streaming parser, it is 
extremely fast and consumes constant amount of memory even for large XMLs. It 
does not support namespaces , but it can handle xmls with namespaces . When you 
provide the xpath, just drop the namespace and give the rest (eg if the tag is 
`'<dc:subject>'` the mapping should just contain `'subject'`).Easy, isn't it? 
And you didn't need to write one line of code! Enjoy :)
  = Extending the tool with APIs =
@@ -463, +492 @@

  </dataConfig>
  }}}
  
-  * You can put script tags inside the ''dataConfig'' node. By default, the 
language is assumed to be Javascript. In case you're using another language, 
specify on the script tag with attribute ''language="MyLanguage"''
+  * You can put script tags inside the ''dataConfig'' node. By default, the 
language is assumed to be Javascript. In case you're using another language, 
specify on the script tag with attribute `'language="MyLanguage"'`
   * Write as many transformer functions as you want to use. Each such function 
must accept a ''row'' variable corresponding to ''Map<String, Object>'' and 
return a row (after applying transformations)
   * Make an entity use a function by specifying 
''transformer="script:<function-name>"'' in the ''entity'' node.
   * In the above data-config, the javascript function ''f1'' will be executed 
once for each row returned by entity e.

[Solr Wiki] Update of "DataImportHandler" by NoblePaul

Reply via email to