Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The following page has been changed by NoblePaul: http://wiki.apache.org/solr/DataImportHandler The comment on the change is: NumberFormatTransformer added ------------------------------------------------------------------------------ === RegexTransformer === - There is an inbuilt transformer called '!RegexTransfromer' provided with the tool itself. It helps in extracting values from fields (from db) using Regular Expressions. The actual class name is `org.apache.solr.handler.dataimport.RegexTransformer` . But as it belongs to the default package , package-name can be omitted + There is an inbuilt transformer called '!RegexTransfromer' provided with the tool itself. It helps in extracting values from fields (from the source) using Regular Expressions. The actual class name is `org.apache.solr.handler.dataimport.RegexTransformer` . But as it belongs to the default package , package-name can be omitted + example: {{{ @@ -477, +478 @@ </entity> }}} - ''''Attributes required by `RegexTransformer`'''' + ==== Attributes required by RegexTransformer ==== + !RegexTransfromer applies only on the fields with an attribute 'regex' or 'splitBy'. All other fields are left as it is. * '''`regex`''' : The regular expression that is used to match . This or `splitBy` must be present for each field . If not, that field is not touched by the transformer . If `replaceWith` is absent, each ''group'' is taken as a value and a list of values is returned * '''`sourceColName`''' : The column on which the regex is to be applied. If there is only one column this can be omitted * '''`splitBy`''' : If the `regex` is used to split a String to obtain multipple values use this @@ -505, +507 @@ </dataConfig> }}} - * You can put script tags inside the ''dataConfig'' node. By default, the language is assumed to be Javascript. In case you're using another language, specify on the script tag with attribute `'language="MyLanguage"'` + * You can put a script tag inside the ''dataConfig'' node. By default, the language is assumed to be Javascript. In case you're using another language, specify on the script tag with attribute `'language="MyLanguage"'` (must be supported by java 6) * Write as many transformer functions as you want to use. Each such function must accept a ''row'' variable corresponding to ''Map<String, Object>'' and return a row (after applying transformations) * Make an entity use a function by specifying ''transformer="script:<function-name>"'' in the ''entity'' node. * In the above data-config, the javascript function ''f1'' will be executed once for each row returned by entity e. - + * The semantics of execution is same as that of a java transformer. The method can have two arguments as in 'transformRow(Map<String,Object> , Context context) in the interface 'Transformer' . As it is javascript the second argument may be omittted and it still works === DateFormatTransformer === There is a built-in transformer called the !DateFormatTransformer which is useful for parsing date/time strings into java.util.Date instances. + !DateFormatTransformer applies only on the fields with an attribute 'dateTimeFormat' . All other fields are left as it is. {{{ <field column="date" xpath="/RDF/item/date" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss" /> }}} + The above field definition is used in the RSS example to parse the publish date of the RSS feed item. The transformer only applies to a field which has the attribute 'dateTimeFormat' and it uses the syntax of java's [http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html SimpleDateFormat]. + === NumberFormatTransformer === + Can be used to parse a number from a String. Uses the !NumberFormat class in java + eg: + {{{ + <field column="price" formatStyle="number" /> + }}} + !DateFormatTransformer applies only on the fields with an attribute 'formatStyle' . All other fields are left as it is. The value of the attribute must be one of (number|percent|integer|currency). == EntityProcessor == Each entity is handled by a default Entity processor called !SqlEntityProcessor. This works well for systems which use RDBMS as a datasource. For other kind of datasources like REST or Non Sql datasources you can choose to implement this interface `org.apache.solr.handler.dataimport.Entityprocessor` {{{
