Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The "DataImportHandler" page has been changed by JamesDyer: http://wiki.apache.org/solr/DataImportHandler?action=diff&rev1=326&rev2=327 Comment: SOLR-4068: Refactor DIH - VariableResolver & Evaluator == What is a row? == A row in !DataImportHandler is a Map (Map<String, Object>). In the map , the key is the name of the field and the value can be anything which is a valid Solr type. The value can also be a Collection of the valid Solr types (this may get mapped to a multi-valued field). If the !DataSource is RDBMS a query cannot emit a multivalued field. But it is possible to create a multivalued field by joining an entity with another.i.e if the sub-entity returns multiple rows for one row from parent entity it can go into a multivalued field. If the datasource is xml, it is possible to return a multivalued field. - == A VariableResolver == + == VariableResolver == - A !VariableResolver is the component which replaces all those placeholders such as `${<name>}`. It is a multilevel Map. Each namespace is a Map and namespaces are separated by periods (.) . eg if there is a placeholder ${item.ID} , 'item' is a nampespace (which is a map) and 'ID' is a value in that namespace. It is possible to nest namespaces like ${item.x.ID} where x could be another Map. A reference to the current !VariableResolver can be obtained from the Context. Or the object can be directly consumed by using ${<name>} in 'query' for RDBMS queries or 'url' in Http . + The !VariableResolver is the component which replaces all those placeholders such as `${<name>}`. It is a multilevel Map. Each namespace is a Map and namespaces are separated by periods (.) . eg if there is a placeholder ${item.ID} , 'item' is a nampespace (which is a map) and 'ID' is a value in that namespace. It is possible to nest namespaces like ${item.x.ID} where x could be another Map. A reference to the current !VariableResolver can be obtained from the Context. Or the object can be directly consumed by using ${<name>} in 'query' for RDBMS queries or 'url' in Http . - === Custom formatting in query and url using Functions === + == Evaluators - Custom formatting in queries and urls == - While the namespace concept is useful , the user may want to put some computed value into the query or url for example there is a Date object and your datasource accepts Date in some custom format . There are a few functions provided by the !DataImportHandler which can do some of these. + While the namespace concept is useful , the user may want to put some computed value into the query or url for example there is a Date object and your datasource accepts Date in some custom format. - * ''formatDate'' : It is used like this `'${dataimporter.functions.formatDate(item.ID, 'yyyy-MM-dd HH:mm')}'` . The first argument can be a valid value from the !VariableResolver and the second value can be a format string (use !SimpleDateFormat) . The first argument can be a computed value eg: `'${dataimporter.functions.formatDate('NOW-3DAYS', 'yyyy-MM-dd HH:mm')}'` and it uses the syntax of the datemath parser in Solr. (note that it must enclosed in single quotes) . <!> Note . This syntax has been changed in 1.4 . The second parameter was not enclosed in single quotes earlier. But it will continue to work without single quote also. + === formatDate === + Use this to format dates as strings. It takes three parameters (prior to Solr 4.1, it takes two): + 1. A variable that refers to a date, or a datemath expression. + 2. A date format string. See java.text.SimpleDateFormat javadoc for valid date formats. (Solr 4.1 and later, this must be enclosed in single quotes. Solr 1.4 - 4.0, quotes are optional. Prior to Solr 1.4, this must not be enclosed in single quotes) + 3. <!> [[Solr4.1]] (optional) The locale code to use when formatting dates, enclosed in single quotes. See java.util.Locale javadoc for details. If omitted, this defaults to the ROOT Locale. (Note: prior to Solr 4.1, formatDate would always use the current machine's default locale.) + + + * example using a variable: `'${dataimporter.functions.formatDate(item.ID, 'yyyy-MM-dd HH:mm')}'` + * example using a datemmath expression: `'${dataimporter.functions.formatDate('NOW-3DAYS', 'yyyy-MM-dd HH:mm')}'` + * example specifying a Locale: <!> [[Solr4.1]] `'${dataimporter.functions.formatDate(item.ID, 'yyyy-MM-dd HH:mm', 'th_TH')}'` + + === escapeSql === - * ''escapeSql'' : Use this to escape special sql characters . eg : `'${dataimporter.functions.escapeSql(item.ID)}'`. Takes only one argument and must be a valid value in the !VaraiableResolver. + Use this to escape special sql characters . eg : `'${dataimporter.functions.escapeSql(item.ID)}'`. Takes only one argument and must be a valid value in the !VaraiableResolver. + + === encodeUrl === - * ''encodeUrl'' : Use this to encode urls . eg : `'${dataimporter.functions.encodeUrl(item.ID)}'` . Takes only one argument and must be a valid value in the !VariableResolver + Use this to encode urls . eg : `'${dataimporter.functions.encodeUrl(item.ID)}'` . Takes only one argument and must be a valid value in the !VariableResolver - ==== Custom Functions ==== + == Custom Evalutaors == It is possible to plug in custom functions into DIH. Implement an [[http://lucene.apache.org/solr/api/org/apache/solr/handler/dataimport/Evaluator.html|Evaluator]] and specify it in the data-config.xml . Following is an example of an evaluator which does a 'toLowerCase' on a String. {{{ @@ -1099, +1112 @@ </document> </dataConfig> }}} - The implementation of !LowerCaseFunctionEvaluator + The implementation of !LowerCaseFunctionEvaluator + <!> [[Solr4.1]] this example depends on API modifications made in Solr 4.1 {{{ public class LowerCaseFunctionEvaluator extends Evaluator{ public String evaluate(String expression, Context context) { - List l = EvaluatorBag.parseParams(expression, context.getVariableResolver()); + List<Object> l = parseParams(expression, context.getVariableResolver()); - if (l.size() != 1) { throw new RuntimeException("'toLowerCase' must have only one parameter "); } - return l.get(0).toString().toLowerCase(); + return l.get(0).toString().toLowerCase(Locale.ROOT); - } - } }}} === Accessing request parameters ===