[
https://issues.apache.org/jira/browse/SOLR-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Peng updated SOLR-7061:
----------------------------
Attachment: SOLR-7061.patch
> Cross-Entity Variable Resolving and Arguments for ScriptTransformer Functions
> -----------------------------------------------------------------------------
>
> Key: SOLR-7061
> URL: https://issues.apache.org/jira/browse/SOLR-7061
> Project: Solr
> Issue Type: Improvement
> Components: contrib - DataImportHandler
> Affects Versions: 4.10.3
> Reporter: Mark Peng
> Priority: Minor
> Labels: dataimport, transformers
> Attachments: SOLR-7061.patch
>
>
> Script Transformer has been widely used to modify the value of columns of
> selected rows from targeting data source (such as SQL Database) based on
> specific logics, before writing to Solr as documents. However, current
> implementation has the following limitations:
> *1. It is not possible to pass constant values or resolved variables (e.g.,
> $\{TABLE.COLUMN\} ) as arguments to a script function.*
> *2. Cross-entity row data exchange is not possible as well.*
> In our use case, we have complex nested entities and rely heavily on the
> script functions to transform table rows while doing data import. Sometimes
> for each single document, we need to get the selected column values from a
> parent entity into current entity for doing value transformation and applying
> if-else logics. To achieve this, we need to join with others tables in the
> SQL of current entity, which is quite resource-consuming, especially for
> large tables.
> Therefore, we have done some improvements to allow us to pass selected column
> values from entity A to another entity B as its function arguments by
> utilizing variable resolver.
> Here is an example about how it works. Support we have the following
> configuration:
> {code}
> <dataConfig>
> <dataSource name="ProductDB"
> driver="oracle.jdbc.driver.OracleDriver"
>
> url="jdbc:oracle:thin:@${dataimporter.request.host}:${dataimporter.request.port}/${dataimporter.request.name}"
>
> user="${dataimporter.request.user}"
> password="${dataimporter.request.password}"
> autoCommit="true"/>
> <!-- ScriptTransformer functions -->
> <script><![CDATA[
> function processItemRow(row, resolvedVars) {
> var isOnSale = resolvedVars.get("${PRODUCT.IS_ONSALE}");
> var discount = row.get("DISCOUNT_RATE");
> var price = row.get("PRICE");
>
> if(isOnSale) {
> row.put("PRICE", price * discount);
> }
> else
> row.put("PRICE", price);
>
> return row;
> }
> ]]>
> </script>
> <document name="EC_SHOP">
> <entity dataSource="ProductDB" name="PRODUCT"
> query="SELECT PRODUCT_ID, TITLE, IS_ONSALE FROM PRODUCT">
> <field column="PRODUCT_ID" name="PRODUCT_ID"/>
> <field column="TITLE" name="TITLE"/>
> <field column="IS_ONSALE" name="IS_ONSALE"/>
>
> <entity dataSource="ProductDB" name="ITEM"
> transformer="script:processItemRow(${PRODUCT.IS_ONSALE})"
> query="SELECT PRICE FROM ITEM WHERE PRODUCT_ID =
> '${PRODUCT.PRODUCT_ID}'">
> <field column="PRICE" name="PRICE"/>
> </entity>
> </entity>
> </dataConfig>
> {code}
> As demonstrated above, now we can get access to the value of column
> *IS_ONSALE* of table *PRODUCT* from the entity of table *ITEM* by passing
> *$\{PRODUCT.IS_ONSALE\}* as an argument of the function *processItemRow* to
> determine if we should give some discounts for the production price. The
> signature of function has a secondary argument (named *resolvedVars* here)
> for passing the map of column values resolved from other previous entities.
> This improvement gives more flexibility for script functions to exchange row
> data cross entities and do more complex processing for entity rows.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]