[ 
https://issues.apache.org/jira/browse/SOLR-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504762#comment-13504762
 ] 

James Dyer commented on SOLR-4047:
----------------------------------

Igor,  I just committed a fix for SOLR-2141 & SOLR-3842 that also includes a 
test that demonstrates this issue also.  However, this test passes and I'm not 
sure anything is actually broken, at least not on the latest revision in Trunk 
or Branch_4x.  Note though this test does not use Tika. However, the code for 
resolving the Tike URL is similar to the code for other Entity processors and 
it should work the same.

See TestVariableResolverEndToEnd, which generates a data-config.xml like this:

{code}
<dataConfig> 
<dataSource name="hsqldb" driver="org.hsqldb.jdbcDriver" 
url="jdbc:hsqldb:mem:." /> 
<document name="TestEvaluators"> 
<entity name="FIRST" processor="SqlEntityProcessor" dataSource="hsqldb"  
query="select  1 as id,  'SELECT' as SELECT_KEYWORD,  CURRENT_TIMESTAMP as 
FIRST_TS from DUAL " >
  <field column="SELECT_KEYWORD" name="select_keyword_s" /> 
  <entity name="SECOND" processor="SqlEntityProcessor" dataSource="hsqldb" 
transformer="TemplateTransformer"    
query="${dataimporter.functions.encodeUrl(FIRST.SELECT_KEYWORD)}  1 as SORT,  
CURRENT_TIMESTAMP as SECOND_TS,  
'${dataimporter.functions.formatDate(FIRST.FIRST_TS, 'yyyy', 'ms_MY')}' as 
SECOND1_S,   'PORK' AS MEAT,  'GRILL' AS METHOD,  'ROUND' AS CUTS,  'BEEF_CUTS' 
AS WHATKIND from DUAL WHERE 1=${FIRST.ID} UNION 
${dataimporter.functions.encodeUrl(FIRST.SELECT_KEYWORD)}  2 as SORT,  
CURRENT_TIMESTAMP as SECOND_TS,  
'${dataimporter.functions.formatDate(FIRST.FIRST_TS, 'yyyy', 'ms_MY')}' as 
SECOND1_S,   'FISH' AS MEAT,  'FRY' AS METHOD,  'SIRLOIN' AS CUTS,  'BEEF_CUTS' 
AS WHATKIND from DUAL WHERE 1=${FIRST.ID} ORDER BY SORT ">
   <field column="SECOND_S" name="second_s" /> 
   <field column="SECOND1_S" name="second1_s" /> 
   <field column="second2_s" 
template="${dataimporter.functions.formatDate(SECOND.SECOND_TS, 'yyyy', 
'ms_MY')}" /> 
   <field column="second3_s" 
template="${dih.functions.formatDate(SECOND.SECOND_TS, 'yyyy', 'ms_MY')}" /> 
   <field column="METHOD" name="${SECOND.MEAT}_s"/>
   <field column="CUTS" name="${SECOND.WHATKIND}_mult_s"/>
  </entity>
</entity>
</document> 
</dataConfig> 
{code}

As you can see the Sql Query on the child entity, instead of having "select", 
it uses ${dataimporter.functions.encodeUrl(FIRST.SELECT_KEYWORD)}, getting the 
word "select" from the data in the parent entity.

The response shows it is correctly executing the inner entity:
{code}
  "response":{"numFound":1,"start":0,"docs":[
      {
        "select_keyword_s":"SELECT",
        "id":"1",
        "second3_s":"2012",
        "second2_s":"2012",
        "PORK_s":"GRILL",
        "BEEF_CUTS_mult_s":["ROUND",
          "SIRLOIN"],
        "second1_s":"2012",
        "FISH_s":"FRY",
        "timestamp":"2012-11-27T16:55:39.409Z"}]
  }
{code}

Unless someone can demonstrate this is an actual problem (once again, a good 
failing unit test would help a lot), I will close this as "not a problem" in 
the next week or so.
                
> dataimporter.functions.encodeUrl throughs Unable to encode expression: 
> field.name with value: null
> --------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4047
>                 URL: https://issues.apache.org/jira/browse/SOLR-4047
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 4.0
>         Environment: Windows 7
>            Reporter: Igor Dobritskiy
>            Priority: Critical
>         Attachments: db-data-config.xml, db.sql, schema.xml, solrconfig.xml
>
>
> For some reason dataimporter.functions.encoude URL stopped work after update 
> to solr 4.0 from 3.5.
> Here is the error
> {code}
> Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
> encode expression: attach.name with value: null Processing Document # 1
> {code}
> Here is the data import config snippet:
> {code}
> ...
>             <entity name="account"
>                     query="select name from accounts where account_id = 
> '${attach.account_id}'">
>                     <entity name="img_index" processor="TikaEntityProcessor" 
>                             dataSource="bin"
>                             format="text" 
>                             
> url="http://example.com/data/${account.name}/attaches/${attach.item_id}/${dataimporter.functions.encodeUrl(attach.name)}">
>                             <field column="text" name="body" />
>                     </entity> 
>             </entity>
> ...
> {code}
> When I'm changing it to *not* use dataimporter.functions.encodeUrl it works 
> but I need to url encode file names as they have special chars in theirs 
> names.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to