[ https://issues.apache.org/jira/browse/SOLR-13829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949422#comment-16949422 ]
ASF subversion and git services commented on SOLR-13829: -------------------------------------------------------- Commit bed9e7c47432777ff09fa8d03d435ad0e59b518a in lucene-solr's branch refs/heads/master from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bed9e7c ] SOLR-13829: Update CHANGES.txt > RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing > mismatch > --------------------------------------------------------------------------------- > > Key: SOLR-13829 > URL: https://issues.apache.org/jira/browse/SOLR-13829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Trey Grainger > Priority: Major > Attachments: SOLR-13829.patch > > Time Spent: 10m > Remaining Estimate: 0h > > In trying to use the "sort" streaming evaluator on float field (pfloat), I am > getting casting errors back based upon which values are calculated based upon > underlying values in a field. > Example: > *Docs:* (paste each into "Documents" pane in Solr Admin UI as type:"json") > > {code:java} > {"id": "1", "name":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]} > {"id": "2", "name":"cheese > pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}{code} > > *Streaming Expression:* > > {code:java} > sort(select(search(food_collection, q="*:*", fl="id,vector_fs", sort="id > asc"), cosineSimilarity(vector_fs, array(5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as > sim, id), by="sim desc"){code} > > *Response:* > > {code:java} > { > "result-set": { > "docs": [ > { > "EXCEPTION": "class java.lang.Double cannot be cast to class > java.lang.Long (java.lang.Double and java.lang.Long are in module java.base > of loader 'bootstrap')", > "EOF": true, > "RESPONSE_TIME": 13 > } > ] > } > }{code} > > > This is because in org.apache.solr.client.solrj.io.eval.RecursiveEvaluator, > there is a line which examines a numeric (BigDecimal) value and - regardless > of the type of the field the value originated from - converts it to a Long if > it looks like a whole number. This is the code in question from that class: > {code:java} > protected Object normalizeOutputType(Object value) { > if(null == value){ > return null; > } else if (value instanceof VectorFunction) { > return value; > } else if(value instanceof BigDecimal){ > BigDecimal bd = (BigDecimal)value; > if(bd.signum() == 0 || bd.scale() <= 0 || > bd.stripTrailingZeros().scale() <= 0){ > try{ > return bd.longValueExact(); > } > catch(ArithmeticException e){ > // value was too big for a long, so use a double which can handle > scientific notation > } > } > > return bd.doubleValue(); > } > ... [other type conversions] > {code} > Because of the *return bd.longValueExact()*; line, the calculated value for > "sim" in doc 1 is "Float(1)", whereas the calculated value for "sim" for doc > 2 is "Double(0.88938313). These are coming back as incompatible data types, > even though the source data is all of the same type and should be comparable. > Thus when the *sort* evaluator streaming expression (and probably others) > runs on these calculated values and the list should contain ["0.88938313", > "1.0"], an exception is thrown because the it's trying to compare > incompatible data types [Double("0.99"), Long(1)]. > This bug is occurring on master currently, but has probably existed in the > codebase since at least August 2017. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org