Also, there may be other client applications that rely on the current behavior. So possibly adding a switch on the export handler for numeric nulls is the safest approach.
Joel Bernstein http://joelsolr.blogspot.com/ On Fri, May 27, 2016 at 8:03 AM, Joel Bernstein <[email protected]> wrote: > Yes, the /export handler returns zero for numeric fields that aren't > present. String fields should be empty though if not present. > > We'll want to keep the zero while sorting in the /export handler. But > removing the zero when outputing the field should by OK. We'll just need to > add test cases that cover the numeric nulls. The RollupStream would be one > place that might have a problem with this. > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Fri, May 27, 2016 at 4:52 AM, Dennis Gove <[email protected]> wrote: > >> Is this true for non-numeric fields as well? I agree that this seems like >> a very bad thing. >> >> I can't imagine that a fix would cause a problem with Streaming >> Expressions, ParallelSQL, or other given that the /select handler is not >> returning 0 for these missing fields (the /select handler is the default >> handler for the Streaming API so if nulls were a problem I imagine we'd >> have already seen it). >> >> That said, within Streaming Expressions there is a select(...) function >> which supports a replace(...) operation which allows you to replace one >> value (or null) with some other value. If a 0 were necessary one could use >> a select(...) to replace null with 0 using an expression like this >> select(<stream>, replace(fieldA, null, withValue=0)). >> The end result of that would be that the field fieldA would never have a >> null value and for all tuples where a null value existed it would be >> replaced with 0. >> >> Details on the select function can be found at >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61330338#StreamingExpressions-select >> . >> >> - Dennis >> >> On Thu, May 26, 2016 at 11:35 PM, Erick Erickson <[email protected] >> > wrote: >> >>> This seems to me to be A Bad Thing. Zero is different from not >>> existing. And let's claim that I want to process a stream and, say, >>> facet on in integer field over the result set. There's no way on the >>> client side to distinguish between a document that has a zero in the >>> field and one that didn't have the field in the first place so I'll >>> over-count the zero bucket. >>> >>> So before I raise a JIRA, my question is whether this is expected >>> behavior or not? I've found a mechanism that _shouldn't_ be very >>> expensive to omit the field if it doesn't exist in the returned >>> tuples. >>> >>> Now, how badly this would break Streaming Expressions, ParallelSQL and >>> the like I haven't looked into yet. >>> >>> So before I work up a trial patch am I going off in the weeds? >>> >>> Best, >>> Erick >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> >
