RE: DocTransformer not always working
Yeah, that makes sense indeed. Thanks! Markus -Original message- > From:Chris Hostetter <hossman_luc...@fucit.org> > Sent: Thursday 15th December 2016 19:44 > To: solr-user@lucene.apache.org > Subject: RE: DocTransformer not always working > > > : Well, i can work with this really fine knowing this, but does it make > : sense? I did assume (or be wrong in doing so) that fl=minhash:[binstr] > : should mean get that field and pass it through the transformer. At least > : i just now fell for it, maybe other shouldn't :) > > that's what it *can* mean, but it's not -- fundementally -- what it means. > > foo:[bar x=y ...] means run the "bar" transformer and request that it > uses the name "foo" as an output key in the resulting documents. > > when "bar" is executing it knows what name it was asked to use, so it can > use that information for other purposes (like in your case: you can use > that as a stored field name to do some processing on) but there's no > reason "foo" has to be a real field name. > > many processors don't treat the "name" special in any way, and in gneral a > processor should behave sanely if there is no name specified (ie: > "fl=[bar]" should be totally valid) > > the key reason why it's not really a good idea to *force* the "name" used > in the response to match a "real" stored field is because it prevents you > from using multiple transformers on the same field, or from returning the > same field unmodified. > > Another/Better way for you to have designed your transformer would have > been that the field to apply the binstr logic too should be specified as a > local param, ie... > > fl=minhash,b2_minhash:[binstr f=minhash base=2],b8_minhash:[binstr > f=minhash base=16] > > > ...see what i mean? > > > > > : > : Anyway, thanks again today, > : Markus > : > : -Original message- > : > From:Chris Hostetter <hossman_luc...@fucit.org> > : > Sent: Wednesday 14th December 2016 23:14 > : > To: solr-user <solr-user@lucene.apache.org> > : > Subject: Re: DocTransformer not always working > : > > : > > : > Fairly certain you aren't overridding getExtraRequestFields, so when your > : > DocTransformer is evaluated it can'd find the field you want it to > : > transform. > : > > : > By default, the ResponseWriters don't provide any fields that aren't > : > explicitly requested by the user, or specified as "extra" by the > : > DocTransformer. > : > > : > IIUC you want the stored value of the "minhash" field to be available to > : > you, but the response writer code doesn't know that -- it just knows you > : > want "minhash" to be the output respons key for the "[binstr]" > : > transformer. > : > > : > > : > Take a look at RawValueTransformerFactory as an example to borrow from. > : > > : > > : > > : > > : > : Date: Wed, 14 Dec 2016 21:55:26 + > : > : From: Markus Jelsma <markus.jel...@openindex.io> > : > : Reply-To: solr-user@lucene.apache.org > : > : To: solr-user <solr-user@lucene.apache.org> > : > : Subject: DocTransformer not always working > : > : > : > : Hello - I just spotted an oddity with all two custom DocTransformers we > sometimes use on Solr 6.3.0. This particular transformer in the example just > transforms a long (or int) into a sequence of bits. I just use it as an > convenience to compare minhashes with my eyeballs. First example is very > straightforward, fl=minhash:[binstr], show only the minhash field, but as a > bit sequence. > : > : > : > : > solr/search/select?omitHeader=true=json=true=1=id%20asc=*:*=minhash:[binstr] > : > : { > : > : "response":{"numFound":96933,"start":0,"docs":[ > : > : {}] > : > : }} > : > : > : > : The document is empty! This also happens with another transformer. The > next example i also request the lang field: > : > : > : > : solr/search/select?omitHeader=true=json=true=1=id > asc=*:*=lang,minhash:[binstr] > : > : { > : > : "response":{"numFound":96933,"start":0,"docs":[ > : > : { > : > : "lang":"nl"}] > : > : }} > : > : > : > : Ok, at least i now get the lang field, but the transformed minhash is > nowhere to be seen. In the next example i request all fields and the > transformed minhash: > : > : > : > : > /solr/search/sel
RE: DocTransformer not always working
: Well, i can work with this really fine knowing this, but does it make : sense? I did assume (or be wrong in doing so) that fl=minhash:[binstr] : should mean get that field and pass it through the transformer. At least : i just now fell for it, maybe other shouldn't :) that's what it *can* mean, but it's not -- fundementally -- what it means. foo:[bar x=y ...] means run the "bar" transformer and request that it uses the name "foo" as an output key in the resulting documents. when "bar" is executing it knows what name it was asked to use, so it can use that information for other purposes (like in your case: you can use that as a stored field name to do some processing on) but there's no reason "foo" has to be a real field name. many processors don't treat the "name" special in any way, and in gneral a processor should behave sanely if there is no name specified (ie: "fl=[bar]" should be totally valid) the key reason why it's not really a good idea to *force* the "name" used in the response to match a "real" stored field is because it prevents you from using multiple transformers on the same field, or from returning the same field unmodified. Another/Better way for you to have designed your transformer would have been that the field to apply the binstr logic too should be specified as a local param, ie... fl=minhash,b2_minhash:[binstr f=minhash base=2],b8_minhash:[binstr f=minhash base=16] ...see what i mean? : : Anyway, thanks again today, : Markus : : -Original message- : > From:Chris Hostetter <hossman_luc...@fucit.org> : > Sent: Wednesday 14th December 2016 23:14 : > To: solr-user <solr-user@lucene.apache.org> : > Subject: Re: DocTransformer not always working : > : > : > Fairly certain you aren't overridding getExtraRequestFields, so when your : > DocTransformer is evaluated it can'd find the field you want it to : > transform. : > : > By default, the ResponseWriters don't provide any fields that aren't : > explicitly requested by the user, or specified as "extra" by the : > DocTransformer. : > : > IIUC you want the stored value of the "minhash" field to be available to : > you, but the response writer code doesn't know that -- it just knows you : > want "minhash" to be the output respons key for the "[binstr]" : > transformer. : > : > : > Take a look at RawValueTransformerFactory as an example to borrow from. : > : > : > : > : > : Date: Wed, 14 Dec 2016 21:55:26 + : > : From: Markus Jelsma <markus.jel...@openindex.io> : > : Reply-To: solr-user@lucene.apache.org : > : To: solr-user <solr-user@lucene.apache.org> : > : Subject: DocTransformer not always working : > : : > : Hello - I just spotted an oddity with all two custom DocTransformers we sometimes use on Solr 6.3.0. This particular transformer in the example just transforms a long (or int) into a sequence of bits. I just use it as an convenience to compare minhashes with my eyeballs. First example is very straightforward, fl=minhash:[binstr], show only the minhash field, but as a bit sequence. : > : : > : solr/search/select?omitHeader=true=json=true=1=id%20asc=*:*=minhash:[binstr] : > : { : > : "response":{"numFound":96933,"start":0,"docs":[ : > : {}] : > : }} : > : : > : The document is empty! This also happens with another transformer. The next example i also request the lang field: : > : : > : solr/search/select?omitHeader=true=json=true=1=id asc=*:*=lang,minhash:[binstr] : > : { : > : "response":{"numFound":96933,"start":0,"docs":[ : > : { : > : "lang":"nl"}] : > : }} : > : : > : Ok, at least i now get the lang field, but the transformed minhash is nowhere to be seen. In the next example i request all fields and the transformed minhash: : > : : > : /solr/search/select?omitHeader=true=json=true=1=id%20asc=*:*=*,minhash:[binstr] : > : { : > : "response":{"numFound":96933,"start":0,"docs":[ : > : { : > : "minhash":"11101101001010001101001010111101100100110010", : > : ...other fields here : > : "_version_":1553728923368423424}] : > : }} : > : : > : So it seems that right now, i can only use a transformer properly if i request all fields. I believe it used to work with all three examples just as you would expect. But since i haven't used transformers for a while, i don't know at which version it stopped working like that (if it ever did of course :) : > : : > : Did i mess something up or did a bug creep on me? : > : : > : Thanks, : > : Markus : > : : > : > -Hoss : > http://www.lucidworks.com/ : > : -Hoss http://www.lucidworks.com/
RE: DocTransformer not always working
Hello - i just looked up the DocTransformer Javadoc and spotted the getExtraRequestFields method. What you mention makes sense, so i immediately tried: solr/search/select?omitHeader=true=json=true=1=id asc=*:*=minhash,minhash:[binstr] { "response":{"numFound":97895,"start":0,"docs":[ { "minhash":"11101101001010001101001010111101100100110010"}] }} So as i get it, instead of using getRequestedFields, just now i just did an explicit get for that fields. Don't mind the changed numFound, it's a live index. Well, i can work with this really fine knowing this, but does it make sense? I did assume (or be wrong in doing so) that fl=minhash:[binstr] should mean get that field and pass it through the transformer. At least i just now fell for it, maybe other shouldn't :) Anyway, thanks again today, Markus -Original message- > From:Chris Hostetter <hossman_luc...@fucit.org> > Sent: Wednesday 14th December 2016 23:14 > To: solr-user <solr-user@lucene.apache.org> > Subject: Re: DocTransformer not always working > > > Fairly certain you aren't overridding getExtraRequestFields, so when your > DocTransformer is evaluated it can'd find the field you want it to > transform. > > By default, the ResponseWriters don't provide any fields that aren't > explicitly requested by the user, or specified as "extra" by the > DocTransformer. > > IIUC you want the stored value of the "minhash" field to be available to > you, but the response writer code doesn't know that -- it just knows you > want "minhash" to be the output respons key for the "[binstr]" > transformer. > > > Take a look at RawValueTransformerFactory as an example to borrow from. > > > > > : Date: Wed, 14 Dec 2016 21:55:26 + > : From: Markus Jelsma <markus.jel...@openindex.io> > : Reply-To: solr-user@lucene.apache.org > : To: solr-user <solr-user@lucene.apache.org> > : Subject: DocTransformer not always working > : > : Hello - I just spotted an oddity with all two custom DocTransformers we > sometimes use on Solr 6.3.0. This particular transformer in the example just > transforms a long (or int) into a sequence of bits. I just use it as an > convenience to compare minhashes with my eyeballs. First example is very > straightforward, fl=minhash:[binstr], show only the minhash field, but as a > bit sequence. > : > : > solr/search/select?omitHeader=true=json=true=1=id%20asc=*:*=minhash:[binstr] > : { > : "response":{"numFound":96933,"start":0,"docs":[ > : {}] > : }} > : > : The document is empty! This also happens with another transformer. The next > example i also request the lang field: > : > : solr/search/select?omitHeader=true=json=true=1=id > asc=*:*=lang,minhash:[binstr] > : { > : "response":{"numFound":96933,"start":0,"docs":[ > : { > : "lang":"nl"}] > : }} > : > : Ok, at least i now get the lang field, but the transformed minhash is > nowhere to be seen. In the next example i request all fields and the > transformed minhash: > : > : > /solr/search/select?omitHeader=true=json=true=1=id%20asc=*:*=*,minhash:[binstr] > : { > : "response":{"numFound":96933,"start":0,"docs":[ > : { > : > "minhash":"11101101001010001101001010111101100100110010", > : ...other fields here > : "_version_":1553728923368423424}] > : }} > : > : So it seems that right now, i can only use a transformer properly if i > request all fields. I believe it used to work with all three examples just as > you would expect. But since i haven't used transformers for a while, i don't > know at which version it stopped working like that (if it ever did of course > :) > : > : Did i mess something up or did a bug creep on me? > : > : Thanks, > : Markus > : > > -Hoss > http://www.lucidworks.com/ >
Re: DocTransformer not always working
Fairly certain you aren't overridding getExtraRequestFields, so when your DocTransformer is evaluated it can'd find the field you want it to transform. By default, the ResponseWriters don't provide any fields that aren't explicitly requested by the user, or specified as "extra" by the DocTransformer. IIUC you want the stored value of the "minhash" field to be available to you, but the response writer code doesn't know that -- it just knows you want "minhash" to be the output respons key for the "[binstr]" transformer. Take a look at RawValueTransformerFactory as an example to borrow from. : Date: Wed, 14 Dec 2016 21:55:26 + : From: Markus Jelsma: Reply-To: solr-user@lucene.apache.org : To: solr-user : Subject: DocTransformer not always working : : Hello - I just spotted an oddity with all two custom DocTransformers we sometimes use on Solr 6.3.0. This particular transformer in the example just transforms a long (or int) into a sequence of bits. I just use it as an convenience to compare minhashes with my eyeballs. First example is very straightforward, fl=minhash:[binstr], show only the minhash field, but as a bit sequence. : : solr/search/select?omitHeader=true=json=true=1=id%20asc=*:*=minhash:[binstr] : { : "response":{"numFound":96933,"start":0,"docs":[ : {}] : }} : : The document is empty! This also happens with another transformer. The next example i also request the lang field: : : solr/search/select?omitHeader=true=json=true=1=id asc=*:*=lang,minhash:[binstr] : { : "response":{"numFound":96933,"start":0,"docs":[ : { : "lang":"nl"}] : }} : : Ok, at least i now get the lang field, but the transformed minhash is nowhere to be seen. In the next example i request all fields and the transformed minhash: : : /solr/search/select?omitHeader=true=json=true=1=id%20asc=*:*=*,minhash:[binstr] : { : "response":{"numFound":96933,"start":0,"docs":[ : { : "minhash":"11101101001010001101001010111101100100110010", : ...other fields here : "_version_":1553728923368423424}] : }} : : So it seems that right now, i can only use a transformer properly if i request all fields. I believe it used to work with all three examples just as you would expect. But since i haven't used transformers for a while, i don't know at which version it stopped working like that (if it ever did of course :) : : Did i mess something up or did a bug creep on me? : : Thanks, : Markus : -Hoss http://www.lucidworks.com/