Re: QParsePlugin not working on sharded collection
So my implementation with a DocTransformer is causing an exception (with a sharded collection): ERROR - 2016-08-04 09:41:44.247; [ShardTest1 shard1_0 core_node3 ShardTest1_shard1_0_replica1] org.apache.solr.common.SolrException; null:org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr/ShardTest1_shard1_0_replica1: parsing error at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:538) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:235) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:227) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:218) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:183) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:148) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.solr.common.SolrException: parsing error at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:52) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:536) ... 12 more Caused by: java.io.EOFException at org.apache.solr.common.util.FastInputStream.readByte(FastInputStream.java:208) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:188) at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:508) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:202) at org.apache.solr.common.util.JavaBinCodec.readSolrDocumentList(JavaBinCodec.java:390) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:237) at org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:135) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:204) at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:126) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:50) ... 13 more Here are the changes to TedQuery (I reduced the amount of data being returned and map the docId to the document - like the [docid] transformer, and put the map in the request context in the finish() method) public void collect(int doc) throws IOException { count++; if (doc % 1 == 0) { mydata.put(Integer.valueOf(doc + super.docBase), String.valueOf(doc + super.docBase)); super.collect(doc); } } public void finish() throws IOException { ... rb.req.getContext().put("mystats", mydata); ... } Here's the transformer: public class TedTransform extends TransformerFactory { @Override public DocTransformer create(String arg0, SolrParams arg1, SolrQueryRequest arg2) { return new TedTransformer(arg0, arg2); } private class TedTransformer extends TransformerWithContext { private final String f; private HashMap<Integer, String> data; public TedTransformer(String f, SolrQueryRequest r) { this.f = f; } @Override public String getName() { return null; } @Override public void transform(SolrDocument arg0, int arg1) throws IOException { if (context.req != null) { if (data == null) { data = (HashMap<Integer, String>) context.req.getContext().get("mystats"); } arg0.setField(f, data.get(Integer.valueOf(arg1))); } } } } And I added the transformer to the solrconfig.xml: {!TedFilter myvar=hello} [TedT] Why does this barf on multi-sharded collections? -- View this message in context: http://lucene.472066.n3.nabble.com/QParsePlugin-not-working-on-sharded-collection-tp4290249p4290390.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: QParsePlugin not working on sharded collection
Thanks Erick, you answered my question by pointing out the aggregator. I didn't realize a merge strategy was _required_ to return stats info when there are multiple shards. I'm having trouble with my actual plugin so I've scaled back to the simplest possible example. I'm adding to it little by little to see what the last straw is. -- View this message in context: http://lucene.472066.n3.nabble.com/QParsePlugin-not-working-on-sharded-collection-tp4290249p4290365.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: QParsePlugin not working on sharded collection
OK, I'm going to assume that somewhere you're keeping more complicated structures around to track all the docs coming through the collector so you can know whether they're duplicates or not. I think there are really two ways (at least) to go about it 1> use a SearchComponent to add a separate section to the response similar to highlighting or faceting. 2> go ahead and use a DocTransformer to add the data to each individual doc. But the example you're using adds the data to the meta-data, not an individual doc. Best, Erick On Wed, Aug 3, 2016 at 2:03 PM, tedsolr <tsm...@sciquest.com> wrote: > So I notice if I create the simplest MergeStrategy I can get my test values > from the shard responses and then if I add info to the SolrQueryResponse it > gets back to the caller. I still must be missing something. I wouldn't > expect to have different code paths - one for single shard one for multi > shard. So if the PostFilter is restricting the documents returned, what's > the correct way to return my analytics info? Should I not be adding data to > the SolrQueryResponse from within the delegating collector's finish() > method? Here's what I'm trying to do (still works fine with a single shard > collection :) > > - Use the DelegatingCollector to restrict docs returned (dropping docs that > are "duplicates" based on my critieria) > - Calculate 2 stats for each collected doc: a count of "duplicate" docs & a > sum on a number field from these "duplicate" docs. I am doing the math in > the collect() method. > - Return the stats in the response stream. I'm using a TransformerFactory > now to inject a new field into the results for each doc. Should I be using a > SearchComponent instead? > > > Erick Erickson wrote >> Right, I don't have the code in front of me right now, but I think >> your issue is at the "aggregation" point. You also have to put >> some code in the aggregation bits that pull your custom parts >> from the sub-request packets and puts in the final packet, >> "doing the right thing" in terms of assembling them into >> something meaningful along the way (e.g. averaging "myvar" >> or putting it in a list identified by shard or..). >> >> I think if you fire the query at one of your shards with =false >> you'll see your additions, which would demonstrate that your >> filter is being found. I assume your custom jar is on the shards >> or you'd get an exception (assuming you've pushed your >> solrconfig to ZK). >> >> Best, >> Erick > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/QParsePlugin-not-working-on-sharded-collection-tp4290249p4290285.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: QParsePlugin not working on sharded collection
So I notice if I create the simplest MergeStrategy I can get my test values from the shard responses and then if I add info to the SolrQueryResponse it gets back to the caller. I still must be missing something. I wouldn't expect to have different code paths - one for single shard one for multi shard. So if the PostFilter is restricting the documents returned, what's the correct way to return my analytics info? Should I not be adding data to the SolrQueryResponse from within the delegating collector's finish() method? Here's what I'm trying to do (still works fine with a single shard collection :) - Use the DelegatingCollector to restrict docs returned (dropping docs that are "duplicates" based on my critieria) - Calculate 2 stats for each collected doc: a count of "duplicate" docs & a sum on a number field from these "duplicate" docs. I am doing the math in the collect() method. - Return the stats in the response stream. I'm using a TransformerFactory now to inject a new field into the results for each doc. Should I be using a SearchComponent instead? Erick Erickson wrote > Right, I don't have the code in front of me right now, but I think > your issue is at the "aggregation" point. You also have to put > some code in the aggregation bits that pull your custom parts > from the sub-request packets and puts in the final packet, > "doing the right thing" in terms of assembling them into > something meaningful along the way (e.g. averaging "myvar" > or putting it in a list identified by shard or..). > > I think if you fire the query at one of your shards with =false > you'll see your additions, which would demonstrate that your > filter is being found. I assume your custom jar is on the shards > or you'd get an exception (assuming you've pushed your > solrconfig to ZK). > > Best, > Erick -- View this message in context: http://lucene.472066.n3.nabble.com/QParsePlugin-not-working-on-sharded-collection-tp4290249p4290285.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: QParsePlugin not working on sharded collection
Right, I don't have the code in front of me right now, but I think your issue is at the "aggregation" point. You also have to put some code in the aggregation bits that pull your custom parts from the sub-request packets and puts in the final packet, "doing the right thing" in terms of assembling them into something meaningful along the way (e.g. averaging "myvar" or putting it in a list identified by shard or..). I think if you fire the query at one of your shards with =false you'll see your additions, which would demonstrate that your filter is being found. I assume your custom jar is on the shards or you'd get an exception (assuming you've pushed your solrconfig to ZK). Best, Erick On Wed, Aug 3, 2016 at 9:42 AM, tedsolr <tsm...@sciquest.com> wrote: > I'm trying to verify that a very simple custom post filter will work on a > sharded collection. So far it doesn't. Here are the search results on my > single shard test collection: > > { > "responseHeader": { > "status": 0, > "QTime": 17 > }, > "thecountis": "946028", > "myvar": "hello", > "response": { > "numFound": 946028, > "start": 0, > "docs": [ > ...] > } > > When I run against a two shard collection (same data set) it's as though the > post filter doesn't exist. The results don't include my additions to the > response: > > { > "responseHeader": { > "status": 0, > "QTime": 17 > }, > "response": { > "numFound": 946028, > "start": 0, > "docs": [ > ...] > } > > Here's the solconfig.xml: > > ... > > > > {!TedFilter myvar=hello} > > > ... > > And here's the simplest plugin I could write: > > public class TedPlugin extends QParserPlugin { > @Override > public void init(NamedList arg0) { > } > > @Override > public QParser createParser(String arg0, final SolrParams arg1, final > SolrParams arg2, final SolrQueryRequest arg3) { > return new QParser(arg0, arg1, arg2, arg3) { > > @Override > public Query parse() throws SyntaxError { > return new TedQuery(arg1, arg2, arg3); > } > }; > } > } > > public class TedQuery extends AnalyticsQuery { > private final String myvar; > > TedQuery(SolrParams localParams, SolrParams params, SolrQueryRequest > req) { > myvar = localParams.get("myvar"); > } > > @Override > public DelegatingCollector getAnalyticsCollector(ResponseBuilder rb, > IndexSearcher searcher) { > return new TedCollector(myvar, rb); > } > > @Override > public boolean equals(Object o) { > if (o instanceof TedQuery) { > TedQuery tq = (TedQuery) o; > return Objects.equals(this.myvar, tq.myvar); > } > return false; > } > > @Override > public int hashCode() { > return myvar == null ? 1 : myvar.hashCode(); > } > > > class TedCollector extends DelegatingCollector { > ResponseBuilder rb; > int count; > String myvar; > > public TedCollector(String myvar, ResponseBuilder rb) { > this.rb = rb; > this.myvar = myvar; > } > > @Override > public void collect(int doc) throws IOException { > count++; > super.collect(doc); > } > > @Override > public void finish() throws IOException { > rb.rsp.add("thecountis", String.valueOf(count)); > rb.rsp.add("myvar", myvar); > > if (super.delegate instanceof DelegatingCollector) { > ((DelegatingCollector) > super.delegate).finish(); > } > } > } > } > > What am I doing wrong? Thanks! > Ted > v5.2.1 SolrCloud mode > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/QParsePlugin-not-working-on-sharded-collection-tp4290249.html > Sent from the Solr - User mailing list archive at Nabble.com.
QParsePlugin not working on sharded collection
I'm trying to verify that a very simple custom post filter will work on a sharded collection. So far it doesn't. Here are the search results on my single shard test collection: { "responseHeader": { "status": 0, "QTime": 17 }, "thecountis": "946028", "myvar": "hello", "response": { "numFound": 946028, "start": 0, "docs": [ ...] } When I run against a two shard collection (same data set) it's as though the post filter doesn't exist. The results don't include my additions to the response: { "responseHeader": { "status": 0, "QTime": 17 }, "response": { "numFound": 946028, "start": 0, "docs": [ ...] } Here's the solconfig.xml: ... {!TedFilter myvar=hello} ... And here's the simplest plugin I could write: public class TedPlugin extends QParserPlugin { @Override public void init(NamedList arg0) { } @Override public QParser createParser(String arg0, final SolrParams arg1, final SolrParams arg2, final SolrQueryRequest arg3) { return new QParser(arg0, arg1, arg2, arg3) { @Override public Query parse() throws SyntaxError { return new TedQuery(arg1, arg2, arg3); } }; } } public class TedQuery extends AnalyticsQuery { private final String myvar; TedQuery(SolrParams localParams, SolrParams params, SolrQueryRequest req) { myvar = localParams.get("myvar"); } @Override public DelegatingCollector getAnalyticsCollector(ResponseBuilder rb, IndexSearcher searcher) { return new TedCollector(myvar, rb); } @Override public boolean equals(Object o) { if (o instanceof TedQuery) { TedQuery tq = (TedQuery) o; return Objects.equals(this.myvar, tq.myvar); } return false; } @Override public int hashCode() { return myvar == null ? 1 : myvar.hashCode(); } class TedCollector extends DelegatingCollector { ResponseBuilder rb; int count; String myvar; public TedCollector(String myvar, ResponseBuilder rb) { this.rb = rb; this.myvar = myvar; } @Override public void collect(int doc) throws IOException { count++; super.collect(doc); } @Override public void finish() throws IOException { rb.rsp.add("thecountis", String.valueOf(count)); rb.rsp.add("myvar", myvar); if (super.delegate instanceof DelegatingCollector) { ((DelegatingCollector) super.delegate).finish(); } } } } What am I doing wrong? Thanks! Ted v5.2.1 SolrCloud mode -- View this message in context: http://lucene.472066.n3.nabble.com/QParsePlugin-not-working-on-sharded-collection-tp4290249.html Sent from the Solr - User mailing list archive at Nabble.com.