[
https://issues.apache.org/jira/browse/SOLR-11598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548627#comment-16548627
]
Varun Thacker commented on SOLR-11598:
--------------------------------------
To recap the changes in the final patch to make it easier for reviewing
* The inner class have been extracted into their own file. It's all in the
export package
* There is no longer a limit on the number of sort fields. Earlier it was 4
* One optimization around setting queue size i.e if total hits is less than
30k create the queueSize with totalHits
{code:java}
int queueSize = 30000;
if (totalHits < 30000) {
queueSize = totalHits;
}{code}
* For docs that have the same sort value, we respect the index order. This
required setting docBase in SortDoc
* DoubleValue#setCurrentValue , the following if statement now throws an
assertion error. LongValue, IntValue and FloatValue were doing the same thing
{code:java}
if (docId < lastDocID) {
throw new AssertionError("docs were sent out-of-order: lastDocID=" + lastDocID
+ " vs doc=" + docId);
}{code}
* Some superficial changes to StringValue which borrows a technique from
FacetFieldProcessorByHashDV
* More tests
{quote}
Today when creating the priority queue , we do doc-value seeks for all the sort
fields. When we stream out docs we again make doc-value seeks against the fl
fields . In most common use-cases I'd imagine fl = sort fields , so if we can
pre-collect the values while sorting it , we can reduce the doc-value seeks
potentially bringing speed improvements. I believe Amrit is already working on
this and can be tackled in another Jira
{quote}
We'll create a separate followup Jira for this
I've run 100 iterations of TestExportWriter and StreamingTest.
TODO:
* fix precommit. It's currently failing while building the ref guide.
Feedback welcome!
> Export Writer needs to support more than 4 Sort fields - Say 10, ideally it
> should not be bound at all, but 4 seems to really short sell the StreamRollup
> capabilities.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-11598
> URL: https://issues.apache.org/jira/browse/SOLR-11598
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: streaming expressions
> Affects Versions: 6.6.1, 7.0
> Reporter: Aroop
> Assignee: Varun Thacker
> Priority: Major
> Labels: patch
> Attachments: SOLR-11598-6_6-streamtests, SOLR-11598-6_6.patch,
> SOLR-11598-master.patch, SOLR-11598.patch, SOLR-11598.patch,
> SOLR-11598.patch, SOLR-11598.patch, SOLR-11598.patch, SOLR-11598.patch,
> SOLR-11598.patch, SOLR-11598.patch, SOLR-11598.patch, streaming-export
> reports.xlsx
>
>
> I am a user of Streaming and I am currently trying to use rollups on an 10
> dimensional document.
> I am unable to get correct results on this query as I am bounded by the
> limitation of the export handler which supports only 4 sort fields.
> I do not see why this needs to be the case, as it could very well be 10 or 20.
> My current needs would be satisfied with 10, but one would want to ask why
> can't it be any decent integer n, beyond which we know performance degrades,
> but even then it should be caveat emptor.
> [~varunthacker]
> Code Link:
> https://github.com/apache/lucene-solr/blob/19db1df81a18e6eb2cce5be973bf2305d606a9f8/solr/core/src/java/org/apache/solr/handler/ExportWriter.java#L455
> Error
> null:java.io.IOException: A max of 4 sorts can be specified
> at
> org.apache.solr.handler.ExportWriter.getSortDoc(ExportWriter.java:452)
> at org.apache.solr.handler.ExportWriter.writeDocs(ExportWriter.java:228)
> at
> org.apache.solr.handler.ExportWriter.lambda$null$1(ExportWriter.java:219)
> at
> org.apache.solr.common.util.JavaBinCodec.writeIterator(JavaBinCodec.java:664)
> at
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:333)
> at
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
> at org.apache.solr.common.util.JavaBinCodec$1.put(JavaBinCodec.java:394)
> at
> org.apache.solr.handler.ExportWriter.lambda$null$2(ExportWriter.java:219)
> at
> org.apache.solr.common.util.JavaBinCodec.writeMap(JavaBinCodec.java:437)
> at
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:354)
> at
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
> at org.apache.solr.common.util.JavaBinCodec$1.put(JavaBinCodec.java:394)
> at
> org.apache.solr.handler.ExportWriter.lambda$write$3(ExportWriter.java:217)
> at
> org.apache.solr.common.util.JavaBinCodec.writeMap(JavaBinCodec.java:437)
> at org.apache.solr.handler.ExportWriter.write(ExportWriter.java:215)
> at org.apache.solr.core.SolrCore$3.write(SolrCore.java:2601)
> at
> org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:49)
> at
> org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:809)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:538)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:534)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]