[ 
https://issues.apache.org/jira/browse/SOLR-11598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548627#comment-16548627
 ] 

Varun Thacker commented on SOLR-11598:
--------------------------------------

To recap the changes in the final patch to make it easier for reviewing 
 * The inner class have been extracted into their own file. It's all in the 
export package
 * There is no longer a limit on the number of sort fields. Earlier it was 4
 * One optimization around setting queue size i.e if total hits is less than 
30k create the queueSize with totalHits 
{code:java}
int queueSize = 30000;
if (totalHits < 30000) {
queueSize = totalHits;
}{code}

 * For docs that have the same sort value, we respect the index order. This 
required setting docBase in SortDoc
 * DoubleValue#setCurrentValue , the following if statement now throws an 
assertion error. LongValue, IntValue and FloatValue were doing the same thing

{code:java}
if (docId < lastDocID) {
throw new AssertionError("docs were sent out-of-order: lastDocID=" + lastDocID 
+ " vs doc=" + docId);
}{code}

 * Some superficial changes to StringValue which borrows a technique from 
FacetFieldProcessorByHashDV
 * More tests

{quote}
Today when creating the priority queue , we do doc-value seeks for all the sort 
fields. When we stream out docs we again make doc-value seeks against the fl 
fields . In most common use-cases I'd imagine fl = sort fields , so if we can 
pre-collect the values while sorting it , we can reduce the doc-value seeks 
potentially bringing speed improvements. I believe Amrit is already working on 
this and can be tackled in another Jira
{quote}
We'll create a separate followup Jira for this

I've run 100 iterations of TestExportWriter and StreamingTest.

TODO:
 * fix precommit. It's currently failing while building the ref guide.

 

Feedback welcome! 

 

> Export Writer needs to support more than 4 Sort fields - Say 10, ideally it 
> should not be bound at all, but 4 seems to really short sell the StreamRollup 
> capabilities.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11598
>                 URL: https://issues.apache.org/jira/browse/SOLR-11598
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: streaming expressions
>    Affects Versions: 6.6.1, 7.0
>            Reporter: Aroop
>            Assignee: Varun Thacker
>            Priority: Major
>              Labels: patch
>         Attachments: SOLR-11598-6_6-streamtests, SOLR-11598-6_6.patch, 
> SOLR-11598-master.patch, SOLR-11598.patch, SOLR-11598.patch, 
> SOLR-11598.patch, SOLR-11598.patch, SOLR-11598.patch, SOLR-11598.patch, 
> SOLR-11598.patch, SOLR-11598.patch, SOLR-11598.patch, streaming-export 
> reports.xlsx
>
>
> I am a user of Streaming and I am currently trying to use rollups on an 10 
> dimensional document.
> I am unable to get correct results on this query as I am bounded by the 
> limitation of the export handler which supports only 4 sort fields.
> I do not see why this needs to be the case, as it could very well be 10 or 20.
> My current needs would be satisfied with 10, but one would want to ask why 
> can't it be any decent integer n, beyond which we know performance degrades, 
> but even then it should be caveat emptor.
> [~varunthacker] 
> Code Link:
> https://github.com/apache/lucene-solr/blob/19db1df81a18e6eb2cce5be973bf2305d606a9f8/solr/core/src/java/org/apache/solr/handler/ExportWriter.java#L455
> Error
> null:java.io.IOException: A max of 4 sorts can be specified
>       at 
> org.apache.solr.handler.ExportWriter.getSortDoc(ExportWriter.java:452)
>       at org.apache.solr.handler.ExportWriter.writeDocs(ExportWriter.java:228)
>       at 
> org.apache.solr.handler.ExportWriter.lambda$null$1(ExportWriter.java:219)
>       at 
> org.apache.solr.common.util.JavaBinCodec.writeIterator(JavaBinCodec.java:664)
>       at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:333)
>       at 
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
>       at org.apache.solr.common.util.JavaBinCodec$1.put(JavaBinCodec.java:394)
>       at 
> org.apache.solr.handler.ExportWriter.lambda$null$2(ExportWriter.java:219)
>       at 
> org.apache.solr.common.util.JavaBinCodec.writeMap(JavaBinCodec.java:437)
>       at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:354)
>       at 
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
>       at org.apache.solr.common.util.JavaBinCodec$1.put(JavaBinCodec.java:394)
>       at 
> org.apache.solr.handler.ExportWriter.lambda$write$3(ExportWriter.java:217)
>       at 
> org.apache.solr.common.util.JavaBinCodec.writeMap(JavaBinCodec.java:437)
>       at org.apache.solr.handler.ExportWriter.write(ExportWriter.java:215)
>       at org.apache.solr.core.SolrCore$3.write(SolrCore.java:2601)
>       at 
> org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:49)
>       at 
> org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:809)
>       at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:538)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
>       at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>       at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>       at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>       at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>       at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>       at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>       at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>       at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>       at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>       at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>       at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>       at org.eclipse.jetty.server.Server.handle(Server.java:534)
>       at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>       at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>       at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>       at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>       at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>       at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>       at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>       at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>       at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>       at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>       at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to