[
https://issues.apache.org/jira/browse/SOLR-11598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16543791#comment-16543791
]
Varun Thacker commented on SOLR-11598:
--------------------------------------
Patch which builds on top of Amrit's patch
Changes:
* I've split up the several sub-classes within ExportWriter to their own
individual class. It's all contained within one package. Personally I found it
easier to read the code after splitting out the classes. It does make the patch
a lot bigger though
* Adds some extra tests. We weren't testing boolean fields and a scenario
where both docs have the same values. This caught a bug in the previous patch
where if two docs are the same the higher docId would get selected rather than
the lower docId one ( index order )
* One optimization around setting queue size i.e if total hits is less than
30k create the queueSize with totalHits
{code:java}
int queueSize = 30000;
if (totalHits < 30000) {
queueSize = totalHits;
}{code}
Today when creating the priority queue , we do doc-value seeks for all the sort
fields. When we stream out docs we again make doc-value seeks against the fl
fields . In most common use-cases I'd imagine fl = sort fields , so if we can
pre-collect the values while sorting it , we can halve the doc-value seeks
potentially bringing us speed improvements. I believe Amrit is already working
on this and can be tackled in another Jira
Based on Mark's and David's comments , should we still limit the sort fields to
10 or keep it say 50? I've added this line to the export writer ref guide page
with the patch already
{code:java}
The export performance will get slower as you add more sort fields. If there is
enough physical memory available outside of the JVM to load up the sort fields
then the performance will be linearly slower with additional of sort fields. It
can get worse otherwise.{code}
I'll begin some manual testing on this patch
> Export Writer needs to support more than 4 Sort fields - Say 10, ideally it
> should not be bound at all, but 4 seems to really short sell the StreamRollup
> capabilities.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-11598
> URL: https://issues.apache.org/jira/browse/SOLR-11598
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: streaming expressions
> Affects Versions: 6.6.1, 7.0
> Reporter: Aroop
> Assignee: Varun Thacker
> Priority: Major
> Labels: patch
> Attachments: SOLR-11598-6_6-streamtests, SOLR-11598-6_6.patch,
> SOLR-11598-master.patch, SOLR-11598.patch, SOLR-11598.patch,
> SOLR-11598.patch, SOLR-11598.patch, SOLR-11598.patch, streaming-export
> reports.xlsx
>
>
> I am a user of Streaming and I am currently trying to use rollups on an 10
> dimensional document.
> I am unable to get correct results on this query as I am bounded by the
> limitation of the export handler which supports only 4 sort fields.
> I do not see why this needs to be the case, as it could very well be 10 or 20.
> My current needs would be satisfied with 10, but one would want to ask why
> can't it be any decent integer n, beyond which we know performance degrades,
> but even then it should be caveat emptor.
> [~varunthacker]
> Code Link:
> https://github.com/apache/lucene-solr/blob/19db1df81a18e6eb2cce5be973bf2305d606a9f8/solr/core/src/java/org/apache/solr/handler/ExportWriter.java#L455
> Error
> null:java.io.IOException: A max of 4 sorts can be specified
> at
> org.apache.solr.handler.ExportWriter.getSortDoc(ExportWriter.java:452)
> at org.apache.solr.handler.ExportWriter.writeDocs(ExportWriter.java:228)
> at
> org.apache.solr.handler.ExportWriter.lambda$null$1(ExportWriter.java:219)
> at
> org.apache.solr.common.util.JavaBinCodec.writeIterator(JavaBinCodec.java:664)
> at
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:333)
> at
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
> at org.apache.solr.common.util.JavaBinCodec$1.put(JavaBinCodec.java:394)
> at
> org.apache.solr.handler.ExportWriter.lambda$null$2(ExportWriter.java:219)
> at
> org.apache.solr.common.util.JavaBinCodec.writeMap(JavaBinCodec.java:437)
> at
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:354)
> at
> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
> at org.apache.solr.common.util.JavaBinCodec$1.put(JavaBinCodec.java:394)
> at
> org.apache.solr.handler.ExportWriter.lambda$write$3(ExportWriter.java:217)
> at
> org.apache.solr.common.util.JavaBinCodec.writeMap(JavaBinCodec.java:437)
> at org.apache.solr.handler.ExportWriter.write(ExportWriter.java:215)
> at org.apache.solr.core.SolrCore$3.write(SolrCore.java:2601)
> at
> org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:49)
> at
> org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:809)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:538)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:534)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]