[ 
https://issues.apache.org/jira/browse/SOLR-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081382#comment-15081382
 ] 

Jason Gerlowski commented on SOLR-8480:
---------------------------------------

As a disclaimer, I'm new to the Streaming Expression API and TupleStreams in 
general.  So take what I say with a grain of salt.

The {{getConsumed()}} method might be do-able and useful, but I'm less sure 
about a {{getSize()}} method.

As I understand the idea of streaming, nothing really knows how many records 
are in the stream.  That's one of the main points/advantages.  The whole result 
set is never fetched all at once, even in the leaves of the TupleStream 
hierarchy.  All things are possible I suppose, but right now there's nothing 
that knows the size of the result-set.

Even assuming that we do know the result-size of underlying searches, 
{{getSize}} would be pretty tricky to figure out for some decorator 
TupleStreams. For example, consider: {{unique(search(...))}}.  How would a 
UniqueStream define its size?  Even if the underlying search knows how many 
results there are total, that doesn't necessarily give UniqueStream any hint at 
how many tuples it will output.  That depends on what the actual result values 
returned by the search().  It can't really be known until all search-result 
values have been read/processed by UniqueStream.

It would be nice to have these methods, but it doesn't seem possible in the 
current streaming API.  Unless I'm missing something, that is.  Did you have a 
particular method in mind for reporting these sort of stats?

> Progress info for TupleStream
> -----------------------------
>
>                 Key: SOLR-8480
>                 URL: https://issues.apache.org/jira/browse/SOLR-8480
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrJ
>            Reporter: Cao Manh Dat
>
> I suggest adding progress info for TupleStream. It can be very helpful for 
> tracking consuming process
> {code}
> public abstract class TupleStream {
>    public abstract long getSize();
>    public abstract long getConsumed();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to