+1 on having a framework. OTOH, as with the warnings implementation, we might want to go ahead with a simpler implementation while we get a more generic framework design in place.
Jacques, do you have any preliminary thoughts on the framework? On Tue, Dec 1, 2015 at 2:08 PM, Julian Hyde <[email protected]> wrote: > +1 for a sideband mechanism. > > Sideband can also allow correlated restart of sub-queries. > > In sideband use cases you described, the messages ran in the opposite > direction to the data. Would the sideband also run in the same direction as > the data? If so it could carry warnings, rejected rows, progress > indications, and (for online aggregation[1]) notifications that a better > approximate query result is available. > > Julian > > [1] https://en.wikipedia.org/wiki/Online_aggregation > > > > > On Dec 1, 2015, at 1:51 PM, Jacques Nadeau <[email protected]> wrote: > > > > This seems like a form of sideband communication. I think we should have > a > > framework for this type of thing in general rather than a one-off for > this > > particular need. Other forms of sideband might be small table bloomfilter > > generation and pushdown into hbase, separate file assignment/partitioning > > providers balancing/generating scanner workloads, statistics generation > for > > adaptive execution, etc. > > > > -- > > Jacques Nadeau > > CTO and Co-Founder, Dremio > > > > On Tue, Dec 1, 2015 at 11:35 AM, Hsuan Yi Chu <[email protected]> > wrote: > > > >> I am trying to deal with the following scenario: > >> > >> A bunch of minor fragments are doing things in parallel. Each of them > could > >> skip some records. Since the downstream minor fragment needs to know the > >> sum of skipped-record-counts (in order to just display or see if the > number > >> exceeds the threshold) in the upstreams, each upstream minor fragment > needs > >> to pass this scalar with RecordBatch. > >> > >> Since this seems impacting the protocol of RecordBatch, I am looking for > >> some advice here. > >> > >> Thanks. > >> > >
