On 2019/02/26 23:11:22, Leo <liliule...@gmail.com> wrote: 
> Hi, I am sorry I didn't make it clear, I tried to do lazy loading for the 
> segment data, not metadata. I hope some columns of a segment will not be 
> loaded in historical nodes until they are called. But SegmentMetadata needs 
> to be produced when a new segment just went to the historical node, and this 
> process needs to read segments in and do the analysis. So I hope to get 
> metadata without reading any segments in.
> 
> On 2019/02/26 22:09:24, Gian Merlino <g...@apache.org> wrote: 
> > Hmm. I think you're talking about the SegmentMetadata queries that
> > DruidSchema runs. The intent is that they include an empty analysisTypes
> > list, so they only use cached metadata and don't actually read segments,
> > and are pretty resource-light on historicals. But if you implemented some
> > sort of lazy loading for that metadata, those wouldn't play well together.
> > I'm not sure what the best approach is here. What's the purpose of the lazy
> > loading? If we need to make them play better together, one way to do that
> > could be to add the information that the broker needs to the segment-level
> > "Metadata" object, which I think is probably going to be faster to load,
> > and then keep loading that eagerly.
> > 
> > On Tue, Feb 26, 2019 at 11:11 AM liliule...@gmail.com <liliule...@gmail.com>
> > wrote:
> > 
> > > Hi, I noticed that the query node will send a metadata query to the
> > > historical node once new segments are published, and this will trigger an
> > > analysis method to read all segments in memory and do the analysis. I 
> > > tried
> > > to do lazy cache for columns, which will read segments from disk once they
> > > are called, however, the analysis method will read all data in memory to
> > > process the metadata query. Is it a good idea to migrate this analysis
> > > process to the middlemanager and persist the result in deep storage? So 
> > > the
> > > historical node can just read that file to answer metadata query.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > > For additional commands, e-mail: dev-h...@druid.apache.org
> > >
> > >
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
> 
> Hi, I noticed that the query node will send a metadata query to the 
> historical node once new segments are published, and this will trigger an 
> analysis method to read all segments in memory and do the analysis. I tried 
> to do lazy cache for columns, which will read segments from disk once they 
> are called, however, the analysis method will read all data in memory to 
> process the metadata query. Is it a good idea to migrate this analysis 
> process to the middlemanager and persist the result in deep storage? So the 
> historical node can just read that file to answer metadata query.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org

Reply via email to