dclim opened a new pull request #7531: Data loader (sampler component)
URL: https://github.com/apache/incubator-druid/pull/7531
 
 
   Implementation of the sampler component of #7502. 
   
   Runs on the overlord and exposes an endpoint on `POST 
/druid/indexer/v1/sampler` that returns sampled data for use by the data loader 
GUI. This is currently intended as an internal-only endpoint and is 
intentionally not documented.
   
   Changes are as minimally-invasive as possible, and most code is confined to 
the `org.apache.druid.indexing.overlord.sampler` package. Additional methods 
were added to `Firehose` (returning the raw rows) and `FirehoseFactory` (adding 
a `connectForSampler` method that signals to the implementation that we only 
care about a few rows and to skip things like prefetching and caching) to 
improve the sampling experience; default implementations do 'the right thing' 
if not implemented.
   
   There are a few 'hacks' added to make the API a bit nicer - i.e. allowing 
the sampler to work if no `dataSchema` is provided, in which case it just 
returns the raw rows if possible and marks everything as unparseable (since no 
parser was provided).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to