npawar commented on issue #5238: Evaluate schema transform expressions during 
ingestion
URL: https://github.com/apache/incubator-pinot/pull/5238#issuecomment-613199104
 
 
   > > > Will be nice if we can completely decouple schema from RecordReader 
and RecordExtractor, which will make future development much easier
   > > 
   > > 
   > > It's already decoupled from RecordExtractor. Do you mean pull up the 
schema even more, such that RecordReader also doesn't need Schema? What would 
we achieve by doing that?
   > 
   > @npawar RecordExtractor is for stream ingestion, and RecordReader is for 
batch ingestion. Think of some users trying to add a new record reader, they 
don't need to understand what schema is, they only need to know here are the 
fields that should be read.
   > This might be bigger change, so we can add a TODO and address it 
separately.
   
   As per offline sync up, StreamMessageDecoder is the entry point for 
realtime, and RecordReader is the entry point for batch. The RecordExtractor is 
expected to be common to both of them. Picture in the design doc linked in the 
description.
   And I've added a TODO in RecordReader class to further pull out Schema. For 
consistency, we should do the same in StreamMessageDecoder as well then. Since 
this is a bigger change, will leave it out for  the scope of this PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to