npawar edited a comment on issue #5238: Evaluate schema transform expressions during ingestion URL: https://github.com/apache/incubator-pinot/pull/5238#issuecomment-614230483 > Yes, splitting it into a couple of PRs may help -- since you offered. Perhaps starting with the record extractor, but you decide. > Regarding your open questions: > 1 . We face this problem because json and csv dont have any schema. Should we introduce the concept of an input schema ? Should we pick the schema from the first record? (what if it has null for some field, and we find the field later? Should we dictate that the first record MUST have all the fields they ever expect to see in the input, and take that to be the schema? Just adding some ideas for consideration. Some related discussion in PR #4968 InputSchema seems like the cleanest solution. But that's more configuration and more scope for error for the user. Giving it more thought, I think the current implementation will work fine, because most often, the same input format will be used for all pushes to the same table. The problem arises only if someone keeps switching between input formats of the data. > 2. Can we assume pinot defaults if input to transform function is null? This will work until we support more input types than pinot itself supports -- far-fetched I think. Right now, I think the best way forward is to let the function handle it. The function definition will change only if someone uses data with different input formats to push to the same schema (if they use Avro once and it succeeds, but then they use CSV and the function fails) - which is not a common practical case.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
