gianm opened a new pull request, #15276:
URL: https://github.com/apache/druid/pull/15276

   Main changes:
   
   1) The `SystemField` enum defines system fields `__file_uri`, `__file_path`,
      and `__file_bucket`. They are associated with each input entity.
   
   2) The `SystemFieldInputSource` interface can be added to any InputSource
      to make it system-field-capable. It sets up serialization of a list
      of configured `systemFields` in the JSON form of the input source, and
      provides a method getSystemFieldValue for computing the value of each
      system field. Cloud object, HDFS, HTTP, and Local now have this.
   
   The `SystemFieldInputSource` isn't strictly necessary, since each input 
source could have implemented system fields internally in its own way. However, 
I think the interface is valuable because it helps ensure system fields are 
dealt with consistently, and because it provides a path to exposing system 
fields in SQL in a nice way. I think that ideally, they would be referenceable 
by name, but not participate in star expansion. AFAICT this would require a new 
Calcite feature. Relevant Calcite mailing list thread: 
https://lists.apache.org/thread/pnf3bx3jlrmv7q1q7jhwhsylrw4q5t20
   
   Until then, system fields can be used in SQL without the planner's 
awareness: with `EXTERN`, add `systemFields` to the `inputSource` section, and 
add the system field names to the `signature` section.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to