dstandish commented on issue #25186:
URL: https://github.com/apache/airflow/issues/25186#issuecomment-1190527854

   yeah you are probably right.  i'm not saying that we necessarily create a 
new standard.  what's at stake for me is whether airflow takes on the 
responsibility of normalizing e.g. when storing in the database.  in other 
words, given that the URI field is the unique identifier for an airflow 
dataset, does airflow really need to take on the responsibility of normalizing, 
such that when a user's code handles it in two different ways, we treat it  as 
the same dataset.  i'm reluctant to take on that responsibility, and i am more 
inclined to force the user to just make their code consistent.  in other words, 
just store in a fully case-sensitivie collation, and not trouble ourselves 
with, e.g. decomposing, lowering hostname, recomposing and storing -- or, 
alternatively, storing hostname separately in a case-insensitive field and 
merge back in on read etc.  i would rather not.
   
   but i think we probably need to decide before 2.4 because a change to this 
behavior would be breaking.
   
   unless we want to mark datasets as experimental... which i doubt...  i 
digress


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to