mihailom-db commented on PR #45383: URL: https://github.com/apache/spark/pull/45383#issuecomment-1999140950
> > Some high-level questions: > > > > 1. If a function requires certain collations but the input uses a different collation, shall we implicitly cast or fail? > > 2. If a function's inputs do not use the same collation, shall we implicit cast or fail? > > 3. If we cast a string with collation to integer or datetime, do we need to consider the collation? > > @cloud-fan - these are great questions and I think that they should be part of the spec. Rough answers from my side are: > > 1. I think that we should fail but I think that there are some subtle caveats here that should be covered in the design spec. > 2. Depends on a function. e.g. for contains we should fail. For concat we should succeed. > 3. No. Decimal/datetime formatting should be part of "language settings" which are not part of collation track. > > @mihailom-db will extend casting section of the doc. @cloud-fan We will have a meeting with Serge today to discuss some of these questions to make sure we got everything right, but for now, my view is this: 1. We should fail, will extend this part after meeting with Serge. Although, apart from lockdown that @uros-db is working on, no function for now supports specific collation. 2. @dbatomic I partially agree. By design doc we should always try to cast to same type. This is more of a question if we have some conflicts of implicit types or multiple explicit types. If this conflict happens then contains fails, and concat doesn't, but we still try to cast. 3. Agreed on this point. Also, we will have a meeting with Serge today to discuss some of these questions to make sure we got everything right. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
