[GitHub] [iceberg] HeartSaVioR commented on pull request #1523: ISSUE-1520 Document writing against partitioned table in Spark

GitBox Mon, 28 Sep 2020 18:53:41 -0700


HeartSaVioR commented on pull request #1523:
URL: https://github.com/apache/iceberg/pull/1523#issuecomment-700377063



   > I don't think so. Most transformations work with equivalents from Spark: 
ORDER BY CAST(ts AS DATE), category will work for the other two columns. Only 
bucketing is difficult.
   
   Ah you're right. No need to calculate the actual partition value for sorting 
on date/time partition value. Good point.
   
   > I think it would be helpful to have a paragraph or a callout that explains 
why this is required: we can't request a sort until sorting is supported and we 
can't inject the functions until Spark supports a FunctionCatalog.
   
   Sounds great. Probably note would work for a callout. I'll add the note. 
Thanks!
   
   > Should we create a utility method to register Iceberg transforms as Spark 
UDFs?
   
   I think it should be helpful. Btw, can we deal with the Java-Scala interop 
trick in Iceberg side, or simply support Java side? I guess we probably don't 
want to deal with Scala directly, as it's tied to the Scala version. (The trick 
doesn't look to be tied to specific version of Scala though.)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] HeartSaVioR commented on pull request #1523: ISSUE-1520 Document writing against partitioned table in Spark

Reply via email to