We have implemented this natively in Spark and explicit sorts are no longer required. Iceberg takes into account both the partition and sort key in the table to request a distribution and ordering from Spark. Should be supported both for batch and micro-batch writes.
- Anton > On Apr 25, 2023, at 11:05 AM, Pucheng Yang <[email protected]> > wrote: > > Hi to confirm, > > In the doc, > https://iceberg.apache.org/docs/1.0.0/spark-writes/#writing-to-partitioned-tables > > <https://iceberg.apache.org/docs/1.0.0/spark-writes/#writing-to-partitioned-tables>, > it says "Explicit sort is necessary because Spark doesn’t allow Iceberg to > request a sort before writing as of Spark 3.0. SPARK-23889 > <https://issues.apache.org/jira/browse/SPARK-23889> is filed to enable > Iceberg to require specific distribution & sort order to Spark." > > I found that all relevant JIRAs in SPARK-23889 > <https://issues.apache.org/jira/browse/SPARK-23889> are resolved in > spark-3.2.0. Does that mean we don't need explicit sort anymore from > spark-3.2.0 and after? > > Thanks > > On Tue, Mar 7, 2023 at 8:10 PM Russell Spitzer <[email protected] > <mailto:[email protected]>> wrote: > This is no longer accurate, since now we do have a "fan-out" writer for > spark. But originally the idea here is that it is way more efficient to open > a single file handle at a time and write to it, than to open a new file > handle for every file as we find a new partition to write to in the same > spark task. Fanout performs the write as just opening each handle as the > writer sees a new partition. > > Now that said, this is a local required sort for the default writer. For best > performance though in making as few files as possible using write > distribution mode "Hash" will force a real shuffle but eliminate this issue > by making sure each spark task is writing to a single or single set of > Partitions in order. We need to update this document to talk about > distribution modes, especially since hash will be the new default soon and > this information is basically for manual tuning only. > > If your data is already organized the way you want, setting distribution mode > to none will avoid this shuffle. If you don't care about multiple file > handles being open at the same time, you can set the fanout writer option. > With "none" and "fan-out" writers you will basically write in the fastest way > possible at the expense of memory at write time and possibly generating many > files if your data isn't organized. > > On Tue, Mar 7, 2023 at 9:46 PM Manu Zhang <[email protected] > <mailto:[email protected]>> wrote: > Hi all, > > As per > https://iceberg.apache.org/docs/latest/spark-writes/#writing-to-partitioned-tables > > <https://iceberg.apache.org/docs/latest/spark-writes/#writing-to-partitioned-tables>, > sort is required for Spark writing to a partitioned table. Does anyone know > the reason behind it? If this is to avoid creating too many small files, > isn't shuffle/repartition sufficient? > > Thanks, > Manu >
