This is truly wonderful. 1. I have an internal patch related to committer stuff I could submit now 2. if someone wants to look at it where FileSystem.open() is used *and you have the file length, file path, or simply know whether you plan to do random or sequential IO*, switch to openFile(). on s3a, abfs it will skip a HEAD request if it is happy, and passing down seek policy makes a big difference in read performance.
Now, we just have to get hive, parquet, orc, avro and iceberg all lined up...once parquet makes the move to 3.3.5+ the vector IO api can be used there. Which mukund thakur will be talking about at Berlin Buzzwords this summer if anyone is thinking of attending. Oh, and I still need to get hadoop to being java11+ only. Thanks for this. steve On Sun, 16 Apr 2023 at 05:25, yangjie01 <yangji...@baidu.com> wrote: > Thanks Chao ~ > > > > Yang Jie > > > > *发件人**: *Dongjoon Hyun <dongjoon.h...@gmail.com> > *日期**: *2023年4月16日 星期日 00:08 > *收件人**: *Chao Sun <sunc...@apache.org> > *抄送**: *dev <dev@spark.apache.org> > *主题**: *Re: hadoop-2 profile to be removed in 3.5.0 > > > > Thank you so much for head-ups, Chao! > > > > Dongjoon. > > > > > > On Fri, Apr 14, 2023 at 6:33 PM Chao Sun <sunc...@apache.org> wrote: > > Hi all, > > Just a heads up that `hadoop-2` profile is going to be removed in > Apache Spark 3.5.0. This has been discussed previously through this > email thread: > https://lists.apache.org/thread/z4jdy9959b6zj9t726zl0zcrk4hzs0xs > <https://mailshield.baidu.com/check?q=MoPVeJkuMlSdC4cRgJ8sTTFNR9ZIviNjcXjreha7KJ52DyJ%2bAe3T0OFrX3fGIqa7CAAyWnIhPmUWDKAK6tyMj%2febVC4%3d> > and is now realized via > https://issues.apache.org/jira/browse/SPARK-42452 > <https://mailshield.baidu.com/check?q=dbwEByIh5K6GxQHq33D4HpiebQx525grUUOS9tAwIQ%2fGCSK5iL9LnRXlslcXK5o0fh01gs6BTQE%3d> > > Feel free to comment if you still have any concerns. > > Thanks. > Chao > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >