This is truly wonderful.

1. I have an internal patch related to committer stuff I could submit now
2. if someone wants to look at it where FileSystem.open() is used *and you
have the file length, file path, or simply know whether you plan to do
random or sequential IO*, switch to openFile(). on s3a, abfs it will skip a
HEAD request if it is happy, and passing down seek policy makes a big
difference in read performance.

Now, we just have to get hive, parquet, orc, avro and iceberg all lined
up...once parquet makes the move to 3.3.5+ the vector IO api can be used
there. Which mukund thakur will be talking about at Berlin Buzzwords this
summer if anyone is thinking of attending.

Oh, and I still need to get hadoop to being java11+ only.

Thanks for this.

steve

On Sun, 16 Apr 2023 at 05:25, yangjie01 <yangji...@baidu.com> wrote:

> Thanks Chao ~
>
>
>
> Yang Jie
>
>
>
> *发件人**: *Dongjoon Hyun <dongjoon.h...@gmail.com>
> *日期**: *2023年4月16日 星期日 00:08
> *收件人**: *Chao Sun <sunc...@apache.org>
> *抄送**: *dev <dev@spark.apache.org>
> *主题**: *Re: hadoop-2 profile to be removed in 3.5.0
>
>
>
> Thank you so much for head-ups, Chao!
>
>
>
> Dongjoon.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 6:33 PM Chao Sun <sunc...@apache.org> wrote:
>
> Hi all,
>
> Just a heads up that `hadoop-2` profile is going to be removed in
> Apache Spark 3.5.0. This has been discussed previously through this
> email thread:
> https://lists.apache.org/thread/z4jdy9959b6zj9t726zl0zcrk4hzs0xs
> <https://mailshield.baidu.com/check?q=MoPVeJkuMlSdC4cRgJ8sTTFNR9ZIviNjcXjreha7KJ52DyJ%2bAe3T0OFrX3fGIqa7CAAyWnIhPmUWDKAK6tyMj%2febVC4%3d>
> and is now realized via
> https://issues.apache.org/jira/browse/SPARK-42452
> <https://mailshield.baidu.com/check?q=dbwEByIh5K6GxQHq33D4HpiebQx525grUUOS9tAwIQ%2fGCSK5iL9LnRXlslcXK5o0fh01gs6BTQE%3d>
>
> Feel free to comment if you still have any concerns.
>
> Thanks.
> Chao
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to