Re: hadoop-2 profile to be removed in 3.5.0

2023-04-18 Thread Steve Loughran
This is truly wonderful.

1. I have an internal patch related to committer stuff I could submit now
2. if someone wants to look at it where FileSystem.open() is used *and you
have the file length, file path, or simply know whether you plan to do
random or sequential IO*, switch to openFile(). on s3a, abfs it will skip a
HEAD request if it is happy, and passing down seek policy makes a big
difference in read performance.

Now, we just have to get hive, parquet, orc, avro and iceberg all lined
up...once parquet makes the move to 3.3.5+ the vector IO api can be used
there. Which mukund thakur will be talking about at Berlin Buzzwords this
summer if anyone is thinking of attending.

Oh, and I still need to get hadoop to being java11+ only.

Thanks for this.

steve

On Sun, 16 Apr 2023 at 05:25, yangjie01  wrote:

> Thanks Chao ~
>
>
>
> Yang Jie
>
>
>
> *发件人**: *Dongjoon Hyun 
> *日期**: *2023年4月16日 星期日 00:08
> *收件人**: *Chao Sun 
> *抄送**: *dev 
> *主题**: *Re: hadoop-2 profile to be removed in 3.5.0
>
>
>
> Thank you so much for head-ups, Chao!
>
>
>
> Dongjoon.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 6:33 PM Chao Sun  wrote:
>
> Hi all,
>
> Just a heads up that `hadoop-2` profile is going to be removed in
> Apache Spark 3.5.0. This has been discussed previously through this
> email thread:
> https://lists.apache.org/thread/z4jdy9959b6zj9t726zl0zcrk4hzs0xs
> <https://mailshield.baidu.com/check?q=MoPVeJkuMlSdC4cRgJ8sTTFNR9ZIviNjcXjreha7KJ52DyJ%2bAe3T0OFrX3fGIqa7CAAyWnIhPmUWDKAK6tyMj%2febVC4%3d>
> and is now realized via
> https://issues.apache.org/jira/browse/SPARK-42452
> <https://mailshield.baidu.com/check?q=dbwEByIh5K6GxQHq33D4HpiebQx525grUUOS9tAwIQ%2fGCSK5iL9LnRXlslcXK5o0fh01gs6BTQE%3d>
>
> Feel free to comment if you still have any concerns.
>
> Thanks.
> Chao
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: hadoop-2 profile to be removed in 3.5.0

2023-04-15 Thread yangjie01
Thanks Chao ~

Yang Jie

发件人: Dongjoon Hyun 
日期: 2023年4月16日 星期日 00:08
收件人: Chao Sun 
抄送: dev 
主题: Re: hadoop-2 profile to be removed in 3.5.0

Thank you so much for head-ups, Chao!

Dongjoon.


On Fri, Apr 14, 2023 at 6:33 PM Chao Sun 
mailto:sunc...@apache.org>> wrote:
Hi all,

Just a heads up that `hadoop-2` profile is going to be removed in
Apache Spark 3.5.0. This has been discussed previously through this
email thread: 
https://lists.apache.org/thread/z4jdy9959b6zj9t726zl0zcrk4hzs0xs<https://mailshield.baidu.com/check?q=MoPVeJkuMlSdC4cRgJ8sTTFNR9ZIviNjcXjreha7KJ52DyJ%2bAe3T0OFrX3fGIqa7CAAyWnIhPmUWDKAK6tyMj%2febVC4%3d>
and is now realized via
https://issues.apache.org/jira/browse/SPARK-42452<https://mailshield.baidu.com/check?q=dbwEByIh5K6GxQHq33D4HpiebQx525grUUOS9tAwIQ%2fGCSK5iL9LnRXlslcXK5o0fh01gs6BTQE%3d>

Feel free to comment if you still have any concerns.

Thanks.
Chao

-
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>


Re: hadoop-2 profile to be removed in 3.5.0

2023-04-15 Thread Dongjoon Hyun
Thank you so much for head-ups, Chao!

Dongjoon.


On Fri, Apr 14, 2023 at 6:33 PM Chao Sun  wrote:

> Hi all,
>
> Just a heads up that `hadoop-2` profile is going to be removed in
> Apache Spark 3.5.0. This has been discussed previously through this
> email thread:
> https://lists.apache.org/thread/z4jdy9959b6zj9t726zl0zcrk4hzs0xs
> and is now realized via
> https://issues.apache.org/jira/browse/SPARK-42452
>
> Feel free to comment if you still have any concerns.
>
> Thanks.
> Chao
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


hadoop-2 profile to be removed in 3.5.0

2023-04-14 Thread Chao Sun
Hi all,

Just a heads up that `hadoop-2` profile is going to be removed in
Apache Spark 3.5.0. This has been discussed previously through this
email thread: https://lists.apache.org/thread/z4jdy9959b6zj9t726zl0zcrk4hzs0xs
and is now realized via
https://issues.apache.org/jira/browse/SPARK-42452

Feel free to comment if you still have any concerns.

Thanks.
Chao

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org