regarding parquet, the jira there is :
https://github.com/apache/parquet-java/issues/3237

and while the change in that lib is "release the buffers", it hits a
problem with coalesced reads as there the releasing doesn't work.

my fix will be: disable coalescing, keep the code (for now), and add a
path/stream capability to indicate merging.

with a max size for range merging of 2MB, it'll only surface in production
if ORC/parquet had rowgroups of size < 2MB, so not realistic at all. But it
exists.

On Fri, 6 Jun 2025 at 09:54, Suhail, Ahmar <ahma...@amazon.co.uk.invalid>
wrote:

> Thanks everyone for testing out this RC.
>
>
> For this RC, the current status is:
>
>
>   *   Issue with buffers allocated by ParquetFileReader.readVectored() are
> not being released. While this is not new, we should root cause and fix in
> case it's a hadoop issue for 3.4.2.
>   *   Need to figure out what to do with AWS SDK bundle, the transitive
> dependency should ideally be restored. I will discuss with Steve.
>   *   Documentation updates - I will fix.
>   *   PR's are up for the YARN issues which was failing the ARM64 builds,
> thank you Masatake, so the next RC will include ARM64 binaries.
>
>
>
> Once the parquet buffer issue is resolved, I will begin work on RC-2. If
> there are any other issues in this RC that I've missed in the above list
> please let me know so I can be sure to include the fix in the next RC.
>
>
> Thank you,
> Ahmar
>
> ________________________________
> From: Masatake Iwasaki <iwasak...@oss.nttdata.com>
> Sent: Thursday, June 5, 2025 4:46:06 PM
> To: Wei-Chiu Chuang; Ahmar Suhail
> Cc: common-...@hadoop.apache.org; hdfs-dev@hadoop.apache.org;
> mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
> Subject: RE: [EXTERNAL] [VOTE] Release Apache Hadoop 3.4.2
>
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi Wei-Chiu Chuang,
>
> > I really wish we can make ARM64 binaries. Would like to find time to work
> > on the two jiras mentioned.
>
> I submitted PRs for YARN-11712 and YARN-11713.
> Could you check the patches?
>
> Thanks,
> Masatake Iwasaki
>
> On 2025/06/05 1:14, Wei-Chiu Chuang wrote:
> > I really wish we can make ARM64 binaries. Would like to find time to work
> > on the two jiras mentioned.
> >
> > On Wed, May 28, 2025 at 5:25 AM Ahmar Suhail <ah...@apache.org> wrote:
> >
> >> Hey all,
> >>
> >> The first release candidate for Hadoop 3.4.2 is now available for
> voting.
> >>
> >> There are a couple of things to note:
> >>
> >> 1/ No Arm64 artifacts. This is due to previously reported issues:
> >> https://issues.apache.org/jira/projects/YARN/issues/YARN-11712 and
> >> ttps://issues.apache.org/jira/projects/YARN/issues/YARN-11713
> >> <https://issues.apache.org/jira/projects/YARN/issues/YARN-11713>, which
> >> mean that the build fails on arm64.
> >>
> >> 2/ Relevant for anyone testing S3A: We've removed the AWS SDK bundle
> >> from hadoop-3.4.2.tar.gz. This is because the SDK bundle is now ~600MB,
> >> which makes the size of tar > 1GB, and it can no longer be uploaded to
> SVN.
> >> For S3A, download SDK bundle v2.29.52 from:
> >>
> https://mvnrepository.com/artifact/software.amazon.awssdk/bundle/2.29.52,
> >> and drop it into /share/hadoop/common/lib. Release notes will be updated
> >> with these instructions.
> >>
> >>
> >> The RC is available at:
> >>
> >> https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.4.2-RC1/
> >>
> >> The git tag is release-3.4.2-RC1, commit
> >> 09870840ec35b48cd107972eb24d25e8aece04c9
> >>
> >> The maven artifacts are staged at:
> >>
> >> https://repository.apache.org/content/repositories/orgapachehadoop-1437
> >>
> >>
> >> You can find my public key (02085AFB652F796A3B01D11FD737A6F52281FA98)
> at:
> >>
> >> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >>
> >>
> >> This release has been created off of branch-3.4. Key changes include:
> >>
> >> * S3A: Integration with S3 Analytics Accelerator input stream
> >> * S3A: Support for S3 conditional writes
> >> * ABFS: Deprecation of WASB driver
> >> * ABFS: Support for Non-Heirarchical Namespace Accounts on ABFS Driver
> >>
> >>
> >> This is my first attempt at managing a release, please do test the
> release
> >> and let me know in case of any issues.
> >>
> >> Thanks,
> >> Ahmar
> >>
> >
>
>

Reply via email to