Thanks Gabor, a cool solution to this problem. I think it should be fine for the encryption feature. The writing path I've mentioned below, is rather atypical; we don't need this function in most cases. A framework that does use this path, would override the default implementation.
Cheers, Gidon On Fri, Jan 29, 2021, 15:52 Gabor Szadovszky <[email protected]> wrote: > Thanks a lot, Yuming, Aaron. This is clearly a show stopper. Let me give a > -1 (binding) and therefore FAIL this vote. Thanks a lot everyone who took > the time to validate. > > About the compatibility issue. > I agree with Gidon that this change is necessary for the feature. The > question is whether we need to do it in a way that it is source compatible > with the previous minor release 1.11.0 or not. > > The fix for the actual issue would be quite easy. We only need to add a > default implementation to the interface so the 3rd party implementers do > not need to update their code. The main question is do we want to make this > change. If yes, then we should update the japicmp plugin to fail in case of > source incompatible changes for minor releases as well. (It would fail in > these cases for a patch version upgrade for example between 1.11.0 and > 1.11.1.) > > I've quicky enabled this and there are a couple of other failures that are > not very clear to me (METHOD_ABSTRACT_ADDED_IN_IMPLEMENTED_INTERFACE). So > before investigating further and trying to fix these issues I would like > to hear your opinions if it is correct to release a minor version with > source incompatible changes (while binary compatibility is guaranteed). > > Thanks a lot, > Gabor > > On Fri, Jan 29, 2021 at 2:41 PM Aaron Niskode-Dossett > <[email protected]> wrote: > > > I haven't seen this java.lang.NoSuchMethodError: > > java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer; problem before, but > > some research suggest that this can happen with JDK > 8 is used to > compile > > code that gets run with JDK 8. Setting --source and --target alone do > not > > address the problem, but adding "--release 1.8" will address this. > > > > I don't yet know enough about the Parquet build process to know if this > is > > systemic or an individual user issue. It seems like we specify Java 8 > > through the pom file? > > > > Another common suggestion to address this specific issue is to case > > ByteBuffer to Buffer before calling methods like position, but that seems > > like an incredibly fragile fix. > > > > References: > > http://openjdk.java.net/jeps/247 > > > > > https://stackoverflow.com/questions/61267495/exception-in-thread-main-java-lang-nosuchmethoderror-java-nio-bytebuffer-flip > > https://github.com/eclipse/jetty.project/issues/3244 > > > > On Fri, Jan 29, 2021 at 6:56 AM Wang, Yuming <[email protected]> > > wrote: > > > > > > > > It seems there is something wrong with JDK 8: > > > > > > java.lang.NoSuchMethodError: > > > java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer; > > > at > > > > > > org.apache.parquet.bytes.CapacityByteArrayOutputStream.write(CapacityByteArrayOutputStream.java:197) > > > at > > > > > > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridEncoder.writeOrAppendBitPackedRun(RunLengthBitPackingHybridEncoder.java:193) > > > at > > > > > > org.apache.parquet.column.values.rle.RunLengthBitPackingHybridEncoder.writeInt(RunLengthBitPackingHybridEncoder.java:179) > > > at > > > > > > org.apache.parquet.column.values.dictionary.DictionaryValuesWriter.getBytes(DictionaryValuesWriter.java:167) > > > at > > > > > > org.apache.parquet.column.values.fallback.FallbackValuesWriter.getBytes(FallbackValuesWriter.java:74) > > > at > > > > > > org.apache.parquet.column.impl.ColumnWriterV1.writePage(ColumnWriterV1.java:60) > > > at > > > > > > org.apache.parquet.column.impl.ColumnWriterBase.writePage(ColumnWriterBase.java:387) > > > at > > > > > > org.apache.parquet.column.impl.ColumnWriteStoreBase.sizeCheck(ColumnWriteStoreBase.java:235) > > > at > > > > > > org.apache.parquet.column.impl.ColumnWriteStoreBase.endRecord(ColumnWriteStoreBase.java:222) > > > at > > > > > > org.apache.parquet.column.impl.ColumnWriteStoreV1.endRecord(ColumnWriteStoreV1.java:29) > > > at org.apache.parquet.io > > > > > > .MessageColumnIO$MessageColumnIORecordConsumer.endMessage(MessageColumnIO.java:307) > > > at > > > > > > org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport.consumeMessage(ParquetWriteSupport.scala:465) > > > at > > > > > > org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport.write(ParquetWriteSupport.scala:148) > > > at > > > > > > org.apache.spark.sql.execution.datasources.parquet.ParquetWriteSupport.write(ParquetWriteSupport.scala:54) > > > at > > > > > > org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:138) > > > > > > > > > On 2021/1/29, 18:01, "Gidon Gershinsky" <[email protected]> wrote: > > > > > > External Email > > > > > > Regarding the technical reason behind this addition - we needed it > to > > > enable encryption in one of the writing paths. > > > > > > Cheers, Gidon > > > > > > > > > On Thu, Jan 28, 2021 at 7:09 PM Aaron Niskode-Dossett > > > <[email protected]> wrote: > > > > > > > My (non-binding) is that this is ok. In a different Apache > > project, > > > we > > > > didn't allow a change like that in minor versions and it delayed > > > some key > > > > work by several months. > > > > > > > > On Thu, Jan 28, 2021 at 3:00 AM Gabor Szadovszky < > [email protected] > > > > > > wrote: > > > > > > > > > Thanks a lot, Fokko. > > > > > > > > > > Regarding the breaking change. We have the maven plugin japicmp > > > executed > > > > in > > > > > the verify phase so I was curious why it did not catch this > > issue. > > > It > > > > seems > > > > > the plugin allows source incompatible changes for minor version > > > upgrades > > > > by > > > > > default. It sounds reasonable to me but I am curious about the > > > opinion of > > > > > the community. See details about the plugin at > > > > > > > > > > > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsiom79.github.io%2Fjapicmp%2FMavenPlugin.html&data=04%7C01%7Cyumwang%40ebay.com%7Ca25ea17897d64584603408d8c43cd748%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637475112885259165%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7FyRjo3idZOwtT5dRYRSwbC5oxcf3unhYuO72nroI74%3D&reserved=0 > > . > > > Search > > > > > for METHOD_ADDED_TO_INTERFACE to find info about the current > > issue. > > > > > > > > > > Cheers, > > > > > Gabor > > > > > > > > > > > > > > > On Wed, Jan 27, 2021 at 11:08 PM Driesprong, Fokko > > > <[email protected] > > > > > > > > > > wrote: > > > > > > > > > > > Thanks for running the release Gabor! > > > > > > > > > > > > The signature checks out: > > > > > > MacBook-Pro-van-Fokko:Downloads fokkodriesprong$ curl > > > > > > > > > > > > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdownloads.apache.org%2Fparquet%2FKEYS&data=04%7C01%7Cyumwang%40ebay.com%7Ca25ea17897d64584603408d8c43cd748%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637475112885259165%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=SxoDKeIRmWww%2FzRI9tYycdDFF9rQONh0iD4XwwVCYPE%3D&reserved=0 > > > > KEYS > > > > > > % Total % Received % Xferd Average Speed Time Time > > > Time > > > > > > Current > > > > > > Dload Upload Total > Spent > > > Left > > > > > > Speed > > > > > > 100 33082 100 33082 0 0 212k 0 --:--:-- > --:--:-- > > > > --:--:-- > > > > > > 212k > > > > > > MacBook-Pro-van-Fokko:Downloads fokkodriesprong$ gpg --import > > > KEYS > > > > > > gpg: key 97D7E8647AE7E47B: 2 signatures not checked due to > > > missing keys > > > > > > gpg: key 97D7E8647AE7E47B: public key "Julien Le Dem < > > > [email protected] > > > > >" > > > > > > imported > > > > > > gpg: key 7CD8278971F0F13B: 1 signature not checked due to a > > > missing key > > > > > > gpg: key 7CD8278971F0F13B: public key "Tianshuo Deng < > > > > [email protected] > > > > > >" > > > > > > imported > > > > > > gpg: key 4FB955854318F669: 8 signatures not checked due to > > > missing keys > > > > > > gpg: key 4FB955854318F669: public key "Tom White (CODE > SIGNING > > > KEY) < > > > > > > [email protected]>" imported > > > > > > gpg: key FCB3CBD9D3924CCD: public key "Ryan Blue (CODE > SIGNING > > > KEY) < > > > > > > [email protected]>" imported > > > > > > gpg: key A9358ED82F7D7992: public key "Alex Levenson < > > > > > > [email protected]>" imported > > > > > > gpg: key 442C7FAC7C58F0FE: public key "Alex Levenson < > > > > > [email protected] > > > > > > >" > > > > > > imported > > > > > > gpg: key 29D94E228CAAD602: 2 signatures not checked due to > > > missing keys > > > > > > gpg: key 29D94E228CAAD602: public key "Uwe L. Korn < > > > [email protected]>" > > > > > > imported > > > > > > gpg: key 021057DBF048F543: public key "Gabor Szadovszky < > > > > > [email protected] > > > > > > >" > > > > > > imported > > > > > > gpg: Total number processed: 8 > > > > > > gpg: imported: 8 > > > > > > gpg: marginals needed: 3 completes needed: 1 trust model: > pgp > > > > > > gpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, > 0m, > > > 0f, 1u > > > > > > gpg: next trustdb check due at 2021-04-08 > > > > > > MacBook-Pro-van-Fokko:Downloads fokkodriesprong$ gpg --verify > > > > > > apache-parquet-1.12.0.tar.gz.asc apache-parquet-1.12.0.tar.gz > > > > > > gpg: Signature made wo 27 jan 16:36:01 2021 CET > > > > > > gpg: using RSA key > > > > > 6FB82970311551C7CEF131F5021057DBF048F543 > > > > > > gpg: Good signature from "Gabor Szadovszky <[email protected] > >" > > > > [unknown] > > > > > > gpg: WARNING: This key is not certified with a trusted > > signature! > > > > > > gpg: There is no indication that the signature > belongs > > > to the > > > > > > owner. > > > > > > Primary key fingerprint: 6FB8 2970 3115 51C7 CEF1 31F5 0210 > > > 57DB F048 > > > > > F543 > > > > > > > > > > > > Also, the hash is looking good: > > > > > > MacBook-Pro-van-Fokko:Downloads fokkodriesprong$ shasum -a > 512 > > > > > > apache-parquet-1.12.0.tar.gz > > > > > > > > > > > > > > > > > > > > > > > > > > > f19e13e3997027f66f82a8cbacff8e832545250a1d84292234d3da45effdcf41a42288dad37ee8f4d04e29415bc58e2f170abdacd9c3ffe6bff4f68d117c32ec > > > > > > apache-parquet-1.12.0.tar.gz > > > > > > MacBook-Pro-van-Fokko:Downloads fokkodriesprong$ cat > > > > > > apache-parquet-1.12.0.tar.gz.sha512 > > > > > > > > > > > > > > > > > > > > > > > > > > > f19e13e3997027f66f82a8cbacff8e832545250a1d84292234d3da45effdcf41a42288dad37ee8f4d04e29415bc58e2f170abdacd9c3ffe6bff4f68d117c32ec > > > > > > apache-parquet-1.12.0.tar.gz > > > > > > > > > > > > I've checked against Iceberg, and the API is broken for the > > > OutputFile: > > > > > > > > > > > > MacBook-Pro-van-Fokko:incubator-iceberg fokkodriesprong$ git > > diff > > > > > > > > > > > > *diff --git > > > > > > > > a/parquet/src/main/java/org/apache/iceberg/parquet/ParquetIO.java > > > > > > > > > b/parquet/src/main/java/org/apache/iceberg/parquet/ParquetIO.java* > > > > > > > > > > > > *index d65b8d63..8ce32a7b 100644* > > > > > > > > > > > > *--- > > > a/parquet/src/main/java/org/apache/iceberg/parquet/ParquetIO.java* > > > > > > > > > > > > *+++ > > > b/parquet/src/main/java/org/apache/iceberg/parquet/ParquetIO.java* > > > > > > > > > > > > @@ -162,6 +162,11 @@ class ParquetIO { > > > > > > > > > > > > public long defaultBlockSize() { > > > > > > > > > > > > return 0; > > > > > > > > > > > > } > > > > > > > > > > > > + > > > > > > > > > > > > + @Override > > > > > > > > > > > > + public String getPath() { > > > > > > > > > > > > + return this.file.location(); > > > > > > > > > > > > + } > > > > > > > > > > > > } > > > > > > > > > > > > This change is introduced here: > > > > > > > > > > > > > > > > > > > > > > > > > > > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fparquet-mr%2Fcommit%2F5c6916c23cb2b9c225ea80328550ee0e11aee225&data=04%7C01%7Cyumwang%40ebay.com%7Ca25ea17897d64584603408d8c43cd748%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637475112885269158%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=g90a2SAxZockh7O5ECUeQor82%2FwGfekGL01dDqrxGIE%3D&reserved=0 > > > > > > > > > > > > It is breaking, but not sure if it is blocking. > > > > > > > > > > > > A +1 (non-binding) from my side! > > > > > > > > > > > > Cheers, Fokko > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Op wo 27 jan. 2021 om 16:46 schreef Gabor Szadovszky < > > > [email protected] > > > > >: > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > I propose the following RC to be released as the official > > > Apache > > > > > Parquet > > > > > > > 1.12.0 release. > > > > > > > > > > > > > > The commit id is ad59c33e53276572c105b4ccac71293e988adc30 > > > > > > > * This corresponds to the tag: apache-parquet-1.12.0-rc1 > > > > > > > * > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fparquet-mr%2Ftree%2Fad59c33e53276572c105b4ccac71293e988adc30&data=04%7C01%7Cyumwang%40ebay.com%7Ca25ea17897d64584603408d8c43cd748%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637475112885269158%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=knnedUc%2FRLdPlJW%2FH%2BOva1%2FhTgHj3hrzSgkeNPWvCNU%3D&reserved=0 > > > > > > > > > > > > > > The release tarball, signature, and checksums are here: > > > > > > > * > > > > > > > > > > > > > > > > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdist.apache.org%2Frepos%2Fdist%2Fdev%2Fparquet%2Fapache-parquet-1.12.0-rc1&data=04%7C01%7Cyumwang%40ebay.com%7Ca25ea17897d64584603408d8c43cd748%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637475112885269158%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=97Ja8DkohIbK0xizxD05D3qDsdc04cK9GsASOtc26%2Bk%3D&reserved=0 > > > > > > > > > > > > > > You can find the KEYS file here: > > > > > > > * > > > > > > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdownloads.apache.org%2Fparquet%2FKEYS&data=04%7C01%7Cyumwang%40ebay.com%7Ca25ea17897d64584603408d8c43cd748%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637475112885269158%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Pp%2Fmyj3JrCR%2FVAZDMTN3o5sh%2FliNFbE%2F1Spad0rPNlI%3D&reserved=0 > > > > > > > > > > > > > > Binary artifacts are staged in Nexus here: > > > > > > > * > > > > > > > > > > > > > > > > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Fgroups%2Fstaging%2Forg%2Fapache%2Fparquet%2F&data=04%7C01%7Cyumwang%40ebay.com%7Ca25ea17897d64584603408d8c43cd748%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637475112885269158%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=AfjlRUIgxRv5MiJ7kMSRlE0m9IXYUG%2BUMtrhs5J3ZZQ%3D&reserved=0 > > > > > > > > > > > > > > This release includes the features Parquet Modular > Encryption > > > and > > > > > Parquet > > > > > > > Bloom Filter. See details at: > > > > > > > * > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fparquet-mr%2Fblob%2Fapache-parquet-1.12.0-rc1%2FCHANGES.md&data=04%7C01%7Cyumwang%40ebay.com%7Ca25ea17897d64584603408d8c43cd748%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637475112885269158%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=l1lLzoJsgXXwgNzuVkMyX1GqbcOQCdIXcyVLLLkg%2Fcc%3D&reserved=0 > > > > > > > > > > > > > > Please download, verify, and test. > > > > > > > > > > > > > > Please vote in the next 72 hours. > > > > > > > > > > > > > > [ ] +1 Release this as Apache Parquet 1.12.0 > > > > > > > [ ] +0 > > > > > > > [ ] -1 Do not release this because... > > > > > > > > > > > > > > > > > > > > > PS.: Starting with RC1 instead of RC0 because I've missed > to > > > update > > > > the > > > > > > > CHANGES.md at the first time. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Aaron Niskode-Dossett, Data Engineering -- Etsy > > > > > > > > > > > > > > -- > > Aaron Niskode-Dossett, Data Engineering -- Etsy > > >
