Any chance we could get some movement on this for 2.4.1? https://issues.apache.org/jira/browse/SPARK-25588 <https://issues.apache.org/jira/browse/SPARK-25588> https://github.com/apache/parquet-mr/pull/560 <https://github.com/apache/parquet-mr/pull/560>
It would require a new Parquet release, which would then need to be picked up by Spark. We're dead in the water on 2.4.0 without a large refactoring (remove all the RDD code paths for reading Avro stored in Parquet). michael > On Mar 8, 2019, at 6:22 PM, Sean Owen <sro...@gmail.com> wrote: > > FWIW RC6 looked fine to me. Passed all tests, etc. > > On Fri, Mar 8, 2019 at 6:09 PM DB Tsai <dbt...@dbtsai.com > <mailto:dbt...@dbtsai.com>> wrote: > Sounds fair to me. I'll cut another rc7 when the PR is merged. Hopefully, > this is the final rc. Thanks. > > Sincerely, > > DB Tsai > ---------------------------------------------------------- > Web: https://www.dbtsai.com <https://www.dbtsai.com/> > PGP Key ID: 42E5B25A8F7A82C1 > > > On Fri, Mar 8, 2019 at 3:23 PM Xiao Li <lix...@databricks.com > <mailto:lix...@databricks.com>> wrote: > It is common to hit this issue when driver and executors are different object > layout, but Spark might not return a wrong answer. It is very hard to find > out the root cause. Thus, I would suggest to include it in Spark 2.4.1. > > On Fri, Mar 8, 2019 at 3:13 PM DB Tsai <dbt...@dbtsai.com > <mailto:dbt...@dbtsai.com>> wrote: > BTW, practically, is it common for users running into this bug when the > driver and executors have different object layout? > > Sincerely, > > DB Tsai > ---------------------------------------------------------- > Web: https://www.dbtsai.com <https://www.dbtsai.com/> > PGP Key ID: 42E5B25A8F7A82C1 > > > On Fri, Mar 8, 2019 at 3:00 PM DB Tsai <dbt...@dbtsai.com > <mailto:dbt...@dbtsai.com>> wrote: > Hi Xiao, > > I already cut rc7 and start the build process. If we definitely need this > fix, I can cut rc8. Let me know what you think. > > Thanks, > > On Fri, Mar 8, 2019 at 1:46 PM Xiao Li <lix...@databricks.com > <mailto:lix...@databricks.com>> wrote: > Hi, DB, > > Since this RC will fail, could you hold it until we fix > https://issues.apache.org/jira/browse/SPARK-27097 > <https://issues.apache.org/jira/browse/SPARK-27097>? Either Kris or I will > submit a PR today. The PR is small and the risk is low. This is a correctness > bug. It would be good to have it. > > Thanks, > > Xiao > > > > > On Fri, Mar 8, 2019 at 12:17 PM DB Tsai <d_t...@apple.com.invalid> wrote: > Since I can not find the commit of `Preparing development version > 2.4.2-SNAPSHOT` after rc6 cut, it's very risky to fix the branch and do a > force-push. I'll follow Marcelo's suggestion to have another rc7 cut. Thus, > this vote fails. > > DB Tsai | Siri Open Source Technologies [not a contribution] | Apple, > Inc > > > On Mar 8, 2019, at 11:45 AM, DB Tsai <d_t...@apple.com.INVALID> wrote: > > > > Okay, I see the problem. rc6 tag is not in the 2.4 branch. It's very weird. > > It must be overwritten by a force push. > > > > DB Tsai | Siri Open Source Technologies [not a contribution] | Apple, > > Inc > > > >> On Mar 8, 2019, at 11:39 AM, DB Tsai <d_t...@apple.com.INVALID> wrote: > >> > >> I was using `./do-release-docker.sh` to create a release. But since the > >> gpg validation fails couple times when the script tried to publish the > >> jars into Nexus, I re-ran the scripts multiple times without creating a > >> new rc. I was wondering if the script will overwrite the v.2.4.1-rc6 tag > >> instead of using the same commit causing this issue. > >> > >> Should we create a new rc7? > >> > >> DB Tsai | Siri Open Source Technologies [not a contribution] | > >> Apple, Inc > >> > >>> On Mar 8, 2019, at 10:54 AM, Marcelo Vanzin <van...@cloudera.com.INVALID> > >>> wrote: > >>> > >>> I personally find it a little weird to not have the commit in branch-2.4. > >>> > >>> Not that this would happen, but if the v2.4.1-rc6 tag is overwritten > >>> (e.g. accidentally) then you lose the reference to that commit, and > >>> then the exact commit from which the rc was generated is lost. > >>> > >>> On Fri, Mar 8, 2019 at 7:49 AM Sean Owen <sro...@gmail.com > >>> <mailto:sro...@gmail.com>> wrote: > >>>> > >>>> That's weird. I see the commit but can't find it in the branch. Was it > >>>> pushed, or lost in a force push of 2.4 along the way? The change is > >>>> there, just under a different commit in the 2.4 branch. > >>>> > >>>> It doesn't necessarily invalidate the RC as it is a valid public tagged > >>>> commit and all that. I just want to be sure we do have the code from > >>>> that commit in these tatballs. It looks like it. > >>>> > >>>> On Fri, Mar 8, 2019, 4:14 AM Mihály Tóth <misut...@gmail.com > >>>> <mailto:misut...@gmail.com>> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> I am not sure how problematic it is but v2.4.1-rc6 is not on > >>>>> branch-2.4. Release related commits I have seen so far were also part > >>>>> of the branch. > >>>>> > >>>>> I guess the "Preparing Spark release v2.4.1-rc6" and "Preparing > >>>>> development version 2.4.2-SNAPSHOT" commits were simply not pushed to > >>>>> spark-2.4 just the tag itself was pushed. I dont know what is the > >>>>> practice in such cases but one solution is to rebase branch-2.4 changes > >>>>> after 3336a21 onto these commits and do a (sorry) force push. In this > >>>>> case there is no impact on this RC. > >>>>> > >>>>> Best Regards, > >>>>> > >>>>> Misi > >>>>> > >>>>> DB Tsai <d_t...@apple.com.invalid> ezt írta (időpont: 2019. márc. 8., > >>>>> P, 1:15): > >>>>>> > >>>>>> Please vote on releasing the following candidate as Apache Spark > >>>>>> version 2.4.1. > >>>>>> > >>>>>> The vote is open until March 11 PST and passes if a majority +1 PMC > >>>>>> votes are cast, with > >>>>>> a minimum of 3 +1 votes. > >>>>>> > >>>>>> [ ] +1 Release this package as Apache Spark 2.4.1 > >>>>>> [ ] -1 Do not release this package because ... > >>>>>> > >>>>>> To learn more about Apache Spark, please see http://spark.apache.org/ > >>>>>> <http://spark.apache.org/> > >>>>>> > >>>>>> The tag to be voted on is v2.4.1-rc6 (commit > >>>>>> 201ec8c9b46f9d037cc2e3a5d9c896b9840ca1bc): > >>>>>> https://github.com/apache/spark/tree/v2.4.1-rc6 > >>>>>> <https://github.com/apache/spark/tree/v2.4.1-rc6> > >>>>>> > >>>>>> The release files, including signatures, digests, etc. can be found at: > >>>>>> https://dist.apache.org/repos/dist/dev/spark/v2.4.1-rc6-bin/ > >>>>>> <https://dist.apache.org/repos/dist/dev/spark/v2.4.1-rc6-bin/> > >>>>>> > >>>>>> Signatures used for Spark RCs can be found in this file: > >>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS > >>>>>> <https://dist.apache.org/repos/dist/dev/spark/KEYS> > >>>>>> > >>>>>> The staging repository for this release can be found at: > >>>>>> https://repository.apache.org/content/repositories/orgapachespark-1308/ > >>>>>> > >>>>>> <https://repository.apache.org/content/repositories/orgapachespark-1308/> > >>>>>> > >>>>>> The documentation corresponding to this release can be found at: > >>>>>> https://dist.apache.org/repos/dist/dev/spark/v2.4.1-rc6-docs/ > >>>>>> <https://dist.apache.org/repos/dist/dev/spark/v2.4.1-rc6-docs/> > >>>>>> > >>>>>> The list of bug fixes going into 2.4.1 can be found at the following > >>>>>> URL: > >>>>>> https://issues.apache.org/jira/projects/SPARK/versions/2.4.1 > >>>>>> <https://issues.apache.org/jira/projects/SPARK/versions/2.4.1> > >>>>>> > >>>>>> FAQ > >>>>>> > >>>>>> ========================= > >>>>>> How can I help test this release? > >>>>>> ========================= > >>>>>> > >>>>>> If you are a Spark user, you can help us test this release by taking > >>>>>> an existing Spark workload and running on this release candidate, then > >>>>>> reporting any regressions. > >>>>>> > >>>>>> If you're working in PySpark you can set up a virtual env and install > >>>>>> the current RC and see if anything important breaks, in the Java/Scala > >>>>>> you can add the staging repository to your projects resolvers and test > >>>>>> with the RC (make sure to clean up the artifact cache before/after so > >>>>>> you don't end up building with a out of date RC going forward). > >>>>>> > >>>>>> =========================================== > >>>>>> What should happen to JIRA tickets still targeting 2.4.1? > >>>>>> =========================================== > >>>>>> > >>>>>> The current list of open tickets targeted at 2.4.1 can be found at: > >>>>>> https://issues.apache.org/jira/projects/SPARK > >>>>>> <https://issues.apache.org/jira/projects/SPARK> and search for "Target > >>>>>> Version/s" = 2.4.1 > >>>>>> > >>>>>> Committers should look at those and triage. Extremely important bug > >>>>>> fixes, documentation, and API tweaks that impact compatibility should > >>>>>> be worked on immediately. Everything else please retarget to an > >>>>>> appropriate release. > >>>>>> > >>>>>> ================== > >>>>>> But my bug isn't fixed? > >>>>>> ================== > >>>>>> > >>>>>> In order to make timely releases, we will typically not hold the > >>>>>> release unless the bug in question is a regression from the previous > >>>>>> release. That being said, if there is something which is a regression > >>>>>> that has not been correctly targeted please ping me or a committer to > >>>>>> help target the issue. > >>>>>> > >>>>>> DB Tsai | Siri Open Source Technologies [not a contribution] | > >>>>>> Apple, Inc > >>>>>> > >>>>>> > >>>>>> --------------------------------------------------------------------- > >>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >>>>>> <mailto:dev-unsubscr...@spark.apache.org> > >>>>>> > >>> > >>> > >>> -- > >>> Marcelo > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >>> <mailto:dev-unsubscr...@spark.apache.org> > >>> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >> <mailto:dev-unsubscr...@spark.apache.org> > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > <mailto:dev-unsubscr...@spark.apache.org> > > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > <mailto:dev-unsubscr...@spark.apache.org> > > > > -- > <sais19-emailsig-keynote...@2x.jpg> > -- > - DB Sent from my iPhone > > > -- > <sais19-emailsig-keynote...@2x.jpg>