Any chance we could get some movement on this for 2.4.1?

https://issues.apache.org/jira/browse/SPARK-25588 
<https://issues.apache.org/jira/browse/SPARK-25588>
https://github.com/apache/parquet-mr/pull/560 
<https://github.com/apache/parquet-mr/pull/560>

It would require a new Parquet release, which would then need to be picked up 
by Spark.  We're dead in the water on 2.4.0 without a large refactoring (remove 
all the RDD code paths for reading Avro stored in Parquet).

   michael


> On Mar 8, 2019, at 6:22 PM, Sean Owen <sro...@gmail.com> wrote:
> 
> FWIW RC6 looked fine to me. Passed all tests, etc.
> 
> On Fri, Mar 8, 2019 at 6:09 PM DB Tsai <dbt...@dbtsai.com 
> <mailto:dbt...@dbtsai.com>> wrote:
> Sounds fair to me. I'll cut another rc7 when the PR is merged. Hopefully, 
> this is the final rc. Thanks.
> 
> Sincerely,
> 
> DB Tsai
> ----------------------------------------------------------
> Web: https://www.dbtsai.com <https://www.dbtsai.com/>
> PGP Key ID: 42E5B25A8F7A82C1
> 
> 
> On Fri, Mar 8, 2019 at 3:23 PM Xiao Li <lix...@databricks.com 
> <mailto:lix...@databricks.com>> wrote:
> It is common to hit this issue when driver and executors are different object 
> layout, but Spark might not return a wrong answer. It is very hard to find 
> out the root cause. Thus, I would suggest to include it in Spark 2.4.1. 
> 
> On Fri, Mar 8, 2019 at 3:13 PM DB Tsai <dbt...@dbtsai.com 
> <mailto:dbt...@dbtsai.com>> wrote:
> BTW, practically, is it common for users running into this bug when the 
> driver and executors have different object layout?
> 
> Sincerely,
> 
> DB Tsai
> ----------------------------------------------------------
> Web: https://www.dbtsai.com <https://www.dbtsai.com/>
> PGP Key ID: 42E5B25A8F7A82C1
> 
> 
> On Fri, Mar 8, 2019 at 3:00 PM DB Tsai <dbt...@dbtsai.com 
> <mailto:dbt...@dbtsai.com>> wrote:
> Hi Xiao,
> 
> I already cut rc7 and start the build process. If we definitely need this 
> fix, I can cut rc8. Let me know what you think.
> 
> Thanks,
> 
> On Fri, Mar 8, 2019 at 1:46 PM Xiao Li <lix...@databricks.com 
> <mailto:lix...@databricks.com>> wrote:
> Hi, DB, 
> 
> Since this RC will fail, could you hold it until we fix 
> https://issues.apache.org/jira/browse/SPARK-27097 
> <https://issues.apache.org/jira/browse/SPARK-27097>? Either Kris or I will 
> submit a PR today. The PR is small and the risk is low. This is a correctness 
> bug. It would be good to have it. 
> 
> Thanks,
> 
> Xiao
> 
> 
>  
> 
> On Fri, Mar 8, 2019 at 12:17 PM DB Tsai <d_t...@apple.com.invalid> wrote:
> Since I can not find the commit of `Preparing development version 
> 2.4.2-SNAPSHOT` after rc6 cut, it's very risky to fix the branch and do a 
> force-push. I'll follow Marcelo's suggestion to have another rc7 cut. Thus, 
> this vote fails.
> 
> DB Tsai  |  Siri Open Source Technologies [not a contribution]  |   Apple, 
> Inc
> 
> > On Mar 8, 2019, at 11:45 AM, DB Tsai <d_t...@apple.com.INVALID> wrote:
> > 
> > Okay, I see the problem. rc6 tag is not in the 2.4 branch. It's very weird. 
> > It must be overwritten by a force push.
> > 
> > DB Tsai  |  Siri Open Source Technologies [not a contribution]  |   Apple, 
> > Inc
> > 
> >> On Mar 8, 2019, at 11:39 AM, DB Tsai <d_t...@apple.com.INVALID> wrote:
> >> 
> >> I was using `./do-release-docker.sh` to create a release. But since the 
> >> gpg validation fails couple times when the script tried to publish the 
> >> jars into Nexus, I re-ran the scripts multiple times without creating a 
> >> new rc. I was wondering if the script will overwrite the v.2.4.1-rc6 tag 
> >> instead of using the same commit causing this issue.
> >> 
> >> Should we create a new rc7?
> >> 
> >> DB Tsai  |  Siri Open Source Technologies [not a contribution]  |   
> >> Apple, Inc
> >> 
> >>> On Mar 8, 2019, at 10:54 AM, Marcelo Vanzin <van...@cloudera.com.INVALID> 
> >>> wrote:
> >>> 
> >>> I personally find it a little weird to not have the commit in branch-2.4.
> >>> 
> >>> Not that this would happen, but if the v2.4.1-rc6 tag is overwritten
> >>> (e.g. accidentally) then you lose the reference to that commit, and
> >>> then the exact commit from which the rc was generated is lost.
> >>> 
> >>> On Fri, Mar 8, 2019 at 7:49 AM Sean Owen <sro...@gmail.com 
> >>> <mailto:sro...@gmail.com>> wrote:
> >>>> 
> >>>> That's weird. I see the commit but can't find it in the branch. Was it 
> >>>> pushed, or lost in a force push of 2.4 along the way? The change is 
> >>>> there, just under a different commit in the 2.4 branch.
> >>>> 
> >>>> It doesn't necessarily invalidate the RC as it is a valid public tagged 
> >>>> commit and all that. I just want to be sure we do have the code from 
> >>>> that commit in these tatballs. It looks like it.
> >>>> 
> >>>> On Fri, Mar 8, 2019, 4:14 AM Mihály Tóth <misut...@gmail.com 
> >>>> <mailto:misut...@gmail.com>> wrote:
> >>>>> 
> >>>>> Hi,
> >>>>> 
> >>>>> I am not sure how problematic it is but v2.4.1-rc6 is not on 
> >>>>> branch-2.4. Release related commits I have seen so far were also part 
> >>>>> of the branch.
> >>>>> 
> >>>>> I guess the "Preparing Spark release v2.4.1-rc6" and "Preparing 
> >>>>> development version 2.4.2-SNAPSHOT" commits were simply not pushed to 
> >>>>> spark-2.4 just the tag itself was pushed. I dont know what is the 
> >>>>> practice in such cases but one solution is to rebase branch-2.4 changes 
> >>>>> after 3336a21 onto these commits and do a (sorry) force push. In this 
> >>>>> case there is no impact on this RC.
> >>>>> 
> >>>>> Best Regards,
> >>>>> 
> >>>>> Misi
> >>>>> 
> >>>>> DB Tsai <d_t...@apple.com.invalid> ezt írta (időpont: 2019. márc. 8., 
> >>>>> P, 1:15):
> >>>>>> 
> >>>>>> Please vote on releasing the following candidate as Apache Spark 
> >>>>>> version 2.4.1.
> >>>>>> 
> >>>>>> The vote is open until March 11 PST and passes if a majority +1 PMC 
> >>>>>> votes are cast, with
> >>>>>> a minimum of 3 +1 votes.
> >>>>>> 
> >>>>>> [ ] +1 Release this package as Apache Spark 2.4.1
> >>>>>> [ ] -1 Do not release this package because ...
> >>>>>> 
> >>>>>> To learn more about Apache Spark, please see http://spark.apache.org/ 
> >>>>>> <http://spark.apache.org/>
> >>>>>> 
> >>>>>> The tag to be voted on is v2.4.1-rc6 (commit 
> >>>>>> 201ec8c9b46f9d037cc2e3a5d9c896b9840ca1bc):
> >>>>>> https://github.com/apache/spark/tree/v2.4.1-rc6 
> >>>>>> <https://github.com/apache/spark/tree/v2.4.1-rc6>
> >>>>>> 
> >>>>>> The release files, including signatures, digests, etc. can be found at:
> >>>>>> https://dist.apache.org/repos/dist/dev/spark/v2.4.1-rc6-bin/ 
> >>>>>> <https://dist.apache.org/repos/dist/dev/spark/v2.4.1-rc6-bin/>
> >>>>>> 
> >>>>>> Signatures used for Spark RCs can be found in this file:
> >>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS 
> >>>>>> <https://dist.apache.org/repos/dist/dev/spark/KEYS>
> >>>>>> 
> >>>>>> The staging repository for this release can be found at:
> >>>>>> https://repository.apache.org/content/repositories/orgapachespark-1308/
> >>>>>>  
> >>>>>> <https://repository.apache.org/content/repositories/orgapachespark-1308/>
> >>>>>> 
> >>>>>> The documentation corresponding to this release can be found at:
> >>>>>> https://dist.apache.org/repos/dist/dev/spark/v2.4.1-rc6-docs/ 
> >>>>>> <https://dist.apache.org/repos/dist/dev/spark/v2.4.1-rc6-docs/>
> >>>>>> 
> >>>>>> The list of bug fixes going into 2.4.1 can be found at the following 
> >>>>>> URL:
> >>>>>> https://issues.apache.org/jira/projects/SPARK/versions/2.4.1 
> >>>>>> <https://issues.apache.org/jira/projects/SPARK/versions/2.4.1>
> >>>>>> 
> >>>>>> FAQ
> >>>>>> 
> >>>>>> =========================
> >>>>>> How can I help test this release?
> >>>>>> =========================
> >>>>>> 
> >>>>>> If you are a Spark user, you can help us test this release by taking
> >>>>>> an existing Spark workload and running on this release candidate, then
> >>>>>> reporting any regressions.
> >>>>>> 
> >>>>>> If you're working in PySpark you can set up a virtual env and install
> >>>>>> the current RC and see if anything important breaks, in the Java/Scala
> >>>>>> you can add the staging repository to your projects resolvers and test
> >>>>>> with the RC (make sure to clean up the artifact cache before/after so
> >>>>>> you don't end up building with a out of date RC going forward).
> >>>>>> 
> >>>>>> ===========================================
> >>>>>> What should happen to JIRA tickets still targeting 2.4.1?
> >>>>>> ===========================================
> >>>>>> 
> >>>>>> The current list of open tickets targeted at 2.4.1 can be found at:
> >>>>>> https://issues.apache.org/jira/projects/SPARK 
> >>>>>> <https://issues.apache.org/jira/projects/SPARK> and search for "Target 
> >>>>>> Version/s" = 2.4.1
> >>>>>> 
> >>>>>> Committers should look at those and triage. Extremely important bug
> >>>>>> fixes, documentation, and API tweaks that impact compatibility should
> >>>>>> be worked on immediately. Everything else please retarget to an
> >>>>>> appropriate release.
> >>>>>> 
> >>>>>> ==================
> >>>>>> But my bug isn't fixed?
> >>>>>> ==================
> >>>>>> 
> >>>>>> In order to make timely releases, we will typically not hold the
> >>>>>> release unless the bug in question is a regression from the previous
> >>>>>> release. That being said, if there is something which is a regression
> >>>>>> that has not been correctly targeted please ping me or a committer to
> >>>>>> help target the issue.
> >>>>>> 
> >>>>>> DB Tsai  |  Siri Open Source Technologies [not a contribution]  |   
> >>>>>> Apple, Inc
> >>>>>> 
> >>>>>> 
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
> >>>>>> <mailto:dev-unsubscr...@spark.apache.org>
> >>>>>> 
> >>> 
> >>> 
> >>> -- 
> >>> Marcelo
> >>> 
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
> >>> <mailto:dev-unsubscr...@spark.apache.org>
> >>> 
> >> 
> >> 
> >> ---------------------------------------------------------------------
> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
> >> <mailto:dev-unsubscr...@spark.apache.org>
> >> 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
> > <mailto:dev-unsubscr...@spark.apache.org>
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
> <mailto:dev-unsubscr...@spark.apache.org>
> 
> 
> 
> -- 
> <sais19-emailsig-keynote...@2x.jpg>
> -- 
> - DB Sent from my iPhone
> 
> 
> -- 
> <sais19-emailsig-keynote...@2x.jpg>

Reply via email to