Re: Outstanding Spark 2.1.1 issues

Daniel Siegmann Mon, 20 Mar 2017 17:24:58 -0700

Any chance of back-porting

SPARK-14536 <https://issues.apache.org/jira/browse/SPARK-14536> - NPE in
JDBCRDD when array column contains nulls (postgresql)


It just adds a null check - just a simple bug fix - so it really belongs in
Spark 2.1.x.

On Mon, Mar 20, 2017 at 6:12 PM, Holden Karau <[email protected]> wrote:

> Hi Spark Developers!
>
> As we start working on the Spark 2.1.1 release I've been looking at our
> outstanding issues still targeted for it. I've tried to break it down by
> component so that people in charge of each component can take a quick look
> and see if any of these things can/should be re-targeted to 2.2 or 2.1.2 &
> the overall list is pretty short (only 9 items - 5 if we only look at
> explicitly tagged) :)
>
> If your working on something for Spark 2.1.1 and it doesn't show up in
> this list please speak up now :) We have a lot of issues (including "in
> progress") that are listed as impacting 2.1.0, but they aren't targeted for
> 2.1.1 - if there is something you are working in their which should be
> targeted for 2.1.1 please let us know so it doesn't slip through the cracks.
>
> The query string I used for looking at the 2.1.1 open issues is:
>
> ((affectedVersion = 2.1.1 AND cf[12310320] is Empty) OR fixVersion = 2.1.1
> OR cf[12310320] = "2.1.1") AND project = spark AND resolution = Unresolved
> ORDER BY priority DESC
>
> None of the open issues appear to be a regression from 2.1.0, but those
> seem more likely to show up during the RC process (thanks in advance to
> everyone testing their workloads :)) & generally none of them seem to be
>
> (Note: the cfs are for Target Version/s field)
>
> Critical Issues:
>  SQL:
>   SPARK-19690 <https://issues.apache.org/jira/browse/SPARK-19690> - Join
> a streaming DataFrame with a batch DataFrame may not work - PR
> https://github.com/apache/spark/pull/17052 (review in progress by
> zsxwing, currently failing Jenkins)*
>
> Major Issues:
>  SQL:
>   SPARK-19035 <https://issues.apache.org/jira/browse/SPARK-19035> - rand()
> function in case when cause failed - no outstanding PR (consensus on JIRA
> seems to be leaning towards it being a real issue but not necessarily
> everyone agrees just yet - maybe we should slip this?)*
>  Deploy:
>   SPARK-19522 <https://issues.apache.org/jira/browse/SPARK-19522>
>  - --executor-memory flag doesn't work in local-cluster mode -
> https://github.com/apache/spark/pull/16975 (review in progress by vanzin,
> but PR currently stalled waiting on response) *
>  Core:
>   SPARK-20025 <https://issues.apache.org/jira/browse/SPARK-20025> - Driver
> fail over will not work, if SPARK_LOCAL* env is set. -
> https://github.com/apache/spark/pull/17357 (waiting on review) *
>  PySpark:
>  SPARK-19955 <https://issues.apache.org/jira/browse/SPARK-19955> - Update
> run-tests to support conda [ Part of Dropping 2.6 support -- which we
> shouldn't do in a minor release -- but also fixes pip installability tests
> to run in Jenkins ]-  PR failing Jenkins (I need to poke this some more,
> but seems like 2.7 support works but some other issues. Maybe slip to 2.2?)
>
> Minor issues:
>  Tests:
>   SPARK-19612 <https://issues.apache.org/jira/browse/SPARK-19612> - Tests
> failing with timeout - No PR per-se but it seems unrelated to the 2.1.1
> release. It's not targetted for 2.1.1 but listed as affecting 2.1.1 - I'd
> consider explicitly targeting this for 2.2?
>  PySpark:
>   SPARK-19570 <https://issues.apache.org/jira/browse/SPARK-19570> - Allow
> to disable hive in pyspark shell - https://github.com/apache/sp
> ark/pull/16906 PR exists but its difficult to add automated tests for
> this (although if SPARK-19955
> <https://issues.apache.org/jira/browse/SPARK-19955> gets in would make
> testing this easier) - no reviewers yet. Possible re-target?*
>  Structured Streaming:
>   SPARK-19613 <https://issues.apache.org/jira/browse/SPARK-19613> - Flaky
> test: StateStoreRDDSuite.versioning and immutability - It's not targetted
> for 2.1.1 but listed as affecting 2.1.1 - I'd consider explicitly targeting
> this for 2.2?
>  ML:
>   SPARK-19759 <https://issues.apache.org/jira/browse/SPARK-19759>
>  - ALSModel.predict on Dataframes : potential optimization by not using
> blas - No PR consider re-targeting unless someone has a PR waiting in the
> wings?
>
> Explicitly targeted issues are marked with a *, the remaining issues are
> listed as impacting 2.1.1 and don't have a specific target version set.
>
> Since 2.1.1 continues the 2.1.0 branch, looking at 2.1.0 shows 1 open
> blocker in SQL( SPARK-19983
> <https://issues.apache.org/jira/browse/SPARK-19983> ),
>
> Query string is:
>
> affectedVersion = 2.1.0 AND cf[12310320] is EMPTY AND project = spark AND
> resolution = Unresolved AND priority = targetPriority
>
> Continuing on for unresolved 2.1.0 issues in Major there are 163 (76 of
> them in progress), 65 Minor (26 in progress), and 9 trivial (6 in progress).
>
> I'll be going through the 2.1.0 major issues with open PRs that impact the
> PySpark component and seeing if any of them should be targeted for 2.1.1,
> if anyone from the other components wants to take a look through we might
> find some easy wins to be merged.
>
> Cheers,
>
> Holden :)
>
> --
> Cell : 425-233-8271 <(425)%20233-8271>
> Twitter: https://twitter.com/holdenkarau
>

Re: Outstanding Spark 2.1.1 issues

Reply via email to