date:20170420

Re: [VOTE] Apache Spark 2.1.1 (RC3)

2017-04-20 Thread Michael Allman

We've identified the cause of the change in behavior. It is related to the SQL 
conf key "spark.sql.hive.caseSensitiveInferenceMode". This key and its related 
functionality was absent from our previous build. The default setting in the 
current build was causing Spark to attempt to scan all table files during query 
analysis. Changing this setting to NEVER_INFER disabled this operation and 
resolved the issue we had.

Michael


> On Apr 20, 2017, at 3:42 PM, Michael Allman  wrote:
> 
> I want to caution that in testing a build from this morning's branch-2.1 we 
> found that Hive partition pruning was not working. We found that Spark SQL 
> was fetching all Hive table partitions for a very simple query whereas in a 
> build from several weeks ago it was fetching only the required partitions. I 
> cannot currently think of a reason for the regression outside of some 
> difference between branch-2.1 from our previous build and branch-2.1 from 
> this morning.
> 
> That's all I know right now. We are actively investigating to find the root 
> cause of this problem, and specifically whether this is a problem in the 
> Spark codebase or not. I will report back when I have an answer to that 
> question.
> 
> Michael
> 
> 
>> On Apr 18, 2017, at 11:59 AM, Michael Armbrust > > wrote:
>> 
>> Please vote on releasing the following candidate as Apache Spark version 
>> 2.1.1. The vote is open until Fri, April 21st, 2018 at 13:00 PST and passes 
>> if a majority of at least 3 +1 PMC votes are cast.
>> 
>> [ ] +1 Release this package as Apache Spark 2.1.1
>> [ ] -1 Do not release this package because ...
>> 
>> 
>> To learn more about Apache Spark, please see http://spark.apache.org/ 
>> 
>> 
>> The tag to be voted on is v2.1.1-rc3 
>>  
>> (2ed19cff2f6ab79a718526e5d16633412d8c4dd4)
>> 
>> List of JIRA tickets resolved can be found with this filter 
>> .
>> 
>> The release files, including signatures, digests, etc. can be found at:
>> http://home.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-bin/ 
>> 
>> 
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc 
>> 
>> 
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1230/ 
>> 
>> 
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-docs/ 
>> 
>> 
>> 
>> FAQ
>> 
>> How can I help test this release?
>> 
>> If you are a Spark user, you can help us test this release by taking an 
>> existing Spark workload and running on this release candidate, then 
>> reporting any regressions.
>> 
>> What should happen to JIRA tickets still targeting 2.1.1?
>> 
>> Committers should look at those and triage. Extremely important bug fixes, 
>> documentation, and API tweaks that impact compatibility should be worked on 
>> immediately. Everything else please retarget to 2.1.2 or 2.2.0.
>> 
>> But my bug isn't fixed!??!
>> 
>> In order to make timely releases, we will typically not hold the release 
>> unless the bug in question is a regression from 2.1.0.
>> 
>> What happened to RC1?
>> 
>> There were issues with the release packaging and as a result was skipped.
>

Re: [VOTE] Apache Spark 2.1.1 (RC3)

2017-04-20 Thread Michael Allman

I want to caution that in testing a build from this morning's branch-2.1 we 
found that Hive partition pruning was not working. We found that Spark SQL was 
fetching all Hive table partitions for a very simple query whereas in a build 
from several weeks ago it was fetching only the required partitions. I cannot 
currently think of a reason for the regression outside of some difference 
between branch-2.1 from our previous build and branch-2.1 from this morning.

That's all I know right now. We are actively investigating to find the root 
cause of this problem, and specifically whether this is a problem in the Spark 
codebase or not. I will report back when I have an answer to that question.

Michael


> On Apr 18, 2017, at 11:59 AM, Michael Armbrust  wrote:
> 
> Please vote on releasing the following candidate as Apache Spark version 
> 2.1.1. The vote is open until Fri, April 21st, 2018 at 13:00 PST and passes 
> if a majority of at least 3 +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Spark 2.1.1
> [ ] -1 Do not release this package because ...
> 
> 
> To learn more about Apache Spark, please see http://spark.apache.org/ 
> 
> 
> The tag to be voted on is v2.1.1-rc3 
>  
> (2ed19cff2f6ab79a718526e5d16633412d8c4dd4)
> 
> List of JIRA tickets resolved can be found with this filter 
> .
> 
> The release files, including signatures, digests, etc. can be found at:
> http://home.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-bin/ 
> 
> 
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc 
> 
> 
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1230/ 
> 
> 
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-docs/ 
> 
> 
> 
> FAQ
> 
> How can I help test this release?
> 
> If you are a Spark user, you can help us test this release by taking an 
> existing Spark workload and running on this release candidate, then reporting 
> any regressions.
> 
> What should happen to JIRA tickets still targeting 2.1.1?
> 
> Committers should look at those and triage. Extremely important bug fixes, 
> documentation, and API tweaks that impact compatibility should be worked on 
> immediately. Everything else please retarget to 2.1.2 or 2.2.0.
> 
> But my bug isn't fixed!??!
> 
> In order to make timely releases, we will typically not hold the release 
> unless the bug in question is a regression from 2.1.0.
> 
> What happened to RC1?
> 
> There were issues with the release packaging and as a result was skipped.

Re: New Optimizer Hint

2017-04-20 Thread Reynold Xin

Doesn't common sub expression elimination address this issue as well?

On Thu, Apr 20, 2017 at 6:40 AM Herman van Hövell tot Westerflier <
hvanhov...@databricks.com> wrote:

> Hi Michael,
>
> This sounds like a good idea. Can you open a JIRA to track this?
>
> My initial feedback on your proposal would be that you might want to
> express the no_collapse at the expression level and not at the plan level.
>
> HTH
>
> On Thu, Apr 20, 2017 at 3:31 PM, Michael Styles <
> michael.sty...@shopify.com> wrote:
>
>> Hello,
>>
>> I am in the process of putting together a PR that introduces a new hint
>> called NO_COLLAPSE. This hint is essentially identical to Oracle's NO_MERGE
>> hint.
>>
>> Let me first give an example of why I am proposing this.
>>
>> df1 = sc.sql.createDataFrame([(1, "abc")], ["id", "user_agent"])
>> df2 = df1.withColumn("ua", user_agent_details(df1["user_agent"]))
>> df3 = df2.select(df2["ua"].device_form_factor.alias("c1"),
>> df2["ua"].browser_version.alias("c2"))
>> df3.explain(True)
>>
>> == Parsed Logical Plan ==
>> 'Project [ua#85[device_form_factor] AS c1#90, ua#85[browser_version] AS
>> c2#91]
>> +- Project [id#80L, user_agent#81, UDF(user_agent#81) AS ua#85]
>>+- LogicalRDD [id#80L, user_agent#81]
>>
>> == Analyzed Logical Plan ==
>> c1: string, c2: string
>> Project [ua#85.device_form_factor AS c1#90, ua#85.browser_version AS
>> c2#91]
>> +- Project [id#80L, user_agent#81, UDF(user_agent#81) AS ua#85]
>>+- LogicalRDD [id#80L, user_agent#81]
>>
>> == Optimized Logical Plan ==
>> Project [UDF(user_agent#81).device_form_factor AS c1#90,
>> UDF(user_agent#81).browser_version AS c2#91]
>> +- LogicalRDD [id#80L, user_agent#81]
>>
>> == Physical Plan ==
>> *Project [UDF(user_agent#81).device_form_factor AS c1#90,
>> UDF(user_agent#81).browser_version AS c2#91]
>> +- Scan ExistingRDD[id#80L,user_agent#81]
>>
>> user_agent_details is a user-defined function that returns a struct. As
>> can be seen from the generated query plan, the function is being executed
>> multiple times which could lead to performance issues. This is due to the
>> CollapseProject optimizer rule that collapses adjacent projections.
>>
>> I'm proposing a hint that prevent the optimizer from collapsing adjacent
>> projections. A new function called 'no_collapse' would be introduced for
>> this purpose. Consider the following example and generated query plan.
>>
>> df1 = sc.sql.createDataFrame([(1, "abc")], ["id", "user_agent"])
>> df2 = F.no_collapse(df1.withColumn("ua",
>> user_agent_details(df1["user_agent"])))
>> df3 = df2.select(df2["ua"].device_form_factor.alias("c1"),
>> df2["ua"].browser_version.alias("c2"))
>> df3.explain(True)
>>
>> == Parsed Logical Plan ==
>> 'Project [ua#69[device_form_factor] AS c1#75, ua#69[browser_version] AS
>> c2#76]
>> +- NoCollapseHint
>>+- Project [id#64L, user_agent#65, UDF(user_agent#65) AS ua#69]
>>   +- LogicalRDD [id#64L, user_agent#65]
>>
>> == Analyzed Logical Plan ==
>> c1: string, c2: string
>> Project [ua#69.device_form_factor AS c1#75, ua#69.browser_version AS
>> c2#76]
>> +- NoCollapseHint
>>+- Project [id#64L, user_agent#65, UDF(user_agent#65) AS ua#69]
>>   +- LogicalRDD [id#64L, user_agent#65]
>>
>> == Optimized Logical Plan ==
>> Project [ua#69.device_form_factor AS c1#75, ua#69.browser_version AS
>> c2#76]
>> +- NoCollapseHint
>>+- Project [UDF(user_agent#65) AS ua#69]
>>   +- LogicalRDD [id#64L, user_agent#65]
>>
>> == Physical Plan ==
>> *Project [ua#69.device_form_factor AS c1#75, ua#69.browser_version AS
>> c2#76]
>> +- *Project [UDF(user_agent#65) AS ua#69]
>>+- Scan ExistingRDD[id#64L,user_agent#65]
>>
>> As can be seen from the query plan, the user-defined function is now
>> evaluated once per row.
>>
>> I would like to get some feedback on this proposal.
>>
>> Thanks.
>>
>>
>
>
> --
>
> Herman van Hövell
>
> Software Engineer
>
> Databricks Inc.
>
> hvanhov...@databricks.com
>
> +31 6 420 590 27
>
> databricks.com
>
> [image: http://databricks.com] 
>
>
> [image: Join Databricks at Spark Summit 2017 in San Francisco, the world's
> largest event for the Apache Spark community.] 
>

Re: [VOTE] Apache Spark 2.1.1 (RC3)

2017-04-20 Thread Nicholas Chammas

Steve,

I think you're a good person to ask about this. Is the below any cause for
concern? Or did I perhaps test this incorrectly?

Nick


On Tue, Apr 18, 2017 at 11:50 PM Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:

> I had trouble starting up a shell with the AWS package loaded
> (specifically, org.apache.hadoop:hadoop-aws:2.7.3):
>
>
> [NOT FOUND  ] 
> com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle) (0ms)
>
>  local-m2-cache: tried
>
>   
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar
>
> [NOT FOUND  ] 
> org.codehaus.jettison#jettison;1.1!jettison.jar(bundle) (1ms)
>
>  local-m2-cache: tried
>
>   
> file:/home/ec2-user/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar
>
> [NOT FOUND  ] 
> com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar (0ms)
>
>  local-m2-cache: tried
>
>   
> file:/home/ec2-user/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar
>
> ::
>
> ::  FAILED DOWNLOADS::
>
> :: ^ see resolution messages for details  ^ ::
>
> ::
>
> :: com.sun.jersey#jersey-json;1.9!jersey-json.jar(bundle)
>
> :: org.codehaus.jettison#jettison;1.1!jettison.jar(bundle)
>
> :: com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar
>
> :: com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle)
>
> ::
>
> Anyone know anything about this? I made sure to build Spark against the
> appropriate version of Hadoop.
>
> Nick
>
> On Tue, Apr 18, 2017 at 2:59 PM Michael Armbrust 
> wrote:
>
> Please vote on releasing the following candidate as Apache Spark version
>> 2.1.1. The vote is open until Fri, April 21st, 2018 at 13:00 PST and
>> passes if a majority of at least 3 +1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Spark 2.1.1
>> [ ] -1 Do not release this package because ...
>>
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v2.1.1-rc3
>>  (
>> 2ed19cff2f6ab79a718526e5d16633412d8c4dd4)
>>
>> List of JIRA tickets resolved can be found with this filter
>> 
>> .
>>
>> The release files, including signatures, digests, etc. can be found at:
>> http://home.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-bin/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1230/
>>
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-docs/
>>
>>
>> *FAQ*
>>
>> *How can I help test this release?*
>>
>> If you are a Spark user, you can help us test this release by taking an
>> existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> *What should happen to JIRA tickets still targeting 2.1.1?*
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should be
>> worked on immediately. Everything else please retarget to 2.1.2 or 2.2.0.
>>
>> *But my bug isn't fixed!??!*
>>
>> In order to make timely releases, we will typically not hold the release
>> unless the bug in question is a regression from 2.1.0.
>>
>> *What happened to RC1?*
>>
>> There were issues with the release packaging and as a result was skipped.
>>
> 
>

Re: New Optimizer Hint

2017-04-20 Thread Herman van Hövell tot Westerflier

Hi Michael,

This sounds like a good idea. Can you open a JIRA to track this?

My initial feedback on your proposal would be that you might want to
express the no_collapse at the expression level and not at the plan level.

HTH

On Thu, Apr 20, 2017 at 3:31 PM, Michael Styles 
wrote:

> Hello,
>
> I am in the process of putting together a PR that introduces a new hint
> called NO_COLLAPSE. This hint is essentially identical to Oracle's NO_MERGE
> hint.
>
> Let me first give an example of why I am proposing this.
>
> df1 = sc.sql.createDataFrame([(1, "abc")], ["id", "user_agent"])
> df2 = df1.withColumn("ua", user_agent_details(df1["user_agent"]))
> df3 = df2.select(df2["ua"].device_form_factor.alias("c1"),
> df2["ua"].browser_version.alias("c2"))
> df3.explain(True)
>
> == Parsed Logical Plan ==
> 'Project [ua#85[device_form_factor] AS c1#90, ua#85[browser_version] AS
> c2#91]
> +- Project [id#80L, user_agent#81, UDF(user_agent#81) AS ua#85]
>+- LogicalRDD [id#80L, user_agent#81]
>
> == Analyzed Logical Plan ==
> c1: string, c2: string
> Project [ua#85.device_form_factor AS c1#90, ua#85.browser_version AS
> c2#91]
> +- Project [id#80L, user_agent#81, UDF(user_agent#81) AS ua#85]
>+- LogicalRDD [id#80L, user_agent#81]
>
> == Optimized Logical Plan ==
> Project [UDF(user_agent#81).device_form_factor AS c1#90,
> UDF(user_agent#81).browser_version AS c2#91]
> +- LogicalRDD [id#80L, user_agent#81]
>
> == Physical Plan ==
> *Project [UDF(user_agent#81).device_form_factor AS c1#90,
> UDF(user_agent#81).browser_version AS c2#91]
> +- Scan ExistingRDD[id#80L,user_agent#81]
>
> user_agent_details is a user-defined function that returns a struct. As
> can be seen from the generated query plan, the function is being executed
> multiple times which could lead to performance issues. This is due to the
> CollapseProject optimizer rule that collapses adjacent projections.
>
> I'm proposing a hint that prevent the optimizer from collapsing adjacent
> projections. A new function called 'no_collapse' would be introduced for
> this purpose. Consider the following example and generated query plan.
>
> df1 = sc.sql.createDataFrame([(1, "abc")], ["id", "user_agent"])
> df2 = F.no_collapse(df1.withColumn("ua", user_agent_details(df1["user_
> agent"])))
> df3 = df2.select(df2["ua"].device_form_factor.alias("c1"),
> df2["ua"].browser_version.alias("c2"))
> df3.explain(True)
>
> == Parsed Logical Plan ==
> 'Project [ua#69[device_form_factor] AS c1#75, ua#69[browser_version] AS
> c2#76]
> +- NoCollapseHint
>+- Project [id#64L, user_agent#65, UDF(user_agent#65) AS ua#69]
>   +- LogicalRDD [id#64L, user_agent#65]
>
> == Analyzed Logical Plan ==
> c1: string, c2: string
> Project [ua#69.device_form_factor AS c1#75, ua#69.browser_version AS
> c2#76]
> +- NoCollapseHint
>+- Project [id#64L, user_agent#65, UDF(user_agent#65) AS ua#69]
>   +- LogicalRDD [id#64L, user_agent#65]
>
> == Optimized Logical Plan ==
> Project [ua#69.device_form_factor AS c1#75, ua#69.browser_version AS
> c2#76]
> +- NoCollapseHint
>+- Project [UDF(user_agent#65) AS ua#69]
>   +- LogicalRDD [id#64L, user_agent#65]
>
> == Physical Plan ==
> *Project [ua#69.device_form_factor AS c1#75, ua#69.browser_version AS
> c2#76]
> +- *Project [UDF(user_agent#65) AS ua#69]
>+- Scan ExistingRDD[id#64L,user_agent#65]
>
> As can be seen from the query plan, the user-defined function is now
> evaluated once per row.
>
> I would like to get some feedback on this proposal.
>
> Thanks.
>
>


-- 

Herman van Hövell

Software Engineer

Databricks Inc.

hvanhov...@databricks.com

+31 6 420 590 27

databricks.com

[image: http://databricks.com] 


[image: Join Databricks at Spark Summit 2017 in San Francisco, the world's
largest event for the Apache Spark community.]

New Optimizer Hint

2017-04-20 Thread Michael Styles

Hello,

I am in the process of putting together a PR that introduces a new hint
called NO_COLLAPSE. This hint is essentially identical to Oracle's NO_MERGE
hint.

Let me first give an example of why I am proposing this.

df1 = sc.sql.createDataFrame([(1, "abc")], ["id", "user_agent"])
df2 = df1.withColumn("ua", user_agent_details(df1["user_agent"]))
df3 = df2.select(df2["ua"].device_form_factor.alias("c1"),
df2["ua"].browser_version.alias("c2"))
df3.explain(True)

== Parsed Logical Plan ==
'Project [ua#85[device_form_factor] AS c1#90, ua#85[browser_version] AS
c2#91]
+- Project [id#80L, user_agent#81, UDF(user_agent#81) AS ua#85]
   +- LogicalRDD [id#80L, user_agent#81]

== Analyzed Logical Plan ==
c1: string, c2: string
Project [ua#85.device_form_factor AS c1#90, ua#85.browser_version AS c2#91]
+- Project [id#80L, user_agent#81, UDF(user_agent#81) AS ua#85]
   +- LogicalRDD [id#80L, user_agent#81]

== Optimized Logical Plan ==
Project [UDF(user_agent#81).device_form_factor AS c1#90,
UDF(user_agent#81).browser_version AS c2#91]
+- LogicalRDD [id#80L, user_agent#81]

== Physical Plan ==
*Project [UDF(user_agent#81).device_form_factor AS c1#90,
UDF(user_agent#81).browser_version AS c2#91]
+- Scan ExistingRDD[id#80L,user_agent#81]

user_agent_details is a user-defined function that returns a struct. As can
be seen from the generated query plan, the function is being executed
multiple times which could lead to performance issues. This is due to the
CollapseProject optimizer rule that collapses adjacent projections.

I'm proposing a hint that prevent the optimizer from collapsing adjacent
projections. A new function called 'no_collapse' would be introduced for
this purpose. Consider the following example and generated query plan.

df1 = sc.sql.createDataFrame([(1, "abc")], ["id", "user_agent"])
df2 = F.no_collapse(df1.withColumn("ua",
user_agent_details(df1["user_agent"])))
df3 = df2.select(df2["ua"].device_form_factor.alias("c1"),
df2["ua"].browser_version.alias("c2"))
df3.explain(True)

== Parsed Logical Plan ==
'Project [ua#69[device_form_factor] AS c1#75, ua#69[browser_version] AS
c2#76]
+- NoCollapseHint
   +- Project [id#64L, user_agent#65, UDF(user_agent#65) AS ua#69]
  +- LogicalRDD [id#64L, user_agent#65]

== Analyzed Logical Plan ==
c1: string, c2: string
Project [ua#69.device_form_factor AS c1#75, ua#69.browser_version AS c2#76]
+- NoCollapseHint
   +- Project [id#64L, user_agent#65, UDF(user_agent#65) AS ua#69]
  +- LogicalRDD [id#64L, user_agent#65]

== Optimized Logical Plan ==
Project [ua#69.device_form_factor AS c1#75, ua#69.browser_version AS c2#76]
+- NoCollapseHint
   +- Project [UDF(user_agent#65) AS ua#69]
  +- LogicalRDD [id#64L, user_agent#65]

== Physical Plan ==
*Project [ua#69.device_form_factor AS c1#75, ua#69.browser_version AS
c2#76]
+- *Project [UDF(user_agent#65) AS ua#69]
   +- Scan ExistingRDD[id#64L,user_agent#65]

As can be seen from the query plan, the user-defined function is now
evaluated once per row.

I would like to get some feedback on this proposal.

Thanks.

Re: [VOTE] Apache Spark 2.1.1 (RC3)

2017-04-20 Thread Denny Lee

+1 (non-binding)


On Wed, Apr 19, 2017 at 9:23 PM Dong Joon Hyun 
wrote:

> +1
>
> I tested RC3 on CentOS 7.3.1611/OpenJDK 1.8.0_121/R 3.3.3
> with `-Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserver
> –Psparkr`
>
> At the end of R test, I saw `Had CRAN check errors; see logs.`,
> but tests passed and log file looks good.
>
> Bests,
> Dongjoon.
>
> From: Reynold Xin 
> Date: Wednesday, April 19, 2017 at 3:41 PM
> To: Marcelo Vanzin 
> Cc: Michael Armbrust , "dev@spark.apache.org" <
> dev@spark.apache.org>
> Subject: Re: [VOTE] Apache Spark 2.1.1 (RC3)
>
> +1
>
> On Wed, Apr 19, 2017 at 3:31 PM, Marcelo Vanzin 
> wrote:
>
>> +1 (non-binding).
>>
>> Ran the hadoop-2.6 binary against our internal tests and things look good.
>>
>> On Tue, Apr 18, 2017 at 11:59 AM, Michael Armbrust
>>  wrote:
>> > Please vote on releasing the following candidate as Apache Spark version
>> > 2.1.1. The vote is open until Fri, April 21st, 2018 at 13:00 PST and
>> passes
>> > if a majority of at least 3 +1 PMC votes are cast.
>> >
>> > [ ] +1 Release this package as Apache Spark 2.1.1
>> > [ ] -1 Do not release this package because ...
>> >
>> >
>> > To learn more about Apache Spark, please see http://spark.apache.org/
>> >
>> > The tag to be voted on is v2.1.1-rc3
>> > (2ed19cff2f6ab79a718526e5d16633412d8c4dd4)
>> >
>> > List of JIRA tickets resolved can be found with this filter.
>> >
>> > The release files, including signatures, digests, etc. can be found at:
>> > http://home.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-bin/
>> >
>> > Release artifacts are signed with the following key:
>> > https://people.apache.org/keys/committer/pwendell.asc
>> >
>> > The staging repository for this release can be found at:
>> > https://repository.apache.org/content/repositories/orgapachespark-1230/
>> >
>> > The documentation corresponding to this release can be found at:
>> > http://people.apache.org/~pwendell/spark-releases/spark-2.1.1-rc3-docs/
>> >
>> >
>> > FAQ
>> >
>> > How can I help test this release?
>> >
>> > If you are a Spark user, you can help us test this release by taking an
>> > existing Spark workload and running on this release candidate, then
>> > reporting any regressions.
>> >
>> > What should happen to JIRA tickets still targeting 2.1.1?
>> >
>> > Committers should look at those and triage. Extremely important bug
>> fixes,
>> > documentation, and API tweaks that impact compatibility should be
>> worked on
>> > immediately. Everything else please retarget to 2.1.2 or 2.2.0.
>> >
>> > But my bug isn't fixed!??!
>> >
>> > In order to make timely releases, we will typically not hold the release
>> > unless the bug in question is a regression from 2.1.0.
>> >
>> > What happened to RC1?
>> >
>> > There were issues with the release packaging and as a result was
>> skipped.
>>
>>
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>

Re: [VOTE] Apache Spark 2.1.1 (RC3)

Re: [VOTE] Apache Spark 2.1.1 (RC3)

Re: New Optimizer Hint

Re: [VOTE] Apache Spark 2.1.1 (RC3)

Re: New Optimizer Hint

New Optimizer Hint

Re: [VOTE] Apache Spark 2.1.1 (RC3)

7 matches

Site Navigation

Mail list logo

Footer information