Re: [VOTE] Apache Spark 2.1.0 (RC2)

2016-12-12 Thread Marcelo Vanzin
Another failing test is "ReplSuite:should clone and clean line object
in ClosureCleaner". It never passes for me, just keeps spinning until
the JVM eventually starts throwing OOM errors. Anyone seeing that?

On Thu, Dec 8, 2016 at 12:39 AM, Reynold Xin  wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.1.0. The vote is open until Sun, December 11, 2016 at 1:00 PT and passes
> if a majority of at least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.1.0
> [ ] -1 Do not release this package because ...
>
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.1.0-rc2
> (080717497365b83bc202ab16812ced93eb1ea7bd)
>
> List of JIRA tickets resolved are:
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.1.0
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc2-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1217
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc2-docs/
>
>
> (Note that the docs and staging repo are still being uploaded and will be
> available soon)
>
>
> ===
> How can I help test this release?
> ===
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> ===
> What should happen to JIRA tickets still targeting 2.1.0?
> ===
> Committers should look at those and triage. Extremely important bug fixes,
> documentation, and API tweaks that impact compatibility should be worked on
> immediately. Everything else please retarget to 2.1.1 or 2.2.0.



-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] Apache Spark 2.1.0 (RC2)

2016-12-12 Thread Yin Huai
-1

I hit https://issues.apache.org/jira/browse/SPARK-18816, which prevents
executor page from showing the log links if an application does not have
executors initially.

On Mon, Dec 12, 2016 at 3:02 PM, Marcelo Vanzin  wrote:

> Actually this is not a simple pom change. The code in
> UDFRegistration.scala calls this method:
>
>   if (returnType == null) {
>returnType =
> JavaTypeInference.inferDataType(TypeToken.of(udfReturnType))._1
>  }
>
> Because we shade guava, it's generally not very safe to call methods
> in different modules that expose shaded APIs. Can this code be
> modified to call the variant that just takes a java.lang.Class instead
> of Guava's TypeToken? It seems like that would work, since that method
> basically just wraps the argument with "TypeToken.of".
>
>
>
> On Mon, Dec 12, 2016 at 2:03 PM, Marcelo Vanzin 
> wrote:
> > I'm running into this when building / testing on 1.7 (haven't tried 1.8):
> >
> > udf3Test(test.org.apache.spark.sql.JavaUDFSuite)  Time elapsed: 0.079
> > sec  <<< ERROR!
> > java.lang.NoSuchMethodError:
> > org.apache.spark.sql.catalyst.JavaTypeInference$.
> inferDataType(Lcom/google/common/reflect/TypeToken;)Lsc
> > ala/Tuple2;
> >at test.org.apache.spark.sql.JavaUDFSuite.udf3Test(
> JavaUDFSuite.java:107)
> >
> >
> > Results :
> >
> > Tests in error:
> >  JavaUDFSuite.udf3Test:107 » NoSuchMethod
> > org.apache.spark.sql.catalyst.JavaTyp...
> >
> >
> > Given the error I'm mostly sure it's something easily fixable by
> > adding Guava explicitly in the pom, so probably shouldn't block
> > anything.
> >
> >
> > On Thu, Dec 8, 2016 at 12:39 AM, Reynold Xin 
> wrote:
> >> Please vote on releasing the following candidate as Apache Spark version
> >> 2.1.0. The vote is open until Sun, December 11, 2016 at 1:00 PT and
> passes
> >> if a majority of at least 3 +1 PMC votes are cast.
> >>
> >> [ ] +1 Release this package as Apache Spark 2.1.0
> >> [ ] -1 Do not release this package because ...
> >>
> >>
> >> To learn more about Apache Spark, please see http://spark.apache.org/
> >>
> >> The tag to be voted on is v2.1.0-rc2
> >> (080717497365b83bc202ab16812ced93eb1ea7bd)
> >>
> >> List of JIRA tickets resolved are:
> >> https://issues.apache.org/jira/issues/?jql=project%20%
> 3D%20SPARK%20AND%20fixVersion%20%3D%202.1.0
> >>
> >> The release files, including signatures, digests, etc. can be found at:
> >> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc2-bin/
> >>
> >> Release artifacts are signed with the following key:
> >> https://people.apache.org/keys/committer/pwendell.asc
> >>
> >> The staging repository for this release can be found at:
> >> https://repository.apache.org/content/repositories/orgapachespark-1217
> >>
> >> The documentation corresponding to this release can be found at:
> >> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc2-docs/
> >>
> >>
> >> (Note that the docs and staging repo are still being uploaded and will
> be
> >> available soon)
> >>
> >>
> >> ===
> >> How can I help test this release?
> >> ===
> >> If you are a Spark user, you can help us test this release by taking an
> >> existing Spark workload and running on this release candidate, then
> >> reporting any regressions.
> >>
> >> ===
> >> What should happen to JIRA tickets still targeting 2.1.0?
> >> ===
> >> Committers should look at those and triage. Extremely important bug
> fixes,
> >> documentation, and API tweaks that impact compatibility should be
> worked on
> >> immediately. Everything else please retarget to 2.1.1 or 2.2.0.
> >
> >
> >
> > --
> > Marcelo
>
>
>
> --
> Marcelo
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [VOTE] Apache Spark 2.1.0 (RC2)

2016-12-12 Thread Marcelo Vanzin
Actually this is not a simple pom change. The code in
UDFRegistration.scala calls this method:

  if (returnType == null) {
   returnType =
JavaTypeInference.inferDataType(TypeToken.of(udfReturnType))._1
 }

Because we shade guava, it's generally not very safe to call methods
in different modules that expose shaded APIs. Can this code be
modified to call the variant that just takes a java.lang.Class instead
of Guava's TypeToken? It seems like that would work, since that method
basically just wraps the argument with "TypeToken.of".



On Mon, Dec 12, 2016 at 2:03 PM, Marcelo Vanzin  wrote:
> I'm running into this when building / testing on 1.7 (haven't tried 1.8):
>
> udf3Test(test.org.apache.spark.sql.JavaUDFSuite)  Time elapsed: 0.079
> sec  <<< ERROR!
> java.lang.NoSuchMethodError:
> org.apache.spark.sql.catalyst.JavaTypeInference$.inferDataType(Lcom/google/common/reflect/TypeToken;)Lsc
> ala/Tuple2;
>at 
> test.org.apache.spark.sql.JavaUDFSuite.udf3Test(JavaUDFSuite.java:107)
>
>
> Results :
>
> Tests in error:
>  JavaUDFSuite.udf3Test:107 » NoSuchMethod
> org.apache.spark.sql.catalyst.JavaTyp...
>
>
> Given the error I'm mostly sure it's something easily fixable by
> adding Guava explicitly in the pom, so probably shouldn't block
> anything.
>
>
> On Thu, Dec 8, 2016 at 12:39 AM, Reynold Xin  wrote:
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.1.0. The vote is open until Sun, December 11, 2016 at 1:00 PT and passes
>> if a majority of at least 3 +1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Spark 2.1.0
>> [ ] -1 Do not release this package because ...
>>
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v2.1.0-rc2
>> (080717497365b83bc202ab16812ced93eb1ea7bd)
>>
>> List of JIRA tickets resolved are:
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.1.0
>>
>> The release files, including signatures, digests, etc. can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc2-bin/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/pwendell.asc
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1217
>>
>> The documentation corresponding to this release can be found at:
>> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc2-docs/
>>
>>
>> (Note that the docs and staging repo are still being uploaded and will be
>> available soon)
>>
>>
>> ===
>> How can I help test this release?
>> ===
>> If you are a Spark user, you can help us test this release by taking an
>> existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> ===
>> What should happen to JIRA tickets still targeting 2.1.0?
>> ===
>> Committers should look at those and triage. Extremely important bug fixes,
>> documentation, and API tweaks that impact compatibility should be worked on
>> immediately. Everything else please retarget to 2.1.1 or 2.2.0.
>
>
>
> --
> Marcelo



-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] Apache Spark 2.1.0 (RC2)

2016-12-12 Thread Marcelo Vanzin
I'm running into this when building / testing on 1.7 (haven't tried 1.8):

udf3Test(test.org.apache.spark.sql.JavaUDFSuite)  Time elapsed: 0.079
sec  <<< ERROR!
java.lang.NoSuchMethodError:
org.apache.spark.sql.catalyst.JavaTypeInference$.inferDataType(Lcom/google/common/reflect/TypeToken;)Lsc
ala/Tuple2;
   at test.org.apache.spark.sql.JavaUDFSuite.udf3Test(JavaUDFSuite.java:107)


Results :

Tests in error:
 JavaUDFSuite.udf3Test:107 » NoSuchMethod
org.apache.spark.sql.catalyst.JavaTyp...


Given the error I'm mostly sure it's something easily fixable by
adding Guava explicitly in the pom, so probably shouldn't block
anything.


On Thu, Dec 8, 2016 at 12:39 AM, Reynold Xin  wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.1.0. The vote is open until Sun, December 11, 2016 at 1:00 PT and passes
> if a majority of at least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.1.0
> [ ] -1 Do not release this package because ...
>
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.1.0-rc2
> (080717497365b83bc202ab16812ced93eb1ea7bd)
>
> List of JIRA tickets resolved are:
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.1.0
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc2-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1217
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-releases/spark-2.1.0-rc2-docs/
>
>
> (Note that the docs and staging repo are still being uploaded and will be
> available soon)
>
>
> ===
> How can I help test this release?
> ===
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> ===
> What should happen to JIRA tickets still targeting 2.1.0?
> ===
> Committers should look at those and triage. Extremely important bug fixes,
> documentation, and API tweaks that impact compatibility should be worked on
> immediately. Everything else please retarget to 2.1.1 or 2.2.0.



-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Aggregating over sorted data

2016-12-12 Thread nsyca
Hi,

SPARK-18591    might be a
solution to your problem but making assuming in your UDAF logic on how Spark
will process the aggregation is really a risky thing. Is there a way to do
it using Windows function with ORDER BY clause to enforce the processing in
this case?




-
Nattavut Sutyanyong | @nsyca
Spark Technology Center
http://www.spark.tc/
--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Aggregating-over-sorted-data-tp1p20206.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org