Actually, I opened - https://issues.apache.org/jira/browse/SPARK-21093.

2017-06-14 17:08 GMT+09:00 Hyukjin Kwon <gurwls...@gmail.com>:

> For a shorter reproducer ...
>
>
> df <- createDataFrame(list(list(1L, 1, "1", 0.1)), c("a", "b", "c", "d"))
> collect(gapply(df, "a", function(key, x) { x }, schema(df)))
>
> And running the below multiple times (5~7):
>
> collect(gapply(df, "a", function(key, x) { x }, schema(df)))
>
> looks occasionally throwing an error.
>
>
> I will leave here and probably explain more information if a JIRA is open.
> This does not look a regression anyway.
>
>
>
> 2017-06-14 16:22 GMT+09:00 Hyukjin Kwon <gurwls...@gmail.com>:
>
>>
>> Per https://github.com/apache/spark/tree/v2.1.1,
>>
>> 1. CentOS 7.2.1511 / R 3.3.3 - this test hangs.
>>
>> I messed it up a bit while downgrading the R to 3.3.3 (It was an actual
>> machine not a VM) so it took me a while to re-try this.
>> I re-built this again and checked the R version is 3.3.3 at least. I hope
>> this one could double checked.
>>
>> Here is the self-reproducer:
>>
>> irisDF <- suppressWarnings(createDataFrame (iris))
>> schema <-  structType(structField("Sepal_Length", "double"),
>> structField("Avg", "double"))
>> df4 <- gapply(
>>   cols = "Sepal_Length",
>>   irisDF,
>>   function(key, x) {
>>     y <- data.frame(key, mean(x$Sepal_Width), stringsAsFactors = FALSE)
>>   },
>>   schema)
>> collect(df4)
>>
>>
>>
>> 2017-06-14 16:07 GMT+09:00 Felix Cheung <felixcheun...@hotmail.com>:
>>
>>> Thanks! Will try to setup RHEL/CentOS to test it out
>>>
>>> _____________________________
>>> From: Nick Pentreath <nick.pentre...@gmail.com>
>>> Sent: Tuesday, June 13, 2017 11:38 PM
>>> Subject: Re: [VOTE] Apache Spark 2.2.0 (RC4)
>>> To: Felix Cheung <felixcheun...@hotmail.com>, Hyukjin Kwon <
>>> gurwls...@gmail.com>, dev <dev@spark.apache.org>
>>>
>>> Cc: Sean Owen <so...@cloudera.com>
>>>
>>>
>>> Hi yeah sorry for slow response - I was RHEL and OpenJDK but will have
>>> to report back later with the versions as am AFK.
>>>
>>> R version not totally sure but again will revert asap
>>> On Wed, 14 Jun 2017 at 05:09, Felix Cheung <felixcheun...@hotmail.com>
>>> wrote:
>>>
>>>> Thanks
>>>> This was with an external package and unrelated
>>>>
>>>>   >> macOS Sierra 10.12.3 / R 3.2.3 - passed with a warning (
>>>> https://gist.github.com/HyukjinKwon/85cbcfb245825852df20ed6a9ecfd845)
>>>>
>>>> As for CentOS - would it be possible to test against R older than
>>>> 3.4.0? This is the same error reported by Nick below.
>>>>
>>>> _____________________________
>>>> From: Hyukjin Kwon <gurwls...@gmail.com>
>>>> Sent: Tuesday, June 13, 2017 8:02 PM
>>>>
>>>> Subject: Re: [VOTE] Apache Spark 2.2.0 (RC4)
>>>> To: dev <dev@spark.apache.org>
>>>> Cc: Sean Owen <so...@cloudera.com>, Nick Pentreath <
>>>> nick.pentre...@gmail.com>, Felix Cheung <felixcheun...@hotmail.com>
>>>>
>>>>
>>>>
>>>> For the test failure on R, I checked:
>>>>
>>>>
>>>> Per https://github.com/apache/spark/tree/v2.2.0-rc4,
>>>>
>>>> 1. Windows Server 2012 R2 / R 3.3.1 - passed (
>>>> https://ci.appveyor.com/project/spark-test/spark/build/755-
>>>> r-test-v2.2.0-rc4)
>>>> 2. macOS Sierra 10.12.3 / R 3.4.0 - passed
>>>> 3. macOS Sierra 10.12.3 / R 3.2.3 - passed with a warning (
>>>> https://gist.github.com/HyukjinKwon/85cbcfb245825852df20ed6a9ecfd845)
>>>> 4. CentOS 7.2.1511 / R 3.4.0 - reproduced (
>>>> https://gist.github.com/HyukjinKwon/2a736b9f80318618cc147ac2bb1a987d)
>>>>
>>>>
>>>> Per https://github.com/apache/spark/tree/v2.1.1,
>>>>
>>>> 1. CentOS 7.2.1511 / R 3.4.0 - reproduced (
>>>> https://gist.github.com/HyukjinKwon/6064b0d10bab8fc1dc6212452d83b301)
>>>>
>>>>
>>>> This looks being failed only in CentOS 7.2.1511 / R 3.4.0 given my
>>>> tests and observations.
>>>>
>>>> This is failed in Spark 2.1.1. So, it sounds not a regression although
>>>> it is a bug that should be fixed (whether in Spark or R).
>>>>
>>>>
>>>> 2017-06-14 8:28 GMT+09:00 Xiao Li <gatorsm...@gmail.com>:
>>>>
>>>>> -1
>>>>>
>>>>> Spark 2.2 is unable to read the partitioned table created by Spark 2.1
>>>>> or earlier.
>>>>>
>>>>> Opened a JIRA https://issues.apache.org/jira/browse/SPARK-21085
>>>>>
>>>>> Will fix it soon.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Xiao Li
>>>>>
>>>>>
>>>>>
>>>>> 2017-06-13 9:39 GMT-07:00 Joseph Bradley <jos...@databricks.com>:
>>>>>
>>>>>> Re: the QA JIRAs:
>>>>>> Thanks for discussing them.  I still feel they are very helpful; I
>>>>>> particularly notice not having to spend a solid 2-3 weeks of time QAing
>>>>>> (unlike in earlier Spark releases).  One other point not mentioned 
>>>>>> above: I
>>>>>> think they serve as a very helpful reminder/training for the community 
>>>>>> for
>>>>>> rigor in development.  Since we instituted QA JIRAs, contributors have 
>>>>>> been
>>>>>> a lot better about adding in docs early, rather than waiting until the 
>>>>>> end
>>>>>> of the cycle (though I know this is drawing conclusions from 
>>>>>> correlations).
>>>>>>
>>>>>> I would vote in favor of the RC...but I'll wait to see about the
>>>>>> reported failures.
>>>>>>
>>>>>> On Fri, Jun 9, 2017 at 3:30 PM, Sean Owen <so...@cloudera.com> wrote:
>>>>>>
>>>>>>> Different errors as in https://issues.apache.org/j
>>>>>>> ira/browse/SPARK-20520 but that's also reporting R test failures.
>>>>>>>
>>>>>>> I went back and tried to run the R tests and they passed, at least
>>>>>>> on Ubuntu 17 / R 3.3.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 9, 2017 at 9:12 AM Nick Pentreath <
>>>>>>> nick.pentre...@gmail.com> wrote:
>>>>>>>
>>>>>>>> All Scala, Python tests pass. ML QA and doc issues are resolved (as
>>>>>>>> well as R it seems).
>>>>>>>>
>>>>>>>> However, I'm seeing the following test failure on R consistently:
>>>>>>>> https://gist.github.com/MLnick/5f26152f97ae8473f807c6895817cf72
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, 8 Jun 2017 at 08:48 Denny Lee <denny.g....@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> +1 non-binding
>>>>>>>>>
>>>>>>>>> Tested on macOS Sierra, Ubuntu 16.04
>>>>>>>>> test suite includes various test cases including Spark SQL, ML,
>>>>>>>>> GraphFrames, Structured Streaming
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 7, 2017 at 9:40 PM vaquar khan <vaquar.k...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> +1 non-binding
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> vaquar khan
>>>>>>>>>>
>>>>>>>>>> On Jun 7, 2017 4:32 PM, "Ricardo Almeida" <
>>>>>>>>>> ricardo.alme...@actnowib.com> wrote:
>>>>>>>>>>
>>>>>>>>>> +1 (non-binding)
>>>>>>>>>>
>>>>>>>>>> Built and tested with -Phadoop-2.7 -Dhadoop.version=2.7.3 -Pyarn
>>>>>>>>>> -Phive -Phive-thriftserver -Pscala-2.11 on
>>>>>>>>>>
>>>>>>>>>>    - Ubuntu 17.04, Java 8 (OpenJDK 1.8.0_111)
>>>>>>>>>>    - macOS 10.12.5 Java 8 (build 1.8.0_131)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 5 June 2017 at 21:14, Michael Armbrust <mich...@databricks.com
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Please vote on releasing the following candidate as Apache
>>>>>>>>>>> Spark version 2.2.0. The vote is open until Thurs, June 8th,
>>>>>>>>>>> 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC
>>>>>>>>>>> votes are cast.
>>>>>>>>>>>
>>>>>>>>>>> [ ] +1 Release this package as Apache Spark 2.2.0
>>>>>>>>>>> [ ] -1 Do not release this package because ...
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> To learn more about Apache Spark, please see
>>>>>>>>>>> http://spark.apache.org/
>>>>>>>>>>>
>>>>>>>>>>> The tag to be voted on is v2.2.0-rc4
>>>>>>>>>>> <https://github.com/apache/spark/tree/v2.2.0-rc4> (
>>>>>>>>>>> 377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e)
>>>>>>>>>>>
>>>>>>>>>>> List of JIRA tickets resolved can be found with this filter
>>>>>>>>>>> <https://issues.apache.org/jira/browse/SPARK-20134?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.2.0>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>> The release files, including signatures, digests, etc. can be
>>>>>>>>>>> found at:
>>>>>>>>>>> http://home.apache.org/~pwendell/spark-releases/spark-2.2.0-
>>>>>>>>>>> rc4-bin/
>>>>>>>>>>>
>>>>>>>>>>> Release artifacts are signed with the following key:
>>>>>>>>>>> https://people.apache.org/keys/committer/pwendell.asc
>>>>>>>>>>>
>>>>>>>>>>> The staging repository for this release can be found at:
>>>>>>>>>>> https://repository.apache.org/content/repositories/orgapache
>>>>>>>>>>> spark-1241/
>>>>>>>>>>>
>>>>>>>>>>> The documentation corresponding to this release can be found at:
>>>>>>>>>>> http://people.apache.org/~pwendell/spark-releases/spark-2.2.
>>>>>>>>>>> 0-rc4-docs/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *FAQ*
>>>>>>>>>>>
>>>>>>>>>>> *How can I help test this release?*
>>>>>>>>>>>
>>>>>>>>>>> If you are a Spark user, you can help us test this release by
>>>>>>>>>>> taking an existing Spark workload and running on this release 
>>>>>>>>>>> candidate,
>>>>>>>>>>> then reporting any regressions.
>>>>>>>>>>>
>>>>>>>>>>> *What should happen to JIRA tickets still targeting 2.2.0?*
>>>>>>>>>>>
>>>>>>>>>>> Committers should look at those and triage. Extremely important
>>>>>>>>>>> bug fixes, documentation, and API tweaks that impact compatibility 
>>>>>>>>>>> should
>>>>>>>>>>> be worked on immediately. Everything else please retarget to 2.3.0 
>>>>>>>>>>> or 2.2.1.
>>>>>>>>>>>
>>>>>>>>>>> *But my bug isn't fixed!??!*
>>>>>>>>>>>
>>>>>>>>>>> In order to make timely releases, we will typically not hold the
>>>>>>>>>>> release unless the bug in question is a regression from 2.1.1.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Joseph Bradley
>>>>>>
>>>>>> Software Engineer - Machine Learning
>>>>>>
>>>>>> Databricks, Inc.
>>>>>>
>>>>>> [image: http://databricks.com] <http://databricks.com/>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>

Reply via email to