Re: [vote] Apache Spark 2.0.0-preview release (rc1)

Xiao Li Thu, 19 May 2016 21:58:27 -0700

Changed my vote to +1. Thanks!

2016-05-19 13:28 GMT-07:00 Xiao Li <[email protected]>:


> Will do. Thanks!
>
> 2016-05-19 13:26 GMT-07:00 Reynold Xin <[email protected]>:
>
>> Xiao thanks for posting. Please file a bug in JIRA. Again as I said in
>> the email this is not meant to be a functional release and will contain
>> bugs.
>>
>> On Thu, May 19, 2016 at 1:20 PM, Xiao Li <[email protected]> wrote:
>>
>>> -1
>>>
>>> Unable to use Hive meta-store in pyspark shell. Tried both HiveContext
>>> and SparkSession. Both failed. It always uses in-memory catalog. Anybody
>>> else hit the same issue?
>>>
>>>
>>> Method 1: SparkSession
>>>
>>> >>> from pyspark.sql import SparkSession
>>>
>>> >>> spark = SparkSession.builder.enableHiveSupport().getOrCreate()
>>>
>>> >>>
>>>
>>> >>> spark.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
>>>
>>> DataFrame[]
>>>
>>> >>> spark.sql("LOAD DATA LOCAL INPATH
>>> 'examples/src/main/resources/kv1.txt' INTO TABLE src")
>>>
>>> Traceback (most recent call last):
>>>
>>>   File "<stdin>", line 1, in <module>
>>>
>>>   File
>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/session.py",
>>> line 494, in sql
>>>
>>>     return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
>>>
>>>   File
>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py",
>>> line 933, in __call__
>>>
>>>   File
>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/utils.py",
>>> line 57, in deco
>>>
>>>     return f(*a, **kw)
>>>
>>>   File
>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py",
>>> line 312, in get_return_value
>>>
>>> py4j.protocol.Py4JJavaError: An error occurred while calling o21.sql.
>>>
>>> : java.lang.UnsupportedOperationException: loadTable is not implemented
>>>
>>> at
>>> org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.loadTable(InMemoryCatalog.scala:297)
>>>
>>> at
>>> org.apache.spark.sql.catalyst.catalog.SessionCatalog.loadTable(SessionCatalog.scala:280)
>>>
>>> at org.apache.spark.sql.execution.command.LoadData.run(tables.scala:263)
>>>
>>> at
>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
>>>
>>> at
>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
>>>
>>> at
>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
>>>
>>> at
>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>>>
>>> at
>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>>>
>>> at
>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
>>>
>>> at
>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>>>
>>> at
>>> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
>>>
>>> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
>>>
>>> at
>>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
>>>
>>> at
>>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
>>>
>>> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:187)
>>>
>>> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:168)
>>>
>>> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63)
>>>
>>> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>
>>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
>>>
>>> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
>>>
>>> at py4j.Gateway.invoke(Gateway.java:280)
>>>
>>> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:128)
>>>
>>> at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>
>>> at py4j.GatewayConnection.run(GatewayConnection.java:211)
>>>
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>> Method 2: Using HiveContext:
>>>
>>> >>> from pyspark.sql import HiveContext
>>>
>>> >>> sqlContext = HiveContext(sc)
>>>
>>> >>> sqlContext.sql("CREATE TABLE IF NOT EXISTS src (key INT, value
>>> STRING)")
>>>
>>> DataFrame[]
>>>
>>> >>> sqlContext.sql("LOAD DATA LOCAL INPATH
>>> 'examples/src/main/resources/kv1.txt' INTO TABLE src")
>>>
>>> Traceback (most recent call last):
>>>
>>>   File "<stdin>", line 1, in <module>
>>>
>>>   File
>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/context.py",
>>> line 346, in sql
>>>
>>>     return self.sparkSession.sql(sqlQuery)
>>>
>>>   File
>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/session.py",
>>> line 494, in sql
>>>
>>>     return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
>>>
>>>   File
>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py",
>>> line 933, in __call__
>>>
>>>   File
>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/utils.py",
>>> line 57, in deco
>>>
>>>     return f(*a, **kw)
>>>
>>>   File
>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py",
>>> line 312, in get_return_value
>>>
>>> py4j.protocol.Py4JJavaError: An error occurred while calling o21.sql.
>>>
>>> : java.lang.UnsupportedOperationException: loadTable is not implemented
>>>
>>> at
>>> org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.loadTable(InMemoryCatalog.scala:297)
>>>
>>> at
>>> org.apache.spark.sql.catalyst.catalog.SessionCatalog.loadTable(SessionCatalog.scala:280)
>>>
>>> at org.apache.spark.sql.execution.command.LoadData.run(tables.scala:263)
>>>
>>> at
>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
>>>
>>> at
>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
>>>
>>> at
>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
>>>
>>> at
>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>>>
>>> at
>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>>>
>>> at
>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
>>>
>>> at
>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>>>
>>> at
>>> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
>>>
>>> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
>>>
>>> at
>>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
>>>
>>> at
>>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
>>>
>>> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:187)
>>>
>>> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:168)
>>>
>>> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63)
>>>
>>> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>
>>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
>>>
>>> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
>>>
>>> at py4j.Gateway.invoke(Gateway.java:280)
>>>
>>> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:128)
>>>
>>> at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>>
>>> at py4j.GatewayConnection.run(GatewayConnection.java:211)
>>>
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> 2016-05-19 12:49 GMT-07:00 Herman van Hövell tot Westerflier <
>>> [email protected]>:
>>>
>>>> +1
>>>>
>>>>
>>>> 2016-05-19 18:20 GMT+02:00 Xiangrui Meng <[email protected]>:
>>>>
>>>>> +1
>>>>>
>>>>> On Thu, May 19, 2016 at 9:18 AM Joseph Bradley <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> +1
>>>>>>
>>>>>> On Wed, May 18, 2016 at 10:49 AM, Reynold Xin <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Ovidiu-Cristian ,
>>>>>>>
>>>>>>> The best source of truth is change the filter with target version to
>>>>>>> 2.1.0. Not a lot of tickets have been targeted yet, but I'd imagine as 
>>>>>>> we
>>>>>>> get closer to 2.0 release, more will be retargeted at 2.1.0.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, May 18, 2016 at 10:43 AM, Ovidiu-Cristian MARCU <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Yes, I can filter..
>>>>>>>> Did that and for example:
>>>>>>>>
>>>>>>>> https://issues.apache.org/jira/browse/SPARK-15370?jql=project%20%3D%20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20affectedVersion%20%3D%202.0.0
>>>>>>>> <https://issues.apache.org/jira/browse/SPARK-15370?jql=project%20=%20SPARK%20AND%20resolution%20=%20Unresolved%20AND%20affectedVersion%20=%202.0.0>
>>>>>>>>
>>>>>>>> To rephrase: for 2.0 do you have specific issues that are not a
>>>>>>>> priority and will released maybe with 2.1 for example?
>>>>>>>>
>>>>>>>> Keep up the good work!
>>>>>>>>
>>>>>>>> On 18 May 2016, at 18:19, Reynold Xin <[email protected]> wrote:
>>>>>>>>
>>>>>>>> You can find that by changing the filter to target version = 2.0.0.
>>>>>>>> Cheers.
>>>>>>>>
>>>>>>>> On Wed, May 18, 2016 at 9:00 AM, Ovidiu-Cristian MARCU <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> +1 Great, I see the list of resolved issues, do you have a list of
>>>>>>>>> known issue you plan to stay with this release?
>>>>>>>>>
>>>>>>>>> with
>>>>>>>>> build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Phive
>>>>>>>>> -Phive-thriftserver -DskipTests clean package
>>>>>>>>>
>>>>>>>>> mvn -version
>>>>>>>>> Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5;
>>>>>>>>> 2015-11-10T17:41:47+01:00)
>>>>>>>>> Maven home: /Users/omarcu/tools/apache-maven-3.3.9
>>>>>>>>> Java version: 1.7.0_80, vendor: Oracle Corporation
>>>>>>>>> Java home:
>>>>>>>>> /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre
>>>>>>>>> Default locale: en_US, platform encoding: UTF-8
>>>>>>>>> OS name: "mac os x", version: "10.11.5", arch: "x86_64", family:
>>>>>>>>> “mac"
>>>>>>>>>
>>>>>>>>> [INFO] Reactor Summary:
>>>>>>>>> [INFO]
>>>>>>>>> [INFO] Spark Project Parent POM ...........................
>>>>>>>>> SUCCESS [  2.635 s]
>>>>>>>>> [INFO] Spark Project Tags .................................
>>>>>>>>> SUCCESS [  1.896 s]
>>>>>>>>> [INFO] Spark Project Sketch ...............................
>>>>>>>>> SUCCESS [  2.560 s]
>>>>>>>>> [INFO] Spark Project Networking ...........................
>>>>>>>>> SUCCESS [  6.533 s]
>>>>>>>>> [INFO] Spark Project Shuffle Streaming Service ............
>>>>>>>>> SUCCESS [  4.176 s]
>>>>>>>>> [INFO] Spark Project Unsafe ...............................
>>>>>>>>> SUCCESS [  4.809 s]
>>>>>>>>> [INFO] Spark Project Launcher .............................
>>>>>>>>> SUCCESS [  6.242 s]
>>>>>>>>> [INFO] Spark Project Core .................................
>>>>>>>>> SUCCESS [01:20 min]
>>>>>>>>> [INFO] Spark Project GraphX ...............................
>>>>>>>>> SUCCESS [  9.148 s]
>>>>>>>>> [INFO] Spark Project Streaming ............................
>>>>>>>>> SUCCESS [ 22.760 s]
>>>>>>>>> [INFO] Spark Project Catalyst .............................
>>>>>>>>> SUCCESS [ 50.783 s]
>>>>>>>>> [INFO] Spark Project SQL ..................................
>>>>>>>>> SUCCESS [01:05 min]
>>>>>>>>> [INFO] Spark Project ML Local Library .....................
>>>>>>>>> SUCCESS [  4.281 s]
>>>>>>>>> [INFO] Spark Project ML Library ...........................
>>>>>>>>> SUCCESS [ 54.537 s]
>>>>>>>>> [INFO] Spark Project Tools ................................
>>>>>>>>> SUCCESS [  0.747 s]
>>>>>>>>> [INFO] Spark Project Hive .................................
>>>>>>>>> SUCCESS [ 33.032 s]
>>>>>>>>> [INFO] Spark Project HiveContext Compatibility ............
>>>>>>>>> SUCCESS [  3.198 s]
>>>>>>>>> [INFO] Spark Project REPL .................................
>>>>>>>>> SUCCESS [  3.573 s]
>>>>>>>>> [INFO] Spark Project YARN Shuffle Service .................
>>>>>>>>> SUCCESS [  4.617 s]
>>>>>>>>> [INFO] Spark Project YARN .................................
>>>>>>>>> SUCCESS [  7.321 s]
>>>>>>>>> [INFO] Spark Project Hive Thrift Server ...................
>>>>>>>>> SUCCESS [ 16.496 s]
>>>>>>>>> [INFO] Spark Project Assembly .............................
>>>>>>>>> SUCCESS [  2.300 s]
>>>>>>>>> [INFO] Spark Project External Flume Sink ..................
>>>>>>>>> SUCCESS [  4.219 s]
>>>>>>>>> [INFO] Spark Project External Flume .......................
>>>>>>>>> SUCCESS [  6.987 s]
>>>>>>>>> [INFO] Spark Project External Flume Assembly ..............
>>>>>>>>> SUCCESS [  1.465 s]
>>>>>>>>> [INFO] Spark Integration for Kafka 0.8 ....................
>>>>>>>>> SUCCESS [  6.891 s]
>>>>>>>>> [INFO] Spark Project Examples .............................
>>>>>>>>> SUCCESS [ 13.465 s]
>>>>>>>>> [INFO] Spark Project External Kafka Assembly ..............
>>>>>>>>> SUCCESS [  2.815 s]
>>>>>>>>> [INFO]
>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>> [INFO] BUILD SUCCESS
>>>>>>>>> [INFO]
>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>> [INFO] Total time: 07:04 min
>>>>>>>>> [INFO] Finished at: 2016-05-18T17:55:33+02:00
>>>>>>>>> [INFO] Final Memory: 90M/824M
>>>>>>>>> [INFO]
>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> On 18 May 2016, at 16:28, Sean Owen <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> I think it's a good idea. Although releases have been preceded
>>>>>>>>> before
>>>>>>>>> by release candidates for developers, it would be good to get a
>>>>>>>>> formal
>>>>>>>>> preview/beta release ratified for public consumption ahead of a new
>>>>>>>>> major release. Better to have a little more testing in the wild to
>>>>>>>>> identify problems before 2.0.0 is finalized.
>>>>>>>>>
>>>>>>>>> +1 to the release. License, sigs, etc check out. On Ubuntu 16 +
>>>>>>>>> Java
>>>>>>>>> 8, compilation and tests succeed for "-Pyarn -Phive
>>>>>>>>> -Phive-thriftserver -Phadoop-2.6".
>>>>>>>>>
>>>>>>>>> On Wed, May 18, 2016 at 6:40 AM, Reynold Xin <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> In the past the Apache Spark community have created preview
>>>>>>>>> packages (not
>>>>>>>>> official releases) and used those as opportunities to ask
>>>>>>>>> community members
>>>>>>>>> to test the upcoming versions of Apache Spark. Several people in
>>>>>>>>> the Apache
>>>>>>>>> community have suggested we conduct votes for these preview
>>>>>>>>> packages and
>>>>>>>>> turn them into formal releases by the Apache foundation's
>>>>>>>>> standard. Preview
>>>>>>>>> releases are not meant to be functional, i.e. they can and highly
>>>>>>>>> likely
>>>>>>>>> will contain critical bugs or documentation errors, but we will be
>>>>>>>>> able to
>>>>>>>>> post them to the project's website to get wider feedback. They
>>>>>>>>> should
>>>>>>>>> satisfy the legal requirements of Apache's release policy
>>>>>>>>> (http://www.apache.org/dev/release.html) such as having proper
>>>>>>>>> licenses.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Please vote on releasing the following candidate as Apache Spark
>>>>>>>>> version
>>>>>>>>> 2.0.0-preview. The vote is open until Friday, May 20, 2015 at
>>>>>>>>> 11:00 PM PDT
>>>>>>>>> and passes if a majority of at least 3 +1 PMC votes are cast.
>>>>>>>>>
>>>>>>>>> [ ] +1 Release this package as Apache Spark 2.0.0-preview
>>>>>>>>> [ ] -1 Do not release this package because ...
>>>>>>>>>
>>>>>>>>> To learn more about Apache Spark, please see
>>>>>>>>> http://spark.apache.org/
>>>>>>>>>
>>>>>>>>> The tag to be voted on is 2.0.0-preview
>>>>>>>>> (8f5a04b6299e3a47aca13cbb40e72344c0114860)
>>>>>>>>>
>>>>>>>>> The release files, including signatures, digests, etc. can be
>>>>>>>>> found at:
>>>>>>>>>
>>>>>>>>> http://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-bin/
>>>>>>>>>
>>>>>>>>> Release artifacts are signed with the following key:
>>>>>>>>> https://people.apache.org/keys/committer/pwendell.asc
>>>>>>>>>
>>>>>>>>> The documentation corresponding to this release can be found at:
>>>>>>>>>
>>>>>>>>> http://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/
>>>>>>>>>
>>>>>>>>> The list of resolved issues are:
>>>>>>>>>
>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-15351?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.0.0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you are a Spark user, you can help us test this release by
>>>>>>>>> taking an
>>>>>>>>> existing Apache Spark workload and running on this candidate, then
>>>>>>>>> reporting
>>>>>>>>> any regressions.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>>
>>
>

Re: [vote] Apache Spark 2.0.0-preview release (rc1)

Reply via email to