+1 Having an rc1 would help me get stable feedback on using my library with Spark, compared to relying on 2.0.0-SNAPSHOT.
On Fri, 20 May 2016 at 05:57 Xiao Li <gatorsm...@gmail.com> wrote: > Changed my vote to +1. Thanks! > > 2016-05-19 13:28 GMT-07:00 Xiao Li <gatorsm...@gmail.com>: > >> Will do. Thanks! >> >> 2016-05-19 13:26 GMT-07:00 Reynold Xin <r...@databricks.com>: >> >>> Xiao thanks for posting. Please file a bug in JIRA. Again as I said in >>> the email this is not meant to be a functional release and will contain >>> bugs. >>> >>> On Thu, May 19, 2016 at 1:20 PM, Xiao Li <gatorsm...@gmail.com> wrote: >>> >>>> -1 >>>> >>>> Unable to use Hive meta-store in pyspark shell. Tried both HiveContext >>>> and SparkSession. Both failed. It always uses in-memory catalog. Anybody >>>> else hit the same issue? >>>> >>>> >>>> Method 1: SparkSession >>>> >>>> >>> from pyspark.sql import SparkSession >>>> >>>> >>> spark = SparkSession.builder.enableHiveSupport().getOrCreate() >>>> >>>> >>> >>>> >>>> >>> spark.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)") >>>> >>>> DataFrame[] >>>> >>>> >>> spark.sql("LOAD DATA LOCAL INPATH >>>> 'examples/src/main/resources/kv1.txt' INTO TABLE src") >>>> >>>> Traceback (most recent call last): >>>> >>>> File "<stdin>", line 1, in <module> >>>> >>>> File >>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/session.py", >>>> line 494, in sql >>>> >>>> return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) >>>> >>>> File >>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", >>>> line 933, in __call__ >>>> >>>> File >>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/utils.py", >>>> line 57, in deco >>>> >>>> return f(*a, **kw) >>>> >>>> File >>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py", >>>> line 312, in get_return_value >>>> >>>> py4j.protocol.Py4JJavaError: An error occurred while calling o21.sql. >>>> >>>> : java.lang.UnsupportedOperationException: loadTable is not implemented >>>> >>>> at >>>> org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.loadTable(InMemoryCatalog.scala:297) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.catalog.SessionCatalog.loadTable(SessionCatalog.scala:280) >>>> >>>> at org.apache.spark.sql.execution.command.LoadData.run(tables.scala:263) >>>> >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) >>>> >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) >>>> >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) >>>> >>>> at >>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) >>>> >>>> at >>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) >>>> >>>> at >>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) >>>> >>>> at >>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) >>>> >>>> at >>>> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) >>>> >>>> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) >>>> >>>> at >>>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) >>>> >>>> at >>>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) >>>> >>>> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:187) >>>> >>>> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:168) >>>> >>>> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63) >>>> >>>> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541) >>>> >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>> >>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> >>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>> >>>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237) >>>> >>>> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) >>>> >>>> at py4j.Gateway.invoke(Gateway.java:280) >>>> >>>> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:128) >>>> >>>> at py4j.commands.CallCommand.execute(CallCommand.java:79) >>>> >>>> at py4j.GatewayConnection.run(GatewayConnection.java:211) >>>> >>>> at java.lang.Thread.run(Thread.java:745) >>>> >>>> >>>> Method 2: Using HiveContext: >>>> >>>> >>> from pyspark.sql import HiveContext >>>> >>>> >>> sqlContext = HiveContext(sc) >>>> >>>> >>> sqlContext.sql("CREATE TABLE IF NOT EXISTS src (key INT, value >>>> STRING)") >>>> >>>> DataFrame[] >>>> >>>> >>> sqlContext.sql("LOAD DATA LOCAL INPATH >>>> 'examples/src/main/resources/kv1.txt' INTO TABLE src") >>>> >>>> Traceback (most recent call last): >>>> >>>> File "<stdin>", line 1, in <module> >>>> >>>> File >>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/context.py", >>>> line 346, in sql >>>> >>>> return self.sparkSession.sql(sqlQuery) >>>> >>>> File >>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/session.py", >>>> line 494, in sql >>>> >>>> return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) >>>> >>>> File >>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", >>>> line 933, in __call__ >>>> >>>> File >>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/pyspark/sql/utils.py", >>>> line 57, in deco >>>> >>>> return f(*a, **kw) >>>> >>>> File >>>> "/Users/xiaoli/IdeaProjects/sparkDelivery/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py", >>>> line 312, in get_return_value >>>> >>>> py4j.protocol.Py4JJavaError: An error occurred while calling o21.sql. >>>> >>>> : java.lang.UnsupportedOperationException: loadTable is not implemented >>>> >>>> at >>>> org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.loadTable(InMemoryCatalog.scala:297) >>>> >>>> at >>>> org.apache.spark.sql.catalyst.catalog.SessionCatalog.loadTable(SessionCatalog.scala:280) >>>> >>>> at org.apache.spark.sql.execution.command.LoadData.run(tables.scala:263) >>>> >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) >>>> >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) >>>> >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) >>>> >>>> at >>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) >>>> >>>> at >>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) >>>> >>>> at >>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) >>>> >>>> at >>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) >>>> >>>> at >>>> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) >>>> >>>> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) >>>> >>>> at >>>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) >>>> >>>> at >>>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) >>>> >>>> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:187) >>>> >>>> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:168) >>>> >>>> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63) >>>> >>>> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541) >>>> >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>> >>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> >>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>> >>>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237) >>>> >>>> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) >>>> >>>> at py4j.Gateway.invoke(Gateway.java:280) >>>> >>>> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:128) >>>> >>>> at py4j.commands.CallCommand.execute(CallCommand.java:79) >>>> >>>> at py4j.GatewayConnection.run(GatewayConnection.java:211) >>>> >>>> at java.lang.Thread.run(Thread.java:745) >>>> >>>> 2016-05-19 12:49 GMT-07:00 Herman van Hövell tot Westerflier < >>>> hvanhov...@questtec.nl>: >>>> >>>>> +1 >>>>> >>>>> >>>>> 2016-05-19 18:20 GMT+02:00 Xiangrui Meng <m...@databricks.com>: >>>>> >>>>>> +1 >>>>>> >>>>>> On Thu, May 19, 2016 at 9:18 AM Joseph Bradley <jos...@databricks.com> >>>>>> wrote: >>>>>> >>>>>>> +1 >>>>>>> >>>>>>> On Wed, May 18, 2016 at 10:49 AM, Reynold Xin <r...@databricks.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Ovidiu-Cristian , >>>>>>>> >>>>>>>> The best source of truth is change the filter with target version >>>>>>>> to 2.1.0. Not a lot of tickets have been targeted yet, but I'd imagine >>>>>>>> as >>>>>>>> we get closer to 2.0 release, more will be retargeted at 2.1.0. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, May 18, 2016 at 10:43 AM, Ovidiu-Cristian MARCU < >>>>>>>> ovidiu-cristian.ma...@inria.fr> wrote: >>>>>>>> >>>>>>>>> Yes, I can filter.. >>>>>>>>> Did that and for example: >>>>>>>>> >>>>>>>>> https://issues.apache.org/jira/browse/SPARK-15370?jql=project%20%3D%20SPARK%20AND%20resolution%20%3D%20Unresolved%20AND%20affectedVersion%20%3D%202.0.0 >>>>>>>>> <https://issues.apache.org/jira/browse/SPARK-15370?jql=project%20=%20SPARK%20AND%20resolution%20=%20Unresolved%20AND%20affectedVersion%20=%202.0.0> >>>>>>>>> >>>>>>>>> To rephrase: for 2.0 do you have specific issues that are not a >>>>>>>>> priority and will released maybe with 2.1 for example? >>>>>>>>> >>>>>>>>> Keep up the good work! >>>>>>>>> >>>>>>>>> On 18 May 2016, at 18:19, Reynold Xin <r...@databricks.com> wrote: >>>>>>>>> >>>>>>>>> You can find that by changing the filter to target version = >>>>>>>>> 2.0.0. Cheers. >>>>>>>>> >>>>>>>>> On Wed, May 18, 2016 at 9:00 AM, Ovidiu-Cristian MARCU < >>>>>>>>> ovidiu-cristian.ma...@inria.fr> wrote: >>>>>>>>> >>>>>>>>>> +1 Great, I see the list of resolved issues, do you have a list >>>>>>>>>> of known issue you plan to stay with this release? >>>>>>>>>> >>>>>>>>>> with >>>>>>>>>> build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Phive >>>>>>>>>> -Phive-thriftserver -DskipTests clean package >>>>>>>>>> >>>>>>>>>> mvn -version >>>>>>>>>> Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; >>>>>>>>>> 2015-11-10T17:41:47+01:00) >>>>>>>>>> Maven home: /Users/omarcu/tools/apache-maven-3.3.9 >>>>>>>>>> Java version: 1.7.0_80, vendor: Oracle Corporation >>>>>>>>>> Java home: >>>>>>>>>> /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre >>>>>>>>>> Default locale: en_US, platform encoding: UTF-8 >>>>>>>>>> OS name: "mac os x", version: "10.11.5", arch: "x86_64", family: >>>>>>>>>> “mac" >>>>>>>>>> >>>>>>>>>> [INFO] Reactor Summary: >>>>>>>>>> [INFO] >>>>>>>>>> [INFO] Spark Project Parent POM ........................... >>>>>>>>>> SUCCESS [ 2.635 s] >>>>>>>>>> [INFO] Spark Project Tags ................................. >>>>>>>>>> SUCCESS [ 1.896 s] >>>>>>>>>> [INFO] Spark Project Sketch ............................... >>>>>>>>>> SUCCESS [ 2.560 s] >>>>>>>>>> [INFO] Spark Project Networking ........................... >>>>>>>>>> SUCCESS [ 6.533 s] >>>>>>>>>> [INFO] Spark Project Shuffle Streaming Service ............ >>>>>>>>>> SUCCESS [ 4.176 s] >>>>>>>>>> [INFO] Spark Project Unsafe ............................... >>>>>>>>>> SUCCESS [ 4.809 s] >>>>>>>>>> [INFO] Spark Project Launcher ............................. >>>>>>>>>> SUCCESS [ 6.242 s] >>>>>>>>>> [INFO] Spark Project Core ................................. >>>>>>>>>> SUCCESS [01:20 min] >>>>>>>>>> [INFO] Spark Project GraphX ............................... >>>>>>>>>> SUCCESS [ 9.148 s] >>>>>>>>>> [INFO] Spark Project Streaming ............................ >>>>>>>>>> SUCCESS [ 22.760 s] >>>>>>>>>> [INFO] Spark Project Catalyst ............................. >>>>>>>>>> SUCCESS [ 50.783 s] >>>>>>>>>> [INFO] Spark Project SQL .................................. >>>>>>>>>> SUCCESS [01:05 min] >>>>>>>>>> [INFO] Spark Project ML Local Library ..................... >>>>>>>>>> SUCCESS [ 4.281 s] >>>>>>>>>> [INFO] Spark Project ML Library ........................... >>>>>>>>>> SUCCESS [ 54.537 s] >>>>>>>>>> [INFO] Spark Project Tools ................................ >>>>>>>>>> SUCCESS [ 0.747 s] >>>>>>>>>> [INFO] Spark Project Hive ................................. >>>>>>>>>> SUCCESS [ 33.032 s] >>>>>>>>>> [INFO] Spark Project HiveContext Compatibility ............ >>>>>>>>>> SUCCESS [ 3.198 s] >>>>>>>>>> [INFO] Spark Project REPL ................................. >>>>>>>>>> SUCCESS [ 3.573 s] >>>>>>>>>> [INFO] Spark Project YARN Shuffle Service ................. >>>>>>>>>> SUCCESS [ 4.617 s] >>>>>>>>>> [INFO] Spark Project YARN ................................. >>>>>>>>>> SUCCESS [ 7.321 s] >>>>>>>>>> [INFO] Spark Project Hive Thrift Server ................... >>>>>>>>>> SUCCESS [ 16.496 s] >>>>>>>>>> [INFO] Spark Project Assembly ............................. >>>>>>>>>> SUCCESS [ 2.300 s] >>>>>>>>>> [INFO] Spark Project External Flume Sink .................. >>>>>>>>>> SUCCESS [ 4.219 s] >>>>>>>>>> [INFO] Spark Project External Flume ....................... >>>>>>>>>> SUCCESS [ 6.987 s] >>>>>>>>>> [INFO] Spark Project External Flume Assembly .............. >>>>>>>>>> SUCCESS [ 1.465 s] >>>>>>>>>> [INFO] Spark Integration for Kafka 0.8 .................... >>>>>>>>>> SUCCESS [ 6.891 s] >>>>>>>>>> [INFO] Spark Project Examples ............................. >>>>>>>>>> SUCCESS [ 13.465 s] >>>>>>>>>> [INFO] Spark Project External Kafka Assembly .............. >>>>>>>>>> SUCCESS [ 2.815 s] >>>>>>>>>> [INFO] >>>>>>>>>> ------------------------------------------------------------------------ >>>>>>>>>> [INFO] BUILD SUCCESS >>>>>>>>>> [INFO] >>>>>>>>>> ------------------------------------------------------------------------ >>>>>>>>>> [INFO] Total time: 07:04 min >>>>>>>>>> [INFO] Finished at: 2016-05-18T17:55:33+02:00 >>>>>>>>>> [INFO] Final Memory: 90M/824M >>>>>>>>>> [INFO] >>>>>>>>>> ------------------------------------------------------------------------ >>>>>>>>>> >>>>>>>>>> On 18 May 2016, at 16:28, Sean Owen <so...@cloudera.com> wrote: >>>>>>>>>> >>>>>>>>>> I think it's a good idea. Although releases have been preceded >>>>>>>>>> before >>>>>>>>>> by release candidates for developers, it would be good to get a >>>>>>>>>> formal >>>>>>>>>> preview/beta release ratified for public consumption ahead of a >>>>>>>>>> new >>>>>>>>>> major release. Better to have a little more testing in the wild to >>>>>>>>>> identify problems before 2.0.0 is finalized. >>>>>>>>>> >>>>>>>>>> +1 to the release. License, sigs, etc check out. On Ubuntu 16 + >>>>>>>>>> Java >>>>>>>>>> 8, compilation and tests succeed for "-Pyarn -Phive >>>>>>>>>> -Phive-thriftserver -Phadoop-2.6". >>>>>>>>>> >>>>>>>>>> On Wed, May 18, 2016 at 6:40 AM, Reynold Xin <r...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> In the past the Apache Spark community have created preview >>>>>>>>>> packages (not >>>>>>>>>> official releases) and used those as opportunities to ask >>>>>>>>>> community members >>>>>>>>>> to test the upcoming versions of Apache Spark. Several people in >>>>>>>>>> the Apache >>>>>>>>>> community have suggested we conduct votes for these preview >>>>>>>>>> packages and >>>>>>>>>> turn them into formal releases by the Apache foundation's >>>>>>>>>> standard. Preview >>>>>>>>>> releases are not meant to be functional, i.e. they can and highly >>>>>>>>>> likely >>>>>>>>>> will contain critical bugs or documentation errors, but we will >>>>>>>>>> be able to >>>>>>>>>> post them to the project's website to get wider feedback. They >>>>>>>>>> should >>>>>>>>>> satisfy the legal requirements of Apache's release policy >>>>>>>>>> (http://www.apache.org/dev/release.html) such as having proper >>>>>>>>>> licenses. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Please vote on releasing the following candidate as Apache Spark >>>>>>>>>> version >>>>>>>>>> 2.0.0-preview. The vote is open until Friday, May 20, 2015 at >>>>>>>>>> 11:00 PM PDT >>>>>>>>>> and passes if a majority of at least 3 +1 PMC votes are cast. >>>>>>>>>> >>>>>>>>>> [ ] +1 Release this package as Apache Spark 2.0.0-preview >>>>>>>>>> [ ] -1 Do not release this package because ... >>>>>>>>>> >>>>>>>>>> To learn more about Apache Spark, please see >>>>>>>>>> http://spark.apache.org/ >>>>>>>>>> >>>>>>>>>> The tag to be voted on is 2.0.0-preview >>>>>>>>>> (8f5a04b6299e3a47aca13cbb40e72344c0114860) >>>>>>>>>> >>>>>>>>>> The release files, including signatures, digests, etc. can be >>>>>>>>>> found at: >>>>>>>>>> >>>>>>>>>> http://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-bin/ >>>>>>>>>> >>>>>>>>>> Release artifacts are signed with the following key: >>>>>>>>>> https://people.apache.org/keys/committer/pwendell.asc >>>>>>>>>> >>>>>>>>>> The documentation corresponding to this release can be found at: >>>>>>>>>> >>>>>>>>>> http://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/ >>>>>>>>>> >>>>>>>>>> The list of resolved issues are: >>>>>>>>>> >>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-15351?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.0.0 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> If you are a Spark user, you can help us test this release by >>>>>>>>>> taking an >>>>>>>>>> existing Apache Spark workload and running on this candidate, >>>>>>>>>> then reporting >>>>>>>>>> any regressions. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>>>>>>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>> >>>> >>> >> >