[ANNOUNCE] CFP open for ApacheCon North America 2016

2015-11-25 Thread Rich Bowen
Community growth starts by talking with those interested in your project. ApacheCon North America is coming, are you? We are delighted to announce that the Call For Presentations (CFP) is now open for ApacheCon North America. You can submit your proposed sessions at

Re: VerifyError running Spark SQL code?

2015-11-25 Thread Josh Rosen
I think I've also seen this issue as well, but in a different suite. I wasn't able to easily get to the bottom of it, though. What JDK / JRE are you using? I'm on Java(TM) SE Runtime Environment (build 1.7.0_65-b17) Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) on OSX. On

VerifyError running Spark SQL code?

2015-11-25 Thread Marcelo Vanzin
I've been running into this error when running Spark SQL recently; no matter what I try (completely clean build or anything else) doesn't seem to fix it. Anyone has some idea of what's wrong? [info] Exception encountered when attempting to run a suite with class name:

Spark checkpoint problem

2015-11-25 Thread wyphao.2007
I am test checkpoint to understand how it works, My code as following: scala> val data = sc.parallelize(List("a", "b", "c")) data: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at :15 scala> sc.setCheckpointDir("/tmp/checkpoint") 15/11/25 18:09:07 WARN

Re: VerifyError running Spark SQL code?

2015-11-25 Thread Marcelo Vanzin
$ java -version java version "1.7.0_67" Java(TM) SE Runtime Environment (build 1.7.0_67-b01) (On Linux.) It's not that particular suite, though, it's anything I do that touches Spark SQL... On Wed, Nov 25, 2015 at 4:54 PM, Josh Rosen wrote: > I think I've also seen

Re: VerifyError running Spark SQL code?

2015-11-25 Thread Marcelo Vanzin
Seems to be some new thing with recent JDK updates according to the intertubes. This patch seems to work around it: --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/HyperLogLogPlusPlus.scala +++

How to add 1.5.2 support to ec2/spark_ec2.py ?

2015-11-25 Thread Alexander Pivovarov
Hi Everyone I noticed that spark ec2 script is outdated. How to add 1.5.2 support to ec2/spark_ec2.py? What else (except of updating spark version in the script) should be done to add 1.5.2 support? We also need to update scala to 2.10.4 (currently it's 2.10.3) Alex

RE: Spark checkpoint problem

2015-11-25 Thread 张志强(旺轩)
What’s your spark version? 发件人: wyphao.2007 [mailto:wyphao.2...@163.com] 发送时间: 2015年11月26日 10:04 收件人: user 抄送: dev@spark.apache.org 主题: Spark checkpoint problem I am test checkpoint to understand how it works, My code as following: scala> val data = sc.parallelize(List("a", "b",

Incremental Analysis with Spark

2015-11-25 Thread Sachith Withana
Hi folks! I'm wondering if Sparks supports or hopes to support incremental data analysis. There are few use cases that prompted me to wonder. ex: If we need to summarize last 30 days worth of data everyday, 1. Does Spark support time range based query execution ? select * from foo where

Re: Using spark MLlib without installing Spark

2015-11-25 Thread Stavros Kontopoulos
You can even use it without spark as well (besides local). For example i have used the following algo in some web app: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala Essentially some algorithms (i havent checked them all)

Re: A proposal for Spark 2.0

2015-11-25 Thread Reynold Xin
I don't think we should drop support for Scala 2.10, or make it harder in terms of operations for people to upgrade. If there are further objections, I'm going to bump remove the 1.7 version and retarget things to 2.0 on JIRA. On Wed, Nov 25, 2015 at 12:54 AM, Sandy Ryza

Re: A proposal for Spark 2.0

2015-11-25 Thread Sandy Ryza
I see. My concern is / was that cluster operators will be reluctant to upgrade to 2.0, meaning that developers using those clusters need to stay on 1.x, and, if they want to move to DataFrames, essentially need to port their app twice. I misunderstood and thought part of the proposal was to drop

Spark checkpoint problem

2015-11-25 Thread wyphao.2007
Hi, I am test checkpoint to understand how it works, My code as following: scala> val data = sc.parallelize(List("a", "b", "c")) data: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[0] at parallelize at :15 scala> sc.setCheckpointDir("/tmp/checkpoint") 15/11/25 18:09:07 WARN

Re:RE: Spark checkpoint problem

2015-11-25 Thread wyphao.2007
Spark 1.5.2. 在 2015-11-26 13:19:39,"张志强(旺轩)" 写道: What’s your spark version? 发件人: wyphao.2007 [mailto:wyphao.2...@163.com] 发送时间: 2015年11月26日 10:04 收件人: user 抄送:dev@spark.apache.org 主题: Spark checkpoint problem I am test checkpoint to understand how it works, My code

Re: Incremental Analysis with Spark

2015-11-25 Thread chester
For the 2nd use case, can you save the result for first 29 days, then just get the last day result and add yourself ? This can be done outside of spark. Does that work for you Sent from my iPad > On Nov 25, 2015, at 9:46 PM, Sachith Withana wrote: > > Hi folks! > >