Bind exception while running FlumeEventCount

2014-11-10 Thread Jeniba Johnson
Hi, I have installed spark-1.1.0 and apache flume 1.4 for running streaming example FlumeEventCount. Previously the code was working fine. Now Iam facing with the below mentioned issues. My flume is running properly it is able to write the file. The command I use is bin/run-example

Re: Replacing Spark's native scheduler with Sparrow

2014-11-10 Thread Nicholas Chammas
On Sun, Nov 9, 2014 at 1:51 AM, Tathagata Das tathagata.das1...@gmail.com wrote: This causes a scalability vs. latency tradeoff - if your limit is 1000 tasks per second (simplifying from 1500), you could either configure it to use 100 receivers at 100 ms batches (10 blocks/sec), or 1000

Re: Replacing Spark's native scheduler with Sparrow

2014-11-10 Thread Tathagata Das
Too bad Nick, I dont have anything immediately ready that tests Spark Streaming with those extreme settings. :) On Mon, Nov 10, 2014 at 9:56 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: On Sun, Nov 9, 2014 at 1:51 AM, Tathagata Das tathagata.das1...@gmail.com wrote: This causes a

getting exception when trying to build spark from master

2014-11-10 Thread Sadhan Sood
Getting an exception while trying to build spark in spark-core: [ERROR] while compiling: /Users/dev/tellapart_spark/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala during phase: typer library version: version 2.10.4 compiler version: version 2.10.4

Re: getting exception when trying to build spark from master

2014-11-10 Thread Josh Rosen
It looks like the Jenkins maven builds are broken, too. Based on the Jenkins logs, I think that this pull request may have broken things (although I'm not sure why): https://github.com/apache/spark/pull/3030#issuecomment-62436181 On Mon, Nov 10, 2014 at 1:42 PM, Sadhan Sood

Http client dependency conflict when using AWS

2014-11-10 Thread Cody Koeninger
I'm wondering why https://issues.apache.org/jira/browse/SPARK-3638 only updated the version of http client for the kinesis-asl profile and left the base dependencies unchanged. Spark built without that profile still has the same java.lang.NoSuchMethodError:

Spark 1.1.1 release

2014-11-10 Thread Andrew Or
Hi everyone, I am the release manager for 1.1.1, and I am preparing to cut a release tonight at midnight. 1.1.1 is a maintenance release which will ship several important bug fixes to users of Spark 1.1. Many users are waiting for these fixes so I would like to release it as soon as possible. At

Re: Bind exception while running FlumeEventCount

2014-11-10 Thread Hari Shreedharan
Looks like that port is not available because another app is using that port. Can you take a look at netstat -a and use a port that is free? Thanks, Hari On Fri, Nov 7, 2014 at 2:05 PM, Jeniba Johnson jeniba.john...@lntinfotech.com wrote: Hi, I have installed spark-1.1.0 and apache flume

Re: getting exception when trying to build spark from master

2014-11-10 Thread Sadhan Sood
I reverted the patch locally, seems to be working for me. On Mon, Nov 10, 2014 at 6:00 PM, Patrick Wendell pwend...@gmail.com wrote: I reverted that patch to see if it fixes it. On Mon, Nov 10, 2014 at 1:45 PM, Josh Rosen rosenvi...@gmail.com wrote: It looks like the Jenkins maven builds

Re: MatrixFactorizationModel predict(Int, Int) API

2014-11-10 Thread Debasish Das
I tested 2 different implementations to generate the predicted ranked list...The first version uses a cartesian of user and product features and then generates a predicted value for each (user,product) key... The second version does a collect on the skinny matrix (most likely products) and then

thrift jdbc server probably running queries as hive query

2014-11-10 Thread Sadhan Sood
I was testing out the spark thrift jdbc server by running a simple query in the beeline client. The spark itself is running on a yarn cluster. However, when I run a query in beeline - I see no running jobs in the spark UI(completely empty) and the yarn UI seem to indicate that the submitted query

Re: thrift jdbc server probably running queries as hive query

2014-11-10 Thread scwf
The sql run successfully? and what sql you running? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/thrift-jdbc-server-probably-running-queries-as-hive-query-tp9267p9268.html Sent from the Apache Spark Developers List mailing list archive at

Checkpoint bugs in GraphX

2014-11-10 Thread Xu Lijie
Hi, all. I'm not sure whether someone has reported this bug: There should be a checkpoint() method in EdgeRDD and VertexRDD as follows: override def checkpoint(): Unit = { partitionsRDD.checkpoint() } Current EdgeRDD and VertexRDD use *RDD.checkpoint()*, which only checkpoint the

Re: thrift jdbc server probably running queries as hive query

2014-11-10 Thread Cheng Lian
Hey Sadhan, I really don't think this is Spark log... Unlike Shark, Spark SQL doesn't even provide a Hive mode to let you execute queries against Hive. Would you please check whether there is an existing HiveServer2 running there? Spark SQL HiveThriftServer2 is just a Spark port of

Re: Checkpoint bugs in GraphX

2014-11-10 Thread GuoQiang Li
I have been trying to fix this bug.‍ The related PR: https://github.com/apache/spark/pull/2631‍ -- Original -- From: Xu Lijie;lijie@gmail.com; Date: Tue, Nov 11, 2014 10:19 AM To: useru...@spark.apache.org; devdev@spark.apache.org; Subject: Checkpoint

Discuss how to do checkpoint more efficently

2014-11-10 Thread Xu Lijie
Hi, all. I want to seek suggestions on how to do checkpoint more efficiently, especially for iterative applications written by GraphX. For iterative applications, the lineage of a job can be very long, which is easy to cause statckoverflow error. A solution is to do checkpoint. However,

Re: Checkpoint bugs in GraphX

2014-11-10 Thread GuoQiang Li
Many methods are not required serialization EdgeRDD or VertexRDD(eg: graph.edges.‍‍count‍), moreover , partitionsRDD(or targetStorageLevel‍) need only in the driver. partitionsRDD (or targetStorageLevel) ‍is not serialized no effect. ‍ -- Original -- From:

RE: Bind exception while running FlumeEventCount

2014-11-10 Thread Jeniba Johnson
Hi Hari Just to give you a background , I had installed spark-1.1.0 and apache flume 1.4 with basic configurations as needed. I just wanted to know that Is this the correct way for running Spark streaming examples with Flume. So As you had mentioned about the TIME_WAIT parameter, did not get

RE: Bind exception while running FlumeEventCount

2014-11-10 Thread Hari Shreedharan
First, can you try a different port? TIME_WAIT is basically a timeout for a socket to be completely decommissioned for the port to be available for binding. Once you wait for a few minutes and if you still see a startup issue, can you also send the error logs? From what I can see, the port

Re: Bind exception while running FlumeEventCount

2014-11-10 Thread Hari Shreedharan
Did you start a Flume agent to push data to the relevant port? Thanks, Hari On Fri, Nov 7, 2014 at 2:05 PM, Jeniba Johnson jeniba.john...@lntinfotech.com wrote: Hi, I have installed spark-1.1.0 and apache flume 1.4 for running streaming example FlumeEventCount. Previously the code was

RE: Bind exception while running FlumeEventCount

2014-11-10 Thread Jeniba Johnson
Hi Hari Meanwhile Iam trying out with different port. I need to confirm with you about the installation for Spark and Flume. For installation, I have just unzipped spark-1.1.0-bin-hadoop1.tar.gz and apache-flume-1.4.0-bin.tar.gz for running spark streaming examples. Is this the correct way