any way to control memory usage when streaming input's speed is faster than the speed of handled by spark streaming ?

2014-05-20 Thread Francis . Hu
sparkers,

 

Is there a better way to control memory usage when streaming input's speed
is faster than the speed of handled by spark streaming ?

 

Thanks,

Francis.Hu



help me: Out of memory when spark streaming

2014-05-16 Thread Francis . Hu
hi, All

 

I encountered OOM when streaming.

I send data to spark streaming through Zeromq at a speed of 600 records per
second, but the spark streaming only handle 10 records per 5 seconds( set it
in streaming program)

my two workers have 4 cores CPU and 1G RAM.

These workers always occur Out Of Memory after moments.

I tried to adjust JVM GC arguments to speed up GC process.  Actually, it
made a little bit change of performance, but workers finally occur OOM.

 

Is there any way to resolve it?

 

it would be appreciated if anyone can help me to get it fixed !

 

 

Thanks,

Francis.Hu



No configuration setting found for key 'akka.zeromq'

2014-05-14 Thread Francis . Hu
hi,all

 

When i run ZeroMQWordCount example on cluster, the worker log says:   Caused
by: com.typesafe.config.ConfigException$Missing: No configuration setting
found for key 'akka.zeromq'

 

Actually, i can see that the reference.conf in
spark-examples-assembly-0.9.1.jar contains below configurations: 

Anyone know what happen ?

 

#

# Akka ZeroMQ Reference Config File #

#

 

# This is the reference config file that contains all the default settings.

# Make your edits/overrides in your application.conf.

 

akka {

 

  zeromq {

 

# The default timeout for a poll on the actual zeromq socket.

poll-timeout = 100ms

 

# Timeout for creating a new socket

new-socket-timeout = 5s

 

socket-dispatcher {

  # A zeromq socket needs to be pinned to the thread that created it.

  # Changing this value results in weird errors and race conditions
within

  # zeromq

  executor = thread-pool-executor

  type = PinnedDispatcher

  thread-pool-executor.allow-core-timeout = off

}

  }

}

 

Exception in worker

 

akka.actor.ActorInitializationException: exception during creation

at akka.actor.ActorInitializationException$.apply(Actor.scala:218)

Caused by: com.typesafe.config.ConfigException$Missing: No configuration
setting found for key 'akka.zeromq'

14/05/06 21:26:19 ERROR actor.ActorCell: changing Recreate into Create after
akka.actor.ActorInitializationException: exception during creation

 

 

Thanks,

Francis.Hu



答复: 答复: java.io.FileNotFoundException: /test/spark-0.9.1/work/app-20140505053550-0000/2/stdout (No such file or directory)

2014-05-11 Thread Francis . Hu
I  have just the problem resolved via running master and work daemons 
individually on where they are.

if I execute the shell: sbin/start-all.sh , the problem always exist.  

 

 

发件人: Francis.Hu [mailto:francis...@reachjunction.com] 
发送时间: Tuesday, May 06, 2014 10:31
收件人: user@spark.apache.org
主题: 答复: 答复: java.io.FileNotFoundException: 
/test/spark-0.9.1/work/app-20140505053550-/2/stdout (No such file or 
directory)

 

i looked into the log again, all exceptions are about FileNotFoundException . 
In the Webui, no anymore info I can check except for the basic description of 
job.  

Attached the log file, could you help to take a look ? Thanks.

 

Francis.Hu

 

发件人: Tathagata Das [mailto:tathagata.das1...@gmail.com] 
发送时间: Tuesday, May 06, 2014 10:16
收件人: user@spark.apache.org
主题: Re: 答复: java.io.FileNotFoundException: 
/test/spark-0.9.1/work/app-20140505053550-/2/stdout (No such file or 
directory)

 

Can you check the Spark worker logs on that machine. Either from the web ui, or 
directly. Should be /test/spark-XXX/logs/  See if that has any error.

If there is not permission issue, I am not why stdout and stderr is not being 
generated. 

 

TD

 

On Mon, May 5, 2014 at 7:13 PM, Francis.Hu francis...@reachjunction.com wrote:

The file does not exist in fact and no permission issue. 

 

francis@ubuntu-4:/test/spark-0.9.1$ ll work/app-20140505053550-/

total 24

drwxrwxr-x  6 francis francis 4096 May  5 05:35 ./

drwxrwxr-x 11 francis francis 4096 May  5 06:18 ../

drwxrwxr-x  2 francis francis 4096 May  5 05:35 2/

drwxrwxr-x  2 francis francis 4096 May  5 05:35 4/

drwxrwxr-x  2 francis francis 4096 May  5 05:35 7/

drwxrwxr-x  2 francis francis 4096 May  5 05:35 9/

 

Francis

 

发件人: Tathagata Das [mailto:tathagata.das1...@gmail.com] 
发送时间: Tuesday, May 06, 2014 3:45
收件人: user@spark.apache.org
主题: Re: java.io.FileNotFoundException: 
/test/spark-0.9.1/work/app-20140505053550-/2/stdout (No such file or 
directory)

 

Do those file actually exist? Those stdout/stderr should have the output of the 
spark's executors running in the workers, and its weird that they dont exist. 
Could be permission issue - maybe the directories/files are not being generated 
because it cannot?

 

TD

 

On Mon, May 5, 2014 at 3:06 AM, Francis.Hu francis...@reachjunction.com wrote:

Hi,All

 

 

We run a spark cluster with three workers. 

created a spark streaming application,

then run the spark project using below command:

 

shell sbt run spark://192.168.219.129:7077 tcp://192.168.20.118:5556 foo

 

we looked at the webui of workers, jobs failed without any error or info, but 
FileNotFoundException occurred in workers' log file as below:

Is this an existent issue of spark? 

 

 

-in workers' 
logs/spark-francis-org.apache.spark.deploy.worker.Worker-1-ubuntu-4.out

 

14/05/05 02:39:39 WARN AbstractHttpConnection: 
/logPage/?appId=app-20140505053550-executorId=2logType=stdout

java.io.FileNotFoundException: 
/test/spark-0.9.1/work/app-20140505053550-/2/stdout (No such file or 
directory)

at java.io.FileInputStream.open(Native Method)

at java.io.FileInputStream.init(FileInputStream.java:138)

at org.apache.spark.util.Utils$.offsetBytes(Utils.scala:687)

at 
org.apache.spark.deploy.worker.ui.WorkerWebUI.logPage(WorkerWebUI.scala:119)

at 
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.scala:52)

at 
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.scala:52)

at org.apache.spark.ui.JettyUtils$$anon$1.handle(JettyUtils.scala:61)

at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1040)

at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:976)

at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)

at 
org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)

at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)

at org.eclipse.jetty.server.Server.handle(Server.java:363)

at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:483)

at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:920)

at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:982)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)

at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)

at 
org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)

at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)

at 

java.io.FileNotFoundException: /test/spark-0.9.1/work/app-20140505053550-0000/2/stdout (No such file or directory)

2014-05-05 Thread Francis . Hu
Hi,All

 

 

We run a spark cluster with three workers. 

created a spark streaming application,

then run the spark project using below command:

 

shell sbt run spark://192.168.219.129:7077 tcp://192.168.20.118:5556 foo

 

we looked at the webui of workers, jobs failed without any error or info,
but FileNotFoundException occurred in workers' log file as below:

Is this an existent issue of spark? 

 

 

-in workers'
logs/spark-francis-org.apache.spark.deploy.worker.Worker-1-ubuntu-4.out-
---

 

14/05/05 02:39:39 WARN AbstractHttpConnection:
/logPage/?appId=app-20140505053550-executorId=2logType=stdout

java.io.FileNotFoundException:
/test/spark-0.9.1/work/app-20140505053550-/2/stdout (No such file or
directory)

at java.io.FileInputStream.open(Native Method)

at java.io.FileInputStream.init(FileInputStream.java:138)

at org.apache.spark.util.Utils$.offsetBytes(Utils.scala:687)

at
org.apache.spark.deploy.worker.ui.WorkerWebUI.logPage(WorkerWebUI.scala:119)

at
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.s
cala:52)

at
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.s
cala:52)

at
org.apache.spark.ui.JettyUtils$$anon$1.handle(JettyUtils.scala:61)

at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java
:1040)

at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:
976)

at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135
)

at
org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)

at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:1
16)

at org.eclipse.jetty.server.Server.handle(Server.java:363)

at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpCo
nnection.java:483)

at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpC
onnection.java:920)

at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplet
e(AbstractHttpConnection.java:982)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)

at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)

at
org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java
:82)

at
org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.
java:628)

at
org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.j
ava:52)

at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:
608)

at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:5
43)

at java.lang.Thread.run(Thread.java:722)

14/05/05 02:39:41 WARN AbstractHttpConnection:
/logPage/?appId=app-20140505053550-executorId=9logType=stderr

java.io.FileNotFoundException:
/test/spark-0.9.1/work/app-20140505053550-/9/stderr (No such file or
directory)

at java.io.FileInputStream.open(Native Method)

at java.io.FileInputStream.init(FileInputStream.java:138)

at org.apache.spark.util.Utils$.offsetBytes(Utils.scala:687)

at
org.apache.spark.deploy.worker.ui.WorkerWebUI.logPage(WorkerWebUI.scala:119)

at
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.s
cala:52)

at
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.s
cala:52)

at
org.apache.spark.ui.JettyUtils$$anon$1.handle(JettyUtils.scala:61)

at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java
:1040)

at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:
976)

at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135
)

at
org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)

at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:1
16)

at org.eclipse.jetty.server.Server.handle(Server.java:363)

at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpCo
nnection.java:483)

at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpC
onnection.java:920)

at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplet
e(AbstractHttpConnection.java:982)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)

at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)

at
org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java
:82)

at
org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.
java:628)

at
org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.j

答复: java.io.FileNotFoundException: /test/spark-0.9.1/work/app-20140505053550-0000/2/stdout (No such file or directory)

2014-05-05 Thread Francis . Hu
The file does not exist in fact and no permission issue. 

 

francis@ubuntu-4:/test/spark-0.9.1$ ll work/app-20140505053550-/

total 24

drwxrwxr-x  6 francis francis 4096 May  5 05:35 ./

drwxrwxr-x 11 francis francis 4096 May  5 06:18 ../

drwxrwxr-x  2 francis francis 4096 May  5 05:35 2/

drwxrwxr-x  2 francis francis 4096 May  5 05:35 4/

drwxrwxr-x  2 francis francis 4096 May  5 05:35 7/

drwxrwxr-x  2 francis francis 4096 May  5 05:35 9/

 

Francis

 

发件人: Tathagata Das [mailto:tathagata.das1...@gmail.com] 
发送时间: Tuesday, May 06, 2014 3:45
收件人: user@spark.apache.org
主题: Re: java.io.FileNotFoundException: 
/test/spark-0.9.1/work/app-20140505053550-/2/stdout (No such file or 
directory)

 

Do those file actually exist? Those stdout/stderr should have the output of the 
spark's executors running in the workers, and its weird that they dont exist. 
Could be permission issue - maybe the directories/files are not being generated 
because it cannot?

 

TD

 

On Mon, May 5, 2014 at 3:06 AM, Francis.Hu francis...@reachjunction.com wrote:

Hi,All

 

 

We run a spark cluster with three workers. 

created a spark streaming application,

then run the spark project using below command:

 

shell sbt run spark://192.168.219.129:7077 tcp://192.168.20.118:5556 foo

 

we looked at the webui of workers, jobs failed without any error or info, but 
FileNotFoundException occurred in workers' log file as below:

Is this an existent issue of spark? 

 

 

-in workers' 
logs/spark-francis-org.apache.spark.deploy.worker.Worker-1-ubuntu-4.out

 

14/05/05 02:39:39 WARN AbstractHttpConnection: 
/logPage/?appId=app-20140505053550-executorId=2logType=stdout

java.io.FileNotFoundException: 
/test/spark-0.9.1/work/app-20140505053550-/2/stdout (No such file or 
directory)

at java.io.FileInputStream.open(Native Method)

at java.io.FileInputStream.init(FileInputStream.java:138)

at org.apache.spark.util.Utils$.offsetBytes(Utils.scala:687)

at 
org.apache.spark.deploy.worker.ui.WorkerWebUI.logPage(WorkerWebUI.scala:119)

at 
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.scala:52)

at 
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.scala:52)

at org.apache.spark.ui.JettyUtils$$anon$1.handle(JettyUtils.scala:61)

at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1040)

at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:976)

at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)

at 
org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)

at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)

at org.eclipse.jetty.server.Server.handle(Server.java:363)

at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:483)

at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:920)

at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:982)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)

at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)

at 
org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)

at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)

at 
org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)

at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)

at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)

at java.lang.Thread.run(Thread.java:722)

14/05/05 02:39:41 WARN AbstractHttpConnection: 
/logPage/?appId=app-20140505053550-executorId=9logType=stderr

java.io.FileNotFoundException: 
/test/spark-0.9.1/work/app-20140505053550-/9/stderr (No such file or 
directory)

at java.io.FileInputStream.open(Native Method)

at java.io.FileInputStream.init(FileInputStream.java:138)

at org.apache.spark.util.Utils$.offsetBytes(Utils.scala:687)

at 
org.apache.spark.deploy.worker.ui.WorkerWebUI.logPage(WorkerWebUI.scala:119)

at 
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.scala:52)

at 
org.apache.spark.deploy.worker.ui.WorkerWebUI$$anonfun$6.apply(WorkerWebUI.scala:52)

at org.apache.spark.ui.JettyUtils$$anon$1.handle(JettyUtils.scala:61)

at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1040)

at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:976)

at 

Issue during Spark streaming with ZeroMQ source

2014-04-29 Thread Francis . Hu
Hi, all

 

I installed spark-0.9.1 and zeromq 4.0.1 , and then run below example:

 

./bin/run-example org.apache.spark.streaming.examples.SimpleZeroMQPublisher
tcp://127.0.1.1:1234 foo.bar`

./bin/run-example org.apache.spark.streaming.examples.ZeroMQWordCount
local[2] tcp://127.0.1.1:1234 foo`

 

No any message was received in ZeroMQWordCount side. 

 

Does anyone know what the issue is ? 

 

 

Thanks,

Francis

 



答复: java.lang.NoClassDefFoundError: scala/tools/nsc/transform/UnCurry$UnCurryTransformer...

2014-04-07 Thread Francis . Hu
Great!!!

When i built it on another disk whose format is ext4, it works right now.

hadoop@ubuntu-1:~$ df -Th
FilesystemType  Size  Used Avail Use% Mounted on
/dev/sdb6 ext4  135G  8.6G  119G   7% /
udev  devtmpfs  7.7G  4.0K  7.7G   1% /dev
tmpfs tmpfs 3.1G  316K  3.1G   1% /run
none  tmpfs 5.0M 0  5.0M   0% /run/lock
none  tmpfs 7.8G  4.0K  7.8G   1% /run/shm
/dev/sda1 ext4  112G  3.7G  103G   4% /faststore
/home/hadoop/.Private ecryptfs  135G  8.6G  119G   7% /home/hadoop

Thanks again, Marcelo Vanzin.


Francis.Hu

-邮件原件-
发件人: Marcelo Vanzin [mailto:van...@cloudera.com] 
发送时间: Saturday, April 05, 2014 1:13
收件人: user@spark.apache.org
主题: Re: java.lang.NoClassDefFoundError: 
scala/tools/nsc/transform/UnCurry$UnCurryTransformer...

Hi Francis,

This might be a long shot, but do you happen to have built spark on an
encrypted home dir?

(I was running into the same error when I was doing that. Rebuilding
on an unencrypted disk fixed the issue. This is a known issue /
limitation with ecryptfs. It's weird that the build doesn't fail, but
you do get warnings about the long file names.)


On Wed, Apr 2, 2014 at 3:26 AM, Francis.Hu francis...@reachjunction.com wrote:
 I stuck in a NoClassDefFoundError.  Any helps that would be appreciated.

 I download spark 0.9.0 source, and then run this command to build it :
 SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true sbt/sbt assembly


 java.lang.NoClassDefFoundError:
 scala/tools/nsc/transform/UnCurry$UnCurryTransformer$$anonfun$14$$anonfun$apply$5$$anonfun$scala$tools$nsc$transform$UnCurry$UnCurryTransformer$$anonfun$$anonfun$$transformInConstructor$1$1

-- 
Marcelo