Log4j files per spark job

2015-03-17 Thread Dan H.
Hey guys, Looking for a bit of help on logging. I trying to get Spark to write log4j logs per job within a Spark cluster. So for example, I'd like: $SPARK_HOME/logs/job1.log.x $SPARK_HOME/logs/job2.log.x And I want this on the driver and on the executor. I'm trying to accomplish this by using

Spark Port Configuration

2014-12-23 Thread Dan H.
Hi all, I'm trying to lock down ALL Spark ports and have tried using spark-defaults.conf and via the sparkContext. (The example below was run in local[*] mode, but all attempts to run in local or spark-submit.sh on cluster via jar all result in the same results). My goal is to define all

Spark Streaming Workflow Validation

2014-08-07 Thread Dan H.
I wanted to post for validation to understand if there is more efficient way to achieve my goal. I'm currently performing this flow for two distinct calculations executing in parallel: 1) Sum key/value pair, by using a simple witnessed count(apply 1 to a mapToPair() and then groupByKey() 2)

Re: Spark Streaming Workflow Validation

2014-08-07 Thread Dan H.
Yes, thanks, I did in fact mean reduceByKey(), thus allowing the convenience method process the summation by key. Thanks for your feedback! DH -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Workflow-Validation-tp11677p11706.html Sent from

Re: Spark Streaming and Storm

2014-07-09 Thread Dan H.
Xichen_tju, I recently evaluated Storm for a period of months (using 2Us, 2.4GHz CPU, 24GBRAM with 3 servers) and was not able to achieve a realistic scale for my business domain needs. Storm is really only a framework, which allows you to put in code to do whatever it is you need for a

Re: reduceByKey Not Being Called by Spark Streaming

2014-07-03 Thread Dan H.
Hi All, I was able to resolve this matter with a simple fix. It seems that in order to process a reduceByKey and the flat map operations at the same time, the only way to resolve was to increase the number of threads to 1. Since I'm developing on my personal machine for speed, I simply updated

reduceByKey Not Being Called by Spark Streaming

2014-07-02 Thread Dan H.
Hi all, I recently just picked up Spark and am trying to work through a coding issue that involves the reduceByKey method. After various debugging efforts, it seems that the reducyByKey method never gets called. Here's my workflow, which is followed by my code and results: My parsed data