RE: question on SPARK_WORKER_CORES

2017-02-17 Thread Satish Lalam
Have you tried passing --executor-cores or –total-executor-cores as arguments, , depending on the spark version? From: kant kodali [mailto:kanth...@gmail.com] Sent: Friday, February 17, 2017 5:03 PM To: Alex Kozlov Cc: user @spark Subject: Re: question

Re: Question about Parallel Stages in Spark

2017-06-27 Thread satish lalam
r driver options and code you've set will > impact the application, but the Yarn scheduler will not impact (beyond > allocating cores, memory, etc. between applications.) > > > > Get Outlook for Android <https://aka.ms/ghei36> > > > > > On Tue, Jun 27, 2017 at 2:33

Re: Broadcasts & Storage Memory

2017-06-21 Thread satish lalam
My understanding is - it from storageFraction. Here cached blocks are immune to eviction - so both persisted RDDs and broadcast variables sit here. Ref

Re: Why my project has this kind of error ?

2017-06-22 Thread satish lalam
Minglei - You could check your jdk path and scala library setting in project structure. i.e., project view (alt + 1), and then pressing F4 to open Project structure... look under SDKs and Libraries. On Mon, Jun 19, 2017 at 10:54 PM, 张明磊 wrote: > Hello to all, > > Below

Re: Question about Parallel Stages in Spark

2017-06-27 Thread satish lalam
Thanks All. To reiterate - stages inside a job can be run parallely as long as - (a) there is no sequential dependency (b) the job has sufficient resources. however, my code was launching 2 jobs and they are sequential as you rightly pointed out. The issue which I was trying to highlight with that

Re: Spark Streaming Design Suggestion

2017-06-14 Thread satish lalam
Agree with Jörn. Dynamically creating/deleting Topics is nontrivial to manage. With the limited knowledge about your scenario - it appears that you are using topics as some kind of message type enum. If that is the case - you might be better off with one (or just a few topics) and have a

Re: Read Local File

2017-06-14 Thread satish lalam
I guess you have already made sure that the paths for your file are exactly the same on each of your nodes. I'd also check the perms on your path. Believe the sample code you pasted is only for testing - and you are already aware that a distributed count on a local file has no benefits. Once I ran