date:20180124

Scala version changed in spark job

2018-01-24 Thread Fawze Abujaber

Hi all, I upgraded my Hadoop cluster which include spark 1.6.0, I noticed that sometimes the job is running with scala version 2.10.5 and sometimes with 2.10.4, any idea why this happening?

Re: a way to allow spark job to continue despite task failures?

2018-01-24 Thread Sunita Arvind

Had a similar situation and landed on this question. Finally I was able to make it do what I needed by cheating the spark driver :) i.e By setting a very high value for "--conf spark.task.maxFailures=800". I made it 800 deliberately which typically is 4. So by the time 800 attempts for failed tasks

CI/CD for spark and scala

2018-01-24 Thread Deepak Sharma

Hi All, I just wanted to check if there are any best practises around using CI/CD for spark / scala projects running on AWS hadoop clusters. IF there is any specific tools , please do let me know. -- Thanks Deepak

Re: Providing Kafka configuration as Map of Strings

2018-01-24 Thread Cody Koeninger

Have you tried passing in a Map that happens to have string for all the values? I haven't tested this, but the underlying kafka consumer constructor is documented to take either strings or objects as values, despite the static type. On Wed, Jan 24, 2018 at 2:48 PM, Tecno Brain wrote: > Basically

Apache Hadoop and Spark

2018-01-24 Thread Mutahir Ali

Hello All, Cordial Greetings, I am trying to familiarize myself with Apache Hadoop and it's different software components and how they can be deployed on physical or virtual infrastructure. I have a few questions: Q1) Can we use Mapreduce and apache spark in the same cluster Q2) is it mandator

Re: Providing Kafka configuration as Map of Strings

2018-01-24 Thread Tecno Brain

Basically, I am trying to avoid writing code like: switch( key ) { case "key.deserializer" : result.put(key , Class.forName(value)); break; case "key.serializer" : result.put(key , Class.forName(value)); break; case "value.deserializer" :

Providing Kafka configuration as Map of Strings

2018-01-24 Thread Tecno Brain

On page https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html there is this Java example: Map kafkaParams = new HashMap<>();kafkaParams.put("bootstrap.servers", "localhost:9092,anotherhost:9092");kafkaParams.put("key.deserializer", StringDeserializer.class);kafkaParams.put("va

Re: Spark Tuning Tool

2018-01-24 Thread Shmuel Blitz

Hi, Which versions of Spark does the tool support? Does the tool have any reference to the number of executor cores? >From your blog post it seems that this is a new feature on your service. Are you offering the tool for download? Shmuel On Wed, Jan 24, 2018 at 7:02 PM, Timothy Chen wrote: >

Re: uncontinuous offset in kafka will cause the spark streamingfailure

2018-01-24 Thread Cody Koeninger

When you say the patch is not suitable, can you clarify why? Probably best to get the various findings centralized on https://issues.apache.org/jira/browse/SPARK-17147 Happy to help with getting the patch up to date and working. On Wed, Jan 24, 2018 at 1:19 AM, namesuperwood wrote: > It seems t

Re: Spark Tuning Tool

2018-01-24 Thread Timothy Chen

Interested to try as well. Tim On Tue, Jan 23, 2018 at 5:54 PM, Raj Adyanthaya wrote: > Its very interesting and I do agree that it will get a lot of traction once > made open source. > > On Mon, Jan 22, 2018 at 9:01 PM, Rohit Karlupia wrote: >> >> Hi, >> >> I have been working on making the pe

Re: spark.sql call takes far too long

2018-01-24 Thread lucas.g...@gmail.com

Hi Michael. I haven't had this particular issue previously, but I have had other performance issues. Some questions which may help: 1. Have you checked the Spark Console? 2. Have you isolated the query in question, are you sure it's actually where the slowdown occurs? 3. How much data are you ta

spark.sql call takes far too long

2018-01-24 Thread Michael Shtelma

Hi all, I have a problem with the performance of the sparkSession.sql call. It takes up to a couple of seconds for me right now. I have a lot of generated temporary tables, which are registered within the session and also a lot of temporary data frames. Is it possible, that the analysis/resolve/an

Scala version changed in spark job

Re: a way to allow spark job to continue despite task failures?

CI/CD for spark and scala

Re: Providing Kafka configuration as Map of Strings

Apache Hadoop and Spark

Re: Providing Kafka configuration as Map of Strings

Providing Kafka configuration as Map of Strings

Re: Spark Tuning Tool

Re: uncontinuous offset in kafka will cause the spark streamingfailure

Re: Spark Tuning Tool

Re: spark.sql call takes far too long

spark.sql call takes far too long

12 matches

Site Navigation

Mail list logo

Footer information