Hi all,
I upgraded my Hadoop cluster which include spark 1.6.0, I noticed that
sometimes the job is running with scala version 2.10.5 and sometimes with
2.10.4, any idea why this happening?
Had a similar situation and landed on this question.
Finally I was able to make it do what I needed by cheating the spark driver
:)
i.e By setting a very high value for "--conf spark.task.maxFailures=800".
I made it 800 deliberately which typically is 4. So by the time 800
attempts for failed tasks
Hi All,
I just wanted to check if there are any best practises around using CI/CD
for spark / scala projects running on AWS hadoop clusters.
IF there is any specific tools , please do let me know.
--
Thanks
Deepak
Have you tried passing in a Map that happens to have
string for all the values? I haven't tested this, but the underlying
kafka consumer constructor is documented to take either strings or
objects as values, despite the static type.
On Wed, Jan 24, 2018 at 2:48 PM, Tecno Brain
wrote:
> Basically
Hello All,
Cordial Greetings,
I am trying to familiarize myself with Apache Hadoop and it's different
software components and how they can be deployed on physical or virtual
infrastructure.
I have a few questions:
Q1) Can we use Mapreduce and apache spark in the same cluster
Q2) is it mandator
Basically, I am trying to avoid writing code like:
switch( key ) {
case "key.deserializer" : result.put(key ,
Class.forName(value)); break;
case "key.serializer" : result.put(key ,
Class.forName(value)); break;
case "value.deserializer" :
On page
https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html
there is this Java example:
Map kafkaParams = new
HashMap<>();kafkaParams.put("bootstrap.servers",
"localhost:9092,anotherhost:9092");kafkaParams.put("key.deserializer",
StringDeserializer.class);kafkaParams.put("va
Hi,
Which versions of Spark does the tool support?
Does the tool have any reference to the number of executor cores?
>From your blog post it seems that this is a new feature on your service.
Are you offering the tool for download?
Shmuel
On Wed, Jan 24, 2018 at 7:02 PM, Timothy Chen wrote:
>
When you say the patch is not suitable, can you clarify why?
Probably best to get the various findings centralized on
https://issues.apache.org/jira/browse/SPARK-17147
Happy to help with getting the patch up to date and working.
On Wed, Jan 24, 2018 at 1:19 AM, namesuperwood wrote:
> It seems t
Interested to try as well.
Tim
On Tue, Jan 23, 2018 at 5:54 PM, Raj Adyanthaya wrote:
> Its very interesting and I do agree that it will get a lot of traction once
> made open source.
>
> On Mon, Jan 22, 2018 at 9:01 PM, Rohit Karlupia wrote:
>>
>> Hi,
>>
>> I have been working on making the pe
Hi Michael.
I haven't had this particular issue previously, but I have had other
performance issues.
Some questions which may help:
1. Have you checked the Spark Console?
2. Have you isolated the query in question, are you sure it's actually
where the slowdown occurs?
3. How much data are you ta
Hi all,
I have a problem with the performance of the sparkSession.sql call. It
takes up to a couple of seconds for me right now. I have a lot of
generated temporary tables, which are registered within the session
and also a lot of temporary data frames. Is it possible, that the
analysis/resolve/an
12 matches
Mail list logo