Hi everyone,
I m experiencing a problem with my spark streaming job when running it on
yarn.
The problem appears only when running this application in a Yarn queue
along with other Tez/MR applications
This problem is in processing time, which exceeds 1 minute for batches of 1
second. Normally whe
called 10 times, although it gets consistently called
>>> 50
>>> times, but the resulting DF is correct and when executing a count()
>>> properly
>>> return 10, as expected.
>>>
>>> I have changed my code to work directly with RDDs using mapPartitions
gt;>
>> As additional information, I have set spark.speculation to false and no
>> tasks failed.
>>
>> I am working on a smaller example that would isolate this potential issue,
>> but in the meantime I
alse and no
> tasks failed.
>
> I am working on a smaller example that would isolate this potential issue,
> but in the meantime I would like to know if somebody encountered this
> issue.
>
> Thank you.
>
>
>
> --
> View this message in context:
> http://apache-sp
.1001560.n3.nabble.com/Spark-Streaming-Problem-with-DataFrame-UDFs-tp26024.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands
Turns out data is in python format. ETL pipeline was over writing original
data
Andy
From: Andrew Davidson
Date: Thursday, November 19, 2015 at 6:58 PM
To: "user @spark"
Subject: spark streaming problem saveAsTextFiles() does not write valid
JSON to HDFS
> I am working on a
I am working on a simple POS. I am running into a really strange problem. I
wrote a java streaming app that collects tweets using the spark twitter
package and stores the to disk in JSON format. I noticed that when I run the
code on my mac. The file are written to the local files system as I expect
om logs nothing of g gets printed.
>
> -- Forwarded message --
> From: Animesh Baranawal
> Date: Thu, May 28, 2015 at 6:57 PM
> Subject: SPARK STREAMING PROBLEM
> To: user@spark.apache.org
>
>
> Hi,
>
> I am trying to extract the filenames from which a Dst
I also started the streaming context by running ssc.start() but still apart
from logs nothing of g gets printed.
-- Forwarded message --
From: Animesh Baranawal
Date: Thu, May 28, 2015 at 6:57 PM
Subject: SPARK STREAMING PROBLEM
To: user@spark.apache.org
Hi,
I am trying to
You must start the StreamingContext by calling ssc.start()
On Thu, May 28, 2015 at 6:57 PM, Animesh Baranawal <
animeshbarana...@gmail.com> wrote:
> Hi,
>
> I am trying to extract the filenames from which a Dstream is generated by
> parsing the toDebugString method on RDD
> I am implementing the
Hi,
I am trying to extract the filenames from which a Dstream is generated by
parsing the toDebugString method on RDD
I am implementing the following code in spark-shell:
import org.apache.spark.streaming.{StreamingContext, Seconds}
val ssc = new StreamingContext(sc,Seconds(10))
val lines = ssc.t
11 matches
Mail list logo