Hello, Spark community!
I've been struggling with my job which constantly fails due to inability to
uncompress some previously compressed blocks while shuffling data.
I use spark 2.2.0 with all the configuration settings left by default (no
specific compression codec is specified). I've
Hi All,
I have to call Oracle sequence using spark. Can you pls tell what is the
way to do that?
Thanks
Rajat
Hi community,
I am using Spark on Yarn. When submiting a job after a long time I get an error
mesage and retry.
It happens when I want to store the dataframe to a table.
spark_df.write.option("path",
"/nlb_datalake/golden_zone/webhose/sentiment").saveAsTable("news_summary_test",
using kafka consumer, 2 mins batch, tasks process take 2 ~ 5 seconds in
general, but a part of tasks take more than 40 seconds. I guess
*CachedKafkaConsumer#poll* could be problem.
private def poll(timeout: Long): Unit = {
val p = consumer.poll(timeout)
val r = p.records(topicPartition)
Hi
Whether kafka topic's partition number can help ?!
在 2019/8/13 下午10:53, Amit Sharma 写道:
I am using kafka spark streming. My UI application send request to
streaming through kafka. Problem is streaming handles one request at a
time so if multiple users send request at the same time they
Hi,
Maybe you can look at the spark ui. The physical plan has no time
consuming information.
在 2019/8/13 下午10:45, Marcelo Valle 写道:
Hi,
I have a job running on AWS EMR. It's basically a join between 2
tables (parquet files on s3), one somehow large (around 50 gb) and
other small (less