Hi Team , We are facing issue in production where we are getting frequent Still have 1 request outstanding when connection with the hostname was closed connection reset by peer : errors as well as warnings : failed to remove cache rdd or failed to remove broadcast variable. Please help us how to
Hi Team , We are facing issue in production where we are getting frequent Still have 1 request outstanding when connection with the hostname was closed connection reset by peer : errors as well as warnings : failed to remove cache rdd or failed to remove broadcast variable. Please help us how to
Hi,
Does any best practices about how to manage Hbase connections with
kerberos authentication in Spark Streaming (YARN) environment?
Want to now how executors manage the HBase connections,how to create
them, close them and refresh Kerberos expires.
Thanks.
Hi,
We want to submit spark streaming job to YARN and consume Kafka topic.
YARN and Kafka are in two different clusters, and they have the
different kerberos authentication.
We have two keytab files for YARN and Kafka.
And my questions is how to add parameter for spark-submit command for
th
Dear,
I use Spark to deserialize some files to restore to my own Class object.
The Spark code and Class deserialized code (using Apache Common Lang) like this:
val fis = spark.sparkContext.binaryFiles("/folder/abc*.file")
val RDD = fis.map(x => {
val content = x._2.toArray()
val b = Block.de
ve can run successfully in Spark local mode, but when run it in
Yarn cluster mode, the error happens.
在 2019/6/26 下午5:52, big data 写道:
I use Apache Commons Lang3's SerializationUtils in the code.
SerializationUtils.serialize() to store a customized class as files into disk
and Serializati
I use Apache Commons Lang3's SerializationUtils in the code.
SerializationUtils.serialize() to store a customized class as files into disk
and SerializationUtils.deserialize(byte[]) to restore them again.
In the Spark local Mode, all serialized files can be deserialized normally and
no error ha
From m opinion, Bitmap is the best solution for active users calculation. Other
solution almost bases on count(distinct) calculation process, which is more
slower.
If you 've implemented Bitmap solution including how to build Bitmap, how to
load Bitmap, then Bitmap is the best choice.
在 2019/6
Hi all,
I've many binary files stored in HDFS, and use SparkContext.binaryFiles
to load them into RDD, then transfer them to be calculated.
How the limitation is load files, is there any solutions to improve load
binary files performance?
Thanks.
Hi,
our project includes this dependency by:
org.apache.spark
spark-streaming-kafka_2.11
1.6.3
From dependency tree, we can see it dependency kafka_2.11:0.8.2.1 verson.
[cid:part1.8B915977.629F799E@outlook.com]
But when we move this dependency to parent pom file, the dependency
> Hi all,
>
> we have two environments for spark streaming job, which consumes Kafka
> topic to do calculation.
>
> Now in one environment, spark streaming job consume an non-standard
> data from kafka and throw an excepiton(not catch it in code), then the
> sreaming job is down.
>
> But in ano
Hi all,
we have two environments for spark streaming job, which consumes Kafka
topic to do calculation.
Now in one environment, spark streaming job consume an non-standard data
from kafka and throw an excepiton(not catch it in code), then the
sreaming job is down.
But in another environment,
Hi
I am executing the following recommendation engine using Spark ML
https://aws.amazon.com/blogs/big-data/building-a-recommendation-engine-with-spark-ml-on-amazon-emr-using-zeppelin/
When I am trying to save the model, the application hungs and does't
respond.
Any pointers to find wher
transfer sex, country, attr1, attr2 columns' value to
double type directly in spark's job.
thanks.
在 16/12/20 下午9:37, theodondre 写道:
Give a snippets of the data.
Sent from my T-Mobile 4G LTE Device
Original message ----
From: big data <mailto:bigdatab...@outlook.com
our source data are string-based data, like this:
col1 col2 col3 ...
aaa bbbccc
aa2 bb2cc2
aa3 bb3cc3
... ... ...
How to convert all of these data to double to apply for mlib's algorithm?
thanks.
15 matches
Mail list logo