Suggest you reading «Hadoop Application Architectures» (orelly) by Mark Grover,
Ted Malaska and others. There you can find some answers for your questions.
> 10 окт. 2017 г., в 9:00, Mahender Sarangam
> написал(а):
>
> Hi,
>
> I'm new to spark and big data, we
park.logLineage=true
\
--conf spark.yarn.historyServer.address=address1
--conf spark.eventLog.dir=address2
\
--conf spark.reducer.maxReqsInFlight=10
--conf spark.shuffle.io.maxRetries=5
--conf spark.network.timeout=240
Input shuffle size: 2.6 TB
Partitions in stage: 20480 and 12768 were completed suc
...@gmail.com
написал(а):
Hi,
If you join logic is correct, it seems to be a similar issue which i faced
recently
Can you try by
SparkContext(conf).set(spark.driver.allowMultipleContexts,true)
Regards,
Satish Chandra
On Mon, Aug 24, 2015 at 2:51 PM, Ilya Karpov i.kar...@cleverdata.ru
Hi, guys
I'm confused about joining columns in SparkSQL and need your advice.
I want to join 2 datasets of profiles. Each profile has name and array of
attributes(age, gender, email etc).
There can be mutliple instances of attribute with the same name, e.g. profile
has 2 emails - so 2 attributes