Re: Need help

2017-10-10 Thread Ilya Karpov
Suggest you reading «Hadoop Application Architectures» (orelly) by Mark Grover, Ted Malaska and others. There you can find some answers for your questions. > 10 окт. 2017 г., в 9:00, Mahender Sarangam > написал(а): > > Hi, > > I'm new to spark and big data, we

Massive fetch fails, io errors in TransportRequestHandler

2017-09-28 Thread Ilya Karpov
park.logLineage=true \ --conf spark.yarn.historyServer.address=address1 --conf spark.eventLog.dir=address2 \ --conf spark.reducer.maxReqsInFlight=10 --conf spark.shuffle.io.maxRetries=5 --conf spark.network.timeout=240 Input shuffle size: 2.6 TB Partitions in stage: 20480 and 12768 were completed suc

Re: Joining using mulitimap or array

2015-08-24 Thread Ilya Karpov
...@gmail.com написал(а): Hi, If you join logic is correct, it seems to be a similar issue which i faced recently Can you try by SparkContext(conf).set(spark.driver.allowMultipleContexts,true) Regards, Satish Chandra On Mon, Aug 24, 2015 at 2:51 PM, Ilya Karpov i.kar...@cleverdata.ru

Joining using mulitimap or array

2015-08-24 Thread Ilya Karpov
Hi, guys I'm confused about joining columns in SparkSQL and need your advice. I want to join 2 datasets of profiles. Each profile has name and array of attributes(age, gender, email etc). There can be mutliple instances of attribute with the same name, e.g. profile has 2 emails - so 2 attributes