DataFrameWriter in pyspark ignoring hdfs attributes (using spark-2.2.1-bin-hadoop2.7)?

2018-03-10 Thread Chuan-Heng Hsiao
hi all, I am using spark-2.2.1-bin-hadoop2.7 with stand-alone mode. (python version: 3.5.2 from ubuntu 16.04) I intended to have DataFrame write to hdfs with customized block-size but failed. However, the corresponding rdd can successfully write with the customized block-size. Could you help me

Re: is there a way to catch exceptions on executor level

2018-03-10 Thread naresh Goud
How about accumaltors? Thanks, Naresh www.linkedin.com/in/naresh-dulam http://hadoopandspark.blogspot.com/ On Thu, Mar 8, 2018 at 12:07 AM Chethan Bhawarlal < cbhawar...@collectivei.com> wrote: > Hi Dev, > > I am doing spark operations on Rdd level for each row like this, > > private def

Re: what is the right syntax for self joins in Spark 2.3.0 ?

2018-03-10 Thread kant kodali
I will give an attempt to answer this. since rightValue1 and rightValue2 have the same key "K"(two matches) why would it ever be the case *rightValue2* replacing *rightValue1* replacing *null? *Moreover, why does user need to care? The result in this case (after getting 2 matches) should be