Please show the write() call, and the results in HDFS. What are all the files you see?
On Fri, Aug 11, 2017 at 1:10 PM, KhajaAsmath Mohammed < mdkhajaasm...@gmail.com> wrote: > tempTable = union_df.registerTempTable("tempRaw") > > create = hc.sql('CREATE TABLE IF NOT EXISTS blab.pyspark_dpprq (vin > string, utctime timestamp, description string, descriptionuom string, > providerdesc string, dt_map string, islocation string, latitude double, > longitude double, speed double, value string)') > > insert = hc.sql('INSERT OVERWRITE TABLE blab.pyspark_dpprq SELECT * FROM > tempRaw') > > > > > On Fri, Aug 11, 2017 at 11:00 AM, Daniel van der Ende < > daniel.vandere...@gmail.com> wrote: > >> Hi Asmath, >> >> Could you share the code you're running? >> >> Daniel >> >> On Fri, 11 Aug 2017, 17:53 KhajaAsmath Mohammed, <mdkhajaasm...@gmail.com> >> wrote: >> >>> Hi, >>> >>> >>> >>> I am using spark sql to write data back to hdfs and it is resulting in >>> multiple output files. >>> >>> >>> >>> I tried changing number spark.sql.shuffle.partitions=1 but it resulted >>> in very slow performance. >>> >>> >>> >>> Also tried coalesce and repartition still the same issue. any >>> suggestions? >>> >>> >>> >>> Thanks, >>> >>> Asmath >>> >> >