tempTable = union_df.registerTempTable("tempRaw") create = hc.sql('CREATE TABLE IF NOT EXISTS blab.pyspark_dpprq (vin string, utctime timestamp, description string, descriptionuom string, providerdesc string, dt_map string, islocation string, latitude double, longitude double, speed double, value string)')
insert = hc.sql('INSERT OVERWRITE TABLE blab.pyspark_dpprq SELECT * FROM tempRaw') On Fri, Aug 11, 2017 at 11:00 AM, Daniel van der Ende < daniel.vandere...@gmail.com> wrote: > Hi Asmath, > > Could you share the code you're running? > > Daniel > > On Fri, 11 Aug 2017, 17:53 KhajaAsmath Mohammed, <mdkhajaasm...@gmail.com> > wrote: > >> Hi, >> >> >> >> I am using spark sql to write data back to hdfs and it is resulting in >> multiple output files. >> >> >> >> I tried changing number spark.sql.shuffle.partitions=1 but it resulted >> in very slow performance. >> >> >> >> Also tried coalesce and repartition still the same issue. any suggestions? >> >> >> >> Thanks, >> >> Asmath >> >