Re: physical memory usage keep increasing for spark app on Yarn

2017-02-15 Thread Yang Cao
").drop("mm").drop("mode").drop("y").drop("m").drop("d") > dropDF.cache() > or > dropDF.write.mode(SaveMode.ErrorIfExists).parquet(temppath) > val dropDF = spark.read.parquet(temppath) > and then > dropDF.repartition(1).write.mode

Re: physical memory usage keep increasing for spark app on Yarn

2017-01-22 Thread Yang Cao
rtitions to 1, try to set bigger value, may be it solve this > problem. > > Cheers, > Pavel > > On Fri, Jan 20, 2017 at 12:35 PM Yang Cao <mailto:cybea...@gmail.com>> wrote: > Hi all, > > I am running a spark application on YARN-client mode with 6 executors (e

Re: physical memory usage keep increasing for spark app on Yarn

2017-01-22 Thread Yang Cao
> On Fri, Jan 20, 2017 at 12:35 PM Yang Cao <mailto:cybea...@gmail.com>> wrote: > Hi all, > > I am running a spark application on YARN-client mode with 6 executors (each 4 > cores and executor memory = 6G and Overhead = 4G, spark version: 1.6.3 / > 2.1.0). I find that

physical memory usage keep increasing for spark app on Yarn

2017-01-20 Thread Yang Cao
Hi all, I am running a spark application on YARN-client mode with 6 executors (each 4 cores and executor memory = 6G and Overhead = 4G, spark version: 1.6.3 / 2.1.0). I find that my executor memory keeps increasing until get killed by node manager; and give out the info that tells me to boost

filter push down on har file

2017-01-16 Thread Yang Cao
Hi, My team just do a archive on last year’s parquet files. I wonder whether the filter push down optimization still work when I read data through “har:///path/to/data/“? THX. Best, - To unsubscribe e-mail: user-unsubscr...@spa

Re: Kryo On Spark 1.6.0

2017-01-10 Thread Yang Cao
If you don’t mind, could please share me with the scala solution? I tried to use kryo but seamed not work at all. I hope to get some practical example. THX > On 2017年1月10日, at 19:10, Enrico DUrso wrote: > > Hi, > > I am trying to use Kryo on Spark 1.6.0. > I am able to register my own classes a

Do you use spark 2.0 in work?

2016-10-31 Thread Yang Cao
Hi guys, Just for personal interest. I wonder whether spark 2.0 a productive version? Is there any company use this version as its main version in daily work? THX - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: java.net.UnknownHostException

2016-08-02 Thread Yang Cao
actually, i just came into same problem. Whether you can share some code around the error, then I can figure it out whether I can help you. And the "s001.bigdata” is your name of name node? > On 2016年8月2日, at 17:22, pseudo oduesp wrote: > > someone can help me please > > 2016-08-01 11:51

create external table from partitioned avro file

2016-07-28 Thread Yang Cao
Hi, I am using spark 1.6 and I hope to create a hive external table based on one partitioned avro file. Currently, I don’t find any build-in api to do this work. I tried the write.format().saveAsTable, with format com.databricks.spark.avro. it returned error can’t file Hive serde for this. Als

get hdfs file path in spark

2016-07-25 Thread Yang Cao
Hi, To be new here, I hope to get assistant from you guys. I wonder whether I have some elegant way to get some directory under some path. For example, I have a path like on hfs /a/b/c/d/e/f, and I am given a/b/c, is there any straight forward way to get the path /a/b/c/d/e . I think I can do it