Spark dataset to byte array over grpc

2018-04-23 Thread Ashwin Sai Shankar
Hi! I'm building a spark app which runs a spark-sql query and send results to client over grpc(my proto file is configured to send the sql output as "bytes"). The client then displays the output rows. When I run spark.sql, I get a DataSet. How do I convert this to byte array? Also is there a

Re: Why python cluster mode is not supported in standalone cluster?

2018-02-14 Thread Ashwin Sai Shankar
+dev mailing list(since i didn't get a response from user DL) On Tue, Feb 13, 2018 at 12:20 PM, Ashwin Sai Shankar <ashan...@netflix.com> wrote: > Hi Spark users! > I noticed that spark doesn't allow python apps to run in cluster mode in > spark standalone cluster. Does anyone kno

Why python cluster mode is not supported in standalone cluster?

2018-02-13 Thread Ashwin Sai Shankar
Hi Spark users! I noticed that spark doesn't allow python apps to run in cluster mode in spark standalone cluster. Does anyone know the reason? I checked jira but couldn't find anything relevant. Thanks, Ashwin

Re: Spark shuffle files

2017-03-27 Thread Ashwin Sai Shankar
ter/core/src/ > main/scala/org/apache/spark/ContextCleaner.scala > > On Mon, Mar 27, 2017 at 12:38 PM, Ashwin Sai Shankar < > ashan...@netflix.com.invalid> wrote: > >> Hi! >> >> In spark on yarn, when are shuffle files on local disk removed? (Is it >>

Spark shuffle files

2017-03-27 Thread Ashwin Sai Shankar
Hi! In spark on yarn, when are shuffle files on local disk removed? (Is it when the app completes or once all the shuffle files are fetched or end of the stage?) Thanks, Ashwin

Re: Hive error after update from 1.4.1 to 1.5.2

2015-12-16 Thread Ashwin Sai Shankar
Hi Bryan, I see the same issue with 1.5.2, can you pls let me know what was the resolution? Thanks, Ashwin On Fri, Nov 20, 2015 at 12:07 PM, Bryan Jeffrey wrote: > Nevermind. I had a library dependency that still had the old Spark version. > > On Fri, Nov 20, 2015 at

Re: Spark on YARN multitenancy

2015-12-15 Thread Ashwin Sai Shankar
We run large multi-tenant clusters with spark/hadoop workloads, and we use 'yarn's preemption'/'spark's dynamic allocation' to achieve multitenancy. See following link on how to enable/configure preemption using fair scheduler :

Re: How to display column names in spark-sql output

2015-12-11 Thread Ashwin Sai Shankar
Never mind, its *set hive.cli.print.header=true* Thanks ! On Fri, Dec 11, 2015 at 5:16 PM, Ashwin Shankar wrote: > Hi, > When we run spark-sql, is there a way to get column names/headers with the > result? > > -- > Thanks, > Ashwin > > >