Bucketing

2018-11-12 Thread Sai
Hi all, I am trying to bring bucketing functionality and realize it is not allowed on DataFrame write. Any work around for this or any update on when this functionality will be made available in Spark? Thanks - To unsubscribe

Re: writing to local files on a worker

2018-11-12 Thread Steve Lewis
I have been looking at Spark-Blast which calls Blast - a well known C++ program in parallel - In my case I have tried to translate the C++ code to Java but am not getting the same results - it is convoluted - I have code that will call the program and read its results - the only real issue is the

question about barrier execution mode in Spark 2.4.0

2018-11-12 Thread Joe
Hello, I was reading Spark 2.4.0 release docs and I'd like to find out more about barrier execution mode. In particular I'd like to know what happens when number of partitions exceeds number of nodes (which I think is allowed, Spark tuning doc mentions that)? Does Spark guarantee that all

Re: Questions on Python support with Spark

2018-11-12 Thread Patrick McCarthy
I've never tried to run a stand-alone cluster alongside hadoop, but why not run Spark as a yarn application? That way it can absolutely (in fact preferably) use the distributed file system. On Fri, Nov 9, 2018 at 5:04 PM, Arijit Tarafdar wrote: > Hello All, > > > > We have a requirement to run

Re: FW: Spark2 and Hive metastore

2018-11-12 Thread Sergey B.
In order for the Spark to see Hive metastore you need to build Spark Session accordingly: val spark = SparkSession.builder() .master("local[2]") .appName("myApp") .config("hive.metastore.uris","thrift://localhost:9083") .enableHiveSupport() .getOrCreate() On Mon, Nov 12, 2018 at 11:49

Re: [Spark-Core] Long scheduling delays (1+ hour)

2018-11-12 Thread bsikander
Forgot to add the link https://jira.apache.org/jira/browse/KAFKA-5649 -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org