Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Muhammad Asif Abbasi
YARN provides the concept of node labels. You should explore the "spark.yarn.executor.nodeLabelConfiguration" property. Cheers, Asif Abbasi On Tue, 7 Feb 2017 at 10:21, Alvaro Brandon wrote: > Hello all: > > I have the following scenario. > - I have a cluster of 50 machines with Hadoop and Spa

Re: Reading a TSV file

2016-09-10 Thread Muhammad Asif Abbasi
o when the term spark-csv is used is it a reference to databricks >>>> stuff? >>>> > >>>> > val df = spark.read.format("com.databricks.spark.csv").option("inferS >>>> chema", >>>> > "true").option(

Re: Reading a TSV file

2016-09-10 Thread Muhammad Asif Abbasi
2:37, Jacek Laskowski wrote: > >> > >> Hi, > >> > >> If Spark 2.0 supports a format, use it. For CSV it's csv() or > >> format("csv"). It should be supported by Scala and Java. If the API's > >> broken for Java (but works for Scala),

Reading a TSV file

2016-09-10 Thread Muhammad Asif Abbasi
Hi, I would like to know what is the most efficient way of reading tsv in Scala, Python and Java with Spark 2.0. I believe with Spark 2.0 CSV is a native source based on Spark-csv module, and we can potentially read a "tsv" file by specifying 1. Option ("delimiter","\t") in Scala 2. sep declarat