what will the scenario in case of s3 and local file system? On Tue, Jun 21, 2016 at 4:36 PM, Jörn Franke <jornfra...@gmail.com> wrote:
> Based on the underlying Hadoop FileFormat. This one does it mostly based > on blocksize. You can change this though. > > On 21 Jun 2016, at 12:19, Sachin Aggarwal <different.sac...@gmail.com> > wrote: > > > when we use readStream to read data as Stream, how spark decides the no of > RDD and partition within each RDD with respect to storage and file format. > > val dsJson = sqlContext.readStream.json( > "/Users/sachin/testSpark/inputJson") > > val dsCsv = sqlContext.readStream.option("header","true").csv( > "/Users/sachin/testSpark/inputCsv") > > val ds = sqlContext.readStream.text("/Users/sachin/testSpark/inputText") > val dsText = ds.as[String].map(x =>(x.split(" ")(0),x.split(" > ")(1))).toDF("name","age") > > val dsParquet = > sqlContext.readStream.format("parquet").parquet("/Users/sachin/testSpark/inputParquet") > > > > -- > > Thanks & Regards > > Sachin Aggarwal > 7760502772 > > -- Thanks & Regards Sachin Aggarwal 7760502772