You can force the data to be loaded as a sparse map assuming the key/value types are consistent. Here is an example <https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1023043053387187/1863598192220754/2840265927289860/latest.html> .
On Wed, Mar 30, 2016 at 8:17 AM, Yavuz Nuzumlalı <manuya...@gmail.com> wrote: > Hi all, > > I'm trying to read a data inside a json file using > `SQLContext.read.json()` method. > > However, reading operation does not finish. My data is of 290000x3100 > dimensions, but it's actually really sparse, so if there is a way to > directly read json into a sparse dataframe, it would work perfect for me. > > What are the alternatives for reading such data into spark? > > P.S. : When I try to load first 50000 rows, read operation is completed in > ~2 minutes. >