The schema has a StructType.

Justin

On Tue, Apr 7, 2015 at 6:58 PM, Yin Huai <yh...@databricks.com> wrote:

> Hi Justin,
>
> Does the schema of your data have any decimal, array, map, or struct type?
>
> Thanks,
>
> Yin
>
> On Tue, Apr 7, 2015 at 6:31 PM, Justin Yip <yipjus...@prediction.io>
> wrote:
>
>> Hello,
>>
>> I have a parquet file of around 55M rows (~ 1G on disk). Performing
>> simple grouping operation is pretty efficient (I get results within 10
>> seconds). However, after called DataFrame.cache, I observe a significant
>> performance degrade, the same operation now takes 3+ minutes.
>>
>> My hunch is that DataFrame cannot leverage its columnar format after
>> persisting in memory. But cannot find anywhere from the doc mentioning this.
>>
>> Did I miss anything?
>>
>> Thanks!
>>
>> Justin
>>
>
>

Reply via email to