.@microsoft.com>;
user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: AVRO File size when caching in-memory
It's something like the schema shown below (with several additional
levels/sublevels)
root
|-- sentAt: long (nullable = true)
|-- sharing: string (nullable = tru
nd how well it was compressable.
>>>
>>>
>>>
>>> The purpose of these formats is to store data to persistent storage in a
>>> way that's faster to read from, not to reduce cache-memory usage.
>>>
>>>
>>>
>>>
faster to read from, not to reduce cache-memory usage.
>>
>>
>>
>> Maybe others here have more info to share.
>>
>>
>>
>> Regards,
>>
>> Shreya
>>
>>
>>
>> Sent from my Windows 10 phone
>>
>>
>>
>&g
be others here have more info to share.
>
>
>
> Regards,
>
> Shreya
>
>
>
> Sent from my Windows 10 phone
>
>
>
> *From: *Prithish <prith...@gmail.com>
> *Sent: *Tuesday, November 15, 2016 11:04 PM
> *To: *Shreya Agarwal <shrey...@microsoft.
,
Shreya
Sent from my Windows 10 phone
From: Prithish<mailto:prith...@gmail.com>
Sent: Tuesday, November 15, 2016 11:04 PM
To: Shreya Agarwal<mailto:shrey...@microsoft.com>
Subject: Re: AVRO File size when caching in-memory
I did another test and noting my observations here. The
Anyone?
On Tue, Nov 15, 2016 at 10:45 AM, Prithish wrote:
> I am using 2.0.1 and databricks avro library 3.0.1. I am running this on
> the latest AWS EMR release.
>
> On Mon, Nov 14, 2016 at 3:06 PM, Jörn Franke wrote:
>
>> spark version? Are you using
I am using 2.0.1 and databricks avro library 3.0.1. I am running this on
the latest AWS EMR release.
On Mon, Nov 14, 2016 at 3:06 PM, Jörn Franke wrote:
> spark version? Are you using tungsten?
>
> > On 14 Nov 2016, at 10:05, Prithish wrote:
> >
> >
spark version? Are you using tungsten?
> On 14 Nov 2016, at 10:05, Prithish wrote:
>
> Can someone please explain why this happens?
>
> When I read a 600kb AVRO file and cache this in memory (using cacheTable), it
> shows up as 11mb (storage tab in Spark UI). I have tried