Why do you think parquet format will be a better in memory representation? My guess is that it won't be -- it wasn't really designed for that, and I think a specialized format designed for in memory use would probably work a lot better.
On Mon, Aug 31, 2015 at 6:05 PM, Wangchangchun (A) < [email protected]> wrote: > Can anyone help to answer me this question? > Thanks! > > 发件人: Wangchangchun (A) > 发送时间: 2015年8月29日 11:03 > 收件人: '[email protected]' > 主题: [implement a memory parquet ] > > Hi, everyone, > Can somebody help to answer me a question? > > In SparkSQL, memory data stored in an object model named internalrow, > If you cache table, spark will convert internalrow into in-memory columnar > storage, > We think that in-memory columnar storage is not an efficient storage, we > want to try to store it in memory using parquet format. > That is , we want to implement a memory parquet format, and store sparksql > cache table in it. > > Is this feasible? If it feasible, can someone give me some advise about > how to implement it? I should use which parquet APIs? > -- Alex Levenson @THISWILLWORK
