[ 
https://issues.apache.org/jira/browse/PARQUET-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li updated PARQUET-792:
-----------------------
    Comment: was deleted

(was: I see, but it is not possible to make the null value occurs on the same 
level for my data. I will try to modify writer to skip data and meta for such 
fields, as they do not exist. The reader should be compatible as if these 
fields are newly added.)

> Skip the storage of repetition level and definition level for all-null column
> -----------------------------------------------------------------------------
>
>                 Key: PARQUET-792
>                 URL: https://issues.apache.org/jira/browse/PARQUET-792
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>            Reporter: Li
>            Priority: Minor
>
> I have a very sparse protobuf message in my project, with thousands of fields.
> In practise, most of the fields are all null values in one page.
> But the repetition level and definition level takes lots of storage space.
> Can parquet skip the storage of r level and d level for such all-null columns 
> to save storage space?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to