asfimport opened a new issue, #404:
URL: https://github.com/apache/parquet-format/issues/404

   The spec for DICTIONARY_ENCODING states that:
   
   > If the dictionary grows too big, whether in size or number of distinct 
values, the encoding will fall back to the plain encoding. 
   
   
   
https://github.com/apache/parquet-format/blob/master/Encodings.md#dictionary-encoding-plain_dictionary--2-and-rle_dictionary--8
   
   However, the parquet-mr implementation was deliberately changed to a 
different fallback mechanism in https://issues.apache.org/jira/browse/PARQUET-52
   
   I'm assuming the parquet-mr implementation is authoritative here. But then 
the spec is incorrect and should be fixed to reflect expected behavior.
   
   
   **Reporter**: [Antoine 
Pitrou](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=apitrou) / 
@pitrou
   
   <sub>**Note**: *This issue was originally created as 
[PARQUET-2221](https://issues.apache.org/jira/browse/PARQUET-2221). Please see 
the [migration 
documentation](https://issues.apache.org/jira/browse/PARQUET-2502) for further 
details.*</sub>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to