[
https://issues.apache.org/jira/browse/PARQUET-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904517#comment-16904517
]
ASF GitHub Bot commented on PARQUET-1630:
-----------------------------------------
jbapple commented on pull request #150: PARQUET-1630: Loosen size restrictions
on Bloom filters
URL: https://github.com/apache/parquet-format/pull/150
This patch uses a range reduction trick to produce a pseudorandom
number within an index without using the modulo operator '%', which is
often very slow.
The oldest reference I know to this trick is Kenneth A. Ross's IBM
research report from 2006, "Efficient Hash Probes on Modern
Processors", available at
https://domino.research.ibm.com/library/cyberdig.nsf/papers/DF54E3545C82E8A585257222006FD9A2/$File/rc24100.pdf
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Resolve Bloom filter spec concerns
> ----------------------------------
>
> Key: PARQUET-1630
> URL: https://issues.apache.org/jira/browse/PARQUET-1630
> Project: Parquet
> Issue Type: Sub-task
> Components: parquet-format
> Reporter: Junjie Chen
> Priority: Trivial
> Labels: pull-request-available
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)