[
https://issues.apache.org/jira/browse/PARQUET-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Szadovszky resolved PARQUET-2072.
---------------------------------------
Resolution: Fixed
> Do Not Determine Both Min/Max for Binary Stats
> ----------------------------------------------
>
> Key: PARQUET-2072
> URL: https://issues.apache.org/jira/browse/PARQUET-2072
> Project: Parquet
> Issue Type: Improvement
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Minor
>
> I'm looking at some benchmarking code of Apache ORC v.s. Apache Parquet and
> see that Parquet is quite a bit slower for writes (reads TBD). Based on my
> investigation, I have noticed a significant amount of time spent in
> determining min/max for binary types.
> One quick improvement is to bypass a "max" value determinization if the value
> has already been determined to be a "min".
> While I'm at it, remove calls to deprecated functions.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)