Prasanth Jayachandran created ORC-350:
-----------------------------------------
Summary: Optionally disable/specify indexes for columns
Key: ORC-350
URL: https://issues.apache.org/jira/browse/ORC-350
Project: ORC
Issue Type: Sub-task
Reporter: Prasanth Jayachandran
There are many cases where entire xml or big json is stored as string column.
If we autogenerate indexes on those columns, we often run into issues with
protobuf stream explosion. The only workaround for now is to change from string
to binary. It will be good to have an option to disable indexes on specific
columns.
Regardless, I think we should have max limits on string column statistics. If
that limit is exceeded PPD should handle it accordingly (by returning
YES_NO_NULL).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)