[jira] [Commented] (ORC-373) Option to disable dictionary encoding

ASF GitHub Bot (JIRA) Mon, 04 Jun 2018 15:18:10 -0700


    [ 
https://issues.apache.org/jira/browse/ORC-373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500944#comment-16500944
 ]


ASF GitHub Bot commented on ORC-373:
------------------------------------

Github user omalley commented on the issue:

    https://github.com/apache/orc/pull/279
  
    Prasanth, rather than adding the flush count, which is really test code in 
the main source, how about creating a StringTreeWriter and catching the stream 
directly. I'd propose this - 
https://github.com/omalley/orc/blob/8a782d204f071e2bd16bcfd591a54319fadff132/java/core/src/test/org/apache/orc/TestStringDictionary.java#L251


> Option to disable dictionary encoding 
> --------------------------------------
>
>                 Key: ORC-373
>                 URL: https://issues.apache.org/jira/browse/ORC-373
>             Project: ORC
>          Issue Type: Bug
>    Affects Versions: 1.5.2
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Major
>
> Currently dictionary check happens after creation of first row group entry. 
> Even when row indexes are disabled, rows end up in red-black tree first 
> before getting flushed during write stripe (into direct stream).
> If dictionary threshold is set to <= 0.0 disable dictionary, we should write 
> directly to stream instead of RBTree. This is useful for hive streaming 
> ingest where delta files explicitly disables dictionaries. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ORC-373) Option to disable dictionary encoding

Reply via email to