[ 
https://issues.apache.org/jira/browse/CARBONDATA-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360993#comment-15360993
 ] 

ASF GitHub Bot commented on CARBONDATA-35:
------------------------------------------

Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/16#discussion_r69421314
  
    --- Diff: 
integration/spark/src/main/scala/org/carbondata/spark/util/GlobalDictionaryUtil.scala
 ---
    @@ -106,27 +107,36 @@ object GlobalDictionaryUtil extends Logging {
         }
       }
     
    -  def gatherDimensionByEncoding(dimension: CarbonDimension,
    +  def gatherDimensionByEncoding(carbonLoadModel: CarbonLoadModel,
    +      dimension: CarbonDimension,
           encoding: Encoding,
           excludeEncoding: Encoding,
    -      dimensionsWithEncoding: ArrayBuffer[CarbonDimension]) {
    +      dimensionsWithEncoding: ArrayBuffer[CarbonDimension],
    +      forPreDefDict: Boolean) {
         if (dimension.isComplex()) {
    --- End diff --
    
    please remove "()"


> generate global dict using pre-defined dict from external column file
> ---------------------------------------------------------------------
>
>                 Key: CARBONDATA-35
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-35
>             Project: CarbonData
>          Issue Type: New Feature
>            Reporter: Jay
>            Priority: Minor
>
> user can set colName:columnfilePath in load DML, which can provide small 
> amount of distinct values, then carbon can use these distinct values to 
> generate dictionary and avoid reading from large raw csv file. this is a new 
> feature and can improve the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to