[
https://issues.apache.org/jira/browse/CARBONDATA-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360993#comment-15360993
]
ASF GitHub Bot commented on CARBONDATA-35:
------------------------------------------
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/16#discussion_r69421314
--- Diff:
integration/spark/src/main/scala/org/carbondata/spark/util/GlobalDictionaryUtil.scala
---
@@ -106,27 +107,36 @@ object GlobalDictionaryUtil extends Logging {
}
}
- def gatherDimensionByEncoding(dimension: CarbonDimension,
+ def gatherDimensionByEncoding(carbonLoadModel: CarbonLoadModel,
+ dimension: CarbonDimension,
encoding: Encoding,
excludeEncoding: Encoding,
- dimensionsWithEncoding: ArrayBuffer[CarbonDimension]) {
+ dimensionsWithEncoding: ArrayBuffer[CarbonDimension],
+ forPreDefDict: Boolean) {
if (dimension.isComplex()) {
--- End diff --
please remove "()"
> generate global dict using pre-defined dict from external column file
> ---------------------------------------------------------------------
>
> Key: CARBONDATA-35
> URL: https://issues.apache.org/jira/browse/CARBONDATA-35
> Project: CarbonData
> Issue Type: New Feature
> Reporter: Jay
> Priority: Minor
>
> user can set colName:columnfilePath in load DML, which can provide small
> amount of distinct values, then carbon can use these distinct values to
> generate dictionary and avoid reading from large raw csv file. this is a new
> feature and can improve the performance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)