[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1831 ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user mohammadshahidkhan commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165810447 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1264,18 +1231,7 @@ /** * default property of unsafe processing */ - public static final String ENABLE_UNSAFE_IN_QUERY_EXECUTION_DEFAULTVALUE = "false"; - - /** - * property for offheap based processing - */ - @CarbonProperty - public static final String USE_OFFHEAP_IN_QUERY_PROCSSING = "use.offheap.in.query.processing"; - - /** - * default value of offheap based processing - */ - public static final String USE_OFFHEAP_IN_QUERY_PROCSSING_DEFAULT = "true"; + public static final String ENABLE_UNSAFE_IN_QUERY_EXECUTION_DEFAULTVALUE = "true"; --- End diff -- Change default value for "enable.unsafe.columnpage" to true ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user KanakaKumar commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165809183 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1264,18 +1231,7 @@ /** * default property of unsafe processing */ - public static final String ENABLE_UNSAFE_IN_QUERY_EXECUTION_DEFAULTVALUE = "false"; - - /** - * property for offheap based processing - */ - @CarbonProperty - public static final String USE_OFFHEAP_IN_QUERY_PROCSSING = "use.offheap.in.query.processing"; - - /** - * default value of offheap based processing - */ - public static final String USE_OFFHEAP_IN_QUERY_PROCSSING_DEFAULT = "true"; + public static final String ENABLE_UNSAFE_IN_QUERY_EXECUTION_DEFAULTVALUE = "true"; --- End diff -- Please make ENABLE_UNSAFE_COLUMN_PAGE_LOADING = "enable.unsafe.columnpage" also "true" by default as its the common configuration for query also. ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user mohammadshahidkhan commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165743860 --- Diff: conf/carbon.properties.template --- @@ -17,29 +17,25 @@ # System Configuration ## -#Mandatory. Carbon Store path -carbon.storelocation=hdfs://hacluster/Opt/CarbonStore +#Optional. Carbon Store path --- End diff -- added ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user mohammadshahidkhan commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165743651 --- Diff: conf/carbon.properties.template --- @@ -76,22 +72,16 @@ carbon.enable.quick.filter=false #carbon.block.meta.size.reserved.percentage=10 ##csv reading buffer size. #carbon.csv.read.buffersize.byte=1048576 -##To identify and apply compression for non-high cardinality columns -#high.cardinality.value=10 ##maximum no of threads used for reading intermediate files for final merging. #carbon.merge.sort.reader.thread=3 ##Carbon blocklet size. Note: this configuration cannot be change once store is generated #carbon.blocklet.size=12 -##number of retries to get the metadata lock for loading data to table -#carbon.load.metadata.lock.retries=3 ##Minimum blocklets needed for distribution. #carbon.blockletdistribution.min.blocklet.size=10 ##Interval between the retries to get the lock #carbon.load.metadata.lock.retry.timeout.sec=5 ##Temporary store location, By default it will take System.getProperty("java.io.tmpdir") -#carbon.tempstore.location=/opt/Carbon/TempStoreLoc -##data loading records count logger -#carbon.load.log.counter=50 +#carbon.tempstore.location --- End diff -- We have used this in CarbonAlterTableCompactionCommand, but i think there also we can use java tmp dir. so removed the property and usage also. ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user mohammadshahidkhan commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165743745 --- Diff: docs/configuration-parameters.md --- @@ -32,10 +32,10 @@ This section provides the details of all the configurations required for the Car | Property | Default Value | Description | ||-|--| -| carbon.storelocation | /user/hive/warehouse/carbon.store | Location where CarbonData will create the store, and write the data in its own format. NOTE: Store location should be in HDFS. | -| carbon.ddl.base.hdfs.url | hdfs://hacluster/opt/data | This property is used to configure the HDFS relative path, the path configured in carbon.ddl.base.hdfs.url will be appended to the HDFS path configured in fs.defaultFS. If this path is configured, then user need not pass the complete path while dataload. For example: If absolute path of the csv file is hdfs://10.18.101.155:54310/data/cnbc/2016/xyz.csv, the path "hdfs://10.18.101.155:54310" will come from property fs.defaultFS and user can configure the /data/cnbc/ as carbon.ddl.base.hdfs.url. Now while dataload user can specify the csv path as /2016/xyz.csv. | -| carbon.badRecords.location | /opt/Carbon/Spark/badrecords | Path where the bad records are stored. | -| carbon.data.file.version | 3 | If this parameter value is set to 1, then CarbonData will support the data load which is in old format(0.x version). If the value is set to 2(1.x onwards version), then CarbonData will support the data load of new format only. The default value for this parameter is 3(latest version is set as default version). It improves the query performance by ~20% to 50%. For configuring V3 format explicitly, add carbon.data.file.version = V3 in carbon.properties file. | +| carbon.storelocation | | Location where CarbonData will create the store, and write the data in its own format. NOTE: Store location should be in HDFS. | --- End diff -- Added ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user mohammadshahidkhan commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165742856 --- Diff: conf/carbon.properties.template --- @@ -110,7 +100,7 @@ carbon.enable.quick.filter=false ##Percentage to identify whether column cardinality is more than configured percent of total row count #high.cardinality.row.count.percentage=80 --- End diff -- not used removed ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user mohammadshahidkhan commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165741342 --- Diff: conf/carbon.properties.template --- @@ -76,22 +72,16 @@ carbon.enable.quick.filter=false #carbon.block.meta.size.reserved.percentage=10 ##csv reading buffer size. #carbon.csv.read.buffersize.byte=1048576 -##To identify and apply compression for non-high cardinality columns -#high.cardinality.value=10 ##maximum no of threads used for reading intermediate files for final merging. #carbon.merge.sort.reader.thread=3 ##Carbon blocklet size. Note: this configuration cannot be change once store is generated #carbon.blocklet.size=12 -##number of retries to get the metadata lock for loading data to table -#carbon.load.metadata.lock.retries=3 ##Minimum blocklets needed for distribution. #carbon.blockletdistribution.min.blocklet.size=10 ##Interval between the retries to get the lock #carbon.load.metadata.lock.retry.timeout.sec=5 ##Temporary store location, By default it will take System.getProperty("java.io.tmpdir") -#carbon.tempstore.location=/opt/Carbon/TempStoreLoc -##data loading records count logger -#carbon.load.log.counter=50 +#carbon.tempstore.location ##To dissable/enable carbon block distribution #carbon.custom.block.distribution=false --- End diff -- The property still in use val useCustomDistribution = CarbonProperties.getInstance().getProperty( CarbonCommonConstants.CARBON_CUSTOM_BLOCK_DISTRIBUTION, "false").toBoolean || carbonDistribution.equalsIgnoreCase(CarbonCommonConstants.CARBON_TASK_DISTRIBUTION_CUSTOM) if (useCustomDistribution) ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165545325 --- Diff: docs/configuration-parameters.md --- @@ -32,10 +32,10 @@ This section provides the details of all the configurations required for the Car | Property | Default Value | Description | ||-|--| -| carbon.storelocation | /user/hive/warehouse/carbon.store | Location where CarbonData will create the store, and write the data in its own format. NOTE: Store location should be in HDFS. | -| carbon.ddl.base.hdfs.url | hdfs://hacluster/opt/data | This property is used to configure the HDFS relative path, the path configured in carbon.ddl.base.hdfs.url will be appended to the HDFS path configured in fs.defaultFS. If this path is configured, then user need not pass the complete path while dataload. For example: If absolute path of the csv file is hdfs://10.18.101.155:54310/data/cnbc/2016/xyz.csv, the path "hdfs://10.18.101.155:54310" will come from property fs.defaultFS and user can configure the /data/cnbc/ as carbon.ddl.base.hdfs.url. Now while dataload user can specify the csv path as /2016/xyz.csv. | -| carbon.badRecords.location | /opt/Carbon/Spark/badrecords | Path where the bad records are stored. | -| carbon.data.file.version | 3 | If this parameter value is set to 1, then CarbonData will support the data load which is in old format(0.x version). If the value is set to 2(1.x onwards version), then CarbonData will support the data load of new format only. The default value for this parameter is 3(latest version is set as default version). It improves the query performance by ~20% to 50%. For configuring V3 format explicitly, add carbon.data.file.version = V3 in carbon.properties file. | +| carbon.storelocation | | Location where CarbonData will create the store, and write the data in its own format. NOTE: Store location should be in HDFS. | --- End diff -- Here also mention that if it is not specified it takes spark warehouse path ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165544641 --- Diff: conf/carbon.properties.template --- @@ -76,22 +72,16 @@ carbon.enable.quick.filter=false #carbon.block.meta.size.reserved.percentage=10 ##csv reading buffer size. #carbon.csv.read.buffersize.byte=1048576 -##To identify and apply compression for non-high cardinality columns -#high.cardinality.value=10 ##maximum no of threads used for reading intermediate files for final merging. #carbon.merge.sort.reader.thread=3 ##Carbon blocklet size. Note: this configuration cannot be change once store is generated #carbon.blocklet.size=12 -##number of retries to get the metadata lock for loading data to table -#carbon.load.metadata.lock.retries=3 ##Minimum blocklets needed for distribution. #carbon.blockletdistribution.min.blocklet.size=10 ##Interval between the retries to get the lock #carbon.load.metadata.lock.retry.timeout.sec=5 ##Temporary store location, By default it will take System.getProperty("java.io.tmpdir") -#carbon.tempstore.location=/opt/Carbon/TempStoreLoc -##data loading records count logger -#carbon.load.log.counter=50 +#carbon.tempstore.location ##To dissable/enable carbon block distribution #carbon.custom.block.distribution=false --- End diff -- This property is now changed to `carbon.task.distribution` and its default value is `block` ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165544424 --- Diff: conf/carbon.properties.template --- @@ -110,7 +100,7 @@ carbon.enable.quick.filter=false ##Percentage to identify whether column cardinality is more than configured percent of total row count #high.cardinality.row.count.percentage=80 --- End diff -- This is also not used I guess, please check and remove ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165544265 --- Diff: conf/carbon.properties.template --- @@ -76,22 +72,16 @@ carbon.enable.quick.filter=false #carbon.block.meta.size.reserved.percentage=10 ##csv reading buffer size. #carbon.csv.read.buffersize.byte=1048576 -##To identify and apply compression for non-high cardinality columns -#high.cardinality.value=10 ##maximum no of threads used for reading intermediate files for final merging. #carbon.merge.sort.reader.thread=3 ##Carbon blocklet size. Note: this configuration cannot be change once store is generated #carbon.blocklet.size=12 -##number of retries to get the metadata lock for loading data to table -#carbon.load.metadata.lock.retries=3 ##Minimum blocklets needed for distribution. #carbon.blockletdistribution.min.blocklet.size=10 ##Interval between the retries to get the lock #carbon.load.metadata.lock.retry.timeout.sec=5 ##Temporary store location, By default it will take System.getProperty("java.io.tmpdir") -#carbon.tempstore.location=/opt/Carbon/TempStoreLoc -##data loading records count logger -#carbon.load.log.counter=50 +#carbon.tempstore.location --- End diff -- Are we really using this? I think we always depends on eith java tmp dir or get tmp directoris from spark/yarn. Please reverify and remove if not used ---
[GitHub] carbondata pull request #1831: [CARBONDATA-1993] Carbon properties default v...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1831#discussion_r165543880 --- Diff: conf/carbon.properties.template --- @@ -17,29 +17,25 @@ # System Configuration ## -#Mandatory. Carbon Store path -carbon.storelocation=hdfs://hacluster/Opt/CarbonStore +#Optional. Carbon Store path --- End diff -- Mention that if it is not specified it takes spark warehouse path ---