[GitHub] carbondata issue #1220: [CARBONDATA-1350]Fix the bug of the verification of ...
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/1220 Discuss with @jackylk offline, now when 'SORT_SCOPE'='GLOBAL_SORT', 'single_pass' can be 'true', so close this pr and raise another PR-1224 to remove useless restriction. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1224: [CARBONDATA-1354]Remove the useless restriction of '...
Github user asfgit commented on the issue: https://github.com/apache/carbondata/pull/1224 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1224: [CARBONDATA-1354]Remove the useless restricti...
GitHub user zzcclp opened a pull request: https://github.com/apache/carbondata/pull/1224 [CARBONDATA-1354]Remove the useless restriction of 'single_pass' can not be true when 'SORT_SCOPE'='GLOBAL_SORT' Now when 'SORT_SCOPE'='GLOBAL_SORT', 'single_pass' can be 'true', so remove the useless restriction You can merge this pull request into a Git repository by running: $ git pull https://github.com/zzcclp/carbondata CARBONDATA-1354 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1224.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1224 commit b6236f583019f3e707be8e4f3b9bb32a20550f86 Author: Zhang Zhichao <441586...@qq.com> Date: 2017-08-02T05:54:49Z [CARBONDATA-1354]Now when 'SORT_SCOPE'='GLOBAL_SORT', 'single_pass' can be 'true' [CARBONDATA-1354]Now when 'SORT_SCOPE'='GLOBAL_SORT', 'single_pass' can be 'true', so remove the useless restriction --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1224: [CARBONDATA-1354]Remove the useless restriction of '...
Github user asfgit commented on the issue: https://github.com/apache/carbondata/pull/1224 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-1354) When 'SORT_SCOPE'='GLOBAL_SORT', 'single_pass' can be 'true'
Zhichao Zhang created CARBONDATA-1354: -- Summary: When 'SORT_SCOPE'='GLOBAL_SORT', 'single_pass' can be 'true' Key: CARBONDATA-1354 URL: https://issues.apache.org/jira/browse/CARBONDATA-1354 Project: CarbonData Issue Type: Bug Components: data-load, spark-integration Affects Versions: 1.2.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Priority: Minor Fix For: 1.2.0 now when 'SORT_SCOPE'='GLOBAL_SORT', 'single_pass' can be 'true' -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1223: [WIP] Support cleaning garbage segment in all tables
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1223 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/722/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1223: [WIP] Support cleaning garbage segment in all tables
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1223 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3318/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1217: [CARBONDATA-1345] Update tablemeta cache afte...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1217 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1217: [CARBONDATA-1345] Update tablemeta cache after table...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1217 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1217: [CARBONDATA-1345] Update tablemeta cache after table...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1217 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3317/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1217: [CARBONDATA-1345] Update tablemeta cache after table...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1217 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/721/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1217: [CARBONDATA-1345] Update tablemeta cache after table...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1217 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/720/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1217: [CARBONDATA-1345] Update tablemeta cache after table...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1217 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3316/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (CARBONDATA-1352) Test case Execute while creating Carbondata jar.
[ https://issues.apache.org/jira/browse/CARBONDATA-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srigopal Mohanty reassigned CARBONDATA-1352: Assignee: Srigopal Mohanty > Test case Execute while creating Carbondata jar. > > > Key: CARBONDATA-1352 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1352 > Project: CarbonData > Issue Type: Bug > Components: other > Environment: Spark 2.1 >Reporter: Vinod Rohilla >Assignee: Srigopal Mohanty >Priority: Minor > Attachments: TestcaseExecution.png, TestCaseExecution.png > > > Steps to Reproduce: > 1: Run the command : mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean > package > 2: Check the attached screenshots. > Expected Result: > 1: All the test cases should be skipped while creating a jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/719/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3315/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1214: [CARBONDATA-1008] use MetastoreListener to sy...
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1214#discussion_r130773318 --- Diff: integration/hive/pom.xml --- @@ -136,6 +142,12 @@ org.apache.carbondata carbondata-hadoop ${project.version} + + +org.apache.spark +spark-sql_2.10 --- End diff -- @anubhav100 It is not required, so I exclude it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (CARBONDATA-1353) SDV cluster tests are failing for measure filter feature
[ https://issues.apache.org/jira/browse/CARBONDATA-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Chen resolved CARBONDATA-1353. Resolution: Fixed Assignee: Ravindra Pesala Fix Version/s: 1.2.0 > SDV cluster tests are failing for measure filter feature > > > Key: CARBONDATA-1353 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1353 > Project: CarbonData > Issue Type: Bug >Reporter: Ravindra Pesala >Assignee: Ravindra Pesala > Fix For: 1.2.0 > > Time Spent: 10m > Remaining Estimate: 0h > > SDV cluster tests are failing for measure filter feature. > http://144.76.159.231:8080/job/ApacheSDVTests/32/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1222: [CARBONDATA-1353] Fixed measure filter tests ...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1222 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/718/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3314/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1216: [CARBONDATA-1344] Remove useless variables
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1216 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1216: [CARBONDATA-1344] Remove useless variables
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1216 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (CARBONDATA-1351) When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', 'ThreadLocalTaskInfo.getCarbonTaskInfo()' return null
[ https://issues.apache.org/jira/browse/CARBONDATA-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1351. -- Resolution: Fixed > When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', > 'ThreadLocalTaskInfo.getCarbonTaskInfo()' return null > > > Key: CARBONDATA-1351 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1351 > Project: CarbonData > Issue Type: Bug > Components: data-load, spark-integration >Affects Versions: 1.2.0 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.2.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', it > uses native RDD of Spark to load data, the method of > 'ThreadLocalTaskInfo.setCarbonTaskInfo(carbonTaskInfo)' in > ‘CarbonRDD.compute’ does not be called, so > 'ThreadLocalTaskInfo.getCarbonTaskInfo()' will return null in some unsafe > related classes, such as: UnsafeFixLengthColumnPage, > UnsafeVarLengthColumnPage, UnsafeMemoryDMStore and so on. > Solution: Set the CarbonTaskInfo in the method of > 'ThreadLocalTaskInfo.getCarbonTaskInfo()' when 'threadLocal.get()' is null. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1221: [CARBONDATA-1351]Fix NPE of 'ThreadLocalTaskI...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1221 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-1353) SDV cluster tests are failing for measure filter feature
Ravindra Pesala created CARBONDATA-1353: --- Summary: SDV cluster tests are failing for measure filter feature Key: CARBONDATA-1353 URL: https://issues.apache.org/jira/browse/CARBONDATA-1353 Project: CarbonData Issue Type: Bug Reporter: Ravindra Pesala SDV cluster tests are failing for measure filter feature. http://144.76.159.231:8080/job/ApacheSDVTests/32/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1222: [WIP]Fixed measure filter tests with null
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1222 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3313/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1214: [CARBONDATA-1008] use MetastoreListener to sy...
Github user anubhav100 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1214#discussion_r130649865 --- Diff: integration/hive/pom.xml --- @@ -136,6 +142,12 @@ org.apache.carbondata carbondata-hadoop ${project.version} + + +org.apache.spark +spark-sql_2.10 --- End diff -- Why spark-sql-2.10 required --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1222: [WIP]Fixed measure filter tests with null
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1222 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/716/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1222: [WIP]Fixed measure filter tests with null
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1222 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1222: [WIP]Fixed measure filter tests with null
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1222 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/715/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1222: [WIP]Fixed measure filter tests with null
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1222 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3311/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1222: [WIP]Fixed measure filter tests with null
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1222 Build Failed with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/714/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1222: [WIP]Fixed measure filter tests with null
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1222 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3310/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1222: [WIP]Fixed measure filter tests with null
GitHub user ravipesala opened a pull request: https://github.com/apache/carbondata/pull/1222 [WIP]Fixed measure filter tests with null SDV tests are failing on measure filter. http://144.76.159.231:8080/job/ApacheSDVTests/32/ Now those are fixed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata sdv-test_tune Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1222.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1222 commit 4c9949beb5cbd855efd69a1c2d82796912896c18 Author: Ravindra PesalaDate: 2017-08-01T10:05:13Z Fixed measure filter tests with null --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/713/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3309/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1218: [CARBONDATA-1347] Implemented Columnar Readin...
Github user anubhav100 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1218#discussion_r130604641 --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataFilterUtil.java --- @@ -0,0 +1,223 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + */ + +package org.apache.carbondata.presto; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.schema.table.CarbonTable; +import org.apache.carbondata.core.scan.expression.ColumnExpression; +import org.apache.carbondata.core.scan.expression.Expression; +import org.apache.carbondata.core.scan.expression.LiteralExpression; +import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression; +import org.apache.carbondata.core.scan.expression.conditional.GreaterThanEqualToExpression; +import org.apache.carbondata.core.scan.expression.conditional.GreaterThanExpression; +import org.apache.carbondata.core.scan.expression.conditional.InExpression; +import org.apache.carbondata.core.scan.expression.conditional.LessThanEqualToExpression; +import org.apache.carbondata.core.scan.expression.conditional.LessThanExpression; +import org.apache.carbondata.core.scan.expression.conditional.ListExpression; +import org.apache.carbondata.core.scan.expression.logical.AndExpression; +import org.apache.carbondata.core.scan.expression.logical.OrExpression; + +import com.facebook.presto.spi.ColumnHandle; +import com.facebook.presto.spi.predicate.Domain; +import com.facebook.presto.spi.predicate.Range; +import com.facebook.presto.spi.predicate.TupleDomain; +import com.facebook.presto.spi.type.BigintType; +import com.facebook.presto.spi.type.BooleanType; +import com.facebook.presto.spi.type.DateType; +import com.facebook.presto.spi.type.DecimalType; +import com.facebook.presto.spi.type.DoubleType; +import com.facebook.presto.spi.type.IntegerType; +import com.facebook.presto.spi.type.SmallintType; +import com.facebook.presto.spi.type.TimestampType; +import com.facebook.presto.spi.type.Type; +import com.facebook.presto.spi.type.VarcharType; +import com.google.common.collect.ImmutableList; +import io.airlift.slice.Slice; + +import static com.google.common.base.Preconditions.checkArgument; + +public class CarbondataFilterUtil { + + private static MapfilterMap = new HashMap<>(); + + private static DataType Spi2CarbondataTypeMapper(CarbondataColumnHandle carbondataColumnHandle) { +Type colType = carbondataColumnHandle.getColumnType(); +if (colType == BooleanType.BOOLEAN) return DataType.BOOLEAN; +else if (colType == SmallintType.SMALLINT) return DataType.SHORT; +else if (colType == IntegerType.INTEGER) return DataType.INT; +else if (colType == BigintType.BIGINT) return DataType.LONG; +else if (colType == DoubleType.DOUBLE) return DataType.DOUBLE; +else if (colType == VarcharType.VARCHAR) return DataType.STRING; +else if (colType == DateType.DATE) return DataType.DATE; +else if (colType == TimestampType.TIMESTAMP) return DataType.TIMESTAMP; +else if (colType == DecimalType.createDecimalType(carbondataColumnHandle.getPrecision(), +carbondataColumnHandle.getScale())) return DataType.DECIMAL; +else return DataType.STRING; + } + + /** + * Convert presto-TupleDomain predication into Carbon scan express condition + * + * @param originalConstraint presto-TupleDomain + * @param carbonTable + * @return + */ + public static Expression parseFilterExpression(TupleDomain originalConstraint, + CarbonTable carbonTable) { +ImmutableList.Builder filters = ImmutableList.builder(); + +Domain domain = null; + +for (ColumnHandle c :
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3308/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/712/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1218: [CARBONDATA-1347] Implemented Columnar Readin...
Github user anubhav100 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1218#discussion_r130602532 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -510,9 +511,16 @@ * LOAD_STATUS SUCCESS */ public static final String STORE_LOADSTATUS_SUCCESS = "Success"; + + /** + * Default batch for data read in Columnar format --- End diff -- we set the batch size = 4096 because vectorizedcarbonrecordreader in carbondata is using ColumnarBatch of defaultsize=4096 to read column batches,so we take it as standard --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1221: [CARBONDATA-1351]Fix NPE of 'ThreadLocalTaskInfo.get...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1221 LGTM, thanks for working on this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1218: [CARBONDATA-1347] Implemented Columnar Readin...
Github user chenliang613 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1218#discussion_r130582527 --- Diff: integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataFilterUtil.java --- @@ -0,0 +1,223 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + */ + +package org.apache.carbondata.presto; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.schema.table.CarbonTable; +import org.apache.carbondata.core.scan.expression.ColumnExpression; +import org.apache.carbondata.core.scan.expression.Expression; +import org.apache.carbondata.core.scan.expression.LiteralExpression; +import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression; +import org.apache.carbondata.core.scan.expression.conditional.GreaterThanEqualToExpression; +import org.apache.carbondata.core.scan.expression.conditional.GreaterThanExpression; +import org.apache.carbondata.core.scan.expression.conditional.InExpression; +import org.apache.carbondata.core.scan.expression.conditional.LessThanEqualToExpression; +import org.apache.carbondata.core.scan.expression.conditional.LessThanExpression; +import org.apache.carbondata.core.scan.expression.conditional.ListExpression; +import org.apache.carbondata.core.scan.expression.logical.AndExpression; +import org.apache.carbondata.core.scan.expression.logical.OrExpression; + +import com.facebook.presto.spi.ColumnHandle; +import com.facebook.presto.spi.predicate.Domain; +import com.facebook.presto.spi.predicate.Range; +import com.facebook.presto.spi.predicate.TupleDomain; +import com.facebook.presto.spi.type.BigintType; +import com.facebook.presto.spi.type.BooleanType; +import com.facebook.presto.spi.type.DateType; +import com.facebook.presto.spi.type.DecimalType; +import com.facebook.presto.spi.type.DoubleType; +import com.facebook.presto.spi.type.IntegerType; +import com.facebook.presto.spi.type.SmallintType; +import com.facebook.presto.spi.type.TimestampType; +import com.facebook.presto.spi.type.Type; +import com.facebook.presto.spi.type.VarcharType; +import com.google.common.collect.ImmutableList; +import io.airlift.slice.Slice; + +import static com.google.common.base.Preconditions.checkArgument; + +public class CarbondataFilterUtil { + + private static MapfilterMap = new HashMap<>(); + + private static DataType Spi2CarbondataTypeMapper(CarbondataColumnHandle carbondataColumnHandle) { +Type colType = carbondataColumnHandle.getColumnType(); +if (colType == BooleanType.BOOLEAN) return DataType.BOOLEAN; +else if (colType == SmallintType.SMALLINT) return DataType.SHORT; +else if (colType == IntegerType.INTEGER) return DataType.INT; +else if (colType == BigintType.BIGINT) return DataType.LONG; +else if (colType == DoubleType.DOUBLE) return DataType.DOUBLE; +else if (colType == VarcharType.VARCHAR) return DataType.STRING; +else if (colType == DateType.DATE) return DataType.DATE; +else if (colType == TimestampType.TIMESTAMP) return DataType.TIMESTAMP; +else if (colType == DecimalType.createDecimalType(carbondataColumnHandle.getPrecision(), +carbondataColumnHandle.getScale())) return DataType.DECIMAL; +else return DataType.STRING; + } + + /** + * Convert presto-TupleDomain predication into Carbon scan express condition + * + * @param originalConstraint presto-TupleDomain + * @param carbonTable + * @return + */ + public static Expression parseFilterExpression(TupleDomain originalConstraint, + CarbonTable carbonTable) { +ImmutableList.Builder filters = ImmutableList.builder(); + +Domain domain = null; + +for (ColumnHandle c :
[GitHub] carbondata pull request #1205: [CARBONDATA-1086] updated configuration-param...
Github user zzcclp commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1205#discussion_r130581766 --- Diff: docs/dml-operation-on-carbondata.md --- @@ -149,6 +149,50 @@ You can use the following options to load data: * If this option is set to TRUE, then high.cardinality.identify.enable property will be disabled during data load. +- **SORT_SCOPE:** This property can have four possible values : + +* BATCH_SORT : The sorting scope is smaller and more index tree will be created,thus loading is faster but query maybe slower. + +* LOCAL_SORT : The sorting scope is bigger and one index tree per data node will be created, thus loading is slower but query is faster. + +* GLOBAL_SORT : The sorting scope is bigger and one index tree per task will be created, thus loading is slower but query is faster. + +* NO_SORT : Feasible if we want to load our data in unsorted manner. + +For BATCH_SORT: + +``` +OPTIONS ('SORT_SCOPE'='BATCH_SORT') +``` + +You can also specify the sort size option for sort scope. + +``` +OPTIONS('SORT_SCOPE'='BATCH_SORT', 'batch_sort_size_inmb'='7') +``` + +Note : + +* batch_sort_size_inmb : Size of data in MB to be processed in batch. By default it is the 45 percent size of sort.inmemory.size.inmb(Memory size in MB available for in-memory sort). + +For GLOBAL_SORT : --- End diff -- I mean that if SORT_SCOPE=GLOBAL_SORT,single_pass must be false --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1205: [CARBONDATA-1086] updated configuration-param...
Github user vandana7 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1205#discussion_r130578370 --- Diff: docs/dml-operation-on-carbondata.md --- @@ -149,6 +149,50 @@ You can use the following options to load data: * If this option is set to TRUE, then high.cardinality.identify.enable property will be disabled during data load. +- **SORT_SCOPE:** This property can have four possible values : + +* BATCH_SORT : The sorting scope is smaller and more index tree will be created,thus loading is faster but query maybe slower. + +* LOCAL_SORT : The sorting scope is bigger and one index tree per data node will be created, thus loading is slower but query is faster. + +* GLOBAL_SORT : The sorting scope is bigger and one index tree per task will be created, thus loading is slower but query is faster. + +* NO_SORT : Feasible if we want to load our data in unsorted manner. + +For BATCH_SORT: + +``` +OPTIONS ('SORT_SCOPE'='BATCH_SORT') +``` + +You can also specify the sort size option for sort scope. + +``` +OPTIONS('SORT_SCOPE'='BATCH_SORT', 'batch_sort_size_inmb'='7') +``` + +Note : + +* batch_sort_size_inmb : Size of data in MB to be processed in batch. By default it is the 45 percent size of sort.inmemory.size.inmb(Memory size in MB available for in-memory sort). + +For GLOBAL_SORT : --- End diff -- Hi, I tried to execute the LOAD query with single_pass= 'true' and sort_scope='BATCH_SORT', it successfully executed and i was able to fetch the records in sorted way syntax i used to execute load query - LOAD DATA INPATH 'hdfs://localhost:54310/uniqdata/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='TRUE','SORT_SCOPE'='BATCH_SORT','batch_sort_size_inmb'='7'); Please let me know if i am doing anything wrong --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-1352) Test case Execute while creating Carbondata jar.
Vinod Rohilla created CARBONDATA-1352: - Summary: Test case Execute while creating Carbondata jar. Key: CARBONDATA-1352 URL: https://issues.apache.org/jira/browse/CARBONDATA-1352 Project: CarbonData Issue Type: Bug Components: other Environment: Spark 2.1 Reporter: Vinod Rohilla Priority: Minor Attachments: TestcaseExecution.png, TestCaseExecution.png Steps to Reproduce: 1: Run the command : mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean package 2: Check the attached screenshots. Expected Result: 1: All the test cases should be skipped while creating a jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1205: [CARBONDATA-1086] updated configuration-parameters.m...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1205 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/711/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1205: [CARBONDATA-1086] updated configuration-parameters.m...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1205 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3307/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1218: [CARBONDATA-1347] Implemented Columnar Readin...
Github user chenliang613 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1218#discussion_r130558828 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -510,9 +511,16 @@ * LOAD_STATUS SUCCESS */ public static final String STORE_LOADSTATUS_SUCCESS = "Success"; + + /** + * Default batch for data read in Columnar format --- End diff -- 1.Can you complement the unit info for "4096" 2.Can you explain why set "4096"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1221: [CARBONDATA-1351]Fix NPE of 'ThreadLocalTaskInfo.get...
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/1221 @xuchuanyin Done, please review, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1221: [CARBONDATA-1351]When 'SORT_SCOPE'='GLOBAL_SORT' and...
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/1221 @zzcclp Title of PR or commit usually represents an action, like `fixed`,`updated`,`Added`,`refactored`, etc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1221: [CARBONDATA-1351]When 'SORT_SCOPE'='GLOBAL_SORT' and...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1221 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3306/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1221: [CARBONDATA-1351]When 'SORT_SCOPE'='GLOBAL_SORT' and...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1221 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/710/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1221: [CARBONDATA-1351]When 'SORT_SCOPE'='GLOBAL_SO...
GitHub user zzcclp opened a pull request: https://github.com/apache/carbondata/pull/1221 [CARBONDATA-1351]When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.c⦠â¦olumnpage'='true', 'ThreadLocalTaskInfo.getCarbonTaskInfo()' return null When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', it uses native RDD of Spark to load data, the method of 'ThreadLocalTaskInfo.setCarbonTaskInfo(carbonTaskInfo)' in âCarbonRDD.computeâ does not be called, so 'ThreadLocalTaskInfo.getCarbonTaskInfo()' will return null in some unsafe related classes, such as: UnsafeFixLengthColumnPage, UnsafeVarLengthColumnPage, UnsafeMemoryDMStore and so on. Solution: Set the CarbonTaskInfo in the method of 'ThreadLocalTaskInfo.getCarbonTaskInfo()' when 'threadLocal.get()' is null. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zzcclp/carbondata CARBONDATA-1351 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1221.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1221 commit 211d392759d5dcfcff4046a111e15cb500c0acf9 Author: Zhang Zhichao <441586...@qq.com> Date: 2017-08-01T08:42:16Z [CARBONDATA-1351]When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', 'ThreadLocalTaskInfo.getCarbonTaskInfo()' return null [CARBONDATA-1351]When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', 'ThreadLocalTaskInfo.getCarbonTaskInfo()' return null When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', it uses native RDD of Spark to load data, the method of 'ThreadLocalTaskInfo.setCarbonTaskInfo(carbonTaskInfo)' in âCarbonRDD.computeâ does not be called, so 'ThreadLocalTaskInfo.getCarbonTaskInfo()' will return null in some unsafe related classes, such as: UnsafeFixLengthColumnPage, UnsafeVarLengthColumnPage, UnsafeMemoryDMStore and so on. Solution: Set the CarbonTaskInfo in the method of 'ThreadLocalTaskInfo.getCarbonTaskInfo()' when 'threadLocal.get()' is null. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1221: [CARBONDATA-1351]When 'SORT_SCOPE'='GLOBAL_SORT' and...
Github user asfgit commented on the issue: https://github.com/apache/carbondata/pull/1221 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1221: [CARBONDATA-1351]When 'SORT_SCOPE'='GLOBAL_SORT' and...
Github user asfgit commented on the issue: https://github.com/apache/carbondata/pull/1221 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-1351) When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', 'ThreadLocalTaskInfo.getCarbonTaskInfo()' return null
Zhichao Zhang created CARBONDATA-1351: -- Summary: When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', 'ThreadLocalTaskInfo.getCarbonTaskInfo()' return null Key: CARBONDATA-1351 URL: https://issues.apache.org/jira/browse/CARBONDATA-1351 Project: CarbonData Issue Type: Bug Components: data-load, spark-integration Affects Versions: 1.2.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Priority: Minor Fix For: 1.2.0 When 'SORT_SCOPE'='GLOBAL_SORT' and 'enable.unsafe.columnpage'='true', it uses native RDD of Spark to load data, the method of 'ThreadLocalTaskInfo.setCarbonTaskInfo(carbonTaskInfo)' in ‘CarbonRDD.compute’ does not be called, so 'ThreadLocalTaskInfo.getCarbonTaskInfo()' will return null in some unsafe related classes, such as: UnsafeFixLengthColumnPage, UnsafeVarLengthColumnPage, UnsafeMemoryDMStore and so on. Solution: Set the CarbonTaskInfo in the method of 'ThreadLocalTaskInfo.getCarbonTaskInfo()' when 'threadLocal.get()' is null. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/709/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1218: [CARBONDATA-1347] Implemented Columnar Reading Of Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1218 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3304/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1220: [CARBONDATA-1350]When 'SORT_SCOPE'='GLOBAL_SORT', th...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1220 Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/708/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1220: [CARBONDATA-1350]When 'SORT_SCOPE'='GLOBAL_SORT', th...
Github user asfgit commented on the issue: https://github.com/apache/carbondata/pull/1220 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1220: [CARBONDATA-1350]When 'SORT_SCOPE'='GLOBAL_SO...
GitHub user zzcclp opened a pull request: https://github.com/apache/carbondata/pull/1220 [CARBONDATA-1350]When 'SORT_SCOPE'='GLOBAL_SORT', the verification of⦠⦠'single_pass' must be false is invalid. The value of option 'single_pass' is coverted to low case, but it uses 'single_pass.equals("TRUE")' to vericate, so it is invalid, and leading to load data unsuccessfully. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zzcclp/carbondata CARBONDATA-1350 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1220.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1220 commit e68e692ec75aa8423c97f42e7465c311f52893a6 Author: Zhang Zhichao <441586...@qq.com> Date: 2017-08-01T07:44:53Z [CARBONDATA-1350]When 'SORT_SCOPE'='GLOBAL_SORT', the verification of 'single_pass' must be false is invalid. [CARBONDATA-1350]When 'SORT_SCOPE'='GLOBAL_SORT', the verification of 'single_pass' must be false is invalid. The value of option 'single_pass' is coverted to low case, but it uses 'single_pass.equals("TRUE")' to vericate, so it is invalid, and leading to load data unsuccessfully. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1220: [CARBONDATA-1350]When 'SORT_SCOPE'='GLOBAL_SORT', th...
Github user asfgit commented on the issue: https://github.com/apache/carbondata/pull/1220 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #:
Github user xuchuanyin commented on the pull request: https://github.com/apache/carbondata/commit/79feac96ae789851c5ad7306a7acaaba25d8e6c9#commitcomment-23408394 In integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala: In integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala on line 481: @ravipesala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-1350) When 'SORT_SCOPE'='GLOBAL_SORT', the verification of 'single_pass' must be false is invalid.
Zhichao Zhang created CARBONDATA-1350: -- Summary: When 'SORT_SCOPE'='GLOBAL_SORT', the verification of 'single_pass' must be false is invalid. Key: CARBONDATA-1350 URL: https://issues.apache.org/jira/browse/CARBONDATA-1350 Project: CarbonData Issue Type: Bug Components: data-load, spark-integration Affects Versions: 1.2.0 Environment: On branch master. Reporter: Zhichao Zhang Assignee: Zhichao Zhang Priority: Minor Fix For: 1.2.0 The value of option 'single_pass' is coverted to low case, but it uses 'single_pass.equals("TRUE")' to vericate, so it is invalid, and leading to load data unsuccessfully. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CARBONDATA-1349) Error messge displays while execute Select Query in the existing table.
[ https://issues.apache.org/jira/browse/CARBONDATA-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108516#comment-16108516 ] xuchuanyin commented on CARBONDATA-1349: [~vin7149] could you please add more detailed error logs? > Error messge displays while execute Select Query in the existing table. > --- > > Key: CARBONDATA-1349 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1349 > Project: CarbonData > Issue Type: Bug > Components: data-query > Environment: Spark 2.1 >Reporter: Vinod Rohilla >Priority: Minor > > *Steps to reproduces:* > 1: Table must be created at least 1 week before. > 2: Data must be loaded. > 3: Perform select Query. " Select * from uniqdata " > *Actual result:* > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 87.0 failed 1 times, most recent failure: Lost task 0.0 in > stage 87.0 (TID 111, localhost, executor driver): > org.apache.spark.util.TaskCompletionListenerException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.nio.BufferUnderflowException > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105) > at org.apache.spark.scheduler.Task.run(Task.scala:112) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Expected Result: Select query should display the correct results. > Note: If the user creates the new table and loading the data and Perform the > select query.It shows the result but Select Query displays error result in > the Existing table. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1349) Error messge displays while execute Select Query in the existing table.
Vinod Rohilla created CARBONDATA-1349: - Summary: Error messge displays while execute Select Query in the existing table. Key: CARBONDATA-1349 URL: https://issues.apache.org/jira/browse/CARBONDATA-1349 Project: CarbonData Issue Type: Bug Components: data-query Environment: Spark 2.1 Reporter: Vinod Rohilla Priority: Minor *Steps to reproduces:* 1: Table must be created at least 1 week before. 2: Data must be loaded. 3: Perform select Query. " Select * from uniqdata " *Actual result:* Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 87.0 failed 1 times, most recent failure: Lost task 0.0 in stage 87.0 (TID 111, localhost, executor driver): org.apache.spark.util.TaskCompletionListenerException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.nio.BufferUnderflowException at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Expected Result: Select query should display the correct results. Note: If the user creates the new table and loading the data and Perform the select query.It shows the result but Select Query displays error result in the Existing table. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #:
Github user xuchuanyin commented on the pull request: https://github.com/apache/carbondata/commit/79feac96ae789851c5ad7306a7acaaba25d8e6c9#commitcomment-23407541 In integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala: In integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala on line 481: @jackylk @sraghunandan Hi, Iâm studying the query related code and find this line of code may cause a problem. The member variable `metadata.tablesMeta` in `CarbonFileMetaStore` is a list based, not Guava-Cache-like based. The action in `refreshCache` will emptied the list but there is no corresponding action to fulfill it. For example, in `CarbonSessionState.refreshRelationFromCache(...)`, it firstly calls `checkSchemasModifiedTimeAndReloadTables`, which calls `refreshCache` internally. And then it will call `getTableFromMetadataCache`, which just does a search in `metadata.tablesMeta`. But the value of `metadata.tablesMeta` is empty. As a result, `CarbonSessionState.refreshRelationFromCache` actually does nothing. I haven't test it due to lack of environment. Could you review this and correct me if I am wrong. Ps: Keeping the `loadMetaData(...)` here is OK in my environment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #1205: [CARBONDATA-1086] updated configuration-param...
Github user zzcclp commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1205#discussion_r130534145 --- Diff: docs/dml-operation-on-carbondata.md --- @@ -149,6 +149,50 @@ You can use the following options to load data: * If this option is set to TRUE, then high.cardinality.identify.enable property will be disabled during data load. +- **SORT_SCOPE:** This property can have four possible values : + +* BATCH_SORT : The sorting scope is smaller and more index tree will be created,thus loading is faster but query maybe slower. + +* LOCAL_SORT : The sorting scope is bigger and one index tree per data node will be created, thus loading is slower but query is faster. + +* GLOBAL_SORT : The sorting scope is bigger and one index tree per task will be created, thus loading is slower but query is faster. + +* NO_SORT : Feasible if we want to load our data in unsorted manner. + +For BATCH_SORT: + +``` +OPTIONS ('SORT_SCOPE'='BATCH_SORT') +``` + +You can also specify the sort size option for sort scope. + +``` +OPTIONS('SORT_SCOPE'='BATCH_SORT', 'batch_sort_size_inmb'='7') +``` + +Note : + +* batch_sort_size_inmb : Size of data in MB to be processed in batch. By default it is the 45 percent size of sort.inmemory.size.inmb(Memory size in MB available for in-memory sort). + +For GLOBAL_SORT : --- End diff -- Suggestion: add below note: `'SINGLE_PASS' must be false.` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #1205: [CARBONDATA-1086] updated configuration-parameters.m...
Github user sgururajshetty commented on the issue: https://github.com/apache/carbondata/pull/1205 LGTM @chenliang613 kindly review and merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (CARBONDATA-1346) Develop framework for SDV tests to run in cluster. And add all existing SDV tests to it
[ https://issues.apache.org/jira/browse/CARBONDATA-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Chen resolved CARBONDATA-1346. Resolution: Fixed Assignee: Ravindra Pesala Fix Version/s: 1.2.0 > Develop framework for SDV tests to run in cluster. And add all existing SDV > tests to it > --- > > Key: CARBONDATA-1346 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1346 > Project: CarbonData > Issue Type: Improvement >Reporter: Ravindra Pesala >Assignee: Ravindra Pesala > Fix For: 1.2.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Develop framework for SDV tests to run in cluster. And add all existing SDV > tests to it -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1169: [CARBONDATA-1346] SDV cluster tests
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1169 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---