[GitHub] carbondata issue #1816: [CARBONDATA-2038][Tests] use junit assertion in java...

2018-01-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1816
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1649/



---


[jira] [Resolved] (CARBONDATA-2009) REFRESH TABLE Limitation When HiveMetaStore is used

2018-01-17 Thread Liang Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen resolved CARBONDATA-2009.

Resolution: Fixed

> REFRESH TABLE Limitation When HiveMetaStore is used
> ---
>
> Key: CARBONDATA-2009
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2009
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Arshad
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Refresh table command will not register the carbon table if the old table is 
> stored in the CarbonHiveMetastore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #1790: [CARBONDATA-2009][Documentation] Document Ref...

2018-01-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1790


---


[GitHub] carbondata pull request #1793: [CARBONDATA-2021]fix clean up issue when upda...

2018-01-17 Thread akashrn5
Github user akashrn5 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1793#discussion_r161996456
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
 ---
@@ -956,9 +964,16 @@ public UpdateVO getInvalidTimestampRange(String 
segmentId) {
 long timestamp = CarbonUpdateUtil.getTimeStampAsLong(
 
CarbonTablePath.DataFileUtil.getTimeStampFromDeleteDeltaFile(fileName));
 
-if (block.getBlockName().equalsIgnoreCase(blkName) && (
-Long.compare(timestamp, deltaStartTimestamp) < 0)) {
-  files.add(eachFile);
+if (block.getBlockName().equalsIgnoreCase(blkName)) {
+
+  if (getOnlyAbortedFiles) {
+if (Long.compare(timestamp, deltaEndTimestamp) > 0) {
--- End diff --

yes, but for all the timestamp comparision in code, we were using 
Long.compare, so i follwed the same , just to make same for all


---


[jira] [Resolved] (CARBONDATA-2013) executing alter query on non-carbon table gives error, "table can not found in database"

2018-01-17 Thread Venkata Ramana G (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Ramana G resolved CARBONDATA-2013.
--
   Resolution: Fixed
 Assignee: Kushal
Fix Version/s: 1.3.0

> executing alter query on non-carbon table gives error, "table can not found 
> in database"
> 
>
> Key: CARBONDATA-2013
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2013
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kushal Sah
>Assignee: Kushal
>Priority: Major
> Fix For: 1.3.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> create table test(id int,time string) ROW FORMAT DELIMITED FIELDS TERMINATED 
> BY ',' STORED AS TEXTFILE;
> alter table test rename to new;  =>  gives error  test not found in database 
> default



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2013) executing alter query on non-carbon table gives error, "table can not found in database"

2018-01-17 Thread Venkata Ramana G (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Ramana G updated CARBONDATA-2013:
-
Summary: executing alter query on non-carbon table gives error, "table can 
not found in database"  (was: executing alter query results that table can not 
found in database)

> executing alter query on non-carbon table gives error, "table can not found 
> in database"
> 
>
> Key: CARBONDATA-2013
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2013
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kushal Sah
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> create table test(id int,time string) ROW FORMAT DELIMITED FIELDS TERMINATED 
> BY ',' STORED AS TEXTFILE;
> alter table test rename to new;  =>  gives error  test not found in database 
> default



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #1799: [CARBONDATA-2026] Fix all issues and testcase...

2018-01-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1799


---


[GitHub] carbondata issue #1799: [CARBONDATA-2026] Fix all issues and testcases when ...

2018-01-17 Thread gvramana
Github user gvramana commented on the issue:

https://github.com/apache/carbondata/pull/1799
  
LGTM


---


[jira] [Created] (CARBONDATA-2043) Configurable wait time for requesting executors and minimum registered executors ratio to continue the block distribution

2018-01-17 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-2043:


 Summary: Configurable wait time for requesting executors and 
minimum registered executors ratio to continue the block distribution
 Key: CARBONDATA-2043
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2043
 Project: CarbonData
  Issue Type: Improvement
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan


* Configurable wait time for requesting executors and minimum registered 
executors ratio to continue the block distribution
- carbon.dynamicallocation.schedulertimeout : to configure wait time. defalt 
5sec, Min 5 sec and max 15 sec.
- carbon.scheduler.minregisteredresourcesratio : min 0.1, max 1.0 and default 
to 0.8 to configure minimum registered executors ratio.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #1820: [CARBONDATA-2042]Fixed data mismatch issue in...

2018-01-17 Thread kumarvishal09
GitHub user kumarvishal09 opened a pull request:

https://github.com/apache/carbondata/pull/1820

[CARBONDATA-2042]Fixed data mismatch issue in case timeseries

**Problem:** Year, Month, Day level timeseries table giving wrong result
**Solution:** Timeseries UDF is not able to convert data when hour is in 24 
hours format

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
   Added UT
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kumarvishal09/incubator-carbondata 
master_Timeseriesdatamismatch

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1820.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1820


commit 42eb2878067db4c0df49680fca970bf011bb81f5
Author: kumarvishal 
Date:   2018-01-17T09:04:56Z

Fixed data mismatch issue in case timeseries




---


[jira] [Created] (CARBONDATA-2042) Data Mismatch issue in case of Timeseries Year, Month and Day level table

2018-01-17 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2042:


 Summary: Data Mismatch issue in case of Timeseries Year, Month and 
Day level table
 Key: CARBONDATA-2042
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2042
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal
Assignee: kumar vishal
 Attachments: data_sort.csv

sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into table 
mainTable")
sql("CREATE TABLE table_03 (imei string,age int,mac string,productdate 
timestamp,updatedate timestamp,gamePointId double,contractid double ) STORED BY 
'org.apache.carbondata.format'")
sql(s"LOAD DATA inpath '$resourcesPath/data_sort.csv' INTO table table_03 
options ('DELIMITER'=',', 
'QUOTECHAR'='','FILEHEADER'='imei,age,mac,productdate,updatedate,gamePointId,contractid')")
sql("create datamap ag1 on table table_03 using 'preaggregate' DMPROPERTIES ( 
'timeseries.eventtime'='productdate','timeseries.hierarchy'='second=1,minute=1,hour=1,day=1,month=1,year=1')as
 select productdate,mac,sum(age) from table_03 group by productdate,mac")



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #1750: [CARBONDATA-1969] Support Java API to create ...

2018-01-17 Thread mohammadshahidkhan
Github user mohammadshahidkhan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1750#discussion_r161989164
  
--- Diff: 
store/sdk/src/test/scala/org/apache/carbondata/store/TestCarbonFileWriter.scala 
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.store
+
+import java.io.File
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.metadata.datatype.{DataTypes, 
StructField}
+import org.apache.carbondata.store.api.{CarbonStore, SchemaBuilder}
+
+class TestCarbonFileWriter extends QueryTest with BeforeAndAfterAll {
+
+  test("test write carbon table and read as external table") {
+sql("DROP TABLE IF EXISTS source")
+
+val tablePath = "./db1/tc1"
+cleanTestTable(tablePath)
+createTestTable(tablePath)
+
+sql(s"CREATE EXTERNAL TABLE source STORED BY 'carbondata' LOCATION 
'$tablePath'")
+checkAnswer(sql("SELECT count(*) from source"), Row(1000))
+
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test write carbon table and read by refresh table") {
+sql("DROP DATABASE IF EXISTS db1 CASCADE")
+
+val tablePath = "./db1/tc1"
+cleanTestTable(tablePath)
+createTestTable(tablePath)
+
+sql("CREATE DATABASE db1 LOCATION './db1'")
+sql("REFRESH TABLE db1.tc1")
+checkAnswer(sql("SELECT count(*) from db1.tc1"), Row(1000))
+
+sql("DROP DATABASE IF EXISTS db1 CASCADE")
+  }
+
+  private def cleanTestTable(tablePath: String) = {
+if (new File(tablePath).exists()) {
+  new File(tablePath).delete()
+}
+  }
+
+  private def createTestTable(tablePath: String): Unit = {
+val carbon = CarbonStore.build()
+
+val schema = SchemaBuilder.newInstance
+  .addColumn(new StructField("name", DataTypes.STRING), true)
+  .addColumn(new StructField("age", DataTypes.INT), false)
+  .addColumn(new StructField("height", DataTypes.DOUBLE), false)
+  .create
+
+val table = carbon.createTable("t1", schema, tablePath)
+val segment = table.newBatchSegment()
+
+segment.open()
+val writer = segment.newWriter()
+(1 to 1000).foreach { _ => writer.writeRow(Array[String]("amy", "1", 
"2.3")) }
+writer.close()
--- End diff --

Stream close can not be ensured here without finally.


---


[GitHub] carbondata pull request #1750: [CARBONDATA-1969] Support Java API to create ...

2018-01-17 Thread mohammadshahidkhan
Github user mohammadshahidkhan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1750#discussion_r161988359
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/store/TableBuilder.java ---
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.store;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.CarbonMetadata;
+import org.apache.carbondata.core.metadata.converter.SchemaConverter;
+import 
org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.metadata.schema.table.DataMapSchema;
+import org.apache.carbondata.core.metadata.schema.table.TableInfo;
+import org.apache.carbondata.core.metadata.schema.table.TableSchema;
+import 
org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
+import org.apache.carbondata.core.util.path.CarbonStorePath;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+import org.apache.carbondata.core.writer.ThriftWriter;
+import org.apache.carbondata.format.SchemaEvolutionEntry;
+import org.apache.carbondata.store.api.Table;
+
+public class TableBuilder {
+
+  private String databaseName;
+  private String tableName;
+  private String tablePath;
+  private TableSchema tableSchema;
+
+  private TableBuilder() { }
+
+  public static TableBuilder newInstance() {
+return new TableBuilder();
+  }
+
+  public Table create() throws IOException {
+if (tableName == null || tablePath == null || tableSchema == null) {
+  throw new IllegalArgumentException("must provide table name and 
table path");
+}
+
+if (databaseName == null) {
+  databaseName = "default";
+}
+
+TableInfo tableInfo = new TableInfo();
+tableInfo.setDatabaseName(databaseName);
+tableInfo.setTableUniqueName(databaseName + "_" + tableName);
+tableInfo.setFactTable(tableSchema);
+tableInfo.setTablePath(tablePath);
+tableInfo.setLastUpdatedTime(System.currentTimeMillis());
+tableInfo.setDataMapSchemaList(new ArrayList(0));
+AbsoluteTableIdentifier identifier = 
tableInfo.getOrCreateAbsoluteTableIdentifier();
+
+CarbonTablePath carbonTablePath = CarbonStorePath.getCarbonTablePath(
+identifier.getTablePath(),
+identifier.getCarbonTableIdentifier());
+String schemaFilePath = carbonTablePath.getSchemaFilePath();
+String schemaMetadataPath = 
CarbonTablePath.getFolderContainingFile(schemaFilePath);
+CarbonMetadata.getInstance().loadTableMetadata(tableInfo);
+SchemaConverter schemaConverter = new 
ThriftWrapperSchemaConverterImpl();
+org.apache.carbondata.format.TableInfo thriftTableInfo =
+schemaConverter.fromWrapperToExternalTableInfo(
+tableInfo,
+tableInfo.getDatabaseName(),
+tableInfo.getFactTable().getTableName());
+org.apache.carbondata.format.SchemaEvolutionEntry schemaEvolutionEntry 
=
+new SchemaEvolutionEntry(
+tableInfo.getLastUpdatedTime());
+
thriftTableInfo.getFact_table().getSchema_evolution().getSchema_evolution_history()
+.add(schemaEvolutionEntry);
+FileFactory.FileType fileType = 
FileFactory.getFileType(schemaMetadataPath);
+if (!FileFactory.isFileExist(schemaMetadataPath, fileType)) {
+  FileFactory.mkdirs(schemaMetadataPath, fileType);
+}
+ThriftWriter thriftWriter = new ThriftWriter(schemaFilePath, false);
+thriftWriter.open();
+thriftWriter.write(thriftTableInfo);
   

[GitHub] carbondata pull request #1783: [CARBONDATA-2013] executing Alter rename to e...

2018-01-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1783


---


[GitHub] carbondata issue #1783: [CARBONDATA-2013] executing Alter rename to executio...

2018-01-17 Thread gvramana
Github user gvramana commented on the issue:

https://github.com/apache/carbondata/pull/1783
  
LGTM


---


[GitHub] carbondata pull request #1793: [CARBONDATA-2021]fix clean up issue when upda...

2018-01-17 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1793#discussion_r161987134
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
 ---
@@ -956,9 +964,16 @@ public UpdateVO getInvalidTimestampRange(String 
segmentId) {
 long timestamp = CarbonUpdateUtil.getTimeStampAsLong(
 
CarbonTablePath.DataFileUtil.getTimeStampFromDeleteDeltaFile(fileName));
 
-if (block.getBlockName().equalsIgnoreCase(blkName) && (
-Long.compare(timestamp, deltaStartTimestamp) < 0)) {
-  files.add(eachFile);
+if (block.getBlockName().equalsIgnoreCase(blkName)) {
+
+  if (getOnlyAbortedFiles) {
+if (Long.compare(timestamp, deltaEndTimestamp) > 0) {
--- End diff --

Can't you use timestamp > deltaEndTimestamp?


---


[GitHub] carbondata issue #1819: [CARBONDATA-1964] Fixed bug to set bad.records.actio...

2018-01-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1819
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2880/



---


[GitHub] carbondata issue #1819: [CARBONDATA-1964] Fixed bug to set bad.records.actio...

2018-01-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1819
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1648/



---


[GitHub] carbondata issue #1800: [HOTFIX] Fix concurrent testcase random failure

2018-01-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1800
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2878/



---


[GitHub] carbondata issue #1792: [CARBONDATA-2018][DataLoad] Optimization in reading/...

2018-01-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1792
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1646/



---


[GitHub] carbondata issue #1751: [CARBONDATA-1971][Blocklet Prunning] Measure Null va...

2018-01-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1751
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2879/



---


[GitHub] carbondata issue #1800: [HOTFIX] Fix concurrent testcase random failure

2018-01-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1800
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1645/



---


[GitHub] carbondata pull request #1793: [CARBONDATA-2021]fix clean up issue when upda...

2018-01-17 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1793#discussion_r161985749
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java ---
@@ -427,6 +427,10 @@ public static void cleanUpDeltaFiles(CarbonTable 
table, boolean forceDelete) {
 
 String validUpdateStatusFile = "";
 
+boolean getOnlyAbortedFiles = true;
--- End diff --

It is hard to understand a Boolean variable with name start with `get`, can 
you rename it `isXXX` and add comment for it. The same for below variable


---


[GitHub] carbondata pull request #1793: [CARBONDATA-2021]fix clean up issue when upda...

2018-01-17 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1793#discussion_r161985530
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java ---
@@ -558,6 +576,36 @@ public static void cleanUpDeltaFiles(CarbonTable 
table, boolean forceDelete) {
 }
   }
 
+  /**
+   * @param segment
--- End diff --

please complete the comment to describe this function


---


[GitHub] carbondata issue #1793: [CARBONDATA-2021]fix clean up issue when update oper...

2018-01-17 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1793
  
LGTM


---


[jira] [Resolved] (CARBONDATA-1982) Loading data into partition table with invalid partition column should throw proper exception

2018-01-17 Thread Geetika Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geetika Gupta resolved CARBONDATA-1982.
---
Resolution: Fixed

> Loading data into partition table with invalid partition column should throw 
> proper exception
> -
>
> Key: CARBONDATA-1982
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1982
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: spark2.1
>Reporter: Geetika Gupta
>Assignee: anubhav tarar
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: 2000_UniqData.csv
>
>
> I created a partitioned table using:
>  CREATE TABLE uniqdata_int_dec(CUST_NAME String,ACTIVE_EMUI_VERSION string, 
> DOB timestamp,
>  DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,
>  DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,
>  INTEGER_COLUMN1 int) Partitioned by (cust_id int, decimal_column1 
> decimal(30,10)) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB")
> Load data command:
>  LOAD DATA INPATH 'hdfs://localhost:54311/2000_UniqData.csv' into table 
> uniqdata_int_dec partition(cust_id123='1', abc='12345678901.1234') OPTIONS 
> ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
> OUTPUT:
>  0: jdbc:hive2://localhost:1> LOAD DATA INPATH 
> 'hdfs://localhost:54311/2000_UniqData.csv' into table uniqdata_int_dec 
> partition(cust_id123='1', decimal_column1='12345678901.1234') OPTIONS 
> ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
>  Error: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 (state=,code=0)
> The above command throws java.lang.IndexOutOfBoundsException: Index: 1, Size: 
> 1 whereas it should throw a proper exception like invalid column expression 
> for partition load command.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-1982) Loading data into partition table with invalid partition column should throw proper exception

2018-01-17 Thread Geetika Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328462#comment-16328462
 ] 

Geetika Gupta commented on CARBONDATA-1982:
---

This Jira is working fine on current master branch

> Loading data into partition table with invalid partition column should throw 
> proper exception
> -
>
> Key: CARBONDATA-1982
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1982
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: spark2.1
>Reporter: Geetika Gupta
>Assignee: anubhav tarar
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: 2000_UniqData.csv
>
>
> I created a partitioned table using:
>  CREATE TABLE uniqdata_int_dec(CUST_NAME String,ACTIVE_EMUI_VERSION string, 
> DOB timestamp,
>  DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,
>  DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,
>  INTEGER_COLUMN1 int) Partitioned by (cust_id int, decimal_column1 
> decimal(30,10)) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB")
> Load data command:
>  LOAD DATA INPATH 'hdfs://localhost:54311/2000_UniqData.csv' into table 
> uniqdata_int_dec partition(cust_id123='1', abc='12345678901.1234') OPTIONS 
> ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
> OUTPUT:
>  0: jdbc:hive2://localhost:1> LOAD DATA INPATH 
> 'hdfs://localhost:54311/2000_UniqData.csv' into table uniqdata_int_dec 
> partition(cust_id123='1', decimal_column1='12345678901.1234') OPTIONS 
> ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
>  Error: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 (state=,code=0)
> The above command throws java.lang.IndexOutOfBoundsException: Index: 1, Size: 
> 1 whereas it should throw a proper exception like invalid column expression 
> for partition load command.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-1982) Loading data into partition table with invalid partition column should throw proper exception

2018-01-17 Thread Geetika Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geetika Gupta closed CARBONDATA-1982.
-

> Loading data into partition table with invalid partition column should throw 
> proper exception
> -
>
> Key: CARBONDATA-1982
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1982
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: spark2.1
>Reporter: Geetika Gupta
>Assignee: anubhav tarar
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: 2000_UniqData.csv
>
>
> I created a partitioned table using:
>  CREATE TABLE uniqdata_int_dec(CUST_NAME String,ACTIVE_EMUI_VERSION string, 
> DOB timestamp,
>  DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,
>  DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,
>  INTEGER_COLUMN1 int) Partitioned by (cust_id int, decimal_column1 
> decimal(30,10)) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB")
> Load data command:
>  LOAD DATA INPATH 'hdfs://localhost:54311/2000_UniqData.csv' into table 
> uniqdata_int_dec partition(cust_id123='1', abc='12345678901.1234') OPTIONS 
> ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
> OUTPUT:
>  0: jdbc:hive2://localhost:1> LOAD DATA INPATH 
> 'hdfs://localhost:54311/2000_UniqData.csv' into table uniqdata_int_dec 
> partition(cust_id123='1', decimal_column1='12345678901.1234') OPTIONS 
> ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
>  Error: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 (state=,code=0)
> The above command throws java.lang.IndexOutOfBoundsException: Index: 1, Size: 
> 1 whereas it should throw a proper exception like invalid column expression 
> for partition load command.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-1982) Loading data into partition table with invalid partition column should throw proper exception

2018-01-17 Thread Geetika Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geetika Gupta updated CARBONDATA-1982:
--
Description: 
I created a partitioned table using:
 CREATE TABLE uniqdata_int_dec(CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB 
timestamp,
 DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,
 DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,
 INTEGER_COLUMN1 int) Partitioned by (cust_id int, decimal_column1 
decimal(30,10)) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
("TABLE_BLOCKSIZE"= "256 MB")

Load data command:
 LOAD DATA INPATH 'hdfs://localhost:54311/2000_UniqData.csv' into table 
uniqdata_int_dec partition(cust_id123='1', abc='12345678901.1234') OPTIONS 
('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');

OUTPUT:
 0: jdbc:hive2://localhost:1> LOAD DATA INPATH 
'hdfs://localhost:54311/2000_UniqData.csv' into table uniqdata_int_dec 
partition(cust_id123='1', decimal_column1='12345678901.1234') OPTIONS 
('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
 Error: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 (state=,code=0)

The above command throws java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 
whereas it should throw a proper exception like invalid column expression for 
partition load command.

  was:
I created a partitioned table using:
CREATE TABLE uniqdata_int_dec(CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB 
timestamp,
DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,
DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,
INTEGER_COLUMN1 int) Partitioned by (cust_id int, decimal_column1 
decimal(30,10)) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
("TABLE_BLOCKSIZE"= "256 MB")

Load data command:
LOAD DATA INPATH 'hdfs://localhost:54311/2000_UniqData.csv' into table 
uniqdata_int_dec partition(cust_id123='1', abc='12345678901.1234') OPTIONS 
('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');

OUTPUT:
0: jdbc:hive2://localhost:1> LOAD DATA INPATH 
'hdfs://localhost:54311/Files/2000_UniqData.csv' into table uniqdata_int_dec 
partition(cust_id123='1', decimal_column1='12345678901.1234') OPTIONS 
('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
Error: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 (state=,code=0)

The above command throws java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 
whereas it should throw a proper exception like invalid column expression for 
partition load command.



> Loading data into partition table with invalid partition column should throw 
> proper exception
> -
>
> Key: CARBONDATA-1982
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1982
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: spark2.1
>Reporter: Geetika Gupta
>Assignee: anubhav tarar
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: 2000_UniqData.csv
>
>
> I created a partitioned table using:
>  CREATE TABLE uniqdata_int_dec(CUST_NAME String,ACTIVE_EMUI_VERSION string, 
> DOB timestamp,
>  DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,
>  DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,
>  INTEGER_COLUMN1 int) Partitioned by (cust_id int, decimal_column1 
> decimal(30,10)) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB")
> Load data command:
>  LOAD DATA INPATH 'hdfs://localhost:54311/2000_UniqData.csv' into table 
> uniqdata_int_dec partition(cust_id123='1', abc='12345678901.1234') OPTIONS 
> ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
> OUTPUT:
>  0: jdbc:hive2://localhost:1> LOAD DATA INPATH 
> 'hdfs://localhost:54311/2000_UniqData.csv' into table uniqdata_int_dec 
> partition(cust_id123='1', decimal_column1='12345678901.1234') OPTIONS 
> ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> 

[GitHub] carbondata issue #1816: [CARBONDATA-2038][Tests] use junit assertion in java...

2018-01-17 Thread xuchuanyin
Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1816
  
retest this please


---


[jira] [Updated] (CARBONDATA-1964) SET command does not set the parameters correctly

2018-01-17 Thread Geetika Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geetika Gupta updated CARBONDATA-1964:
--
Description: 
I created the following table:
 CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

then I set the parameter carbon.options.bad.records.action using:
 *set carbon.options.bad.records.action=fail;
 *
 Load command:
 LOAD DATA INPATH 'hdfs://localhost:54311/Files/2000_UniqData.csv' into table 
uniqdata OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','timestampformat'='dd/mm/');
 +--+-+
|Result|

+--+-+
 +--+-+
 No rows selected (1.43 seconds)

The load executed successfully. However, no data is loaded into the table due 
to mismatch of timestamp format.

Then I again set the parameter carbon.options.bad.records.action using:
 *set carbon.options.bad.records.action=FAIL;
 *
 This time the same load command gave me the following exception:

Error: java.lang.Exception: Data load failed due to bad record: The value with 
column name dob and column data type TIMESTAMP is not a valid TIMESTAMP 
type.Please enable bad record logger to know the detail reason. (state=,code=0)

The first case should behave in the same as manner as the second case. So the 
SET command does not set the parameter values correctly and it does not even 
throw an exception when the value is not set correctly.

  was:
I created the following table:
CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

then I set the parameter carbon.options.bad.records.action using:
*set carbon.options.bad.records.action=fail;
*
Load command:
LOAD DATA INPATH 'hdfs://localhost:54311/Files/2000_UniqData.csv' into table 
uniqdata OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','timestampformat'='dd/mm/');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (1.43 seconds)

The load executed successfully. However, no data is loaded into the table due 
to mismatch of timestamp format.

Then I again set the parameter carbon.options.bad.records.action using:
*set carbon.options.bad.records.action=FAIL;
*
This time the same load command gave me the following exception:

Error: java.lang.Exception: Data load failed due to bad record: The value with 
column name dob and column data type TIMESTAMP is not a valid TIMESTAMP 
type.Please enable bad record logger to know the detail reason. (state=,code=0)

The first case should behave in the same as manner as the second case. So the 
SET command does not set the parameter values correctly and it does not even 
throw an exception when the value is not set correctly.




> SET command does not set the parameters correctly
> -
>
> Key: CARBONDATA-1964
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1964
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: spark2.1
>Reporter: Geetika Gupta
>Priority: Major
> Fix For: 1.3.0
>
> Attachments: 2000_UniqData.csv
>
>
> I created the following table:
>  CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> 

[jira] [Commented] (CARBONDATA-1964) SET command does not set the parameters correctly

2018-01-17 Thread Geetika Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328433#comment-16328433
 ] 

Geetika Gupta commented on CARBONDATA-1964:
---

This issue will be fixed by PR: https://github.com/apache/carbondata/pull/1819

> SET command does not set the parameters correctly
> -
>
> Key: CARBONDATA-1964
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1964
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: spark2.1
>Reporter: Geetika Gupta
>Priority: Major
> Fix For: 1.3.0
>
> Attachments: 2000_UniqData.csv
>
>
> I created the following table:
> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> then I set the parameter carbon.options.bad.records.action using:
> *set carbon.options.bad.records.action=fail;
> *
> Load command:
> LOAD DATA INPATH 'hdfs://localhost:54311/Files/2000_UniqData.csv' into table 
> uniqdata OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','timestampformat'='dd/mm/');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (1.43 seconds)
> The load executed successfully. However, no data is loaded into the table due 
> to mismatch of timestamp format.
> Then I again set the parameter carbon.options.bad.records.action using:
> *set carbon.options.bad.records.action=FAIL;
> *
> This time the same load command gave me the following exception:
> Error: java.lang.Exception: Data load failed due to bad record: The value 
> with column name dob and column data type TIMESTAMP is not a valid TIMESTAMP 
> type.Please enable bad record logger to know the detail reason. 
> (state=,code=0)
> The first case should behave in the same as manner as the second case. So the 
> SET command does not set the parameter values correctly and it does not even 
> throw an exception when the value is not set correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #1819: Refactored code to set bad.records.action par...

2018-01-17 Thread geetikagupta16
GitHub user geetikagupta16 opened a pull request:

https://github.com/apache/carbondata/pull/1819

Refactored code to set bad.records.action parameter using SET command

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/geetikagupta16/incubator-carbondata 
CARBONDATA-1964

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1819.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1819


commit 3d8901402a164e8d812cb20a93ce2eae637601fb
Author: Geetika Gupta 
Date:   2018-01-17T08:01:56Z

Refactored code to set bad.records.action parameter using SET command




---


[jira] [Created] (CARBONDATA-2041) Not able to load data into a partitioned table using insert overwrite

2018-01-17 Thread Vandana Yadav (JIRA)
Vandana Yadav created CARBONDATA-2041:
-

 Summary: Not able to load data into a partitioned table using 
insert overwrite 
 Key: CARBONDATA-2041
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2041
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.3.0
 Environment: spark 2.1
Reporter: Vandana Yadav
 Attachments: 2000_UniqData.csv

Not able to load data into a partitioned table using insert overwrite

Steps to reproduce:

1) Create Hive table and load data in it:

a) CREATE TABLE uniqdata_hive (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 
int)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

b) LOAD DATA LOCAL INPATH 
'/home/knoldus/Desktop/csv/TestData/Data/uniqdata/2000_UniqData.csv' into table 
UNIQDATA_HIVE;

2) Create a partitioned table(hive and carbon both):

a) CREATE TABLE uniqdata_string(CUST_ID int,CUST_NAME String,DOB timestamp,DOJ 
timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
decimal(30,10),DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, 
Double_COLUMN2 double,INTEGER_COLUMN1 int) PARTITIONED BY(ACTIVE_EMUI_VERSION 
string) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
('TABLE_BLOCKSIZE'= '256 MB');

b) CREATE TABLE uniqdata_hive_partition (CUST_ID int,CUST_NAME String, DOB 
timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 
int) partitioned by (active_emui_version string)ROW FORMAT DELIMITED FIELDS 
TERMINATED BY ',';

 

3) Load data into the partitioned table using insert overwrite:

a) load into the hive_partitioned table:

insert overwrite table uniqdata_hive_partition partition (active_emui_version) 
select CUST_ID,CUST_NAME, DOB, DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1, DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2, INTEGER_COLUMN1,active_emui_version from uniqdata_hive limit 10;

 

output: 

Data successfully loaded into the table:

validation Query:

select count(*) from uniqdata_hive_partition

output:

count(1) |
+---+--+
| 10 |
+---+–+

 

b) load into carbon partitioned table:

insert overwrite table uniqdata_string partition(active_emui_version) select 
CUST_ID, CUST_NAME,DOB,doj, bigint_column1, bigint_column2, decimal_column1, 
decimal_column2,double_column1, 
double_column2,integer_column1,active_emui_version from uniqdata_hive limit 10;

 

Expected Result: Data should be loaded successfully

Actual Result:

Error: org.apache.spark.SparkException: Job aborted. (state=,code=0)

 

logs:

org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: partition spec is invalid; 
field active_emui_version does not exist or is empty;
at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:98)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.createPartitions(HiveExternalCatalog.scala:842)
at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.createPartitions(SessionCatalog.scala:679)
at 
org.apache.spark.sql.hive.CarbonSessionCatalog.createPartitions(CarbonSessionState.scala:155)
at 
org.apache.spark.sql.execution.command.AlterTableAddPartitionCommand.run(ddl.scala:361)
at 
org.apache.spark.sql.execution.datasources.DataSourceAnalysis$$anonfun$apply$1.org$apache$spark$sql$execution$datasources$DataSourceAnalysis$$anonfun$$refreshPartitionsCallback$1(DataSourceStrategy.scala:221)
at 
org.apache.spark.sql.execution.datasources.DataSourceAnalysis$$anonfun$apply$1$$anonfun$8.apply(DataSourceStrategy.scala:243)
at 
org.apache.spark.sql.execution.datasources.DataSourceAnalysis$$anonfun$apply$1$$anonfun$8.apply(DataSourceStrategy.scala:243)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:143)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:121)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:121)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:121)
at 
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:101)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at 

<    1   2   3