[GitHub] carbondata issue #1788: [CARBONDATA-1592] Added analysis exception to handle...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1788
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2826/



---


[GitHub] carbondata pull request #1787: [CARBONDATA-2017] Fix input path checking whe...

2018-01-10 Thread SangeetaGulia
Github user SangeetaGulia commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1787#discussion_r160882178
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala 
---
@@ -73,7 +73,8 @@ object FileUtils {
   val stringBuild = new StringBuilder()
   val filePaths = inputPath.split(",")
   for (i <- 0 until filePaths.size) {
-val fileType = FileFactory.getFileType(filePaths(i))
+val filePath = CarbonUtil.checkAndAppendHDFSUrl(filePaths(i))
--- End diff --

@jackylk I have verified this. It is working fine with S3 also. We will now 
be able to use the carbon property **carbon.ddl.base.hdfs.url** for s3 also to 
provide base URL.


---


[GitHub] carbondata issue #71: [CARBONDATA-155] Code refactor to avoid the Type Casti...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/71
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1461/



---


[GitHub] carbondata pull request #1792: [CARBONDATA-2018][DataLoad] Optimization in r...

2018-01-10 Thread xuchuanyin
GitHub user xuchuanyin opened a pull request:

https://github.com/apache/carbondata/pull/1792

[CARBONDATA-2018][DataLoad] Optimization in reading/writing for sort temp 
row

pack the no-sort fields in the row as a byte array during merge sort
to save CPU consumption

I've tested it in my cluster and seen about 8% performance gained 
(74MB/s/Node -> 81MB/s/Node) in data loading. Please note that global_sort will 
not gain benefit from this feature since there are no sort temp file in that 
procedure.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [x] Any interfaces changed?
 `Some internal used interface has been changed`
 - [x] Any backward compatibility impacted?
 `No`
 - [x] Document update required?
`No`
 - [x] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
`No`
- How it is tested? Please attach test report.
`Tested in 3-node cluster with real business data`
- Is it a performance related change? Please attach the performance 
test report.
`Yes, I've tested it in my cluster and seen about 8% performance gained 
(74MB/s/Node -> 81MB/s/Node) in data loading.`
- Any additional information to help reviewers in testing this 
change.
`No`
 - [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
`Unrelated`


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata 
opt_sort_temp_serializeation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1792.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1792


commit 1cf4efbd5f3065cb996fa4d6a133df68f2cca585
Author: xuchuanyin 
Date:   2018-01-10T12:39:02Z

pack no sort fields

pack the no-sort fields in the row as a byte array during merge sort
to save CPU consumption




---


[GitHub] carbondata issue #71: [CARBONDATA-155] Code refactor to avoid the Type Casti...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/71
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2694/



---


[GitHub] carbondata pull request #1791: [CARBONDATA-2010] Block streaming on main tab...

2018-01-10 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/carbondata/pull/1791

[CARBONDATA-2010] Block streaming on main table of preaggregate datamap

If the table has 'preaggregate' DataMap, it doesn't support streaming now

 - [x] Any interfaces changed?
   no
 - [x] Any backward compatibility impacted?
   no
 - [x] Document update required?
   yes, i will add this limitation into
 - [x] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
  added new ut
- How it is tested? Please attach test report.
   CI run ut
- Is it a performance related change? Please attach the performance 
test report.
   no
- Any additional information to help reviewers in testing this 
change.
   added code comment
   
 - [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
  small changes


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/carbondata agg_block_streaming

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1791.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1791


commit 6ccd1a060351ed7cbe1bc653623210c7e7e234f5
Author: QiangCai 
Date:   2018-01-11T07:04:22Z

block streaming on main table of preaggregate datamap




---


[GitHub] carbondata issue #1104: [CARBONDATA-1239] Add validation for set command par...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1104
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1460/



---


[jira] [Created] (CARBONDATA-2018) Optimization in reading/writing for sort temp row during data loading

2018-01-10 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2018:
--

 Summary: Optimization in reading/writing for sort temp row during 
data loading
 Key: CARBONDATA-2018
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2018
 Project: CarbonData
  Issue Type: Improvement
  Components: data-load
Affects Versions: 1.3.0
Reporter: xuchuanyin
Assignee: xuchuanyin
 Fix For: 1.3.0


# SCENARIO

Currently in carbondata data loading, during sort process step, records will be 
sorted partially and spilled to the disk. And then carbondata will read these 
records and do merge sort.

Since sort step is CPU-tense, during writing/reading these records, we can 
optimize the serialization/deserialization for these rows and reduce CPU 
consumption in parsing the rows.

This should enhance the data loading performance.

# RESOLVE
We can pick up the un-sorted fields in the row and pack them as bytes array and 
skip paring them.

# RESULT

I've tested it in my cluster and seen about 8% performance gained (74MB/s/Node 
-> 81MB/s/Node).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1788: [CARBONDATA-1592] Added analysis exception to handle...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1788
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1457/



---


[GitHub] carbondata issue #1788: [CARBONDATA-1592] Added analysis exception to handle...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1788
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2691/



---


[GitHub] carbondata issue #1766: [WIP]enable hive metastore and test

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1766
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1459/



---


[GitHub] carbondata issue #1766: [WIP]enable hive metastore and test

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1766
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2692/



---


[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1724
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2825/



---


[GitHub] carbondata issue #1781: [CARBONDATA-2012] Add support to load pre-aggregate ...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1781
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1456/



---


[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1770
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1455/



---


[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1770
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2690/



---


[GitHub] carbondata issue #1781: [CARBONDATA-2012] Add support to load pre-aggregate ...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1781
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2689/



---


[jira] [Closed] (CARBONDATA-1758) Carbon1.3.0- No Inverted Index : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException

2018-01-10 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-1758.
---
Resolution: Fixed

Defect is closed as fixed. Its working fine in latest Carbon 1.3.0 build.

> Carbon1.3.0- No Inverted Index : Select column with is null for 
> no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException
> 
>
> Key: CARBONDATA-1758
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1758
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: 3 node cluster
>Reporter: Chetan Bhat
>  Labels: Functional
>
> Steps :
> In Beeline user executes the queries in sequence.
> CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table 
> uniqdata_DI_int OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> Select count(CUST_ID) from uniqdata_DI_int;
> Select count(CUST_ID)*10 as multiple from uniqdata_DI_int;
> Select avg(CUST_ID) as average from uniqdata_DI_int;
> Select floor(CUST_ID) as average from uniqdata_DI_int;
> Select ceil(CUST_ID) as average from uniqdata_DI_int;
> Select ceiling(CUST_ID) as average from uniqdata_DI_int;
> Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int;
> Select CUST_ID from uniqdata_DI_int where CUST_ID is null;
> *Issue : Select column with is null for no_inverted_index column throws 
> java.lang.ArrayIndexOutOfBoundsException*
> 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where 
> CUST_ID is null;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 79.0 (TID 123, BLR114278, executor 18): 
> org.apache.spark.util.TaskCompletionListenerException: 
> java.util.concurrent.ExecutionException: 
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105)
> at org.apache.spark.scheduler.Task.run(Task.scala:112)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace: (state=,code=0)
> Expected : Select column with is null for no_inverted_index column should be 
> successful displaying the correct result set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1774
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2824/



---


[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1774
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1454/



---


[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1724
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2688/



---


[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1724
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1453/



---


[GitHub] carbondata issue #1774: [CARBONDATA-2001] Unable to Save DataFrame As Carbon...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1774
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2687/



---


[GitHub] carbondata issue #1781: [CARBONDATA-2012] Add support to load pre-aggregate ...

2018-01-10 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/1781
  
retest this please


---


[jira] [Commented] (CARBONDATA-1758) Carbon1.3.0- No Inverted Index : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException

2018-01-10 Thread Akash R Nilugal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321705#comment-16321705
 ] 

Akash R Nilugal commented on CARBONDATA-1758:
-

i have also executed, queries are working fine

> Carbon1.3.0- No Inverted Index : Select column with is null for 
> no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException
> 
>
> Key: CARBONDATA-1758
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1758
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: 3 node cluster
>Reporter: Chetan Bhat
>  Labels: Functional
>
> Steps :
> In Beeline user executes the queries in sequence.
> CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table 
> uniqdata_DI_int OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> Select count(CUST_ID) from uniqdata_DI_int;
> Select count(CUST_ID)*10 as multiple from uniqdata_DI_int;
> Select avg(CUST_ID) as average from uniqdata_DI_int;
> Select floor(CUST_ID) as average from uniqdata_DI_int;
> Select ceil(CUST_ID) as average from uniqdata_DI_int;
> Select ceiling(CUST_ID) as average from uniqdata_DI_int;
> Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int;
> Select CUST_ID from uniqdata_DI_int where CUST_ID is null;
> *Issue : Select column with is null for no_inverted_index column throws 
> java.lang.ArrayIndexOutOfBoundsException*
> 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where 
> CUST_ID is null;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 79.0 (TID 123, BLR114278, executor 18): 
> org.apache.spark.util.TaskCompletionListenerException: 
> java.util.concurrent.ExecutionException: 
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105)
> at org.apache.spark.scheduler.Task.run(Task.scala:112)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace: (state=,code=0)
> Expected : Select column with is null for no_inverted_index column should be 
> successful displaying the correct result set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat

2018-01-10 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1770
  
retest this please


---


[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

2018-01-10 Thread anubhav100
Github user anubhav100 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r160864606
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala
 ---
@@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, 
val dataFrame: DataFrame) {
 val carbonSchema = schema.map { field =>
   s"${ field.name } ${ convertToCarbonType(field.dataType) }"
 }
+  val isStreaming = if (options.isStreaming) Some("true") else None
+
 val property = Map(
   "SORT_COLUMNS" -> options.sortColumns,
   "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
   "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
-  "TABLE_BLOCKSIZE" -> options.tableBlockSize
-).filter(_._2.isDefined).map(p => s"'${p._1}' = 
'${p._2.get}'").mkString(",")
+  "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+  "STREAMING" -> isStreaming
+).filter(_._2.isDefined).
--- End diff --

done


---


[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

2018-01-10 Thread anubhav100
Github user anubhav100 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r160864615
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala
 ---
@@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, 
val dataFrame: DataFrame) {
 val carbonSchema = schema.map { field =>
   s"${ field.name } ${ convertToCarbonType(field.dataType) }"
 }
+  val isStreaming = if (options.isStreaming) Some("true") else None
+
 val property = Map(
   "SORT_COLUMNS" -> options.sortColumns,
   "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
   "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
-  "TABLE_BLOCKSIZE" -> options.tableBlockSize
-).filter(_._2.isDefined).map(p => s"'${p._1}' = 
'${p._2.get}'").mkString(",")
+  "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+  "STREAMING" -> isStreaming
--- End diff --

@jackylk reason that i not used options.isStreaming is because it is giving 
back a boolean value so i just converted the boolean to option 


---


[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...

2018-01-10 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/1724
  
@geetikagupta16 can you squash the commits.


---


[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1770
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1452/



---


[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1770
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2686/



---


[GitHub] carbondata issue #1770: [CARBONDATA-1994] Remove CarbonInputFormat

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1770
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2823/



---


[GitHub] carbondata issue #1790: [CARBONDATA-2009][Documentation] Document Refresh co...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1790
  
Can one of the admins verify this patch?


---


[GitHub] carbondata issue #1790: [CARBONDATA-2009][Documentation] Document Refresh co...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1790
  
Can one of the admins verify this patch?


---


[GitHub] carbondata issue #1790: [CARBONDATA-2009][Documentation] Document Refresh co...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1790
  
Can one of the admins verify this patch?


---


[GitHub] carbondata pull request #1790: [CARBONDATA-2009][Documentation] Document Ref...

2018-01-10 Thread arshadmohammad
GitHub user arshadmohammad opened a pull request:

https://github.com/apache/carbondata/pull/1790

[CARBONDATA-2009][Documentation] Document Refresh command constraint

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed? NO
 
 - [ ] Any backward compatibility impacted? NO
 
 - [ ] Document update required? YES

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   NA( This PR has only document change )
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.  NA



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/arshadmohammad/carbondata CARBONDATA-2009

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1790.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1790


commit d53fee295ef8860f8c12a7d38c653015f7809f05
Author: Mohammad Arshad 
Date:   2018-01-11T01:40:12Z

[CARBONDATA-2009][Documentation] Document Refresh command constraint




---


[GitHub] carbondata pull request #1775: [CARBONDATA-1993] Removed unused carbon prope...

2018-01-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1775


---


[GitHub] carbondata issue #1775: [CARBONDATA-1993] Removed unused carbon properties h...

2018-01-10 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1775
  
LGTM


---


[GitHub] carbondata pull request #1787: [CARBONDATA-2017] Fix input path checking whe...

2018-01-10 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1787#discussion_r160845775
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala 
---
@@ -73,7 +73,8 @@ object FileUtils {
   val stringBuild = new StringBuilder()
   val filePaths = inputPath.split(",")
   for (i <- 0 until filePaths.size) {
-val fileType = FileFactory.getFileType(filePaths(i))
+val filePath = CarbonUtil.checkAndAppendHDFSUrl(filePaths(i))
--- End diff --

This is only for HDFS, right? How about other storage system support like 
S3?
@SangeetaGulia Can you have a look at this, I think this may impact #1584 
that you are working on


---


[GitHub] carbondata issue #1781: [CARBONDATA-2012] Add support to load pre-aggregate ...

2018-01-10 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1781
  
Can you add more description, why the current loading flow is not 
transactional?


---


[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

2018-01-10 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r160845350
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala
 ---
@@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, 
val dataFrame: DataFrame) {
 val carbonSchema = schema.map { field =>
   s"${ field.name } ${ convertToCarbonType(field.dataType) }"
 }
+  val isStreaming = if (options.isStreaming) Some("true") else None
+
 val property = Map(
   "SORT_COLUMNS" -> options.sortColumns,
   "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
   "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
-  "TABLE_BLOCKSIZE" -> options.tableBlockSize
-).filter(_._2.isDefined).map(p => s"'${p._1}' = 
'${p._2.get}'").mkString(",")
+  "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+  "STREAMING" -> isStreaming
--- End diff --

why not use `options.isStreaming` directly?


---


[GitHub] carbondata pull request #1774: [CARBONDATA-2001] Unable to Save DataFrame As...

2018-01-10 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1774#discussion_r160845207
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala
 ---
@@ -167,13 +167,19 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, 
val dataFrame: DataFrame) {
 val carbonSchema = schema.map { field =>
   s"${ field.name } ${ convertToCarbonType(field.dataType) }"
 }
+  val isStreaming = if (options.isStreaming) Some("true") else None
+
 val property = Map(
   "SORT_COLUMNS" -> options.sortColumns,
   "DICTIONARY_INCLUDE" -> options.dictionaryInclude,
   "DICTIONARY_EXCLUDE" -> options.dictionaryExclude,
-  "TABLE_BLOCKSIZE" -> options.tableBlockSize
-).filter(_._2.isDefined).map(p => s"'${p._1}' = 
'${p._2.get}'").mkString(",")
+  "TABLE_BLOCKSIZE" -> options.tableBlockSize,
+  "STREAMING" -> isStreaming
+).filter(_._2.isDefined).
--- End diff --

move `.` to next line


---


[jira] [Resolved] (CARBONDATA-2011) CarbonStreamingQueryListener throwing ClassCastException

2018-01-10 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-2011.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> CarbonStreamingQueryListener throwing ClassCastException
> 
>
> Key: CARBONDATA-2011
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2011
> Project: CarbonData
>  Issue Type: Bug
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 1.3.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Java.lang.ClassCastException: 
> org.apache.spark.sql.execution.streaming.StreamingQueryWrapper cannot be cast 
> to org.apache.spark.sql.execution.streaming.StreamExecution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1779: [CARBONDATA-2011] Fix ClassCastException in C...

2018-01-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1779


---


[GitHub] carbondata issue #1779: [CARBONDATA-2011] Fix ClassCastException in CarbonSt...

2018-01-10 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1779
  
LGTM


---


[GitHub] carbondata issue #1104: [CARBONDATA-1239] Add validation for set command par...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1104
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2822/



---


[GitHub] carbondata issue #1789: [WIP] Fix avoid reading of all block information in ...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1789
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2821/



---


[GitHub] carbondata issue #1789: [WIP] Fix avoid reading of all block information in ...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1789
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2684/



---


[GitHub] carbondata issue #1789: [WIP] Fix avoid reading of all block information in ...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1789
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1451/



---


[GitHub] carbondata issue #1104: [CARBONDATA-1239] Add validation for set command par...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1104
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2685/



---


[GitHub] carbondata issue #1104: [CARBONDATA-1239] Add validation for set command par...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1104
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1450/



---


[GitHub] carbondata issue #1782: [WIP] Changes for creating carbon index merge file f...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1782
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2820/



---


[GitHub] carbondata issue #1782: [WIP] Changes for creating carbon index merge file f...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1782
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2683/



---


[GitHub] carbondata issue #1782: [WIP] Changes for creating carbon index merge file f...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1782
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1449/



---


[GitHub] carbondata issue #1788: [WIP][CARBONDATA-1592] Added analysis exception to h...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1788
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2819/



---


[GitHub] carbondata pull request #1789: [WIP] Fix avoid reading of all block informat...

2018-01-10 Thread ravipesala
GitHub user ravipesala opened a pull request:

https://github.com/apache/carbondata/pull/1789

[WIP] Fix avoid reading of all block information in driver for old stores.

Problem: 
For old stores prior to 1.2 version there is no blocklet information stored 
in carbonindex file. So the new code needs to read all carbondata files footers 
inside the driver to get the blocklet information.  That makes the first time 
queries become slower. As observed count(*) query was taking 2 swconds on old 
version and after upgrade it takes very long time.

Solution:
If there is no information blocklet available in carbonindex file then 
don't read carbondata files footer in driver side. Instead read carbondata 
files in executor to get the blocklet information.


Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [X] Any interfaces changed?
 
 - [X] Any backward compatibility impacted?
 
 - [X] Document update required?

 - [X] Testing done
  
 - [X] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravipesala/incubator-carbondata 
datamap-pld-store

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1789.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1789


commit f0aaf4d8d0e4761227bc1adff29d33798c88bd12
Author: ravipesala 
Date:   2018-01-10T15:35:48Z

Fix avoid reading of all block information in driver for old stores.




---


[GitHub] carbondata issue #1788: [WIP][CARBONDATA-1592] Added analysis exception to h...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1788
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1448/



---


[GitHub] carbondata issue #1788: [WIP][CARBONDATA-1592] Added analysis exception to h...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1788
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2682/



---


[GitHub] carbondata issue #1787: [CARBONDATA-2017] Fix input path checking when loadi...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1787
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2818/



---


[GitHub] carbondata issue #1787: [CARBONDATA-2017] Fix input path checking when loadi...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1787
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1447/



---


[GitHub] carbondata issue #1787: [CARBONDATA-2017] Fix input path checking when loadi...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1787
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2681/



---


[GitHub] carbondata pull request #1788: [WIP][CARBONDATA-1592] Added analysis excepti...

2018-01-10 Thread ManoharVanam
GitHub user ManoharVanam opened a pull request:

https://github.com/apache/carbondata/pull/1788

[WIP][CARBONDATA-1592] Added analysis exception to handle event exceptions

Description : Added analysis exception case, to handle event listener 
exceptions

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ManoharVanam/incubator-carbondata defect6

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1788.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1788


commit 4c05b66100784bb5aab9d5209305e37f0753316b
Author: Manohar 
Date:   2018-01-10T14:37:33Z

Added analysis exception to handle event exceptions




---


[GitHub] carbondata issue #1783: [CARBONDATA-2013] executing alter query results that...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1783
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2817/



---


[GitHub] carbondata pull request #1787: [CARBONDATA-2017] Fix input path checking whe...

2018-01-10 Thread kevinjmh
GitHub user kevinjmh opened a pull request:

https://github.com/apache/carbondata/pull/1787

[CARBONDATA-2017] Fix input path checking when loading data from multiple 
paths

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kevinjmh/carbondata load_multi_path

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1787.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1787






---


[jira] [Created] (CARBONDATA-2017) Error occurs when loading multiple files

2018-01-10 Thread jiangmanhua (JIRA)
jiangmanhua created CARBONDATA-2017:
---

 Summary: Error occurs when loading multiple files
 Key: CARBONDATA-2017
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2017
 Project: CarbonData
  Issue Type: Bug
Reporter: jiangmanhua
Priority: Minor


Problem:
Carbon supports loading from multiple file paths at once, but we find that 
Carbon will throw an exception like "The input file does not exist" when 
loading multiple files on HDFS.

For example:
ex1: LOAD DATA INPATH '/data/source.csv,/data/source2.csv' INTO TABLE test_table
ex2: LOAD DATA INPATH 'hdfs://ha/data/source.csv,hdfs://ha/data/source2.csv' 
INTO TABLE test_table

ex1 will throw an exception saying that source2.csv does not exist.
ex2 will execute normally.


Solution:
We found that carbon takes the PATH as a whole and checks its prefix before 
spliting it into multiplt paths. So the problem will be solved when we do the 
prefix checking job for each path after spliting PATH into multiplt paths. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1783: [CARBONDATA-2013] executing alter query results that...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1783
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2680/



---


[GitHub] carbondata issue #1783: [CARBONDATA-2013] executing alter query results that...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1783
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1446/



---


[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1786
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2816/



---


[jira] [Updated] (CARBONDATA-2009) REFRESH TABLE Limitation When HiveMetaStore is used

2018-01-10 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-2009:
-
Description: Refresh table command will not register the carbon table if 
the old table is stored in the CarbonHiveMetastore  (was: Refresh table when 
spark.carbon.hive.schema.store is set to true ie when hive meta store is used.)

> REFRESH TABLE Limitation When HiveMetaStore is used
> ---
>
> Key: CARBONDATA-2009
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2009
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Arshad
>Priority: Minor
> Fix For: 1.3.0
>
>
> Refresh table command will not register the carbon table if the old table is 
> stored in the CarbonHiveMetastore



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1786
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2815/



---


[jira] [Assigned] (CARBONDATA-2016) Exception displays while implementing compaction with alter query

2018-01-10 Thread anubhav tarar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anubhav tarar reassigned CARBONDATA-2016:
-

Assignee: anubhav tarar

> Exception displays while implementing compaction with alter query
> -
>
> Key: CARBONDATA-2016
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2016
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: spark 2.1
>Reporter: Vandana Yadav
>Assignee: anubhav tarar
>Priority: Minor
>
> Exception displays while implementing compaction with alter query.
> Steps to reproduce:
> 1) Create a table :
> CREATE TABLE CUSTOMER1 ( C_CUSTKEY INT , C_NAME STRING , C_ADDRESS STRING , 
> C_NATIONKEY INT , C_PHONE STRING , C_ACCTBAL DECIMAL(15,2) , C_MKTSEGMENT 
> STRING , C_COMMENT STRING) stored by 'carbondata';
> 2) Insert data into the table:
> a) insert into customer1 
> values(1,'vandana','noida',1,'123456789',45987.78,'hello','comment')
> b) insert into customer1 
> values(2,'vandana','noida',2,'123456789',487.78,'hello','comment')
> c) insert into customer1 
> values(3,'geetika','delhi',3,'123456789',487897.78,'hello','comment')
> d) insert into customer1 
> values(4,'sangeeta','delhi',3,'123456789',48789.78,'hello','comment')
> 3) Perform alter table query:
>  alter table customer1 add columns (intfield int) TBLPROPERTIES 
> ('DEFAULT.VALUE.intfield'='10');
> 4) show segments for displaying segments before compaction
> show segments for table customer1;
> output:
> ++--+--+--++--+--+
> | SegmentSequenceId  |  Status  | Load Start Time  |  Load End 
> Time   | Merged To  | File Format  |
> ++--+--+--++--+--+
> | 3  | Success  | 2018-01-10 16:16:53.611  | 2018-01-10 
> 16:16:54.99   | NA | COLUMNAR_V3  |
> | 2  | Success  | 2018-01-10 16:16:46.878  | 2018-01-10 
> 16:16:47.75   | NA | COLUMNAR_V3  |
> | 1  | Success  | 2018-01-10 16:16:38.096  | 2018-01-10 
> 16:16:38.972  | NA | COLUMNAR_V3  |
> | 0  | Success  | 2018-01-10 16:16:31.979  | 2018-01-10 
> 16:16:33.293  | NA | COLUMNAR_V3  |
> ++--+--+--++--+--+
> 4 rows selected (0.029 seconds)
> 5) alter table query for compaction:
> alter table customer1 compact 'minor';
> Expected Result: Table should be compacted successfully.
> Actual Result: 
> Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please 
> check logs for more info. Exception in compaction Compaction Failure in 
> Merger Rdd.; (state=,code=0)
> thriftserver logs:
> 18/01/10 16:17:12 ERROR CompactionResultSortProcessor: [Executor task launch 
> worker-36][partitionID:customer1;queryID:15798380253871] Compaction failed: 
> java.lang.Long cannot be cast to java.lang.Integer
> java.lang.ClassCastException: java.lang.Long cannot be cast to 
> java.lang.Integer
>   at 
> org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataToFile(SortDataRows.java:273)
>   at 
> org.apache.carbondata.processing.sort.sortdata.SortDataRows.startSorting(SortDataRows.java:214)
>   at 
> org.apache.carbondata.processing.merger.CompactionResultSortProcessor.processResult(CompactionResultSortProcessor.java:226)
>   at 
> org.apache.carbondata.processing.merger.CompactionResultSortProcessor.execute(CompactionResultSortProcessor.java:159)
>   at 
> org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.(CarbonMergerRDD.scala:234)
>   at 
> org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:81)
>   at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>   at org.apache.spark.scheduler.Task.run(Task.scala:99)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch 
> worker-36][partitionID:customer1;queryID:15798380253871] Total memory used 
> after task 15798371335347 is 5313 Current tasks running now are : 

[GitHub] carbondata issue #1785: [CARBONDATA-2015] Restricted maximum length of bytes...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1785
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2814/



---


[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1786
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2679/



---


[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1786
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1445/



---


[jira] [Resolved] (CARBONDATA-1957) create datamap query fails on table having dictionary_include

2018-01-10 Thread Geetika Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geetika Gupta resolved CARBONDATA-1957.
---
Resolution: Fixed

> create datamap query fails on table having dictionary_include
> -
>
> Key: CARBONDATA-1957
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1957
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: spark2.1
>Reporter: Geetika Gupta
> Fix For: 1.3.0
>
> Attachments: 2000_UniqData.csv
>
>
> I created a datamap using the following command:
> create datamap uniqdata_agg on table uniqdata using 'preaggregate' as select 
> cust_id, cust_name,avg(decimal_column1) from uniqdata group by 
> cust_id,cust_name;
> It throws the following error:
> Error: java.lang.Exception: DataLoad failure: (state=,code=0)
> Steps to reproduce:
> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> Load command:
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> Create datamap commad:
> create datamap uniqdata_agg on table uniqdata using 'preaggregate' as select 
> cust_id, cust_name,avg(decimal_column1) from uniqdata group by 
> cust_id,cust_name;
> The above command throws the following exception:
> Error: java.lang.Exception: DataLoad failure: (state=,code=0)
> Here are the logs:
> 18/01/02 11:46:58 ERROR ParallelReadMergeSorterImpl: 
> SafeParallelSorterPool:uniqdata_uniqdata_agg 
> java.lang.IllegalArgumentException: requirement failed: Decimal precision 
> 2922 exceeds max precision 38
>   at scala.Predef$.require(Predef.scala:224)
>   at org.apache.spark.sql.types.Decimal.set(Decimal.scala:113)
>   at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:426)
>   at org.apache.spark.sql.types.Decimal.apply(Decimal.scala)
>   at 
> org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:409)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown
>  Source)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at 
> org.apache.carbondata.spark.rdd.LazyRddIterator.next(NewCarbonDataLoadRDD.scala:514)
>   at 
> org.apache.carbondata.spark.rdd.LazyRddIterator.next(NewCarbonDataLoadRDD.scala:477)
>   at 
> org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.getBatch(InputProcessorStepImpl.java:239)
>   at 
> org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:200)
>   at 
> org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:129)
>   at 
> org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:97)
>   at 
> org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:83)
>   at 
> org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:218)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 18/01/02 11:46:58 ERROR ForwardDictionaryCache: 
> SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: 
> null
> 18/01/02 11:46:58 ERROR ForwardDictionaryCache: 
> SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: 
> null
> 18/01/02 11:46:58 ERROR ForwardDictionaryCache: 
> SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: 
> null
> 18/01/02 11:46:58 ERROR 

[jira] [Commented] (CARBONDATA-1957) create datamap query fails on table having dictionary_include

2018-01-10 Thread Geetika Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320135#comment-16320135
 ] 

Geetika Gupta commented on CARBONDATA-1957:
---

This bug has been resolved by this PR: 
https://github.com/apache/carbondata/pull/1742

> create datamap query fails on table having dictionary_include
> -
>
> Key: CARBONDATA-1957
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1957
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: spark2.1
>Reporter: Geetika Gupta
> Fix For: 1.3.0
>
> Attachments: 2000_UniqData.csv
>
>
> I created a datamap using the following command:
> create datamap uniqdata_agg on table uniqdata using 'preaggregate' as select 
> cust_id, cust_name,avg(decimal_column1) from uniqdata group by 
> cust_id,cust_name;
> It throws the following error:
> Error: java.lang.Exception: DataLoad failure: (state=,code=0)
> Steps to reproduce:
> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> Load command:
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> Create datamap commad:
> create datamap uniqdata_agg on table uniqdata using 'preaggregate' as select 
> cust_id, cust_name,avg(decimal_column1) from uniqdata group by 
> cust_id,cust_name;
> The above command throws the following exception:
> Error: java.lang.Exception: DataLoad failure: (state=,code=0)
> Here are the logs:
> 18/01/02 11:46:58 ERROR ParallelReadMergeSorterImpl: 
> SafeParallelSorterPool:uniqdata_uniqdata_agg 
> java.lang.IllegalArgumentException: requirement failed: Decimal precision 
> 2922 exceeds max precision 38
>   at scala.Predef$.require(Predef.scala:224)
>   at org.apache.spark.sql.types.Decimal.set(Decimal.scala:113)
>   at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:426)
>   at org.apache.spark.sql.types.Decimal.apply(Decimal.scala)
>   at 
> org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:409)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown
>  Source)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at 
> org.apache.carbondata.spark.rdd.LazyRddIterator.next(NewCarbonDataLoadRDD.scala:514)
>   at 
> org.apache.carbondata.spark.rdd.LazyRddIterator.next(NewCarbonDataLoadRDD.scala:477)
>   at 
> org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.getBatch(InputProcessorStepImpl.java:239)
>   at 
> org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:200)
>   at 
> org.apache.carbondata.processing.loading.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:129)
>   at 
> org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:97)
>   at 
> org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:83)
>   at 
> org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:218)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 18/01/02 11:46:58 ERROR ForwardDictionaryCache: 
> SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: 
> null
> 18/01/02 11:46:58 ERROR ForwardDictionaryCache: 
> SafeParallelSorterPool:uniqdata_uniqdata_agg Error loading the dictionary: 
> null
> 18/01/02 11:46:58 ERROR ForwardDictionaryCache: 
> 

[GitHub] carbondata issue #1785: [CARBONDATA-2015] Restricted maximum length of bytes...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1785
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1443/



---


[GitHub] carbondata issue #1785: [CARBONDATA-2015] Restricted maximum length of bytes...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1785
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2677/



---


[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1584
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2813/



---


[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1786
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1444/



---


[GitHub] carbondata issue #1786: [CARBONDATA-1988] Fixed bug to remove empty partitio...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1786
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2678/



---


[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1724
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1442/



---


[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1724
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2676/



---


[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation

2018-01-10 Thread SangeetaGulia
Github user SangeetaGulia commented on the issue:

https://github.com/apache/carbondata/pull/1584
  
@jackylk we have made the changes as per your review comments. Can you 
please check.


---


[jira] [Updated] (CARBONDATA-2016) Exception displays while implementing compaction with alter query

2018-01-10 Thread Vandana Yadav (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vandana Yadav updated CARBONDATA-2016:
--
Description: 
Exception displays while implementing compaction with alter query.

Steps to reproduce:

1) Create a table :
CREATE TABLE CUSTOMER1 ( C_CUSTKEY INT , C_NAME STRING , C_ADDRESS STRING , 
C_NATIONKEY INT , C_PHONE STRING , C_ACCTBAL DECIMAL(15,2) , C_MKTSEGMENT 
STRING , C_COMMENT STRING) stored by 'carbondata';

2) Insert data into the table:
a) insert into customer1 
values(1,'vandana','noida',1,'123456789',45987.78,'hello','comment')
b) insert into customer1 
values(2,'vandana','noida',2,'123456789',487.78,'hello','comment')
c) insert into customer1 
values(3,'geetika','delhi',3,'123456789',487897.78,'hello','comment')
d) insert into customer1 
values(4,'sangeeta','delhi',3,'123456789',48789.78,'hello','comment')

3) Perform alter table query:
 alter table customer1 add columns (intfield int) TBLPROPERTIES 
('DEFAULT.VALUE.intfield'='10');

4) show segments for displaying segments before compaction
show segments for table customer1;

output:
++--+--+--++--+--+
| SegmentSequenceId  |  Status  | Load Start Time  |  Load End Time 
  | Merged To  | File Format  |
++--+--+--++--+--+
| 3  | Success  | 2018-01-10 16:16:53.611  | 2018-01-10 
16:16:54.99   | NA | COLUMNAR_V3  |
| 2  | Success  | 2018-01-10 16:16:46.878  | 2018-01-10 
16:16:47.75   | NA | COLUMNAR_V3  |
| 1  | Success  | 2018-01-10 16:16:38.096  | 2018-01-10 
16:16:38.972  | NA | COLUMNAR_V3  |
| 0  | Success  | 2018-01-10 16:16:31.979  | 2018-01-10 
16:16:33.293  | NA | COLUMNAR_V3  |
++--+--+--++--+--+
4 rows selected (0.029 seconds)

5) alter table query for compaction:
alter table customer1 compact 'minor';

Expected Result: Table should be compacted successfully.

Actual Result: 
Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please check 
logs for more info. Exception in compaction Compaction Failure in Merger Rdd.; 
(state=,code=0)

thriftserver logs:
18/01/10 16:17:12 ERROR CompactionResultSortProcessor: [Executor task launch 
worker-36][partitionID:customer1;queryID:15798380253871] Compaction failed: 
java.lang.Long cannot be cast to java.lang.Integer
java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
at 
org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataToFile(SortDataRows.java:273)
at 
org.apache.carbondata.processing.sort.sortdata.SortDataRows.startSorting(SortDataRows.java:214)
at 
org.apache.carbondata.processing.merger.CompactionResultSortProcessor.processResult(CompactionResultSortProcessor.java:226)
at 
org.apache.carbondata.processing.merger.CompactionResultSortProcessor.execute(CompactionResultSortProcessor.java:159)
at 
org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.(CarbonMergerRDD.scala:234)
at 
org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:81)
at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch 
worker-36][partitionID:customer1;queryID:15798380253871] Total memory used 
after task 15798371335347 is 5313 Current tasks running now are : 
[6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch 
worker-36][partitionID:customer1;queryID:15798380253871] Total memory used 
after task 15798371335347 is 5313 Current tasks running now are : 
[6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch 
worker-36][partitionID:customer1;queryID:15798380253871] Total memory used 
after task 15798371335347 is 5313 Current tasks running now are : 
[6856382704941, 14621295743743, 14461639534151, 4378916027096, 

[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1584
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1441/



---


[GitHub] carbondata pull request #1786: [CARBONDATA-1988] Fixed bug to remove empty p...

2018-01-10 Thread geetikagupta16
GitHub user geetikagupta16 opened a pull request:

https://github.com/apache/carbondata/pull/1786

[CARBONDATA-1988] Fixed bug to remove empty partition directory for drop 
partition command

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/geetikagupta16/incubator-carbondata 
CARBONDATA-1988

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1786.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1786


commit 41263d54d69a492e77275d4c375d330430cbebc3
Author: Geetika Gupta 
Date:   2018-01-10T10:53:55Z

Refactored code to remove partition directory for drop partition command




---


[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1584
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2675/



---


[jira] [Created] (CARBONDATA-2016) Exception displays while implementing compaction with alter query

2018-01-10 Thread Vandana Yadav (JIRA)
Vandana Yadav created CARBONDATA-2016:
-

 Summary: Exception displays while implementing compaction with 
alter query
 Key: CARBONDATA-2016
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2016
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.3.0
 Environment: spark 2.1
Reporter: Vandana Yadav
Priority: Minor


Exception displays while implementing compaction with alter query.

Steps to reproduce:

1) Create a table :
CREATE TABLE CUSTOMER1 ( C_CUSTKEY INT , C_NAME STRING , C_ADDRESS STRING , 
C_NATIONKEY INT , C_PHONE STRING , C_ACCTBAL DECIMAL(15,2) , C_MKTSEGMENT 
STRING , C_COMMENT STRING) stored by 'carbondata';

2) Insert data into the table:
a) insert into customer1 
values(1,'vandana','noida',1,'123456789',45987.78,'hello','comment')
b) insert into customer1 
values(2,'vandana','noida',2,'123456789',487.78,'hello','comment')
c) insert into customer1 
values(3,'geetika','delhi',3,'123456789',487897.78,'hello','comment')
d) insert into customer1 
values(4,'sangeeta','delhi',3,'123456789',48789.78,'hello','comment')

3) Perform alter table query:
 alter table customer1 add columns (intfield int) TBLPROPERTIES 
('DEFAULT.VALUE.intfield'='10');

4) show segments for displaying segments before compaction
show segments for table customer1;

output:
++--+--+--++--+--+
| SegmentSequenceId  |  Status  | Load Start Time  |  Load End Time 
  | Merged To  | File Format  |
++--+--+--++--+--+
| 3  | Success  | 2018-01-10 16:16:53.611  | 2018-01-10 
16:16:54.99   | NA | COLUMNAR_V3  |
| 2  | Success  | 2018-01-10 16:16:46.878  | 2018-01-10 
16:16:47.75   | NA | COLUMNAR_V3  |
| 1  | Success  | 2018-01-10 16:16:38.096  | 2018-01-10 
16:16:38.972  | NA | COLUMNAR_V3  |
| 0  | Success  | 2018-01-10 16:16:31.979  | 2018-01-10 
16:16:33.293  | NA | COLUMNAR_V3  |
++--+--+--++--+--+
4 rows selected (0.029 seconds)

5) alter table query for compaction:
alter table customer1 compact 'minor';

Expected Result: Table should be compacted successfully.

Actual Result: 
Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please check 
logs for more info. Exception in compaction Compaction Failure in Merger Rdd.; 
(state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1784: [CARBONDATA-1965]removed sort_scope from dynamic con...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1784
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2812/



---


[jira] [Updated] (CARBONDATA-2015) Restricted maximum length of bytes per column

2018-01-10 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2015:

Description: 
Validation for number of bytes for a column is added.

We have limited the number of characters per column to 32000.
For example, a single unicode character takes 3 bytes. So in this case, if my 
column has 30,000 unicode characters, then 32000 * 3 exceeds the short range. 
So, load will fail.

> Restricted maximum length of bytes per column
> -
>
> Key: CARBONDATA-2015
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2015
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Validation for number of bytes for a column is added.
> We have limited the number of characters per column to 32000.
> For example, a single unicode character takes 3 bytes. So in this case, if my 
> column has 30,000 unicode characters, then 32000 * 3 exceeds the short range. 
> So, load will fail.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1785: [CARBONDATA-2015] Restricted maximum length o...

2018-01-10 Thread dhatchayani
GitHub user dhatchayani opened a pull request:

https://github.com/apache/carbondata/pull/1785

[CARBONDATA-2015] Restricted maximum length of bytes per column

Validation for number of bytes for a column is added.

We have limited the number of characters per column to 32000.
For example, a single unicode character takes 3 bytes. So in this case, if 
my column has 30,000 unicode characters, then 32000 * 3 exceeds the short 
range. So, load will fail.

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [x] Testing done
UT Added
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhatchayani/incubator-carbondata 32000_bytes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1785.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1785


commit e380a1d6b2ffae8611f6045e9f63d2ca6e710652
Author: dhatchayani 
Date:   2018-01-10T10:59:14Z

[CARBONDATA-2015] Restricted maximum length of bytes per column




---


[GitHub] carbondata issue #1784: [CARBONDATA-1965]removed sort_scope from dynamic con...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1784
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/2674/



---


[GitHub] carbondata issue #1784: [CARBONDATA-1965]removed sort_scope from dynamic con...

2018-01-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1784
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/1440/



---


[GitHub] carbondata issue #1724: [CARBONDATA-1940][PreAgg] Fixed bug for creation of ...

2018-01-10 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/1724
  
retest this please


---


[jira] [Created] (CARBONDATA-2015) Restricted maximum length of bytes per column

2018-01-10 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2015:
---

 Summary: Restricted maximum length of bytes per column
 Key: CARBONDATA-2015
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2015
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-2014) update table status for load failure only after first entry

2018-01-10 Thread Akash R Nilugal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal reassigned CARBONDATA-2014:
---

Assignee: Akash R Nilugal

> update table status for load failure only after first entry
> ---
>
> Key: CARBONDATA-2014
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2014
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> update table status for load failure only after first entry and before 
> calling to update the table status for failure, check whether it is hive 
> partition table in the same way as it is checked while updating in progress 
> status to table status



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1784: [CARBONDATA-1965]removed sort_scope from dyna...

2018-01-10 Thread vandana7
GitHub user vandana7 opened a pull request:

https://github.com/apache/carbondata/pull/1784

[CARBONDATA-1965]removed sort_scope from dynamic configuration in 
carbondata using set-reset as it is not configured by set




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vandana7/incubator-carbondata remove_scope_set

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1784.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1784


commit 05175b284ff7c58dec7ed11c2a3c1f914bc22697
Author: vandana 
Date:   2018-01-10T10:08:40Z

removed sort_scope from dynamic configurationin carbondata using set-reset 
as it is not configured by set




---


[GitHub] carbondata issue #1751: [CARBONDATA-1971][Blocklet Prunning] Measure Null va...

2018-01-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1751
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2811/



---


[jira] [Closed] (CARBONDATA-1735) Carbon1.3.0 Load: Segment created during load is not marked for delete if beeline session is closed while load is still in progress

2018-01-10 Thread Ajeet Rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajeet Rai closed CARBONDATA-1735.
-
Resolution: Fixed

This issue has been verified in latest carbon 1.3 version and it is working 
fine. Hence closing the defect.

> Carbon1.3.0 Load: Segment created during load is not marked for delete if 
> beeline session is closed  while load is still in progress
> 
>
> Key: CARBONDATA-1735
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1735
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: 3 Node ant cluster 
>Reporter: Ajeet Rai
>Priority: Minor
>  Labels: DFX
>
> Load: Segment created during load is not marked for delete if beeline session 
> is closed  while load is still in progress.
> Steps: 
> 1: Create a table with dictionary include
> 2: Start a load job
> 3: close the beeline session when global dictionary generation job is still 
> in progress.
> 4: Observe that global dictionary generation job is completed but next job is 
> not triggered.
> 5:  Also observe that table status file is not updated and status of job is 
> still in progress.
> 6: show segment  will show this segment with status as in progress.
> Expected behaviour: Either job should be completed or load should fail and 
> segment should be marked for delete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >