[GitHub] carbondata issue #2612: [CARBONDATA-2834] Remove unnecessary nested looping ...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2612
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6528/



---


[GitHub] carbondata issue #2612: [CARBONDATA-2834] Remove unnecessary nested looping ...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2612
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7804/



---


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread brijoobopanna
Github user brijoobopanna commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
retest this please



---


[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...

2018-08-06 Thread manishgupta88
Github user manishgupta88 commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
LGTM


---


[GitHub] carbondata issue #2605: [CARBONDATA-2585] Fix local dictionary for both tabl...

2018-08-06 Thread akashrn5
Github user akashrn5 commented on the issue:

https://github.com/apache/carbondata/pull/2605
  
retest this please


---


[GitHub] carbondata pull request #2586: [wip]Ui kill

2018-08-06 Thread akashrn5
Github user akashrn5 closed the pull request at:

https://github.com/apache/carbondata/pull/2586


---


[GitHub] carbondata issue #2612: [CARBONDATA-2834] Remove unnecessary nested looping ...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2612
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6183/



---


[GitHub] carbondata pull request #2612: [CARBONDATA-2834] Remove unnecessary nested l...

2018-08-06 Thread kunal642
GitHub user kunal642 opened a pull request:

https://github.com/apache/carbondata/pull/2612

[CARBONDATA-2834] Remove unnecessary nested looping over 
loadMetadatadetails.

removed nested for loop which causes query performance degradation if…

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kunal642/carbondata nestedloop_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2612.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2612


commit ebe22d331dc4ea4ef6904e779702801c3eb5d859
Author: kunal642 
Date:   2018-08-06T12:47:28Z

removed nested for loop which causes query performance degradation if 
number of segments are too many




---


[jira] [Created] (CARBONDATA-2834) Refactor code to remove nested for loop to extract invalidTimestampRange.

2018-08-06 Thread Kunal Kapoor (JIRA)
Kunal Kapoor created CARBONDATA-2834:


 Summary: Refactor code to remove nested for loop to extract 
invalidTimestampRange.
 Key: CARBONDATA-2834
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2834
 Project: CarbonData
  Issue Type: Bug
Reporter: Kunal Kapoor
Assignee: Kunal Kapoor


Reactor getInvalidTimestampRange method in SegmentUpdateStatusManager because 
it has an unnecessary nested loop to get timestamp from invalid segments.

This will cause query performance degradation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6527/



---


[jira] [Closed] (CARBONDATA-2809) Manually rebuilding non-lazy datamap cause error

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2809.
--
Resolution: Duplicate

duplicated with CARBONDATA-2821

> Manually rebuilding non-lazy datamap cause error
> 
>
> Key: CARBONDATA-2809
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2809
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
> 1. create base table
> 2. load data to base table
> 3. create index datamap (such as bloomfilter datamap) on base table
> 4. rebuild datamap  This will give error
> In step3, the data of datamap has already been generated, if we trigger 
> rebuild, the procedure does not clean the files properly, thus causing the 
> error.
> Actually, the rebuild is not required. We can fix this issue by skipping the 
> rebuild procedure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reopened CARBONDATA-2820:


> Block rebuilding for preagg, bloom and lucene datamap
> -
>
> Key: CARBONDATA-2820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2820
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> currently we will block rebuilding these datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2820.
--
Resolution: Duplicate

> Block rebuilding for preagg, bloom and lucene datamap
> -
>
> Key: CARBONDATA-2820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2820
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> currently we will block rebuilding these datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571049#comment-16571049
 ] 

xuchuanyin edited comment on CARBONDATA-2820 at 8/7/18 2:40 AM:


duplicated with CARBONDATA-2821


was (Author: xuchuanyin):
duplicated with CARBONDATA-2823

> Block rebuilding for preagg, bloom and lucene datamap
> -
>
> Key: CARBONDATA-2820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2820
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> currently we will block rebuilding these datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2820.
--
Resolution: Duplicate

duplicated with CARBONDATA-2823

> Block rebuilding for preagg, bloom and lucene datamap
> -
>
> Key: CARBONDATA-2820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2820
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> currently we will block rebuilding these datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7803/



---


[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue

2018-08-06 Thread xuchuanyin
Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/2611
  
@kumarvishal09 can you explain this modification?
In previous implementation, we split a record to 'dict-sort', 'nodict-sort' 
and 'noSortDims & measures'.  'noSortDims & measures' is packed to bytes to 
avoid serialization-deserialization for them during reading/writing records to 
sort temp. In previous implementation, we can see about 8% enhancement in data 
loading.


---


[jira] [Commented] (CARBONDATA-2833) NPE when we do a insert over a insert failure operation

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571028#comment-16571028
 ] 

xuchuanyin commented on CARBONDATA-2833:


steps in issue description cannot reproduce the problem, I've tried with 
another steps, but still cannot reproduce it:

```

test("test") {
 CarbonProperties.getInstance().addProperty("bad_records_logger_enable", "true")
 
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION,
 "FAIL")
 sql("CREATE DATABASE test1")
 sql("use test1")
 sql("DROP TABLE IF EXISTS ab")
 sql("CREATE TABLE ab (a integer, b string) stored by 'carbondata'")
 sql("CREATE DATAMAP dm ON TABLE ab using 'bloomfilter' 
DMPROPERTIES('index_columns'='a,b')")
 try {
 sql("insert into ab select 'berb', 'abc', 'ggg', '1'")
 } catch {
 case e : Exception => LOGGER.error(e)
 }
 LOGGER.error("XU second run")
 try {
 sql("insert into ab select 'berb', 'abc', 'ggg', '1'")
 } catch {
 case e : Exception => LOGGER.error(e)
 }
 sql("select * from ab").show(false)
 sql("DROP TABLE IF EXISTS ab")
 sql("DROP DATABASE IF EXISTS test1")
 sql("use default")
 CarbonProperties.getInstance().addProperty("bad_records_logger_enable",
 CarbonLoadOptionConstants.CARBON_OPTIONS_BAD_RECORDS_LOGGER_ENABLE_DEFAULT)
 
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION,
 "FAIL")
}

```

The load statement complains about the bad_record error, no NPE is reported.

> NPE when we do a insert over a insert failure operation
> ---
>
> Key: CARBONDATA-2833
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2833
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Brijoo Bopanna
>Priority: Major
>
> jdbc:hive2://10.18.5.188:23040/default> CREATE TABLE
> 0: jdbc:hive2://10.18.5.188:23040/default> IF NOT EXISTS test_table(
> 0: jdbc:hive2://10.18.5.188:23040/default> id string,
> 0: jdbc:hive2://10.18.5.188:23040/default> name string,
> 0: jdbc:hive2://10.18.5.188:23040/default> city string,
> 0: jdbc:hive2://10.18.5.188:23040/default> age Int)
> 0: jdbc:hive2://10.18.5.188:23040/default> STORED BY 'carbondata';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.191 seconds)
> 0: jdbc:hive2://10.18.5.188:23040/default>
> 0: jdbc:hive2://10.18.5.188:23040/default>
> 0: jdbc:hive2://10.18.5.188:23040/default>
> 0: jdbc:hive2://10.18.5.188:23040/default> desc test_table
> 0: jdbc:hive2://10.18.5.188:23040/default> ;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | id    | string | NULL |
> | name  | string | NULL |
> | city  | string | NULL |
> | age   | int    | NULL |
> +---++--+--+
> 4 rows selected (0.081 seconds)
> 0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select 
> 'berb','abc','ggg','1';
> Error: java.lang.Exception: Data load failed due to bad record: The value 
> with column name a and column data type INT is not a valid INT type.Please 
> enable bad record logger to know the detail reason. (state=,code=0)
> 0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select 
> 'berb','abc','ggg','1';
> *Error: java.lang.NullPointerException (state=,code=0)*
> 0: jdbc:hive2://10.18.5.188:23040/default> insert into test_table select 
> 'berb','abc','ggg',1;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (1.127 seconds)
> 0: jdbc:hive2://10.18.5.188:23040/default> show tables
> 0: jdbc:hive2://10.18.5.188:23040/default> ;
> +---+-+--+--+
> | database  |  tableName  | isTemporary  |
> +---+-+--+--+
> | praveen   | a   | false    |
> | praveen   | ab      | false    |
> | praveen   | bbc | false    |
> | praveen   | test_table  | false    |
> +---+-+--+--+
> 4 rows selected (0.041 seconds)
> 0: jdbc:hive2://10.18.5.188:23040/default>
> 0: jdbc:hive2://10.18.5.188:23040/default> desc ab
> 0: jdbc:hive2://10.18.5.188:23040/default> ;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | a | int    | NULL |
> | b | string | NULL |
> +---++--+--+
> 2 rows selected (0.074 seconds)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6182/



---


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread xuchuanyin
Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
@ravipesala 
Fixed.
The root cause is that MV is actually 'deferred rebuild', but we didn't 
specify it while we create the datamap.
To make compliance, we will enable 'deferred rebuild' for MV datamap no 
matter whether the flag is enabled by user or not.


---


[GitHub] carbondata issue #2606: [CARBONDATA-2817]Thread Leak in Update and in No sor...

2018-08-06 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2606
  
@BJangir Please handle thread leak scenario for BatchSortWriter in case of 
any exception. DataWriterBatchProcessorStepImpl.java


---


[GitHub] carbondata pull request #2606: [CARBONDATA-2817]Thread Leak in Update and in...

2018-08-06 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2606#discussion_r207961430
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/loading/steps/CarbonRowDataWriterProcessorStepImpl.java
 ---
@@ -169,24 +171,36 @@ private void doExecute(Iterator 
iterator, int iteratorIndex) thr
   if (rowsNotExist) {
 rowsNotExist = false;
 dataHandler = 
CarbonFactHandlerFactory.createCarbonFactHandler(model);
+this.carbonFactHandlers.add(dataHandler);
 dataHandler.initialise();
   }
   processBatch(iterator.next(), dataHandler, iteratorIndex);
 }
-if (!rowsNotExist) {
-  finish(dataHandler, iteratorIndex);
+try {
+  if (!rowsNotExist) {
+finish(dataHandler, iteratorIndex);
+  }
+} finally {
+  carbonFactHandlers.remove(dataHandler);
 }
+
+
   }
 
   @Override protected String getStepName() {
 return "Data Writer";
   }
 
   private void finish(CarbonFactHandler dataHandler, int iteratorIndex) {
+CarbonDataWriterException exception = null;
--- End diff --

Please handle for closeHandler method as it can also throw exception 


---


[jira] [Updated] (CARBONDATA-2827) Refactor Segment Status Manager Interface

2018-08-06 Thread Venkata Ramana G (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Ramana G updated CARBONDATA-2827:
-
Attachment: Segment Status Management interface 
design_V1_Ramana_reviewed.docx

> Refactor Segment Status Manager Interface
> -
>
> Key: CARBONDATA-2827
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2827
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>Priority: Major
> Attachments: Segment Status Management interface design_V1.docx, 
> Segment Status Management interface design_V1_Ramana_reviewed.docx
>
>
> Carbon uses tablestatus file to record segment status and details of each 
> segment during each load. This tablestatus enables carbon to support 
> concurrent loads and reads without data inconsistency or corruption.
> So it is very important feature of carbondata and we should have clean 
> interfaces to maintain it. Current tablestatus updation is shattered to 
> multiple places and there is no clean interface, so I am proposing to 
> refactor current SegmentStatusManager interface and bringing all tablestatus 
> operations to single interface.  
> This new interface allows to add table status to any other storage like DB. 
> This is needed for S3 type object stores as  these are eventually consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2609: [CARBONDATA-2823] Support streaming property with da...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2609
  
LGTM


---


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
@xuchuanyin Please check MVTests, it is failing


---


[jira] [Created] (CARBONDATA-2833) NPE when we do a insert over a insert failure operation

2018-08-06 Thread Brijoo Bopanna (JIRA)
Brijoo Bopanna created CARBONDATA-2833:
--

 Summary: NPE when we do a insert over a insert failure operation
 Key: CARBONDATA-2833
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2833
 Project: CarbonData
  Issue Type: Bug
Reporter: Brijoo Bopanna


jdbc:hive2://10.18.5.188:23040/default> CREATE TABLE

0: jdbc:hive2://10.18.5.188:23040/default> IF NOT EXISTS test_table(

0: jdbc:hive2://10.18.5.188:23040/default> id string,

0: jdbc:hive2://10.18.5.188:23040/default> name string,

0: jdbc:hive2://10.18.5.188:23040/default> city string,

0: jdbc:hive2://10.18.5.188:23040/default> age Int)

0: jdbc:hive2://10.18.5.188:23040/default> STORED BY 'carbondata';

+-+--+

| Result  |

+-+--+

+-+--+

No rows selected (0.191 seconds)

0: jdbc:hive2://10.18.5.188:23040/default>

0: jdbc:hive2://10.18.5.188:23040/default>

0: jdbc:hive2://10.18.5.188:23040/default>

0: jdbc:hive2://10.18.5.188:23040/default> desc test_table

0: jdbc:hive2://10.18.5.188:23040/default> ;

+---++--+--+

| col_name  | data_type  | comment  |

+---++--+--+

| id    | string | NULL |

| name  | string | NULL |

| city  | string | NULL |

| age   | int    | NULL |

+---++--+--+

4 rows selected (0.081 seconds)

0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select 
'berb','abc','ggg','1';

Error: java.lang.Exception: Data load failed due to bad record: The value with 
column name a and column data type INT is not a valid INT type.Please enable 
bad record logger to know the detail reason. (state=,code=0)

0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select 
'berb','abc','ggg','1';

*Error: java.lang.NullPointerException (state=,code=0)*

0: jdbc:hive2://10.18.5.188:23040/default> insert into test_table select 
'berb','abc','ggg',1;

+-+--+

| Result  |

+-+--+

+-+--+

No rows selected (1.127 seconds)

0: jdbc:hive2://10.18.5.188:23040/default> show tables

0: jdbc:hive2://10.18.5.188:23040/default> ;

+---+-+--+--+

| database  |  tableName  | isTemporary  |

+---+-+--+--+

| praveen   | a   | false    |

| praveen   | ab      | false    |

| praveen   | bbc | false    |

| praveen   | test_table  | false    |

+---+-+--+--+

4 rows selected (0.041 seconds)

0: jdbc:hive2://10.18.5.188:23040/default>

0: jdbc:hive2://10.18.5.188:23040/default> desc ab

0: jdbc:hive2://10.18.5.188:23040/default> ;

+---++--+--+

| col_name  | data_type  | comment  |

+---++--+--+

| a | int    | NULL |

| b | string | NULL |

+---++--+--+

2 rows selected (0.074 seconds)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2611
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6526/



---


[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2611
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7802/



---


[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2610
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6524/



---


[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2610
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7800/



---


[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2611
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6181/



---


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6523/



---


[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2611
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6180/



---


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7799/



---


[GitHub] carbondata issue #2611: [WIP]Fixed data loading performance issue

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2611
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7801/



---


[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6522/



---


[GitHub] carbondata pull request #2611: [WIP]Fixed data loading performance issue

2018-08-06 Thread kumarvishal09
GitHub user kumarvishal09 opened a pull request:

https://github.com/apache/carbondata/pull/2611

[WIP]Fixed data loading performance issue

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kumarvishal09/incubator-carbondata 
dataloadPerFix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2611.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2611


commit 5a2ebf3d056794387f2622818c9cf7be7ec4ec61
Author: kumarvishal09 
Date:   2018-08-06T13:30:27Z

Fixed data loading performance issue




---


[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7798/



---


[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2610
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6179/



---


[jira] [Commented] (CARBONDATA-2822) Carbon Configuration - "carbon.invisible.segments.preserve.count" configuration property is not working as expected.

2018-08-06 Thread Indhumathi Muthumurugesh (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570184#comment-16570184
 ] 

Indhumathi Muthumurugesh commented on CARBONDATA-2822:
--

Hi Prasanna,

The carbon configuration *"carbon.invisible.segments.preserve.count"* is 
actually for TableStatusFile. When set this property, if the number of 
invisible segment info files exceeds the given value, then, those files will be 
removed and written to tablestatus.history file.

Thanks & Regards,

Indhumathi M

> Carbon Configuration - "carbon.invisible.segments.preserve.count"  
> configuration property is not working as expected.
> -
>
> Key: CARBONDATA-2822
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2822
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, file-format
> Environment: 3 Node ANT cluster.
>Reporter: Prasanna Ravichandran
>Priority: Minor
>
> For the *carbon.invisible.segments.preserve.count* configuration, it is not 
> working as expected.
> +*Steps to reproduce:*+
> 1) Setting up "*carbon.invisible.segments.preserve.count=20"* in 
> carbon.properties and restarting the thrift server.
>  
> 2) After performing Loading 40 times and Compaction 4 times.
> 3) Perform clean files, so that the tablestatus.history file would be 
> generated with invisible segments details.
>  So Total 44 segments would be created including visible and invisible 
> segments.(40 load segment (like segment ID from 0,1,2...39) + 4 compacted new 
> segment(like 0.1,20.1,22.1,0.2))
> In that, *41 segments information are present in the "tablestatus.history" 
> file(*which holds invisible(marked for delete and compacted) segments 
> details) and 3 segments information are present in the "tablestatus" 
> file(which holds visible segments(0 .2 -final compacted segment) along with 
> (1^st^ segment - 0th segment) and (last segment-39th segment)). *But 
> invisible segment preserve count is configured to 20, which is not followed 
> for the tablestatus.history file.*
> +*Expected result:*+
> tablestatus.history file should preserve only the latest 20 segments, as per 
> the configuration.
> +*Actual result:*+
> tablestatus.history file is having 41 invisible segments details.(which is 
> above the configured value: 20)
>  
> This is tested with ANT cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6178/



---


[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...

2018-08-06 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
retest this please


---


[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7796/



---


[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6520/



---


[GitHub] carbondata issue #2594: [CARBONDATA-2809][DataMap] Block rebuilding for bloo...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2594
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6177/



---


[GitHub] carbondata issue #2608: [CARBONDATA-2829][CARBONDATA-2832] Fix creating merg...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6176/



---


[GitHub] carbondata issue #2568: [Presto-integration-Technical-note] created document...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2568
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6519/



---


[jira] [Assigned] (CARBONDATA-2832) Block loading error for select query executed after merge index command executed on V1/V2 store table

2018-08-06 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat reassigned CARBONDATA-2832:
---

Assignee: dhatchayani

> Block loading error for select query executed after merge index command 
> executed on V1/V2 store table
> -
>
> Key: CARBONDATA-2832
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2832
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Assignee: dhatchayani
>Priority: Minor
>
> Steps :
> *Create and load data in V1/V2 carbon store:*
> create table brinjal (imei string,AMSize string,channelsId 
> string,ActiveCountry string, Activecity string,gamePointId 
> double,deviceInformationId double,productionDate Timestamp,deliveryDate 
> timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('table_blocksize'='1');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE 
> brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');
> *In 1.4.1*
> refresh table brinjal;
> alter table brinjal compact 'segment_index';
> select * from brinjal where AMSize='8RAM size';
>  
> *Issue : Block loading error for select query executed after merge index 
> command executed on V1/V2 store table.*
> 0: jdbc:hive2://10.18.98.101:22550/default> select * from brinjal where 
> AMSize='8RAM size';
> *Error: java.io.IOException: Problem in loading segment blocks. 
> (state=,code=0)*
> *Expected :* select query executed after merge index command executed on 
> V1/V2 store table should return correct result set without error**



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2568: [Presto-integration-Technical-note] created document...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2568
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7795/



---


[jira] [Created] (CARBONDATA-2832) Block loading error for select query executed after merge index command executed on V1/V2 store table

2018-08-06 Thread Chetan Bhat (JIRA)
Chetan Bhat created CARBONDATA-2832:
---

 Summary: Block loading error for select query executed after merge 
index command executed on V1/V2 store table
 Key: CARBONDATA-2832
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2832
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 1.4.1
 Environment: Spark 2.1
Reporter: Chetan Bhat


Steps :

*Create and load data in V1/V2 carbon store:*

create table brinjal (imei string,AMSize string,channelsId string,ActiveCountry 
string, Activecity string,gamePointId double,deviceInformationId 
double,productionDate Timestamp,deliveryDate timestamp,deliverycharge double) 
STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('table_blocksize'='1');

LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE 
brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
'"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');

*In 1.4.1*

refresh table brinjal;

alter table brinjal compact 'segment_index';

select * from brinjal where AMSize='8RAM size';

 

*Issue : Block loading error for select query executed after merge index 
command executed on V1/V2 store table.*

0: jdbc:hive2://10.18.98.101:22550/default> select * from brinjal where 
AMSize='8RAM size';

*Error: java.io.IOException: Problem in loading segment blocks. (state=,code=0)*

*Expected :* select query executed after merge index command executed on V1/V2 
store table should return correct result set without error**



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2568: [Presto-integration-Technical-note] created document...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2568
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6175/



---


[GitHub] carbondata issue #2568: [Presto-integration-Technical-note] created document...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2568
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6174/



---


[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2610
  
Build Failed with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6518/



---


[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2610
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7793/



---


[GitHub] carbondata pull request #2568: [Presto-integration-Technical-note] created d...

2018-08-06 Thread vandana7
Github user vandana7 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2568#discussion_r207833427
  
--- Diff: integration/presto/presto-integration-technical-note.md ---
@@ -0,0 +1,253 @@
+
+
+# Presto Integration Technical Note
+Presto Integration with Carbon data include the below steps:
+
+* Setting up Presto Cluster
+
+* Setting up cluster to use carbondata as a catalog along with other 
catalogs provided by presto.
+
+In this technical note we will first learn about the above two points and 
after that we will see how we can do performance tuning with Presto.
+
+## **Let us begin with the first step of Presto Cluster Setup:**
+
+
+* ### Installing Presto
+
+ 1. Download the 0.187 version of Presto using:
+  `wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.187/presto-server-0.187.tar.gz`
+
+ 2. Extract Presto tar file: `tar zxvf presto-server-0.187.tar.gz`.
+
+ 3. Download the Presto CLI for the coordinator and name it presto.
+
+  ```
+wget 
https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.187/presto-cli-0.187-executable.jar
+
+mv presto-cli-0.187-executable.jar presto
+
+chmod +x presto
+  ```
+
+### Create Configuration Files
+
+  1. Create `etc` folder in presto-server-0.187 directory.
+  2. Create `config.properties`, `jvm.config`, `log.properties`, and 
`node.properties` files.
+  3. Install uuid to generate a node.id.
+
+  ```
+  sudo apt-get install uuid
+
+  uuid
+  ```
+
+
+# Contents of your node.properties file
+
+  ```
+  node.environment=production
+  node.id=
+  node.data-dir=/home/ubuntu/data
+  ```
+
+# Contents of your jvm.config file
+
+  ```
+  -server
+  -Xmx16G
+  -XX:+UseG1GC
+  -XX:G1HeapRegionSize=32M
+  -XX:+UseGCOverheadLimit
+  -XX:+ExplicitGCInvokesConcurrent
+  -XX:+HeapDumpOnOutOfMemoryError
+  -XX:OnOutOfMemoryError=kill -9 %p
+  ```
+
+# Contents of your log.properties file
+  ```
+  com.facebook.presto=INFO
+  ```
+
+ The default minimum level is `INFO`. There are four levels: `DEBUG`, 
`INFO`, `WARN` and `ERROR`.
+
+### Coordinator Configurations
+
+# Contents of your config.properties
+  ```
+  coordinator=true
+  node-scheduler.include-coordinator=false
+  http-server.http.port=8086
+  query.max-memory=50GB
+  query.max-memory-per-node=2GB
+  discovery-server.enabled=true
+  discovery.uri=:8086
+  ```
+The options `node-scheduler.include-coordinator=false` and 
`coordinator=true` indicate that the node is the coordinator and tells the 
coordinator not to do any of the computation work itself and to use the workers.
+
+**Note**: We recommend setting `query.max-memory-per-node` to half of the 
JVM config max memory, though if your workload is highly concurrent, you may 
want to use a lower value for `query.max-memory-per-node`.
+
+Also relation between below two configuration-properties should be like:
+If, `query.max-memory-per-node=30GB`
+Then, `query.max-memory=<30GB * number of nodes>`.
+
+### Worker Configurations
+
+# Contents of your config.properties
+
+  ```
+  coordinator=false
+  http-server.http.port=8086
+  query.max-memory=50GB
+  query.max-memory-per-node=2GB
+  discovery.uri=:8086
+  ```
+
+**Note**: `jvm.config` and `node.properties` files are same for all the 
nodes (worker + coordinator). All the nodes should have different 
`node.id`.(generated by uuid command).
+
+### **With this we are ready with the Presto Cluster setup but to 
integrate with carbon data further steps are required which are as follows:**
+
+### Catalog Configurations
+
+1. Create a folder named `catalog` in etc directory of presto on all the 
nodes of the cluster including the coordinator.
+
+# Configuring Carbondata in Presto
+1. Create a file named `carbondata.properties` in the `catalog` folder and 
set the required properties on all the nodes.
+
+### Add Plugins
+
+1. Create a directory named `carbondata` in plugin directory of presto.
+2. Copy `carbondata` jars to `plugin/carbondata` directory on all nodes.
+
+### Start Presto Server on all nodes
+
+```
+./presto-server-0.187/bin/launcher start
+```
+To run it as a background process.
+
+```
+./presto-server-0.187/bin/launcher run
+```
+To run it in foreground.
+
+### Start Presto CLI
+```
+./presto
+```
+To connect to carbondata catalog use the following command:
+
+```
+./presto --server :8086 --catalog carbondata --schema 

+```
  

[GitHub] carbondata pull request #2568: [Presto-integration-Technical-note] created d...

2018-08-06 Thread vandana7
Github user vandana7 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2568#discussion_r207833200
  
--- Diff: integration/presto/performance-report-of-presto-with-carbon.md ---
@@ -0,0 +1,27 @@

[GitHub] carbondata pull request #2568: [Presto-integration-Technical-note] created d...

2018-08-06 Thread vandana7
Github user vandana7 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2568#discussion_r207832179
  
--- Diff: integration/presto/presto-integration-in-carbondata.md ---
@@ -0,0 +1,134 @@
+
+
+# PRESTO INTEGRATION IN CARBONDATA
+
+1. [Document Purpose](#document-purpose)
+1. [Purpose](#purpose)
+1. [Scope](#scope)
+1. [Definitions and Acronyms](#definitions-and-acronyms)
+1. [Requirements addressed](#requirements-addressed)
+1. [Design Considerations](#design-considerations)
+1. [Row Iterator Implementation](#row-iterator-implementation)
+1. [ColumnarReaders or StreamReaders 
approach](#columnarreaders-or-streamreaders-approach)
+1. [Module Structure](#module-structure)
+1. [Detailed design](#detailed-design)
+1. [Modules](#modules)
+1. [Functions Developed](#functions-developed)
+1. [Integration Tests](#integration-tests)
+1. [Tools and languages used](#tools-and-languages-used)
+1. [References](#references)
+
+## Document Purpose
+
+ *  _Purpose_
+ The purpose of this document is to outline the technical design of the 
Presto Integration in CarbonData.
+
+ Its main purpose is to -
+   *  Provide the link between the Functional Requirement and the detailed 
Technical Design documents.
+   *  Detail the functionality which will be provided by each component or 
group of components and show how the various components interact in the design.
+
+ This document is not intended to address installation and configuration 
details of the actual implementation. Installation and configuration details 
are provided in technology guides provided on CarbonData wiki page.As is true 
with any high level design, this document will be updated and refined based on 
changing requirements.
+ *  _Scope_
+ Presto Integration with CarbonData will allow execution of CarbonData 
queries on the Presto CLI.  CarbonData can be added easily as a Data Source 
among the multiple heterogeneous data sources for Presto.
+ *  _Definitions and Acronyms_
+  **CarbonData :** CarbonData is a fully indexed columnar and Hadoop 
native data-store for processing heavy analytical workloads and detailed 
queries on big data. In customer benchmarks, CarbonData has proven to manage 
Petabyte of data running on extraordinarily low-cost hardware and answers 
queries around 10 times faster than the current open source solutions 
(column-oriented SQL on Hadoop data-stores).
+
+ **Presto :** Presto is a distributed SQL query engine designed to query 
large data sets distributed over one or more heterogeneous data sources.
+
+## Requirements addressed
+This integration of Presto mainly serves two purpose:
+ * Support of Apache CarbonData as Data Source in Presto.
+ * Execution of Apache CarbonData Queries on Presto.
+
+## Design Considerations
--- End diff --

Done


---


[GitHub] carbondata issue #2610: [CARBONDATA-2831] Added Support Merge index files re...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2610
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6173/



---


[GitHub] carbondata issue #2609: [CARBONDATA-2823] Support streaming property with da...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2609
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6517/



---


[GitHub] carbondata pull request #2590: [CARBONDATA-2750] Updated documentation on Lo...

2018-08-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2590


---


[GitHub] carbondata issue #2609: [CARBONDATA-2823] Support streaming property with da...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2609
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7792/



---


[GitHub] carbondata pull request #2610: [CARBONDATA-2831] Added Support Merge index f...

2018-08-06 Thread ajantha-bhat
GitHub user ajantha-bhat opened a pull request:

https://github.com/apache/carbondata/pull/2610

[CARBONDATA-2831] Added Support Merge index files read from non 
transactional table

problem : Currently  SDK read/ nontransactional table read from external
table gives null output when carbonMergeindex file is present instead of
carobnindex files.

cause : In LatestFileReadCommitted, while taking snapshot, merge index
files were not considered.

solution: consider the merge index files while taking snapshot

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed? NA
 
 - [ ] Any backward compatibility impacted? NA
 
 - [ ] Document update required? NA

 - [ ] Testing done. Added UT   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. NA



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ajantha-bhat/carbondata issue_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2610


commit 4a2ca45bd80e542db4a0e461ffdbfb6e55f29d27
Author: ajantha-bhat 
Date:   2018-08-06T08:45:41Z

[CARBONDATA-2831] Added Support Merge index files read from non 
transactional table.

problem : Currently  SDK read/ nontransactional table read from external
table gives null output when carbonMergeindex file is present instead of
carobnindex files.

cause : In LatestFileReadCommitted, while taking snapshot, merge index
files were not considered.

solution: consider the merge index files while taking snapshot




---


[jira] [Created] (CARBONDATA-2830) Support Merge index files read from non transactional table.

2018-08-06 Thread Ajantha Bhat (JIRA)
Ajantha Bhat created CARBONDATA-2830:


 Summary: Support Merge index files read from non transactional 
table.
 Key: CARBONDATA-2830
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2830
 Project: CarbonData
  Issue Type: Bug
Reporter: Ajantha Bhat
Assignee: Ajantha Bhat


problem : Currently  SDK read/ nontransactional table read from external table 
gives null output when carbonMergeindex file is present instead of carobnindex 
files. 

cause : In LatestFileReadCommitted, while taking snapshot, merge index files 
were not considered.

solution: consider the merge index files while taking snapshot

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2831) Support Merge index files read from non transactional table.

2018-08-06 Thread Ajantha Bhat (JIRA)
Ajantha Bhat created CARBONDATA-2831:


 Summary: Support Merge index files read from non transactional 
table.
 Key: CARBONDATA-2831
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2831
 Project: CarbonData
  Issue Type: Bug
Reporter: Ajantha Bhat
Assignee: Ajantha Bhat


problem : Currently  SDK read/ nontransactional table read from external table 
gives null output when carbonMergeindex file is present instead of carobnindex 
files. 

cause : In LatestFileReadCommitted, while taking snapshot, merge index files 
were not considered.

solution: consider the merge index files while taking snapshot

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

2018-08-06 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2607#discussion_r207812426
  
--- Diff: 
integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala
 ---
@@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends 
CarbonReadSupport[T] {
* @param dictionaryData
* @return
*/
-  private def createSliceArrayBlock(dictionaryData: Dictionary): 
SliceArrayBlock = {
+  private def createSliceArrayBlock(dictionaryData: Dictionary): Block = {
 val chunks: DictionaryChunksWrapper = 
dictionaryData.getDictionaryChunks
-val sliceArray = new Array[Slice](chunks.getSize + 1)
-// Initialize Slice Array with Empty Slice as per Presto's code
-sliceArray(0) = Slices.EMPTY_SLICE
-var count = 1
+val positionCount = chunks.getSize;
+val offsetVector : Array[Int] = new Array[Int](positionCount + 2 )
+val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 
1)
+isNullVector(0) = true
+isNullVector(1) = true
--- End diff --

ok.


---


[GitHub] carbondata issue #2590: [CARBONDATA-2750] Updated documentation on Local Dic...

2018-08-06 Thread chenliang613
Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/2590
  
LGTM


---


[jira] [Resolved] (CARBONDATA-2763) Create table with partition and no_inverted_index on long_string column is not blocked

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2763.

   Resolution: Fixed
Fix Version/s: 1.4.1

> Create table with partition and no_inverted_index on long_string column is 
> not blocked
> --
>
> Key: CARBONDATA-2763
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2763
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1, 2.2
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 1.4.1
>
>
> Steps :
>  # Create table with partition using long_string column 
>  CREATE TABLE local_no_inverted_index(id int, name string, description 
> string,address string, note string) STORED BY 'org.apache.carbondata.format' 
> tblproperties('no_inverted_index'='note','long_string_columns'='note');
>  2. Create table with no_inverted_index 
>   CREATE TABLE local1_partition(id int,name string, description 
> string,address string)  partitioned by (note string) STORED BY 
> 'org.apache.carbondata.format' tblproperties('long_string_columns'='note');
>  
> Actual Output : The Create table with partition and no_inverted_index on 
> long_string column is successful.
> 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE 
> local_no_inverted_index(id int, name string, description string,address 
> string, note string) STORED BY 'org.apache.carbondata.format' 
> tblproperties('no_inverted_index'='note','long_string_columns'='note');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (2.604 seconds)
> 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE local1_partition(id 
> int,name string, description string,address string) partitioned by (note 
> string) STORED BY 'org.apache.carbondata.format' 
> tblproperties('long_string_columns'='note');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (1.989 seconds)
> Expected Output - The Create table with partition and no_inverted_index on 
> long_string column should be blocked.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2609: [CARBONDATA-2823] Support streaming property with da...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2609
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6172/



---


[GitHub] carbondata pull request #2607: [CARBONDATA-2818] Presto Upgrade to 0.206

2018-08-06 Thread bhavya411
Github user bhavya411 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2607#discussion_r207804289
  
--- Diff: 
integration/presto/src/main/scala/org/apache/carbondata/presto/CarbonDictionaryDecodeReadSupport.scala
 ---
@@ -84,25 +85,31 @@ class CarbonDictionaryDecodeReadSupport[T] extends 
CarbonReadSupport[T] {
* @param dictionaryData
* @return
*/
-  private def createSliceArrayBlock(dictionaryData: Dictionary): 
SliceArrayBlock = {
+  private def createSliceArrayBlock(dictionaryData: Dictionary): Block = {
 val chunks: DictionaryChunksWrapper = 
dictionaryData.getDictionaryChunks
-val sliceArray = new Array[Slice](chunks.getSize + 1)
-// Initialize Slice Array with Empty Slice as per Presto's code
-sliceArray(0) = Slices.EMPTY_SLICE
-var count = 1
+val positionCount = chunks.getSize;
+val offsetVector : Array[Int] = new Array[Int](positionCount + 2 )
+val isNullVector: Array[Boolean] = new Array[Boolean](positionCount + 
1)
+isNullVector(0) = true
+isNullVector(1) = true
--- End diff --

We are talking about dictionary here , so In dictionary there will be only 
one null and the key value will be 1 by default in CarbonData, hence the 
isNullVector will be populated only once with null value it has no bearing on 
actual data. The Carbondata key starts from 1 so we need a filler at 0th 
position and 1 index is actually Null to map to carbondata null values . The 
offset index will be like 0th Position  ->  0  (As it is filler)
1st Position -> 0 (For actual Null)
2nd Postion -> 0 as the byte[] is still null so starting point will be 0 
only


---


[jira] [Resolved] (CARBONDATA-2762) Long string column displayed as string in describe formatted

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2762.

   Resolution: Fixed
Fix Version/s: 1.4.1

> Long string column displayed as string in describe formatted
> 
>
> Key: CARBONDATA-2762
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2762
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 1.4.1
>
>
> Steps :
> User creates a table with long string column and executes the describe 
> formatted table command.
> 0: jdbc:hive2://10.18.98.101:22550/default> create table t2(c1 string, c2 
> string) stored by 'carbondata' tblproperties('long_string_columns' = 'c2');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (3.034 seconds)
> 0: jdbc:hive2://10.18.98.101:22550/default> desc formatted t2;
> Actual Output : The describe formatted displays the c2 column as string 
> instead of long string.
> 0: jdbc:hive2://10.18.98.101:22550/default> desc formatted t2;
> +---+---+---+--+
> | col_name | data_type | comment |
> +---+---+---+--+
> | c1 | string | KEY COLUMN,null |
> *| c2 | string | KEY COLUMN,null |*
> | | | |
> | ##Detailed Table Information | | |
> | Database Name | default | |
> | Table Name | t2 | |
> | CARBON Store Path | 
> hdfs://hacluster/user/hive/warehouse/carbon.store/default/t2 | |
> | Comment | | |
> | Table Block Size | 1024 MB | |
> | Table Data Size | 0 | |
> | Table Index Size | 0 | |
> | Last Update Time | 0 | |
> | SORT_SCOPE | LOCAL_SORT | LOCAL_SORT |
> | CACHE_LEVEL | BLOCK | |
> | Streaming | false | |
> | Local Dictionary Enabled | true | |
> | Local Dictionary Threshold | 1 | |
> | Local Dictionary Include | c1,c2 | |
> | | | |
> | ##Detailed Column property | | |
> | ADAPTIVE | | |
> | SORT_COLUMNS | c1 | |
> +---+---+---+--+
> 22 rows selected (2.847 seconds)
>  
> Expected Output : The describe formatted should display the c2 column as long 
> string.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2608: [CARBONDATA-2829] Fix creating merge index on older ...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
Build Success with Spark 2.2.1, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6516/



---


[jira] [Resolved] (CARBONDATA-2796) Fix data loading problem when table has complex column and long string column

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2796.

   Resolution: Fixed
Fix Version/s: 1.4.1

> Fix data loading problem when table has  complex column and long string column
> --
>
> Key: CARBONDATA-2796
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2796
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: jiangmanhua
>Assignee: jiangmanhua
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> currently both varchar column and complex column believes itself is the last 
> one member in noDictionary group when converting carbon row from raw format 
> to 3-parted format. Since they need to be proceeded in different way, 
> exception will occur if we deal the column in wrong way.
> To fix this, we marked the info of complex columns explicitly like varchar 
> columns, and keep the order of noDictionary group as : normal Dim & varchar & 
> complex



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2608: [CARBONDATA-2829] Fix creating merge index on older ...

2018-08-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7791/



---


[GitHub] carbondata pull request #2609: [CARBONDATA-2823] Support streaming property ...

2018-08-06 Thread xuchuanyin
GitHub user xuchuanyin opened a pull request:

https://github.com/apache/carbondata/pull/2609

[CARBONDATA-2823] Support streaming property with datamap

Since during query, carbondata get splits from streaming segment and
columnar segments repectively, we can support streaming with index
datamap.

For preaggregate datamap, it already supported streaming table, so here
we will remove the outdated comments.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [x] Any interfaces changed?
 `NO`
 - [x] Any backward compatibility impacted?
 `NO`
 - [x] Document update required?
`NO`
 - [x] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
`NO`
- How it is tested? Please attach test report.
`Tested in local`
- Is it a performance related change? Please attach the performance 
test report.
`NO`
- Any additional information to help reviewers in testing this 
change.
`NA`
   
 - [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata 
issue2823_streaming_support_preagg_index_dm

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2609.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2609


commit a7772d8fd2ece3c362f16925299c63b848657c9a
Author: xuchuanyin 
Date:   2018-08-06T07:34:51Z

Support streaming property with datamap

Since during query, carbondata get splits from streaming segment and
columnar segments repectively, we can support streaming with index
datamap.

For preaggregate datamap, it already supported streaming table, so here
we will remove the outdated comments.




---


[jira] [Updated] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error

2018-08-06 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-2823:

Description: 
Steps :
 # create table
 # create bloom/lucene datamap
 # load data
 # alter table set tblProperties

0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE uniqdata_load (CUST_ID 
int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 
double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format';
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (1.43 seconds)
0: jdbc:hive2://10.18.98.101:22550/default> CREATE DATAMAP dm_uniqdata1_tmstmp6 
ON TABLE uniqdata_load USING 'bloomfilter' DMPROPERTIES ('INDEX_COLUMNS' = 
'DOJ', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (0.828 seconds)
0: jdbc:hive2://10.18.98.101:22550/default> LOAD DATA INPATH 
'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_load 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (4.903 seconds)
0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
tblproperties('local_dictionary_include'='CUST_NAME');
Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
streaming is not supported for index datamap (state=,code=0)

 

Issue : Alter table set local dictionary include fails with incorrect error.

0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
tblproperties('local_dictionary_include'='CUST_NAME');

*Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
streaming is not supported for index datamap (state=,code=0)*

 

Expected : Operation should be success. If the operation is unsupported it 
should throw correct error message.

 

  was:
Steps :

In old version V3 store create table and load data.

CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED BY 'org.apache.carbondata.format';
LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_load OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

In 1.4.1 version refresh the table of old V3 store.

refresh table uniqdata_load;

Create bloom filter and merge index.

CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' 
DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 
'BLOOM_FPP'='0.1');

Alter table set local dictionary include.

 alter table uniqdata_load set 
tblproperties('local_dictionary_include'='CUST_NAME');

 

Issue : Alter table set local dictionary include fails with incorrect error.

0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
tblproperties('local_dictionary_include'='CUST_NAME');

*Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
streaming is not supported for index datamap (state=,code=0)*

 

Expected : Operation should be success. If the operation is unsupported it 
should throw correct error message.

 


> Alter table set local dictionary include after bloom creation and merge index 
> on old V3 store fails throwing incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Assignee: xuchuanyin
>Priority: Minor
>
> Steps :
>  # create table
>  # create bloom/lucene datamap
>  # load data
>  # alter table set tblProperties
> 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE uniqdata_load 
> (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ 
> timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
> decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, 
> 

[jira] [Updated] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation fails throwing incorrect error

2018-08-06 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-2823:

Summary: Alter table set local dictionary include after bloom creation 
fails throwing incorrect error  (was: Alter table set local dictionary include 
after bloom creation and merge index on old V3 store fails throwing incorrect 
error)

> Alter table set local dictionary include after bloom creation fails throwing 
> incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Assignee: xuchuanyin
>Priority: Minor
>
> Steps :
>  # create table
>  # create bloom/lucene datamap
>  # load data
>  # alter table set tblProperties
> 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE uniqdata_load 
> (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ 
> timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
> decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, 
> Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 
> 'org.apache.carbondata.format';
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (1.43 seconds)
> 0: jdbc:hive2://10.18.98.101:22550/default> CREATE DATAMAP 
> dm_uniqdata1_tmstmp6 ON TABLE uniqdata_load USING 'bloomfilter' DMPROPERTIES 
> ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (0.828 seconds)
> 0: jdbc:hive2://10.18.98.101:22550/default> LOAD DATA INPATH 
> 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_load 
> OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (4.903 seconds)
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)
>  
> Issue : Alter table set local dictionary include fails with incorrect error.
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> *Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)*
>  
> Expected : Operation should be success. If the operation is unsupported it 
> should throw correct error message.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569825#comment-16569825
 ] 

xuchuanyin commented on CARBONDATA-2823:


since we get the splits from streaming segment and columnar segments 
respectively, we can support streaming with index datamap

> Alter table set local dictionary include after bloom creation and merge index 
> on old V3 store fails throwing incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Assignee: xuchuanyin
>Priority: Minor
>
> Steps :
> In old version V3 store create table and load data.
> CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_load OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> In 1.4.1 version refresh the table of old V3 store.
> refresh table uniqdata_load;
> Create bloom filter and merge index.
> CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' 
> DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 
> 'BLOOM_FPP'='0.1');
> Alter table set local dictionary include.
>  alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
>  
> Issue : Alter table set local dictionary include fails with incorrect error.
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> *Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)*
>  
> Expected : Operation should be success. If the operation is unsupported it 
> should throw correct error message.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569823#comment-16569823
 ] 

xuchuanyin edited comment on CARBONDATA-2823 at 8/6/18 7:22 AM:


As for CARBONDATA-2823, it can simply be reproduced by
1. create table
2. create bloom/lucene datamap
3. load data
4. alter table set tblProperties


was (Author: xuchuanyin):
As for CARBONDATA-2823, it can simply reproduced by
1. create table
2. create bloom/lucene datamap
3. load data
4. alter table set tblProperties

> Alter table set local dictionary include after bloom creation and merge index 
> on old V3 store fails throwing incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Priority: Minor
>
> Steps :
> In old version V3 store create table and load data.
> CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_load OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> In 1.4.1 version refresh the table of old V3 store.
> refresh table uniqdata_load;
> Create bloom filter and merge index.
> CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' 
> DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 
> 'BLOOM_FPP'='0.1');
> Alter table set local dictionary include.
>  alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
>  
> Issue : Alter table set local dictionary include fails with incorrect error.
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> *Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)*
>  
> Expected : Operation should be success. If the operation is unsupported it 
> should throw correct error message.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reassigned CARBONDATA-2823:
--

Assignee: xuchuanyin

> Alter table set local dictionary include after bloom creation and merge index 
> on old V3 store fails throwing incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Assignee: xuchuanyin
>Priority: Minor
>
> Steps :
> In old version V3 store create table and load data.
> CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_load OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> In 1.4.1 version refresh the table of old V3 store.
> refresh table uniqdata_load;
> Create bloom filter and merge index.
> CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' 
> DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 
> 'BLOOM_FPP'='0.1');
> Alter table set local dictionary include.
>  alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
>  
> Issue : Alter table set local dictionary include fails with incorrect error.
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> *Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)*
>  
> Expected : Operation should be success. If the operation is unsupported it 
> should throw correct error message.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569823#comment-16569823
 ] 

xuchuanyin commented on CARBONDATA-2823:


As for CARBONDATA-2823, it can simply reproduced by
1. create table
2. create bloom/lucene datamap
3. load data
4. alter table set tblProperties

> Alter table set local dictionary include after bloom creation and merge index 
> on old V3 store fails throwing incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Priority: Minor
>
> Steps :
> In old version V3 store create table and load data.
> CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_load OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> In 1.4.1 version refresh the table of old V3 store.
> refresh table uniqdata_load;
> Create bloom filter and merge index.
> CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' 
> DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 
> 'BLOOM_FPP'='0.1');
> Alter table set local dictionary include.
>  alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
>  
> Issue : Alter table set local dictionary include fails with incorrect error.
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> *Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)*
>  
> Expected : Operation should be success. If the operation is unsupported it 
> should throw correct error message.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2608: [CARBONDATA-2829] Fix creating merge index on older ...

2018-08-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2608
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/6171/



---


[GitHub] carbondata pull request #2608: [CARBONDATA-2829] Fix creating merge index on...

2018-08-06 Thread dhatchayani
GitHub user dhatchayani opened a pull request:

https://github.com/apache/carbondata/pull/2608

[CARBONDATA-2829] Fix creating merge index on older V1 V2 store

Block merge index creation for the old store V1 V2 versions

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [x] Testing done
Manual Testing
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhatchayani/carbondata CARBONDATA-2829

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2608.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2608


commit 45130534fed38bc4b4ac684c41d1afb1a33770be
Author: dhatchayani 
Date:   2018-08-06T06:45:26Z

[CARBONDATA-2829] Fix creating merge index on older V1 V2 store




---


[jira] [Created] (CARBONDATA-2829) Fix creating merge index on older V1 V2 store

2018-08-06 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2829:
---

 Summary: Fix creating merge index on older V1 V2 store
 Key: CARBONDATA-2829
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2829
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani


Block creating merge index on older V1 V2 version



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)