date:20170727

[jira] [Created] (CARBONDATA-1336) Add issue mailing list

2017-07-27 Thread xuchuanyin (JIRA)

xuchuanyin created CARBONDATA-1336:
--

 Summary: Add issue mailing list
 Key: CARBONDATA-1336
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1336
 Project: CarbonData
  Issue Type: Improvement
  Components: docs
Reporter: xuchuanyin
Assignee: xuchuanyin
Priority: Trivial


Carbondata's issue related mails have been sent to a new mailing list other 
than DEV. 
We need to add the related guidance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1203: Rebase encoding_override branch onto master

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1203
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3229/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1203: Rebase encoding_override branch onto master

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1203
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/634/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1203: Rebase encoding_override branch onto master

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1203
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1203: Rebase encoding_override branch onto master

2017-07-27 Thread sraghunandan

GitHub user sraghunandan opened a pull request:

https://github.com/apache/carbondata/pull/1203

Rebase encoding_override branch onto master

Rebase encoding_override branch onto master

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sraghunandan/carbondata-1 
rebase_encoding-override_onto_master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1203.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1203


commit bc3e6843ee83370b6b20e5c9eef92f10667edbae
Author: jackylk 
Date:   2017-07-04T00:12:13Z

[CARBONDATA-1098] Change page statistics use exact type and use column page 
in writer

This PR changes writer in data load:

make statistics collection use exact data type in schema instead of generic 
type
change consumer and writer to use EncodedTablePage instead of NodeHolder. 
EncodedTablePage is the output of TablePage.encode

This closes#1102

commit a5af0ff238230bf64c8ac987bec9977d3f081ff2
Author: jackylk 
Date:   2017-07-13T01:21:30Z

[CARBONDATA-1268] Support encoding strategy for dimension columns

In this PR, dimension encoding is changed to use EncodingStrategy instead 
of hard coding.
In future, dimension encoding can be adjusted by extending EncodingStrategy

This closes#1136

commit 74226907990cdee41a6ccbd69e2a813077792f89
Author: Raghunandan S 
Date:   2017-07-26T13:59:05Z

Resolve rebase conflicts when rebasing branch encoding_override onto master




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1203: Rebase encoding_override branch onto master

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1203
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1134: [CARBONDATA-1262] Remove unnecessary LoadConfigurati...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1134
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/633/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1134: [CARBONDATA-1262] Remove unnecessary LoadConfigurati...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1134
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3228/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3227/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1134: [CARBONDATA-1262] Remove unnecessary LoadConfigurati...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1134
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/632/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/631/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1202: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1202
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/630/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1202: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1202
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3226/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1202: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1202
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1202: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1202
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/629/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1202: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1202
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3225/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1202: [CARBONDATA-1326] Fixed normal/low priority f...

2017-07-27 Thread mohammadshahidkhan

GitHub user mohammadshahidkhan opened a pull request:

https://github.com/apache/carbondata/pull/1202

[CARBONDATA-1326] Fixed normal/low priority findbug issues



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mohammadshahidkhan/incubator-carbondata 
findbugfix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1202.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1202


commit 5e78124091341daf02f5a04d472d7f0e5590d40c
Author: mohammadshahidkhan 
Date:   2017-07-27T15:37:54Z

[CARBONDATA-1326] Fixed normal/low priority findbug issues




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1202: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1202
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1134: [CARBONDATA-1262] Remove unnecessary LoadConf...

2017-07-27 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1134#discussion_r129874324
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
 ---
@@ -465,6 +463,29 @@ public static String 
checkAndCreateCarbonStoreLocation(String factStoreLocation,
   }
 
   /**
+   * Return the sort scope enum.
+   */
+  public static SortScopeOptions.SortScope getSortScope(String 
sortScopeString) {
+SortScopeOptions.SortScope sortScope;
+try {
+  // first check whether user input it from ddl, otherwise get from 
carbon properties
--- End diff --

suggest changing "it" to "sort scope"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (CARBONDATA-1287) Remove unnecessary MDK generation in loading

2017-07-27 Thread Liang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen resolved CARBONDATA-1287.

Resolution: Fixed
  Assignee: Jacky Li

> Remove unnecessary MDK generation in loading
> 
>
> Key: CARBONDATA-1287
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1287
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.2.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When updating MDK key in data load write step, there is unnecessary MDK 
> generation. It can be removed to improvement loading performance



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata pull request #1145: [CARBONDATA-1287] remove unnecessary MDK gene...

2017-07-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1145


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1145: [CARBONDATA-1287] remove unnecessary MDK generation

2017-07-27 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1145
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1200: [Documentation] Fixed the syntax issue in Del...

2017-07-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1200


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1200: [Documentation] Fixed the syntax issue in Delete by ...

2017-07-27 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1200
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (CARBONDATA-1313) Remove unnecessary statistics

2017-07-27 Thread Liang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen resolved CARBONDATA-1313.

Resolution: Fixed
  Assignee: Jacky Li

> Remove unnecessary statistics 
> --
>
> Key: CARBONDATA-1313
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1313
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.2.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Unique Value and Decimal Point is not used, remove them in measure statistics



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3224/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/628/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1181: [CARBONDATA-1313] Remove unnecessary measure ...

2017-07-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1181


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1168: [CARBONDATA-1229] restrict drop when loading is in p...

2017-07-27 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1168
  
@kunal642 can you please rebase


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (CARBONDATA-1335) Duplicated & time-consuming method call found in query

2017-07-27 Thread Jacky Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-1335.
--
   Resolution: Fixed
Fix Version/s: 1.2.0

> Duplicated & time-consuming method call found in query
> --
>
> Key: CARBONDATA-1335
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1335
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-query
>Affects Versions: 1.1.1
>Reporter: xuchuanyin
>Priority: Minor
>  Labels: performance
> Fix For: 1.2.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> # Scenario
> Currently we did a concurrent  14 queries on Carbondata. The queries are the 
> same, but on different tables. We have noticed the following scene:
> + A single query took about 5s;
> + In concurrent scenario, each query took about 15s;
> By adding checkpoint in the log, we found that there was great latency in 
> starting query jobs in spark.
> # Analyze
> When we fire a query, Carbondata firstly do some job in the client side, 
> including parse/analyze plans and prepare filtered blocks and inputSplits. 
> Then Carbondata start to submit query job to spark. 
> We found in the first step, Carbondata took about 7s in current scenario, but 
> it only took about <1s in single scenario.
> By studying the related code, we found the most time consuming method call 
> was  `CarbonSessionCatalog.lookupRelation`. In side this method, it called 
> `super.lookupRelation` twice, which consumed about 3s each time.
> # Solution
> Carbondata only needs to call the `super.lookupRelation` only once, we need 
> to remove the useless duplicated method call.
> I've tested in my environment and it works well. In concurrent scenario, each 
> query takes about 12s (3s saved for the improvement).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1181: [CARBONDATA-1313] Remove unnecessary measure statist...

2017-07-27 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1181
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (CARBONDATA-1281) Disk hotspot found during data loading

2017-07-27 Thread Liang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen resolved CARBONDATA-1281.

   Resolution: Fixed
Fix Version/s: 1.2.0

> Disk hotspot found during data loading
> --
>
> Key: CARBONDATA-1281
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1281
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core, data-load
>Affects Versions: 1.1.0
>Reporter: xuchuanyin
>Assignee: xuchuanyin
> Fix For: 1.2.0
>
>  Time Spent: 17.5h
>  Remaining Estimate: 0h
>
> # Scenario
> Currently we have done a massive data loading. The input data is about 71GB 
> in CSV format，and have about 88million records. When using carbondata, we do 
> not use any dictionary encoding. Our testing environment has three nodes and 
> each of them have 11 disks as yarn executor directory. We submit the loading 
> command through JDBCServer.The JDBCServer instance have three executors in 
> total, one on each node respectively. The loading takes about 10minutes 
> (+-3min vary from each time).
> We have observed the nmon information during the loading and find：
> 1. lots of CPU waits in the first half of loading;
> 2. only one single disk has many writes and almost reaches its bottleneck 
> (Avg. 80M/s, Max. 150M/s on SAS Disk)
> 3. the other disks are quite idel
> # Analyze
> When do data loading, carbondata read and sort data locally(default scope) 
> and write the temp files to local disk. In my case, there is only one 
> executor in one node, so carbondata write all the temp file to one 
> disk(container directory or yarn local directory), thus resulting into single 
> disk hotspot.
> # Modification
> We should support multiple directory for writing temp files to avoid disk 
> hotspot.
> Ps: I have improved this in my environment and the result is pretty 
> optimistic: the loading takes about 6minutes (10 minutes before improving).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (CARBONDATA-1281) Disk hotspot found during data loading

2017-07-27 Thread Liang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen reassigned CARBONDATA-1281:
--

Assignee: xuchuanyin  (was: Liang Chen)

> Disk hotspot found during data loading
> --
>
> Key: CARBONDATA-1281
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1281
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core, data-load
>Affects Versions: 1.1.0
>Reporter: xuchuanyin
>Assignee: xuchuanyin
> Fix For: 1.2.0
>
>  Time Spent: 17.5h
>  Remaining Estimate: 0h
>
> # Scenario
> Currently we have done a massive data loading. The input data is about 71GB 
> in CSV format，and have about 88million records. When using carbondata, we do 
> not use any dictionary encoding. Our testing environment has three nodes and 
> each of them have 11 disks as yarn executor directory. We submit the loading 
> command through JDBCServer.The JDBCServer instance have three executors in 
> total, one on each node respectively. The loading takes about 10minutes 
> (+-3min vary from each time).
> We have observed the nmon information during the loading and find：
> 1. lots of CPU waits in the first half of loading;
> 2. only one single disk has many writes and almost reaches its bottleneck 
> (Avg. 80M/s, Max. 150M/s on SAS Disk)
> 3. the other disks are quite idel
> # Analyze
> When do data loading, carbondata read and sort data locally(default scope) 
> and write the temp files to local disk. In my case, there is only one 
> executor in one node, so carbondata write all the temp file to one 
> disk(container directory or yarn local directory), thus resulting into single 
> disk hotspot.
> # Modification
> We should support multiple directory for writing temp files to avoid disk 
> hotspot.
> Ps: I have improved this in my environment and the result is pretty 
> optimistic: the loading takes about 6minutes (10 minutes before improving).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (CARBONDATA-1281) Disk hotspot found during data loading

2017-07-27 Thread Liang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen reassigned CARBONDATA-1281:
--

Assignee: Liang Chen

> Disk hotspot found during data loading
> --
>
> Key: CARBONDATA-1281
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1281
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core, data-load
>Affects Versions: 1.1.0
>Reporter: xuchuanyin
>Assignee: Liang Chen
>  Time Spent: 17.5h
>  Remaining Estimate: 0h
>
> # Scenario
> Currently we have done a massive data loading. The input data is about 71GB 
> in CSV format，and have about 88million records. When using carbondata, we do 
> not use any dictionary encoding. Our testing environment has three nodes and 
> each of them have 11 disks as yarn executor directory. We submit the loading 
> command through JDBCServer.The JDBCServer instance have three executors in 
> total, one on each node respectively. The loading takes about 10minutes 
> (+-3min vary from each time).
> We have observed the nmon information during the loading and find：
> 1. lots of CPU waits in the first half of loading;
> 2. only one single disk has many writes and almost reaches its bottleneck 
> (Avg. 80M/s, Max. 150M/s on SAS Disk)
> 3. the other disks are quite idel
> # Analyze
> When do data loading, carbondata read and sort data locally(default scope) 
> and write the temp files to local disk. In my case, there is only one 
> executor in one node, so carbondata write all the temp file to one 
> disk(container directory or yarn local directory), thus resulting into single 
> disk hotspot.
> # Modification
> We should support multiple directory for writing temp files to avoid disk 
> hotspot.
> Ps: I have improved this in my environment and the result is pretty 
> optimistic: the loading takes about 6minutes (10 minutes before improving).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1134: [CARBONDATA-1262] Remove unnecessary LoadConfigurati...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1134
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3223/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1134: [CARBONDATA-1262] Remove unnecessary LoadConfigurati...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1134
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/627/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1198


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-27 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
LGTM, very good PR! Thanks for your good contribution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1067: [CARBONDATA-1199] support dynamically enabling unsaf...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1067
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/626/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1067: [CARBONDATA-1199] support dynamically enabling unsaf...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1067
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3222/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1181: [CARBONDATA-1313] Remove unnecessary measure statist...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1181
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/625/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1181: [CARBONDATA-1313] Remove unnecessary measure statist...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1181
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3221/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (CARBONDATA-1268) Add encoding selection strategy for columns

2017-07-27 Thread Jacky Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-1268.
--
Resolution: Fixed

> Add encoding selection strategy for columns
> ---
>
> Key: CARBONDATA-1268
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1268
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Jacky Li
> Fix For: 1.2.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> For each column, carbon should support encoding strategy to choose the 
> suitable encoding method. 
> This strategy should be extensible, so developer can change its behavior 
> easily.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata pull request #1136: [CARBONDATA-1268] Support encoding strategy f...

2017-07-27 Thread jackylk

Github user jackylk closed the pull request at:

https://github.com/apache/carbondata/pull/1136


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1194: Rebase metadata onto master

2017-07-27 Thread sraghunandan

Github user sraghunandan commented on the issue:

https://github.com/apache/carbondata/pull/1194
  
Merged onto master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1102: [CARBONDATA-1098] Change page statistics use ...

2017-07-27 Thread jackylk

Github user jackylk closed the pull request at:

https://github.com/apache/carbondata/pull/1102


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1194: Rebase metadata onto master

2017-07-27 Thread sraghunandan

Github user sraghunandan closed the pull request at:

https://github.com/apache/carbondata/pull/1194


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1099: [CARBONDATA-1232] Datamap implementation for ...

2017-07-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1099


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1196: Rebase datamap branch onto master

2017-07-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1196


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1196: Rebase datamap branch onto master

2017-07-27 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1196
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1196: Rebase datamap branch onto master

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1196
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/624/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1196: Rebase datamap branch onto master

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1196
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3220/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-27 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
@chenliang613 Does this PR can be merged or need more reviews?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1099: [CARBONDATA-1232] Datamap implementation for Blockle...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1099
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/623/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1099: [CARBONDATA-1232] Datamap implementation for Blockle...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1099
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3219/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1179: [WIP] Added the blocklet info to index file and make...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1179
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/622/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1179: [WIP] Added the blocklet info to index file and make...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1179
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3218/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3217/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/621/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3216/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/620/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3215/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/619/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3213/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1201: [CARBONDATA-1326] Fixed normal/low priority f...

2017-07-27 Thread kunal642

GitHub user kunal642 opened a pull request:

https://github.com/apache/carbondata/pull/1201

[CARBONDATA-1326] Fixed normal/low priority findbug issues

Fixed normal/low priority findbug issues in the code

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kunal642/carbondata findbugs_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1201.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1201


commit 3d17859d030772c6492f79ce025577fa98b60ac0
Author: kunal642 
Date:   2017-07-27T10:15:24Z

fixed findbugs issues




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3214/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/618/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1201: [CARBONDATA-1326] Fixed normal/low priority findbug ...

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1201
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1200: [Documentation] Fixed the syntax issue in Delete by ...

2017-07-27 Thread sgururajshetty

Github user sgururajshetty commented on the issue:

https://github.com/apache/carbondata/pull/1200
  
LGTM
@chenliang613 kindly review


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1200: [Documentation] Fixed the syntax issue in Delete by ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1200
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1200: [Documentation] Fixed the syntax issue in Delete by ...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1200
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1200: [Documentation] Fixed the syntax issue in Delete by ...

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1200
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1200: [Documentation] Fixed the syntax issue in Del...

2017-07-27 Thread siddhardhk

GitHub user siddhardhk opened a pull request:

https://github.com/apache/carbondata/pull/1200

[Documentation] Fixed the syntax issue in Delete by Segment ID

In the Delete by Segment ID command the WHERE was misspelled as WERE


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/siddhardhk/carbondata master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1200.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1200


commit 7aba7caf042bec08282941db7f9470407781a07a
Author: siddhardhk 
Date:   2017-07-27T09:58:43Z

[Documentation] Fixed the syntax issue in Delete by Segment ID

In the Delete by Segment ID command the WHERE was misspelled as WERE




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1200: [Documentation] Fixed the syntax issue in Delete by ...

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1200
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1199: [CARBONDATA-1335] Remove duplicated time-consuming m...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1199
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3212/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1199: [CARBONDATA-1335] Remove duplicated time-consuming m...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1199
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/617/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3211/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/616/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (CARBONDATA-1335) Duplicated & time-consuming method call found in query

2017-07-27 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-1335:
---
Description: 
# Scenario

Currently we did a concurrent  14 queries on Carbondata. The queries are the 
same, but on different tables. We have noticed the following scene:

+ A single query took about 5s;
+ In concurrent scenario, each query took about 15s;

By adding checkpoint in the log, we found that there was great latency in 
starting query jobs in spark.

# Analyze

When we fire a query, Carbondata firstly do some job in the client side, 
including parse/analyze plans and prepare filtered blocks and inputSplits. Then 
Carbondata start to submit query job to spark. 

We found in the first step, Carbondata took about 7s in current scenario, but 
it only took about <1s in single scenario.
By studying the related code, we found the most time consuming method call was  
`CarbonSessionCatalog.lookupRelation`. In side this method, it called 
`super.lookupRelation` twice, which consumed about 3s each time.

# Solution

Carbondata only needs to call the `super.lookupRelation` only once, we need to 
remove the useless duplicated method call.

I've tested in my environment and it works well. In concurrent scenario, each 
query takes about 12s (3s saved for the improvement).

  was:
# Scenario

Currently we did a concurrent  14 queries on Carbondata. The queries are the 
same, but on different tables. We have noticed the following scene:

+ A single query took about 5s;
+ In concurrent scenario, each query took about 15s;

By adding checkpoint in the log, we found that there was great latency in 
starting query jobs in spark.

# Analysts

When we fire a query, Carbondata firstly do some job in the client side, 
including parse/analyze plans and prepare filtered blocks and inputSplits. Then 
Carbondata start to submit query job to spark. 

We found in the first step, Carbondata took about 7s in current scenario, but 
it only took about <1s in single scenario.
By studying the related code, we found the most time consuming method call was  
`CarbonSessionCatalog.lookupRelation`. In side this method, it called 
`super.lookupRelation` twice, which consumed about 3s each time.

# Solution

Carbondata only needs to call the `super.lookupRelation` only once, we need to 
remove the useless duplicated method call.

I've tested in my environment and it works well. In concurrent scenario, each 
query takes about 12s (3s saved for the improvement).


> Duplicated & time-consuming method call found in query
> --
>
> Key: CARBONDATA-1335
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1335
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-query
>Affects Versions: 1.1.1
>Reporter: xuchuanyin
>Priority: Minor
>  Labels: performance
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> # Scenario
> Currently we did a concurrent  14 queries on Carbondata. The queries are the 
> same, but on different tables. We have noticed the following scene:
> + A single query took about 5s;
> + In concurrent scenario, each query took about 15s;
> By adding checkpoint in the log, we found that there was great latency in 
> starting query jobs in spark.
> # Analyze
> When we fire a query, Carbondata firstly do some job in the client side, 
> including parse/analyze plans and prepare filtered blocks and inputSplits. 
> Then Carbondata start to submit query job to spark. 
> We found in the first step, Carbondata took about 7s in current scenario, but 
> it only took about <1s in single scenario.
> By studying the related code, we found the most time consuming method call 
> was  `CarbonSessionCatalog.lookupRelation`. In side this method, it called 
> `super.lookupRelation` twice, which consumed about 3s each time.
> # Solution
> Carbondata only needs to call the `super.lookupRelation` only once, we need 
> to remove the useless duplicated method call.
> I've tested in my environment and it works well. In concurrent scenario, each 
> query takes about 12s (3s saved for the improvement).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3210/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1199: [CARBONDATA-1335] Remove duplicated time-cons...

2017-07-27 Thread xuchuanyin

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1199#discussion_r129767934
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonSessionState.scala
 ---
@@ -106,7 +109,6 @@ class CarbonSessionCatalog(
 }
   case _ =>
 }
-super.lookupRelation(name, alias)
--- End diff --

this PR mainly focus on removing this useless method call


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1199: [CARBONDATA-1335] Remove duplicated time-consuming m...

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1199
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/615/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1199: [CARBONDATA-1335] Remove duplicated time-consuming m...

2017-07-27 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1199
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1199: [CARBONDATA-1335] Remove duplicated time-cons...

2017-07-27 Thread xuchuanyin

GitHub user xuchuanyin opened a pull request:

https://github.com/apache/carbondata/pull/1199

[CARBONDATA-1335] Remove duplicated time-consuming method call

# Scenario

Currently we did a concurrent  14 queries on Carbondata. The queries are 
the same, but on different tables. We have noticed the following scene:

+ A single query took about 5s;
+ In concurrent scenario, each query took about 15s;

By adding checkpoint in the log, we found that there was great latency in 
starting query jobs in spark.

# Analysts

When we fire a query, Carbondata firstly do some job in the client side, 
including parse/analyze plans and prepare filtered blocks and inputSplits. Then 
Carbondata start to submit query job to spark. 

We found in the first step, Carbondata took about 7s in current scenario, 
but it only took about <1s in single scenario.
By studying the related code, we found the most time consuming method call 
was  `CarbonSessionCatalog.lookupRelation`. In side this method, it called 
`super.lookupRelation` twice, which consumed about 3s each time.

# Solution

Carbondata only needs to call the `super.lookupRelation` only once, we need 
to remove the useless duplicated method call.

I've tested in my environment and it works well. In concurrent scenario, 
each query takes about 12s (3s saved for the improvement).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata remove_duplicated_lookup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1199.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1199


commit d05d1fadbe0f773a00ff1d0e96ff9fe90b7b7f06
Author: xuchuanyin 
Date:   2017-07-27T07:07:25Z

Remove duplicated time-consuming method call




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-27 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
All review comments solved


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-27 Thread xuchuanyin

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1198#discussion_r129765977
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
 ---
@@ -145,21 +146,31 @@ public static void 
renameBadRecordsFromInProgressToNormal(
   /**
* This method will be used to delete sort temp location is it is exites
*/
-  public static void deleteSortLocationIfExists(String tempFileLocation) {
-// create new temp file location where this class
-//will write all the temp files
-File file = new File(tempFileLocation);
-
-if (file.exists()) {
-  try {
-CarbonUtil.deleteFoldersAndFiles(file);
-  } catch (IOException | InterruptedException e) {
-LOGGER.error(e);
+  public static void deleteSortLocationIfExists(String[] locations) {
+for (String loc : locations) {
+  File file = new File(loc);
+  if (file.exists()) {
+try {
+  CarbonUtil.deleteFoldersAndFiles(file);
+} catch (IOException | InterruptedException e) {
+  LOGGER.error(e, "Failed to delete " + loc);
+}
   }
 }
   }
 
   /**
+   * This method will be used to create dirs
+   * @param locations locations to create
+   */
+  public static void createLocations(String[] locations) {
+for (String loc : locations) {
+  if (new File(loc).mkdirs()) {
--- End diff --

:+1: nice


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-27 Thread xuchuanyin

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1198#discussion_r129765796
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1296,6 +1296,18 @@
   public static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL =
   "carbon.lease.recovery.retry.interval";
 
+  /**
+   * whether to use multi directories when loading data,
+   * the main purpose is to avoid single-disk-hot-spot
+   */
+  @CarbonProperty
+  public static final String CARBON_USE_MULTI_TEMP_DIR = 
"carbon.use.multiple.temp.dir";
+
+  /**
+   * default value for multi temp dir
+   */
+  public static final String CARBON_USING_MULTI_TEMP_DIR_DEFAULT = "false";
--- End diff --

:+1: fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-1335) Duplicated & time-consuming method call found in query

2017-07-27 Thread xuchuanyin (JIRA)

xuchuanyin created CARBONDATA-1335:
--

 Summary: Duplicated & time-consuming method call found in query
 Key: CARBONDATA-1335
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1335
 Project: CarbonData
  Issue Type: Improvement
  Components: data-query
Affects Versions: 1.1.1
Reporter: xuchuanyin
Priority: Minor


# Scenario

Currently we did a concurrent  14 queries on Carbondata. The queries are the 
same, but on different tables. We have noticed the following scene:

+ A single query took about 5s;
+ In concurrent scenario, each query took about 15s;

By adding checkpoint in the log, we found that there was great latency in 
starting query jobs in spark.

# Analysts

When we fire a query, Carbondata firstly do some job in the client side, 
including parse/analyze plans and prepare filtered blocks and inputSplits. Then 
Carbondata start to submit query job to spark. 

We found in the first step, Carbondata took about 7s in current scenario, but 
it only took about <1s in single scenario.
By studying the related code, we found the most time consuming method call was  
`CarbonSessionCatalog.lookupRelation`. In side this method, it called 
`super.lookupRelation` twice, which consumed about 3s each time.

# Solution

Carbondata only needs to call the `super.lookupRelation` only once, we need to 
remove the useless duplicated method call.

I've tested in my environment and it works well. In concurrent scenario, each 
query takes about 12s (3s saved for the improvement).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3209/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

2017-07-27 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/614/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-27 Thread sraghunandan

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1198#discussion_r129753971
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
 ---
@@ -145,21 +146,31 @@ public static void 
renameBadRecordsFromInProgressToNormal(
   /**
* This method will be used to delete sort temp location is it is exites
*/
-  public static void deleteSortLocationIfExists(String tempFileLocation) {
-// create new temp file location where this class
-//will write all the temp files
-File file = new File(tempFileLocation);
-
-if (file.exists()) {
-  try {
-CarbonUtil.deleteFoldersAndFiles(file);
-  } catch (IOException | InterruptedException e) {
-LOGGER.error(e);
+  public static void deleteSortLocationIfExists(String[] locations) {
+for (String loc : locations) {
+  File file = new File(loc);
+  if (file.exists()) {
+try {
+  CarbonUtil.deleteFoldersAndFiles(file);
+} catch (IOException | InterruptedException e) {
+  LOGGER.error(e, "Failed to delete " + loc);
+}
   }
 }
   }
 
   /**
+   * This method will be used to create dirs
+   * @param locations locations to create
+   */
+  public static void createLocations(String[] locations) {
+for (String loc : locations) {
+  if (new File(loc).mkdirs()) {
--- End diff --

should it not be !new File(loc).mkdirs()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-27 Thread sraghunandan

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1198#discussion_r129753676
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1296,6 +1296,18 @@
   public static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL =
   "carbon.lease.recovery.retry.interval";
 
+  /**
+   * whether to use multi directories when loading data,
+   * the main purpose is to avoid single-disk-hot-spot
+   */
+  @CarbonProperty
+  public static final String CARBON_USE_MULTI_TEMP_DIR = 
"carbon.use.multiple.temp.dir";
+
+  /**
+   * default value for multi temp dir
+   */
+  public static final String CARBON_USING_MULTI_TEMP_DIR_DEFAULT = "false";
--- End diff --

change to match the above configuration


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-1334) Delete Operation Hung in large dataset

2017-07-27 Thread sounak chakraborty (JIRA)

sounak chakraborty created CARBONDATA-1334:
--

 Summary: Delete Operation Hung in large dataset
 Key: CARBONDATA-1334
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1334
 Project: CarbonData
  Issue Type: Bug
Reporter: sounak chakraborty


Delete operation is hung in large dataset. Due to wrong quals check in 
DeleteDeltaBlockletDetails.java multiple DeleteDeltaBlockDetails objects being 
formed (almost like each object for each delete offset). Due to this high 
object formation search cost became very high which caused the hung situation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

95 matches

Mail list logo