[GitHub] carbondata issue #2974: [CARBONDATA-2563][CATALYST] Explain query with Order...

2018-12-10 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2974
  
LGTM


---


[GitHub] carbondata issue #2982: [CARBONDATA-3158] support presto-carbon to read sdk ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2982
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9957/



---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1699/



---


[GitHub] carbondata issue #2982: [CARBONDATA-3158] support presto-carbon to read sdk ...

2018-12-10 Thread ajantha-bhat
Github user ajantha-bhat commented on the issue:

https://github.com/apache/carbondata/pull/2982
  
@ravipesala , @jackylk : PR is ready. please review


---


[GitHub] carbondata issue #2971: [TEST] Test loading performance with range_column

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2971
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1698/



---


[GitHub] carbondata issue #2621: [CARBONDATA-2840] Added SDV testcases for Complex Da...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2621
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9955/



---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
retest this please


---


[GitHub] carbondata issue #2982: [CARBONDATA-3158] support presto-carbon to read sdk ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2982
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1907/



---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9956/



---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1906/



---


[GitHub] carbondata issue #2621: [CARBONDATA-2840] Added SDV testcases for Complex Da...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2621
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1905/



---


[GitHub] carbondata pull request #2621: [CARBONDATA-2840] Added SDV testcases for Com...

2018-12-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2621


---


[jira] [Resolved] (CARBONDATA-3141) Remove Carbon Table Detail Test Case

2018-12-10 Thread Kunal Kapoor (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3141.
--
   Resolution: Fixed
Fix Version/s: 1.5.2

> Remove Carbon Table Detail Test Case
> 
>
> Key: CARBONDATA-3141
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3141
> Project: CarbonData
>  Issue Type: Test
>Reporter: Praveen M P
>Assignee: Praveen M P
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2838) Add SDV test cases for Local Dictionary Support

2018-12-10 Thread Kunal Kapoor (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-2838.
--
   Resolution: Fixed
Fix Version/s: 1.5.2

> Add SDV test cases for Local Dictionary Support
> ---
>
> Key: CARBONDATA-2838
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2838
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Praveen M P
>Assignee: Praveen M P
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2617: [CARBONDATA-2838] Added SDV test cases for Lo...

2018-12-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2617


---


[GitHub] carbondata issue #2617: [CARBONDATA-2838] Added SDV test cases for Local Dic...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2617
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9954/



---


[GitHub] carbondata issue #2982: [CARBONDATA-3158] support presto-carbon to read sdk ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2982
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1697/



---


[GitHub] carbondata pull request #2968: [CARBONDATA-3141] Removed Carbon Table Detail...

2018-12-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2968


---


[GitHub] carbondata issue #2969: [CARBONDATA-3127]Fix the TestCarbonSerde exception

2018-12-10 Thread SteNicholas
Github user SteNicholas commented on the issue:

https://github.com/apache/carbondata/pull/2969
  
@xubo245 Please review this request again include improvement of the 
comments referred by @xuchuanyin .


---


[GitHub] carbondata issue #2968: [CARBONDATA-3141] Removed Carbon Table Detail Comman...

2018-12-10 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/2968
  
LGTM


---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1696/



---


[GitHub] carbondata issue #2617: [CARBONDATA-2838] Added SDV test cases for Local Dic...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2617
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1903/



---


[GitHub] carbondata issue #2621: [CARBONDATA-2840] Added SDV testcases for Complex Da...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2621
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1695/



---


[GitHub] carbondata pull request #2982: [CARBONDATA-3158] support presto-carbon to re...

2018-12-10 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2982#discussion_r240477368
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
 ---
@@ -364,23 +355,38 @@ private CarbonTable 
parseCarbonMetadata(SchemaTableName table) {
   String tablePath = storePath + "/" + 
carbonTableIdentifier.getDatabaseName() + "/"
   + carbonTableIdentifier.getTableName();
 
-  //Step 2: read the metadata (tableInfo) of the table.
-  ThriftReader.TBaseCreator createTBase = new 
ThriftReader.TBaseCreator() {
-// TBase is used to read and write thrift objects.
-// TableInfo is a kind of TBase used to read and write table 
information.
-// TableInfo is generated by thrift,
-// see schema.thrift under format/src/main/thrift for details.
-public TBase create() {
-  return new org.apache.carbondata.format.TableInfo();
+  String metadataPath = CarbonTablePath.getSchemaFilePath(tablePath);
+  boolean isTransactionalTable = false;
+  try {
+if (FileFactory.getCarbonFile(metadataPath)
--- End diff --

hmm. ok.


---


[GitHub] carbondata issue #2621: [CARBONDATA-2840] Added SDV testcases for Complex Da...

2018-12-10 Thread brijoobopanna
Github user brijoobopanna commented on the issue:

https://github.com/apache/carbondata/pull/2621
  
retest this please



---


[GitHub] carbondata pull request #2982: [CARBONDATA-3158] support presto-carbon to re...

2018-12-10 Thread ajantha-bhat
Github user ajantha-bhat commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2982#discussion_r240475287
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
 ---
@@ -364,23 +355,38 @@ private CarbonTable 
parseCarbonMetadata(SchemaTableName table) {
   String tablePath = storePath + "/" + 
carbonTableIdentifier.getDatabaseName() + "/"
   + carbonTableIdentifier.getTableName();
 
-  //Step 2: read the metadata (tableInfo) of the table.
-  ThriftReader.TBaseCreator createTBase = new 
ThriftReader.TBaseCreator() {
-// TBase is used to read and write thrift objects.
-// TableInfo is a kind of TBase used to read and write table 
information.
-// TableInfo is generated by thrift,
-// see schema.thrift under format/src/main/thrift for details.
-public TBase create() {
-  return new org.apache.carbondata.format.TableInfo();
+  String metadataPath = CarbonTablePath.getSchemaFilePath(tablePath);
+  boolean isTransactionalTable = false;
+  try {
+if (FileFactory.getCarbonFile(metadataPath)
+.isFileExist(metadataPath, 
FileFactory.getFileType(metadataPath))) {
+  // If metadata folder exists, it is a transactional table
+  isTransactionalTable = true;
 }
-  };
-  ThriftReader thriftReader =
-  new ThriftReader(CarbonTablePath.getSchemaFilePath(tablePath), 
createTBase);
-  thriftReader.open();
-  org.apache.carbondata.format.TableInfo tableInfo =
-  (org.apache.carbondata.format.TableInfo) thriftReader.read();
-  thriftReader.close();
-
+  } catch (IOException e) {
+throw new RuntimeException(e);
+  }
+  org.apache.carbondata.format.TableInfo tableInfo;
+  if (isTransactionalTable) {
+//Step 2: read the metadata (tableInfo) of the table.
+ThriftReader.TBaseCreator createTBase = new 
ThriftReader.TBaseCreator() {
+  // TBase is used to read and write thrift objects.
+  // TableInfo is a kind of TBase used to read and write table 
information.
+  // TableInfo is generated by thrift,
+  // see schema.thrift under format/src/main/thrift for details.
+  public TBase create() {
+return new org.apache.carbondata.format.TableInfo();
+  }
+};
+ThriftReader thriftReader =
+new ThriftReader(CarbonTablePath.getSchemaFilePath(tablePath), 
createTBase);
+thriftReader.open();
+tableInfo = (org.apache.carbondata.format.TableInfo) 
thriftReader.read();
+thriftReader.close();
+  } else {
+tableInfo =
+CarbonUtil.inferSchema(tablePath, table.getTableName(), false, 
new Configuration());
--- End diff --

I have tested. It works.

but better to use
FileFactory.getConfiguration(). I will change to it,


---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1693/



---


[GitHub] carbondata issue #2617: [CARBONDATA-2838] Added SDV test cases for Local Dic...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2617
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1694/



---


[GitHub] carbondata pull request #2977: [CARBONDATA-3147] Fixed concurrent load issue

2018-12-10 Thread kunal642
Github user kunal642 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2977#discussion_r240473257
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateListeners.scala
 ---
@@ -111,6 +113,29 @@ trait CommitHelper {
 }
   }
 
+  def mergeTableStatusContents(uuidTableStatusPath: String,
+  tableStatusPath: String): Boolean = {
+try {
--- End diff --

moved lock acquiring locking inside this method


---


[GitHub] carbondata pull request #2977: [CARBONDATA-3147] Fixed concurrent load issue

2018-12-10 Thread kunal642
Github user kunal642 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2977#discussion_r240473230
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateListeners.scala
 ---
@@ -111,6 +113,29 @@ trait CommitHelper {
 }
   }
 
+  protected def mergeTableStatusContents(uuidTableStatusPath: String,
--- End diff --

done


---


[GitHub] carbondata issue #2977: [CARBONDATA-3147] Fixed concurrent load issue

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1902/



---


[GitHub] carbondata issue #2977: [CARBONDATA-3147] Fixed concurrent load issue

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9952/



---


[GitHub] carbondata issue #2977: [CARBONDATA-3147] Fixed concurrent load issue

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1692/



---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1901/



---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9951/



---


[GitHub] carbondata pull request #2970: [CARBONDATA-3142]Add timestamp with thread na...

2018-12-10 Thread qiuchenjian
Github user qiuchenjian commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2970#discussion_r240447425
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/util/CarbonThreadFactory.java ---
@@ -34,14 +34,26 @@
*/
   private String name;
 
+  private boolean withTime = false;
+
   public CarbonThreadFactory(String name) {
 this.defaultFactory = Executors.defaultThreadFactory();
 this.name = name;
   }
 
+  public CarbonThreadFactory(String name, boolean withTime) {
+this(name);
+this.withTime = withTime;
+  }
+
   @Override public Thread newThread(Runnable r) {
 final Thread thread = defaultFactory.newThread(r);
-thread.setName(name);
+if (withTime) {
+  thread.setName(name + "_" + System.currentTimeMillis());
--- End diff --

@xubo245  
timestamp is different for diff thread, bacause ms is fine-gained and  
Thread pool creates thread one by one normally。
timestamp is more useful than uuid, it indicates creation time


---


[GitHub] carbondata issue #2970: [CARBONDATA-3142]Add timestamp with thread name whic...

2018-12-10 Thread qiuchenjian
Github user qiuchenjian commented on the issue:

https://github.com/apache/carbondata/pull/2970
  
> @qiuchenjian the checklist should be select correctly, you can refer 
#2981 or other PR

done


---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1691/



---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
retest this please


---


[GitHub] carbondata issue #2970: [CARBONDATA-3142]Add timestamp with thread name whic...

2018-12-10 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2970
  
 @qiuchenjian the checklist should be select correctly, you can refer 
https://github.com/apache/carbondata/pull/2981


---


[GitHub] carbondata pull request #2970: [CARBONDATA-3142]Add timestamp with thread na...

2018-12-10 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2970#discussion_r240444131
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/util/CarbonThreadFactory.java ---
@@ -34,14 +34,26 @@
*/
   private String name;
 
+  private boolean withTime = false;
+
   public CarbonThreadFactory(String name) {
 this.defaultFactory = Executors.defaultThreadFactory();
 this.name = name;
   }
 
+  public CarbonThreadFactory(String name, boolean withTime) {
+this(name);
+this.withTime = withTime;
+  }
+
   @Override public Thread newThread(Runnable r) {
 final Thread thread = defaultFactory.newThread(r);
-thread.setName(name);
+if (withTime) {
+  thread.setName(name + "_" + System.currentTimeMillis());
--- End diff --

timestamp maybe the same for different newThread, UUID is better.


---


[GitHub] carbondata issue #2977: [CARBONDATA-3147] Fixed concurrent load issue

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9950/



---


[GitHub] carbondata issue #2977: [CARBONDATA-3147] Fixed concurrent load issue

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1900/



---


[GitHub] carbondata issue #2977: [CARBONDATA-3147] Fixed concurrent load issue

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1690/



---


[GitHub] carbondata issue #2847: [CARBONDATA-3005]Support Gzip as column compressor

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2847
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1898/



---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1899/



---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9948/



---


[GitHub] carbondata issue #2847: [CARBONDATA-3005]Support Gzip as column compressor

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2847
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9947/



---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9944/



---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1896/



---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1688/



---


[GitHub] carbondata pull request #2847: [CARBONDATA-3005]Support Gzip as column compr...

2018-12-10 Thread shardul-cr7
Github user shardul-cr7 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2847#discussion_r240247558
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataWithCompression.scala
 ---
@@ -252,50 +253,94 @@ class TestLoadDataWithCompression extends QueryTest 
with BeforeAndAfterEach with
""".stripMargin)
   }
 
-  test("test data loading with snappy compressor and offheap") {
+  test("test data loading with different compressors and offheap") {
+for(comp <- compressors){
+  
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_OFFHEAP_SORT,
 "true")
--- End diff --

By default for gzip/zstd, it's false. So UT for this scenario is not 
required.


---


[GitHub] carbondata issue #2847: [CARBONDATA-3005]Support Gzip as column compressor

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2847
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1687/



---


[GitHub] carbondata issue #2975: [CARBONDATA-3145] Avoid duplicate decoding for compl...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2975
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1895/



---


[GitHub] carbondata issue #2975: [CARBONDATA-3145] Avoid duplicate decoding for compl...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2975
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9943/



---


[GitHub] carbondata issue #2847: [CARBONDATA-3005]Support Gzip as column compressor

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2847
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1686/



---


[GitHub] carbondata pull request #2847: [CARBONDATA-3005]Support Gzip as column compr...

2018-12-10 Thread shardul-cr7
Github user shardul-cr7 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2847#discussion_r240236819
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/compression/GzipCompressor.java
 ---
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.compression;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.DoubleBuffer;
+import java.nio.FloatBuffer;
+import java.nio.IntBuffer;
+import java.nio.LongBuffer;
+import java.nio.ShortBuffer;
+
+import org.apache.carbondata.core.util.ByteUtil;
+
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
+
+public class GzipCompressor implements Compressor {
+
+  public GzipCompressor() {
+  }
+
+  @Override public String getName() {
+return "gzip";
+  }
+
+  /*
+   * Method called for compressing the data and
+   * return a byte array
+   */
+  private byte[] compressData(byte[] data) {
+
+ByteArrayOutputStream bt = new ByteArrayOutputStream();
+try {
+  GzipCompressorOutputStream gzos = new GzipCompressorOutputStream(bt);
+  try {
+gzos.write(data);
+  } catch (IOException e) {
+e.printStackTrace();
+  } finally {
+gzos.close();
+  }
+} catch (IOException e) {
+  e.printStackTrace();
+}
+
+return bt.toByteArray();
+  }
+
+  /*
+   * Method called for decompressing the data and
+   * return a byte array
+   */
+  private byte[] decompressData(byte[] data) {
+
+ByteArrayInputStream bt = new ByteArrayInputStream(data);
+ByteArrayOutputStream bot = new ByteArrayOutputStream();
+
+try {
+  GzipCompressorInputStream gzis = new GzipCompressorInputStream(bt);
+  byte[] buffer = new byte[1024];
+  int len;
+
+  while ((len = gzis.read(buffer)) != -1) {
+bot.write(buffer, 0, len);
+  }
+
+} catch (IOException e) {
+  e.printStackTrace();
+}
+
+return bot.toByteArray();
--- End diff --

Similar to ByteArrayOutputStream.close() reason mentioned above.


---


[GitHub] carbondata pull request #2847: [CARBONDATA-3005]Support Gzip as column compr...

2018-12-10 Thread shardul-cr7
Github user shardul-cr7 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2847#discussion_r240236269
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/compression/GzipCompressor.java
 ---
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.compression;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.DoubleBuffer;
+import java.nio.FloatBuffer;
+import java.nio.IntBuffer;
+import java.nio.LongBuffer;
+import java.nio.ShortBuffer;
+
+import org.apache.carbondata.core.util.ByteUtil;
+
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
+
+public class GzipCompressor implements Compressor {
+
+  public GzipCompressor() {
+  }
+
+  @Override public String getName() {
+return "gzip";
+  }
+
+  /*
+   * Method called for compressing the data and
+   * return a byte array
+   */
+  private byte[] compressData(byte[] data) {
+
+ByteArrayOutputStream bt = new ByteArrayOutputStream();
+try {
+  GzipCompressorOutputStream gzos = new GzipCompressorOutputStream(bt);
+  try {
+gzos.write(data);
+  } catch (IOException e) {
+e.printStackTrace();
+  } finally {
+gzos.close();
+  }
+} catch (IOException e) {
+  e.printStackTrace();
+}
+
+return bt.toByteArray();
+  }
+
+  /*
+   * Method called for decompressing the data and
+   * return a byte array
+   */
+  private byte[] decompressData(byte[] data) {
+
+ByteArrayInputStream bt = new ByteArrayInputStream(data);
+ByteArrayOutputStream bot = new ByteArrayOutputStream();
+
+try {
+  GzipCompressorInputStream gzis = new GzipCompressorInputStream(bt);
+  byte[] buffer = new byte[1024];
+  int len;
+
+  while ((len = gzis.read(buffer)) != -1) {
+bot.write(buffer, 0, len);
+  }
+
+} catch (IOException e) {
+  e.printStackTrace();
+}
+
+return bot.toByteArray();
+  }
+
+  @Override public byte[] compressByte(byte[] unCompInput) {
+return compressData(unCompInput);
+  }
+
+  @Override public byte[] compressByte(byte[] unCompInput, int byteSize) {
+return compressData(unCompInput);
+  }
+
+  @Override public byte[] unCompressByte(byte[] compInput) {
+return decompressData(compInput);
+  }
+
+  @Override public byte[] unCompressByte(byte[] compInput, int offset, int 
length) {
+byte[] data = new byte[length];
+System.arraycopy(compInput, offset, data, 0, length);
+return decompressData(data);
+  }
+
+  @Override public byte[] compressShort(short[] unCompInput) {
+ByteBuffer unCompBuffer = ByteBuffer.allocate(unCompInput.length * 
ByteUtil.SIZEOF_SHORT);
+unCompBuffer.asShortBuffer().put(unCompInput);
+return compressData(unCompBuffer.array());
+  }
+
+  @Override public short[] unCompressShort(byte[] compInput, int offset, 
int length) {
+byte[] unCompArray = unCompressByte(compInput, offset, length);
+ShortBuffer unCompBuffer = 
ByteBuffer.wrap(unCompArray).asShortBuffer();
+short[] shorts = new short[unCompArray.length / ByteUtil.SIZEOF_SHORT];
+unCompBuffer.get(shorts);
+return shorts;
+  }
+
+  @Override public byte[] compressInt(int[] unCompInput) {
+ByteBuffer unCompBuffer = ByteBuffer.allocate(unCompInput.length * 
ByteUtil.SIZEOF_INT);
+unCompBuffer.asIntBuffer().put(unCompInput);
+return compressData(unCompBuffer.array());
+  }
+
+  @Override public int[] unCompressInt(byte[] 

[GitHub] carbondata pull request #2847: [CARBONDATA-3005]Support Gzip as column compr...

2018-12-10 Thread shardul-cr7
Github user shardul-cr7 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2847#discussion_r240236381
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/compression/GzipCompressor.java
 ---
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.compression;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.DoubleBuffer;
+import java.nio.FloatBuffer;
+import java.nio.IntBuffer;
+import java.nio.LongBuffer;
+import java.nio.ShortBuffer;
+
+import org.apache.carbondata.core.util.ByteUtil;
+
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
+
+public class GzipCompressor implements Compressor {
+
+  public GzipCompressor() {
+  }
+
+  @Override public String getName() {
+return "gzip";
+  }
+
+  /*
+   * Method called for compressing the data and
+   * return a byte array
+   */
+  private byte[] compressData(byte[] data) {
+
+ByteArrayOutputStream bt = new ByteArrayOutputStream();
+try {
+  GzipCompressorOutputStream gzos = new GzipCompressorOutputStream(bt);
+  try {
+gzos.write(data);
+  } catch (IOException e) {
+e.printStackTrace();
+  } finally {
+gzos.close();
+  }
+} catch (IOException e) {
+  e.printStackTrace();
--- End diff --

Done.


---


[GitHub] carbondata pull request #2847: [CARBONDATA-3005]Support Gzip as column compr...

2018-12-10 Thread shardul-cr7
Github user shardul-cr7 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2847#discussion_r240236462
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/compression/GzipCompressor.java
 ---
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.compression;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.DoubleBuffer;
+import java.nio.FloatBuffer;
+import java.nio.IntBuffer;
+import java.nio.LongBuffer;
+import java.nio.ShortBuffer;
+
+import org.apache.carbondata.core.util.ByteUtil;
+
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
+
+public class GzipCompressor implements Compressor {
+
+  public GzipCompressor() {
+  }
+
+  @Override public String getName() {
+return "gzip";
+  }
+
+  /*
+   * Method called for compressing the data and
+   * return a byte array
+   */
+  private byte[] compressData(byte[] data) {
+
+ByteArrayOutputStream bt = new ByteArrayOutputStream();
+try {
+  GzipCompressorOutputStream gzos = new GzipCompressorOutputStream(bt);
+  try {
+gzos.write(data);
+  } catch (IOException e) {
+e.printStackTrace();
+  } finally {
+gzos.close();
+  }
+} catch (IOException e) {
+  e.printStackTrace();
+}
+
+return bt.toByteArray();
+  }
+
+  /*
+   * Method called for decompressing the data and
+   * return a byte array
+   */
+  private byte[] decompressData(byte[] data) {
+
+ByteArrayInputStream bt = new ByteArrayInputStream(data);
+ByteArrayOutputStream bot = new ByteArrayOutputStream();
+
+try {
+  GzipCompressorInputStream gzis = new GzipCompressorInputStream(bt);
+  byte[] buffer = new byte[1024];
+  int len;
+
+  while ((len = gzis.read(buffer)) != -1) {
+bot.write(buffer, 0, len);
+  }
+
+} catch (IOException e) {
+  e.printStackTrace();
--- End diff --

Done.


---


[GitHub] carbondata pull request #2982: [CARBONDATA-3158] support presto-carbon to re...

2018-12-10 Thread qiuchenjian
Github user qiuchenjian commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2982#discussion_r240229604
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
 ---
@@ -364,23 +355,38 @@ private CarbonTable 
parseCarbonMetadata(SchemaTableName table) {
   String tablePath = storePath + "/" + 
carbonTableIdentifier.getDatabaseName() + "/"
   + carbonTableIdentifier.getTableName();
 
-  //Step 2: read the metadata (tableInfo) of the table.
-  ThriftReader.TBaseCreator createTBase = new 
ThriftReader.TBaseCreator() {
-// TBase is used to read and write thrift objects.
-// TableInfo is a kind of TBase used to read and write table 
information.
-// TableInfo is generated by thrift,
-// see schema.thrift under format/src/main/thrift for details.
-public TBase create() {
-  return new org.apache.carbondata.format.TableInfo();
+  String metadataPath = CarbonTablePath.getSchemaFilePath(tablePath);
+  boolean isTransactionalTable = false;
+  try {
+if (FileFactory.getCarbonFile(metadataPath)
--- End diff --

The operator of getFileType is called twice in getCarbonFile and if(...), 
it's better to getting it ahead 
FileType fileType = FileFactory.getFileType(metadataPath);
if (FileFactory.getCarbonFile(metadataPath, 
fileType).isFileExist(metadataPath,fileType)) {...}


---


[GitHub] carbondata pull request #2982: [CARBONDATA-3158] support presto-carbon to re...

2018-12-10 Thread qiuchenjian
Github user qiuchenjian commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2982#discussion_r240233846
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
 ---
@@ -364,23 +355,38 @@ private CarbonTable 
parseCarbonMetadata(SchemaTableName table) {
   String tablePath = storePath + "/" + 
carbonTableIdentifier.getDatabaseName() + "/"
   + carbonTableIdentifier.getTableName();
 
-  //Step 2: read the metadata (tableInfo) of the table.
-  ThriftReader.TBaseCreator createTBase = new 
ThriftReader.TBaseCreator() {
-// TBase is used to read and write thrift objects.
-// TableInfo is a kind of TBase used to read and write table 
information.
-// TableInfo is generated by thrift,
-// see schema.thrift under format/src/main/thrift for details.
-public TBase create() {
-  return new org.apache.carbondata.format.TableInfo();
+  String metadataPath = CarbonTablePath.getSchemaFilePath(tablePath);
+  boolean isTransactionalTable = false;
+  try {
+if (FileFactory.getCarbonFile(metadataPath)
+.isFileExist(metadataPath, 
FileFactory.getFileType(metadataPath))) {
+  // If metadata folder exists, it is a transactional table
+  isTransactionalTable = true;
 }
-  };
-  ThriftReader thriftReader =
-  new ThriftReader(CarbonTablePath.getSchemaFilePath(tablePath), 
createTBase);
-  thriftReader.open();
-  org.apache.carbondata.format.TableInfo tableInfo =
-  (org.apache.carbondata.format.TableInfo) thriftReader.read();
-  thriftReader.close();
-
+  } catch (IOException e) {
+throw new RuntimeException(e);
+  }
+  org.apache.carbondata.format.TableInfo tableInfo;
+  if (isTransactionalTable) {
+//Step 2: read the metadata (tableInfo) of the table.
+ThriftReader.TBaseCreator createTBase = new 
ThriftReader.TBaseCreator() {
+  // TBase is used to read and write thrift objects.
+  // TableInfo is a kind of TBase used to read and write table 
information.
+  // TableInfo is generated by thrift,
+  // see schema.thrift under format/src/main/thrift for details.
+  public TBase create() {
+return new org.apache.carbondata.format.TableInfo();
+  }
+};
+ThriftReader thriftReader =
+new ThriftReader(CarbonTablePath.getSchemaFilePath(tablePath), 
createTBase);
+thriftReader.open();
+tableInfo = (org.apache.carbondata.format.TableInfo) 
thriftReader.read();
+thriftReader.close();
+  } else {
+tableInfo =
+CarbonUtil.inferSchema(tablePath, table.getTableName(), false, 
new Configuration());
--- End diff --

Is this code (tableInfo = CarbonUtil.inferSchema(tablePath, 
table.getTableName(), false, new Configuration());)  tested on hdfs,  
FileSystem may be not created by "new Configuration()"


---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1897/



---


[GitHub] carbondata pull request #2847: [CARBONDATA-3005]Support Gzip as column compr...

2018-12-10 Thread shardul-cr7
Github user shardul-cr7 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2847#discussion_r240227006
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/compression/GzipCompressor.java
 ---
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.compression;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
+
+/**
+ * Codec Class for performing Gzip Compression
+ */
+public class GzipCompressor extends AbstractCompressor {
+
+  @Override public String getName() {
+return "gzip";
+  }
+
+  /**
+   * This method takes the Byte Array data and Compresses in gzip format
+   *
+   * @param data Data Byte Array passed for compression
+   * @return Compressed Byte Array
+   */
+  private byte[] compressData(byte[] data) {
+ByteArrayOutputStream byteArrayOutputStream = new 
ByteArrayOutputStream();
--- End diff --

Based on the observations I have initialized the byteArrayOutputStream with 
size of half of byte buffer, So it reduces the number of resizing of the stream.


---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9945/



---


[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1684/



---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1685/



---


[jira] [Updated] (CARBONDATA-2755) Compaction of Complex DataType (STRUCT AND ARRAY)

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2755:

Summary: Compaction of Complex DataType (STRUCT AND ARRAY)  (was: 
Compaction of Complex DataType)

> Compaction of Complex DataType (STRUCT AND ARRAY)
> -
>
> Key: CARBONDATA-2755
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2755
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Complex Type Enhancements - Compaction of Complex DataType



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2975: [CARBONDATA-3145] Avoid duplicate decoding for compl...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2975
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1683/



---


[jira] [Resolved] (CARBONDATA-3145) Avoid duplicate decoding for complex column pages while querying

2018-12-10 Thread kumar vishal (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3145.
--
Resolution: Fixed

> Avoid duplicate decoding for complex column pages while querying
> 
>
> Key: CARBONDATA-3145
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3145
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2975: [CARBONDATA-3145] Avoid duplicate decoding fo...

2018-12-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2975


---


[jira] [Commented] (CARBONDATA-2755) Compaction of Complex DataType

2018-12-10 Thread dhatchayani (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714748#comment-16714748
 ] 

dhatchayani commented on CARBONDATA-2755:
-

https://issues.apache.org/jira/browse/CARBONDATA-3160  Jira to extend 
compaction support with MAP type

> Compaction of Complex DataType
> --
>
> Key: CARBONDATA-2755
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2755
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Complex Type Enhancements - Compaction of Complex DataType



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2983: [CARBONDATA-3119] Fixed SDK Write for Complex Array ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2983
  
Can one of the admins verify this patch?


---


[jira] [Updated] (CARBONDATA-3160) Compaction support with MAP data type

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3160:

Description: Support compaction with MAP type

> Compaction support with MAP data type
> -
>
> Key: CARBONDATA-3160
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3160
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>
> Support compaction with MAP type



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2983: [CARBONDATA-3119] Fixed SDK Write for Complex...

2018-12-10 Thread shivamasn
GitHub user shivamasn opened a pull request:

https://github.com/apache/carbondata/pull/2983

[CARBONDATA-3119] Fixed SDK Write for Complex Array Type when Array is Empty

### What was the issue?
For SDK Write , it was going into bad record by returning null on passing 
empty array for Complex Type.
### What has been changed?
Added a check for empty array. This will return an empty array.

 - [ ] Any interfaces changed?
 NA
 - [ ] Any backward compatibility impacted?
 NA
 - [ ] Document update required?
NA
 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
Test  case added
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
NA


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shivamasn/carbondata complex_issue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2983.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2983


commit a81911b32aa7b586a5ae0c8d206d16fb8921c6fa
Author: namanrastogi 
Date:   2018-12-10T13:34:43Z

Complex Type Empty String Issue Fixed




---


[jira] [Created] (CARBONDATA-3160) Compaction support with MAP data type

2018-12-10 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3160:
---

 Summary: Compaction support with MAP data type
 Key: CARBONDATA-3160
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3160
 Project: CarbonData
  Issue Type: Sub-task
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2605) Complex DataType Enhancements

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani reassigned CARBONDATA-2605:
---

Assignee: dhatchayani

> Complex DataType Enhancements
> -
>
> Key: CARBONDATA-2605
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2605
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
> Attachments: Complex Data Type Enhancements.pdf
>
>
> Umbrella Jira to implement enhancements in Complex Data Type for Carbon.
>  * Projection push down for struct data type.
>  * Provide adaptive encoding and decoding for all data type.
>  * Support JSON data loading directly into Carbon table.
>  
> Please access the Design Document through this link. 
>  
> [https://docs.google.com/document/d/12EZwUlLs53Vro7pMeLnFd0lCjeKOakKY-60e3cryJb4/edit#|https://docs.google.com/document/d/12EZwUlLs53Vro7pMeLnFd0lCjeKOakKY-60e3cryJb4/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2755) Compaction of Complex DataType

2018-12-10 Thread dhatchayani (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714745#comment-16714745
 ] 

dhatchayani commented on CARBONDATA-2755:
-

This Jira is to support compaction with STRUCT and ARRAY type. For MAP type, a 
new Jira will be raised.

> Compaction of Complex DataType
> --
>
> Key: CARBONDATA-2755
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2755
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Complex Type Enhancements - Compaction of Complex DataType



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2755) Compaction of Complex DataType

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani reassigned CARBONDATA-2755:
---

Assignee: dhatchayani  (was: sounak chakraborty)

> Compaction of Complex DataType
> --
>
> Key: CARBONDATA-2755
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2755
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Complex Type Enhancements - Compaction of Complex DataType



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2977: [CARBONDATA-3147] Fixed concurrent load issue

2018-12-10 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2977#discussion_r240215937
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateListeners.scala
 ---
@@ -111,6 +113,29 @@ trait CommitHelper {
 }
   }
 
+  def mergeTableStatusContents(uuidTableStatusPath: String,
+  tableStatusPath: String): Boolean = {
+try {
--- End diff --

Please try to move taking lock inside this method


---


[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...

2018-12-10 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2976
  
This PR supports compaction only for STRUCT and ARRAY. Please raise another 
jira and PR to support MAP type as well.


---


[jira] [Resolved] (CARBONDATA-3143) Fix local dictionary issue for presto

2018-12-10 Thread kumar vishal (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3143.
--
Resolution: Fixed
  Assignee: Ravindra Pesala

> Fix local dictionary issue for presto
> -
>
> Key: CARBONDATA-3143
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3143
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Fix local dictionary issue for presto



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2972: [CARBONDATA-3143] Fixed local dictionary in p...

2018-12-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/2972


---


[GitHub] carbondata pull request #2976: [CARBONDATA-2755][Complex DataType Enhancemen...

2018-12-10 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2976#discussion_r240215007
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
 ---
@@ -371,9 +374,25 @@ public static CarbonFactDataHandlerModel 
getCarbonFactDataHandlerModel(CarbonLoa
 
.getFormattedCardinality(segmentProperties.getDimColumnsCardinality(), 
wrapperColumnSchema);
 carbonFactDataHandlerModel.setColCardinality(formattedCardinality);
 //TO-DO Need to handle complex types here .
-Map complexIndexMap =
-new HashMap(segmentProperties.getComplexDimensions().size());
-carbonFactDataHandlerModel.setComplexIndexMap(complexIndexMap);
+
+int simpleDimensionCount = -1;
+if (segmentProperties.getDimensions().size() == 0) {
+  simpleDimensionCount = 0;
+} else {
+  simpleDimensionCount = segmentProperties.getDimensions().size() - 
segmentProperties
+  .getNumberOfNoDictionaryDimension() - 
segmentProperties.getComplexDimensions().size();
+}
--- End diff --

Please move down this code to `convertComplexDimensionToGenericDataType`


---


[jira] [Created] (CARBONDATA-3159) Issue with SDK Write when empty array is given

2018-12-10 Thread Shivam Goyal (JIRA)
Shivam Goyal created CARBONDATA-3159:


 Summary: Issue with SDK Write when empty array is given
 Key: CARBONDATA-3159
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3159
 Project: CarbonData
  Issue Type: Bug
Reporter: Shivam Goyal






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2972: [CARBONDATA-3143] Fixed local dictionary in presto

2018-12-10 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/2972
  
LGTM


---


[GitHub] carbondata issue #2972: [CARBONDATA-3143] Fixed local dictionary in presto

2018-12-10 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2972
  
LGTM


---


[GitHub] carbondata pull request #2976: [CARBONDATA-2755][Complex DataType Enhancemen...

2018-12-10 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2976#discussion_r240214729
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
 ---
@@ -407,6 +426,81 @@ public static CarbonFactDataHandlerModel 
getCarbonFactDataHandlerModel(CarbonLoa
 return carbonFactDataHandlerModel;
   }
 
+  /**
+   * This routine takes the Complex Dimension and convert into generic 
DataType.
+   * @param complexDimensions
+   * @param dimensionCount
+   * @param isNullFormat
+   *@param isEmptyBadRecords @return
+   */
+  private static Map 
convertComplexDimensionToGenericDataType(
+  List complexDimensions, int dimensionCount, String 
isNullFormat,
+  boolean isEmptyBadRecords) {
+Map complexIndexMap =
+new HashMap(complexDimensions.size());
+
+for (CarbonDimension carbonDimension : complexDimensions) {
+
+  if (carbonDimension.isComplex()) {
+GenericDataType g;
+if 
(carbonDimension.getColumnSchema().getDataType().getName().equalsIgnoreCase("ARRAY"))
 {
+  g = new ArrayDataType(carbonDimension.getColName(), "", 
carbonDimension.getColumnId());
+} else if 
(carbonDimension.getColumnSchema().getDataType().getName()
+.equalsIgnoreCase("STRUCT")) {
+  g = new StructDataType(carbonDimension.getColName(), "", 
carbonDimension.getColumnId());
+} else {
+  // Add Primitive type.
+  throw new RuntimeException("Primitive Type should not be coming 
in first loop");
+}
+if (carbonDimension.getNumberOfChild() > 0) {
+  
addChildrenForComplex(carbonDimension.getListOfChildDimensions(), g, 
isNullFormat,
+  isEmptyBadRecords);
+}
+g.setOutputArrayIndex(0);
+complexIndexMap.put(dimensionCount++, g);
+  }
+
+}
+return complexIndexMap;
+  }
+
+  private static void addChildrenForComplex(List 
listOfChildDimensions,
+  GenericDataType genericDataType, String isNullFormat, boolean 
isEmptyBadRecord) {
+for (CarbonDimension carbonDimension : listOfChildDimensions) {
+  if 
(carbonDimension.getColumnSchema().getDataType().getName().equalsIgnoreCase("ARRAY"))
 {
+GenericDataType arrayGeneric = new 
ArrayDataType(carbonDimension.getColName(),
+carbonDimension.getColName()
+.substring(0, 
carbonDimension.getColName().lastIndexOf(".")),
+carbonDimension.getColumnId());
+if (carbonDimension.getNumberOfChild() > 0) {
+  
addChildrenForComplex(carbonDimension.getListOfChildDimensions(), arrayGeneric,
+  isNullFormat, isEmptyBadRecord);
+}
+genericDataType.addChildren(arrayGeneric);
+  } else if (carbonDimension.getColumnSchema().getDataType().getName()
+  .equalsIgnoreCase("STRUCT")) {
+GenericDataType structGeneric = new 
StructDataType(carbonDimension.getColName(),
+carbonDimension.getColName()
+.substring(0, 
carbonDimension.getColName().lastIndexOf(".")),
+carbonDimension.getColumnId());
+if (carbonDimension.getNumberOfChild() > 0) {
+  
addChildrenForComplex(carbonDimension.getListOfChildDimensions(), structGeneric,
+  isNullFormat, isEmptyBadRecord);
+}
+genericDataType.addChildren(structGeneric);
+  } else {
+// Primitive Data Type
+genericDataType.addChildren(
+new 
PrimitiveDataType(carbonDimension.getColumnSchema().getColumnName(),
+carbonDimension.getDataType(), carbonDimension.getColName()
+.substring(0, 
carbonDimension.getColName().lastIndexOf(".")),
+carbonDimension.getColumnId(),
+
carbonDimension.getColumnSchema().hasEncoding(Encoding.DICTIONARY), 
isNullFormat,
+isEmptyBadRecord));
--- End diff --

Please check and remove `isEmptyBadRecord` from it if not used


---


[GitHub] carbondata pull request #2976: [CARBONDATA-2755][Complex DataType Enhancemen...

2018-12-10 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2976#discussion_r240213617
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
 ---
@@ -407,6 +426,81 @@ public static CarbonFactDataHandlerModel 
getCarbonFactDataHandlerModel(CarbonLoa
 return carbonFactDataHandlerModel;
   }
 
+  /**
+   * This routine takes the Complex Dimension and convert into generic 
DataType.
+   * @param complexDimensions
+   * @param dimensionCount
+   * @param isNullFormat
+   *@param isEmptyBadRecords @return
+   */
+  private static Map 
convertComplexDimensionToGenericDataType(
+  List complexDimensions, int dimensionCount, String 
isNullFormat,
+  boolean isEmptyBadRecords) {
+Map complexIndexMap =
+new HashMap(complexDimensions.size());
+
+for (CarbonDimension carbonDimension : complexDimensions) {
+
+  if (carbonDimension.isComplex()) {
+GenericDataType g;
+if 
(carbonDimension.getColumnSchema().getDataType().getName().equalsIgnoreCase("ARRAY"))
 {
+  g = new ArrayDataType(carbonDimension.getColName(), "", 
carbonDimension.getColumnId());
+} else if 
(carbonDimension.getColumnSchema().getDataType().getName()
+.equalsIgnoreCase("STRUCT")) {
+  g = new StructDataType(carbonDimension.getColName(), "", 
carbonDimension.getColumnId());
+} else {
+  // Add Primitive type.
+  throw new RuntimeException("Primitive Type should not be coming 
in first loop");
+}
+if (carbonDimension.getNumberOfChild() > 0) {
+  
addChildrenForComplex(carbonDimension.getListOfChildDimensions(), g, 
isNullFormat,
+  isEmptyBadRecords);
+}
+g.setOutputArrayIndex(0);
+complexIndexMap.put(dimensionCount++, g);
+  }
+
+}
+return complexIndexMap;
+  }
+
+  private static void addChildrenForComplex(List 
listOfChildDimensions,
+  GenericDataType genericDataType, String isNullFormat, boolean 
isEmptyBadRecord) {
+for (CarbonDimension carbonDimension : listOfChildDimensions) {
+  if 
(carbonDimension.getColumnSchema().getDataType().getName().equalsIgnoreCase("ARRAY"))
 {
+GenericDataType arrayGeneric = new 
ArrayDataType(carbonDimension.getColName(),
+carbonDimension.getColName()
+.substring(0, 
carbonDimension.getColName().lastIndexOf(".")),
+carbonDimension.getColumnId());
+if (carbonDimension.getNumberOfChild() > 0) {
+  
addChildrenForComplex(carbonDimension.getListOfChildDimensions(), arrayGeneric,
+  isNullFormat, isEmptyBadRecord);
+}
+genericDataType.addChildren(arrayGeneric);
+  } else if (carbonDimension.getColumnSchema().getDataType().getName()
+  .equalsIgnoreCase("STRUCT")) {
+GenericDataType structGeneric = new 
StructDataType(carbonDimension.getColName(),
+carbonDimension.getColName()
+.substring(0, 
carbonDimension.getColName().lastIndexOf(".")),
--- End diff --

Pls extract to top and reuse it


---


[GitHub] carbondata pull request #2976: [CARBONDATA-2755][Complex DataType Enhancemen...

2018-12-10 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2976#discussion_r240212600
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
 ---
@@ -407,6 +426,81 @@ public static CarbonFactDataHandlerModel 
getCarbonFactDataHandlerModel(CarbonLoa
 return carbonFactDataHandlerModel;
   }
 
+  /**
+   * This routine takes the Complex Dimension and convert into generic 
DataType.
+   * @param complexDimensions
+   * @param dimensionCount
+   * @param isNullFormat
+   *@param isEmptyBadRecords @return
+   */
+  private static Map 
convertComplexDimensionToGenericDataType(
+  List complexDimensions, int dimensionCount, String 
isNullFormat,
+  boolean isEmptyBadRecords) {
+Map complexIndexMap =
+new HashMap(complexDimensions.size());
+
+for (CarbonDimension carbonDimension : complexDimensions) {
+
+  if (carbonDimension.isComplex()) {
+GenericDataType g;
+if 
(carbonDimension.getColumnSchema().getDataType().getName().equalsIgnoreCase("ARRAY"))
 {
--- End diff --

Please check the utility to get the complex type


---


[GitHub] carbondata pull request #2976: [CARBONDATA-2755][Complex DataType Enhancemen...

2018-12-10 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2976#discussion_r240212283
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
 ---
@@ -407,6 +426,81 @@ public static CarbonFactDataHandlerModel 
getCarbonFactDataHandlerModel(CarbonLoa
 return carbonFactDataHandlerModel;
   }
 
+  /**
+   * This routine takes the Complex Dimension and convert into generic 
DataType.
+   * @param complexDimensions
+   * @param dimensionCount
+   * @param isNullFormat
+   *@param isEmptyBadRecords @return
+   */
+  private static Map 
convertComplexDimensionToGenericDataType(
+  List complexDimensions, int dimensionCount, String 
isNullFormat,
+  boolean isEmptyBadRecords) {
+Map complexIndexMap =
+new HashMap(complexDimensions.size());
+
+for (CarbonDimension carbonDimension : complexDimensions) {
+
+  if (carbonDimension.isComplex()) {
+GenericDataType g;
--- End diff --

Pls give some proper name


---


[jira] [Updated] (CARBONDATA-3145) Avoid duplicate decoding for complex column pages while querying

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3145:

Summary: Avoid duplicate decoding for complex column pages while querying  
(was: Read improvement for complex column pages while querying)

> Avoid duplicate decoding for complex column pages while querying
> 
>
> Key: CARBONDATA-3145
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3145
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2982: [CARBONDATA-3158] support presto-carbon to read sdk ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2982
  
Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9941/



---


[GitHub] carbondata pull request #2847: [CARBONDATA-3005]Support Gzip as column compr...

2018-12-10 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2847#discussion_r240208519
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/compression/GzipCompressor.java
 ---
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.compression;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
+
+/**
+ * Codec Class for performing Gzip Compression
+ */
+public class GzipCompressor extends AbstractCompressor {
+
+  @Override public String getName() {
+return "gzip";
+  }
+
+  /**
+   * This method takes the Byte Array data and Compresses in gzip format
+   *
+   * @param data Data Byte Array passed for compression
+   * @return Compressed Byte Array
+   */
+  private byte[] compressData(byte[] data) {
+ByteArrayOutputStream byteArrayOutputStream = new 
ByteArrayOutputStream();
--- End diff --

ByteArrayOutputStream initializes with 32 and copies the data to new byte[] 
on expansion. Can you use a better initial size to limit the number of copies 
during expansion. Snappy has a utility (maxCompressedLength) to calculate the 
same, you check if any gzip libs has similar method. If not we an use based a 
test with max possible compression ratio. 


---


[GitHub] carbondata pull request #2976: [CARBONDATA-2755][Complex DataType Enhancemen...

2018-12-10 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2976#discussion_r240209202
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java
 ---
@@ -337,6 +342,25 @@ public static void 
addColumnCardinalityToMap(Map columnCardinal
 .toPrimitive(updatedCardinalityList.toArray(new 
Integer[updatedCardinalityList.size()]));
   }
 
+  private static void 
fillColumnSchemaListForComplexDims(List carbonDimensionsList,
--- End diff --

Can you add comment what is happening in this method?


---


[GitHub] carbondata pull request #2847: [CARBONDATA-3005]Support Gzip as column compr...

2018-12-10 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2847#discussion_r240206699
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/compression/GzipCompressor.java
 ---
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.compression;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
+import 
org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
+
+/**
+ * Codec Class for performing Gzip Compression
+ */
+public class GzipCompressor extends AbstractCompressor {
+
+  @Override public String getName() {
+return "gzip";
+  }
+
+  /**
+   * This method takes the Byte Array data and Compresses in gzip format
+   *
+   * @param data Data Byte Array passed for compression
+   * @return Compressed Byte Array
+   */
+  private byte[] compressData(byte[] data) {
+ByteArrayOutputStream byteArrayOutputStream = new 
ByteArrayOutputStream();
+try {
+  GzipCompressorOutputStream gzipCompressorOutputStream =
+  new GzipCompressorOutputStream(byteArrayOutputStream);
+  try {
+/**
+ * Below api will write bytes from specified byte array to the 
gzipCompressorOutputStream
+ * The output stream will compress the given byte array.
+ */
+gzipCompressorOutputStream.write(data);
+  } catch (IOException e) {
+throw new RuntimeException("Error during Compression writing step 
", e);
+  } finally {
+gzipCompressorOutputStream.close();
+  }
+} catch (IOException e) {
+  throw new RuntimeException("Error during Compression step ", e);
+}
+return byteArrayOutputStream.toByteArray();
+  }
+
+  /**
+   * This method takes the Byte Array data and Decompresses in gzip format
+   *
+   * @param data   Data Byte Array for Compression
+   * @param offset Start value of Data Byte Array
+   * @param length Size of Byte Array
+   * @return
+   */
+  private byte[] decompressData(byte[] data, int offset, int length) {
+ByteArrayInputStream byteArrayOutputStream = new 
ByteArrayInputStream(data, offset, length);
+ByteArrayOutputStream byteOutputStream = new ByteArrayOutputStream();
+try {
+  GzipCompressorInputStream gzipCompressorInputStream =
+  new GzipCompressorInputStream(byteArrayOutputStream);
+  byte[] buffer = new byte[1024];
--- End diff --

Instead of fixed 1024, can you observe what is the blocksize (bytes size) 
gzip operates and use the same value ?


---


[GitHub] carbondata pull request #2847: [CARBONDATA-3005]Support Gzip as column compr...

2018-12-10 Thread KanakaKumar
Github user KanakaKumar commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2847#discussion_r240211334
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataWithCompression.scala
 ---
@@ -252,50 +253,94 @@ class TestLoadDataWithCompression extends QueryTest 
with BeforeAndAfterEach with
""".stripMargin)
   }
 
-  test("test data loading with snappy compressor and offheap") {
+  test("test data loading with different compressors and offheap") {
+for(comp <- compressors){
+  
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_OFFHEAP_SORT,
 "true")
--- End diff --

Should we have UT for enable.unsafe.in.query.processing ture and false ?


---


[GitHub] carbondata pull request #2976: [CARBONDATA-2755][Complex DataType Enhancemen...

2018-12-10 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2976#discussion_r240210882
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
 ---
@@ -371,9 +374,25 @@ public static CarbonFactDataHandlerModel 
getCarbonFactDataHandlerModel(CarbonLoa
 
.getFormattedCardinality(segmentProperties.getDimColumnsCardinality(), 
wrapperColumnSchema);
 carbonFactDataHandlerModel.setColCardinality(formattedCardinality);
 //TO-DO Need to handle complex types here .
--- End diff --

Remove it


---


[GitHub] carbondata issue #2975: [CARBONDATA-3145] Avoid duplicate decoding for compl...

2018-12-10 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/2975
  
LGTM.. @ravipesala yeah u are right...Based on PR desc I asked for 
performance report, now it's okay :) 


---


[GitHub] carbondata issue #2982: [CARBONDATA-3158] support presto-carbon to read sdk ...

2018-12-10 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2982
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1893/



---


  1   2   >