[GitHub] [carbondata] ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r276089107
 
 

 ##
 File path: docs/supported-data-types-in-carbondata.md
 ##
 @@ -51,4 +51,5 @@
 
   * Other Types
 * BOOLEAN
+* [BINARY](./supported-binary-in-carbondata.md)
 
 Review comment:
   Just remove the `supported-binary-in-carbondata.md`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r276088896
 
 

 ##
 File path: docs/supported-binary-in-carbondata.md
 ##
 @@ -0,0 +1,162 @@
+
+
+# Support binary data type in CarbonData
 
 Review comment:
   Binary datatype is a standard datatype which we are supporting now. So 
better no need to add any special information about this datatype.  Just add 
along with other datatypes in the document.
   And regarding example, you can add one more column with binary  in the 
existing example but not to the document


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r276086564
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/metadata/datatype/DataType.java
 ##
 @@ -99,6 +99,8 @@ public static char convertType(DataType dataType) {
   return DATE_CHAR;
 } else if (dataType == DataTypes.BYTE_ARRAY) {
   return BYTE_ARRAY_CHAR;
+} else if (dataType == DataTypes.BINARY) {
 
 Review comment:
   you can keep `||` in above `if` condition, no need of new if condition


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r276084416
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/AbstractNonDictionaryVectorFiller.java
 ##
 @@ -168,6 +170,31 @@ public LongStringVectorFiller(int numberOfRows, int 
actualDataLength) {
   }
 }
 
+class BinaryVectorFiller extends AbstractNonDictionaryVectorFiller {
+
+  public BinaryVectorFiller(int numberOfRows) {
+super(numberOfRows);
+  }
+
+  @Override public void fillVector(byte[] data, CarbonColumnVector vector) {
 
 Review comment:
   This code is the same as LongStringVectorFiller , please use the same class, 
no need to add new class with same duplicated code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ravipesala commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r276084188
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/AbstractNonDictionaryVectorFiller.java
 ##
 @@ -168,6 +170,31 @@ public LongStringVectorFiller(int numberOfRows, int 
actualDataLength) {
   }
 }
 
+class BinaryVectorFiller extends AbstractNonDictionaryVectorFiller {
 
 Review comment:
   This code 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] qiuchenjian commented on issue #3122: [CARBONDATA-3289] MV datamap doesn't take effect when having clause use alias

2019-04-16 Thread GitBox
qiuchenjian commented on issue #3122: [CARBONDATA-3289] MV datamap doesn't take 
effect when having clause use alias
URL: https://github.com/apache/carbondata/pull/3122#issuecomment-483929627
 
 
   @akashrn5 
   Because they are different issues and different PRs,  If we put a class, we 
will have a conflict。
   Can we handle your suggest using a new PR, when the PRs are merged?  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483928087
 
 
   Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3127/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483924105
 
 
   Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/11157/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483913258
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2896/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (CARBONDATA-3336) Support Binary Data Type

2019-04-16 Thread xubo245 (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-3336:

Description: 
CarbonData supports binary data type



Version Changes Owner   Date
0.1 Init doc for Supporting binary data typeXubo2019-4-10

Background :
Binary is basic data type and widely used in various scenarios. So it’s better 
to support binary data type in CarbonData. Download data from S3 will be slow 
when dataset has lots of small binary data. The majority of application 
scenarios are  related to storage small binary data type into CarbonData, which 
can avoid small binary files problem and speed up S3 access performance, also 
can decrease cost of accessing OBS by decreasing the number of calling S3 API. 
It also will easier to manage structure data and Unstructured data(binary) by 
storing them into CarbonData. 

Goals:
1. Supporting write binary data type by Carbon Java SDK.
2. Supporting read binary data type by Spark Carbon file format(carbon 
datasource) and CarbonSession.
3. Supporting read binary data type by Carbon SDK
4. Supporting write binary by spark


Approach and Detail:
1.Supporting write binary data type by Carbon Java SDK [Formal]:
1.1 Java SDK needs support write data with specific data types, 
like int, double, byte[ ] data type, no need to convert all data type to string 
array. User read binary file as byte[], then SDK writes byte[] into binary 
column.  
1.2 CarbonData compress binary column because now the compressor is 
table level.
=>TODO, support configuration for compress, default is no 
compress because binary usually is already compressed, like jpg format image. 
So no need to uncompress for binary column. 1.5.4 will support column level 
compression, after that, we can implement no compress for binary. We can talk 
with community.
1.3 CarbonData stores binary as dimension.
1.4 Support configure page size for binary data type because binary 
data usually is big, such as 200k. Otherwise it will be very big for one 
blocklet (32000 rows).
  TODO: 1.5 Avro, JSON convert need consider


2. Supporting read and manage binary data type by Spark Carbon file 
format(carbon DataSource) and CarbonSession.[Formal]
2.1 Supporting read binary data type from non-transaction table, 
read binary column and return as byte[]
2.2 Support create table with binary column, table property doesn’t 
support sort_columns, dictionary, COLUMN_META_CACHE, RANGE_COLUMN for binary 
column
=> Evaluate COLUMN_META_CACHE for binary
   => CARBON Datasource don't support dictionary include column
   => carbon.column.compressor for all columns
2.3 Support CTAS for binary=> transaction/non-transaction
2.4 Support external table for binary
2.5 Support projection for binary column
2.6 Support desc formatted
   => Carbon Datasource don't support  ALTER TABLE add calumny 
sql
   =>TODO: ALTER TABLE for binary data type in carbon session
2.7 Don’t support PARTITION, filter, BUCKETCOLUMNS  for binary  
2.8 Support compaction for binary(TODO)
2.9 datamap? Don’t support bloomfilter, lucene, timeseries datamap, 
 no need min max datamap for binary, support mv and pre-aggregate in the future
2.10 CSDK / python SDK support binary in the future.(TODO)
2.11 Support S3
TODO:
2.12 support UDF, hex, base64, cast:
   select hex(bin) from carbon_table.
   select CAST(s AS BINARY) from carbon_table.
CarbonSession: impact analysis


3. Supporting read binary data type by Carbon SDK
3.1 Supporting read binary data type from non-transaction table, 
read binary column and return as byte[]
3.2 Supporting projection for binary column
3.3 Supporting S3
3.4 no need to support filter.

4. Supporting write binary by spark (carbon file format / 
carbonsession, POC??)
4.1 Convert binary to String and storage in CSV
4.2 Spark load CSV and convert string to byte[], and storage in 
CarbonData. read binary column and return as byte[]
4.3 Supporting insert into (string => binary),  TODO: update, 
delete for binary
4.4 Don’t support stream table.
=> refer hive and Spark2.4 image DataSource


 
mail list: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discuss-CarbonData-supports-binary-data-type-td76828.html


  was:
CarbonData supports binary data type



Version Changes Owner   Date
0.1 Init doc for Supporting binary data typeXubo2019-4-10

Background :
Binary is basic data type and widely used in various 

[jira] [Updated] (CARBONDATA-3351) Support Binary Data Type

2019-04-16 Thread xubo245 (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-3351:

Description: 
1.Supporting write binary data type by Carbon Java SDK:
1.1 Java SDK needs support write data with specific data types, like int, 
double, byte[ ] data type, no need to convert all data type to string array. 
User read binary file as byte[], then SDK writes byte[] into binary column. 
 
1.2 CarbonData compress binary column because now the compressor is table level.
=>TODO, support configuration for compress, default is no compress because 
binary usually is already compressed, like jpg format image. So no need to 
uncompress for binary column. 1.5.4 will support column level compression, 
after that, we can implement no compress for binary. We can talk with community.
1.3 CarbonData stores binary as dimension.
1.4 Support configure page size for binary data type because binary data 
usually is big, such as 200k. Otherwise it will be very big for one blocklet 
(32000 rows). =>PR2814

2. Supporting read and manage binary data type by Spark Carbon file 
format(carbon DataSource) and CarbonSession.
2.1 Supporting read binary data type from non-transaction table, read binary 
column and return as byte[]
2.2 Support create table with binary column, table property doesn’t support 
sort_columns, dictionary, RANGE_COLUMN for binary column
=> Evaluate COLUMN_META_CACHE for binary
=> CARBON Datasource don't support dictionary include column
=> carbon.column.compressor for all columns
2.3 Support CTAS for binary=> transaction/non-transaction
2.4 Support external table for binary
2.5 Support projection for binary column
2.6 Support desc
=> Carbon Datasource don't support ALTER TABLE add column by sql
2.7 Don’t support PARTITION, filter, BUCKETCOLUMNS for binary   
2.8 Support S3

3. Supporting read binary data type by Carbon SDK
3.1 Supporting read binary data type from non-transaction table, 
read binary column and return as byte[]
3.2 Supporting projection for binary column
3.3 Supporting S3
3.4 no need to support filter.

4. Supporting write binary by spark (carbon file format / 
carbonsession, POC??)
4.1 Convert binary to String and storage in CSV
4.2 Spark load CSV and convert string to byte[], and storage in 
CarbonData. read binary column and return as byte[]
4.3 Supporting insert into (string => binary),  TODO: update, 
delete for binary
4.4 Don’t support stream table.
=> refer hive and Spark2.4 image DataSource


  was:
1.Supporting write binary data type by Carbon Java SDK:
1.1 Java SDK needs support write data with specific data types, like int, 
double, byte[ ] data type, no need to convert all data type to string array. 
User read binary file as byte[], then SDK writes byte[] into binary column. 
 
1.2 CarbonData compress binary column because now the compressor is table level.
=>TODO, support configuration for compress, default is no compress because 
binary usually is already compressed, like jpg format image. So no need to 
uncompress for binary column. 1.5.4 will support column level compression, 
after that, we can implement no compress for binary. We can talk with community.
1.3 CarbonData stores binary as dimension.
1.4 Support configure page size for binary data type because binary data 
usually is big, such as 200k. Otherwise it will be very big for one blocklet 
(32000 rows). =>PR2814

2. Supporting read and manage binary data type by Spark Carbon file 
format(carbon DataSource) and CarbonSession.
2.1 Supporting read binary data type from non-transaction table, read binary 
column and return as byte[]
2.2 Support create table with binary column, table property doesn’t support 
sort_columns, dictionary, RANGE_COLUMN for binary column
=> Evaluate COLUMN_META_CACHE for binary
=> CARBON Datasource don't support dictionary include column
=> carbon.column.compressor for all columns
2.3 Support CTAS for binary=> transaction/non-transaction
2.4 Support external table for binary
2.5 Support projection for binary column
2.6 Support desc
=> Carbon Datasource don't support ALTER TABLE add column by sql
2.7 Don’t support PARTITION, filter, BUCKETCOLUMNS for binary   
2.8 Support S3


> Support Binary Data Type
> 
>
> Key: CARBONDATA-3351
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3351
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
>  Time Spent: 16h 50m
>  Remaining Estimate: 0h
>
> 1.Supporting write binary data type by Carbon Java SDK:
> 1.1 Java SDK needs support write data with specific data types, like int, 
> double, byte[ ] data type, no need to convert all data type to string 

[GitHub] [carbondata] xubo245 commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483899601
 
 
   @ajantha-bhat @ravipesala @KanakaKumar @jackylk @kunal642 @QiangCai  Please 
review it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483780324
 
 
   Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3126/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483780339
 
 
   Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/11156/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483731850
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2895/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275877140
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+   

[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483723129
 
 
   Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/11155/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483695297
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2894/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275834619
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+   

[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275828580
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
 ##
 @@ -65,7 +65,7 @@
   @Override
   public void write(Object object) throws IOException {
 try {
-  writable.set((String[]) object);
+  writable.set((Object[]) object);
 
 Review comment:
   no need convert byte[] to string before write, and no need convert string to 
byte[] in carbon internal . No encode hex for byte[]


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483673739
 
 
   Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/11153/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483670335
 
 
   Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3123/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483654987
 
 
   Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3121/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3177: [WIP] Distributed index server

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3177: [WIP] Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#issuecomment-483652180
 
 
   Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/11154/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3177: [WIP] Distributed index server

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3177: [WIP] Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#issuecomment-483651806
 
 
   Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3124/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3177: [WIP] Distributed index server

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3177: [WIP] Distributed index server
URL: https://github.com/apache/carbondata/pull/3177#issuecomment-483651360
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2893/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483646256
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2892/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483646185
 
 
   Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/11151/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275773021
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/test/util/BinaryUtil.java
 ##
 @@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.test.util;
+
+import java.io.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.sdk.file.CarbonReader;
+import org.apache.carbondata.sdk.file.CarbonWriter;
+import org.apache.carbondata.sdk.file.Field;
+import org.apache.carbondata.sdk.file.Schema;
+
+import static org.apache.carbondata.sdk.file.utils.SDKUtil.listFiles;
+
+public class BinaryUtil {
+  public static void binaryToCarbon(String sourceImageFolder, String 
outputPath,
+String sufAnnotation, final String 
sufImage) throws Exception {
+Field[] fields = new Field[5];
+fields[0] = new Field("binaryId", DataTypes.INT);
+fields[1] = new Field("binaryName", DataTypes.STRING);
+fields[2] = new Field("binary", DataTypes.BINARY);
+fields[3] = new Field("labelName", DataTypes.STRING);
+fields[4] = new Field("labelContent", DataTypes.STRING);
+CarbonWriter writer = CarbonWriter
+.builder()
+.outputPath(outputPath)
+.withCsvInput(new Schema(fields))
+.withBlockSize(256)
+.writtenBy("SDKS3Example").withPageSizeInMb(1)
+.build();
+binaryToCarbon(sourceImageFolder, writer, sufAnnotation, sufImage);
+  }
+
+  public static boolean binaryToCarbon(String sourceImageFolder, CarbonWriter 
writer,
+  String sufAnnotation, final String sufImage) throws Exception {
+int num = 1;
+
+byte[] originBinary = null;
+
+// read and write image data
+for (int j = 0; j < num; j++) {
+
+  Object[] files = listFiles(sourceImageFolder, sufImage).toArray();
+
+  if (null != files) {
+for (int i = 0; i < files.length; i++) {
+  // read image and encode to Hex
 
 Review comment:
   For byte[], no need encode. I replied in previous comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275772720
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+  

[GitHub] [carbondata] CarbonDataQA commented on issue #3178: [WIP] Support alter SORT_COLUMNS property

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3178: [WIP] Support alter SORT_COLUMNS property
URL: https://github.com/apache/carbondata/pull/3178#issuecomment-483640057
 
 
   Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3122/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275772575
 
 

 ##
 File path: 
store/sdk/src/test/java/org/apache/carbondata/sdk/file/ImageTest.java
 ##
 @@ -0,0 +1,699 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import junit.framework.TestCase;
+
+import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
+import org.apache.carbondata.core.scan.expression.LiteralExpression;
+import 
org.apache.carbondata.core.scan.expression.conditional.EqualToExpression;
+import org.apache.carbondata.core.util.CarbonProperties;
+
+import org.apache.carbondata.test.util.BinaryUtil;
+import org.apache.commons.codec.DecoderException;
+import org.apache.commons.codec.binary.Hex;
+import org.apache.commons.io.FileUtils;
+import org.junit.Assert;
+import org.junit.Test;
+
+import javax.imageio.ImageIO;
+import javax.imageio.ImageReadParam;
+import javax.imageio.ImageReader;
+import javax.imageio.ImageTypeSpecifier;
+import javax.imageio.stream.FileImageInputStream;
+import javax.imageio.stream.ImageInputStream;
+import java.awt.color.ColorSpace;
+import java.awt.image.BufferedImage;
+import java.io.*;
+import java.util.Iterator;
+import java.util.List;
+
+import static org.apache.carbondata.sdk.file.utils.SDKUtil.listFiles;
+
+public class ImageTest extends TestCase {
+
+  @Test
+  public void testBinaryWithFilter() throws IOException, 
InvalidLoadOptionException, InterruptedException, DecoderException {
+String imagePath = "./src/test/resources/image/carbondatalogo.jpg";
+int num = 1;
+int rows = 1;
+String path = "./target/binary";
+Field[] fields = new Field[3];
+fields[0] = new Field("name", DataTypes.STRING);
+fields[1] = new Field("age", DataTypes.INT);
+fields[2] = new Field("image", DataTypes.BINARY);
+
+byte[] originBinary = null;
+
+// read and write image data
+for (int j = 0; j < num; j++) {
+  CarbonWriter writer = CarbonWriter
+  .builder()
+  .outputPath(path)
+  .withCsvInput(new Schema(fields))
+  .writtenBy("SDKS3Example").withPageSizeInMb(1)
+  .build();
+
+  for (int i = 0; i < rows; i++) {
+// read image and encode to Hex
+BufferedInputStream bis = new BufferedInputStream(new 
FileInputStream(imagePath));
+char[] hexValue = null;
+originBinary = new byte[bis.available()];
+while ((bis.read(originBinary)) != -1) {
+  hexValue = Hex.encodeHex(originBinary);
+}
+// write data
+writer.write(new String[]{"robot" + (i % 10), String.valueOf(i), 
String.valueOf(hexValue)});
+bis.close();
+  }
+  writer.close();
+}
+
+// Read data with filter
+EqualToExpression equalToExpression = new EqualToExpression(
+new ColumnExpression("name", DataTypes.STRING),
+new LiteralExpression("robot0", DataTypes.STRING));
+
+CarbonReader reader = CarbonReader
+.builder(path, "_temp")
+.filter(equalToExpression)
+.build();
+
+System.out.println("\nData:");
+int i = 0;
+while (i < 20 && reader.hasNext()) {
+  Object[] row = (Object[]) reader.readNextRow();
+
+  byte[] outputBinary = Hex.decodeHex(new String((byte[]) 
row[1]).toCharArray());
+  System.out.println(row[0] + " " + row[2] + " image size:" + 
outputBinary.length);
+
+  // validate output binary data and origin binary data
+  assert (originBinary.length == outputBinary.length);
+  for (int j = 0; j < originBinary.length; j++) {
+assert (originBinary[j] == outputBinary[j]);
+  }
+  String value = new String(outputBinary);
+  Assert.assertTrue(value.startsWith("�PNG"));
+  // save image, user can compare the save image and original image
+  String destString = "./target/binary/image" + i + ".jpg";
+  BufferedOutputStream bos = 

[GitHub] [carbondata] CarbonDataQA commented on issue #3178: [WIP] Support alter SORT_COLUMNS property

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3178: [WIP] Support alter SORT_COLUMNS property
URL: https://github.com/apache/carbondata/pull/3178#issuecomment-483639839
 
 
   Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/11152/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3178: [WIP] Support alter SORT_COLUMNS property

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3178: [WIP] Support alter SORT_COLUMNS property
URL: https://github.com/apache/carbondata/pull/3178#issuecomment-483639476
 
 
   Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2891/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] akashrn5 commented on issue #3122: [CARBONDATA-3289] MV datamap doesn't take effect when having clause use alias

2019-04-16 Thread GitBox
akashrn5 commented on issue #3122: [CARBONDATA-3289] MV datamap doesn't take 
effect when having clause use alias
URL: https://github.com/apache/carbondata/pull/3122#issuecomment-483639006
 
 
   @qiuchenjian i think , to fix the issues you are adding new rules for MV 
plan optimizations, same in #3101 , so what i suggest is instead of making new 
class for every rule, why can't you place all rules inside one class and in 
different objects.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] QiangCai opened a new pull request #3178: [WIP] Support alter SORT_COLUMNS property

2019-04-16 Thread GitBox
QiangCai opened a new pull request #3178: [WIP] Support alter SORT_COLUMNS 
property
URL: https://github.com/apache/carbondata/pull/3178
 
 
   Be sure to do all of the following checklist to help us incorporate 
   your contribution quickly and easily:
   
- [ ] Any interfaces changed?

- [ ] Any backward compatibility impacted?

- [ ] Document update required?
   
- [ ] Testing done
   Please provide details on 
   - Whether new unit test cases have been added or why no new tests 
are required?
   - How it is tested? Please attach test report.
   - Is it a performance related change? Please attach the performance 
test report.
   - Any additional information to help reviewers in testing this 
change.
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275771398
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+  

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275769962
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+  

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275769962
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+  

[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483627489
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2890/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275753544
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/test/util/BinaryUtil.java
 ##
 @@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.test.util;
+
+import java.io.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.sdk.file.CarbonReader;
+import org.apache.carbondata.sdk.file.CarbonWriter;
+import org.apache.carbondata.sdk.file.Field;
+import org.apache.carbondata.sdk.file.Schema;
+
+import static org.apache.carbondata.sdk.file.utils.SDKUtil.listFiles;
+
+public class BinaryUtil {
+  public static void binaryToCarbon(String sourceImageFolder, String 
outputPath,
+String sufAnnotation, final String 
sufImage) throws Exception {
+Field[] fields = new Field[5];
+fields[0] = new Field("binaryId", DataTypes.INT);
+fields[1] = new Field("binaryName", DataTypes.STRING);
+fields[2] = new Field("binary", DataTypes.BINARY);
+fields[3] = new Field("labelName", DataTypes.STRING);
+fields[4] = new Field("labelContent", DataTypes.STRING);
+CarbonWriter writer = CarbonWriter
+.builder()
+.outputPath(outputPath)
+.withCsvInput(new Schema(fields))
+.withBlockSize(256)
+.writtenBy("SDKS3Example").withPageSizeInMb(1)
+.build();
+binaryToCarbon(sourceImageFolder, writer, sufAnnotation, sufImage);
+  }
+
+  public static boolean binaryToCarbon(String sourceImageFolder, CarbonWriter 
writer,
+  String sufAnnotation, final String sufImage) throws Exception {
+int num = 1;
+
+byte[] originBinary = null;
+
+// read and write image data
+for (int j = 0; j < num; j++) {
+
+  Object[] files = listFiles(sourceImageFolder, sufImage).toArray();
+
+  if (null != files) {
+for (int i = 0; i < files.length; i++) {
+  // read image and encode to Hex
 
 Review comment:
   I mean "where" ? This is using plain byte[], I don't see any encoding logic 
here


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275753185
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
 ##
 @@ -65,7 +65,7 @@
   @Override
   public void write(Object object) throws IOException {
 try {
-  writable.set((String[]) object);
+  writable.set((Object[]) object);
 
 Review comment:
   but CSVWriter is always going for convert step. so if you send as byte[] 
instead of string, decode hex may give wrong result ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275752182
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/test/util/BinaryUtil.java
 ##
 @@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.test.util;
+
+import java.io.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.sdk.file.CarbonReader;
+import org.apache.carbondata.sdk.file.CarbonWriter;
+import org.apache.carbondata.sdk.file.Field;
+import org.apache.carbondata.sdk.file.Schema;
+
+import static org.apache.carbondata.sdk.file.utils.SDKUtil.listFiles;
+
+public class BinaryUtil {
+  public static void binaryToCarbon(String sourceImageFolder, String 
outputPath,
+String sufAnnotation, final String 
sufImage) throws Exception {
+Field[] fields = new Field[5];
+fields[0] = new Field("binaryId", DataTypes.INT);
+fields[1] = new Field("binaryName", DataTypes.STRING);
+fields[2] = new Field("binary", DataTypes.BINARY);
+fields[3] = new Field("labelName", DataTypes.STRING);
+fields[4] = new Field("labelContent", DataTypes.STRING);
+CarbonWriter writer = CarbonWriter
+.builder()
+.outputPath(outputPath)
+.withCsvInput(new Schema(fields))
+.withBlockSize(256)
+.writtenBy("SDKS3Example").withPageSizeInMb(1)
+.build();
+binaryToCarbon(sourceImageFolder, writer, sufAnnotation, sufImage);
+  }
+
+  public static boolean binaryToCarbon(String sourceImageFolder, CarbonWriter 
writer,
+  String sufAnnotation, final String sufImage) throws Exception {
+int num = 1;
+
+byte[] originBinary = null;
+
+// read and write image data
+for (int j = 0; j < num; j++) {
+
+  Object[] files = listFiles(sourceImageFolder, sufImage).toArray();
+
+  if (null != files) {
+for (int i = 0; i < files.length; i++) {
+  // read image and encode to Hex
 
 Review comment:
   many places, like: 
org.apache.carbondata.sdk.file.ImageTest#testBinaryWithFilter


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275752182
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/test/util/BinaryUtil.java
 ##
 @@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.test.util;
+
+import java.io.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.sdk.file.CarbonReader;
+import org.apache.carbondata.sdk.file.CarbonWriter;
+import org.apache.carbondata.sdk.file.Field;
+import org.apache.carbondata.sdk.file.Schema;
+
+import static org.apache.carbondata.sdk.file.utils.SDKUtil.listFiles;
+
+public class BinaryUtil {
+  public static void binaryToCarbon(String sourceImageFolder, String 
outputPath,
+String sufAnnotation, final String 
sufImage) throws Exception {
+Field[] fields = new Field[5];
+fields[0] = new Field("binaryId", DataTypes.INT);
+fields[1] = new Field("binaryName", DataTypes.STRING);
+fields[2] = new Field("binary", DataTypes.BINARY);
+fields[3] = new Field("labelName", DataTypes.STRING);
+fields[4] = new Field("labelContent", DataTypes.STRING);
+CarbonWriter writer = CarbonWriter
+.builder()
+.outputPath(outputPath)
+.withCsvInput(new Schema(fields))
+.withBlockSize(256)
+.writtenBy("SDKS3Example").withPageSizeInMb(1)
+.build();
+binaryToCarbon(sourceImageFolder, writer, sufAnnotation, sufImage);
+  }
+
+  public static boolean binaryToCarbon(String sourceImageFolder, CarbonWriter 
writer,
+  String sufAnnotation, final String sufImage) throws Exception {
+int num = 1;
+
+byte[] originBinary = null;
+
+// read and write image data
+for (int j = 0; j < num; j++) {
+
+  Object[] files = listFiles(sourceImageFolder, sufImage).toArray();
+
+  if (null != files) {
+for (int i = 0; i < files.length; i++) {
+  // read image and encode to Hex
 
 Review comment:
   many places: org.apache.carbondata.sdk.file.ImageTest#testBinaryWithFilter


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275751739
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/utils/SDKUtil.java
 ##
 @@ -0,0 +1,79 @@
+/*
 
 Review comment:
   no, write binary will use this class to list binary file, like jpg


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3164: [CARBONDATA-3331] Fix for external table in Show Metacache

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3164: [CARBONDATA-3331] Fix for external table 
in Show Metacache
URL: https://github.com/apache/carbondata/pull/3164#issuecomment-483621995
 
 
   Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3119/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275751479
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReaderBuilder.java
 ##
 @@ -70,6 +77,53 @@
 ThreadLocalSessionInfo.setCarbonSessionInfo(new CarbonSessionInfo());
   }
 
+  /**
+   * Construct a CarbonReaderBuilder with table name
+   *
+   * @param tableName table name
+   */
+  CarbonReaderBuilder(String tableName) {
 
 Review comment:
   removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275751374
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java
 ##
 @@ -155,6 +154,17 @@ public static CarbonReaderBuilder builder(String 
tablePath) {
 return builder(tablePath, tableName);
   }
 
+  /**
+   * Return a new {@link CarbonReaderBuilder} instance
+   *
+   * @return CarbonReaderBuilder object
+   */
+  public static CarbonReaderBuilder builder() {
 
 Review comment:
   removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275750897
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
 ##
 @@ -65,7 +65,7 @@
   @Override
   public void write(Object object) throws IOException {
 try {
-  writable.set((String[]) object);
+  writable.set((Object[]) object);
 
 Review comment:
   It can support write with real data type.
   It's better to support like that, which can avoid convert operation.
   org.apache.carbondata.test.util.BinaryUtil#binaryToCarbon(java.lang.String, 
org.apache.carbondata.sdk.file.CarbonWriter, java.lang.String, 
java.lang.String):
   ```
 String labelFileName = ((String) files[i]).split(sufImage)[0] + 
sufAnnotation;
 BufferedInputStream txtBis = new BufferedInputStream(new 
FileInputStream(labelFileName));
 String labelValue = null;
 byte[] labelBinary = null;
 labelBinary = new byte[txtBis.available()];
 while ((txtBis.read(labelBinary)) != -1) {
   labelValue = new String(labelBinary, "UTF-8");
 }
 // write data
 writer.write(new Object[]{i, (String) files[i], originBinary,
 labelFileName, labelValue});
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275750897
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
 ##
 @@ -65,7 +65,7 @@
   @Override
   public void write(Object object) throws IOException {
 try {
-  writable.set((String[]) object);
+  writable.set((Object[]) object);
 
 Review comment:
   It can support write with real data type.
   
   ```
 String labelFileName = ((String) files[i]).split(sufImage)[0] + 
sufAnnotation;
 BufferedInputStream txtBis = new BufferedInputStream(new 
FileInputStream(labelFileName));
 String labelValue = null;
 byte[] labelBinary = null;
 labelBinary = new byte[txtBis.available()];
 while ((txtBis.read(labelBinary)) != -1) {
   labelValue = new String(labelBinary, "UTF-8");
 }
 // write data
 writer.write(new Object[]{i, (String) files[i], originBinary,
 labelFileName, labelValue});
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275750897
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
 ##
 @@ -65,7 +65,7 @@
   @Override
   public void write(Object object) throws IOException {
 try {
-  writable.set((String[]) object);
+  writable.set((Object[]) object);
 
 Review comment:
   It can support write with real data type.
   
   `
 String labelFileName = ((String) files[i]).split(sufImage)[0] + 
sufAnnotation;
 BufferedInputStream txtBis = new BufferedInputStream(new 
FileInputStream(labelFileName));
 String labelValue = null;
 byte[] labelBinary = null;
 labelBinary = new byte[txtBis.available()];
 while ((txtBis.read(labelBinary)) != -1) {
   labelValue = new String(labelBinary, "UTF-8");
 }
 // write data
 writer.write(new Object[]{i, (String) files[i], originBinary,
 labelFileName, labelValue});`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275747927
 
 

 ##
 File path: 
store/sdk/src/test/java/org/apache/carbondata/sdk/file/ImageTest.java
 ##
 @@ -0,0 +1,699 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import junit.framework.TestCase;
+
+import org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
+import org.apache.carbondata.core.scan.expression.LiteralExpression;
+import 
org.apache.carbondata.core.scan.expression.conditional.EqualToExpression;
+import org.apache.carbondata.core.util.CarbonProperties;
+
+import org.apache.carbondata.test.util.BinaryUtil;
+import org.apache.commons.codec.DecoderException;
+import org.apache.commons.codec.binary.Hex;
+import org.apache.commons.io.FileUtils;
+import org.junit.Assert;
+import org.junit.Test;
+
+import javax.imageio.ImageIO;
+import javax.imageio.ImageReadParam;
+import javax.imageio.ImageReader;
+import javax.imageio.ImageTypeSpecifier;
+import javax.imageio.stream.FileImageInputStream;
+import javax.imageio.stream.ImageInputStream;
+import java.awt.color.ColorSpace;
+import java.awt.image.BufferedImage;
+import java.io.*;
+import java.util.Iterator;
+import java.util.List;
+
+import static org.apache.carbondata.sdk.file.utils.SDKUtil.listFiles;
+
+public class ImageTest extends TestCase {
+
+  @Test
+  public void testBinaryWithFilter() throws IOException, 
InvalidLoadOptionException, InterruptedException, DecoderException {
+String imagePath = "./src/test/resources/image/carbondatalogo.jpg";
+int num = 1;
+int rows = 1;
+String path = "./target/binary";
+Field[] fields = new Field[3];
+fields[0] = new Field("name", DataTypes.STRING);
+fields[1] = new Field("age", DataTypes.INT);
+fields[2] = new Field("image", DataTypes.BINARY);
+
+byte[] originBinary = null;
+
+// read and write image data
+for (int j = 0; j < num; j++) {
+  CarbonWriter writer = CarbonWriter
+  .builder()
+  .outputPath(path)
+  .withCsvInput(new Schema(fields))
+  .writtenBy("SDKS3Example").withPageSizeInMb(1)
+  .build();
+
+  for (int i = 0; i < rows; i++) {
+// read image and encode to Hex
+BufferedInputStream bis = new BufferedInputStream(new 
FileInputStream(imagePath));
+char[] hexValue = null;
+originBinary = new byte[bis.available()];
+while ((bis.read(originBinary)) != -1) {
+  hexValue = Hex.encodeHex(originBinary);
+}
+// write data
+writer.write(new String[]{"robot" + (i % 10), String.valueOf(i), 
String.valueOf(hexValue)});
+bis.close();
+  }
+  writer.close();
+}
+
+// Read data with filter
+EqualToExpression equalToExpression = new EqualToExpression(
+new ColumnExpression("name", DataTypes.STRING),
+new LiteralExpression("robot0", DataTypes.STRING));
+
+CarbonReader reader = CarbonReader
+.builder(path, "_temp")
+.filter(equalToExpression)
+.build();
+
+System.out.println("\nData:");
+int i = 0;
+while (i < 20 && reader.hasNext()) {
+  Object[] row = (Object[]) reader.readNextRow();
+
+  byte[] outputBinary = Hex.decodeHex(new String((byte[]) 
row[1]).toCharArray());
+  System.out.println(row[0] + " " + row[2] + " image size:" + 
outputBinary.length);
+
+  // validate output binary data and origin binary data
+  assert (originBinary.length == outputBinary.length);
+  for (int j = 0; j < originBinary.length; j++) {
+assert (originBinary[j] == outputBinary[j]);
+  }
+  String value = new String(outputBinary);
+  Assert.assertTrue(value.startsWith("�PNG"));
+  // save image, user can compare the save image and original image
+  String destString = "./target/binary/image" + i + ".jpg";
+  BufferedOutputStream 

[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275750565
 
 

 ##
 File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/BinaryFieldConverterImpl.java
 ##
 @@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.processing.loading.converter.impl;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.datastore.row.CarbonRow;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
+import org.apache.carbondata.processing.loading.DataField;
+import org.apache.carbondata.processing.loading.converter.BadRecordLogHolder;
+import org.apache.carbondata.processing.loading.converter.FieldConverter;
+import 
org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException;
+import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
+
+import org.apache.commons.codec.DecoderException;
+import org.apache.commons.codec.binary.Hex;
+import org.apache.log4j.Logger;
+
+/**
+ * Converter for binary
+ */
+public class BinaryFieldConverterImpl implements FieldConverter {
+  private static final Logger LOGGER =
+  
LogServiceFactory.getLogService(BinaryFieldConverterImpl.class.getName());
+
+  private int index;
+  private DataType dataType;
+  private CarbonDimension dimension;
+  private String nullformat;
+  private boolean isEmptyBadRecord;
+  private DataField dataField;
+  public BinaryFieldConverterImpl(DataField dataField, String nullformat, int 
index,
+  boolean isEmptyBadRecord) {
+this.dataType = dataField.getColumn().getDataType();
+this.dimension = (CarbonDimension) dataField.getColumn();
+this.nullformat = nullformat;
+this.index = index;
+this.isEmptyBadRecord = isEmptyBadRecord;
+this.dataField = dataField;
+  }
+
+  @Override
+  public void convert(CarbonRow row, BadRecordLogHolder logHolder)
+  throws CarbonDataLoadingException {
+if (row.getObject(index) instanceof String) {
+  row.update(convert(row.getString(index), logHolder), index);
+} else if (row.getObject(index) instanceof byte[]) {
+  row.update(row.getObject(index), index);
+} else {
+  throw new CarbonDataLoadingException("Binary only support String and 
byte[] data type");
+}
+  }
+
+  @Override
+  public Object convert(Object value, BadRecordLogHolder logHolder)
+  throws RuntimeException {
+String literalValue = (String) (value);
+if (literalValue != null) {
+  try {
+return Hex.decodeHex(literalValue.toCharArray());
 
 Review comment:
   SQL is like that. But only for SDK write with string(old way). It no need 
decode if user write with real data type, like byte[].
   I changed just now. removed Hex.decodeHex.
   So user will get hex code from read when user want to encode binary to hex 
and write to csv file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275749305
 
 

 ##
 File path: 
integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceBinaryTest.scala
 ##
 @@ -0,0 +1,236 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.carbondata.datasource
+
+import java.io.File
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.test.util.BinaryUtil
+import org.apache.commons.io.FileUtils
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.carbondata.datasource.TestUtil._
+import org.apache.spark.util.SparkUtil
+
+import org.scalatest.{BeforeAndAfterAll, FunSuite}
+
+class SparkCarbonDataSourceBinaryTest extends FunSuite with BeforeAndAfterAll {
+
+var writerPath = new File(this.getClass.getResource("/").getPath
++ "../../target/SparkCarbonFileFormat/WriterOutput/")
+.getCanonicalPath
+var outputPath = writerPath + 2
+//getCanonicalPath gives path with \, but the code expects /.
+writerPath = writerPath.replace("\\", "/")
+
+var sdkPath = new File(this.getClass.getResource("/").getPath + 
"../../../../store/sdk/")
+.getCanonicalPath
+
+def buildTestBinaryData(): Any = {
+FileUtils.deleteDirectory(new File(writerPath))
+
+val sourceImageFolder = sdkPath + "/src/test/resources/image/flowers"
+val sufAnnotation = ".txt"
+BinaryUtil.binaryToCarbon(sourceImageFolder, writerPath, 
sufAnnotation, ".jpg")
+}
+
+def cleanTestData() = {
+FileUtils.deleteDirectory(new File(writerPath))
+FileUtils.deleteDirectory(new File(outputPath))
+}
+
+import spark._
+
+override def beforeAll(): Unit = {
+CarbonProperties.getInstance()
+.addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
+CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT)
+buildTestBinaryData()
+
+FileUtils.deleteDirectory(new File(outputPath))
+sql("DROP TABLE IF EXISTS sdkOutputTable")
+}
+
+override def afterAll(): Unit = {
+cleanTestData()
+sql("DROP TABLE IF EXISTS sdkOutputTable")
+}
+
+test("Test direct sql read carbon") {
+assert(new File(writerPath).exists())
+checkAnswer(
+sql(s"SELECT COUNT(*) FROM carbon.`$writerPath`"),
+Seq(Row(3)))
+}
+
+test("Test read image carbon with spark carbon file format, generate by 
sdk, CTAS") {
+sql("DROP TABLE IF EXISTS binaryCarbon")
+sql("DROP TABLE IF EXISTS binaryCarbon3")
+if (SparkUtil.isSparkVersionEqualTo("2.1")) {
+sql(s"CREATE TABLE binaryCarbon USING CARBON OPTIONS(PATH 
'$writerPath')")
+sql(s"CREATE TABLE binaryCarbon3 USING CARBON OPTIONS(PATH 
'$outputPath')" + " AS SELECT * FROM binaryCarbon")
+} else {
+sql(s"CREATE TABLE binaryCarbon USING CARBON LOCATION 
'$writerPath'")
+sql(s"CREATE TABLE binaryCarbon3 USING CARBON LOCATION 
'$outputPath'" + " AS SELECT * FROM binaryCarbon")
+}
+checkAnswer(sql("SELECT COUNT(*) FROM binaryCarbon"),
+Seq(Row(3)))
+checkAnswer(sql("SELECT COUNT(*) FROM binaryCarbon3"),
+Seq(Row(3)))
+sql("DROP TABLE IF EXISTS binaryCarbon")
+sql("DROP TABLE IF EXISTS binaryCarbon3")
+}
+
+test("Don't support sort_columns") {
+import spark._
+sql("DROP TABLE IF EXISTS binaryTable")
+val exception = intercept[Exception] {
+sql(
+s"""
+   | CREATE TABLE binaryTable (
+   |id DOUBLE,
+   |label BOOLEAN,
+   |name STRING,
+   |image BINARY,
+   |autoLabel BOOLEAN)
+   | using carbon
+   | options('SORT_COLUMNS'='image')
+   """.stripMargin)
+sql("SELECT COUNT(*) FROM 

[GitHub] [carbondata] CarbonDataQA commented on issue #3164: [CARBONDATA-3331] Fix for external table in Show Metacache

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3164: [CARBONDATA-3331] Fix for external table 
in Show Metacache
URL: https://github.com/apache/carbondata/pull/3164#issuecomment-483616809
 
 
   Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/11149/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3173: [CARBONDATA-3351] Support Binary Data 
Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483615190
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2889/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275729737
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/test/util/BinaryUtil.java
 ##
 @@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.test.util;
+
+import java.io.*;
+
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.sdk.file.CarbonReader;
+import org.apache.carbondata.sdk.file.CarbonWriter;
+import org.apache.carbondata.sdk.file.Field;
+import org.apache.carbondata.sdk.file.Schema;
+
+import static org.apache.carbondata.sdk.file.utils.SDKUtil.listFiles;
+
+public class BinaryUtil {
+  public static void binaryToCarbon(String sourceImageFolder, String 
outputPath,
+String sufAnnotation, final String 
sufImage) throws Exception {
+Field[] fields = new Field[5];
+fields[0] = new Field("binaryId", DataTypes.INT);
+fields[1] = new Field("binaryName", DataTypes.STRING);
+fields[2] = new Field("binary", DataTypes.BINARY);
+fields[3] = new Field("labelName", DataTypes.STRING);
+fields[4] = new Field("labelContent", DataTypes.STRING);
+CarbonWriter writer = CarbonWriter
+.builder()
+.outputPath(outputPath)
+.withCsvInput(new Schema(fields))
+.withBlockSize(256)
+.writtenBy("SDKS3Example").withPageSizeInMb(1)
+.build();
+binaryToCarbon(sourceImageFolder, writer, sufAnnotation, sufImage);
+  }
+
+  public static boolean binaryToCarbon(String sourceImageFolder, CarbonWriter 
writer,
+  String sufAnnotation, final String sufImage) throws Exception {
+int num = 1;
+
+byte[] originBinary = null;
+
+// read and write image data
+for (int j = 0; j < num; j++) {
+
+  Object[] files = listFiles(sourceImageFolder, sufImage).toArray();
+
+  if (null != files) {
+for (int i = 0; i < files.length; i++) {
+  // read image and encode to Hex
 
 Review comment:
   where are we encoding to Hex here ?  I see that we are passing original 
byteArray from binary.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275729122
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/utils/SDKUtil.java
 ##
 @@ -0,0 +1,79 @@
+/*
 
 Review comment:
   Revert changes in this file as
   these changes are not related to Binary, please keep in separate PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3164: [CARBONDATA-3331] Fix for external table in Show Metacache

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3164: [CARBONDATA-3331] Fix for external table 
in Show Metacache
URL: https://github.com/apache/carbondata/pull/3164#issuecomment-483600034
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2888/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275726945
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReaderBuilder.java
 ##
 @@ -70,6 +77,53 @@
 ThreadLocalSessionInfo.setCarbonSessionInfo(new CarbonSessionInfo());
   }
 
+  /**
+   * Construct a CarbonReaderBuilder with table name
+   *
+   * @param tableName table name
+   */
+  CarbonReaderBuilder(String tableName) {
 
 Review comment:
   Revert changes in this file as
   these changes are not related to Binary, please keep in separate PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275726945
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReaderBuilder.java
 ##
 @@ -70,6 +77,53 @@
 ThreadLocalSessionInfo.setCarbonSessionInfo(new CarbonSessionInfo());
   }
 
+  /**
+   * Construct a CarbonReaderBuilder with table name
+   *
+   * @param tableName table name
+   */
+  CarbonReaderBuilder(String tableName) {
 
 Review comment:
   These changes are not related to Binary, please keep in separate PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275726664
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReader.java
 ##
 @@ -155,6 +154,17 @@ public static CarbonReaderBuilder builder(String 
tablePath) {
 return builder(tablePath, tableName);
   }
 
+  /**
+   * Return a new {@link CarbonReaderBuilder} instance
+   *
+   * @return CarbonReaderBuilder object
+   */
+  public static CarbonReaderBuilder builder() {
 
 Review comment:
   This changes is not related to binary, please keep in separate PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275726498
 
 

 ##
 File path: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CSVCarbonWriter.java
 ##
 @@ -65,7 +65,7 @@
   @Override
   public void write(Object object) throws IOException {
 try {
-  writable.set((String[]) object);
+  writable.set((Object[]) object);
 
 Review comment:
   better to keep as string, as converter step expecting binary as string ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275718435
 
 

 ##
 File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/converter/impl/BinaryFieldConverterImpl.java
 ##
 @@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.processing.loading.converter.impl;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.datastore.row.CarbonRow;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
+import org.apache.carbondata.processing.loading.DataField;
+import org.apache.carbondata.processing.loading.converter.BadRecordLogHolder;
+import org.apache.carbondata.processing.loading.converter.FieldConverter;
+import 
org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException;
+import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
+
+import org.apache.commons.codec.DecoderException;
+import org.apache.commons.codec.binary.Hex;
+import org.apache.log4j.Logger;
+
+/**
+ * Converter for binary
+ */
+public class BinaryFieldConverterImpl implements FieldConverter {
+  private static final Logger LOGGER =
+  
LogServiceFactory.getLogService(BinaryFieldConverterImpl.class.getName());
+
+  private int index;
+  private DataType dataType;
+  private CarbonDimension dimension;
+  private String nullformat;
+  private boolean isEmptyBadRecord;
+  private DataField dataField;
+  public BinaryFieldConverterImpl(DataField dataField, String nullformat, int 
index,
+  boolean isEmptyBadRecord) {
+this.dataType = dataField.getColumn().getDataType();
+this.dimension = (CarbonDimension) dataField.getColumn();
+this.nullformat = nullformat;
+this.index = index;
+this.isEmptyBadRecord = isEmptyBadRecord;
+this.dataField = dataField;
+  }
+
+  @Override
+  public void convert(CarbonRow row, BadRecordLogHolder logHolder)
+  throws CarbonDataLoadingException {
+if (row.getObject(index) instanceof String) {
+  row.update(convert(row.getString(index), logHolder), index);
+} else if (row.getObject(index) instanceof byte[]) {
+  row.update(row.getObject(index), index);
+} else {
+  throw new CarbonDataLoadingException("Binary only support String and 
byte[] data type");
+}
+  }
+
+  @Override
+  public Object convert(Object value, BadRecordLogHolder logHolder)
+  throws RuntimeException {
+String literalValue = (String) (value);
+if (literalValue != null) {
+  try {
+return Hex.decodeHex(literalValue.toCharArray());
 
 Review comment:
   SDK flow and SQL flow, both the flow, user need to convert binary to hex 
string and pass it ?
   hence we need to decode it here ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275711453
 
 

 ##
 File path: 
integration/spark-datasource/src/test/scala/org/apache/spark/sql/carbondata/datasource/SparkCarbonDataSourceBinaryTest.scala
 ##
 @@ -0,0 +1,236 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.carbondata.datasource
+
+import java.io.File
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.test.util.BinaryUtil
+import org.apache.commons.io.FileUtils
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.carbondata.datasource.TestUtil._
+import org.apache.spark.util.SparkUtil
+
+import org.scalatest.{BeforeAndAfterAll, FunSuite}
+
+class SparkCarbonDataSourceBinaryTest extends FunSuite with BeforeAndAfterAll {
+
+var writerPath = new File(this.getClass.getResource("/").getPath
++ "../../target/SparkCarbonFileFormat/WriterOutput/")
+.getCanonicalPath
+var outputPath = writerPath + 2
+//getCanonicalPath gives path with \, but the code expects /.
+writerPath = writerPath.replace("\\", "/")
+
+var sdkPath = new File(this.getClass.getResource("/").getPath + 
"../../../../store/sdk/")
+.getCanonicalPath
+
+def buildTestBinaryData(): Any = {
+FileUtils.deleteDirectory(new File(writerPath))
+
+val sourceImageFolder = sdkPath + "/src/test/resources/image/flowers"
+val sufAnnotation = ".txt"
+BinaryUtil.binaryToCarbon(sourceImageFolder, writerPath, 
sufAnnotation, ".jpg")
+}
+
+def cleanTestData() = {
+FileUtils.deleteDirectory(new File(writerPath))
+FileUtils.deleteDirectory(new File(outputPath))
+}
+
+import spark._
+
+override def beforeAll(): Unit = {
+CarbonProperties.getInstance()
+.addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
+CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT)
+buildTestBinaryData()
+
+FileUtils.deleteDirectory(new File(outputPath))
+sql("DROP TABLE IF EXISTS sdkOutputTable")
+}
+
+override def afterAll(): Unit = {
+cleanTestData()
+sql("DROP TABLE IF EXISTS sdkOutputTable")
+}
+
+test("Test direct sql read carbon") {
+assert(new File(writerPath).exists())
+checkAnswer(
+sql(s"SELECT COUNT(*) FROM carbon.`$writerPath`"),
+Seq(Row(3)))
+}
+
+test("Test read image carbon with spark carbon file format, generate by 
sdk, CTAS") {
+sql("DROP TABLE IF EXISTS binaryCarbon")
+sql("DROP TABLE IF EXISTS binaryCarbon3")
+if (SparkUtil.isSparkVersionEqualTo("2.1")) {
+sql(s"CREATE TABLE binaryCarbon USING CARBON OPTIONS(PATH 
'$writerPath')")
+sql(s"CREATE TABLE binaryCarbon3 USING CARBON OPTIONS(PATH 
'$outputPath')" + " AS SELECT * FROM binaryCarbon")
+} else {
+sql(s"CREATE TABLE binaryCarbon USING CARBON LOCATION 
'$writerPath'")
+sql(s"CREATE TABLE binaryCarbon3 USING CARBON LOCATION 
'$outputPath'" + " AS SELECT * FROM binaryCarbon")
+}
+checkAnswer(sql("SELECT COUNT(*) FROM binaryCarbon"),
+Seq(Row(3)))
+checkAnswer(sql("SELECT COUNT(*) FROM binaryCarbon3"),
+Seq(Row(3)))
+sql("DROP TABLE IF EXISTS binaryCarbon")
+sql("DROP TABLE IF EXISTS binaryCarbon3")
+}
+
+test("Don't support sort_columns") {
+import spark._
+sql("DROP TABLE IF EXISTS binaryTable")
+val exception = intercept[Exception] {
+sql(
+s"""
+   | CREATE TABLE binaryTable (
+   |id DOUBLE,
+   |label BOOLEAN,
+   |name STRING,
+   |image BINARY,
+   |autoLabel BOOLEAN)
+   | using carbon
+   | options('SORT_COLUMNS'='image')
+   """.stripMargin)
+sql("SELECT COUNT(*) FROM 

[GitHub] [carbondata] NamanRastogi commented on a change in pull request #3164: [WIP] [CARBONDATA-3331] Fix for external table in Show Metacache

2019-04-16 Thread GitBox
NamanRastogi commented on a change in pull request #3164: [WIP] 
[CARBONDATA-3331] Fix for external table in Show Metacache
URL: https://github.com/apache/carbondata/pull/3164#discussion_r275691272
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
 ##
 @@ -71,52 +65,64 @@ case class CarbonShowCacheCommand(tableIdentifier: 
Option[TableIdentifier],
 Row("ALL", "ALL", 0L, 0L, 0L),
 Row(currentDatabase, "ALL", 0L, 0L, 0L))
 } else {
-  val carbonTables = CarbonEnv.getInstance(sparkSession).carbonMetaStore
-.listAllTables(sparkSession).filter {
-carbonTable =>
-  carbonTable.getDatabaseName.equalsIgnoreCase(currentDatabase) &&
-  isValidTable(carbonTable, sparkSession) &&
-  !carbonTable.isChildDataMap
+  val carbonTables = 
sparkSession.sessionState.catalog.listTables(currentDatabase).collect {
+case tableIdent if CarbonEnv.getInstance(sparkSession).carbonMetaStore
+  .tableExists(tableIdent)(sparkSession) =>
+  CarbonEnv.getCarbonTable(tableIdent)(sparkSession)
 
 Review comment:
   Done,


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3164: [WIP] [CARBONDATA-3331] Fix for external table in Show Metacache

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3164: [WIP] [CARBONDATA-3331] Fix for external 
table in Show Metacache
URL: https://github.com/apache/carbondata/pull/3164#issuecomment-483566145
 
 
   Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/3118/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3164: [WIP] [CARBONDATA-3331] Fix for external table in Show Metacache

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3164: [WIP] [CARBONDATA-3331] Fix for external 
table in Show Metacache
URL: https://github.com/apache/carbondata/pull/3164#issuecomment-483557691
 
 
   Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/11148/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275677395
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestNonTransactionalCarbonTableForBinary.scala
 ##
 @@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io._
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.test.util.BinaryUtil
+import org.apache.commons.io.FileUtils
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+
+class TestNonTransactionalCarbonTableForBinary extends QueryTest with 
BeforeAndAfterAll {
+
+var writerPath = new File(this.getClass.getResource("/").getPath
++ "../../target/SparkCarbonFileFormat/WriterOutput/")
+.getCanonicalPath
+var outputPath = writerPath + 2
+//getCanonicalPath gives path with \, but the code expects /.
+writerPath = writerPath.replace("\\", "/")
+
+var sdkPath = new File(this.getClass.getResource("/").getPath + 
"../../../../store/sdk/")
+.getCanonicalPath
+
+def buildTestBinaryData(): Any = {
+FileUtils.deleteDirectory(new File(writerPath))
+
+val sourceImageFolder = sdkPath + "/src/test/resources/image/flowers"
+val sufAnnotation = ".txt"
+BinaryUtil.binaryToCarbon(sourceImageFolder, writerPath, 
sufAnnotation, ".jpg")
+}
+
+def cleanTestData() = {
+FileUtils.deleteDirectory(new File(writerPath))
+FileUtils.deleteDirectory(new File(outputPath))
+}
+
+override def beforeAll(): Unit = {
+CarbonProperties.getInstance()
+.addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
+CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT)
+buildTestBinaryData()
+
+FileUtils.deleteDirectory(new File(outputPath))
+sql("DROP TABLE IF EXISTS sdkOutputTable")
+}
+
+override def afterAll(): Unit = {
+cleanTestData()
+sql("DROP TABLE IF EXISTS sdkOutputTable")
+}
+
+test("test read image carbon with external table, generate by sdk, CTAS") {
+sql("DROP TABLE IF EXISTS binaryCarbon")
+sql("DROP TABLE IF EXISTS binaryCarbon3")
+if (SparkUtil.isSparkVersionEqualTo("2.1")) {
+sql(s"CREATE EXTERNAL TABLE binaryCarbon STORED BY 'carbondata' 
OPTIONS(PATH '$writerPath')")
+val exception = intercept[Exception] {
+sql(s"CREATE TABLE binaryCarbon3 STORED BY 'carbondata' 
OPTIONS(PATH '$outputPath')" + " AS SELECT * FROM binaryCarbon")
+}
+assert(exception.getMessage.contains("DataLoad failure: Error 
while initializing data handler"))
+} else {
+sql(s"CREATE EXTERNAL TABLE binaryCarbon STORED BY 'carbondata' 
LOCATION '$writerPath'")
+val exception = intercept[Exception] {
+sql(s"CREATE TABLE binaryCarbon3 STORED BY 'carbondata' 
LOCATION '$outputPath'" + " AS SELECT * FROM binaryCarbon")
+}
+assert(exception.getMessage.contains("DataLoad failure: Error 
while initializing data handler : Failed for table"))
+}
+checkAnswer(sql("SELECT COUNT(*) FROM binaryCarbon"),
 
 Review comment:
   Validated


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] kunal642 commented on issue #3167: [CARBONDATA-3334] fixed multiple segment file issue for partition

2019-04-16 Thread GitBox
kunal642 commented on issue #3167: [CARBONDATA-3334] fixed multiple segment 
file issue for partition
URL: https://github.com/apache/carbondata/pull/3167#issuecomment-483552320
 
 
   @ravipesala build passed..please merge


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on issue #3173: [CARBONDATA-3351] Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#issuecomment-483551819
 
 
   @ajantha-bhat Can you help to check Carbon DataSource? It's more important.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275669941
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableWithColumnMetCacheAndCacheLevelProperty.scala
 ##
 @@ -127,6 +127,12 @@ class 
TestCreateTableWithColumnMetCacheAndCacheLevelProperty extends QueryTest w
 assert(isExpectedValueValid("default", "column_meta_cache", 
"column_meta_cache", "c1,c3"))
   }
 
+  test("validate for binary - COLUMN_META_CACHE_13") {
 
 Review comment:
   removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275669685
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
 ##
 @@ -367,5 +367,31 @@ class TestQueryWithColumnMetCacheAndCacheLevelProperty 
extends QueryTest with Be
 sql("DROP table IF EXISTS carbonCahe")
   }
 
+  // TODO: support insert and query with filter
+  ignore("Test For Cache set but Min/Max exceeds for Binary") {
 
 Review comment:
   removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275668971
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+   

[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275668793
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+   

[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275668438
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
 
 Review comment:
   it will not be a image.
   
   I plan to support keep the same for input and output data by carbon session. 
we can support small data in that scenario.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275667095
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+   

[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275666201
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+   

[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275665160
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+   

[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275661055
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+   

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275658433
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestNonTransactionalCarbonTableForBinary.scala
 ##
 @@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io._
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.test.util.BinaryUtil
+import org.apache.commons.io.FileUtils
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+
+class TestNonTransactionalCarbonTableForBinary extends QueryTest with 
BeforeAndAfterAll {
+
+var writerPath = new File(this.getClass.getResource("/").getPath
++ "../../target/SparkCarbonFileFormat/WriterOutput/")
+.getCanonicalPath
+var outputPath = writerPath + 2
+//getCanonicalPath gives path with \, but the code expects /.
+writerPath = writerPath.replace("\\", "/")
+
+var sdkPath = new File(this.getClass.getResource("/").getPath + 
"../../../../store/sdk/")
+.getCanonicalPath
+
+def buildTestBinaryData(): Any = {
+FileUtils.deleteDirectory(new File(writerPath))
+
+val sourceImageFolder = sdkPath + "/src/test/resources/image/flowers"
+val sufAnnotation = ".txt"
+BinaryUtil.binaryToCarbon(sourceImageFolder, writerPath, 
sufAnnotation, ".jpg")
+}
+
+def cleanTestData() = {
+FileUtils.deleteDirectory(new File(writerPath))
+FileUtils.deleteDirectory(new File(outputPath))
+}
+
+override def beforeAll(): Unit = {
+CarbonProperties.getInstance()
+.addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
+CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT)
+buildTestBinaryData()
+
+FileUtils.deleteDirectory(new File(outputPath))
+sql("DROP TABLE IF EXISTS sdkOutputTable")
+}
+
+override def afterAll(): Unit = {
+cleanTestData()
+sql("DROP TABLE IF EXISTS sdkOutputTable")
+}
+
+test("test read image carbon with external table, generate by sdk, CTAS") {
+sql("DROP TABLE IF EXISTS binaryCarbon")
+sql("DROP TABLE IF EXISTS binaryCarbon3")
+if (SparkUtil.isSparkVersionEqualTo("2.1")) {
+sql(s"CREATE EXTERNAL TABLE binaryCarbon STORED BY 'carbondata' 
OPTIONS(PATH '$writerPath')")
+val exception = intercept[Exception] {
+sql(s"CREATE TABLE binaryCarbon3 STORED BY 'carbondata' 
OPTIONS(PATH '$outputPath')" + " AS SELECT * FROM binaryCarbon")
+}
+assert(exception.getMessage.contains("DataLoad failure: Error 
while initializing data handler"))
+} else {
+sql(s"CREATE EXTERNAL TABLE binaryCarbon STORED BY 'carbondata' 
LOCATION '$writerPath'")
+val exception = intercept[Exception] {
+sql(s"CREATE TABLE binaryCarbon3 STORED BY 'carbondata' 
LOCATION '$outputPath'" + " AS SELECT * FROM binaryCarbon")
+}
+assert(exception.getMessage.contains("DataLoad failure: Error 
while initializing data handler : Failed for table"))
+}
+checkAnswer(sql("SELECT COUNT(*) FROM binaryCarbon"),
 
 Review comment:
   Need to validate a result content ? at least 1 row ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275656175
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+   

[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275655661
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
 
 Review comment:
   ok, merged


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275655661
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
 
 Review comment:
   ok, removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275655039
 
 

 ##
 File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/util/CarbonVectorizedRecordReader.java
 ##
 @@ -128,6 +128,9 @@ private boolean nextBatch() {
 if (iterator.hasNext()) {
   iterator.processNextBatch(carbonColumnarBatch);
   numBatched = carbonColumnarBatch.getActualSize();
+  if (numBatched == 0) {
 
 Review comment:
   ok, removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] CarbonDataQA commented on issue #3164: [WIP] [CARBONDATA-3331] Fix for external table in Show Metacache

2019-04-16 Thread GitBox
CarbonDataQA commented on issue #3164: [WIP] [CARBONDATA-3331] Fix for external 
table in Show Metacache
URL: https://github.com/apache/carbondata/pull/3164#issuecomment-483537900
 
 
   Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2887/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275654792
 
 

 ##
 File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
 ##
 @@ -150,6 +151,10 @@ public int getNumBlocks() {
 return numBlocks;
   }
 
+  public void setFileLists(List fileLists) {
 
 Review comment:
   ok, removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275653962
 
 

 ##
 File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
 ##
 @@ -162,7 +162,13 @@ public CarbonTable getOrCreateCarbonTable(Configuration 
configuration) throws IO
 // do block filtering and get split
 splits = getSplits(job, filter, externalTableSegments, null, 
partitionInfo, null);
   } else {
-for (CarbonFile carbonFile : 
getAllCarbonDataFiles(carbonTable.getTablePath())) {
+List carbonFiles = null;
 
 Review comment:
   ok, done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275654044
 
 

 ##
 File path: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
 ##
 @@ -208,6 +214,18 @@ public CarbonTable getOrCreateCarbonTable(Configuration 
configuration) throws IO
 return carbonFiles;
   }
 
+  private List getAllCarbonDataFiles(List fileLists) {
+List carbonFiles = new LinkedList<>();
+try {
 
 Review comment:
   ok, done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275652969
 
 

 ##
 File path: format/src/main/thrift/schema.thrift
 ##
 @@ -33,6 +33,7 @@ enum DataType {
TIMESTAMP = 6,
DATE = 7,
BOOLEAN = 8,
+   BINARY = 19,
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275652936
 
 

 ##
 File path: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
 ##
 @@ -2384,7 +2388,7 @@ public static void dropDatabaseDirectory(String 
databasePath)
   return b.array();
 } else if (DataTypes.isDecimal(dataType)) {
   return DataTypeUtil.bigDecimalToByte((BigDecimal) value);
-} else if (dataType == DataTypes.BYTE_ARRAY) {
+} else if (dataType == DataTypes.BYTE_ARRAY || dataType == 
DataTypes.BINARY) {
   return (byte[]) value;
 } else if (dataType == DataTypes.STRING
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275652599
 
 

 ##
 File path: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
 ##
 @@ -2179,7 +2179,11 @@ static DataType thriftDataTypeToWrapperDataType(
 return DataTypes.FLOAT;
   case BYTE:
 return DataTypes.BYTE;
+  case BINARY:
+return DataTypes.BINARY;
   default:
+LOGGER.warn(String.format("It can't match the data type, so use 
default data type: %s",
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275650406
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/metadata/datatype/BinaryType.java
 ##
 @@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.metadata.datatype;
+
+public class BinaryType extends DataType {
+  static final DataType BINARY =
+  new BinaryType(DataTypes.BINARY_TYPE_ID, 13, "BINARY", -1);
 
 Review comment:
   change to 26


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (CARBONDATA-3355) Add new module integrate with prestosql(presto of version 308+)

2019-04-16 Thread hjw (JIRA)
hjw created CARBONDATA-3355:
---

 Summary: Add new module integrate with prestosql(presto of version 
308+)
 Key: CARBONDATA-3355
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3355
 Project: CarbonData
  Issue Type: New Feature
  Components: presto-integration
Affects Versions: 1.5.3
Reporter: hjw


Presto has separated out a new development team,  the code structure and 
content of the new branch named prestosql (presto version of version 300+ 
)change significantly. Carbondata's existing presto component of integration 
does not support the new branch prestosql, I have developed a new module to 
support the new branch prestosql. Can it be adopted?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275649865
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
 ##
 @@ -244,14 +243,15 @@ public static void updateTableInfo(TableInfo tableInfo) {
   public static CarbonTable buildTable(
   String tablePath,
   String tableName,
-  Configuration configuration) throws IOException {
-TableInfo tableInfoInfer = CarbonUtil.buildDummyTableInfo(tablePath, 
"null", "null");
-CarbonFile carbonFile = 
getLatestIndexFile(FileFactory.getCarbonFile(tablePath, configuration));
-if (carbonFile == null) {
-  throw new RuntimeException("Carbon index file not exists.");
+  Configuration configuration,
+  boolean isFile) throws IOException {
+if (isFile) {
+  int index = tablePath.lastIndexOf("/");
 
 Review comment:
   ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
xubo245 commented on a change in pull request #3173: [CARBONDATA-3351] Support 
Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275648998
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/util/CarbonMetadataUtil.java
 ##
 @@ -521,6 +521,8 @@ private static int compareMeasureData(byte[] first, byte[] 
second, DataType data
 compare = -1;
   }
   return (int) compare;
+} else if (dataType == DataTypes.BINARY) {
+  return 0;
 
 Review comment:
   ok, done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275639781
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableWithColumnMetCacheAndCacheLevelProperty.scala
 ##
 @@ -127,6 +127,12 @@ class 
TestCreateTableWithColumnMetCacheAndCacheLevelProperty extends QueryTest w
 assert(isExpectedValueValid("default", "column_meta_cache", 
"column_meta_cache", "c1,c3"))
   }
 
+  test("validate for binary - COLUMN_META_CACHE_13") {
 
 Review comment:
   As discussed for above COLUMN_META_CACHE, no need to support for binary 
columns.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275639581
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/TestQueryWithColumnMetCacheAndCacheLevelProperty.scala
 ##
 @@ -367,5 +367,31 @@ class TestQueryWithColumnMetCacheAndCacheLevelProperty 
extends QueryTest with Be
 sql("DROP table IF EXISTS carbonCahe")
   }
 
+  // TODO: support insert and query with filter
+  ignore("Test For Cache set but Min/Max exceeds for Binary") {
 
 Review comment:
   As discussed for above COLUMN_META_CACHE, no need to support for binary 
columns. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] Support Binary Data Type

2019-04-16 Thread GitBox
ajantha-bhat commented on a change in pull request #3173: [CARBONDATA-3351] 
Support Binary Data Type
URL: https://github.com/apache/carbondata/pull/3173#discussion_r275639076
 
 

 ##
 File path: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/binary/TestBinaryDataType.scala
 ##
 @@ -0,0 +1,355 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.integration.spark.testsuite.binary
+
+import java.util.Arrays
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.codec.binary.Hex
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+/**
+  * Test cases for testing binary
+  */
+class TestBinaryDataType extends QueryTest with BeforeAndAfterAll {
+override def beforeAll {
+}
+
+test("Create table and load data with binary column") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT * FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(5 == each.length)
+
+assert(Integer.valueOf(each(0).toString) > 0)
+assert(each(1).toString.equalsIgnoreCase("false") || 
(each(1).toString.equalsIgnoreCase("true")))
+assert(each(2).toString.contains(".png"))
+
+val bytes20 = each.getAs[Array[Byte]](3).slice(0, 20)
+val binaryName = each(2).toString
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+
+assert(each(4).toString.equalsIgnoreCase("false") || 
(each(4).toString.equalsIgnoreCase("true")))
+}
+} catch {
+case e: Exception =>
+e.printStackTrace()
+assert(false)
+}
+}
+
+test("Support projection for binary") {
+sql("DROP TABLE IF EXISTS binaryTable")
+sql(
+s"""
+   | CREATE TABLE IF NOT EXISTS binaryTable (
+   |id int,
+   |label boolean,
+   |name string,
+   |image binary,
+   |autoLabel boolean)
+   | STORED BY 'carbondata'
+ """.stripMargin)
+sql(
+s"""
+   | LOAD DATA LOCAL INPATH '$resourcesPath/binarydata.csv'
+   | INTO TABLE binaryTable
+   | OPTIONS('header'='false')
+ """.stripMargin)
+checkAnswer(sql("SELECT COUNT(*) FROM binaryTable"), Seq(Row(3)))
+try {
+val df = sql("SELECT name,image FROM binaryTable").collect()
+assert(3 == df.length)
+df.foreach { each =>
+assert(2 == each.length)
+val binaryName = each(0).toString
+val bytes20 = each.getAs[Array[Byte]](1).slice(0, 20)
+val expectedBytes = firstBytes20.get(binaryName).get
+assert(Arrays.equals(expectedBytes, bytes20), "incorrect 
numeric value for flattened image")
+}
+} catch {
+case e: Exception =>
+