from:"Ravindra Pesala \(JIRA\)"

[jira] [Resolved] (CARBONDATA-965) dataload fail message is not correct when there is no good data to load

2017-04-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-965.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> dataload fail message is not correct when there is no good data to load
> ---
>
> Key: CARBONDATA-965
> URL: https://issues.apache.org/jira/browse/CARBONDATA-965
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-967) select * with order by and limit for join not working

2017-04-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-967:
---
Fix Version/s: 1.1.0

> select * with order by and limit for join not working
> -
>
> Key: CARBONDATA-967
> URL: https://issues.apache.org/jira/browse/CARBONDATA-967
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Reporter: joobisb
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: carbon1.csv, carbon2.csv
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> CREATE TABLE carbon1 (imei string,age int,task bigint,sale 
> decimal(10,3),productdate timestamp,score double)STORED BY 
> 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/data/carbon1.csv'  INTO TABLE carbon1 
> options ('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='false','BAD_RECORDS_ACTION'='FORCE',
>   'FILEHEADER'= ''); 
> CREATE TABLE carbon2 (imei string,age int,task bigint,sale 
> decimal(10,3),productdate timestamp,score double)STORED BY 
> 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/data/carbon2.csv'  INTO TABLE carbon2 
> options ('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='false','BAD_RECORDS_ACTION'='FORCE',
>   'FILEHEADER'= ''); 
> CREATE TABLE carbon3 (imei string,age int,task bigint,sale 
> decimal(10,3),productdate timestamp,score double)STORED BY 
> 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/data/carbon1.csv'  INTO TABLE carbon3 
> options ('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='false','BAD_RECORDS_ACTION'='FORCE',
>   'FILEHEADER'= '');
> select * from carbon1 a full outer join carbon2 b on 
> substr(a.productdate,1,10)=substr(b.productdate,1,10) order by a.imei limit 
> 100;
> it is throwing below exception
> ERROR TaskSetManager: Task 0 in stage 12.0 failed 1 times; aborting job
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 0 in stage 12.0 failed 1 times, most recent failure: 
> Lost task 0.0 in stage 12.0 (TID 211, localhost, executor driver): 
> java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot 
> be cast to java.lang.Integer
>   at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
>   at 
> org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$3$$anon$1$$anonfun$next$1.apply$mcVI$sp(CarbonDictionaryDecoder.scala:112)
>   at 
> org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$3$$anon$1$$anonfun$next$1.apply(CarbonDictionaryDecoder.scala:109)
>   at 
> org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$3$$anon$1$$anonfun$next$1.apply(CarbonDictionaryDecoder.scala:109)
>   at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>   at 
> org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$3$$anon$1.next(CarbonDictionaryDecoder.scala:109)
>   at 
> org.apache.spark.sql.CarbonDictionaryDecoder$$anonfun$doExecute$1$$anonfun$3$$anon$1.next(CarbonDictionaryDecoder.scala:99)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:232)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
>   at



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (CARBONDATA-960) Unsafe merge sort is not working properly

2017-04-20 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-960:
--

 Summary: Unsafe merge sort is not working properly
 Key: CARBONDATA-960
 URL: https://issues.apache.org/jira/browse/CARBONDATA-960
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Unsafe merge sort is not working properly as the sorted data is wrong. When we 
tried to load 3.5 billion data we found that data is not sorted properly 
through unsafe sort



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1257) Measure Filter Block Prunning and Filter Evaluation Support

2017-07-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1257.
-
   Resolution: Fixed
 Assignee: sounak chakraborty
Fix Version/s: 1.1.1
   1.2.0

> Measure Filter Block Prunning and Filter Evaluation Support
> ---
>
> Key: CARBONDATA-1257
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1257
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: sounak chakraborty
>Assignee: sounak chakraborty
> Fix For: 1.2.0, 1.1.1
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Measure columns currently stores Min and Max values at blocklet level. This 
> helps in Block Prunning in case of Measure Filter Query. The currrent 
> requirement is to 
> a) Enable Block Prunning in case of Measure Filters. 
> b) Handle Measure Data Evaluation in Major Filter Evaluators in Carbon like 
> Include, Exclude, Greater Than , Less Than etc. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-1205) Use Spark 2.1 as default from 1.2.0 onwards

2017-08-03 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1205.
-
   Resolution: Fixed
 Assignee: Liang Chen
Fix Version/s: 1.2.0

> Use Spark 2.1 as default from 1.2.0 onwards
> ---
>
> Key: CARBONDATA-1205
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1205
> Project: CarbonData
>  Issue Type: Task
>  Components: build
>Reporter: Liang Chen
>Assignee: Liang Chen
> Fix For: 1.2.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> From 1.2.0, there are many features developing based on Spark 2.1,  this task 
> is to use Spark2.1 as default compilation in parent pom.
> Discussion session at : 
> https://lists.apache.org/thread.html/5b186b5868de16280ced1b623fae8b5c54933def44398f6a4310ffb3@%3Cdev.carbondata.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-1357) byte[] convert to UTF8String bug

2017-08-03 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1357.
-
   Resolution: Fixed
Fix Version/s: 1.2.0

> byte[] convert to UTF8String bug
> 
>
> Key: CARBONDATA-1357
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1357
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Reporter: Cao, Lionel
>Assignee: Cao, Lionel
> Fix For: 1.2.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
>   public Object convertFromByteToUTF8String(Object data) {
> return data.toString();
>   }
> toString will get incorrect result like B[ should use new String



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-1356) Invert overwrite should delete files immediately

2017-08-04 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1356.
-
   Resolution: Fixed
Fix Version/s: 1.2.0

> Invert overwrite should delete files immediately
> 
>
> Key: CARBONDATA-1356
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1356
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
> Fix For: 1.2.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> When user issued insert overwrite command, it should delete the file 
> immediately to avoid stale folders.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1379) Date range filter with cast not working

2017-08-14 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1379:
---

 Summary: Date range filter with cast not working
 Key: CARBONDATA-1379
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1379
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Date range filter with cast not working. Query like below cannot work.

{code}
select doj from directDictionaryTable where doj > cast('2016-03-14' as date)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1380) Tablestatus file is not updated in case of load failure. Insert Overwrite does not work properly

2017-08-14 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1380:
---

 Summary: Tablestatus file is not updated in case of load failure. 
Insert Overwrite does not work properly
 Key: CARBONDATA-1380
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1380
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Tablestatus file is not updated in case of load failure. 
So Insert Overwrite does not work properly



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1284) Use hive metastore to store carbon schema

2017-07-10 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1284:
---

 Summary: Use hive metastore to store carbon schema
 Key: CARBONDATA-1284
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1284
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala


We should provide an option for user to store carbon schema inside hive 
metastore. So that carbon should not require to synchronize schema file every 
time.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1306) Carbondata select query crashes when using big data with more than million rows

2017-07-13 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1306:
---

 Summary: Carbondata select query crashes when using big data with 
more than million rows
 Key: CARBONDATA-1306
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1306
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Carbondata crashes when executing the following queries sequentially of 
CompareTest .

{code}
select sum(m1)  from t1
select sum(m1), sum(m2) from t1
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1311) Add carbon storelocation to spark warehouse. And extract storelocation out of metastore

2017-07-18 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1311:
---

 Summary: Add carbon storelocation to spark warehouse. And extract 
storelocation out of metastore
 Key: CARBONDATA-1311
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1311
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Change default storelocation to sparkwouse, ,so if user does not provide 
storelocation then it chooses sparkwarelocation as store location.
Change file metastore and avoid reading all schema files once and keep it in 
memory, instead implement cache based storage where it reads when request comes.
Extract store location out of metastore and refactored carbonmetastore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1321) Add Insert overwrite support and force clean up files in carbon

2017-07-20 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1321:
---

 Summary: Add Insert overwrite support and force clean up files in 
carbon
 Key: CARBONDATA-1321
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1321
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala


Insert overwrite support and force clean up files and clean up in progress 
files support added.
Carbon should support syntax and features like

{code}
LOAD DATA INPATH '" data.csv' overwrite INTO table carbontable
insert overwrite table carbontable select * from othertable
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1322) Add Insert overwrite support and force clean up files in carbon

2017-07-20 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1322:
---

 Summary: Add Insert overwrite support and force clean up files in 
carbon
 Key: CARBONDATA-1322
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1322
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala


Insert overwrite support and force clean up files and clean up in progress 
files support added.
Carbon should support syntax and features like

{code}
LOAD DATA INPATH '" data.csv' overwrite INTO table carbontable
insert overwrite table carbontable select * from othertable
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-981) Configuration can't find HIVE_CONNECTION_URL in yarn-client mode

2017-04-25 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-981.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Configuration can't find HIVE_CONNECTION_URL in yarn-client mode
> 
>
> Key: CARBONDATA-981
> URL: https://issues.apache.org/jira/browse/CARBONDATA-981
> Project: CarbonData
>  Issue Type: Bug
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 1.1.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (CARBONDATA-1346) Develop framework for SDV tests to run in cluster. And add all existing SDV tests to it

2017-07-30 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1346:
---

 Summary: Develop framework for SDV tests to run in cluster. And 
add all existing SDV tests to it
 Key: CARBONDATA-1346
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1346
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala


Develop framework for SDV tests to run in cluster. And add all existing SDV 
tests to it




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1353) SDV cluster tests are failing for measure filter feature

2017-08-01 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1353:
---

 Summary: SDV cluster tests are failing for measure filter feature
 Key: CARBONDATA-1353
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1353
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


SDV cluster tests are failing for measure filter feature.

http://144.76.159.231:8080/job/ApacheSDVTests/32/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1361) Reduce the sdv cluster test time

2017-08-04 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1361:
---

 Summary: Reduce the sdv cluster test time
 Key: CARBONDATA-1361
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1361
 Project: CarbonData
  Issue Type: Test
Reporter: Ravindra Pesala
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (CARBONDATA-844) Avoid to get useless splits

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-844:
---
Fix Version/s: (was: 1.1.0)
   1.1.1

> Avoid to get useless splits
> ---
>
> Key: CARBONDATA-844
> URL: https://issues.apache.org/jira/browse/CARBONDATA-844
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yadong Qi
>Assignee: Yadong Qi
> Fix For: 1.1.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In current implements of CarbonInputFormat.getDataBlocksOfSegment, 
> 1. Get all of the carbondata splits in segments directory.
> 2. Read the carbonindex and construct the B-tree.
> 3. Apply filter and get matching splits.
> I think we get some useless splits and the operator of getSplits is 
> expensive. So we'd better to do the getSplits after filter:
> 1. List the segment directory, and filter the path of carbonindex.
> 2. Read the carbonindex and construct the B-tree.
> 3. Apply filter and get matching blocks.
> 4. Get carbondata splits from filtered blocks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1045) Mismatch in message display with insert and load operation on failure due to bad records in update operation

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1045.
-
Resolution: Fixed

> Mismatch in message display with insert and load operation on failure due to 
> bad records in update operation
> 
>
> Key: CARBONDATA-1045
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1045
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Minor
> Fix For: 1.1.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When bad records action is set to fail and any IUD operation… is executed and 
> it fails due to bad records error message is not displayed correctly because 
> of which user is not clear with the cause of update operation failure. 
> Whereas in the same case in other operations like data load and insert into, 
> if there is any failure due to bad record proper error message is displayed 
> to the user for failure due to bad record.
> Steps to reproduce
> ---
> create table update_with_bad_record(item int, name String) stored by 
> 'carbondata'
> LOAD DATA LOCAL INPATH '' into table update_with_bad_record
> update update_with_bad_record set (item)=(3.45)
> dummy data
> ---
> item,name
> 2,Apple



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-905) Unable to execute method public: org.apache.hadoop.hive.ql.metadata.HiveException

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-905:
---
Fix Version/s: (was: 1.1.0)
   1.1.1

> Unable to execute method public: 
> org.apache.hadoop.hive.ql.metadata.HiveException
> -
>
> Key: CARBONDATA-905
> URL: https://issues.apache.org/jira/browse/CARBONDATA-905
> Project: CarbonData
>  Issue Type: Bug
> Environment: Spark1.6
>Reporter: SWATI RAO
> Fix For: 1.1.1
>
> Attachments: Test_Data1.csv, Test_Data1_h1.csv
>
>
> When we execute Same query in hive, it is working fine but when we execute in 
> carbondata "org.apache.hadoop.hive.ql.metadata.HiveException" occurs. 
> HIVE:
> 0: jdbc:hive2://hadoop-master:1> create table Test_Boundary_h1 (c1_int 
> int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string 
> string,c6_Timestamp Timestamp,c7_Datatype_Desc string) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',' ;
> -+
> result
> -+
> -+
> No rows selected (1.177 seconds)
> 0: jdbc:hive2://hadoop-master:1> load data local inpath 
> '/opt/Carbon/CarbonData/TestData/Data/Test_Data1_h1.csv' OVERWRITE INTO TABLE 
> Test_Boundary_h1 ;
> -+
> Result
> -+
> -+
> No rows selected (0.437 seconds)
> 0: jdbc:hive2://hadoop-master:1> select 
> min(c1_int),max(c1_int),sum(c1_int),avg(c1_int) , count(c1_int), 
> variance(c1_int) from Test_Boundary_h1 where rand(c1_int)=0.6201007799387834 
> or rand(c1_int)=0.45540022789662593 ;
> +---+---+---+---+--+---+--+
> |  _c0  |  _c1  |  _c2  |  _c3  | _c4  |  _c5  |
> +---+---+---+---+--+---+--+
> | NULL  | NULL  | NULL  | NULL  | 0| NULL  |
> +---+---+---+---+--+---+--+
> 1 row selected (0.996 seconds)
> CARBONDATA:
> 0: jdbc:hive2://hadoop-master:1> create table Test_Boundary (c1_int 
> int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string 
> string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 
> 'org.apache.carbondata.format' ;
> -+
> Result
> -+
> -+
> No rows selected (4.48 seconds)
> 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 
> 'hdfs://192.168.2.145:54310/BabuStore/Data/Test_Data1.csv' INTO table 
> Test_Boundary 
> OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')
>  ;
> -+
> Result
> -+
> -+
> No rows selected (4.445 seconds)
> 0: jdbc:hive2://hadoop-master:1> select 
> min(c1_int),max(c1_int),sum(c1_int),avg(c1_int) , count(c1_int), 
> variance(c1_int) from Test_Boundary where rand(c1_int)=0.6201007799387834 or 
> rand(c1_int)=0.45540022789662593 ;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 19.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 19.0 (TID 826, hadoop-master): 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method 
> public org.apache.hadoop.hive.serde2.io.DoubleWritable 
> org.apache.hadoop.hive.ql.udf.UDFRand.evaluate(org.apache.hadoop.io.LongWritable)
>   on object org.apache.hadoop.hive.ql.udf.UDFRand@3152da1e of class 
> org.apache.hadoop.hive.ql.udf.UDFRand with arguments {null} of size 1
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:981)
>   at org.apache.spark.sql.hive.HiveSimpleUDF.eval(hiveUDFs.scala:185)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:68)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:68)
>   at 
> org.apache.spark.sql.execution.Filter$$anonfun$2$$anonfun$apply$2.apply(basicOperators.scala:74)
>   at 
> org.apache.spark.sql.execution.Filter$$anonfun$2$$anonfun$apply$2.apply(basicOperators.scala:72)
>   at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:390)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>   at 
> org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:504)
>   at 
> org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.(TungstenAggregationIterator.scala:686)
>   at 
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
>   at 
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
>

[jira] [Reopened] (CARBONDATA-780) Alter table support for compaction through sort step

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reopened CARBONDATA-780:


> Alter table support for compaction through sort step
> 
>
> Key: CARBONDATA-780
> URL: https://issues.apache.org/jira/browse/CARBONDATA-780
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Assignee: Manish Gupta
> Fix For: 1.1.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Alter table need to support compaction process where complete data need to be 
> sorted again and then written to file.
> Currently in compaction process data is directly given to writer step where 
> it is splitted into columns and written. But as columns are sorted from left 
> to right, on dropping a column data will again become unorganized as dropped 
> column data will not be considered during compaction. In these scenarios 
> complete data need to be sorted again and then submitted to writer step.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-780) Alter table support for compaction through sort step

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-780.

Resolution: Fixed

> Alter table support for compaction through sort step
> 
>
> Key: CARBONDATA-780
> URL: https://issues.apache.org/jira/browse/CARBONDATA-780
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Assignee: Manish Gupta
> Fix For: 1.1.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Alter table need to support compaction process where complete data need to be 
> sorted again and then written to file.
> Currently in compaction process data is directly given to writer step where 
> it is splitted into columns and written. But as columns are sorted from left 
> to right, on dropping a column data will again become unorganized as dropped 
> column data will not be considered during compaction. In these scenarios 
> complete data need to be sorted again and then submitted to writer step.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-904) ArrayIndexOutOfBoundsException

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-904:
---
Fix Version/s: (was: 1.1.0)
   1.1.1

> ArrayIndexOutOfBoundsException 
> ---
>
> Key: CARBONDATA-904
> URL: https://issues.apache.org/jira/browse/CARBONDATA-904
> Project: CarbonData
>  Issue Type: Bug
> Environment: Spark1.6
>Reporter: SWATI RAO
>Assignee: Rahul Kumar
> Fix For: 1.1.1
>
> Attachments: Test_Data1_h1.csv, Test_Data1_h1.csv
>
>
> Or operator is not working properly.
> When we execute these query in hive it is working fine but when we execute 
> the same in carbondata it throws an exception:
> java.lang.ArrayIndexOutOfBoundsException
> HIVE:
> 0: jdbc:hive2://hadoop-master:1> create table Test_Boundary_h1 (c1_int 
> int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string 
> string,c6_Timestamp Timestamp,c7_Datatype_Desc string) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',' ;
> +-+--+
> | result  |
> +-+--+
> +-+--+
> No rows selected (1.177 seconds)
> 0: jdbc:hive2://hadoop-master:1> load data local inpath 
> '/opt/Carbon/CarbonData/TestData/Data/Test_Data1_h1.csv' OVERWRITE INTO TABLE 
> Test_Boundary_h1 ;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.437 seconds)
> 0: jdbc:hive2://hadoop-master:1> select c6_Timestamp,max(c6_Timestamp) 
> from Test_Boundary_h1 where c6_Timestamp ='2017-07-01 12:07:28' or 
> c6_Timestamp ='2019-07-05 13:07:30' or c6_Timestamp = '1999-01-06 10:05:29' 
> group by c6_Timestamp ;
> +++--+
> |  c6_Timestamp  |  _c1   |
> +++--+
> | 2017-07-01 12:07:28.0  | 2017-07-01 12:07:28.0  |
> +++--+
> 1 row selected (1.637 seconds)
> CARBONDATA:
> 0: jdbc:hive2://hadoop-master:1> create table Test_Boundary (c1_int 
> int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string 
> string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 
> 'org.apache.carbondata.format' ;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (4.48 seconds)
> 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 
> 'hdfs://192.168.2.145:54310/BabuStore/Data/Test_Data1.csv' INTO table 
> Test_Boundary 
> OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='')
>  ;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (4.445 seconds)
> 0: jdbc:hive2://hadoop-master:1> select c6_Timestamp,max(c6_Timestamp) 
> from Test_Boundary where c6_Timestamp ='2017-07-01 12:07:28' or c6_Timestamp 
> =' 2019-07-05 13:07:30' or c6_Timestamp = '1999-01-06 10:05:29' group by 
> c6_Timestamp ;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 5.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 5.0 (TID 8, hadoop-master): java.lang.RuntimeException: 
> java.util.concurrent.ExecutionException: 
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136)
>   at 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:50)
>   at 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:32)
>   at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:50)
>   at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
>   at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:31)
>   at 
> org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.(ChunkRowIterator.java:41)
>   at 
> org.apache.carbondata.hadoop.CarbonRecordReader.initialize(CarbonRecordReader.java:79)
>   at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD.scala:204)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at

[jira] [Closed] (CARBONDATA-780) Alter table support for compaction through sort step

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala closed CARBONDATA-780.
--

> Alter table support for compaction through sort step
> 
>
> Key: CARBONDATA-780
> URL: https://issues.apache.org/jira/browse/CARBONDATA-780
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Assignee: Manish Gupta
> Fix For: 1.1.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Alter table need to support compaction process where complete data need to be 
> sorted again and then written to file.
> Currently in compaction process data is directly given to writer step where 
> it is splitted into columns and written. But as columns are sorted from left 
> to right, on dropping a column data will again become unorganized as dropped 
> column data will not be considered during compaction. In these scenarios 
> complete data need to be sorted again and then submitted to writer step.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (CARBONDATA-1047) Add load options to perform batch sort and add more testcases

2017-05-11 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1047:
---

 Summary: Add load options to perform batch sort and add more 
testcases
 Key: CARBONDATA-1047
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1047
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala


Add load options to perform batch sort and add more testcases.

Add options like below to the load command for batch sort.

{code}
LOAD DATA LOCAL INPATH '$filePath' into table carbon_load1 
OPTIONS('batch_sort'='true', 'batch_sort_size_inmb'='1')
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1033) using column with array type bucket table is created but exception thrown when select performed

2017-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1033.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

> using column with array type bucket table is created but exception 
> thrown when select performed
> ---
>
> Key: CARBONDATA-1033
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1033
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.1.0
> Environment: 3 node cluster 
>Reporter: Chetan Bhat
>Assignee: Kunal Kapoor
> Fix For: 1.1.0
>
>   Original Estimate: 504h
>  Time Spent: 1h 10m
>  Remaining Estimate: 502h 50m
>
> User tries to create a bucket table with array.
> Table is successful as shown below.
> 0: jdbc:hive2://172.168.100.199:23040> CREATE TABLE uniqData_t4(ID Int, date 
> Timestamp, country String,name String, phonetype String, serialname String, 
> salary Int,mobile array)USING org.apache.spark.sql.CarbonSource 
> OPTIONS("bucketnumber"="1", "bucketcolumns"="name","tableName"="uniqData_t4");
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.061 seconds)
> User executes select query on bucket table with column type having 
> array.
> Actual Issue :
> When user performs select query on bucket table with column type having 
> array the UncheckedExecutionException is thrown.
> 0: jdbc:hive2://172.168.100.199:23040> select count(*) from uniqData_t4;
> Error: org.spark_project.guava.util.concurrent.UncheckedExecutionException: 
> java.lang.Exception: Do not have default and uniqdata_t4 (state=,code=0)
> 0: jdbc:hive2://172.168.100.199:23040> select * from uniqData_t4;
> Error: org.spark_project.guava.util.concurrent.UncheckedExecutionException: 
> java.lang.Exception: Do not have default and uniqdata_t4 (state=,code=0)
> Expected : The bucket table creation with array type should not be 
> created. If its created then the select query should return correct result 
> set without throwing exception.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-704) data mismatch between hive and carbondata after loading for bigint values

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-704.

   Resolution: Won't Fix
 Assignee: (was: anubhav tarar)
Fix Version/s: 1.1.0

> data mismatch between hive and carbondata after loading for bigint values
> -
>
> Key: CARBONDATA-704
> URL: https://issues.apache.org/jira/browse/CARBONDATA-704
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
>Reporter: SWATI RAO
> Fix For: 1.1.0
>
> Attachments: Test_Data1 (4).csv
>
>
> carbondata
> 0: jdbc:hive2://localhost:1> create table Test_Boundary (c1_int 
> int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string 
> string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 
> 'org.apache.carbondata.format' ;
> 0: jdbc:hive2://localhost:1>  LOAD DATA INPATH 
> 'hdfs://localhost:54310/Test_Data1.csv' INTO table Test_Boundary OPTIONS  
>   
> ('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='');
> 0: jdbc:hive2://localhost:1> select c2_Bigint from Test_Boundary;
> +--+--+
> |  c2_Bigint   |
> +--+--+
> | NULL |
> | NULL |
> | NULL |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> +--+--+
> but in hive
> create table Test_Boundary_hive (c1_int int,c2_Bigint Bigint,c3_Decimal 
> Decimal(38,30),c4_double double,c5_string string,c6_Timestamp 
> Timestamp,c7_Datatype_Desc string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY 
> ",";
> LOAD DATA LOCAL INPATH 'Test_Data1.csv' into table Test_Boundary_hive;
> select c2_Bigint from Test_Boundary_hive;
> +---+--+
> |   c2_Bigint   |
> +---+--+
> | 1234  |
> | 2345  |
> | 3456  |
> | 4567  |
> | 9223372036854775807   |
> | -9223372036854775808  |
> | -9223372036854775807  |
> | -9223372036854775806  |
> | -9223372036854775805  |
> | 0 |
> | 9223372036854775807   |
> | 9223372036854775807   |
> | 9223372036854775807   |
> | NULL  |
> | NULL  |
> | NULL  |
> +---+--+



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-774) Not like operator does not work properly in carbondata

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-774.

   Resolution: Fixed
 Assignee: (was: Rahul Kumar)
Fix Version/s: 1.1.0

> Not like operator does not work properly in carbondata
> --
>
> Key: CARBONDATA-774
> URL: https://issues.apache.org/jira/browse/CARBONDATA-774
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.0.0-incubating
> Environment: Spark 2.1
>Reporter: Vinod Rohilla
>Priority: Trivial
> Fix For: 1.1.0
>
> Attachments: CSV.tar.gz
>
>
> Not Like operator result does not display same as hive.
> Steps to reproduces:
> A): Create table in Hive
> CREATE TABLE uniqdata_h (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
> 2:Load Data in hive
> a)load data local inpath '/opt/TestData/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata_h
> b)load data local inpath '/opt/TestData/Data/uniqdata/4000_UniqData.csv' into 
> table uniqdata_h
> c)load data local inpath '/opt/TestData/Data/uniqdata/6000_UniqData.csv' into 
> table uniqdata_h
> d)load data local inpath '/opt/TestData/Data/uniqdata/7000_UniqData.csv' into 
> table uniqdata_h
> e)load data local inpath '/opt/TestData/Data/uniqdata/3000_1_UniqData.csv' 
> into table uniqdata_h
> 3: Run the Query:
> select CUST_ID from uniqdata_h where CUST_ID NOT LIKE 100079
> 4:Result in Hive
> +--+--+
> | CUST_ID  |
> +--+--+
> | 8999 |
> | 9000 |
> | 9001 |
> | 9002 |
> | 9003 |
> | 9004 |
> | 9005 |
> | 9006 |
> | 9007 |
> | 9008 |
> | 9009 |
> | 9010 |
> | 9011 |
> | 9012 |
> | 9013 |
> | 9014 |
> | 9015 |
> | 9016 |
> | 9017 |
> | 9018 |
> | 9019 |
> | 9020 |
> | 9021 |
> | 9022 |
> | 9023 |
> | 9024 |
> | 9025 |
> | 9026 |
> | 9027 |
> | 9028 |
> | 9029 |
> | 9030 |
> | 9031 |
> | 9032 |
> | 9033 |
> | 9034 |
> | 9035 |
> | 9036 |
> | 9037 |
> | 9038 |
> | 9039 |
> | 9040 |
> | 9041 |
> | 9042 |
> | 9043 |
> | 9044 |
> | 9045 |
> | 9046 |
> | 9047 |
> | 9048 |
> | 9049 |
> | 9050 |
> | 9051 |
> | 9052 |
> | 9053 |
> | 9054 |
> | 9055 |
> | 9056 |
> | 9057 |
> | 9058 |
> | 9059 |
> | 9060 |
> | 9061 |
> | 9062 |
> | 9063 |
> | 9064 |
> | 9065 |
> | 9066 |
> | 9067 |
> | 9068 |
> | 9069 |
> | 9070 |
> | 9071 |
> | 9072 |
> | 9073 |
> | 9074 |
> | 9075 |
> | 9076 |
> | 9077 |
> | 9078 |
> | 9079 |
> | 9080 |
> | 9081 |
> | 9082 |
> | 9083 |
> | 9084 |
> | 9085 |
> | 9086 |
> | 9087 |
> | 9088 |
> | 9089 |
> | 9090 |
> | 9091 |
> | 9092 |
> | 9093 |
> | 9094 |
> | 9095 |
> | 9096 |
> | 9097 |
> | 9098 |
> +--+--+
> | CUST_ID  |
> +--+--+
> | 9099 |
> | 9100 |
> | 9101 |
> | 9102 |
> | 9103 |
> | 9104 |
> | 9105 |
> | 9106 |
> | 9107 |
> | 9108 |
> | 9109 |
> | 9110 |
> | 9111 |
> | 9112 |
> | 9113 |
> | 9114 |
> | 9115 |
> | 9116 |
> | 9117 |
> | 9118 |
> | 9119 |
> | 9120 |
> | 9121 |
> | 9122 |
> | 9123 |
> | 9124 |
> | 9125 |
> | 9126 |
> | 9127 |
> | 9128 |
> | 9129 |
> | 9130 |
> | 9131 |
> | 9132 |
> | 9133 |
> | 9134 |
> | 9135 |
> | 9136 |
> | 9137 |
> | 9138 |
> | 9139 |
> | 9140 |
> | 9141 |
> | 9142 |
> | 9143 |
> | 9144 |
> | 9145 |
> | 9146 |
> | 9147 |
> | 9148 |
> | 9149 |
> | 9150 |
> | 9151 |
> | 9152 |
> | 9153 |
> | 9154 |
> | 9155 |
> | 9156 |
> | 9157 |
> | 9158 |
> | 9159 |
> | 9160 |
> | 9161 |
> | 9162 |
> | 9163 |
> | 9164 |
> | 9165 |
> | 9166 |
> | 9167 |
> | 9168 |
> | 9169 |
> | 9170 |
> | 9171 |
> | 9172 |
> | 9173 |
> | 9174 |
> | 9175 |
> | 9176 |
> | 9177 |
> | 9178 |
> | 9179 |
> | 9180 |
> | 9181 |
> | 9182 |
> | 9183 |
> | 9184 |
> | 9185 |
> | 9186 |
> | 9187 |
> | 9188 |
> | 9189 |
> | 9190 |
> | 9191 |
> | 9192 |
> | 9193 |
> | 9194 |
> | 9195 |
> | 9196 |
> | 9197

[jira] [Resolved] (CARBONDATA-711) Inconsistent data load when single_pass='true'

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-711.

   Resolution: Fixed
Fix Version/s: 1.1.0

It is working in current master. It is fixed.

> Inconsistent data load when single_pass='true'
> --
>
> Key: CARBONDATA-711
> URL: https://issues.apache.org/jira/browse/CARBONDATA-711
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.1.0
> Environment: Spark 1.6
>Reporter: SWATI RAO
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: 2000_UniqData.csv
>
>
>  When we perform dataload with Single_pass='true' , it repeats some of the 
> values in the table whereas the csv contains empty value for that column. PFA 
> csv which is used for dataloading. And below is the create , load . and 
> select query.
> CREATE TABLE uniq_shared_dictionary (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,Double_COLUMN2,DECIMAL_COLUMN2')
>  LOAD DATA INPATH 
> 'hdfs://192.168.2.145:54310/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniq_shared_dictionary OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_Pass'='true')
>  ;
> Output: 
> 0: jdbc:hive2://hadoop-master:1> select CUST_ID from 
> uniq_shared_dictionary ;
> +--+--+
> | Cust_Id  |
> +--+--+
> | 8999 |
> | 8999 |
> | 8999 |
> | 8999 |
> | 8999 |
> | 8999 |
> | 8999 |
> | 8999 |
> | 8999 |
> | 8999 |
> | 8999 |
> | 8999 |
> | 9000 |
> | 9001 |
> | 9002 |
> | 9003 |
> | 9004 |
> | 9005 |
> | 9006 |
> | 9007 |
> | 9008 |
> | 9009 |
> | 9010 |
> | 9011 |
> | 9012 |
> | 9013 |
> | 9014 |
> | 9015 |
> | 9016 |
> | 9017 |
> | 9018 |
> | 9019 |
> | 9020 |
> | 9021 |
> | 9022 |
> | 9023 |
> | 9024 |
> | 9025 |
> | 9026 |
> | 9027 |
> | 9028 |
> | 9029 |
> | 9030 |
> | 9031 |
> | 9032 |
> | 9033 |
> | 9034 |
> | 9035 |
> | 9036 |
> | 9037 |
> | 9038 |
> | 9039 |
> | 9040 |
> | 9041 |
> | 9042 |
> | 9043 |
> | 9044 |
> | 9045 |
> | 9046 |
> | 9047 |
> | 9048 |
> | 9049 |
> | 9050 |
> | 9051 |
> | 9052 |
> | 9053 |
> | 9054 |
> | 9055 |
> | 9056 |
> | 9057 |
> | 9058 |
> | 9059 |
> | 9060 |
> | 9061 |
> | 9062 |
> | 9063 |
> | 9064 |
> | 9065 |
> | 9066 |
> | 9067 |
> | 9068 |
> | 9069 |
> | 9070 |
> | 9071 |
> | 9072 |
> | 9073 |
> | 9074 |
> | 9075 |
> | 9076 |
> | 9077 |
> | 9078 |
> | 9079 |
> | 9080 |
> | 9081 |
> | 9082 |
> | 9083 |
> | 9084 |
> | 9085 |
> | 9086 |
> | 9087 |
> +--+--+
> | Cust_Id  |
> +--+--+
> | 9088 |
> | 9089 |
> | 9090 |
> | 9091 |
> | 9092 |
> | 9093 |
> | 9094 |
> | 9095 |
> | 9096 |
> | 9097 |
> | 9098 |
> | 9099 |
> | 9100 |
> | 9101 |
> | 9102 |
> | 9103 |
> | 9104 |
> | 9105 |
> | 9106 |
> | 9107 |
> | 9108 |
> | 9109 |
> | 9110 |
> | 9111 |
> | 9112 |
> | 9113 |
> | 9114 |
> | 9115 |
> | 9116 |
> | 9117 |
> | 9118 |
> | 9119 |
> | 9120 |
> | 9121 |
> | 9122 |
> | 9123 |
> | 9124 |
> | 9125 |
> | 9126 |
> | 9127 |
> | 9128 |
> | 9129 |
> | 9130 |
> | 9131 |
> | 9132 |
> | 9133 |
> | 9134 |
> | 9135 |
> | 9136 |
> | 9137 |
> | 9138 |
> | 9139 |
> | 9140 |
> | 9141 |
> | 9142 |
> | 9143 |
> | 9144 |
> | 9145 |
> | 9146 |
> | 9147 |
> | 9148 |
> | 9149 |
> | 9150 |
> | 9151 |
> | 9152 |
> | 9153 |
> | 9154 |
> | 9155 |
> | 9156 |
> | 9157 |
> | 9158 |
> | 9159 |
> | 9160 |
> | 9161 |
> | 9162 |
> | 9163 |
> | 9164 |
> | 9165 |
> | 9166 |
> | 9167 |
> | 9168 |
> | 9169 |
> | 9170 |
> | 9171 |
> | 9172 |
> | 9173 |
> | 9174 |
> | 9175 |
> | 9176 |
> | 9177 |
> | 9178 |
> | 9179 |
> | 9180

[jira] [Commented] (CARBONDATA-1043) Data Load fail in Automation

2017-05-10 Thread Ravindra Pesala (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004612#comment-16004612
 ] 

Ravindra Pesala commented on CARBONDATA-1043:
-

I cannot reproduce this issue.
Please provide more logs.

> Data Load fail in Automation 
> -
>
> Key: CARBONDATA-1043
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1043
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.0.0-incubating
> Environment: Spark1.6
>Reporter: SWATI RAO
>Priority: Trivial
> Attachments: 2000_UniqData.csv
>
>
> Steps to reproduce :
> 1.CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='ACTIVE_EMUI_VERSION')
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> DataLoad failure: There is an unexpected error: 3
> -
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='ACTIVE_EMUI_VERSION')
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> DataLoad failure: There is an unexpected error: 3
> -
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='ACTIVE_EMUI_VERSION')
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> DataLoad failure: There is an unexpected error: 3
>  DRIVER STACK TRACE :
> INFO  10-05 15:43:54,750 - DataLoad failure
> org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException:
>  There is an unexpected error: 3
>   at 
> org.apache.carbondata.processing.newflow.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:137)
>   at 
> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:48)
>   at 
> org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD$$anon$1.(NewCarbonDataLoadRDD.scala:166)
>   at 
> org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD.compute(NewCarbonDataLoadRDD.scala:142)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (CARBONDATA-837) Unable to delete records from carbondata table

2017-05-10 Thread Ravindra Pesala (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004627#comment-16004627
 ] 

Ravindra Pesala commented on CARBONDATA-837:


[~sanoj_mg] Since this issue opened long time it was fixed by other 
contributor. Please feel free to take other jira issues. 

> Unable to delete records from carbondata table
> --
>
> Key: CARBONDATA-837
> URL: https://issues.apache.org/jira/browse/CARBONDATA-837
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.1.0
> Environment: HDP 2.5, Spark 1.6.2
>Reporter: Sanoj MG
>Assignee: Sanoj MG
>Priority: Minor
>
> As per below document I am trying to delete entries from the table :
> https://github.com/apache/incubator-carbondata/blob/master/docs/dml-operation-on-carbondata.md
> scala> cc.sql("select * from accountentity").count
> res10: Long = 391351
> scala> cc.sql("delete from accountentity")
> INFO  30-03 09:03:03,099 - main Query [DELETE FROM ACCOUNTENTITY]
> INFO  30-03 09:03:03,104 - Parsing command: select tupleId from accountentity
> INFO  30-03 09:03:03,104 - Parse Completed
> INFO  30-03 09:03:03,105 - Parsing command: select tupleId from accountentity
> INFO  30-03 09:03:03,105 - Parse Completed
> res11: org.apache.spark.sql.DataFrame = []
> scala> cc.sql("select * from accountentity").count
> res12: Long = 391351
> The records gets deleted only when an action such as show() is applied. 
> scala> cc.sql("delete from accountentity").show



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-788) Like operator is not working properly

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-788.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Like operator is not working properly
> -
>
> Key: CARBONDATA-788
> URL: https://issues.apache.org/jira/browse/CARBONDATA-788
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.1.0
> Environment: spark 2.1
>Reporter: Geetika Gupta
>Assignee: Rahul Kumar
>Priority: Trivial
> Fix For: 1.1.0
>
> Attachments: 2000_UniqData.csv
>
>
> I tried to create a table using the following command:
> CREATE TABLE uniqdata_INC(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> Load command for the table :
> LOAD DATA INPATH 
> 'hdfs://localhost:54311/BabuStore/DATA/uniqdata/2000_UniqData.csv' into table 
> uniqdata_INC OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> When I performed the below query on the table using 'like' operator, it 
> displayed no results.
> select cust_id from uniqdata_INC where cust_id like 8999;
> Result:
> +--+--+
> | cust_id  |
> +--+--+
> +--+--+
> No rows selected (0.515 seconds)
> PFA the csv used for input data.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (CARBONDATA-704) data mismatch between hive and carbondata after loading for bigint values

2017-05-10 Thread Ravindra Pesala (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004563#comment-16004563
 ] 

Ravindra Pesala commented on CARBONDATA-704:


I have verified in current master, and it is fixed.

> data mismatch between hive and carbondata after loading for bigint values
> -
>
> Key: CARBONDATA-704
> URL: https://issues.apache.org/jira/browse/CARBONDATA-704
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
>Reporter: SWATI RAO
>Assignee: anubhav tarar
> Attachments: Test_Data1 (4).csv
>
>
> carbondata
> 0: jdbc:hive2://localhost:1> create table Test_Boundary (c1_int 
> int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string 
> string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 
> 'org.apache.carbondata.format' ;
> 0: jdbc:hive2://localhost:1>  LOAD DATA INPATH 
> 'hdfs://localhost:54310/Test_Data1.csv' INTO table Test_Boundary OPTIONS  
>   
> ('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='');
> 0: jdbc:hive2://localhost:1> select c2_Bigint from Test_Boundary;
> +--+--+
> |  c2_Bigint   |
> +--+--+
> | NULL |
> | NULL |
> | NULL |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> +--+--+
> but in hive
> create table Test_Boundary_hive (c1_int int,c2_Bigint Bigint,c3_Decimal 
> Decimal(38,30),c4_double double,c5_string string,c6_Timestamp 
> Timestamp,c7_Datatype_Desc string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY 
> ",";
> LOAD DATA LOCAL INPATH 'Test_Data1.csv' into table Test_Boundary_hive;
> select c2_Bigint from Test_Boundary_hive;
> +---+--+
> |   c2_Bigint   |
> +---+--+
> | 1234  |
> | 2345  |
> | 3456  |
> | 4567  |
> | 9223372036854775807   |
> | -9223372036854775808  |
> | -9223372036854775807  |
> | -9223372036854775806  |
> | -9223372036854775805  |
> | 0 |
> | 9223372036854775807   |
> | 9223372036854775807   |
> | 9223372036854775807   |
> | NULL  |
> | NULL  |
> | NULL  |
> +---+--+



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-789) Join operation does not work properly in Carbon data.

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-789.

Resolution: Won't Fix

> Join operation does not work properly  in Carbon data. 
> ---
>
> Key: CARBONDATA-789
> URL: https://issues.apache.org/jira/browse/CARBONDATA-789
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.1.0
> Environment: Spark 2.1
>Reporter: Vinod Rohilla
>Priority: Trivial
> Attachments: 2000_UniqData.csv
>
>
> Join operation does not work properly  in Carbon data for the int data type.
> Steps to Reproduces:
> A) Create Table in Hive:
> First table:
> CREATE TABLE uniqdata_nobucket11_Hive (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
> First table Load:
> LOAD DATA LOCAL INPATH 
> '/home/vinod/Desktop/AllCSV/2000_UniqData.csv'OVERWRITE INTO TABLE 
> uniqdata_nobucket11_Hive;
> Second  table :
> CREATE TABLE uniqdata_nobucket22_Hive (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
> Second  table Load:
> LOAD DATA LOCAL INPATH 
> '/home/vinod/Desktop/AllCSV/2000_UniqData.csv'OVERWRITE INTO TABLE 
> uniqdata_nobucket22_Hive;
> Results in Hive:
> | CUST_ID  |CUST_NAME |ACTIVE_EMUI_VERSION |  DOB 
>   |  DOJ   | BIGINT_COLUMN1  | BIGINT_COLUMN2  | 
> DECIMAL_COLUMN1 | DECIMAL_COLUMN2 |Double_COLUMN1|
> Double_COLUMN2 | INTEGER_COLUMN1  | CUST_ID  |CUST_NAME |
> ACTIVE_EMUI_VERSION |  DOB   |  DOJ   | 
> BIGINT_COLUMN1  | BIGINT_COLUMN2  | DECIMAL_COLUMN1 | 
> DECIMAL_COLUMN2 |Double_COLUMN1|Double_COLUMN2 | 
> INTEGER_COLUMN1  |
> +--+--++++-+-+-+-+--+---+--+--+--++++-+-+-+-+--+---+--+--+
> | 10999| CUST_NAME_01999  | ACTIVE_EMUI_VERSION_01999  | 1975-06-23 
> 01:00:03.0  | 1975-06-23 02:00:03.0  | 123372038853| -223372034855   | 
> 12345680900.123400  | 22345680900.123400  | 1.12345674897976E10  | 
> -1.12345674897976E10  | 2000 | 10999| CUST_NAME_01999  | 
> ACTIVE_EMUI_VERSION_01999  | 1975-06-23 01:00:03.0  | 1975-06-23 02:00:03.0  
> | 123372038853| -223372034855   | 12345680900.123400  | 
> 22345680900.123400  | 1.12345674897976E10  | -1.12345674897976E10  | 2000 
> |
> +--+--++++-+-+-+-+--+---+--+--+--++++-+-+-+-+--+---+--+--+
> 2,001 rows selected (3.369 seconds)
> B) Create table in Carbon data 
> First Table:
> CREATE TABLE uniqdata_nobucket11 (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' ;
> Load Data in table:
> LOAD DATA INPATH 'hdfs://localhost:54310/2000_UniqData.csv' into table 
> uniqdata_nobucket11 OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> Create Second table:
> CREATE TABLE uniqdata_nobucket22 (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB

[jira] [Resolved] (CARBONDATA-530) Query with ordery by and limit is not optimized properly

2017-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-530.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Query with ordery by and limit is not optimized properly
> 
>
> Key: CARBONDATA-530
> URL: https://issues.apache.org/jira/browse/CARBONDATA-530
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ashok Kumar
>Assignee: Ashok Kumar
>Priority: Minor
> Fix For: 1.1.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> for order by query having limit, spark optimizes the plan. 
> But since we put Decoder in between Limit and TungstenSort  plan, check the 
> plan as below, its not able to optimize the plan
> |== Physical Plan ==  
>   
>   
>   
> |
> |Limit 2  
>   
>   
>   
> |
> | ConvertToSafe   
>   
>   
>   
> |
> |  CarbonDictionaryDecoder [CarbonDecoderRelation(Map(name#3 -> 
> name#3),CarbonDatasourceRelation(`default`.`dict`,None))], 
> ExcludeProfile(ArrayBuffer(name#3)), CarbonAliasDecoderRelation() 
>   
>  |
> |   TungstenSort [name#3 ASC], true, 0
>   
>   
>   
> |
> |ConvertToUnsafe  
>   
>   
>   
> |
> | Exchange rangepartitioning(name#3 ASC)  
>   
>   
>   
> |
> |  ConvertToSafe  
>   
>   
>   
> |
> |   CarbonDictionaryDecoder [CarbonDecoderRelation(Map(name#3 -> 
> name#3),CarbonDatasourceRelation(`default`.`dict`,None))], 
> IncludeProfile(ArrayBuffer(name#3)), CarbonAliasDecoderRelation() 
>   
> |
> |CarbonScan [name#3], (CarbonRelation default, dict, 
> CarbonMetaData(ArrayBuffer(name),ArrayBuffer(default_dummy_measure),org.apache.carbondata.core.carbon.metadata.schema.table.CarbonTable@6021d179,DictionaryMap(Map(name
>  -> true))), org.apache.carbondata.spark.merger.TableMeta@4c3f903d, None), 
> [(name#3 = hello)], false|
> | 
>   
>   
>   
> |
> |Code Generation: true
>   
>   
>

[jira] [Resolved] (CARBONDATA-646) Bad record handling is not correct for Int data type

2017-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-646.

Resolution: Duplicate

Duplicated to CARBONDATA-542

> Bad record handling is not correct for Int data type
> 
>
> Key: CARBONDATA-646
> URL: https://issues.apache.org/jira/browse/CARBONDATA-646
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.0.0-incubating, 0.1.1-incubating
> Environment: Spark 1.6
>Reporter: Ramakrishna
>Assignee: Manish Gupta
>Priority: Minor
> Attachments: 646_1.PNG, 646_2.PNG
>
>
> With Bad record handling as default,
> If Char value is given for Int data type, that is handled properly(moving 
> NULL).
> If Decimal values is given for Int Data type, it is stripping the decimal, 
> where it should consider this as bad record and move NULL.
> Bad record csv:
> TRUE,2.7,423.0,A,2003454300, 
> 121.5,4.99,2.44,SE3423ee,asfdsffdfg,EtryTRWT,2012-01-12 
> 03:14:05.123456729,2012-01-20
> 0: jdbc:hive2://172.168.100.212:23040> select * from t_carbn01 where 
> qty_total is NULL;
> ++---+--++-+--+-+-++-+--++--+--+
> | active_status  | item_type_cd  | qty_day_avg  | qty_total  | sell_price 
>  | sell_pricep  | discount_price  | profit  | item_code  |  item_name  | 
> outlet_name  |  update_time   | create_date  |
> ++---+--++-+--+-+-++-+--++--+--+
> | TRUE   | 2 | 423  | NULL   | 
> 2003454304  | 121.5| 4.99| 2.44| SE3423ee   | 
> asfdsffdfg  | EtryTRWT | 2012-01-12 03:14:05.0  | 2012-01-20   |
> ++---+--++-+--+-+-++-+--++--
> 0: jdbc:hive2://172.168.100.212:23040> desc t_carbn01;
> +-+---+--+--+
> |col_name |   data_type   | comment  |
> +-+---+--+--+
> | active_status   | string|  |
> | item_type_cd| bigint|  |
> | qty_day_avg | bigint|  |
> | qty_total   | bigint|  |
> | sell_price  | bigint|  |
> | sell_pricep | double|  |
> | discount_price  | double|  |
> | profit  | decimal(3,2)  |  |
> | item_code   | string|  |
> | item_name   | string|  |
> | outlet_name | string|  |
> | update_time | timestamp |  |
> | create_date | string|  |
> +-+---+--+--+
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-664) Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.

2017-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-664.

Resolution: Duplicate

Duplicated to CARBONDATA-726

> Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.
> 
>
> Key: CARBONDATA-664
> URL: https://issues.apache.org/jira/browse/CARBONDATA-664
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
>Reporter: Harsh Sharma
>  Labels: bug
> Attachments: 100_olap_C20.csv, Driver Logs, Executor Logs
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Below scenario is working on Spark 2.1, but not on Spark 1.6
> create table VMALL_DICTIONARY_INCLUDE (imei string,deviceInformationId 
> int,MAC string,deviceColor string,device_backColor string,modelId 
> string,marketName string,AMSize string,ROMSize string,CUPAudit 
> string,CPIClocked string,series string,productionDate timestamp,bomCode 
> string,internalModels string, deliveryTime string, channelsId string, 
> channelsName string , deliveryAreaId string, deliveryCountry string, 
> deliveryProvince string, deliveryCity string,deliveryDistrict string, 
> deliveryStreet string, oxSingleNumber string, ActiveCheckTime string, 
> ActiveAreaId string, ActiveCountry string, ActiveProvince string, Activecity 
> string, ActiveDistrict string, ActiveStreet string, ActiveOperatorId string, 
> Active_releaseId string, Active_EMUIVersion string, Active_operaSysVersion 
> string, Active_BacVerNumber string, Active_BacFlashVer string, 
> Active_webUIVersion string, Active_webUITypeCarrVer 
> string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, 
> Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, 
> Latest_DAY Decimal(30,10), Latest_HOUR string, Latest_areaId string, 
> Latest_country string, Latest_province string, Latest_city string, 
> Latest_district string, Latest_street string, Latest_releaseId string, 
> Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber 
> string, Latest_BacFlashVer string, Latest_webUIVersion string, 
> Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, 
> Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, 
> Latest_operatorId string, gamePointDescription string,gamePointId 
> double,contractNumber BigInt) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='imei,deviceInformationId,productionDate,gamePointId,Latest_DAY,contractNumber');
> LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/100_olap_C20.csv' INTO 
> table VMALL_DICTIONARY_INCLUDE 
> options('DELIMITER'=',','QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='imei,deviceInformationId,MAC,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CPIClocked,series,productionDate,bomCode,internalModels,deliveryTime,channelsId,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,deliveryStreet,oxSingleNumber,contractNumber,ActiveCheckTime,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,ActiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_operaSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Active_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Active_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaSysVersion,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITypeCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePADPartitionedVersions,Latest_operatorId,gamePointId,gamePointDescription');
> select sum(deviceinformationId) from VMALL_DICTIONARY_INCLUDE where 
> deviceColor ='5Device Color' and modelId != '109' or Latest_DAY > 
> '1234567890123540.00' and contractNumber == '92233720368547800' or 
> Active_operaSysVersion like 'Operating System Version' and gamePointId <=> 
> '8.1366141918611E39' and deviceInformationId < '100' and productionDate 
> not like '2016-07-01' and imei is null and Latest_HOUR is not null and 
> channelsId <= '7' and Latest_releaseId >= '1' and Latest_MONTH between 6 and 
> 8 and Latest_YEAR not between 2016 and 2017 and Latest_HOUR RLIKE '12' and 
> gamePointDescription REGEXP 'Site' and imei in 
> ('1AA1','1AA100','1AA10','1AA1000','1AA1','1AA10','1AA100','1AA11','1AA12','1AA14','','NULL')
>  and Active_BacVerNumber not in ('Background version number1','','null');
> This scenario results in the following

[jira] [Resolved] (CARBONDATA-615) Update query store wrong value for Date data type

2017-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-615.

Resolution: Duplicate

Duplicate to CARBONDATA-603

> Update query store wrong value for Date data type
> -
>
> Key: CARBONDATA-615
> URL: https://issues.apache.org/jira/browse/CARBONDATA-615
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6 Spark 2.1
>Reporter: Anurag Srivastava
>Assignee: ravikiran
>Priority: Minor
> Attachments: 2000_UniqData.csv, update_dob.png
>
>
> I am trying to update DOB column with Date Data Type. It is storing a day 
> before date which I have mentioned for updating in DOB column.
> *Create Table :* CREATE TABLE uniqdata (CUST_ID int,CUST_NAME 
> char(30),ACTIVE_EMUI_VERSION string, DOB Date, DOJ Date, BIGINT_COLUMN1 
> bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> *Load Data :* LOAD DATA INPATH 'hdfs://localhost:54310/2000_UniqData.csv' 
> into table uniqdata OPTIONS ('DELIMITER'=',' 
> ,'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true');
> *Update Query :*  update uniqdata set (dob)=(to_date('2016-12-01')) where 
> cust_name = 'CUST_NAME_01999';
> *Expected Result :* It should update DOB column with *2016-12-01*.
> *Actual Result :* It is updating DOB column with *2016-11-30*.
> !https://issues.apache.org/jira/secure/attachment/12846515/update_dob.png!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-626) [Dataload] Dataloading is not working with delimiter set as "|"

2017-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-626.

Resolution: Duplicate

Duplicate to CARBONDATA-622

> [Dataload] Dataloading is not working with delimiter set as "|"
> ---
>
> Key: CARBONDATA-626
> URL: https://issues.apache.org/jira/browse/CARBONDATA-626
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
> Environment: 3 node cluster
> Spark Version:-1.6
>Reporter: SOURYAKANTA DWIVEDY
>Assignee: QiangCai
>
> Description : Data loading fail with delimiter as "|" .
> Steps:
> > 1. Create table
> > 2. Load data into table
> Log :-
> -
> - create table DIM_TERMINAL 
> (
> ID int,
> TAC String,
> TER_BRAND_NAME String,
> TER_MODEL_NAME String,
> TER_MODENAME String,
> TER_TYPE_ID String,
> TER_TYPE_NAME_EN String,
> TER_TYPE_NAME_CHN String,
> TER_OSTYPE String,
> TER_OS_TYPE_NAME String,
> HSPASPEED String,
> LTESPEED String,
> VOLTE_FLAG String,
> flag String
> ) stored by 'org.apache.carbondata.format' TBLPROPERTIES 
> ('DICTIONARY_INCLUDE'='TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
> - jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 
> 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL.csv' INTO table DIM_TERMINAL1 
> OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 
> 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
> Error: java.lang.RuntimeException: Data loading failed. table not found: 
> default.dim_terminal1 (state=,code=0)
> 0: jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 
> 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL1.csv' INTO table DIM_TERMINAL 
> OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 
> 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
> Error: org.apache.spark.sql.AnalysisException: Reference 'D' is ambiguous, 
> could be: D#4893, D#4907, D#4920, D#4935, D#4952, D#5025, D#5034.; 
> (state=,code=0)
> - csv raw details :  
> 103880|99000537|MI|2S H1SC 3C|2G/3G|0|SmartPhone|SmartPhone|4|Android|||1| 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-399) [Bad Records] Data Load is not FAILED even bad_records_action="FAIL" .

2017-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-399.

Resolution: Duplicate

Duplicate to CARBONDATA-424

> [Bad Records] Data Load is not FAILED even  bad_records_action="FAIL" .
> ---
>
> Key: CARBONDATA-399
> URL: https://issues.apache.org/jira/browse/CARBONDATA-399
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 0.1.0-incubating
> Environment: SUSE 11 SP4
> YARN HA 
> 3 Nodes
>Reporter: Babulal
>Assignee: Akash R Nilugal
>Priority: Minor
>
> Data Load is not FAILED when string data are loaded in the int column . 
> 1. Create table  defect_5 (imei string ,deviceInformationId int,mac 
> string,productdate timestamp,updatetime timestamp,gamePointId 
> double,contractNumber double) stored by 'carbondata' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='deviceInformationId') ;
> deviceInformationId  is int  ( it will handled as  dimension). Now load the 
> data 
> 2.  0: jdbc:hive2://ha-cluster/default> LOAD DATA  inpath 
> 'hdfs://hacluster/tmp/100_default_date_11_header_2.csv' into table defect_5 
> options('DELIMITER'=',', 'bad_records_action'='FAIL',  
> 'QUOTECHAR'='"','FILEHEADER'='imei,deviceinformationid,mac,productdate,updatetime,gamepointid,contractnumber');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.969 seconds)
> 3. Data 
> imei,deviceinformationid,mac,productdate,updatetime,gamepointid,contractnumber
> 1AA1,babu,Mikaa1,2015-01-01 11:00:00,2015-01-01 13:00:00,10,260
> 1AA2,3,Mikaa2,2015-01-02 12:00:00,2015-01-01 14:00:00,278,230
> 1AA3,1,Mikaa1,2015-01-03 13:00:00,2015-01-01 15:00:00,2556,1
> 1AA4,10,Mikaa2,2015-01-04 14:00:00,2015-01-01 16:00:00,640,254
> 1AA5,10,Mikaa,2015-01-05 15:00:00,2015-01-01 17:00:00,980,256
> 1AA6,10,Mikaa,2015-01-06 16:00:00,2015-01-01 18:00:00,1,2378
> 1AA7,10,Mikaa,2015-01-07 17:00:00,2015-01-01 19:00:00,96,234
> 1AA8,9,max,2015-01-08 18:00:00,2015-01-01 20:00:00,89,236
> 1AA9,10,max,2015-01-09 19:00:00,2015-01-01 21:00:00,198.36,239.2
> Expect Outoput:- Data Load should FAIL 
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1038) DICTIONARY_EXCLUDE is not working for string datatype when used with Spark Datasource DDL

2017-05-08 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1038.
-
Resolution: Duplicate

duplicated to https://issues.apache.org/jira/browse/CARBONDATA-829

> DICTIONARY_EXCLUDE is not working for string datatype when used with Spark 
> Datasource DDL
> -
>
> Key: CARBONDATA-1038
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1038
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
> Environment: spark 2.1
>Reporter: Neha Bhardwaj
>Assignee: Kunal Kapoor
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> DICTIONARY_EXCLUDE is throwing exception for string datatype when used Spark 
> Datasource DDL
> Steps to reproduce:
> CREATE TABLE IF NOT EXISTS uniq_product( productNumber Int, productName 
> String, storeCity String, storeProvince String, 
> productCategory String, productBatch String, saleQuantity Int, revenue Int) 
> USING 
> org.apache.spark.sql.CarbonSource OPTIONS('COLUMN_GROUPS'='(productCategory)',
>  'DICTIONARY_EXCLUDE'='productName', 'DICTIONARY_INCLUDE'='productNumber',
>  'NO_INVERTED_INDEX'='productBatch','bucketnumber'='1', 
> 'bucketcolumns'='productNumber','tableName'='uniq_product');
> Expected Output :
> Table should get created.
> Actual Output :
> Error: org.apache.carbondata.spark.exception.MalformedCarbonCommandException: 
> DICTIONARY_EXCLUDE is unsupported for stringtype data type column: 
> productname (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-155) Code refactor to avoid the Type Casting in compareTo method

2017-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-155:
---
Priority: Trivial  (was: Major)

> Code refactor to avoid the Type Casting in compareTo method
> ---
>
> Key: CARBONDATA-155
> URL: https://issues.apache.org/jira/browse/CARBONDATA-155
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Trivial
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> PraveenAdlakha added a note 23 hours ago
> Hi ,
> Would like to suggest a couple of things here remove Comparable from the 
> class definition as Distributable is already implementing it.
> Let use generics so that we donot have to typeCast everywhere in the compare 
> method for that need to do two things:
> 1) Change the class definition of Distributable to :
> public abstract class Distributable> implements Comparable
> 2) Change the compareTo method definition to:
> public int compareTo(TableBlockInfo other)
> Let me know incase you are facing any isssue in doing this I will provide the 
> patch if needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-155) Code refactor to avoid the Type Casting in compareTo method

2017-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-155:
---
Issue Type: Improvement  (was: Bug)

> Code refactor to avoid the Type Casting in compareTo method
> ---
>
> Key: CARBONDATA-155
> URL: https://issues.apache.org/jira/browse/CARBONDATA-155
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> PraveenAdlakha added a note 23 hours ago
> Hi ,
> Would like to suggest a couple of things here remove Comparable from the 
> class definition as Distributable is already implementing it.
> Let use generics so that we donot have to typeCast everywhere in the compare 
> method for that need to do two things:
> 1) Change the class definition of Distributable to :
> public abstract class Distributable> implements Comparable
> 2) Change the compareTo method definition to:
> public int compareTo(TableBlockInfo other)
> Let me know incase you are facing any isssue in doing this I will provide the 
> patch if needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (CARBONDATA-1032) NumberFormatException and NegativeArraySizeException for select with in clause filter limit for unsafe true configuration

2017-05-09 Thread Ravindra Pesala (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002280#comment-16002280
 ] 

Ravindra Pesala commented on CARBONDATA-1032:
-

Please provide reproducible steps and data.

> NumberFormatException and NegativeArraySizeException for select with in 
> clause filter limit for unsafe true configuration
> -
>
> Key: CARBONDATA-1032
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1032
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.1.0
> Environment: 3 node cluster SUSE 11 SP4
>Reporter: Chetan Bhat
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Carbon .properties are configured as below:
> carbon.allowed.compaction.days = 2
> carbon.enable.auto.load.merge = false
> carbon.compaction.level.threshold = 3,2
> carbon.timestamp.format = -MM-dd
> carbon.badRecords.location = /tmp/carbon
> carbon.numberof.preserve.segments = 2
> carbon.sort.file.buffer.size = 20
> max.query.execution.time = 60
> carbon.number.of.cores.while.loading = 8
> carbon.storelocation =hdfs://hacluster/opt/CarbonStore
> enable.data.loading.statistics = true
> enable.unsafe.sort = true
> offheap.sort.chunk.size.inmb = 128
> sort.inmemory.size.inmb = 30720
> carbon.enable.vector.reader=true
> enable.unsafe.in.query.processing=true
> enable.query.statistics=true
> carbon.blockletgroup.size.in.mb=128
> high.cardinality.identify.enable=TRUE
> high.cardinality.threshold=1
> high.cardinality.value=1000
> high.cardinality.row.count.percentage=40
> carbon.data.file.version=2
> carbon.major.compaction.size=2
> carbon.enable.auto.load.merge=FALSE
> carbon.numberof.preserve.segments=1
> carbon.allowed.compaction.days=1
> User creates table, loads 1535088 records data and executes the select with 
> in clause filter limit. 
> Actual Result :
> NumberFormatException and NegativeArraySizeException for select with in 
> clause filter limit for unsafe true configuration.
> 0: jdbc:hive2://172.168.100.199:23040> select * from flow_carbon_test4 where 
> opp_bk in ('149199158','149199116','149199022','149199031')  
> and dt>='20140101' and dt <= '20160101' order by bal asc limit 1000;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 1 in stage 2109.0 failed 4 times, most recent failure: Lost task 1.3 in 
> stage 2109.0 (TID 75120, linux-49, executor 2): 
> java.lang.NegativeArraySizeException
> at 
> org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeBigDecimalMeasureChunkStore.getBigDecimal(UnsafeBigDecimalMeasureChunkStore.java:132)
> at 
> org.apache.carbondata.core.datastore.compression.decimal.CompressByteArray.getBigDecimalValue(CompressByteArray.java:94)
> at 
> org.apache.carbondata.core.datastore.dataholder.CarbonReadDataHolder.getReadableBigDecimalValueByIndex(CarbonReadDataHolder.java:38)
> at 
> org.apache.carbondata.core.scan.result.vector.MeasureDataVectorProcessor$DecimalMeasureVectorFiller.fillMeasureVectorForFilter(MeasureDataVectorProcessor.java:253)
> at 
> org.apache.carbondata.core.scan.result.impl.FilterQueryScannedResult.fillColumnarMeasureBatch(FilterQueryScannedResult.java:119)
> at 
> org.apache.carbondata.core.scan.collector.impl.DictionaryBasedVectorResultCollector.scanAndFillResult(DictionaryBasedVectorResultCollector.java:145)
> at 
> org.apache.carbondata.core.scan.collector.impl.DictionaryBasedVectorResultCollector.collectVectorBatch(DictionaryBasedVectorResultCollector.java:137)
> at 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:65)
> at 
> org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46)
> at 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:251)
> at 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:141)
> at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:221)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at

[jira] [Resolved] (CARBONDATA-1050) int and short measures should not be considered as long.

2017-05-17 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1050.
-
   Resolution: Fixed
 Assignee: ravikiran
Fix Version/s: 1.1.1
   1.2.0

> int and short measures should not be considered as long.
> 
>
> Key: CARBONDATA-1050
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1050
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load, data-query
>Reporter: ravikiran
>Assignee: ravikiran
>Priority: Minor
> Fix For: 1.2.0, 1.1.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1061) If AL_DICTIONARY_PATH is used in load option then by SINGLE_PASS must be used.

2017-05-17 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1061.
-
   Resolution: Fixed
Fix Version/s: 1.1.1
   1.2.0

> If AL_DICTIONARY_PATH is used in load option then by SINGLE_PASS must be used.
> --
>
> Key: CARBONDATA-1061
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1061
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.1.0
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Fix For: 1.2.0, 1.1.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1056) Data_load failure using single_pass true with spark 2.1

2017-05-17 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1056.
-
   Resolution: Fixed
Fix Version/s: 1.1.1
   1.2.0

> Data_load failure using single_pass true with spark 2.1
> ---
>
> Key: CARBONDATA-1056
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1056
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.1.0
> Environment: spark 2.1
>Reporter: Vandana Yadav
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.2.0, 1.1.1
>
> Attachments: 2000_UniqData.csv
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Data_load failure using single_pass true with spark 2.1
> Steps to reproduce:
> 1)Create Table:
> CREATE TABLE uniq_exclude_sp1 (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='CUST_NAME,ACTIVE_EMUI_VERSION');
> 2) Load Data:
> LOAD DATA INPATH 'hdfs://localhost:54310/2000_UniqData.csv' into table 
> uniq_exclude_sp1 OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_Pass'='true');
> 3)Result:
> Actual result on beeline:
> Error: java.lang.Exception: Dataload failed due to error while writing 
> dictionary file! (state=,code=0)
> Expected Result: data should be load successfully 
> 4)Thriftserver logs:
> 17/05/16 16:07:20 INFO SparkExecuteStatementOperation: Running query 'LOAD 
> DATA INPATH 'hdfs://localhost:54310/2000_UniqData.csv' into table 
> uniq_exclude_sp1 OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_Pass'='true')'
>  with 34eb7e9e-bd49-495c-af68-8f0b5e36b786
> 17/05/16 16:07:20 INFO CarbonSparkSqlParser: Parsing command: LOAD DATA 
> INPATH 'hdfs://localhost:54310/2000_UniqData.csv' into table uniq_exclude_sp1 
> OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_Pass'='true')
> 17/05/16 16:07:20 INFO CarbonLateDecodeRule: pool-23-thread-4 Skip 
> CarbonOptimizer
> 17/05/16 16:07:20 INFO HdfsFileLock: pool-23-thread-4 HDFS lock 
> path:hdfs://localhost:54310/opt/prestocarbonStore/default/uniq_exclude_sp1/meta.lock
> 17/05/16 16:07:20 INFO LoadTable: pool-23-thread-4 Successfully able to get 
> the table metadata file lock
> 17/05/16 16:07:20 INFO LoadTable: pool-23-thread-4 Initiating Direct Load for 
> the Table : (default.uniq_exclude_sp1)
> 17/05/16 16:07:20 AUDIT CarbonDataRDDFactory$: 
> [knoldus][hduser][Thread-137]Data load request has been received for table 
> default.uniq_exclude_sp1
> 17/05/16 16:07:20 INFO CommonUtil$: pool-23-thread-4 [Block Distribution]
> 17/05/16 16:07:20 INFO CommonUtil$: pool-23-thread-4 totalInputSpaceConsumed: 
> 376223 , defaultParallelism: 4
> 17/05/16 16:07:20 INFO CommonUtil$: pool-23-thread-4 
> mapreduce.input.fileinputformat.split.maxsize: 16777216
> 17/05/16 16:07:20 INFO FileInputFormat: Total input paths to process : 1
> 17/05/16 16:07:20 INFO DistributionUtil$: pool-23-thread-4 Executors 
> configured : 1
> 17/05/16 16:07:20 INFO DistributionUtil$: pool-23-thread-4 Total Time taken 
> to ensure the required executors : 0
> 17/05/16 16:07:20 INFO DistributionUtil$: pool-23-thread-4 Time elapsed to 
> allocate the required executors: 0
> 17/05/16 16:07:20 INFO CarbonDataRDDFactory$: pool-23-thread-4 Total Time 
> taken in block allocation: 1
> 17/05/16 16:07:20 INFO CarbonDataRDDFactory$: pool-23-thread-4 Total no of 
> blocks: 1, No.of Nodes: 1
> 17/05/16 16:07:20 INFO CarbonDataRDDFactory$: pool-23-thread-4 #Node: knoldus 
> no.of.blocks: 1
> 17/05/16 16:07:20 INFO MemoryStore: Block broadcast_2 stored as values in 
> memory (estimated size 53.7 MB, free 291.4 MB)
> 17/05/16 16:07:20 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes 
> in memory (estimated size 23.2 KB, free 291.4 MB)
> 17/05/16 16:07:20 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory 
> on 192.168.1.10:42046 (size: 23.2 KB, free: 366.2 MB)
> 17/05/16 16:07:20 INFO

[jira] [Resolved] (CARBONDATA-1013) Unexpected characters displays in results while using join query.

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1013.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

> Unexpected characters displays in results while using join query.
> -
>
> Key: CARBONDATA-1013
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1013
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.1.0
> Environment: spark 2.1
>Reporter: Vandana Yadav
>Assignee: Srigopal Mohanty
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: customer_C1.csv, payment_C1.csv, unwanted characters.png
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Unexpected characters displays in result-set while using join query.
> Steps to reproduce:
> 1) Create tables:
> a) In carbondata:
> table 1 : create table Comp_TABLE_ONE_JOIN (customer_uid String,customer_id 
> int, gender String, first_name String, middle_name String, last_name 
> String,customer_address String, country String) STORED BY 
> 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='customer_uid')
> table 2: create table Comp_TABLE_TWO_JOIN (customer_payment_id 
> String,customer_id int,payment_amount Decimal(15,5), payment_mode 
> String,payment_details String) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='customer_payment_id')
> b) In hive:
> table 1: create table Comp_TABLE_ONE_JOIN_h (customer_uid String,customer_id 
> int, gender String, first_name String, middle_name String, last_name 
> String,customer_address String, country String)ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ",";
> table 2:create table Comp_TABLE_TWO_JOIN_h (customer_payment_id 
> String,customer_id int,payment_amount Decimal(15,5), payment_mode 
> String,payment_details String) ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
> 2) Load Data :
> a) In Carbondata:
> table 1 : LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/customer_C1.csv' INTO 
> table Comp_TABLE_ONE_JOIN options 
> ('DELIMITER'=',','QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='customer_uid,customer_id,gender,first_name,middle_name,last_name,customer_address,country')
> table 2:LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/payment_C1.csv' INTO table 
> Comp_TABLE_TWO_JOIN options 
> ('DELIMITER'=',','QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='customer_payment_id
>  ,customer_id,payment_amount ,payment_mode, payment_details')
> b) In hive:
> table 1: LOAD DATA LOCAL INPATH 
> '/home/knoldus/Desktop/csv/TestData2/Data/customer_C1.csv' INTO table 
> Comp_TABLE_ONE_JOIN_h;
> table 2: LOAD DATA LOCAL INPATH 
> '/home/knoldus/Desktop/csv/TestData2/Data/payment_C1.csv' INTO table 
> Comp_TABLE_TWO_JOIN_h;
> 3)Execute query:
> select * from Comp_TABLE_ONE_JOIN join Comp_TABLE_TWO_JOIN on 
> Comp_TABLE_ONE_JOIN.customer_id=Comp_TABLE_TWO_JOIN.customer_id limit  2;
> Actual Result:
> a) In Carbondata:
> +---+--+-+-+--++---++---+--+-+---+-+--+
> | customer_uid  | customer_id  | gender  | first_name  | middle_name  | 
> last_name  | customer_address  |  country   |  customer_payment_id  | 
> customer_id  | payment_amount  | payment_mode  |   payment_details   |
> +---+--+-+-+--++---++---+--+-+---+-+--+
> | UID31a31  | 31   | female  | fname31 | mname31  | 
> lname31| address 31| country31  | Cust_payment_ID31a31  | 31  
>  | 193288.72000|   |p  |
> | UID31a31  | 31   | female  | fname31 | mname31  | 
> lname31| address 31| country31  | Cust_payment_ID31a31  | 31  
>  | 193288.72000|   |p  |
> +---+--+-+-+--++---++---+--+-+---+-+--+
> 2 rows selected (0.499 seconds)
> b) In Hive
> ---+--+-+-+--++---+---+--+--+-+---+-+--+
> | customer_uid  | customer_id  | gender  | first_name  | middle_name  | 
> last_name  | customer_address  |  country  | customer_payment_id  | 
> customer_id  | payment_amount  | payment_mode  |

[jira] [Resolved] (CARBONDATA-1019) Like Filter Pushdown

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1019.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

> Like Filter Pushdown
> 
>
> Key: CARBONDATA-1019
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1019
> Project: CarbonData
>  Issue Type: Bug
>Reporter: sounak chakraborty
>Assignee: sounak chakraborty
> Fix For: 1.1.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Like Filter Pushdown in carbon layer to convert into Range Filter.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1027) insert into/data load failing for numeric dictionary included column having null value

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1027.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

> insert into/data load failing for numeric dictionary included column having 
> null value
> --
>
> Key: CARBONDATA-1027
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1027
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Fix For: 1.1.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> insert into/data load failing for numeric dictionary included column having 
> null value



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-999) use carbondata bulket feature，but it doesn't seem to work?

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-999.

Resolution: Won't Fix

Bucketing is not supported in Spark 1.6

> use carbondata bulket feature，but it doesn't seem to work?
> --
>
> Key: CARBONDATA-999
> URL: https://issues.apache.org/jira/browse/CARBONDATA-999
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.1.0
> Environment: spark 1.6.2，carbondata 1.1.0 rc1
>Reporter: xuzhiliang
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> 1.CREATE TABLE shop_test(platFormId int,sellerNick string,companyGuid 
> STRING,companyName STRING) STORED BY 'carbondata' TBLPROPERTIES 
> ('BUCKETNUMBER'='2','BUCKETCOLUMNS'='sellerNick')
> 2. .when loading data
> the sorter is type of ParallelReadMergeSorterImpl,not 
> ParallelReadMergeSorterWithBucketingImpl,why configuration.getBucketingInfo 
> is null?What is wrong with that? Can you fix it?
> 3.hadoop dfs -lsr /Opt/CarbonStore/default/shop_test
> drwxr-xr-x   - root supergroup  0 2017-04-27 15:37 
> /Opt/CarbonStore/default/shop_test/Fact
> drwxr-xr-x   - root supergroup  0 2017-04-27 15:37 
> /Opt/CarbonStore/default/shop_test/Fact/Part0
> drwxr-xr-x   - root supergroup  0 2017-04-27 15:37 
> /Opt/CarbonStore/default/shop_test/Fact/Part0/Segment_0
> -rw-r--r--   3 root supergroup566 2017-04-27 15:37 
> /Opt/CarbonStore/default/shop_test/Fact/Part0/Segment_0/0_batchno0-0-1493278648826.carbonindex
> -rw-r--r--   3 root supergroup891 2017-04-27 15:37 
> /Opt/CarbonStore/default/shop_test/Fact/Part0/Segment_0/part-0-0_batchno0-0-1493278648826.carbondata



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-976) Wrong entry getting deleted from schemaEvolution during alter revert

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-976.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Wrong entry getting deleted from schemaEvolution during alter revert
> 
>
> Key: CARBONDATA-976
> URL: https://issues.apache.org/jira/browse/CARBONDATA-976
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
> Fix For: 1.1.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-921) selecting columns out of order in hive doesn't work

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-921:
---
Component/s: hive-integration

> selecting columns out of order in hive doesn't work
> ---
>
> Key: CARBONDATA-921
> URL: https://issues.apache.org/jira/browse/CARBONDATA-921
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, hive-integration
> Environment: spark 2.1, hive 1.2.1
>Reporter: Neha Bhardwaj
>Assignee: anubhav tarar
>Priority: Minor
> Attachments: abc.csv
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Selecting columns non sequentially(out of the order) fails to render output
> Steps to reproduce:
> 1) In Spark Shell :
> a) Create Table -
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.CarbonSession._
> val carbon = 
> SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://localhost:54310/opt/data")
> scala> carbon.sql(" create table abc(id int, name string) stored by 
> 'carbondata' ").show
> b) Load Data -
> scala> carbon.sql(""" load data inpath 'hdfs://localhost:54310/Files/abc.csv' 
> into table abc """ ).show
> 2) In Hive :
> a) Add Jars -
> add jar 
> /home/neha/incubator-carbondata/assembly/target/scala-2.11/carbondata_2.11-1.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar;
> add jar /opt/spark-2.1.0-bin-hadoop2.7/jars/spark-catalyst_2.11-2.1.0.jar;
> add jar 
> /home/neha/incubator-carbondata/integration/hive/carbondata-hive-1.1.0-incubating-SNAPSHOT.jar;
> b) Create Table -
> create table abc(id int,name string);
> c) Alter location -
> hive> alter table abc set LOCATION 
> 'hdfs://localhost:54310/opt/data/default/abc' ;
> d) Set Properties -
> set hive.mapred.supports.subdirectories=true;
> set mapreduce.input.fileinputformat.input.dir.recursive=true;
> d) Alter FileFormat -
> alter table abc set FILEFORMAT
> INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat"
> OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat"
> SERDE "org.apache.carbondata.hive.CarbonHiveSerDe";
> e) Queries -
> hive> select id from abc; //Works Fine(Column in order)
> hive> select name from abc;   //Doesn't Work(Column out of order)
> hive> select id,name from abc;//Works Fine(Columns in order)
> hive> select name,id from abc;//Doesn't Work(Columns out of order)
> Expected output : Query - hive> select name,id from abc;
> display data of the columns specified.
> Actual output : Query - hive> select name,id from abc;
> OK
> Failed with exception java.io.IOException:java.lang.ClassCastException: 
> java.lang.String cannot be cast to java.lang.Long
> Time taken: 0.079 seconds



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-902) NoClassDefFoundError for Decimal datatype during select queries

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-902:
---
Component/s: hive-integration

> NoClassDefFoundError for Decimal datatype during select queries
> ---
>
> Key: CARBONDATA-902
> URL: https://issues.apache.org/jira/browse/CARBONDATA-902
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, hive-integration
> Environment: Spark 2.1, Hive 1.2.1
>Reporter: Neha Bhardwaj
>Assignee: anubhav tarar
>Priority: Minor
> Attachments: testHive1.csv
>
>
> Decimal data type raises exception while selecting the data from the table in 
> hive.
> Steps to reproduce:
> 1) In Spark Shell :
>  a) Create Table -
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.CarbonSession._
> val carbon = 
> SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://localhost:54310/opt/data")
>  
>  scala> carbon.sql(""" create table testHive1(id int,name string,dob 
> timestamp,experience decimal,salary double,incentive bigint) stored 
> by'carbondata' """).show 
>  b) Load Data - 
> scala> carbon.sql(""" load data inpath 
> 'hdfs://localhost:54310/Files/testHive1.csv' into table testHive1 """ ).show
> 2) In Hive : 
>  a) Add Jars - 
> add jar 
> /home/neha/incubator-carbondata/assembly/target/scala-2.11/carbondata_2.11-1.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar;
> add jar /opt/spark-2.1.0-bin-hadoop2.7/jars/spark-catalyst_2.11-2.1.0.jar;
> add jar 
> /home/neha/incubator-carbondata/integration/hive/carbondata-hive-1.1.0-incubating-SNAPSHOT.jar;
>
>  
>  b) Create Table -
> create table testHive1(id int,name string,dob timestamp,experience 
> decimal,salary double,incentive bigint);
> c) Alter location - 
> hive> alter table testHive1 set LOCATION 
> 'hdfs://localhost:54310/opt/data/default/testhive1' ;
>  d) Set Properties - 
> set hive.mapred.supports.subdirectories=true;
> set mapreduce.input.fileinputformat.input.dir.recursive=true;
> d) Alter FileFormat -
> alter table testHive1 set FILEFORMAT
> INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat"
> OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat"
> SERDE "org.apache.carbondata.hive.CarbonHiveSerDe";
>  f) Execute Queries - 
> select * from testHive1;
> 3) Query :
> hive> select * from testHive1;
> Expected Output : 
> ResultSet should display all the data present in the table.
> Result:
> Exception in thread "[main][partitionID:testhive1;queryID:8945394553892]" 
> java.lang.NoClassDefFoundError: org/apache/spark/sql/types/Decimal
>   at 
> org.apache.carbondata.core.scan.collector.impl.AbstractScannedResultCollector.getMeasureData(AbstractScannedResultCollector.java:109)
>   at 
> org.apache.carbondata.core.scan.collector.impl.AbstractScannedResultCollector.fillMeasureData(AbstractScannedResultCollector.java:78)
>   at 
> org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillMeasureData(DictionaryBasedResultCollector.java:158)
>   at 
> org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectData(DictionaryBasedResultCollector.java:115)
>   at 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:51)
>   at 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:32)
>   at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:50)
>   at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
>   at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:31)
>   at 
> org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.(ChunkRowIterator.java:41)
>   at 
> org.apache.carbondata.hive.CarbonHiveRecordReader.initialize(CarbonHiveRecordReader.java:84)
>   at 
> org.apache.carbondata.hive.CarbonHiveRecordReader.(CarbonHiveRecordReader.java:66)
>   at 
> org.apache.carbondata.hive.MapredCarbonInputFormat.getRecordReader(MapredCarbonInputFormat.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:673)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:323)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:445)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
>   at

[jira] [Updated] (CARBONDATA-837) Unable to delete records from carbondata table

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-837:
---
Fix Version/s: (was: NONE)
   1.1.1

> Unable to delete records from carbondata table
> --
>
> Key: CARBONDATA-837
> URL: https://issues.apache.org/jira/browse/CARBONDATA-837
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.1.0
> Environment: HDP 2.5, Spark 1.6.2
>Reporter: Sanoj MG
>Assignee: Sanoj MG
>Priority: Minor
> Fix For: 1.1.1
>
>
> As per below document I am trying to delete entries from the table :
> https://github.com/apache/incubator-carbondata/blob/master/docs/dml-operation-on-carbondata.md
> scala> cc.sql("select * from accountentity").count
> res10: Long = 391351
> scala> cc.sql("delete from accountentity")
> INFO  30-03 09:03:03,099 - main Query [DELETE FROM ACCOUNTENTITY]
> INFO  30-03 09:03:03,104 - Parsing command: select tupleId from accountentity
> INFO  30-03 09:03:03,104 - Parse Completed
> INFO  30-03 09:03:03,105 - Parsing command: select tupleId from accountentity
> INFO  30-03 09:03:03,105 - Parse Completed
> res11: org.apache.spark.sql.DataFrame = []
> scala> cc.sql("select * from accountentity").count
> res12: Long = 391351
> The records gets deleted only when an action such as show() is applied. 
> scala> cc.sql("delete from accountentity").show



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-802) Select query is throwing exception if new dictionary column is added without any default value

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-802.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Select query is throwing exception if new dictionary column is added without 
> any default value
> --
>
> Key: CARBONDATA-802
> URL: https://issues.apache.org/jira/browse/CARBONDATA-802
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Minor
> Fix For: 1.1.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Select query is throwing exception if new dictionary column is added without 
> any default value
> eg., create table test(int id, name string) stored as 'carbondata'
> alter table test add columns(country string) 
> tblproperties('default.value.country'='india') -->select query is passing
> alter table test add columns(state string) -->select query is failing



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-713) Use store location in properties when user didn't pass the location as the parameter

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-713:
---
Issue Type: Improvement  (was: Bug)

> Use store location in properties when user didn't pass the location as the 
> parameter
> 
>
> Key: CARBONDATA-713
> URL: https://issues.apache.org/jira/browse/CARBONDATA-713
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 1.1.0
>Reporter: Yadong Qi
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> The store location of carbon comes from three places:
> 1. default location path in code(../carbon.store)
> 2. configurate "carbon.storelocation" in carbon.properties
> 3. pass the location as the parameter
> The priority is low to high.
> But when I create a CarbonContext or CarbonSession without any parameters and 
> configurate "carbon.storelocation" in carbon.properties, the final value of 
> location is defalut(../carbon.store)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-698) no_inverted_index is not working

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-698.

Resolution: Duplicate

> no_inverted_index is not working
> 
>
> Key: CARBONDATA-698
> URL: https://issues.apache.org/jira/browse/CARBONDATA-698
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.0.0-incubating
> Environment: spark1.6,2.1
>Reporter: anubhav tarar
>Assignee: anubhav tarar
>Priority: Trivial
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> i am creating no_inverted_index with invalid values using both spark 1.6 and 
> 2.1 and it works 
> spark 2.1 logs
> 0: jdbc:hive2://localhost:1> DROP TABLE IF EXISTS  productSalesTable;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (3.621 seconds)
> 0: jdbc:hive2://localhost:1>   CREATE TABLE productSalesTable( 
> productNumber Int, productName String, storeCity String, storeProvince 
> String, productCategory String, productBatch String, saleQuantity Int, 
> revenue Int) STORED BY 'carbondata' TBLPROPERTIES ('COLUMN_GROUPS'='( 
> productName)','DICTIONARY_INCLUDE'='productName', 'NO_INVERTED_INDEX'='1');
> spark 1.6 logs
> cc.sql("DROP TABLE IF EXISTS  productSalesTable").show();
> cc.sql("""CREATE TABLE productSalesTable( productNumber Int, productName 
> String, storeCity String, storeProvince String, productCategory String, 
> productBatch String, saleQuantity Int, revenue Int) STORED BY 'carbondata' 
> TBLPROPERTIES ('COLUMN_GROUPS'='( 
> productName)','DICTIONARY_INCLUDE'='productName', 
> 'NO_INVERTED_INDEX'='1')""").show()
> AUDIT 07-02 15:48:14,485 - [knoldus][knoldus][Thread-1]Creating Table with 
> Database name [default] and Table name [productsalestable]
> AUDIT 07-02 15:48:14,868 - [knoldus][knoldus][Thread-1]Table created with 
> Database name [default] and Table name [productsalestable]
> while debugging the code i found out in carbon ddl sql parser 
> tablePropertiess map contain the properties in lower case
> and we are checking tableProperties.get("NO_INVERTED_INDEX")
> so it is failing due to string mismatch



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-688) Abnormal behaviour of double datatype when used in DICTIONARY_INCLUDE and filtering null values

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-688.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Abnormal behaviour of double datatype when used in DICTIONARY_INCLUDE and 
> filtering null values
> ---
>
> Key: CARBONDATA-688
> URL: https://issues.apache.org/jira/browse/CARBONDATA-688
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.1.0
> Environment: Spark 2.1
>Reporter: Geetika Gupta
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: 100_olap_C20.csv
>
>
> I tried to create a table having double as a column and load null values into 
> that table. When I performed the select query on the table, it is displaying 
> wrong data.
> Below are the commands used:
> Create table :
> create table  Comp_VMALL_DICTIONARY_INCLUDE (imei string,deviceInformationId 
> int,MAC string,deviceColor string,device_backColor string,modelId 
> string,marketName string,AMSize string,ROMSize string,CUPAudit 
> string,CPIClocked string,series string,productionDate timestamp,bomCode 
> string,internalModels string, deliveryTime string, channelsId string, 
> channelsName string , deliveryAreaId string, deliveryCountry string, 
> deliveryProvince string, deliveryCity string,deliveryDistrict string, 
> deliveryStreet string, oxSingleNumber string, ActiveCheckTime string, 
> ActiveAreaId string, ActiveCountry string, ActiveProvince string, Activecity 
> string, ActiveDistrict string, ActiveStreet string, ActiveOperatorId string, 
> Active_releaseId string, Active_EMUIVersion string, Active_operaSysVersion 
> string, Active_BacVerNumber string, Active_BacFlashVer string, 
> Active_webUIVersion string, Active_webUITypeCarrVer 
> string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, 
> Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, 
> Latest_DAY Decimal(30,10), Latest_HOUR string, Latest_areaId string, 
> Latest_country string, Latest_province string, Latest_city string, 
> Latest_district string, Latest_street string, Latest_releaseId string, 
> Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber 
> string, Latest_BacFlashVer string, Latest_webUIVersion string, 
> Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, 
> Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, 
> Latest_operatorId string, gamePointDescription string,gamePointId 
> double,contractNumber BigInt)  STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='imei,deviceInformationId,productionDate,gamePointId,Latest_DAY,contractNumber');
> Load command:
> LOAD DATA INPATH  'hdfs://localhost:54311/BabuStore/DATA/100_olap_C20.csv' 
> INTO table Comp_VMALL_DICTIONARY_INCLUDE options ('DELIMITER'=',', 
> 'QUOTECHAR'='"', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='imei,deviceInformationId,MAC,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CPIClocked,series,productionDate,bomCode,internalModels,deliveryTime,channelsId,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,deliveryStreet,oxSingleNumber,contractNumber,ActiveCheckTime,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,ActiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_operaSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Active_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Active_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaSysVersion,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITypeCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePADPartitionedVersions,Latest_operatorId,gamePointId,gamePointDescription');
> Select query:
> select gamePointId  from Comp_VMALL_DICTIONARY_INCLUDE where gamePointId IS 
> NOT NULL order by gamePointId;
> select gamePointId from Comp_VMALL_DICTIONARY_INCLUDE where gamePointId is 
> NULL;
> The first select command displays null values as well and the second command 
> displays no values.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (CARBONDATA-631) Select,Delete and Insert Query Failing for table created in 0.2 with data loaded in 1.0

2017-05-10 Thread Ravindra Pesala (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004237#comment-16004237
 ] 

Ravindra Pesala commented on CARBONDATA-631:


[~pallavisingh_09] It seems this issue is already fixed, Please verify it on 
current master and close it if it is fixed

> Select,Delete and Insert Query Failing for table created in 0.2 with data 
> loaded in 1.0
> ---
>
> Key: CARBONDATA-631
> URL: https://issues.apache.org/jira/browse/CARBONDATA-631
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6 
>Reporter: Pallavi Singh
>Assignee: kumar vishal
> Fix For: NONE
>
>
> Created table  with the 0.2 jar:
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB");
> then 
> LOAD DATA INPATH 'hdfs://localhost:54310/csv/2000_UniqData.csv' into table 
> uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> Switched to 1.0 jar
> LOAD DATA INPATH 'hdfs://localhost:54310/csv/2000_UniqData.csv' into table 
> uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> After successful load :
> select count(*) from uniqdata;
> I get following error : 
> INFO  12-01 18:31:04,057 - Running query 'select count(*) from uniqdata' with 
> 81129cf3-fcd4-429d-9adf-d37d35cdf051
> INFO  12-01 18:31:04,058 - pool-27-thread-46 Query [SELECT COUNT(*) FROM 
> UNIQDATA]
> INFO  12-01 18:31:04,060 - Parsing command: select count(*) from uniqdata
> INFO  12-01 18:31:04,060 - Parse Completed
> INFO  12-01 18:31:04,061 - Parsing command: select count(*) from uniqdata
> INFO  12-01 18:31:04,061 - Parse Completed
> INFO  12-01 18:31:04,061 - 27: get_table : db=12jan17 tbl=uniqdata
> INFO  12-01 18:31:04,061 - ugi=pallaviip=unknown-ip-addr  
> cmd=get_table : db=12jan17 tbl=uniqdata 
> INFO  12-01 18:31:04,061 - 27: Opening raw store with implemenation 
> class:org.apache.hadoop.hive.metastore.ObjectStore
> INFO  12-01 18:31:04,063 - ObjectStore, initialize called
> INFO  12-01 18:31:04,068 - Reading in results for query 
> "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is 
> closing
> INFO  12-01 18:31:04,069 - Using direct SQL, underlying DB is DERBY
> INFO  12-01 18:31:04,069 - Initialized ObjectStore
> INFO  12-01 18:31:04,101 - pool-27-thread-46 Starting to optimize plan
> ERROR 12-01 18:31:04,168 - pool-27-thread-46 Cannot convert12-01-2017 
> 16:02:28 to Time/Long type valueUnparseable date: "12-01-2017 16:02:28"
> ERROR 12-01 18:31:04,185 - pool-27-thread-46 Cannot convert12-01-2017 
> 16:02:08 to Time/Long type valueUnparseable date: "12-01-2017 16:02:08"
> ERROR 12-01 18:31:04,185 - pool-27-thread-46 Cannot convert12-01-2017 
> 16:02:08 to Time/Long type valueUnparseable date: "12-01-2017 16:02:08"
> ERROR 12-01 18:31:04,204 - pool-27-thread-46 Cannot convert12-01-2017 
> 16:02:08 to Time/Long type valueUnparseable date: "12-01-2017 16:02:08"
> ERROR 12-01 18:31:04,210 - Error executing query, currentState RUNNING, 
> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree:
> CarbonDictionaryDecoder [CarbonDecoderRelation(Map(dob#280 -> dob#280, 
> double_column1#287 -> double_column1#287, decimal_column1#285 -> 
> decimal_column1#285, cust_id#282L -> cust_id#282L, integer_column1#289L -> 
> integer_column1#289L, decimal_column2#286 -> decimal_column2#286, 
> cust_name#278 -> cust_name#278, double_column2#288 -> double_column2#288, 
> active_emui_version#279 -> active_emui_version#279, bigint_column1#283L -> 
> bigint_column1#283L, bigint_column2#284L -> bigint_column2#284L, doj#281 -> 
> doj#281),CarbonDatasourceRelation(`12jan17`.`uniqdata`,None))], 
> ExcludeProfile(ArrayBuffer()), CarbonAliasDecoderRelation()
> +- TungstenAggregate(key=[], 
> functions=[(count(1),mode=Final,isDistinct=false)], output=[_c0#750L])
>+- TungstenExchange SinglePartition, None
>   +- TungstenAggregate(key=[], 
> functions=[(count(1),mode=Partial,isDistinct=false)], output=[count#754L])
>  +- CarbonScan CarbonRelation 12jan17,

[jira] [Resolved] (CARBONDATA-1023) Able to do load from dataframe with byte data type in carbon table

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1023.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

> Able to do load from dataframe with byte data type in carbon table
> --
>
> Key: CARBONDATA-1023
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1023
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.0.0-incubating
> Environment: spark common
>Reporter: anubhav tarar
>Assignee: anubhav tarar
>Priority: Trivial
> Fix For: 1.1.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> i am able to load data with bytetype from dataframe in a carbon table
> as far as documentation says byte type is not supported
>  val rdd1 = sqlContext.sparkContext.parallelize(
>   Row("byte",1234.toByte) :: Nil)
> val schema1 = StructType(
>   StructField("string",StringType, nullable = false) ::
>   StructField("byte",ByteType, nullable = false) :: Nil)
>val dataFrame = sqlContext.createDataFrame(rdd1, schema1)
>  dataFrame.write
>   .format("carbondata")
>   .option("tableName", "carbon1")
>   .option("compress", "true")
>   .mode(SaveMode.Overwrite)
>   .save()
>   sql("select * from carbon1").show()
> +--++
> |string|byte|
> +--++
> |  byte| -46|
> +--++
> if we try to create a table with byte type it shows error
>   sql(
>   "CREATE TABLE restructure (empno byte, empname String, designation 
> String, doj Timestamp, " +
>   "workgroupcategory int, workgroupcategoryname String, deptno int, 
> deptname String, " +
>   "projectcode int, projectjoindate Timestamp, projectenddate 
> Timestamp,attendance int," +
>   "utilization int,salary int) STORED BY 'org.apache.carbondata.format'")
> result 
> org.apache.carbondata.spark.exception.MalformedCarbonCommandException: 
> Unsupported data type: StructField(empno,ByteType,true).getType



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1001) Data type change should support int to long conversion

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1001.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

> Data type change should support int to long conversion
> --
>
> Key: CARBONDATA-1001
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1001
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.1.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-973) Improve Index File loading in performance

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-973:
---
Issue Type: Improvement  (was: Bug)

> Improve Index File loading in performance
> -
>
> Key: CARBONDATA-973
> URL: https://issues.apache.org/jira/browse/CARBONDATA-973
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: kumar vishal
>Assignee: kumar vishal
>
> Problem: Index file loading in large cluster is taking more time in large 
> cluster
> Solution: To get the index file list file is getting called, instead of 
> calling list file  create and path and read the file 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-917) count(*) doesn't work

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-917:
---
Component/s: hive-integration

> count(*) doesn't work
> -
>
> Key: CARBONDATA-917
> URL: https://issues.apache.org/jira/browse/CARBONDATA-917
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, hive-integration
> Environment: scala 2.1, Hive 1.2.1
>Reporter: Neha Bhardwaj
>Assignee: anubhav tarar
>Priority: Minor
> Attachments: abc.csv
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Select query with count(*) fails to render output
> Steps to reproduce:
> 1) In Spark Shell :
> a) Create Table -
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.CarbonSession._
> val carbon = 
> SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://localhost:54310/opt/data")
> scala> carbon.sql(" create table abc(id int, name string) stored by 
> 'carbondata' ").show
> b) Load Data - 
> scala> carbon.sql(""" load data inpath 'hdfs://localhost:54310/Files/abc.csv' 
> into table abc """ ).show
> 2) In Hive :
> a) Add Jars - 
> add jar 
> /home/neha/incubator-carbondata/assembly/target/scala-2.11/carbondata_2.11-1.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar;
> add jar /opt/spark-2.1.0-bin-hadoop2.7/jars/spark-catalyst_2.11-2.1.0.jar;
> add jar 
> /home/neha/incubator-carbondata/integration/hive/carbondata-hive-1.1.0-incubating-SNAPSHOT.jar;
> b) Create Table -
> create table abc(id int,name string);
> c) Alter location - 
> hive> alter table abc set LOCATION 
> 'hdfs://localhost:54310/opt/data/default/abc' ;
> d) Set Properties - 
> set hive.mapred.supports.subdirectories=true;
> set mapreduce.input.fileinputformat.input.dir.recursive=true;
> d) Alter FileFormat -
> alter table abc set FILEFORMAT
> INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat"
> OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat"
> SERDE "org.apache.carbondata.hive.CarbonHiveSerDe";
> e) Query -
> hive> select count(*) from abc;
> Expected Output : 
> ResultSet should display the count of the number of rows in the table.
> Result:
> Query ID = hduser_20170412181449_85a7db42-42a1-450c-9931-dc7b3b00b412
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Job running in-process (local Hadoop)
> 2017-04-12 18:14:53,949 Stage-1 map = 0%,  reduce = 0%
> Ended Job = job_local220086106_0001 with errors
> Error during job, obtaining debugging information...
> Job Tracking URL: http://localhost:8080/
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched: 
> Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-868) Select query on decimal datatype is not working fine after adding decimal column using alter

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-868.

   Resolution: Fixed
Fix Version/s: 1.1.0

> Select query on decimal datatype is not working fine after adding decimal 
> column using alter
> 
>
> Key: CARBONDATA-868
> URL: https://issues.apache.org/jira/browse/CARBONDATA-868
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Spark2.1
>Reporter: SWATI RAO
>Assignee: Srigopal Mohanty
> Fix For: 1.1.0
>
> Attachments: 2000_UniqData.csv
>
>
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> ALTER TABLE uniqdata RENAME TO uniqdata1;
> alter table uniqdata1 add columns(msrField 
> decimal(5,2))TBLPROPERTIES('DEFAULT.VALUE.msrfield'= '123.45');
> 0: jdbc:hive2://192.168.2.126:1> select msrField from uniqdata1;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 48.0 failed 1 times, most recent failure: Lost task 0.0 in 
> stage 48.0 (TID 1041, localhost, executor driver): 
> java.lang.ArrayIndexOutOfBoundsException: 4186
>   at 
> org.apache.spark.sql.execution.vectorized.OnHeapColumnVector.putInt(OnHeapColumnVector.java:202)
>   at 
> org.apache.spark.sql.execution.vectorized.ColumnVector.putDecimal(ColumnVector.java:608)
>   at 
> org.apache.carbondata.spark.vectorreader.ColumnarVectorWrapper.putDecimal(ColumnarVectorWrapper.java:58)
>   at 
> org.apache.carbondata.spark.vectorreader.ColumnarVectorWrapper.putDecimals(ColumnarVectorWrapper.java:64)
>   at 
> org.apache.carbondata.core.scan.collector.impl.RestructureBasedVectorResultCollector.fillDataForNonExistingMeasures(RestructureBasedVectorResultCollector.java:202)
>   at 
> org.apache.carbondata.core.scan.collector.impl.RestructureBasedVectorResultCollector.collectVectorBatch(RestructureBasedVectorResultCollector.java:98)
>   at 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:65)
>   at 
> org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46)
>   at 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:246)
>   at 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:140)
>   at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:222)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231)
>   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>   at org.apache.spark.scheduler.Task.run(Task.scala:99)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
>

[jira] [Updated] (CARBONDATA-699) using column group column name in dictionary_exclude do not give any exception

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-699:
---
Fix Version/s: 1.1.1

> using column group column name in dictionary_exclude do not give any exception
> --
>
> Key: CARBONDATA-699
> URL: https://issues.apache.org/jira/browse/CARBONDATA-699
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.0.0-incubating
> Environment: spark 2.1
>Reporter: anubhav tarar
>Assignee: anubhav tarar
>Priority: Minor
> Fix For: 1.1.1
>
>
> using column group column name in dictionary_exclude do not give any 
> exception in spark 2.1 it gives exception in spark 1.6 which is correct
> here are logs in spark 2.1.0 =
>  
> 0: jdbc:hive2://localhost:1> CREATE TABLE IF NOT EXISTS 
> productSalesTable1 ( productNumber Int, productName String, storeCity String, 
> storeProvince String, productCategory String, productBatch String, 
> saleQuantity Int, revenue Int) STORED BY 'carbondata' TBLPROPERTIES 
> ('COLUMN_GROUPS'='(productName,productNumber)', 
> 'DICTIONARY_EXCLUDE'='productName', 'DICTIONARY_INCLUDE'='productNumber', 
> 'NO_INVERTED_INDEX'='productBatch');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (3.844 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-677) Fixed GC issue and Re-factor Dictionary based result collector

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-677.

Resolution: Duplicate

> Fixed GC issue and Re-factor Dictionary based result collector
> --
>
> Key: CARBONDATA-677
> URL: https://issues.apache.org/jira/browse/CARBONDATA-677
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Problem:When time stamp or date type is getting selected in query projection 
> we are creating direct dictionary result collector object every for each time 
> this is causing lots of gc.
> Solution: Need to create only one object for time stamp and date type



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-649) Rand() function is not working while updating data

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-649:
---
Fix Version/s: 1.1.1

> Rand() function is not working while updating data
> --
>
> Key: CARBONDATA-649
> URL: https://issues.apache.org/jira/browse/CARBONDATA-649
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
>Reporter: Anurag Srivastava
>Assignee: sounak chakraborty
>Priority: Minor
> Fix For: 1.1.1
>
> Attachments: 2000_UniqData.csv, error_random_function.png, 
> executor_log
>
>
> I am using update functionality with the *rand(1)* and *rand()* which return 
> deterministic value or random value.
> But as I run query it gives error.
> *Create Table :* CREATE TABLE uniqdata (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");
> *Load Data :* LOAD DATA INPATH 'hdfs://localhost:54310/2000_UniqData.csv' 
> into table uniqdata OPTIONS ('DELIMITER'=',' 
> ,'QUOTECHAR'='""','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','MAXCOLUMNS'='12');
> *Query-1 :* Update uniqdata  set (decimal_column1) = (rand());
> *Query-2 :* Update uniqdata  set (decimal_column1) = (rand(1));
> Expected Result : Update column with random value.
> Actual Result :Error: java.lang.RuntimeException: Update operation failed. 
> Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most 
> recent failure: Lost task 0.3 in stage 3.0 (TID 205, 192.168.2.140): 
> java.lang.ArrayIndexOutOfBoundsException: 1
> !https://issues.apache.org/jira/secure/attachment/12847788/error_random_function.png!
> I have attached screen shot of log, executor log and CSV with this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-652) Cannot update a table with 1000 columns

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-652:
---
Fix Version/s: 1.1.1

> Cannot update a table with 1000 columns
> ---
>
> Key: CARBONDATA-652
> URL: https://issues.apache.org/jira/browse/CARBONDATA-652
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
>Reporter: Deepti Bhardwaj
>Assignee: sounak chakraborty
>Priority: Minor
> Fix For: 1.1.1
>
> Attachments: create-table-1000-columns, 
> create-table-1000-columns-hive, data.csv, error-while-update.png, thrift-log
>
>
> I created a hive table and loaded it with data(data.csv).
> The commands for hive table are in attached 
> file(create-table-1000-columns-hive)
> Then I created a carbon table and inserted data in it from the above hive 
> table(see create-table-1000-columns)
> after which I fired the below query:
> update tablewith1000columns set (a1)=('testing!~~~') where a1='A1';
> and it gave java.lang.ArrayIndexOutOfBoundsException
> !https://issues.apache.org/jira/secure/attachment/12847806/error-while-update.png!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-602) When we are loading data 3 or 4 time using 'USE_KETTLE' ='false' with 'SINGLE_PASS'='true', It is throwing an error

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-602.

   Resolution: Fixed
Fix Version/s: NONE

> When we are  loading data 3 or 4 time using 'USE_KETTLE' ='false' with 
> 'SINGLE_PASS'='true', It is throwing an error
> 
>
> Key: CARBONDATA-602
> URL: https://issues.apache.org/jira/browse/CARBONDATA-602
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
> Environment: spark-1.6
>Reporter: Payal
>Assignee: QiangCai
> Fix For: NONE
>
> Attachments: 7000_UniqData.csv
>
>
> When we are Loading  data  using 'USE_KETTLE' ='false' with 
> 'SINGLE_PASS'='true' ,It is Throwing an error -- Error: java.lang.Exception: 
> Data load failed due to error while write dictionary file! (state=,code=0) 
> and without  'USE_KETTLE' ='false' Data load is successful
> For Example:
> CREATE TABLE uniqdata_INCLUDEDICTIONARY (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> 0: jdbc:hive2://192.168.2.126:1> LOAD DATA INPATH 
> 'hdfs://localhost:54311/payal/7000_UniqData.csv' into table 
> uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE'
>  ='fail');
> Error: java.lang.IllegalArgumentException: For input string: "fail" 
> (state=,code=0)
> LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' 
> into table uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> LOGS.
> INFO  06-01 13:31:54,820 - Running query 'LOAD DATA INPATH 
> 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table 
> uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE'
>  ='false')' with 2e6007f7-946d-4071-a73f-30d90538ebd6
> INFO  06-01 13:31:54,820 - pool-26-thread-58 Query [LOAD DATA INPATH 
> 'HDFS://HADOOP-MASTER:54311/DATA/UNIQDATA/7000_UNIQDATA.CSV' INTO TABLE 
> UNIQDATA_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,DOUBLE_COLUMN1,DOUBLE_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='TRUE','USE_KETTLE'
>  ='FALSE')]
> INFO  06-01 13:31:54,831 - Successfully able to get the table metadata file 
> lock
> INFO  06-01 13:31:54,834 - pool-26-thread-58 Initiating Direct Load for the 
> Table : (meradb.uniqdata_includedictionary)
> AUDIT 06-01 13:31:54,838 - [deepak-Vostro-3546][hduser][Thread-494]Data load 
> request has been received for table meradb.uniqdata_includedictionary
> AUDIT 06-01 13:31:54,838 - [deepak-Vostro-3546][hduser][Thread-494]Data is 
> loading with New Data Flow for table meradb.uniqdata_includedictionary
> INFO  06-01 13:31:54,891 - pool-26-thread-58 [Block Distribution]
> INFO  06-01 13:31:54,891 - pool-26-thread-58 totalInputSpaceConsumed: 1505367 
> , defaultParallelism: 8
> INFO  06-01 13:31:54,891 - pool-26-thread-58 
> mapreduce.input.fileinputformat.split.maxsize: 16777216
> INFO  06-01 13:31:54,891 - Total input paths to process : 1
> INFO  06-01 13:31:54,892 - pool-26-thread-58 Executors configured : 1
> INFO  06-01 13:31:54,893 - pool-26-thread-58 Requesting total executors: 1
> INFO  06-01 13:31:54,897 - pool-26-thread-58 Total Time taken to ensure the 
> required executors : 3
>

[jira] [Updated] (CARBONDATA-586) Create table with 'Char' data type but it workes as 'String' data type

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-586:
---
Issue Type: Improvement  (was: Bug)

> Create table with 'Char' data type but it workes as 'String' data type
> --
>
> Key: CARBONDATA-586
> URL: https://issues.apache.org/jira/browse/CARBONDATA-586
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-load
>Affects Versions: 1.0.0-incubating
> Environment: spark 1.6.2
> spark 2.1
>Reporter: Anurag Srivastava
>Assignee: QiangCai
>Priority: Minor
>
> I am trying to use Char data type with Carbon Data latest version and it 
> created successfully. When I started loading data in this that time I found 
> that it is taking data more then its size. 
> I have checked it with hive and there it is working fine.
> EX :- 
> 1. *Carbon Data :* 
> 1.1 create table test_carbon (name char(10)) stored by 
> 'org.apache.carbondata.format';
> 1.2 desc test_carbon;
> *Output :* 
> +-+--+--+--+
> | col_name | data_type  | comment   |
> +-+--+--+
> | name| string |  |
> +-+--+--+
> 1.3 LOAD DATA INPATH 'hdfs://localhost:54310/test.csv' into table test_carbon 
> OPTIONS ('FILEHEADER'='name');
> 1.4 select * from test_carbon;
> *Output :* 
> ++
> |name   |
> ++
> | Anurag Srivasrata  |
> | Robert|
> | james james   |
> ++
> 2. *Hive :* 
> 2.1 create table test_hive (name char(10));
> 2.2 desc test_hive;
> *Output :* 
> +-+--+-+
> | col_name | data_type  | comment  |
> +-+--+-+
> | name| char(10)| NULL   |
> +-+--+-+
> 2.3 LOAD DATA INPATH 'hdfs://localhost:54310/test.csv' into table test_hive;
> 2.4 select * from test_hive;
> *Output :* 
> ++
> |name |
> ++
> | james jame   |
> | Anurag Sri|
> | Robert  |
> ++
> So as hive truncate remaining string with Char data type in carbon data it 
> should work like hive.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-586) Create table with 'Char' data type but it workes as 'String' data type

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-586.

Resolution: Won't Fix

> Create table with 'Char' data type but it workes as 'String' data type
> --
>
> Key: CARBONDATA-586
> URL: https://issues.apache.org/jira/browse/CARBONDATA-586
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-load
>Affects Versions: 1.0.0-incubating
> Environment: spark 1.6.2
> spark 2.1
>Reporter: Anurag Srivastava
>Assignee: QiangCai
>Priority: Minor
>
> I am trying to use Char data type with Carbon Data latest version and it 
> created successfully. When I started loading data in this that time I found 
> that it is taking data more then its size. 
> I have checked it with hive and there it is working fine.
> EX :- 
> 1. *Carbon Data :* 
> 1.1 create table test_carbon (name char(10)) stored by 
> 'org.apache.carbondata.format';
> 1.2 desc test_carbon;
> *Output :* 
> +-+--+--+--+
> | col_name | data_type  | comment   |
> +-+--+--+
> | name| string |  |
> +-+--+--+
> 1.3 LOAD DATA INPATH 'hdfs://localhost:54310/test.csv' into table test_carbon 
> OPTIONS ('FILEHEADER'='name');
> 1.4 select * from test_carbon;
> *Output :* 
> ++
> |name   |
> ++
> | Anurag Srivasrata  |
> | Robert|
> | james james   |
> ++
> 2. *Hive :* 
> 2.1 create table test_hive (name char(10));
> 2.2 desc test_hive;
> *Output :* 
> +-+--+-+
> | col_name | data_type  | comment  |
> +-+--+-+
> | name| char(10)| NULL   |
> +-+--+-+
> 2.3 LOAD DATA INPATH 'hdfs://localhost:54310/test.csv' into table test_hive;
> 2.4 select * from test_hive;
> *Output :* 
> ++
> |name |
> ++
> | james jame   |
> | Anurag Sri|
> | Robert  |
> ++
> So as hive truncate remaining string with Char data type in carbon data it 
> should work like hive.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-422) [Bad Records]Select query failed with "NullPointerException" after data-load with options as MAXCOLUMN and BAD_RECORDS_ACTION

2017-05-10 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-422.

Resolution: Won't Fix

> [Bad Records]Select query failed with "NullPointerException" after data-load 
> with options as MAXCOLUMN and BAD_RECORDS_ACTION
> -
>
> Key: CARBONDATA-422
> URL: https://issues.apache.org/jira/browse/CARBONDATA-422
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 0.1.1-incubating
> Environment: 3 node Cluster
>Reporter: SOURYAKANTA DWIVEDY
>Priority: Minor
>
> Description : Select query failed with "NullPointerException" after data-load 
> with options as MAXCOLUMN and BAD_RECORDS_ACTION
> Steps:
> 1. Create table
> 2. Load data into table with BAD_RECORDS_ACTION option [ Create Table -- 
> columns -9 ,CSV coulmn - 10 , Header - 9]
> 3. Do select * query ,it will pass
>  4. Then Load data into table with BAD_RECORDS_ACTION and MAXCOLUMN option [ 
> Create Table -- columns -9 ,CSV coulmn - 10 , Header - 9,MAXCOLUMNS -- 9]
> 5. Do select * query ,it will fail with "NullPointerException"
> Log :- 
> ---
> 0: jdbc:hive2://ha-cluster/default> create table emp3(ID int,Name string,DOJ 
> timestamp,Designation string,Salary double,Dept string,DOB timestamp,Addr 
> string,Gender string) STORED BY 'org.apache.carbondata.format';
> +-+--+
> | result |
> +-+--+
> +-+--+
> No rows selected (0.589 seconds)
> 0: jdbc:hive2://ha-cluster/default> LOAD DATA inpath 
> 'hdfs://hacluster/chetan/emp11.csv' into table emp3 options('DELIMITER'=',', 
> 'QUOTECHAR'='"','FILEHEADER'='ID,Name,DOJ,Designation,Salary,Dept,DOB,Addr,Gender',
>  'BAD_RECORDS_ACTION'='FORCE');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (2.415 seconds)
> 0: jdbc:hive2://ha-cluster/default> select * from emp3;
> +---+---+---+--+--+---+---++-+--+
> | id | name | doj | designation | salary | dept | dob | addr | gender |
> +---+---+---+--+--+---+---++-+--+
> | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
> | 1 | AAA | NULL | Trainee | 1.0 | IT | NULL | Pune | Male |
> | 2 | BBB | NULL | SE | 3.0 | NW | NULL | Bangalore | Female |
> | 3 | CCC | NULL | SSE | 4.0 | DATA | NULL | Mumbai | Female |
> | 4 | DDD | NULL | TL | 6.0 | OPER | NULL | Delhi | Male |
> | 5 | EEE | NULL | STL | 8.0 | MAIN | NULL | Chennai | Female |
> | 6 | FFF | NULL | Trainee | 1.0 | IT | NULL | Pune | Male |
> | 7 | GGG | NULL | SE | 3.0 | NW | NULL | Bangalore | Female |
> | 8 | HHH | NULL | SSE | 4.0 | DATA | NULL | Mumbai | Female |
> | 9 | III | NULL | TL | 6.0 | OPER | NULL | Delhi | Male |
> | 10 | JJJ | NULL | STL | 8.0 | MAIN | NULL | Chennai | Female |
> | NULL | Name | NULL | Designation | NULL | Dept | NULL | Addr | Gender |
> +---+---+---+--+--+---+---++-+--+
> 12 rows selected (0.418 seconds)
> 0: jdbc:hive2://ha-cluster/default> LOAD DATA inpath 
> 'hdfs://hacluster/chetan/emp11.csv' into table emp3 options('DELIMITER'=',', 
> 'QUOTECHAR'='"','FILEHEADER'='ID,Name,DOJ,Designation,Salary,Dept,DOB,Addr,Gender','MAXCOLUMNS'='9',
>  'BAD_RECORDS_ACTION'='FORCE');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (1.424 seconds)
> 0: jdbc:hive2://ha-cluster/default> select * from emp3;
> Error: java.io.IOException: java.lang.NullPointerException (state=,code=0)
> 0: jdbc:hive2://ha-cluster/default>



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1092) alter table add column query should support no_inverted_index

2017-06-12 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1092.
-
   Resolution: Fixed
Fix Version/s: 1.2.0

> alter table add column query should support no_inverted_index
> -
>
> Key: CARBONDATA-1092
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1092
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Minor
> Fix For: 1.2.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-1159) Batch sort loading is not proper without synchronization

2017-06-12 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1159.
-
   Resolution: Fixed
Fix Version/s: 1.1.1
   1.2.0

> Batch sort loading is not proper without synchronization
> 
>
> Key: CARBONDATA-1159
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1159
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.2.0, 1.1.1
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-1158) Hive integration code optimization

2017-06-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1158.
-
   Resolution: Fixed
Fix Version/s: 1.2.0

> Hive integration code optimization
> --
>
> Key: CARBONDATA-1158
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1158
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: hive-integration
>Reporter: Liang Chen
>Assignee: Liang Chen
> Fix For: 1.2.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Hive integration code optimization:
> 1. Remove redundant and unused code.
> 2. Optimize some code
> a) Convert some internal functions from public to private.
> b) Fix some code which may generate error.
> c) Change code as per java code style.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-1211) Implicit Column Projection

2017-06-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1211.
-
   Resolution: Fixed
Fix Version/s: 1.1.1
   1.2.0

> Implicit Column Projection
> --
>
> Key: CARBONDATA-1211
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1211
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: sounak chakraborty
> Fix For: 1.2.0, 1.1.1
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Garbage values coming when projection is being done on implicit column i.e. 
> tupleId. Only occurs when vector reader is enabled. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-1223) Fixing empty file creation in batch sort loading

2017-06-23 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1223.
-
   Resolution: Fixed
Fix Version/s: 1.2.0

> Fixing empty file creation in batch sort loading
> 
>
> Key: CARBONDATA-1223
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1223
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
> Fix For: 1.2.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-1213) Removed rowCountPercentage check and fixed IUD data load issue

2017-06-23 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1213.
-
Resolution: Fixed

> Removed rowCountPercentage check and fixed IUD data load issue
> --
>
> Key: CARBONDATA-1213
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1213
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Manish Gupta
>Assignee: Manish Gupta
> Fix For: 1.2.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Problems:
> 1. Row count percentage not required with high cardinality threshold check
> 2. IUD returning incorrect results in case of update on high cardinality 
> column
> Analysis:
> 1. In case a column is identified as high cardinality column still it is not 
> getting converted to no dictionary column because of another parameter check 
> called rowCountPercentage. Default value of rowCountPercentage is 80%. Due to 
> this even though high cardinality column is identified, if it is less than 
> 80% of the total number of rows it will be treated as dictionary column. This 
> can still lead to executor lost failure due to memory constraints.
> 2. RLE on a column is not being set correctly and due to incorrect code 
> design RLE applicable on a column is decided by a different part of code from 
> the one which is actually applying the RLE on a column. Because of this 
> Footer is getting filled with incorrect RLE information and query is failing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1231) Add datamap interfaces for pruning and indexing

2017-06-27 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1231:
---

 Summary: Add datamap interfaces for pruning and indexing
 Key: CARBONDATA-1231
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1231
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add datamap interfaces for pruning and indexing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (CARBONDATA-1230) Datamap framework for Carbondata to leverage indexing

2017-06-27 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-1230:
---

Assignee: Ravindra Pesala

> Datamap framework for Carbondata to leverage indexing
> -
>
> Key: CARBONDATA-1230
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1230
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>
> Datamap should be single point interface for indexing and pruning. 
> It could be two types
> # 1. Coarse grained datamap.
> # 2 Fine grained datamap.
> h3. Coarse grained datamap
> These datamaps contains the information of blocklets. so it can prune till 
> blocklet level. It could be loaded on driver side or executor side depends on 
> size of datamap.
> Default implementation for this type is BlockletDataMap. It contains all 
> necessary information  of blocklet with stats like startkey, endkey and max 
> and min value. Using this information all filter queries would be pruned by 
> datamap.
> h3. Fine grained datamap
> These datamap contains information up to page and row level. It is stored 
> executor side and used as part of filtering to speed up the queries.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1235) Add lucene coarse grained datamap

2017-06-27 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1235:
---

 Summary: Add lucene coarse grained datamap
 Key: CARBONDATA-1235
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1235
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1230) Datamap framework for Carbondata to leverage indexing

2017-06-27 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1230:
---

 Summary: Datamap framework for Carbondata to leverage indexing
 Key: CARBONDATA-1230
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1230
 Project: CarbonData
  Issue Type: New Feature
Reporter: Ravindra Pesala


Datamap should be single point interface for indexing and pruning. 
It could be two types
# 1. Coarse grained datamap.
# 2 Fine grained datamap.

h3. Coarse grained datamap
These datamaps contains the information of blocklets. so it can prune till 
blocklet level. It could be loaded on driver side or executor side depends on 
size of datamap.
Default implementation for this type is BlockletDataMap. It contains all 
necessary information  of blocklet with stats like startkey, endkey and max and 
min value. Using this information all filter queries would be pruned by datamap.

h3. Fine grained datamap
These datamap contains information up to page and row level. It is stored 
executor side and used as part of filtering to speed up the queries.
 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1233) Add datamap writer for Blocklet datamap.

2017-06-27 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1233:
---

 Summary: Add datamap writer for Blocklet datamap.
 Key: CARBONDATA-1233
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1233
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add blocklet datamap writer and plug it into data writer step.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1232) Implement Blocklet datamap only driver side and prune blocklets using it.

2017-06-27 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1232:
---

 Summary: Implement Blocklet datamap only driver side and prune 
blocklets using it. 
 Key: CARBONDATA-1232
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1232
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Load all blocklet details to BlockletDataMap and prune the blocklets using it. 
Load these details on driver side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (CARBONDATA-1231) Add datamap interfaces for pruning and indexing

2017-06-27 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1231.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.2.0

> Add datamap interfaces for pruning and indexing
> ---
>
> Key: CARBONDATA-1231
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1231
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.2.0
>
>
> Add datamap interfaces for pruning and indexing



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1234) Make the datamap distributable.

2017-06-27 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1234:
---

 Summary: Make the datamap distributable. 
 Key: CARBONDATA-1234
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1234
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Make the datamap distributable so that multiple tasks could be launched prune 
the datamap on executor side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1088) Minimize the driver side block index.

2017-05-24 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1088:
---

 Summary: Minimize the driver side block index.
 Key: CARBONDATA-1088
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1088
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Minimize the driver side block index.
 -> Remove Btree and use array and binary search.
 -> Use unsafe to store data either on offheap/onheap.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (CARBONDATA-1087) Optimize block and blocklet index.

2017-05-24 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1087:
---

 Summary: Optimize block and blocklet index.
 Key: CARBONDATA-1087
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1087
 Project: CarbonData
  Issue Type: New Feature
Reporter: Ravindra Pesala


1.Minimize the storage of index and move possible data to unsafe
2. Remove the btree data structure and use simple array and binary search.
3. Unify block and blocklet index to single index and load on driver side.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (CARBONDATA-1089) Unify block(driver) and blocklet(executor) side indexes.

2017-05-24 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1089:
---

 Summary: Unify block(driver) and blocklet(executor) side indexes. 
 Key: CARBONDATA-1089
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1089
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Unify block(driver) and blocklet(executor) side indexes. 
 -> Remove btree and use array data structure .
 -> Use unsafe to store data.
 -> Unify both block and blocklet indexes and load on driver side.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1091) Implicit column tupleId is not returning results if VectorReader is enabled.

2017-05-25 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1091.
-
   Resolution: Fixed
Fix Version/s: 1.1.1
   1.2.0

> Implicit column tupleId is not returning results if VectorReader is enabled.
> 
>
> Key: CARBONDATA-1091
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1091
> Project: CarbonData
>  Issue Type: Bug
> Environment: Spark-2.1
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Minor
> Fix For: 1.2.0, 1.1.1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If user enables vector reader while querying, implicit column tupleId is not 
> getting filled and throwing exception.
> eg., carbon.enable.vector.reader = true in Carbon.Properties
> select getTupleId() as tupleId from carbonTable
> This needs to be corrected.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange

2017-05-22 Thread Ravindra Pesala (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019436#comment-16019436
 ] 

Ravindra Pesala commented on CARBONDATA-1076:
-

Can you add the csv file to reproduce this issue.

> Join Issue caused by dictionary and shuffle exchange
> 
>
> Key: CARBONDATA-1076
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1076
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.1.1-incubating, 1.1.0
> Environment: Carbon + spark 2.1
>Reporter: chenerlu
>Assignee: Ravindra Pesala
>
> We can reproduce this issue as following steps:
> Step1: create a carbon table
>  
> carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 
> int) STORED by 'carbondata' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')")
>  
> Step2: load data
> carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE 
> carbon_table")
>  
> you can get carbon_table file in attachment.
>  
> Step3: do the query
>  
> [expected] Hive table and parquet table get same result as below and it 
> should be correct.
> |col1|col1|col3|
> |   1|null|null|
> |null|   4|   1|
> |   4|null|null|
> |null|   7|   1|
> |   7|null|null|
> |null|   1|   1|
>  
> [acutally] carbon will get null because wrong match.
> |col1|col1|col3|
> |   1|   1|   1|
> |   4|   4|   1|
> |   7|   7|   1|
> Root cause analysis:
>  
> It is because this query has two subquery, and one subquey do the decode 
> after exchange and the other subquery do the decode before exchange, and this 
> may lead to wrong match when execute full join.
>  
> My idea: Can we move decode before exchange ? Because I am not very familiar 
> with Carbon query, so any idea about this ?
> Plan as follows:
>  
> == Physical Plan ==
> SortMergeJoin [col1#3445], [col1#3460], FullOuter
> :- Sort [col1#3445 ASC NULLS FIRST], false, 0
> :  +- Exchange hashpartitioning(col1#3445, 200)
> : +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> 
> col1#3445, col2#3446 -> col2#3446, col3#3447 -> 
> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table 
> name :carbon_table, Schema 
> :Some(StructType(StructField(col1,IntegerType,true), 
> StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), 
> CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, 
> col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name 
> :tempdev, Table name :carbon_table, Schema 
> :Some(StructType(StructField(col1,IntegerType,true), 
> StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], 
> IncludeProfile(ArrayBuffer(col1#3445)), CarbonAliasDecoderRelation(), 
> org.apache.spark.sql.CarbonSession@69e87cbe
> :+- HashAggregate(keys=[col1#3445, col2#3446], functions=[], 
> output=[col1#3445])
> :   +- Exchange hashpartitioning(col1#3445, col2#3446, 200)
> :  +- HashAggregate(keys=[col1#3445, col2#3446], functions=[], 
> output=[col1#3445, col2#3446])
> : +- Scan CarbonDatasourceHadoopRelation [ Database name 
> :tempdev, Table name :carbon_table, Schema 
> :Some(StructType(StructField(col1,IntegerType,true), 
> StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ] 
> tempdev.carbon_table[col1#3445,col2#3446] 
> +- Sort [col1#3460 ASC NULLS FIRST], false, 0
>+- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> 
> col1#3445, col2#3446 -> col2#3446, col3#3447 -> 
> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table 
> name :carbon_table, Schema 
> :Some(StructType(StructField(col1,IntegerType,true), 
> StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), 
> CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, 
> col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name 
> :tempdev, Table name :carbon_table, Schema 
> :Some(StructType(StructField(col1,IntegerType,true), 
> StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], 
> IncludeProfile(ArrayBuffer(col1#3460)), CarbonAliasDecoderRelation(), 
> org.apache.spark.sql.CarbonSession@69e87cbe
>   +- HashAggregate(keys=[col1#3460], functions=[count(col2#3461)], 
> output=[col1#3460, col3#3436L])
>  +- Exchange hashpartitioning(col1#3460, 200)
> +- HashAggregate(keys=[col1#3460], 
> functions=[partial_count(col2#3461)], output=[col1#3460, count#3472L])
>+- CarbonDictionaryDecoder 
> [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, 
> col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name 
> :tempdev, Table name :carbon_table, Schema 
>

[jira] [Resolved] (CARBONDATA-1074) Add TablePage for data load process

2017-05-27 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1074.
-
Resolution: Fixed
  Assignee: Jacky Li

> Add TablePage for data load process
> ---
>
> Key: CARBONDATA-1074
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1074
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.2.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Add TablePage preparing for data load refactory.
> Unify different steps to use ConvertedRow instead of Object[], steps includes:
> 1. normal sort table
> 2. no sort table
> 3. compaction



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (CARBONDATA-946) TUPLEID implicit column support in spark 2.1

2017-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-946:
--

Assignee: Naresh P R

> TUPLEID implicit column support in spark 2.1
> 
>
> Key: CARBONDATA-946
> URL: https://issues.apache.org/jira/browse/CARBONDATA-946
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Venkata Ramana G
>Assignee: Naresh P R
> Fix For: 1.2.0, 1.1.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Implicit column tupleid, will uniquely identify a tuple (row) in data. So 
> that update and delete can implemented using tupleid.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-946) TUPLEID implicit column support in spark 2.1

2017-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-946.

   Resolution: Fixed
Fix Version/s: 1.1.1
   1.2.0

> TUPLEID implicit column support in spark 2.1
> 
>
> Key: CARBONDATA-946
> URL: https://issues.apache.org/jira/browse/CARBONDATA-946
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Venkata Ramana G
>Assignee: Naresh P R
> Fix For: 1.2.0, 1.1.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Implicit column tupleid, will uniquely identify a tuple (row) in data. So 
> that update and delete can implemented using tupleid.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-989) decompressing error while load 'gz' and 'bz2' data into table

2017-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-989.

   Resolution: Fixed
 Assignee: Ran Mingxuan
Fix Version/s: 1.1.1
   1.2.0

> decompressing error while load 'gz' and 'bz2' data into table
> -
>
> Key: CARBONDATA-989
> URL: https://issues.apache.org/jira/browse/CARBONDATA-989
> Project: CarbonData
>  Issue Type: Bug
> Environment: spark 2.1.0
> hadoop 2.6.0 - CDH 5.5.2
>Reporter: Ran Mingxuan
>Assignee: Ran Mingxuan
> Fix For: 1.2.0, 1.1.1
>
>   Original Estimate: 24h
>  Time Spent: 4h 40m
>  Remaining Estimate: 19h 20m
>
> Run command in spark shell：
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.CarbonSession._
> val carbon = 
> SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://nsha/user/ranmx/test/carbon")
> carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id string, name string, 
> city string, age Int) STORED BY 'carbondata'")
> carbon.sql("LOAD DATA inpath '/ranmx/test/sh.csv.bz2' INTO TABLE test_table")
> get error：
> 17/04/26 11:11:26 ERROR LoadTable: main
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.io.compress.bzip2.Bzip2Factory.isNativeBzip2Loaded(Bzip2Factory.java:54)
>   at 
> org.apache.hadoop.io.compress.bzip2.Bzip2Factory.getBzip2DecompressorType(Bzip2Factory.java:120)
>   at 
> org.apache.hadoop.io.compress.BZip2Codec.getDecompressorType(BZip2Codec.java:242)
>   at 
> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176)
>   at 
> org.apache.hadoop.io.compress.CompressionCodec$Util.createInputStreamWithCodecPool(CompressionCodec.java:157)
>   at 
> org.apache.hadoop.io.compress.BZip2Codec.createInputStream(BZip2Codec.java:157)
>   at 
> org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:139)
>   at 
> org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:104)
>   at 
> org.apache.carbondata.core.util.CarbonUtil.readHeader(CarbonUtil.java:1273)
>   at 
> org.apache.carbondata.spark.util.CommonUtil$.getCsvHeaderColumns(CommonUtil.scala:319)
>   at 
> org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:474)
>   ...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (CARBONDATA-1108) Support delete operation in vector reader of Spark 2.1

2017-05-30 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1108:
---

 Summary: Support delete operation in vector reader of Spark 2.1
 Key: CARBONDATA-1108
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1108
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Support delete operation in vector reader of Spark 2.1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1109) Page lost in load process when last page is not be consumed at the end

2017-06-02 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1109.
-
   Resolution: Fixed
 Assignee: Yadong Qi
Fix Version/s: 1.1.1
   1.2.0

> Page lost in load process when last page is not be consumed at the end
> --
>
> Key: CARBONDATA-1109
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1109
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.2.0
>Reporter: Yadong Qi
>Assignee: Yadong Qi
> Fix For: 1.2.0, 1.1.1
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> First, we use Producer-Consumer model in the write step, we have n(default 
> value is 2 and can be configured) producers and one consumer. The task of 
> generate last page(less than 32000) is added to thread pool at the end, but 
> can't be guaranteed to be finished and add to BlockletDataHolder at the end. 
> Because we have n tasks running concurrently.
> Second, we have 2 ways to invoke `writeDataToFile`, one is the size of 
> `DataWriterHolder` reach the size of blocklet and two is the page is the last 
> page.
> So if the last page is not be consumed at the end, we lost the page which be 
> consumed after last page.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (CARBONDATA-1119) Database drop cascade is not working in Spark 2.1 and alter table not working in vector reader

2017-06-03 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-1119:
---

 Summary: Database drop cascade is not working in Spark 2.1 and 
alter table not working in vector reader
 Key: CARBONDATA-1119
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1119
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Database drop cascade is not working in Spark 2.1 
And alter table not working when vector reader is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 945 matches

Mail list logo