[jira] [Closed] (CARBONDATA-913) dead lock problem in unsafe batch parallel read merge sort

2017-04-13 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan closed CARBONDATA-913.
---

> dead lock problem in unsafe batch parallel read merge sort
> --
>
> Key: CARBONDATA-913
> URL: https://issues.apache.org/jira/browse/CARBONDATA-913
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Critical
> Attachments: unsafeBatchMergesort_threading issue.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-913) dead lock problem in unsafe batch parallel read merge sort

2017-04-13 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan resolved CARBONDATA-913.
-
Resolution: Fixed

fixed with https://github.com/apache/incubator-carbondata/pull/783

> dead lock problem in unsafe batch parallel read merge sort
> --
>
> Key: CARBONDATA-913
> URL: https://issues.apache.org/jira/browse/CARBONDATA-913
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Critical
> Attachments: unsafeBatchMergesort_threading issue.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-913) dead lock problem in unsafe batch parallel read merge sort

2017-04-12 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-913:

Attachment: unsafeBatchMergesort_threading issue.txt

> dead lock problem in unsafe batch parallel read merge sort
> --
>
> Key: CARBONDATA-913
> URL: https://issues.apache.org/jira/browse/CARBONDATA-913
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Critical
> Attachments: unsafeBatchMergesort_threading issue.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-913) dead lock problem in unsafe batch parallel read merge sort

2017-04-12 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-913:
---

 Summary: dead lock problem in unsafe batch parallel read merge sort
 Key: CARBONDATA-913
 URL: https://issues.apache.org/jira/browse/CARBONDATA-913
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-903) data load is not failing even though bad records exists in the data in case of unsafe sort or batch sort

2017-04-11 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-903:

Summary: data load is not failing even though bad records exists in the 
data in case of unsafe sort or batch sort  (was: data load is not failing even 
though bad records exists in the data)

> data load is not failing even though bad records exists in the data in case 
> of unsafe sort or batch sort
> 
>
> Key: CARBONDATA-903
> URL: https://issues.apache.org/jira/browse/CARBONDATA-903
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CARBONDATA-903) data load is not failing even though bad records exists in the data

2017-04-11 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964327#comment-15964327
 ] 

Mohammad Shahid Khan commented on CARBONDATA-903:
-

When carbon.load.use.batch.sort or enable.unsafe.sort or table bucket 
configured 
the data loading is not failing even in though the bad records exists in the 
data being loaded.

> data load is not failing even though bad records exists in the data
> ---
>
> Key: CARBONDATA-903
> URL: https://issues.apache.org/jira/browse/CARBONDATA-903
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-903) data load is not failing even though bad records exists in the data

2017-04-11 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan reassigned CARBONDATA-903:
---

Assignee: Mohammad Shahid Khan

> data load is not failing even though bad records exists in the data
> ---
>
> Key: CARBONDATA-903
> URL: https://issues.apache.org/jira/browse/CARBONDATA-903
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-903) data load is not failing even though bad records exists in the data

2017-04-11 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-903:
---

 Summary: data load is not failing even though bad records exists 
in the data
 Key: CARBONDATA-903
 URL: https://issues.apache.org/jira/browse/CARBONDATA-903
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-422) [Bad Records]Select query failed with "NullPointerException" after data-load with options as MAXCOLUMN and BAD_RECORDS_ACTION

2017-04-10 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan reassigned CARBONDATA-422:
---

Assignee: (was: Mohammad Shahid Khan)

> [Bad Records]Select query failed with "NullPointerException" after data-load 
> with options as MAXCOLUMN and BAD_RECORDS_ACTION
> -
>
> Key: CARBONDATA-422
> URL: https://issues.apache.org/jira/browse/CARBONDATA-422
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 0.1.1-incubating
> Environment: 3 node Cluster
>Reporter: SOURYAKANTA DWIVEDY
>Priority: Minor
>
> Description : Select query failed with "NullPointerException" after data-load 
> with options as MAXCOLUMN and BAD_RECORDS_ACTION
> Steps:
> 1. Create table
> 2. Load data into table with BAD_RECORDS_ACTION option [ Create Table -- 
> columns -9 ,CSV coulmn - 10 , Header - 9]
> 3. Do select * query ,it will pass
>  4. Then Load data into table with BAD_RECORDS_ACTION and MAXCOLUMN option [ 
> Create Table -- columns -9 ,CSV coulmn - 10 , Header - 9,MAXCOLUMNS -- 9]
> 5. Do select * query ,it will fail with "NullPointerException"
> Log :- 
> ---
> 0: jdbc:hive2://ha-cluster/default> create table emp3(ID int,Name string,DOJ 
> timestamp,Designation string,Salary double,Dept string,DOB timestamp,Addr 
> string,Gender string) STORED BY 'org.apache.carbondata.format';
> +-+--+
> | result |
> +-+--+
> +-+--+
> No rows selected (0.589 seconds)
> 0: jdbc:hive2://ha-cluster/default> LOAD DATA inpath 
> 'hdfs://hacluster/chetan/emp11.csv' into table emp3 options('DELIMITER'=',', 
> 'QUOTECHAR'='"','FILEHEADER'='ID,Name,DOJ,Designation,Salary,Dept,DOB,Addr,Gender',
>  'BAD_RECORDS_ACTION'='FORCE');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (2.415 seconds)
> 0: jdbc:hive2://ha-cluster/default> select * from emp3;
> +---+---+---+--+--+---+---++-+--+
> | id | name | doj | designation | salary | dept | dob | addr | gender |
> +---+---+---+--+--+---+---++-+--+
> | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
> | 1 | AAA | NULL | Trainee | 1.0 | IT | NULL | Pune | Male |
> | 2 | BBB | NULL | SE | 3.0 | NW | NULL | Bangalore | Female |
> | 3 | CCC | NULL | SSE | 4.0 | DATA | NULL | Mumbai | Female |
> | 4 | DDD | NULL | TL | 6.0 | OPER | NULL | Delhi | Male |
> | 5 | EEE | NULL | STL | 8.0 | MAIN | NULL | Chennai | Female |
> | 6 | FFF | NULL | Trainee | 1.0 | IT | NULL | Pune | Male |
> | 7 | GGG | NULL | SE | 3.0 | NW | NULL | Bangalore | Female |
> | 8 | HHH | NULL | SSE | 4.0 | DATA | NULL | Mumbai | Female |
> | 9 | III | NULL | TL | 6.0 | OPER | NULL | Delhi | Male |
> | 10 | JJJ | NULL | STL | 8.0 | MAIN | NULL | Chennai | Female |
> | NULL | Name | NULL | Designation | NULL | Dept | NULL | Addr | Gender |
> +---+---+---+--+--+---+---++-+--+
> 12 rows selected (0.418 seconds)
> 0: jdbc:hive2://ha-cluster/default> LOAD DATA inpath 
> 'hdfs://hacluster/chetan/emp11.csv' into table emp3 options('DELIMITER'=',', 
> 'QUOTECHAR'='"','FILEHEADER'='ID,Name,DOJ,Designation,Salary,Dept,DOB,Addr,Gender','MAXCOLUMNS'='9',
>  'BAD_RECORDS_ACTION'='FORCE');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (1.424 seconds)
> 0: jdbc:hive2://ha-cluster/default> select * from emp3;
> Error: java.io.IOException: java.lang.NullPointerException (state=,code=0)
> 0: jdbc:hive2://ha-cluster/default>



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-664) Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.

2017-04-10 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan reassigned CARBONDATA-664:
---

Assignee: (was: Mohammad Shahid Khan)

> Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.
> 
>
> Key: CARBONDATA-664
> URL: https://issues.apache.org/jira/browse/CARBONDATA-664
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
>Reporter: Harsh Sharma
>  Labels: bug
> Attachments: 100_olap_C20.csv, Driver Logs, Executor Logs
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Below scenario is working on Spark 2.1, but not on Spark 1.6
> create table VMALL_DICTIONARY_INCLUDE (imei string,deviceInformationId 
> int,MAC string,deviceColor string,device_backColor string,modelId 
> string,marketName string,AMSize string,ROMSize string,CUPAudit 
> string,CPIClocked string,series string,productionDate timestamp,bomCode 
> string,internalModels string, deliveryTime string, channelsId string, 
> channelsName string , deliveryAreaId string, deliveryCountry string, 
> deliveryProvince string, deliveryCity string,deliveryDistrict string, 
> deliveryStreet string, oxSingleNumber string, ActiveCheckTime string, 
> ActiveAreaId string, ActiveCountry string, ActiveProvince string, Activecity 
> string, ActiveDistrict string, ActiveStreet string, ActiveOperatorId string, 
> Active_releaseId string, Active_EMUIVersion string, Active_operaSysVersion 
> string, Active_BacVerNumber string, Active_BacFlashVer string, 
> Active_webUIVersion string, Active_webUITypeCarrVer 
> string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, 
> Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, 
> Latest_DAY Decimal(30,10), Latest_HOUR string, Latest_areaId string, 
> Latest_country string, Latest_province string, Latest_city string, 
> Latest_district string, Latest_street string, Latest_releaseId string, 
> Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber 
> string, Latest_BacFlashVer string, Latest_webUIVersion string, 
> Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, 
> Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, 
> Latest_operatorId string, gamePointDescription string,gamePointId 
> double,contractNumber BigInt) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='imei,deviceInformationId,productionDate,gamePointId,Latest_DAY,contractNumber');
> LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/100_olap_C20.csv' INTO 
> table VMALL_DICTIONARY_INCLUDE 
> options('DELIMITER'=',','QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='imei,deviceInformationId,MAC,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CPIClocked,series,productionDate,bomCode,internalModels,deliveryTime,channelsId,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,deliveryStreet,oxSingleNumber,contractNumber,ActiveCheckTime,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,ActiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_operaSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Active_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Active_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaSysVersion,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITypeCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePADPartitionedVersions,Latest_operatorId,gamePointId,gamePointDescription');
> select sum(deviceinformationId) from VMALL_DICTIONARY_INCLUDE where 
> deviceColor ='5Device Color' and modelId != '109' or Latest_DAY > 
> '1234567890123540.00' and contractNumber == '92233720368547800' or 
> Active_operaSysVersion like 'Operating System Version' and gamePointId <=> 
> '8.1366141918611E39' and deviceInformationId < '100' and productionDate 
> not like '2016-07-01' and imei is null and Latest_HOUR is not null and 
> channelsId <= '7' and Latest_releaseId >= '1' and Latest_MONTH between 6 and 
> 8 and Latest_YEAR not between 2016 and 2017 and Latest_HOUR RLIKE '12' and 
> gamePointDescription REGEXP 'Site' and imei in 
> ('1AA1','1AA100','1AA10','1AA1000','1AA1','1AA10','1AA100','1AA11','1AA12','1AA14','','NULL')
>  and Active_BacVerNumber not in ('Background version number1','','null');
> This scenario results in the 

[jira] [Assigned] (CARBONDATA-890) For Spark 2.1 LRU cache size at driver is getting configured with the executor lru cache size.

2017-04-09 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan reassigned CARBONDATA-890:
---

Assignee: Mohammad Shahid Khan

> For Spark 2.1 LRU cache size at driver is getting configured with the 
> executor lru cache size.
> --
>
> Key: CARBONDATA-890
> URL: https://issues.apache.org/jira/browse/CARBONDATA-890
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-890) For Spark 2.1 LRU cache size at driver is getting configured with the executor lru cache size.

2017-04-09 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-890:
---

 Summary: For Spark 2.1 LRU cache size at driver is getting 
configured with the executor lru cache size.
 Key: CARBONDATA-890
 URL: https://issues.apache.org/jira/browse/CARBONDATA-890
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-881) Load status is successful even though system is fail to write status into tablestatus file

2017-04-06 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-881:

Summary: Load status is successful even though system is fail to write 
status into tablestatus file  (was: Load status is successful even though 
system is fail to writte status into tablestatus file)

> Load status is successful even though system is fail to write status into 
> tablestatus file
> --
>
> Key: CARBONDATA-881
> URL: https://issues.apache.org/jira/browse/CARBONDATA-881
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>
> org.apache.carbondata.core.statusmanager.SegmentStatusManager.SegmentStatusManager
>  is eating the IOException 
> due to this even though when carbon is fail to write the load status in the 
> tablestatus file, the final load is successful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-881) Load status is successful even though system is fail to writte status into tablestatus file

2017-04-06 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-881:
---

 Summary: Load status is successful even though system is fail to 
writte status into tablestatus file
 Key: CARBONDATA-881
 URL: https://issues.apache.org/jira/browse/CARBONDATA-881
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan


org.apache.carbondata.core.statusmanager.SegmentStatusManager.SegmentStatusManager
 is eating the IOException 
due to this even though when carbon is fail to write the load status in the 
tablestatus file, the final load is successful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-814) bad record log file writing is not correct

2017-03-23 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-814:
---

 Summary: bad record log file writing is not correct
 Key: CARBONDATA-814
 URL: https://issues.apache.org/jira/browse/CARBONDATA-814
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-794) Numeric dimension column value should be validated for the bad record

2017-03-19 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-794:
---

 Summary: Numeric dimension column value should be validated for 
the bad record
 Key: CARBONDATA-794
 URL: https://issues.apache.org/jira/browse/CARBONDATA-794
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan







--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-784) Make configurable empty data to be treated as bad record or not And Expose BAD_RECORDS_ACTION default value to be configurable from out side.

2017-03-16 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-784:
---

 Summary: Make configurable empty data to be treated as bad record 
or not And Expose BAD_RECORDS_ACTION default value to be configurable from out 
side.
 Key: CARBONDATA-784
 URL: https://issues.apache.org/jira/browse/CARBONDATA-784
 Project: CarbonData
  Issue Type: Improvement
  Components: data-load
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan


*Make configurable empty data to be treated as bad record or not And Expose 
BAD_RECORDS_ACTION default value to be configurable from out side.*

1. Currently carbon Empty data is considered as bad record.
A property in the load option should be provided so as
the user can tell weather to consider empty data as bad record or not.
*IS_EMPTY_DATA_BAD_RECORD = false/true*
*default to false *
For example if IS_EMPTY_DATA_BAD_RECORD false below data will not treated as 
bad record vice versa
"","",""
,,

*2.Expose BAD_RECORDS_ACTION default value to be configurable from out side. *
A property carbon.bad.records.action will be added in carbon properties 
so that user can configure default value of BAD_RECORDS_ACTION.
*default to FAIL*




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-744) he property "spark.carbon.custom.distribution" should be change to carbon.custom.block.distribution and should be part of CarbonProperties

2017-03-12 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-744:

Summary: he property "spark.carbon.custom.distribution" should be change to 
carbon.custom.block.distribution and should be part of CarbonProperties  (was: 
The property "spark.carbon.custom.distribution" should  be part of 
CarbonProperties)

> he property "spark.carbon.custom.distribution" should be change to 
> carbon.custom.block.distribution and should be part of CarbonProperties
> --
>
> Key: CARBONDATA-744
> URL: https://issues.apache.org/jira/browse/CARBONDATA-744
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Trivial
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The property "spark.carbon.custom.distribution" should  be part of 
> CarbonProperties
> As naming style adopted in carbon we should name the key 
> carbon.custom.distribution



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CARBONDATA-422) [Bad Records]Select query failed with "NullPointerException" after data-load with options as MAXCOLUMN and BAD_RECORDS_ACTION

2017-03-03 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894049#comment-15894049
 ] 

Mohammad Shahid Khan commented on CARBONDATA-422:
-

This issue is not reproducible.

> [Bad Records]Select query failed with "NullPointerException" after data-load 
> with options as MAXCOLUMN and BAD_RECORDS_ACTION
> -
>
> Key: CARBONDATA-422
> URL: https://issues.apache.org/jira/browse/CARBONDATA-422
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 0.1.1-incubating
> Environment: 3 node Cluster
>Reporter: SOURYAKANTA DWIVEDY
>Assignee: Mohammad Shahid Khan
>Priority: Minor
>
> Description : Select query failed with "NullPointerException" after data-load 
> with options as MAXCOLUMN and BAD_RECORDS_ACTION
> Steps:
> 1. Create table
> 2. Load data into table with BAD_RECORDS_ACTION option [ Create Table -- 
> columns -9 ,CSV coulmn - 10 , Header - 9]
> 3. Do select * query ,it will pass
>  4. Then Load data into table with BAD_RECORDS_ACTION and MAXCOLUMN option [ 
> Create Table -- columns -9 ,CSV coulmn - 10 , Header - 9,MAXCOLUMNS -- 9]
> 5. Do select * query ,it will fail with "NullPointerException"
> Log :- 
> ---
> 0: jdbc:hive2://ha-cluster/default> create table emp3(ID int,Name string,DOJ 
> timestamp,Designation string,Salary double,Dept string,DOB timestamp,Addr 
> string,Gender string) STORED BY 'org.apache.carbondata.format';
> +-+--+
> | result |
> +-+--+
> +-+--+
> No rows selected (0.589 seconds)
> 0: jdbc:hive2://ha-cluster/default> LOAD DATA inpath 
> 'hdfs://hacluster/chetan/emp11.csv' into table emp3 options('DELIMITER'=',', 
> 'QUOTECHAR'='"','FILEHEADER'='ID,Name,DOJ,Designation,Salary,Dept,DOB,Addr,Gender',
>  'BAD_RECORDS_ACTION'='FORCE');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (2.415 seconds)
> 0: jdbc:hive2://ha-cluster/default> select * from emp3;
> +---+---+---+--+--+---+---++-+--+
> | id | name | doj | designation | salary | dept | dob | addr | gender |
> +---+---+---+--+--+---+---++-+--+
> | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
> | 1 | AAA | NULL | Trainee | 1.0 | IT | NULL | Pune | Male |
> | 2 | BBB | NULL | SE | 3.0 | NW | NULL | Bangalore | Female |
> | 3 | CCC | NULL | SSE | 4.0 | DATA | NULL | Mumbai | Female |
> | 4 | DDD | NULL | TL | 6.0 | OPER | NULL | Delhi | Male |
> | 5 | EEE | NULL | STL | 8.0 | MAIN | NULL | Chennai | Female |
> | 6 | FFF | NULL | Trainee | 1.0 | IT | NULL | Pune | Male |
> | 7 | GGG | NULL | SE | 3.0 | NW | NULL | Bangalore | Female |
> | 8 | HHH | NULL | SSE | 4.0 | DATA | NULL | Mumbai | Female |
> | 9 | III | NULL | TL | 6.0 | OPER | NULL | Delhi | Male |
> | 10 | JJJ | NULL | STL | 8.0 | MAIN | NULL | Chennai | Female |
> | NULL | Name | NULL | Designation | NULL | Dept | NULL | Addr | Gender |
> +---+---+---+--+--+---+---++-+--+
> 12 rows selected (0.418 seconds)
> 0: jdbc:hive2://ha-cluster/default> LOAD DATA inpath 
> 'hdfs://hacluster/chetan/emp11.csv' into table emp3 options('DELIMITER'=',', 
> 'QUOTECHAR'='"','FILEHEADER'='ID,Name,DOJ,Designation,Salary,Dept,DOB,Addr,Gender','MAXCOLUMNS'='9',
>  'BAD_RECORDS_ACTION'='FORCE');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (1.424 seconds)
> 0: jdbc:hive2://ha-cluster/default> select * from emp3;
> Error: java.io.IOException: java.lang.NullPointerException (state=,code=0)
> 0: jdbc:hive2://ha-cluster/default>



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CARBONDATA-664) Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.

2017-03-03 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894018#comment-15894018
 ] 

Mohammad Shahid Khan edited comment on CARBONDATA-664 at 3/3/17 9:49 AM:
-

Fixed with PR https://github.com/apache/incubator-carbondata/pull/584


was (Author: mohdshahidkhan):
closed with PR https://github.com/apache/incubator-carbondata/pull/584

> Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.
> 
>
> Key: CARBONDATA-664
> URL: https://issues.apache.org/jira/browse/CARBONDATA-664
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
>Reporter: Harsh Sharma
>Assignee: Mohammad Shahid Khan
>  Labels: bug
> Attachments: 100_olap_C20.csv, Driver Logs, Executor Logs
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Below scenario is working on Spark 2.1, but not on Spark 1.6
> create table VMALL_DICTIONARY_INCLUDE (imei string,deviceInformationId 
> int,MAC string,deviceColor string,device_backColor string,modelId 
> string,marketName string,AMSize string,ROMSize string,CUPAudit 
> string,CPIClocked string,series string,productionDate timestamp,bomCode 
> string,internalModels string, deliveryTime string, channelsId string, 
> channelsName string , deliveryAreaId string, deliveryCountry string, 
> deliveryProvince string, deliveryCity string,deliveryDistrict string, 
> deliveryStreet string, oxSingleNumber string, ActiveCheckTime string, 
> ActiveAreaId string, ActiveCountry string, ActiveProvince string, Activecity 
> string, ActiveDistrict string, ActiveStreet string, ActiveOperatorId string, 
> Active_releaseId string, Active_EMUIVersion string, Active_operaSysVersion 
> string, Active_BacVerNumber string, Active_BacFlashVer string, 
> Active_webUIVersion string, Active_webUITypeCarrVer 
> string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, 
> Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, 
> Latest_DAY Decimal(30,10), Latest_HOUR string, Latest_areaId string, 
> Latest_country string, Latest_province string, Latest_city string, 
> Latest_district string, Latest_street string, Latest_releaseId string, 
> Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber 
> string, Latest_BacFlashVer string, Latest_webUIVersion string, 
> Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, 
> Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, 
> Latest_operatorId string, gamePointDescription string,gamePointId 
> double,contractNumber BigInt) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='imei,deviceInformationId,productionDate,gamePointId,Latest_DAY,contractNumber');
> LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/100_olap_C20.csv' INTO 
> table VMALL_DICTIONARY_INCLUDE 
> options('DELIMITER'=',','QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='imei,deviceInformationId,MAC,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CPIClocked,series,productionDate,bomCode,internalModels,deliveryTime,channelsId,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,deliveryStreet,oxSingleNumber,contractNumber,ActiveCheckTime,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,ActiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_operaSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Active_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Active_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaSysVersion,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITypeCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePADPartitionedVersions,Latest_operatorId,gamePointId,gamePointDescription');
> select sum(deviceinformationId) from VMALL_DICTIONARY_INCLUDE where 
> deviceColor ='5Device Color' and modelId != '109' or Latest_DAY > 
> '1234567890123540.00' and contractNumber == '92233720368547800' or 
> Active_operaSysVersion like 'Operating System Version' and gamePointId <=> 
> '8.1366141918611E39' and deviceInformationId < '100' and productionDate 
> not like '2016-07-01' and imei is null and Latest_HOUR is not null and 
> channelsId <= '7' and Latest_releaseId >= '1' and Latest_MONTH between 6 and 
> 8 and Latest_YEAR not between 2016 and 2017 and Latest_HOUR RLIKE '12' and 
> gamePointDescription REGEXP 

[jira] [Assigned] (CARBONDATA-744) The property "spark.carbon.custom.distribution" should be part of CarbonProperties

2017-03-03 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan reassigned CARBONDATA-744:
---

Assignee: Mohammad Shahid Khan

> The property "spark.carbon.custom.distribution" should  be part of 
> CarbonProperties
> ---
>
> Key: CARBONDATA-744
> URL: https://issues.apache.org/jira/browse/CARBONDATA-744
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Trivial
>
> The property "spark.carbon.custom.distribution" should  be part of 
> CarbonProperties
> As naming style adopted in carbon we should name the key 
> carbon.custom.distribution



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-744) The property "spark.carbon.custom.distribution" should be part of CarbonProperties

2017-03-03 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-744:
---

 Summary: The property "spark.carbon.custom.distribution" should  
be part of CarbonProperties
 Key: CARBONDATA-744
 URL: https://issues.apache.org/jira/browse/CARBONDATA-744
 Project: CarbonData
  Issue Type: Improvement
Reporter: Mohammad Shahid Khan
Priority: Trivial


The property "spark.carbon.custom.distribution" should  be part of 
CarbonProperties
As naming style adopted in carbon we should name the key 
carbon.custom.distribution



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CARBONDATA-664) Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.

2017-01-25 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837824#comment-15837824
 ] 

Mohammad Shahid Khan edited comment on CARBONDATA-664 at 1/25/17 2:38 PM:
--

There is no problem is force load, the issue is happening because in case of OR 
filter some of the projected dimension and measure are not loaded in 
dictionaryDataChunk and measureDataChunck


was (Author: mohdshahidkhan):
There is no problem is force load, the issue is happening because in case of OR 
filter some of the projected dimension and measure are not loaded in 
dictionaryDataChunk and measureDataChunck , same needs to be taken care.

> Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.
> 
>
> Key: CARBONDATA-664
> URL: https://issues.apache.org/jira/browse/CARBONDATA-664
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
>Reporter: Harsh Sharma
>Assignee: Mohammad Shahid Khan
>  Labels: bug
> Attachments: 100_olap_C20.csv, Driver Logs, Executor Logs
>
>
> Below scenario is working on Spark 2.1, but not on Spark 1.6
> create table VMALL_DICTIONARY_INCLUDE (imei string,deviceInformationId 
> int,MAC string,deviceColor string,device_backColor string,modelId 
> string,marketName string,AMSize string,ROMSize string,CUPAudit 
> string,CPIClocked string,series string,productionDate timestamp,bomCode 
> string,internalModels string, deliveryTime string, channelsId string, 
> channelsName string , deliveryAreaId string, deliveryCountry string, 
> deliveryProvince string, deliveryCity string,deliveryDistrict string, 
> deliveryStreet string, oxSingleNumber string, ActiveCheckTime string, 
> ActiveAreaId string, ActiveCountry string, ActiveProvince string, Activecity 
> string, ActiveDistrict string, ActiveStreet string, ActiveOperatorId string, 
> Active_releaseId string, Active_EMUIVersion string, Active_operaSysVersion 
> string, Active_BacVerNumber string, Active_BacFlashVer string, 
> Active_webUIVersion string, Active_webUITypeCarrVer 
> string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, 
> Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, 
> Latest_DAY Decimal(30,10), Latest_HOUR string, Latest_areaId string, 
> Latest_country string, Latest_province string, Latest_city string, 
> Latest_district string, Latest_street string, Latest_releaseId string, 
> Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber 
> string, Latest_BacFlashVer string, Latest_webUIVersion string, 
> Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, 
> Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, 
> Latest_operatorId string, gamePointDescription string,gamePointId 
> double,contractNumber BigInt) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='imei,deviceInformationId,productionDate,gamePointId,Latest_DAY,contractNumber');
> LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/100_olap_C20.csv' INTO 
> table VMALL_DICTIONARY_INCLUDE 
> options('DELIMITER'=',','QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='imei,deviceInformationId,MAC,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CPIClocked,series,productionDate,bomCode,internalModels,deliveryTime,channelsId,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,deliveryStreet,oxSingleNumber,contractNumber,ActiveCheckTime,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,ActiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_operaSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Active_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Active_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaSysVersion,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITypeCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePADPartitionedVersions,Latest_operatorId,gamePointId,gamePointDescription');
> select sum(deviceinformationId) from VMALL_DICTIONARY_INCLUDE where 
> deviceColor ='5Device Color' and modelId != '109' or Latest_DAY > 
> '1234567890123540.00' and contractNumber == '92233720368547800' or 
> Active_operaSysVersion like 'Operating System Version' and gamePointId <=> 
> '8.1366141918611E39' and deviceInformationId < '100' and productionDate 
> not like '2016-07-01' and imei is 

[jira] [Commented] (CARBONDATA-664) Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.

2017-01-25 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837824#comment-15837824
 ] 

Mohammad Shahid Khan commented on CARBONDATA-664:
-

There is no problem is force load, the issue is happening because in case of OR 
filter some of the projected dimension and measure are not loaded in 
dictionaryDataChunk and measureDataChunck , same needs to be taken care.

> Select queries fail when BAD_RECORDS_ACTION as FORCED is used in load query.
> 
>
> Key: CARBONDATA-664
> URL: https://issues.apache.org/jira/browse/CARBONDATA-664
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
>Reporter: Harsh Sharma
>Assignee: Mohammad Shahid Khan
>  Labels: bug
> Attachments: 100_olap_C20.csv, Driver Logs, Executor Logs
>
>
> Below scenario is working on Spark 2.1, but not on Spark 1.6
> create table VMALL_DICTIONARY_INCLUDE (imei string,deviceInformationId 
> int,MAC string,deviceColor string,device_backColor string,modelId 
> string,marketName string,AMSize string,ROMSize string,CUPAudit 
> string,CPIClocked string,series string,productionDate timestamp,bomCode 
> string,internalModels string, deliveryTime string, channelsId string, 
> channelsName string , deliveryAreaId string, deliveryCountry string, 
> deliveryProvince string, deliveryCity string,deliveryDistrict string, 
> deliveryStreet string, oxSingleNumber string, ActiveCheckTime string, 
> ActiveAreaId string, ActiveCountry string, ActiveProvince string, Activecity 
> string, ActiveDistrict string, ActiveStreet string, ActiveOperatorId string, 
> Active_releaseId string, Active_EMUIVersion string, Active_operaSysVersion 
> string, Active_BacVerNumber string, Active_BacFlashVer string, 
> Active_webUIVersion string, Active_webUITypeCarrVer 
> string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, 
> Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, 
> Latest_DAY Decimal(30,10), Latest_HOUR string, Latest_areaId string, 
> Latest_country string, Latest_province string, Latest_city string, 
> Latest_district string, Latest_street string, Latest_releaseId string, 
> Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber 
> string, Latest_BacFlashVer string, Latest_webUIVersion string, 
> Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, 
> Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, 
> Latest_operatorId string, gamePointDescription string,gamePointId 
> double,contractNumber BigInt) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='imei,deviceInformationId,productionDate,gamePointId,Latest_DAY,contractNumber');
> LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/100_olap_C20.csv' INTO 
> table VMALL_DICTIONARY_INCLUDE 
> options('DELIMITER'=',','QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='imei,deviceInformationId,MAC,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CPIClocked,series,productionDate,bomCode,internalModels,deliveryTime,channelsId,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,deliveryStreet,oxSingleNumber,contractNumber,ActiveCheckTime,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,ActiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_operaSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Active_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Active_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaSysVersion,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITypeCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePADPartitionedVersions,Latest_operatorId,gamePointId,gamePointDescription');
> select sum(deviceinformationId) from VMALL_DICTIONARY_INCLUDE where 
> deviceColor ='5Device Color' and modelId != '109' or Latest_DAY > 
> '1234567890123540.00' and contractNumber == '92233720368547800' or 
> Active_operaSysVersion like 'Operating System Version' and gamePointId <=> 
> '8.1366141918611E39' and deviceInformationId < '100' and productionDate 
> not like '2016-07-01' and imei is null and Latest_HOUR is not null and 
> channelsId <= '7' and Latest_releaseId >= '1' and Latest_MONTH between 6 and 
> 8 and Latest_YEAR not between 2016 and 2017 and Latest_HOUR RLIKE '12' and 
> gamePointDescription REGEXP 'Site' and imei in 
> 

[jira] [Commented] (CARBONDATA-675) DataLoad failure

2017-01-23 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15834433#comment-15834433
 ] 

Mohammad Shahid Khan commented on CARBONDATA-675:
-

[~abhishekg] please make sure carbonplugin is present in 
/datadisk1/OSCON/BigData/HACluster/install/spark/sparkJdbc/ directory

> DataLoad failure
> 
>
> Key: CARBONDATA-675
> URL: https://issues.apache.org/jira/browse/CARBONDATA-675
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load, data-query
> Environment: SPARK=1.6.2
>Reporter: abhishek giri
>Priority: Critical
> Attachments: dataload_default_failure.png, dataload_failure.png, 
> thriftlogs.txt
>
>
> With performance test Create and Load this issue is coming.This issue is with 
> 300 column performance table.
> System configuration :- RAM-30G ,Cores-16
> Command to start thrift :- ./bin/spark-submit --master yarn-client 
> --executor-memory 16g --executor-cores 8 --driver-memory 4g --num-executors 3 
> --class org.apachecarbondata.spark.thriftserver.CarbonThriftServer  jar path> "hdfs://hacluster/Opt/CarbonStore"
> Size of data :- 34.59034071 GB
> Create query :- create table oscon_new_1 (ACTIVE_AREA_ID String, 
> ACTIVE_CHECK_DY String, ACTIVE_CHECK_HOUR String, ACTIVE_CHECK_MM String, 
> ACTIVE_CHECK_TIME String, ACTIVE_CHECK_YR String, ACTIVE_CITY String, 
> ACTIVE_COUNTRY String, ACTIVE_DISTRICT String, ACTIVE_EMUI_VERSION String, 
> ACTIVE_FIRMWARE_VER String, ACTIVE_NETWORK String, ACTIVE_OS_VERSION String, 
> ACTIVE_PROVINCE String, BOM String, CHECK_DATE String, CHECK_DY String, 
> CHECK_HOUR String, CHECK_MM String, CHECK_YR String, CUST_ADDRESS_ID String, 
> CUST_AGE String, CUST_BIRTH_COUNTRY String, CUST_BIRTH_DY String, 
> CUST_BIRTH_MM String, CUST_BIRTH_YR String, CUST_BUY_POTENTIAL String, 
> CUST_CITY String, CUST_STATE String, CUST_COUNTRY String, CUST_COUNTY String, 
> CUST_EMAIL_ADDR String, CUST_LAST_RVW_DATE timestamp, CUST_FIRST_NAME String, 
> CUST_ID String, CUST_JOB_TITLE String, CUST_LAST_NAME String, CUST_LOGIN 
> String, CUST_NICK_NAME String, CUST_PRFRD_FLG String, CUST_SEX String, 
> CUST_STREET_NAME String, CUST_STREET_NO String, CUST_SUITE_NO String, 
> CUST_ZIP String, DELIVERY_CITY String, DELIVERY_STATE String, 
> DELIVERY_COUNTRY String, DELIVERY_DISTRICT String, DELIVERY_PROVINCE String, 
> DEVICE_NAME String, INSIDE_NAME String, ITM_BRAND String, ITM_BRAND_ID 
> String, ITM_CATEGORY String, ITM_CATEGORY_ID String, ITM_CLASS String, 
> ITM_CLASS_ID String, ITM_COLOR String, ITM_CONTAINER String, ITM_FORMULATION 
> String, ITM_MANAGER_ID String, ITM_MANUFACT String, ITM_MANUFACT_ID String, 
> ITM_ID String, ITM_NAME String, ITM_REC_END_DATE String, ITM_REC_START_DATE 
> String, LATEST_AREAID String, LATEST_CHECK_DY String, LATEST_CHECK_HOUR 
> String, LATEST_CHECK_MM String, LATEST_CHECK_TIME String, LATEST_CHECK_YR 
> String, LATEST_CITY String, LATEST_COUNTRY String, LATEST_DISTRICT String, 
> LATEST_EMUI_VERSION String, LATEST_FIRMWARE_VER String, LATEST_NETWORK 
> String, LATEST_OS_VERSION String, LATEST_PROVINCE String, OL_ORDER_DATE 
> String, OL_ORDER_NO int, OL_RET_ORDER_NO String, OL_RET_DATE String, OL_SITE 
> String, OL_SITE_DESC String, PACKING_DATE String, PACKING_DY String, 
> PACKING_HOUR String, PACKING_LIST_NO String, PACKING_MM String, PACKING_YR 
> String, PRMTION_ID String, PRMTION_NAME String, PRM_CHANNEL_CAT String, 
> PRM_CHANNEL_DEMO String, PRM_CHANNEL_DETAILS String, PRM_CHANNEL_DMAIL 
> String, PRM_CHANNEL_EMAIL String, PRM_CHANNEL_EVENT String, PRM_CHANNEL_PRESS 
> String, PRM_CHANNEL_RADIO String, PRM_CHANNEL_TV String, PRM_DSCNT_ACTIVE 
> String, PRM_END_DATE String, PRM_PURPOSE String, PRM_START_DATE String, 
> PRODUCT_ID String, PROD_BAR_CODE String, PROD_BRAND_NAME String, PRODUCT_NAME 
> String, PRODUCT_MODEL String, PROD_MODEL_ID String, PROD_COLOR String, 
> PROD_SHELL_COLOR String, PROD_CPU_CLOCK String, PROD_IMAGE String, PROD_LIVE 
> String, PROD_LOC String, PROD_LONG_DESC String, PROD_RAM String, PROD_ROM 
> String, PROD_SERIES String, PROD_SHORT_DESC String, PROD_THUMB String, 
> PROD_UNQ_DEVICE_ADDR String, PROD_UNQ_MDL_ID String, PROD_UPDATE_DATE String, 
> PROD_UQ_UUID String, SHP_CARRIER String, SHP_CODE String, SHP_CONTRACT 
> String, SHP_MODE_ID String, SHP_MODE String, STR_ORDER_DATE String, 
> STR_ORDER_NO String, TRACKING_NO String, WH_CITY String, WH_COUNTRY String, 
> WH_COUNTY String, WH_ID String, WH_NAME String, WH_STATE String, 
> WH_STREET_NAME String, WH_STREET_NO String, WH_STREET_TYPE String, 
> WH_SUITE_NO String, WH_ZIP String, CUST_DEP_COUNT double, CUST_VEHICLE_COUNT 
> double, CUST_ADDRESS_CNT double, CUST_CRNT_CDEMO_CNT double, 
> CUST_CRNT_HDEMO_CNT double, CUST_CRNT_ADDR_DM double, CUST_FIRST_SHIPTO_CNT 
> double, CUST_FIRST_SALES_CNT double, CUST_GMT_OFFSET 

[jira] [Updated] (CARBONDATA-669) InsertIntoCarbonTableTestCase.insert into carbon table from carbon table union query random test failure

2017-01-20 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-669:

Summary: InsertIntoCarbonTableTestCase.insert into carbon table from carbon 
table union query  random test failure  (was: 
org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase.insert
 into carbon table from carbon table union query  random test failure)

> InsertIntoCarbonTableTestCase.insert into carbon table from carbon table 
> union query  random test failure
> -
>
> Key: CARBONDATA-669
> URL: https://issues.apache.org/jira/browse/CARBONDATA-669
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Fix For: 1.0.0-incubating
>
>
> org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase.insert
>  into carbon table from carbon table union query  random test failure
> ERROR 19-01 19:49:53,151 - Exception in task 1.0 in stage 1634.0 (TID 7523)
> java.lang.NullPointerException
>   at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.initQuery(AbstractQueryExecutor.java:136)
>   at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.getBlockExecutionInfos(AbstractQueryExecutor.java:219)
>   at 
> org.apache.carbondata.core.scan.executor.impl.DetailQueryExecutor.execute(DetailQueryExecutor.java:39)
>   at 
> org.apache.carbondata.hadoop.CarbonRecordReader.initialize(CarbonRecordReader.java:79)
>   at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD.scala:192)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> ERROR 19-01 19:49:53,152 - Task 1 in stage 1634.0 failed 1 times; aborting job



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-669) org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase.insert into carbon table from carbon table union query random test failure

2017-01-20 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-669:
---

 Summary: 
org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase.insert
 into carbon table from carbon table union query  random test failure
 Key: CARBONDATA-669
 URL: https://issues.apache.org/jira/browse/CARBONDATA-669
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 1.0.0-incubating
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
 Fix For: 1.0.0-incubating


org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase.insert
 into carbon table from carbon table union query  random test failure

ERROR 19-01 19:49:53,151 - Exception in task 1.0 in stage 1634.0 (TID 7523)
java.lang.NullPointerException
at 
org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.initQuery(AbstractQueryExecutor.java:136)
at 
org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.getBlockExecutionInfos(AbstractQueryExecutor.java:219)
at 
org.apache.carbondata.core.scan.executor.impl.DetailQueryExecutor.execute(DetailQueryExecutor.java:39)
at 
org.apache.carbondata.hadoop.CarbonRecordReader.initialize(CarbonRecordReader.java:79)
at 
org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD.scala:192)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
ERROR 19-01 19:49:53,152 - Task 1 in stage 1634.0 failed 1 times; aborting job



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CARBONDATA-660) Bad Records Logs and Raw CSVs should get display under segment id instead of Tasks id

2017-01-19 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan reassigned CARBONDATA-660:
---

Assignee: Mohammad Shahid Khan

> Bad Records Logs and Raw CSVs should get display under segment id instead of 
> Tasks id
> -
>
> Key: CARBONDATA-660
> URL: https://issues.apache.org/jira/browse/CARBONDATA-660
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-load
>Reporter: Priyal Sachdeva
>Assignee: Mohammad Shahid Khan
>Priority: Minor
>
> create table if not exists Badrecords_test (imei string,AMSize int) STORED BY 
> 'org.apache.carbondata.format';
>  LOAD DATA INPATH 'hdfs://hacluster/CSVs/bad_records.csv' into table 
> Badrecords_test OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='REDIRECT','FILEHEADER'='imei,AMSize');
> Bad Records Logs and raw csvs are getting display under Task ID
> linux-61:/srv/OSCON/BigData/HACluster/install/hadoop/datanode # 
> bin/hadoop fs -ls /tmp/carbon/default/badrecords_test
> drwxr-xr-x   - root users  0 2017-01-18 21:08 
> /tmp/carbon/default/badrecords_test/0--->Task ID
> 0: jdbc:hive2://172.168.100.205:23040> show segments for table 
> Badrecords_test;
> ++--+--+--+--+
> | SegmentSequenceId  |  Status  | Load Start Time  |  
> Load End Time   |
> ++--+--+--+--+
> | 8  | Partial Success  | 2017-01-18 21:12:58.018  | 
> 2017-01-18 21:12:59.652  |
> | 7  | Partial Success  | 2017-01-18 21:08:07.426  | 
> 2017-01-18 21:08:11.791  |
> | 6  | Partial Success  | 2017-01-18 21:07:07.645  | 
> 2017-01-18 21:07:08.747  |
> | 5  | Partial Success  | 2017-01-18 19:34:16.163  | 
> 2017-01-18 19:34:18.163  |
> | 4  | Partial Success  | 2017-01-18 19:34:13.669  | 
> 2017-01-18 19:34:15.811  |
> | 3  | Partial Success  | 2017-01-18 19:30:18.753  | 
> 2017-01-18 19:30:19.644  |
> | 2  | Partial Success  | 2017-01-18 19:30:13.508  | 
> 2017-01-18 19:30:15.578  |
> | 1  | Partial Success  | 2017-01-18 19:18:54.787  | 
> 2017-01-18 19:18:54.94   |
> | 0  | Partial Success  | 2017-01-18 19:18:53.741  | 
> 2017-01-18 19:18:54.614  |
> ++--+--+--+--+
> Bad Records Logs and raw csvs are getting display under Task ID. It would be 
> good to have the information of bad records as per the load i.e under segment 
> id..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-233) bad record logger support for non parseable numeric and timestamp data

2017-01-17 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan resolved CARBONDATA-233.
-
Resolution: Resolved

Resolved with pr
https://github.com/apache/incubator-carbondata/pull/148

> bad record logger support for non parseable numeric and timestamp data
> --
>
> Key: CARBONDATA-233
> URL: https://issues.apache.org/jira/browse/CARBONDATA-233
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>
> bad record logger support for non parseable numeric and timestamp data



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-634) Load Query options invalid values are not consistent behaviour.

2017-01-17 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827466#comment-15827466
 ] 

Mohammad Shahid Khan commented on CARBONDATA-634:
-

As of now if user is setting wrong option for BAD_RECORDS_ACTION, the system 
takes the default value as force.
Which is not correct behavior, as the force load can change the actual data.
By default if no value is set by default  BAD_RECORDS_ACTION will be 
initialized with FORCE.
But if the value for BAD_RECORDS_ACTION  is set then the the value should be 
validated.


> Load Query options invalid values are not consistent behaviour. 
> 
>
> Key: CARBONDATA-634
> URL: https://issues.apache.org/jira/browse/CARBONDATA-634
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
> Environment: spark-1.6
>Reporter: Payal
>Assignee: Mohammad Shahid Khan
>Priority: Minor
> Attachments: 2000_UniqData.csv
>
>
> If we pass invalid keyword in ('BAD_RECORDS_ACTION'='FAIL'), its behaves like 
> default('BAD_RECORDS_ACTION'='FORCE') , here we require some error message 
> instead of default behavior.  
> Query
> CREATE TABLE Dataload_uniqdata1 (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format';
>  LOAD DATA INPATH 
> 'hdfs://hadoop-master:54311/data/uniqdata/2000_UniqData.csv' into table 
> Dataload_uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FAIL','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
>  select *   from Dataload_uniqdata limit 10 ;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-641) DICTIONARY_EXCLUDE is not working with 'DATE' column

2017-01-16 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823733#comment-15823733
 ] 

Mohammad Shahid Khan commented on CARBONDATA-641:
-

date and timestamp type are direct dictionary column, so automatically no 
dictionary file will be generated for the same.
Therefor DATE and TIMESTAMP both should not be supported in DICTIONARY_EXCLUDE 
option.

Input validation for the timestamp type is missing i.e. its not throwing error 
for the same.
Solution:
Validation should be added for the timestamp type as well.
After fix both the adding both type date and timestamp in DICTIONARY_EXCLUDE 
will throw error.

> DICTIONARY_EXCLUDE is not working with 'DATE' column
> 
>
> Key: CARBONDATA-641
> URL: https://issues.apache.org/jira/browse/CARBONDATA-641
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.0.0-incubating
> Environment: Spark - 1.6 and Spark - 2.1
>Reporter: Anurag Srivastava
>
> I am trying to create a table with *"DICTIONARY_EXCLUDE"* and this property 
> is not working for *"DATE"* Data Type.
> *Query :*  CREATE TABLE uniqdata_date_dictionary (CUST_ID int,CUST_NAME 
> string,ACTIVE_EMUI_VERSION string, DOB date, DOJ date, BIGINT_COLUMN1 
> bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB","DICTIONARY_EXCLUDE"="DOB,DOJ");
> *Expected Result :* Table created.
> *Actual Result :* Error: 
> org.apache.carbondata.spark.exception.MalformedCarbonCommandException: 
> DICTIONARY_EXCLUDE is unsupported for date data type column: dob 
> (state=,code=0)
> But is is working fine, If I use 'TIMESTAMP' in place of 'DATE'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CARBONDATA-641) DICTIONARY_EXCLUDE is not working with 'DATE' column

2017-01-16 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan reassigned CARBONDATA-641:
---

Assignee: Mohammad Shahid Khan

> DICTIONARY_EXCLUDE is not working with 'DATE' column
> 
>
> Key: CARBONDATA-641
> URL: https://issues.apache.org/jira/browse/CARBONDATA-641
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.0.0-incubating
> Environment: Spark - 1.6 and Spark - 2.1
>Reporter: Anurag Srivastava
>Assignee: Mohammad Shahid Khan
>
> I am trying to create a table with *"DICTIONARY_EXCLUDE"* and this property 
> is not working for *"DATE"* Data Type.
> *Query :*  CREATE TABLE uniqdata_date_dictionary (CUST_ID int,CUST_NAME 
> string,ACTIVE_EMUI_VERSION string, DOB date, DOJ date, BIGINT_COLUMN1 
> bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB","DICTIONARY_EXCLUDE"="DOB,DOJ");
> *Expected Result :* Table created.
> *Actual Result :* Error: 
> org.apache.carbondata.spark.exception.MalformedCarbonCommandException: 
> DICTIONARY_EXCLUDE is unsupported for date data type column: dob 
> (state=,code=0)
> But is is working fine, If I use 'TIMESTAMP' in place of 'DATE'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-484) Implement LRU cache for B-Tree to ensure to avoid out memory, when too many number of tables exits and all are not frequently used.

2016-12-02 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-484:

Summary: Implement LRU cache for B-Tree to ensure to avoid out memory, when 
too many number of tables exits and all are not frequently used.  (was: LRU 
cache for B-Tree to ensure to avoid out memory, when too many number of tables 
exits and all are not frequently used.)

> Implement LRU cache for B-Tree to ensure to avoid out memory, when too many 
> number of tables exits and all are not frequently used.
> ---
>
> Key: CARBONDATA-484
> URL: https://issues.apache.org/jira/browse/CARBONDATA-484
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Attachments: B-Tree LRU Cache.pdf
>
>
> *LRU Cache for B-Tree*
> Problem:
> CarbonData is maintaining two level of B-Tree cache, one at the driver level 
> and another at executor level.  Currently CarbonData has the mechanism to 
> invalidate the segments and blocks cache for the invalid table segments, but 
> there is no eviction policy for the unused cached object. So the instance at 
> which complete memory is utilized then the system will not be able to process 
> any new requests.
> Solution:
> In the cache maintained at the driver level and at the executor there must be 
> objects in cache currently not in use. Therefore system should have the 
> mechanism to below mechanism.
> 1.   Set the max memory limit till which objects could be hold in the 
> memory.
> 2.   When configured memory limit reached then identify the cached 
> objects currently not in use so that the required memory could be freed 
> without impacting the existing process.
> 3.   Eviction should be done only till the required memory is not meet.
> For details please refer to attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-484) LRU cache for B-Tree to ensure to avoid out memory, when too many number of tables exits and all are not frequently used.

2016-12-02 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-484:

Attachment: B-Tree LRU Cache.pdf

LRU cache for B-Tree Design Doc 

> LRU cache for B-Tree to ensure to avoid out memory, when too many number of 
> tables exits and all are not frequently used.
> -
>
> Key: CARBONDATA-484
> URL: https://issues.apache.org/jira/browse/CARBONDATA-484
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Attachments: B-Tree LRU Cache.pdf
>
>
> *LRU Cache for B-Tree*
> Problem:
> CarbonData is maintaining two level of B-Tree cache, one at the driver level 
> and another at executor level.  Currently CarbonData has the mechanism to 
> invalidate the segments and blocks cache for the invalid table segments, but 
> there is no eviction policy for the unused cached object. So the instance at 
> which complete memory is utilized then the system will not be able to process 
> any new requests.
> Solution:
> In the cache maintained at the driver level and at the executor there must be 
> objects in cache currently not in use. Therefore system should have the 
> mechanism to below mechanism.
> 1.   Set the max memory limit till which objects could be hold in the 
> memory.
> 2.   When configured memory limit reached then identify the cached 
> objects currently not in use so that the required memory could be freed 
> without impacting the existing process.
> 3.   Eviction should be done only till the required memory is not meet.
> For details please refer to attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-484) LRU cache for B-Tree to ensure to avoid out memory, when too many number of tables exits and all are not frequently used.

2016-12-02 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-484:

Issue Type: New Feature  (was: Bug)

> LRU cache for B-Tree to ensure to avoid out memory, when too many number of 
> tables exits and all are not frequently used.
> -
>
> Key: CARBONDATA-484
> URL: https://issues.apache.org/jira/browse/CARBONDATA-484
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>
> *LRU Cache for B-Tree*
> Problem:
> CarbonData is maintaining two level of B-Tree cache, one at the driver level 
> and another at executor level.  Currently CarbonData has the mechanism to 
> invalidate the segments and blocks cache for the invalid table segments, but 
> there is no eviction policy for the unused cached object. So the instance at 
> which complete memory is utilized then the system will not be able to process 
> any new requests.
> Solution:
> In the cache maintained at the driver level and at the executor there must be 
> objects in cache currently not in use. Therefore system should have the 
> mechanism to below mechanism.
> 1.   Set the max memory limit till which objects could be hold in the 
> memory.
> 2.   When configured memory limit reached then identify the cached 
> objects currently not in use so that the required memory could be freed 
> without impacting the existing process.
> 3.   Eviction should be done only till the required memory is not meet.
> For details please refer to attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-484) LRU cache for B-Tree to ensure to avoid out memory, when too many number of tables exits and all are not frequently used.

2016-12-02 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-484:
---

 Summary: LRU cache for B-Tree to ensure to avoid out memory, when 
too many number of tables exits and all are not frequently used.
 Key: CARBONDATA-484
 URL: https://issues.apache.org/jira/browse/CARBONDATA-484
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan


*LRU Cache for B-Tree*
Problem:

CarbonData is maintaining two level of B-Tree cache, one at the driver level 
and another at executor level.  Currently CarbonData has the mechanism to 
invalidate the segments and blocks cache for the invalid table segments, but 
there is no eviction policy for the unused cached object. So the instance at 
which complete memory is utilized then the system will not be able to process 
any new requests.

Solution:

In the cache maintained at the driver level and at the executor there must be 
objects in cache currently not in use. Therefore system should have the 
mechanism to below mechanism.

1.   Set the max memory limit till which objects could be hold in the 
memory.

2.   When configured memory limit reached then identify the cached objects 
currently not in use so that the required memory could be freed without 
impacting the existing process.

3.   Eviction should be done only till the required memory is not meet.

For details please refer to attachments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-317) CSV having only space char is throwing NullPointerException

2016-10-13 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-317:
---

 Summary: CSV having only space char is throwing 
NullPointerException
 Key: CARBONDATA-317
 URL: https://issues.apache.org/jira/browse/CARBONDATA-317
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-288) In hdfs bad record logger is failing in writting the bad records

2016-10-08 Thread Mohammad Shahid Khan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated CARBONDATA-288:

Fix Version/s: 0.2.0-incubating

> In hdfs bad record logger is failing in writting the bad records
> 
>
> Key: CARBONDATA-288
> URL: https://issues.apache.org/jira/browse/CARBONDATA-288
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Fix For: 0.2.0-incubating
>
>
> For HDFS file system 
> CarbonFile logFile = FileFactory.getCarbonFile(filePath, FileType.HDFS);
> if filePath does not exits then
> Calling CarbonFile.getPath() throws NullPointerException.
> Solution:
> If file does not exist then file must be created first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-288) In hdfs bad record logger is failing in writting the bad records

2016-10-08 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-288:
---

 Summary: In hdfs bad record logger is failing in writting the bad 
records
 Key: CARBONDATA-288
 URL: https://issues.apache.org/jira/browse/CARBONDATA-288
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan


For HDFS file system 
CarbonFile logFile = FileFactory.getCarbonFile(filePath, FileType.HDFS);
if filePath does not exits then
Calling CarbonFile.getPath() throws NullPointerException.
Solution:
If file does not exist then file must be created first.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-220) TimeStampDirectDictionaryGenerator_UT.java is not running in the build

2016-09-07 Thread Mohammad Shahid Khan (JIRA)
Mohammad Shahid Khan created CARBONDATA-220:
---

 Summary: TimeStampDirectDictionaryGenerator_UT.java is not running 
in the build
 Key: CARBONDATA-220
 URL: https://issues.apache.org/jira/browse/CARBONDATA-220
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan


TimeStampDirectDictionaryGenerator_UT.java is not running in the build.
And generateDirectSurrogateKey test is failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)