Vandana Yadav created CARBONDATA-2680:
-----------------------------------------
Summary: Incorrect result displays after applying the
histogram_numeric function on carbon and hive table
Key: CARBONDATA-2680
URL: https://issues.apache.org/jira/browse/CARBONDATA-2680
Project: CarbonData
Issue Type: Bug
Components: data-query
Affects Versions: 1.5.0
Environment: spark 2.2
Reporter: Vandana Yadav
Attachments: 100_hive_test.csv
Incorrect result displays after applying the histogram_numeric function on
carbon and hive table:
Steps to reproduce:
1) Create carbon table and load data in it:
a) create table Carbon_automation (imei string,deviceInformationId int,MAC
string,deviceColor string,device_backColor string,modelId string,marketName
string,AMSize string,ROMSize string,CUPAudit string,CPIClocked string,series
string,productionDate timestamp,bomCode string,internalModels string,
deliveryTime string, channelsId string, channelsName string , deliveryAreaId
string, deliveryCountry string, deliveryProvince string, deliveryCity
string,deliveryDistrict string, deliveryStreet string, oxSingleNumber string,
ActiveCheckTime string, ActiveAreaId string, ActiveCountry string,
ActiveProvince string, Activecity string, ActiveDistrict string, ActiveStreet
string, ActiveOperatorId string, Active_releaseId string, Active_EMUIVersion
string, Active_operaSysVersion string, Active_BacVerNumber string,
Active_BacFlashVer string, Active_webUIVersion string, Active_webUITypeCarrVer
string,Active_webTypeDataVerNumber string, Active_operatorsVersion string,
Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int,
Latest_DAY int, Latest_HOUR string, Latest_areaId string, Latest_country
string, Latest_province string, Latest_city string, Latest_district string,
Latest_street string, Latest_releaseId string, Latest_EMUIVersion string,
Latest_operaSysVersion string, Latest_BacVerNumber string, Latest_BacFlashVer
string, Latest_webUIVersion string, Latest_webUITypeCarrVer string,
Latest_webTypeDataVerNumber string, Latest_operatorsVersion string,
Latest_phonePADPartitionedVersions string, Latest_operatorId string,
gamePointDescription string,gamePointId double,contractNumber double,imei_count
int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES
('DICTIONARY_INCLUDE'='deviceInformationId,Latest_YEAR,Latest_MONTH,Latest_DAY')
b) LOAD DATA INPATH
'hdfs://hadoop-master:54311/BabuStore/Data/HiveData/100_hive_test.csv' INTO
TABLE Carbon_automation
OPTIONS('DELIMITER'=',','QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='imei,deviceInformationId,MAC,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CPIClocked,series,productionDate,bomCode,internalModels,deliveryTime,channelsId,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,deliveryStreet,oxSingleNumber,contractNumber,ActiveCheckTime,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,ActiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_operaSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Active_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Active_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaSysVersion,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITypeCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePADPartitionedVersions,Latest_operatorId,gamePointId,gamePointDescription,imei_count');
2) Create Hive table and load data in it:
a) create table Carbon_automation_h (imei string,deviceInformationId int,MAC
string,deviceColor string,device_backColor string,modelId string,marketName
string,AMSize string,ROMSize string,CUPAudit string,CPIClocked string,series
string,productionDate timestamp,bomCode string,internalModels string,
deliveryTime string, channelsId string, channelsName string , deliveryAreaId
string, deliveryCountry string, deliveryProvince string, deliveryCity
string,deliveryDistrict string, deliveryStreet string, oxSingleNumber string,
ActiveCheckTime string, ActiveAreaId string, ActiveCountry string,
ActiveProvince string, Activecity string, ActiveDistrict string, ActiveStreet
string, ActiveOperatorId string, Active_releaseId string, Active_EMUIVersion
string, Active_operaSysVersion string, Active_BacVerNumber string,
Active_BacFlashVer string, Active_webUIVersion string, Active_webUITypeCarrVer
string,Active_webTypeDataVerNumber string, Active_operatorsVersion string,
Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int,
Latest_DAY int, Latest_HOUR string, Latest_areaId string, Latest_country
string, Latest_province string, Latest_city string, Latest_district string,
Latest_street string, Latest_releaseId string, Latest_EMUIVersion string,
Latest_operaSysVersion string, Latest_BacVerNumber string, Latest_BacFlashVer
string, Latest_webUIVersion string, Latest_webUITypeCarrVer string,
Latest_webTypeDataVerNumber string, Latest_operatorsVersion string,
Latest_phonePADPartitionedVersions string, Latest_operatorId string,
gamePointDescription string,gamePointId double,contractNumber double,imei_count
int)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
b) load data local inpath
'/opt/Carbon/CarbonData/TestData/Data/HiveData/100_hive_test.csv' OVERWRITE
INTO TABLE Carbon_automation_h;
3) Execute Query:
on carbon table:
select histogram_numeric(1, 5000)from carbon_automation;
on Hive table:
select histogram_numeric(1, 5000)from Carbon_automation_h;
4) Expected Result:
As both tables have similar Data content the the output of executed query
should be same.
5) Actual Result:
Carbon Table:
Output:
+------------------------------+--+
| histogram_numeric( 1, 5000) |
+------------------------------+--+
| [\{"x":1.0,"y":99.0}] |
+------------------------------+--+
1 row selected (0.204 seconds)
Hive Table:
Output:
+------------------------------------------+--+
| histogram_numeric( 1, 5000) |
+------------------------------------------+--+
| [\{"x":1.0,"y":50.0},\{"x":1.0,"y":49.0}] |
+------------------------------------------+--+
1 row selected (2.171 seconds)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)