Payal created CARBONDATA-602:
--------------------------------

             Summary: When we are  loading data 3 or 4 time , It is throwing an 
error
                 Key: CARBONDATA-602
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-602
             Project: CarbonData
          Issue Type: Bug
          Components: data-load
            Reporter: Payal


When we are Loading  data  using 'USE_KETTLE' ='false' with 
'SINGLE_PASS'='true' ,It is Throwing an error -- Error: java.lang.Exception: 
Data load failed due to error while write dictionary file! (state=,code=0) and 
without  'USE_KETTLE' ='false' Data load is successful


For Example:
CREATE TABLE uniqdata_INCLUDEDICTIONARY (CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');


LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' 
into table uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE'
 ='false');
Error: java.lang.Exception: Dataload failed due to error while write dictionary 
file! (state=,code=0)

LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' 
into table uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true');
+---------+--+
| Result  |
+---------+--+
+---------+--+



INFO  06-01 13:31:54,820 - Running query 'LOAD DATA INPATH 
'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table 
uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE'
 ='false')' with 2e6007f7-946d-4071-a73f-30d90538ebd6
INFO  06-01 13:31:54,820 - pool-26-thread-58 Query [LOAD DATA INPATH 
'HDFS://HADOOP-MASTER:54311/DATA/UNIQDATA/7000_UNIQDATA.CSV' INTO TABLE 
UNIQDATA_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,DOUBLE_COLUMN1,DOUBLE_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='TRUE','USE_KETTLE'
 ='FALSE')]
INFO  06-01 13:31:54,831 - Successfully able to get the table metadata file lock
INFO  06-01 13:31:54,834 - pool-26-thread-58 Initiating Direct Load for the 
Table : (meradb.uniqdata_includedictionary)
AUDIT 06-01 13:31:54,838 - [deepak-Vostro-3546][hduser][Thread-494]Data load 
request has been received for table meradb.uniqdata_includedictionary
AUDIT 06-01 13:31:54,838 - [deepak-Vostro-3546][hduser][Thread-494]Data is 
loading with New Data Flow for table meradb.uniqdata_includedictionary
INFO  06-01 13:31:54,891 - pool-26-thread-58 [Block Distribution]
INFO  06-01 13:31:54,891 - pool-26-thread-58 totalInputSpaceConsumed: 1505367 , 
defaultParallelism: 8
INFO  06-01 13:31:54,891 - pool-26-thread-58 
mapreduce.input.fileinputformat.split.maxsize: 16777216
INFO  06-01 13:31:54,891 - Total input paths to process : 1
INFO  06-01 13:31:54,892 - pool-26-thread-58 Executors configured : 1
INFO  06-01 13:31:54,893 - pool-26-thread-58 Requesting total executors: 1
INFO  06-01 13:31:54,897 - pool-26-thread-58 Total Time taken to ensure the 
required executors : 3
INFO  06-01 13:31:54,897 - pool-26-thread-58 Time elapsed to allocate the 
required executors: 0
INFO  06-01 13:31:54,898 - pool-26-thread-58 Total Time taken in block 
allocation: 6
INFO  06-01 13:31:54,898 - pool-26-thread-58 Total no of blocks: 1, No.of 
Nodes: 1
INFO  06-01 13:31:54,898 - pool-26-thread-58 #Node: hadoop-slave-1 
no.of.blocks: 1 , mismatch locations: ,knoldus

INFO  06-01 13:31:55,057 - Block broadcast_62 stored as values in memory 
(estimated size 150.4 MB, free 300.0 MB)
INFO  06-01 13:31:55,064 - Block broadcast_62_piece0 stored as bytes in memory 
(estimated size 19.7 KB, free 300.0 MB)
INFO  06-01 13:31:55,064 - Added broadcast_62_piece0 in memory on 
192.168.2.174:32778 (size: 19.7 KB, free: 511.0 MB)
INFO  06-01 13:31:55,064 - Created broadcast 62 from broadcast at 
NewCarbonDataLoadRDD.scala:109
INFO  06-01 13:31:55,067 - Starting job: collect at 
CarbonDataRDDFactory.scala:632
INFO  06-01 13:31:55,067 - Got job 31 (collect at 
CarbonDataRDDFactory.scala:632) with 1 output partitions
INFO  06-01 13:31:55,067 - Final stage: ResultStage 38 (collect at 
CarbonDataRDDFactory.scala:632)
INFO  06-01 13:31:55,067 - Parents of final stage: List()
INFO  06-01 13:31:55,067 - Missing parents: List()
INFO  06-01 13:31:55,068 - Submitting ResultStage 38 (NewCarbonDataLoadRDD[150] 
at RDD at NewCarbonDataLoadRDD.scala:91), which has no missing parents
INFO  06-01 13:31:55,068 - Preferred Location for split : hadoop-slave-1
INFO  06-01 13:31:55,069 - Block broadcast_63 stored as values in memory 
(estimated size 12.0 KB, free 300.0 MB)
INFO  06-01 13:31:55,070 - Block broadcast_63_piece0 stored as bytes in memory 
(estimated size 5.8 KB, free 300.0 MB)
INFO  06-01 13:31:55,070 - Added broadcast_63_piece0 in memory on 
192.168.2.174:32778 (size: 5.8 KB, free: 511.0 MB)
INFO  06-01 13:31:55,071 - Created broadcast 63 from broadcast at 
DAGScheduler.scala:1006
INFO  06-01 13:31:55,071 - Submitting 1 missing tasks from ResultStage 38 
(NewCarbonDataLoadRDD[150] at RDD at NewCarbonDataLoadRDD.scala:91)
INFO  06-01 13:31:55,071 - Adding task set 38.0 with 1 tasks
INFO  06-01 13:31:55,072 - Starting task 0.0 in stage 38.0 (TID 92, 
hadoop-slave-1, partition 0,NODE_LOCAL, 2498 bytes)
INFO  06-01 13:31:55,083 - Added broadcast_63_piece0 in memory on 
hadoop-slave-1:34995 (size: 5.8 KB, free: 511.0 MB)
INFO  06-01 13:31:55,096 - Added broadcast_62_piece0 in memory on 
hadoop-slave-1:34995 (size: 19.7 KB, free: 511.0 MB)
AUDIT 06-01 13:31:55,120 - [deepak-Vostro-3546][hduser][Thread-428]Connected 
org.apache.carbondata.core.dictionary.server.DictionaryServerHandler@7c9223ef
INFO  06-01 13:31:56,510 - Finished task 0.0 in stage 38.0 (TID 92) in 1439 ms 
on hadoop-slave-1 (1/1)
INFO  06-01 13:31:56,510 - Removed TaskSet 38.0, whose tasks have all 
completed, from pool
INFO  06-01 13:31:56,510 - ResultStage 38 (collect at 
CarbonDataRDDFactory.scala:632) finished in 1.439 s
INFO  06-01 13:31:56,510 - Job 31 finished: collect at 
CarbonDataRDDFactory.scala:632, took 1.443490 s
INFO  06-01 13:31:56,511 - pool-26-thread-58 Acquired lock for 
tablemeradb.uniqdata_includedictionary for table status updation
INFO  06-01 13:31:56,595 - pool-26-thread-58 Successfully deleted the lock file 
/tmp/meradb/uniqdata_includedictionary/tablestatus.lock
INFO  06-01 13:31:56,595 - pool-26-thread-58 Table unlocked successfully after 
table status updationmeradb.uniqdata_includedictionary
ERROR 06-01 13:31:56,595 - pool-26-thread-58 Error while close dictionary 
server and write dictionary file for meradb.uniqdata_includedictionary
ERROR 06-01 13:31:56,595 - pool-26-thread-58
java.lang.Exception: Dataload failed due to error while write dictionary file!
    at 
org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:773)
    at 
org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:470)
    at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
    at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
    at 
org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
    at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
    at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
    at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
    at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
    at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
    at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145)
    at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)
    at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:137)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:211)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:154)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:151)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:164)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
AUDIT 06-01 13:31:56,596 - [deepak-Vostro-3546][hduser][Thread-494]Dataload 
failure for meradb.uniqdata_includedictionary. Please check the logs
INFO  06-01 13:31:56,596 - pool-26-thread-58 Successfully deleted the lock file 
/tmp/meradb/uniqdata_includedictionary/meta.lock
INFO  06-01 13:31:56,596 - Table MetaData Unlocked Successfully after data load
ERROR 06-01 13:31:56,597 - Error executing query, currentState RUNNING,
java.lang.Exception: Dataload failed due to error while write dictionary file!
    at 
org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:773)
    at 
org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:470)
    at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
    at 
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
    at 
org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
    at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
    at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
    at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
    at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
    at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
    at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145)
    at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)
    at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:137)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:211)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:154)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:151)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:164)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
ERROR 06-01 13:31:56,597 - Error running hive query:
org.apache.hive.service.cli.HiveSQLException: java.lang.Exception: Dataload 
failed due to error while write dictionary file!
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:246)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:154)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:151)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:164)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to