[jira] [Assigned] (CARBONDATA-1056) Data_load failure using single_pass true with spark 2.1

Kunal Kapoor (JIRA) Tue, 16 May 2017 09:13:31 -0700

     [ 
https://issues.apache.org/jira/browse/CARBONDATA-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Kunal Kapoor reassigned CARBONDATA-1056:
----------------------------------------

    Assignee: Kunal Kapoor

> Data_load failure using single_pass true with spark 2.1
> -------------------------------------------------------
>
>                 Key: CARBONDATA-1056
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1056
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-load
>    Affects Versions: 1.1.0
>         Environment: spark 2.1
>            Reporter: Vandana Yadav
>            Assignee: Kunal Kapoor
>            Priority: Minor
>         Attachments: 2000_UniqData.csv
>
>
> Data_load failure using single_pass true with spark 2.1
> Steps to reproduce:
> 1)Create Table:
> CREATE TABLE uniq_exclude_sp1 (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='CUST_NAME,ACTIVE_EMUI_VERSION');
> 2) Load Data:
> LOAD DATA INPATH 'hdfs://localhost:54310/2000_UniqData.csv' into table 
> uniq_exclude_sp1 OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_Pass'='true');
> 3)Result:
> Actual result on beeline:
> Error: java.lang.Exception: Dataload failed due to error while writing 
> dictionary file! (state=,code=0)
> Expected Result: data should be load successfully 
> 4)Thriftserver logs:
> 17/05/16 16:07:20 INFO SparkExecuteStatementOperation: Running query 'LOAD 
> DATA INPATH 'hdfs://localhost:54310/2000_UniqData.csv' into table 
> uniq_exclude_sp1 OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_Pass'='true')'
>  with 34eb7e9e-bd49-495c-af68-8f0b5e36b786
> 17/05/16 16:07:20 INFO CarbonSparkSqlParser: Parsing command: LOAD DATA 
> INPATH 'hdfs://localhost:54310/2000_UniqData.csv' into table uniq_exclude_sp1 
> OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_Pass'='true')
> 17/05/16 16:07:20 INFO CarbonLateDecodeRule: pool-23-thread-4 Skip 
> CarbonOptimizer
> 17/05/16 16:07:20 INFO HdfsFileLock: pool-23-thread-4 HDFS lock 
> path:hdfs://localhost:54310/opt/prestocarbonStore/default/uniq_exclude_sp1/meta.lock
> 17/05/16 16:07:20 INFO LoadTable: pool-23-thread-4 Successfully able to get 
> the table metadata file lock
> 17/05/16 16:07:20 INFO LoadTable: pool-23-thread-4 Initiating Direct Load for 
> the Table : (default.uniq_exclude_sp1)
> 17/05/16 16:07:20 AUDIT CarbonDataRDDFactory$: 
> [knoldus][hduser][Thread-137]Data load request has been received for table 
> default.uniq_exclude_sp1
> 17/05/16 16:07:20 INFO CommonUtil$: pool-23-thread-4 [Block Distribution]
> 17/05/16 16:07:20 INFO CommonUtil$: pool-23-thread-4 totalInputSpaceConsumed: 
> 376223 , defaultParallelism: 4
> 17/05/16 16:07:20 INFO CommonUtil$: pool-23-thread-4 
> mapreduce.input.fileinputformat.split.maxsize: 16777216
> 17/05/16 16:07:20 INFO FileInputFormat: Total input paths to process : 1
> 17/05/16 16:07:20 INFO DistributionUtil$: pool-23-thread-4 Executors 
> configured : 1
> 17/05/16 16:07:20 INFO DistributionUtil$: pool-23-thread-4 Total Time taken 
> to ensure the required executors : 0
> 17/05/16 16:07:20 INFO DistributionUtil$: pool-23-thread-4 Time elapsed to 
> allocate the required executors: 0
> 17/05/16 16:07:20 INFO CarbonDataRDDFactory$: pool-23-thread-4 Total Time 
> taken in block allocation: 1
> 17/05/16 16:07:20 INFO CarbonDataRDDFactory$: pool-23-thread-4 Total no of 
> blocks: 1, No.of Nodes: 1
> 17/05/16 16:07:20 INFO CarbonDataRDDFactory$: pool-23-thread-4 #Node: knoldus 
> no.of.blocks: 1
> 17/05/16 16:07:20 INFO MemoryStore: Block broadcast_2 stored as values in 
> memory (estimated size 53.7 MB, free 291.4 MB)
> 17/05/16 16:07:20 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes 
> in memory (estimated size 23.2 KB, free 291.4 MB)
> 17/05/16 16:07:20 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory 
> on 192.168.1.10:42046 (size: 23.2 KB, free: 366.2 MB)
> 17/05/16 16:07:20 INFO SparkContext: Created broadcast 2 from broadcast at 
> NewCarbonDataLoadRDD.scala:185
> 17/05/16 16:07:20 INFO SparkContext: Starting job: collect at 
> CarbonDataRDDFactory.scala:630
> 17/05/16 16:07:20 INFO DAGScheduler: Got job 1 (collect at 
> CarbonDataRDDFactory.scala:630) with 1 output partitions
> 17/05/16 16:07:20 INFO DAGScheduler: Final stage: ResultStage 1 (collect at 
> CarbonDataRDDFactory.scala:630)
> 17/05/16 16:07:20 INFO DAGScheduler: Parents of final stage: List()
> 17/05/16 16:07:20 INFO DAGScheduler: Missing parents: List()
> 17/05/16 16:07:20 INFO DAGScheduler: Submitting ResultStage 1 
> (NewCarbonDataLoadRDD[4] at RDD at NewCarbonDataLoadRDD.scala:174), which has 
> no missing parents
> 17/05/16 16:07:20 INFO NewCarbonDataLoadRDD: Preferred Location for split : 
> knoldus
> 17/05/16 16:07:20 INFO MemoryStore: Block broadcast_3 stored as values in 
> memory (estimated size 11.8 KB, free 291.4 MB)
> 17/05/16 16:07:20 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes 
> in memory (estimated size 6.0 KB, free 291.4 MB)
> 17/05/16 16:07:20 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory 
> on 192.168.1.10:42046 (size: 6.0 KB, free: 366.2 MB)
> 17/05/16 16:07:20 INFO SparkContext: Created broadcast 3 from broadcast at 
> DAGScheduler.scala:996
> 17/05/16 16:07:20 INFO DAGScheduler: Submitting 1 missing tasks from 
> ResultStage 1 (NewCarbonDataLoadRDD[4] at RDD at 
> NewCarbonDataLoadRDD.scala:174)
> 17/05/16 16:07:20 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
> 17/05/16 16:07:20 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, 
> localhost, executor driver, partition 0, ANY, 6850 bytes)
> 17/05/16 16:07:20 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
> 17/05/16 16:07:20 INFO NewCarbonDataLoadRDD: Input split: knoldus
> 17/05/16 16:07:20 INFO NewCarbonDataLoadRDD: The Block Count in this node :1
> 17/05/16 16:07:20 INFO AbstractDataLoadProcessorStep: Thread-61 Rows 
> processed in step Input Processor : 0
> 17/05/16 16:07:20 INFO AbstractDataLoadProcessorStep: Thread-62 Rows 
> processed in step Data Converter : 0
> 17/05/16 16:07:20 INFO AbstractDataLoadProcessorStep: Thread-63 Rows 
> processed in step Sort Processor : 0
> 17/05/16 16:07:20 INFO AbstractDataLoadProcessorStep: Thread-64 Rows 
> processed in step Data Writer : 0
> 17/05/16 16:07:20 AUDIT DictionaryClient: 
> [knoldus][hduser][Thread-149]Starting client on 192.168.1.10 2030
> 17/05/16 16:07:20 INFO DictionaryClient: Dictionary client Dictionary client 
> Started, Total time spent : 1
> 17/05/16 16:07:20 AUDIT DictionaryClientHandler: 
> [knoldus][hduser][Thread-150]Connected client 
> io.netty.channel.DefaultChannelHandlerContext@1d9d500d
> 17/05/16 16:07:20 AUDIT DictionaryServerHandler: 
> [knoldus][hduser][Thread-105]Connected 
> io.netty.channel.DefaultChannelHandlerContext@1b9e1073
> 17/05/16 16:07:21 INFO SortParameters: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Sort size for table: 500000
> 17/05/16 16:07:21 INFO SortParameters: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Number of intermediate file to be merged: 20
> 17/05/16 16:07:21 INFO SortParameters: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  File Buffer Size: 1048576
> 17/05/16 16:07:21 INFO SortParameters: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  temp file 
> location/tmp/42046473395440/0/default/uniq_exclude_sp1/Fact/Part0/Segment_0/0/sortrowtmp
> 17/05/16 16:07:21 INFO DataLoadExecutor: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Data Loading is started for table uniq_exclude_sp1
> 17/05/16 16:07:21 AUDIT DictionaryClient: 
> [knoldus][hduser][Thread-149]Starting client on 192.168.1.10 2030
> 17/05/16 16:07:21 INFO DictionaryClient: Dictionary client Dictionary client 
> Started, Total time spent : 0
> 17/05/16 16:07:21 AUDIT DictionaryClientHandler: 
> [knoldus][hduser][Thread-152]Connected client 
> io.netty.channel.DefaultChannelHandlerContext@4b31e773
> 17/05/16 16:07:21 AUDIT DictionaryServerHandler: 
> [knoldus][hduser][Thread-105]Connected 
> io.netty.channel.DefaultChannelHandlerContext@7b13e6e9
> 17/05/16 16:07:22 INFO SortDataRows: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  File based sorting will be used
> 17/05/16 16:07:22 INFO ParallelReadMergeSorterImpl: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Record Processed For table: uniq_exclude_sp1
> 17/05/16 16:07:22 INFO SingleThreadFinalSortFilesMerger: [Executor task 
> launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Number of temp file: 1
> 17/05/16 16:07:22 INFO SingleThreadFinalSortFilesMerger: [Executor task 
> launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  File Buffer Size: 20971520
> 17/05/16 16:07:22 INFO SingleThreadFinalSortFilesMerger: [Executor task 
> launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Started adding first record from each file
> 17/05/16 16:07:22 INFO SingleThreadFinalSortFilesMerger: [Executor task 
> launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Heap Size1
> 17/05/16 16:07:22 INFO CarbonFactDataHandlerColumnar: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Initializing writer executors
> 17/05/16 16:07:22 INFO CarbonFactDataHandlerColumnar: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Number of rows per column blocklet 32000
> 17/05/16 16:07:22 INFO AbstractFactDataWriter: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Total file size: 1073741824 and dataBlock Size: 966367642
> 17/05/16 16:07:22 INFO CarbonFactDataHandlerColumnar: pool-43-thread-1 Number 
> Of records processed: 2013
> 17/05/16 16:07:22 INFO CarbonFactDataWriterImplV3: pool-44-thread-1 Number of 
> Pages for blocklet is: 1 :Rows Added: 2013
> 17/05/16 16:07:22 INFO DataWriterProcessorStepImpl: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Record Processed For table: uniq_exclude_sp1
> 17/05/16 16:07:22 INFO DataWriterProcessorStepImpl: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Finished Carbon DataWriterProcessorStepImpl: Read: 2013: Write: 2013
> 17/05/16 16:07:22 INFO CarbonFactDataHandlerColumnar: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  All blocklets have been finished writing
> 17/05/16 16:07:22 INFO AbstractFactDataWriter: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Copying 
> /tmp/42046473395440/0/default/uniq_exclude_sp1/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1494931040374.carbondata
>  --> 
> hdfs://localhost:54310/opt/prestocarbonStore/default/uniq_exclude_sp1/Fact/Part0/Segment_0
> 17/05/16 16:07:22 INFO AbstractFactDataWriter: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  The configured block size is 1024 MB, the actual carbon file size is 89 KB, 
> choose the max value 1024 MB as the block size on HDFS
> 17/05/16 16:07:23 INFO AbstractFactDataWriter: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Total copy time (ms) to copy file 
> /tmp/42046473395440/0/default/uniq_exclude_sp1/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1494931040374.carbondata
>  is 458
> 17/05/16 16:07:23 INFO AbstractFactDataWriter: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Copying 
> /tmp/42046473395440/0/default/uniq_exclude_sp1/Fact/Part0/Segment_0/0/0_batchno0-0-1494931040374.carbonindex
>  --> 
> hdfs://localhost:54310/opt/prestocarbonStore/default/uniq_exclude_sp1/Fact/Part0/Segment_0
> 17/05/16 16:07:23 INFO AbstractFactDataWriter: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  The configured block size is 1024 MB, the actual carbon file size is 2 KB, 
> choose the max value 1024 MB as the block size on HDFS
> 17/05/16 16:07:23 INFO AbstractFactDataWriter: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Total copy time (ms) to copy file 
> /tmp/42046473395440/0/default/uniq_exclude_sp1/Fact/Part0/Segment_0/0/0_batchno0-0-1494931040374.carbonindex
>  is 32
> 17/05/16 16:07:23 INFO AbstractDataLoadProcessorStep: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Total rows processed in step Data Writer: 2013
> 17/05/16 16:07:23 INFO AbstractDataLoadProcessorStep: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Total rows processed in step Sort Processor: 2013
> 17/05/16 16:07:23 INFO AbstractDataLoadProcessorStep: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Total rows processed in step Data Converter: 2013
> 17/05/16 16:07:23 INFO AbstractDataLoadProcessorStep: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Total rows processed in step Input Processor: 2013
> 17/05/16 16:07:27 INFO DataLoadExecutor: [Executor task launch 
> worker-1][partitionID:default_uniq_exclude_sp1_b1d312f0-b60b-4332-9272-7f05d5758c69]
>  Data loading is successful for table uniq_exclude_sp1
> 17/05/16 16:07:27 INFO CarbonLoaderUtil: pool-47-thread-1 Deleted the local 
> store location/tmp/42046473395440/0 : TIme taken: 1
> 17/05/16 16:07:27 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1543 
> bytes result sent to driver
> 17/05/16 16:07:27 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) 
> in 6921 ms on localhost (executor driver) (1/1)
> 17/05/16 16:07:27 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks 
> have all completed, from pool 
> 17/05/16 16:07:27 INFO DAGScheduler: ResultStage 1 (collect at 
> CarbonDataRDDFactory.scala:630) finished in 6.920 s
> 17/05/16 16:07:27 INFO DAGScheduler: Job 1 finished: collect at 
> CarbonDataRDDFactory.scala:630, took 6.932695 s
> 17/05/16 16:07:27 INFO CarbonDataRDDFactory$: pool-23-thread-4 
> ********starting clean up**********
> 17/05/16 16:07:27 ERROR CarbonDataRDDFactory$: pool-23-thread-4 Error while 
> writing dictionary file for default_uniq_exclude_sp1
> 17/05/16 16:07:27 ERROR LoadTable: pool-23-thread-4 
> java.lang.Exception: Dataload failed due to error while writing dictionary 
> file!
>       at 
> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.writeDictionary(CarbonDataRDDFactory.scala:981)
>       at 
> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:848)
>       at 
> org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:511)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>       at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
>       at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
>       at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
>       at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
>       at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
>       at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
>       at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
>       at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:220)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> 17/05/16 16:07:27 AUDIT LoadTable: [knoldus][hduser][Thread-137]Dataload 
> failure for default.uniq_exclude_sp1. Please check the logs
> 17/05/16 16:07:27 INFO HdfsFileLock: pool-23-thread-4 Deleted the lock file 
> hdfs://localhost:54310/opt/prestocarbonStore/default/uniq_exclude_sp1/meta.lock
> 17/05/16 16:07:27 INFO LoadTable: pool-23-thread-4 Table MetaData Unlocked 
> Successfully after data load
> 17/05/16 16:07:27 ERROR SparkExecuteStatementOperation: Error executing 
> query, currentState RUNNING, 
> java.lang.Exception: Dataload failed due to error while writing dictionary 
> file!
>       at 
> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.writeDictionary(CarbonDataRDDFactory.scala:981)
>       at 
> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:848)
>       at 
> org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:511)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>       at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
>       at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
>       at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
>       at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
>       at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
>       at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
>       at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
>       at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:220)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> 17/05/16 16:07:27 ERROR SparkExecuteStatementOperation: Error running hive 
> query: 
> org.apache.hive.service.cli.HiveSQLException: java.lang.Exception: Dataload 
> failed due to error while writing dictionary file!
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:258)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>       at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (CARBONDATA-1056) Data_load failure using single_pass true with spark 2.1

Reply via email to