Re: Re: Getting [Problem in loading segment blocks] error after doing multi update operations

yixu2001 Fri, 23 Mar 2018 01:02:36 -0700

dev 
 This issue has caused great trouble for our production. I will appreciate if 
you have any plan to fix it and let me know.



yixu2001
 
From: BabuLal
Date: 2018-03-23 00:20
To: dev
Subject: Re: Getting [Problem in loading segment blocks] error after doing 
multi update operations
hi all 
i am able to reproduce same exception in my cluster  and got the same
exception. (Trace is listed below)
------ 
scala> carbon.sql("select count(*) from public.c_compact4").show
2018-03-22 20:40:33,105 | WARN  | main | main spark.sql.sources.options.keys
expected, but read nothing |
org.apache.carbondata.common.logging.impl.StandardLogService.logWarnMessage(StandardLogService.java:168)
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute,
tree:
Exchange SinglePartition
+- *HashAggregate(keys=[], functions=[partial_count(1)],
output=[count#1443L])
   +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :public,
Table name :c_compact4, Schema
:Some(StructType(StructField(id,StringType,true),
StructField(qqnum,StringType,true), StructField(nick,StringType,true),
StructField(age,StringType,true), StructField(gender,StringType,true),
StructField(auth,StringType,true), StructField(qunnum,StringType,true),
StructField(mvcc,StringType,true))) ] public.c_compact4[]
  at
org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)
  at
org.apache.spark.sql.execution.exchange.ShuffleExchange.doExecute(ShuffleExchange.scala:112)
  at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
  at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
  at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
  at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
  at
org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:235)
  at
org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141)
  at
org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:372)
  at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114
  at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
  at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135
  at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
  at
org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225)
  at
org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308)
  at
org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:113)
  at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2386)
  at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
  at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788)
  at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2385)
  at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2392)
  at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2128
  at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2127)
  at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2818)
  at org.apache.spark.sql.Dataset.head(Dataset.scala:2127)
  at org.apache.spark.sql.Dataset.take(Dataset.scala:2342)
  at org.apache.spark.sql.Dataset.showString(Dataset.scala:248)
  at org.apache.spark.sql.Dataset.show(Dataset.scala:638)
  at org.apache.spark.sql.Dataset.show(Dataset.scala:597)
  at org.apache.spark.sql.Dataset.show(Dataset.scala:606)
  ... 48 elided
Caused by: java.io.IOException: Problem in loading segment blocks.
  at
org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:153)
  at
org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:76)
  at
org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:72)
  at
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getDataBlocksOfSegment(CarbonTableInputFormat.java:739
  at
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:666)
  at
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:426)
  at
org.apache.carbondata.spark.rdd.CarbonScanRDD.getPartitions(CarbonScanRDD.scala:107)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
  at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
  at scala.Option.getOrElse(Option.scala:121
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:251
  at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
  at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:91)
  at
org.apache.spark.sql.execution.exchange.ShuffleExchange$.prepareShuffleDependency(ShuffleExchange.scala:273)
  at
org.apache.spark.sql.execution.exchange.ShuffleExchange.prepareShuffleDependency(ShuffleExchange.scala:84)
  at
org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:121)
  at
org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:112)
  at
org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)
  ... 81 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
  at
org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.getLocations(AbstractDFSCarbonFile.java:509)
  at
org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:142)
 
----------------Store location---- ----
linux-49:/opt/babu # hadoop fs -ls
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/*.deletedelta
-rw-rw-r--+  3 hdfs hive     177216 2018-03-22 18:20
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-0_batchno0-0-1521723019528.deletedelta
-rw-r--r--   3 hdfs hive          0 2018-03-22 19:35
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-0_batchno0-0-1521723886214.deletedelta
-rw-rw-r--+  3 hdfs hive      87989 2018-03-22 18:20
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-1_batchno0-0-1521723019528.deletedelta
-rw-r--r--   3 hdfs hive          0 2018-03-22 19:35
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-1_batchno0-0-1521723886214.deletedelta
-rw-rw-r--+  3 hdfs hive      87989 2018-03-22 18:20
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-2_batchno0-0-1521723019528.deletedelta
-rw-r--r--   3 hdfs hive          0 2018-03-22 19:35
/user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-2_batchno0-0-1521723886214.deletedelta
 
 
-----------------------------------------------------------
 
Issue reproduced technique :-
Writing  content of delete delta is failed but deletedelta file created
successfully . Failed  during Horizontal  Compaction ( added setSpaceQuota
in hdfs so that file can created successfully and write to this file is
failed)
*Below points to be handled to fix this issue.* 
 
1. When Horizontal Compaction is failed 0 byte delete delta file should be
deleted currently it is not deleted. This is a cleaning part of the
Horizontal Compaction fail .
2. delete delta of 0 byte should not be considered while reading .( we can
further discuss about this solution ) . currently tablestatus file has the
entry of deletedelta  timestamp.
3. If deleting  is in progress , file is created (name node has entry of
file) but data writing is in progress (not yet flush) but at same time 
select query is  triggered ,then Query will failed so this scenario also
need to  handle.
 
@dev :- Please Let me know if any other detail is needed.
 
Thanks
Babu
 
 
 
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Re: Getting [Problem in loading segment blocks] error after doing multi update operations

Reply via email to