[
https://issues.apache.org/jira/browse/CARBONDATA-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vandana Yadav updated CARBONDATA-1540:
--------------------------------------
Description:
Memory issue while executing complex data type queries on the cluster:
Steps to reproduce:
1) Create COmplex data type table:
create table Array_com (CUST_ID string, YEAR int, MONTH int, AGE int, GENDER
string, EDUCATED string, IS_MARRIED string, ARRAY_INT array<int>,ARRAY_STRING
array<string>,ARRAY_DATE array<timestamp>,CARD_COUNT int,DEBIT_COUNT int,
CREDIT_COUNT int, DEPOSIT double, HQ_DEPOSIT double) STORED BY
'org.apache.carbondata.format';
2) Load Data into the table:
LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/complex/Array.csv' INTO table
Array_com options ('DELIMITER'=',', 'QUOTECHAR'='"',
'FILEHEADER'='CUST_ID,YEAR,MONTH,AGE,GENDER,EDUCATED,IS_MARRIED,ARRAY_INT,ARRAY_STRING,ARRAY_DATE,CARD_COUNT,DEBIT_COUNT,CREDIT_COUNT,DEPOSIT,HQ_DEPOSIT','COMPLEX_DELIMITER_LEVEL_1'='$');
3) Execute the Select Query:
select array_int[0], array_int[0]+ 10 as a from array_com
Expected Result: select query should display the correct result.
Actual Result: Error: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3
in stage 0.0 (TID 3, slave1, executor 4): ExecutorLostFailure (executor 4
exited caused by one of the running tasks) Reason: Slave lost
Driver stacktrace: (state=,code=0)
Thrift server Log:
*** Error in `/usr/lib/jvm/java-8-oracle/bin/java': malloc(): memory
corruption: 0x00007f8b90292760 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f8bae53c7e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8213e)[0x7f8bae54713e]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7f8bae549184]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x91df65)[0x7f8baddeff65]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x2e23a0)[0x7f8bad7b43a0]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x2e2444)[0x7f8bad7b4444]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x9740ba)[0x7f8bade460ba]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0xa3c710)[0x7f8badf0e710]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(JVM_FindClassFromBootLoader+0x22b)[0x7f8badbe953b]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/libjava.so(Java_java_lang_ClassLoader_findBootstrapClass+0x9b)[0x7f8bacb9589b]
[0x7f8b991fd5bc]
======= Memory map: ========
was:
Memory issue while executing complex data type queries on the cluster:
Steps to reproduce:
1) Create COmplex data type table:
create table Array_com (CUST_ID string, YEAR int, MONTH int, AGE int, GENDER
string, EDUCATED string, IS_MARRIED string, ARRAY_INT array<int>,ARRAY_STRING
array<string>,ARRAY_DATE array<timestamp>,CARD_COUNT int,DEBIT_COUNT int,
CREDIT_COUNT int, DEPOSIT double, HQ_DEPOSIT double) STORED BY
'org.apache.carbondata.format';
2) Load Data into the table:
LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/complex/Array.csv' INTO table
Array_com options ('DELIMITER'=',', 'QUOTECHAR'='"',
'FILEHEADER'='CUST_ID,YEAR,MONTH,AGE,GENDER,EDUCATED,IS_MARRIED,ARRAY_INT,ARRAY_STRING,ARRAY_DATE,CARD_COUNT,DEBIT_COUNT,CREDIT_COUNT,DEPOSIT,HQ_DEPOSIT','COMPLEX_DELIMITER_LEVEL_1'='$');
3) Execute the Select Query:
select array_int[0], array_int[0]+ 10 as a from array_com
Expected Result: select query should display the correct result.
Actual Result: Error: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3
in stage 3.0 (TID 7, 148.251.7.173, executor 3): ExecutorLostFailure (executor
3 exited caused by one of the running tasks) Reason: Remote RPC client
disassociated. Likely due to containers exceeding thresholds, or network
issues. Check driver logs for WARN messages.
Driver stacktrace: (state=,code=0)
Thrift server Log:
17/10/09 11:03:33 INFO SparkExecuteStatementOperation: Running query 'select
array_int[0], array_int[0]+ 10 as a from array_com' with
5e2f2e3e-e737-496e-bb6f-31269aaed2be
17/10/09 11:03:33 INFO CarbonSparkSqlParser: Parsing command: select
array_int[0], array_int[0]+ 10 as a from array_com
17/10/09 11:03:33 INFO HiveMetaStore: 7: get_table : db=default tbl=array_com
17/10/09 11:03:33 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_table :
db=default tbl=array_com
17/10/09 11:03:33 INFO HiveMetaStore: 7: Opening raw store with implemenation
class:org.apache.hadoop.hive.metastore.ObjectStore
17/10/09 11:03:33 INFO ObjectStore: ObjectStore, initialize called
17/10/09 11:03:33 INFO Query: Reading in results for query
"org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is
closing
17/10/09 11:03:33 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is
DERBY
17/10/09 11:03:33 INFO ObjectStore: Initialized ObjectStore
17/10/09 11:03:33 INFO CatalystSqlParser: Parsing command: array<string>
17/10/09 11:03:33 INFO CarbonLateDecodeRule: pool-24-thread-6 Starting to
optimize plan
17/10/09 11:03:33 STATISTIC QueryStatisticsRecorderImpl: Time taken for Carbon
Optimizer to optimize: 15
17/10/09 11:03:33 INFO CarbonLateDecodeRule: pool-24-thread-6 Skip
CarbonOptimizer
17/10/09 11:03:33 INFO CodeGenerator: Code generated in 15.017295 ms
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_0_piece0 on
46.4.88.233:44387 in memory (size: 23.5 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_0_piece0 on
176.9.29.112:42871 in memory (size: 23.5 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO ContextCleaner: Cleaned accumulator 0
17/10/09 11:03:33 INFO ContextCleaner: Cleaned accumulator 1
17/10/09 11:03:33 INFO ContextCleaner: Cleaned shuffle 0
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_1_piece0 on
46.4.88.233:44387 in memory (size: 10.2 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_1_piece0 on
176.9.29.112:42871 in memory (size: 10.2 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO TableInfo: pool-24-thread-6 Table block size not
specified for default_array_com. Therefore considering the default value 1024 MB
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_2_piece0 on
46.4.88.233:44387 in memory (size: 3.8 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_2_piece0 on
176.9.29.112:42871 in memory (size: 3.8 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_3_piece0 on
46.4.88.233:44387 in memory (size: 23.2 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_3_piece0 on
46.4.88.233:36341 in memory (size: 23.2 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_4_piece0 on
46.4.88.233:44387 in memory (size: 7.3 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_4_piece0 on
46.4.88.233:36341 in memory (size: 7.3 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO UnsafeMemoryManager: pool-24-thread-6 Working Memory
manager is created with size 536870912 with
org.apache.carbondata.core.memory.UnsafeMemoryAllocator@750f4066
17/10/09 11:03:33 INFO UnsafeMemoryManager: pool-24-thread-6 Memory block
(org.apache.carbondata.core.memory.MemoryBlock@2c5c1fce) is created with size
8388608. Total memory used 8388608Bytes, left 528482304Bytes
17/10/09 11:03:33 INFO UnsafeMemoryManager: pool-24-thread-6 Memory block
(org.apache.carbondata.core.memory.MemoryBlock@55d7a9fc) is created with size
511. Total memory used 8389119Bytes, left 528481793Bytes
17/10/09 11:03:33 INFO UnsafeMemoryManager: pool-24-thread-6 Freeing memory of
size: 8388608available memory: 536870401
17/10/09 11:03:33 STATISTIC DriverQueryStatisticsRecorderImpl: Print query
statistic for query id: 431867348397223
+--------+--------------------+---------------------+------------------------+
| Module| Operation Step| Total Query Cost| Query Cost|
+--------+--------------------+---------------------+------------------------+
| Driver| Load blocks driver| | 87 |
| +--------------------+ +------------------------+
| Part| Block allocation| 88 | 0 |
| +--------------------+ +------------------------+
| |Block identification| | 1 |
+--------+--------------------+---------------------+------------------------+
17/10/09 11:03:33 INFO CarbonScanRDD:
Identified no.of.blocks: 1,
no.of.tasks: 1,
no.of.nodes: 0,
parallelism: 24
17/10/09 11:03:33 INFO SparkContext: Starting job: run at
AccessController.java:0
17/10/09 11:03:33 INFO DAGScheduler: Got job 2 (run at AccessController.java:0)
with 1 output partitions
17/10/09 11:03:33 INFO DAGScheduler: Final stage: ResultStage 3 (run at
AccessController.java:0)
17/10/09 11:03:33 INFO DAGScheduler: Parents of final stage: List()
17/10/09 11:03:33 INFO DAGScheduler: Missing parents: List()
17/10/09 11:03:33 INFO DAGScheduler: Submitting ResultStage 3
(MapPartitionsRDD[19] at run at AccessController.java:0), which has no missing
parents
17/10/09 11:03:33 INFO MemoryStore: Block broadcast_5 stored as values in
memory (estimated size 10.3 KB, free 2004.6 MB)
17/10/09 11:03:33 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in
memory (estimated size 5.3 KB, free 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on
46.4.88.233:44387 (size: 5.3 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO SparkContext: Created broadcast 5 from broadcast at
DAGScheduler.scala:996
17/10/09 11:03:33 INFO DAGScheduler: Submitting 1 missing tasks from
ResultStage 3 (MapPartitionsRDD[19] at run at AccessController.java:0)
17/10/09 11:03:33 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks
17/10/09 11:03:33 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 4,
46.4.88.233, executor 1, partition 0, ANY, 6807 bytes)
17/10/09 11:03:33 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on
46.4.88.233:36341 (size: 5.3 KB, free: 3.6 GB)
17/10/09 11:05:34 ERROR TaskSchedulerImpl: Lost executor 1 on 46.4.88.233:
Remote RPC client disassociated. Likely due to containers exceeding thresholds,
or network issues. Check driver logs for WARN messages.
17/10/09 11:05:34 INFO StandaloneAppClient$ClientEndpoint: Executor updated:
app-20171009104414-0003/1 is now EXITED (Command exited with code 134)
17/10/09 11:05:34 INFO StandaloneSchedulerBackend: Executor
app-20171009104414-0003/1 removed: Command exited with code 134
17/10/09 11:05:34 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 4,
46.4.88.233, executor 1): ExecutorLostFailure (executor 1 exited caused by one
of the running tasks) Reason: Remote RPC client disassociated. Likely due to
containers exceeding thresholds, or network issues. Check driver logs for WARN
messages.
17/10/09 11:05:34 INFO DAGScheduler: Executor lost: 1 (epoch 1)
17/10/09 11:05:34 INFO BlockManagerMaster: Removal of executor 1 requested
17/10/09 11:05:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1
from BlockManagerMaster.
17/10/09 11:05:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to
remove non-existent executor 1
17/10/09 11:05:34 INFO BlockManagerMasterEndpoint: Removing block manager
BlockManagerId(1, 46.4.88.233, 36341, None)
17/10/09 11:05:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1
from BlockManagerMaster.
17/10/09 11:05:34 INFO TaskSetManager: Starting task 0.1 in stage 3.0 (TID 5,
148.251.7.173, executor 0, partition 0, ANY, 6807 bytes)
17/10/09 11:05:34 INFO BlockManagerMaster: Removed 1 successfully in
removeExecutor
17/10/09 11:05:34 INFO DAGScheduler: Shuffle files lost for executor: 1 (epoch
1)
17/10/09 11:05:34 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on
148.251.7.173:37951 (size: 5.3 KB, free: 3.6 GB)
17/10/09 11:07:35 ERROR TaskSchedulerImpl: Lost executor 0 on 148.251.7.173:
Remote RPC client disassociated. Likely due to containers exceeding thresholds,
or network issues. Check driver logs for WARN messages.
17/10/09 11:07:35 WARN TaskSetManager: Lost task 0.1 in stage 3.0 (TID 5,
148.251.7.173, executor 0): ExecutorLostFailure (executor 0 exited caused by
one of the running tasks) Reason: Remote RPC client disassociated. Likely due
to containers exceeding thresholds, or network issues. Check driver logs for
WARN messages.
17/10/09 11:07:35 INFO DAGScheduler: Executor lost: 0 (epoch 2)
17/10/09 11:07:35 INFO BlockManagerMasterEndpoint: Trying to remove executor 0
from BlockManagerMaster.
17/10/09 11:07:35 INFO BlockManagerMasterEndpoint: Removing block manager
BlockManagerId(0, 148.251.7.173, 37951, None)
17/10/09 11:07:35 INFO BlockManagerMaster: Removed 0 successfully in
removeExecutor
17/10/09 11:07:35 INFO DAGScheduler: Shuffle files lost for executor: 0 (epoch
2)
17/10/09 11:07:35 INFO TaskSetManager: Starting task 0.2 in stage 3.0 (TID 6,
176.9.29.112, executor 2, partition 0, ANY, 6807 bytes)
17/10/09 11:07:35 INFO StandaloneAppClient$ClientEndpoint: Executor updated:
app-20171009104414-0003/0 is now EXITED (Command exited with code 134)
17/10/09 11:07:35 INFO StandaloneSchedulerBackend: Executor
app-20171009104414-0003/0 removed: Command exited with code 134
17/10/09 11:07:35 INFO BlockManagerMasterEndpoint: Trying to remove executor 0
from BlockManagerMaster.
17/10/09 11:07:35 INFO BlockManagerMaster: Removal of executor 0 requested
17/10/09 11:07:35 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to
remove non-existent executor 0
17/10/09 11:07:35 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on
176.9.29.112:42871 (size: 5.3 KB, free: 3.6 GB)
17/10/09 11:09:36 ERROR TaskSchedulerImpl: Lost executor 2 on 176.9.29.112:
Remote RPC client disassociated. Likely due to containers exceeding thresholds,
or network issues. Check driver logs for WARN messages.
17/10/09 11:09:36 WARN TaskSetManager: Lost task 0.2 in stage 3.0 (TID 6,
176.9.29.112, executor 2): ExecutorLostFailure (executor 2 exited caused by one
of the running tasks) Reason: Remote RPC client disassociated. Likely due to
containers exceeding thresholds, or network issues. Check driver logs for WARN
messages.
17/10/09 11:09:36 INFO DAGScheduler: Executor lost: 2 (epoch 3)
17/10/09 11:09:36 INFO BlockManagerMasterEndpoint: Trying to remove executor 2
from BlockManagerMaster.
17/10/09 11:09:36 INFO BlockManagerMasterEndpoint: Removing block manager
BlockManagerId(2, 176.9.29.112, 42871, None)
17/10/09 11:09:36 INFO BlockManagerMaster: Removed 2 successfully in
removeExecutor
17/10/09 11:09:36 INFO DAGScheduler: Shuffle files lost for executor: 2 (epoch
3)
17/10/09 11:09:36 INFO StandaloneAppClient$ClientEndpoint: Executor updated:
app-20171009104414-0003/2 is now EXITED (Command exited with code 134)
17/10/09 11:09:36 INFO StandaloneSchedulerBackend: Executor
app-20171009104414-0003/2 removed: Command exited with code 134
17/10/09 11:09:36 INFO BlockManagerMaster: Removal of executor 2 requested
17/10/09 11:09:36 INFO BlockManagerMasterEndpoint: Trying to remove executor 2
from BlockManagerMaster.
17/10/09 11:09:36 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to
remove non-existent executor 2
17/10/09 11:09:36 INFO StandaloneAppClient$ClientEndpoint: Executor added:
app-20171009104414-0003/3 on worker-20171009095843-148.251.7.173-33995
(148.251.7.173:33995) with 8 cores
17/10/09 11:09:36 INFO StandaloneSchedulerBackend: Granted executor ID
app-20171009104414-0003/3 on hostPort 148.251.7.173:33995 with 8 cores, 7.0 GB
RAM
17/10/09 11:09:36 INFO StandaloneAppClient$ClientEndpoint: Executor updated:
app-20171009104414-0003/3 is now RUNNING
17/10/09 11:09:37 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered
executor NettyRpcEndpointRef(null) (148.251.7.173:56362) with ID 3
17/10/09 11:09:37 INFO TaskSetManager: Starting task 0.3 in stage 3.0 (TID 7,
148.251.7.173, executor 3, partition 0, ANY, 6807 bytes)
17/10/09 11:09:37 INFO BlockManagerMasterEndpoint: Registering block manager
148.251.7.173:40003 with 3.6 GB RAM, BlockManagerId(3, 148.251.7.173, 40003,
None)
17/10/09 11:09:38 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on
148.251.7.173:40003 (size: 5.3 KB, free: 3.6 GB)
17/10/09 11:11:39 ERROR TaskSchedulerImpl: Lost executor 3 on 148.251.7.173:
Remote RPC client disassociated. Likely due to containers exceeding thresholds,
or network issues. Check driver logs for WARN messages.
17/10/09 11:11:39 WARN TaskSetManager: Lost task 0.3 in stage 3.0 (TID 7,
148.251.7.173, executor 3): ExecutorLostFailure (executor 3 exited caused by
one of the running tasks) Reason: Remote RPC client disassociated. Likely due
to containers exceeding thresholds, or network issues. Check driver logs for
WARN messages.
17/10/09 11:11:39 ERROR TaskSetManager: Task 0 in stage 3.0 failed 4 times;
aborting job
17/10/09 11:11:39 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have
all completed, from pool
17/10/09 11:11:39 INFO StandaloneAppClient$ClientEndpoint: Executor updated:
app-20171009104414-0003/3 is now EXITED (Command exited with code 134)
17/10/09 11:11:39 INFO StandaloneSchedulerBackend: Executor
app-20171009104414-0003/3 removed: Command exited with code 134
17/10/09 11:11:39 INFO BlockManagerMaster: Removal of executor 3 requested
17/10/09 11:11:39 INFO BlockManagerMasterEndpoint: Trying to remove executor 3
from BlockManagerMaster.
17/10/09 11:11:39 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to
remove non-existent executor 3
17/10/09 11:11:39 INFO BlockManagerMasterEndpoint: Removing block manager
BlockManagerId(3, 148.251.7.173, 40003, None)
17/10/09 11:11:39 INFO StandaloneAppClient$ClientEndpoint: Executor added:
app-20171009104414-0003/4 on worker-20171009095843-148.251.7.173-33995
(148.251.7.173:33995) with 8 cores
17/10/09 11:11:39 INFO StandaloneSchedulerBackend: Granted executor ID
app-20171009104414-0003/4 on hostPort 148.251.7.173:33995 with 8 cores, 7.0 GB
RAM
17/10/09 11:11:39 INFO TaskSchedulerImpl: Cancelling stage 3
17/10/09 11:11:39 INFO DAGScheduler: ResultStage 3 (run at
AccessController.java:0) failed in 485.691 s due to Job aborted due to stage
failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3
in stage 3.0 (TID 7, 148.251.7.173, executor 3): ExecutorLostFailure (executor
3 exited caused by one of the running tasks) Reason: Remote RPC client
disassociated. Likely due to containers exceeding thresholds, or network
issues. Check driver logs for WARN messages.
Driver stacktrace:
17/10/09 11:11:39 INFO DAGScheduler: Job 2 failed: run at
AccessController.java:0, took 485.699162 s
17/10/09 11:11:39 INFO DAGScheduler: Executor lost: 3 (epoch 4)
17/10/09 11:11:39 INFO BlockManagerMasterEndpoint: Trying to remove executor 3
from BlockManagerMaster.
17/10/09 11:11:39 INFO BlockManagerMaster: Removed 3 successfully in
removeExecutor
17/10/09 11:11:39 INFO DAGScheduler: Shuffle files lost for executor: 3 (epoch
4)
17/10/09 11:11:39 ERROR SparkExecuteStatementOperation: Error executing query,
currentState RUNNING,
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 3.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3.0 (TID
7, 148.251.7.173, executor 3): ExecutorLostFailure (executor 3 exited caused by
one of the running tasks) Reason: Remote RPC client disassociated. Likely due
to containers exceeding thresholds, or network issues. Check driver logs for
WARN messages.
Driver stacktrace:
at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
at scala.Option.foreach(Option.scala:257)
at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1650)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1918)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1931)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1944)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:935)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.collect(RDD.scala:934)
at
org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:275)
at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2371)
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2765)
at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2370)
at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2375)
at
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2375)
at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2778)
at
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2375)
at org.apache.spark.sql.Dataset.collect(Dataset.scala:2351)
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:235)
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
17/10/09 11:11:39 INFO StandaloneAppClient$ClientEndpoint: Executor updated:
app-20171009104414-0003/4 is now RUNNING
17/10/09 11:11:39 ERROR SparkExecuteStatementOperation: Error running hive
query:
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.SparkException:
Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most
recent failure: Lost task 0.3 in stage 3.0 (TID 7, 148.251.7.173, executor 3):
ExecutorLostFailure (executor 3 exited caused by one of the running tasks)
Reason: Remote RPC client disassociated. Likely due to containers exceeding
thresholds, or network issues. Check driver logs for WARN messages.
Driver stacktrace:
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:258)
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
17/10/09 11:11:41 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered
executor NettyRpcEndpointRef(null) (148.251.7.173:56378) with ID 4
17/10/09 11:11:41 INFO BlockManagerMasterEndpoint: Registering block manager
148.251.7.173:40125 with 3.6 GB RAM, BlockManagerId(4, 148.251.7.173, 40125,
None)
17/10/09 11:14:14 INFO BlockManagerInfo: Removed broadcast_5_piece0 on
46.4.88.233:44387 in memory (size: 5.3 KB, free: 2004.6 MB)
17/10/09 11:14:14 INFO ContextCleaner: Cleaned accumulator 171
17/10/09 11:14:14 INFO ContextCleaner: Cleaned accumulator 170
> Memory issue while executing complex data type queries on cluster
> -----------------------------------------------------------------
>
> Key: CARBONDATA-1540
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1540
> Project: CarbonData
> Issue Type: Bug
> Components: data-query
> Affects Versions: 1.2.0
> Environment: spark 2.1
> Reporter: Vandana Yadav
> Attachments: Array.csv
>
>
> Memory issue while executing complex data type queries on the cluster:
> Steps to reproduce:
> 1) Create COmplex data type table:
> create table Array_com (CUST_ID string, YEAR int, MONTH int, AGE int, GENDER
> string, EDUCATED string, IS_MARRIED string, ARRAY_INT array<int>,ARRAY_STRING
> array<string>,ARRAY_DATE array<timestamp>,CARD_COUNT int,DEBIT_COUNT int,
> CREDIT_COUNT int, DEPOSIT double, HQ_DEPOSIT double) STORED BY
> 'org.apache.carbondata.format';
> 2) Load Data into the table:
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/complex/Array.csv' INTO table
> Array_com options ('DELIMITER'=',', 'QUOTECHAR'='"',
> 'FILEHEADER'='CUST_ID,YEAR,MONTH,AGE,GENDER,EDUCATED,IS_MARRIED,ARRAY_INT,ARRAY_STRING,ARRAY_DATE,CARD_COUNT,DEBIT_COUNT,CREDIT_COUNT,DEPOSIT,HQ_DEPOSIT','COMPLEX_DELIMITER_LEVEL_1'='$');
> 3) Execute the Select Query:
> select array_int[0], array_int[0]+ 10 as a from array_com
> Expected Result: select query should display the correct result.
> Actual Result: Error: org.apache.spark.SparkException: Job aborted due to
> stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost
> task 0.3 in stage 0.0 (TID 3, slave1, executor 4): ExecutorLostFailure
> (executor 4 exited caused by one of the running tasks) Reason: Slave lost
> Driver stacktrace: (state=,code=0)
> Thrift server Log:
> *** Error in `/usr/lib/jvm/java-8-oracle/bin/java': malloc(): memory
> corruption: 0x00007f8b90292760 ***
> ======= Backtrace: =========
> /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f8bae53c7e5]
> /lib/x86_64-linux-gnu/libc.so.6(+0x8213e)[0x7f8bae54713e]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7f8bae549184]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x91df65)[0x7f8baddeff65]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x2e23a0)[0x7f8bad7b43a0]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x2e2444)[0x7f8bad7b4444]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x9740ba)[0x7f8bade460ba]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0xa3c710)[0x7f8badf0e710]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(JVM_FindClassFromBootLoader+0x22b)[0x7f8badbe953b]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/libjava.so(Java_java_lang_ClassLoader_findBootstrapClass+0x9b)[0x7f8bacb9589b]
> [0x7f8b991fd5bc]
> ======= Memory map: ========
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)