[jira] [Updated] (CARBONDATA-1540) Memory issue while executing complex data type queries on cluster

Vandana Yadav (JIRA) Sun, 08 Oct 2017 23:37:17 -0700

     [ 
https://issues.apache.org/jira/browse/CARBONDATA-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vandana Yadav updated CARBONDATA-1540:
--------------------------------------
    Description: 
Memory issue while executing complex data type queries on the cluster:

Steps to reproduce:

1) Create COmplex data type table:
create table Array_com (CUST_ID string, YEAR int, MONTH int, AGE int, GENDER 
string, EDUCATED string, IS_MARRIED string, ARRAY_INT array<int>,ARRAY_STRING 
array<string>,ARRAY_DATE array<timestamp>,CARD_COUNT int,DEBIT_COUNT int, 
CREDIT_COUNT int, DEPOSIT double, HQ_DEPOSIT double) STORED BY 
'org.apache.carbondata.format';

2) Load Data into the table:
LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/complex/Array.csv' INTO table 
Array_com  options ('DELIMITER'=',', 'QUOTECHAR'='"', 
'FILEHEADER'='CUST_ID,YEAR,MONTH,AGE,GENDER,EDUCATED,IS_MARRIED,ARRAY_INT,ARRAY_STRING,ARRAY_DATE,CARD_COUNT,DEBIT_COUNT,CREDIT_COUNT,DEPOSIT,HQ_DEPOSIT','COMPLEX_DELIMITER_LEVEL_1'='$');

3) Execute the Select Query:
select array_int[0], array_int[0]+ 10 as a  from array_com

Expected Result: select query should display the correct result.

Actual Result: Error: org.apache.spark.SparkException: Job aborted due to stage 
failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 
in stage 0.0 (TID 3, slave1, executor 4): ExecutorLostFailure (executor 4 
exited caused by one of the running tasks) Reason: Slave lost
Driver stacktrace: (state=,code=0)


Thrift server Log:
*** Error in `/usr/lib/jvm/java-8-oracle/bin/java': malloc(): memory 
corruption: 0x00007f8b90292760 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f8bae53c7e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8213e)[0x7f8bae54713e]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7f8bae549184]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x91df65)[0x7f8baddeff65]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x2e23a0)[0x7f8bad7b43a0]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x2e2444)[0x7f8bad7b4444]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x9740ba)[0x7f8bade460ba]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0xa3c710)[0x7f8badf0e710]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(JVM_FindClassFromBootLoader+0x22b)[0x7f8badbe953b]
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/libjava.so(Java_java_lang_ClassLoader_findBootstrapClass+0x9b)[0x7f8bacb9589b]
[0x7f8b991fd5bc]
======= Memory map: ========


  was:
Memory issue while executing complex data type queries on the cluster:

Steps to reproduce:

1) Create COmplex data type table:
create table Array_com (CUST_ID string, YEAR int, MONTH int, AGE int, GENDER 
string, EDUCATED string, IS_MARRIED string, ARRAY_INT array<int>,ARRAY_STRING 
array<string>,ARRAY_DATE array<timestamp>,CARD_COUNT int,DEBIT_COUNT int, 
CREDIT_COUNT int, DEPOSIT double, HQ_DEPOSIT double) STORED BY 
'org.apache.carbondata.format';

2) Load Data into the table:
LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/complex/Array.csv' INTO table 
Array_com  options ('DELIMITER'=',', 'QUOTECHAR'='"', 
'FILEHEADER'='CUST_ID,YEAR,MONTH,AGE,GENDER,EDUCATED,IS_MARRIED,ARRAY_INT,ARRAY_STRING,ARRAY_DATE,CARD_COUNT,DEBIT_COUNT,CREDIT_COUNT,DEPOSIT,HQ_DEPOSIT','COMPLEX_DELIMITER_LEVEL_1'='$');

3) Execute the Select Query:
select array_int[0], array_int[0]+ 10 as a  from array_com

Expected Result: select query should display the correct result.

Actual Result: Error: org.apache.spark.SparkException: Job aborted due to stage 
failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3 
in stage 3.0 (TID 7, 148.251.7.173, executor 3): ExecutorLostFailure (executor 
3 exited caused by one of the running tasks) Reason: Remote RPC client 
disassociated. Likely due to containers exceeding thresholds, or network 
issues. Check driver logs for WARN messages.
Driver stacktrace: (state=,code=0)


Thrift server Log:
17/10/09 11:03:33 INFO SparkExecuteStatementOperation: Running query 'select 
array_int[0], array_int[0]+ 10 as a  from array_com' with 
5e2f2e3e-e737-496e-bb6f-31269aaed2be
17/10/09 11:03:33 INFO CarbonSparkSqlParser: Parsing command: select 
array_int[0], array_int[0]+ 10 as a  from array_com
17/10/09 11:03:33 INFO HiveMetaStore: 7: get_table : db=default tbl=array_com
17/10/09 11:03:33 INFO audit: ugi=root  ip=unknown-ip-addr      cmd=get_table : 
db=default tbl=array_com        
17/10/09 11:03:33 INFO HiveMetaStore: 7: Opening raw store with implemenation 
class:org.apache.hadoop.hive.metastore.ObjectStore
17/10/09 11:03:33 INFO ObjectStore: ObjectStore, initialize called
17/10/09 11:03:33 INFO Query: Reading in results for query 
"org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is 
closing
17/10/09 11:03:33 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is 
DERBY
17/10/09 11:03:33 INFO ObjectStore: Initialized ObjectStore
17/10/09 11:03:33 INFO CatalystSqlParser: Parsing command: array<string>
17/10/09 11:03:33 INFO CarbonLateDecodeRule: pool-24-thread-6 Starting to 
optimize plan
17/10/09 11:03:33 STATISTIC QueryStatisticsRecorderImpl: Time taken for Carbon 
Optimizer to optimize: 15
17/10/09 11:03:33 INFO CarbonLateDecodeRule: pool-24-thread-6 Skip 
CarbonOptimizer
17/10/09 11:03:33 INFO CodeGenerator: Code generated in 15.017295 ms
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
46.4.88.233:44387 in memory (size: 23.5 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
176.9.29.112:42871 in memory (size: 23.5 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO ContextCleaner: Cleaned accumulator 0
17/10/09 11:03:33 INFO ContextCleaner: Cleaned accumulator 1
17/10/09 11:03:33 INFO ContextCleaner: Cleaned shuffle 0
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 
46.4.88.233:44387 in memory (size: 10.2 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 
176.9.29.112:42871 in memory (size: 10.2 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO TableInfo: pool-24-thread-6 Table block size not 
specified for default_array_com. Therefore considering the default value 1024 MB
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 
46.4.88.233:44387 in memory (size: 3.8 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 
176.9.29.112:42871 in memory (size: 3.8 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 
46.4.88.233:44387 in memory (size: 23.2 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 
46.4.88.233:36341 in memory (size: 23.2 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 
46.4.88.233:44387 in memory (size: 7.3 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 
46.4.88.233:36341 in memory (size: 7.3 KB, free: 3.6 GB)
17/10/09 11:03:33 INFO UnsafeMemoryManager: pool-24-thread-6 Working Memory 
manager is created with size 536870912 with 
org.apache.carbondata.core.memory.UnsafeMemoryAllocator@750f4066
17/10/09 11:03:33 INFO UnsafeMemoryManager: pool-24-thread-6 Memory block 
(org.apache.carbondata.core.memory.MemoryBlock@2c5c1fce) is created with size 
8388608. Total memory used 8388608Bytes, left 528482304Bytes
17/10/09 11:03:33 INFO UnsafeMemoryManager: pool-24-thread-6 Memory block 
(org.apache.carbondata.core.memory.MemoryBlock@55d7a9fc) is created with size 
511. Total memory used 8389119Bytes, left 528481793Bytes
17/10/09 11:03:33 INFO UnsafeMemoryManager: pool-24-thread-6 Freeing memory of 
size: 8388608available memory:  536870401
17/10/09 11:03:33 STATISTIC DriverQueryStatisticsRecorderImpl: Print query 
statistic for query id: 431867348397223
+--------+--------------------+---------------------+------------------------+
|  Module|      Operation Step|     Total Query Cost|              Query Cost|
+--------+--------------------+---------------------+------------------------+
|  Driver|  Load blocks driver|                     |                     87 |
|        +--------------------+                     +------------------------+
|    Part|    Block allocation|                  88 |                      0 |
|        +--------------------+                     +------------------------+
|        |Block identification|                     |                      1 |
+--------+--------------------+---------------------+------------------------+

17/10/09 11:03:33 INFO CarbonScanRDD: 
 Identified no.of.blocks: 1,
 no.of.tasks: 1,
 no.of.nodes: 0,
 parallelism: 24
       
17/10/09 11:03:33 INFO SparkContext: Starting job: run at 
AccessController.java:0
17/10/09 11:03:33 INFO DAGScheduler: Got job 2 (run at AccessController.java:0) 
with 1 output partitions
17/10/09 11:03:33 INFO DAGScheduler: Final stage: ResultStage 3 (run at 
AccessController.java:0)
17/10/09 11:03:33 INFO DAGScheduler: Parents of final stage: List()
17/10/09 11:03:33 INFO DAGScheduler: Missing parents: List()
17/10/09 11:03:33 INFO DAGScheduler: Submitting ResultStage 3 
(MapPartitionsRDD[19] at run at AccessController.java:0), which has no missing 
parents
17/10/09 11:03:33 INFO MemoryStore: Block broadcast_5 stored as values in 
memory (estimated size 10.3 KB, free 2004.6 MB)
17/10/09 11:03:33 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in 
memory (estimated size 5.3 KB, free 2004.6 MB)
17/10/09 11:03:33 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 
46.4.88.233:44387 (size: 5.3 KB, free: 2004.6 MB)
17/10/09 11:03:33 INFO SparkContext: Created broadcast 5 from broadcast at 
DAGScheduler.scala:996
17/10/09 11:03:33 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 3 (MapPartitionsRDD[19] at run at AccessController.java:0)
17/10/09 11:03:33 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks
17/10/09 11:03:33 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 4, 
46.4.88.233, executor 1, partition 0, ANY, 6807 bytes)
17/10/09 11:03:33 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 
46.4.88.233:36341 (size: 5.3 KB, free: 3.6 GB)
17/10/09 11:05:34 ERROR TaskSchedulerImpl: Lost executor 1 on 46.4.88.233: 
Remote RPC client disassociated. Likely due to containers exceeding thresholds, 
or network issues. Check driver logs for WARN messages.
17/10/09 11:05:34 INFO StandaloneAppClient$ClientEndpoint: Executor updated: 
app-20171009104414-0003/1 is now EXITED (Command exited with code 134)
17/10/09 11:05:34 INFO StandaloneSchedulerBackend: Executor 
app-20171009104414-0003/1 removed: Command exited with code 134
17/10/09 11:05:34 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 4, 
46.4.88.233, executor 1): ExecutorLostFailure (executor 1 exited caused by one 
of the running tasks) Reason: Remote RPC client disassociated. Likely due to 
containers exceeding thresholds, or network issues. Check driver logs for WARN 
messages.
17/10/09 11:05:34 INFO DAGScheduler: Executor lost: 1 (epoch 1)
17/10/09 11:05:34 INFO BlockManagerMaster: Removal of executor 1 requested
17/10/09 11:05:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 
from BlockManagerMaster.
17/10/09 11:05:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to 
remove non-existent executor 1
17/10/09 11:05:34 INFO BlockManagerMasterEndpoint: Removing block manager 
BlockManagerId(1, 46.4.88.233, 36341, None)
17/10/09 11:05:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 
from BlockManagerMaster.
17/10/09 11:05:34 INFO TaskSetManager: Starting task 0.1 in stage 3.0 (TID 5, 
148.251.7.173, executor 0, partition 0, ANY, 6807 bytes)
17/10/09 11:05:34 INFO BlockManagerMaster: Removed 1 successfully in 
removeExecutor
17/10/09 11:05:34 INFO DAGScheduler: Shuffle files lost for executor: 1 (epoch 
1)
17/10/09 11:05:34 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 
148.251.7.173:37951 (size: 5.3 KB, free: 3.6 GB)
17/10/09 11:07:35 ERROR TaskSchedulerImpl: Lost executor 0 on 148.251.7.173: 
Remote RPC client disassociated. Likely due to containers exceeding thresholds, 
or network issues. Check driver logs for WARN messages.
17/10/09 11:07:35 WARN TaskSetManager: Lost task 0.1 in stage 3.0 (TID 5, 
148.251.7.173, executor 0): ExecutorLostFailure (executor 0 exited caused by 
one of the running tasks) Reason: Remote RPC client disassociated. Likely due 
to containers exceeding thresholds, or network issues. Check driver logs for 
WARN messages.
17/10/09 11:07:35 INFO DAGScheduler: Executor lost: 0 (epoch 2)
17/10/09 11:07:35 INFO BlockManagerMasterEndpoint: Trying to remove executor 0 
from BlockManagerMaster.
17/10/09 11:07:35 INFO BlockManagerMasterEndpoint: Removing block manager 
BlockManagerId(0, 148.251.7.173, 37951, None)
17/10/09 11:07:35 INFO BlockManagerMaster: Removed 0 successfully in 
removeExecutor
17/10/09 11:07:35 INFO DAGScheduler: Shuffle files lost for executor: 0 (epoch 
2)
17/10/09 11:07:35 INFO TaskSetManager: Starting task 0.2 in stage 3.0 (TID 6, 
176.9.29.112, executor 2, partition 0, ANY, 6807 bytes)
17/10/09 11:07:35 INFO StandaloneAppClient$ClientEndpoint: Executor updated: 
app-20171009104414-0003/0 is now EXITED (Command exited with code 134)
17/10/09 11:07:35 INFO StandaloneSchedulerBackend: Executor 
app-20171009104414-0003/0 removed: Command exited with code 134
17/10/09 11:07:35 INFO BlockManagerMasterEndpoint: Trying to remove executor 0 
from BlockManagerMaster.
17/10/09 11:07:35 INFO BlockManagerMaster: Removal of executor 0 requested
17/10/09 11:07:35 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to 
remove non-existent executor 0
17/10/09 11:07:35 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 
176.9.29.112:42871 (size: 5.3 KB, free: 3.6 GB)
17/10/09 11:09:36 ERROR TaskSchedulerImpl: Lost executor 2 on 176.9.29.112: 
Remote RPC client disassociated. Likely due to containers exceeding thresholds, 
or network issues. Check driver logs for WARN messages.
17/10/09 11:09:36 WARN TaskSetManager: Lost task 0.2 in stage 3.0 (TID 6, 
176.9.29.112, executor 2): ExecutorLostFailure (executor 2 exited caused by one 
of the running tasks) Reason: Remote RPC client disassociated. Likely due to 
containers exceeding thresholds, or network issues. Check driver logs for WARN 
messages.
17/10/09 11:09:36 INFO DAGScheduler: Executor lost: 2 (epoch 3)
17/10/09 11:09:36 INFO BlockManagerMasterEndpoint: Trying to remove executor 2 
from BlockManagerMaster.
17/10/09 11:09:36 INFO BlockManagerMasterEndpoint: Removing block manager 
BlockManagerId(2, 176.9.29.112, 42871, None)
17/10/09 11:09:36 INFO BlockManagerMaster: Removed 2 successfully in 
removeExecutor
17/10/09 11:09:36 INFO DAGScheduler: Shuffle files lost for executor: 2 (epoch 
3)
17/10/09 11:09:36 INFO StandaloneAppClient$ClientEndpoint: Executor updated: 
app-20171009104414-0003/2 is now EXITED (Command exited with code 134)
17/10/09 11:09:36 INFO StandaloneSchedulerBackend: Executor 
app-20171009104414-0003/2 removed: Command exited with code 134
17/10/09 11:09:36 INFO BlockManagerMaster: Removal of executor 2 requested
17/10/09 11:09:36 INFO BlockManagerMasterEndpoint: Trying to remove executor 2 
from BlockManagerMaster.
17/10/09 11:09:36 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to 
remove non-existent executor 2
17/10/09 11:09:36 INFO StandaloneAppClient$ClientEndpoint: Executor added: 
app-20171009104414-0003/3 on worker-20171009095843-148.251.7.173-33995 
(148.251.7.173:33995) with 8 cores
17/10/09 11:09:36 INFO StandaloneSchedulerBackend: Granted executor ID 
app-20171009104414-0003/3 on hostPort 148.251.7.173:33995 with 8 cores, 7.0 GB 
RAM
17/10/09 11:09:36 INFO StandaloneAppClient$ClientEndpoint: Executor updated: 
app-20171009104414-0003/3 is now RUNNING
17/10/09 11:09:37 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered 
executor NettyRpcEndpointRef(null) (148.251.7.173:56362) with ID 3
17/10/09 11:09:37 INFO TaskSetManager: Starting task 0.3 in stage 3.0 (TID 7, 
148.251.7.173, executor 3, partition 0, ANY, 6807 bytes)
17/10/09 11:09:37 INFO BlockManagerMasterEndpoint: Registering block manager 
148.251.7.173:40003 with 3.6 GB RAM, BlockManagerId(3, 148.251.7.173, 40003, 
None)
17/10/09 11:09:38 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 
148.251.7.173:40003 (size: 5.3 KB, free: 3.6 GB)
17/10/09 11:11:39 ERROR TaskSchedulerImpl: Lost executor 3 on 148.251.7.173: 
Remote RPC client disassociated. Likely due to containers exceeding thresholds, 
or network issues. Check driver logs for WARN messages.
17/10/09 11:11:39 WARN TaskSetManager: Lost task 0.3 in stage 3.0 (TID 7, 
148.251.7.173, executor 3): ExecutorLostFailure (executor 3 exited caused by 
one of the running tasks) Reason: Remote RPC client disassociated. Likely due 
to containers exceeding thresholds, or network issues. Check driver logs for 
WARN messages.
17/10/09 11:11:39 ERROR TaskSetManager: Task 0 in stage 3.0 failed 4 times; 
aborting job
17/10/09 11:11:39 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have 
all completed, from pool 
17/10/09 11:11:39 INFO StandaloneAppClient$ClientEndpoint: Executor updated: 
app-20171009104414-0003/3 is now EXITED (Command exited with code 134)
17/10/09 11:11:39 INFO StandaloneSchedulerBackend: Executor 
app-20171009104414-0003/3 removed: Command exited with code 134
17/10/09 11:11:39 INFO BlockManagerMaster: Removal of executor 3 requested
17/10/09 11:11:39 INFO BlockManagerMasterEndpoint: Trying to remove executor 3 
from BlockManagerMaster.
17/10/09 11:11:39 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to 
remove non-existent executor 3
17/10/09 11:11:39 INFO BlockManagerMasterEndpoint: Removing block manager 
BlockManagerId(3, 148.251.7.173, 40003, None)
17/10/09 11:11:39 INFO StandaloneAppClient$ClientEndpoint: Executor added: 
app-20171009104414-0003/4 on worker-20171009095843-148.251.7.173-33995 
(148.251.7.173:33995) with 8 cores
17/10/09 11:11:39 INFO StandaloneSchedulerBackend: Granted executor ID 
app-20171009104414-0003/4 on hostPort 148.251.7.173:33995 with 8 cores, 7.0 GB 
RAM
17/10/09 11:11:39 INFO TaskSchedulerImpl: Cancelling stage 3
17/10/09 11:11:39 INFO DAGScheduler: ResultStage 3 (run at 
AccessController.java:0) failed in 485.691 s due to Job aborted due to stage 
failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3 
in stage 3.0 (TID 7, 148.251.7.173, executor 3): ExecutorLostFailure (executor 
3 exited caused by one of the running tasks) Reason: Remote RPC client 
disassociated. Likely due to containers exceeding thresholds, or network 
issues. Check driver logs for WARN messages.
Driver stacktrace:
17/10/09 11:11:39 INFO DAGScheduler: Job 2 failed: run at 
AccessController.java:0, took 485.699162 s
17/10/09 11:11:39 INFO DAGScheduler: Executor lost: 3 (epoch 4)
17/10/09 11:11:39 INFO BlockManagerMasterEndpoint: Trying to remove executor 3 
from BlockManagerMaster.
17/10/09 11:11:39 INFO BlockManagerMaster: Removed 3 successfully in 
removeExecutor
17/10/09 11:11:39 INFO DAGScheduler: Shuffle files lost for executor: 3 (epoch 
4)
17/10/09 11:11:39 ERROR SparkExecuteStatementOperation: Error executing query, 
currentState RUNNING, 
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 3.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3.0 (TID 
7, 148.251.7.173, executor 3): ExecutorLostFailure (executor 3 exited caused by 
one of the running tasks) Reason: Remote RPC client disassociated. Likely due 
to containers exceeding thresholds, or network issues. Check driver logs for 
WARN messages.
Driver stacktrace:
        at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
        at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
        at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
        at scala.Option.foreach(Option.scala:257)
        at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1650)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at 
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1918)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1931)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1944)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958)
        at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:935)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
        at org.apache.spark.rdd.RDD.collect(RDD.scala:934)
        at 
org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:275)
        at 
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2371)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
        at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2765)
        at 
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2370)
        at 
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2375)
        at 
org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$collect$1.apply(Dataset.scala:2375)
        at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2778)
        at 
org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2375)
        at org.apache.spark.sql.Dataset.collect(Dataset.scala:2351)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:235)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
17/10/09 11:11:39 INFO StandaloneAppClient$ClientEndpoint: Executor updated: 
app-20171009104414-0003/4 is now RUNNING
17/10/09 11:11:39 ERROR SparkExecuteStatementOperation: Error running hive 
query: 
org.apache.hive.service.cli.HiveSQLException: org.apache.spark.SparkException: 
Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most 
recent failure: Lost task 0.3 in stage 3.0 (TID 7, 148.251.7.173, executor 3): 
ExecutorLostFailure (executor 3 exited caused by one of the running tasks) 
Reason: Remote RPC client disassociated. Likely due to containers exceeding 
thresholds, or network issues. Check driver logs for WARN messages.
Driver stacktrace:
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:258)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
17/10/09 11:11:41 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered 
executor NettyRpcEndpointRef(null) (148.251.7.173:56378) with ID 4
17/10/09 11:11:41 INFO BlockManagerMasterEndpoint: Registering block manager 
148.251.7.173:40125 with 3.6 GB RAM, BlockManagerId(4, 148.251.7.173, 40125, 
None)
17/10/09 11:14:14 INFO BlockManagerInfo: Removed broadcast_5_piece0 on 
46.4.88.233:44387 in memory (size: 5.3 KB, free: 2004.6 MB)
17/10/09 11:14:14 INFO ContextCleaner: Cleaned accumulator 171
17/10/09 11:14:14 INFO ContextCleaner: Cleaned accumulator 170


> Memory issue while executing complex data type queries on cluster
> -----------------------------------------------------------------
>
>                 Key: CARBONDATA-1540
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1540
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-query
>    Affects Versions: 1.2.0
>         Environment: spark 2.1
>            Reporter: Vandana Yadav
>         Attachments: Array.csv
>
>
> Memory issue while executing complex data type queries on the cluster:
> Steps to reproduce:
> 1) Create COmplex data type table:
> create table Array_com (CUST_ID string, YEAR int, MONTH int, AGE int, GENDER 
> string, EDUCATED string, IS_MARRIED string, ARRAY_INT array<int>,ARRAY_STRING 
> array<string>,ARRAY_DATE array<timestamp>,CARD_COUNT int,DEBIT_COUNT int, 
> CREDIT_COUNT int, DEPOSIT double, HQ_DEPOSIT double) STORED BY 
> 'org.apache.carbondata.format';
> 2) Load Data into the table:
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/complex/Array.csv' INTO table 
> Array_com  options ('DELIMITER'=',', 'QUOTECHAR'='"', 
> 'FILEHEADER'='CUST_ID,YEAR,MONTH,AGE,GENDER,EDUCATED,IS_MARRIED,ARRAY_INT,ARRAY_STRING,ARRAY_DATE,CARD_COUNT,DEBIT_COUNT,CREDIT_COUNT,DEPOSIT,HQ_DEPOSIT','COMPLEX_DELIMITER_LEVEL_1'='$');
> 3) Execute the Select Query:
> select array_int[0], array_int[0]+ 10 as a  from array_com
> Expected Result: select query should display the correct result.
> Actual Result: Error: org.apache.spark.SparkException: Job aborted due to 
> stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost 
> task 0.3 in stage 0.0 (TID 3, slave1, executor 4): ExecutorLostFailure 
> (executor 4 exited caused by one of the running tasks) Reason: Slave lost
> Driver stacktrace: (state=,code=0)
> Thrift server Log:
> *** Error in `/usr/lib/jvm/java-8-oracle/bin/java': malloc(): memory 
> corruption: 0x00007f8b90292760 ***
> ======= Backtrace: =========
> /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f8bae53c7e5]
> /lib/x86_64-linux-gnu/libc.so.6(+0x8213e)[0x7f8bae54713e]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7f8bae549184]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x91df65)[0x7f8baddeff65]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x2e23a0)[0x7f8bad7b43a0]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x2e2444)[0x7f8bad7b4444]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0x9740ba)[0x7f8bade460ba]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(+0xa3c710)[0x7f8badf0e710]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/server/libjvm.so(JVM_FindClassFromBootLoader+0x22b)[0x7f8badbe953b]
> /usr/lib/jvm/java-8-oracle/jre/lib/amd64/libjava.so(Java_java_lang_ClassLoader_findBootstrapClass+0x9b)[0x7f8bacb9589b]
> [0x7f8b991fd5bc]
> ======= Memory map: ========



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (CARBONDATA-1540) Memory issue while executing complex data type queries on cluster

Reply via email to