[ https://issues.apache.org/jira/browse/CARBONDATA-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422205#comment-17422205 ]
Bigicecream commented on CARBONDATA-4279: ----------------------------------------- [~Indhumathi27] Sorry for the slow responce 1.No, This are the logs: {noformat} SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/mnt/yarn/usercache/livy/filecache/48/__spark_libs__3665716770347383703.zip/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 21/09/29 15:44:36 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 18902@ip-10-4-181-156 21/09/29 15:44:37 INFO SignalUtils: Registered signal handler for TERM 21/09/29 15:44:37 INFO SignalUtils: Registered signal handler for HUP 21/09/29 15:44:37 INFO SignalUtils: Registered signal handler for INT 21/09/29 15:44:37 INFO SecurityManager: Changing view acls to: yarn,livy 21/09/29 15:44:37 INFO SecurityManager: Changing modify acls to: yarn,livy 21/09/29 15:44:37 INFO SecurityManager: Changing view acls groups to: 21/09/29 15:44:37 INFO SecurityManager: Changing modify acls groups to: 21/09/29 15:44:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, livy); groups with view permissions: Set(); users with modify permissions: Set(yarn, livy); groups with modify permissions: Set() 21/09/29 15:44:38 INFO TransportClientFactory: Successfully created connection to ip-10-4-137-125.eu-west-1.compute.internal/10.4.137.125:34545 after 78 ms (0 ms spent in bootstraps) 21/09/29 15:44:38 INFO SecurityManager: Changing view acls to: yarn,livy 21/09/29 15:44:38 INFO SecurityManager: Changing modify acls to: yarn,livy 21/09/29 15:44:38 INFO SecurityManager: Changing view acls groups to: 21/09/29 15:44:38 INFO SecurityManager: Changing modify acls groups to: 21/09/29 15:44:38 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, livy); groups with view permissions: Set(); users with modify permissions: Set(yarn, livy); groups with modify permissions: Set() 21/09/29 15:44:38 INFO TransportClientFactory: Successfully created connection to ip-10-4-137-125.eu-west-1.compute.internal/10.4.137.125:34545 after 1 ms (0 ms spent in bootstraps) 21/09/29 15:44:38 INFO DiskBlockManager: Created local directory at /mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/blockmgr-5aa03748-2d6d-4c78-9da5-1ef0e23cc506 21/09/29 15:44:38 INFO DiskBlockManager: Created local directory at /mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/blockmgr-2dba9cef-1782-4baa-a13f-fe379e090118 21/09/29 15:44:38 INFO DiskBlockManager: Created local directory at /mnt/yarn/usercache/livy/appcache/application_1632902169938_0005/blockmgr-d279178b-8dc9-4319-a64c-1e5bad11fe29 21/09/29 15:44:38 INFO MemoryStore: MemoryStore started with capacity 4.0 GB 21/09/29 15:44:38 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://coarsegrainedschedu...@ip-10-4-137-125.eu-west-1.compute.internal:34545 21/09/29 15:44:38 INFO CoarseGrainedExecutorBackend: Successfully registered with driver 21/09/29 15:44:38 INFO Executor: Starting executor ID 4 on host ip-10-4-181-156.eu-west-1.compute.internal 21/09/29 15:44:38 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38947. 21/09/29 15:44:38 INFO NettyBlockTransferService: Server created on ip-10-4-181-156.eu-west-1.compute.internal:38947 21/09/29 15:44:38 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 21/09/29 15:44:38 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(4, ip-10-4-181-156.eu-west-1.compute.internal, 38947, None) 21/09/29 15:44:38 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(4, ip-10-4-181-156.eu-west-1.compute.internal, 38947, None) 21/09/29 15:44:38 INFO BlockManager: external shuffle service port = 7337 21/09/29 15:44:38 INFO BlockManager: Registering executor with local external shuffle service. 21/09/29 15:44:38 INFO TransportClientFactory: Successfully created connection to ip-10-4-181-156.eu-west-1.compute.internal/10.4.181.156:7337 after 2 ms (0 ms spent in bootstraps) 21/09/29 15:44:38 INFO BlockManager: Initialized BlockManager: BlockManagerId(4, ip-10-4-181-156.eu-west-1.compute.internal, 38947, None) 21/09/29 15:44:38 INFO Executor: Using REPL class URI: spark://ip-10-4-137-125.eu-west-1.compute.internal:34545/classes 21/09/29 15:44:38 INFO CoarseGrainedExecutorBackend: Got assigned task 1 21/09/29 15:44:38 INFO Executor: Running task 0.0 in stage 1.0 (TID 1) 21/09/29 15:44:39 INFO TorrentBroadcast: Started reading broadcast variable 4 21/09/29 15:44:39 INFO TransportClientFactory: Successfully created connection to ip-10-4-137-125.eu-west-1.compute.internal/10.4.137.125:38079 after 5 ms (0 ms spent in bootstraps) 21/09/29 15:44:39 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 74.5 KB, free 4.0 GB) 21/09/29 15:44:39 INFO TorrentBroadcast: Reading broadcast variable 4 took 147 ms 21/09/29 15:44:39 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 187.1 KB, free 4.0 GB) 21/09/29 15:44:40 INFO TransportClientFactory: Successfully created connection to ip-10-4-137-125.eu-west-1.compute.internal/10.4.137.125:34545 after 2 ms (0 ms spent in bootstraps) 21/09/29 15:44:40 INFO CodeGenerator: Code generated in 471.203622 ms 21/09/29 15:44:40 INFO CodeGenerator: Code generated in 33.323995 ms 21/09/29 15:44:40 INFO CodeGenerator: Code generated in 22.954171 ms 21/09/29 15:44:40 INFO CodeGenerator: Code generated in 30.408357 ms 21/09/29 15:44:41 INFO CodeGenerator: Code generated in 81.831165 ms 21/09/29 15:44:41 INFO CoarseGrainedExecutorBackend: eagerFSInit: Eagerly initialized FileSystem at s3://does/not/exist in 2268 ms 21/09/29 15:44:41 INFO SQLConfCommitterProvider: Getting user defined output committer class org.apache.carbondata.hadoop.api.CarbonOutputCommitter 21/09/29 15:44:41 INFO FileOutputCommitter: File Output Committer Algorithm version is 2 21/09/29 15:44:41 INFO FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 21/09/29 15:44:41 INFO SQLConfCommitterProvider: Using output committer class org.apache.carbondata.hadoop.api.CarbonOutputCommitter 21/09/29 15:44:41 INFO CodeGenerator: Code generated in 12.314282 ms 21/09/29 15:44:41 INFO CodeGenerator: Code generated in 12.462397 ms 21/09/29 15:44:42 INFO CodeGenerator: Code generated in 42.357031 ms 21/09/29 15:44:42 INFO CarbonProperties: Property file path: /mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/container_1632902169938_0005_01_000008/../../../conf/carbon.properties 21/09/29 15:44:42 INFO CarbonProperties: ------Using Carbon.properties -------- 21/09/29 15:44:42 INFO CarbonProperties: {} 21/09/29 15:44:42 INFO CarbonProperties: Considered file format is: V3 21/09/29 15:44:42 INFO CarbonProperties: Blocklet Size Configured value is "64" 21/09/29 15:44:42 WARN CarbonProperties: The enable mv value "null" is invalid. Using the default value "true" 21/09/29 15:44:42 WARN CarbonProperties: The value "LOCALLOCK" configured for key carbon.lock.type is invalid for current file system. Use the default value HDFSLOCK instead. 21/09/29 15:44:42 INFO CarbonProperties: Considered value for min max byte limit for string is: 200 21/09/29 15:44:42 INFO CarbonProperties: Using default value for carbon.detail.batch.size 100 21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: /mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001 21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: /mnt/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001 21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: /mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001 21/09/29 15:44:42 INFO DataLoadExecutor: Data Loading is started for table mark_for_del_bug4 21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: /mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001 21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: /mnt/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001 21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: /mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001 21/09/29 15:44:42 WARN CarbonOutputIteratorWrapper: try to poll a row batch one more time. 21/09/29 15:44:42 INFO AbstractFactDataWriter: Total file size: 1073741824 and dataBlock Size: 966367642 21/09/29 15:44:42 INFO AbstractFactDataWriter: Carbondata will write temporary fact data to local disk. 21/09/29 15:44:42 INFO CarbonFactDataWriterImplV3: Sort Scope : NO_SORT 21/09/29 15:44:43 INFO AbstractFactDataWriter: Randomly choose factdata temp location: /mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001 21/09/29 15:44:43 WARN CarbonOutputIteratorWrapper: try to poll a row batch one more time. 21/09/29 15:44:43 WARN CarbonOutputIteratorWrapper: try to poll a row batch one more time. 21/09/29 15:44:43 WARN UnsafeMemoryManager: It is not recommended to set off-heap working memory size less than 512MB, so setting default value to 512 21/09/29 15:44:43 INFO UnsafeMemoryManager: Off-heap Working Memory manager is created with size 536870912 with OFFHEAP 21/09/29 15:44:43 INFO CarbonFactDataWriterImplV3: Number of Pages for blocklet is: 1 :Rows Added: 1 21/09/29 15:44:43 INFO CarbonUtil: Copying /mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001/part-0-100100000100001_batchno0-0-0-1632930271956.snappy.carbondata to s3a://coralogix-bigicecream/CarbonDataTests/bla2.db/mark_for_del_bug4/dt=2021-07-07/hr=13, operation id 1632930283187 21/09/29 15:44:43 INFO CarbonUtil: Total copy time is 235 ms, operation id 1632930283187 21/09/29 15:44:43 INFO AbstractFactDataWriter: Randomly choose index file location: /mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001 21/09/29 15:44:43 INFO CarbonUtil: Copying /mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001/100100000100001_batchno0-0-0-1632930271956.carbonindex to s3a://coralogix-bigicecream/CarbonDataTests/bla2.db/mark_for_del_bug4/dt=2021-07-07/hr=13/0_1632930271956.tmp, operation id 1632930283434 21/09/29 15:44:43 INFO CarbonUtil: Total copy time is 244 ms, operation id 1632930283434 21/09/29 15:44:43 INFO AbstractDataLoadProcessorStep: Total rows processed in step Data Writer: 1 21/09/29 15:44:43 INFO AbstractDataLoadProcessorStep: Total rows processed in step Input Processor: 1 21/09/29 15:44:43 INFO CarbonTableOutputFormat: Closed writer task attempt_20210929154434_0001_m_000000_1 21/09/29 15:44:43 INFO CarbonLoaderUtil: Deleted the local store location: /mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001:/mnt/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001:/mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001 : Time taken: 2 21/09/29 15:44:43 INFO SparkHadoopMapRedUtil: No need to commit output of task because needsTaskCommit=false: attempt_20210929154434_0001_m_000000_1 21/09/29 15:44:43 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 2799 bytes result sent to driver {noformat} 2. Yes, when the table doesn't have partitions it works fine > Insert data to table with a partitions resulting in 'Marked for Delete' > segment in Spark in EMR > ----------------------------------------------------------------------------------------------- > > Key: CARBONDATA-4279 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4279 > Project: CarbonData > Issue Type: Bug > Affects Versions: 2.3.0 > Environment: Release label:emr-5.24.1 > Hadoop distribution:Amazon 2.8.5 > Applications: > Hue 4.4.0, Spark 2.4.5,JupyterHub 0.9.6 > Jar complied with: > apache-carbondata:2.3.0-SNAPSHOT > spark:2.4.5 > hadoop:2.8.3 > Reporter: Bigicecream > Priority: Blocker > > as described [here|https://github.com/apache/carbondata/issues/4212] > After the commit > [https://github.com/apache/carbondata/commit/42f69827e0a577b6128417104c0a49cd5bf21ad7] > I have successfully created a table with partitions, but when I trying insert > data the job end with a success > but the segment is marked as "Marked for Delete" > I am running: > {code:sql} > CREATE TABLE lior_carbon_tests.mark_for_del_bug( > timestamp string, > name string > ) > STORED AS carbondata > PARTITIONED BY (dt string, hr string) > {code} > {code:sql} > INSERT INTO lior_carbon_tests.mark_for_del_bug select > '2021-07-07T13:23:56.012+00:00','spark','2021-07-07','13' > {code} > {code:sql} > select * from lior_carbon_tests.mark_for_del_bug > {code} > gives: > {code:java} > +---------+----+---+---+ > |timestamp|name| dt| hr| > +---------+----+---+---+ > +---------+----+---+---+ > {code} > And > {code:java} > show segments for TABLE lior_carbon_tests.mark_for_del_bug > {code} > gives > > {code:java} > +---+-----------------+-----------------------+---------------+---------+---------+----------+-----------+ > |ID |Status |Load Start Time |Load Time Taken|Partition|Data > Size|Index Size|File Format| > +---+-----------------+-----------------------+---------------+---------+---------+----------+-----------+ > |0 |Marked for Delete|2021-09-02 15:24:21.022|11.798S |NA |NA > |NA |columnar_v3| > +---+-----------------+-----------------------+---------------+---------+---------+----------+-----------+ > {code} > > I took a looking at the folder structure in S3 and it seems fine -- This message was sent by Atlassian Jira (v8.3.4#803005)