[ 
https://issues.apache.org/jira/browse/CARBONDATA-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422205#comment-17422205
 ] 

Bigicecream commented on CARBONDATA-4279:
-----------------------------------------

[~Indhumathi27] 
 Sorry for the slow responce

 

1.No,
 This are the logs:
{noformat}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/mnt/yarn/usercache/livy/filecache/48/__spark_libs__3665716770347383703.zip/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
21/09/29 15:44:36 INFO CoarseGrainedExecutorBackend: Started daemon with 
process name: 18902@ip-10-4-181-156
21/09/29 15:44:37 INFO SignalUtils: Registered signal handler for TERM
21/09/29 15:44:37 INFO SignalUtils: Registered signal handler for HUP
21/09/29 15:44:37 INFO SignalUtils: Registered signal handler for INT
21/09/29 15:44:37 INFO SecurityManager: Changing view acls to: yarn,livy
21/09/29 15:44:37 INFO SecurityManager: Changing modify acls to: yarn,livy
21/09/29 15:44:37 INFO SecurityManager: Changing view acls groups to: 
21/09/29 15:44:37 INFO SecurityManager: Changing modify acls groups to: 
21/09/29 15:44:37 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users  with view permissions: Set(yarn, livy); 
groups with view permissions: Set(); users  with modify permissions: Set(yarn, 
livy); groups with modify permissions: Set()
21/09/29 15:44:38 INFO TransportClientFactory: Successfully created connection 
to ip-10-4-137-125.eu-west-1.compute.internal/10.4.137.125:34545 after 78 ms (0 
ms spent in bootstraps)
21/09/29 15:44:38 INFO SecurityManager: Changing view acls to: yarn,livy
21/09/29 15:44:38 INFO SecurityManager: Changing modify acls to: yarn,livy
21/09/29 15:44:38 INFO SecurityManager: Changing view acls groups to: 
21/09/29 15:44:38 INFO SecurityManager: Changing modify acls groups to: 
21/09/29 15:44:38 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users  with view permissions: Set(yarn, livy); 
groups with view permissions: Set(); users  with modify permissions: Set(yarn, 
livy); groups with modify permissions: Set()
21/09/29 15:44:38 INFO TransportClientFactory: Successfully created connection 
to ip-10-4-137-125.eu-west-1.compute.internal/10.4.137.125:34545 after 1 ms (0 
ms spent in bootstraps)
21/09/29 15:44:38 INFO DiskBlockManager: Created local directory at 
/mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/blockmgr-5aa03748-2d6d-4c78-9da5-1ef0e23cc506
21/09/29 15:44:38 INFO DiskBlockManager: Created local directory at 
/mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/blockmgr-2dba9cef-1782-4baa-a13f-fe379e090118
21/09/29 15:44:38 INFO DiskBlockManager: Created local directory at 
/mnt/yarn/usercache/livy/appcache/application_1632902169938_0005/blockmgr-d279178b-8dc9-4319-a64c-1e5bad11fe29
21/09/29 15:44:38 INFO MemoryStore: MemoryStore started with capacity 4.0 GB
21/09/29 15:44:38 INFO CoarseGrainedExecutorBackend: Connecting to driver: 
spark://coarsegrainedschedu...@ip-10-4-137-125.eu-west-1.compute.internal:34545
21/09/29 15:44:38 INFO CoarseGrainedExecutorBackend: Successfully registered 
with driver
21/09/29 15:44:38 INFO Executor: Starting executor ID 4 on host 
ip-10-4-181-156.eu-west-1.compute.internal
21/09/29 15:44:38 INFO Utils: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 38947.
21/09/29 15:44:38 INFO NettyBlockTransferService: Server created on 
ip-10-4-181-156.eu-west-1.compute.internal:38947
21/09/29 15:44:38 INFO BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy
21/09/29 15:44:38 INFO BlockManagerMaster: Registering BlockManager 
BlockManagerId(4, ip-10-4-181-156.eu-west-1.compute.internal, 38947, None)
21/09/29 15:44:38 INFO BlockManagerMaster: Registered BlockManager 
BlockManagerId(4, ip-10-4-181-156.eu-west-1.compute.internal, 38947, None)
21/09/29 15:44:38 INFO BlockManager: external shuffle service port = 7337
21/09/29 15:44:38 INFO BlockManager: Registering executor with local external 
shuffle service.
21/09/29 15:44:38 INFO TransportClientFactory: Successfully created connection 
to ip-10-4-181-156.eu-west-1.compute.internal/10.4.181.156:7337 after 2 ms (0 
ms spent in bootstraps)
21/09/29 15:44:38 INFO BlockManager: Initialized BlockManager: 
BlockManagerId(4, ip-10-4-181-156.eu-west-1.compute.internal, 38947, None)
21/09/29 15:44:38 INFO Executor: Using REPL class URI: 
spark://ip-10-4-137-125.eu-west-1.compute.internal:34545/classes
21/09/29 15:44:38 INFO CoarseGrainedExecutorBackend: Got assigned task 1
21/09/29 15:44:38 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
21/09/29 15:44:39 INFO TorrentBroadcast: Started reading broadcast variable 4
21/09/29 15:44:39 INFO TransportClientFactory: Successfully created connection 
to ip-10-4-137-125.eu-west-1.compute.internal/10.4.137.125:38079 after 5 ms (0 
ms spent in bootstraps)
21/09/29 15:44:39 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in 
memory (estimated size 74.5 KB, free 4.0 GB)
21/09/29 15:44:39 INFO TorrentBroadcast: Reading broadcast variable 4 took 147 
ms
21/09/29 15:44:39 INFO MemoryStore: Block broadcast_4 stored as values in 
memory (estimated size 187.1 KB, free 4.0 GB)
21/09/29 15:44:40 INFO TransportClientFactory: Successfully created connection 
to ip-10-4-137-125.eu-west-1.compute.internal/10.4.137.125:34545 after 2 ms (0 
ms spent in bootstraps)
21/09/29 15:44:40 INFO CodeGenerator: Code generated in 471.203622 ms
21/09/29 15:44:40 INFO CodeGenerator: Code generated in 33.323995 ms
21/09/29 15:44:40 INFO CodeGenerator: Code generated in 22.954171 ms
21/09/29 15:44:40 INFO CodeGenerator: Code generated in 30.408357 ms
21/09/29 15:44:41 INFO CodeGenerator: Code generated in 81.831165 ms
21/09/29 15:44:41 INFO CoarseGrainedExecutorBackend: eagerFSInit: Eagerly 
initialized FileSystem at s3://does/not/exist in 2268 ms
21/09/29 15:44:41 INFO SQLConfCommitterProvider: Getting user defined output 
committer class org.apache.carbondata.hadoop.api.CarbonOutputCommitter
21/09/29 15:44:41 INFO FileOutputCommitter: File Output Committer Algorithm 
version is 2
21/09/29 15:44:41 INFO FileOutputCommitter: FileOutputCommitter skip cleanup 
_temporary folders under output directory:false, ignore cleanup failures: false
21/09/29 15:44:41 INFO SQLConfCommitterProvider: Using output committer class 
org.apache.carbondata.hadoop.api.CarbonOutputCommitter
21/09/29 15:44:41 INFO CodeGenerator: Code generated in 12.314282 ms
21/09/29 15:44:41 INFO CodeGenerator: Code generated in 12.462397 ms
21/09/29 15:44:42 INFO CodeGenerator: Code generated in 42.357031 ms
21/09/29 15:44:42 INFO CarbonProperties: Property file path: 
/mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/container_1632902169938_0005_01_000008/../../../conf/carbon.properties
21/09/29 15:44:42 INFO CarbonProperties: ------Using Carbon.properties --------
21/09/29 15:44:42 INFO CarbonProperties: {}
21/09/29 15:44:42 INFO CarbonProperties: Considered file format is: V3
21/09/29 15:44:42 INFO CarbonProperties: Blocklet Size Configured value is "64"
21/09/29 15:44:42 WARN CarbonProperties: The enable mv value "null" is invalid. 
Using the default value "true"
21/09/29 15:44:42 WARN CarbonProperties: The value "LOCALLOCK" configured for 
key carbon.lock.type is invalid for current file system. Use the default value 
HDFSLOCK instead.
21/09/29 15:44:42 INFO CarbonProperties: Considered value for min max byte 
limit for string is: 200
21/09/29 15:44:42 INFO CarbonProperties: Using default value for 
carbon.detail.batch.size 100
21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: 
/mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001
21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: 
/mnt/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001
21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: 
/mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001
21/09/29 15:44:42 INFO DataLoadExecutor: Data Loading is started for table 
mark_for_del_bug4
21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: 
/mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001
21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: 
/mnt/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001
21/09/29 15:44:42 INFO CarbonDataProcessorUtil: Successfully created dir: 
/mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001
21/09/29 15:44:42 WARN CarbonOutputIteratorWrapper: try to poll a row batch one 
more time.
21/09/29 15:44:42 INFO AbstractFactDataWriter: Total file size: 1073741824 and 
dataBlock Size: 966367642
21/09/29 15:44:42 INFO AbstractFactDataWriter: Carbondata will write temporary 
fact data to local disk.
21/09/29 15:44:42 INFO CarbonFactDataWriterImplV3: Sort Scope : NO_SORT
21/09/29 15:44:43 INFO AbstractFactDataWriter: Randomly choose factdata temp 
location: 
/mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001
21/09/29 15:44:43 WARN CarbonOutputIteratorWrapper: try to poll a row batch one 
more time.
21/09/29 15:44:43 WARN CarbonOutputIteratorWrapper: try to poll a row batch one 
more time.
21/09/29 15:44:43 WARN UnsafeMemoryManager: It is not recommended to set 
off-heap working memory size less than 512MB, so setting default value to 512
21/09/29 15:44:43 INFO UnsafeMemoryManager: Off-heap Working Memory manager is 
created with size 536870912 with OFFHEAP
21/09/29 15:44:43 INFO CarbonFactDataWriterImplV3: Number of Pages for blocklet 
is: 1 :Rows Added: 1
21/09/29 15:44:43 INFO CarbonUtil: Copying 
/mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001/part-0-100100000100001_batchno0-0-0-1632930271956.snappy.carbondata
 to 
s3a://coralogix-bigicecream/CarbonDataTests/bla2.db/mark_for_del_bug4/dt=2021-07-07/hr=13,
 operation id 1632930283187
21/09/29 15:44:43 INFO CarbonUtil: Total copy time is 235 ms, operation id 
1632930283187
21/09/29 15:44:43 INFO AbstractFactDataWriter: Randomly choose index file 
location: 
/mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001
21/09/29 15:44:43 INFO CarbonUtil: Copying 
/mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001/Fact/Part0/Segment_0/100100000100001/100100000100001_batchno0-0-0-1632930271956.carbonindex
 to 
s3a://coralogix-bigicecream/CarbonDataTests/bla2.db/mark_for_del_bug4/dt=2021-07-07/hr=13/0_1632930271956.tmp,
 operation id 1632930283434
21/09/29 15:44:43 INFO CarbonUtil: Total copy time is 244 ms, operation id 
1632930283434
21/09/29 15:44:43 INFO AbstractDataLoadProcessorStep: Total rows processed in 
step Data Writer: 1
21/09/29 15:44:43 INFO AbstractDataLoadProcessorStep: Total rows processed in 
step Input Processor: 1
21/09/29 15:44:43 INFO CarbonTableOutputFormat: Closed writer task 
attempt_20210929154434_0001_m_000000_1
21/09/29 15:44:43 INFO CarbonLoaderUtil: Deleted the local store location: 
/mnt2/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001:/mnt/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001:/mnt1/yarn/usercache/livy/appcache/application_1632902169938_0005/carbon15ae7277d0784335b1396e8f8687ff55_100100000100001
 : Time taken: 2
21/09/29 15:44:43 INFO SparkHadoopMapRedUtil: No need to commit output of task 
because needsTaskCommit=false: attempt_20210929154434_0001_m_000000_1
21/09/29 15:44:43 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 2799 
bytes result sent to driver
{noformat}
2. Yes, when the table doesn't have partitions it works fine

> Insert data to table with a partitions resulting in 'Marked for Delete' 
> segment in Spark in EMR
> -----------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-4279
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-4279
>             Project: CarbonData
>          Issue Type: Bug
>    Affects Versions: 2.3.0
>         Environment: Release label:emr-5.24.1
> Hadoop distribution:Amazon 2.8.5
> Applications:
> Hue 4.4.0, Spark 2.4.5,JupyterHub 0.9.6
> Jar complied with:
> apache-carbondata:2.3.0-SNAPSHOT
> spark:2.4.5
> hadoop:2.8.3
>            Reporter: Bigicecream
>            Priority: Blocker
>
> as described [here|https://github.com/apache/carbondata/issues/4212]
> After the commit 
> [https://github.com/apache/carbondata/commit/42f69827e0a577b6128417104c0a49cd5bf21ad7]
> I have successfully created a table with partitions, but when I trying insert 
> data the job end with a success
>  but the segment is marked as "Marked for Delete"
> I am running:
> {code:sql}
> CREATE TABLE lior_carbon_tests.mark_for_del_bug(
> timestamp string,
> name string
> )
> STORED AS carbondata
> PARTITIONED BY (dt string, hr string)
> {code}
> {code:sql}
> INSERT INTO lior_carbon_tests.mark_for_del_bug select 
> '2021-07-07T13:23:56.012+00:00','spark','2021-07-07','13'
> {code}
> {code:sql}
> select * from lior_carbon_tests.mark_for_del_bug
> {code}
> gives:
> {code:java}
> +---------+----+---+---+
> |timestamp|name| dt| hr|
> +---------+----+---+---+
> +---------+----+---+---+
> {code}
> And
> {code:java}
> show segments for TABLE lior_carbon_tests.mark_for_del_bug
> {code}
> gives
>  
> {code:java}
> +---+-----------------+-----------------------+---------------+---------+---------+----------+-----------+
> |ID |Status           |Load Start Time        |Load Time Taken|Partition|Data 
> Size|Index Size|File Format|
> +---+-----------------+-----------------------+---------------+---------+---------+----------+-----------+
> |0  |Marked for Delete|2021-09-02 15:24:21.022|11.798S        |NA       |NA   
>     |NA        |columnar_v3|
> +---+-----------------+-----------------------+---------------+---------+---------+----------+-----------+
> {code}
>  
> I took a looking at the folder structure in S3 and it seems fine



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to