[jira] [Closed] (CARBONDATA-4106) Compaction is not working properly

Ajantha Bhat (Jira) Mon, 18 Jan 2021 02:36:04 -0800


     [ 
https://issues.apache.org/jira/browse/CARBONDATA-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ajantha Bhat closed CARBONDATA-4106.
------------------------------------
    Fix Version/s:     (was: 2.0.1)
       Resolution: Not A Bug

> Compaction is not working properly
> ----------------------------------
>
>                 Key: CARBONDATA-4106
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-4106
>             Project: CarbonData
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 2.0.1
>         Environment: Apache spark 2.4.5, carbonData 2.0.1
>            Reporter: suyash yadav
>            Priority: Major
>         Attachments: describe_fact_probe_1
>
>
> Hi Team,
> We are using apache carbondata 2.0.1 for one of our POC and we observed that 
> we are not getting proper benifit from using compaction (Both majour and 
> minor).
> Please find below details for the issue we are facing:
> *Name of the table used*:  fact_365_1_probe_1
> +*Number of rows:*
> +
> select count(*) from fact_365_1_probe_1
>  +--------+
>  |count(1)|
>  +--------+
>  |76963753|
> *Sample data from the table:*
> ======================
> +-------------------+--------------------------+------------------------------------+------------------+-------------+-------------------+
>  | ts| metric| tags_id| value| epoch| ts2|
>  
> +-------------------+--------------------------+------------------------------------+------------------+-------------+-------------------+
>  |2021-01-07 
> 21:05:00|Probe.Duplicate.Poll.Count|c8dead9b-87ae-46ae-8703-bc2b7bfba5d4|39.611356797970274|1610033757768|2021-01-07
>  00:00:00|
>  |2021-01-07 
> 23:50:00|Probe.Duplicate.Poll.Count|62351ef2-f2ce-49d1-a2fd-a0d1e5f6a1b9| 
> 72.70658115131307|1610043742516|2021-01-07 00:00:00|
>  
> [^describe_fact_probe_1]
>  
> I have attached  the describe output which will show you the other details of 
> the table.
> The size of the table is 3.24 GB and even after running minor or majour 
> compaction the size remain almost the same.
> So we re not getting any benifit by running the compaction.Could you please 
> review the shared details and help us in identifying if we are missing 
> something here or is there any bug?
> Also we need answer to the following questions about carbondata storate:
> 1. In case of decimal values, how the storage behaves like if i have one row 
> with 20 digits after decimal and second row has only 5 digits  after decimal 
> so how and what would be the difference in the storage taken.
> 2. My second question is , if i have two tables and one of the table has same 
> values for 100 rows and other table has different values for 100 rows so how 
> carbon will behave as far as the storage is concerned in this scenario. WHich 
> table will take less storage or both will take same storage.
> 3.Also for string datatype could you please describe what is the storage 
> defined for string datatype.
>  
> ================



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (CARBONDATA-4106) Compaction is not working properly

Reply via email to