[jira] [Created] (CARBONDATA-4286) Select query with and filter is giving empty result
Nihal kumar ojha created CARBONDATA-4286: Summary: Select query with and filter is giving empty result Key: CARBONDATA-4286 URL: https://issues.apache.org/jira/browse/CARBONDATA-4286 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Select query on a table with and filter condition returns an empty result while valid data present in the table. Root cause: Currently when we are building the min-max index at block level that time we are using unsafe byte comparator for either dimension or measure column which returns incorrect result for measure columns. We should use different comparators for dimensions and measure columns which we are already doing at time of writing the min-max index at blocklet level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CARBONDATA-4195) Materialized view loading time increased due to full refresh
[ https://issues.apache.org/jira/browse/CARBONDATA-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403524#comment-17403524 ] Nihal kumar ojha edited comment on CARBONDATA-4195 at 8/24/21, 5:14 AM: Hi, can you please provide the create MV command? Based on that only MV will be created with incremental or full refresh. If your query contains avg() aggregate function or some expression like sum(col1) + sum(col2) then MV will be created with full refresh. So once we have that command then we can conclude. Or if it is a duplicate of [CARBONDATA-4239|https://issues.apache.org/jira/browse/CARBONDATA-4239] then please close this issue as we are already tracking that issue. was (Author: nihal): Hi, can you please provide the create MV command? Based on that only MV will be created with incremental or full refresh. If your query contains avg() aggregate function or some expression like sum(col1) + sum(col2) then MV will be created with full refresh. So once we have that command then we can conclude. > Materialized view loading time increased due to full refresh > > > Key: CARBONDATA-4195 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4195 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 2.1.0 >Reporter: Mayuri Patole >Priority: Major > Fix For: 2.1.0 > > > Hi Team, > We are using carbon 2.1.0 in our project where parallel data loading is > happening. > We are working on getting optimal performance for aggregated queries using > materialized views. > We observed that continues data loading and full refresh of MV is causing > increased load time and high memory usage which doesn't have to be this way. > Can you suggest a way to perform incremental refresh because we do not need > to calculate old data again while loading ? > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4195) Materialized view loading time increased due to full refresh
[ https://issues.apache.org/jira/browse/CARBONDATA-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403524#comment-17403524 ] Nihal kumar ojha commented on CARBONDATA-4195: -- Hi, can you please provide the create MV command? Based on that only MV will be created with incremental or full refresh. If your query contains avg() aggregate function or some expression like sum(col1) + sum(col2) then MV will be created with full refresh. So once we have that command then we can conclude. > Materialized view loading time increased due to full refresh > > > Key: CARBONDATA-4195 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4195 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 2.1.0 >Reporter: Mayuri Patole >Priority: Major > Fix For: 2.1.0 > > > Hi Team, > We are using carbon 2.1.0 in our project where parallel data loading is > happening. > We are working on getting optimal performance for aggregated queries using > materialized views. > We observed that continues data loading and full refresh of MV is causing > increased load time and high memory usage which doesn't have to be this way. > Can you suggest a way to perform incremental refresh because we do not need > to calculate old data again while loading ? > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4227) SDK CarbonWriterBuilder cannot execute `build()` several times with different output path
[ https://issues.apache.org/jira/browse/CARBONDATA-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17392772#comment-17392772 ] Nihal kumar ojha commented on CARBONDATA-4227: -- Hi, Based on the current carbon SDK implementation, when you execute `CarbonWriter.builder()` that time we create an instance of CarbonWriterBuilder and then we keep changing the other property based on other exposed APIs in the same instance. Now when we trigger build() then we consider all the property of instance as final and make use of that for creating the CarbonWriter. Now suppose after performing build() if we will allow change other property of the same instance and perform build() again then it will overwrite the previous build() instance and that again leads to confusion. So I will suggest if you want to use another build() instance then please create a new builder() instance first otherwise it won't behave as expected. Please post your reply if any other confusion related to this. > SDK CarbonWriterBuilder cannot execute `build()` several times with different > output path > - > > Key: CARBONDATA-4227 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4227 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 2.1.1 >Reporter: ChenKai >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > Sometimes we want to reuse CarbonWriterBuilder object to build CarbonWriter > with different output paths, but it does not work. > For example: > {code:scala} > val builder = CarbonWriter.builder().withCsvInput(...).writtenBy(...) > // 1. first writing with path1 > val writer1 = builder.outputPath(path1).build() > // write data, it works > // 2. second writing with path2 > val writer2 = builder.outputPath(path2).build() > // write data, it does not work. It still writes data to path1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4177) performence issue with Query
[ https://issues.apache.org/jira/browse/CARBONDATA-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17392172#comment-17392172 ] Nihal kumar ojha commented on CARBONDATA-4177: -- Hi, Currently, carbondata doesn't support limit push down for either main table or MV table. It is only supported in the case of secondary index(SI). Because of this when we try to select rows with limit 10 in that case also carbon first fill vector of size 4096 rows and then sends it to spark. After that spark will apply the limit and give the result. So even for 10 rows we fetch 4096 rows and that is taking time. May be in future we can support limit pushdown for main table and MV and we will fetch only 10 rows at place of 4096 rows after that your query can get advantage. > performence issue with Query > > > Key: CARBONDATA-4177 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4177 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 2.0.1 >Reporter: suyash yadav >Priority: Major > Fix For: 2.0.1 > > > Hi Team,Hi Team, > We are working on a POC using carbondata 2.0.1 and have come across > parformance issue.Below are the details: > 1.Table creation query: > == > spark.sql("create table Flow_TS_1day_stats_16042021(start_time > timestamp,end_time timestamp,source_ip_address string,destintion_ip_address > string,appname string,protocol_name string,source_tos smallint,in_interface > smallint,out_interface smallint,src_as bigint,dst_as bigint,source_mask > smallint,destination_mask smallint, dst_tos smallint,input_pkt > bigint,input_byt bigint,output_pkt bigint,output_byt bigint,source_port > int,destination_port int) stored as carbondata TBLPROPERTIES > ('local_dictionary_enable'='false')").show() > TWO MVs are there on this table, Below are the queries for those MVs > :=== > 1. Network MV > > spark.sql("create materialized view > Network_Level_Agg_10min_MV_with_ip_15042021_again as select > timeseries(end_time,'ten_minute') as end_time,source_ip_address, > destintion_ip_address,appname,protocol_name,source_port,destination_port,source_tos,src_as,dst_as,sum(input_pkt) > as input_pkt,sum(input_byt) as input_byt,sum(output_pkt) as > output_pkt,sum(output_byt) as output_byt from > Flow_TS_1day_stats_15042021_again group by > timeseries(end_time,'ten_minute'),source_ip_address,destintion_ip_address, > appname,protocol_name,source_port,destination_port,source_tos,src_as,dst_as > order by input_pkt,input_byt,output_pkt,output_byt desc").show(false) > 2. Interfae MV: > ==Interface :== > spark.sql("create materialized view Interface_Level_Agg_10min_MV_16042021 as > select timeseries(end_time,'ten_minute') as end_time, > source_ip_address,destintion_ip_address,appname,protocol_name,source_port,destination_port,source_tos,src_as,dst_as,in_interface,out_interface,sum(input_pkt) > as input_pkt,sum(input_byt) as input_byt,sum(output_pkt) as > output_pkt,sum(output_byt) as output_byt from Flow_TS_1day_stats_16042021 > group by timeseries(end_time,'ten_minute'), > source_ip_address,destintion_ip_address,appname,protocol_name,source_port,destination_port,source_tos,src_as,dst_as,in_interface,out_interface > order by input_pkt,input_byt,output_pkt,output_byt desc").show(false) > +*We are firing below query for fethcing data which is taking almost 10 > seconds:*+ > *Select appname,input_byt from Flow_TS_1day_stats_16042021 where end_time >= > '2021-03-02 00:00:00' and end_time < '2021-03-03 00:00:00' group by > appname,input_byt order by input_byt desc LIMIT 10* > > The above query is only fetching 10 records but it is taking almost 10 > seconds to complete. > Could you please review above schemas and help us to understand how can we > get some improvement in the qury execution time. We are expectingt he > response should be in subseconds. > Table Name : RAW Table (1 Day - 300K/Sec)#Records : 2592000 > RegardsSuyash Yadav -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4256) SI creation on a complex column that includes child column with a dot(.) fails with parse exception.
Nihal kumar ojha created CARBONDATA-4256: Summary: SI creation on a complex column that includes child column with a dot(.) fails with parse exception. Key: CARBONDATA-4256 URL: https://issues.apache.org/jira/browse/CARBONDATA-4256 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha sql("create table complextable (country struct, name string, id Map, arr1 array, arr2 array) stored as carbondata"); sql("create index index_1 on table complextable(country.b) as 'carbondata'"); The above query fails with a parsing exception. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4248) Explain query with upper case column is throwing key not found exception.
Nihal kumar ojha created CARBONDATA-4248: Summary: Explain query with upper case column is throwing key not found exception. Key: CARBONDATA-4248 URL: https://issues.apache.org/jira/browse/CARBONDATA-4248 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Steps to reproduce: sql("drop table if exists carbon_table") sql("drop table if exists parquet_table") sql("create table IF NOT EXISTS carbon_table(`BEGIN_TIME` BIGINT," + " `SAI_CGI_ECGI` STRING) stored as carbondata") sql("create table IF NOT EXISTS parquet_table(CELL_NAME string, CGISAI string)" + " stored as parquet") sql("explain extended with grpMainDatathroughput as (select" + " from_unixtime(begin_time, 'MMdd') as data_time, SAI_CGI_ECGI from carbon_table)," + " grpMainData as (select * from grpMainDatathroughput a JOIN(select CELL_NAME, CGISAI from" + " parquet_table) b ON b.CGISAI=a.SAI_CGI_ECGI) " + "select * from grpMainData a left join grpMainData b on a.cell_name=b.cell_name") -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4232) Add missing doc change for secondary index.
Nihal kumar ojha created CARBONDATA-4232: Summary: Add missing doc change for secondary index. Key: CARBONDATA-4232 URL: https://issues.apache.org/jira/browse/CARBONDATA-4232 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Doc changes were not handled for PR-4116 to leverage secondary index till segment level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-4114) Select query is returning empty result when carbon.read.partition.hive.direct = false
[ https://issues.apache.org/jira/browse/CARBONDATA-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha closed CARBONDATA-4114. Resolution: Duplicate > Select query is returning empty result when carbon.read.partition.hive.direct > = false > - > > Key: CARBONDATA-4114 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4114 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Currently when {{carbon.read.partition.hive.direct = false}} then select > query with load command for the CSV file which contain multiple rows is > returning empty result. > > set carbon.read.partition.hive.direct=false; > drop table if exists sourceTable; > CREATE TABLE sourceTable (empno int, empname String, designation String, doj > Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, > deptname String, projectcode int, projectjoindate Timestamp, projectenddate > Timestamp) partitioned by(attendance int, utilization int, salary int) STORED > AS carbondata; > LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE sourceTable > OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"'); > select * from sourceTable; -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4114) Select query is returning empty result when carbon.read.partition.hive.direct = false
[ https://issues.apache.org/jira/browse/CARBONDATA-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360633#comment-17360633 ] Nihal kumar ojha commented on CARBONDATA-4114: -- Duplicate of CARBONDATA-4113 > Select query is returning empty result when carbon.read.partition.hive.direct > = false > - > > Key: CARBONDATA-4114 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4114 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Currently when {{carbon.read.partition.hive.direct = false}} then select > query with load command for the CSV file which contain multiple rows is > returning empty result. > > set carbon.read.partition.hive.direct=false; > drop table if exists sourceTable; > CREATE TABLE sourceTable (empno int, empname String, designation String, doj > Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, > deptname String, projectcode int, projectjoindate Timestamp, projectenddate > Timestamp) partitioned by(attendance int, utilization int, salary int) STORED > AS carbondata; > LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE sourceTable > OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"'); > select * from sourceTable; -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4196) Allow zero or more white spaces in geo UDFs
Nihal kumar ojha created CARBONDATA-4196: Summary: Allow zero or more white spaces in geo UDFs Key: CARBONDATA-4196 URL: https://issues.apache.org/jira/browse/CARBONDATA-4196 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Currently, regex of geo UDF is not allowing zero space between UDF name and parenthesis. It always expects a single space in between. For ex: {{linestring (120.184179 30.327465)}}. Because of this sometimes using the UDFs without space is not giving the expected result. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4188) Select query fails for longstring data with small table page size after alter add columns
Nihal kumar ojha created CARBONDATA-4188: Summary: Select query fails for longstring data with small table page size after alter add columns Key: CARBONDATA-4188 URL: https://issues.apache.org/jira/browse/CARBONDATA-4188 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Steps to reproduce: # Create table with small page size and longstring data type. # Load large amount of data(more than one page should be created.) # Alter add int column on the same table. # Select query with filter on newly added columns fails with ArrayIndexOutOfBoundException. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4186) Insert query is failing when partition column is part of local sort scope.
Nihal kumar ojha created CARBONDATA-4186: Summary: Insert query is failing when partition column is part of local sort scope. Key: CARBONDATA-4186 URL: https://issues.apache.org/jira/browse/CARBONDATA-4186 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Currently when we create table with partition column and put the same column as part of local sort scope then Insert query fails with ArrayIndexOutOfBounds exception. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4162) Leverage Secondary Index till segment level with SI as datamap and SI with plan rewrite
[ https://issues.apache.org/jira/browse/CARBONDATA-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha updated CARBONDATA-4162: - Summary: Leverage Secondary Index till segment level with SI as datamap and SI with plan rewrite (was: Leverage Secondary Index till segment level with Spark plan rewrite) > Leverage Secondary Index till segment level with SI as datamap and SI with > plan rewrite > --- > > Key: CARBONDATA-4162 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4162 > Project: CarbonData > Issue Type: New Feature >Reporter: Nihal kumar ojha >Priority: Major > Attachments: Support SI at segment level.pdf > > Time Spent: 1.5h > Remaining Estimate: 0h > > *Background:* > Secondary index tables are created as indexes and managed as child tables > internally by Carbondata. In the existing architecture, if the parent(main) > table and SI table don’t > have the same valid segments then we disable the SI table. And then from the > next query onwards, we scan and prune only the parent table until we trigger > the next load or REINDEX command (as these commands will make the > parent and SI table segments in sync). Because of this, queries take more > time to give the result when SI is disabled. > *Proposed Solution:* > We are planning to leverage SI till the segment level. It means at place > of disabling the SI table(when parent and child table segments are not in > sync) > we will do pruning on SI tables for all the valid segments(segments with > status > success, marked for update and load partial success) and the rest of the > segments will be pruned by the parent table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4162) Leverage Secondary Index till segment level with Spark plan rewrite
Nihal kumar ojha created CARBONDATA-4162: Summary: Leverage Secondary Index till segment level with Spark plan rewrite Key: CARBONDATA-4162 URL: https://issues.apache.org/jira/browse/CARBONDATA-4162 Project: CarbonData Issue Type: New Feature Reporter: Nihal kumar ojha Attachments: Support SI at segment level.pdf *Background:* Secondary index tables are created as indexes and managed as child tables internally by Carbondata. In the existing architecture, if the parent(main) table and SI table don’t have the same valid segments then we disable the SI table. And then from the next query onwards, we scan and prune only the parent table until we trigger the next load or REINDEX command (as these commands will make the parent and SI table segments in sync). Because of this, queries take more time to give the result when SI is disabled. *Proposed Solution:* We are planning to leverage SI till the segment level. It means at place of disabling the SI table(when parent and child table segments are not in sync) we will do pruning on SI tables for all the valid segments(segments with status success, marked for update and load partial success) and the rest of the segments will be pruned by the parent table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4144) After the alter table xxx compact command is executed, the index size of the segment is 0, and an error is reported while quering
[ https://issues.apache.org/jira/browse/CARBONDATA-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297969#comment-17297969 ] Nihal kumar ojha commented on CARBONDATA-4144: -- Hi, can you please give clear steps to reproduce this issue? As whatever image you have uploaded is not rendering. > After the alter table xxx compact command is executed, the index size of the > segment is 0, and an error is reported while quering > - > > Key: CARBONDATA-4144 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4144 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.6.1, 2.0.0, 2.1.0 >Reporter: liuhe0702 >Priority: Major > > 1、In the tablestatus of the second index table, the value of indexSize is 0 > and segmentFile is xxx_null.segment. > !https://dts-szv.clouddragon.huawei.com/v1/downLoadFile?filePath=liuhe%2000450496/image/2020102600.jpg! > 2、query failed > !/net/dts/fckeditor/download.ashx?Path=HXE3plWtEstOcBZ8qldKhWFgc8703rzjttf0DP3ccTTJ21FmQMM2WBKlMMiV1tu7Q%2bCX9IYzuiC2ZDlw22gp5wmAKdaHmUvtbQCfVF70yuXj3LoG3neGY7nF%2b%2bxEd9Mv|width=1280,height=807!!image-2021-03-09-17-09-29-077.png|width=15,height=15!!https://dts.huawei.com/net/dts/fckeditor/download.ashx?Path=HXE3plWtEstOcBZ8qldKhWFgc8703rzjttf0DP3ccTTJ21FmQMM2WBKlMMiV1tu7Q%2bCX9IYzuiC2ZDlw22gp5wmAKdaHmUvtbQCfVF70yuXj3LoG3neGY7nF%2b%2bxEd9Mv! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4131) Concurrent load on table with flat folder structure fails with FileNotFound
[ https://issues.apache.org/jira/browse/CARBONDATA-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha resolved CARBONDATA-4131. -- Resolution: Duplicate > Concurrent load on table with flat folder structure fails with FileNotFound > --- > > Key: CARBONDATA-4131 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4131 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4131) Concurrent load on table with flat folder structure fails with FileNotFound
[ https://issues.apache.org/jira/browse/CARBONDATA-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285791#comment-17285791 ] Nihal kumar ojha commented on CARBONDATA-4131: -- Duplicate of CARBONDATA-3962 > Concurrent load on table with flat folder structure fails with FileNotFound > --- > > Key: CARBONDATA-4131 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4131 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4131) Concurrent load on table with flat folder structure fails with FileNotFound
Nihal kumar ojha created CARBONDATA-4131: Summary: Concurrent load on table with flat folder structure fails with FileNotFound Key: CARBONDATA-4131 URL: https://issues.apache.org/jira/browse/CARBONDATA-4131 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4114) Select query is returning empty result when carbon.read.partition.hive.direct = false
Nihal kumar ojha created CARBONDATA-4114: Summary: Select query is returning empty result when carbon.read.partition.hive.direct = false Key: CARBONDATA-4114 URL: https://issues.apache.org/jira/browse/CARBONDATA-4114 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Currently when {{carbon.read.partition.hive.direct = false}} then select query with load command for the CSV file which contain multiple rows is returning empty result. set carbon.read.partition.hive.direct=false; drop table if exists sourceTable; CREATE TABLE sourceTable (empno int, empname String, designation String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp) partitioned by(attendance int, utilization int, salary int) STORED AS carbondata; LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE sourceTable OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"'); select * from sourceTable; -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4101) Carbondata Connectivity via JDBC driver
[ https://issues.apache.org/jira/browse/CARBONDATA-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267164#comment-17267164 ] Nihal kumar ojha commented on CARBONDATA-4101: -- Hi Rohit, We can connect carbondata using the JDBC connector. Please follow [https://carbondata.apache.org/quick-start-guide.html] to understand the integration of carbondata with different engines like spark, presto, hive, flink, and let us know if you have any dought. Regards, Nihal kumar ojha > Carbondata Connectivity via JDBC driver > --- > > Key: CARBONDATA-4101 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4101 > Project: CarbonData > Issue Type: Task > Components: other >Reporter: Rohit Paranjape >Priority: Blocker > > Hello Team, > > We are working on one POC in which we wanted to connect to carbondata via our > third party application using JDBC connector. > > Can we connect to carbondata using JDBC ? If yes, what would be the procedure > to do the same and if not, then what would be the possible options to connect > to carbondata using > third party application. > > Please share your inputs on the same. > > Thanks & Regards, > Rohit Paranjape -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4102) Add UT and FT to improve coverage of SI module.
Nihal kumar ojha created CARBONDATA-4102: Summary: Add UT and FT to improve coverage of SI module. Key: CARBONDATA-4102 URL: https://issues.apache.org/jira/browse/CARBONDATA-4102 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Add UT and FT to improve coverage of SI module and also remove dead or unused code if exists. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4070) Handle the scenario mentioned in description for SI.
Nihal kumar ojha created CARBONDATA-4070: Summary: Handle the scenario mentioned in description for SI. Key: CARBONDATA-4070 URL: https://issues.apache.org/jira/browse/CARBONDATA-4070 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha # SI creation should not be allowed on SI table. # SI table should not be scanned with like filter on MT. # Drop column should not be allowed on SI table. Add the FT for all above scenario and sort column related scenario. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4069) Alter table set streaming=true should not be allowed on SI table or table having SI.
Nihal kumar ojha created CARBONDATA-4069: Summary: Alter table set streaming=true should not be allowed on SI table or table having SI. Key: CARBONDATA-4069 URL: https://issues.apache.org/jira/browse/CARBONDATA-4069 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha # Create carbon table and SI . # Now set streaming = true on either SI table or main table. Both the operation should not be allowed because SI is not supported on streaming table. create table maintable2 (a string,b string,c int) STORED AS carbondata; insert into maintable2 values('k','x',2); create index m_indextable on table maintable2(b) AS 'carbondata'; ALTER TABLE maintable2 SET TBLPROPERTIES('streaming'='true'); => operation should not be allowed. ALTER TABLE m_indextable SET TBLPROPERTIES('streaming'='true') => operation should not be allowed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4068) Alter table set long string should not allowed on SI column.
Nihal kumar ojha created CARBONDATA-4068: Summary: Alter table set long string should not allowed on SI column. Key: CARBONDATA-4068 URL: https://issues.apache.org/jira/browse/CARBONDATA-4068 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha # Create table and create SI. # Now try to set the column data type to long string on which SI is created. Operation should not be allowed because we don't support SI on long string. create table maintable (a string,b string,c int) STORED AS carbondata; create index indextable on table maintable(b) AS 'carbondata'; insert into maintable values('k','x',2); ALTER TABLE maintable SET TBLPROPERTIES('long_String_columns'='b'); -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4059) Block compaction on SI table.
Nihal kumar ojha created CARBONDATA-4059: Summary: Block compaction on SI table. Key: CARBONDATA-4059 URL: https://issues.apache.org/jira/browse/CARBONDATA-4059 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Currently compaction is allowed on SI table. Because of this if only SI table is compacted then running filter query query on main table is causing more data scan of SI table which will causing performance degradation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4052) Select query on SI table after insert overwrite is giving wrong result.
Nihal kumar ojha created CARBONDATA-4052: Summary: Select query on SI table after insert overwrite is giving wrong result. Key: CARBONDATA-4052 URL: https://issues.apache.org/jira/browse/CARBONDATA-4052 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha # Create carbon table. # Create SI table on the same carbon table. # Do load or insert operation. # Run query insert overwrite on maintable. # Now select query on SI table is showing old as well as new data which should be only new data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4046) Select count(*) fails on partition table.
[ https://issues.apache.org/jira/browse/CARBONDATA-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha updated CARBONDATA-4046: - Description: Steps to reproduce 1. set property `carbon.read.partition.hive.direct=false` 2. Create table which contain more than one partition column. 3. run query select count (*) It fails with exception as `Key not found`. create table partition_cache(a string) partitioned by(b int, c String) stored as carbondata; insert into partition_cache select 'k',1,'nihal'; select count(*) from partition_cache where b = 1; was: Steps to reproduce 1. set property `carbon.read.partition.hive.direct=false` 2. Create table which contain more than one partition column. 3. run query select count (*) It fails with exception as `Key not found`. > Select count(*) fails on partition table. > - > > Key: CARBONDATA-4046 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4046 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Major > > Steps to reproduce > 1. set property `carbon.read.partition.hive.direct=false` > 2. Create table which contain more than one partition column. > 3. run query select count (*) > > It fails with exception as `Key not found`. > > create table partition_cache(a string) partitioned by(b int, c String) stored > as carbondata; > insert into partition_cache select 'k',1,'nihal'; > select count(*) from partition_cache where b = 1; -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4046) Select count(*) fails on partition table.
Nihal kumar ojha created CARBONDATA-4046: Summary: Select count(*) fails on partition table. Key: CARBONDATA-4046 URL: https://issues.apache.org/jira/browse/CARBONDATA-4046 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Steps to reproduce 1. set property `carbon.read.partition.hive.direct=false` 2. Create table which contain more than one partition column. 3. run query select count (*) It fails with exception as `Key not found`. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-3880) How to start JDBC service in distributed index
[ https://issues.apache.org/jira/browse/CARBONDATA-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213825#comment-17213825 ] Nihal kumar ojha commented on CARBONDATA-3880: -- Hi, please follow the below steps to configure the distributed index server with JDBC. 1. Add these properties in spark-defaults.conf spark.yarn.keytab= spark.carbon.indexserver.keytab= spark.carbon.indexserver.principal=spark2x/hadoop.hadoop@hadoop.com spark.yarn.principal=spark2x/hadoop.hadoop@hadoop.com 2. Add following configuration in carbon.properties(Ensure the carbon.properties is configured in spark-defaults.conf in driver extra java option) carbon.enable.index.server=true carbon.indexserver.enable.prepriming=true carbon.indexserver.HA.enabled=true carbon.max.executor.lru.cache.size=-1 carbon.disable.index.server.fallback=false carbon.indexserver.zookeeper.dir=/indexserver2x carbon.index.server.port= Then run below spark-submit command at $spark_home location bin/spark-submit --num-executors 2 --master yarn --class org.apache.carbondata.indexserver.indexserver then start spark JDBCserver as usual. Queries should reflect in yarn UI, in the index server and spark JDBC application. > How to start JDBC service in distributed index > --- > > Key: CARBONDATA-3880 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3880 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 2.0.0 >Reporter: li >Priority: Major > Fix For: 2.1.0 > > > How to start JDBC service in distributed index -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-3892) An exception occurred when modifying the table name using SparkSession
[ https://issues.apache.org/jira/browse/CARBONDATA-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213785#comment-17213785 ] Nihal kumar ojha commented on CARBONDATA-3892: -- Hi, I was trying to reproduce this issue but not getting reproduced. I am using the query "ALTER TABLE oldTable RENAME to newTable". please correct me if I am wrong. Or if there is some other configuration then please add here. > An exception occurred when modifying the table name using SparkSession > -- > > Key: CARBONDATA-3892 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3892 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 2.0.0 >Reporter: li >Priority: Blocker > > Exception in thread "main" java.lang.LinkageError: ClassCastException: > attempting to > castjar:file:/usr/hdp/2.6.5.0-292/spark2/carbonlib/apache-carbondata-1.6.1-bin-spark2.2.1-hadoop2.7.2.jar!/javax/ws/rs/ext/RuntimeDelegate.classtojar:file:/usr/hdp/2.6.5.0-292/spark2/carbonlib/apache-carbondata-1.6.1-bin-spark2.2.1-hadoop2.7.2.jar!/javax/ws/rs/ext/RuntimeDelegate.class -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-3806) Create bloom datamap fails with null pointer exception
[ https://issues.apache.org/jira/browse/CARBONDATA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208575#comment-17208575 ] Nihal kumar ojha commented on CARBONDATA-3806: -- This issue was handled in PR: https://github.com/apache/carbondata/pull/3775 > Create bloom datamap fails with null pointer exception > -- > > Key: CARBONDATA-3806 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3806 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.1 > Environment: Spark 2.3.2 >Reporter: Chetan Bhat >Priority: Major > > Create bloom datamap fails with null pointer exception > create table brinjal_bloom (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'carbondata' > TBLPROPERTIES('table_blocksize'='1'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE > brinjal_bloom OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); > 0: jdbc:hive2://10.20.255.171:23040/default> CREATE DATAMAP dm_brinjal4 ON > TABLE brinjal_bloom USING 'bloomfilter' DMPROPERTIES ('INDEX_COLUMNS' = > 'AMSize', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1'); > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 210.0 failed 4 times, most recent failure: Lost task 0.3 in > stage 210.0 (TID 1477, vm2, executor 2): java.lang.NullPointerException > at > org.apache.carbondata.core.datamap.Segment.getCommittedIndexFile(Segment.java:150) > at > org.apache.carbondata.core.util.BlockletDataMapUtil.getTableBlockUniqueIdentifiers(BlockletDataMapUtil.java:198) > at > org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getTableBlockIndexUniqueIdentifiers(BlockletDataMapFactory.java:176) > at > org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:154) > at > org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getSegmentProperties(BlockletDataMapFactory.java:425) > at > org.apache.carbondata.datamap.IndexDataMapRebuildRDD.internalCompute(IndexDataMapRebuildRDD.scala:359) > at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:84) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:109) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Driver stacktrace: (state=,code=0) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3912) Clean file requests are failing in case of multiple load due to concurrent locking.
[ https://issues.apache.org/jira/browse/CARBONDATA-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha resolved CARBONDATA-3912. -- Fix Version/s: 2.1.0 Resolution: Fixed This issue was handled in PR: https://github.com/apache/carbondata/pull/3871 > Clean file requests are failing in case of multiple load due to concurrent > locking. > --- > > Key: CARBONDATA-3912 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3912 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > If multiple loads are fired at the same time then clean file requests are > failing due to failing in lock acquiring. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3992) Drop Index is throwing null pointer exception.
[ https://issues.apache.org/jira/browse/CARBONDATA-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha resolved CARBONDATA-3992. -- Resolution: Fixed Fixed in PR: https://github.com/apache/carbondata/pull/3928 > Drop Index is throwing null pointer exception. > -- > > Key: CARBONDATA-3992 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3992 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Index server set to true but index server is not running. > Create an index as 'carbondata' and try to drop the index -> throwing null > pointer exception. > IndexStoreMandaer.Java -> line 98 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3992) Drop Index is throwing null pointer exception.
Nihal kumar ojha created CARBONDATA-3992: Summary: Drop Index is throwing null pointer exception. Key: CARBONDATA-3992 URL: https://issues.apache.org/jira/browse/CARBONDATA-3992 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Index server set to true but index server is not running. Create an index as 'carbondata' and try to drop the index -> throwing null pointer exception. IndexStoreMandaer.Java -> line 98 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3964) Select * from table or select count(*) without filter is throwing null pointer exception.
[ https://issues.apache.org/jira/browse/CARBONDATA-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha updated CARBONDATA-3964: - Priority: Minor (was: Major) > Select * from table or select count(*) without filter is throwing null > pointer exception. > - > > Key: CARBONDATA-3964 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3964 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Steps to reproduce. > 1. Create a table. > 2. Load around 500 segments and more than 1 million records. > 3. Running query select(*) or select count(*) without filter is throwing null > pointer exception. > File: TableIndex.java > Method: pruneWithMultiThread > line: 447 > Reason: filter.getresolver() is null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3964) Select * from table or select count(*) without filter is throwing null pointer exception.
Nihal kumar ojha created CARBONDATA-3964: Summary: Select * from table or select count(*) without filter is throwing null pointer exception. Key: CARBONDATA-3964 URL: https://issues.apache.org/jira/browse/CARBONDATA-3964 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Steps to reproduce. 1. Create a table. 2. Load around 500 segments and more than 1 million records. 3. Running query select(*) or select count(*) without filter is throwing null pointer exception. File: TableIndex.java Method: pruneWithMultiThread line: 447 Reason: filter.getresolver() is null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3947) Insert Into Select Operation is throwing exception for hive read/write operation in carbon.
Nihal kumar ojha created CARBONDATA-3947: Summary: Insert Into Select Operation is throwing exception for hive read/write operation in carbon. Key: CARBONDATA-3947 URL: https://issues.apache.org/jira/browse/CARBONDATA-3947 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha CREATE TABLE hive_carbon_table1(id INT, name STRING, scale DECIMAL, country STRING, salary DOUBLE) stored by 'org.apache.carbondata.hive.CarbonStorageHandler'; INSERT into hive_carbon_table1 SELECT 1, 'RAM', '2.3', 'INDIA', 3500"; CREATE TABLE hive_carbon_table2(id INT, name STRING, scale DECIMAL, country STRING, salary DOUBLE) stored by 'org.apache.carbondata.hive.CarbonStorageHandler'; INSERT into hive_carbon_table2 SELECT * FROM hive_carbon_table1"; -> Throwing exception as "CarbonData file is not present in the table location" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3928) Handle the Strings which length is greater than 32000 as a bad record.
Nihal kumar ojha created CARBONDATA-3928: Summary: Handle the Strings which length is greater than 32000 as a bad record. Key: CARBONDATA-3928 URL: https://issues.apache.org/jira/browse/CARBONDATA-3928 Project: CarbonData Issue Type: Task Reporter: Nihal kumar ojha Currently, when the string length exceeds 32000 then the load is failed. Suggestion: 1. Bad record can handle string length greater than 32000 and load should not be failed because only a few records string length is greater than 32000. 2. Include some more information in the log message like which record and column have the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3912) Clean file requests are failing in case of multiple load due to concurrent locking.
Nihal kumar ojha created CARBONDATA-3912: Summary: Clean file requests are failing in case of multiple load due to concurrent locking. Key: CARBONDATA-3912 URL: https://issues.apache.org/jira/browse/CARBONDATA-3912 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha If multiple loads are fired at the same time then clean file requests are failing due to failing in lock acquiring. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3855) Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON.
[ https://issues.apache.org/jira/browse/CARBONDATA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha updated CARBONDATA-3855: - Attachment: CarbonData SDK support load from file.pdf > Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON. > -- > > Key: CARBONDATA-3855 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3855 > Project: CarbonData > Issue Type: New Feature >Reporter: Nihal kumar ojha >Priority: Major > Attachments: CarbonData SDK support load from file.pdf > > Time Spent: 7h 10m > Remaining Estimate: 0h > > Please find the solution document attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3855) Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON.
[ https://issues.apache.org/jira/browse/CARBONDATA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha updated CARBONDATA-3855: - Attachment: (was: CarbonData SDK support load from file.pdf) > Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON. > -- > > Key: CARBONDATA-3855 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3855 > Project: CarbonData > Issue Type: New Feature >Reporter: Nihal kumar ojha >Priority: Major > Attachments: CarbonData SDK support load from file.pdf > > Time Spent: 7h 10m > Remaining Estimate: 0h > > Please find the solution document attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3855) Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON.
[ https://issues.apache.org/jira/browse/CARBONDATA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha updated CARBONDATA-3855: - Attachment: CarbonData SDK support load from file.pdf > Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON. > -- > > Key: CARBONDATA-3855 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3855 > Project: CarbonData > Issue Type: New Feature >Reporter: Nihal kumar ojha >Priority: Major > Attachments: CarbonData SDK support load from file.pdf > > > Please find the solution document attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3855) Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON.
[ https://issues.apache.org/jira/browse/CARBONDATA-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha updated CARBONDATA-3855: - Attachment: (was: CarbonData SDK support load from file .pdf) > Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON. > -- > > Key: CARBONDATA-3855 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3855 > Project: CarbonData > Issue Type: New Feature >Reporter: Nihal kumar ojha >Priority: Major > > Please find the solution document attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3855) Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON.
Nihal kumar ojha created CARBONDATA-3855: Summary: Support Carbondata SDK to load data from parquet, ORC, CSV, Avro and JSON. Key: CARBONDATA-3855 URL: https://issues.apache.org/jira/browse/CARBONDATA-3855 Project: CarbonData Issue Type: New Feature Reporter: Nihal kumar ojha Attachments: CarbonData SDK support load from file .pdf Please find the solution document attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)